JP4607830B2

JP4607830B2 - Interest information generating apparatus, interest information generating method, and interest information generating program

Info

Publication number: JP4607830B2
Application number: JP2006198413A
Authority: JP
Inventors: 真中辻; 優三好; 祥広大塚
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-07-20
Filing date: 2006-07-20
Publication date: 2011-01-05
Anticipated expiration: 2026-07-20
Also published as: JP2008027142A

Description

本発明は興味情報生成装置、興味情報生成方法および興味情報生成プログラムに関し、特に、ブログエントリを参照しつつ、個人の興味情報が概念階層化されたパーソナルオントロジを自動的に生成する方法に適用して好適なものである。 The present invention relates to an interest information generation apparatus, an interest information generation method, and an interest information generation program, and more particularly to a method of automatically generating a personal ontology in which personal interest information is conceptually hierarchized while referring to a blog entry. And suitable.

インターネット上などで情報検索を実施する場合、主としてキーワード入力による情報検索が行われている。このキーワード検索では、例えば、ビデオ録画装置の分野においては、録画内容がキーワードによって指定されるため、ユーザが適切なキーワードを思いつかないと、意図したビデオ内容を録画することができなかった。
また、ｇｏｏなどの検索エンジンを利用した検索においても、ユーザが適切なキーワードを思いつかないと、意図した内容を検索することができないだけでなく、不要な検索結果が多数含まれることがあった。
さらに、ｄｏｂｌｏｇなどのブログプロバイダにおいても、ユーザはキーワードベースで興味のある情報を検索し、検索結果にかかったブログサイトにアクセスするというレベルに留まっており、ユーザが興味のある未知のキーワードやコミュニティあるいはブログサイトを発見することができなかった。 When performing an information search on the Internet or the like, an information search is mainly performed by inputting a keyword. In this keyword search, for example, in the field of video recording devices, since the recording content is specified by a keyword, the intended video content could not be recorded unless the user came up with an appropriate keyword.
Further, even in a search using a search engine such as google, if the user does not come up with an appropriate keyword, the intended content cannot be searched, and many unnecessary search results may be included.
Furthermore, even in blog providers such as doblog, the user remains at the level of searching for information of interest on a keyword basis and accessing the blog site according to the search result, and the unknown keyword or community in which the user is interested. Or I couldn't find a blog site.

この理由の一つとして、現在の検索方法はキーワード検索のみであり、クラス（概念）体系やクラスの持つ属性、すなわちクラス名やインスタンス（実体）を利用した精度の高い検索を実施できないことが挙げられる。キーワードはただの文字列であるが、オントロジの持つクラスは複数のインスタンスをメンバとして持つものであり、どのようなインスタンスをクラスの構成メンバとするかによって、各個人の嗜好をクラスに反映させることができる。また、クラス階層の取り方によっても、各個人の嗜好をクラスに反映させることができる。
ここで、従来のオントロジ間の近似度の計算方法では、非特許文献１に開示されているように、完全に分散された環境下で構築されたオントロジが対象とされる。このため、オントロジ間の近似度の計算の前処理として、オントロジの持つクラス間の対応関係の導出が行われる。 One reason for this is that the current search method is only keyword search, and a high-precision search using the class (concept) system and class attributes, that is, class names and instances (substances) cannot be performed. It is done. The keyword is just a character string, but the ontology class has multiple instances as members, and the individual preference is reflected in the class depending on what instance is a member of the class. Can do. Also, the preference of each individual can be reflected in the class depending on the class hierarchy.
Here, in the conventional method for calculating the degree of approximation between ontology, as disclosed in Non-Patent Document 1, an ontology constructed in a completely distributed environment is targeted. For this reason, as a pre-process for calculating the degree of approximation between ontology, a correspondence relationship between classes of ontology is derived.

図６は、従来の完全に分散された環境下で構築されたオントロジの構成例を示す図である。
図６において、オントロジＯＡには、“ロック”というクラスが存在している。そして、“ロック”というクラスには、“ＵＳ”という子クラスが存在し、“ＵＳ”という子クラスには、“インディーズ”という子クラスが存在し、“インディーズ”という子クラスには、“グランジ”という子クラスが存在している。そして、“グランジ”という子クラスには、“ｎｉｒｖａｎａ”、“ＦｏｏＦｉｇｈｔｅｒｓ”および“ｓｕｐｅｒｃｈｕｎｋ”というインスタンスが存在しているものとする。 FIG. 6 is a diagram illustrating a configuration example of an ontology constructed in a conventional completely distributed environment.
In FIG. 6, the ontology OA has a class called “lock”. The class “Rock” has a child class “US”, the child class “US” has a child class “Indie”, and the child class “Indies” has a “Grunge” class. There is a child class. Further, it is assumed that instances “nirvana”, “Foo Fighters”, and “super chunk” exist in the child class “grunge”.

一方、オントロジＯＢには、“ロック”というクラスが存在している。そして、“ロック”というクラスには、“アメリカ”および“イギリス”という子クラスが存在し、“アメリカ”という子クラスには、“アメリカ”という子クラスには“ｉｎｄｉｅｓ”という子クラスが存在し、“ｉｎｄｉｅｓ”という子クラスには“ｎｉｒｖａｎａ”および“Ｓｔｏｎｅｔｅｍｐｌｅｐｉｌｏｔｓ”というインスタンスが存在しているものとする。 On the other hand, the ontology OB has a class called “lock”. The class “Rock” has child classes “America” and “United Kingdom”, and the child class “America” has a child class “Indies” in the child class “America”. It is assumed that instances “nirvana” and “Stone temple pilots” exist in the child class “indies”.

さらに、“イギリス”という子クラスには、“オルタナティブ”という子クラスが存在し、“オルタナティブ”という子クラスには、“スコットランド”という子クラスが存在している。そして、“スコットランド”には、“ｔｅｅｎａｇｅ”というインスタンスが存在しているものとする。
そして、オントロジＯＡ、ＯＢ間の近似度を計測する場合、これらのオントロジＯＡ、ＯＢにそれぞれ含まれるクラス間のマッピングを行う。ここで、オントロジＯＡ、ＯＢにそれぞれ含まれるクラス間のマッピングを行う場合、各オントロジＯＡ、ＯＢを構成するクラス間の全組み合わせに対して近似度を計測することができる。 Furthermore, the child class “UK” has a child class “Alternative”, and the child class “Alternative” has a child class “Scotland”. It is assumed that an instance of “tenage” exists in “Scotland”.
When the degree of approximation between the ontology OA and OB is measured, mapping between classes included in the ontology OA and OB is performed. Here, when mapping between classes included in the ontology OA and OB is performed, the degree of approximation can be measured for all combinations between classes constituting the ontology OA and OB.

すなわち、図７および図８に示すように、各オントロジＯＡ、ＯＢに含まれる全てのクラス間で総当りにて近似度をそれぞれ算出することにより、オントロジＯＡ、ＯＢ間のマッピングを行うことができる。なお、近似度を計測方法としては、クラスの持つ名前属性や各種プロパティ、インスタンス集合などのクラス特性の近似度に基づいて計算することができる。 That is, as shown in FIG. 7 and FIG. 8, mapping between the ontology OA and OB can be performed by calculating the approximate degree between all classes included in each ontology OA and OB. . As a method of measuring the degree of approximation, it can be calculated based on the degree of approximation of class characteristics such as name attributes, various properties, and instance sets possessed by a class.

図９は、図６のクラス間におけるマッピング結果の一例を示す図である。
図９において、オントロジＯＡ、ＯＢ間のマッピングの結果、オントロジＯＡの“ロック”というクラスとオントロジＯＢの“ロック”というクラスとがマッピングされ、オントロジＯＡの“ＵＳ”というクラスとオントロジＯＢの“アメリカ”というクラスとがマッピングされ、オントロジＯＡの“インディーズ”というクラスとオントロジＯＢの“ｉｎｄｉｅｓ”というクラスとがマッピングされ、オントロジＯＡの“グランジ”というクラスとオントロジＯＢの“ｉｎｄｉｅｓ”というクラスとがマッピングされたものとする。 FIG. 9 is a diagram illustrating an example of a mapping result between the classes of FIG.
In FIG. 9, as a result of mapping between ontology OA and OB, ontology OA “lock” class and ontology OB “lock” class are mapped, ontology OA “US” class and ontology OB “USA” ”Is mapped, ontology OA“ indie ”class is mapped to ontology OB“ indies ”class, ontology OA“ grunge ”class and ontology OB“ indies ”class is mapped It shall be assumed.

そして、オントロジＯＡ、ＯＢのクラス間におけるマッピングが行われると、近似度の高いクラスの隣接クラスからなるクラス集合間の近似度を計測する。なお、クラス集合間の近似度の計測方法としては、例えば、Ｊａｃｃａｒｄ係数を用いることができる。このＪａｃｃａｒｄ係数では、クラス集合Ｘ、Ｙ間の近似度は、（Ｘ∩Ｙ）／（Ｘ∪Ｙ）にて求めることができる。 When mapping between the classes of ontology OA and OB is performed, the degree of approximation between the class sets composed of the adjacent classes of the class having a high degree of approximation is measured. As a method for measuring the degree of approximation between class sets, for example, a Jaccard coefficient can be used. With this Jaccard coefficient, the degree of approximation between the class sets X and Y can be obtained by (X∩Y) / (X∪Y).

図１０は、図６のオントロジ間における近似度計測結果の一例を示す図である。
図１０において、図６のオントロジＯＡ、ＯＢのクラス間において、図９のマッピング結果が得られたものとする。この場合、オントロジＯＡにおいて、“ロック”というクラスおよび“ＵＳ”というクラスからなるクラス集合Ｇ１１、“ロック”というクラス、“ＵＳ”というクラスおよび“インディーズ”というクラスからなるクラス集合Ｇ１２、“インディーズ”というクラスおよび“グランジ”というクラスからなるクラス集合Ｇ１３が得られる。 FIG. 10 is a diagram illustrating an example of the approximation degree measurement result between the ontology in FIG.
In FIG. 10, it is assumed that the mapping result of FIG. 9 is obtained between the classes of ontology OA and OB of FIG. In this case, in ontology OA, a class set G11 consisting of a class called "lock" and a class called "US", a class called "lock", a class set G12 consisting of a class called "US" and a class called "indie", "indie" And a class set G13 including the class “grunge” is obtained.

一方、オントロジＯＢにおいて、“ロック”というクラス、“アメリカ”というクラスおよび“イギリス”というクラスからなるクラス集合Ｇ２１、“ロック”というクラス、“アメリカ”というクラスおよび“ｉｎｄｉｅｓ”というクラスからなるクラス集合Ｇ２２、“アメリカ”というクラスおよび“ｉｎｄｉｅｓ”というクラスからなるクラス集合Ｇ２３が得られる。 On the other hand, in ontology OB, a class set G21 consisting of a class "Rock", a class "America" and a class "UK", a class set "Class" "Rock", a class "America" and a class "Indies" G22, a class set G23 consisting of a class “America” and a class “indies” is obtained.

そして、クラス集合Ｇ１１、Ｇ２１間では、クラス集合Ｇ１１の“ロック”というクラスとクラス集合Ｇ２１の“ロック”というクラスとは対応関係にあるので、これらで１個のメンバとみなし、クラス集合Ｇ１１の“ＵＳ”というクラスとクラス集合Ｇ２１の“アメリカ”というクラスとは対応関係にあるので、これらで１個のメンバとみなし、さらにクラス集合Ｇ２１には“イギリス”というクラスが含まれるので、クラス集合Ｇ１１、Ｇ２１全体のメンバ数（Ｇ１１∪Ｇ２１）は３となる。 Between the class sets G11 and G21, the class “lock” in the class set G11 and the class “lock” in the class set G21 are in a correspondence relationship. Since the class “US” and the class “USA” in the class set G21 are in a corresponding relationship, they are regarded as one member, and the class set G21 includes a class “UK”. The total number of members G11 and G21 (G11∪G21) is 3.

一方、クラス集合Ｇ１１の“ロック”というクラスとクラス集合Ｇ２１の“ロック”というクラスとは対応関係にあり、クラス集合Ｇ１１の“ＵＳ”というクラスとクラス集合Ｇ２１の“アメリカ”というクラスとは対応関係にあり、クラス集合Ｇ２１の“イギリス”に対応するクラスはクラス集合Ｇ１１にはないので、クラス集合Ｇ１１、Ｇ２１間に共通に含まれるメンバ数（Ｇ１１∩Ｇ２１）は２となる。この結果、クラス集合Ｇ１１、Ｇ２１間におけるＪａｃｃａｒｄ係数は２／３となる。 On the other hand, the class “lock” in the class set G11 and the class “lock” in the class set G21 have a correspondence relationship, and the class “US” in the class set G11 and the class “America” in the class set G21 correspond to each other. Since there is no class corresponding to “UK” in the class set G21 in the class set G11, the number of members commonly included between the class sets G11 and G21 (G11∩G21) is two. As a result, the Jaccard coefficient between the class sets G11 and G21 is 2/3.

同様に、クラス集合Ｇ１２、Ｇ２２間におけるＪａｃｃａｒｄ係数は２／４、クラス集合Ｇ１３、Ｇ２３間におけるＪａｃｃａｒｄ係数は１／３となる。
そして、オントロジＯＡ、ＯＢのクラス集合間におけるＪａｃｃａｒｄ係数がそれぞれ求まると、これらのクラス集合間におけるＪａｃｃａｒｄ係数を足し合わせる。そして、Ｊａｃｃａｒｄ係数を足し合わせた結果をソースオントロジＯＡのクラス数にて除することで、オントロジＯＡからみたオントロジＯＢの近似度を計測することができる。例えば、図１０では、ソースオントロジＯＡのクラス数は４なので、オントロジＯＡからみたオントロジＯＢの近似度は、（２／３＋２／４＋１／３）／４＝３／８となる。 Similarly, the Jaccard coefficient between the class sets G12 and G22 is 2/4, and the Jaccard coefficient between the class sets G13 and G23 is 1/3.
When the Jaccard coefficients between the ontology OA and OB class sets are found, the Jaccard coefficients between these class sets are added together. Then, by dividing the result obtained by adding the Jaccard coefficients by the number of classes of the source ontology OA, the degree of approximation of the ontology OB viewed from the ontology OA can be measured. For example, in FIG. 10, since the number of classes of the source ontology OA is 4, the degree of approximation of the ontology OB viewed from the ontology OA is (2/3 + 2/4 + 1/3) / 4 = 3/8.

図１１は、従来の複数のオントロジに対する近似度計測結果の比較例を示す図である。
図１１において、オントロジＯＣでは、オントロジＯＢの“イギリス”というクラス以下の子クラスがないものとする。そして、オントロジＯＣでは、“ロック”というクラスおよび“アメリカ”というクラスからなるクラス集合Ｇ３１、“ロック”というクラス、“アメリカ”というクラスおよび“ｉｎｄｉｅｓ”というクラスからなるクラス集合Ｇ３２、“アメリカ”というクラスおよび“ｉｎｄｉｅｓ”というクラスからなるクラス集合Ｇ３３が得られる。 FIG. 11 is a diagram illustrating a comparative example of the degree-of-approximation measurement results for a plurality of conventional ontology.
In FIG. 11, it is assumed that the ontology OC has no child class below the class “UK” of the ontology OB. In the ontology OC, a class set G31 composed of a class “Rock” and a class “America”, a class set G32 composed of a class “Lock”, a class “America” and a class “indies”, “America” A class set G33 including the class and the class “indies” is obtained.

この場合、クラス集合Ｇ１１、Ｇ３１間におけるＪａｃｃａｒｄ係数は２／２、クラス集合Ｇ１２、Ｇ３２間におけるＪａｃｃａｒｄ係数は２／４、クラス集合Ｇ１３、Ｇ３３間におけるＪａｃｃａｒｄ係数は１／３となる。この結果、オントロジＯＡからみたオントロジＯＣの近似度は、（２／２＋２／４＋１／３）／４＝１１／２４となる。
また、非特許文献２には、複数のオントロジを構成するクラス間のマッピングを半自動で実現する方法が開示されている。 In this case, the Jaccard coefficient between the class sets G11 and G31 is 2/2, the Jaccard coefficient between the class sets G12 and G32 is 2/4, and the Jaccard coefficient between the class sets G13 and G33 is 1/3. As a result, the degree of approximation of ontology OC viewed from ontology OA is (2/2 + 2/4 + 1/3) / 4 = 11/24.
Non-Patent Document 2 discloses a method for realizing semi-automatic mapping between classes constituting a plurality of ontology.

Ｍａｅｄｃｈｅ，Ａ．ａｎｄＳｔａａｂ，Ｓ．：ＭｅａｓｕｒｉｎｇＳｉｍｉｌａｒｉｔｙｂｅｔｗｅｅｎＯｎｔｏｌｏｇｉｅｓ，ＩｎＴｅｃｈｎｉｃａｌＲｅｐｏｒｔ，Ｅ０４４８，ＵｎｉｖｅｒｓｉｔｙｏｆＫａｒｌｓｒｕｈｅ（２００１）Maedche, A .; and Staab, S .; : Measuring Similarity between Ontologies, In Technical Report, E0448, University of Karlsruhe (2001) 中辻真、三好優、木村辰幸：“柔軟なシステム連携のための意味情報に基づくメッセージマッピング手法の提案と評価”，日本データベース学会Ｌｅｔｔｅｒｓ，Ｖｏｌ．４，Ｎｏ．１，ｐｐ．３７−４０（２００５）Makoto Nakajo, Yu Miyoshi, Yasuyuki Kimura: “Proposal and Evaluation of Message Mapping Method Based on Semantic Information for Flexible System Cooperation”, Database Society of Japan Letters, Vol. 4, no. 1, pp. 37-40 (2005)

しかしながら、ユーザの興味が概念階層化されたパーソナルオントロジを生成するにはコストが高くつくため、各個人のパーソナルオントロジをインターネットを介して流通させることが困難である。このため、各個人のパーソナルオントロジを他のユーザのパーソナルオントロジとマッピングさせることにより、興味の一致するユーザ間で自動的にコミュニティを形成することができなかった。また、Ｗｅｂ上の音楽ファイルなどのコンテンツのメタデータとパーソナルオントロジとの間でのマッチングによる自動的かつ高精度な情報検索および推薦ができないという問題があった。 However, since it is expensive to generate a personal ontology in which the user's interests are conceptually hierarchized, it is difficult to distribute each person's personal ontology via the Internet. For this reason, by mapping the personal ontology of each individual to the personal ontology of another user, it has not been possible to automatically form a community between users with whom the interests match. In addition, there is a problem that automatic and highly accurate information retrieval and recommendation cannot be performed by matching between metadata of content such as music files on the Web and personal ontology.

また、従来のパーソナルオントロジの生成方法では、あるアーチストを熱心に語っているユーザ同士であろうが、あるユーザはそのアーチストについて熱心に語っており、別のユーザはそのアーチストについて熱心に語っていない場合であろうが、そのアーチストについての興味の近さは同じであると計測されることから、ユーザの興味が十分に反映されたパーソナルオントロジを精度よく生成することができないという問題があった。
そこで、本発明の目的は、作成にかかるコストを抑制しつつ、個人の興味が高精度に反映されたパーソナルオントロジを生成することが可能な興味情報生成装置、興味情報生成方法および興味情報生成プログラムを提供することである。 In addition, in the conventional personal ontology generation method, users who are eagerly talking about an artist may be, but one user is eagerly talking about the artist, and another user is not eagerly talking about the artist. In some cases, since the degree of interest in the artist is measured to be the same, there is a problem that a personal ontology that sufficiently reflects the user's interest cannot be generated with high accuracy.
Accordingly, an object of the present invention is to provide an interest information generation apparatus, an interest information generation method, and an interest information generation program capable of generating a personal ontology in which an individual's interest is reflected with high accuracy while suppressing the cost of creation. Is to provide.

上述した課題を解決するために、請求項１記載の興味情報生成装置によれば、ブログエントリを個人ごとに記憶する記憶手段と、前記ブログエントリに含まれる単語を抽出する単語抽出手段と、予め設定された単語が概念階層化されてなる雛形オントロジを記憶する記憶手段と、前記雛形オントロジから前記単語抽出手段で抽出された単語を含むクラスまたはインスタンスを抽出する分類子適用手段と、前記ブログエントリに含まれる単語を含む前記雛形オントロジのクラスまたはインスタンスに対する、当該ブログエントリを持つ個人の興味度を計測する興味度計測手段と、前記興味度計測手段にて計測された前記興味度に基づいて、個人ごとに前記興味度が比較的大きな前記クラスまたはインスタンスについて、当該クラスまたはインスタンスおよびそれらの上位のクラスを含む階層構造を、個人の興味情報を表すパーソナルオントロジとして前記雛形オントロジから抽出するパーソナルオントロジ抽出手段と、を備え、前記興味度計測手段は、前記ブログエントリの個人ごとの集合をＥ、当該集合Ｅのうちの一のブログエントリに出現する単語を含む前記クラスおよびインスタンスの種類数をＮ（Ｅｉ）としたとき、前記一のブログエントリに含まれる単語を含む前記インスタンスに対する前記興味度を、

から計測し、前記一のブログエントリに含まれる単語を含むクラスに対する前記興味度を、

から計測することを特徴とする。 To solve the problems described above, according to the interest information generating apparatus according to claim 1, storage means for storing a blog entry for each individual, a word extraction means for extracting words included in the blog entry, previously Storage means for storing a template ontology in which set words are conceptually hierarchized; classifier applying means for extracting a class or instance including a word extracted by the word extracting means from the template ontology; and the blog entry Based on the degree of interest measured by the interest degree measuring means, the interest degree measuring means for measuring the degree of interest of the individual having the blog entry, with respect to the class ontology class or instance including the word included in the degree of interest for each individual for a relatively large the class or instance, the class or Instruments Includes Manual and a hierarchical structure that includes a class of their upper, and personal ontology extracting means for extracting from the model ontology as a personal ontology representing the interest information of the individual, and the interest degree measuring means, each individual of the blog entry The instance including the word included in the one blog entry, where E is the set of and the class and instance type including the word appearing in one blog entry of the set E is N (Ei) The degree of interest in

And the degree of interest in the class including the word included in the one blog entry,

It is characterized by measuring from .

また、請求項２記載の興味情報生成方法によれば、個人のブログエントリに含まれる単語が概念階層化されてなるパーソナルオントロジを興味情報として生成する興味情報生成装置が実行する興味情報生成方法であって、記憶手段に個人ごとに記憶されたブログエントリに含まれる単語を抽出するステップと、記憶手段に記憶され、予め設定された単語が概念階層化されてなる雛形オントロジから、前記ブログエントリから抽出された単語を含むクラスまたはインスタンスを抽出するステップと、前記ブログエントリの個人ごとの集合をＥ、当該集合Ｅのうちの一のブログエントリに出現する単語を含む前記クラスおよびインスタンスの種類数をＮ（Ｅｉ）としたとき、前記一のブログエントリを持つ個人の、前記一のブログエントリに含まれる単語を含む前記インスタンスに対する興味度を、

から計測し、前記一のブログエントリに含まれる単語を含む前記クラスに対する興味度を

から計測するステップと、計測された前記興味度に基づいて、前記興味度が比較的大きな前記クラスまたは前記インスタンスについて、当該クラスまたはインスタンスおよびそれらの上位のクラスを含む階層構造を、個人の興味情報を表すパーソナルオントロジとして前記雛形オントロジから個人ごとに抽出するステップと、を備えることを特徴とする。 In addition, according to the interest information generation method according to claim 2, the interest information generation method executed by the interest information generation apparatus that generates, as interest information, a personal ontology in which words included in a personal blog entry are hierarchized into a concept hierarchy. A step of extracting words included in the blog entry stored for each individual in the storage means, and from a template ontology stored in the storage means and preliminarily hierarchized in terms of concepts, from the blog entry Extracting a class or instance including the extracted word, a set of individuals of the blog entry E, and the number of types of the class and instance including a word appearing in one blog entry of the set E When N (Ei), the individual having the one blog entry is included in the one blog entry That the degree of interest with respect to the instance that contains the word,

And measure the degree of interest in the class containing the words included in the one blog entry.

A step of measuring the, on the basis of the degree of interest that is measured, for the degree of interest is relatively large the class or the instances, the hierarchical structure including the class or instance and class of their upper, personal interest information , extracting for each individual from the template ontology as a personal ontology representing the characterized in that it comprises a.

また、請求項３記載の興味情報生成プログラムによれば、記憶手段に個人ごとに記憶されたブログエントリに含まれる単語を抽出するステップと、記憶手段に記憶され、予め設定された単語が概念階層化されてなる雛形オントロジから、前記ブログエントリから抽出された単語を含むクラスまたはインスタンスを抽出するステップと、前記ブログエントリの個人ごとの集合をＥ、当該集合Ｅのうちの一のブログエントリに出現する単語を含む前記クラスおよびインスタンスの種類数をＮ（Ｅｉ）としたとき、前記一のブログエントリを持つ個人の、前記一のブログエントリに含まれる単語を含む前記インスタンスに対する興味度を、

から計測し、前記一のブログエントリに含まれる単語を含む前記クラスに対する興味度を、

から計測するステップと、計測された前記興味度に基づいて、当該興味度が比較的大きな前記クラスまたは前記インスタンスについて、当該クラスまたはインスタンスおよびそれらの上位のクラスを含む階層構造を、個人の興味情報を表すパーソナルオントロジとして前記雛形オントロジから個人ごとに抽出するステップと、をコンピュータに実行させることを特徴とする。 Further, according to the interest information generating program according to claim 3, the step of extracting words included in the blog entry stored for each individual in the storage means, and the preset words stored in the storage means Extracting a class or an instance including a word extracted from the blog entry from the model ontology formed into a form, and an individual set of the blog entry to appear in one blog entry of the set E When the number of types of classes and instances that include words to be N (Ei) is, the degree of interest of the individual having the one blog entry with respect to the instance that includes the words included in the one blog entry,

The degree of interest in the class including the words included in the one blog entry,

And, based on the measured degree of interest, for the class or the instance having a relatively high degree of interest, a hierarchical structure including the class or the instance and higher classes thereof is used as the personal interest information. And extracting the individual ontology from the template ontology for each individual as a personal ontology representing

以上説明したように、本発明によれば、単なる文字列ではなく概念間の一致性に基づいて、自分の嗜好に適合した情報を検索することが可能となるとともに、個人の興味度を考慮しながら各個人の興味情報に含まれる単語を雛形オントロジ上で照合することにより、多義語が間違って分類されることを排除しつつ、個人の興味が精度よく反映されたパーソナルオントロジを生成することができる。このため、作成にかかるコストを抑制しつつ、パーソナルオントロジを精度よく生成することが可能となり、情報検索の精度を向上させることが可能となるとともに、各個人のパーソナルオントロジをインターネット上で広く流通させることが可能となり、個人の嗜好に適合したコミュニティを形成することができる。 As described above, according to the present invention, it is possible to search for information suitable for one's preference based on the consistency between concepts rather than just a character string, and consider the degree of personal interest. However, it is possible to generate a personal ontology that accurately reflects personal interests while collating the words included in each individual's interest information on the template ontology, while eliminating the misclassification of ambiguous words. it can. For this reason, it is possible to generate personal ontology with high accuracy while suppressing the cost of creation, and it is possible to improve the accuracy of information retrieval, and to distribute each person's personal ontology widely on the Internet. And it is possible to form a community that suits individual preferences.

以下、本発明の実施形態に係る興味情報生成装置およびその方法について図面を参照しながら説明する。
図１は、本発明の一実施形態に係る興味情報生成装置が適用されるシステムの概略構成を示すブロック図である。
図１において、端末２〜４およびサーバ５が通信網１を介して接続されている。なお、通信網１としては、例えば、ＩＰ通信を行う公衆通信網を用いることができ、インターネットであってもよい。また、企業間の専用通信網であっても、公衆通信網であってもよいが、高信頼性とセキュリティとを備えた専用通信を提供できるＩＰ−ＶＰＮ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ−ＶｉｒｔｕａｌＰｒｉｖａｔｅＮｅｔｗｏｒｋ）のような網であってもよい。また、端末２〜４としては、ノート型パーソナルコンピュータあるいはデスクトップ型パーソナルコンピュータでもよく、携帯電話端末やＰＤＡ（ＰｅｒｓｏｎａｌＤａｔａＡｓｓｉｓｔａｎｔ）などでもよい。また、サーバ５は、ブログプロバイダやＩＳＰ（ＩｎｆｏｒｍａｔｉｏｎＳｅｒｖｉｃｅＰｒｏｖｉｄｅｒ）上に設置することができ、サーバ５としては、例えば、ブログの更新情報を収集して提供するｐｉｎｇサーバを用いることができる。 Hereinafter, an interest information generation apparatus and method according to an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a schematic configuration of a system to which an interest information generating apparatus according to an embodiment of the present invention is applied.
In FIG. 1, terminals 2 to 4 and a server 5 are connected via a communication network 1. As the communication network 1, for example, a public communication network that performs IP communication can be used, and the Internet may be used. Further, it may be a private communication network between companies or a public communication network, but it is like an IP-VPN (Internet Protocol-Virtual Private Network) that can provide dedicated communication with high reliability and security. A simple net may be used. Further, the terminals 2 to 4 may be a notebook personal computer or a desktop personal computer, and may be a mobile phone terminal or a PDA (Personal Data Assistant). The server 5 can be installed on a blog provider or ISP (Information Service Provider). As the server 5, for example, a ping server that collects and provides update information of a blog can be used.

ここで、サーバ５には、端末２〜４にそれぞれ対応したブログサイト７〜９が設けられ、各ブログサイト７〜９には、ブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎがそれぞれ保持されている。なお、ブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎはブログにおける記事の最小単位を表し、日にちごとに設けることができる。また、サーバ５には雛形オントロジ６が保持され、雛形オントロジ６には、個人の興味情報が概念階層化されたパーソナルオントロジの雛形が設けられている。 Here, the server 5 is provided with blog sites 7 to 9 corresponding to the terminals 2 to 4 respectively, and the blog entries 7a to 7n, 8a to 8n, and 9a to 9n are held in the blog sites 7 to 9, respectively. Has been. The blog entries 7a to 7n, 8a to 8n, and 9a to 9n represent minimum units of articles in the blog, and can be provided for each day. The server 5 holds a template ontology 6, and the template ontology 6 is provided with a personal ontology template in which personal interest information is conceptually hierarchized.

なお、雛形オントロジ６は、ブログプロバイダ側で恣意的に作成することができる。例えば、ブログプロバイダが音楽に関するパーソナルオントロジを各端末２〜４のユーザに構築させたければ、音楽に関する雛形オントロジ６を構築すればよい。ここで、各端末２〜４のユーザの興味を細やかに表現するために、可能な限り細分化された網羅性の高い雛形オントロジ６を構築することが好ましい。また、雛形オントロジ６の実体は、オントロジ記述言語ＯＷＬなどのＸＭＬ言語で記述されたテキストファイルである。また、情報の整理の簡単化のため、インスタンスは最下位クラスにのみ分類してもよい。 The template ontology 6 can be arbitrarily created on the blog provider side. For example, if the blog provider wants the users of the terminals 2 to 4 to construct a personal ontology related to music, the template ontology 6 related to music may be constructed. Here, in order to express the interests of the users of the terminals 2 to 4 in detail, it is preferable to construct a template ontology 6 that is subdivided as much as possible and has high coverage. The entity of the template ontology 6 is a text file described in an XML language such as the ontology description language OWL. In order to simplify the organization of information, instances may be classified only into the lowest class.

さらに、サーバ５には、ブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎに対して形態素解析をそれぞれ適用することにより、ブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎに頻出する単語を抽出する頻出単語抽出手段５ａ、ブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎに頻出する単語を含むクラスまたはインスタンスを雛形オントロジ６から抽出する分類子適用手段５ｂ、分類子適用手段５ｂにて抽出されたクラスまたはインスタンスがブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎに含まれる割合に基づいて、それらのクラスまたはインスタンスに対する興味度を計測する興味度計測手段５ｃ、興味度計測手段５ｃにて計測された興味度に基づいて、クラスまたはインスタンスおよびそれらの上位の全てのクラスをパーソナルオントロジとして雛形オントロジ６から抽出するパーソナルオントロジ抽出手段５ｄが設けられている。 Furthermore, the server 5 applies words that appear frequently in the blog entries 7a to 7n, 8a to 8n, and 9a to 9n by applying morphological analysis to the blog entries 7a to 7n, 8a to 8n, and 9a to 9n, respectively. In the frequent word extraction means 5a to be extracted, the classifier application means 5b and the classifier application means 5b to extract from the template ontology 6 a class or instance including words that frequently appear in the blog entries 7a to 7n, 8a to 8n, and 9a to 9n. Interest level measuring means 5c and interest level measuring means 5c for measuring the degree of interest in these classes or instances based on the ratio of the extracted classes or instances included in the blog entries 7a to 7n, 8a to 8n, 9a to 9n. Class or instance and all classes above it based on the interest measured in Personal ontology extraction means 5d for extracting from template ontology 6 is provided as a personal ontology.

そして、頻出単語抽出手段５ａは、ブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎに対して形態素解析をそれぞれ適用する。そして、同一ユーザの持つ複数のブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎで頻出する形態素を抽出する。この際、明らかに興味でない形態素（例えば、私・もの・が、など）はフィルタリングすることができる。 The frequent word extraction means 5a applies morphological analysis to the blog entries 7a to 7n, 8a to 8n, and 9a to 9n, respectively. Then, morphemes that frequently appear in a plurality of blog entries 7a to 7n, 8a to 8n, and 9a to 9n possessed by the same user are extracted. At this time, morphemes that are obviously not of interest (e.g., me, things, etc.) can be filtered.

次に、分類子適用手段５ｂは、ブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎで頻出する各形態素を雛形オントロジ６に適用し、雛形オントロジ６内のクラスまたはインスタンスに一致する文字列があるかどうかを調べる。そして、雛形オントロジ６内のクラスまたはインスタンスに一致する文字列がブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎにある場合、興味度計測手段５ｃは、それらのクラスまたはインスタンスがブログエントリ７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎに含まれる割合に基づいて、それらのクラスまたはインスタンスに対する興味度を計測する。そして、パーソナルオントロジ抽出手段５ｄは、興味度計測手段５ｃにて計測された興味度が大きなクラスまたはインスタンスについて、雛形オントロジ６のルートクラスから、そのクラスまたはインスタンスまでの直接的な子孫クラスおよびインスタンスをパーソナルオントロジとして抽出する。 Next, the classifier applying unit 5b applies each morpheme frequently appearing in the blog entries 7a to 7n, 8a to 8n, and 9a to 9n to the template ontology 6, and a character string that matches the class or instance in the template ontology 6 is applied. Find out if there is. When the character string matching the class or instance in the template ontology 6 is in the blog entries 7a to 7n, 8a to 8n, and 9a to 9n, the interest degree measuring means 5c has the class or instance of the blog entry 7a to Based on the ratio included in 7n, 8a to 8n, and 9a to 9n, the degree of interest in those classes or instances is measured. Then, the personal ontology extraction means 5d obtains the direct descendant classes and instances from the root class of the template ontology 6 to the class or instance with respect to the class or instance having a high degree of interest measured by the interest degree measurement means 5c. Extract as a personal ontology.

これにより、単なる文字列ではなく概念間の一致性に基づいて、自分の嗜好に適合した情報を検索することが可能となるとともに、個人の興味度を考慮しながら各個人の興味情報に含まれる単語を雛形オントロジ６上で照合することにより、多義語が間違って分類されることを排除しつつ、個人の興味が精度よく反映されたパーソナルオントロジを生成することができる。このため、作成にかかるコストを抑制しつつ、パーソナルオントロジを精度よく生成することが可能となり、情報検索の精度を向上させることが可能となるとともに、各個人のパーソナルオントロジをインターネット上で広く流通させることが可能となり、個人の嗜好に適合したコミュニティを形成することができる。 This makes it possible to search for information that suits one's preference based on the consistency between concepts rather than just character strings, and is included in each individual's interest information while taking into account the degree of personal interest By collating words on the template ontology 6, it is possible to generate a personal ontology in which personal interests are accurately reflected while eliminating the misclassification of ambiguous words. For this reason, it is possible to generate personal ontology with high accuracy while suppressing the cost of creation, and it is possible to improve the accuracy of information retrieval, and to distribute each person's personal ontology widely on the Internet. And it is possible to form a community that suits individual preferences.

ここで、クラスまたはインスタンスに対する興味度は、例えば、以下のように定義することができる。
（１）１エントリ当たりのユーザの興味度を１とする。
（２）あるエントリＥ_ｉに出現するユーザのクラスおよびインスタンスの種類をＮ（Ｅ_ｉ）個とすると、そのエントリＥ_ｉにおけるユーザの各クラスおよびインスタンスの興味度は１／Ｎ（Ｅ_ｉ）である。
（３）オントロジ内の各インスタンスＩ_ｉに対する興味度は、ユーザの全蓄積エントリ集合をＥとすると、以下の式で与えられる。 Here, the degree of interest in a class or instance can be defined as follows, for example.
(1) The degree of interest of the user per entry is 1.
(2) If the types of user classes and instances appearing in an entry E _i are N (E _i ), the degree of interest of each class and instance of the user in the entry E _i is 1 / N (E _i ). is there.
(3) The degree of interest for each instance I _i in the ontology is given by the following equation, where E is the total accumulated entry set of the user.

（４）オントロジ内の各クラスＣ_ｉに対する興味度は、ユーザの全蓄積エントリ集合をＥとすると、以下の式で与えられる。 (4) The degree of interest for each class C _i in the ontology is given by the following equation, where E is the total accumulated entry set of the user.

（５）インスタンスの興味度は、そのインスタンスが所属するクラスの興味度に引き継がれ、子クラスの興味度は、親クラスの興味度に引き継がれる。
図２は、本発明の一実施形態に係るクラスおよびインスタンスの興味度の一例を示す図である。
図２において、ユーザＡの全蓄積エントリには、エントリＥ_１、Ｅ_２が含まれているものとする。そして、ユーザＡの興味オントロジＯＡには、“オルタナティブ”というクラスが存在し、“オルタナティブ”というクラスには、“Ｍａｄｃｈｅｓｔｅｒ”および“Ｓｈｏｅｇａｚｅ”という子クラスが存在しているものとする。また、オルタナティブ”というクラスには、“ｎｉｒｖａｎａ”というインスタンスが存在し、“Ｍａｄｃｈｅｓｔｅｒ”というクラスには、“ＮｅｗＯｒｄｅｒ”および“ＳｔｏｎｅＲｏｓｅｓ”というインスタンスが存在し、“Ｓｈｏｅｇａｚｅ”というクラスには、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”および“Ｒｉｄｅ”というインスタンスが存在しているものとする。 (5) The interest level of the instance is inherited by the interest level of the class to which the instance belongs, and the interest level of the child class is inherited by the interest level of the parent class.
FIG. 2 is a diagram illustrating an example of the degree of interest of classes and instances according to an embodiment of the present invention.
In FIG. 2, it is assumed that entries E ₁ and E ₂ are included in all the stored entries of user A. In the interest ontology OA of the user A, a class called “alternative” exists, and a class called “alternative” has child classes called “Madchester” and “Shoegaze”. The class “alternative” has an instance “nirvana”, the class “Madchester” has instances “New Order” and “Stone Roses”, and the class “Shoegaze” has “ Assume that instances of “My Bloody Valentine” and “Ride” exist.

そして、エントリＥ_１では、“ＳｔｏｎｅＲｏｓｅｓ”、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”、“ｎｉｒｖａｎａ”および“ＮｅｗＯｒｄｅｒ”という４アーチストが出現し、エントリＥ_２では、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”という１アーチストが再度出現する上に、“Ｓｈｏｅｇａｚｅ”という１ジャンルと、“Ｒｉｄｅ”という１アーチストが出現している。
このため、“ＮｅｗＯｒｄｅｒ”というインスタンスの興味度は１／４、“ＳｔｏｎｅＲｏｓｅｓ”というインスタンスの興味度は１／４、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”というインスタンスの興味度は１／４＋１／３＝７／１２、“Ｒｉｄｅ”というインスタンスの興味度は１／３、“ｎｉｒｖａｎａ”というインスタンスの興味度は１／４となる。 In entry E ₁ , four artists “Stone Roses”, “My Bloody Valentine”, “nirvana”, and “New Order” appear, and in entry E ₂ , one artist “My Bloody Valentine” appears again. Above, one genre “Shoegaze” and one artist “Ride” appear.
Therefore, the interest level of the instance “New Order” is 1/4, the interest level of the instance “Stone Roses” is 1/4, and the interest level of the instance “My Bloody Valentine” is 1/4 + 1/3 = 7 / 12. The interest level of the instance “Ride” is 1/3, and the interest level of the instance “nirvana” is 1/4.

また、“Ｍａｄｃｈｅｓｔｅｒ”というクラスの興味度は、そのクラスの配下の“ＮｅｗＯｒｄｅｒ”および“ＳｔｏｎｅＲｏｓｅｓ”というインスタンスの興味度を足し合わせることで求めることができ、１／４＋１／４＝１／２となる。また、“Ｓｈｏｅｇａｚｅ”というクラスの興味度は、そのクラスの配下の“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”および“Ｒｉｄｅ”というインスタンスの興味度を足し合わせることで求めることができ、７／１２＋１／３＝５／４となる。さらに、“オルタナティブ”というクラスの興味度は、そのクラスの配下の“ｎｉｒｖａｎａ”というインスタンス、“Ｍａｄｃｈｅｓｔｅｒ”および“Ｓｈｏｅｇａｚｅ”というクラスの興味度を足し合わせることで求めることができ、１／２＋５／４＋１／４＝２となる。 Further, the interest level of the class “Madchester” can be obtained by adding the interest levels of the instances “New Order” and “Stone Roses” under the class, and 1/4 + 1/4 = 1/2. It becomes. In addition, the interest level of the class “Shoegaze” can be obtained by adding the interest levels of the instances “My Bloody Valentine” and “Ride” under the class, and 7/12 + 1/3 = 5/4 It becomes. Further, the interest level of the class “alternative” can be obtained by adding the interest levels of the instances “nirvana”, “Madchester” and “Shoegaze” subordinate to the class “1/2 + 5/4 + 1”. / 4 = 2.

これにより、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”というにアーチストついては、２つのエントリＥ_１、Ｅ_２に書き込みがなされているため、ユーザＡの興味が強いということを認識することができる。また、“Ｓｈｏｅｇａｚｅ”というにジャンルついては、そのクラス名もエントリＥ_２に出現する上に、その配下のアーチストをエントリＥ_１、Ｅ_２において頻繁に語っているため、興味度を大きくすることができ、単純に配下にアーチストを多く持っていても、興味としては均等に扱われることを防止することができる。 As a result, it is possible to recognize that the user A's interest is strong because the artist “My Bloody Valentine” has been written in the two entries E ₁ and E ₂ . In addition, regarding the genre of “Shoegaze”, the class name also appears in entry E ₂ and the subordinate artists are frequently spoken in entries E ₁ and E ₂ , so the degree of interest can be increased. Even if there are many artists under their control, it can be prevented that they are treated equally as interests.

なお、雛形オントロジ６は、デスクワークにて人手で作成してサーバ５に保持させるようにしてもよいし、パーソナルオントロジ抽出手段５ｃにて抽出されたパーソナルオントロジを既存の雛形オントロジ６とマージすることにより作成してもよい。さらに、雛形オントロジ６から抽出されたパーソナルオントロジに対してユーザが興味のあるクラスまたはインスタンスを追加したり、ユーザが興味のないクラスまたはインスタンスを削除したりするようにしてもよい。 Note that the template ontology 6 may be manually created by desk work and stored in the server 5, or the personal ontology extracted by the personal ontology extracting means 5c may be merged with the existing template ontology 6. You may create it. Furthermore, a class or instance that the user is interested in may be added to the personal ontology extracted from the template ontology 6, or a class or instance that the user is not interested in may be deleted.

また、頻出単語抽出手段５ａ、分類子適用手段５ｂ、興味度計測手段５ｃおよびパーソナルオントロジ抽出手段５ｄは、これらの手段で行われる処理を遂行させる命令が記述されたプログラムをコンピュータに実行させることにより実現することができる。
そして、このプログラムをＣＤ−ＲＯＭなどの記憶媒体に記憶しておけば、サーバ５のコンピュータに記憶媒体を装着し、そのプログラムをコンピュータにインストールすることにより、頻出単語抽出手段５ａ、分類子適用手段５ｂ、興味度計測手段５ｃおよびパーソナルオントロジ抽出手段５ｄで行われる処理を実現することができる。また、このプログラムを通信網１を介してダウンロードすることにより、このプログラムを容易に普及させることができる。 The frequent word extraction means 5a, classifier application means 5b, interest degree measurement means 5c, and personal ontology extraction means 5d cause the computer to execute a program in which instructions for performing the processing performed by these means are described. Can be realized.
If this program is stored in a storage medium such as a CD-ROM, the frequent word extraction means 5a and classifier application means are installed by installing the storage medium in the computer of the server 5 and installing the program in the computer. 5b, the processing performed by the interest degree measuring means 5c and the personal ontology extracting means 5d can be realized. Moreover, by downloading this program via the communication network 1, this program can be easily spread.

また、頻出単語抽出手段５ａ、分類子適用手段５ｂ、興味度計測手段５ｃおよびパーソナルオントロジ抽出手段５ｄで行われる処理を遂行させる命令が記述されたプログラムをコンピュータに実行させる場合、スタンドアロン型コンピュータで実行させるようにしてもよく、ネットワークに接続された複数のコンピュータに分散処理させるようにしてもよい。 In addition, when a computer executes a program in which an instruction for performing processing performed by the frequent word extraction unit 5a, the classifier application unit 5b, the interest degree measurement unit 5c, and the personal ontology extraction unit 5d is executed by the stand-alone computer Alternatively, it may be distributed to a plurality of computers connected to the network.

図３は、本発明の一実施形態に係るパーソナルオントロジの生成方法を示す図である。
図３において、ｐｉｎｇサーバなどを通じ、ユーザＡ、Ｂ、・・・、Ｘのエントリ集合をそれぞれ収集し、これらの収集した全てのブログエントリに対して形態素解析を行うことにより、インデックスを作成する（ステップＳ１）。
次に、ｐｉｎｇサーバにて収集された全てのブログエントリを雛形オントロジＯＨに対して分類する（ステップＳ２）。ここで、ブログエントリの分類方法としては、雛形オントロジＯＨのあるクラスＣ_ｉの名前属性があるエントリ内の記述にあれば、そのエントリをクラスＣ_ｉに分類することができる。また、雛形オントロジＯＨのあるクラスＣ_ｉに所属するインスタンスＩ_ｉ（∈Ｃ_ｉ）の名前属性があるエントリ内の記述にあれば、そのエントリをクラスＣ_ｉに所属するインスタンスＩ_ｉに分類することができる。なお、同一のエントリが複数のクラスに分類されてもよい。 FIG. 3 is a diagram illustrating a method for generating a personal ontology according to an embodiment of the present invention.
In FIG. 3, an entry set of users A, B,..., X is collected through a ping server or the like, and an index is created by performing morphological analysis on all the collected blog entries ( Step S1).
Next, all the blog entries collected by the ping server are classified with respect to the template ontology OH (step S2). Here, as a method for classifying blog entries, if there is a description in an entry having a name attribute of class C _{i having} a template ontology OH, the entry can be classified into class C _i . Also, if the name attribute of the instance I _i (∈C _i ) belonging to the class C _i having the template ontology OH is in the description in the entry, the entry is classified into the instance I _i belonging to the class C _i. Can do. Note that the same entry may be classified into a plurality of classes.

例えば、“Ｃｈａｒｌａｔａｎｓ”という文字列がエントリ内の記述にある場合、そのエントリは、クラス“Ｍａｄｃｈｅｓｔｅｒ”のインスタンス“Ｃｈａｒｌａｔａｎｓ”に分類することができる。
次に、雛形オントロジＯＨを形成する最下層クラスＣ_ｌの持つ各インスタンスに興味を持つユーザＡ、Ｂ、・・・、Ｘの数を計測する（ステップＳ３）。なお、クラスＣ_ｌのインスタンスに興味を持つユーザＡ、Ｂ、・・・、Ｘの数を計測する場合、同一のユーザが複数のエントリにおいてインスタンスＩ_ｌを記述している場合においても、ユーザ数は１と計測する。 For example, when a character string “Charlantans” is included in the description in the entry, the entry can be classified into an instance “Charlantans” of the class “Madchester”.
Next, the number of users A, B,..., X who are interested in each instance of the lowermost class C ₁ forming the template ontology OH is measured (step S3). When the number of users A, B,..., X who are interested in instances of class C _l is measured, the number of users even when the same user describes instance I _l in a plurality of entries. Measures 1

次に、雛形オントロジＯＨを形成する最下層クラスＣ_ｌに興味を持つユーザＡ、Ｂ、・・・、Ｘの数を計測する。ここで、雛形オントロジＯＨを形成する最下層クラスＣ_ｌに興味を持つユーザＡ、Ｂ、・・・、Ｘの数は、最下層クラスＣ_ｌの配下の全てのインスタンスに興味を持つユーザ数と、最下層クラスＣ_ｌ自体に興味を持つユーザ数との総和にて算出することができる。なお、同一のユーザが複数のインスタンスに興味を持っていたり、最下層クラスとその最下層クラスに所属するインスタンスに同時に興味を持っている場合においても、ユーザ数は１と計測する。このようにして、雛形オントロジＯＨを形成するクラスやインスタンスに興味を持つユーザＡ、Ｂ、・・・、Ｘの数をルートクラスまで再帰的に計測することで、そのドメインに興味を持つユーザＡ、Ｂ、・・・、Ｘの分布を算出することができる。 Next, the user A interested in lowermost class C _l to form a stationery ontology OH, measured B, · · ·, the number of X. Here, the number of users A, B,..., X interested in the lowest class C ₁ forming the template ontology OH is the number of users interested in all instances under the lowest class C _1. , it can be calculated by the sum of the number of users interested in the lowest layer class C _l itself. Note that the number of users is measured as 1 even when the same user is interested in a plurality of instances or is interested in the lowermost class and instances belonging to the lowermost class at the same time. In this way, by recursively measuring the number of users A, B,..., X interested in the class or instance forming the template ontology OH up to the root class, the user A interested in the domain , B,..., X can be calculated.

次に、ｐｉｎｇサーバにて収集された全てのブログエントリが雛形オントロジＯＨに対して分類されると、その分類結果をユーザＩＤごとに整理することにより、各ユーザＡ、Ｂ、・・・、Ｘごとの興味オントロジＯＡ、・・・、ＯＸを生成する（ステップＳ４）。
ここで、ｐｉｎｇサーバにて収集された全てのブログエントリを雛形オントロジＯＨに対して分類する場合、オントロジの持つ同一クラスに所属するインスタンスは同一の性質を持つという特性と、クラス階層の近いクラス間の性質は近く、両者のインスタンス間の性質も近いという特性を用いることにより、分類の誤りを除去することができる。 Next, when all the blog entries collected by the ping server are classified with respect to the template ontology OH, the classification results are arranged for each user ID, whereby each user A, B,. Each interest ontology OA,..., OX is generated (step S4).
Here, when all the blog entries collected by the ping server are classified with respect to the template ontology OH, the instances that belong to the same class of the ontology have the same property, and between classes close to the class hierarchy The classification error can be eliminated by using the property that the properties of are close and the properties of both instances are also close.

図４は、本発明の一実施形態に係るパーソナルオントロジ間の近似度計測方法を示す図である。なお、以下の説明では、あるオントロジに対する別のオントロジとの間の近似度を計測する場合、前者をソースオントロジ、後者をターゲットオントロジと呼ぶ。
図４において、雛形オントロジＯＨの“ａ１”というクラスの直下には、“ｂ１”、“ｂ２”および“ｂ３”というクラスが存在し、“ｂ１”というクラスの直下には、“ｃ１”および“ｃ２”というクラスが存在し、“ｃ１”というクラスの直下には、“ｄ１”および“ｄ２”というクラスが存在しているものとする。また、“ｂ２”というクラスの直下には、“ｃ３”および“ｃ４”というクラスが存在し、“ｂ３”というクラスの直下には、“ｃ５”というクラスが存在しているものとする。 FIG. 4 is a diagram illustrating a method for measuring the degree of approximation between personal ontology according to an embodiment of the present invention. In the following description, when the degree of approximation between an ontology and another ontology is measured, the former is called a source ontology and the latter is called a target ontology.
In FIG. 4, classes “b1”, “b2”, and “b3” exist immediately below the class “a1” of the template ontology OH, and “c1” and “c3” immediately below the class “b1”. It is assumed that a class called “c2” exists and classes “d1” and “d2” exist immediately below the class called “c1”. Also, it is assumed that classes “c3” and “c4” exist immediately below the class “b2”, and a class “c5” exists immediately below the class “b3”.

また、“ｄ１”というクラスには、“ｊ”および“ｋ”というインスタンスが存在し、“ｄ２”というクラスには、“ｌ”というインスタンスが存在し、“ｃ２”というクラスには、“ｍ”というインスタンスが存在し、“ｂ２”というクラスには、“ｎ”というインスタンスが存在し、“ｃ３”というクラスには、“ａ”、“ｅ”、“ｃ”、“ｆ”、“ｂ”、“ｄ”および“ｇ”というインスタンスが存在し、“ｃ４”というクラスには、“ｐ”、“ｇ”、“ｊ”および“ｈ”というインスタンスが存在しているものとする。 The class “d1” includes instances “j” and “k”, the class “d2” includes the instance “l”, and the class “c2” includes “m”. "B", the class "b2" has an instance "n", and the class "c3" has "a", "e", "c", "f", "b" ”,“ D ”, and“ g ”exist, and the class“ c4 ”includes instances“ p ”,“ g ”,“ j ”, and“ h ”.

そして、各ユーザのブログエントリに頻出する単語をそれぞれ抽出し、その単語を含むクラスまたはインスタンスおよびそれらの上位の全てのクラスを雛形オントロジＯＨからそれぞれ抽出することにより、パーソナルオントロジＯＡ、ＯＢが作成されたものとする。
ここで、パーソナルオントロジＯＡの“ａ１というクラスの直下には、“ｂ１”および“ｂ２”というクラスが存在し、“ｂ１”というクラスの直下には、“ｃ１”および“ｃ２”というクラスが存在し、“ｃ１”というクラスの直下には、“ｄ１”というクラスが存在し、“ｂ２”というクラスの直下には、“ｃ３”および“ｃ４”というクラスが存在しているものとする。また、“ｄ１”というクラスには、“ｊ”および“ｋ”というインスタンスが存在し、“ｃ２”というクラスには、“ｍ”というインスタンスが存在し、“ｃ３”というクラスには、“ａ”、“ｃ”、“ｂ”および“ｄ”というインスタンスが存在し、“ｃ４”というクラスには、“ｑ”および“ｈ”というインスタンスが存在しているものとする。 Then, the words that frequently appear in each user's blog entry are extracted, and the classes or instances including the words and all the classes above them are extracted from the template ontology OH, thereby creating personal ontologies OA and OB. Shall be.
Here, the classes “b1” and “b2” exist immediately below the class “a1” of the personal ontology OA, and the classes “c1” and “c2” exist immediately below the class “b1”. It is assumed that a class “d1” exists immediately below the class “c1”, and a class “c3” and “c4” exist immediately below the class “b2”. , The class “d1” includes instances “j” and “k”, the class “c2” includes the instance “m”, and the class “c3” includes “a”. , “C”, “b”, and “d” exist, and the class “c4” includes instances “q” and “h”.

また、パーソナルオントロジＯＢの“ａ１”というクラスの直下には、“ｂ１”、“ｂ２”および“ｂ３”というクラスが存在し、“ｂ１”というクラスの直下には、“ｃ１”というクラスが存在し、“ｃ１”というクラスの直下には、“ｄ１”というクラスが存在し、“ｂ２”というクラスの直下には、“ｃ３”および“ｃ４”というクラスが存在しているものとする。また、“ｄ２”というクラスには、“ｌ”というインスタンスが存在し、“ｂ２”というクラスには、“ｎ”というインスタンスが存在し、“ｃ３”というクラスには、“ａ”、“ｃ”、“ｅ”および“ｆ”というインスタンスが存在し、“ｃ４”というクラスには、“ｐ”および“ｊ”というインスタンスが存在しているものとする。 Also, in the personal ontology OB, there are “b1”, “b2”, and “b3” classes directly under the “a1” class, and there is a “c1” class directly under the “b1” class. Assume that a class “d1” exists immediately below the class “c1”, and a class “c3” and “c4” exist immediately below the class “b2”. The class “d2” has an instance “l”, the class “b2” has an instance “n”, and the class “c3” has “a” and “c”. ”,“ E ”, and“ f ”exist, and“ c4 ”class includes“ p ”and“ j ”.

そして、雛形オントロジＯＨおよびパーソナルオントロジＯＡ、ＯＢ間で末端クラスを除く共通クラスを分析し、共通クラスを親クラスとした親子クラスからなるトポロジを抽出する。なお、図３の例では、末端クラスは、“ｄ１”、“ｄ２”、“ｃ３”および“ｃ４”とする。この結果、雛形オントロジＯＨおよびパーソナルオントロジＯＡ、ＯＢ間において、クラス“ａ１”を親クラスとした子クラス集合Ｇ１、クラス“ｂ１”を親クラスとした子クラス集合Ｇ２、クラス“ｃ１”を親クラスとした子クラス集合Ｇ３、クラス“ｂ２”を親クラスとした子クラス集合Ｇ４、クラス“ｂ３”を親クラスとした子クラス集合Ｇ５を抽出することができる。 Then, a common class excluding a terminal class is analyzed between the template ontology OH and the personal ontology OA and OB, and a topology composed of parent and child classes with the common class as a parent class is extracted. In the example of FIG. 3, the end classes are “d1”, “d2”, “c3”, and “c4”. As a result, a child class set G1 having the class “a1” as the parent class, a child class set G2 having the class “b1” as the parent class, and the class “c1” being the parent class between the template ontology OH and the personal ontologies OA and OB. Child class set G3, child class set G4 having class “b2” as a parent class, and child class set G5 having class “b3” as a parent class can be extracted.

なお、この共通クラスの分析は、同じクラスＩＤが雛形オントロジＯＨおよびパーソナルオントロジＯＡ、ＯＢ間に存在するかということだけを確認すればよい。このため、クラスの名前属性やインスタンス集合プロパティなどの近似度を計測する必要がなくなり、クラス間の対応関係を正確に維持しつつ、計算量を減らすことができる。 The analysis of the common class only needs to confirm whether the same class ID exists between the template ontology OH and the personal ontology OA, OB. For this reason, it is not necessary to measure the degree of approximation of class name attributes, instance set properties, etc., and the amount of calculation can be reduced while maintaining the correspondence between classes accurately.

次に、パーソナルオントロジＯＡ、ＯＢ間で各トポロジを形成する子クラス集合Ｘ、Ｙ間の近似度を深さ優先で計算する。ここで、雛形オントロジＯＨを構成するクラスの子クラスの集合をＺとすると、子クラス集合Ｘ、Ｙ間の近似度は、｜Ｘ∩Ｙ｜／｜Ｚ｜にて求めることができる。そして、各トポロジの子クラス集合間の近似度を足し合わせることにより、パーソナルオントロジＯＡ、ＯＢ間でのトポロジの近似度Ｓ_Ｔを計測することができる。 Next, the degree of approximation between the child class sets X and Y forming each topology between the personal ontology OA and OB is calculated with depth priority. Here, if the set of child classes of the classes constituting the template ontology OH is Z, the degree of approximation between the child class sets X and Y can be obtained by | X∩Y | / | Z |. By adding the approximation degree between the child class set for each topology, it is possible to measure the degree of approximation S _T personal ontology OA, between OB topology.

例えば、子クラス集合Ｇ１において、雛形オントロジＯＨには“ｂ１”、“ｂ２”および“ｂ３”という子クラスが含まれているので、子クラス集合Ｇ１における雛形オントロジＯＨの子クラス集合のメンバ数は３となる。また、子クラス集合Ｇ１において、パーソナルオントロジＯＡには“ｂ１”および“ｂ２”という子クラスが含まれ、パーソナルオントロジＯＢには“ｂ１”、“ｂ２”および“ｂ３”という子クラスが含まれているので、パーソナルオントロジＯＡ、ＯＢに共通に含まれている子クラスは“ｂ１”およびｂ２”だけとなり、子クラス集合Ｇ１におけるパーソナルオントロジＯＡ、ＯＢの子クラス集合の積集合のメンバ数は２となる。この結果、パーソナルオントロジＯＡ、ＯＢの子クラス集合Ｇ１間における近似度は２／３となる。 For example, in the child class set G1, since the template ontology OH includes child classes “b1”, “b2”, and “b3”, the number of members of the child class set of the template ontology OH in the child class set G1 is 3 In the child class set G1, the personal ontology OA includes child classes “b1” and “b2”, and the personal ontology OB includes child classes “b1”, “b2”, and “b3”. Therefore, the child classes commonly included in the personal ontologies OA and OB are only “b1” and b2 ”, and the number of members of the product set of the child class sets of the personal ontology OA and OB in the child class set G1 is 2. As a result, the degree of approximation between the child ontology sets G1 of the personal ontology OA and OB is 2/3.

同様に、パーソナルオントロジＯＡ、ＯＢの子クラス集合Ｇ２間における近似度は１／２、子クラス集合Ｇ３間における近似度は０／２、子クラス集合Ｇ４間における近似度は２／２となる。この結果、パーソナルオントロジＯＡ、ＯＢ間でのトポロジの近似度Ｓ_Ｔは２／３＋１／２＋０／２＋２／２となる。
なお、パーソナルオントロジＯＡ、ＯＢ間でのトポロジの近似度の計算でも、雛形オントロジＯＨから割り振られたクラスＩＤを参照し、雛形オントロジＯＨの接続形態に沿ったものがパーソナルオントロジＯＡ、ＯＢ間に存在するかということだけを確認すればよい。例えば、パーソナルオントロジＯＡ、ＯＢにおけるａ１−ｂ１−ｃ１という接続形態は雛形オントロジＯＨの接続形態と同じであるかどうかは、“ａ１”、“ｂ１”および“ｃ１”というクラスがパーソナルオントロジＯＡ、ＯＢにて保持されているかどうかということを確認するだけで判断することができる。 Similarly, the degree of approximation between the personal ontology OA and OB between the child class sets G2 is 1/2, the degree of approximation between the child class sets G3 is 0/2, and the degree of approximation between the child class sets G4 is 2/2. As a result, personal ontology OA, similarity _{S T} topology between OB becomes 2/3 + 1/2 + 0/2 + 2/2.
In the calculation of the approximate degree of topology between the personal ontology OA and OB, the class ID assigned from the template ontology OH is referred to, and the one along the connection form of the template ontology OH exists between the personal ontology OA and OB. You only have to confirm whether you want to do it. For example, whether the connection form a1-b1-c1 in the personal ontology OA, OB is the same as the connection form of the template ontology OH depends on whether the classes “a1”, “b1”, and “c1” are the personal ontology OA, OB. It can be determined simply by confirming whether or not it is held at.

このため、パーソナルオントロジＯＡ、ＯＢ間のトポロジの一致度を確認するために、パーソナルオントロジＯＡ、ＯＢが持つクラスＩＤを調べるだけでよく、対応クラスを起点として、上下クラスにさらに対応クラスがあるかを調べる必要がなくなり、パーソナルオントロジＯＡ、ＯＢ間でのトポロジの近似度の計算量を減らすことができる。 For this reason, in order to confirm the degree of coincidence of the topology between the personal ontology OA and OB, it is only necessary to check the class ID possessed by the personal ontology OA and OB. Therefore, it is possible to reduce the calculation amount of the degree of approximation of the topology between the personal ontology OA and OB.

次に、パーソナルオントロジＯＡ、ＯＢの共通クラス間の近似度を計算する。ここで、パーソナルオントロジＯＡ、ＯＢの共通クラス間の近似度を計算する場合、クラスに所属するインスタンス集合を用いることができる。すなわち、あるクラスＣ１において、ソースオントロジのインスタンス集合をｘ、ターゲットオントロジのインスタンス集合をｙ、雛形オントロジＯＨのインスタンス集合をｚとすると、パーソナルオントロジＯＡ、ＯＢの共通クラス間の近似度は、｜ｘ∩ｙ｜／｜ｚ｜にて求めることができる。そして、共通クラス間の近似度を足し合わせることにより、パーソナルオントロジＯＡ、ＯＢのクラス間の近似度Ｓ_Ｃを計測することができる。 Next, the degree of approximation between the common classes of the personal ontology OA and OB is calculated. Here, when calculating the degree of approximation between the common classes of the personal ontology OA and OB, an instance set belonging to the class can be used. That is, in a certain class C1, if the instance set of the source ontology is x, the instance set of the target ontology is y, and the instance set of the template ontology OH is z, the degree of approximation between the common classes of the personal ontology OA and OB is | x ∩y | / | z |. By adding the degree of approximation between the common class, it is possible to measure the degree of approximation S _C between personal ontology OA, the OB classes.

例えば、パーソナルオントロジＯＡ、ＯＢ間の共通クラス“ｂ２”において、パーソナルオントロジＯＡの共通クラス“ｂ２”におけるインスタンス集合Ｇ６には、インスタンス“ｎ”が存在し、パーソナルオントロジＯＢの共通クラス“ｂ２”におけるインスタンス集合Ｇ６には、インスタンス“ｎ”が存在し、雛形オントロジＯＨの共通クラス“ｂ２”にはインスタンス“ｎ”が存在している。この結果、雛形オントロジＯＨの共通クラス“ｂ２”のインスタンスのメンバ数は１、パーソナルオントロジＯＡ、ＯＢのインスタンス集合Ｇ６の積集合のメンバ数は１となり、“ｂ２”という共通クラス間の近似度は１／１となる。 For example, in the common class “b2” between the personal ontologies OA and OB, the instance set G6 in the common class “b2” of the personal ontology OA includes the instance “n”, and the common class “b2” of the personal ontology OB. An instance “n” exists in the instance set G6, and an instance “n” exists in the common class “b2” of the template ontology OH. As a result, the number of members of the common class “b2” of the template ontology OH is 1, the number of members of the product set of the instance set G6 of the personal ontology OA and OB is 1, and the degree of approximation between the common classes “b2” is 1/1.

また、パーソナルオントロジＯＡ、ＯＢ間の共通クラス“ｃ２”において、パーソナルオントロジＯＡの共通クラス“ｃ２”におけるインスタンス集合Ｇ７には、インスタンス“ａ”、“ｃ”、“ｂ”および“ｄ”が存在し、パーソナルオントロジＯＢの共通クラス“ｃ３”におけるインスタンス集合Ｇ７には、インスタンス“ａ”、“ｃ”、“ｅ”および“ｆ”が存在し、雛形オントロジＯＨの共通クラス“ｃ３”にはインスタンス“ａ”、“ｅ”、“ｃ”、“ｆ”、“ｂ”、“ｄ”および“ｇ”が存在している。
この結果、雛形オントロジＯＨの共通クラス“ｃ３”のインスタンスのメンバ数は７、パーソナルオントロジＯＡ、ＯＢのインスタンス集合Ｇ７の積集合のメンバ数は２となり、“ｃ３”という共通クラス間の近似度は２／７となる。
従って、パーソナルオントロジＯＡ、ＯＢのクラス間の近似度Ｓ_Ｃは１／１＋２／７となる。 In the common class “c2” between the personal ontology OA and OB, the instance set G7 in the common class “c2” of the personal ontology OA includes instances “a”, “c”, “b”, and “d”. In the instance set G7 in the common class “c3” of the personal ontology OB, instances “a”, “c”, “e”, and “f” exist, and in the common class “c3” of the template ontology OH “A”, “e”, “c”, “f”, “b”, “d”, and “g” exist.
As a result, the number of members of the common class “c3” of the template ontology OH is 7, the number of members of the product set of the instance set G7 of the personal ontologies OA and OB is 2, and the degree of approximation between the common classes “c3” is 2/7.
Therefore, the degree of approximation _{S C} between personal ontology OA, the OB class is 1/1 + 2/7.

なお、パーソナルオントロジＯＡ、ＯＢの共通クラス間の近似度の計算でも、雛形オントロジＯＨから割り振られたインスタンスＩＤがパーソナルオントロジＯＡ、ＯＢ間に存在するかということだけを確認すればよい。このため、パーソナルオントロジＯＡ、ＯＢのクラス間の近似度を計算するために、インスタンスの名前の一致性などによるインスタンスの対応関係を予め確認する必要がなくなり、計算量を減らすことができる。
そして、パーソナルオントロジＯＡ、ＯＢにおけるトポロジの近似度Ｓ_Ｔおよびクラス間の近似度Ｓ_Ｃが求まると、トポロジとクラスに対する重要度に応じた評価関数ｆ（Ｘ）を用いることにより、以下の式にてパーソナルオントロジＯＡ、ＯＢ間の近似度Ｓ_Ｏ（ＡＢ）を与えることができる。
Ｓ_Ｏ（ＡＢ）＝Ｓ_Ｔ＋ｆ（Ｓ_Ｃ） Note that even in the calculation of the degree of approximation between the common classes of the personal ontology OA and OB, it is only necessary to confirm whether the instance ID allocated from the template ontology OH exists between the personal ontology OA and OB. For this reason, in order to calculate the degree of approximation between the classes of the personal ontology OA and OB, it is not necessary to confirm in advance the correspondence between instances based on the consistency of the names of instances, and the amount of calculation can be reduced.
Then, the personal ontology OA, the similarity S _C between the degree of approximation S _T and class topology in OB is determined, by using an evaluation function f (X) corresponding to the importance for the topology and class, the following expression The degree of approximation S _O (AB) between the personal ontology OA and OB can be given.
S _O (AB) = S _T + f (S _C )

これにより、雛形オントロジＯＨの持つクラス特性を継承させつつパーソナルオントロジＯＡ、ＯＢを構築することが可能となるとともに、インスタンス集合間の共起度をパーソナルオントロジＯＡ、ＯＢ間の近似度の計測に直接用いることができ、計算量を抑制しつつ、個人の嗜好に適合したパーソナルオントロジＯＡ、ＯＢを適正に抽出することが可能となるとともに、パーソナルオントロジＯＡ、ＯＢが雛形オントロジＯＨの持つドメインの知識を保有しているかの相対的な尺度として利用することができ、そのドメインに対する知識を多く持つユーザを有効的に絞り込むことができる。 This makes it possible to construct personal ontology OA and OB while inheriting the class characteristics possessed by the template ontology OH, and directly measure the co-occurrence between instance sets to the degree of approximation between the personal ontology OA and OB. It is possible to appropriately extract personal ontologies OA and OB that suit individual tastes while reducing the amount of calculation, and at the same time, the domain ontology OH has knowledge of the domain that the personal ontology OA and OB have. It can be used as a relative measure of possession, and users who have a lot of knowledge about the domain can be effectively narrowed down.

また、パーソナルオントロジＯＡ、ＯＢにおけるトポロジの近似度Ｓ_Ｔを計測する場合、雛形オントロジのクラス集合のメンバ数を基準として、パーソナルオントロジＯＡ、ＯＢ間のトポロジの近似度を計測することにより、パーソナルオントロジＯＡ、ＯＢのクラス集合のメンバ数が増大した場合においても、パーソナルオントロジＯＡ、ＯＢ間のトポロジの近似度が小さくなることを防止することができ、クラスが豊富にあるという意味で知識量の多いパーソナルオントロジＯＡ、ＯＢとの近似度を大きくすることができる。 The personal ontology OA, when measuring the degree of approximation S _T topology in OB, based on the number of members of the class set stationery ontology, personal ontology OA, by measuring the degree of approximation topology between OB, personal ontology Even when the number of members of the class set of OA and OB increases, the degree of topology approximation between the personal ontology OA and OB can be prevented from decreasing, and the amount of knowledge is large in the sense that there are abundant classes. The degree of approximation with the personal ontology OA, OB can be increased.

ここで、ユーザの興味度を考慮しながら、パーソナルオントロジを生成することにより、パーソナルオントロジＯＡ、ＯＢ間の近似度の精度を向上させることが可能となり、自分の興味に近い情報を効率よく取得することができる。
すなわち、ユーザの興味度を導入することにより、ユーザが同一のブログエントリ内でアーチストを単に羅列しているだけの場合には、そのブログエントリ内に出現するアーチストに対する興味は低いと捉えることが可能となるとともに、ユーザが異なるブログエントリに跨ってアーチストを記述している場合には、そのブログエントリ内に出現するアーチストに対する興味は高いと捉えることが可能となる。 Here, it is possible to improve the accuracy of the degree of approximation between the personal ontology OA and OB by generating the personal ontology while considering the user's degree of interest, and efficiently acquire information close to the user's interest. be able to.
In other words, by introducing the user's degree of interest, if the user simply lists artists in the same blog entry, it can be considered that the interest in the artist appearing in the blog entry is low In addition, when the user describes an artist across different blog entries, it is possible to grasp that the interest in the artist appearing in the blog entry is high.

このため、パーソナルオントロジＯＡ、ＯＢ間の近似度に基づいてユーザＡの興味に近い興味を持つユーザＢを検出する場合、アーチストをブログエントリ内で単に羅列したり、複数のブログエントリに跨ってアーチストを記述しながら１つのブログエントリにしか語っていないユーザＡについては、そのユーザＡの興味を狭い範囲に限定することを可能として、そのユーザＡのパーソナルオントロジＯＡと、ユーザＢのパーソナルオントロジＯＢとの間の近似度を小さくし、興味情報の推奨精度の劣化を抑制することができる。 For this reason, when detecting the user B who has an interest close to that of the user A based on the degree of approximation between the personal ontology OA and OB, the artist is simply enumerated in the blog entry, or the artist spans multiple blog entries. For user A who is only talking to one blog entry while describing the user A, it is possible to limit the user A's interest to a narrow range, and user A's personal ontology OA, user B's personal ontology OB, It is possible to reduce the degree of approximation between and reduce the recommended accuracy of interest information.

例えば、図２において、エントリＥ_１では、“ＳｔｏｎｅＲｏｓｅｓ”、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”、“ｎｉｒｖａｎａ”および“ＮｅｗＯｒｄｅｒ”という４アーチストが出現し、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”というアーチストがエントリＥ_２で再度出現する場合には、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”というアーチストについてはユーザＡの興味が強いものとして扱うことができ、“ＳｔｏｎｅＲｏｓｅｓ”、“ＭｙＢｌｏｏｄｙＶａｌｅｎｔｉｎｅ”、“ｎｉｒｖａｎａ”および“ＮｅｗＯｒｄｅｒ”という４アーチストがユーザＡの興味として均等に扱われるのを防止することができる。
このため、ユーザＡの興味オントロジの生成精度を向上させることができ、興味オントロジ間の近似度の計測精度を向上させることを可能として、興味情報の推奨精度を向上させることができる。 For example, in FIG. 2, the entry _{E 1, "Stone Roses",} "My Bloody Valentine", "nirvana" and "New Order" 4 artist appeared that, "My Bloody Valentine" artist that again an entry _{E 2} If it appears, the artist “My Bloody Valentine” can be treated as being highly interested by user A, and four artists “Stone Roses”, “My Bloody Valentine”, “nirvana”, and “New Order” are displayed. Can be prevented from being treated equally as the interest of the user A.
For this reason, it is possible to improve the accuracy of generating the interest ontology of the user A, improve the accuracy of measuring the degree of approximation between the interest ontologies, and improve the recommended accuracy of interest information.

図５において、各ユーザＡ、ＢのブログエントリＰＡ、ＰＢを雛形オントロジに対してそれぞれ分類することにより、各ユーザＡ、Ｂの興味オントロジＫＡ、ＫＢがそれぞれ生成されたものとする（ステップＳ１１）。そして、各ユーザＡ、Ｂの興味オントロジＫＡ、ＫＢ間の近似度を計測し（ステップＳ１２）、近似度の高い興味オントロジＫＡ、ＫＢ間で共起するクラスやインスタンスを分析することで、トポロジが異なるにも関わらず興味を持つ可能性が高い情報を他のユーザのエントリを介して意外な情報としてユーザに推奨することができる（ステップＳ１３）。 In FIG. 5, it is assumed that the blog entries PA and PB of the users A and B are classified with respect to the template ontology, thereby generating the interest ontologies KA and KB of the users A and B, respectively (step S11). . Then, the degree of approximation between the interest ontologies KA and KB of the users A and B is measured (step S12), and the topologies are analyzed by analyzing classes and instances that co-occur between the interest ontologies KA and KB having a high degree of approximation. Although it is different, information that is highly likely to be of interest can be recommended to the user as unexpected information through the entry of another user (step S13).

例えば、各ユーザＡ、Ｂの興味オントロジＫＡ、ＫＢ間の近似度を計測することにより、“Ｍａｄｃｈｅｓｔｅｒ”などのクラスや“ＨａｐｐｙＭｏｎｄａｙｓ”などのインスタンスに興味を持つユーザは、“Ｇｌａｓｇｏｗ”というクラスや“ＴｅｅｎａｇｅＦａｎｃｌｕｂ”というインスタンスにも興味を持つ可能性が高いことが判る。
また、このような興味オントロジＫＡ、ＫＢをブログに適用することで、単純なキーワード検索ではなく、興味オントロジＫＡ、ＫＢ間の近似度に基づく意外なエントリ推薦によるコミュニティの形成を支援することができ、ユーザの興味を自然に広げることができる（ステップＳ１４）。 For example, by measuring the degree of approximation between the interest ontologies KA and KB of each user A and B, a user who is interested in a class such as “Madchester” or an instance such as “Happy Mondays” can have a class “Glasgow” It can be seen that there is a high possibility of being interested in an instance of “Teenage Fanclub”.
Moreover, by applying such interest ontologies KA and KB to a blog, it is possible to support the formation of a community based on an unexpected entry recommendation based on the degree of approximation between the interest ontologies KA and KB rather than a simple keyword search. The user's interest can be naturally expanded (step S14).

なお、興味オントロジＫＡ、ＫＢ間の近似度を計測する方法としては、クラスの持つクラス名やインスタンスなどのクラス属性間の近似度またはクラス間の接続形態であるトポロジの近似度に基づいて推論学習照合し、オントロジの持つクラス間の意味的な近似度を計測することができる。あるいは、クラス属性間の近似度およびトポロジの近似度の両方に基づいて推論学習照合し、オントロジの持つクラス間の意味的な近似度を計測するようにしてもよい。 In addition, as a method of measuring the degree of approximation between the interest ontology KA and KB, inference learning is performed based on the degree of approximation between class attributes such as class names and instances possessed by classes or the degree of topology approximation that is a connection form between classes. You can collate and measure the semantic approximation between classes of ontology. Alternatively, inference learning may be collated based on both the degree of approximation between class attributes and the degree of topology, and the semantic degree of approximation between classes of the ontology may be measured.

また、上述した実施形態において、興味オントロジＫＡ、ＫＢ間を自動的に生成するためには、雛形オントロジが必要となる。雛形オントロジを設計するためには、クラス間の階層関係やユーザの興味を細やかに反映させるための末端クラスの粒度の調整が必要となる。ここで、ｇｏｏなどのポータルサイトにおけるトピックディレクトリは詳細化が進んでいる。例えば、音楽ドメインのジャンルでは、Ｗｅｂ上で公開されているジャンルの階層情報は、ユーザの興味に従った検索を考慮して粒度が細やかに設定されている。このため、インターネット上のポータルサイトにおけるトピックディレクトリを用いることで、雛形オントロジを構築することができる。 Further, in the embodiment described above, a template ontology is necessary to automatically generate an interest ontology KA, KB. In order to design the template ontology, it is necessary to adjust the granularity of the end classes to reflect the hierarchical relationship between classes and the user's interest in detail. Here, topic directories in portal sites such as goo are becoming more detailed. For example, in the genre of the music domain, the granularity information of the genre hierarchy information published on the Web is finely set in consideration of the search according to the user's interest. For this reason, a template ontology can be constructed by using a topic directory in a portal site on the Internet.

本発明は、パーソナルオントロジを簡易に作成して自分の興味に的確にマッチングする情報を速やかに入手することができ、情報通信システムが持つ情報源から自分の興味にマッチングする情報を自動的かつ効率的に活用することができる。 The present invention makes it possible to easily create a personal ontology and quickly obtain information that exactly matches one's interest, and automatically and efficiently obtain information that matches one's interest from an information source of an information communication system. Can be used.

本発明の一実施形態に係る興味情報生成装置が適用されるシステムの概略構成を示すブロック図である。1 is a block diagram illustrating a schematic configuration of a system to which an interest information generation apparatus according to an embodiment of the present invention is applied. 本発明の一実施形態に係るクラスおよびインスタンスの興味度の一例を示す図である。It is a figure which shows an example of the interest degree of the class and instance which concern on one Embodiment of this invention. 本発明の一実施形態に係るパーソナルオントロジの生成方法を示す図である。It is a figure which shows the production | generation method of the personal ontology based on one Embodiment of this invention. 本発明の一実施形態に係るパーソナルオントロジ間の近似度計測方法を示す図である。It is a figure which shows the approximation measuring method between personal ontology concerning one Embodiment of this invention. 本発明の一実施形態に係る興味オントロジの近似性を利用したコミュニティ形成方法を示す図である。It is a figure which shows the community formation method using the approximation of the interest ontology which concerns on one Embodiment of this invention. 従来の完全に分散された環境下で構築されたオントロジの構成例を示す図である。It is a figure which shows the structural example of the ontology constructed | assembled in the conventional completely distributed environment. 図６のクラス間における総当り方式による近似度計測方法の一例を示す図である。It is a figure which shows an example of the approximation measuring method by the brute force method between the classes of FIG. 図６のクラス間における総当り方式による近似度計測方法の一例を示す図である。It is a figure which shows an example of the approximation measuring method by the brute force method between the classes of FIG. 図６のクラス間におけるマッピング結果の一例を示す図である。It is a figure which shows an example of the mapping result between the classes of FIG. 図６のオントロジ間における近似度計測結果の一例を示す図である。It is a figure which shows an example of the approximation degree measurement result between the ontology of FIG. 従来の複数のオントロジに対する近似度計測結果の比較例を示す図である。It is a figure which shows the comparative example of the approximation degree measurement result with respect to the conventional some ontology.

Explanation of symbols

１通信網
２〜４端末
５サーバ
５ａ頻出単語抽出手段
５ｂ分類子適用手段
５ｃ興味度計測手段
５ｄパーソナルオントロジ抽出手段
６雛形オントロジ
７〜９ブログサイト
７ａ〜７ｎ、８ａ〜８ｎ、９ａ〜９ｎブログエントリ DESCRIPTION OF SYMBOLS 1 Communication network 2-4 Terminal 5 Server 5a Frequent word extraction means 5b Classifier application means 5c Interest degree measurement means 5d Personal ontology extraction means 6 Model ontology 7-9 Blog site 7a-7n, 8a-8n, 9a-9n Blog entry

Claims

Storage means for storing blog entries for each person;
Word extraction means for extracting words contained in the blog entry ;
Storage means for storing a template ontology in which preset words are hierarchized in concept;
A classifier applying means for extracting a class or instance including words extracted by the word extraction means from the template ontology,
An interest degree measuring means for measuring an interest degree of an individual having the blog entry for the class or instance of the template ontology including the word included in the blog entry ;
Based on the degree of interest measured by the degree-of-interest measurement means, for the class or instance having a relatively large degree of interest , a hierarchical structure including the class or instance and higher classes thereof is obtained as personal interest information. Personal ontology extraction means for extracting from the template ontology for each individual as a personal ontology representing ,
The degree-of-interest measurement means, when the set of individuals of the blog entry is E, and the number of types of classes and instances including words that appear in one blog entry of the set E is N (Ei), The degree of interest with respect to the instance including the word included in the one blog entry,

An interest information generation device characterized by measuring from

An interest information generation method executed by an interest information generation apparatus that generates, as interest information, a personal ontology in which words included in a personal blog entry are conceptually hierarchized,
Extracting words contained in the blog entry stored for each individual in the storage means ;
Extracting a class or instance including a word extracted from the blog entry from a template ontology that is stored in a storage means and is configured by conceptually stratified words;
When the set of individuals of the blog entry is E and the number of types of classes and instances including words that appear in one blog entry of the set E is N (Ei), the blog entry has the one blog entry. The degree of interest of an individual in the instance containing the word contained in the one blog entry is:

Step to measure from ,
Based on the measured degree of interest, for the class or instance having a relatively high degree of interest , a hierarchical structure including the class or instance and higher classes is used as a personal ontology representing personal interest information. interest information generating method characterized by comprising the steps of: extracting for each individual from template ontology.

Extracting words contained in the blog entry stored for each individual in the storage means ;
Extracting a class or instance including a word extracted from the blog entry from a template ontology that is stored in a storage means and is configured by conceptually stratified words;
When the set of individuals of the blog entry is E and the number of types of classes and instances including words that appear in one blog entry of the set E is N (Ei), the blog entry has the one blog entry. The degree of interest of an individual in the instance containing the word contained in the one blog entry is:

Step to measure from ,
Based on the measured degree of interest, for the class or instance having a relatively high degree of interest, the hierarchical structure including the class or instance and higher classes thereof is used as a personal ontology representing personal interest information. interest information generation program for causing and a step of extracting the personal each from model ontology, to the computer.