JP2021163316A

JP2021163316A - Search program, search method, and search device

Info

Publication number: JP2021163316A
Application number: JP2020065952A
Authority: JP
Inventors: 裕章森川; Hiroaki Morikawa
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2020-04-01
Filing date: 2020-04-01
Publication date: 2021-10-11
Anticipated expiration: 2040-04-01
Also published as: JP7375657B2

Abstract

【課題】カテゴリに関連付けられたファセットの集合から適切な複数のファセットを出力する。
【解決手段】特定のカテゴリの指定を受け付け、ナレッジグラフにおける前記特定のカテゴリに関連付けられた複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの個数に基づく第１指標と、前記ナレッジグラフにおける前記特定のカテゴリと前記複数のファセットのそれぞれとの距離に基づく第２指標と、のうち少なくとも一方に基づいて算出される前記複数のファセットの優先度に応じて、前記複数のファセットを順に並べて出力２１０する。
【選択図】図３PROBLEM TO BE SOLVED: To output a plurality of appropriate facets from a set of facets associated with a category.
SOLUTION: A first index based on the number of categories associated with each of a plurality of facets associated with the specific category in the knowledge graph by accepting the designation of a specific category, and the knowledge graph. The plurality of facets are arranged in order according to the priority of the plurality of facets calculated based on at least one of the second index based on the distance between the specific category and the plurality of facets in the above. Output 210.
[Selection diagram] Fig. 3

Description

本発明は、検索プログラム、検索方法、及び、検索装置に関する。 The present invention relates to a search program, a search method, and a search device.

様々な情報源から情報を収集した知識ベース（ＫＢ；Knowledge Base）の一例として、ナレッジグラフ（ＫＧ；Knowledge Graph）が知られている。 The Knowledge Graph (KG) is known as an example of a knowledge base (KB) that collects information from various information sources.

ＫＧは、例えば、ＫＧが格納するデータ全体を対象としたファセット検索（Faceted Search）に利用されることがある。ファセット検索は、データ検索システムにより用意された検索条件をユーザが選択することで、ＫＧ内のコンテンツの絞り込みを可能とする検索手法である。 The KG may be used, for example, for a faceted search for the entire data stored in the KG. The facet search is a search method that enables the user to narrow down the contents in the KG by selecting the search conditions prepared by the data search system.

特表２０１１−５１３８１９号公報Japanese Patent Publication No. 2011-513819 特表２００５−５１４６７３号公報Special Table 2005-514673

駒水孝裕、天笠俊之、北川博之，“D-022 XMLデータに対するファセット検索のためのファセット抽出の自動化”，第13回情報科学技術フォーラム（FIT2014），第2分冊第133頁−第134頁，2014年Takahiro Komamizu, Toshiyuki Amagasa, Hiroyuki Kitagawa, "Automation of facet extraction for facet search for D-022 XML data", 13th Information Science and Technology Forum (FIT2014), Volume 2, pp. 133-134, 2014 駒水孝裕、天笠俊之、北川博之，“XMLデータに対するファセットナビゲーションのためのフレームワークFoXの提案”，第1回データ工学と情報マネジメントに関するフォーラム（DEIM），B7-6，2009年Takahiro Komamizu, Toshiyuki Amagasa, Hiroyuki Kitagawa, "Proposal of FoX Framework for Faceted Navigation for XML Data", 1st Forum on Data Engineering and Information Management (DEIM), B7-6, 2009

ファセット検索において、ＫＧデータを或る特定のカテゴリに絞り込んで検索を行なう場合、適切なキー（ファセットキー）による絞り込みが行なわれない場合がある。 In the facet search, when the KG data is narrowed down to a specific category and the search is performed, the narrowing down by an appropriate key (facet key) may not be performed.

例えば、プロ野球選手というカテゴリの場合、当該カテゴリとの関連度が高い、打席や利き腕、甲子園出場経験等の適切なファセットキーで絞り込むことで、ユーザの知識に即したファセット検索が可能となる。 For example, in the case of the category of professional baseball players, it is possible to search for facets according to the user's knowledge by narrowing down by appropriate facet keys such as at-bats, dominant arm, and experience of participating in Koshien, which are highly related to the category.

しかし、従来のファセット検索システムでは、或る特定のカテゴリに絞り込んだ検索を行なう場合であっても、生年月日や出身地、会社種別等の、重要ではない又は適切ではないファセットキーにより絞り込みが行なわれる場合がある。 However, in the conventional facet search system, even when the search is narrowed down to a specific category, it is narrowed down by an unimportant or inappropriate facet key such as date of birth, place of origin, company type, etc. May be done.

このように、適切なファセットキーによる絞り込みが行なわれない場合、ファセット検索において、ユーザの目的のデータに辿り着くまでの手番が増加する可能性がある。 In this way, if the narrowing down by an appropriate facet key is not performed, there is a possibility that the turn to reach the target data of the user in the facet search increases.

１つの側面では、本発明は、カテゴリに関連付けられたファセットの集合から適切な複数のファセットを出力することを目的の１つとする。 In one aspect, one of the objects of the present invention is to output a plurality of appropriate facets from a set of facets associated with a category.

１つの側面では、検索プログラムは、コンピュータに、以下の処理を実行させてよい。前記処理は、特定のカテゴリの指定を受け付けてよい。また、前記処理は、ナレッジグラフにおける前記特定のカテゴリに関連付けられた複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの個数に基づく第１指標と、前記ナレッジグラフにおける前記特定のカテゴリと前記複数のファセットのそれぞれとの距離に基づく第２指標と、のうち少なくとも一方に基づいて算出される前記複数のファセットの優先度に応じて、前記複数のファセットを順に並べて出力してよい。 In one aspect, the search program may cause the computer to perform the following operations: The process may accept the designation of a particular category. In addition, the processing includes a first index based on the number of categories in which each of the plurality of facets associated with the specific category in the knowledge graph is associated with the knowledge graph, and the specific category in the knowledge graph. The plurality of facets may be output in order according to the second index based on the distance to each of the plurality of facets and the priority of the plurality of facets calculated based on at least one of them.

１つの側面では、本発明は、カテゴリに関連付けられたファセットの集合から適切な複数のファセットを出力することができる。 In one aspect, the invention can output the appropriate set of facets from a set of facets associated with a category.

ＲＤＦ（Resource Description Framework）の記述方式の一例であるグラフ形式の表現例を示す図である。It is a figure which shows the expression example of the graph format which is an example of the description method of RDF (Resource Description Framework). ＫＧデータにおける、カテゴリによる絞り込み対象と、カテゴリごとのファセットとを例示する図である。It is a figure which exemplifies the narrowing-down target by a category and the facet for each category in KG data. ファセット検索システムのＵＩ（User Interface）の画面表示例を示す図である。It is a figure which shows the screen display example of the UI (User Interface) of a facet search system. 一実施形態に係るファセット検索システムの機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of the facet search system which concerns on one Embodiment. サーバの機能を実現するコンピュータのハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware configuration example of the computer which realizes a server function. 頻度表作成部による頻度表の作成処理の一例を示す図である。It is a figure which shows an example of the frequency table creation process by a frequency table creation part. クラス集合、Ｐ頻度表、及び、ＰＯ頻度表の一例を示す図である。It is a figure which shows an example of a class set, a P frequency table, and a PO frequency table. 各指標の算出式の一例を示す図である。It is a figure which shows an example of the calculation formula of each index. 指標値の一例を示す図である。It is a figure which shows an example of an index value. カテゴリ重要度を表形式で表したカテゴリ重要度表の一例を示す図である。It is a figure which shows an example of the category importance table which expressed the category importance in tabular form. ファセットスコアを表形式で表したファセットスコア表の一例を示す図である。It is a figure which shows an example of the facet score table which represented the facet score in a tabular form. ＲＤＦスキーマにおける、クラス及びファセットキーを表すグラフの一例を示す図である。It is a figure which shows an example of the graph which shows the class and facet key in RDF Schema. ＲＤＦスキーマにおける、クラス及びファセットキーを表すグラフの一例を示す図である。It is a figure which shows an example of the graph which shows the class and facet key in RDF Schema. スコア（org_score）、ファセットスコア（new_score）、スコア（ont_score）、並びに、最終的なスコア（final_score）を表形式で例示する図である。It is a figure which illustrates the score (org_score), facet score (new_score), score (ont_score), and final score (final_score) in a table format. 項目一覧領域に表示されるファセットキーの一例を示す図である。It is a figure which shows an example of the facet key displayed in the item list area. “org_score”及び“new_score”のそれぞれをベースとしてファセットキーをソートした場合のＭＲＲ（Mean Reciprocal Rank）の比較例を示す図である。It is a figure which shows the comparative example of MRR (Mean Reciprocal Rank) when the facet key is sorted based on each of "org_score" and "new_score". “new_score”及び“final_score”のそれぞれをベースとしてファセットキーをソートした場合のＭＲＲの比較例を示す図である。It is a figure which shows the comparative example of MRR when the facet key is sorted based on each of "new_score" and "final_score". 一実施形態に係るＤＢ（Database）作成処理の動作例を説明するフローチャートである。It is a flowchart explaining the operation example of the DB (Database) creation process which concerns on one Embodiment. 図１８のステップＳ１の頻度表作成処理の動作例を説明するフローチャートである。It is a flowchart explaining the operation example of the frequency table creation process of step S1 of FIG. 図１８のステップＳ２のスコアＤＢ作成処理の動作例を説明するフローチャートである。It is a flowchart explaining the operation example of the score DB creation process of step S2 of FIG. 一実施形態に係るファセット検索処理の動作例を説明するフローチャートである。It is a flowchart explaining the operation example of the facet search process which concerns on one Embodiment.

以下、図面を参照して本発明の実施の形態を説明する。ただし、以下に説明する実施形態は、あくまでも例示であり、以下に明示しない種々の変形又は技術の適用を排除する意図はない。例えば、本実施形態を、その趣旨を逸脱しない範囲で種々変形して実施することができる。なお、以下の説明で用いる図面において、同一符号を付した部分は、特に断らない限り、同一若しくは同様の部分を表す。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. However, the embodiments described below are merely examples, and there is no intention of excluding the application of various modifications or techniques not specified below. For example, the present embodiment can be variously modified and implemented without departing from the spirit of the present embodiment. In the drawings used in the following description, the parts having the same reference numerals represent the same or similar parts unless otherwise specified.

〔１〕一実施形態
〔１−１〕ファセット検索システムの説明
まず、ファセット検索システムについて簡単に説明する。ファセット検索システムは、例えば、大規模なナレッジグラフ（ＫＧ）に対するファセット検索を行なうためのシステムである。 [1] Embodiment [1-1] Description of facet search system First, a facet search system will be briefly described. The facet search system is, for example, a system for performing a facet search on a large-scale Knowledge Graph (KG).

一実施形態に係るファセット検索システムは、例えば、ＫＧ内の検索対象のデータ全体に対するファセット検索を実施する代わりに、カテゴリで候補を絞り込み、その後、カテゴリ内のファセット検索を実施することにより、効率化を実現する。 The facet search system according to one embodiment is made more efficient by, for example, instead of performing a facet search on the entire data to be searched in the KG, narrowing down candidates by category and then performing a facet search within the category. To realize.

ＫＧに格納されるデータは、主語（Subject）、述語（Predicate）、目的語（Object）の３つの要素を１セットとする、ＲＤＦ（Resource Description Framework）と呼ばれる記述方式により表現される。 The data stored in the KG is expressed by a description method called RDF (Resource Description Framework) in which three elements of a subject (Subject), a predicate (Predicate), and an object (Object) are set as one set.

ＫＧを利用するファセット検索において、カテゴリは、インスタンスのクラスである。また、ファセットキーは、「述語」となり、ファセット値は、「目的語」となる。ファセット検索により、ファセット検索システムは、検索条件に合致したインスタンス、例えば「主語」の集合を探索する。 In facet search using KG, a category is a class of instances. The facet key is a "predicate" and the facet value is an "object". By facet search, the facet search system searches for an instance that matches the search condition, for example, a set of "subjects".

ここで、「インスタンス」とは、世の中の事象や事物を表現するものであり、エンティティとも呼ばれる。例えば、野球選手である「山田太郎」は、インスタンスの一例であり、「山田太郎」のように文字列で表されてもよく、ＤＢ（Database）内のＩＤで表されてもよい。 Here, the "instance" represents an event or thing in the world, and is also called an entity. For example, the baseball player "Taro Yamada" is an example of an instance, and may be represented by a character string such as "Taro Yamada" or by an ID in a DB (Database).

ＤＢの一例としては、データ（オープンデータ）を格納しインターネット上で公開する種々のＤＢ、例えば、ＬＯＤ（Linked Open Data）の技術を利用するＤＢが挙げられる。 An example of a DB is various DBs that store data (open data) and publish it on the Internet, for example, a DB that uses LOD (Linked Open Data) technology.

当該ＤＢ内のＩＤとは、当該ＤＢにおいて「山田太郎」の情報を参照できる場所のＵＲＩ（Uniform Resource Indicator）、例えば「山田太郎」の記事を含むウェブ（Ｗｅｂ）ページのＵＲＬ（Uniform Resource Locator）が挙げられる。一例として、ＤＢのドメインが“aaa.org”である場合、「山田太郎」のＤＢ内のＩＤは、“http://aaa.org/resource/Taro_Yamada”となる。 The ID in the DB is the URI (Uniform Resource Indicator) of the place where the information of "Taro Yamada" can be referred to in the DB, for example, the URL (Uniform Resource Locator) of the web page containing the article of "Taro Yamada". Can be mentioned. As an example, when the domain of the DB is "aaa.org", the ID in the DB of "Taro Yamada" is "http://aaa.org/resource/Taro_Yamada".

一実施形態において、ＫＧには、インスタンスとして、１以上のＤＢ内のＩＤが設定可能であってよい。換言すれば、ＫＧは、情報源として１以上のＤＢから情報を収集した知識ベースである。 In one embodiment, the KG may be capable of setting one or more IDs in the DB as an instance. In other words, KG is a knowledge base that collects information from one or more DBs as an information source.

「クラス」とは、インスタンスの種別を表すものであり、例えば、「山田太郎」のクラスは、「野球選手」である。インスタンスは、複数のクラスに属する場合があり、「山田太郎」は、「野球選手」“http://aaa.org/ontology/BaseballPlayer”であり、「アスリート」“http://aaa.org/ontology/Athlete”であり、「人」“http://aaa.org/ontology/Person”である。 The "class" represents the type of instance. For example, the class of "Taro Yamada" is a "baseball player". Instances may belong to multiple classes, where "Taro Yamada" is a "baseball player" "http://aaa.org/ontology/BaseballPlayer" and "athlete" "http://aaa.org/" It is "ontology / Athlete" and "person" "http://aaa.org/ontology/Person".

クラスは、上位及び下位の関係を保持できる。例えば、ＲＤＦスキーマを利用すると、“BaseballPlayer”（野球選手）と“Athlete”（アスリート）は、“rdfs:subClassOf”（サブクラス）の関係となる。 Classes can maintain superior and inferior relationships. For example, using RDF Schema, "BaseballPlayer" and "Athlete" have a "rdfs: subClassOf" relationship.

以下、ＲＤＦスキーマ（ＲＤＦＳスキーマと称されてもよい）、及び、他の標準語彙や独自に定義した語彙を利用して定義したスキーマを、「オントロジー」（Ontology）と呼ぶ。 Hereinafter, the RDF schema (which may be referred to as RDFS schema) and the schema defined by using other standard vocabularies or independently defined vocabularies are referred to as “Ontology”.

図１は、ＲＤＦの記述方式の一例であるグラフ形式の表現例を示す図である。なお、図１の説明では、東京都の知事である「鈴木花子」について、ＤＢ“http://aaa.org/”内のＵＲＩをインスタンスに含む例を示す。 FIG. 1 is a diagram showing an example of representation in a graph format, which is an example of an RDF description method. In the explanation of FIG. 1, an example is shown in which the URI in the DB “http://aaa.org/” is included in the instance of “Hanako Suzuki”, the governor of Tokyo.

また、以下に、図１に示す表現例をテキスト（ｎ３）形式で示す。 Further, an example of the expression shown in FIG. 1 is shown below in the text (n3) format.

<http://ja.aaa.org/resource/東京都> <http://aaaa.org/ontology/leader> <http://ja.aaa.org/resource/鈴木花子>.
<http://ja.aaa.org/resource/鈴木花子>
rdf:type <http://aaa.org/ontology/Politician>;
<http://aaa.org/ontology/birthPlace> <http://ja.aaa.org/resource/兵庫県>;
<http://aaa.org/ontology/birthDate> “1960-01-01”. <http://ja.aaa.org/resource/Tokyo><http://aaaa.org/ontology/leader><http://ja.aaa.org/resource/ Hanako Suzuki>.
<http://ja.aaa.org/resource/ Hanako Suzuki>
rdf: type <http://aaa.org/ontology/Politician>;
<http://aaa.org/ontology/birthPlace><http://ja.aaa.org/resource/ Hyogo Prefecture>;
<http://aaa.org/ontology/birthDate> “1960-01-01”.

このように、ＲＤＦは、「もの」、「こと」を、「主語」（Ｓ；Subject）、「述語」（Ｐ；Predicate）、「目的語」（Ｏ；Object）の３つ組みで表現する。図１及び上記テキスト（ｎ３）形式で示す表現例により、以下のように「もの」、「こと」が整理される。なお、“rdf:type”は、インスタンスとクラスとの関係を定義する述語である。 In this way, RDF expresses "things" and "things" as a triplet of "subject" (S; Subject), "predicate" (P; Predicate), and "object" (O; Object). .. According to FIG. 1 and the expression example shown in the text (n3) format, "things" and "things" are organized as follows. Note that "rdf: type" is a predicate that defines the relationship between an instance and a class.

「東京都（Ｓ）の知事（Ｐ）は鈴木花子（Ｏ）である」
「鈴木花子（Ｓ）は（Ｐ）政治家（Ｏ）、出身（Ｐ）は兵庫県（Ｏ）、生年月日（Ｐ）は1960-01-01（Ｏ）である」 "The governor (P) of Tokyo (S) is Hanako Suzuki (O)."
"Hanako Suzuki (S) is a (P) politician (O), her birthplace (P) is Hyogo prefecture (O), and her date of birth (P) is 1960-01-01 (O)."

図２は、ＫＧデータにおける、カテゴリによる絞り込み対象と、カテゴリごとのファセットとを例示する図である。例えば、ＫＧデータは、「主語」、「述語」、「目的語」のセットとして、「山田太郎」、「rdf:type」、「プロ野球選手」のセットや、「山田太郎」、「打席」、「左」のセットを含んでよい。 FIG. 2 is a diagram illustrating a target for narrowing down by category and facets for each category in KG data. For example, KG data is a set of "subject", "predicate", and "object", such as "Taro Yamada", "rdf: type", "professional baseball player", "Taro Yamada", and "at bat". , "Left" set may be included.

ファセット検索システムは、例えば、「事物」から「人」又は「組織」、「人」から「政治家」、「プロ野球選手」又は「サッカー選手」、「組織」から「会社」又は「プロ野球チーム」のように、階層的に、カテゴリの絞り込み（探索）を可能としてよい。 The facet search system is, for example, "things" to "people" or "organization", "people" to "politicians", "professional baseball players" or "soccer players", "organizations" to "company" or "professional baseball". It may be possible to narrow down (search) categories hierarchically, such as "team".

例えば、ユーザは、ファセット検索システムにおいて、いずれかのカテゴリのファセットを選択し、提示された候補となる値を選択することで、検索結果を絞り込むことができる。なお、図２の例において、「政治家」のカテゴリに着目すると、ファセットは、｛名前、所属政党、出身地、生年月日｝であり、それぞれに対応付けられる文字列やＤＢ内のＩＤがファセットに対する値である。 For example, in the facet search system, the user can narrow down the search results by selecting facets in any category and selecting the presented candidate values. In the example of FIG. 2, focusing on the category of "politician", the facet is {name, political party, birthplace, date of birth}, and the character string and ID in the DB associated with each are The value for the facet.

図３は、ファセット検索システムのＵＩ（User Interface）の画面表示例を示す図である。図３に示すように、ファセット検索システムのＵＩは、ユーザの端末に表示する画面として、カテゴリ探索画面１００と、カテゴリ探索画面１００から遷移するファセット検索画面２００とを含んでよい。 FIG. 3 is a diagram showing a screen display example of the UI (User Interface) of the facet search system. As shown in FIG. 3, the UI of the facet search system may include a category search screen 100 and a facet search screen 200 transitioning from the category search screen 100 as screens to be displayed on the user's terminal.

カテゴリ探索画面１００は、ＫＧデータからカテゴリを絞り込むための画面、換言すれば、特定のカテゴリの指定を受け付ける画面であり、例えば、カテゴリの選択領域１１０及びクラスの選択領域１２０を含んでよい。なお、カテゴリ探索画面１００は、ファセット検索画面２００において保存された検索条件を表示するためのボタン１３０を含んでもよい。 The category search screen 100 is a screen for narrowing down categories from KG data, in other words, a screen that accepts designation of a specific category, and may include, for example, a category selection area 110 and a class selection area 120. The category search screen 100 may include a button 130 for displaying the search conditions saved on the facet search screen 200.

選択領域１１０は、例えば、図２に示す「事物」からデータセット、例えば「人」又は「組織」の選択を受け付けるための領域であり、図３の例では「データセットＡ」及び「データセットＢ」が表示されている。選択領域１２０は、選択領域１１０で選択された「人」又は「組織」に対応付けられたクラスの一覧を表示する領域であって、例えば、図２に示す「人」から「政治家」、「プロ野球選手」又は「サッカー選手」等の選択を受け付けるための領域である。 The selection area 110 is, for example, an area for accepting the selection of a data set, for example, a “person” or an “organization” from the “thing” shown in FIG. 2, and in the example of FIG. 3, the “data set A” and the “data set”. "B" is displayed. The selection area 120 is an area for displaying a list of classes associated with the "person" or "organization" selected in the selection area 110, and is, for example, from "person" to "politician" shown in FIG. This is an area for accepting selections such as "professional baseball player" or "soccer player".

なお、図３の例では、便宜上、選択領域１１０を「データセット」、選択領域１２０を「クラス」の選択領域と表記しているが、選択領域１１０及び１２０で選択された「データセット」の「クラス」を「カテゴリ」と捉えてよい。 In the example of FIG. 3, for convenience, the selection area 110 is referred to as a “data set” and the selection area 120 is referred to as a “class” selection area, but the “data set” selected in the selection areas 110 and 120 You can think of "class" as "category".

ＵＩは、選択領域１２０で選択されたクラス（カテゴリ）の情報に基づき、ファセット検索画面２００を表示する。 The UI displays the facet search screen 200 based on the information of the class (category) selected in the selection area 120.

ファセット検索画面２００は、カテゴリ探索画面１００で選択されたクラス（カテゴリ）から、ファセットを検索するための画面である。すなわち、ファセット検索画面２００は、カテゴリ探索画面１００からの遷移先である検索画面であって、特定のカテゴリに関連付けられた複数のファセットを対象とした検索を行なうための検索画面である。 The facet search screen 200 is a screen for searching facets from the class (category) selected on the category search screen 100. That is, the facet search screen 200 is a search screen that is a transition destination from the category search screen 100, and is a search screen for performing a search targeting a plurality of facets associated with a specific category.

ファセット検索画面２００は、例えば、項目一覧領域２１０、検索条件の設定領域２３０、出力項目の設定領域２５０、出力言語の設定領域２６０、及び、一覧表示領域２８０を含んでよい。 The facet search screen 200 may include, for example, an item list area 210, a search condition setting area 230, an output item setting area 250, an output language setting area 260, and a list display area 280.

項目一覧領域２１０は、カテゴリ探索画面１００で選択されたクラス（カテゴリ）に基づき、ＫＧデータ内のファセットキーを表示する領域である。 The item list area 210 is an area for displaying the facet key in the KG data based on the class (category) selected on the category search screen 100.

検索条件の設定領域２３０は、項目一覧領域２１０で選択された状態で追加ボタン２２０が押下されたファセットキーについての検索条件を設定するための領域である。出力項目の設定領域２５０は、項目一覧領域２１０で選択された状態で追加ボタン２４０が押下されたファセットキーについて、エンティティを出力する項目及び出力順序を設定するための領域である。出力言語の設定領域２５０は、エンティティの出力言語を設定するための領域である。 The search condition setting area 230 is an area for setting a search condition for the facet key for which the add button 220 is pressed while the item list area 210 is selected. The output item setting area 250 is an area for setting an item to output an entity and an output order for the facet key for which the add button 240 is pressed while being selected in the item list area 210. The output language setting area 250 is an area for setting the output language of the entity.

一覧表示領域２８０は、検索ボタン２７０が押下された場合に、設定領域２３０、２５０及び２６０の設定内容に基づき、エンティティの一覧を表示する領域である。 The list display area 280 is an area for displaying a list of entities based on the setting contents of the setting areas 230, 250, and 260 when the search button 270 is pressed.

なお、ファセット検索画面２００は、一覧表示領域２８０の表示内容や、設定領域２３０、２５０及び２６０の設定内容、項目一覧領域２１０の表示内容、のいずれか１つ以上についての操作ボタン２９０を表示してもよい。操作ボタン２９０は、例えば、ＣＳＶ（Comma Separated Value）形式での出力ボタン、ＲＤＦ問合せ言語の一例であるＳＰＡＲＱＬ（SPARQL Protocol and RDF Query Language）文の確認ボタン、及び、検索条件の保存ボタン、等を含んでよい。 The facet search screen 200 displays operation buttons 290 for any one or more of the display contents of the list display area 280, the setting contents of the setting areas 230, 250 and 260, and the display contents of the item list area 210. You may. The operation button 290 includes, for example, an output button in CSV (Comma Separated Value) format, a confirmation button for a SPARQL (SPARQL Protocol and RDF Query Language) statement, which is an example of an RDF query language, and a save button for search conditions. May include.

ところで、ファセット検索画面２００における項目一覧領域２１０には、カテゴリに適したファセットキーが表示されない場合がある。以下、このような場合を比較例として説明する。 By the way, the facet key suitable for the category may not be displayed in the item list area 210 on the facet search screen 200. Hereinafter, such a case will be described as a comparative example.

（比較例）
例えば、図３に示す画面表示を、非特許文献１に記載された技術を用いて実行する場合を想定する。この技術では、例えば、サーバは、ＸＭＬ（eXtensible Markup Language）データを用いて、以下の手順でファセットの値の抽出を行なう。 (Comparison example)
For example, it is assumed that the screen display shown in FIG. 3 is executed by using the technique described in Non-Patent Document 1. In this technique, for example, the server uses XML (eXtensible Markup Language) data to extract facet values according to the following procedure.

（ｉ）サーバは、ＸＭＬデータから構造要約を抽出する。
構造要約は、ＸＭＬデータ中の各要素の親子関係や、親要素に対する子要素の出現頻度等を示す情報である。 (I) The server extracts a structural summary from the XML data.
The structure summary is information indicating the parent-child relationship of each element in the XML data, the appearance frequency of the child element with respect to the parent element, and the like.

（ii）構造要約からクラス候補及びファセット候補を抽出する。
なお、上記技術においては、構造要約内の或るノードでＸＭＬデータ中のオブジェクトに対応するものをクラスといい、構造要約中のクラスノードの子孫ノードのうちの、選ばれたノードをファセットという。 (Ii) Extract class candidates and facet candidates from the structural summary.
In the above technique, a certain node in the structure summary corresponding to an object in XML data is called a class, and a selected node among the descendant nodes of the class node in the structure summary is called a facet.

（iii）クラス候補及びファセット候補から適切なものを抽出する。
（iv）ＸＭＬデータから抽出されたクラスに該当するＸＭＬ部分木をオブジェクトとして抽出する。
（ｖ）抽出されたファセットに該当する要素の値をファセットの値として抽出する。 (Iii) Extract appropriate ones from class candidates and facet candidates.
(Iv) Extract the XML subtree corresponding to the class extracted from the XML data as an object.
(V) The value of the element corresponding to the extracted facet is extracted as the facet value.

例えば、サーバは、上記（ｉ）〜（ｖ）の手順において、ＸＭＬデータに代えて、ＲＤＦの記述形式のＫＧデータを対象とすることが考えられる。 For example, in the above procedures (i) to (v), the server may target KG data in the RDF description format instead of XML data.

非特許文献１に記載された技術において、上記（iii）の手順では、サーバは、頻度によるアプローチと、意味に基づくアプローチとを組み合わせた手法を採用する。頻度によるアプローチでは、サーバは、ＫＧデータ内でより多くのインスタンスを取得できるファセットキーを抽出することができる。意味に基づくアプローチでは、サーバは、WordNetやWikipedia（登録商標）等の既存知識を利用し、ファセットキーが人間により解釈可能か否かを判定する。 In the technique described in Non-Patent Document 1, in the procedure (iii) above, the server employs a method that combines a frequency-based approach and a meaning-based approach. In the frequency approach, the server can extract facet keys that can get more instances in the KG data. In a meaning-based approach, the server uses existing knowledge such as WordNet and Wikipedia® to determine if the facet key is human-interpretable.

なお、上記技術では、ファセットキーの重要度については考慮されていない。一方、非特許文献２には、ファセットキーをランキングする指標が記載されている。 In the above technique, the importance of the facet key is not considered. On the other hand, Non-Patent Document 2 describes an index for ranking facet keys.

ここで、非特許文献１及び２に記載された技術を用いて、ファセット検索画面２００における項目一覧領域２１０を表示する場合、換言すれば、或る特定のカテゴリ内のファセット検索を行なう場合を考える。或る特定のカテゴリ内のファセット検索を行なう場合、カテゴリ固有のファセットキーで絞り込んだ方が良いケースがある。例えば、カテゴリ探索画面１００において、プロ野球選手というカテゴリが選択された場合を想定する。 Here, when the item list area 210 on the facet search screen 200 is displayed by using the techniques described in Non-Patent Documents 1 and 2, in other words, a case where facet search within a specific category is performed is considered. .. When performing a facet search within a specific category, there are cases where it is better to narrow down by the facet key specific to the category. For example, it is assumed that the category of professional baseball player is selected on the category search screen 100.

この場合、非特許文献２に記載された技術では、ＫＧデータ全体の統計的傾向により、甲子園出場経験よりも、出身地や生年月日が重要なファセットキーとして上位にランキングされてしまう。 In this case, in the technique described in Non-Patent Document 2, due to the statistical tendency of the entire KG data, the place of origin and the date of birth are ranked higher as important facet keys than the experience of participating in Koshien.

また、非特許文献１に記載された技術では、知識ベースにファセットが存在するか否かを利用するに留まっており、知識のオントロジーは利用されない。例えば、「日付」のファセットキーについて、人の場合は「生まれた日」、プロ野球選手の場合は「初出場の年月日」等の方が重要なファセットとなる。これらの情報は、オントロジー等によりＫＧの中で構造化されている。 Further, in the technique described in Non-Patent Document 1, only the presence or absence of facets in the knowledge base is used, and the ontology of knowledge is not used. For example, regarding the facet key of "date", the "date of birth" for a person and the "date of first appearance" for a professional baseball player are more important facets. This information is structured in the KG by an ontology or the like.

しかし、プロ野球選手というカテゴリが選択された場合、ファセットキーとして、生年月日や出身地、会社種別等よりも、打席や利き腕、甲子園出場経験等で絞り込まれる方が、ユーザの知識に即したファセット検索を実現できる。 However, when the category of professional baseball players is selected, it is more in line with the user's knowledge to narrow down the facet keys by turn at bat, dominant arm, Koshien participation experience, etc., rather than by date of birth, place of birth, company type, etc. Facet search can be realized.

そこで、以下で説明する一実施形態に係るファセット検索システムは、主に、下記（ａ）及び（ｂ）の手法を採用して、項目一覧領域２１０に表示する、カテゴリに適したファセットキーのランキングを行なう。 Therefore, the facet search system according to the embodiment described below mainly adopts the methods (a) and (b) below and displays the facet key ranking suitable for the category in the item list area 210. To do.

（ａ）カテゴリ固有のファセットキーを重要視する指標の導入。
（ｂ）ＫＧの知識構造を考慮したファセットキーの意味解釈の実施。 (A) Introduction of indicators that emphasize category-specific facet keys.
(B) Implementation of semantic interpretation of facet keys in consideration of the knowledge structure of KG.

これにより、カテゴリに関連付けられたファセットの集合から適切な複数のファセットを出力することができ、ユーザの目的のデータに辿り着くまでの手番を減少させることができる。 As a result, it is possible to output a plurality of appropriate facets from the set of facets associated with the category, and it is possible to reduce the turn to reach the target data of the user.

〔１−２〕ファセット検索システムの構成例
図４は、一実施形態に係るファセット検索システム１の機能構成例を示すブロック図である。ファセット検索システム１は、ファセット検索を行なう検索システムの一例であり、図４に示すように、例示的に、サーバ２、ナレッジグラフ（ＫＧ）３、及び、１以上（図４の例では１台）の端末４を備えてよい。 [1-2] Configuration Example of Facet Search System FIG. 4 is a block diagram showing a functional configuration example of the facet search system 1 according to the embodiment. The facet search system 1 is an example of a search system that performs facet search, and as shown in FIG. 4, an example, a server 2, a knowledge graph (KG) 3, and one or more (one in the example of FIG. 4). ) Terminal 4 may be provided.

ＫＧ３は、知識ベースの一例であり、例えば、ＲＤＦの記述形式で記述されたデータを記憶してよい。 KG3 is an example of a knowledge base, and may store data described in RDF description format, for example.

端末４は、ファセット検索システム１のユーザが使用する情報処理端末の一例であり、サーバ２に対してファセットの検索に関するアクセスを行なうＰＣ（Personal Computer）又はサーバ等のコンピュータである。 The terminal 4 is an example of an information processing terminal used by a user of the facet search system 1, and is a computer such as a PC (Personal Computer) or a server that accesses the server 2 for facet search.

ＫＧ３とサーバ２との間、及び、端末４とサーバ２との間は、それぞれ、図示しないネットワークを介して相互に通信可能に接続されてよい。ネットワークは、ＷＡＮ（Wide Area Network）、ＬＡＮ（Local Area Network）、又はこれらの組み合わせを含んでよい。ＷＡＮにはインターネットが含まれてよく、ＬＡＮにはＶＰＮ（Virtual Private Network）が含まれてよい。 The KG 3 and the server 2 and the terminal 4 and the server 2 may be connected to each other so as to be able to communicate with each other via a network (not shown). The network may include a WAN (Wide Area Network), a LAN (Local Area Network), or a combination thereof. The WAN may include the Internet and the LAN may include a VPN (Virtual Private Network).

サーバ２は、検索装置、情報処理装置、又は、コンピュータの一例である。例えば、サーバ２は、ファセット検索システム１において、端末４からのファセットの検索に関する種々のアクセスに応じて、ＫＧ３の参照、端末４への応答や情報の通知等の種々の処理を行なう。 The server 2 is an example of a search device, an information processing device, or a computer. For example, in the facet search system 1, the server 2 performs various processes such as referencing the KG3, responding to the terminal 4, and notifying information in response to various accesses related to the facet search from the terminal 4.

サーバ２は、例えば、端末４に対して、アクセスを可能とするための機能を提供してよい。当該機能としては、例えば、端末４によるアクセスに用いられる、ウェブページ等の画面の生成及び表示制御が挙げられる。例えば、端末４は、ブラウザ等のアプリケーションを用いてサーバ２にアクセス要求を送信し、サーバ２から受信する画面情報に基づきアプリケーションに表示されるウェブページを介して、サーバ２へのアクセスを行なってよい。 The server 2 may provide, for example, a function for enabling access to the terminal 4. Examples of the function include generation and display control of a screen such as a web page used for access by the terminal 4. For example, the terminal 4 transmits an access request to the server 2 using an application such as a browser, and accesses the server 2 via a web page displayed on the application based on the screen information received from the server 2. good.

サーバ２は、仮想サーバ（ＶＭ；Virtual Machine）であってもよいし、物理サーバであってもよい。また、サーバ２の機能は、１台のコンピュータにより実現されてもよいし、２台以上のコンピュータにより実現されてもよい。さらに、サーバ２の機能のうちの少なくとも一部は、クラウド環境により提供されるＨＷ（Hardware）リソース及びＮＷ（Network）リソースを用いて実現されてもよい。 The server 2 may be a virtual server (VM) or a physical server. Further, the function of the server 2 may be realized by one computer or may be realized by two or more computers. Further, at least a part of the functions of the server 2 may be realized by using the HW (Hardware) resource and the NW (Network) resource provided by the cloud environment.

（ハードウェア構成例）
図５は、サーバ２の機能を実現するコンピュータ１０のハードウェア（ＨＷ）構成例を示すブロック図である。サーバ２の機能を実現するＨＷリソースとして、複数のコンピュータが用いられる場合は、各コンピュータが図５に例示するＨＷ構成を備えてよい。 (Hardware configuration example)
FIG. 5 is a block diagram showing a hardware (HW) configuration example of the computer 10 that realizes the function of the server 2. When a plurality of computers are used as the HW resource that realizes the function of the server 2, each computer may have the HW configuration illustrated in FIG.

図５に示すように、コンピュータ１０は、ＨＷ構成として、例示的に、プロセッサ１０ａ、メモリ１０ｂ、記憶部１０ｃ、ＩＦ（Interface）部１０ｄ、Ｉ／Ｏ（Input / Output）部１０ｅ、及び読取部１０ｆを備えてよい。 As shown in FIG. 5, the computer 10 has an HW configuration, for example, a processor 10a, a memory 10b, a storage unit 10c, an IF (Interface) unit 10d, an I / O (Input / Output) unit 10e, and a reading unit. It may be provided with 10f.

プロセッサ１０ａは、種々の制御や演算を行なう演算処理装置の一例である。プロセッサ１０ａは、コンピュータ１０内の各ブロックとバス１０ｉで相互に通信可能に接続されてよい。なお、プロセッサ１０ａは、複数のプロセッサを含むマルチプロセッサであってもよいし、複数のプロセッサコアを有するマルチコアプロセッサであってもよく、或いは、マルチコアプロセッサを複数有する構成であってもよい。 The processor 10a is an example of an arithmetic processing unit that performs various controls and operations. The processor 10a may be connected to each block in the computer 10 so as to be able to communicate with each other by the bus 10i. The processor 10a may be a multiprocessor including a plurality of processors, a multicore processor having a plurality of processor cores, or a configuration having a plurality of multicore processors.

プロセッサ１０ａとしては、例えば、ＣＰＵ、ＭＰＵ、ＧＰＵ、ＡＰＵ、ＤＳＰ、ＡＳＩＣ、ＦＰＧＡ等の集積回路（ＩＣ；Integrated Circuit）が挙げられる。なお、プロセッサ１０ａとして、これらの集積回路の２以上の組み合わせが用いられてもよい。ＣＰＵはCentral Processing Unitの略称であり、ＭＰＵはMicro Processing Unitの略称である。ＧＰＵはGraphics Processing Unitの略称であり、ＡＰＵはAccelerated Processing Unitの略称である。ＤＳＰはDigital Signal Processorの略称であり、ＡＳＩＣはApplication Specific ICの略称であり、ＦＰＧＡはField-Programmable Gate Arrayの略称である。 Examples of the processor 10a include integrated circuits (ICs) such as CPUs, MPUs, GPUs, APUs, DSPs, ASICs, and FPGAs. A combination of two or more of these integrated circuits may be used as the processor 10a. CPU is an abbreviation for Central Processing Unit, and MPU is an abbreviation for Micro Processing Unit. GPU is an abbreviation for Graphics Processing Unit, and APU is an abbreviation for Accelerated Processing Unit. DSP is an abbreviation for Digital Signal Processor, ASIC is an abbreviation for Application Specific IC, and FPGA is an abbreviation for Field-Programmable Gate Array.

メモリ１０ｂは、種々のデータやプログラム等の情報を格納するＨＷの一例である。メモリ１０ｂとしては、例えばＤＲＡＭ（Dynamic Random Access Memory）等の揮発性メモリ、及び、ＰＭ（Persistent Memory）等の不揮発性メモリ、の一方又は双方が挙げられる。 The memory 10b is an example of HW that stores information such as various data and programs. Examples of the memory 10b include one or both of a volatile memory such as DRAM (Dynamic Random Access Memory) and a non-volatile memory such as PM (Persistent Memory).

記憶部１０ｃは、種々のデータやプログラム等の情報を格納するＨＷの一例である。記憶部１０ｃとしては、ＨＤＤ（Hard Disk Drive）等の磁気ディスク装置、ＳＳＤ（Solid State Drive）等の半導体ドライブ装置、不揮発性メモリ等の各種記憶装置が挙げられる。不揮発性メモリとしては、例えば、フラッシュメモリ、ＳＣＭ（Storage Class Memory）、ＲＯＭ（Read Only Memory）等が挙げられる。 The storage unit 10c is an example of HW that stores information such as various data and programs. Examples of the storage unit 10c include a magnetic disk device such as an HDD (Hard Disk Drive), a semiconductor drive device such as an SSD (Solid State Drive), and various storage devices such as a non-volatile memory. Examples of the non-volatile memory include a flash memory, an SCM (Storage Class Memory), a ROM (Read Only Memory), and the like.

また、記憶部１０ｃは、コンピュータ１０の各種機能の全部若しくは一部を実現するプログラム１０ｇ（検索プログラム）を格納してよい。例えば、サーバ２のプロセッサ１０ａは、記憶部１０ｃに格納されたプログラム１０ｇをメモリ１０ｂに展開して実行することにより、図４に例示するサーバ２としての機能を実現できる。 Further, the storage unit 10c may store a program 10g (search program) that realizes all or a part of various functions of the computer 10. For example, the processor 10a of the server 2 can realize the function as the server 2 illustrated in FIG. 4 by expanding and executing the program 10g stored in the storage unit 10c in the memory 10b.

ＩＦ部１０ｄは、ネットワークとの間の接続及び通信の制御等を行なう通信ＩＦの一例である。例えば、ＩＦ部１０ｄは、イーサネット（登録商標）等のＬＡＮ（Local Area Network）、或いは、ＦＣ（Fibre Channel）等の光通信等に準拠したアダプタを含んでよい。当該アダプタは、無線及び有線の一方又は双方の通信方式に対応してよい。例えば、サーバ２は、ＩＦ部１０ｄを介して、ＫＧ３及び端末４のそれぞれと相互に通信可能に接続されてよい。また、例えば、プログラム１０ｇは、当該通信ＩＦを介して、ネットワークからコンピュータ１０にダウンロードされ、記憶部１０ｃに格納されてもよい。 The IF unit 10d is an example of a communication IF that controls connection with a network and communication. For example, the IF unit 10d may include an adapter compliant with LAN (Local Area Network) such as Ethernet (registered trademark) or optical communication such as FC (Fibre Channel). The adapter may support one or both wireless and wired communication methods. For example, the server 2 may be connected to each of the KG3 and the terminal 4 so as to be able to communicate with each other via the IF unit 10d. Further, for example, the program 10g may be downloaded from the network to the computer 10 via the communication IF and stored in the storage unit 10c.

Ｉ／Ｏ部１０ｅは、入力装置、及び、出力装置、の一方又は双方を含んでよい。入力装置としては、例えば、キーボード、マウス、タッチパネル等が挙げられる。出力装置としては、例えば、モニタ、プロジェクタ、プリンタ等が挙げられる。 The I / O unit 10e may include one or both of an input device and an output device. Examples of the input device include a keyboard, a mouse, a touch panel, and the like. Examples of the output device include a monitor, a projector, a printer and the like.

読取部１０ｆは、記録媒体１０ｈに記録されたデータやプログラムの情報を読み出すリーダの一例である。読取部１０ｆは、記録媒体１０ｈを接続可能又は挿入可能な接続端子又は装置を含んでよい。読取部１０ｆとしては、例えば、ＵＳＢ（Universal Serial Bus）等に準拠したアダプタ、記録ディスクへのアクセスを行なうドライブ装置、ＳＤカード等のフラッシュメモリへのアクセスを行なうカードリーダ等が挙げられる。なお、記録媒体１０ｈにはプログラム１０ｇが格納されてもよく、読取部１０ｆが記録媒体１０ｈからプログラム１０ｇを読み出して記憶部１０ｃに格納してもよい。 The reading unit 10f is an example of a reader that reads data and program information recorded on the recording medium 10h. The reading unit 10f may include a connection terminal or device to which the recording medium 10h can be connected or inserted. Examples of the reading unit 10f include an adapter compliant with USB (Universal Serial Bus) and the like, a drive device for accessing a recording disk, a card reader for accessing a flash memory such as an SD card, and the like. The program 10g may be stored in the recording medium 10h, or the reading unit 10f may read the program 10g from the recording medium 10h and store it in the storage unit 10c.

記録媒体１０ｈとしては、例示的に、磁気／光ディスクやフラッシュメモリ等の非一時的なコンピュータ読取可能な記録媒体が挙げられる。磁気／光ディスクとしては、例示的に、フレキシブルディスク、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）、ブルーレイディスク、ＨＶＤ（Holographic Versatile Disc）等が挙げられる。フラッシュメモリとしては、例示的に、ＵＳＢメモリやＳＤカード等の半導体メモリが挙げられる。 Examples of the recording medium 10h include a non-temporary computer-readable recording medium such as a magnetic / optical disk or a flash memory. Examples of magnetic / optical disks include flexible discs, CDs (Compact Discs), DVDs (Digital Versatile Discs), Blu-ray discs, HVDs (Holographic Versatile Discs), and the like. Examples of the flash memory include semiconductor memories such as USB memory and SD card.

上述したコンピュータ１０のＨＷ構成は例示である。従って、コンピュータ１０内でのＨＷの増減（例えば任意のブロックの追加や削除）、分割、任意の組み合わせでの統合、又は、バスの追加若しくは削除等は適宜行なわれてもよい。例えば、サーバ２において、Ｉ／Ｏ部１０ｅ及び読取部１０ｆの少なくとも一方は、省略されてもよい。 The HW configuration of the computer 10 described above is an example. Therefore, the increase / decrease of HW (for example, addition or deletion of arbitrary blocks), division, integration in any combination, addition or deletion of buses, etc. in the computer 10 may be performed as appropriate. For example, in the server 2, at least one of the I / O unit 10e and the reading unit 10f may be omitted.

なお、情報処理端末の一例である端末４は、上述したコンピュータ１０と同様のＨＷ構成により実現されてよい。 The terminal 4, which is an example of the information processing terminal, may be realized by the same HW configuration as the computer 10 described above.

例えば、端末４のプロセッサ１０ａは、記憶部１０ｃに格納されたプログラム１０ｇをメモリ１０ｂに展開して実行することにより、図４に示す端末４としての機能を実現できる。 For example, the processor 10a of the terminal 4 can realize the function as the terminal 4 shown in FIG. 4 by expanding and executing the program 10g stored in the storage unit 10c in the memory 10b.

なお、図４に示す端末４は、Ｉ／Ｏ部１０ｅの一例である入力装置及び表示装置を備えてよい。例えば、端末４のプロセッサ１０ａは、ＩＦ部１０ｄを介してサーバ２から受信した情報に基づき、各画面を表示装置に表示してよい。また、端末４のプロセッサ１０ａは、入力された情報を、ＩＦ部１０ｄを介してサーバ２に送信してよい。 The terminal 4 shown in FIG. 4 may include an input device and a display device, which are examples of the I / O unit 10e. For example, the processor 10a of the terminal 4 may display each screen on the display device based on the information received from the server 2 via the IF unit 10d. Further, the processor 10a of the terminal 4 may transmit the input information to the server 2 via the IF unit 10d.

（機能構成例）
図４の説明に戻り、サーバ２は、例示的に、メモリ部２１、検索制御部２２、統計処理部２３、意味解釈処理部２４、及びランキング調整部２５を備えてよい。 (Functional configuration example)
Returning to the description of FIG. 4, the server 2 may optionally include a memory unit 21, a search control unit 22, a statistical processing unit 23, a meaning interpretation processing unit 24, and a ranking adjusting unit 25.

メモリ部２１は、記憶領域の一例であり、ファセットの検索に関する種々の情報を記憶する。図４に示すように、メモリ部２１は、例示的に、頻度表２１ａ及びスコアＤＢ２１ｂを記憶してよい。以下の説明では、便宜上、頻度表２１ａ及びスコアＤＢ２１ｂのデータ形式をテーブル形式として説明するが、これに限定されるものではなく、種々のＤＢのデータ形式であってよい。 The memory unit 21 is an example of a storage area, and stores various information related to facet search. As shown in FIG. 4, the memory unit 21 may optionally store the frequency table 21a and the score DB 21b. In the following description, for convenience, the data formats of the frequency table 21a and the score DB 21b will be described as a table format, but the present invention is not limited to this, and various DB data formats may be used.

なお、頻度表２１ａ及びスコアＤＢ２１ｂは、例えば、図５に示すメモリ１０ｂ及び記憶部１０ｃの少なくとも１つが有する記憶領域に格納されてよい。換言すれば、メモリ部２１は、メモリ１０ｂ及び記憶部１０ｃの少なくとも１つが有する記憶領域により実現されてよい。 The frequency table 21a and the score DB 21b may be stored in, for example, a storage area of at least one of the memory 10b and the storage unit 10c shown in FIG. In other words, the memory unit 21 may be realized by a storage area included in at least one of the memory 10b and the storage unit 10c.

検索制御部２２は、端末４に対して、図３に例示するカテゴリ探索画面１００及びファセット検索画面２００を含むＵＩを提供する。例えば、検索制御部２２は、カテゴリ探索画面１００及びファセット検索画面２００のそれぞれの画面情報を生成し、端末４に送信するとともに、端末４でＵＩを介して入力された文字列や選択項目を示す制御情報を、端末４から受信してよい。 The search control unit 22 provides the terminal 4 with a UI including the category search screen 100 and the facet search screen 200 illustrated in FIG. For example, the search control unit 22 generates screen information for each of the category search screen 100 and the facet search screen 200, transmits the screen information to the terminal 4, and indicates a character string and selection items input via the UI on the terminal 4. The control information may be received from the terminal 4.

一例として、検索制御部２２は、カテゴリ探索画面１００の選択領域１１０及び１２０で選択されたクラス（カテゴリ）を含む制御情報を端末４から受信すると、当該制御情報を統計処理部２３及び意味解釈処理部２４のそれぞれに通知してよい。換言すれば、検索制御部２２は、特定のカテゴリの指定を受け付ける受付部の一例である。 As an example, when the search control unit 22 receives the control information including the class (category) selected in the selection areas 110 and 120 of the category search screen 100 from the terminal 4, the search control unit 22 receives the control information from the statistical processing unit 23 and the semantic interpretation processing. Each of the parts 24 may be notified. In other words, the search control unit 22 is an example of a reception unit that accepts the designation of a specific category.

また、検索制御部２２は、複数のファセットキーを示す情報をランキング調整部２５から通知されると、当該情報に含まれるファセットキーを、その表示順序（並び順）も含めて、項目一覧領域２１０に表示させる。換言すれば、検索制御部２２は、ランキング調整部２５により算出される優先度に応じて、複数のファセットを順に並べて出力する（例えば項目一覧領域２１０に表示する）、出力部の一例である。 Further, when the search control unit 22 is notified by the ranking adjustment unit 25 of information indicating a plurality of facet keys, the search control unit 22 displays the facet keys included in the information, including the display order (arrangement order), in the item list area 210. To display. In other words, the search control unit 22 is an example of an output unit that outputs a plurality of facets in order (for example, displayed in the item list area 210) according to the priority calculated by the ranking adjustment unit 25.

統計処理部２３は、ファセット検索が行なわれる前の事前フェーズ（準備フェーズ）として、ＫＧ３及び制御情報に基づき、スコアＤＢ２１ｂを作成又は更新する。例えば、統計処理部２３は、頻度表作成部２３ａ及びスコア算出部２３ｂを備えてよい。 The statistical processing unit 23 creates or updates the score DB 21b based on the KG3 and the control information as a preliminary phase (preparation phase) before the facet search is performed. For example, the statistical processing unit 23 may include a frequency table creation unit 23a and a score calculation unit 23b.

頻度表作成部２３ａは、頻度表２１ａを作成する。例えば、頻度表作成部２３ａは、図６に示すように、ＫＧ３から、ファセット検索の対象となる全てのクラス集合２１ａ１を取得する。例えば、頻度表作成部２３ａは、図６に示すクラス取得クエリＱ１を実行し、ＫＧ３からクラス集合２１ａ１を取得してよい。 The frequency table creation unit 23a creates a frequency table 21a. For example, as shown in FIG. 6, the frequency table creation unit 23a acquires all the class sets 21a1 to be faceted search from KG3. For example, the frequency table creation unit 23a may execute the class acquisition query Q1 shown in FIG. 6 to acquire the class set 21a1 from the KG3.

そして、頻度表作成部２３ａは、クラス集合２１ａ１に含まれるクラスとＫＧ３とに基づき、ファセットキー頻度表２１ａ２、及び、ファセットキー・ファセット値頻度表２１ａ３（図７参照）を算出する。例えば、頻度表作成部２３ａは、図６に示すファセットキー頻度表クエリＱ２及びファセットキー・ファセット値頻度表クエリＱ３をそれぞれ実行し、ＫＧ３からファセットキー頻度表２１ａ２、及び、ファセットキー・ファセット値頻度表２１ａ３を取得してよい。 Then, the frequency table creation unit 23a calculates the facet key frequency table 21a2 and the facet key facet value frequency table 21a3 (see FIG. 7) based on the class and KG3 included in the class set 21a1. For example, the frequency table creation unit 23a executes the facet key frequency table query Q2 and the facet key facet value frequency table query Q3 shown in FIG. 6, respectively, and from KG3 to the facet key frequency table 21a2 and the facet key facet value frequency. Table 21a3 may be obtained.

なお、ＫＧ３において、クラスは、主語（Ｓ）、ファセットキーは、述語（Ｐ）、ファセット値は、目的語（Ｏ）にそれぞれ相当する。このため、ファセットキー頻度表２１ａ２は、Ｐ頻度表２１ａ２と称されてもよく、ファセットキー・ファセット値頻度表２１ａ３は、ＰＯ頻度表２１ａ３と称されてもよい。 In KG3, the class corresponds to the subject (S), the facet key corresponds to the predicate (P), and the facet value corresponds to the object (O). Therefore, the facet key frequency table 21a2 may be referred to as the P frequency table 21a2, and the facet key facet value frequency table 21a3 may be referred to as the PO frequency table 21a3.

図６に示す各クエリＱ１〜Ｑ３は、ＲＤＦ問合せ言語の一例であるＳＰＡＲＱＬを用いたクエリの一例である。クエリＱ１〜Ｑ３において、“?s”は主語（Ｓ）、“?p”は述語（Ｐ）、“?o”は目的語（Ｏ）に相当し、クエリＱ２及びＱ３における“%CLASS%”は、クエリＱ１で取得された各クラスによって置換される。 Each query Q1 to Q3 shown in FIG. 6 is an example of a query using SPARQL, which is an example of an RDF query language. In queries Q1 to Q3, "? S" corresponds to the subject (S), "? P" corresponds to the predicate (P), and "? O" corresponds to the object (O), and "% CLASS%" in queries Q2 and Q3. Is replaced by each class obtained in query Q1.

図７は、クラス集合２１ａ１、Ｐ頻度表２１ａ２、及び、ＰＯ頻度表２１ａ３の一例を示す図である。 FIG. 7 is a diagram showing an example of the class set 21a1, the P frequency table 21a2, and the PO frequency table 21a3.

図７に例示するように、クラス集合２１ａ１は、ＫＧ３から、“政治家”等のカテゴリごとに、“名前”、所属政党”等のファセットキー（Ｐ）を抽出した情報である。 As illustrated in FIG. 7, the class set 21a1 is information obtained by extracting facet keys (P) such as "name" and affiliated political party from KG3 for each category such as "politician".

Ｐ頻度表２１ａ２は、ＫＧ３から、クラス集合２１ａ１に含まれるファセットキー（Ｐ）ごとに、ＫＧ３における「頻度」（例えば、クエリＱ２で得られたレコードの「数」）を抽出した情報である。 The P frequency table 21a2 is information obtained by extracting the "frequency" in the KG3 (for example, the "number" of the records obtained in the query Q2) for each facet key (P) included in the class set 21a1 from the KG3.

ＰＯ頻度表２１ａ３は、ＫＧ３から、クラス集合２１ａ１に含まれるファセットキー（Ｐ）ごと、且つ、ファセット値（Ｏ）ごとに、ＫＧ３における「頻度」（例えば、クエリＱ３で得られたレコードの「数」）を抽出した情報である。 The PO frequency table 21a3 shows the "frequency" in KG3 (for example, the "number" of the records obtained by the query Q3) for each facet key (P) and each facet value (O) included in the class set 21a1 from KG3. ") Is the extracted information.

頻度表作成部２３ａは、クラス集合２１ａ１、Ｐ頻度表２１ａ２、及び、ＰＯ頻度表２１ａ３のうちの少なくともＰＯ頻度表２１ａ３の情報を、頻度表２１ａとしてメモリ部２１に格納してよい。なお、頻度表２１ａとして、少なくともＰＯ頻度表２１ａ３の情報を格納するものとしたのは、ＰＯ頻度表２１ａ３のファセット値をファセットキー単位で合計することでＰ頻度表２１ａ２を導出可能だからである。 The frequency table creation unit 23a may store at least the information of the PO frequency table 21a3 among the class set 21a1, the P frequency table 21a2, and the PO frequency table 21a3 in the memory unit 21 as the frequency table 21a. The reason why at least the information of the PO frequency table 21a3 is stored in the frequency table 21a is that the P frequency table 21a2 can be derived by summing the facet values of the PO frequency table 21a3 in facet key units.

スコア算出部２３ｂは、頻度表作成部２３ａが作成した頻度表２１ａに基づき、スコアＤＢ２１ｂを算出する。 The score calculation unit 23b calculates the score DB 21b based on the frequency table 21a created by the frequency table creation unit 23a.

図８は、各指標の算出式の一例を示す図であり、図９は、指標値２１ｂ１の一例を示す図である。例えば、スコア算出部２３ｂは、下記式（１）〜式（４）に例示する各指標の算出式（図８参照）を用いて、図９に例示する指標値２１ｂ１を算出してよい。

FIG. 8 is a diagram showing an example of a calculation formula for each index, and FIG. 9 is a diagram showing an example of an index value 21b1. For example, the score calculation unit 23b may calculate the index value 21b1 illustrated in FIG. 9 by using the calculation formulas (see FIG. 8) of each index exemplified in the following formulas (1) to (4).

上記式（１）に示すファセット頻度（freq(f)）は、ＫＧ３の検索対象データ全体でのファセットの出現頻度を示す指標である。ファセット頻度が大きいファセットほど、多くの検索対象データに出現することを意味し、ファセット頻度が小さいファセットほど、検索対象データ内で出現する頻度がより局所的であることを意味する。上記式（１）において、“N”は、検索対象のデータの全体数であり、“n(facet)”は、ファセットあたりの検索対象のデータ数である。 The facet frequency (freq (f)) shown in the above formula (1) is an index showing the appearance frequency of facets in the entire search target data of KG3. A facet with a higher facet frequency means that it appears in more search target data, and a facet with a lower facet frequency means that it appears more locally in the search target data. In the above equation (1), "N" is the total number of data to be searched, and "n (facet)" is the number of data to be searched per facet.

上記式（２）に示すファセット均衡度（bala(f)）は、ファセットごとの検索できるデータ数の分布である。ファセット均衡度の大きいファセットほど、検索できるデータの範囲が広く、バランスのよいファセットであり、ファセット均衡度の小さいファセットほど、検索対象データ内でのファセットキーの出現の偏りが大きいことを意味する。上記式（２）において、“n(key_i)”はファセットキー“key_i”において検索できる検索対象データ数であり、“n_key”はファセット“f”におけるキーの数であり、“μ”は“n(key_i)”の平均である。 The facet equilibrium degree (bala (f)) shown in the above equation (2) is a distribution of the number of data that can be searched for each facet. A facet with a larger facet balance means that the range of data that can be searched is wider and a well-balanced facet, and a facet with a smaller facet balance means that the appearance of the facet key in the search target data is more biased. In the above equation (2), "n (key _i )" is the number of search target data that can be searched by the facet key "key _i _{", "n key} " is the number of keys in the facet "f", and "μ". Is the average of “n (key _i)”.

上記式（３）に示すキー濃度（card(f)）は、各ファセットにおけるファセットキーの数の分布である。上記式（３）において、“n_key”はファセット“f”におけるキーの数であり、“μ”は“n_key”の平均であり、“σ²”は分散である。 The key density (card (f)) shown in the above equation (3) is the distribution of the number of facet keys in each facet. In the above equation (3), "n _key " is the number of keys in the facet "f", "μ" is _{the average of "n key} ", and "σ ² " is the variance.

上記式（４）に示すキー単調性（mono(f)）は、ファセットキーの単調性の指標であり、或るファセットにおけるファセットキーごとに検索できるデータ数の平均を評価する指標である。上記式（４）において、“avg”は“n(key_i)”の平均であり、“μ”及び“σ²”はその平均及び分散である。 The key monotonicity (mono (f)) shown in the above equation (4) is an index of the monotonicity of the facet key, and is an index for evaluating the average number of data that can be searched for each facet key in a certain facet. In the above equation (4), “avg” is _{the average of “n (key i} )”, and “μ” and “σ ² ” are the average and variance.

なお、上記式（１）〜式（４）は、例えば非特許文献２に記載された各指標の算出式と同様であり、これらの詳細な説明を省略する。 The above formulas (1) to (4) are the same as the calculation formulas for each index described in, for example, Non-Patent Document 2, and detailed description thereof will be omitted.

スコア算出部２３ｂは、頻度表２１ａと上記式（１）〜（４）とに基づき、図９に例示するように、ファセットキーごとの指標値２１ｂ１を算出してよい。 The score calculation unit 23b may calculate the index value 21b1 for each facet key as illustrated in FIG. 9 based on the frequency table 21a and the above equations (1) to (4).

また、スコア算出部２３ｂは、頻度表２１ａと、指標値２１ｂ１とに基づいて、カテゴリごと、且つ、ファセットキーごとにカテゴリ重要度（signif(f, C)）を算出する。カテゴリ重要度（signif(f, C)）は、クラス内のファセットの出現頻度を考慮した指標であって、カテゴリ固有のファセットキーを重要視する指標である。 Further, the score calculation unit 23b calculates the category importance (signif (f, C)) for each category and for each facet key based on the frequency table 21a and the index value 21b1. The category importance (signif (f, C)) is an index that considers the frequency of appearance of facets in the class, and is an index that emphasizes the facet key unique to the category.

図１０は、カテゴリ重要度（signif(f, C)）を表形式で表したカテゴリ重要度表２１ｂ２の一例を示す図である。 FIG. 10 is a diagram showing an example of the category importance table 21b2 in which the category importance (signif (f, C)) is represented in a tabular format.

例えば、スコア算出部２３ｂは、下記式（５）に基づきカテゴリ重要度（signif(f, C)）を算出してよい。
signif(f, C) = weight(f, C) * uniq(f) （５） For example, the score calculation unit 23b may calculate the category importance (signif (f, C)) based on the following formula (5).
signif (f, C) = weight (f, C) * uniq (f) (5)

上記式（５）において、“weight(f, C)”は、クラス内のファセットの出現頻度の重みであり、例えば、下記式（６）により表されてよい。
weight(f, C) = n(f, C) / N_d （６） In the above equation (5), "weight (f, C)" is the weight of the appearance frequency of facets in the class, and may be expressed by the following equation (6), for example.
weight (f, C) = n (f, C) / N _d (6)

ここで、上記式（６）において、“N_d”はクラス“C”内の“f”の総数であり、“n(f, C)”はクラス“C”内のファセットの数である。 Here, in the above equation (6), “N _d ” is the total number of “f” in the class “C”, and “n (f, C)” is the number of facets in the class “C”.

また、上記式（５）において、“uniq(f)”は、ファセットごとのユニーク度を示す指標である。ユニーク度とは、ファセットが複数のクラスのうちの特定のクラスに偏って（例えば特定のクラスのみに）出現するか否かを示す指標である。例えば、“uniq(f)”は、下記式（７）により表されてよい。
uniq(f) = NC / uniq_count(f) （７） Further, in the above equation (5), "uniq (f)" is an index indicating the uniqueness of each facet. The uniqueness is an index indicating whether or not a facet appears in a specific class among a plurality of classes (for example, only in a specific class). For example, "uniq (f)" may be expressed by the following equation (7).
uniq (f) = NC / uniq_count (f) (7)

ここで、上記式（７）において、“NC”はＫＧ３内の総クラス数であり、“uniq_count(f)”はＫＧ３内の“f”が含まれるクラス数である。このように、ユニーク度は、ＫＧ３内のクラス総数をファセットが出現するクラス数で除算することで得られてよい。 Here, in the above equation (7), "NC" is the total number of classes in KG3, and "uniq_count (f)" is the number of classes including "f" in KG3. In this way, the uniqueness may be obtained by dividing the total number of classes in KG3 by the number of classes in which facets appear.

このように、カテゴリ重要度（signif(f, C)）は、ＫＧ３における特定のカテゴリに関連付けられた複数のファセットのそれぞれがＫＧ３において関連付けられているカテゴリの個数に基づく第１指標の一例である。 Thus, the category importance (signif (f, C)) is an example of a first index based on the number of categories in which each of the plurality of facets associated with a particular category in KG3 is associated in KG3. ..

スコア算出部２３ｂは、頻度表２１ａ及び上記式（５）を用いて、ファセットキーごとのカテゴリ重要度（signif(f, C)）を算出してよい。なお、図１０に示すように、スコア算出部２３ｂは、カテゴリ重要度（signif(f, C)）を“0”〜“1”の範囲に正規化する。 The score calculation unit 23b may calculate the category importance (signif (f, C)) for each facet key by using the frequency table 21a and the above formula (5). As shown in FIG. 10, the score calculation unit 23b normalizes the category importance (signif (f, C)) to the range of “0” to “1”.

そして、スコア算出部２３ｂは、上述した指標値２１ｂ１に基づくスコア（org_score）と、上記式（５）に基づくスコア（signif）とを用いて、ファセットスコア（new_score）を算出する。 Then, the score calculation unit 23b calculates the facet score (new_score) by using the score (org_score) based on the index value 21b1 described above and the score (signif) based on the above formula (5).

例えば、スコア算出部２３ｂは、下記式（８）に例示するように、スコア（org_score）と、スコア（signif）との重み付き線形和を算出し、ファセットスコア（new_score）を取得してよい。
new_score(f) = ω * org_score(f) + (1 - ω) * signif(f) （８） For example, the score calculation unit 23b may calculate a weighted linear sum of the score (org_score) and the score (signif) and obtain the facet score (new_score) as illustrated in the following equation (8).
new_score (f) = ω * org_score (f) + (1-ω) * signif (f) (8)

上記式（８）において、“ω”は重みである。一実施形態においては、非限定的な例として、“ω=0.5”であるものとする。“ω=0.5”の場合、上記式（８）は、下記式（８’）のように表される。
new_score(f) = 0.5 * org_score(f) + 0.5 * signif(f) （８’） In the above equation (8), "ω" is a weight. In one embodiment, as a non-limiting example, “ω = 0.5”. When "ω = 0.5", the above equation (8) is expressed as the following equation (8').
new_score (f) = 0.5 * org_score (f) + 0.5 * signif (f) (8')

ここで、上記式（８）又は（８’）において、“org_score(f)”は、下記式（９）により表されてよい。なお、下記式（９）において、α＋β＋γ＋θ＝１であるものとする。

Here, in the above formula (8) or (8'), "org_score (f)" may be expressed by the following formula (9). In the following equation (9), it is assumed that α + β + γ + θ = 1.

図１１は、ファセットスコア（new_score）を表形式で表したファセットスコア表２１ｂ３の一例を示す図である。例えば、スコア算出部２３ｂは、指標値２１ｂ１、カテゴリ重要度表２１ｂ２、及び、ファセットスコア表２１ｂ３のうちの少なくともファセットスコア表２１ｂ３の情報を、スコアＤＢ２１ｂとしてメモリ部２１に格納してよい。 FIG. 11 is a diagram showing an example of a facet score table 21b3 in which the facet score (new_score) is represented in a tabular format. For example, the score calculation unit 23b may store the information of the index value 21b1, the category importance table 21b2, and at least the facet score table 21b3 of the facet score table 21b3 in the memory unit 21 as the score DB 21b.

図１１に例示するように、クラス（カテゴリ）内でのファセットの重要度を加味したスコア（new_score）により、例えば、“政治家”については、“名前”や“生年月日”よりも“所属政党”の方が高いスコアとなり、“プロ野球選手”については、“名前”や“生年月日”よりも“守備位置”の方が高いスコアとなる。 As illustrated in FIG. 11, according to the score (new_score) that takes into account the importance of facets within the class (category), for example, for "politician", "affiliation" rather than "name" or "date of birth" "Political party" has a higher score, and for "professional baseball player", "defensive position" has a higher score than "name" and "date of birth".

すなわち、図３に示す項目一覧領域２１０において、“政治家”についての“所属政党”や、“プロ野球選手”についての“守備位置”等のファセットキーが優先度の高い項目として表示されることになる。従って、サーバ２は、カテゴリに関連付けられたファセットの集合から、よりスコアの高いファセットキーにより絞り込まれた適切な複数のファセットを出力することができる。 That is, in the item list area 210 shown in FIG. 3, facet keys such as "affiliated party" for "politician" and "defensive position" for "professional baseball player" are displayed as high priority items. become. Therefore, the server 2 can output a plurality of appropriate facets narrowed down by the facet key having a higher score from the set of facets associated with the category.

このように、スコア算出部２３ｂは、複数のファセットのそれぞれについてのuniq(f)を含むカテゴリ重要度を算出し、算出したカテゴリ重要度を含む情報（例えばスコアＤＢ２１ｂ）をメモリ部２１に格納する算出部の一例である。 In this way, the score calculation unit 23b calculates the category importance including uniq (f) for each of the plurality of facets, and stores the information including the calculated category importance (for example, the score DB 21b) in the memory unit 21. This is an example of the calculation unit.

図４の説明に戻り、意味解釈処理部２４は、ファセット検索が行なわれる際に、カテゴリ探索画面１００（図３参照）において選択されたクラス（カテゴリ）から、或るファセットが所属するクラスまでのＫＧ３上のパスを、スコアに反映する。このように、意味解釈処理部２４は、ファセット検索が行なわれる際に、選択されたクラスに応じて、オントロジーを考慮した優先度の高いファセットが抽出されるように、スコアを変更する。 Returning to the explanation of FIG. 4, the meaning interpretation processing unit 24 extends from the class (category) selected on the category search screen 100 (see FIG. 3) to the class to which a certain facet belongs when the facet search is performed. The pass on KG3 is reflected in the score. In this way, the semantic interpretation processing unit 24 changes the score so that the facets having a high priority in consideration of the ontology are extracted according to the selected class when the facet search is performed.

例えば、意味解釈処理部２４は、下記式（１０）に示すように、オントロジーにおけるグラフ上の距離を考慮したスコア（ont_score）を算出してよい。スコア（ont_score）は、ＫＧ３における選択されたクラスと複数のファセットのそれぞれとの距離に基づく第２指標の一例である。
ont_score(f, C) = 1 / (distance(C, C_f) + 1) （１０） For example, the semantic interpretation processing unit 24 may calculate a score (ont_score) in consideration of the distance on the graph in the ontology, as shown in the following equation (10). The score (ont_score) is an example of a second index based on the distance between the selected class and each of the plurality of facets in KG3.
ont_score (f, C) = 1 / (distance (C, C _f ) + 1) (10)

ここで、上記式（１０）において、“f”はスコアを計算する対象のファセットであり、“C”は項目一覧領域２１０で選択されたクラス（カテゴリ）であり、“C_f”は、項目一覧領域２１０に表示される候補のファセットが所属するクラスである。 Here, in the above equation (10), “f” is the facet for which the score is calculated, “C” is the class (category) selected in the item list area 210, and “C _f ” is the item. This is the class to which the candidate facets displayed in the list area 210 belong.

また、上記式（１０）において、“distance(C, C_f)”は、選択されたクラスと選択されたファセットが所属するクラスとの間のスキーマ（オントロジー）上、すなわちグラフ上の距離である。グラフ上の距離とは、クラス（カテゴリ）の階層間の距離を意味してよい。例えば、意味解釈処理部２４は、階層的な各クラスを、木構造における各ノードと捉え、既知の手法により、ノード間の距離をグラフ上の距離として算出してよい。なお、“C = C_f”である場合、“distance(C, C_f) = 0”となる。 Further, in the above equation (10), "distance (C, C _f )" is the distance on the schema (ontology), that is, on the graph, between the selected class and the class to which the selected facet belongs. .. The distance on the graph may mean the distance between the layers of the class (category). For example, the semantic interpretation processing unit 24 may regard each hierarchical class as each node in the tree structure, and calculate the distance between the nodes as the distance on the graph by a known method. If “C = C _f ”, then “distance (C, C _f ) = 0”.

図１２は、ＲＤＦスキーマにおける、クラス及びファセットキーを表すグラフの一例を示す図である。以下、図１２の例において、選択されたカテゴリが“BaseballPlayer”（破線参照）である場合の、スコア（ont_score）の算出例を説明する。 FIG. 12 is a diagram showing an example of a graph representing a class and a facet key in RDF Schema. Hereinafter, in the example of FIG. 12, an example of calculating the score (ont_score) when the selected category is “BaseballPlayer” (see the broken line) will be described.

一例として、スコア算出対象のファセットが“名前（人）”である場合、スコア（ont_score）は、下記式（１１）に示すように算出される。
ont_score(名前（人）, BaseballPlayer)
= 1 / (distance(BaseballPlayer, Person) + 1)
= 1 / (1 + 1) = 0.5 （１１） As an example, when the facet for which the score is calculated is a "name (person)", the score (ont_score) is calculated as shown in the following formula (11).
ont_score (name (person), Baseball Player)
= 1 / (distance (BaseballPlayer, Person) + 1)
= 1 / (1 + 1) = 0.5 (11)

他の例として、スコア算出対象のファセットが“本社所在地”である場合、スコア（ont_score）は、下記式（１２）に示すように算出される。
ont_score(本社所在地, BaseballPlayer)
= 1 / (distance(BaseballPlayer, Company) + 1)
= 1 / (3 + 1) = 0.25 （１２） As another example, when the facet for which the score is calculated is the "head office location", the score (ont_score) is calculated as shown in the following formula (12).
ont_score (Headquarters location, Baseball Player)
= 1 / (distance (BaseballPlayer, Company) + 1)
= 1 / (3 + 1) = 0.25 (12)

他の例として、スコア算出対象のファセットが“所属政党”である場合、スコア（ont_score）は、下記式（１３）に示すように算出される。
ont_score(所属政党, BaseballPlayer)
= 1 / (distance(BaseballPlayer, Politician) + 1)
= 1 / (2 + 1) = 0.33 （１３） As another example, when the facet for which the score is calculated is the "affiliated political party", the score (ont_score) is calculated as shown in the following formula (13).
ont_score (Political party, Baseball Player)
= 1 / (distance (BaseballPlayer, Politician) + 1)
= 1 / (2 + 1) = 0.33 (13)

このように、意味解釈処理部２４は、オントロジーにおける関連度が大きいファセットキー、一例として、グラフ上の距離が近い（distance(C, C_f)が小さい）ファセットキーほど、優先度が高くなるようなスコア（ont_score）を算出する。また、意味解釈処理部２４は、オントロジーにおける関連度が小さいファセットキー、一例として、グラフ上の距離が遠い（distance(C, C_f)が大きい）ファセットキーほど、優先度が低くなるようなスコア（ont_score）を算出する。換言すれば、スコア（ont_score）は、ＫＧ３の知識構造を考慮したファセットキーの意味解釈が反映されたスコアであるといえる。 In this way, the semantic interpretation processing unit 24 has a higher priority as the facet key having a higher degree of relevance in the ontology, for example, the facet key having a shorter distance on the graph (the distance (C, C _{f) is smaller).} Calculate the score (ont_score). Further, the semantic interpretation processing unit 24 has a score such that the facet key having a smaller degree of relevance in the ontology, for example, the facet key having a longer distance on the graph (the distance (C, C _f ) is larger), the lower the priority. Calculate (ont_score). In other words, the score (ont_score) can be said to be a score that reflects the semantic interpretation of the facet key in consideration of the knowledge structure of KG3.

なお、意味解釈処理部２４は、ＫＧ３上のファセットキーと選択されたクラスとの間のグラフ上の距離（distance(C, C_f)）に加えて、又は、代えて、意味上の距離を考慮した指標に基づきスコア（ont_score）を算出してもよい。 In addition, the semantic interpretation processing unit 24 adds or substitutes the semantic distance between the facet key on KG3 and the selected class on the graph (distance (C, C _f)). The score (ont_score) may be calculated based on the considered index.

意味上の距離を考慮した指標（意味上の距離指標）としては、例えば、ファセットキーとなる述語（Ｐ；Predicate）の語彙が、標準語彙であるか否かに応じて定まる指標が挙げられる。標準語彙であるか否かの判断は、例えば、標準語彙を蓄積するＤＢに、ファセットキーとなる語彙が登録されているか否かの判断により行なわれてよい。標準語彙を蓄積するＤＢとしては、例えば、“prefix.cc”や、“Linked Open Vocabularies”等のＤＢが挙げられる。 Examples of the index considering the semantic distance (semantic distance index) include an index determined according to whether or not the vocabulary of the predicate (P; Predicate) serving as the facet key is a standard vocabulary. The determination of whether or not the vocabulary is the standard vocabulary may be made, for example, by determining whether or not the vocabulary serving as the facet key is registered in the DB for accumulating the standard vocabulary. Examples of the DB for accumulating the standard vocabulary include DBs such as "prefix.cc" and "Linked Open Vocabularies".

例えば、意味解釈処理部２４は、下記式（１４）に示すように、オントロジーにおける意味上の距離を考慮したスコア（ont_score）を算出してよい。
ont_score(f, C) = (1 / (distance(C, C_f) + 1)) * std_vocab(f) （１４） For example, the semantic interpretation processing unit 24 may calculate a score (ont_score) in consideration of the semantic distance in the ontology, as shown in the following equation (14).
ont_score (f, C) = (1 / (distance (C, C _f ) + 1)) * std_vocab (f) (14)

ここで、上記式（１４）において、(1 / (distance(C, C_f) + 1))の項は、上記式（１０）と同様であり、“std_vocab(f)”は、下記式（１５）に示すように、標準語彙であれば“1.0”、標準語彙ではなければ“0.5”、等となる関数であってよい。

Here, in the above equation (14), the term of (1 / (distance (C, C _f ) + 1)) is the same as that of the above equation (10), and “std_vocab (f)” is the following equation ( As shown in 15), the function may be "1.0" if it is a standard vocabulary, "0.5" if it is not a standard vocabulary, and so on.

例えば、意味解釈処理部２４は、上記式（１４）に示すように、上記式（１０）に示すグラフ上の距離を考慮したスコアに対して、上記式（１５）に示す意味上の距離指標を乗算することで、グラフ上及び意味上の距離の双方を考慮したスコアを算出してよい。 For example, as shown in the above equation (14), the semantic interpretation processing unit 24 has a semantic distance index shown in the above equation (15) with respect to the score considering the distance on the graph shown in the above equation (10). By multiplying, a score that considers both the graph and the semantic distance may be calculated.

或いは、意味解釈処理部２４は、上記式（１４）に代えて、上記式（１５）に示す“std_vocab(f)”を、意味上の距離のみを考慮したスコア（ont_score）として採用してもよい。 Alternatively, the semantic interpretation processing unit 24 may adopt "std_vocab (f)" shown in the above equation (15) as a score (ont_score) considering only the semantic distance instead of the above equation (14). good.

図１３は、ＲＤＦスキーマにおける、クラス及びファセットキーを表すグラフの一例を示す図である。以下、図１３の例において、選択されたカテゴリが“BaseballPlayer”（破線参照）である場合の、意味上の距離指標を考慮したスコア（ont_score）の算出例を説明する。 FIG. 13 is a diagram showing an example of a graph representing a class and a facet key in RDF Schema. Hereinafter, in the example of FIG. 13, when the selected category is “BaseballPlayer” (see the broken line), a calculation example of the score (ont_score) in consideration of the semantic distance index will be described.

なお、図１３において、例えば、“名前（人）”は、“foat:name”で表される標準語彙であり、“名前（会社）”は、“skos:prefLabel”で表される標準語彙であり、“本社所在地”は、“14a-ont:本社所在地”で表される独自語彙（非標準語彙）であるものとする。 In FIG. 13, for example, "name (person)" is a standard vocabulary represented by "foat: name", and "name (company)" is a standard vocabulary represented by "skos: prefLabel". Yes, "Headquarters location" shall be the original vocabulary (non-standard vocabulary) represented by "14a-ont: Headquarters location".

一例として、スコア算出対象のファセットが、“名前（会社）”である場合、スコア（ont_score）は、下記式（１６）に示すように算出される。
ont_score(名前（会社）, BaseballPlayer)
= (1 / (distance(BaseballPlayer, Company) + 1)) * std_vocab(名前（会社）)
= (1 / (3 + 1)) * 0.5 = 0.125 （１６） As an example, when the facet for which the score is calculated is "name (company)", the score (ont_score) is calculated as shown in the following formula (16).
ont_score (name (company), Baseball Player)
= (1 / (distance (BaseballPlayer, Company) + 1)) * std_vocab (name (company))
= (1 / (3 + 1)) * 0.5 = 0.125 (16)

他の例として、スコア算出対象のファセットが“本社所在地”である場合、スコア（ont_score）は、下記式（１７）に示すように算出される。
ont_score(本社所在地, BaseballPlayer)
= (1 / (distance(BaseballPlayer, Company) + 1)) * std_vocab(本社所在地)
= (1 / (3 + 1)) * 1.0 = 0.25 （１７） As another example, when the facet for which the score is calculated is the "head office location", the score (ont_score) is calculated as shown in the following formula (17).
ont_score (Headquarters location, Baseball Player)
= (1 / (distance (BaseballPlayer, Company) + 1)) * std_vocab (Headquarters location)
= (1 / (3 + 1)) * 1.0 = 0.25 (17)

このように、意味解釈処理部２４は、オントロジーにおける関連度が大きいファセットキー、一例として、意味上の距離（std_vocab）が近い（大きい）ファセットキーほど、優先度が高くなるようなスコア（ont_score）を算出する。また、意味解釈処理部２４は、オントロジーにおける関連度が小さいファセットキー、一例として、意味上の距離（std_vocab）が遠い（小さい）ファセットキーほど、優先度が低くなるようなスコア（ont_score）を算出する。 In this way, the semantic interpretation processing unit 24 has a score (ont_score) such that the facet key having a higher degree of relevance in the ontology, for example, the facet key having a closer (larger) semantic distance (std_vocab), the higher the priority. Is calculated. Further, the semantic interpretation processing unit 24 calculates a score (ont_score) that has a lower priority as the facet key having a smaller degree of relevance in the ontology, for example, a facet key having a longer (smaller) semantic distance (std_vocab). do.

これにより、図３に示す項目一覧領域２１０において、選択されたカテゴリとの間で、グラフ上の距離及び意味上の距離の一方又は双方が近いファセットキーが、優先度の高い項目として表示されることになる。従って、サーバ２は、カテゴリに関連付けられたファセットの集合から、よりスコアの高いファセットキーにより絞り込まれた適切な複数のファセットを出力することができる。 As a result, in the item list area 210 shown in FIG. 3, facet keys in which one or both of the distance on the graph and the semantic distance are close to the selected category are displayed as high-priority items. It will be. Therefore, the server 2 can output a plurality of appropriate facets narrowed down by the facet key having a higher score from the set of facets associated with the category.

図４の説明に戻り、ランキング調整部２５は、ファセット検索において、最終的なファセットスコア（final_score）を算出する。 Returning to the description of FIG. 4, the ranking adjustment unit 25 calculates the final facet score (final_score) in the facet search.

例えば、ファセット検索では、上述した検索制御部２２により、カテゴリ探索画面１００において、エンティティに付与されているクラスの階層を辿りながら、目的のカテゴリが選択され、ファセット検索画面２００が表示される。 For example, in the facet search, the search control unit 22 described above selects a target category on the category search screen 100 while tracing the hierarchy of the classes assigned to the entities, and the facet search screen 200 is displayed.

このとき、選択されたカテゴリに基づいて、意味解釈処理部２４により、知識構造を利用したファセット重要度としてのスコア（ont_score）が計算される。 At this time, based on the selected category, the semantic interpretation processing unit 24 calculates the score (ont_score) as the facet importance using the knowledge structure.

ランキング調整部２５は、事前フェーズにおいてスコア算出部２３ｂが算出したファセットスコア（new_score）と、意味解釈処理部２４が算出したスコア（out_score）とに基づいて、最終的なスコア（final_score）を算出してよい。例えば、ランキング調整部２５は、下記式（１８）に基づいて、ファセットスコア（new_score）と、スコア（out_score）とを乗算することで、スコア（final_score）を算出してよい。
final_score(f, C) = new_score(f, C) * ont_score(f, C) （１８） The ranking adjustment unit 25 calculates the final score (final_score) based on the facet score (new_score) calculated by the score calculation unit 23b in the preliminary phase and the score (out_score) calculated by the meaning interpretation processing unit 24. You can do it. For example, the ranking adjusting unit 25 may calculate the score (final_score) by multiplying the facet score (new_score) and the score (out_score) based on the following equation (18).
final_score (f, C) = new_score (f, C) * ont_score (f, C) (18)

図１４は、上記式（９）に示すスコア（org_score）、上記式（８）に示すファセットスコア（new_score）、上記式（１０）又は式（１４）に示すスコア（ont_score）、並びに、上記式（１８）に示す最終的なスコア（final_score）を表形式で例示する図である。 FIG. 14 shows the score (org_score) shown in the above formula (9), the facet score (new_score) shown in the above formula (8), the score (ont_score) shown in the above formula (10) or the above formula (14), and the above formula. It is a figure which illustrates the final score (final_score) shown in (18) in a tabular form.

図１４の例では、カテゴリ探索画面１００において、カテゴリとして“プロ野球選手”が選択された場合を示す。この場合、最終的なスコア（final_score）は、ファセットスコア（new_score）及びスコア（ont_score）のいずれのスコアも高い値となっている“守備位置”がもっと高い“0.562”となっている。この“守備位置”は、上記式（９）に示すスコア（org_score）では、“名前”や“生年月日”よりも低いスコアである。 In the example of FIG. 14, a case where “professional baseball player” is selected as a category on the category search screen 100 is shown. In this case, the final score (final_score) is "0.562", which is a higher value for both the facet score (new_score) and the score (ont_score), and the "defensive position" is higher. This "defensive position" is a score lower than the "name" and "date of birth" in the score (org_score) shown in the above formula (9).

このように、カテゴリ固有のファセットキーを重要視する指標と、ＫＧ３の知識構造を考慮したファセットキーの意味解釈とにより、“プロ野球選手”のカテゴリに対する“守備位置”のように、適切なファセットキーのスコアが高くなるように算出される。 In this way, by the index that emphasizes the facet key peculiar to the category and the meaning interpretation of the facet key considering the knowledge structure of KG3, the appropriate facet like the "defensive position" for the category of "professional baseball player". It is calculated so that the key score is high.

例えば、ランキング調整部２５は、算出した最終的なスコア（final_score）に基づいて、項目一覧領域２１０に表示するファセットキーを当該スコアが高い順にソートし、ソートしたファセットキーの情報を検索制御部２２に出力してよい。 For example, the ranking adjustment unit 25 sorts the facet keys to be displayed in the item list area 210 in descending order of the score based on the calculated final score (final_score), and the sorted facet key information is searched and controlled by the search control unit 22. You may output to.

ランキング調整部２５により算出される最終的なスコア（final_score）は、複数のファセットの優先度の一例である。 The final score (final_score) calculated by the ranking adjustment unit 25 is an example of the priority of a plurality of facets.

これにより、検索制御部２２は、図１５に例示するように、項目一覧領域２１０に、上記式（９）に示すスコア（org_score）ベースのリスト２１１に代えて、最終的なスコア（final_score）ベースのファセットキーのリスト２１２を表示することができる。当該リスト２１２は、ランキング調整部２５により算出される優先度に応じて、複数のファセットを順に並べたリストの一例である。 As a result, as illustrated in FIG. 15, the search control unit 22 replaces the score (org_score) -based list 211 shown in the above equation (9) with the item list area 210 based on the final score (final_score). List 212 of facet keys can be displayed. The list 212 is an example of a list in which a plurality of facets are arranged in order according to the priority calculated by the ranking adjustment unit 25.

図１６及び図１７は、“org_score”、“new_score”及び“final_score”のそれぞれをベースとしてファセットキーをソートした場合のＭＲＲ（Mean Reciprocal Rank）の比較例を示す図である。ＭＲＲは、検索結果の品質の評価指標であり、“0”〜“1”の範囲の値となる。ＭＲＲが“1”に近いほど、項目一覧領域２１０に表示された上位のファセットキーが検索（選択）されることを意味し、ユーザの知識に即したファセット検索が実現されていることを意味する。 16 and 17 are diagrams showing a comparative example of MRR (Mean Reciprocal Rank) when facet keys are sorted based on each of “org_score”, “new_score” and “final_score”. MRR is an evaluation index of the quality of the search result, and is a value in the range of "0" to "1". The closer the MRR is to "1", the higher the facet key displayed in the item list area 210 is searched (selected), which means that the facet search according to the user's knowledge is realized. ..

図１６では、上記式（９）に示すスコア（org_score）ベースのファセット順位及びＭＲＲと、上記式（８）に示すファセットスコア（new_score）ベースのファセット順位及びＭＲＲとの比較例を示す。図１６の例では、ファセット対象のデータ数“11401”を持つデータセットから、“野球選手”のカテゴリを選択し、“打席”、“ドラフト順位”、“初出場”の各ファセットキーを選択する場合を想定する。 FIG. 16 shows a comparative example of the facet ranking and MRR based on the score (org_score) shown in the above formula (9) and the facet ranking and MRR based on the facet score (new_score) shown in the above formula (8). In the example of FIG. 16, from the data set having the number of data to be faceted “11401”, the category of “baseball player” is selected, and each facet key of “at bat”, “draft ranking”, and “first appearance” is selected. Imagine a case.

図１６に例示するように、“org_score”ベースでは、ファセット順位はいずれのファセットキーも１００位前後であり、ＭＲＲは“0.0094”である。これに対し、“new_score”ベースでは、ファセット順位は３位〜６位であり、ＭＲＲは“0.2333”となっている。 As illustrated in FIG. 16, on an "org_score" basis, the facet rank is around 100 for each facet key, and the MRR is "0.0094". On the other hand, on the "new_score" base, the facet ranking is 3rd to 6th, and the MRR is "0.2333".

このように、“new_score”ベースでは、“野球選手”固有のファセットである“打席”、“ドラフト順位”、“初出場”のファセット順位及びＭＲＲが、“org_score”ベースよりも高くなっている。 Thus, in the "new_score" base, the facets unique to "baseball players" such as "at bat", "draft ranking", "first appearance", and MRR are higher than those in the "org_score" base.

図１７では、上記式（８）に示すファセットスコア（new_score）ベースのファセット順位及びＭＲＲと、上記式（１８）に示す最終的なスコア（final_score）ベースのファセット順位及びＭＲＲとの比較例を示す。図１７の例では、ファセット対象のデータ数“11401”を持つデータセットから、“野球選手”のカテゴリを選択し、“名前”、“出生地”、“生年月日”の各ファセットキーを選択する場合を想定する。 FIG. 17 shows a comparative example of the facet ranking and MRR based on the facet score (new_score) shown in the above formula (8) and the facet ranking and MRR based on the final score (final_score) shown in the above formula (18). .. In the example of FIG. 17, from the data set having the number of data to be faceted "11401", the category of "baseball player" is selected, and each facet key of "name", "place of birth", and "date of birth" is selected. Imagine a case where you want to.

図１７に例示するように、“new_score”ベースでは、ファセット順位は３１位〜６６位であり、ＭＲＲは“0.2333”である。これに対し、“final_score”ベースでは、ファセット順位は１４１位〜２１１位であり、ＭＲＲは“0.0057”となっている。 As illustrated in FIG. 17, on a “new_score” basis, the facet ranks are 31st to 66th and the MRR is “0.2333”. On the other hand, on the "final_score" basis, the facet ranking is 141st to 211th, and the MRR is "0.0057".

このように、“名前”、“出生地”、“生年月日”は、いずれも“野球選手（BaseballPlayer）”固有のファセットではなく、“人（Person）”に関するファセットである。このため、オントロジーにおける関連度（ont_score）が考慮された“final_score”ベースでは、“new_score”ベースよりもファセット順位及びＭＲＲが低くなっている。 In this way, "name", "place of birth", and "date of birth" are not facets specific to "BaseballPlayer" but facets related to "Person". Therefore, the facet ranking and MRR are lower in the "final_score" base, which takes into account the degree of relevance (ont_score) in the ontology, than in the "new_score" base.

以上のように、一実施形態に係るサーバ２は、ＫＧ３において、ファセットに付与されているカテゴリ（クラス）で先に絞り込み、その後、カテゴリ内のファセットで絞り込みを行なうことにより、効率的にファセット検索を実現する。 As described above, the server 2 according to the embodiment efficiently searches for facets by first narrowing down by the category (class) given to the facets in KG3 and then narrowing down by the facets in the category. To realize.

このとき、サーバ２は、クラス固有のファセットを重要視する指標を導入するとともに、ファセット間の関係をファセット検索対象のＫＧ３から取り出し、当該関係を利用することにより、或る特定のクラスに所属するエンティティの集合を効率的に取り出す。 At this time, the server 2 belongs to a specific class by introducing an index that emphasizes class-specific facets, extracting the relationship between facets from the facet search target KG3, and using the relationship. Efficiently retrieve a set of entities.

〔１−３〕動作例
以下、上述したファセット検索システム１の動作例を、フローチャートを参照しながら説明する。 [1-3] Operation Example Hereinafter, an operation example of the facet search system 1 described above will be described with reference to a flowchart.

〔１−３−１〕ＤＢ作成処理
図１８は、一実施形態に係るＤＢ作成処理の動作例を説明するフローチャートである。 [1-3-1] DB Creation Process FIG. 18 is a flowchart illustrating an operation example of the DB creation process according to the embodiment.

図１８に例示するように、サーバ２において、統計処理部２３の頻度表作成部２３ａは、頻度表２１ａを作成する（ステップＳ１）。また、統計処理部２３のスコア算出部２３ｂは、頻度表２１ａに基づきスコアＤＢ２１ｂを作成し（ステップＳ２）、処理が終了する。 As illustrated in FIG. 18, in the server 2, the frequency table creation unit 23a of the statistical processing unit 23 creates the frequency table 21a (step S1). Further, the score calculation unit 23b of the statistical processing unit 23 creates the score DB 21b based on the frequency table 21a (step S2), and the processing ends.

図１９は、図１８のステップＳ１の頻度表作成処理の動作例を説明するフローチャートである。図１９に例示するように、頻度表作成部２３ａは、頻度表作成処理として、ＫＧ３内の全クラスＣ_ａｌｌを取得する（ステップＳ１１）。 FIG. 19 is a flowchart illustrating an operation example of the frequency table creation process in step S1 of FIG. As illustrated in FIG. 19, the frequency table creation unit 23a, as the frequency table creation process acquires all classes _{C all} in KG3 (step S11).

頻度表作成部２３ａは、全クラスＣ_ａｌｌの要素であるクラスＣを全て処理したか否かを判定する（ステップＳ１２）。全て処理した場合（ステップＳ１２でＹＥＳ）、処理が終了する。 The frequency table creation unit 23a determines whether or not all the classes C, which are the elements of _{all the classes Call, have been processed (step S12).} When all processing is performed (YES in step S12), the processing ends.

全て処理していない場合（ステップＳ１２でＮＯ）、頻度表作成部２３ａは、クラスがＣであるインスタンスが持つ述語（Predicate）ごとのインスタンス数、換言すれば、ファセットキー頻度を算出し（ステップＳ１３）、Ｐ頻度表２１ａ２に格納する。 When not all are processed (NO in step S12), the frequency table creation unit 23a calculates the number of instances for each predicate (Predicate) of the instance whose class is C, in other words, the facet key frequency (step S13). ), P It is stored in the frequency table 21a2.

そして、頻度表作成部２３ａは、クラスがＣであるインスタンスが持つ述語ごとに目的語（Object）の数、換言すれば、ファセットキー・ファセット値頻度を算出し（ステップＳ１４）、ＰＯ頻度表２１ａ３に格納して、処理がステップＳ１２に移行する。 Then, the frequency table creation unit 23a calculates the number of objects (Object) for each predicate of the instance whose class is C, in other words, the facet key / facet value frequency (step S14), and the PO frequency table 21a3. The process proceeds to step S12.

図２０は、図１８のステップＳ２のスコアＤＢ作成処理の動作例を説明するフローチャートである。図２０に例示するように、スコア算出部２３ｂは、スコアＤＢ作成処理として、頻度表２１ａ及びＫＧ３に基づき、ファセット頻度、ファセット均衡度、キー濃度、及び、キー単調性をそれぞれ算出する（ステップＳ２１〜Ｓ２４）。 FIG. 20 is a flowchart illustrating an operation example of the score DB creation process in step S2 of FIG. As illustrated in FIG. 20, the score calculation unit 23b calculates the facet frequency, facet balance, key density, and key monotonicity, respectively, based on the frequency tables 21a and KG3 as the score DB creation process (step S21). ~ S24).

スコア算出部２３ｂは、ＫＧ３におけるクラス及びファセットの情報に基づき、カテゴリ重要度を算出する（ステップＳ２５）。 The score calculation unit 23b calculates the category importance based on the class and facet information in KG3 (step S25).

そして、スコア算出部２３ｂは、ファセット頻度、ファセット均衡度、キー濃度、及び、キー単調性を利用して、上記式（９）に示すスコアorg_scoreを算出する（ステップＳ２６）。 Then, the score calculation unit 23b calculates the score org_score shown in the above equation (9) by using the facet frequency, the facet equilibrium degree, the key concentration, and the key monotonicity (step S26).

また、スコア算出部２３ｂは、ステップＳ２６で算出したスコアorg_scoreと、ステップＳ２５で算出したカテゴリ重要度とを利用し、ファセットスコアnew_scoreを算出して、スコアＤＢ２１ｂに格納し（ステップＳ２７）、処理が終了する。 Further, the score calculation unit 23b calculates the facet score new_score by using the score org_score calculated in step S26 and the category importance calculated in step S25, stores it in the score DB 21b (step S27), and processes it. finish.

〔１−３−２〕ファセット検索処理
図２１は、一実施形態に係るファセット検索処理の動作例を説明するフローチャートである。 [1-3-2] Facet Search Process FIG. 21 is a flowchart illustrating an operation example of the facet search process according to the embodiment.

図２１に例示するように、サーバ２において、検索制御部２２は、端末４に対して提示するカテゴリ探索画面１００上で、端末４によるクラスＣの選択を受け付ける（ステップＳ３１）。 As illustrated in FIG. 21, in the server 2, the search control unit 22 accepts the selection of the class C by the terminal 4 on the category search screen 100 presented to the terminal 4 (step S31).

意味解釈処理部２４は、クラスＣに所属するインスタンスが保有する述語（Predicate）を取得する（ステップＳ３２）。 The semantic interpretation processing unit 24 acquires the predicate (Predicate) possessed by the instance belonging to the class C (step S32).

意味解釈処理部２４は、述語（Predicate）の要素であるファセットキーｆを全て処理したか否かを判定する（ステップＳ３３）。全て処理していない場合（ステップＳ３３でＮＯ）、意味解釈処理部２４は、スコアＤＢ２１ｂからクラスＣにおけるファセットキーｆのスコアnew_scoreを取得する（ステップＳ３４）。 The semantic interpretation processing unit 24 determines whether or not all the facet keys f, which are elements of the predicate (Predicate), have been processed (step S33). When all the processing is not performed (NO in step S33), the meaning interpretation processing unit 24 acquires the score new_score of the facet key f in the class C from the score DB 21b (step S34).

意味解釈処理部２４は、クラスＣとファセットキーｆとを利用し、ＫＧ３内のオントロジーに基づき、距離スコアont_scoreを算出し（ステップＳ３５）、処理がステップＳ３３に移行する。 The semantic interpretation processing unit 24 uses the class C and the facet key f to calculate the distance score ont_score based on the ontology in KG3 (step S35), and the processing shifts to step S33.

ステップＳ３３において、全て処理した場合（ステップＳ３３でＹＥＳ）、意味解釈処理部２４は、new_score及びont_scoreを用いて、ファセットキーｆのスコアfinal_scoreを算出する（ステップＳ３６）。 When all the processing is performed in step S33 (YES in step S33), the meaning interpretation processing unit 24 calculates the score final_score of the facet key f using new_score and ont_score (step S36).

ランキング調整部２５は、final_scoreに基づきソートしたファセットキーを検索制御部２２に通知する。検索制御部２２は、ソートされたファセットキーを、カテゴリ探索画面１００からの遷移先であるファセット検索画面２００の項目一覧領域２１０に表示し（ステップＳ３７）、項目一覧領域２１０の表示に係るファセット検索処理が終了する。 The ranking adjustment unit 25 notifies the search control unit 22 of the facet keys sorted based on final_score. The search control unit 22 displays the sorted facet keys in the item list area 210 of the facet search screen 200, which is the transition destination from the category search screen 100 (step S37), and the facet search related to the display of the item list area 210. The process ends.

〔２〕その他
上述した一実施形態に係る技術は、以下のように変形、変更して実施することができる。 [2] Others The technology according to the above-described embodiment can be modified or modified as follows.

例えば、図４に示すサーバ２が備える検索制御部２２、統計処理部２３（頻度表作成部２３ａ及びスコア算出部２３ｂ）、意味解釈処理部２４及びランキング調整部２５は、任意の組み合わせで併合してもよく、それぞれ分割してもよい。 For example, the search control unit 22, the statistical processing unit 23 (frequency table creation unit 23a and score calculation unit 23b), the semantic interpretation processing unit 24, and the ranking adjustment unit 25 included in the server 2 shown in FIG. 4 are merged in any combination. It may be divided into each.

また、図１に示すサーバ２は、複数の装置がネットワークを介して互いに連携することにより、各処理機能を実現する構成であってもよい。一例として、検索制御部２２はＷｅｂサーバ、統計処理部２３、意味解釈処理部２４及びランキング調整部２５はアプリケーションサーバ、メモリ部２１はＤＢサーバ、等であってもよい。この場合、Ｗｅｂサーバ、アプリケーションサーバ及びＤＢサーバが、ネットワークを介して互いに連携することにより、サーバ２としての各処理機能を実現してもよい。 Further, the server 2 shown in FIG. 1 may have a configuration in which a plurality of devices cooperate with each other via a network to realize each processing function. As an example, the search control unit 22 may be a Web server, the statistical processing unit 23, the meaning interpretation processing unit 24 and the ranking adjustment unit 25 may be an application server, the memory unit 21 may be a DB server, and the like. In this case, the Web server, the application server, and the DB server may realize each processing function as the server 2 by coordinating with each other via the network.

さらに、一実施形態において、ランキング調整部２５に入力される最終的なスコア（final_score）は、上記式（８）及び式（１８）に示すように、signifが考慮されたnew_scoreと、ont_scoreとを乗算したスコアであるものとして説明した。最終的なスコア（final_score）は、これに限定されるものではなく、少なくとも、signif及びont_scoreの一方が考慮されればよい。 Further, in one embodiment, the final score (final_score) input to the ranking adjustment unit 25 is a new_score in which signif is taken into consideration and an ont_score as shown in the above equations (8) and (18). It was explained as being a multiplied score. The final score (final_score) is not limited to this, and at least one of signif and ont_score may be considered.

例えば、最終的なスコア（final_score）は、下記式（１９）に示すように、ont_scoreを考慮せず、new_scoreと一致してもよい。 For example, the final score (final_score) may match new_score without considering ont_score, as shown in the following equation (19).

final_score(f, C) = new_score(f, C) （１９） final_score (f, C) = new_score (f, C) (19)

或いは、最終的なスコア（final_score）は、下記式（２０）に示すように、signifを考慮しないスコアであってもよい。なお、一実施形態では、下記式（２０）において、重みωは、例示的に“0.5”であるものとする。 Alternatively, the final score (final_score) may be a score that does not consider signif, as shown in the following equation (20). In one embodiment, the weight ω is exemplifiedly “0.5” in the following equation (20).

final_score(f, C) = ω * org_score(f, C) + (1 - ω) * ont_score(f, C) （２０） final_score (f, C) = ω * org_score (f, C) + (1-ω) * ont_score (f, C) (20)

上記式（１９）又は式（２０）に示す最終的なスコア（final_score）によっても、図１６又は図１７に例示するような一実施形態と同様の効果を奏することができる。 The final score (final_score) represented by the above formula (19) or (20) can also have the same effect as that of one embodiment as illustrated in FIG. 16 or FIG.

〔３〕付記
以上の実施形態に関し、さらに以下の付記を開示する。 [3] Additional Notes The following additional notes will be further disclosed with respect to the above embodiments.

（付記１）
特定のカテゴリの指定を受け付け、
ナレッジグラフにおける前記特定のカテゴリに関連付けられた複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの個数に基づく第１指標と、前記ナレッジグラフにおける前記特定のカテゴリと前記複数のファセットのそれぞれとの距離に基づく第２指標と、のうち少なくとも一方に基づいて算出される前記複数のファセットの優先度に応じて、前記複数のファセットを順に並べて出力する、
処理をコンピュータに実行させる、検索プログラム。 (Appendix 1)
Accepts the specification of a specific category,
A first index based on the number of categories associated with each of the specific categories in the Knowledge Graph, and each of the specific category and the plurality of facets in the Knowledge Graph. The plurality of facets are sequentially output in order according to the priority of the plurality of facets calculated based on at least one of the second index based on the distance between the two.
A search program that lets a computer perform processing.

（付記２）
前記第１指標は、前記ナレッジグラフにおける全てのカテゴリの個数を、前記複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの個数でそれぞれ除算した結果を含む、
付記１に記載の検索プログラム。 (Appendix 2)
The first index includes the result of dividing the number of all categories in the Knowledge Graph by the number of categories each of the plurality of facets associated with in the Knowledge Graph.
The search program described in Appendix 1.

（付記３）
前記複数のファセットのそれぞれについての前記結果を含む前記第１指標を算出し、
算出した前記第１指標を含む情報を記憶領域に格納する、
処理を前記コンピュータに実行させる、付記２に記載の検索プログラム。 (Appendix 3)
The first index, which includes the results for each of the plurality of facets, is calculated.
The calculated information including the first index is stored in the storage area.
The search program according to Appendix 2, which causes the computer to execute the process.

（付記４）
前記第２指標は、前記ナレッジグラフにおける前記特定のカテゴリの階層と、前記複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの階層のそれぞれとの距離に基づく、
付記１〜付記３のいずれか１項に記載の検索プログラム。 (Appendix 4)
The second indicator is based on the distance between the hierarchy of the particular category in the Knowledge Graph and each of the hierarchy of categories with which each of the plurality of facets is associated in the Knowledge Graph.
The search program according to any one of Supplementary notes 1 to 3.

（付記５）
前記第２指標は、前記複数のファセットのそれぞれが標準語彙であるか否かに基づく、
付記１〜付記４のいずれか１項に記載の検索プログラム。 (Appendix 5)
The second index is based on whether or not each of the plurality of facets is a standard vocabulary.
The search program according to any one of Supplementary notes 1 to 4.

（付記６）
前記出力する処理は、前記特定のカテゴリの指定を受け付けた画面からの遷移先である検索画面であって、前記特定のカテゴリに関連付けられた前記複数のファセットを対象とした検索を行なうための前記検索画面に、前記複数のファセットの優先度に応じて順に並べた前記複数のファセットを表示する処理を含む、
付記１〜付記５のいずれか１項に記載の検索プログラム。 (Appendix 6)
The output process is a search screen that is a transition destination from the screen that accepts the designation of the specific category, and is for performing a search targeting the plurality of facets associated with the specific category. The search screen includes a process of displaying the plurality of facets arranged in order according to the priority of the plurality of facets.
The search program according to any one of Supplementary notes 1 to 5.

（付記７）
特定のカテゴリの指定を受け付け、
ナレッジグラフにおける前記特定のカテゴリに関連付けられた複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの個数に基づく第１指標と、前記ナレッジグラフにおける前記特定のカテゴリと前記複数のファセットのそれぞれとの距離に基づく第２指標と、のうち少なくとも一方に基づいて算出される前記複数のファセットの優先度に応じて、前記複数のファセットを順に並べて出力する、
処理をコンピュータが実行する、検索方法。 (Appendix 7)
Accepts the specification of a specific category,
Each of the plurality of facets associated with the specific category in the Knowledge Graph is a first index based on the number of categories associated with the Knowledge Graph, and each of the specific category and the plurality of facets in the Knowledge Graph. The plurality of facets are sequentially output in order according to the priority of the plurality of facets calculated based on at least one of the second index based on the distance between the two.
A search method in which a computer performs processing.

（付記８）
前記第１指標は、前記ナレッジグラフにおける全てのカテゴリの個数を、前記複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの個数でそれぞれ除算した結果を含む、
付記７に記載の検索方法。 (Appendix 8)
The first index includes the result of dividing the number of all categories in the Knowledge Graph by the number of categories each of the plurality of facets associated with in the Knowledge Graph.
The search method described in Appendix 7.

（付記９）
前記複数のファセットのそれぞれについての前記結果を含む前記第１指標を算出し、
算出した前記第１指標を含む情報を記憶領域に格納する、
処理を前記コンピュータが実行する、付記８に記載の検索方法。 (Appendix 9)
The first index, which includes the results for each of the plurality of facets, is calculated.
The calculated information including the first index is stored in the storage area.
The search method according to Appendix 8, wherein the computer executes the process.

（付記１０）
前記第２指標は、前記ナレッジグラフにおける前記特定のカテゴリの階層と、前記複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの階層のそれぞれとの距離に基づく、
付記７〜付記９のいずれか１項に記載の検索方法。 (Appendix 10)
The second indicator is based on the distance between the hierarchy of the particular category in the Knowledge Graph and each of the hierarchy of categories with which each of the plurality of facets is associated in the Knowledge Graph.
The search method according to any one of Supplementary note 7 to Supplementary note 9.

（付記１１）
前記第２指標は、前記複数のファセットのそれぞれが標準語彙であるか否かに基づく、
付記７〜付記１０のいずれか１項に記載の検索方法。 (Appendix 11)
The second index is based on whether or not each of the plurality of facets is a standard vocabulary.
The search method according to any one of Supplementary note 7 to Supplementary note 10.

（付記１２）
前記出力する処理は、前記特定のカテゴリの指定を受け付けた画面からの遷移先である検索画面であって、前記特定のカテゴリに関連付けられた前記複数のファセットを対象とした検索を行なうための前記検索画面に、前記複数のファセットの優先度に応じて順に並べた前記複数のファセットを表示する処理を含む、
付記７〜付記１１のいずれか１項に記載の検索方法。 (Appendix 12)
The output process is a search screen that is a transition destination from the screen that accepts the designation of the specific category, and is for performing a search targeting the plurality of facets associated with the specific category. The search screen includes a process of displaying the plurality of facets arranged in order according to the priority of the plurality of facets.
The search method according to any one of Supplementary note 7 to Supplementary note 11.

（付記１３）
特定のカテゴリの指定を受け付ける受付部と、
ナレッジグラフにおける前記特定のカテゴリに関連付けられた複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの個数に基づく第１指標と、前記ナレッジグラフにおける前記特定のカテゴリと前記複数のファセットのそれぞれとの距離に基づく第２指標と、のうち少なくとも一方に基づいて算出される前記複数のファセットの優先度に応じて、前記複数のファセットを順に並べて出力する出力部と、
を備える、検索装置。 (Appendix 13)
A reception desk that accepts the designation of a specific category,
A first index based on the number of categories associated with each of the specific categories in the Knowledge Graph, and each of the specific category and the plurality of facets in the Knowledge Graph. A second index based on the distance between the two, and an output unit that outputs the plurality of facets in order according to the priority of the plurality of facets calculated based on at least one of them.
A search device.

（付記１４）
前記第１指標は、前記ナレッジグラフにおける全てのカテゴリの個数を、前記複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの個数でそれぞれ除算した結果を含む、
付記１３に記載の検索装置。 (Appendix 14)
The first index includes the result of dividing the number of all categories in the Knowledge Graph by the number of categories each of the plurality of facets associated with in the Knowledge Graph.
The search device according to Appendix 13.

（付記１５）
前記複数のファセットのそれぞれについての前記結果を含む前記第１指標を算出し、
算出した前記第１指標を含む情報を記憶領域に格納する、
算出部を備える、付記１４に記載の検索装置。 (Appendix 15)
The first index, which includes the results for each of the plurality of facets, is calculated.
The calculated information including the first index is stored in the storage area.
The search device according to Appendix 14, further comprising a calculation unit.

（付記１６）
前記第２指標は、前記ナレッジグラフにおける前記特定のカテゴリの階層と、前記複数のファセットのそれぞれが前記ナレッジグラフにおいて関連付けられているカテゴリの階層のそれぞれとの距離に基づく、
付記１３〜付記１５のいずれか１項に記載の検索装置。 (Appendix 16)
The second indicator is based on the distance between the hierarchy of the particular category in the Knowledge Graph and each of the hierarchy of categories with which each of the plurality of facets is associated in the Knowledge Graph.
The search device according to any one of Supplementary note 13 to Supplementary note 15.

（付記１７）
前記第２指標は、前記複数のファセットのそれぞれが標準語彙であるか否かに基づく、
付記１３〜付記１６のいずれか１項に記載の検索装置。 (Appendix 17)
The second index is based on whether or not each of the plurality of facets is a standard vocabulary.
The search device according to any one of Supplementary note 13 to Supplementary note 16.

（付記１８）
前記出力部は、前記特定のカテゴリの指定を受け付けた画面からの遷移先である検索画面であって、前記特定のカテゴリに関連付けられた前記複数のファセットを対象とした検索を行なうための前記検索画面に、前記複数のファセットの優先度に応じて順に並べた前記複数のファセットを表示する、
付記１３〜付記１７のいずれか１項に記載の検索装置。 (Appendix 18)
The output unit is a search screen that is a transition destination from the screen that accepts the designation of the specific category, and is the search for performing a search targeting the plurality of facets associated with the specific category. The plurality of facets arranged in order according to the priority of the plurality of facets are displayed on the screen.
The search device according to any one of Supplementary note 13 to Supplementary note 17.

１ファセット検索システム
１０コンピュータ
２サーバ
２１メモリ部
２１ａ頻度表
２１ｂスコアＤＢ
２２検索制御部
２３統計処理部
２３ａ頻度表作成部
２３ｂスコア算出部
２４意味解釈処理部
２５ランキング調整部
３ＫＧ（ナレッジグラフ）
４端末 1 Facet search system 10 Computer 2 Server 21 Memory section 21a Frequency table 21b Score DB
22 Search control unit 23 Statistical processing unit 23a Frequency table creation unit 23b Score calculation unit 24 Semantic interpretation processing unit 25 Ranking adjustment unit 3 KG (Knowledge graph)
4 terminals

Claims

Accepts the specification of a specific category,
A first index based on the number of categories associated with each of the specific categories in the Knowledge Graph, and each of the specific category and the plurality of facets in the Knowledge Graph. The plurality of facets are sequentially output in order according to the priority of the plurality of facets calculated based on at least one of the second index based on the distance between the two.
A search program that lets a computer perform processing.

The first index includes the result of dividing the number of all categories in the Knowledge Graph by the number of categories each of the plurality of facets associated with in the Knowledge Graph.
The search program according to claim 1.

The first index, which includes the results for each of the plurality of facets, is calculated.
The calculated information including the first index is stored in the storage area.
The search program according to claim 2, wherein the computer executes the process.

The second indicator is based on the distance between the hierarchy of the particular category in the Knowledge Graph and each of the hierarchy of categories with which each of the plurality of facets is associated in the Knowledge Graph.
The search program according to any one of claims 1 to 3.

The second index is based on whether or not each of the plurality of facets is a standard vocabulary.
The search program according to any one of claims 1 to 4.

The output process is a search screen that is a transition destination from the screen that accepts the designation of the specific category, and is for performing a search targeting the plurality of facets associated with the specific category. The search screen includes a process of displaying the plurality of facets arranged in order according to the priority of the plurality of facets.
The search program according to any one of claims 1 to 5.

Accepts the specification of a specific category,
A first index based on the number of categories associated with each of the specific categories in the Knowledge Graph, and each of the specific category and the plurality of facets in the Knowledge Graph. The plurality of facets are sequentially output in order according to the priority of the plurality of facets calculated based on at least one of the second index based on the distance between the two.
A search method in which a computer performs processing.

A reception desk that accepts the designation of a specific category,
A first index based on the number of categories associated with each of the specific categories in the Knowledge Graph, and each of the specific category and the plurality of facets in the Knowledge Graph. A second index based on the distance between the two, and an output unit that outputs the plurality of facets in order according to the priority of the plurality of facets calculated based on at least one of them.
A search device.