JP2013200862A

JP2013200862A - Method and device for diversifying query results

Info

Publication number: JP2013200862A
Application number: JP2012276584A
Authority: JP
Inventors: Jianqiang Li; ジェンチャンリー; shun cheng Liu; シュンチェンリュウ
Original assignee: NEC China Co Ltd
Current assignee: NEC China Co Ltd
Priority date: 2012-03-23
Filing date: 2012-12-19
Publication date: 2013-10-03
Anticipated expiration: 2032-12-19
Also published as: CN103324644A; CN103324644B; JP5486667B2

Abstract

PROBLEM TO BE SOLVED: To provide a method and device for diversifying query results.SOLUTION: In the method for diversifying the query result, a combination set of related keywords is constructed by extracting the related keywords from domain ontology for respective query keywords of a query, a query result set is acquired by performing search by the combination of the respective related keywords in the combination set of the related keywords, the corresponding number of query results are extracted from the respective query result sets to construct a final query result set, and the constructed final query result set is ranked to acquire diversified query results.

Description

本発明は情報検索技術に関し、特にクエリ結果を多様化するための方法および装置に関する。 The present invention relates to an information search technique, and more particularly, to a method and apparatus for diversifying query results.

従来の情報検索技術では、例えばサーチ結果をクラスタ化または分類する、平均分散分析などによって結果をランク付けし直すなど、検索された文書を後処理またはランク付けし直すことによって一般的に多様化が行われる。 Traditional information retrieval techniques typically diversify by post-processing or re-ranking retrieved documents, such as clustering or classifying search results, re-ranking results by means of mean variance analysis, etc. Done.

情報検索技術の発展と共に、サーチ結果の多様化および情報検索のクエリの曖昧性除去におけるユーザの要求はますます高まっている。特に、サーチ結果の多様化とは、ユーザによるクエリキーワード入力にはいくつかの解釈がある可能性があり、取得されたクエリ結果は、これらの異なる解釈をカバーする結果でなければならないことを意味する。サーチ結果の多様化の目的は、サーチ結果の関連性および新規性のバランスをとることによってユーザが満足しないリスクを可能な限り低下させることである。クエリの曖昧性除去は、ユーザの入力キーワードに基づいて可能性があるすべてのクエリの意図を識別し、より正確な方法で意図を表すことに重点を置いている。 With the development of information retrieval technology, user demands for diversification of search results and disambiguation of information retrieval queries are increasing. In particular, diversification of search results means that there may be several interpretations of query keyword input by the user, and the retrieved query results must cover these different interpretations. To do. The purpose of search result diversification is to reduce as much as possible the risk that users are not satisfied by balancing the relevance and novelty of search results. Query disambiguation focuses on identifying all possible query intents based on user input keywords and expressing the intent in a more accurate manner.

クエリの曖昧性除去は、サーチの多様化をサポートする新しい方法として働き、これによって特に結果のスケールが大きい時は、計算コストが効率的に節約され結果がよりわかりやすくなり得る。多様化されたサーチは、従来技術では一般的にクエリログにおいて統計分析(または機械学習など)を使用することによって実行される。 Query disambiguation serves as a new way to support search diversification, which can effectively save computational costs and make the results more understandable, especially when the results are large in scale. Diversified searches are typically performed in the prior art by using statistical analysis (or machine learning, etc.) in the query log.

具体的にはクエリ結果を多様化するための既存の方法は、図１に示されるようなクエリ間の変換の形を採用しており、以下を含む。 Specifically, an existing method for diversifying query results adopts a form of conversion between queries as shown in FIG. 1 and includes the following.

ステップＳ１０１は、クエリログの大きいサンプルの分析を行うことによって、所与のクエリＱについてｋ個の関連するクエリＲ（Ｑ）を生成することである。 Step S101 is to generate k related queries R (Q) for a given query Q by analyzing a large sample of query logs.

ステップＳ１０２は、クエリ結果の各組からｎ／（ｋ＋１）個（ｎはユーザ文書の数を表す）の結果を抽出することによって、最初のＤＯＣリストを取得することである。そして Step S102 is to obtain an initial DOC list by extracting n / (k + 1) (n represents the number of user documents) results from each set of query results. And

ステップＳ１０３は、最初のＤＯＣリストを関連性フィードバック手法でランク付けし直すことである。 Step S103 is to re-rank the first DOC list with the relevance feedback technique.

図２に示されるようなクエリ結果を多様化するための対応する装置は、 A corresponding device for diversifying query results as shown in FIG.

ユーザのクエリキーワードを格納するように構成されたクエリユニット２０１と、 A query unit 201 configured to store user query keywords;

ユーザのクエリログを格納するように構成されたクエリログメモリユニット２０２と、 A query log memory unit 202 configured to store user query logs;

ユーザのクエリキーワードおよびクエリログに従ってターゲットクエリに関連するクエリキーワードを決定するように構成されたクエリ曖昧性除去ユニット２０３と、 A query disambiguation unit 203 configured to determine query keywords related to the target query according to the user's query keywords and query log;

ターゲットクエリに関連するクエリキーワードを格納するように構成されたサブクエリメモリユニット２０４と、 A subquery memory unit 204 configured to store query keywords associated with the target query;

サーチされた文書を格納するように構成された文書メモリユニット２０５と、 A document memory unit 205 configured to store the searched documents;

サブクエリのキーワードを使用して、文書メモリユニット２０５内の文書をサーチするように構成されたキーワードサーチユニット２０６と、 A keyword search unit 206 configured to search for documents in the document memory unit 205 using the subquery keywords;

各サブクエリによるサーチのクエリ結果を格納するように構成されたサブクエリ結果メモリユニット２０７と、 A subquery result memory unit 207 configured to store query results of a search by each subquery;

それぞれのクエリ結果を結合するように構成されたクエリ結果結合ユニット２０８と、 A query result combining unit 208 configured to combine the respective query results;

結合されたクエリ結果を格納するように構成されたクエリ結果メモリユニット２０９と、 A query result memory unit 209 configured to store the combined query results;

結合されたクエリ結果をランク付けするように構成されたクエリ結果ランク付けユニット２１０と、 A query result ranking unit 210 configured to rank the combined query results;

ターゲットクエリの最終的な多様化されたクエリ結果を格納するように構成された多様化済みランクリストメモリユニット２１１とを含む。 A diversified rank list memory unit 211 configured to store a final diversified query result of the target query.

具体的には、例えばクエリキーワード「ウィンドウ」およびターゲットクエリｑ＝（ウィンドウ）が与えられると、クエリキーワードおよびクエリログに従ってサブクエリのキーワードが「ウィンドウＸＰ」、「ハウスウィンドウ」、……として取得され、次いでｑの１組のサブクエリは、Ｒ（ｑ）＝｛（ｑ_１，ｑ，ウィンドウＸＰ），（ｑ_２，ｑ，ハウスウィンドウ），……｝となり、ターゲットクエリｑによるサーチ、および１組のサブクエリＲ（ｑ）におけるそれぞれのサブクエリによるサーチは、１組の文書リストＳ（ｑ）＝｛（ｑ，文書リスト１），（ｑ_１，文書リスト２），（ｑ_２，文書リスト３），……｝を形成する文書リストをそれぞれ取得し、ｑの新しい１組のクエリ結果ＲＦ（ｑ）を形成するためにｎ／（ｋ＋１）文書が各文書リストから選択される。ここでｎは結果のスケールを表すあらかじめ定められた値であり、ｋはサブクエリの数を表し、ＲＦ（ｑ）における文書は文書とユーザの関心との間のマッチ度に基づいてランク付けされ、それによってユーザクエリの多様化されたクエリ結果が取得される。 Specifically, for example, given a query keyword “window” and a target query q = (window), sub-query keywords are acquired as “window XP”, “house window”,. A set of sub-queries for q is R (q) = {(q ₁ , q, window XP), (q ₂ , q, house window),...}, a search by the target query q, and a set of sub-queries A search by each subquery in R (q) is performed by one set of document list S (q) = {(q, document list 1), (q ₁ , document list 2), (q ₂ , document list 3),. ...} are obtained respectively, and n / (k + 1) to form a new set of query results RF (q) for q. Document is selected from each of the document list. Where n is a predetermined value representing the scale of the results, k represents the number of subqueries, the documents in RF (q) are ranked based on the degree of match between the document and the user's interest, Thereby, a diversified query result of the user query is acquired.

クエリ結果を多様化する上記の方法からわかるように、１組のサブクエリは、従来技術ではクエリログに基づいて決定されるが、本発明の発明者が識別しているのは、その時点ではユーザの本当のクエリの意図を正確に表すことができない可能性があるユーザ入力クエリキーワードに基づいてクエリログが生成され、さらに企業サーチまたは他のサーチ状況においてもクエリログを使用することができない、またはクエリログのサイズがクエリの曖昧性除去をサポートするのに十分ではない可能性があるので、クエリログは信頼できないデータソースである可能性があり、従って正確ではない多様化されたクエリ結果が生成されるということである。 As can be seen from the above method of diversifying query results, a set of sub-queries is determined on the basis of the query log in the prior art, but the inventor of the present invention has now identified that user's Query logs are generated based on user input query keywords that may not accurately represent the intent of the true query, and cannot be used in company search or other search situations, or the size of the query log The query log may be an unreliable data source, and therefore produces diversified query results that are not accurate, because it may not be sufficient to support query disambiguation is there.

本発明の実施例は、正確な多様化されたクエリ結果を取得するために、クエリ結果を多様化する方法および装置を提供する。 Embodiments of the present invention provide a method and apparatus for diversifying query results to obtain accurate diversified query results.

本発明の好ましい態様によるクエリ結果を多様化するための方法は、
それぞれのクエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築するステップと、
関連キーワードの組み合わせセットにおけるそれぞれの関連キーワードの組み合わせでサーチして、クエリ結果セットを取得するステップと、
それぞれのクエリ結果セットから対応する数のクエリ結果を抽出することによって、最終的なクエリ結果セットを構築するステップと、
構築された最終的なクエリ結果セットをランク付けして、多様化されたクエリ結果を取得するステップとを含む。 A method for diversifying query results according to a preferred aspect of the present invention comprises:
Building a set of related keyword combinations by extracting related keywords from the domain ontology for each query keyword;
Searching for each related keyword combination in the related keyword combination set to obtain a query result set;
Constructing a final query result set by extracting a corresponding number of query results from each query result set;
Ranking the constructed final query result set to obtain diversified query results.

本発明の好ましい態様によるクエリ結果を多様化するための装置は、
それぞれのクエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築するように構成された関連キーワード組み合わせセット構築ユニットと、
関連キーワードの組み合わせセットにおけるそれぞれの関連キーワードの組み合わせでサーチして、クエリ結果セットを取得するように構成されたクエリユニットと、
それぞれのクエリ結果セットから対応する数のクエリ結果を抽出することによって、最終的なクエリ結果セットを構築するように構成された最終クエリ結果セット構築ユニットと、
構築された最終的なクエリ結果セットをランク付けして、多様化されたクエリ結果を取得するように構成されたランク付けユニットとを含む。 An apparatus for diversifying query results according to a preferred aspect of the present invention comprises:
A related keyword combination set building unit configured to build a set of related keyword combinations by extracting related keywords from the domain ontology for each query keyword;
A query unit configured to search for each related keyword combination in the related keyword combination set to obtain a query result set;
A final query result set construction unit configured to construct a final query result set by extracting a corresponding number of query results from each query result set;
A ranking unit configured to rank the constructed final query result set to obtain diversified query results.

本発明のクエリ結果を多様化するための方法および装置によれば、領域オントロジにおいて所与のキーワードクエリについて１組の関連キーワードの組み合わせが決定され、これらの関連キーワードの組み合わせによって拡張されたクエリセットが構築され、それによって信頼できないクエリログから拡張されたクエリセットが決定されることが回避され、従って多様化されたクエリ結果がより正確になる。 According to the method and apparatus for diversifying query results of the present invention, a set of related keywords is determined for a given keyword query in a domain ontology, and the query set expanded by the combination of these related keywords. Is constructed, thereby avoiding an extended query set being determined from untrusted query logs, and thus diversified query results are more accurate.

従来技術におけるクエリ結果を多様化するための方法のフローチャートである。3 is a flowchart of a method for diversifying query results in the prior art. 従来技術におけるクエリ結果を多様化するための装置の概略構造図である。FIG. 6 is a schematic structural diagram of an apparatus for diversifying query results in the prior art. 本発明の一実施例によるクエリ結果を多様化するための方法のフローチャートである。4 is a flowchart of a method for diversifying query results according to an embodiment of the present invention. 本発明の一実施例による最小サブグラフを取得するための方法のフローチャートである。4 is a flowchart of a method for obtaining a minimum subgraph according to an embodiment of the present invention. 本発明の一実施例による１組のクエリ結果を決定するための方法のフローチャートである。4 is a flowchart of a method for determining a set of query results according to one embodiment of the present invention. 本発明の一実施例によるクエリ結果を取得するための方法のフローチャートである。4 is a flowchart of a method for obtaining a query result according to an embodiment of the present invention; 本発明の一実施例によるランク付け方法のフローチャートである。4 is a flowchart of a ranking method according to an embodiment of the present invention. 本発明の一実施例による類似性によるランク付けの方法のフローチャートである。3 is a flowchart of a method of ranking by similarity according to an embodiment of the present invention; 本発明の一実施例によるクエリ結果を多様化するための装置の概略構造図である。FIG. 3 is a schematic structural diagram of an apparatus for diversifying query results according to an embodiment of the present invention.

本発明の実施例は、クエリ結果を多様化するための方法および装置を提供しており、領域オントロジにおいて所与のキーワードクエリについて１組の関連キーワードの組み合わせが決定され、これらの関連キーワードの組み合わせによって拡張されたクエリセットが構築され、それによって信頼できないクエリログから拡張されたクエリセットが決定されることが回避され、従って多様化されたクエリ結果がより正確になる。 Embodiments of the present invention provide a method and apparatus for diversifying query results, wherein a set of related keywords is determined for a given keyword query in a domain ontology, and the combination of these related keywords. Is used to build an expanded query set, thereby avoiding determining an expanded query set from untrusted query logs, thus making diversified query results more accurate.

図３に示されるように、本発明の一実施例によるクエリ結果を多様化するための方法は、以下を含む。 As shown in FIG. 3, a method for diversifying query results according to one embodiment of the present invention includes:

ステップＳ３０１は、それぞれのクエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築することである。 Step S301 is to construct a combination set of related keywords by extracting the related keywords from the region ontology for each query keyword.

ステップＳ３０２は、関連キーワードの組み合わせセットにおけるそれぞれの関連キーワードの組み合わせでサーチして、クエリ結果セットを取得することである。 Step S302 is to perform a search with each combination of related keywords in the combination set of related keywords to obtain a query result set.

ステップＳ３０３は、それぞれのクエリ結果セットから対応する数のクエリ結果を抽出することによって、最終的なクエリ結果セットを構築することである。 Step S303 is to construct a final query result set by extracting a corresponding number of query results from each query result set.

ステップＳ３０４は、構築された最終的なクエリ結果セットをランク付けして、多様化されたクエリ結果を取得することである。 Step S304 is to rank the constructed final query result set to obtain diversified query results.

各関連キーワードは領域オントロジから抽出され、従って関連キーワードの選択がより正確になる。さらに、関連キーワードの組み合わせに基づいてクエリグラフを構築することで所与のクエリを解釈することによって、解釈がユーザの本当の意図により近くなり、それによって多様化されたクエリ結果がより正確になる。領域オントロジは概念間の関係と共に１組の概念として領域知識を表す。 Each related keyword is extracted from the region ontology, thus making the selection of related keywords more accurate. In addition, by interpreting a given query by building a query graph based on a combination of related keywords, the interpretation is closer to the user's real intentions, thereby diversifying query results more accurately . Domain ontology represents domain knowledge as a set of concepts along with relationships between concepts.

具体的には、ステップＳ３０１で最初に、所与のクエリのキーワード毎にそのキーワードの関連キーワードが領域オントロジから抽出され、次いで関連キーワードの組み合わせセットが関連キーワードから構築される。構築された関連キーワードの組み合わせセットは、Ｓ（Ｑ）＝｛（ｃ_１，ｃ_２，…，ｃ_ｍ）｜ｃ_１∈Ｃ_１＆＆ｃ_２∈Ｃ_２＆＆…ｃ_ｍ∈Ｃ_ｍ｝であり、ここでＣ_ｉは、所与のクエリにおけるｍ個のキーワードの中のｉ番目のキーワードの関連キーワードセットである。 Specifically, in step S301, for each keyword of a given query, a keyword related keyword is extracted from the region ontology, and then a combination set of related keywords is constructed from the related keywords. The combination set of related keywords constructed is S (Q) = {(c ₁ , c ₂ ,..., C _m ) | c ₁ ∈C ₁ && c ₂ ∈C ₂ && ... c _m ∈C _m } Here, C _i is a related keyword set of the i-th keyword among the m keywords in the given query.

所与のクエリキーワードの関連キーワードを抽出する時、クエリキーワードとオントロジ概念との間で単語マッチングを行うことができ、次いで取得されたマッチするオントロジ概念が関連キーワードとして設定される。当業者は当然、他の方法によって領域オントロジから関連キーワードを抽出することができる。 When extracting a related keyword of a given query keyword, word matching can be performed between the query keyword and the ontology concept, and then the acquired matching ontology concept is set as the related keyword. Of course, those skilled in the art can extract related keywords from the region ontology by other methods.

クエリ結果をより正確にするために関連キーワードの組み合わせをさらにフィルタに通すことができ、それによってユーザの意図により近いものが取得される。 Related keyword combinations can be further filtered to make the query results more accurate, thereby getting closer to the user's intention.

具体的にはステップＳ３０１において、領域オントロジに従って所与のクエリについて関連キーワードの組み合わせセットを構築した後、 Specifically, in step S301, after building a related keyword combination set for a given query according to the region ontology,

関連キーワードの組み合わせセットにおける関連キーワードの組み合わせ毎に、組み合わせ内のすべてのキーワードを連結する最小サブグラフが領域オントロジから抽出される。ここで最小サブグラフは、領域オントロジにおけるそれぞれのキーワードを連結するサブグラフの中の最小数のエッジを有し、最も近い方法ですべての関連キーワードを連結するサブグラフである。 For each combination of related keywords in the combination set of related keywords, a minimum subgraph connecting all the keywords in the combination is extracted from the region ontology. Here, the minimum subgraph is a subgraph that has the minimum number of edges in the subgraph connecting each keyword in the region ontology and connects all related keywords in the closest way.

図４に示されるように、関連キーワードの組み合わせにおける５つのキーワードが与えられると、抽出されたサブグラフは５つのキーワードすべてを連結し、最小数のエッジを有する。 As shown in FIG. 4, given five keywords in a related keyword combination, the extracted subgraph connects all five keywords and has a minimum number of edges.

この時、図５に示されるように関連キーワードの組み合わせセットにおけるそれぞれの関連キーワードの組み合わせでサーチして、クエリ結果セットを取得するステップＳ３０２は特に以下を含む。 At this time, as shown in FIG. 5, step S302 for obtaining a query result set by searching for each combination of related keywords in the combination set of related keywords includes the following.

ステップＳ５０１は最小サブグラフ毎に、最小サブグラフにおけるキーワードおよび他のノードから成るサブクエリを決定することである。 Step S501 is to determine, for each minimum subgraph, a subquery including a keyword and other nodes in the minimum subgraph.

ステップＳ５０２は、各サブクエリにおけるキーワードおよび他のノードでサーチして、最小サブグラフの数と同じ数のサブクエリ結果セットを取得する。そして In step S502, a search is performed with keywords and other nodes in each subquery, and the same number of subquery result sets as the number of minimum subgraphs are obtained. And

ステップＳ５０３は、クエリ結果セットをそれぞれのサブクエリ結果セットから成るセットとして決定することである。 Step S503 is to determine the query result set as a set composed of the respective subquery result sets.

例えばユーザはｍ個のキーワード、Ｑ＝｛ｋ_１，……，ｋ_ｍ｝を含むクエリを入力し、任意のキーワードｋ_ｉについて１組の関連キーワード、Ｃ_ｉ＝｛ｃ_ｉ１，ｃ_ｉ２，……，ｃ_ｉｎｉ｝を領域オントロジから抽出することができ、ｎｉは、ｋ_ｉについての関連キーワードの数を示し、またｋ_ｉに対する各関連キーワードの関連度の値、Ｒ_ｉ＝｛ｒ_ｉ１，ｒ_ｉ２，……、ｒ_ｉｎｉ｝を領域オントロジに基づいて計算することができる。合計で

であり、ユーザの入力クエリについて関連キーワードの組み合わせ、Ｓ（Ｑ）＝｛（ｃ１，ｃ２，…，ｃｍ）｜ｃ１∈Ｃ１＆＆ｃ２∈Ｃ２＆＆…ｃｍ∈Ｃｍ｝を決定することができる。 For example, a user inputs a query including _m keywords, Q = {k ₁ ,..., K _m }, and a set of related keywords for an arbitrary keyword k _i , C _i = {c _i1 , c _i2,. _..., it is possible to extract the _{c ini}} from the region ontology, ni represents the number of relevant keywords for _{k i,} also relevance values for each related keyword for _{_{_{k i, R i = {r}}} i1, r _{i 2} ,..., r _ini } can be calculated based on the region ontology. In total

And a combination of related keywords for the user's input query, S (Q) = {(c1, c2,..., Cm) | c1εC1 && c2εC2 && ... cmεCm} can be determined.

関連キーワードの組み合わせ毎に領域オントロジから最小のクエリセマンティックグラフを決定することができ、最小のクエリセマンティックグラフは組み合わせの中のそれぞれのキーワードを含み、各キーワードがクエリセマンティックグラフのノードであり、最小のクエリセマンティックグラフは結合されるそれぞれのキーワードについての他のノードも含み得る。最小のクエリセマンティックグラフは、所与のユーザクエリのサブクエリとして表すこともでき、それぞれのキーワードを連結するサブグラフの中の最小数のエッジを有するサブグラフである。 For each combination of related keywords, the smallest query semantic graph can be determined from the region ontology, where the smallest query semantic graph contains each keyword in the combination, each keyword is a node of the query semantic graph, and the smallest The query semantic graph may also include other nodes for each keyword that is combined. A minimal query semantic graph, which can also be expressed as a subquery of a given user query, is a subgraph having the smallest number of edges in the subgraph connecting the respective keywords.

関連キーワードの組み合わせにおいて、あるキーワードをランダムに選択し、選択されたキーワードをトラバースが始まる最初のノードとして設定し、領域オントロジにおいてキーワードを別のノードに連結する各パスをトラバースし、それぞれのキーワードを連結する最小サブグラフが決定されるまでターゲットノードへの最短のパスを最小サブグラフにおけるパスとして選択することによって、最小サブグラフを取得することができる。ここで同じパス長を有するパスが２つある場合、そのうちの一方をランダムに選択することができる。ここでのターゲットノードは、関連キーワードの組み合わせにおけるキーワードを指す。 In a combination of related keywords, select a keyword randomly, set the selected keyword as the first node where traversal begins, traverse each path that connects the keyword to another node in the region ontology, A minimum subgraph can be obtained by selecting the shortest path to the target node as the path in the minimum subgraph until the minimum subgraph to be connected is determined. Here, when there are two paths having the same path length, one of them can be selected at random. The target node here indicates a keyword in a combination of related keywords.

ステップＳ３０３において、それぞれのクエリ結果セットから対応する数のクエリ結果を抽出することにより、最終的なクエリ結果セットを構築する方法は２つある。第１の方法は、各サブクエリ結果セットからあらかじめ設定された数のクエリ結果を抽出することである。第２の方法は、所与のクエリに対するサブクエリの関連性に従って、各サブクエリ結果セットから対応する数のクエリ結果を抽出することであり、それによってサブクエリが所与のクエリに関連すればするほど、サブクエリ結果セットからより多くの結果が選択され、最終的なクエリ結果セットに追加されることになる。 In step S303, there are two methods for constructing a final query result set by extracting a corresponding number of query results from each query result set. The first method is to extract a preset number of query results from each subquery result set. The second method is to extract a corresponding number of query results from each subquery result set according to the relevance of the subquery for a given query, so that the more the subquery is related to a given query, More results will be selected from the subquery result set and added to the final query result set.

具体的には図６に示されるように、所与のクエリに対する各サブクエリの関連性に従って、特に以下の通り各サブクエリ結果セットから対応する数のクエリ結果を取得することができる。 Specifically, as shown in FIG. 6, according to the relevance of each subquery to a given query, a corresponding number of query results can be obtained from each subquery result set, particularly as follows.

ステップＳ６０１は、各最小サブグラフのサブグラフ重みを

として決定することであり、式中ｍはクエリキーワードの数を表し、ｒｉはクエリキーワードに対するクエリキーワードについて領域オントロジから抽出された関連キーワードのマッチ値を表し、｜Ｅ｜はサブグラフにおけるエッジの数を表す。 Step S601 calculates the subgraph weight of each minimum subgraph.

Where m represents the number of query keywords, ri represents the match value of the related keyword extracted from the region ontology for the query keyword for the query keyword, and | E | represents the number of edges in the subgraph. Represent.

ステップＳ６０２は、最小サブグラフのサブグラフ重みに従って各最小サブグラフに対応するサブクエリ結果セットから対応する数のクエリ結果を抽出することである。 Step S602 is to extract a corresponding number of query results from the subquery result set corresponding to each minimum subgraph according to the subgraph weight of the minimum subgraph.

ステップＳ６０２では、対応する数のクエリ結果は、最小サブグラフのサブグラフ重みに従って、特に以下の通り各最小サブグラフに対応するサブクエリ結果セットから取得することができる。 In step S602, a corresponding number of query results can be obtained from the subquery result set corresponding to each minimum subgraph, particularly as follows, according to the subgraph weights of the minimum subgraph.

最小サブグラフに対応するサブクエリ結果セットから抽出されたクエリ結果は、上位ｋ＊ａ個の結果であり、ここでａはすべての最小サブグラフのサブグラフ重みの合計に対する現在の最小サブグラフのサブグラフ重みの比率を表し、ｋは最終的なクエリ結果セットのスケールを表すあらかじめ定められた値であり、ｋ＊ａはｋとａとの積未満の最大の整数を表す。 The query results extracted from the subquery result set corresponding to the smallest subgraph are the top k * a results, where a is the ratio of the subgraph weight of the current smallest subgraph to the sum of the subgraph weights of all smallest subgraphs. Where k is a predetermined value representing the scale of the final query result set, and k * a represents the largest integer less than the product of k and a.

さらに本発明の一実施例は、ユーザがより便利にそのクエリの意図に近いクエリ結果を見ることができるようにクエリ結果にランク付けする方法を提供し、この時、図７に示されるように、構築された最終的なクエリ結果セットをランク付けして多様化されたクエリ結果を取得するステップＳ３０４は、以下を含む。 Furthermore, one embodiment of the present invention provides a method for ranking query results so that the user can more conveniently view query results that are closer to the intent of the query, as shown in FIG. Step S304 of ranking the constructed final query result set to obtain diversified query results includes:

ステップＳ７０１は、最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値を決定することである。 Step S701 is to determine the value of the relevance of the query result for the corresponding minimum subgraph for each query result in the final query result set.

ステップＳ７０２は、最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値、および最小サブグラフのサブグラフ重みに従ってクエリ結果の重みを決定することである。そして Step S702 is to determine the weight of the query result according to the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph for each query result in the final query result set. And

ステップＳ７０３は、クエリ結果の重みによって最終的なクエリ結果セットにおけるクエリ結果をランク付けして、多様化されたクエリ結果を取得することである。 Step S703 is to obtain the diversified query results by ranking the query results in the final query result set according to the query result weights.

具体的にはステップＳ７０２で、クエリ結果の重みを、特に以下の通り対応する最小サブグラフに対するクエリ結果の関連度の値と、最小サブグラフのサブグラフ重みに従って決定することができる。 Specifically, in step S702, the weight of the query result can be determined according to the relevance value of the query result with respect to the corresponding minimum subgraph and the subgraph weight of the minimum subgraph as follows.

クエリ結果の重みは、対応する最小サブグラフに対するクエリ結果の関連度の値と最小サブグラフのサブグラフ重みとの積として決定される。 The query result weight is determined as the product of the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph.

さらにステップＳ７０３で、クエリ結果の重みに従って取得されたクエリ結果を直接ランク付けすることにより、またはさらにクエリ結果の間の類似性を考慮に入れることにより、取得されたクエリ結果をクエリ結果の重みに従ってランク付けすることができ、この時、図８に示されるようにステップＳ７０３は特に以下を含む。 Further, in step S703, by directly ranking the query results obtained according to the query result weights, or further taking into account the similarity between the query results, the obtained query results are made according to the query result weights. At this time, as shown in FIG. 8, step S703 specifically includes the following:

ステップＳ８０１は、最も高い重みを有するクエリ結果を第１位にランク付けされたクエリ結果として決定し、各２つのクエリ結果の間の類似性を決定することである。 Step S801 is to determine the query result having the highest weight as the query result ranked first, and to determine the similarity between each two query results.

ステップＳ８０２は、ランク付けすべき各クエリ結果の類似性の重みを

として決定することであり、式中ｓはクエリ結果の重みを表し、ｄは現在のクエリ結果を表し、Ｄはすでにランク付けされている１組のクエリ結果を表し、類似性（ｄ，ｄ´）はｄとｄ´との間の類似性を表す。そして Step S802 calculates the similarity weight of each query result to be ranked.

Where s represents the weight of the query result, d represents the current query result, D represents the set of query results that are already ranked, and the similarity (d, d ′ ) Represents the similarity between d and d '. And

ステップＳ８０３は、第１位にランク付けされたクエリ結果を除くクエリ結果を類似性の重みによって繰り返してランク付けすることである。 Step S803 is to repeatedly rank the query results excluding the query result ranked first by the similarity weight.

本発明の一実施例によるクエリ結果を多様化するための方法を示す具体的な例について、以下で説明する。 A specific example illustrating a method for diversifying query results according to an embodiment of the present invention is described below.

ユーザのクエリキーワードが「牡丹」および「北京」である場合、Ｃ（「牡丹」）＝｛（「牡丹花」，０．５），（「牡丹ＴＶ」，０．２），（「牡丹江」，０．２），…｝およびＣ（「北京」）＝｛（「北京市」，０．８）（「北京の腕時計」，０．０７），（「北京故事」，０．０５），…｝を領域オントロジから決定することができ、ここで（「牡丹花」，０．５）は、「牡丹」に対する「牡丹」の関連キーワード「牡丹花」のマッチ値を表す。 When the user's query keyword is “peony” and “Beijing”, C (“peony”) = {(“peony flower”, 0.5), (“peony TV”, 0.2), (“peony river” , 0.2), ...} and C ("Beijing") = {("Beijing City", 0.8) ("Beijing Watch", 0.07), ("Beijing Events", 0.05), ...} can be determined from the region ontology, where ("peony flower", 0.5) represents the match value of the related keyword "peony flower" of "peony" with respect to "peony".

それぞれの関連キーワードの組み合わせが決定された後、それぞれのキーワードを連結するサブグラフが取得され、例えば１組の最小サブグラフはＳ（グラフ）＝｛（ｇ１，牡丹花，北京市，０．６５），（ｇ２，牡丹ＴＶ，北京市，０．５），（ｇ３，牡丹花，李勤勤，北京故事，０．１３８），…｝であり、容易に導き出すことができるように、サブグラフｇ１は０．６５のサブグラフ重みを有し、ｇ２は０．５のサブグラフ重みを有し、ｇ３は０．１３８のサブグラフ重みを有する。 After the combination of each related keyword is determined, a subgraph connecting each keyword is acquired. For example, one set of minimum subgraphs is S (graph) = {(g1, peony flower, Beijing, 0.65), (G2, peony TV, Beijing, 0.5), (g3, peony flower, li commuter, beijing affair, 0.138), ...}, and the subgraph g1 is 0.65 so that it can be easily derived. G2 has a subgraph weight of 0.5 and g3 has a subgraph weight of 0.138.

それぞれのサブグラフにおけるキーワードおよび他のノードによるサーチによって、サブクエリ結果のそれぞれの組、例えばここでは、サブクエリ結果の組における文書毎に、結果（ｇ１）＝｛（ｄｏｃ１，ωｇ＝０．６５，ωｒ＝０．９），（ｄｏｃ２，ωｇ＝０．６５，ωｒ＝０．７），…｝、結果（ｇ２）＝｛（ｄｏｃ３，ωｇ＝０．５，ωｒ＝０．８），（ｄｏｃ４，ωｇ＝０．５，ωｒ＝０．６）…｝…が得られ、ｗｇはそれに対する最小サブグラフのサブグラフ重みを表し、ｗｒは最小サブグラフに対する文書の関連度を表し、サブクエリ結果の各組における文書はｗｒによってランク付けされる。 For each set of subquery results, eg, here for each document in the set of subquery results, by searching with keywords and other nodes in each subgraph, the result (g1) = {(doc1, ωg = 0.65, ωr = 0.9), (doc2, ωg = 0.65, ωr = 0.7),...}, Result (g2) = {(doc3, ωg = 0.5, ωr = 0.8), (doc4, ωg = 0.5, ωr = 0.6)..., Wg represents the subgraph weight of the smallest subgraph, wr represents the relevance of the document to the smallest subgraph, and the document in each set of subquery results is ranked by wr.

最小サブグラフに対応するサブクエリ結果セットから取得されたクエリ結果は、サブグラフに最も関連する上位３＊ａ個のクエリ結果であり、例えば上位

個の文書が結果（ｇ１）から選択され、最終的なクエリ結果セットＲＦ（ｑ）に追加され、上位

個の文書が結果（ｇ２）から選択され、最終的なクエリ結果セットＲＦ（ｑ）に追加される。ここで３は、ＲＦ（ｑ）の予想されるスケールを意味する。 The query results obtained from the subquery result set corresponding to the smallest subgraph are the top 3 * a query results most relevant to the subgraph, eg, top

Documents are selected from the results (g1) and added to the final query result set RF (q)

Documents are selected from the results (g2) and added to the final query result set RF (q). Here, 3 means the expected scale of RF (q).

ＲＦ（ｑ）が｛（ｄｏｃ１，０．６５，０．９），（ｄｏｃ２，０．６５，０．７），（ｄｏｃ３，０．５，０．８）｝である場合、 When RF (q) is {(doc1, 0.65, 0.9), (doc2, 0.65, 0.7), (doc3, 0.5, 0.8)},

取得されたクエリ結果は、クエリ結果の重みによって直接ランク付けすることができ、３つの文書の重みはそれぞれｓ１＝０．６５×０．９、ｓ２＝０．６５×０．７、ｓ３＝０．５×０．８であるので、ランク付けされたクエリ結果は、ＲＦ（ｑ）＝｛ｄｏｃ１，ｄｏｃ２，ｄｏｃ３｝である。 The obtained query results can be directly ranked by the weight of the query results, and the weights of the three documents are s1 = 0.65 × 0.9, s2 = 0.65 × 0.7, and s3 = 0, respectively. Since it is .5 × 0.8, the ranked query result is RF (q) = {doc1, doc2, doc3}.

あるいは取得されたクエリ結果は、さらに類似性を使用してランク付けすることができ、この時、類似性（ｄｏｃ１，ｄｏｃ２）＝０．５、類似性（ｄｏｃ１，ｄｏｃ３）＝０．１、および類似性（ｄｏｃ２，ｄｏｃ３）＝０．２である場合、ランク付けされたクエリ結果は、ＲＦ（ｑ）＝｛ｄｏｃ１，ｄｏｃ３，ｄｏｃ２｝である。 Alternatively, the retrieved query results can be further ranked using similarity, where similarity (doc1, doc2) = 0.5, similarity (doc1, doc3) = 0.1, and If similarity (doc2, doc3) = 0.2, the ranked query result is RF (q) = {doc1, doc3, doc2}.

本発明の一実施例は、対応してクエリ結果を多様化するための装置をさらに提供し、図９に示されるように装置は以下を含む。 One embodiment of the present invention further provides an apparatus for correspondingly diversifying query results, as shown in FIG. 9, the apparatus includes:

所与のクエリのそれぞれのクエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築するように構成された関連キーワード組み合わせセット構築ユニット９０１と、 A related keyword combination set construction unit 901 configured to construct a related keyword combination set by extracting related keywords from the region ontology for each query keyword of a given query;

関連キーワードの組み合わせセットにおけるそれぞれの関連キーワードの組み合わせでサーチして、クエリ結果セットを取得するように構成されたクエリユニット９０２と、 A query unit 902 configured to search with each related keyword combination in the related keyword combination set to obtain a query result set;

それぞれのクエリ結果セットから対応する数のクエリ結果を抽出することによって、最終的なクエリ結果セットを構築するように構成された最終クエリ結果セット構築ユニット９０３と、 A final query result set construction unit 903 configured to construct a final query result set by extracting a corresponding number of query results from each query result set;

構築された最終的なクエリ結果セットをランク付けして、多様化されたクエリ結果を取得するように構成されたランク付けユニット９０４とを含む。 A ranking unit 904 configured to rank the constructed final query result set to obtain diversified query results.

特に、関連キーワード組み合わせセット構築ユニット９０１は、 In particular, the related keyword combination set construction unit 901

所与のクエリのクエリキーワード毎に、領域オントロジからキーワードの関連キーワードを抽出し、 For each query keyword of a given query, extract keyword related keywords from the domain ontology,

それぞれの関連キーワードに基づいて、関連キーワードの組み合わせセットを構築するように構成される。 A related keyword combination set is constructed based on each related keyword.

それぞれの関連キーワードに基づいて、関連キーワードの組み合わせセットを構築する関連キーワード組み合わせセット構築ユニット９０１は、特に以下を含む。 The related keyword combination set construction unit 901 that constructs a combination set of related keywords based on each related keyword includes in particular:

関連キーワードの組み合わせセットは、Ｓ（Ｑ）＝｛（ｃ_１，ｃ_２，…，ｃ_ｍ）｜ｃ_１∈Ｃ_１＆＆ｃ_２∈Ｃ_２＆＆…ｃ_ｍ∈Ｃ_ｍ｝として構築され、ここでＣ_ｉは、所与のクエリにおけるｍ個のキーワードの中のｉ番目のキーワードの関連キーワードセットである。 The related keyword combination set is constructed as S (Q) = {(c ₁ , c ₂ ,..., C _m ) | c ₁ ∈ C ₁ && c ₂ ∈ C ₂ &&... C _m ∈ C _m }, where C _i is a related keyword set of the i-th keyword among m keywords in a given query.

特に、関連キーワード組み合わせセット構築ユニット９０１は、さらに In particular, the related keyword combination set construction unit 901 further includes

所与のクエリのクエリキーワード毎に、領域オントロジからキーワードの関連キーワードを抽出した後、 For each query keyword in a given query, after extracting the keyword related keywords from the domain ontology,

領域オントロジにおける所与のクエリのキーワードセットの関連キーワードの組み合わせセットを構築した後、 After building a related keyword combination set for a given query keyword set in a domain ontology,

関連キーワードの組み合わせセットにおける関連キーワードの組み合わせ毎に、関連キーワードの組み合わせ内のそれぞれのキーワードを連結する最小サブグラフを領域オントロジから抽出するように構成されており、ここでは、最小サブグラフは、領域オントロジにおけるそれぞれのキーワードを連結するサブグラフの中の最小数のエッジを有するサブグラフである。 For each combination of related keywords in the combination set of related keywords, a minimum subgraph connecting the keywords in the combination of related keywords is extracted from the region ontology. Here, the minimum subgraph is defined in the region ontology. It is a subgraph having the minimum number of edges in the subgraph connecting each keyword.

クエリユニット９０２は特に、 Query unit 902 specifically includes

最小サブグラフ毎に、最小サブグラフにおけるキーワードおよび他のノードから成るサブクエリを決定し、 For each minimal subgraph, determine a subquery consisting of keywords and other nodes in the minimal subgraph,

各サブクエリにおけるキーワードおよび他のノードでサーチして、最小サブグラフの数と同じ数のサブクエリ結果セットを取得し、 Search by keyword and other nodes in each subquery to get as many subquery result sets as the number of minimum subgraphs,

クエリ結果セットをそれぞれのサブクエリ結果セットから成るセットとして決定するように構成される。 A query result set is configured to be determined as a set of respective subquery result sets.

最終クエリ結果セット構築ユニット９０３は特に、 The final query result set construction unit 903 specifically

所与のクエリに対する各サブクエリの関連性に従って、各サブクエリ結果セットから対応する数のクエリ結果を抽出し、 Extract a corresponding number of query results from each subquery result set according to the relevance of each subquery to a given query,

それぞれのサブクエリ結果セットから取得されたクエリ結果を結合するように構成される。 Configured to combine query results obtained from respective subquery result sets.

さらに、最終クエリ結果セット構築ユニット９０３は特に、 Furthermore, the final query result set construction unit 903 specifically

各最小サブグラフのサブグラフ重みを

として決定し、式中ｍはクエリキーワードの数を表し、ｒｉはクエリキーワードに対するクエリキーワードについて領域オントロジから抽出された関連キーワードのマッチ値を表し、｜Ｅ｜はサブグラフにおけるエッジの数を表し、 Subgraph weight for each minimum subgraph

Where m represents the number of query keywords, ri represents the match value of the related keyword extracted from the region ontology for the query keyword for the query keyword, | E | represents the number of edges in the subgraph,

最小サブグラフのサブグラフ重みに従って、各最小サブグラフに対応するサブクエリ結果セットから対応する数のクエリ結果を抽出し、 Extract a corresponding number of query results from the subquery result set corresponding to each minimum subgraph according to the subgraph weight of the minimum subgraph,

具体的には、最小サブグラフのサブグラフ重みに従って、各最小サブグラフに対応するサブクエリ結果セットから対応する数のクエリ結果を抽出する最終クエリ結果セット構築ユニット９０３は、特に以下を含む。 Specifically, the final query result set construction unit 903 that extracts a corresponding number of query results from the subquery result set corresponding to each minimum subgraph according to the subgraph weights of the minimum subgraph includes in particular:

最小サブグラフに対応するサブクエリ結果セットから抽出されたクエリ結果は、最小サブグラフに最も関連する上位ｋ＊ａ個のクエリ結果であり、ここでａはすべての最小サブグラフのサブグラフ重みの合計に対する現在の最小サブグラフのサブグラフ重みの比率を表し、ｋは最終的なクエリ結果セットのスケールを表すあらかじめ定められた値であり、ｋ＊ａはｋとａとの積未満の最大の整数を表す。 The query results extracted from the subquery result set corresponding to the smallest subgraph are the top k * a query results most relevant to the smallest subgraph, where a is the current minimum for the sum of the subgraph weights of all the smallest subgraphs. It represents the ratio of the subgraph weight of the subgraph, k is a predetermined value representing the scale of the final query result set, and k * a represents the largest integer less than the product of k and a.

ランク付けユニット９０４は特に、 In particular, the ranking unit 904

最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値を決定し、 For each query result in the final query result set, determine the relevance value of the query result for the corresponding smallest subgraph,

最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値、および最小サブグラフのサブグラフ重みに従ってクエリ結果の重みを決定し、 For each query result in the final query result set, determine the query result weight according to the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph,

クエリ結果の重みによって最終的なクエリ結果セットにおけるクエリ結果をランク付けして、多様化されたクエリ結果を取得するように構成される。 A query result weight is configured to rank the query results in the final query result set to obtain a diversified query result.

クエリ結果の重みを、対応する最小サブグラフに対するクエリ結果の関連度の値、および最小サブグラフのサブグラフ重みに従って決定するランク付けユニット９０４は、特に以下を含む。 The ranking unit 904 that determines the weight of the query result according to the value of the relevance of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph includes in particular:

クエリ結果の重みによって取得されたクエリ結果をランク付けするランク付けユニット９０４は、特に以下を含む。 The ranking unit 904 for ranking the query results obtained by the query result weights includes in particular:

クエリ結果は、クエリ結果の重みに従って直接ランク付けされる。または The query results are directly ranked according to the query result weights. Or

最も高い重みを有するクエリ結果は、第１位にランク付けされたクエリ結果として決定され、各２つのクエリ結果の間の類似性が決定され、ランク付けされる各クエリ結果の類似性の重みは、

として決定される。式中ｓはクエリ結果の重みを表し、ｄは現在のクエリ結果を表し、Ｄはすでにランク付けされている１組のクエリ結果を表し、類似性（ｄ，ｄ´）はｄとｄ´との間の類似性を表し、第１位にランク付けされたクエリ結果を除くクエリ結果が類似性の重みによって繰り返してランク付けされる。 The query result with the highest weight is determined as the query result ranked first, the similarity between each two query results is determined, and the similarity weight of each ranked query result is ,

As determined. Where s represents the weight of the query result, d represents the current query result, D represents a set of already ranked query results, and the similarity (d, d ′) is d and d ′. The query results excluding the query result ranked first are repeatedly ranked according to the similarity weight.

本発明の実施例は、クエリ結果を多様化するための方法および装置を提供し、所与のクエリの１組のキーワードの１組の関連キーワードの組み合わせが領域オントロジにおいて決定され、これらの関連キーワードの組み合わせによって拡張されたクエリセットが構築され、それによって、信頼できないクエリログから拡張されたクエリセットが決定されることが回避され、従って多様化されたクエリ結果がより正確になる。 Embodiments of the present invention provide a method and apparatus for diversifying query results, wherein a set of related keywords for a set of keywords for a given query is determined in a region ontology, and these related keywords An expanded query set is constructed by the combination, thereby avoiding determining an expanded query set from untrusted query logs, thus making diversified query results more accurate.

本発明の実施例を、方法、システム、またはコンピュータプログラム製品として具体化することができることを、当業者であれば理解されたい。従って本発明は、すべてハードウェアの実施例、すべてソフトウェアの実施例、またはソフトウェアとハードウェアとの組み合わせの実施例の形で具体化することができる。さらに、本発明は、コンピュータ使用可能プログラムコードが含まれる１つまたは複数のコンピュータ使用可能記憶媒体（ディスクメモリ、ＣＤ−ＲＯＭ、光メモリなどを含むが、これに限定されるものではない）において具体化されるコンピュータプログラム製品の形で具体化することができる。 Those skilled in the art will appreciate that the embodiments of the invention may be embodied as a method, system, or computer program product. Accordingly, the present invention may be embodied in the form of an entirely hardware embodiment, an all software embodiment, or an embodiment of a combination of software and hardware. Furthermore, the present invention is embodied in one or more computer-usable storage media (including but not limited to disk memory, CD-ROM, optical memory, etc.) that contain computer-usable program code. Can be embodied in the form of a computer program product.

本発明について、本発明の実施例による方法、デバイス（システム）、およびコンピュータプログラム製品のフローチャート／ブロック図で説明してきた。フローチャート／ブロック図のそれぞれのフロー／ブロック、ならびにフローチャート／ブロック図のフロー／ブロックの組み合わせをコンピュータプログラム命令で具体化することができることを理解されたい。これらのコンピュータプログラム命令は、コンピュータ、または他のプログラム可能なデータ処理装置のプロセッサ上で実行される命令がフローチャートのフロー／ブロック図のブロックで指定されている機能を実行するための手段を構築できるマシンを生産するために、汎用コンピュータ、専用コンピュータ、埋込み型プロセッサ、または別のプログラム可能なデータ処理装置のプロセッサ上にロードすることができる。 The present invention has been described in flowcharts / block diagrams of methods, devices (systems) and computer program products according to embodiments of the invention. It should be understood that each flow / block of the flowchart / block diagram, and combinations of flowcharts / block diagrams, may be embodied in computer program instructions. These computer program instructions can build a means for instructions executed on the processor of a computer or other programmable data processing device to perform the functions specified in the flowchart / block diagram blocks of the flowchart. To produce the machine, it can be loaded onto a general purpose computer, a dedicated computer, an embedded processor, or another programmable data processing device processor.

これらのコンピュータプログラム命令は、コンピュータまたは他のプログラム可能なデータ処理装置に特定の方法で動作するよう指示することができるコンピュータ可読メモリに格納することもできる。これによってコンピュータ可読メモリに格納された命令が、フローチャートのフロー／ブロック図のブロックで指定されている機能を実行する命令手段を含む製品を作成できるようになる。 These computer program instructions may also be stored in a computer readable memory that may instruct a computer or other programmable data processing device to operate in a particular manner. This allows the instructions stored in the computer readable memory to create a product that includes instruction means for performing the functions specified in the flowchart / block diagram blocks.

これらのコンピュータプログラム命令は、一連の操作ステップがコンピュータまたは他のプログラム可能なデータ処理装置上で実行され、コンピュータまたは他のプログラム可能なデバイス上で実行される命令が、フローチャートのフロー／ブロック図のブロックで指定されている機能を実行するためのステップを提供するようなコンピュータ実施プロセスを作成するために、コンピュータまたは他のプログラム可能なデータ処理装置上にロードすることもできる。 These computer program instructions are executed in a sequence of operational steps on a computer or other programmable data processing device, and the instructions executed on the computer or other programmable device are represented in the flowchart / block diagram of the flowchart. It can also be loaded onto a computer or other programmable data processing device to create a computer-implemented process that provides steps for performing the functions specified in the block.

本発明の好ましい実施例について説明してきたが、基礎をなす発明の概念から恩恵を受ける当業者は、これらの実施例に追加の変更および変形を行うことができる。従って添付の特許請求の範囲は、好ましい実施例および本発明の範囲に含まれるすべての変更および変形を含むものと解釈されるものとする。 Although preferred embodiments of the present invention have been described, those skilled in the art who benefit from the underlying inventive concepts can make additional changes and modifications to these embodiments. Accordingly, the appended claims are to be construed to include the preferred embodiments and all modifications and variations that fall within the scope of the invention.

当業者は、本発明の意図および範囲から逸脱することなく、本発明に様々な変更および変形を行うことができることは明らかである。従って変更および変形が本発明およびその同等物に添付される特許請求の範囲に含まれる限り、本発明はこれらの変更および変形も含むものとする。 It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit and scope of the invention. Accordingly, it is intended that the present invention include those modifications and variations as long as the modifications and variations are included in the scope of the claims appended hereto and their equivalents.

さらに、上記実施形態の一部又は全部は、以下の付記のようにも記載されうるが、これに限定されない。 Further, a part or all of the above-described embodiment can be described as in the following supplementary notes, but is not limited thereto.

（付記１）
クエリ結果を多様化するための方法であって、
所与のクエリの各クエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築するステップと、
関連キーワードの組み合わせセットにおけるそれぞれの関連キーワードの組み合わせでサーチして、クエリ結果セットを取得するステップと、
それぞれのクエリ結果セットから対応する数のクエリ結果を抽出することによって、最終的なクエリ結果セットを構築するステップと、
構築された最終的なクエリ結果セットをランク付けして、多様化されたクエリ結果を取得するステップと
を含むことを特徴とするクエリ結果を多様化するための方法。 (Appendix 1)
A method for diversifying query results,
Building a set of related keywords by extracting the related keywords from the domain ontology for each query keyword of a given query;
Searching for each related keyword combination in the related keyword combination set to obtain a query result set;
Constructing a final query result set by extracting a corresponding number of query results from each query result set;
Ranking the constructed final query result set to obtain a diversified query result; and a method for diversifying the query results.

（付記２）
所与のクエリの各クエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築する前記ステップが、
所与のクエリのクエリキーワード毎に、前記領域オントロジからキーワードの関連キーワードを抽出するステップと、
前記関連キーワードに基づいて、関連キーワードの組み合わせセットを構築するステップを含む
ことを特徴とする付記１に記載のクエリ結果を多様化するための方法。 (Appendix 2)
The step of constructing a set of related keywords by extracting related keywords from the domain ontology for each query keyword of a given query comprises:
For each query keyword of a given query, extracting a keyword related keyword from the region ontology;
The method for diversifying query results according to claim 1, further comprising the step of constructing a set of related keyword combinations based on the related keywords.

（付記３）
前記関連キーワードに基づいて、関連キーワードの組み合わせセットを構築する前記ステップが、
関連キーワードの組み合わせセットを、Ｓ（Ｑ）＝｛（ｃ_１，ｃ_２，…，ｃ_ｍ）｜ｃ_１∈Ｃ_１＆＆ｃ_２∈Ｃ_２＆＆…ｃ_ｍ∈Ｃ_ｍ｝として構築する
ここでＣ_ｉは、所与のクエリにおけるｍ個のキーワードの中のｉ番目のキーワードの関連キーワードセットである
ことを特徴とする付記２に記載のクエリ結果を多様化するための方法。 (Appendix 3)
The step of constructing a combination set of related keywords based on the related keywords comprises:
Build a set of related keywords as S (Q) = {(c ₁ , c ₂ ,..., C _m ) | c ₁ ∈ C ₁ && c ₂ ∈ C ₂ &&... C _m ∈ C _m } where C The method for diversifying the query result according to appendix 2, wherein _i is a related keyword set of an i-th keyword among m keywords in a given query.

（付記４）
各クエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築した後に、
関連キーワードの組み合わせセットにおける関連キーワードの組み合わせ毎に、組み合わせ内のすべてのキーワードを連結する最小サブグラフを領域オントロジから抽出するステップを含み、
関連キーワードの組み合わせセットにおけるそれぞれの関連キーワードの組み合わせでサーチして、クエリ結果セットを取得する前記ステップは、
最小サブグラフ毎に、最小サブグラフにおけるキーワードおよび他のノードから成るサブクエリを決定するステップと、
各サブクエリにおけるキーワードおよび他のノードでサーチして、最小サブグラフの数と同じ数のサブクエリ結果セットを取得するステップと、
クエリ結果セットをそれぞれのサブクエリ結果セットから成るセットとして決定するステップとを含み、
前記最小サブグラフは、領域オントロジにおけるそれぞれのキーワードを連結するサブグラフの中の最小数のエッジを有し、最も近い方法ですべての関連キーワードを連結するサブグラフである
ことを特徴とする付記１に記載のクエリ結果を多様化するための方法。 (Appendix 4)
After building a set of related keywords by extracting related keywords from the domain ontology for each query keyword,
For each related keyword combination in the related keyword combination set, the step of extracting from the region ontology a minimum subgraph connecting all the keywords in the combination,
Searching for each related keyword combination in the related keyword combination set to obtain a query result set comprises:
For each minimum subgraph, determining a subquery consisting of keywords and other nodes in the minimum subgraph;
Searching on keywords and other nodes in each subquery to obtain the same number of subquery result sets as the number of minimum subgraphs;
Determining a query result set as a set of respective subquery result sets;
The minimum subgraph is a subgraph that has a minimum number of edges in a subgraph connecting each keyword in a region ontology, and connects all related keywords in the closest way. A method for diversifying query results.

（付記５）
それぞれのクエリ結果セットから対応する数のクエリ結果を抽出することにより、最終的なクエリ結果セットを構築する前記ステップが、
所与のクエリに対する各サブクエリの関連性に従って、各サブクエリ結果セットから対応する数のクエリ結果を抽出するステップと、
各サブクエリ結果セットから取得したクエリ結果を結合するステップとを含む
ことを特徴とする付記４に記載のクエリ結果を多様化するための方法。 (Appendix 5)
Said step of constructing a final query result set by extracting a corresponding number of query results from each query result set;
Extracting a corresponding number of query results from each subquery result set according to the relevance of each subquery to a given query;
The method for diversifying the query results according to claim 4, further comprising the step of combining query results obtained from each subquery result set.

（付記６）
所与のクエリに対する各サブクエリの関連性に従って、各サブクエリ結果セットから対応する数のクエリ結果を抽出する前記ステップが、
各最小サブグラフのサブグラフ重みを

ｍはクエリキーワードの数を表し、ｒｉはクエリキーワードに対するクエリキーワードについて領域オントロジから抽出された関連キーワードのマッチ値を表し、｜Ｅ｜はサブグラフにおけるエッジの数を表す
として決定するステップと、
最小サブグラフのサブグラフ重みに従って、各最小サブグラフに対応するサブクエリ結果セットから対応する数のクエリ結果を抽出するステップとを含む
ことを特徴とする付記５に記載のクエリ結果を多様化するための方法。 (Appendix 6)
Said step of extracting a corresponding number of query results from each subquery result set according to the relevance of each subquery to a given query;
Subgraph weight for each minimum subgraph

m represents the number of query keywords, ri represents the match value of the related keyword extracted from the region ontology for the query keyword for the query keyword, and | E | is determined to represent the number of edges in the subgraph;
The method for diversifying query results according to appendix 5, comprising: extracting a corresponding number of query results from a subquery result set corresponding to each minimum subgraph according to the subgraph weight of the minimum subgraph.

（付記７）
最小サブグラフのサブグラフ重みに従って、各最小サブグラフに対応するサブクエリ結果セットから対応する数のクエリ結果を抽出する前記ステップにおいて、
最小サブグラフに対応するサブクエリ結果セットから抽出されたクエリ結果は、最小サブグラフに最も関連する上位ｋ＊ａ個のクエリ結果であり、ここで、ａはすべての最小サブグラフのサブグラフ重みの合計に対する現在の最小サブグラフのサブグラフ重みの比率を表し、ｋは最終的なクエリ結果セットのスケールを表すあらかじめ定められた値であり、ｋ＊ａはｋとａとの積未満の最大の整数を表す
ことを特徴とする付記５に記載のクエリ結果を多様化するための方法。 (Appendix 7)
Extracting the corresponding number of query results from the subquery result set corresponding to each minimum subgraph according to the subgraph weight of the minimum subgraph,
The query results extracted from the subquery result set corresponding to the smallest subgraph are the top k * a query results most relevant to the smallest subgraph, where a is the current relative to the sum of the subgraph weights of all the smallest subgraphs. Represents the ratio of subgraph weights of the smallest subgraph, k is a predetermined value representing the scale of the final query result set, and k * a represents the largest integer less than the product of k and a. A method for diversifying the query result according to appendix 5.

（付記８）
構築された最終的なクエリ結果セットをランク付けして、多様化されたクエリ結果を取得する前記ステップが、
最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値を決定するステップと、
最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値、および最小サブグラフのサブグラフ重みに従ってクエリ結果の重みを決定するステップと、
クエリ結果の重みによって最終的なクエリ結果セットにおけるクエリ結果をランク付けして、多様化されたクエリ結果を取得するステップとを含む
ことを特徴とする付記４に記載のクエリ結果を多様化するための方法。 (Appendix 8)
The step of ranking the constructed final query result set to obtain diversified query results comprises:
For each query result in the final query result set, determining a relevance value of the query result for the corresponding minimum subgraph;
Determining, for each query result in the final query result set, a query result weight according to the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph;
Ranking the query results in the final query result set according to the weights of the query results, and obtaining the diversified query results, to diversify the query results as described in appendix 4 the method of.

（付記９）
最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値、および最小サブグラフのサブグラフ重みに従ってクエリ結果の重みを決定する前記ステップが、
クエリ結果の重みを、対応する最小サブグラフに対するクエリ結果の関連度の値と最小サブグラフのサブグラフ重みとの積として決定する
ことを特徴とする付記８に記載のクエリ結果を多様化するための方法。 (Appendix 9)
For each query result in the final query result set, said step of determining the query result weight according to the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph,
9. The method for diversifying a query result according to appendix 8, wherein the weight of the query result is determined as a product of a relevance value of the query result for the corresponding minimum subgraph and a subgraph weight of the minimum subgraph.

（付記１０）
構築された最終的なクエリ結果セットをランク付けして、多様化されたクエリ結果を取得する前記ステップが、
クエリ結果を、クエリ結果の重みに従って直接ランク付けするか、あるいは
最も高い重みを有するクエリ結果を、第１位にランク付けされたクエリ結果として決定し、各２つのクエリ結果の間の類似性を決定し、ランク付けされる各クエリ結果の類似性の重みを、

ここで、ｓはクエリ結果の重みを表し、ｄは現在のクエリ結果を表し、Ｄはすでにランク付けされている１組のクエリ結果を表し、類似性（ｄ，ｄ´）はｄとｄ´との間の類似性を表す
として決定し、第１位にランク付けされたクエリ結果を除くクエリ結果を類似性の重みによって繰り返してランク付けする
ことを特徴とする付記８に記載のクエリ結果を多様化するための方法。 (Appendix 10)
The step of ranking the constructed final query result set to obtain diversified query results comprises:
Either rank the query results directly according to the weight of the query results, or determine the query result with the highest weight as the query result ranked first, and determine the similarity between each two query results Determine and rank the similarity weight of each query result,

Where s represents the weight of the query result, d represents the current query result, D represents a set of already ranked query results, and similarity (d, d ′) is d and d ′. The query result according to appendix 8, characterized in that the query result excluding the query result ranked first is ranked repeatedly by the similarity weight. A way to diversify.

（付記１１）
クエリ結果を多様化するための装置であって、
それぞれのクエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築するように構成された関連キーワード組み合わせセット構築ユニットと、
関連キーワードの組み合わせセットにおけるそれぞれの関連キーワードの組み合わせでサーチして、クエリ結果セットを取得するように構成されたクエリユニットと、
それぞれのクエリ結果セットから対応する数のクエリ結果を抽出することによって、最終的なクエリ結果セットを構築するように構成された最終クエリ結果セット構築ユニットと、
構築された最終的なクエリ結果セットをランク付けして、多様化されたクエリ結果を取得するように構成されたランク付けユニットと
を備えることを特徴とするクエリ結果を多様化するための装置。 (Appendix 11)
A device for diversifying query results,
A related keyword combination set building unit configured to build a set of related keyword combinations by extracting related keywords from the domain ontology for each query keyword;
A query unit configured to search for each related keyword combination in the related keyword combination set to obtain a query result set;
A final query result set construction unit configured to construct a final query result set by extracting a corresponding number of query results from each query result set;
An apparatus for diversifying query results, comprising: a ranking unit configured to rank the constructed final query result set to obtain diversified query results.

（付記１２）
前記関連キーワード組み合わせセット構築ユニットが、
所与のクエリのクエリキーワード毎に、前記領域オントロジからキーワードの関連キーワードを抽出し、
前記関連キーワードに基づいて、関連キーワードの組み合わせセットを構築するように構成される
ことを特徴とする付記１１に記載のクエリ結果を多様化するための装置。 (Appendix 12)
The related keyword combination set construction unit is
For each query keyword of a given query, extract keyword related keywords from the region ontology,
The apparatus for diversifying query results according to appendix 11, wherein the apparatus is configured to construct a combination set of related keywords based on the related keywords.

（付記１３）
前記関連キーワードに基づいて、関連キーワードの組み合わせセットを構築する前記関連キーワード組み合わせセット構築ユニットが、
関連キーワードの組み合わせセットを、Ｓ（Ｑ）＝｛（ｃ_１，ｃ_２，…，ｃ_ｍ）｜ｃ_１∈Ｃ_１＆＆ｃ_２∈Ｃ_２＆＆…ｃ_ｍ∈Ｃ_ｍ｝として構築するように構成され、
ここでＣ_ｉは、所与のクエリにおけるｍ個のキーワードの中のｉ番目のキーワードの関連キーワードセットである
ことを特徴とする付記１２に記載のクエリ結果を多様化するための装置。 (Appendix 13)
The related keyword combination set construction unit that constructs a combination set of related keywords based on the related keywords,
A related keyword combination set is constructed such that S (Q) = {(c ₁ , c ₂ ,..., C _m ) | c ₁ ∈C ₁ && c ₂ ∈C ₂ &&... C _m ∈C _m } And
Here C _i, the apparatus for diversifying the query results according to Note 12, which is a related keyword set of the i-th keyword in the m keywords in a given query.

（付記１４）
前記関連キーワード組み合わせセット構築ユニットが、
各クエリキーワードについて領域オントロジから関連キーワードを抽出することによって、関連キーワードの組み合わせセットを構築した後に、
関連キーワードの組み合わせセットにおける関連キーワードの組み合わせ毎に、組み合わせ内のすべてのキーワードを連結する最小サブグラフを領域オントロジから抽出するように構成され、
前記クエリユニットが、
最小サブグラフ毎に、最小サブグラフにおけるキーワードおよび他のノードから成るサブクエリを決定し、
各サブクエリにおけるキーワードおよび他のノードでサーチして、最小サブグラフの数と同じ数のサブクエリ結果セットを取得し、
クエリ結果セットをそれぞれのサブクエリ結果セットから成るセットとして決定するように構成され、
前記最小サブグラフは、領域オントロジにおけるそれぞれのキーワードを連結するサブグラフの中の最小数のエッジを有し、最も近い方法ですべての関連キーワードを連結するサブグラフである
ことを特徴とする付記１１に記載のクエリ結果を多様化するための装置。 (Appendix 14)
The related keyword combination set construction unit is
After building a set of related keywords by extracting related keywords from the domain ontology for each query keyword,
For each related keyword combination in the related keyword combination set, it is configured to extract from the region ontology the smallest subgraph connecting all the keywords in the combination,
The query unit is
For each minimal subgraph, determine a subquery consisting of keywords and other nodes in the minimal subgraph,
Search by keyword and other nodes in each subquery to get as many subquery result sets as the number of minimum subgraphs,
Configured to determine the query result set as a set of each subquery result set,
The supplementary note 11, wherein the minimum subgraph is a subgraph having a minimum number of edges in a subgraph connecting each keyword in a region ontology, and connecting all related keywords in a closest manner. A device for diversifying query results.

（付記１５）
前記最終クエリ結果セット構築ユニットが、
所与のクエリに対する各サブクエリの関連性に従って、各サブクエリ結果セットから対応する数のクエリ結果を抽出し、
各サブクエリ結果セットから取得したクエリ結果を結合するように構成される
ことを特徴とする付記１４に記載のクエリ結果を多様化するための装置。 (Appendix 15)
The final query result set construction unit is
Extract a corresponding number of query results from each subquery result set according to the relevance of each subquery to a given query,
The apparatus for diversifying the query results according to appendix 14, wherein the query results obtained from the respective subquery result sets are combined.

（付記１６）
前記最終クエリ結果セット構築ユニットが、
各最小サブグラフのサブグラフ重みを

ｍはクエリキーワードの数を表し、ｒｉはクエリキーワードに対するクエリキーワードについて領域オントロジから抽出された関連キーワードのマッチ値を表し、｜Ｅ｜はサブグラフにおけるエッジの数を表す
として決定し、
最小サブグラフのサブグラフ重みに従って、各最小サブグラフに対応するサブクエリ結果セットから対応する数のクエリ結果を抽出し、
各サブクエリ結果セットから取得したクエリ結果を結合するように構成される
ことを特徴とする付記１５に記載のクエリ結果を多様化するための装置。 (Appendix 16)
The final query result set construction unit is
Subgraph weight for each minimum subgraph

m represents the number of query keywords, ri represents the match value of the related keyword extracted from the region ontology for the query keyword for the query keyword, and | E | represents the number of edges in the subgraph,
Extract a corresponding number of query results from the subquery result set corresponding to each minimum subgraph according to the subgraph weight of the minimum subgraph,
The apparatus for diversifying query results according to appendix 15, characterized in that it is configured to combine query results obtained from each subquery result set.

（付記１７）
最小サブグラフのサブグラフ重みに従って、各最小サブグラフに対応するサブクエリ結果セットから対応する数のクエリ結果を抽出する前記最終クエリ結果セット構築ユニットにおいて、
最小サブグラフに対応するサブクエリ結果セットから抽出されたクエリ結果は、最小サブグラフに最も関連する上位ｋ＊ａ個のクエリ結果であり、ここで、ａはすべての最小サブグラフのサブグラフ重みの合計に対する現在の最小サブグラフのサブグラフ重みの比率を表し、ｋは最終的なクエリ結果セットのスケールを表すあらかじめ定められた値であり、ｋ＊ａはｋとａとの積未満の最大の整数を表す
ことを特徴とする付記１６に記載のクエリ結果を多様化するための装置。 (Appendix 17)
In the final query result set construction unit for extracting a corresponding number of query results from the subquery result set corresponding to each minimum subgraph according to the subgraph weight of the minimum subgraph,
The query results extracted from the subquery result set corresponding to the smallest subgraph are the top k * a query results most relevant to the smallest subgraph, where a is the current relative to the sum of the subgraph weights of all the smallest subgraphs. Represents the ratio of subgraph weights of the smallest subgraph, k is a predetermined value representing the scale of the final query result set, and k * a represents the largest integer less than the product of k and a. An apparatus for diversifying the query results according to attachment 16.

（付記１８）
前記ランク付けユニットが、
最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値を決定し、
最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値、および最小サブグラフのサブグラフ重みに従ってクエリ結果の重みを決定し、
クエリ結果の重みによって最終的なクエリ結果セットにおけるクエリ結果をランク付けして、多様化されたクエリ結果を取得するように構成される
ことを特徴とする付記１４に記載のクエリ結果を多様化するための装置。 (Appendix 18)
The ranking unit is
For each query result in the final query result set, determine the relevance value of the query result for the corresponding smallest subgraph,
For each query result in the final query result set, determine the query result weight according to the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph,
The query result according to appendix 14, wherein the query result is configured to rank the query result in the final query result set according to the weight of the query result to obtain a diversified query result. Equipment for.

（付記１９）
最終的なクエリ結果セットにおけるクエリ結果毎に、対応する最小サブグラフに対するクエリ結果の関連度の値、および最小サブグラフのサブグラフ重みに従ってクエリ結果の重みを決定する前記ランク付けユニットが、
クエリ結果の重みを、対応する最小サブグラフに対するクエリ結果の関連度の値と最小サブグラフのサブグラフ重みとの積として決定するように構成される
ことを特徴とする付記１８に記載のクエリ結果を多様化するための装置。 (Appendix 19)
For each query result in the final query result set, the ranking unit that determines the query result weight according to the relevance value of the query result to the corresponding minimum subgraph and the subgraph weight of the minimum subgraph,
The query result according to appendix 18, wherein the query result weight is determined to be determined as a product of the value of the relevance of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph. Device to do.

（付記２０）
クエリ結果の重みによって最終的なクエリ結果セットにおけるクエリ結果をランク付けする前記ランク付けユニットが、
クエリ結果を、クエリ結果の重みに従って直接ランク付けするか、あるいは
最も高い重みを有するクエリ結果を、第１位にランク付けされたクエリ結果として決定し、各２つのクエリ結果の間の類似性を決定し、ランク付けされる各クエリ結果の類似性の重みを、

ここで、ｓはクエリ結果の重みを表し、ｄは現在のクエリ結果を表し、Ｄはすでにランク付けされている１組のクエリ結果を表し、類似性（ｄ，ｄ´）はｄとｄ´との間の類似性を表す
として決定し、第１位にランク付けされたクエリ結果を除くクエリ結果を類似性の重みによって繰り返してランク付けするように構成される
ことを特徴とする付記１８に記載のクエリ結果を多様化するための装置。 (Appendix 20)
The ranking unit that ranks query results in the final query result set by query result weights,
Either rank the query results directly according to the weight of the query results, or determine the query result with the highest weight as the query result ranked first, and determine the similarity between each two query results Determine and rank the similarity weight of each query result,

Where s represents the weight of the query result, d represents the current query result, D represents a set of already ranked query results, and similarity (d, d ′) is d and d ′. The supplementary note 18 is characterized in that it is configured so that the query results excluding the query result ranked first are repeatedly ranked according to the similarity weight. Device for diversifying the described query results.

２０１：クエリユニット
２０２：クエリログメモリユニット
２０３：クエリ曖昧性除去ユニット
２０４：サブクエリメモリユニット
２０５：文書メモリユニット
２０６：キーワードサーチユニット
２０７：サブクエリ結果メモリユニット
２０８：クエリ結果結合ユニット
２０９：クエリ結果メモリユニット
２１０：クエリ結果ランク付けユニット
２１１：多様化済みランクリストメモリユニット
９０１：関連キーワード組み合わせセット構築ユニット
９０２：クエリユニット
９０３：最終クエリ結果セット構築ユニット
９０４：ランク付けユニット
201: Query unit 202: Query log memory unit 203: Query disambiguation unit 204: Subquery memory unit 205: Document memory unit 206: Keyword search unit 207: Subquery result memory unit 208: Query result combination unit 209: Query result memory unit 210 : Query result ranking unit 211: Diversified rank list memory unit 901: Related keyword combination set construction unit 902: Query unit 903: Final query result set construction unit 904: Ranking unit

Claims

A method for diversifying query results,
Building a set of related keywords by extracting the related keywords from the domain ontology for each query keyword of a given query;
Searching for each related keyword combination in the related keyword combination set to obtain a query result set;
Constructing a final query result set by extracting a corresponding number of query results from each query result set;
Ranking the constructed final query result set to obtain a diversified query result; and a method for diversifying the query results.

The step of constructing a set of related keywords by extracting related keywords from the domain ontology for each query keyword of a given query comprises:
For each query keyword of a given query, extracting a keyword related keyword from the region ontology;
The method for diversifying query results according to claim 1, comprising building a set of related keyword combinations based on the related keywords.

The step of constructing a combination set of related keywords based on the related keywords comprises:
Build a set of related keywords as S (Q) = {(c ₁ , c ₂ ,..., C _m ) | c ₁ ∈ C ₁ && c ₂ ∈ C ₂ &&... C _m ∈ C _m } where C The method for diversifying query results according to claim 2, wherein _i is a related keyword set of an i-th keyword among m keywords in a given query.

After building a set of related keywords by extracting related keywords from the domain ontology for each query keyword,
For each related keyword combination in the related keyword combination set, the step of extracting from the region ontology a minimum subgraph connecting all the keywords in the combination,
Searching for each related keyword combination in the related keyword combination set to obtain a query result set comprises:
For each minimum subgraph, determining a subquery consisting of keywords and other nodes in the minimum subgraph;
Searching on keywords and other nodes in each subquery to obtain the same number of subquery result sets as the number of minimum subgraphs;
Determining a query result set as a set of respective subquery result sets;
The minimum subgraph is a subgraph that has a minimum number of edges in a subgraph connecting each keyword in a region ontology and connects all related keywords in the closest way. To diversify query results for.

Said step of constructing a final query result set by extracting a corresponding number of query results from each query result set;
Extracting a corresponding number of query results from each subquery result set according to the relevance of each subquery to a given query;
Combining the query results obtained from each subquery result set. The method for diversifying query results according to claim 4.

Said step of extracting a corresponding number of query results from each subquery result set according to the relevance of each subquery to a given query;
Subgraph weight for each minimum subgraph

m represents the number of query keywords, ri represents the match value of the related keyword extracted from the region ontology for the query keyword for the query keyword, and | E | is determined to represent the number of edges in the subgraph;
Extracting the corresponding number of query results from the subquery result set corresponding to each minimum subgraph according to the subgraph weights of the minimum subgraphs. 6. A method for diversifying query results according to claim 5 comprising: .

Extracting the corresponding number of query results from the subquery result set corresponding to each minimum subgraph according to the subgraph weight of the minimum subgraph,
The query results extracted from the subquery result set corresponding to the smallest subgraph are the top k * a query results most relevant to the smallest subgraph, where a is the current relative to the sum of the subgraph weights of all the smallest subgraphs. Represents the ratio of subgraph weights of the smallest subgraph, k is a predetermined value representing the scale of the final query result set, and k * a represents the largest integer less than the product of k and a. The method for diversifying the query results according to claim 5.

The step of ranking the constructed final query result set to obtain diversified query results comprises:
For each query result in the final query result set, determining a relevance value of the query result for the corresponding minimum subgraph;
Determining, for each query result in the final query result set, a query result weight according to the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph;
5. Diversifying the query results according to claim 4, comprising ranking query results in the final query result set according to the weights of the query results to obtain diversified query results. Way for.

For each query result in the final query result set, said step of determining the query result weight according to the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph,
9. The method for diversifying a query result according to claim 8, wherein the weight of the query result is determined as a product of the relevance value of the query result for the corresponding minimum subgraph and the subgraph weight of the minimum subgraph. .

A device for diversifying query results,
A related keyword combination set building unit configured to build a set of related keyword combinations by extracting related keywords from the domain ontology for each query keyword;
A query unit configured to search for each related keyword combination in the related keyword combination set to obtain a query result set;
A final query result set construction unit configured to construct a final query result set by extracting a corresponding number of query results from each query result set;
An apparatus for diversifying query results, comprising: a ranking unit configured to rank the constructed final query result set to obtain diversified query results.