JP2012208917A

JP2012208917A - Document ranking method and apparatus

Info

Publication number: JP2012208917A
Application number: JP2011268139A
Authority: JP
Inventors: Jianqiang Li; ジェンチャンリイ; chun cheng Liu; チュンチェンリウ; Yu Zhao; ユウジャオ; Bo Liu; ボリウ
Original assignee: NEC China Co Ltd
Current assignee: NEC China Co Ltd
Priority date: 2011-03-28
Filing date: 2011-12-07
Publication date: 2012-10-25
Anticipated expiration: 2031-12-07
Also published as: CN102708104B; JP5362807B2; CN102708104A

Abstract

PROBLEM TO BE SOLVED: To provide a document ranking method and an apparatus.SOLUTION: A method comprises steps of: extracting query semantic information based on queries and an ontology of users; extracting document semantic information based on documents, the queries, and the ontology; determining a relational semantic relevance score between the document semantic information and the query semantic information; and ranking the documents based on the relational semantic relevance score. The method and an apparatus efficiently improve document ranking accuracy.

Description

本発明は情報検索の分野に関し、特に、ドキュメントランク付け方法および装置に関する。 The present invention relates to the field of information retrieval, and more particularly to a document ranking method and apparatus.

電子情報の広範な適用と普及に伴い、様々な分散システムによって多様かつ大量な情報が蓄積され続けている。そのため、大量な情報から有益な情報を見つけ出す方法への関心は高まる一方である。 With the wide application and spread of electronic information, various and large amounts of information continue to be accumulated by various distributed systems. For this reason, interest in methods for finding useful information from a large amount of information is increasing.

情報検索とはドキュメント群から情報を検索することであり、これには、ドキュメントに含まれる情報の特定部分の検索、ドキュメントそのものの検索、ドキュメントを記述するメタデータの検索、データベース内の検索等が含まれる。検索対象の情報は、テキスト、音声、データ等様々である。 An information search is a search for information from a group of documents, such as a search for a specific part of information contained in a document, a search for the document itself, a search for metadata describing the document, a search in a database, etc. included. The search target information includes various texts, voices, data, and the like.

現在、ドキュメントのランク付けを行うための方法は、大きくクエリ依存型とクエリ非依存型の２通りに分類される。クエリ依存型とは、ユーザがクエリを実行した際に、ユーザによって入力されたクエリ内容に基づいてドキュメントがランク付けされ、目的の情報がより正確に得られるようにする方法である。従来の意味ベースのドキュメントランク付け方法では、クエリとドキュメント間の意味的関連度は主にオントロジに基づいて判定され、その関連度に基づいてドキュメントがランク付けされる。こうした従来の方法では、クエリとドキュメントに含まれる概念間の意味的関連度のみが考慮され、これらの概念間の関係に存在する意味的関連度は考慮されない。しかし、この関係における意味的関連度（以下、「関係意味関連度」という）は、ユーザのクエリの目的を理解し、目的のドキュメントとのマッチングを正確に行う上できわめて有用である。 Currently, methods for ranking documents are roughly classified into two types, a query-dependent type and a query-independent type. The query-dependent type is a method in which when a user executes a query, the documents are ranked based on the contents of the query input by the user, and target information can be obtained more accurately. In the conventional semantic-based document ranking method, the semantic relevance between a query and a document is mainly determined based on ontology, and the documents are ranked based on the relevance. In such a conventional method, only the semantic relevance between the concepts included in the query and the document is considered, and the semantic relevance existing in the relationship between these concepts is not considered. However, the semantic relevance (hereinafter referred to as “relation semantic relevance”) in this relationship is extremely useful for understanding the purpose of the user's query and accurately matching the target document.

そのため、従来技術による様々なドキュメントランク付け方法では、目的のクエリ結果を迅速かつ正確に得られないことが多い。 Therefore, various document ranking methods according to the prior art often cannot obtain a target query result quickly and accurately.

“Ｕｎｓｕｐｅｒｖｉｓｅｄｉｎｆｏｒｍａｔｉｏｎｅｘｔｒａｃｔｉｏｎｆｒｏｍｕｎｓｔｒｕｃｔｕｒｅｄ，ｕｎｇｒａｍｍａｔｉｃａｌｄａｔａｓｏｕｒｃｅｓｏｎｔｈｅＷｏｒｌｄＷｉｄｅＷｅｂ（ＷｏｒｌｄＷｉｄｅＷｅｂ上の非構造化・非文法的データソースからの無監視情報抽出）”，ＩｎｔｅｒｎａｔｉｏｎａｌＪｏｕｒｎａｌｏｎＤｏｃｕｍｅｎｔＡｎａｌｙｓｉｓａｎｄＲｅｃｏｇｎｉｔｉｏｎ，２００７，ｖｏｌ．１０，Ｎｏ．３−４，ｐａｇｅ２１１−２２６“Unsupervised information extraction from unstructured, unstructured data source on the World Wide web” (Unstructured, ungrammatical data source on the World Wide web) . 10, no. 3-4, pages 211-226 “Ｅｆｆｉｃｉｅｎｔｌｙｌｉｎｋｉｎｇｔｅｘｔｄｏｃｕｍｅｎｔｓｗｉｔｈｒｅｌｅｖａｎｔｓｔｒｕｃｔｕｒｅｄｉｎｆｏｒｍａｔｉｏｎ（テキストドキュメントと構造化された関連情報との効率的なリンク方法）”，ＩｎＰｒｏｃｅｅｄｉｎｇｏｆＶＬＤＢ２００６“Efficiently linked text documents with relevant structured information”, In Proceeding of VLDB2006 “Ｇｒａｐｈ−ＢａｓｅｄＣｏｎｃｅｐｔＩｄｅｎｔｉｆｉｃａｔｉｏｎａｎｄＤｉｓａｍｂｉｇｕａｔｉｏｎｆｏｒＥｎｔｅｒｐｒｉｓｅＳｅａｒｃｈ（エンタープライズサーチのためのグラフベースの概念識別および曖昧性除去）”，ＩｎＰｒｏｃｅｅｄｉｎｇｏｆＷＷＷ２０１０“Graph-Based Concept Identification and Dissemination for Enterprise Search (Graph-Based Concept Identification and Disambiguation for Enterprise Search)”, In Proceeding of WWW2010. “ＩｍｐｒｏｖｅｄＳｅｍａｎｔｉｃＳｉｍｉｌａｒｉｔｙＣａｌｃｕｌａｔｉｎｇＭｏｄｅｌａｎｄＡｐｐｌｉｃａｔｉｏｎ（改良型意味類似度計算モデルと応用）”，ＪｉｌｉｎＵｎｉｖｅｒｓｉｔｙＰｒｅｓｓ，Ｖｏｌ．３９，Ｎｏ．１，２００９“Improved Semantic Similarity Modeling and Application (improved semantic similarity calculation model and application)”, Jilin University Press, Vol. 39, no. 1,2009 “Ｕｓｉｎｇｉｎｆｏｒｍａｔｉｏｎｃｏｎｔｅｎｔｔｏｅｖａｌｕａｔｅｓｅｍａｎｔｉｃｓｉｍｉｌａｒｉｔｙｉｎａｔａｘｏｎｏｍｙ（分類学上の意味類似評価への情報コンテンツの利用）”，ＩｎＩＪＣＡＩ’９５“Using information content to evaluation similarity in a taxonomy (use of information content for taxonomic semantic similarity evaluation)”, In IJCAI '95 “ＩｎｔｒｏｄｕｃｔｉｏｎｔｏＭｏｄｅｒｎＩｎｆｏｒｍａｔｉｏｎＲｅｔｒｉｅｖａｌ（現代の情報検索入門）”，ＭｃＧｒａｗ−Ｈｉｌｌ，１９８３“Introduction to Modern Information Retrieval”, McGraw-Hill, 1983 “ＣａｔｅｇｏｒｉｚｉｎｇａｎｄＲａｎｋｉｎｇＳｅａｒｃｈＥｎｇｉｎｅ’ｓＲｅｓｕｌｔｓｂｙＳｅｍａｎｔｉｃＳｉｍｉｌａｒｉｔｙ（意味類似度による検索エンジンの結果の分類およびランク付け”，ＩｎＰｒｏｃｅｅｄｉｎｇｏｆＩＣＵＩＭＣ’０８“Categorizing and Ranking Search Engine ’s Results by Semantic Similarity”, In Proceeding of ICUIMC'08

上記の問題に鑑み、本発明は、ドキュメントランク付け方法および装置を提供する。 In view of the above problems, the present invention provides a document ranking method and apparatus.

本発明の第１の態様によれば、ドキュメントランク付け方法が提供される。この方法は、ユーザのクエリおよびオントロジに基づいてクエリの意味情報を抽出するステップと、ドキュメント、クエリおよびオントロジに基づいてドキュメント意味情報を抽出するステップと、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定するステップと、関係意味関連度スコアに基づいてドキュメントのランク付けを行うステップとを備える。 According to a first aspect of the present invention, a document ranking method is provided. The method includes the steps of extracting query semantic information based on a user query and ontology, extracting document semantic information based on the document, query and ontology, and relationship meaning between the document semantic information and the query semantic information. Determining a relevance score, and ranking the documents based on the relationship meaning relevance score.

本発明の第２の態様によれば、ドキュメントランク付け装置が提供される。この装置は、ユーザのクエリおよびオントロジに基づいてクエリの意味情報を抽出するように構成されたクエリ意味情報抽出手段と、ドキュメント、クエリおよびオントロジに基づいてドキュメント意味情報を抽出するように構成されたドキュメント意味情報抽出手段と、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定するように構成された関係意味関連度スコア決定手段と、関係意味関連度スコアに基づいてドキュメントのランク付けを行うように構成されたランク付け手段とを備える。 According to a second aspect of the present invention, a document ranking apparatus is provided. The apparatus is configured to extract query semantic information extraction means configured to extract query semantic information based on a user query and ontology, and to extract document semantic information based on the document, query, and ontology. Document semantic information extraction means, relation semantic relevance score determination means configured to determine a relation semantic relevance score between document semantic information and query semantic information, and ranking of documents based on the relation semantic relevance score Ranking means configured to:

本発明の方法と装置によれば、ドキュメントのランク付けは、クエリとドキュメント間における概念の意味関連度スコアだけではなく、クエリとドキュメント間における関係意味関連度スコアにも基づいて行われる。ドキュメントおよびクエリ内に含まれる関係の意味的な側面を考慮することによって、クエリ精度が効果的に改善されるため、ユーザは目的のクエリ結果をより迅速かつ正確に得ることができる。 According to the method and apparatus of the present invention, ranking of documents is performed based not only on the semantic relevance score of concepts between queries and documents, but also on the related semantic relevance scores between queries and documents. Considering the semantic aspects of the relationships contained within the document and query effectively improves the query accuracy so that the user can obtain the desired query results more quickly and accurately.

本発明のその他の特徴と利点は、添付図面を参照して本発明の原理を解明する以下の好適な実施例の説明により明らかとなるであろう。 Other features and advantages of the present invention will become apparent from the following description of a preferred embodiment which elucidates the principles of the invention with reference to the accompanying drawings.

図を参照して行われる以下の説明により本発明がさらに包括的に理解されれば、本発明の他の目的と効果はより明確で分かりやすいものとなるであろう。 Other objects and advantages of the present invention will become clearer and easier to understand if the present invention is more comprehensively understood by the following description with reference to the drawings.

本発明の一実施例によるドキュメントランク付け方法のフローチャートである。5 is a flowchart of a document ranking method according to an embodiment of the present invention. 本発明の他の実施例によるドキュメントランク付け方法のフローチャートである。6 is a flowchart of a document ranking method according to another embodiment of the present invention. 本発明の一実施例による、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する方法のフローチャートである。4 is a flowchart of a method for determining a relationship semantic relevance score between document semantic information and query semantic information according to an embodiment of the present invention; 本発明の他の実施例による、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する方法のフローチャートである。6 is a flowchart of a method for determining a relationship semantic relevance score between document semantic information and query semantic information according to another embodiment of the present invention. 本発明のさらに他の実施例による、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する方法のフローチャートである。6 is a flowchart of a method for determining a relationship semantic relevance score between document semantic information and query semantic information according to still another embodiment of the present invention; 本発明の一実施例によるドキュメントランク付け装置のブロック図である。1 is a block diagram of a document ranking apparatus according to an embodiment of the present invention.

上記のすべての図において、同じ参照番号は、同じか、類似しているか、または対応する特徴や機能を示す。 In all the above figures, the same reference numbers indicate the same, similar or corresponding features or functions.

図に示したフローチャートおよびブロック図は、本発明の実施例によるコンピュータプログラム製品によって実行可能なシステム、方法、アーキテクチャ、機能、および動作である。フローチャートまたはブロック図の各ブロックは、特定の論理機能を実行するための１つ以上の実行可能命令を含む、モジュール、プログラムセグメント、またはコードの一部を表す。一部の代替実施例では、ブロックに示す機能が図に示す順序とは異なる順序で実行されることに留意されたい。例えば、２つの連続するブロックが、関連機能との関係上、実際にはほぼ並列的に実行されたり、逆の順序で実行されたりする場合もある。また、ブロック図、フローチャートの各ブロック、およびこれらの組み合わせは、特定の機能／動作を実行するための専用のハードウェアベースシステムによって、あるいは専用ハードウェアとコンピュータ命令との組み合せによって実装できることに留意されたい。 The flowcharts and block diagrams shown in the figures are systems, methods, architectures, functions, and operations executable by a computer program product according to an embodiment of the present invention. Each block in the flowchart or block diagram represents a module, program segment, or portion of code that includes one or more executable instructions for performing a particular logic function. Note that in some alternative embodiments, the functions shown in the blocks are performed in a different order than the order shown in the figures. For example, two consecutive blocks may actually be executed substantially in parallel or in reverse order in relation to the related function. It is also noted that the block diagrams, flowchart blocks, and combinations thereof may be implemented by a dedicated hardware-based system for performing a specific function / operation or by a combination of dedicated hardware and computer instructions. I want.

従来技術によるドキュメントランク付け方法は、大きくクエリ依存型とクエリ非依存型とに分類される。クエリ依存型とは、ユーザがクエリを実行した際に、ユーザによって入力されたクエリ内容に基づいてドキュメントがランク付けされる方法である。クエリ非依存型とは、ドキュメントと特定のクエリとの一致度を考慮しないでドキュメントのランク付けを行う方法であり、例えば、ドキュメントに固有な特性に基づいてドキュメントを直接ランク付けする場合がこれに該当する。本発明によるドキュメントランク付け方法は、クエリ依存型である。本発明の方法によれば、ユーザによって入力されたクエリが受信されると、クエリに基づいて複数のドキュメントの順位が決定される。 Prior art document ranking methods are roughly classified into query-dependent and query-independent types. The query-dependent type is a method in which, when a user executes a query, documents are ranked based on the query content input by the user. Query-independent is a method that ranks documents without considering the degree of matching between the document and a specific query.For example, when ranking documents directly based on characteristics specific to the document. Applicable. The document ranking method according to the present invention is query dependent. According to the method of the present invention, when a query input by a user is received, a ranking of a plurality of documents is determined based on the query.

本発明の一実施例においては、ドキュメントランク付け方法と装置が開示される。本発明によるドキュメントランク付け方法は、ユーザによって入力されたクエリに基づいて実行される。本発明による方法は、複数のドキュメントをランク付けするように適応することができる。本発明の一実施例によれば、まず、ユーザのクエリとオントロジとに基づいてクエリの意味情報が抽出され、ドキュメントとユーザのクエリとオントロジとに基づいてドキュメントの意味情報が抽出され、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアが決定され、決定された関係意味関連度スコアに基づいてドキュメントがランク付けされる。本発明の方法では、ドキュメントのランク付けを行う際に、ユーザのクエリに含まれる概念とドキュメントに含まれる概念だけではなく、ユーザのクエリとドキュメント間の関係に基づく意味関連度スコア（本発明では「関係意味関連度スコア」ともいう）も考慮されるため、ドキュメントのランク付け精度が効果的に改善される。 In one embodiment of the present invention, a document ranking method and apparatus is disclosed. The document ranking method according to the present invention is executed based on a query input by a user. The method according to the invention can be adapted to rank a plurality of documents. According to an embodiment of the present invention, first, semantic information of a query is extracted based on a user's query and ontology, and semantic information of the document is extracted based on the document, the user's query and ontology, and the document meaning is extracted. A relationship semantic relevance score between the information and the query semantic information is determined, and the documents are ranked based on the determined relationship semantic relevance score. In the method of the present invention, when ranking documents, not only the concept included in the user query and the concept included in the document, but also the semantic relevance score based on the relationship between the user query and the document (in the present invention, (Also referred to as “relevance meaning relevance score”), the document ranking accuracy is effectively improved.

明瞭化のため、最初に、本発明で使用される用語を説明する。

１．オントロジ For clarity, the terminology used in the present invention is first explained.

1. Ontology

オントロジとは、元来は哲学用語であるが、現代では「共有される概念モデルの明示的かつ形式化された記述」と考えられる。オントロジは、関連領域の知識を収集し、その技術の知識に対する共通理解を提供し、その技術において一般に認知されている語彙（すなわち概念）を決定し、さらに、これらの概念およびこれらの概念間に存在する相互関係の明示的な定義を、様々なレベルで形式化された方法で提供する手段として利用することができる。 Ontology, originally a philosophical term, is now considered "an explicit and formalized description of a shared conceptual model". An ontology collects knowledge in the relevant domain, provides a common understanding of the knowledge of the technology, determines the vocabulary (ie, concepts) that are commonly recognized in the technology, and further, between these concepts and these concepts. The explicit definition of existing relationships can be used as a means of providing in a formalized manner at various levels.

意味的な観点から言えば、概念間の関係は、表１に示すように、大きく４つのタイプに分類される。

From a semantic point of view, the relationships between concepts are roughly classified into four types as shown in Table 1.

実用用途においては、概念間の関係は上記４タイプの基本的関係に限定されておらず、各分野に固有な状況に応じて他の関係を定義することができる。 In practical use, the relationship between the concepts is not limited to the above four types of basic relationships, and other relationships can be defined according to the situation specific to each field.

現在、広く利用されているオントロジとしては、Ｗｏｒｄｎｅｔ、Ｆｒａｍｅｎｅｔ、ＳＥＮＳＵＳ、Ｍｉｋｒｏｋｍｏｓ等が挙げられる。Ｗｏｒｄｎｅｔは、心理学的な言語規則に基づいた英語の辞書であり、情報編成単位としてｓｙｎｓｅｔｓ（特定の文脈状況において相互に交換可能な類義語の集合）を利用している。Ｆｒａｍｅｎｅｔも英語の辞書であり、「フレーム意味論」と呼ばれる記述フレームを採用して、強力な意味分析能力を実現している。現在はＦｒａｍｅｎｅｔＩＩへと発展している。ＧＵＭは、言語の種類に依存しない概念編成モードであり、自然言語処理を指向し、多言語処理をサポートし、基本概念で構成されている。ＳＥＮＳＵＳも自然言語処理指向であり、機械翻訳用の概念構造を提供し、７万個以上の概念を含む。Ｍｉｋｒｏｋｍｏｓも自然言語処理を指向し、多言語処理をサポートする。また、言語間の中間言語であるＴＭＲを採用して知識表現を行う。

２．意味パス Currently, widely used ontology includes Wordnet, Framenet, SENSUS, Mikrommos, and the like. Wordnet is an English dictionary based on psychological language rules, and uses synets (a set of synonyms interchangeable in a specific context) as an information organization unit. Framenet is also an English dictionary, which employs a description frame called “frame semantics” to realize powerful semantic analysis capabilities. Currently, it is developing into Flamenet II. GUM is a concept organization mode that does not depend on the type of language, is oriented to natural language processing, supports multilingual processing, and is composed of basic concepts. SENSUS is also oriented towards natural language processing, providing a conceptual structure for machine translation and containing over 70,000 concepts. Mikrommos is also oriented towards natural language processing and supports multilingual processing. In addition, knowledge expression is performed using TMR, which is an intermediate language between languages.

2. Meaning path

「意味パス」とは、オントロジ内の概念間の１つ以上の関係で構成されるシーケンスである。これらの概念は意味に基づいて抽出され、意味に基づいて構築される。オントロジ内のｍ個の関係がｒ’_１，ｒ’_２，…，ｒ’_ｍとして表現され、概念がｄ_１，ｄ_２，…，ｄ_ｍ，ｒ_１，…，ｒ_ｍと表現されるとすると、ｒ_ｉとｄ_ｉ＋１（ここで、ｉは１以上、ｍ未満）が同一概念の場合は、シーケンスｒ’_１（ｄ_１，ｒ_１），ｒ’_２（ｄ_２，ｒ_２）…，ｒ’_ｍ（ｄ_ｍ，ｒ_ｍ）を概念ｄ_１と概念ｒ_ｍ間の意味パスと呼ぶことができる。 A “semantic path” is a sequence made up of one or more relationships between concepts within an ontology. These concepts are extracted based on meaning and built on meaning. The m relation r in ontology _{_{'1, r' 2, ...}} , are represented as r _'m, concepts _{_{_{d 1, d 2, ...,}}} d m, r 1, ..., if it is expressed as _{r m} Then, when r _i and d _{i + 1} (where i is 1 or more and less than m) have the same concept, the sequences r ′ ₁ (d ₁ , r ₁ ), r ′ ₂ (d ₂ , r ₂ ),. r ′ _m (d _m , r _m ) can be called a semantic path between the concept d ₁ and the concept r _m .

意味パスａ＝ｒ’_１（ｄ_１，ｒ_１），ｒ’_２（ｄ_２，ｒ_２）…，ｒ’_ｍ（ｄ_ｍ，ｒ_ｍ）を順方向意味パスと呼ぶとすれば、意味パスｂ＝ｒ’_ｑ（ｒ_ｍ，ｄ_ｑ），ｒ’_ｑ−１（ｒ_ｑ−１，ｄ_ｑ−１）…，ｒ’_ｐ（ｒ_ｐ，ｄ_１）は、逆方向意味パスと呼ぶことができる。 If the semantic path a = r ′ ₁ (d ₁ , r ₁ ), r ′ ₂ (d ₂ , r ₂ )..., R ′ _m (d _m , r _m ) is called a forward semantic path, the semantic path b = r ′ _q (r _m , d _q ), r ′ _q−1 (r _q−1 , d _q−1 )..., r ′ _p (r _p , d ₁ ) are referred to as reverse meaning paths. Can do.

例えば、概念Ａと概念Ｂ間の意味パスにおいて、概念Ａから概念Ｂへの意味パスを「順方向」意味パスとし、Ｐ_ＡＢと表すとする。この場合、概念Ｂから概念Ａへの意味パスがある場合には、例えばこれをＰ_ＢＡと表記し、「順方向」意味パスに対する「逆方向」意味パスとみなすことができる。 For example, in the semantic path between the concept A and the concept B, the semantic path from the concept A to the concept B is referred to as a “forward” semantic path and expressed as P _AB . In this case, when there is a semantic path from the concept B to the concept A, for example, this is expressed as _PBA , and can be regarded as a “reverse” semantic path with respect to the “forward” semantic path.

当業者は、本発明の実施例において、「順方向」意味パスと「逆方向」意味パスは相対的なものであり、特定の意味パスを「順方向」または「逆方向」意味パスと定義することを必須とするものではないことが理解されるであろう。

３．クエリ意味情報 Those skilled in the art will recognize that a “forward” meaning path and a “reverse” meaning path are relative and that a specific meaning path is defined as a “forward” or “reverse” meaning path in embodiments of the present invention. It will be understood that this is not essential.

3. Query semantic information

クエリ意味情報は、クエリに含まれる概念（あるいは、クエリに含まれる概念集合）と、クエリに含まれる概念間の意味パスと、クエリに含まれる概念間の意味パス数とで構成される。 The query semantic information includes a concept (or a concept set included in the query) included in the query, a semantic path between concepts included in the query, and the number of semantic paths between concepts included in the query.

クエリ意味情報は、様々な形式で表現することができる。例えば、クエリ意味情報を、頂点と辺とで表されるクエリグラフの形式で表現してもよい。クエリグラフの頂点は、クエリ意味情報に含まれるクエリ概念集合内の各概念に対応し、クエリグラフの辺は、クエリ意味情報に含まれる各概念ペア間の１つ以上の意味パスに対応する。クエリグラフの辺の重みは、辺の２つの頂点にそれぞれ対応する２つの概念間の意味パス数に対応する。他の例を挙げれば、クエリ意味情報は、テキストファイルの形式で表現してもよい。この場合、クエリに含まれる概念と、各概念ペア間の意味パスは、テキストファイルに記録される。クエリ意味情報は、上記に加えて、他の適切ないかなる形式で表現してもよい。

４．ドキュメント意味情報 Query semantic information can be expressed in various formats. For example, the query semantic information may be expressed in the form of a query graph represented by vertices and edges. The vertex of the query graph corresponds to each concept in the query concept set included in the query semantic information, and the edge of the query graph corresponds to one or more semantic paths between each concept pair included in the query semantic information. The weight of the edge of the query graph corresponds to the number of semantic paths between the two concepts respectively corresponding to the two vertices of the edge. As another example, the query semantic information may be expressed in the form of a text file. In this case, the concept included in the query and the semantic path between each concept pair are recorded in a text file. In addition to the above, the query semantic information may be expressed in any other suitable format.

4). Document semantic information

本発明において、ドキュメントは狭義での通常のドキュメントではなく、ドキュメントの情報の一部、ドキュメントそのもの、ドキュメントを記述するメタデータ等も含まれることがある。 In the present invention, a document is not an ordinary document in a narrow sense, but may include a part of document information, the document itself, metadata describing the document, and the like.

ドキュメント意味情報は、ドキュメントに含まれる概念（これは、例えば「ドキュメント概念集合」と呼ばれる）と、ドキュメントに含まれる概念間の意味パスと、ドキュメントに含まれる概念間の意味パス数とで構成することができる。 The document semantic information is composed of concepts included in the document (this is called, for example, “document concept set”), semantic paths between concepts included in the document, and the number of semantic paths between concepts included in the document. be able to.

ドキュメント意味情報は、多様な形式で表現することができる。例えば、ドキュメント意味情報を、頂点と辺とで表されるドキュメントグラフの形式で表現してもよい。ドキュメントグラフの頂点は、ドキュメント意味情報に含まれるドキュメント概念集合内の各概念に対応し、ドキュメントグラフの辺は、ドキュメント意味情報に含まれる各概念ペア間の１つ以上の意味パスに対応する。ドキュメントグラフの辺の重みは、辺の２つの頂点にそれぞれ対応する２つの概念間の意味パス数に対応する。ドキュメント意味情報は、上記に加えて、テキストファイル形式等の他の適切ないかなる形式で表現してもよい。

５．概念意味関連度スコア Document semantic information can be expressed in a variety of formats. For example, the document semantic information may be expressed in the form of a document graph represented by vertices and edges. The vertex of the document graph corresponds to each concept in the document concept set included in the document semantic information, and the edge of the document graph corresponds to one or more semantic paths between each concept pair included in the document semantic information. The weight of the edge of the document graph corresponds to the number of semantic paths between the two concepts respectively corresponding to the two vertices of the edge. In addition to the above, the document semantic information may be expressed in any other suitable format such as a text file format.

5. Concept meaning relevance score

本発明において、概念意味関連度スコアとは、概念に基づく意味的関連度スコアであり、ユーザによって入力されたクエリとドキュメント間の意味関連度スコアを概念の観点から示すものである。クエリから抽出された概念集合は、ユーザの情報要求をある程度反映しており、ドキュメントから抽出された概念集合は、ドキュメントの内容をある程度反映している。クエリ概念集合とドキュメント概念集合間の関連度スコアの計算は、ユーザのクエリとドキュメントとのマッチング度を判定するように適応されている。

６．関係意味関連度スコア In the present invention, the concept-meaning relevance score is a semantic relevance score based on a concept, and indicates a semantic relevance score between a query input by a user and a document from the viewpoint of the concept. The concept set extracted from the query reflects the user's information request to some extent, and the concept set extracted from the document reflects the content of the document to some extent. The calculation of the relevance score between the query concept set and the document concept set is adapted to determine the degree of matching between the user query and the document.

6). Relationship meaning relevance score

本発明において、「関係意味関連度スコア」とは関係に基づく意味的関連度スコアであり、ユーザによって入力されたクエリとドキュメント間の意味関連度スコアを関係の観点から示すものである。関係は、ユーザのクエリ要求と、ドキュメントによって記述されている内容を理解するうえで、非常に重要である。例えば、ユーザが２つのキーワード「バスケットボール」と「アメリカ」を使ってクエリを行った場合、実際にユーザが必要としているのは「アメリカ合衆国におけるバスケットボールの売り上げ状況」、「アメリカ合衆国におけるバスケットボールの試合の状況」等であるかもしれない。その際、ランク付け対象のドキュメントは２つあり、両者のいずれにも「バスケットボール」、「アメリカ」という２つの概念が含まれているが、一方には「アメリカ合衆国におけるバスケットボールの生産状況」が記述され、もう一方には「アメリカ合衆国でのバスケットボールの試合」が記述されている。２つのドキュメントのうちクエリとの関連性が高い方を決定するという問題を解決するには、ユーザのクエリとドキュメントとの潜在的意味関係をそれぞれ抽出し、抽出された２つの関係集合間の関連度スコアを計算して、ユーザのクエリがドキュメントとマッチするかどうかを判断する必要がある。本発明によれば、ドキュメントに記述されている意味関係がユーザの意味関係の要求とマッチする確率が計算され、クエリとドキュメント間の関係意味関連度スコアが取得される。 In the present invention, the “relationship semantic relevance score” is a semantic relevance score based on a relationship, and indicates a semantic relevance score between a query input by a user and a document from the viewpoint of the relationship. Relationships are very important in understanding user query requests and what is described by the document. For example, when a user makes a query using two keywords “basketball” and “USA”, what the user actually needs is “the situation of basketball sales in the United States” and “the situation of basketball games in the United States”. Etc. At that time, there are two documents to be ranked, both of which contain the two concepts of “basketball” and “USA”, one of which describes the “production situation of basketball in the United States”. On the other hand, “Basketball game in the United States” is described. To solve the problem of determining which of the two documents is more relevant to the query, extract the potential semantic relationship between the user's query and the document, respectively, and the relationship between the two relation sets extracted A degree score needs to be calculated to determine if the user's query matches the document. According to the present invention, the probability that the semantic relation described in the document matches the semantic relation request of the user is calculated, and the relation semantic relevance score between the query and the document is obtained.

図１は、本発明の一実施例によるドキュメントランク付け方法のフローチャートである。 FIG. 1 is a flowchart of a document ranking method according to an embodiment of the present invention.

ステップＳ１０１において、ユーザのクエリとオントロジとに基づいて、クエリ意味情報が抽出される。 In step S101, query semantic information is extracted based on the user's query and ontology.

本発明においては、クエリ意味情報は、ユーザによって入力されたクエリから抽出された概念と、これらの概念間の意味パスとで構成することができる。本発明の実施例においては、ステップＳ１０１におけるクエリ意味情報の抽出は、オントロジに基づいてユーザのクエリに含まれるクエリ概念集合を抽出し、オントロジに基づいてクエリ概念集合内の各概念ペア間の意味パスを取得し、さらに、クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する、という手順で実施される。 In the present invention, the query semantic information can be composed of a concept extracted from a query input by a user and a semantic path between these concepts. In the embodiment of the present invention, the query semantic information extraction in step S101 extracts the query concept set included in the user's query based on the ontology, and the meaning between each concept pair in the query concept set based on the ontology. The procedure is carried out by obtaining a path and determining the number of semantic paths between the concept pairs based on the semantic paths between the concept pairs in the query concept set.

したがって、ステップＳ１０１においては、ユーザのクエリに含まれる概念と、これらの概念間に存在する意味パスと、各概念ペア間の意味パス数とが決定される。 Accordingly, in step S101, the concepts included in the user's query, the semantic paths existing between these concepts, and the number of semantic paths between each concept pair are determined.

本発明の実施例においては、クエリ概念集合内の各概念ペア間の意味パス数が決定されるが、その数を最適化する方法は多数ある。一実施例においては、各概念ペア間の意味パス数を取得する際に、各概念ペア間の順方向意味パスの集合と逆方向意味パスの集合とを決定することによって、重複してカウントされた順方向意味パスと逆方向意味パスを除外することができる。他の実施例においては、順方向意味パスの集合および逆方向意味パスの集合は、順方向意味パスの集合および逆方向意味パスの集合からそれぞれ冗長パスを除外することによって最適化され、その結果、取得された各概念ペア間の意味パス数が最適化される。さらに他の実施例においては、取得された各概念ペア間の意味パス数は、順方向意味パスの集合と逆方向意味パスの集合とに基づいて決定された相互パスペアのカウント結果を除外することによって、最適化される。 In an embodiment of the present invention, the number of semantic paths between each pair of concepts in a query concept set is determined, but there are many ways to optimize that number. In one embodiment, when acquiring the number of semantic paths between each concept pair, the number of forward semantic paths between each concept pair and the set of backward semantic paths are determined to be counted redundantly. The forward semantic path and the backward semantic path can be excluded. In another embodiment, the set of forward semantic paths and the set of backward semantic paths are optimized by excluding redundant paths from the set of forward semantic paths and the set of backward semantic paths, respectively. The number of semantic paths between each acquired concept pair is optimized. In yet another embodiment, the obtained number of semantic paths between each concept pair excludes the result of counting the mutual path pairs determined based on the set of forward semantic paths and the set of backward semantic paths. Optimized by.

ステップＳ１０２において、ドキュメントとクエリとオントロジとに基づいて、ドキュメント意味情報が抽出される。 In step S102, document semantic information is extracted based on the document, query, and ontology.

本発明においては、ドキュメント意味情報は、ランク付け対象のドキュメントから抽出された概念と、これらの概念間の意味パスとで構成することができる。本発明の一実施例においては、ステップＳ１０２においてドキュメント意味情報を抽出する手順は多数の方法で実施される。 In the present invention, document semantic information can be composed of concepts extracted from documents to be ranked and semantic paths between these concepts. In one embodiment of the present invention, the procedure for extracting document semantic information in step S102 is performed in a number of ways.

例えば、オントロジに基づいて、ドキュメントに含まれる概念集合とクエリに含まれる概念集合が抽出され、ドキュメントに含まれる概念集合とクエリに含まれる概念集合との共通部分に基づいてドキュメント概念集合が取得され、ドキュメントに基づいてドキュメント概念集合内の各概念ペア間の意味パスが取得され、さらに、ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数が決定される。 For example, based on ontology, a concept set included in a document and a concept set included in a query are extracted, and a document concept set is acquired based on a common part between the concept set included in the document and the concept set included in the query. The semantic path between each concept pair in the document concept set is acquired based on the document, and the number of semantic paths between the concept pairs is determined based on the semantic path between the concept pairs in the document concept set. .

他の例においては、ドキュメントに含まれる全ての概念が事前に抽出され、次にすべての概念間の意味パスが取得される。クエリが受信されると、クエリ概念集合が取得され、ドキュメント内の概念とのマッチングが行われて、対応するドキュメント意味情報が取得される。したがって、ステップＳ１０２においては、ランク付け対象となっている複数のドキュメントのうちの各ドキュメントに含まれる概念と、これらの概念間に存在する意味パスと、各概念ペア間の意味パス数とが決定される。 In another example, all the concepts contained in the document are pre-extracted and then the semantic path between all the concepts is obtained. When a query is received, a query concept set is obtained, matched with a concept in the document, and corresponding document semantic information is obtained. Therefore, in step S102, the concept included in each document among the plurality of documents to be ranked, the semantic path existing between these concepts, and the number of semantic paths between each concept pair are determined. Is done.

本発明の実施例においては、ドキュメント概念集合内の各概念ペア間の意味パス数が決定されるが、その数を最適化する方法は多数ある。一実施例においては、各概念ペア間の意味パス数を取得する際に、各概念ペア間の順方向意味パスの集合と逆方向意味パスの集合とを決定することによって、重複してカウントされた順方向意味パスと逆方向意味パスを除外することができる。他の実施例においては、順方向意味パスの集合および逆方向意味パスの集合は、順方向意味パスの集合および逆方向意味パスの集合からそれぞれ冗長パスを除外することによって最適化され、その結果、取得された各概念ペア間の意味パス数が最適化される。さらに他の実施例においては、取得された各概念ペア間の意味パス数は、順方向意味パスの集合および逆方向意味パスの集合に基づいて決定された相互パスペアのカウント結果を除外することによって、最適化される。 In an embodiment of the present invention, the number of semantic paths between each concept pair in the document concept set is determined, but there are many ways to optimize that number. In one embodiment, when acquiring the number of semantic paths between each concept pair, the number of forward semantic paths between each concept pair and the set of backward semantic paths are determined to be counted redundantly. The forward semantic path and the backward semantic path can be excluded. In another embodiment, the set of forward semantic paths and the set of backward semantic paths are optimized by excluding redundant paths from the set of forward semantic paths and the set of backward semantic paths, respectively. The number of semantic paths between each acquired concept pair is optimized. In yet another embodiment, the number of semantic paths between each acquired concept pair is determined by excluding the count result of the mutual path pairs determined based on the set of forward semantic paths and the set of backward semantic paths. Optimized.

なお、ステップＳ１０１とステップＳ１０２は、必ずしもこの順序で実行される必要はないことは理解されるであろう。本発明の他の実施例においては、ステップＳ１０２が最初に実行され、次にステップＳ１０１が実行される。また、ステップＳ１０１とステップＳ１０２が同時に実行されてもよい。図１の実施例に示すステップＳ１０１とＳ１０２の実行順序は、本発明を限定するためのものではなく、１つの例示にすぎない。 It should be understood that step S101 and step S102 do not necessarily have to be executed in this order. In another embodiment of the present invention, step S102 is performed first, followed by step S101. Moreover, step S101 and step S102 may be performed simultaneously. The execution order of steps S101 and S102 shown in the embodiment of FIG. 1 is not intended to limit the present invention, but is merely an example.

ステップＳ１０３において、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアが決定される。 In step S103, a relationship semantic relevance score between document semantic information and query semantic information is determined.

本発明の一実施例においては、ドキュメント意味情報の意味パス数およびクエリ意味情報の意味パス数が取得され、取得された意味パス数に基づいて、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアが決定される。図３〜５は、本発明による、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定するための３つの実施例を示す。以下では、これらの詳細について説明する。 In one embodiment of the present invention, the number of semantic paths of document semantic information and the number of semantic paths of query semantic information are acquired, and based on the acquired number of semantic paths, the relationship semantic relation between document semantic information and query semantic information is acquired. A degree score is determined. 3-5 illustrate three examples for determining the relationship semantic relevance score between document semantic information and query semantic information according to the present invention. Hereinafter, these details will be described.

ステップＳ１０４において、関係意味関連度スコアに基づいて、ドキュメントがランク付けされる。 In step S104, the documents are ranked based on the relationship meaning relevance score.

ステップＳ１０４は多数の方法で実行することができる。 Step S104 can be performed in a number of ways.

一実施例においては、ドキュメントのランク付けは、各ドキュメントに関連して取得された関係意味関連度スコアを直接、降順その他の適切な順序でランク付けることによって行われる。 In one embodiment, the ranking of documents is done by directly ranking the relationship semantic relevance scores obtained in association with each document in descending order or other suitable order.

他の実施例においては、クエリと関連のあるドキュメントの概念意味関連度スコアが取得され、関係関連度スコアと概念関連度スコアとに基づいてドキュメントのスコアが決定されて、ドキュメントのスコアに基づいてドキュメントがランク付けされる。 In another embodiment, a concept semantic relevance score for a document associated with the query is obtained, a document score is determined based on the relationship relevance score and the conceptual relevance score, and based on the document score. Documents are ranked.

さらに他の実施例においては、ドキュメントとクエリとの間の概念意味関連度スコアが取得され、概念意味関連度スコアに基づいてドキュメントのランク付けが行われ、ランク付けされたドキュメントがグループ化されて、関係意味関連度スコアに基づいて各グループ内のドキュメントがランク付けされる。 In yet another embodiment, a conceptual semantic relevance score between a document and a query is obtained, the documents are ranked based on the conceptual semantic relevance score, and the ranked documents are grouped. The documents in each group are ranked based on the relationship semantic relevance score.

図１のフローはこれで終了する。 This is the end of the flow of FIG.

なお、本発明によるドキュメント、クエリ、およびオントロジに基づくドキュメント意味情報の抽出は、多数の特定の実施方法で実行できることが理解されるであろう。 It will be appreciated that the extraction of document semantic information based on documents, queries, and ontologies according to the present invention can be performed in a number of specific implementations.

本発明の一例においては、ドキュメント意味情報の抽出はユーザのクエリを契機に開始され、オントロジに基づいてドキュメントに含まれる概念集合とクエリに含まれる概念集合とが抽出され、ドキュメントに含まれる概念集合とクエリに含まれる概念集合との共通部分に基づいてドキュメント概念集合が取得され、ドキュメントに基づいてドキュメント概念集合内の各概念ペア間の意味パスが取得され、さらに、ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数が決定される、というプロセスで行われる。 In one example of the present invention, the extraction of document semantic information is triggered by a user query, and a concept set included in the document and a concept set included in the query are extracted based on the ontology, and the concept set included in the document is extracted. And the concept set included in the query is obtained based on the common part of the query, the semantic path between each pair of concepts in the document concept set is obtained based on the document, and each concept in the document concept set is obtained. This is performed by a process in which the number of semantic paths between each concept pair is determined based on the semantic paths between the pairs.

本発明の他の例においては、ユーザのクエリが受信される前（例：オフライン状態時）にドキュメントの前処理が完了されるか、または他のクエリの処理中にバックグラウンドでドキュメントの前処理が実行される。そのため、ドキュメントに含まれる概念と、これらの概念間の意味パスとをオントロジに基づいて事前に抽出し、事前に抽出されたこれらの概念と意味パスとをデータベースやメモリに記憶しておくことができる。ユーザがクエリを実行すると、ドキュメントに含まれる概念集合とクエリに含まれる概念集合との共通部分がデータベースまたはメモリから検索され、この共通部分に基づいてドキュメント概念集合が取得される。次に、データベースまたはメモリに保存されている意味パスに従ってドキュメント概念集合内の各概念ペア間の意味パスが取得されて、意味パス数が決定される。この例はオフラインのクエリ処理として実施することができる。 In other examples of the invention, document pre-processing is completed before a user query is received (e.g., in an offline state) or in the background while other queries are being processed. Is executed. Therefore, it is possible to extract the concepts contained in the document and the semantic paths between these concepts in advance based on the ontology, and store these extracted concepts and semantic paths in advance in a database or memory. it can. When the user executes a query, a common part between the concept set included in the document and the concept set included in the query is retrieved from the database or memory, and the document concept set is obtained based on the common part. Next, a semantic path between each concept pair in the document concept set is acquired according to the semantic path stored in the database or memory, and the number of semantic paths is determined. This example can be implemented as offline query processing.

図２は、本発明の他の実施例によるドキュメントランク付け方法のフローチャートである。 FIG. 2 is a flowchart of a document ranking method according to another embodiment of the present invention.

ステップＳ２０１において、オントロジに基づいて、ユーザのクエリに含まれるクエリ概念集合が抽出される。 In step S201, a query concept set included in the user's query is extracted based on the ontology.

このステップでは、最初に、ユーザによって入力されたクエリ内容が受信される。ここでは、目的のドキュメントを取得して表示するためにユーザがクエリとして「アメリカバスケットボール」と入力したと想定する。本発明において「ドキュメント」とは、Ｗｅｂページ、プレーンテキストファイル、ＰＤＦファイル、Ｗｏｒｄファイル、ＰｏｗｅｒＰｏｉｎｔファイル、Ｅｘｃｅｌファイル等、当業者によって利用可能な任意のファイルを意味する。 In this step, first, the query content input by the user is received. Here, it is assumed that the user inputs “American basketball” as a query in order to obtain and display the target document. In the present invention, “document” means any file that can be used by those skilled in the art, such as a Web page, a plain text file, a PDF file, a Word file, a PowerPoint file, and an Excel file.

ユーザのクエリに含まれる概念は、オントロジに基づいて多数の方法で決定することができる。テキストから概念を抽出する方法は、現在すでに多数ある。例えば、非特許文献１（“Ｕｎｓｕｐｅｒｖｉｓｅｄｉｎｆｏｒｍａｔｉｏｎｅｘｔｒａｃｔｉｏｎｆｒｏｍｕｎｓｔｒｕｃｔｕｒｅｄ，ｕｎｇｒａｍｍａｔｉｃａｌｄａｔａｓｏｕｒｃｅｓｏｎｔｈｅＷｏｒｌｄＷｉｄｅＷｅｂ（ＷｏｒｌｄＷｉｄｅＷｅｂ上の非構造化・非文法的データソースからの無監視情報抽出）”，ＩｎｔｅｒｎａｔｉｏｎａｌＪｏｕｒｎａｌｏｎＤｏｃｕｍｅｎｔＡｎａｌｙｓｉｓａｎｄＲｅｃｏｇｎｉｔｉｏｎ，２００７，ｖｏｌ．１０，Ｎｏ．３−４，ｐａｇｅ２１１−２２６）に記載される概念認識方法、非特許文献２（“Ｅｆｆｉｃｉｅｎｔｌｙｌｉｎｋｉｎｇｔｅｘｔｄｏｃｕｍｅｎｔｓｗｉｔｈｒｅｌｅｖａｎｔｓｔｒｕｃｔｕｒｅｄｉｎｆｏｒｍａｔｉｏｎ（テキストドキュメントと構造化された関連情報との効率的なリンク方法）”，ＩｎＰｒｏｃｅｅｄｉｎｇｏｆＶＬＤＢ２００６）に記載される概念認識方法、および非特許文献３（“Ｇｒａｐｈ−ＢａｓｅｄＣｏｎｃｅｐｔＩｄｅｎｔｉｆｉｃａｔｉｏｎａｎｄＤｉｓａｍｂｉｇｕａｔｉｏｎｆｏｒＥｎｔｅｒｐｒｉｓｅＳｅａｒｃｈ（エンタープライズサーチのためのグラフベースの概念識別および曖昧性除去）”，ＩｎＰｒｏｃｅｅｄｉｎｇｏｆＷＷＷ２０１０）に記載される概念認識方法等が挙げられる。 The concepts involved in the user's query can be determined in a number of ways based on the ontology. There are already many ways to extract concepts from text. For example, Non-Patent Document 1 ("Unstructured information extraction from unstructured unsourced data on the World Wide Web, unstructured data from unstructured unsourced data on the World Wide Web") Analysis and Recognition, 2007, vol. 10, No. 3-4, pages 211-226), Non-Patent Document 2 ("Efficiently linked text documents structured information"). Efficient Link Method between Text Document and Structured Related Information) ”, Concept Recognition Method described in In Proceeding of VLDB2006), and Non-Patent Document 3 (“ Graph-Based Concept Identification and Enterprise Search ”) Graph-based concept identification and disambiguation for enterprise search) ”, In Proceeding of WWW 2010).

本発明の実施例においては、ユーザによって入力されたクエリ「アメリカバスケットボール」に含まれる概念は「アメリカ」および「バスケットボール」であると判定され、これによりステップＳ２０１において、クエリ概念集合は｛「アメリカ」，「バスケットボール」｝であると判定されたと想定する。 In the embodiment of the present invention, it is determined that the concepts included in the query “American basketball” inputted by the user are “America” and “basketball”, so that in step S201, the query concept set is {“America”. , “Basketball”}.

ステップＳ２０２において、オントロジに基づいて、クエリ概念集合内の各概念ペア間の意味パスが取得される In step S202, a semantic path between each concept pair in the query concept set is acquired based on the ontology.

オントロジには、多数の既知の概念とこれらの概念間の意味パスが含まれる。クエリ概念集合に含まれる概念「アメリカ」および「バスケットボール」をオントロジ内で検索することによって、オントロジ内に存在する２つの概念「アメリカ」と「バスケットボール」間の意味パスが決定される。例えば、５つの意味パス＜生産する（アメリカ，バスケットボール）＞、＜販売する（アメリカ，バスケットボール）＞、＜開催する（アメリカ，バスケットボールの試合）＞、＜使用する（バスケットボールの試合，バスケットボール）＞、および＜〜において生産する（バスケットボール，アメリカ）＞があると想定する。 An ontology includes a number of known concepts and semantic paths between these concepts. By searching the concepts “USA” and “basketball” included in the query concept set in the ontology, the semantic path between the two concepts “USA” and “basketball” existing in the ontology is determined. For example, five meaning paths <produce (USA, basketball)>, <sell (USA, basketball)>, <hold (USA, basketball game)>, <use (basketball game, basketball)>, And <Production in (Basketball, USA)>.

ステップＳ２０３において、クエリ概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の意味パス数が決定され、クエリ意味情報が取得される。 In step S203, the number of semantic paths between the concept pairs is determined based on the semantic paths between the concept pairs in the query concept set, and query semantic information is acquired.

本発明による一実施例においては、クエリ概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の順方向意味パスの集合と逆方向意味パスの集合とが決定され、その後、順方向意味パスの集合の要素数と逆方向意味パスの集合の要素数とに基づいて、各概念ペア間の意味パス数が取得される。例えば、２つの概念「アメリカ」と「バスケットボール」とを含むクエリ概念集合に対しては、ステップＳ２０２において取得されたこの２つの概念間の意味パスに基づいて、概念「アメリカ」から概念「バスケットボール」への意味パスが検出され、その結果、２つの概念「アメリカ」と「バスケットボール」間の順方向意味パスの集合が取得される。同様に、ステップＳ２０２において取得されたこの２つの概念間の意味パスに基づいて、概念「バスケットボール」から概念「アメリカ」への意味パスが検出され、その結果、２つの概念「アメリカ」と「バスケットボール」間の逆方向意味パスの集合が取得される。さらに、順方向意味パス集合の要素数と逆方向意味パス集合の要素数が計算され、２つの要素数の合計が２つの概念「アメリカ」と「バスケットボール」間の意味パス数とみなされる。 In one embodiment according to the present invention, based on the semantic path between each concept pair in the query concept set, a forward semantic path set and a reverse semantic path set between each concept pair are determined, and then Based on the number of elements in the set of forward semantic paths and the number of elements in the set of backward semantic paths, the number of semantic paths between each concept pair is acquired. For example, for a query concept set including two concepts “America” and “basketball”, from the concept “America” to the concept “basketball” based on the semantic path between the two concepts acquired in step S202. As a result, a set of forward semantic paths between the two concepts “America” and “Basketball” is obtained. Similarly, based on the semantic path between the two concepts acquired in step S202, a semantic path from the concept “basketball” to the concept “america” is detected. As a result, the two concepts “america” and “basketball” are detected. A set of backward semantic paths between “is obtained. Further, the number of elements in the forward semantic path set and the number of elements in the backward semantic path set are calculated, and the sum of the two elements is regarded as the number of semantic paths between the two concepts “America” and “Basketball”.

発明による他の実施例においては、クエリ概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の順方向意味パスの集合および逆方向意味パスの集合が決定されることに加えて、順方向意味パスの集合から冗長パスが除外されて順方向意味パスの集合が最適化され、逆方向意味パスの集合から冗長パスが除外されて逆方向意味パスの集合が最適化される。さらに、最適化された順方向意味パスの集合の要素数と最適化された逆方向意味パスの集合の要素数とに基づいて、各概念ペア間の意味パス数が取得される。例えば、２つの概念「アメリカ」および「バスケットボール」を含むクエリ概念集合に対しては、ステップＳ２０２において取得されたこの２つの概念間の意味パスに基づいて、概念「アメリカ」から概念「バスケットボール」への順方向意味パスの集合および逆方向意味パスの集合が検出される。次に、順方向意味パスの集合から冗長パスが検索され、逆方向意味パスの集合から冗長パスが検索される。順方向意味パスの集合から冗長パスが除外され、逆方向意味パスの集合から冗長パスが除外されることによって、順方向意味パスの集合と逆方向意味パスの集合の最適化がそれぞれ実行される。さらに、最適化された順方向意味パス集合の要素数と最適化された逆方向意味パス集合の要素数との合計が計算され、２つの要素数の合計が概念「アメリカ」と「バスケットボール」との間の意味パス数とみなされる。 In another embodiment according to the invention, a set of forward semantic paths and a set of backward semantic paths between each concept pair is determined based on the semantic paths between each concept pair in the query concept set. Thus, the redundant path is excluded from the set of forward semantic paths and the set of forward semantic paths is optimized, and the redundant path is excluded from the set of reverse semantic paths and the set of backward semantic paths is optimized. . Furthermore, the number of semantic paths between each pair of concepts is acquired based on the number of elements in the set of optimized forward semantic paths and the number of elements in the set of optimized backward semantic paths. For example, for a query concept set including two concepts “USA” and “basketball”, from the concept “USA” to the concept “basketball” based on the semantic path between the two concepts acquired in step S202. A set of forward semantic paths and a set of backward semantic paths are detected. Next, a redundant path is searched from a set of forward semantic paths, and a redundant path is searched from a set of backward semantic paths. By optimizing the set of forward semantic paths and the set of reverse semantic paths, redundant paths are excluded from the set of forward semantic paths and redundant paths are excluded from the set of reverse semantic paths. . Furthermore, the sum of the number of elements of the optimized forward semantic path set and the number of elements of the optimized backward semantic path set is calculated, and the sum of the two elements is the concept “America” and “basketball”. The number of semantic paths between

本発明においては、ｒ_ｍ（Ｃ_１，Ｃ_２）Λｒ_ｎ（Ｃ_２，Ｃ_３）→ｒ_ｐ（Ｃ_１，Ｃ_３）とすれば、概念Ｃ１とＣ３との間の意味パスｒ_１…ｒ_ｍｒ_ｎ…ｒ_ｑは、もう１つの意味パスｒ_１…ｒ_ｐ…ｒ_ｑに対して冗長パスであるとみなされる（ここで、Ｃ_１、Ｃ_２、Ｃ_３は３つの概念、ｒ_１，…ｒ_ｍ，…ｒ_ｎ，…ｒ_ｐ，…ｒ_ｑは、各概念間の関係、記号「Λ」は関係「ＡＮＤ」を示す）。 In the present invention, if r _m (C ₁ , C ₂ ) Λr _n (C ₂ , C ₃ ) → r _p (C ₁ , C ₃ ), the semantic path r ₁ between the concepts C ₁ and C ₃ . r _m r _n ... r _q is considered to be a redundant path with respect to another semantic path r ₁ ... r _p ... r _q (where C ₁ , C ₂ and C ₃ are three concepts, r _{_{_{1, ... r m, ... r}}} n, ... r p, ... r q , the relationship between the concept, the symbol "Λ" represents a relationship between "aND").

本発明による他の実施例においては、クエリ概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の順方向意味パスの集合および逆方向意味パスの集合が決定されることに加えて、順方向意味パスの集合と逆方向意味パスの集合とに基づいて、相互パスペアが決定される。次に、順方向意味パス集合の要素数、逆方向意味パス集合の要素数、および相互パスペア数に基づいて、各概念ペア間の意味パス数が決定される。例えば、２つの概念「アメリカ」と「バスケットボール」とを含むクエリ概念集合については、ステップＳ２０２で取得されたこの２つの概念間の意味パスに基づいて、概念「アメリカ」から概念「バスケットボール」への順方向意味パスの集合と逆方向意味パスの集合とが検出される。次に、順方向意味パスの集合と逆方向意味パスの集合とに基づいて、相互パスペアが決定される。さらに、順方向意味パスの集合の要素数と逆方向意味パスの集合の要素数との合計から相互パスペア数を差し引いて得られた数が、２つの概念「アメリカ」と「バスケットボール」との間の意味パス数とみなされる。 In another embodiment according to the present invention, a set of forward semantic paths and a set of backward semantic paths between each concept pair is determined based on the semantic paths between each concept pair in the query concept set. In addition, a mutual path pair is determined based on a set of forward semantic paths and a set of reverse semantic paths. Next, the number of semantic paths between each concept pair is determined based on the number of elements in the forward semantic path set, the number of elements in the backward semantic path set, and the number of mutual path pairs. For example, for a query concept set including two concepts “USA” and “basketball”, based on the semantic path between the two concepts acquired in step S202, the concept “USA” to the concept “basketball”. A set of forward semantic paths and a set of reverse semantic paths are detected. Next, a mutual path pair is determined based on the set of forward semantic paths and the set of backward semantic paths. Furthermore, the number obtained by subtracting the number of mutual path pairs from the total number of elements in the forward semantic path set and the number of elements in the backward semantic path set is between the two concepts “America” and “Basketball”. Is considered as the number of paths.

本発明において、概念Ｃ_ｉとＣ_ｊ間の順方向意味パスの集合をＳ_ｉｊとし、逆方向意味パスの集合をＳ_ｊｉとした場合、パスｌ_１は順方向意味パス集合Ｓ_ｉｊの要素、すなわち、ｌ_１∈Ｓ_ｉｊかつｌ_１＝ｒ_１（Ｃ_１，Ｃ_２），…，ｒ_ｍ（Ｃ_２ｍ−１，Ｃ_２ｍ）であり、パスｌ_２は逆方向意味パスの集合Ｓ_ｊｉの要素、すなわち、ｌ_２∈Ｓ_ｊｉかつｌ_２＝ｒ_ｍ ^−１（Ｃ_２ｍ，Ｃ_２ｍ−１），…，ｒ_１ ^−１（Ｃ_２，Ｃ_１）である（ここで、ｒ^−１はｒの逆関係であり、（ｌ_１，ｌ_２）は相互パスペアである）。 In the present invention, when the set of forward semantic paths between the concepts C _i and C _j is S _ij and the set of backward semantic paths is S _ji , the path l ₁ is an element of the forward semantic path set S _ij , That, _{l 1} ∈S _ij and _{_{_{_{l 1 = r 1 (C 1}}}} , C 2), ..., a _{_{r m (C 2m-1,}} C 2m), the path _{l 2} of the set _{S ji} reverse sense path Elements i.e. l ₂ ∈ S _ji and l ₂ = r _m ⁻¹ (C _2m , C _2m−1 ),..., R ₁ ⁻¹ (C ₂ , C ₁ ), where r ⁻¹ is r is the inverse relationship, and (l ₁ , l ₂ ) is a mutual path pair).

クエリ意味情報は、ユーザのクエリに含まれるクエリ概念集合、クエリ概念集合内の各概念ペア間の意味パス、および意味パス数に基づいて構築される。前述のように、クエリ意味情報は多数の方法で表現することができる。例えば、クエリ意味情報はグラフ理論に基づいてクエリグラフの形式で表現することができ、この場合、クエリグラフの頂点は、クエリ意味情報に含まれるクエリ概念集合内の１概念に対応し、クエリグラフの辺は、クエリ意味情報の２つの概念間の意味パスに対応する。そして、クエリグラフの辺の重みは、クエリ意味情報の２つの概念間の意味パス数に対応する。クエリ意味情報はさらに、テキストファイルの形式で表現することもできる。なお、ここで示したクエリグラフやテキストファイルは単なる例であり、クエリ意味情報はこれに限定されず、多数の適切な形式で表現できることが当業者には理解されるであろう。 The query semantic information is constructed based on a query concept set included in the user's query, a semantic path between each concept pair in the query concept set, and the number of semantic paths. As mentioned above, query semantic information can be expressed in a number of ways. For example, the query semantic information can be expressed in the form of a query graph based on graph theory. In this case, the vertex of the query graph corresponds to one concept in the query concept set included in the query semantic information, and the query graph Corresponds to the semantic path between the two concepts of the query semantic information. The edge weight of the query graph corresponds to the number of semantic paths between the two concepts of the query semantic information. The query semantic information can also be expressed in the form of a text file. It should be noted that the query graph and text file shown here are merely examples, and those skilled in the art will understand that the query semantic information is not limited to this and can be expressed in many appropriate formats.

ステップＳ２０４において、オントロジに基づいて、ドキュメントに含まれる概念集合とクエリに含まれる概念集合が抽出される。 In step S204, based on the ontology, a concept set included in the document and a concept set included in the query are extracted.

本発明において「ドキュメント」とは、Ｗｅｂページ、プレーンテキストファイル、ＰＤＦファイル、Ｗｏｒｄファイル、ＰｏｗｅｒＰｏｉｎｔファイル、Ｅｘｃｅｌファイル等、当業者によって利用可能な任意のファイルを意味する。 In the present invention, “document” means any file that can be used by those skilled in the art, such as a Web page, a plain text file, a PDF file, a Word file, a PowerPoint file, and an Excel file.

上記のように、ユーザのクエリに含まれる概念は、オントロジに基づいて多数の方法で決定され、それによりクエリに含まれる概念集合が抽出される。同様に、ドキュメントに含まれる概念は、オントロジに基づいて多数の方法で決定され、それによりドキュメントに含まれる概念集合が抽出される。 As described above, the concepts included in the user's query are determined in a number of ways based on the ontology, thereby extracting a set of concepts included in the query. Similarly, the concepts contained in the document are determined in a number of ways based on the ontology, thereby extracting the set of concepts contained in the document.

ステップＳ２０４において、ドキュメントに含まれる概念集合を抽出する動作およびクエリに含まれる概念集合を抽出する動作が行われるが、両者は同時に行うことも、あるいは連続して行うことも可能であり、その順序は単なる例示であって、必須ではないことは理解されるであろう。 In step S204, an operation of extracting a concept set included in the document and an operation of extracting a concept set included in the query are performed. Both of these operations can be performed simultaneously or sequentially. It will be understood that is merely illustrative and not essential.

本発明による１つの例においては、ユーザのクエリが受信される前に、ドキュメントに含まれる概念集合が抽出される（すなわち、ドキュメントが前処理される）。その際、ドキュメントの前処理によって取得された概念間の意味パスは、データベースやメモリに記憶しておくことができる。次に、ユーザのクエリが受信されると、オントロジに基づいて、クエリに含まれる概念集合が抽出される。さらに、ドキュメントの前処理によって取得された概念とクエリ概念との間の意味パスに基づいて、ドキュメント概念集合が取得される。 In one example in accordance with the present invention, a concept set contained in a document is extracted (ie, the document is preprocessed) before a user query is received. At that time, the semantic paths between concepts acquired by the preprocessing of the document can be stored in a database or memory. Next, when a user query is received, a concept set included in the query is extracted based on the ontology. Further, a document concept set is acquired based on the semantic path between the concept acquired by the pre-processing of the document and the query concept.

ステップＳ２０５において、ドキュメントに含まれる概念集合とクエリに含まれる概念集合間の共通部分に基づいて、ドキュメント概念集合が取得される。 In step S205, a document concept set is acquired based on a common part between the concept set included in the document and the concept set included in the query.

本発明において、ドキュメント概念集合を取得する方法とクエリ概念集合を取得する方法は、必ずしも同じではない。ステップＳ２０１において取得されるクエリ概念集合は、オントロジに基づいて、ユーザのクエリから直接抽出される。ステップＳ２０５において取得されたドキュメント概念集合とクエリ概念集合には同じ概念が含まれるが、これらの概念は仮想概念と共通概念とに分類することができる。 In the present invention, the method for obtaining a document concept set and the method for obtaining a query concept set are not necessarily the same. The query concept set acquired in step S201 is directly extracted from the user's query based on the ontology. The document concept set and the query concept set acquired in step S205 include the same concept, but these concepts can be classified into a virtual concept and a common concept.

ドキュメントから抽出された概念集合とクエリ概念集合（すなわち、クエリに含まれる概念集合）間の共通部分から、オントロジに基づいて取得された概念は、共通概念である。例えば、ステップＳ２０４において、ドキュメントに含まれ、オントロジに基づいて抽出された概念集合が｛「バスケットボール」，「店」，「試合」｝であり、クエリに含まれ、オントロジに基づいて抽出された概念集合が｛「アメリカ」，「バスケットボール」｝であるとすれば、ドキュメントに含まれる概念集合とクエリに含まれる概念集合間の共通部分は｛「バスケットボール」｝と判定されるので、「バスケットボール」は前述の共通概念である。 A concept acquired based on an ontology from a common part between a concept set extracted from a document and a query concept set (that is, a concept set included in a query) is a common concept. For example, in step S204, the concept set included in the document and extracted based on the ontology is {“basketball”, “store”, “game”}, and included in the query and extracted based on the ontology. If the set is {“USA”, “basketball”}, the common part between the concept set included in the document and the concept set included in the query is determined as {“basketball”}. This is the common concept described above.

ステップＳ２０４において、オントロジに基づいてドキュメントから抽出された概念には概念「アメリカ」が含まれていない。したがって、本発明において、ドキュメント概念集合に概念「アメリカ」と「バスケットボール」が含まれていると判定された場合には、ドキュメント概念集合｛「アメリカ」，「バスケットボール」｝の中の「アメリカ」は、仮想概念とみなされる。そして、ドキュメント概念集合内の概念間の意味パスがその後決定される際に、仮想概念と共通概念間の意味パス数はすべて０に集合される。 In step S204, the concept “USA” is not included in the concept extracted from the document based on the ontology. Therefore, in the present invention, when it is determined that the concepts “America” and “basketball” are included in the document concept set, “USA” in the document concept set {“USA”, “basketball”} is Is considered a virtual concept. When the semantic paths between the concepts in the document concept set are subsequently determined, the number of semantic paths between the virtual concept and the common concept is all set to zero.

ステップＳ２０６において、ドキュメントに基づいて、ドキュメント概念集合内の各概念ペア間の意味パスが取得される In step S206, a semantic path between each concept pair in the document concept set is acquired based on the document.

ステップＳ２０２とは異なり、ステップＳ２０６においては、ドキュメント概念集合内の各概念ペア間の意味パスを決定する際の根拠とされるのは、ドキュメントとオントロジである。その結果、ドキュメントそのものの特徴や特質がより多く反映されるため、ドキュメントとクエリとの一致度の決定が容易となる。 Unlike step S202, in step S206, the basis for determining the semantic path between each concept pair in the document concept set is the document and ontology. As a result, more characteristics and characteristics of the document itself are reflected, so that it is easy to determine the degree of coincidence between the document and the query.

ステップＳ２０７において、ドキュメント概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の意味パス数が決定され、ドキュメント意味情報が取得される。 In step S207, the number of semantic paths between the concept pairs is determined based on the semantic paths between the concept pairs in the document concept set, and document semantic information is acquired.

本発明による一実施例においては、ドキュメント概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の順方向意味パスの集合と逆方向意味パスの集合が決定され、その後、順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて、各概念ペア間の意味パス数が取得される。 In one embodiment according to the present invention, a set of forward semantic paths and a set of backward semantic paths between each concept pair are determined based on the semantic paths between each concept pair in the document concept set. Based on the number of elements in the direction semantic path set and the number of elements in the reverse direction semantic path set, the number of semantic paths between the concept pairs is acquired.

本発明による他の実施例においては、ドキュメント概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の順方向意味パスの集合および逆方向意味パスの集合が決定されることに加えて、順方向意味パスの集合から冗長パスが除外されて順方向意味パスの集合が最適化され、逆方向意味パスの集合から冗長パスが除外されて逆方向意味パスの集合が最適化される。さらに、最適化された順方向意味パスの集合の要素数と最適化された逆方向意味パスの集合の要素数とに基づいて、各概念ペア間の意味パス数が取得される。 In another embodiment according to the present invention, a set of forward semantic paths and a set of backward semantic paths between the concept pairs are determined based on the semantic paths between the concept pairs in the document concept set. In addition, the redundant path is excluded from the set of forward semantic paths and the set of forward semantic paths is optimized, and the redundant path is excluded from the set of reverse semantic paths and the set of backward semantic paths is optimized. The Furthermore, the number of semantic paths between each pair of concepts is acquired based on the number of elements in the set of optimized forward semantic paths and the number of elements in the set of optimized backward semantic paths.

本発明による他の実施例においては、ドキュメント概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の順方向意味パスの集合および逆方向意味パスの集合が決定されることに加えて、順方向意味パスの集合と逆方向意味パスの集合とに基づいて、相互パスペアが決定される。その後、順方向意味パス集合の要素数、逆方向意味パス集合の要素数、および相互パスペア数に基づいて、各概念ペア間の意味パス数が決定される。この実施例において、「相互パスペア」の定義はステップＳ２０３での定義と同じである。 In another embodiment according to the present invention, a set of forward semantic paths and a set of backward semantic paths between the concept pairs are determined based on the semantic paths between the concept pairs in the document concept set. In addition, a mutual path pair is determined based on a set of forward semantic paths and a set of reverse semantic paths. Thereafter, the number of semantic paths between each concept pair is determined based on the number of elements in the forward semantic path set, the number of elements in the backward semantic path set, and the number of mutual path pairs. In this embodiment, the definition of “mutual path pair” is the same as that in step S203.

上記の各実施例では、順方向意味パス集合内の意味パス数と逆方向意味パス集合内の意味パス数とを決定するプロセスにおいて、仮想概念と共通概念間の意味パス数は、すべて０に集合されることに留意されたい。 In each of the above embodiments, in the process of determining the number of semantic paths in the forward semantic path set and the number of semantic paths in the backward semantic path set, the number of semantic paths between the virtual concept and the common concept is all zero. Note that they are gathered.

ドキュメント意味情報は、ドキュメントに含まれるドキュメント概念集合、ドキュメント概念集合内の各概念ペア間の意味パス、および意味パス数に基づいて構築される。前述のように、ドキュメント意味情報は多数の方法で構築が可能である。例えば、ドキュメント意味情報はグラフ理論に基づいてドキュメントグラフの形式で表現することができ、この場合、ドキュメントグラフの頂点は、ドキュメント意味情報に含まれるドキュメント概念集合内の１概念に対応し、ドキュメントグラフの辺は、ドキュメント意味情報の２つの概念間の意味パスに対応する。そして、ドキュメントグラフの辺の重みは、ドキュメント意味情報の２つの概念間の意味パス数に対応する。ドキュメント意味情報はさらに、テキストファイルの形式で表現することもできる。なお、ここで示したドキュメントグラフやテキストファイルは単なる例であり、ドキュメント意味情報はこれに限定されず、多数の適切な形式で表現できることは当業者には理解されるであろう。 The document semantic information is constructed based on a document concept set included in the document, a semantic path between each concept pair in the document concept set, and the number of semantic paths. As mentioned above, document semantic information can be constructed in a number of ways. For example, the document semantic information can be expressed in the form of a document graph based on graph theory. In this case, the vertex of the document graph corresponds to one concept in the document concept set included in the document semantic information, and the document graph Corresponds to a semantic path between two concepts of document semantic information. The weight of the edge of the document graph corresponds to the number of semantic paths between the two concepts of the document semantic information. The document semantic information can also be expressed in the form of a text file. Those skilled in the art will understand that the document graphs and text files shown here are merely examples, and the document semantic information is not limited to this and can be expressed in many appropriate formats.

ステップＳ２０８において、ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数が取得される。 In step S208, the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information are acquired.

ステップＳ２０９において、ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数に基づいて、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアが決定される。 In step S209, based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information, a relationship semantic relevance score between the document semantic information and the query semantic information is determined.

ステップＳ２０９は、多数の方法で実行することができる。図３〜５は、それぞれ、本発明の一実施例によるものであり、ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいて、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する方法が示される。 Step S209 can be performed in a number of ways. FIGS. 3-5 are each according to one embodiment of the present invention, and based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information, between the document semantic information and the query semantic information. A method for determining a relationship semantic relevance score is shown.

図３は、本発明の一実施例による、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコア判定方法のフローチャートである。 FIG. 3 is a flowchart of a method for determining a relationship semantic relevance score between document semantic information and query semantic information according to an embodiment of the present invention.

ステップＳ３０１において、ドキュメント意味情報の意味パス数の合計がドキュメント数値として計算される。このステップでは、最初に、ドキュメント意味情報内の各概念ペア間の意味パス数が取得され、次にその数が合計される。本発明の他の実施例においては、合計後の数値が、例えば、その数値からさらに冗長パス数および相互パスペア数を差し引くことによって最適化される。 In step S301, the total number of semantic paths of the document semantic information is calculated as a document numerical value. In this step, first, the number of semantic paths between each concept pair in the document semantic information is obtained, and then the number is summed. In another embodiment of the present invention, the summed numerical value is optimized, for example, by further subtracting the redundant path number and the mutual path pair number from the numerical value.

ステップＳ３０２において、クエリ意味情報の意味パス数の合計がクエリ数値として計算される。このステップでは、最初に、クエリ意味情報内の各概念ペア間の意味パス数が取得され、次にその数が合計される。本発明の他の実施例においては、合計後の数値は、例えば、その数値から冗長パス数および相互パスペア数を差し引くことによって最適化される。 In step S302, the total number of semantic paths of the query semantic information is calculated as a query numerical value. In this step, first, the number of semantic paths between each concept pair in the query semantic information is acquired, and then the number is summed. In another embodiment of the present invention, the summed numerical value is optimized, for example, by subtracting the redundant path number and the mutual path pair number from the numerical value.

ステップＳ３０３において、クエリ数値に対するドキュメント数値の比率が、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとして決定される。図３のフローはこれで終了する。 In step S303, the ratio of the document numerical value to the query numerical value is determined as the relationship semantic relevance score between the document semantic information and the query semantic information. The flow in FIG. 3 ends here.

図４は、本発明の他の実施例による、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコア判定方法のフローチャートである。 FIG. 4 is a flowchart of a relationship semantic relevance score determination method between document semantic information and query semantic information according to another embodiment of the present invention.

ステップＳ４０１において、クエリ意味情報に含まれる概念集合が取得される。 In step S401, a concept set included in the query semantic information is acquired.

本発明による一実施例においては、クエリ意味情報に含まれる概念集合が｛「アメリカ」，「バスケットボール」，「試合」｝であると想定すると、本発明によればドキュメント意味情報に含まれる概念集合はクエリ意味情報に含まれる概念集合と同じであり、その違いは、ドキュメント意味情報に含まれる概念集合には、仮想概念または共通概念（またはその両方）が含まれていることである。例えば、集合内のすべての概念が共通概念であることも、すべての概念が仮想概念であることも、集合に共通概念と仮想概念の両方が含まれることもあり得る。 In one embodiment according to the present invention, assuming that the concept set included in the query semantic information is {“USA”, “basketball”, “game”}, according to the present invention, the concept set included in the document semantic information. Is the same as the concept set included in the query semantic information, and the difference is that the concept set included in the document semantic information includes a virtual concept or a common concept (or both). For example, all concepts in a set may be common concepts, all concepts may be virtual concepts, or a set may include both common concepts and virtual concepts.

ステップＳ４０２において、ドキュメント意味情報に基づいて、概念集合内の各概念ペア間のドキュメント意味パス数が決定される。 In step S402, the number of document semantic paths between each concept pair in the concept set is determined based on the document semantic information.

概念集合内の各概念ペア間の意味パス数を決定する際には、仮想概念の有無を考慮する必要がある。２つの概念間の意味パスを決定する際に、２つの概念のうち少なくとも１つが仮想概念である場合には、２つの概念間の意味パス数は０である。 When determining the number of semantic paths between each pair of concepts in a concept set, it is necessary to consider the presence or absence of virtual concepts. When determining a semantic path between two concepts, if at least one of the two concepts is a virtual concept, the number of semantic paths between the two concepts is zero.

さらに、ここで注目すべきは、概念集合内の各概念ペア間のドキュメント意味パス数は、オントロジではなくドキュメント意味情報に基づいて決定されることである。 Further, it should be noted that the number of document semantic paths between each concept pair in the concept set is determined based on document semantic information, not ontologies.

ステップＳ４０３において、クエリ意味情報に基づいて、概念集合内の各概念ペア間のクエリ意味パス数が決定される。 In step S403, the number of query semantic paths between each concept pair in the concept set is determined based on the query semantic information.

ここで注目すべきは、概念集合内の各概念ペア間のクエリ意味パス数は、オントロジではなくクエリ意味情報に基づいて決定されることである。 It should be noted that the number of query semantic paths between concept pairs in the concept set is determined based on query semantic information, not ontologies.

ステップＳ４０４において、各概念ペア間について、クエリ意味パス数に対するドキュメント意味パス数の比率が計算される。 In step S404, the ratio of the number of document semantic paths to the number of query semantic paths is calculated for each concept pair.

ステップＳ４０５において、各比率の積が、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとして決定される。 In step S405, the product of the respective ratios is determined as a relationship semantic relevance score between the document semantic information and the query semantic information.

例えば、各概念ペア間のドキュメント意味パス数がλ_ｉ、各概念ペア間のクエリ意味パス数がη_ｉ（ここで、ｉは１〜Ｋの任意の数であり、Ｋは概念集合内のすべての概念から各概念ペアを作る場合の組み合わせ総数である）とすれば、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアＳｃｏｒｅ_Ｒは次の式で表される。

For example, the number of document semantic paths between each concept pair is λ _i , and the number of query semantic paths between each concept pair is η _i (where i is an arbitrary number from 1 to K, and K is all in the concept set) The relationship semantic relevance score Score _R between the document semantic information and the query semantic information is expressed by the following equation.

図４のフローはこれで終了する。 The flow in FIG. 4 ends here.

図５は、本発明の他の実施例による、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコア判定方法のフローチャートである。 FIG. 5 is a flowchart of a relationship semantic relevance score determination method between document semantic information and query semantic information according to another embodiment of the present invention.

ステップＳ５０１において、ドキュメント意味情報に基づいて、ドキュメントスパニングツリー（全域木）の集合が決定される。 In step S501, a set of document spanning trees (panning trees) is determined based on the document semantic information.

前述のように、ドキュメント意味情報はグラフ理論に基づいてドキュメントグラフの形式で表現することができる。グラフ理論分野においては、ドキュメントグラフは、互いに異なり、いずれも閉回路を有さない、複数のスパニングツリーに分解できることは周知の事実である。ドキュメントグラフから分解されたこれらのスパニングツリーは、ドキュメントスパニングツリー集合を構成する。 As described above, document semantic information can be expressed in the form of a document graph based on graph theory. In the field of graph theory, it is a well-known fact that document graphs can be decomposed into a plurality of spanning trees which are different from each other and all do not have a closed circuit. These spanning trees decomposed from the document graph constitute a document spanning tree set.

ステップＳ５０２において、クエリ意味情報に基づいてクエリスパニングツリー集合が決定される。 In step S502, a query spanning tree set is determined based on the query semantic information.

ステップＳ５０１と同様に、クエリ意味情報はグラフ理論に基づいてクエリグラフの形式で表現でき、このクエリグラフは、互いに異なり、いずれも閉回路を有さない、複数のスパニングツリーに分解できる。クエリグラフから分解されたこれらのスパニングツリーは、クエリスパニングツリー集合を構成する。 Similar to step S501, the query semantic information can be expressed in the form of a query graph based on graph theory, and the query graph can be decomposed into a plurality of spanning trees that are different from each other and do not have a closed circuit. These spanning trees decomposed from the query graph constitute a query spanning tree set.

ステップＳ５０３において、ドキュメント意味情報内の意味パス数に基づいて、ドキュメントスパニングツリー集合内の各ドキュメントスパニングツリーによって記述されたドキュメントの意味的関係の組み合わせ総数が計算される。 In step S503, based on the number of semantic paths in the document semantic information, the total number of combinations of semantic relations of documents described by each document spanning tree in the document spanning tree set is calculated.

ステップＳ５０４において、クエリ意味情報内の意味パス数に基づいて、クエリスパニングツリー集合内の各クエリスパニングツリーによって記述されたクエリの意味的関係の組み合わせ総数が計算される。 In step S504, based on the number of semantic paths in the query semantic information, the total number of combinations of query semantic relationships described by each query spanning tree in the query spanning tree set is calculated.

ステップＳ５０５において、ドキュメントの意味的関係の組み合わせ総数とクエリの意味的関係の組み合わせ総数とに基づいて、各スパニングツリーペアの意味ペアスコアが決定される。 In step S505, the semantic pair score of each spanning tree pair is determined based on the total number of combinations of document semantic relationships and the total number of combinations of query semantic relationships.

スパニングツリーペアとは、クエリスパニングツリーの集合内のクエリスパニングツリーの１つとドキュメントスパニングツリーの集合内の対応する１つのスパニングツリーとで構成されるスパニングツリーのペアである。このペアの一方のスパニングツリーは他方のスパニングツリーと１対１で対応する。 A spanning tree pair is a pair of spanning trees composed of one of the query spanning trees in the set of query spanning trees and one corresponding spanning tree in the set of document spanning trees. One spanning tree of this pair has a one-to-one correspondence with the other spanning tree.

各ドキュメントスパニングツリーの各２つの頂点（例：対応する概念）間の辺の重み付けをλ_１，λ_２，…，λ_Ｋとし、各クエリスパニングツリーの各２つの頂点（例：対応する概念）間の辺の重み付けをη_１，η_２，…，η_Ｋとすれば（ここで、Ｋは、概念集合内のすべての概念から各概念ペアを作る場合の組み合わせ総数である）、各スパニングツリーペアの意味関連度スコアＳｃｏｒｅ_ｔｒｅｅは次の式で表現される。

Edge weights between each two vertices (eg, corresponding concept) of each document spanning tree are λ ₁ , λ ₂ ,..., Λ _K, and each two vertices (eg, corresponding concept) of each query spanning tree. If the weights of the edges between them are η ₁ , η ₂ ,..., Η _K (where K is the total number of combinations in the case where each concept pair is created from all the concepts in the concept set), each spanning tree The pair semantic relevance score Score _tree is expressed by the following equation.

式（２）において、分子は、ステップＳ５０４で取得されたドキュメントスパニングツリー集合内の各ドキュメントスパニングツリーによって記述されたドキュメントの意味的関係の組み合わせ総数を示し、分母はステップＳ５０５で取得されたクエリスパニングツリー集合内の各クエリスパニングツリーによって記述されたクエリの意味的関係の組み合わせ総数を示す。 In Equation (2), the numerator indicates the total number of combinations of the semantic relationships of the documents described by each document spanning tree in the document spanning tree set acquired in step S504, and the denominator is the query spanning acquired in step S505. Indicates the total number of semantic relationships of queries described by each query spanning tree in the tree set.

ステップＳ５０６において、スパニングツリーペアの意味ペアスコアの平均値が、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとして決定される。 In step S506, the average value of the semantic pair scores of the spanning tree pair is determined as the relationship semantic relevance score between the document semantic information and the query semantic information.

例えば、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアＳｃｏｒｅ_Ｒは、以下の式によって計算される。
Ｓｃｏｒｅ_Ｒ＝Ｍｅａｎ（Ｓｃｏｒｅ_ｔｒｅｅ）．（３）
For example, the relationship semantic relevance score Score _R between document semantic information and query semantic information is calculated by the following equation.
Score _R = Mean (Score _tree ). (3)

ここで、「Ｍｅａｎ（ｘ）」は、ｘの平均値を示す。式（３）において、Ｍｅａｎ（Ｓｃｏｒｅ_ｔｒｅｅ）は、各スパニングツリーペアの意味関連度スコアＳｃｏｒｅ_ｔｒｅｅの平均値が計算されることを示す。この平均値は、算術平均値でも、加重平均値でも、あるいは当業者によって使用されるその他任意の形式の平均値であってもよいことは理解されるであろう。 Here, “Mean (x)” indicates an average value of x. In Equation (3), Mean (Score _tree ) indicates that the average value of the semantic relevance score Score _tree of each spanning tree pair is calculated. It will be appreciated that this average value may be an arithmetic average value, a weighted average value, or any other form of average value used by those skilled in the art.

図５のフローはこれで終了する。 This is the end of the flow of FIG.

本発明による一実施例においては、ドキュメント内のすべての概念間のドキュメント意味情報を取得することにより、ドキュメント意味情報の集合が事前に形成される。また、クエリの受信後に、クエリ内の概念が取得され、クエリ概念集合が形成される。その後、クエリ概念集合とドキュメント意味情報集合とを照合することによって、ドキュメント意味情報のサブ集合が取得される。ドキュメント意味情報のサブ集合には、ドキュメント意味情報集合に含まれ、かつクエリ概念集合内の概念と一致する、すべての概念のドキュメント意味情報が含まれる。 In one embodiment according to the present invention, a set of document semantic information is pre-formed by obtaining document semantic information between all concepts in the document. Also, after receiving the query, the concepts in the query are obtained to form a query concept set. Thereafter, a sub-set of document semantic information is obtained by collating the query concept set with the document semantic information set. The document semantic information sub-set includes document semantic information for all concepts that are included in the document semantic information set and that match the concepts in the query concept set.

続いて、ドキュメント意味情報サブ集合内の意味パス数とクエリ意味情報内の意味パス数が取得される。そして、ドキュメント意味情報サブ集合内の意味パス数とクエリ意味情報内の意味パス数に基づいて、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアが決定される。 Subsequently, the number of semantic paths in the document semantic information sub-set and the number of semantic paths in the query semantic information are acquired. Then, based on the number of semantic paths in the document semantic information sub-set and the number of semantic paths in the query semantic information, a relationship semantic relevance score between the document semantic information and the query semantic information is determined.

ステップＳ２１０において、ドキュメントとクエリ間の概念意味関連度スコアが取得される。 In step S210, a concept-meaning relevance score between the document and the query is acquired.

概念意味関連度スコアとは、概念の観点から見た、ドキュメントとクエリの意味関連度スコアである。概念意味関連度スコアは、多数の方法によって計算できる。 The concept semantic relevance score is a semantic relevance score between a document and a query from the viewpoint of a concept. The concept semantic relevance score can be calculated by a number of methods.

例えば、ベクトル空間モデルに基づいて、概念意味関連度スコア（Ｓｃｏｒｅ_Ｃとする）を計算してもよい。この方法では、まず、クエリ概念集合（Ｓ_ｑとする）と意味類似度計算モデルとに基づいて、ｎ次元のクエリベクトルｑ＝（ｑ_１，…，ｑ_ｎ）が構築される（ここで、ｎはオントロジ内の概念の総数であり、各概念はベクトルｑの要素の１つに対応する）。この意味類似度計算モデルの例としては、非特許文献４（“ＩｍｐｒｏｖｅｄＳｅｍａｎｔｉｃＳｉｍｉｌａｒｉｔｙＣａｌｃｕｌａｔｉｎｇＭｏｄｅｌａｎｄＡｐｐｌｉｃａｔｉｏｎ（改良型意味類似度計算モデルと応用）”，ＪｉｌｉｎＵｎｉｖｅｒｓｉｔｙＰｒｅｓｓ，Ｖｏｌ．３９，Ｎｏ．１，２００９）、非特許文献５（“Ｕｓｉｎｇｉｎｆｏｒｍａｔｉｏｎｃｏｎｔｅｎｔｔｏｅｖａｌｕａｔｅｓｅｍａｎｔｉｃｓｉｍｉｌａｒｉｔｙｉｎａｔａｘｏｎｏｍｙ（分類学上の意味類似評価への情報コンテンツの利用）”，ＩｎＩＪＣＡＩ’９５）が挙げられる。ベクトルｑ内の要素の値を設定する際には、Ｓ_ｑ内に当該要素に対応する概念Ｃ_ｉ（ｉ＝１，２，…，ｎ）が出現する場合には、当該要素の値は１に集合される。出現しない場合には、当該要素の値はＣ_ｉおよびＳ_ｑ内の目的の概念間の意味的類似度に集合される。 For example, a concept-meaning relevance score (referred to as Score _C ) may be calculated based on a vector space model. In this method, first, an n-dimensional query vector q = (q ₁ ,..., Q _n ) is constructed based on a query concept set (referred to as S _q ) and a semantic similarity calculation model (where, n is the total number of concepts in the ontology, each concept corresponding to one of the elements of the vector q). As an example of this semantic similarity calculation model, Non-Patent Document 4 (“Improved Semantic Simulating Modeling and Application”, Jilin University Press, Vol. 39, No. 1, 2009). ), Non-Patent Document 5 ("Using information content to evaluate semantic similarity in a taxonomy (use of information content for semantic similarity evaluation in taxonomy)"), In IJCAI '95). When setting the value of an element in the vector q, if the concept C _i (i = 1, 2,..., N) corresponding to the element appears in S _q , the value of the element is 1 Is gathered. If not, the value of the element is set to the semantic similarity between the target concepts in C _i and S _q .

次に、各ドキュメントのｎ次元のドキュメントベクトルｄ＝（ｄ_１，…，ｄ_ｎ），ｄ_ｉ（ｉ＝１，２，…，ｎ）を構築する際には、概念Ｃ_ｉと当該ドキュメント間の関連度が反映され、その値はドキュメント内の概念Ｃ_ｉの出現頻度に基づき、ＴＦ−ＩＤＦアルゴリズム（非特許文献６（“ＩｎｔｒｏｄｕｃｔｉｏｎｔｏＭｏｄｅｒｎＩｎｆｏｒｍａｔｉｏｎＲｅｔｒｉｅｖａｌ（現代の情報検索入門）”，ＭｃＧｒａｗ−Ｈｉｌｌ，１９８３）によって計算される。このアルゴリズムは

の式で表され、
ここで、
ｆｒｅｑ_ｉ，ｄはドキュメント内に概念Ｃ_ｉが出現する頻度、

はドキュメント内の概念のうち最多の出現頻度を示す概念の頻度値、
ｎ_ｉはＣ_ｉによってマーキングされるドキュメントの総数、
Ｄは検索空間内のドキュメント集合である。 Next, when constructing an n-dimensional document vector d = (d ₁ ,..., D _n ), d _i (i = 1, 2,..., N) of each document, the concept C _i and the relevant document are The value is based on the appearance frequency of the concept C _i in the document, and the TF-IDF algorithm (Non-Patent Document 6 (“Introduction to Modern Information Retrieval”, McGraw-Hill) , 1983) This algorithm is

Represented by the formula
here,
freq _{i, d} is the frequency at which the concept C _i appears in the document,

Is the frequency value of the concept that shows the highest frequency of occurrence in the document,
n _i is the total number of documents marked by C _i ,
D is a set of documents in the search space.

次に、クエリベクトルｑとドキュメントベクトルｄを使い、式（４）に従って、概念意味関連度スコアＳｃｏｒｅ_Ｃが計算される。

Next, using the query vector q and the document vector d, a concept semantic relevance score Score _C is calculated according to the equation (4).

概念意味関連度スコアは、例えば、非特許文献７（“ＣａｔｅｇｏｒｉｚｉｎｇａｎｄＲａｎｋｉｎｇＳｅａｒｃｈＥｎｇｉｎｅ’ｓＲｅｓｕｌｔｓｂｙＳｅｍａｎｔｉｃＳｉｍｉｌａｒｉｔｙ（意味類似度による検索エンジンの結果の分類およびランク付け”，ＩｎＰｒｏｃｅｅｄｉｎｇｏｆＩＣＵＩＭＣ’０８）に記載される方法によって計算してもよい。この方法では、クエリからクエリ概念集合Ｓ_ｑが取得され、ドキュメント群からドキュメント概念集合Ｓ_ｄが取得され、次にＳ_ｑおよびＳ_ｄ内の各概念ペア間の意味類似度が計算され、さらにこれらの類似度の平均値が計算されて、概念意味関連度スコアＳｃｏｒｅ_Ｃが取得される。 For example, Non-Patent Document 7 (“Categorizing and Ranking Search Engine's Results by Semantic Similarity (Classification and Ranking of Search Engine Results by Meaning of Similarity”, In Proceeding of ICUIMC'08). May be calculated by the described method, in which a query concept set S _q is obtained from a query, a document concept set S _d is obtained from a set of documents, and then each concept pair in S _q and S _d The semantic similarity between them is calculated, the average value of these similarities is calculated, and the concept semantic relevance score Score _C is obtained.

当業者は概念意味関連度スコアを他の既知の方法によっても取得できることに留意されたい。上記の概念意味関連度スコアの取得方法は、単なる例示であり、範囲を限定するためのものではない。 It should be noted that a person skilled in the art can obtain the concept semantic relevance score by other known methods. The above-described method of obtaining the conceptual meaning relevance score is merely an example, and is not intended to limit the range.

概念意味関連度スコアは、事前に計算し、本発明のドキュメントランク付け装置によってアクセス可能な記憶装置に保存しておくこともできる。記憶装置は、例えば、半導体ディスク、磁気ディスク、光学ディスク、フロッピーディスク等のローカルメモリ、携行可能なメモリや、インターネットなどのコンピュータネットワーク経由でダウンロード可能なメモリでもよい。 The concept semantic relevance score may be calculated in advance and stored in a storage device accessible by the document ranking device of the present invention. The storage device may be, for example, a local memory such as a semiconductor disk, a magnetic disk, an optical disk, or a floppy disk, a portable memory, or a memory that can be downloaded via a computer network such as the Internet.

概念意味関連度スコアは、本発明の実施例を実行中（例えば、ステップＳ２１０）に、リアルタイムで計算してもよい。さらに、当業者は、本明細書中で開示した具体例に限定されず、現在の技術的条件や技術的手段に基づく他の適切な手段を使用して、ドキュメントの概念意味関連度スコアおよびクエリを取得できるであろう。 The conceptual meaning relevance score may be calculated in real time while the embodiment of the present invention is being executed (for example, step S210). Further, those skilled in the art are not limited to the specific examples disclosed herein, but use other suitable means based on current technical conditions and technical means to use the document's concept semantic relevance score and query. Would be able to get

ステップＳ２１１において、ドキュメントのスコアは、各関係関連度スコアと各概念関連度スコアとに基づいて決定される。 In step S211, the score of the document is determined based on each relationship relevance score and each concept relevance score.

本発明による一実施例において、関係関連度スコアをＳｃｏｒｅ_Ｒとし、概念関連度スコアをＳｃｏｒｅ_Ｃとし、関係関連度スコアと概念関連度スコアをそれぞれ概念の重み（λ_Ｃとする）および関係の重み（λ_Ｒ）を使って重み付けすると想定する。ここで、概念の重みλ_Ｃおよび関係の重みλ_Ｒの値はいずれも０〜１の範囲であり、概念の重みλ_Ｃと関係の重みλ_Ｒとの合計は１である。ドキュメントのスコアは、重み付けされた関係関連度スコアと重み付けされた概念関連度スコアとを合計することによって求めることができる。この実施例においてドキュメントのスコア（Ｓｃｏｒｅ_ｄとする）を決定する方法を、以下に示す。
Ｓｃｏｒｅ_ｄ＝λ_Ｃ・Ｓｃｏｒｅ_Ｃ＋λ_Ｒ・Ｓｃｏｒｅ_Ｒ（５） In one embodiment according to the present invention, the relationship relevance score is Score _R , the concept relevance score is Score _C , the relationship relevance score and the concept relevance score are respectively a concept weight (λ _C ) and a relationship weight. Assume weighting using (λ _R ). Here, the values of the concept weight λ _C and the relationship weight λ _R are both in the range of 0 to 1, and the sum of the concept weight λ _C and the relationship weight λ _R is 1. The score of the document can be determined by summing the weighted relationship relevance score and the weighted concept relevance score. A method for determining the score (score _d ) of the document in this embodiment will be described below.
Score _d = λ _C · Score _C + λ _R · Score _R (5)

式（５）において、λ_Ｒ∈［０，１］、λ_Ｃ∈［０，１］、λ_Ｃ＋λ_Ｒ＝１である。 In equation (5), λ _R ∈ [0, 1], λ _C ∈ [0, 1], and λ _C + λ _R = 1.

概念の重みλ_Ｃと関係の重みλ_Ｒとの合計は１であるため、式（５）は次のように単純化できる。
Ｓｃｏｒｅ_ｄ＝λ_Ｃ・Ｓｃｏｒｅ_Ｃ＋（１−λ）・Ｓｃｏｒｅ_Ｒ（６） Since the sum of the concept weight λ _C and the relationship weight λ _R is 1, Equation (5) can be simplified as follows.
Score _d = λ _C · Score _C + (1−λ) · Score _R (6)

式（６）において、λ∈［０，１］である。 In Equation (6), λε [0, 1].

ステップＳ２１２において、ドキュメントのスコアに基づいてドキュメントがランク付けされる。 In step S212, the documents are ranked based on the document score.

ステップＳ２１１が完了すると、ランク付け対象のドキュメントに対するスコアを取得することができる。例えば、ランク付け対象のドキュメントが１０個あるとすれば、ステップＳ２１１において、この１０個のドキュメントのスコアが取得される。次に、ステップＳ２１２において、１０個のドキュメントは、降順、昇順、あるいは当業者自身が定める順序でランク付けすることができる。１０個のドキュメントの各スコアは、ドキュメントと、ユーザによって入力されたクエリとの間の、概念および関係に関する意味的関連度の大きさを示す。ドキュメントのスコアが高いほど、ドキュメントとユーザのクエリとの間の意味的関連度は大きくなり、ドキュメントのスコアが低いほど、ドキュメントとユーザのクエリとの間の意味的関連度が小さくなる。 When step S211 is completed, the score for the document to be ranked can be acquired. For example, if there are 10 documents to be ranked, the scores of these 10 documents are acquired in step S211. Next, in step S212, the ten documents can be ranked in descending order, ascending order, or in an order determined by one skilled in the art. Each score of the 10 documents indicates the magnitude of the semantic relevance regarding the concept and relationship between the document and the query entered by the user. The higher the document score, the greater the semantic relevance between the document and the user query, and the lower the document score, the smaller the semantic relevance between the document and the user query.

本発明による他の実施例において、ステップＳ２１１およびステップＳ２１２は、概念関連度スコアに基づいてドキュメントがランク付けされ、ランク付けされたドキュメントがグループ化され、さらに各ドキュメントグループの個々のドキュメントが関係関連度スコアに基づいてランク付けされる、というステップに置き換えることができる。一例として、ランク付け対象のドキュメントが１０個あるとする。１０個のドキュメントは、各ドキュメントの概念関連度スコアＳｃｏｒｅ_Ｃに基づいて、粗い粒度でランク付けされ、いくつかのグループに分割される。ここで、例えば、１０個のドキュメントがドキュメント５個ずつの２つのグループに分割され、第１のドキュメントグループの概念関連度スコアは第２のドキュメントグループの概念関連度スコアよりも大きいとする。第１のグループの５個のドキュメントはその後、それぞれの関係関連度スコアに基づいて、細かい粒度でランク付けされる。これにより、第１のグループの５個のドキュメントの順位は、５個のドキュメントの元の順位に基づいて、さらに調整される。同様に、第２のドキュメントグループの５個のドキュメントも、それぞれの関係関連度スコアに基づいて細かくランク付けされる。その結果、１０個のドキュメントのランク付け形式が取得されるが、このランク付けには、クエリとドキュメント間の概念関連度スコアと関係関連度スコアの両方が考慮されている。また、ドキュメントと、ユーザにより入力されたクエリとの間の、概念および関係についての意味関連度の大きさも示される。 In another embodiment according to the present invention, steps S211 and S212 are performed by ranking the documents based on the concept relevance score, grouping the ranked documents, and further relating the individual documents of each document group. It can be replaced with the step of ranking based on the degree score. As an example, assume that there are 10 documents to be ranked. The ten documents are ranked with coarse granularity based on each document's concept relevance score Score _C and divided into several groups. Here, for example, it is assumed that ten documents are divided into two groups of five documents, and the concept relevance score of the first document group is larger than the concept relevance score of the second document group. The five documents in the first group are then ranked with fine granularity based on their respective relationship relevance scores. Thereby, the rank of the five documents in the first group is further adjusted based on the original rank of the five documents. Similarly, the five documents in the second document group are also finely ranked based on their relational relevance scores. As a result, a ranking format of 10 documents is acquired. In this ranking, both the concept relevance score and the relationship relevance score between the query and the document are considered. It also shows the magnitude of the semantic relevance for concepts and relationships between the document and the query entered by the user.

図２のフローはこれで終了する。 This is the end of the flow of FIG.

図６は、本発明の一実施例によるドキュメントランク付け装置６００のブロック図である。装置６００は、クエリ意味情報抽出手段６０１と、ドキュメント意味情報抽出手段６０２と、関係意味関連度スコア決定手段６０３と、ランク付け手段６０４とを備える。クエリ意味情報抽出手段６０１は、ユーザのクエリおよびオントロジに基づいてクエリ意味情報を抽出するように構成されている。ドキュメント意味情報抽出手段６０２は、ドキュメント、クエリ、およびオントロジに基づいてドキュメント意味情報を抽出するように構成されている。関係意味関連度スコア決定手段６０３は、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定するように構成されている。ランク付け手段６０４は、関係意味関連度スコアに基づいてドキュメントをランク付けするように構成されている。 FIG. 6 is a block diagram of a document ranking apparatus 600 according to an embodiment of the present invention. The apparatus 600 includes a query meaning information extraction unit 601, a document meaning information extraction unit 602, a relationship meaning relevance score determination unit 603, and a ranking unit 604. The query semantic information extraction unit 601 is configured to extract query semantic information based on the user's query and ontology. The document semantic information extraction unit 602 is configured to extract document semantic information based on the document, the query, and the ontology. The relationship meaning relevance score determination means 603 is configured to determine a relationship meaning relevance score between document semantic information and query semantic information. The ranking means 604 is configured to rank the documents based on the relationship meaning relevance score.

本発明による一実施例においては、クエリ意味情報抽出手段６０１は、オントロジに基づいて、ユーザのクエリに含まれるクエリ概念の集合を抽出する手段と、オントロジに基づいて、クエリ概念集合内の各概念ペア間の意味パスを取得する手段と、クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する手段とを備える。 In one embodiment according to the present invention, the query semantic information extracting means 601 includes means for extracting a set of query concepts included in the user's query based on the ontology, and each concept in the query concept set based on the ontology. Means for obtaining a semantic path between the pairs, and means for determining the number of semantic paths between the concept pairs based on the semantic paths between the concept pairs in the query concept set.

本発明による一実施例においては、クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する手段は、クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の順方向意味パス集合と逆方向意味パス集合とを決定する手段と、順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段とを備える。 In one embodiment according to the invention, the means for determining the number of semantic paths between each concept pair based on the semantic paths between each concept pair in the query concept set is the semantic path between each concept pair in the query concept set. Each of the concept pairs based on the number of elements in the forward semantic path set and the number of elements in the forward semantic path set. Means for acquiring the number of semantic paths between them.

本発明による他の実施例においては、順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段は、順方向意味パス集合から冗長パスを除外して順方向意味パス集合を最適化する手段と、逆方向意味パス集合から冗長パスを除外して逆方向意味パス集合を最適化する手段と、最適化された順方向意味パスの要素数と最適化された逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段とを備える。 In another embodiment according to the present invention, the means for obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set and the number of elements in the reverse semantic path set is the forward semantic path set. For optimizing the forward semantic path set by excluding redundant paths from the path, means for excluding redundant paths from the reverse semantic path set and optimizing the reverse semantic path set, and optimized forward semantics Means for obtaining the number of semantic paths between each concept pair based on the number of elements of the path and the number of elements of the optimized backward semantic path set.

本発明による他の実施例においては、順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段は、順方向意味パス集合と逆方向意味パス集合とに基づいて相互パスペアを決定する手段と、順方向意味パス集合の要素数、逆方向意味パス集合の要素数、および相互パスペア数に基づいて各概念ペア間の意味パス数を取得する手段とを備える。 In another embodiment according to the present invention, the means for obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set and the number of elements in the reverse semantic path set is the forward semantic path set. And a semantic path between each concept pair based on the number of elements in the forward semantic path set, the number of elements in the reverse semantic path set, and the number of mutual path pairs. Means for obtaining a number.

本発明による一実施例においては、ドキュメント意味情報抽出手段６０２は、オントロジに基づいて、ドキュメントに含まれる概念集合とクエリに含まれる概念集合を抽出する手段と、ドキュメントに含まれる概念集合とクエリに含まれる概念集合との共通部分に基づいてドキュメント概念集合を取得する手段と、ドキュメントに基づいてドキュメント概念集合内の各概念ペア間の意味パスを取得する手段と、ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する手段とを備える。 In one embodiment according to the present invention, the document semantic information extraction unit 602 extracts a concept set included in the document and a concept set included in the query based on the ontology, and a concept set and query included in the document. Means for obtaining a document concept set based on a common part with the contained concept set, means for obtaining a semantic path between each pair of concepts in the document concept set based on the document, and each concept pair in the document concept set Means for determining the number of semantic paths between each pair of concepts based on the semantic paths between them.

本発明による他の実施例においては、クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パスを決定する手段は、ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の順方向意味パス集合と逆方向意味パス集合とを決定する手段と、順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段とを備える。 In another embodiment according to the present invention, the means for determining the semantic path between each concept pair based on the semantic path between each concept pair in the query concept set is a semantic path between each concept pair in the document concept set. Each of the concept pairs based on the number of elements in the forward semantic path set and the number of elements in the forward semantic path set. Means for acquiring the number of semantic paths between them.

本発明による他の実施例においては、順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段は、順方向意味パス集合から冗長パスを除外して順方向意味パス集合を最適化するための手段と、逆方向意味パス集合から冗長パスを除外して逆方向意味パス集合を最適化するための手段と、最適化された順方向意味パス集合の要素数と最適化された逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段とを備える。 In another embodiment according to the present invention, the means for obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set and the number of elements in the reverse semantic path set is the forward semantic path set. Means for optimizing the forward semantic path set by excluding redundant paths from the path, and means for optimizing the reverse semantic path set by excluding redundant paths from the reverse semantic path set. Means for obtaining the number of semantic paths between each pair of concepts based on the number of elements in the forward semantic path set and the number of elements in the optimized backward semantic path set.

本発明による他の実施例においては、順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段は、順方向意味パス集合と逆方向意味パス集合とに基づいて相互パスペアを決定する手段と、順方向意味パス集合の要素数、逆方向意味パス集合の要素数、および相互パスペア数とに基づいて各概念ペア間の意味パス数を取得する手段とを備える。 In another embodiment according to the present invention, the means for obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set and the number of elements in the reverse semantic path set is the forward semantic path set. Meaning between each pair of concepts based on the means for determining a mutual path pair based on the reverse semantic path set, the number of elements in the forward semantic path set, the number of elements in the reverse semantic path set, and the number of mutual path pairs Means for obtaining the number of passes.

本発明による一実施例においては、関係意味関連度スコア決定手段６０３は、ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とを取得する手段と、ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する手段とを備える。 In one embodiment according to the present invention, the relationship meaning relevance score determination means 603 includes means for obtaining the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information, and the semantic path in the document semantic information. Means for determining a relationship semantic relevance score between document semantic information and query semantic information based on the number and the number of semantic paths in the query semantic information.

本発明による他の実施例においては、ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する手段は、ドキュメント意味情報内の意味パス数の合計を計算してドキュメント数値とする手段と、クエリ意味情報内の意味パス数の合計を計算してクエリ数値とする手段と、クエリ数値に対するドキュメント数値の比率を決定してドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとする手段とを備える。 In another embodiment according to the present invention, the means for determining the relationship semantic relevance score between the document semantic information and the query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information is , A means for calculating the total number of semantic paths in the document semantic information to obtain a document number, a means for calculating the total number of semantic paths in the query semantic information to obtain a query number, and a ratio of the document number to the query number And a means for obtaining a relationship semantic relevance score between the document semantic information and the query semantic information.

本発明による他の実施例においては、ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する手段は、クエリ意味情報に含まれる概念の集合を取得する手段と、ドキュメント意味情報に基づいて概念集合内の各概念ペア間のドキュメント意味パス数を決定する手段と、クエリ意味情報に基づいて概念集合内の各概念ペア間のクエリ意味パス数を決定する手段と、各概念ペア間の、クエリ意味パス数に対するドキュメント意味パス数の比率を計算する手段と、それぞれの比率の積をドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとして決定する手段とを備える。 In another embodiment according to the present invention, the means for determining the relationship semantic relevance score between the document semantic information and the query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information is Means for acquiring a set of concepts included in the query semantic information, means for determining the number of document semantic paths between each pair of concepts in the concept set based on the document semantic information, and in the concept set based on the query semantic information Means for determining the number of query semantic paths between each concept pair, means for calculating the ratio of the number of document semantic paths to the number of query semantic paths between each concept pair, and the product of each ratio as the document semantic information and the query Means for determining as a relationship semantic relevance score between semantic information.

本発明による他の実施例においては、ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する手段は、ドキュメント意味情報に基づいてドキュメントスパニングツリー集合を決定する手段と、クエリ意味情報に基づいてクエリスパニングツリー集合を決定する手段と（クエリスパニングツリー集合の各要素はドキュメントスパニングツリー集合各要素と１対１で対応し、複数のスパニングツリーペアが生成される）、ドキュメント意味情報内の意味パス数に基づいてドキュメントスパニングツリー集合内の各ドキュメントスパニングツリーによって記述されたドキュメント意味関係の組み合わせ総数を計算する手段と、クエリ意味情報内の意味パス数に基づいてクエリスパニングツリー集合内の各クエリスパニングツリーによって記述されたクエリ意味関係の組み合わせ総数を計算する手段と、ドキュメント意味関係の組み合わせ総数とクエリ意味関係の組み合わせ総数とに基づいて各スパニングツリーペアの意味ペアスコアを決定する手段と、スパニングツリーペアの意味ペアスコアの平均値を決定してドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとする手段とを備える。 In another embodiment according to the present invention, the means for determining the relationship semantic relevance score between the document semantic information and the query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information is Means for determining a document spanning tree set based on document semantic information, means for determining a query spanning tree set based on query semantic information, and each element of the query spanning tree set is paired with each element of the document spanning tree set. 1, a plurality of spanning tree pairs are generated), and the total number of combinations of document semantic relationships described by each document spanning tree in the document spanning tree set is calculated based on the number of semantic paths in the document semantic information. Means and query semantic information A means for calculating the total number of combinations of query semantics described by each query spanning tree in the set of query spanning trees based on the number of semantic paths, and the total number of combinations of document semantic relations and the total number of combinations of query semantic relations Means for determining a semantic pair score of each spanning tree pair, and means for determining an average value of the semantic pair scores of the spanning tree pair to obtain a relation meaning relevance score between the document semantic information and the query semantic information.

本発明による一実施例においては、ランク付け手段６０４は、ドキュメントとクエリ間の概念意味関連度スコアを取得する手段と、関係関連度スコアと概念関連度スコアとに基づいてドキュメントのスコアを決定する手段と、ドキュメントのスコアに基づいてドキュメントをランク付けする手段とを備える。 In one embodiment according to the present invention, the ranking means 604 determines the score of the document based on the means for obtaining a concept semantic relevance score between the document and the query, and the relationship relevance score and the concept relevance score. Means, and means for ranking the documents based on the score of the documents.

本発明による他の実施例においては、関係関連度スコアと概念関連度スコアとに基づいてドキュメントのスコアを決定する手段は、関係の重みと概念の重みをそれぞれ使用して関係関連度スコアと概念関連度スコアの重み付けをする手段と（関係の重みと概念の重みはいずれも０〜１の値であり、関係の重みと概念の重みの合計は１である）、重み付けされた関係関連度スコアと重み付けされた概念関係度スコアとを合計してドキュメントのスコアを取得する手段とを備える。 In another embodiment according to the present invention, the means for determining the document score based on the relationship relevance score and the concept relevance score uses the relationship weight and the concept weight, respectively. Means for weighting the relevance score (the relation weight and the concept weight are both 0 to 1 and the sum of the relation weight and the concept weight is 1), and the weighted relation relevance score And a weighted conceptual relationship degree score to obtain a document score.

本発明による一実施例においては、ランク付け手段４００は、ドキュメントとクエリ間の概念意味関連度スコアを取得する手段と、概念関連度スコアに基づいてドキュメントをランク付けする手段と、ランク付けされたドキュメントをグループ化する手段と、関連関係度スコアに基づいて各ドキュメントグループ内の各ドキュメントをランク付けする手段とを備える。 In one embodiment according to the present invention, the ranking means 400 is ranked by means for obtaining a concept semantic relevance score between a document and a query, and means for ranking the documents based on the concept relevance score. Means for grouping documents, and means for ranking each document in each document group based on a relevance score.

本発明はさらに、ユーザのクエリとオントロジに基づいてクエリの意味情報抽出を実行するコードと、ドキュメント、クエリ、およびオントロジに基づいてドキュメント意味情報抽出を実行するコードと、ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコア決定を実行するコードと、関係意味関連度スコアに基づいてドキュメントのランク付けを実行するコードとを備えるコンピュータプログラム製品に関する。コードは、使用前に、他のコンピュータシステム内のメモリに格納しておく（例えば、ハードディスクや、光学ディスクやフロッピーディスク等の携行可能なメモリに格納する）か、インターネット等のコンピュータネットワークを介してダウンロードすることができる。 The present invention further includes code for performing query semantic information extraction based on user query and ontology, code for performing document semantic information extraction based on document, query, and ontology, document semantic information, and query semantic information. The present invention relates to a computer program product comprising a code for performing a relationship semantic relevance score determination between and a code for ranking documents based on the relationship semantic relevance score. Before use, the code is stored in a memory in another computer system (for example, stored in a portable memory such as a hard disk, an optical disk or a floppy disk), or via a computer network such as the Internet. Can be downloaded.

本発明の実施例によって開示される方法は、ソフトウェアとして、ハードウェアとして、またはソフトウェアとハードウェアとの組み合わせとして実施されてもよい。ハードウェア部分は、専用の論理回路を使用して実装でき、ソフトウェア部分はメモリに格納して、マイクロプロセッサ、パーソナル・コンピュータ、メインフレーム等の適切な命令実行システムによって実行することができる。好適な実施例においては、本発明は、ファームウェア、常駐ソフトウェア、マイクロコード等のソフトウェアとして実行することができる。本発明はさらに、コンピュータ等の命令実行システムによって使用されるかそれと接続して使用されるプログラムコードを提供する、コンピュータから利用可能かまたはコンピュータ可読媒体によってアクセス可能な、コンピュータプログラム製品として実施されてもよい。なお、コンピュータ利用可能またはコンピュータ可読媒体とは、命令実行システム、装置、もしくはデバイスによって使用されるかそれと接続して使用されるプログラムを内蔵、格納、通信、伝搬、または搬送するあらゆる有形手段を意味する。 The methods disclosed by the embodiments of the present invention may be implemented as software, hardware, or a combination of software and hardware. The hardware portion can be implemented using dedicated logic circuitry, and the software portion can be stored in memory and executed by a suitable instruction execution system such as a microprocessor, personal computer, mainframe or the like. In a preferred embodiment, the present invention can be implemented as software such as firmware, resident software, microcode, etc. The present invention is further embodied as a computer program product, available from a computer or accessible by a computer readable medium, that provides program code for use by or in connection with an instruction execution system such as a computer. Also good. A computer-usable or computer-readable medium means any tangible means for storing, storing, communicating, propagating, or carrying a program used by or connected to an instruction execution system, apparatus, or device. To do.

こうした媒体とは、電子、磁気、光学、電磁気、赤外線、半導体のシステム（装置またはデバイス）や、伝播媒体等である。また、コンピュータ可読媒体の例としては、半導体または固体記憶装置、磁気テープ、携帯用コンピュータディスケット、ランダムアクセスメモリ（ＲＡＭ）、読み取り専用メモリ（ＲＯＭ）、ハードディスク、光学ディスクが挙げられる。現在の光学ディスクの例としては、コンパクトディスク読み取り専用メモリ（ＣＤ−ＲＯＭ）、コンパクトディスク読み取り／書き込みメモリ（ＣＲ−ＲＯＭ）、ＤＶＤが挙げられる。 Such media are electronic, magnetic, optical, electromagnetic, infrared, semiconductor systems (apparatus or device), propagation media, and the like. Examples of computer readable media include semiconductor or solid state storage devices, magnetic tape, portable computer diskettes, random access memory (RAM), read only memory (ROM), hard disks, and optical disks. Examples of current optical discs include compact disc read only memory (CD-ROM), compact disc read / write memory (CR-ROM), and DVD.

本発明の実施例によるプログラムコードを格納または実行するように適応されるデータ処理システムには、メモリ素子と直接またはシステムバスを介して連結される１つ以上のプロセッサ等が含まれる。メモリ素子には、プログラムコードの実行中に利用可能なローカルメモリ、大容量記憶装置のほか、実行時における大容量記憶装置からのコード読み込み回数を低減させるために、プログラムコードの少なくとも１つの部分を一時的に記憶するために提供されるキャッシュ等が含まれる。 A data processing system adapted to store or execute program code according to an embodiment of the present invention includes one or more processors or the like coupled directly to a memory device or via a system bus. In addition to the local memory and the mass storage device that can be used during the execution of the program code, the memory element includes at least one part of the program code in order to reduce the number of times the code is read from the mass storage device at the time of execution. A cache provided for temporary storage is included.

システムには、直接または中間Ｉ／Ｏコントローラを介して、入出力装置（キーボード、ディスプレイ、ポインティングデバイス等）を連結することができる。システムにはさらに、システムを他の処理システム、リモートプリンタ、リモート記憶装置等に私的・公的な中間ネットワークを介して接続するための、ネットワークアダプタを連結してもよい。現在利用可能なネットワークアダプタには、モデム、ケーブルモデム、Ｅｔｈｅｒｎｅｔカードがあるが、これらは単なる例示にすぎない。本明細書で触れた通信ネットワークは、ローカルエリアネットワーク（ＬＡＮ）、広域ネットワーク（ＷＡＮ）、ＩＰプロトコルベースのネットワーク（インターネット等）、および終端間ネットワーク（アドホックピアツーピアネットワーク等）を始めとする多様なネットワークで構成することができる。 Input / output devices (keyboard, display, pointing device, etc.) can be connected to the system directly or through an intermediate I / O controller. The system may further be coupled with a network adapter for connecting the system to other processing systems, remote printers, remote storage devices, etc. via a private / public intermediate network. Currently available network adapters include modems, cable modems, and Ethernet cards, but these are merely examples. The communication networks mentioned in this specification include various networks including a local area network (LAN), a wide area network (WAN), an IP protocol-based network (such as the Internet), and an end-to-end network (such as an ad hoc peer-to-peer network). Can be configured.

以上、好ましい実施の形態をあげて本発明を説明したが、本発明は必ずしも、上記実施の形態に限定されるものでなく、その技術的思想の範囲内において様々に変形して実施することができる。 The present invention has been described above with reference to preferred embodiments. However, the present invention is not necessarily limited to the above embodiments, and various modifications can be made within the scope of the technical idea. it can.

さらに、上記実施形態の一部又は全部は、以下の付記のようにも記載されうるが、これに限定されない。 Further, a part or all of the above-described embodiment can be described as in the following supplementary notes, but is not limited thereto.

（付記１）
ユーザのクエリおよびオントロジに基づいてクエリの意味情報を抽出するステップと、
ドキュメント、クエリおよびオントロジに基づいてドキュメント意味情報を抽出するステップと、
前記ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定するステップと、
前記関係意味関連度スコアに基づいてドキュメントのランク付けを行うステップと
を備えることを特徴とするドキュメントランク付け方法。 (Appendix 1)
Extracting semantic information of the query based on the user's query and ontology;
Extracting document semantic information based on documents, queries and ontologies;
Determining a relationship semantic relevance score between the document semantic information and query semantic information;
Ranking the documents based on the relationship semantic relevance score, and a document ranking method.

（付記２）
ユーザのクエリおよびオントロジに基づいてクエリの意味情報を抽出する前記ステップが、
オントロジに基づいて、ユーザのクエリに含まれるクエリ概念の集合を抽出するステップと、
オントロジに基づいて、クエリ概念集合内の各概念ペア間の意味パスを取得するステップと、
クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定するステップと
を含むことを特徴とする付記１に記載のドキュメントランク付け方法。 (Appendix 2)
Extracting the query semantic information based on the user's query and ontology;
Extracting a set of query concepts contained in the user's query based on the ontology;
Obtaining a semantic path between each concept pair in the query concept set based on the ontology;
The document ranking method according to claim 1, further comprising: determining the number of semantic paths between the concept pairs based on the semantic paths between the concept pairs in the query concept set.

（付記３）
クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する前記ステップが、
クエリ概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の順方向意味パス集合と逆方向意味パス集合とを決定するステップと、
順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得するステップと
を含むことを特徴とする付記２に記載のドキュメントランク付け方法。 (Appendix 3)
Said step of determining the number of semantic paths between each concept pair based on the semantic paths between each concept pair in the query concept set;
Determining a forward semantic path set and a reverse semantic path set between each concept pair based on a semantic path between each concept pair in the query concept set;
The document ranking method according to claim 2, further comprising the step of obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set and the number of elements in the backward semantic path set. .

（付記４）
ドキュメント、クエリおよびオントロジに基づいてドキュメント意味情報を抽出する前記ステップが、
オントロジに基づいて、ドキュメントに含まれる概念集合とクエリに含まれる概念集合を抽出するステップと、
ドキュメントに含まれる概念集合とクエリに含まれる概念集合との共通部分に基づいてドキュメント概念集合を取得するステップと、
ドキュメントに基づいてドキュメント概念集合内の各概念ペア間の意味パスを取得するステップと、
ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定するステップと
を含むことを特徴とする付記１に記載のドキュメントランク付け方法。 (Appendix 4)
Said step of extracting document semantic information based on documents, queries and ontologies,
Extracting a concept set included in the document and a concept set included in the query based on the ontology; and
Obtaining a document concept set based on a common part between a concept set included in the document and a concept set included in the query;
Obtaining a semantic path between each pair of concepts in the document concept set based on the document;
The document ranking method according to claim 1, further comprising: determining the number of semantic paths between each concept pair based on the semantic paths between each concept pair in the document concept set.

（付記５）
前記ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する前記ステップが、
ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の順方向意味パス集合と逆方向意味パス集合とを決定するステップと、
順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得するステップと
を含むことを特徴とする付記４に記載のドキュメントランク付け方法。 (Appendix 5)
Determining the number of semantic paths between each concept pair based on the semantic paths between each concept pair in the document concept set;
Determining a forward semantic path set and a reverse semantic path set between each concept pair based on a semantic path between each concept pair in the document concept set;
The document ranking method according to claim 4, further comprising the step of obtaining the number of semantic paths between concept pairs based on the number of elements in the forward semantic path set and the number of elements in the backward semantic path set. .

（付記６）
前記順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する前記ステップが、
順方向意味パス集合から冗長パスを除外して順方向意味パス集合を最適化するためのステップと、
逆方向意味パス集合から冗長パスを除外して逆方向意味パス集合を最適化するためのステップと、
最適化された順方向意味パス集合の要素数と最適化された逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得するステップと
を含むことを特徴とする付記３又は付記５に記載のドキュメントランク付け方法。 (Appendix 6)
Obtaining the number of semantic paths between each pair of concepts based on the number of elements in the forward semantic path set and the number of elements in the backward semantic path set;
Steps for optimizing the forward semantic path set by excluding redundant paths from the forward semantic path set;
Steps for optimizing the reverse semantic path set by excluding redundant paths from the reverse semantic path set;
The step of obtaining the number of semantic paths between each pair of concepts based on the number of elements of the optimized forward semantic path set and the number of elements of the optimized backward semantic path set. The document ranking method according to 3 or appendix 5.

（付記７）
前記順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する前記ステップが、
順方向意味パス集合と逆方向意味パス集合とに基づいて相互パスペアを決定するステップと、
順方向意味パス集合の要素数、逆方向意味パス集合の要素数、および相互パスペア数とに基づいて各概念ペア間の意味パス数を取得するステップと
を含むことを特徴とする付記３又は付記５に記載のドキュメントランク付け方法。 (Appendix 7)
Obtaining the number of semantic paths between each pair of concepts based on the number of elements in the forward semantic path set and the number of elements in the backward semantic path set;
Determining a mutual path pair based on a forward semantic path set and a reverse semantic path set;
(C) obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set, the number of elements in the reverse semantic path set, and the number of mutual path pairs. 5. The document ranking method according to 5.

（付記８）
前記ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する前記ステップが、
ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とを取得するステップと、
ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定するステップと
を含むことを特徴とする付記１に記載のドキュメントランク付け方法。 (Appendix 8)
Determining the relationship semantic relevance score between the document semantic information and query semantic information;
Obtaining the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information;
The method of claim 1, further comprising: determining a relationship semantic relevance score between the document semantic information and the query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information. The document ranking method described.

（付記９）
前記ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する前記ステップが、
ドキュメント意味情報内の意味パス数の合計を計算してドキュメント数値とするステップと、
クエリ意味情報内の意味パス数の合計を計算してクエリ数値とするステップと、
クエリ数値に対するドキュメント数値の比率を決定してドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとするステップと
を含むことを特徴とする付記８に記載のドキュメントランク付け方法。 (Appendix 9)
Determining the relationship semantic relevance score between document semantic information and query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in query semantic information;
Calculating the total number of semantic paths in the document semantic information as a document number;
Calculating the total number of semantic paths in the query semantic information as a query number;
The document ranking method according to appendix 8, further comprising: determining a ratio of the document numerical value to the query numerical value to obtain a relationship semantic relevance score between the document semantic information and the query semantic information.

（付記１０）
前記ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する前記ステップが、
クエリ意味情報に含まれる概念の集合を取得するステップと、
ドキュメント意味情報に基づいて概念集合内の各概念ペア間のドキュメント意味パス数を決定するステップと、
クエリ意味情報に基づいて概念集合内の各概念ペア間のクエリ意味パス数を決定するステップと、
各概念ペア間の、クエリ意味パス数に対するドキュメント意味パス数の比率を計算するステップと、
それぞれの比率の積をドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとして決定するステップと
を含むことを特徴とする付記８に記載のドキュメントランク付け方法。 (Appendix 10)
Determining the relationship semantic relevance score between document semantic information and query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in query semantic information;
Obtaining a set of concepts contained in the query semantic information;
Determining the number of document semantic paths between each pair of concepts in the concept set based on the document semantic information;
Determining the number of query semantic paths between each concept pair in the concept set based on the query semantic information;
Calculating the ratio of the number of document semantic paths to the number of query semantic paths between each concept pair;
The document ranking method according to claim 8, further comprising: determining a product of each ratio as a relationship semantic relevance score between the document semantic information and the query semantic information.

（付記１１）
前記ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定するステップが、
ドキュメント意味情報に基づいてドキュメントスパニングツリー集合を決定するステップと、
クエリ意味情報に基づいてクエリスパニングツリー集合を決定するステップと、
ドキュメント意味情報内の意味パス数に基づいてドキュメントスパニングツリー集合内の各ドキュメントスパニングツリーによって記述されたドキュメント意味関係の組み合わせ総数を計算するステップと、
クエリ意味情報内の意味パス数に基づいてクエリスパニングツリー集合内の各クエリスパニングツリーによって記述されたクエリ意味関係の組み合わせ総数を計算するステップと、
ドキュメント意味関係の組み合わせ総数とクエリ意味関係の組み合わせ総数とに基づいて各スパニングツリーペアの意味ペアスコアを決定するステップと、
スパニングツリーペアの意味ペアスコアの平均値を決定してドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとするステップを含み、
前記クエリスパニングツリー集合の各要素はドキュメントスパニングツリー集合の各要素と１対１で対応し、複数のスパニングツリーペアが生成される
ことを特徴とする付記８に記載のドキュメントランク付け方法。 (Appendix 11)
Determining a relationship semantic relevance score between document semantic information and query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in query semantic information;
Determining a document spanning tree set based on document semantic information;
Determining a query spanning tree set based on query semantic information;
Calculating the total number of combinations of document semantic relationships described by each document spanning tree in the document spanning tree set based on the number of semantic paths in the document semantic information;
Calculating the total number of combinations of query semantic relationships described by each query spanning tree in the query spanning tree set based on the number of semantic paths in the query semantic information;
Determining a semantic pair score for each spanning tree pair based on the total number of document semantic relationship combinations and the total number of query semantic relationship combinations;
Determining an average value of the semantic pair score of the spanning tree pair to obtain a relationship semantic relevance score between the document semantic information and the query semantic information,
9. The document ranking method according to appendix 8, wherein each element of the query spanning tree set has a one-to-one correspondence with each element of the document spanning tree set, and a plurality of spanning tree pairs are generated.

（付記１２）
前記関係意味関連度スコアに基づいてドキュメントのランク付けを行うステップが、
ドキュメントとクエリ間の概念意味関連度スコアを取得するステップと、
関係関連度スコアと概念関連度スコアとに基づいてドキュメントのスコアを決定するステップと、
ドキュメントのスコアに基づいてドキュメントをランク付けするステップと
を含むことを特徴とする付記１に記載のドキュメントランク付け方法。 (Appendix 12)
Ranking the documents based on the relationship semantic relevance score,
Obtaining a conceptual semantic relevance score between the document and the query;
Determining a score for the document based on the relationship relevance score and the concept relevance score;
The document ranking method according to claim 1, further comprising the step of ranking the document based on the score of the document.

（付記１３）
前記関係関連度スコアと概念関連度スコアとに基づいてドキュメントのスコアを決定する前記ステップが、
関係の重みと概念の重みをそれぞれ使用して関係関連度スコアと概念関連度スコアの重み付けをするステップと、
重み付けされた関係関連度スコアと重み付けされた概念関係度スコアとを合計してドキュメントのスコアを取得するステップを含み、
前記関係の重みと概念の重みはいずれも０〜１の値であり、関係の重みと概念の重みの合計は１である
ことを特徴とする付記１２に記載のドキュメントランク付け方法。 (Appendix 13)
Determining the document score based on the relationship relevance score and the concept relevance score;
Weighting the relationship relevance score and the concept relevance score using the relationship weight and the concept weight, respectively,
Summing the weighted relationship relevance score and the weighted concept relevance score to obtain a document score,
13. The document ranking method according to appendix 12, wherein the relationship weight and the concept weight are both 0 to 1, and the sum of the relationship weight and the concept weight is 1.

（付記１４）
前記関係意味関連度スコアに基づいてドキュメントのランク付けを行うステップが、
ドキュメントとクエリ間の概念意味関連度スコアを取得するステップと、
概念関連度スコアに基づいてドキュメントをランク付けするステップと、
ランク付けされたドキュメントをグループ化するステップと、
関連関係度スコアに基づいて各ドキュメントグループ内の各ドキュメントをランク付けするステップと
を含むことを特徴とする付記１に記載のドキュメントランク付け方法。 (Appendix 14)
Ranking the documents based on the relationship semantic relevance score,
Obtaining a conceptual semantic relevance score between the document and the query;
Ranking the documents based on the concept relevance score;
Grouping the ranked documents;
The document ranking method according to claim 1, further comprising: ranking each document in each document group based on the relevance score.

（付記１５）
ユーザのクエリおよびオントロジに基づいてクエリの意味情報を抽出するように構成されたクエリ意味情報抽出手段と、
ドキュメント、クエリおよびオントロジに基づいてドキュメント意味情報を抽出するように構成されたドキュメント意味情報抽出手段と、
ドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定するように構成された関係意味関連度スコア決定手段と、
関係意味関連度スコアに基づいてドキュメントのランク付けを行うように構成されたランク付け手段と
を備えることを特徴とするドキュメントランク付け装置。 (Appendix 15)
Query semantic information extraction means configured to extract query semantic information based on a user query and ontology;
Document semantic information extraction means configured to extract document semantic information based on documents, queries and ontologies;
A relationship semantic relevance score determining means configured to determine a relationship semantic relevance score between document semantic information and query semantic information;
A document ranking apparatus comprising: ranking means configured to rank documents based on a relation meaning relevance score.

（付記１６）
前記クエリ意味情報抽出手段は、
オントロジに基づいて、ユーザのクエリに含まれるクエリ概念の集合を抽出する手段と、
オントロジに基づいて、クエリ概念集合内の各概念ペア間の意味パスを取得する手段と、
クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する手段と
を備えることを特徴とする付記１５に記載のドキュメントランク付け装置。 (Appendix 16)
The query semantic information extraction means includes:
A means for extracting a set of query concepts contained in a user's query based on an ontology;
A means for obtaining a semantic path between each concept pair in the query concept set based on ontology;
The document ranking apparatus according to claim 15, further comprising: means for determining the number of semantic paths between the concept pairs based on the semantic paths between the concept pairs in the query concept set.

（付記１７）
前記クエリ概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する手段は、
クエリ概念集合内の各概念ペア間の意味パスに基づいて、各概念ペア間の順方向意味パス集合と逆方向意味パス集合とを決定する手段と、
順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段と
を備えることを特徴とする付記１６に記載のドキュメントランク付け装置。 (Appendix 17)
Means for determining the number of semantic paths between each concept pair based on the semantic paths between each concept pair in the query concept set,
Means for determining a forward semantic path set and a reverse semantic path set between each concept pair based on a semantic path between each concept pair in the query concept set;
The document ranking apparatus according to appendix 16, further comprising means for obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set and the number of elements in the backward semantic path set. .

（付記１８）
前記ドキュメント意味情報抽出手段は、
オントロジに基づいて、ドキュメントに含まれる概念集合とクエリに含まれる概念集合を抽出する手段と、
ドキュメントに含まれる概念集合とクエリに含まれる概念集合との共通部分に基づいてドキュメント概念集合を取得する手段と、
ドキュメントに基づいてドキュメント概念集合内の各概念ペア間の意味パスを取得する手段と、
ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する手段と
を備えることを特徴とする付記１５に記載のドキュメントランク付け装置。 (Appendix 18)
The document semantic information extraction means includes
A means for extracting a concept set included in a document and a concept set included in a query based on an ontology;
Means for obtaining a document concept set based on a common part between a concept set included in a document and a concept set included in a query;
Means for obtaining a semantic path between each pair of concepts in a document concept set based on a document;
The document ranking apparatus according to appendix 15, further comprising means for determining the number of semantic paths between the concept pairs based on the semantic paths between the concept pairs in the document concept set.

（付記１９）
前記ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の意味パス数を決定する手段は、
ドキュメント概念集合内の各概念ペア間の意味パスに基づいて各概念ペア間の順方向意味パス集合と逆方向意味パス集合とを決定する手段と、
順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段と
を備えることを特徴とする付記１８に記載のドキュメントランク付け装置。 (Appendix 19)
Means for determining the number of semantic paths between each concept pair based on the semantic paths between each concept pair in the document concept set;
Means for determining a forward semantic path set and a reverse semantic path set between each concept pair based on a semantic path between each concept pair in the document concept set;
The document ranking apparatus according to appendix 18, further comprising means for acquiring the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set and the number of elements in the backward semantic path set. .

（付記２０）
前記順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段は、
順方向意味パス集合から冗長パスを除外して順方向意味パス集合を最適化するための手段と、
逆方向意味パス集合から冗長パスを除外して逆方向意味パス集合を最適化するための手段と、
最適化された順方向意味パス集合の要素数と最適化された逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段と
を備えることを特徴とする付記１７又は付記１９に記載のドキュメントランク付け装置。 (Appendix 20)
Means for obtaining the number of semantic paths between each pair of concepts based on the number of elements of the forward semantic path set and the number of elements of the backward semantic path set,
Means for excluding redundant paths from the forward semantic path set and optimizing the forward semantic path set;
Means for excluding redundant paths from the reverse semantic path set and optimizing the reverse semantic path set;
And means for obtaining the number of semantic paths between each pair of concepts based on the number of elements in the optimized forward semantic path set and the number of elements in the optimized backward semantic path set. The document ranking apparatus according to 17 or appendix 19.

（付記２１）
前記順方向意味パス集合の要素数と逆方向意味パス集合の要素数とに基づいて各概念ペア間の意味パス数を取得する手段は、
順方向意味パス集合と逆方向意味パス集合とに基づいて相互パスペアを決定する手段と、
順方向意味パス集合の要素数、逆方向意味パス集合の要素数、および相互パスペア数とに基づいて各概念ペア間の意味パス数を取得する手段と
を備えることを特徴とする付記１７又は付記１９に記載のドキュメントランク付け装置。 (Appendix 21)
Means for obtaining the number of semantic paths between each pair of concepts based on the number of elements of the forward semantic path set and the number of elements of the backward semantic path set,
Means for determining a mutual path pair based on a forward semantic path set and a reverse semantic path set;
Supplementary note 17 or Supplementary note, comprising means for obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set, the number of elements in the backward semantic path set, and the number of mutual path pairs 19. The document ranking apparatus according to 19.

（付記２２）
前記関係意味関連度スコア決定手段は、
ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とを取得する手段と、
ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する手段と
を備えることを特徴とする付記１５に記載のドキュメントランク付け装置。 (Appendix 22)
The relation meaning relevance score determination means includes
Means for obtaining the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information;
(Supplementary note 15) comprising means for determining a relationship semantic relevance score between document semantic information and query semantic information based on the number of semantic paths in document semantic information and the number of semantic paths in query semantic information. The document ranking device described.

（付記２３）
前記ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する手段は、
ドキュメント意味情報内の意味パス数の合計を計算してドキュメント数値とする手段と、
クエリ意味情報内の意味パス数の合計を計算してクエリ数値とする手段と、
クエリ数値に対するドキュメント数値の比率を決定してドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとする手段と
を備えることを特徴とする付記２２に記載のドキュメントランク付け装置。 (Appendix 23)
The means for determining the relationship semantic relevance score between the document semantic information and the query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information,
A means for calculating the total number of semantic paths in the document semantic information to obtain a document number;
A means for calculating the total number of semantic paths in the query semantic information and making it a query number;
The document ranking apparatus according to appendix 22, further comprising means for determining a ratio of the document numerical value with respect to the query numerical value to obtain a relationship meaning relevance score between the document semantic information and the query semantic information.

（付記２４）
前記ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する手段は、
クエリ意味情報に含まれる概念の集合を取得する手段と、
ドキュメント意味情報に基づいて概念集合内の各概念ペア間のドキュメント意味パス数を決定する手段と、
クエリ意味情報に基づいて概念集合内の各概念ペア間のクエリ意味パス数を決定する手段と、
各概念ペア間の、クエリ意味パス数に対するドキュメント意味パス数の比率を計算する手段と、
それぞれの比率の積をドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとして決定する手段と
を備えることを特徴とする付記２２に記載のドキュメントランク付け装置。 (Appendix 24)
The means for determining the relationship semantic relevance score between the document semantic information and the query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information,
Means for acquiring a set of concepts included in the query semantic information;
Means for determining the number of document semantic paths between each pair of concepts in a concept set based on document semantic information;
Means for determining the number of query semantic paths between each pair of concepts in a concept set based on query semantic information;
Means for calculating the ratio of the number of document semantic paths to the number of query semantic paths between each concept pair;
The document ranking apparatus according to appendix 22, further comprising: means for determining a product of each ratio as a relationship semantic relevance score between the document semantic information and the query semantic information.

（付記２５）
前記ドキュメント意味情報内の意味パス数とクエリ意味情報内の意味パス数とに基づいてドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアを決定する手段は、
ドキュメント意味情報に基づいてドキュメントスパニングツリー集合を決定する手段と、
クエリ意味情報に基づいてクエリスパニングツリー集合を決定する手段と、
ドキュメント意味情報内の意味パス数に基づいてドキュメントスパニングツリー集合内の各ドキュメントスパニングツリーによって記述されたドキュメント意味関係の組み合わせ総数を計算する手段と、
クエリ意味情報内の意味パス数に基づいてクエリスパニングツリー集合内の各クエリスパニングツリーによって記述されたクエリ意味関係の組み合わせ総数を計算する手段と、
ドキュメント意味関係の組み合わせ総数とクエリ意味関係の組み合わせ総数とに基づいて各スパニングツリーペアの意味ペアスコアを決定する手段と、
スパニングツリーペアの意味ペアスコアの平均値を決定してドキュメント意味情報とクエリ意味情報間の関係意味関連度スコアとする手段を備え、
前記クエリスパニングツリー集合の各要素はドキュメントスパニングツリー集合の各要素と１対１で対応し、複数のスパニングツリーペアが生成される
ことを特徴とする付記２２に記載のドキュメントランク付け装置。 (Appendix 25)
The means for determining the relationship semantic relevance score between the document semantic information and the query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information,
Means for determining a document spanning tree set based on document semantic information;
Means for determining a query spanning tree set based on query semantic information;
Means for calculating the total number of combinations of document semantic relationships described by each document spanning tree in the document spanning tree set based on the number of semantic paths in the document semantic information;
Means for calculating the total number of combinations of query semantic relationships described by each query spanning tree in the query spanning tree set based on the number of semantic paths in the query semantic information;
Means for determining a semantic pair score for each spanning tree pair based on the total number of document semantic relationship combinations and the total number of query semantic relationship combinations;
Means for determining an average value of the semantic pair score of the spanning tree pair to obtain a relationship semantic relevance score between the document semantic information and the query semantic information,
23. The document ranking apparatus according to appendix 22, wherein each element of the query spanning tree set corresponds to each element of the document spanning tree set on a one-to-one basis, and a plurality of spanning tree pairs are generated.

（付記２６）
前記ランク付け手段は、
ドキュメントとクエリ間の概念意味関連度スコアを取得する手段と、
関係関連度スコアと概念関連度スコアとに基づいてドキュメントのスコアを決定する手段と、
ドキュメントのスコアに基づいてドキュメントをランク付けする手段と
を備えることを特徴とする付記１５に記載のドキュメントランク付け装置。 (Appendix 26)
The ranking means is
A means for obtaining a conceptual semantic relevance score between a document and a query;
Means for determining a score for a document based on a relationship relevance score and a concept relevance score;
The document ranking apparatus according to claim 15, further comprising means for ranking the document based on the score of the document.

（付記２７）
前記関係関連度スコアと概念関連度スコアとに基づいてドキュメントのスコアを決定する手段は、
関係の重みと概念の重みをそれぞれ使用して関係関連度スコアと概念関連度スコアの重み付けをする手段と、
重み付けされた関係関連度スコアと重み付けされた概念関係度スコアとを合計してドキュメントのスコアを取得する手段を備え、
前記関係の重みと概念の重みはいずれも０〜１の値であり、関係の重みと概念の重みの合計は１である
ことを特徴とする付記２６に記載のドキュメントランク付け装置。 (Appendix 27)
The means for determining the score of the document based on the relationship relevance score and the concept relevance score,
Means for weighting the relationship relevance score and the concept relevance score using the relationship weight and the concept weight, respectively;
Means for summing the weighted relationship relevance score and the weighted concept relevance score to obtain a score for the document,
27. The document ranking apparatus according to appendix 26, wherein the relation weight and the concept weight both have a value of 0 to 1, and the sum of the relation weight and the concept weight is 1.

（付記２８）
前記ランク付け手段は、
ドキュメントとクエリ間の概念意味関連度スコアを取得する手段と、
概念関連度スコアに基づいてドキュメントをランク付けする手段と、
ランク付けされたドキュメントをグループ化する手段と、
関連関係度スコアに基づいて各ドキュメントグループ内の各ドキュメントをランク付けする手段と
を備えることを特徴とする付記１５に記載のドキュメントランク付け装置。 (Appendix 28)
The ranking means is
A means for obtaining a conceptual semantic relevance score between a document and a query;
A means of ranking documents based on conceptual relevance scores;
A means of grouping ranked documents;
The document ranking apparatus according to appendix 15, further comprising means for ranking each document in each document group based on the relevance score.

６０１：クエリ意味情報抽出手段
６０２：ドキュメント意味情報抽出手段
６０３：関係意味関連度スコア決定手段
６０４：ランク付け手段
601: Query semantic information extracting means 602: Document semantic information extracting means 603: Relation meaning relevance score determining means 604: Ranking means

Claims

Extracting semantic information of the query based on the user's query and ontology;
Extracting document semantic information based on documents, queries and ontologies;
Determining a relationship semantic relevance score between the document semantic information and query semantic information;
Ranking the documents based on the relationship semantic relevance score, and a document ranking method.

Extracting the query semantic information based on the user's query and ontology;
Extracting a set of query concepts contained in the user's query based on the ontology;
Obtaining a semantic path between each concept pair in the query concept set based on the ontology;
The document ranking method according to claim 1, further comprising: determining the number of semantic paths between the concept pairs based on the semantic paths between the concept pairs in the query concept set.

Said step of determining the number of semantic paths between each concept pair based on the semantic paths between each concept pair in the query concept set;
Determining a forward semantic path set and a reverse semantic path set between each concept pair based on a semantic path between each concept pair in the query concept set;
Obtaining the number of semantic paths between each pair of concepts based on the number of elements in the forward semantic path set and the number of elements in the backward semantic path set. Method.

Said step of extracting document semantic information based on documents, queries and ontologies,
Extracting a concept set included in the document and a concept set included in the query based on the ontology; and
Obtaining a document concept set based on a common part between a concept set included in the document and a concept set included in the query;
Obtaining a semantic path between each pair of concepts in the document concept set based on the document;
2. The document ranking method according to claim 1, further comprising: determining the number of semantic paths between each concept pair based on a semantic path between each concept pair in the document concept set.

Determining the number of semantic paths between each concept pair based on the semantic paths between each concept pair in the document concept set;
Determining a forward semantic path set and a reverse semantic path set between each concept pair based on a semantic path between each concept pair in the document concept set;
5. The document ranking according to claim 4, further comprising: obtaining the number of semantic paths between each concept pair based on the number of elements in the forward semantic path set and the number of elements in the backward semantic path set. Method.

Determining the relationship semantic relevance score between the document semantic information and query semantic information;
Obtaining the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information;
And determining a relationship semantic relevance score between the document semantic information and the query semantic information based on the number of semantic paths in the document semantic information and the number of semantic paths in the query semantic information. Document ranking method described in.

Ranking the documents based on the relationship semantic relevance score,
Obtaining a conceptual semantic relevance score between the document and the query;
Determining a score for the document based on the relationship relevance score and the concept relevance score;
The document ranking method according to claim 1, further comprising: ranking documents based on document scores.

Determining the document score based on the relationship relevance score and the concept relevance score;
Weighting the relationship relevance score and the concept relevance score using the relationship weight and the concept weight, respectively,
Summing the weighted relationship relevance score and the weighted concept relevance score to obtain a document score,
8. The document ranking method according to claim 7, wherein the relation weight and the concept weight both have a value of 0 to 1, and the sum of the relation weight and the concept weight is 1. 9.

Ranking the documents based on the relationship semantic relevance score,
Obtaining a conceptual semantic relevance score between the document and the query;
Ranking the documents based on the concept relevance score;
Grouping the ranked documents;
The document ranking method according to claim 1, further comprising: ranking each document in each document group based on the relevance score.

Query semantic information extraction means configured to extract query semantic information based on a user query and ontology;
Document semantic information extraction means configured to extract document semantic information based on documents, queries and ontologies;
A relationship semantic relevance score determining means configured to determine a relationship semantic relevance score between document semantic information and query semantic information;
A document ranking apparatus comprising: ranking means configured to rank documents based on a relation meaning relevance score.