JP2005242807A - Related knowledge retrieval apparatus, sentences network generation device, sentences network generation method, and program - Google Patents

Related knowledge retrieval apparatus, sentences network generation device, sentences network generation method, and program Download PDF

Info

Publication number
JP2005242807A
JP2005242807A JP2004053424A JP2004053424A JP2005242807A JP 2005242807 A JP2005242807 A JP 2005242807A JP 2004053424 A JP2004053424 A JP 2004053424A JP 2004053424 A JP2004053424 A JP 2004053424A JP 2005242807 A JP2005242807 A JP 2005242807A
Authority
JP
Japan
Prior art keywords
sentence
information
end point
inter
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2004053424A
Other languages
Japanese (ja)
Inventor
Eiji Murakami
英治 村上
Takao Terano
隆雄 寺野
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Azbil Corp
Original Assignee
Azbil Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Azbil Corp filed Critical Azbil Corp
Priority to JP2004053424A priority Critical patent/JP2005242807A/en
Publication of JP2005242807A publication Critical patent/JP2005242807A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

<P>PROBLEM TO BE SOLVED: To construct a sentences network without being influenced by a personal point of view and to retrieve flexibly desired related knowledge that has arbitrary order with respect to knowledge to be inspected. <P>SOLUTION: By a sentences network generation means 15D, sentences including the same important word as a starting point sentence designated by teaching information 14B is selected as a starting point candidate sentence from among a sentences set 14A, and sentences including the same important word as an end point sentence designated by the teaching information 14B is selected as an end point candidate sentence from among the sentences set 14A. By using new directed line segment with the starting point sentence and the starting point candidate sentence as starting points respectively and with the end point sentence and the end point candidate sentence as an end point respectively, a sentences network to indicate the relation and order among the respective sentences is generated. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、関連知識検索技術に関し、特に複数の文書のうちから所望の知識に関連する知識を持つ文書を検索する関連知識検索技術に関する。   The present invention relates to a related knowledge search technique, and more particularly to a related knowledge search technique for searching a document having knowledge related to desired knowledge from a plurality of documents.

高度情報化社会では、情報処理技術や情報通信技術の発展に伴い、電子化された膨大な量の情報を容易に蓄積し、これを利用する環境が提供されつつある。このような環境を利用して任意の課題を解決するためには、膨大な情報からその課題に関連する知識を効率よく検索する必要がある。   In an advanced information society, with the development of information processing technology and information communication technology, an enormous amount of digitized information is easily stored and an environment for using it is being provided. In order to solve an arbitrary problem using such an environment, it is necessary to efficiently search knowledge related to the problem from a huge amount of information.

従来、このような任意の課題に関連する知識を検索する技術として、各文章の類似性に基づき検索する技術が提案されている(例えば、特許文献1など参照)。
この関連知識検索技術では、事象の内容を示す文章(テキストデータ)について他の文章との類似性を効率よく判断するため、予め事例ベースに蓄積されている各事象と入力事象について、文章構造とその意味に関する類似度を求めて両者を照合し、その類似度が高い文章を入力事象の文章に類似する類似文章として表示出力している。
Conventionally, as a technique for searching for knowledge related to such an arbitrary problem, a technique for searching based on the similarity of each sentence has been proposed (see, for example, Patent Document 1).
In this related knowledge search technology, in order to efficiently determine the similarity of a sentence (text data) indicating the contents of an event with other sentences, the sentence structure and the The similarity regarding the meaning is calculated | required, both are collated and the sentence with the high similarity is displayed and output as a similar sentence similar to the sentence of an input event.

しかし、上記関連知識検索技術によれば、単に入力事象の文章と類似性が高い類似文章が検索されて表示されるだけであり、文章間の順序性については得られないという問題点があった。例えば、設備の保守・操業業務において発生した入力事象について検索した場合、その入力事象に類似する事象が提示されるだけであって、その発生原因となる事象や事後発生しうる事象などの関連事象について得ることができない。
設備の保守・操業業務では、前述したように、発生した事象に対して迅速かつ適切に処置を行う必要があり、そのためには入力事象の発生原因や事後発生しうる事象を、入力事象に関連する関連事情として的確に把握することが重要となる。
However, according to the related knowledge search technique, there is a problem that only similar sentences having high similarity to the sentence of the input event are searched and displayed, and the order between the sentences cannot be obtained. . For example, when searching for an input event that occurred in equipment maintenance / operation work, only an event similar to that input event is presented, and related events such as the event that caused the event and the event that can occur afterwards Can't get about.
In the maintenance and operation of facilities, as described above, it is necessary to take prompt and appropriate actions for the events that have occurred. To that end, the causes of input events and the events that can occur afterwards are related to the input events. It is important to accurately grasp the relevant circumstances.

これに対して、分類対象となる各文章の内容を熟知する熟練者が文書間の順序性や関連性を予め教示して、文章間の順序性および関連性を示す文章ネットワークを予め構築しておき、その文章ネットワークに基づき、非熟練者が任意の文章と関連する文章を検索する方法も考えられる。
この方法によれば、各文章間の関連性だけでなく文書間の順序性も得られることから、入力事象の発生原因や事後発生しうる事象を容易に把握することができ、設備の保守・操業業務でも利用できる。
On the other hand, an expert who is familiar with the contents of each sentence to be classified teaches in advance the order and relevance between documents, and constructs a sentence network that shows the order and relevance between sentences in advance. In addition, based on the sentence network, a method in which an unskilled person searches for a sentence related to an arbitrary sentence can be considered.
According to this method, not only the relationship between each sentence but also the order between documents can be obtained, so it is possible to easily grasp the cause of the input event and the event that can occur afterwards. It can also be used for operations.

なお、出願人は、本明細書に記載した先行技術文献情報で特定される先行技術文献以外には、本発明に関連する先行技術文献を出願時までに発見するには至らなかった。
特開2000−276487号公報
The applicant has not yet found prior art documents related to the present invention by the time of filing other than the prior art documents specified by the prior art document information described in this specification.
JP 2000-276487 A

しかしながら、このような従来の技術では、熟練者の教示という主観に基づいて構築した文章ネットワークを用いているため、文章間の順序性および関連性が熟練者の教示に大きく依存し、熟練者の教示の範囲を超えた新たな観点から関連する知識を検索することができないという問題点があった。
本発明はこのような課題を解決するためのものであり、主観にとらわれないで文章ネットワークを構築でき、検査対象知識に対して任意の順序性を持つ所望の関連知識を柔軟に検索できる関連知識検索装置、文章ネットワーク生成装置、文章ネットワーク生成方法、およびプログラムを提供することを目的としている。
However, since the conventional technique uses a sentence network constructed based on the subjectivity of the expert's teaching, the order and relationship between sentences greatly depends on the instruction of the expert, There was a problem that related knowledge could not be searched from a new point of view beyond the scope of teaching.
The present invention is to solve such a problem, and it is possible to construct a sentence network without being constrained by subjectivity and to flexibly search for desired related knowledge having an arbitrary order with respect to knowledge to be examined. It is an object of the present invention to provide a search device, a text network generation device, a text network generation method, and a program.

このような目的を達成するために、本発明にかかる関連知識検索装置は、所定の重要語を用いて各種知識を文字情報で記述した複数の文章からなる文章集合、およびこの文章集合に属する各文章間の関連を示す文章ネットワーク情報を記憶する記憶部と、この記憶部の文章ネットワーク情報を用いて、検索対象知識を含む文章に関連する所望の文章を記憶部の文章集合から検索する演算処理部とを有する関連知識検索装置であって、演算処理部に、2つの文章の間の関連性および順序性を示す文章間関係情報を入力とし、当該文章間関係情報の始点側に位置する始点文章と当該文章間関係情報の終点側に位置する終点文章とに含まれる重要語に基づいて、文章ネットワーク情報を生成する文章ネットワーク生成手段を設け、この文章ネットワーク生成手段で、入力された文章間関係情報で指定された始点文章と同じ重要語をすべて含む文章を記憶部の文章集合から始点候補文章として選択するとともに、文章間関係情報で指定された終点文章と同じ重要語をすべて含む文章を記憶部の文章集合から終点候補文章として選択し、始点文章および始点候補文章のそれぞれを始点とし終点文章および終点候補文章のそれぞれを終点とする新たな文章間関係情報を生成し、これら新たな文章間関係情報の始点文章を示す情報と終点文章を示す情報との対から、各文章間の関連性および順序性を示す文章ネットワーク情報を生成するようにしたものである。   In order to achieve such an object, the related knowledge search device according to the present invention includes a sentence set composed of a plurality of sentences in which various pieces of knowledge are described as character information using a predetermined important word, and each sentence belonging to the sentence set. A storage unit for storing sentence network information indicating a relation between sentences, and a calculation process for searching a desired sentence related to a sentence including knowledge to be searched from a sentence set in the storage unit, using the sentence network information in the storage part. A related knowledge search device having an input to the arithmetic processing unit, the inter-sentence relation information indicating the relation and order between the two sentences, and the start point located on the start point side of the inter-sentence relation information Sentence network generation means for generating sentence network information is provided based on important words contained in the sentence and the end point sentence located on the end point side of the inter-text relation information. The text generator selects a sentence including all the same important words as the starting sentence specified by the input sentence relation information from the sentence set in the storage unit as a starting point candidate sentence, and an end point specified by the sentence relation information. Select a sentence that contains all the same important words as the sentence from the set of sentences in the storage section as the end point candidate sentence, and start a new sentence between each of the start point sentence and the start point candidate sentence, and the end point sentence and the end point candidate sentence. Generation of relationship information, and sentence network information indicating the relationship and order between each sentence from the pair of information indicating the start sentence and the information indicating the end sentence of the new inter-sentence relation information. Is.

この際、演算処理部に、任意の文章間関係情報を、当該文章間関係情報の始点文章に含まれる重要語のそれぞれを始点とし当該文章間関係情報の終点文章に含まれる重要語のそれぞれを終点とする各重要語間関係情報に分解し、これら重要語間関係情報に対応する重要語間重みの和に基づき当該文章間関係情報に対応する文章間重みを算出する文章間重み算出手段をさらに設け、文章ネットワーク生成手段により、各文章間ごとに文章間重み算出手段で得られた文章間重みを有する文章ネットワークを生成するようにしてもよい。   At this time, in the arithmetic processing unit, arbitrary sentence relation information, each of the important words included in the end sentence of the inter-sentence relation information starting from each of the important words included in the start sentence of the inter-sentence relation information. A sentence weight calculation means for decomposing the important word relation information as the end point and calculating the sentence weight corresponding to the sentence relation information based on the sum of the important word weights corresponding to the important word relation information. Furthermore, a sentence network having a sentence weight obtained by the sentence weight calculating means may be generated for each sentence by the sentence network generating means.

さらに、演算処理部に、入力された文章間関係情報を、当該文章間関係情報の始点文章に含まれる重要語のそれぞれを始点とし当該文章間関係情報の終点文章に含まれる重要語のそれぞれを終点とする各重要語間関係情報に分解し、これら重要語間関係情報の発生ごとに当該重要語間関係情報に対応する重要語間重みを更新する重要語間重み算出手段をさらに設けてもよい。   Further, the arithmetic processing unit is configured to input the inter-sentence relationship information, and each of the important words included in the end-point sentence of the inter-sentence relation information starting from each of the important words included in the start-point sentence of the inter-sentence relation information. There may be further provided an important word weight calculation means for decomposing each important word relation information as the end point and updating the important word weight corresponding to the important word relation information for each occurrence of the important word relation information. Good.

また、演算処理部に、文章ネットワーク情報に基づき所望の知識を含む文章を始点文章または終点文章とする文章を記憶部の文章集合から検索し、知識に関連する文章として出力する知識検索手段をさらに設けてもよい。   In addition, the arithmetic processing unit further includes a knowledge search unit that searches the sentence set in the storage unit for a sentence having a sentence including a desired knowledge as a start point sentence or an end point sentence based on the sentence network information, and outputs the sentence as a sentence related to the knowledge. It may be provided.

また、演算処理部に、文章ネットワーク情報に基づき所望の知識を含む文章を始点文章または終点文章とする文章を記憶部の文章集合から検索し、得られた文章を当該文章間重みの順に知識に関連する文章として出力する知識検索手段をさらに設けてもよい。   In addition, the arithmetic processing unit searches the sentence set of the storage unit for a sentence having a sentence including the desired knowledge based on the sentence network information as a start point sentence or an end point sentence, and obtains the obtained sentence as knowledge in the order of the weight between the sentences. You may further provide the knowledge search means output as a related sentence.

また、本発明にかかる文章ネットワーク情報生成装置は、所定の重要語を用いて各種知識を文字情報で記述した複数の文章からなる文章集合を記憶する記憶部と、検索対象知識を含む文章に関連する所望の文章を記憶部の文章集合から検索するのに用いる文章ネットワーク情報を生成する演算処理部とを有する文章ネットワーク情報生成装置であって、演算処理部に、2つの文章の間の関連性および順序性を示す文章間関係情報を入力とし、当該文章間関係情報の始点側に位置する始点文章と当該文章間関係情報の終点側に位置する終点文章とに含まれる重要語に基づいて、文章ネットワーク情報を生成する文章ネットワーク生成手段を設け、この文章ネットワーク生成手段で、入力された文章間関係情報で指定された始点文章と同じ重要語をすべて含む文章を記憶部の文章集合から始点候補文章として選択するとともに、文章間関係情報で指定された終点文章と同じ重要語をすべて含む文章を記憶部の文章集合から終点候補文章として選択し、始点文章および始点候補文章のそれぞれを始点とし終点文章および終点候補文章のそれぞれを終点とする新たな文章間関係情報を生成し、これら新たな文章間関係情報の始点文章を示す情報と終点文章を示す情報との対から、各文章間の関連性および順序性を示す文章ネットワーク情報を生成するようにしてもよい。   The sentence network information generating apparatus according to the present invention relates to a storage unit that stores a sentence set composed of a plurality of sentences in which various kinds of knowledge are described using character information using a predetermined important word, and a sentence that includes search target knowledge. A sentence network information generation device having a calculation processing unit that generates a sentence network information used for searching a desired sentence to be searched from a set of sentences in a storage unit, the relevance between two sentences in the calculation processing unit And the inter-sentence relationship information indicating the order, and based on the important words included in the start point sentence located on the start point side of the inter-sentence relationship information and the end point sentence located on the end point side of the inter-sentence relationship information, Sentence network generation means for generating sentence network information is provided, and this sentence network generation means uses the same important word as the start sentence specified by the input sentence relationship information. Select all the sentences including all as the start point candidate sentences from the sentence set in the storage unit, and select all sentences including all the same important words as the end point sentence specified in the inter-sentence relationship information from the sentence set in the storage unit as the end point candidate sentences, Generate new inter-sentence relationship information with each of the start point sentence and start point candidate sentence as the start point and each of the end point sentence and end point candidate sentence as the end point. You may make it produce | generate the text network information which shows the relationship and order between each text from the pair with the information to show.

また、本発明にかかる文章ネットワーク生成方法は、所定の重要語を用いて各種知識を文字情報で記述した複数の文章からなる文章集合を記憶する記憶部と、検索対象知識を含む文章に関連する所望の文章を記憶部の文章集合から検索するのに用いる文章ネットワーク情報を生成する演算処理部とを有する文章ネットワーク情報生成装置で用いられ、入力された所定の文章間関係情報に基づき文章ネットワーク情報を生成する文章ネットワーク生成方法であって、演算処理部で、2つの文章の間の関連性および順序性を示す文章間関係情報を入力とし、当該文章間関係情報の始点側に位置する始点文章と当該文章間関係情報の終点側に位置する終点文章とに含まれる重要語に基づいて、文章ネットワーク情報を生成する文章ネットワーク生成ステップを設け、この文章ネットワーク生成ステップとして、入力された文章間関係情報で指定された始点文章と同じ重要語をすべて含む文章を記憶部の文章集合から始点候補文章として選択するとともに、文章間関係情報で指定された終点文章と同じ重要語をすべて含む文章を記憶部の文章集合から終点候補文章として選択するステップと、始点文章および始点候補文章のそれぞれを始点とし終点文章および終点候補文章のそれぞれを終点とする新たな文章間関係情報を生成し、これら新たな文章間関係情報の始点文章を示す情報と終点文章を示す情報との対から、各文章間の関連性および順序性を示す文章ネットワーク情報を生成するステップとを実行するようにしたものである。   The sentence network generation method according to the present invention relates to a storage unit that stores a sentence set composed of a plurality of sentences in which various kinds of knowledge are described as character information using a predetermined important word, and a sentence including knowledge to be searched. Sentence network information based on input predetermined sentence relationship information used in a sentence network information generation apparatus having an arithmetic processing unit for generating sentence network information used to search a desired sentence from a sentence set in a storage unit A sentence network generation method for generating a sentence network, wherein the arithmetic processing unit receives inter-sentence relation information indicating relevance and order between two sentences, and is a start-point sentence located on the start-point side of the inter-sentence relation information Network generation that generates text network information based on key words contained in the end-point text located on the end-point side of the inter-text relation information In this sentence network generation step, a sentence including all the same important words as the starting sentence specified by the inputted sentence relation information is selected as a starting point candidate sentence from the sentence set in the storage unit, and the sentence relation Selecting a sentence including all the same important words as the end sentence specified in the information as an end point candidate sentence from the sentence set in the storage unit, and each of the end point sentence and the end point candidate sentence from the start point sentence and the start point candidate sentence respectively. Is used to generate new inter-sentence relationship information, and from these pairs of information indicating the start sentence of the new inter-text relation information and information indicating the end sentence, a sentence indicating the relationship and order between the sentences. Generating network information.

この際、演算処理部で、任意の文章間関係情報を、当該文章間関係情報の始点文章に含まれる重要語のそれぞれを始点とし当該文章間関係情報の終点文章に含まれる重要語のそれぞれを終点とする各重要語間関係情報に分解し、これら重要語間関係情報に対応する重要語間重みの和に基づき当該文章間関係情報に対応する文章間重みを算出する文章間重み算出ステップをさらに設け、文章ネットワーク情報生成ステップで、各文章間ごとに文章間重み算出ステップで得られた文章間重みを有する文章ネットワークを生成するようにしてもよい。   At this time, in the arithmetic processing unit, arbitrary sentence relation information is obtained by using each important word contained in the end sentence of the sentence relation information starting from each of the important words contained in the start sentence of the sentence relation information. A sentence weight calculation step for calculating a sentence weight corresponding to the sentence relation information based on the sum of the weights of the important words corresponding to the important word relation information. Further, a sentence network having a sentence weight obtained in the sentence weight calculation step may be generated for each sentence in the sentence network information generation step.

さらに、演算処理部で、入力された文章間関係情報を、当該文章間関係情報の始点文章に含まれる重要語のそれぞれを始点とし当該文章間関係情報の終点文章に含まれる重要語のそれぞれを終点とする各重要語間関係情報に分解し、これら重要語間関係情報の発生ごとに当該重要語間関係情報に対応する重要語間重みを更新する重要語間重み算出ステップをさらに設けてもよい。   Further, in the arithmetic processing unit, the input inter-sentence relation information is obtained by using each of the important words included in the end sentence of the inter-sentence relation information as the start point of each of the important words contained in the start sentence of the inter-sentence relation information. There may be further provided an important word weight calculation step for decomposing each important word relation information as the end point and updating the important word weight corresponding to the important word relation information for each occurrence of the important word relation information. Good.

また、本発明にかかるプログラムは、所定の重要語を用いて各種知識を文字情報で記述した複数の文章からなる文章集合を記憶する記憶部と、検索対象知識を含む文章に関連する所望の文章を記憶部の文章集合から検索するのに用いる文章ネットワーク情報を生成する演算処理部とを有する文章ネットワーク情報生成装置のコンピュータに、2つの文章の間の関連性および順序性を示す文章間関係情報を入力とし、この文章間関係情報の始点側に位置する始点文章と同じ重要語を含む文章を記憶部の文章集合から始点候補文章として選択するとともに、当該文章間関係情報の終点側に位置する終点文章と同じ重要語を含む文章を記憶部の文章集合から終点候補文章として選択するステップと、始点文章および始点候補文章のそれぞれを始点とし終点文章および終点候補文章のそれぞれを終点とする新たな文章間関係情報を生成し、これら新たな文章間関係情報の始点文章を示す情報と終点文章を示す情報との対から、各文章間の関連性および順序性を示す文章ネットワーク情報を生成するステップとを実行させるようにしたものである。   In addition, the program according to the present invention includes a storage unit that stores a sentence set including a plurality of sentences in which various kinds of knowledge are described using character information using predetermined important words, and a desired sentence related to the sentence including the search target knowledge. A sentence network information generating apparatus having an arithmetic processing unit for generating sentence network information used to retrieve a sentence set from a sentence set in a storage unit, and inter-sentence relation information indicating relevance and order between two sentences Is selected as a starting point candidate sentence from the sentence set in the storage unit, and located on the end side of the inter-sentence relation information. Selecting a sentence including the same important word as the end sentence from the sentence set in the storage unit as an end point candidate sentence, and setting each of the start point sentence and the start point candidate sentence as a start point Generate new inter-sentence relationship information with the end point sentence and the end point candidate sentence as the end point, and from the pair of information indicating the start point sentence and the end point sentence of these new inter-sentence relation information, Generating sentence network information indicating relevance and order.

本発明によれば、教示情報で指定された始点文章と同じ重要語を含む文章が文章集合から始点候補文章として選択されるとともに、教示情報で指定された終点文章と同じ重要語を含む文章が文章集合から終点候補文章として選択され、始点文章および始点候補文章のそれぞれを始点とし終点文章および終点候補文章のそれぞれを終点とする新たな有向線分を用いて、各文章間の関連性および順序性を示す文章ネットワークが生成されるものとなり、熟練者の教示という主観にとらわれない文章ネットワークを構築することができ、検索対象知識に対して任意の順序性を持つ所望の関連知識を柔軟に検索できる。   According to the present invention, a sentence including the same important word as the start sentence specified by the teaching information is selected as a starting point candidate sentence from the sentence set, and a sentence including the same important word as the end sentence specified by the teaching information is selected. Using the new directed line segment selected from the sentence set as the end point candidate sentence and starting with each of the start point sentence and the start point candidate sentence and ending with the end point sentence and each of the end point candidate sentences, the relationship between the sentences and Sentence network that shows order is generated, it is possible to build a sentence network that is not bound by the subjectivity of expert instruction, and flexibly add desired related knowledge with arbitrary order to search target knowledge Searchable.

次に、本発明の実施の形態について図面を参照して説明する。
まず、図1を参照して、本発明の一実施の形態にかかる関連知識検索装置について説明する。図1は本発明の一実施の形態にかかる関連知識検索装置の構成を示すブロック図である。
Next, embodiments of the present invention will be described with reference to the drawings.
First, a related knowledge search device according to an embodiment of the present invention will be described with reference to FIG. FIG. 1 is a block diagram showing a configuration of a related knowledge retrieval apparatus according to an embodiment of the present invention.

本実施の形態にかかる関連知識検索装置1は、関連知識検索で用いる文章ネットワークとして、教示情報で指定された始点文章および終点文章について、これら文章と同じ重要語を含む文章を文章集合から始点候補文章および終点候補文章として選択し、始点文章および始点候補文章のそれぞれを始点とし終点文章および終点候補文章のそれぞれを終点とする新たな文章間関係情報に基づき、各文章間の関連性および順序性を示す文章ネットワークを生成するようにしたものである。
以下では、2つの文章間の関係を示す文章間関係情報、および2つの重要語間の関係を示す重要語間関係情報を、それぞれグラフ理論に基づく有向線分で表すこととする。
The related knowledge search device 1 according to the present embodiment, as a sentence network used for related knowledge search, for a start sentence and an end sentence specified by teaching information, a sentence including the same important word as these sentences from a sentence set as a start point candidate Relevance and ordering between sentences based on new sentence relationship information that is selected as a sentence and an end point candidate sentence, and each of the start point sentence and the start point candidate sentence is the start point and the end point sentence and the end point candidate sentence are the end points. A network of sentences is generated.
In the following, the inter-sentence relationship information indicating the relationship between the two sentences and the important word relationship information indicating the relationship between the two important words are each represented by a directed line segment based on the graph theory.

この関連知識検索装置1は、全体としてコンピュータを有する情報処理装置からなり、画面表示部11、操作入力部12、情報入出力部13、記憶部14、および演算処理部15が設けられている。
画面表示部11は、LCDやCRTなどの画面表示装置からなり、演算処理部15からの出力に応じて、各種情報を画面表示する。
操作入力部12は、キーボードやマウスなどの入力装置からなり、利用者の操作を検出して演算処理部15へ出力する。
情報入出力部13は、記録媒体や通信ネットワークと接続されて、演算処理部15での処理に用いる各種処理情報やプログラムをやり取りするための回路部である。
The related knowledge search device 1 is composed of an information processing device having a computer as a whole, and is provided with a screen display unit 11, an operation input unit 12, an information input / output unit 13, a storage unit 14, and an arithmetic processing unit 15.
The screen display unit 11 includes a screen display device such as an LCD or a CRT, and displays various types of information on the screen in accordance with the output from the arithmetic processing unit 15.
The operation input unit 12 includes an input device such as a keyboard and a mouse, detects a user operation, and outputs the detected operation to the arithmetic processing unit 15.
The information input / output unit 13 is a circuit unit that is connected to a recording medium or a communication network and exchanges various processing information and programs used for processing in the arithmetic processing unit 15.

記憶部14は、ハードディスクやメモリなどの記憶装置からなり、演算処理部15での処理に用いる各種処理情報14A〜14Eやプログラム14Fを記憶する。
処理情報としては、各種知識がテキストデータで記述された複数の文章からなる文章集合14A、熟練者が教示した任意の文章間の関係を示す教示情報14B、文章集合14A内の文章を解析しその内容を把握するための複数の重要語からなる重要語リスト14C、文章集合14Aの各文章間の関係を示す文章ネットワーク情報14D、文章集合14Aの各文章で用いられている重要語間の関係を示す重要語ネットワーク情報14Eがある。
The storage unit 14 includes a storage device such as a hard disk or a memory, and stores various processing information 14A to 14E and a program 14F used for processing in the arithmetic processing unit 15.
As processing information, a sentence set 14A composed of a plurality of sentences in which various kinds of knowledge are described in text data, teaching information 14B indicating a relationship between arbitrary sentences taught by an expert, and sentences in the sentence set 14A are analyzed. An important word list 14C composed of a plurality of important words for grasping the contents, sentence network information 14D indicating a relation between sentences in the sentence set 14A, and a relation between important words used in each sentence in the sentence set 14A. There is important word network information 14E.

演算処理部15は、CPUとその周辺回路からなり、記憶部14のプログラム14Fを読み込んで実行することにより、上記ハードウェアとプログラムとを協働させて各種機能手段を実現する。
機能手段としては、教示情報受付手段15A、文章ネットワーク生成手段15B、文章間重み算出手段15C、重要語ネットワーク生成手段15D、重要語間重み算出手段15E、および知識検索手段15Fがある。
The arithmetic processing unit 15 is composed of a CPU and its peripheral circuits, and reads and executes the program 14F in the storage unit 14, thereby realizing various functional means by cooperating the hardware and the program.
As functional means, there are teaching information reception means 15A, sentence network generation means 15B, sentence weight calculation means 15C, keyword network generation means 15D, keyword weight calculation means 15E, and knowledge search means 15F.

教示情報受付手段15Aは、操作入力部12からの操作を受け付けて、熟練者が教示した任意の文章間の関係を示す教示情報14Bを生成し、記憶部14へ格納する機能手段である。
文章ネットワーク生成手段15Bは、教示情報14Bと重要語リスト14Cに基づき、文章集合14Aの各文章間の関係を解析し、文章ネットワーク情報14Dを生成する機能手段である。
文章間重み算出手段15Cは、2つの文章に含まれている重要語の数に基づき、文章ネットワーク情報14Dを構成する各文章間の重みを算出する機能手段である。
The teaching information receiving unit 15 </ b> A is a functional unit that receives an operation from the operation input unit 12, generates teaching information 14 </ b> B indicating a relationship between arbitrary sentences taught by a skilled person, and stores it in the storage unit 14.
The sentence network generation unit 15B is a functional unit that analyzes the relationship between each sentence in the sentence set 14A based on the teaching information 14B and the important word list 14C, and generates the sentence network information 14D.
The sentence weight calculation means 15C is a functional means for calculating a weight between sentences constituting the sentence network information 14D based on the number of important words included in two sentences.

重要語ネットワーク生成手段15Dは、教示情報14Bと重要語リスト14Cに基づき、文章集合14Aの各文章に含まれる重要語間の関係を解析し、重要語ネットワーク情報14Eを生成する機能手段である。
重要語間重み算出手段15Eは、教示情報14Bに基づき、重要語ネットワーク情報14Eを構成する各重要語間の重みを算出する機能手段である。
知識検索手段15Fは、文章ネットワーク情報14Dに基づき、所望の文章に関連する文章を検索し関連知識として出力する機能手段である。
The keyword network generation unit 15D is a functional unit that analyzes the relationship between the key words included in each sentence of the sentence set 14A based on the teaching information 14B and the keyword list 14C, and generates the keyword network information 14E.
The important word weight calculation means 15E is a functional means for calculating the weight between the important words constituting the important word network information 14E based on the teaching information 14B.
The knowledge search unit 15F is a functional unit that searches for a sentence related to a desired sentence based on the sentence network information 14D and outputs it as related knowledge.

図2に文章集合14Aの構成例を示す。この例は、「ストレス」についてWeb上で多数の回答者に自由に文章を記述してもらったものを集計したものであり、各文章Dごとに当該文章Dを管理するための文章番号Diとが割り当てられている。
図3に重要語リスト14Cの構成例を示す。この重要語リスト14Cは、所定のアルゴリズムに基づき各文章Dを解析し、得られた重要語の種別とその前後関係とから各重要語Tを構成したものであり、各重要語Tごとに当該重要語Tを管理するターム番号Tjが割り当てられている。
FIG. 2 shows a configuration example of the sentence set 14A. In this example, “Stress” is a summary of sentences that a number of respondents freely described on the Web, and a sentence number Di for managing the sentence D for each sentence D Is assigned.
FIG. 3 shows a configuration example of the important word list 14C. This important word list 14C is constructed by analyzing each sentence D based on a predetermined algorithm and constructing each important word T from the obtained important word type and its context, and for each important word T, A term number Tj for managing the important word T is assigned.

各重要語Tは、前方に位置するキーワード前と後方に位置するキーワード後との対からなり、それぞれのキーワードごとにそのキーワードの内容を示す単語とその単語の品詞属性種別とが規定されている。
例えば重要語「1」は、「ストレス」と「解消」という2つのキーワードからなり、その位置関係は「ストレス」が前方に位置するものと規定されている。また重要語「2」は、同じく「ストレス」と「解消」という2つのキーワードからなるが、その位置関係は「ストレス」が後方に位置するものと規定されている。
Each important word T consists of a pair of a front keyword and a rear keyword after the keyword. A word indicating the content of the keyword and a part of speech attribute type of the word are defined for each keyword. .
For example, the important word “1” is composed of two keywords “stress” and “elimination”, and the positional relationship is defined such that “stress” is located in front. The important word “2” is also composed of two keywords “stress” and “elimination”, and the positional relationship is defined such that “stress” is located behind.

[文章ネットワーク生成動作]
次に、図4を参照して、本発明の第1の実施の形態にかかる関連知識検索装置での文章ネットワーク生成動作について説明する。図4は第1の実施の形態にかかる関連知識検索装置での文章ネットワーク生成処理を示すフローチャートである。
演算処理部15は、操作入力部12からの所定の文章ネットワーク生成指示操作、あるいは教示情報受付手段15Aでの新たな教示情報14Bの受付完了に応じて、文章ネットワーク生成手段15Bにより、図4の文章ネットワーク生成処理を開始する。
[Text network generation operation]
Next, with reference to FIG. 4, the sentence network generation operation in the related knowledge search device according to the first exemplary embodiment of the present invention will be described. FIG. 4 is a flowchart showing a sentence network generation process in the related knowledge search device according to the first embodiment.
In response to a predetermined sentence network generation instruction operation from the operation input unit 12 or completion of reception of new teaching information 14B in the teaching information receiving unit 15A, the arithmetic processing unit 15 uses the sentence network generating unit 15B to perform the operation shown in FIG. Start the sentence network generation process.

まず、文章ネットワーク生成手段15Bは、記憶部14の文章集合14Aに含まれる各文章ごとに、教示情報14Bの有向線分の始点側に位置する始点文章Dsと当該文章とについて、文章間重み算出手段15Cでその文章間重みvを算出する(ステップ100)。この際、始点文章Dsを始点側とし文章集合14Aの各文章を終点側として、それぞれ文章間の重みを算出する。   First, the sentence network generation means 15B, for each sentence included in the sentence set 14A of the storage unit 14, for the start point sentence Ds located on the start point side of the directed line segment of the teaching information 14B and the sentence, the inter-sentence weight The calculation means 15C calculates the sentence weight v (step 100). At this time, the weight between the sentences is calculated with the starting point sentence Ds as the starting point side and each sentence of the sentence set 14A as the ending point side.

文章間重みvは、文章間重み算出手段15Cで、次の式(1)により求められる。式(1)において、|De|は、教示情報14Bの有向線分の終点側に位置する終点文章Deに含まれる重要語の数を示し、|De∩DOCd|は、終点文章Deと算出対象となる有向線分の終点側に位置する文章DOCdとに共通して含まれる重要語の数を示している。   The sentence weight v is obtained by the following expression (1) by the sentence weight calculation means 15C. In Expression (1), | De | indicates the number of important words included in the end point sentence De located on the end point side of the directed line segment of the teaching information 14B, and | De∩DOCd | is calculated as the end point sentence De. The number of important words included in common with the sentence DOCd located on the end point side of the directed directed line segment is shown.

Figure 2005242807
Figure 2005242807

したがって、式(1)によれば、終点文章Deに含まれている重要語のすべてが文章DOCdに含まれている場合、変数c=1と定義され、文章間重みvも「1」となる。一方、終点文章Deに含まれている重要語のすべてが文章DOCdに含まれていない場合、変数c=0と定義され、文章間重みvも「0」となる。   Therefore, according to the formula (1), when all the important words included in the end point sentence De are included in the sentence DOCd, the variable c = 1 is defined, and the inter-sentence weight v is also “1”. . On the other hand, when all of the important words included in the end point sentence De are not included in the sentence DOCd, the variable c = 0 is defined, and the inter-sentence weight v is also “0”.

このようにして、始点文章Dsと文章集合14Aの各文章との間の文章間重みvを算出し、その文章間重みvが有効値すなわちゼロ以外の値を示す文章を、始点文章Dsに対する終点候補文章として選択し(ステップ101)、始点文章Dsから終点文章Deと各終点候補文章に対して新たな有向線分を生成する(ステップ102)。   In this way, the inter-sentence weight v between the starting point sentence Ds and each sentence of the sentence set 14A is calculated, and the sentence where the inter-sentence weight v is an effective value, that is, a value other than zero, is determined as the end point with respect to the starting point sentence Ds. A candidate sentence is selected (step 101), and a new directed line segment is generated for the end point sentence De and each end point candidate sentence from the start point sentence Ds (step 102).

次に、文章ネットワーク生成手段15Bは、記憶部14の文章集合14Aに含まれる各文章ごとに、当該文章と教示情報14Bの有向線分の終点側に位置する終点文章Deとについて、文章間重み算出手段15Cでその文章間重みvを算出する(ステップ103)。この際、文章集合14Aの各文章を始点側とし始点文章Dsを終点側として、それぞれ文章間の重みを算出する。また、式(1)において、Deに代えてDsを用い、DOCdに代えて算出対象となる有向線分の始点側に位置する文章DOCsを用いればよい。
そして、その文章間重みvが有効値すなわちゼロ以外の値を示す文章を、終点文章Deに対する始点候補文章として選択し(ステップ104)、各始点候補文章から終点文章Deと終点候補文章に対して新たな有向線分を生成する(ステップ105)。
Next, for each sentence included in the sentence set 14 </ b> A of the storage unit 14, the sentence network generation unit 15 </ b> B calculates the sentence interval between the sentence and the end point sentence De located on the end point side of the directed line segment of the teaching information 14 </ b> B. The weight calculation means 15C calculates the sentence weight v (step 103). At this time, each sentence in the sentence set 14A is set as the start point side and the start point sentence Ds is set as the end point side, and the weight between the sentences is calculated. Further, in Expression (1), Ds may be used instead of De, and text DOCs positioned on the starting point side of the directed line segment to be calculated may be used instead of DOCd.
Then, a sentence whose inter-sentence weight v is an effective value, that is, a value other than zero is selected as a starting point candidate sentence for the end point sentence De (step 104), and from each start point candidate sentence to the end point sentence De and the end point candidate sentence. A new directed line segment is generated (step 105).

これにより、終点文章Deに含まれる重要語をすべて含む文章が、文章集合14Aから終点候補文章として選択されるとともに、始点文章Dsに含まれるすべての重要語を含む文章が、文章集合14Aから始点候補文章として選択されて、始点文章Dsと始点候補文章を始点とし終点文章Deおよび終点候補文章を終点とする新たな有向線分がそれぞれ生成される。
そして、これら有向線分を示す始点文章と終点文章の対が文章ネットワーク情報14Dとして出力されて記憶部14へ格納され(ステップ106)、一連の文章ネットワーク生成処理が終了する。
As a result, a sentence including all the important words included in the end point sentence De is selected as an end point candidate sentence from the sentence set 14A, and a sentence including all the important words included in the start point sentence Ds is selected from the sentence set 14A. Selected as a candidate sentence, new directed line segments are generated, each having a start point sentence Ds and a start point candidate sentence as a start point, and an end point sentence De and an end point candidate sentence as an end point.
Then, a pair of the start point sentence and the end point sentence indicating the directed line segment is output as the sentence network information 14D and stored in the storage unit 14 (step 106), and a series of sentence network generation processes is completed.

図5に、文章ネットワーク生成動作例を示す。この例では、図5(a)に示すように、教示情報14Bとして、文章D163を始点文章Dsとし、文章D357を始点文章Deとする有向線分Vk0が指定されたものとする。
そして、図4のステップ100,101で、文章集合14Aのうちから、図5(b)に示すように、始点文章D163との文章間重みv1,v2が有効値(ゼロ以外)となった終点候補文章D380,D421が選択され、図4のステップ102で、文章ネットワーク上に始点文章D163から終点候補文章のそれぞれについて有向線分Vd1,Vd2が生成される。
FIG. 5 shows an example of a sentence network generation operation. In this example, as shown in FIG. 5A, it is assumed that the directed line segment Vk0 having the sentence D163 as the start point sentence Ds and the sentence D357 as the start point sentence De is specified as the teaching information 14B.
Then, in steps 100 and 101 of FIG. 4, from the sentence set 14A, as shown in FIG. 5B, the end points at which the inter-sentence weights v1 and v2 with the start sentence D163 become valid values (other than zero). Candidate sentences D380 and D421 are selected, and in step 102 in FIG. 4, directed line segments Vd1 and Vd2 are generated on the sentence network from the start point sentence D163 to the end point candidate sentences.

続いて、図4のステップ103で、文章集合14Aのうちから、図5(c)に示すように、終点文章D357との文章間重みv3が有効値となった始点候補文章D196が選択され、図4のステップ104で、始点候補文章D196から終点文章D357および終点候補D380,D421のそれぞれに有向線分Vd3〜Vd5が生成される。
そして、これら有向線分Vk0,Vd1〜Vd5が文章ネットワーク情報14Dとして出力される。図6に文章ネットワーク情報14Dの構成例を示す。ここでは、各有向線分の始点側に位置する始点文章の文章番号と、終点側に位置する終点文章の文章番号とが対として記述されている。
Subsequently, in step 103 of FIG. 4, a starting point candidate sentence D196 in which the inter-sentence weight v3 with the end point sentence D357 is an effective value is selected from the sentence set 14A, as shown in FIG. 5C. In step 104 of FIG. 4, directed line segments Vd3 to Vd5 are generated from the start point candidate sentence D196 to the end point sentence D357 and the end point candidates D380 and D421, respectively.
These directed line segments Vk0, Vd1 to Vd5 are output as the sentence network information 14D. FIG. 6 shows a configuration example of the text network information 14D. Here, the sentence number of the start sentence located on the start point side of each directed line segment and the sentence number of the end sentence located on the end point side are described as a pair.

このように、教示情報14Bで指定された始点文章Dsおよび終点文章Deについて、これら文章と同じ重要語を含む文章を文章集合から始点候補文章および終点候補文章として選択し、始点文章Dsおよび始点候補文章のそれぞれを始点とし終点文章Deおよび終点候補文章のそれぞれを終点とする新たな有向線分に基づき、各文章間の関連性および順序性を示す文章ネットワークを生成するようにしたので、教示情報14Bを重要語レベルで分析でき、熟練者の教示では指定されていない新たな有向線分を含む文章ネットワークを容易に構築できる。これにより、熟練者の教示という主観の範囲を超えた新たな観点から、順序性を持った関連知識を柔軟に検索することができる。   As described above, for the start point sentence Ds and the end point sentence De specified by the teaching information 14B, sentences including the same important words as these sentences are selected as the start point candidate sentence and the end point candidate sentence from the sentence set, and the start point sentence Ds and the start point candidate are selected. Based on new directed line segments with each sentence as the start point and each of the end point sentence De and the end point candidate sentences as end points, a sentence network showing the relationship and order between each sentence is generated. The information 14B can be analyzed at the important word level, and a sentence network including a new directed line segment that is not specified by the expert's instruction can be easily constructed. Thereby, it is possible to flexibly search for related knowledge having order from a new viewpoint that exceeds the subjective range of teaching by an expert.

次に、図7,図8を参照して、本発明の第2の実施の形態にかかる関連知識検索装置での文章ネットワーク生成動作について説明する。図7は、第2の実施の形態にかかる関連知識検索装置での重要語ネットワーク生成処理を示すフローチャートである。図8は、第2の実施の形態にかかる関連知識検索装置での文章ネットワーク生成処理を示すフローチャートである。   Next, with reference to FIG. 7 and FIG. 8, the sentence network generation operation in the related knowledge search device according to the second exemplary embodiment of the present invention will be described. FIG. 7 is a flowchart illustrating the keyword network generation processing in the related knowledge search device according to the second embodiment. FIG. 8 is a flowchart showing a sentence network generation process in the related knowledge search device according to the second embodiment.

前述した第1の実施の形態では、文章ネットワークにおいて新たな有向線分を生成する際、教示情報で指定された始点文章および終点文章について、その文章と同じ重要語をすべて含む文章を、始点候補文章および終点候補文章として選択する場合を例として説明した。
本実施の形態では、教示情報で指定された有向線分を、その始点文章と終点文章に含まれる重要語間の有向線分にそれぞれ分解して、各重要語間の重みを持つ有向線分からなる重要語ネットワーク情報を生成し、この重要語ネットワーク情報に基づき、新たな有向線分を生成する際に用いる始点候補文章および終点候補文章を選択する場合について説明する。
In the first embodiment described above, when a new directed line segment is generated in the sentence network, a sentence including all the same important words as the sentence for the start sentence and the end sentence specified by the teaching information The case of selecting as a candidate sentence and an end point candidate sentence has been described as an example.
In this embodiment, the directed line segment specified by the teaching information is decomposed into directed line segments between important words included in the start sentence and the end sentence, and weights between the important words are assigned. A case will be described in which important word network information composed of direction line segments is generated, and start point candidate sentences and end point candidate sentences used when generating new directional line segments are selected based on the important word network information.

演算処理部15は、操作入力部12からの所定の文章ネットワーク生成指示操作、あるいは教示情報受付手段15Aでの新たな教示情報14Bの受付完了に応じて、まず、重要語ネットワーク生成手段15Dにより、図7の重要語ネットワーク生成処理を開始する。
重要語ネットワーク生成手段15Dは、記憶部14の教示情報14Bで指定された有向線分Vkについて、その始点側に位置する始点文章Dsに含まれる重要語と終点側に位置する終点文章Deに含まれる重要語とを取得する(ステップ110)。
In response to a predetermined sentence network generation instruction operation from the operation input unit 12 or completion of reception of new teaching information 14B in the teaching information receiving unit 15A, the arithmetic processing unit 15 first uses the important word network generating unit 15D to The important word network generation process of FIG. 7 is started.
The keyword network generation unit 15D converts the directional line segment Vk specified by the teaching information 14B of the storage unit 14 into the keyword and the end point sentence De located on the end point side included in the start point sentence Ds located on the start point side. The important words included are acquired (step 110).

そして、これら始点側重要語と終点側重要語との間に有向線分を生成することにより、教示情報14Bで指定された有向線分を重要語間有向線分に分解する(ステップ111)。その際、各重要語間有向線分を生成するごとに、その重要語間有向線分に対応する重要語間重みwを更新する(ステップ112)。   Then, by generating a directional line segment between the start point side important word and the end point side important word, the directional line segment specified by the teaching information 14B is decomposed into the important inter-word directional line segment (step). 111). At this time, each time the important inter-word directed line segment is generated, the important inter-word weight w corresponding to the important inter-word directed line segment is updated (step 112).

重要語間重みwは、重要語間重み算出手段15Eで、次の式(2)により求められる。式(2)において、w0元の重要語間重みであり、当該重要語間有向線分が生成されていない初期状態ではw0=0とする。αは元の重要語間重みw0に対する(含める)割合を決定する係数であり、βは元の重要語間重みw0に対する+側の更新幅を決定する係数を示し、γは元の重要語間重みw0に対する−側更新幅を決定する係数を示している。なお、係数α,β,γはともに1以上の値をとる。また、w0の初期値(重要語間有向線分の新規発生時)はゼロとする。 The important word weight w is obtained by the following important expression (2) by the important word weight calculating means 15E. In the equation (2), w 0 is an important word weight between the original words, and w 0 = 0 in the initial state where the important inter-word directed line segment is not generated. α is a coefficient for determining the (include) ratio with respect to the original important word weight w 0 , β is a coefficient for determining the + side update width with respect to the original important word weight w 0 , and γ is the original importance A coefficient for determining a negative update width for the inter-word weight w 0 is shown. The coefficients α, β, and γ all take values of 1 or more. The initial value of w 0 (when a new important line between important words is generated) is set to zero.

Figure 2005242807
Figure 2005242807

このようにして、各重要語間有向線分ごとにその重要語間重みwが算出される。そして、これら重要語間有向線分とその重要語間重みwとが重要語ネットワーク情報14Eとして出力されて記憶部14へ格納され(ステップ113)、一連の重要語ネットワーク生成処理が終了する。
これにより、図8に示すように重要語ネットワーク情報が生成される。ここでは、各有向線分の始点側に位置する始点重要語の重要語番号と、終点側に位置する終点重要語の重要語番号と、その重要語間重みwとが対として記述されている。
In this manner, the important word weight w is calculated for each important word directed line segment. Then, these important inter-word directed line segments and the important inter-word weight w are output as the important word network information 14E and stored in the storage unit 14 (step 113), and a series of important word network generation processes is completed.
As a result, the important word network information is generated as shown in FIG. Here, the important word number of the starting important word located on the starting point side of each directed line segment, the important word number of the ending important word located on the ending point side, and the weight w between the important words are described as a pair. Yes.

次に、演算処理部15は、文章ネットワーク生成手段15Bを用いて、図9の文章ネットワーク生成処理を開始する。
まず、文章ネットワーク生成手段15Bは、記憶部14の文章集合14Aに含まれる各文章ごとに、教示情報14Bの有向線分の始点側に位置する始点文章Dsと当該文章とについて、文章間重み算出手段15Cでその文章間重みvを算出する(ステップ120)。この際、始点文章Dsを始点側とし文章集合14Aの各文章を終点側として、それぞれ文章間の重みを算出する。
Next, the arithmetic processing unit 15 starts the sentence network generation process of FIG. 9 using the sentence network generation unit 15B.
First, the sentence network generation means 15B, for each sentence included in the sentence set 14A of the storage unit 14, for the start point sentence Ds located on the start point side of the directed line segment of the teaching information 14B and the sentence, the inter-sentence weight The calculation means 15C calculates the sentence weight v (step 120). At this time, the weight between the sentences is calculated with the starting point sentence Ds as the starting point side and each sentence of the sentence set 14A as the ending point side.

文章間重みvは、文章間重み算出手段15Cで、次の式(3)により求められる。式(3)において、|De|は、教示情報14Bの有向線分の終点側に位置する終点文章Deに含まれる重要語の数を示し、|De∩DOCd|は、終点文章Deと算出対象となる有向線分の終点側に位置する文章DOCdとに共通して含まれる重要語の数を示している。また、Mは算出対象となる有向線分の始点側に位置する文章Dsに含まれる重要語の数を示し、Nは算出対象となる有向線分の終点側に位置する文章DOCdに含まれる重要語の数を示している。   The sentence weight v is obtained by the following expression (3) by the sentence weight calculation means 15C. In Expression (3), | De | indicates the number of important words included in the end point sentence De located on the end side of the directed line segment of the teaching information 14B, and | De∩DOCd | is calculated as the end point sentence De. The number of important words included in common with the sentence DOCd located on the end point side of the directed directed line segment is shown. M represents the number of important words included in the sentence Ds located on the start point side of the directed line segment to be calculated, and N represents the sentence DOCd located on the end point side of the directed line segment to be calculated. Indicates the number of important words.

Figure 2005242807
Figure 2005242807

したがって、式(3)によれば、終点文章Deに含まれている重要語のすべてが終点側文章DOCdに含まれている場合、変数c=1と定義され、始点文章Dsの重要語と終点側文章DOCdの重要語との間の各重要語有向線分の重みwの総和が文章間重みvとなる。一方、終点文章Deに含まれている重要語のすべてが終点側文章DOCdに含まれていない場合、変数c=0と定義され、文章間重みvも「0」となる。   Therefore, according to Equation (3), when all the important words included in the end point sentence De are included in the end point side sentence DOCd, the variable c = 1 is defined, and the important word and the end point of the start point sentence Ds are defined. The sum of the weights w of each important word directed line segment with the important word of the side sentence DOCd becomes the inter-sentence weight v. On the other hand, when all of the important words included in the end point sentence De are not included in the end point side sentence DOCd, the variable c = 0 is defined, and the inter-sentence weight v is also “0”.

このようにして、始点文章Dsと文章集合14Aの各文章との間の文章間重みvを算出し、その文章間重みvが有効値すなわちゼロ以外の値を示す文章を、始点文章Dsに対する終点候補文章として選択し(ステップ121)、始点文章Dsから終点文章Deと各終点候補文章に対して新たな有向線分を生成する(ステップ122)。   In this way, the inter-sentence weight v between the starting point sentence Ds and each sentence of the sentence set 14A is calculated, and the sentence where the inter-sentence weight v is an effective value, that is, a value other than zero, is determined as the end point with respect to the starting point sentence Ds. A candidate sentence is selected (step 121), and a new directed line segment is generated for the end point sentence De and each end point candidate sentence from the start point sentence Ds (step 122).

次に、文章ネットワーク生成手段15Bは、記憶部14の文章集合14Aに含まれる各文章ごとに、当該文章と教示情報14Bの有向線分の終点側に位置する終点文章Deとについて、文章間重み算出手段15Cでその文章間重みvを算出する(ステップ123)。この際、文章集合14Aの各文章を始点側とし始点文章Dsを終点側として、それぞれ文章間の重みを算出する。また、式(3)において、Deに代えてDsを用い、DOCdに代えて算出対象となる有向線分の始点側に位置する文章DOCsを用いればよい。
そして、その文章間重みvが有効値すなわちゼロ以外の値を示す文章を、終点文章Deに対する始点候補文章として選択し(ステップ124)、各始点候補文章から終点文章Deと終点候補文章に対して新たな有向線分を生成し(ステップ125)、文章間重み算出手段15Cを用いて、これら有向線分ごとに文章間重みvを算出する(ステップ126)。
Next, for each sentence included in the sentence set 14 </ b> A of the storage unit 14, the sentence network generation unit 15 </ b> B calculates the sentence interval between the sentence and the end point sentence De located on the end point side of the directed line segment of the teaching information 14 </ b> B. The weight calculation means 15C calculates the sentence weight v (step 123). At this time, each sentence in the sentence set 14A is set as the start point side and the start point sentence Ds is set as the end point side, and the weight between the sentences is calculated. Further, in Expression (3), Ds may be used instead of De, and text DOCs positioned on the start point side of the directed line segment to be calculated may be used instead of DOCd.
Then, a sentence whose inter-sentence weight v is an effective value, that is, a value other than zero, is selected as a start point candidate sentence for the end point sentence De (step 124), and from each start point candidate sentence to the end point sentence De and the end point candidate sentence. A new directed line segment is generated (step 125), and an inter-text weight v is calculated for each of these directed line segments using the inter-text weight calculation means 15C (step 126).

これにより、終点文章Deに含まれる重要語をすべて含む文章が、文章集合14Aから終点候補文章として選択されるとともに、始点文章Dsに含まれるすべての重要語を含む文章が、文章集合14Aから始点候補文章として選択されて、始点文章と始点候補文章を始点とし終点文章Deおよび終点候補文章を終点とする新たな有向線分がそれぞれ生成される。
そして、これら有向線分を示す始点文章、終点文章、およびその文章間重みの対が文章ネットワーク情報14Dとして出力されて記憶部14へ格納され(ステップ127)、一連の文章ネットワーク生成処理が終了する。
As a result, a sentence including all the important words included in the end point sentence De is selected as an end point candidate sentence from the sentence set 14A, and a sentence including all the important words included in the start point sentence Ds is selected from the sentence set 14A. Selected as a candidate sentence, new directed line segments are generated, each having a start point sentence and a start point candidate sentence as a start point, and an end point sentence De and an end point candidate sentence as an end point.
Then, a pair of the start point sentence, the end point sentence indicating the directed line segment, and the weight between the sentences is output as the sentence network information 14D and stored in the storage unit 14 (step 127), and a series of sentence network generation processes is completed. To do.

図10に文章ネットワーク情報14Dの構成例を示す。ここでは、各有向線分の始点側に位置する始点文章の文章番号と、終点側に位置する終点文章の文章番号と、その文章間重みvとが対として記述されており、これをグラフ化すると図11の文章ネットワークとなる。   FIG. 10 shows a configuration example of the text network information 14D. Here, the sentence number of the start sentence located on the start point side of each directed line segment, the sentence number of the end sentence located on the end point side, and the weight v between the sentences are described as a pair. If it becomes, it will become a text network of FIG.

ここでは、教示情報14Bで指定された、始点文章D163から終点文章D357への有向線分について文章間重み「5」が付与されている。
また、この教示情報14Bで指定された以外の新たな有向線分として、始点文章D163から終点候補文章D380,D421のそれぞれへの有向線分が生成され、その文章間重みがそれぞれ「12」,「7」となっている。また、始点候補文章D196から終点文章D357および終点候補文章D380,D421のそれぞれへの有向線分が生成され、その文章間重みがそれぞれ「10」,「8」,「6」となっている。
Here, the inter-sentence weight “5” is assigned to the directed line segment designated by the teaching information 14B from the start point sentence D163 to the end point sentence D357.
Further, as new directed line segments other than those specified by the teaching information 14B, directed line segments from the start point sentence D163 to the end point candidate sentences D380 and D421 are generated, and the inter-sentence weight is “12”. "," 7 ". In addition, directed line segments from the start point candidate sentence D196 to the end point sentence D357 and the end point candidate sentences D380 and D421 are generated, and the inter-sentence weights are “10”, “8”, and “6”, respectively. .

このように、文章ネットワークにおいて新たな有向線分を生成する際、教示情報で指定された有向線分を、その始点文章と終点文章に含まれる重要語間の有向線分に分解して、各重要語間の重みを持つ有向線分からなる重要語ネットワーク情報を生成し、この重要語ネットワーク情報に基づき、新たな有向線分を生成する際に用いる始点候補文章および終点候補文章を選択するようにしたので、各文章間の関連有無とその順序性だけでなく、その関連の重みについても文章ネットワークで表現できる。
これにより、任意の課題に対する関連知識を検索する場合、文章間の重みに基づき課題に対する結びつきが大きい知識を容易に検索でき、多くの知識の中から所望の知識を効率よく検索できる。
In this way, when generating a new directed line segment in the sentence network, the directed line segment specified in the teaching information is decomposed into a directed line segment between important words included in the start point sentence and the end point sentence. Generating important word network information consisting of directed line segments having weights between important words, and starting point candidate sentence and end point candidate sentences used for generating new directed line segments based on the important word network information Therefore, not only the presence / absence of each sentence and its order, but also the weight of the relation can be expressed by the sentence network.
Thereby, when searching related knowledge for an arbitrary task, it is possible to easily search for knowledge that has a large connection to the task based on the weight between sentences, and it is possible to efficiently search for desired knowledge from a lot of knowledge.

また、重要語間重みwを算出する式(2)によれば、同一の重要語間有向線分に対して教示が繰り返された場合、その有向線分に対応する重要語間重みwが増加して、その始点重要語と終点重要語の関連性が強くなるため、正当な教示を重要語ネットワークに蓄積することができる。また、係数γを時間関数で表現して、重要語ネットワークの時間経過とともに重要語間重みが減衰するようにしてもよく、教示の繰り返しが少ない有向線分を淘汰して、あまり利用されなくなった関連性を削減し、新たな関連性を効果的に利用することができる。   Further, according to the equation (2) for calculating the important word weight w, when the teaching is repeated for the same important inter-word directed line segment, the important inter-word weight w corresponding to the directed line segment. Since the relationship between the start point important word and the end point important word becomes stronger, it is possible to accumulate valid teachings in the key word network. In addition, the coefficient γ may be expressed as a time function so that the weights between important words attenuate with the passage of time in the important word network. Relevance can be reduced and new relevance can be used effectively.

次に、図12を参照して、本発明の第3の実施の形態にかかる関連知識検索装置での知識検索動作について説明する。図12は、第3の実施の形態にかかる関連知識検索装置での知識検索処理を示すフローチャートである。
本実施の形態では、前述した第2の実施の形態による文章ネットワーク生成動作で得られた文章ネットワーク情報14Dを用いて、所望の文章に関連する文章を検索する場合について説明する。
Next, with reference to FIG. 12, a knowledge search operation in the related knowledge search device according to the third exemplary embodiment of the present invention will be described. FIG. 12 is a flowchart illustrating a knowledge search process in the related knowledge search apparatus according to the third embodiment.
In the present embodiment, a case will be described in which a sentence related to a desired sentence is searched using the sentence network information 14D obtained by the sentence network generation operation according to the second embodiment.

演算処理部15は、操作入力部12からの所定の関連知識検索指示操作に応じて、知識検索手段15Fにより、図12の知識検索処理を開始する。
知識検索手段15Fは、まず操作入力部12から課題として検索対象文章の指定を取得し(ステップ130)、記憶部14の文章ネットワーク情報14Dから、検索対象文章を始点文章とする有向線分ごとに、その終点文章と文章間重みを取得する(ステップ131)。そして、得られた終点文章について、文章間重みが大きい順に、あるいは文章間重みが所定のしきい値より大きいもののみを、画面表示部11で画面表示する(ステップ132)。
The arithmetic processing unit 15 starts the knowledge search process of FIG. 12 by the knowledge search unit 15F in response to a predetermined related knowledge search instruction operation from the operation input unit 12.
First, the knowledge search means 15F acquires the specification of the search target sentence from the operation input unit 12 as a task (step 130), and from the sentence network information 14D of the storage unit 14, for each directed line segment that uses the search target sentence as the starting point sentence. Then, the end point sentence and the inter-sentence weight are acquired (step 131). Then, the screen display unit 11 displays only the obtained end point sentences in descending order of the inter-sentence weight or those having the inter-sentence weight larger than a predetermined threshold (step 132).

続いて、知識検索手段15Fは、記憶部14の文章ネットワーク情報14Dから、検索対象文章を終点文章とする有向線分ごとに、その始点文章と文章間重みを取得する(ステップ133)。そして、得られた始点文章について、文章間重みが大きい順に、あるいは文章間重みが所定のしきい値より大きいもののみを、画面表示部11で画面表示する(ステップ134)。   Subsequently, the knowledge search unit 15F acquires the start point sentence and the inter-sentence weight for each directed line segment having the search target sentence as the end point sentence from the sentence network information 14D in the storage unit 14 (step 133). Then, the screen display unit 11 displays only the obtained starting point sentences in descending order of the inter-sentence weight or those having the inter-sentence weight greater than a predetermined threshold (step 134).

したがって、前述した図10,図11の文章ネットワーク情報を用いて、文章D163に関連する知識を検索した場合、図13の検索結果画面表示例に示すように、教示情報14Bで指定された文章D163から文章D357への有向線分以外に、文章D163を始点として文章D380,D421への新たな有向線分が得られている。この際、各文章間重みの大きい順に上から表示されており、教示情報14Bで指定された有向線分より関連性の高い文章が存在することが容易に把握できる。   Therefore, when knowledge related to the sentence D163 is searched using the sentence network information shown in FIGS. 10 and 11, the sentence D163 specified by the teaching information 14B as shown in the search result screen display example of FIG. In addition to the directed line segment from the text D357 to the text D357, new directed line segments from the text D163 to the texts D380 and D421 are obtained. At this time, it is displayed from the top in descending order of the weight between each sentence, and it can be easily understood that there is a sentence more relevant than the directed line segment designated by the teaching information 14B.

また、前述した図10,図11の文章ネットワーク情報を用いて、文章D163に関連する知識を検索した場合、図14の検索結果画面表示例に示すように、教示情報14Bで指定された文章D163から文章D357への有向線分以外に、文章D196から文章D357を終点とする新たな有向線分が得られている。この際、各文章間重みの大きい順に上から表示されており、教示情報14Bで指定された有向線分より重要な文章が存在することが容易に把握できる。   When the knowledge related to the sentence D163 is searched using the sentence network information shown in FIGS. 10 and 11, the sentence D163 specified by the teaching information 14B as shown in the search result screen display example of FIG. In addition to the directed line segment from the sentence D357 to the sentence D357, a new directed line segment having the sentence D357 as the end point is obtained. At this time, it is displayed from the top in the descending order of the weight between the sentences, and it can be easily understood that there is a sentence more important than the directed line segment specified by the teaching information 14B.

なお、以上では、操作入力部12からの所定の文章ネットワーク生成指示操作、あるいは教示情報受付手段15Aでの新たな教示情報14Bの受付完了に応じて、文章ネットワーク生成処理、あるいは重要語ネットワーク生成処理および文章ネットワーク生成処理を行う場合を例として説明したが、これら処理タイミングについては上記に限定されるものではない。
例えば、教示情報受付手段15Aで、教示を逐次受け付けて記憶部14の教示情報14Bへ順次格納しておき、実際に操作入力部12から関連知識検索指示操作があった場合に、記憶部14の教示情報14Bに基づき、文章ネットワーク生成処理、あるいは重要語ネットワーク生成処理および文章ネットワーク生成処理を文章ネットワーク生成処理を実行するようにしてもよい。これにより、関連知識検索処理が行われない場合には、文章ネットワーク生成処理や重要語ネットワーク生成処理が行われなくなり、無駄な処理を省くことができる。
In the above, depending on a predetermined sentence network generation instruction operation from the operation input unit 12 or completion of reception of new teaching information 14B in the teaching information receiving unit 15A, a sentence network generation process or an important word network generation process. Although the case where the sentence network generation processing is performed has been described as an example, the processing timing is not limited to the above.
For example, the teaching information receiving unit 15A sequentially receives teachings and sequentially stores them in the teaching information 14B of the storage unit 14, and when there is an associated knowledge search instruction operation from the operation input unit 12, the storage unit 14 Based on the teaching information 14B, the sentence network generation process, or the keyword network generation process and the sentence network generation process may be executed. As a result, when the related knowledge search process is not performed, the sentence network generation process and the keyword network generation process are not performed, and unnecessary processes can be omitted.

また、以上では、関連知識を検索する機能として知識検索手段を有する関連知識検索装置を例として説明したが、関連知識検索に用いる文章ネットワーク情報あるいは重要語ネットワーク情報のみを生成する関連知識検索用ネットワーク情報生成装置を構成することもできる。   In the above description, the related knowledge search device having the knowledge search means as the function for searching the related knowledge has been described as an example. However, the related knowledge search network that generates only the sentence network information or the keyword network information used for the related knowledge search. An information generation device can also be configured.

この場合、前述した第1の実施の形態に対応する関連知識検索用ネットワーク情報生成装置としては、図1の構成のうち、演算処理部15には、少なくとも文章ネットワーク生成手段15Bと文章間重み算出手段15Cを設けておけばよく、記憶部14には、文章集合14A,教示情報14B、および重要語リスト14Cを処理情報として設けておけばよい。
また、前述した第2の実施の形態に対応する関連知識検索用ネットワーク情報生成装置としては、上記構成に加えて、演算処理部15に重要語ネットワーク生成手段15Dと重要語間重み算出手段15Eを設けておけばよい。
In this case, as the related knowledge search network information generation device corresponding to the first embodiment described above, the arithmetic processing unit 15 in the configuration of FIG. 1 includes at least the sentence network generation unit 15B and the sentence weight calculation. Means 15C may be provided, and the storage unit 14 may be provided with a sentence set 14A, teaching information 14B, and an important word list 14C as processing information.
In addition to the above configuration, the related knowledge search network information generating apparatus corresponding to the second embodiment described above includes an important word network generating unit 15D and an important word weight calculating unit 15E in the arithmetic processing unit 15. It should be provided.

本発明の一実施の形態にかかる関連知識検索装置の構成を示すブロック図である。It is a block diagram which shows the structure of the related knowledge search device concerning one embodiment of this invention. 文章集合の構成例である。It is a structural example of a text set. 重要語リストの構成例である。It is a structural example of an important word list. 本発明の第1の実施の形態にかかる文章ネットワーク生成処理を示すフローチャートである。It is a flowchart which shows the text network production | generation process concerning the 1st Embodiment of this invention. 本発明の第1の実施の形態にかかる文章ネットワーク生成動作例を示す説明図である。It is explanatory drawing which shows the example of a text network production | generation operation | movement concerning the 1st Embodiment of this invention. 図4の文章ネットワーク生成処理で得られる文章ネットワーク情報の構成例である。It is a structural example of the text network information obtained by the text network generation process of FIG. 本発明の第2の実施の形態にかかる重要語ネットワーク生成処理を示すフローチャートである。It is a flowchart which shows the important word network production | generation process concerning the 2nd Embodiment of this invention. 図7の重要語ネットワーク生成処理で得られる重要語ネットワーク情報の構成例である。It is a structural example of the important word network information obtained by the important word network generation processing of FIG. 本発明の第2の実施の形態にかかる文章ネットワーク生成処理を示すフローチャートである。It is a flowchart which shows the text network production | generation process concerning the 2nd Embodiment of this invention. 図9の文章ネットワーク生成処理で得られる文章ネットワーク情報の構成例である。10 is a configuration example of text network information obtained by the text network generation process of FIG. 9. 図10の文章ネットワークを示すグラフである。It is a graph which shows the text network of FIG. 本発明の第3の実施の形態にかかる知識検索処理を示すフローチャートである。It is a flowchart which shows the knowledge search process concerning the 3rd Embodiment of this invention. 図12の知識検索処理で得られる検索結果画面表示例である。13 is a search result screen display example obtained by the knowledge search process of FIG. 12. 図12の知識検索処理で得られる他の検索結果画面表示例である。13 is another search result screen display example obtained by the knowledge search process of FIG. 12.

符号の説明Explanation of symbols

1…関連知識検索装置、11…画面表示部、12…操作入力部、13…情報入出力部、14…記憶部、14A…文章集合、14B…教示情報、14C…重要語リスト、14D…文章ネットワーク情報、14E…重要語ネットワーク情報、14F…プログラム、15…演算処理部、15A…教示情報受付手段、15B…文章ネットワーク生成手段、15C…文章間重み算出手段、15D…重要語ネットワーク生成手段、15E…重要語間重み算出手段、15F…知識検索手段。
DESCRIPTION OF SYMBOLS 1 ... Related knowledge search device, 11 ... Screen display part, 12 ... Operation input part, 13 ... Information input / output part, 14 ... Memory | storage part, 14A ... Text set, 14B ... Teaching information, 14C ... Important word list, 14D ... Text Network information, 14E ... Keyword network information, 14F ... Program, 15 ... Calculation processing unit, 15A ... Teaching information receiving means, 15B ... Sentence network generating means, 15C ... Intertext weight calculating means, 15D ... Keyword network generating means, 15E: Important word weight calculation means, 15F: Knowledge search means.

Claims (10)

所定の重要語を用いて各種知識を文字情報で記述した複数の文章からなる文章集合、およびこの文章集合に属する各文章間の関連を示す文章ネットワーク情報を記憶する記憶部と、この記憶部の文章ネットワーク情報を用いて、検索対象知識を含む文章に関連する所望の文章を前記記憶部の文章集合から検索する演算処理部とを有する関連知識検索装置であって、
前記演算処理部は、2つの文章の間の関連性および順序性を示す文章間関係情報を入力とし、当該文章間関係情報の始点側に位置する始点文章と当該文章間関係情報の終点側に位置する終点文章とに含まれる重要語に基づいて、前記文章ネットワーク情報を生成する文章ネットワーク生成手段を備え、
この文章ネットワーク生成手段は、入力された前記文章間関係情報で指定された前記始点文章と同じ重要語をすべて含む文章を前記記憶部の文章集合から始点候補文章として選択するとともに、前記文章間関係情報で指定された前記終点文章と同じ重要語をすべて含む文章を前記記憶部の文章集合から終点候補文章として選択し、前記始点文章および前記始点候補文章のそれぞれを始点とし前記終点文章および前記終点候補文章のそれぞれを終点とする新たな文章間関係情報を生成し、これら新たな文章間関係情報の始点文章を示す情報と終点文章を示す情報との対から、前記各文章間の関連性および順序性を示す文章ネットワーク情報を生成することを特徴とする関連知識検索装置。
A storage unit for storing a sentence set composed of a plurality of sentences in which various kinds of knowledge are described in character information using a predetermined important word, and a sentence network information indicating a relation between the sentences belonging to the sentence set, and A related knowledge search device having an arithmetic processing unit for searching a desired sentence related to a sentence including knowledge to be searched from a sentence set of the storage unit using sentence network information,
The arithmetic processing unit receives inter-sentence relationship information indicating relevance and order between two sentences, and inputs the start-point sentence located on the start point side of the inter-sentence relation information and the end point side of the inter-sentence relation information. A sentence network generating means for generating the sentence network information based on an important word included in the located end point sentence;
The sentence network generation means selects a sentence including all the same important words as the start sentence specified by the input sentence relation information as a start point candidate sentence from the sentence set of the storage unit, and the sentence relation A sentence including all the same important words as the end sentence specified by information is selected as an end point candidate sentence from the sentence set of the storage unit, and the end point sentence and the end point are respectively set as the start point sentence and the start point candidate sentence. Generate new inter-sentence relationship information with each candidate sentence as an end point, and from the pair of the information indicating the start point sentence and the information indicating the end point sentence of the new inter-sentence relation information, A related knowledge retrieval device that generates sentence network information indicating order.
請求項1に記載の関連知識検索装置において、
前記演算処理部は、任意の文章間関係情報を、当該文章間関係情報の始点文章に含まれる重要語のそれぞれを始点とし当該文章間関係情報の終点文章に含まれる重要語のそれぞれを終点とする各重要語間関係情報に分解し、これら重要語間関係情報に対応する重要語間重みの和に基づき当該文章間関係情報に対応する文章間重みを算出する文章間重み算出手段をさらに備え、
前記文章ネットワーク生成手段は、前記各文章間ごとに前記文章間重み算出手段で得られた文章間重みを有する文章ネットワークを生成することを特徴とする関連知識検索装置。
The related knowledge search device according to claim 1,
The arithmetic processing unit, as an arbitrary sentence relationship information, each of the important words included in the start sentence of the inter-sentence relation information as the start point, and each of the important words included in the end point sentence of the inter-sentence relation information as the end point Further comprising a sentence weight calculation means for calculating the sentence weight corresponding to the sentence relation information based on the sum of the important word weights corresponding to the important word relation information. ,
The related knowledge search device, wherein the sentence network generation unit generates a sentence network having an inter-sentence weight obtained by the inter-sentence weight calculation unit for each sentence.
請求項2に記載の関連知識検索装置において、
前記演算処理部は、入力された前記文章間関係情報を、当該文章間関係情報の始点文章に含まれる重要語のそれぞれを始点とし当該文章間関係情報の終点文章に含まれる重要語のそれぞれを終点とする各重要語間関係情報に分解し、これら重要語間関係情報の発生ごとに当該重要語間関係情報に対応する重要語間重みを更新する重要語間重み算出手段をさらに備えることを特徴とする関連知識検索装置。
The related knowledge search device according to claim 2,
The arithmetic processing unit is configured to input each of the important words included in the end sentence of the inter-sentence relation information from each of the important words included in the start sentence of the inter-sentence relation information. It further comprises an important word weight calculation means for decomposing each important word relation information as an end point and updating the important word weight corresponding to the important word relation information for each occurrence of the important word relation information. A related knowledge retrieval device as a feature.
請求項1に記載の関連知識検索装置において、
前記演算処理部は、前記文章ネットワーク情報に基づき所望の知識を含む文章を始点文章または終点文章とする文章を前記記憶部の文章集合から検索し、前記知識に関連する文章として出力する知識検索手段をさらに備えることを特徴とする関連知識検索装置。
The related knowledge search device according to claim 1,
The arithmetic processing unit retrieves a sentence having a sentence including desired knowledge as a start sentence or an end sentence based on the sentence network information from a sentence set in the storage unit, and outputs the sentence as a sentence related to the knowledge The related knowledge search device, further comprising:
請求項2に記載の関連知識検索装置において、
前記演算処理部は、前記文章ネットワーク情報に基づき所望の知識を含む文章を始点文章または終点文章とする文章を前記記憶部の文章集合から検索し、得られた文章を当該文章間重みの順に前記知識に関連する文章として出力する知識検索手段をさらに備えることを特徴とする関連知識検索装置。
The related knowledge search device according to claim 2,
The arithmetic processing unit searches the sentence set of the storage unit for a sentence having a sentence including desired knowledge as a start point sentence or an end point sentence based on the sentence network information, and obtains the obtained sentence in the order of the weight between the sentences. A related knowledge search apparatus, further comprising knowledge search means for outputting as a sentence related to knowledge.
所定の重要語を用いて各種知識を文字情報で記述した複数の文章からなる文章集合を記憶する記憶部と、検索対象知識を含む文章に関連する所望の文章を前記記憶部の文章集合から検索するのに用いる文章ネットワーク情報を生成する演算処理部とを有する文章ネットワーク情報生成装置であって、
前記演算処理部は、2つの文章の間の関連性および順序性を示す文章間関係情報を入力とし、当該文章間関係情報の始点側に位置する始点文章と当該文章間関係情報の終点側に位置する終点文章とに含まれる重要語に基づいて、前記文章ネットワーク情報を生成する文章ネットワーク生成手段を備え、
この文章ネットワーク生成手段は、入力された前記文章間関係情報で指定された前記始点文章と同じ重要語をすべて含む文章を前記記憶部の文章集合から始点候補文章として選択するとともに、前記文章間関係情報で指定された前記終点文章と同じ重要語をすべて含む文章を前記記憶部の文章集合から終点候補文章として選択し、前記始点文章および前記始点候補文章のそれぞれを始点とし前記終点文章および前記終点候補文章のそれぞれを終点とする新たな文章間関係情報を生成し、これら新たな文章間関係情報の始点文章を示す情報と終点文章を示す情報との対から、前記各文章間の関連性および順序性を示す文章ネットワーク情報を生成することを特徴とする文章ネットワーク情報生成装置。
A storage unit that stores a set of sentences composed of a plurality of sentences in which various kinds of knowledge are described in character information using a predetermined important word, and a desired sentence related to a sentence including the search target knowledge is searched from the set of sentences in the storage unit. A sentence network information generating device having an arithmetic processing unit for generating sentence network information used to
The arithmetic processing unit receives inter-sentence relationship information indicating relevance and order between two sentences, and inputs the start-point sentence located on the start point side of the inter-sentence relation information and the end point side of the inter-sentence relation information. A sentence network generating means for generating the sentence network information based on an important word included in the located end point sentence;
The sentence network generation means selects a sentence including all the same important words as the start sentence specified by the input sentence relation information as a start point candidate sentence from the sentence set of the storage unit, and the sentence relation A sentence including all the same important words as the end sentence specified by information is selected as an end point candidate sentence from the sentence set of the storage unit, and the end point sentence and the end point are respectively set as the start point sentence and the start point candidate sentence. Generate new inter-sentence relationship information with each candidate sentence as an end point, and from the pair of the information indicating the start point sentence and the information indicating the end point sentence of the new inter-sentence relation information, A sentence network information generating apparatus for generating sentence network information indicating order.
所定の重要語を用いて各種知識を文字情報で記述した複数の文章からなる文章集合を記憶する記憶部と、検索対象知識を含む文章に関連する所望の文章を前記記憶部の文章集合から検索するのに用いる文章ネットワーク情報を生成する演算処理部とを有する文章ネットワーク情報生成装置で用いられ、入力された所定の文章間関係情報に基づき文章ネットワーク情報を生成する文章ネットワーク生成方法であって、
前記演算処理部で、2つの文章の間の関連性および順序性を示す文章間関係情報を入力とし、当該文章間関係情報の始点側に位置する始点文章と当該文章間関係情報の終点側に位置する終点文章とに含まれる重要語に基づいて、前記文章ネットワーク情報を生成する文章ネットワーク生成ステップを備え、
この文章ネットワーク生成ステップは、入力された前記文章間関係情報で指定された前記始点文章と同じ重要語をすべて含む文章を前記記憶部の文章集合から始点候補文章として選択するとともに、前記文章間関係情報で指定された前記終点文章と同じ重要語をすべて含む文章を前記記憶部の文章集合から終点候補文章として選択するステップと、
前記始点文章および前記始点候補文章のそれぞれを始点とし前記終点文章および前記終点候補文章のそれぞれを終点とする新たな文章間関係情報を生成し、これら新たな文章間関係情報の始点文章を示す情報と終点文章を示す情報との対から、前記各文章間の関連性および順序性を示す文章ネットワーク情報を生成するステップとを有することを特徴とする文章ネットワーク生成方法。
A storage unit that stores a set of sentences composed of a plurality of sentences in which various kinds of knowledge are described in character information using a predetermined important word, and a desired sentence related to a sentence including the search target knowledge is searched from the set of sentences in the storage unit. A sentence network generation method for generating sentence network information based on inputted predetermined sentence relationship information, which is used in a sentence network information generation apparatus having an arithmetic processing unit for generating sentence network information to be used,
In the arithmetic processing unit, the inter-sentence relationship information indicating the relationship and order between the two sentences is input, and the start point sentence located on the start point side of the inter-sentence relationship information and the end point side of the inter-sentence relationship information A sentence network generating step for generating the sentence network information based on an important word included in the end point sentence located;
In this sentence network generation step, a sentence including all the same important words as the starting point sentence specified by the inputted inter-sentence relation information is selected as a starting point candidate sentence from the sentence set of the storage unit, and the inter-sentence relation Selecting a sentence including all the same important words as the end sentence specified by information as an end point candidate sentence from the sentence set of the storage unit;
Information indicating the starting point sentence of the new inter-sentence relation information, and generating new inter-sentence relation information starting from the starting point sentence and the starting point candidate sentence and starting from the end point sentence and the end point candidate sentence, respectively. And a step of generating sentence network information indicating the relevance and order between the sentences from a pair of information indicating the end sentence and the sentence network generating method.
請求項7に記載の文章ネットワーク生成方法において、
前記演算処理部で、任意の文章間関係情報を、当該文章間関係情報の始点文章に含まれる重要語のそれぞれを始点とし当該文章間関係情報の終点文章に含まれる重要語のそれぞれを終点とする各重要語間関係情報に分解し、これら重要語間関係情報に対応する重要語間重みの和に基づき当該文章間関係情報に対応する文章間重みを算出する文章間重み算出ステップをさらに備え、
前記文章ネットワーク情報生成ステップは、前記各文章間ごとに前記文章間重み算出ステップで得られた文章間重みを有する文章ネットワークを生成することを特徴とする文章ネットワーク生成方法。
The sentence network generation method according to claim 7,
In the arithmetic processing unit, any inter-sentence relationship information, each of the important words included in the start point sentence of the inter-sentence relation information as the start point, and each of the important words included in the end point sentence of the inter-sentence relation information as the end point Further comprising a sentence weight calculation step for calculating a sentence weight corresponding to the sentence relation information based on a sum of the weights of the important words corresponding to the important word relation information. ,
The sentence network information generation step generates a sentence network having the sentence weight obtained in the sentence weight calculation step for each sentence.
請求項7に記載の文章ネットワーク生成方法において、
前記演算処理部で、入力された前記文章間関係情報を、当該文章間関係情報の始点文章に含まれる重要語のそれぞれを始点とし当該文章間関係情報の終点文章に含まれる重要語のそれぞれを終点とする各重要語間関係情報に分解し、これら重要語間関係情報の発生ごとに当該重要語間関係情報に対応する重要語間重みを更新する重要語間重み算出ステップをさらに備えることを特徴とする文章ネットワーク生成方法。
The sentence network generation method according to claim 7,
In the arithmetic processing unit, each of the important words included in the end sentence of the inter-sentence relation information is determined by using the input inter-sentence relation information as the start point of each of the important words included in the start sentence of the inter-sentence relation information. It further comprises a significant word weight calculation step for decomposing each important word relation information as the end point and updating the important word weight corresponding to the important word relation information for each occurrence of the important word relation information. Character network generation method characterized.
所定の重要語を用いて各種知識を文字情報で記述した複数の文章からなる文章集合を記憶する記憶部と、検索対象知識を含む文章に関連する所望の文章を前記記憶部の文章集合から検索するのに用いる文章ネットワーク情報を生成する演算処理部とを有する文章ネットワーク情報生成装置のコンピュータに、
2つの文章の間の関連性および順序性を示す文章間関係情報を入力とし、この文章間関係情報の始点側に位置する始点文章と同じ重要語を含む文章を前記記憶部の文章集合から始点候補文章として選択するとともに、当該文章間関係情報の終点側に位置する終点文章と同じ重要語を含む文章を前記記憶部の文章集合から終点候補文章として選択するステップと、
前記始点文章および前記始点候補文章のそれぞれを始点とし前記終点文章および前記終点候補文章のそれぞれを終点とする新たな文章間関係情報を生成し、これら新たな文章間関係情報の始点文章を示す情報と終点文章を示す情報との対から、前記各文章間の関連性および順序性を示す文章ネットワーク情報を生成するステップとを実行させるプログラム。
A storage unit that stores a set of sentences composed of a plurality of sentences in which various kinds of knowledge are described in character information using a predetermined important word, and a desired sentence related to a sentence including the search target knowledge is searched from the set of sentences in the storage unit. A computer of a text network information generation device having a calculation processing unit for generating text network information used to
The sentence relation information indicating the relationship and order between the two sentences is input, and a sentence including the same important word as the start sentence located on the start point side of the sentence relation information is started from the sentence set in the storage unit. Selecting as a candidate sentence and selecting a sentence including the same important word as the end point sentence located on the end point side of the inter-text relation information from the sentence set of the storage unit as an end point candidate sentence;
Information indicating the starting point sentence of the new inter-sentence relation information, and generating new inter-sentence relation information starting from the starting point sentence and the starting point candidate sentence and starting from the end point sentence and the end point candidate sentence, respectively. And a step of generating sentence network information indicating relevance and order between the respective sentences from a pair of information indicating the end sentence and the sentence.
JP2004053424A 2004-02-27 2004-02-27 Related knowledge retrieval apparatus, sentences network generation device, sentences network generation method, and program Pending JP2005242807A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2004053424A JP2005242807A (en) 2004-02-27 2004-02-27 Related knowledge retrieval apparatus, sentences network generation device, sentences network generation method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2004053424A JP2005242807A (en) 2004-02-27 2004-02-27 Related knowledge retrieval apparatus, sentences network generation device, sentences network generation method, and program

Publications (1)

Publication Number Publication Date
JP2005242807A true JP2005242807A (en) 2005-09-08

Family

ID=35024464

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2004053424A Pending JP2005242807A (en) 2004-02-27 2004-02-27 Related knowledge retrieval apparatus, sentences network generation device, sentences network generation method, and program

Country Status (1)

Country Link
JP (1) JP2005242807A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010066975A (en) * 2008-09-10 2010-03-25 Kobe Steel Ltd Sentence retrieval device, sentence retrieval program and sentence retrieval method
JP2016538615A (en) * 2013-09-29 2016-12-08 ペキン ユニバーシティ ファウンダー グループ カンパニー,リミティド Main knowledge point recommendation method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010066975A (en) * 2008-09-10 2010-03-25 Kobe Steel Ltd Sentence retrieval device, sentence retrieval program and sentence retrieval method
JP2016538615A (en) * 2013-09-29 2016-12-08 ペキン ユニバーシティ ファウンダー グループ カンパニー,リミティド Main knowledge point recommendation method and system
US10289623B2 (en) 2013-09-29 2019-05-14 Peking University Founder Group Co. Ltd. Method and system for key knowledge point recommendation

Similar Documents

Publication Publication Date Title
CN108647205B (en) Fine-grained emotion analysis model construction method and device and readable storage medium
US20100010803A1 (en) Text paraphrasing method and program, conversion rule computing method and program, and text paraphrasing system
JP3983265B1 (en) Dictionary creation support system, method and program
CN104933081A (en) Search suggestion providing method and apparatus
JP2006190298A (en) Method for applying automatically conceptual highlighting to electronic text
Almarsoomi et al. AWSS: An algorithm for measuring Arabic word semantic similarity
JP2004192398A (en) Information processor and information processing method, and information processing program
JP3428554B2 (en) Semantic network automatic creation device and computer readable recording medium
US20220222442A1 (en) Parameter learning apparatus, parameter learning method, and computer readable recording medium
JP4935243B2 (en) Search program, information search device, and information search method
JP2006323517A (en) Text classification device and program
JP2010146222A (en) Document classification apparatus, document classification method, and program
JP5330046B2 (en) Co-occurrence expression extraction apparatus and co-occurrence expression extraction method
Aliyanto et al. Supervised probabilistic latent semantic analysis (sPLSA) for estimating technology readiness level
JP6693032B2 (en) Method, program and system for parsing sentences
JP2005242807A (en) Related knowledge retrieval apparatus, sentences network generation device, sentences network generation method, and program
JP2011191834A (en) Method, device and program for classifying document
JP2009157620A (en) Information search support device
JP7200683B2 (en) Information processing device and program
JP6375367B2 (en) Objection generation method, objection generation system
JP2008250409A (en) Typical sentence analyzing device, method, and program therefor
JP5277090B2 (en) Link creation support device, link creation support method, and program
JP6797038B2 (en) Software material selection support device and software material selection support program
JP2008250893A (en) Information retrieval device, information retrieval method and its program
JP4592556B2 (en) Document search apparatus, document search method, and document search program

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20051226

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20090303

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20090707