JPH06318234A - Document retrieving device - Google Patents

Document retrieving device

Info

Publication number
JPH06318234A
JPH06318234A JP4160895A JP16089592A JPH06318234A JP H06318234 A JPH06318234 A JP H06318234A JP 4160895 A JP4160895 A JP 4160895A JP 16089592 A JP16089592 A JP 16089592A JP H06318234 A JPH06318234 A JP H06318234A
Authority
JP
Japan
Prior art keywords
document
search
attribute
similarity
difference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP4160895A
Other languages
Japanese (ja)
Other versions
JP2581376B2 (en
Inventor
Kenji Sato
研治 佐藤
Kazushi Muraki
一至 村木
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP4160895A priority Critical patent/JP2581376B2/en
Publication of JPH06318234A publication Critical patent/JPH06318234A/en
Application granted granted Critical
Publication of JP2581376B2 publication Critical patent/JP2581376B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Abstract

PURPOSE:To enable retrieval with similarity between documents in the case of document retrieval by calculating the similarity and difference mutually between documents corresponding to the attribute set of the document. CONSTITUTION:A document attribute value condition input means 6 inputs the different point or equal point of a document presented at present by a user. A document attribute type condition input means 7 designates the retrieval corresponding to the type of an attribute to be conscious of as the different or equal point. The user designates the document presented at present as the document provided with the close attribute by using a document adjacent request input means 8. Corresponding to these retrieval conditions, the document is retrieved by a similarity/difference retrieving means 11 while using the information calculated by a similarity calculating means 9 and a difference calculating means 10.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は、文書検索装置に関し、
特に文書とその差異性・類似性での検索機能を有する文
書検索装置に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document retrieval device,
In particular, the present invention relates to a document search device having a search function based on a document and its difference and similarity.

【0002】[0002]

【従来の技術】従来の文書検索装置は、文書の属性のみ
を検索条件として入力し、複数の検索条件により文書の
数を絞り込み、目的文書を探し出すという検索を行う文
書検索装置である。文書の検索においては、1つの文書
を検索して、その文書が目的の文書ではないが近いと感
じられる際に、目的の文書と非常に似かよっているとい
う観点での類似性での検索要求が生じたり、ある点では
目的の文書と同じなのだがある点では異なっているとい
う観点での差異性での検索要求が生じることがある。例
えば、特開平2−2458号公報では、キーワードが振
られていない文書から自動的にキーワードを抽出し、そ
のキーワードと検索キーとして与えられたキーワードの
類似度を計算し文書を検索する方法が提案されていて、
検索キーとしてキーワードを持っていない文書を与えて
も、その文書からキーワードを抽出することで類似した
文書の検索が可能になっている。
2. Description of the Related Art A conventional document search device is a document search device that searches for a target document by inputting only the attributes of the document as a search condition, narrowing down the number of documents by a plurality of search conditions. When searching for a document, when one document is searched and it is felt that the document is not the target document but is close, a search request based on the similarity that the document is very similar to the target document is requested. Occasionally, a search request may occur with the difference that it is the same as the target document in some respects but different in some respects. For example, Japanese Patent Laid-Open No. 2-2458 proposes a method of automatically extracting a keyword from a document in which the keyword is not assigned, calculating the degree of similarity between the keyword and a keyword given as a search key, and searching the document. Has been done,
Even if a document having no keyword is given as a search key, it is possible to search for a similar document by extracting the keyword from the document.

【0003】[0003]

【発明が解決しようとする課題】しかし、従来の検索装
置ではこのような検索手段が提供されていないため、ま
た最初から文書の属性を検索条件として入力して検索を
行なわなければならないという問題点が存在する。
However, since the conventional search device does not provide such a search means, the attribute of the document must be input as the search condition from the beginning to perform the search. Exists.

【0004】前述の公報に記載された検索法でも、文書
を検索元としてその文書との差異や類似を更に指定する
方法での検索法は与えられていない。
Even the search method described in the above publication does not provide a search method in which a document is used as a search source and a difference or similarity with the document is further specified.

【0005】このように従来の文書検索装置では、1つ
の文書が検索された場合や、既に提示されている場合
に、その文書との類似での文書検索の指定や、その文書
とある観点が異なっている文書の検索の指定が行えない
という問題がある。
As described above, in the conventional document retrieval apparatus, when one document is retrieved or is already presented, designation of document retrieval similar to the document and a certain aspect of the document are performed. There is a problem that the search of different documents cannot be specified.

【0006】[0006]

【課題を解決するための手段】上述した問題点を解決す
るため、本発明による文書検索装置は、文書を保持する
文書保存手段と文書に付属する属性を保持する文書属性
保存手段と、文書の検索条件を入力する検索条件入力手
段と、検索条件に従って文書を検索する文書検索手段
と、検索された文書と当該文書に関する情報を提示する
文書提示手段と、特定の文書と属性の値を検索条件とし
て入力する文書属性値条件入力手段と、特定の文書と属
性の型を検索条件として入力する文書属性型条件入力手
段と、特定の文書の近隣の提示の要求を入力する文書近
隣要求入力手段と、当該検索元文書の属性と他の文書の
属性から類似性を計算する類似性計算手段と、当該検索
元文書の属性と他の文書の属性から差異性を計算する差
異性計算手段と、計算された類似性または差異性と前記
文書属性値条件入力手段および文書属性型条件入力手段
および文書近隣要求入力手段によって得られた検索条件
に従って文書を検索する類似・差異検索手段と、検索結
果が複数文書である場合に、当該検索元文書とその近隣
文書の類似・差異性情報を提示する近隣情報提示手段を
有する。
In order to solve the above-mentioned problems, a document retrieval apparatus according to the present invention includes a document storage unit for storing a document, a document attribute storage unit for storing attributes attached to the document, and a document storage unit for storing the document. Retrieval condition input means for inputting retrieval conditions, document retrieval means for retrieving documents according to the retrieval conditions, document presentation means for presenting retrieved documents and information about the documents, and retrieval conditions for specific documents and attribute values A document attribute value condition inputting means, a document attribute type condition inputting means for inputting a specific document and attribute type as a search condition, and a document neighborhood request inputting means for inputting a request to present a neighborhood of a specific document , A similarity calculation means for calculating a similarity from the attribute of the search source document and an attribute of another document, and a difference calculation means for calculating a difference from the attribute of the search source document and an attribute of another document, A plurality of search results and a similarity / difference search means for searching a document according to the similarities or differences and the search conditions obtained by the document attribute value condition input means, the document attribute type condition input means, and the document neighborhood request input means. In the case of a document, it has a neighborhood information presenting means for presenting similarity / difference information of the search source document and its neighboring documents.

【0007】[0007]

【実施例】次に、本発明について図面を参照して説明す
る。図1は本発明の一実施例を示すブロック図である。
図1を参照すると、本発明の実施例は、文書を保持する
文書保存手段2と文書に付属する属性を保持する文書属
性保存手段3を内部に備えた文書データベース1と、文
書の検索条件を入力する検索条件入力手段4と、前記検
索条件入力手段4によって入力された検索条件に従って
文書を検索する文書検索手段5と、前記文書検索手段5
によって検索された文書と当該文書に関する情報を提示
する文書提示手段12と、特定の文書と属性の値を検索
条件として入力する文書属性値条件入力手段6と、特定
の文書と属性の型を検索条件として入力する文書属性型
条件入力手段7と、特定の文書の近隣の提示の要求を入
力する文書近隣要求入力手段8と、当該検索元文書の属
性と他の文書の属性から類似性を計算する類似性計算手
段9と、当該検索元文書の属性と他の文書の属性から差
異性を計算する差異性計算手段10と、計算された類似
性または差異性と前記文書属性値条件入力手段6および
文書属性型条件入力手段7および文書近隣要求入力手段
8によって得られた検索条件に従って文書を検索する類
似・差異検索手段11と、検索結果が複数文書である場
合に、当該検索元文書とその近隣文書の類似・差異性情
報を提示する近隣情報提示手段13とから構成される。
DESCRIPTION OF THE PREFERRED EMBODIMENTS Next, the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of the present invention.
Referring to FIG. 1, in the embodiment of the present invention, a document database 1 internally including a document storage unit 2 for storing a document and a document attribute storage unit 3 for storing attributes attached to the document, and a document search condition are set. Search condition input means 4 for inputting, document search means 5 for searching a document according to the search condition input by the search condition input means 4, and the document search means 5
The document presenting unit 12 that presents the document retrieved by the search and the information about the document, the document attribute value condition inputting unit 6 that inputs the value of the specific document and the attribute as the retrieval condition, and the type of the specific document and the attribute are retrieved. A document attribute type condition input means 7 to be input as a condition, a document proximity request input means 8 to input a request to present a neighborhood of a specific document, and a similarity is calculated from the attribute of the search source document and the attribute of another document. The similarity calculation means 9, the difference calculation means 10 for calculating the difference from the attribute of the search source document and the attribute of another document, the calculated similarity or difference and the document attribute value condition input means 6 And a similarity / difference search means 11 for searching a document according to the search conditions obtained by the document attribute type condition input means 7 and the document neighborhood request input means 8 and the search source when the search result is a plurality of documents. Written and composed neighbor information presenting means 13 for presenting a similarity-difference of information in the neighbor document.

【0008】検索条件入力手段4により入力された文書
の通常の検索条件は通信線45を通して文書検索手段5
へ送られる。文書検索手段5では、文書属性保存手段3
より文書属性を通信線35を通して取り出し、条件に当
てはまる属性を探し、対応する文書を文書保存手段2よ
り通信線25を通して取り出す。検索された文書は通信
線512を通して文書提示手段12へ送られ使用者に提
示される。文書属性値条件入力手段6により入力された
特定文書(提示中の文書)と属性値は通信線611を通
して、文書属性型条件入力手段7により入力された特定
文書と属性型は通信線711を通して、文書近隣要求入
力手段8により入力された特定文書と近隣の距離は通信
線811を通して、類似・差異検索手段11に送られ
る。類似・差異検索手段11では通信線311を通して
文書属性保存手段3より文書属性を取り出す。検索文書
の属性と表示文書の属性は通信線119を通して類似性
計算手段9へ、通信線1110を通して差異性計算手段
10ヘ送られる。類似性計算手段9および差異性計算手
段10で計算された類似性および差異性は、それぞれ通
信線911および通信線1011を通して、類似・差異
計算手段11へ送られる。類似・差異・差異検索手段1
1では、得られた類似性・差異性を基に検索条件に該当
する文書を探し、対応する文書を文書保存手段2より通
信線211を通して取り出す。検索された文書が1つの
場合は、通信線1112を通して文書提示手段12へ送
られ使用者に提示される。検索された文書が複数の場合
は、通信線1113を通して近隣情報提示手段に送ら
れ、文書間の類似性・差異性と共に検索された文書が提
示される。
The normal search condition of the document input by the search condition input means 4 is the document search means 5 through the communication line 45.
Sent to. In the document search means 5, the document attribute storage means 3
The document attributes are retrieved through the communication line 35, the attributes that meet the conditions are searched, and the corresponding document is retrieved from the document storage means 2 through the communication line 25. The retrieved document is sent to the document presenting means 12 through the communication line 512 and presented to the user. The specific document (presenting document) and the attribute value input by the document attribute value condition input means 6 are transmitted through the communication line 611, and the specific document and the attribute type input by the document attribute type condition input means 7 are transmitted through the communication line 711. The distance between the specific document and the neighborhood input by the document neighborhood request input means 8 is sent to the similarity / difference search means 11 through the communication line 811. The similarity / difference search means 11 retrieves the document attribute from the document attribute storage means 3 through the communication line 311. The attribute of the search document and the attribute of the display document are sent to the similarity calculating means 9 through the communication line 119 and to the difference calculating means 10 through the communication line 1110. The similarity and the difference calculated by the similarity calculating means 9 and the difference calculating means 10 are sent to the similarity / difference calculating means 11 through the communication line 911 and the communication line 1011, respectively. Similarity / difference / difference search means 1
In step 1, a document corresponding to the search condition is searched based on the obtained similarity / difference, and the corresponding document is retrieved from the document storage means 2 through the communication line 211. When there is only one retrieved document, it is sent to the document presenting means 12 through the communication line 1112 and presented to the user. If there are a plurality of retrieved documents, they are sent to the neighborhood information presenting means through the communication line 1113, and the retrieved documents are presented together with the similarity / difference between the documents.

【0009】本実施例では、属性とは文書内部に存在す
るキーワードや文書に付与されたキーワードも含んでい
る。
In this embodiment, the attributes include keywords existing inside the document and keywords added to the document.

【0010】文書属性値条件入力手段6では、ユーザは
現在提示されている文書に対してその文書とはここが違
うまたはここが同じという点を入力する。例えば、該当
提示文書は「佐藤」が書いたものであるが、検索したい
文書は私がかいたものといった指定や、この文書にさら
に「経済摩擦」というキーワードが増えたものといった
指定や、この文書から「首相」というキーワードが抜け
たものといった指定をする。
In the document attribute value condition input means 6, the user inputs the point that the present document is different from or the same as the present document. For example, the relevant presentation document was written by "Sato", but the document that I wanted to search was specified by me, or that the keyword "economic friction" was added to this document. Specify that the keyword "Prime Minister" is omitted from.

【0011】文書属性型条件入力手段7では、ユーザが
現在提示されている文書との違いを明確に指定できない
場合に、異なっているまたは同じと意識できる属性の型
で検索を指定する。例えば、いつ書いた文書だったかは
思い出せないが、この文書とは「作成日時」が異なって
いるものといった指定をする。
In the document attribute type condition input means 7, when the user cannot clearly specify the difference from the currently presented document, the search is specified by the attribute type which can be recognized as different or the same. For example, it is not possible to remember when the document was written, but it is specified that the "creation date and time" is different from this document.

【0012】文書近隣要求入力手段8では、ユーザが現
在提示されている文書に対し、この文書ではないが、非
常に似ていると意識した場合に、その文書と属性で近い
ものという指定をする。
In the document proximity request input means 8, when the user recognizes that the currently presented document is not this document, but is very similar, it specifies that the document is close in attributes. .

【0013】これらの検索条件に対し、類似性計算手段
9では、より多くの属性が共通している文書を類似文書
として文書の分類を行なう。また差異性計算手段10で
は、類似している文書の差異を属性の異なりで更に分類
を行う。これらの計算法としては、例えば2つの文書の
異なっている属性値の数をそれらの文書間の距離として
計算し、この距離を用いて文書の分類を行なうといった
方法を用いることができる。類似・差異検索手段11で
は、この計算された文書の類似性および差異性を用い
て、検索条件に適合する文書を検索する。
With respect to these search conditions, the similarity calculating means 9 classifies the documents having more common attributes as similar documents. Further, the difference calculation means 10 further classifies the differences between similar documents according to different attributes. As these calculation methods, for example, a method may be used in which the number of different attribute values of two documents is calculated as the distance between the documents and the distance is used to classify the documents. The similarity / difference search means 11 searches for a document that matches the search condition using the calculated similarity and difference of the document.

【0014】近隣情報提示手段13では、検索結果の文
章が複数である場合に、該当検索元文章と検索された文
章の差異を検索された文章に付加してユーザに提示す
る。この提示方法としては、例えば該当検索元手段と検
索された文書をそれぞれノードとするグラフ構造で表わ
し、検索元手段と検索された文書を結ぶアークにその差
異を表わす属性値を付与してやるといった方法を用いる
ことができる。
When there are a plurality of sentences as the search result, the neighborhood information presenting means 13 adds the difference between the relevant search source sentence and the retrieved sentence to the retrieved sentence and presents it to the user. As this presentation method, for example, a method in which the search source means and the retrieved document are represented by a graph structure having nodes as respective nodes and an attribute value indicating the difference is given to an arc connecting the search source means and the retrieved document is used. Can be used.

【0015】[0015]

【発明の効果】以上説明したように、本発明による文書
検索装置では、1つの文書が検索された場合や、既に提
示されている場合に、その文書との類似での文書検索の
指定や、その文書とある観点が異なっている文書の検索
の指定により文書検索が行えるようになるという効果を
有する。またこの文書検索装置は、文字で構成された文
書だけではなく、画像データや音声データ等の付加的な
属性に対しても類似性・差異性を取り扱うことで、マル
チメディア文書の類似・差異による検索をも行える。
As described above, in the document retrieval apparatus according to the present invention, when one document is retrieved or has already been presented, designation of document retrieval similar to the document, There is an effect that a document search can be performed by designating a search of a document having a certain viewpoint different from that of the document. In addition, this document retrieval device handles similarities / differences not only for documents composed of characters but also for additional attributes such as image data and audio data, so that the similarity / difference between multimedia documents You can also search.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の一実施例を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

【符号の説明】[Explanation of symbols]

1 文書データベース 2 文書保存手段 3 文書属性保存手段 4 検索条件入力手段 5 文書検索手段 6 文書属性値条件入力手段 7 文書属性型条件入力手段 8 文書近隣要求入力手段 9 類似性計算手段 10 差異性計算手段 11 類似・差異検索手段 12 文書提示手段 13 近隣情報提示手段 1 Document Database 2 Document Storage Means 3 Document Attribute Storage Means 4 Search Condition Input Means 5 Document Search Means 6 Document Attribute Value Condition Input Means 7 Document Attribute Type Condition Input Means 8 Document Neighbor Request Input Means 9 Similarity Calculation Means 10 Difference Calculation Means 11 Similarity / Difference Searching Means 12 Document Presenting Means 13 Neighborhood Information Presenting Means

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】 文書を保持する文書保持手段と文書に付
属する属性を保持する文書属性保持手段と、文書の検索
条件を入力する検索条件入力手段と、検索条件に従って
文書を検索する文書検索手段と、検索された文書と当該
文書に関する情報を提示する文書提示手段と、特定の文
書と属性の値を検索条件として入力する文書属性値条件
入力手段と、特定の文書と属性の型を検索条件として入
力する文書属性型条件入力手段と、特定の文書の近隣の
提示の要求を入力する文書近隣要求入力手段と、当該検
索元文書の属性と他の文書の属性から類似性を計算する
類似性計算手段と、当該検索元文書の属性と他の文書の
属性から差異性を計算する差異性計算手段と、計算され
た類似性または差異性と前記文書属性値条件入力手段お
よび文書属性型条件入力手段および文書近隣要求入力手
段によって得られた検索条件に従って文書を検索する類
似・差異検索手段と、検索結果が複数文書である場合
に、当該検索元文書とその近隣文書の類似・差異性情報
を提示する近隣情報提示手段を有することを特徴とする
文書検索装置。
1. A document holding unit for holding a document, a document attribute holding unit for holding an attribute attached to a document, a search condition input unit for inputting a search condition of a document, and a document search unit for searching a document according to the search condition. A document presenting means for presenting the retrieved document and information about the document; a document attribute value condition inputting means for inputting a value of a specific document and an attribute as a search condition; and a search condition for a specific document and an attribute type. As a document attribute type condition inputting means, a document neighborhood request inputting means for inputting a request to present a neighborhood of a specific document, and a similarity calculating a similarity from the attribute of the search source document and the attribute of another document A calculating means, a difference calculating means for calculating the difference from the attribute of the search source document and the attribute of another document, the calculated similarity or difference and the document attribute value condition input means and the document attribute type condition Similarity / difference search means for searching documents according to the search conditions obtained by the input means and the document proximity request input means, and similarity / difference information of the search source document and its neighboring documents when the search results are a plurality of documents. A document search device comprising: neighborhood information presenting means for presenting.
JP4160895A 1992-06-19 1992-06-19 Document search device Expired - Lifetime JP2581376B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP4160895A JP2581376B2 (en) 1992-06-19 1992-06-19 Document search device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP4160895A JP2581376B2 (en) 1992-06-19 1992-06-19 Document search device

Publications (2)

Publication Number Publication Date
JPH06318234A true JPH06318234A (en) 1994-11-15
JP2581376B2 JP2581376B2 (en) 1997-02-12

Family

ID=15724685

Family Applications (1)

Application Number Title Priority Date Filing Date
JP4160895A Expired - Lifetime JP2581376B2 (en) 1992-06-19 1992-06-19 Document search device

Country Status (1)

Country Link
JP (1) JP2581376B2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01304531A (en) * 1988-06-01 1989-12-08 Hitachi Ltd Data base system
JPH0415869A (en) * 1990-05-10 1992-01-21 Toshiba Corp Electronic filing device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01304531A (en) * 1988-06-01 1989-12-08 Hitachi Ltd Data base system
JPH0415869A (en) * 1990-05-10 1992-01-21 Toshiba Corp Electronic filing device

Also Published As

Publication number Publication date
JP2581376B2 (en) 1997-02-12

Similar Documents

Publication Publication Date Title
US5544049A (en) Method for performing a search of a plurality of documents for similarity to a plurality of query words
US5523945A (en) Related information presentation method in document processing system
JPH06215029A (en) Retrieval method of text
WO2019009995A1 (en) System and method for natural language music search
JPH10171819A (en) Information retrieving device
JP3612769B2 (en) Information search apparatus and information search method
JP3281639B2 (en) Document search system
JPH08171569A (en) Document retrieval device
JP2581376B2 (en) Document search device
JPH0581326A (en) Data base retrieving device
JP3531344B2 (en) Information retrieval device
JP3222193B2 (en) Information retrieval device
JPS6378228A (en) Information retrieving device
JP3591813B2 (en) Data retrieval method, apparatus and recording medium
JPS6325774A (en) Information registering/retrieving device
JP4034503B2 (en) Document search system and document search method
JPH05204978A (en) Information retrieving device
JP3436109B2 (en) Related search formula search device and computer-readable recording medium storing related search formula search program
JPH07120357B2 (en) Document retrieval device
JPH1173420A (en) Document processor and computer-readable recording medium where document processing program is recorded
JPH04127371A (en) Device and method for registering data and device and method for retrieving data
JP2000029892A (en) Recommendation system
JP2666475B2 (en) Kanji compound word keyword search device
JPH10307849A (en) Retrieving keyword determining method, its device, document retrieving device, and recording medium
JPH0488474A (en) Document processor

Legal Events

Date Code Title Description
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 19961001