JP2773682B2

JP2773682B2 - Applicable feedback device

Info

Publication number: JP2773682B2
Application number: JP7128050A
Authority: JP
Inventors: 加奈子久保; 幹也谷
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1995-05-26
Filing date: 1995-05-26
Publication date: 1998-07-09
Anticipated expiration: 2013-07-09
Also published as: JPH08320879A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、情報検索システムで、
検索結果が検索者の要求に適合しているかどうかという
適合判定をもとに、新たに検索を行って新たな検索結果
を適合度順に出力する適合フィードバック装置に関す
る。BACKGROUND OF THE INVENTION The present invention relates to an information retrieval system,
The present invention relates to a matching feedback device that performs a new search based on a matching determination as to whether or not a search result matches a searcher's request, and outputs new search results in order of matching.

【０００２】[0002]

【従来の技術】従来の情報検索システムでデータベース
のレコードの検索を行う際には、検索者が検索式を作成
し、検索を行っていた。そして、その検索結果に満足で
きない場合には、検索者が検索式を再度作成し、あるい
は前の検索式を一部修正するなどして検索をやり直すの
が一般的であったが、検索に不慣れな検索者にとっては
検索式の作成は困難であり、検索をやり直しても要求に
適合するレコードを得られないことが多かった。2. Description of the Related Art When searching for records in a database using a conventional information search system, a searcher creates a search formula and performs a search. If the search results are not satisfactory, it is common for the searcher to re-create the search formula or to modify the previous search formula partially and start the search again. It is difficult for a simple searcher to create a search expression, and in many cases a record that meets the request cannot be obtained even if the search is repeated.

【０００３】そこで、検索者の結果に対する適合判定を
検索式に反映させ自動的に検索式を修正、あるいは再度
作成する適合フィードバックの方式が提案されている。
例えば、“ＲｅｌｅｖａｎｃｅＷｅｉｇｈｔｉｎｇ
ｏｆＳｅａｒｃｈＴｅｒｍｓ”（Ｓ．Ｅ．Ｒｏｂｅ
ｒｔｓｏｎ，ＫａｒｅｎＳｐａｒｃｋＪｏｎｅｓ
著，ＪｏｕｒｎａｌｏｆｔｈｅＡｍｅｒｉｃａｎ
ＳｏｃｉｅｔｙｆｏｒＩｎｆｏｒｍａｔｉｏｎ
Ｓｃｉｅｎｃｅ，ｖｏｌ．２６，ｐ．１２９−１４６，
１９７６年）（以下、文献１）では、検索者が検索され
た結果であるレコードを、適合レコードと不適合レコー
ドかを入力し、該適合／不適合レコードを調べて、ある
語が適合レコード中に出現している確率を計算し、その
値からその語があるレコードに出現しているときそのレ
コードが適合レコードとなる確率を求めている。この確
率の値は、その語の適合レコードを検索する能力を示す
重みとして考えることができ、次のように算出される。[0003] Therefore, there has been proposed a matching feedback system in which a matching determination with respect to a searcher's result is reflected in the search formula and the search formula is automatically corrected or re-created.
For example, “Relevance Weighting
of Search Terms "(SE Robe)
rtson, Karen Spark Jones
Written by the Journal of the American
Society for Information
Science, vol. 26, p. 129-146,
(1976) (hereinafter referred to as Reference 1), a searcher inputs a record as a search result as a conforming record or a nonconforming record, examines the conforming / nonconforming record, and finds a certain word in the conforming record. Calculate the probability that the record is a matching record when the word appears in a record from that value. The value of the probability can be considered as a weight indicating the ability to search for a matching record of the word, and is calculated as follows.

【０００４】予め検索者によって、適合／不適合の判定
がなされたレコード集合から検索要求中のある語につい
て、ａ：適合レコードで、その語が出現しているレコード数ｂ：適合レコードで、その語が出現していないレコード数ｃ：不適合レコードで、その語が出現しているレコード数ｄ：不適合レコードで、その語が出現していないレコード数を調べ、その語の重みをｌｏｇ（（ａ＋０．５）（ｄ＋０．５））／（（ｃ＋０．５）（ｂ＋０．５））としている。さらに文献１では、この重み付けの方法を
他の重み付けの方法と実験によって比較した結果、最も
検索効率がよいことを示している。[0004] For a word for which a search request is made from a set of records for which a searcher has previously determined that the word is relevant / non-conforming, a: the number of records in which the word appears in the relevant record b: the word in the relevant record C: the number of records in which the word appears in non-conforming records d: the number of records in which the word does not appear in non-conforming records, and the weight of the word is log ((a + 0. 5) (d + 0.5)) / ((c + 0.5) (b + 0.5)). Further, in Reference 1, as a result of comparing this weighting method with other weighting methods by experiments, it is shown that the retrieval efficiency is the highest.

【０００５】また、特開平０２−２４５９７１号公報
「情報検索処理方法および装置」（以下、文献２）に記
載の発明では、検索した結果に検索者の要求に適合して
いるかどうかという判定情報から、適合レコードに出現
して、不適合レコードに出現していない語を抽出し、そ
の抽出された語から検索語として有効な単語を選び、検
索式にその単語を新たな検索語として追加し、検索を行
っている。[0005] Further, in the invention described in Japanese Patent Application Laid-Open No. 02-245971, "Information Search Processing Method and Apparatus" (hereinafter referred to as Document 2), it is necessary to determine whether or not a search result meets a searcher's request. , Extract words that appear in matching records and do not appear in non-conforming records, select valid words as search words from the extracted words, add the words as new search words to the search formula, and search It is carried out.

【０００６】特開平０５−１５１２７１号公報「情報検
索装置」（以下、文献３）に記載の発明では、検索結果
であるレコードの、適合レコードと不適合レコードを数
件ずつ検索者が入力し、適合レコード中に出現する確率
が適合レコードと不適合レコードに出現する確率よりも
高い語を検索語として選択し、新たな検索式を生成する
技術が記載されている。In the invention described in Japanese Patent Application Laid-Open No. 05-151271, "Information Retrieval Apparatus" (hereinafter referred to as Reference 3), a searcher inputs several conforming records and several non-conforming records of a record as a retrieval result, and A technique is described in which a word whose probability of appearing in a record is higher than the probability of appearing in a matching record and a non-matching record is selected as a search word, and a new search formula is generated.

【０００７】[0007]

【発明が解決しようとする課題】しかしながら、上述の
文献１から３に記載されている手法には以下のような問
題がある。However, the methods described in the above-mentioned documents 1 to 3 have the following problems.

【０００８】まず、文献１から３では、適合レコード中
のどの語を検索語とするかの選択基準が曖昧である。例
えば、データベース内が、「タイトル」、「抄録」、
「本文」の項目から構成されている場合に、タイトル中
に出現している語のほうが抄録中に出現している語より
も重要度が高いと考えられるが、これらの手法ではこの
点は考慮されていない。First, in References 1 to 3, the selection criterion for selecting a word in a matching record as a search word is ambiguous. For example, in the database, "Title", "Abstract",
When it is composed of "text" items, words appearing in the title are considered to be more important than words appearing in the abstract, but these methods take this point into consideration. It has not been.

【０００９】また、特に文献２では、検索語はタイト
ル、抄録などのその文献の内容を示すテキスト中の語を
対象にしており、検索語の重み付けも語の文字列として
の一致度、出現頻度、シソーラス上の関係など、テキス
ト中の語に特有の方法をとっているが、検索者にとって
適合か否かを決定する要素として著者、著者の所属、雑
誌論文ならばそれを収録している雑誌名なども考えられ
るが、これらの要素は検索語の選択、重み付けにおいて
も考慮されていない。[0009] In particular, in Document 2, the search term is intended for words in the text indicating the contents of the document, such as titles and abstracts. , A thesaurus, etc., which are specific to the words in the text, but the author, the affiliation of the author, and the journal that contains it if it is a journal article Names can be considered, but these elements are not considered in selection and weighting of search words.

【００１０】さらにまた、文献１から３では、不適合レ
コードが入力されない場合が考慮されていない。不適合
レコードがなければ適合レコード中の語が全て検索語と
して選択されることになるが、データベース中に多く出
現する語、例えば情報学データベースにおける「情報」
のような一般的に多く出現される語も検索語として選択
されてしまう。[0010] Further, in References 1 to 3, the case where a nonconforming record is not input is not considered. If there is no nonconforming record, all the words in the conforming record will be selected as search words, but words that appear frequently in the database, for example, "information" in the informatics database
A word that appears in general, such as, is also selected as a search word.

【００１１】そして、文献３では、検索者はかならず適
合レコードと不適合レコードを入力しなければならず、
適合レコードのみの入力では検索語を選択することがで
きなく、検索語の重みを算出することもできない。[0011] In Document 3, the searcher must input a conforming record and a nonconforming record.
When only matching records are input, the search term cannot be selected, and the weight of the search term cannot be calculated.

【００１２】このように、従来の適合フィードバック方
式には検索語の選択、および不適合レコードがない場合
の対応など解決すべき課題があった。As described above, the conventional adaptive feedback method has problems to be solved, such as selection of a search word and handling when there is no incompatible record.

【００１３】本発明の目的は、上述の問題点を解決し、
検索語の対象をタイトル、抄録などのテキスト項目に限
らず、著者、著者の所属などにまで拡げ、さらには不適
合レコードが入力されていない場合でも検索語の選択、
重み付けが可能な適合フィードバック装置を提供するこ
とにある。An object of the present invention is to solve the above-mentioned problems,
The search term is not limited to text items such as titles and abstracts, but extends to authors, author affiliations, etc., and even when no nonconforming records are entered, selection of search terms,
It is an object of the present invention to provide a weighted adaptive feedback device.

【００１４】[0014]

【課題を解決するための手段】本発明、第１の発明は、
検索者が検索式を作成し、前記検索式によりデータベー
スを検索して検索結果を出力するデータベース検索シス
テムで、前記検索結果が前記検索者にとって、適合か不
適合かの判定がなされた適合判定済みレコードファイル
を作成し、前記適合判定済みレコードファイルにより再
度検索式を作成して前記データベースを検索しなおす適
合フィードバック装置において、前記適合判定済みレコ
ードファイル内で、適合と判定された検索結果に出現し
ている語と前記語が所属している項目を取り出し、前記
語が前記適合判定済みレコードファイルの前記項目中に
どのくらいの割合で出現しているかを記載した判定別出
現レコード表を作成する判定別出現レコード表作成部
と、前記判定別出現レコード表を参照して前記データベ
ースを検索しなおすための検索語と検索項目を選択する
検索語選択部と、前記検索語と前記検索項目と前記判定
済みレコードファイルから新しい検索式を作成する検索
式生成部と、前記検索式から検索を実行して検索結果を
得る検索実行部とを有することを特徴とする。Means for Solving the Problems The present invention, the first invention,
A database search system in which a searcher creates a search formula, searches a database using the search formula, and outputs a search result, wherein the search result is a match-determined record for which the searcher has determined whether the search is compatible or not. In the conformance feedback device for creating a file and re-searching the database by creating a search expression again with the conformance-determined record file, in the conformance-determined record file, a search result that has been determined to be suitable The word to which the word belongs and the item to which the word belongs are taken out, and the occurrence of each word is created in the judgment-determined appearance record table in which the word appears in the item of the conformity-determined record file in what proportion. Searching the database again with reference to a record table creating unit and the occurrence record table for each judgment A search word selection unit for selecting a search word and a search item for search, a search expression generation unit for creating a new search expression from the search word, the search item and the determined record file, and executing a search from the search expression And a search execution unit for obtaining a search result.

【００１５】また、第２の発明は、第１の発明におい
て、前記検索語選択部で選択された検索語の重みを前記
判定別出現レコード表を参照して算出する検索語重み算
出部と、前記検索式実行部で得られた検索結果に、前記
検索語選択部で選択された検索語がどのくらい存在して
いるかを抽出し、前記検索語ごとの重みを前記検索語重
み算出部を参照して入力し、前記検索結果に存在してい
る検索語ごとの重みを全て加算した値を前記検索結果の
適合度とし、前記適合度順にソートした検索結果である
適合度順検索結果を出力するレコード適合度算出部とを
有することを特徴とする。[0015] In a second aspect based on the first aspect, a search term weight calculator for calculating the weight of the search term selected by the search term selector with reference to the occurrence record table for each determination. In the search result obtained by the search expression execution unit, extract how much the search term selected by the search word selection unit exists, and refer to the search word weight calculation unit for the weight of each search word. A record that outputs a search result in order of relevance, which is a search result sorted in the order of relevance, with a value obtained by adding all the weights of the search terms present in the search result as the relevance of the search result. And a fitness calculating unit.

【００１６】さらに、第３の発明は、検索者が検索式を
作成し、前記検索式によりデータベースを検索して検索
結果を出力するデータベース検索システムで、前記検索
結果が前記検索者にとって、適合か不適合かの判定がな
された適合判定済みレコードファイルを作成し、前記適
合判定済みレコードファイルにより再度検索式を作成し
て前記データベースを検索しなおす適合フィードバック
装置において、前記適合判定済みレコードファイル内
で、適合と判定された検索結果に出現している語と前記
語が所属している項目を取り出し、前記語が前記適合判
定済みレコードファイルの前記項目中にどのくらいの割
合で出現しているかを記載した判定別出現レコード表を
作成する判定別出現レコード表作成部と、前記データベ
ースの各項目の重み係数を保持した項目知識と、前記判
定別出現レコード表または前記項目知識の重み係数を参
照して前記データベースを検索しなおすための検索語と
検索項目を選択する検索語選択部と、前記検索語と前記
検索項目と前記判定済みレコードファイルから新しい検
索式を生成する検索式生成部と、前記検索式から検索を
実行して検索結果を得る検索実行部とを有することを特
徴とする。Further, a third invention is a database search system in which a searcher creates a search formula, searches a database using the search formula, and outputs a search result, wherein the search result is suitable for the searcher. Create a conformity-determined record file in which the determination of nonconformity has been made, in the conformance feedback device to create a search formula again by the conformance-determined record file and search the database again, in the conformance-determined record file, The words appearing in the search results determined to be compatible and the items to which the words belong were taken out, and the percentage of the words appearing in the items of the matched determined record file was described. A judgment-based occurrence record table creating unit for creating a judgment-based occurrence record table, and weights of the respective items in the database A search term selection unit for selecting a search term and a search item for re-searching the database with reference to the item knowledge holding the number, the judgment record table or the weight coefficient of the item knowledge, and the search term And a search formula generation unit for generating a new search formula from the search item and the determined record file; and a search execution unit for executing a search from the search formula to obtain a search result.

【００１７】さらに、第４の発明は、第３の発明におい
て、前記検索語選択部で選択された検索語の重みを前記
判定別出現レコード表と前記項目知識を参照して算出す
る検索語重み算出部と、前記検索式実行部で得られた検
索結果に、前記検索語選択部で選択された検索語がどの
くらい存在しているかを抽出し、前記検索語ごとの重み
を前記検索語重み算出部を参照して入力し、前記検索結
果に存在している検索語ごとの重みを全て加算した値を
前記検索結果の適合度とし、前記適合度順にソートした
検索結果である適合度順検索結果を出力するレコード適
合度算出部とを有することを特徴とする。In a fourth aspect based on the third aspect, the search term weight is calculated by referring to the judgment-based occurrence record table and the item knowledge, the weight of the search term selected by the search term selection section. A calculating unit that extracts how many search terms selected by the search term selecting unit are present in the search results obtained by the search formula executing unit, and calculates a weight for each of the search terms by the search term weight calculation. The value obtained by adding all the weights of the search terms present in the search result is referred to as the relevance of the search result, and the relevance order search result is a search result sorted in the relevance order. And a record matching degree calculation unit that outputs the record matching degree.

【００１８】さらに、第５の発明は、第３、第４の発明
において、前記検索語選択部が検索語を選択する際に、
前記適合判定済みレコードファイルに不適合レコードが
存在するかを判断し、不適合レコードがある場合は前記
判定別出現レコード表を参照して検索語を選択し、不適
合レコードがない場合は、前記項目知識を参照して検索
語を選択することを特徴とする。According to a fifth aspect of the present invention, in the third and fourth aspects, when the search word selection section selects a search word,
Determine whether a non-conforming record exists in the conformance-determined record file, if there is a non-conforming record, select a search term by referring to the judgment-based occurrence record table, and if there is no non-conforming record, check the item knowledge. It is characterized in that a search word is selected by referring to.

【００１９】[0019]

【実施例】次に、本発明の実施例について、図面を参照
して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００２０】図１は本発明における適合フィードバック
装置の構成の一実施例を示すブロック図である。図２は
適合判定されたある語の出現状況の一例を示す表であ
る。図３は図１の検索語選択部１０４の処理の流れの一
例を示す図である。図４は図１の検索語重み算出部１０
６の処理の流れの一例を示す図である。図５と図８は図
１の判定別出現レコード表１０３の一例である。図６と
図１０は図１の検索語重み算出部１０６で算出される重
みの一例である。図７と図１１は図１の検索式生成部１
０８で生成される検索式の一例である。図９は図１の項
目知識１０５の一例である。FIG. 1 is a block diagram showing an embodiment of the configuration of the adaptive feedback device according to the present invention. FIG. 2 is a table showing an example of the appearance status of a certain word that has been determined to be suitable. FIG. 3 is a diagram showing an example of a processing flow of the search term selection unit 104 of FIG. FIG. 4 shows the search term weight calculator 10 of FIG.
FIG. 11 is a diagram showing an example of the processing flow of No. 6; FIG. 5 and FIG. 8 are examples of the judgment-specific appearance record table 103 in FIG. 6 and 10 show examples of the weight calculated by the search word weight calculator 106 in FIG. 7 and 11 show the search expression generation unit 1 of FIG.
It is an example of the search formula generated in 08. FIG. 9 is an example of the item knowledge 105 of FIG.

【００２１】図１において、適合判定済みレコードファ
イル１０１は、検索者が一旦検索を行って得た結果の各
レコードに対して、適合／不適合の判定を下したもので
ある。In FIG. 1, a conformity-determined record file 101 is a file in which a searcher has once performed a search, and has determined whether each record obtained as a result of the search is suitable or not.

【００２２】判定別出現レコード表作成部１０２は適合
判定済みレコードファイル１０１を読み込んで、適合レ
コードに出現している語を取り出す。このとき、語を取
り出す項目として何を選択するかについては、データベ
ース管理者、あるいは、データベース検索システム管理
者が予め決めておく。The judgment-specific appearance record table creating unit 102 reads the conformity-determined record file 101, and extracts words appearing in the conformance record. At this time, what should be selected as an item from which a word is extracted is determined in advance by a database administrator or a database search system administrator.

【００２３】ここで選択される項目は、それぞれのレコ
ードの持つ特性を表現している項目であればよい。例え
ば、雑誌論文のデータベースであれば、タイトル、著
者、著者所属、雑誌名、抄録、キーワードなどであり、
また、会社に関するデータベースであれば、会社名、役
員名、業種、系列などであり、それぞれのレコードのレ
コードＩＤのような項目は不適切である。The item selected here may be any item expressing the characteristics of each record. For example, in the case of a journal article database, titles, authors, author affiliations, journal names, abstracts, keywords, etc.
Further, in the case of a database relating to a company, there are a company name, an executive name, a business type, a series, and the like, and items such as a record ID of each record are inappropriate.

【００２４】また、項目を選択する際には、タイトルや
抄録などのようなテキストで、一度自然言語解析などを
用いて不要語を取り除かなくてはならない項目と、著
者、所属、雑誌名などのように項目の値そのものを一つ
の語として取り出す項目を選別しておく。そして、実際
に適合レコードに出現している語を取り出す際には、予
め決められた項目からその項目に適した方法で語を取り
出す。When selecting an item, texts such as titles and abstracts must be used to remove unnecessary words once using natural language analysis, etc., and authors, affiliations, journal names, etc. In this way, items whose values are to be extracted as one word are selected. When extracting words actually appearing in the matching record, words are extracted from a predetermined item by a method suitable for the item.

【００２５】次に、それぞれの語について項目別に適合
／不適合レコードに出現しているレコード数を数える。
以下、それぞれの数を、図２のようにａ，ｂ，ｃ，ｄと
する。ａ：適合レコードで、その語がその項目に出現している
レコード数ｂ：適合レコードで、その語がその項目に出現していな
いレコード数ｃ：不適合レコードで、その語がその項目に出現してい
るレコード数ｄ：不適合レコードで、その語がその項目に出現してい
ないレコード数最後に判定別出現レコード表作成部１０２は取り出した
語について、ａ，ｂ，ｃ，ｄの数を記載した判定別出現
レコード表１０３を作成する。判定別出現レコード表に
ついては、さらに詳しく後述する。Next, for each word, the number of records appearing in the conforming / non-conforming record for each item is counted.
Hereinafter, the numbers are a, b, c, and d as shown in FIG. a: The number of records in which the word appears in the item in the matching record b: The number of records in which the word does not appear in the item in the matching record c: The word appears in the item in the non-matching record Number of records d: Number of records that are nonconforming records and the word does not appear in the item. Finally, the appearance-by-judgment record creation unit 102 describes the numbers a, b, c, and d of the extracted words. A judgment-based appearance record table 103 is created. The appearance record table for each judgment will be described later in more detail.

【００２６】また、本実施例で文献１と同様な重み付け
の方法を採用したのは、文献１に紹介されている実験で
示されているように、他の重み付けの方法に比べて性能
がよい。また、重み付けの際に必要となるのは、適合／
不適合レコードに語が出現しているかいないかだけであ
り、それ以上に何回出現しているかなどの頻度の情報は
いらないため、非常に簡易である。以上の点を考慮し、
本実施例での重み付けの方法は文献１と同様なものとし
た。Further, in the present embodiment, the same weighting method as in Reference 1 is employed, as shown in the experiment introduced in Reference 1, in which the performance is better than other weighting methods. . In addition, when weighting is required,
This is very simple because there is no need for information on the frequency such as how many times the word has appeared in the non-conforming record, and no more information. With the above in mind,
The weighting method in this embodiment is the same as that in Reference 1.

【００２７】検索語選択部１０４は判定別出現レコード
表１０３と項目知識１０５を用いて、新しい検索式を作
成するための検索項目と検索語を選択する。また、適合
判定済みレコードファイルに不適合レコードがある場合
とない場合とで、次のように処理が異なる。The search term selection unit 104 selects a search item and a search term for creating a new search formula using the appearance record table 103 for each judgment and the item knowledge 105. Further, the processing differs depending on whether or not there is a nonconforming record in the conformance-determined record file as follows.

【００２８】まず、不適合レコードがない場合には、項
目知識１０５を参照して重み係数があるしきい値以上の
項目を検索項目とし、その検索項目に出現する語を検索
語とする。First, when there is no nonconforming record, an item having a weight coefficient equal to or more than a threshold value is set as a search item by referring to the item knowledge 105, and a word appearing in the search item is set as a search word.

【００２９】項目知識１０５は、データベースを構成す
るタイトル、抄録、著者などの項目と、その項目が適合
レコードを検索するのにどのくらい有用であるかを示す
重み係数を記載している。この重み係数はデータベース
管理者が任意に入力してもよいし、その項目に出現する
語の種類などを考慮して計算してもよい。The item knowledge 105 describes items such as a title, an abstract, an author, and the like constituting the database, and a weight coefficient indicating how useful the item is in searching for a matching record. This weighting factor may be arbitrarily input by the database administrator, or may be calculated in consideration of the type of word appearing in the item.

【００３０】また、不適合レコードがある場合には、判
定別出現レコード表１０３中のすべての語と項目につい
て適合レコードに出現する確率、すなわち（適合レコード中に出現しているレコード数）／（適合レコード数）＝ａ／（ａ＋ｂ）・・・（Ａ）と、適合判定済みの全レコードに出現する確率、すなわ
ち（適合あるいは不適合レコード中に出現しているレコード数）／（適合レコード数＋不適合レコード数）＝（ａ＋ｃ）／（ａ＋ｂ＋ｃ＋ｄ）・・・（Ｂ）とを計算する。When there is a non-conforming record, the probability that all words and items in the judgment-specific appearance record table 103 appear in the conforming record, that is, (the number of records appearing in the conforming record) / (conforming Number of records) = a / (a + b) (A) and the probability of occurrence in all records for which conformity has been determined, that is, (number of records appearing in conforming or nonconforming records) / (number of conforming records + nonconforming) (The number of records) = (a + c) / (a + b + c + d) (B)

【００３１】適合レコードに出現する確率が適合レコー
ドおよび不適合レコードに出現する確率よりも大きけれ
ば、あらたに適合レコードを検索できる可能性が大きい
と考えられるので、ａ／（ａ＋ｂ）＞（ａ＋ｃ）／（ａ＋ｂ＋ｃ＋ｄ）・・・（Ｃ）を満たす語と項目の組合せを探し、それらを新しい検索
語と検索項目として選択し出力する。If the probability of appearing in a matching record is higher than the probability of appearing in a matching record and a non-matching record, it is considered that there is a greater possibility that a matching record can be searched for, so that a / (a + b)> (a + c) / (A + b + c + d) (C) Search for combinations of words and items that satisfy the following, and select and output them as new search words and search items.

【００３２】検索語重み算出部１０６は検索語選択部１
０４で選択された検索語の重みを算出する。まず検索語
重み算出部１０６は判定別出現レコード表１０３で各検
索語の検索項目におけるａ，ｂ，ｃ，ｄの値を得る。こ
の重みは、各検索語が、新たな適合レコードを検索でき
る可能性の高さを示すものである。そこで、適合レコー
ドに多く出現して、不適合レコードには出現していない
語が新たな適合レコードを検索する可能性が高いと考え
られる。すなわち、ａ、ｄの値ができるだけ大きく、
ｂ、ｃの値ができるだけ小さい検索語が望ましいので、
以下のような重み付けの方法が考えられる。どの重み付
け式を用いるかはデータベース管理者あるいは検索者が
選択してよい。（ａ＋０．５）（ｄ＋０．５）・・・（１）（（ａ＋０．５）（ｄ＋０．５））／（（ｃ＋０．５）（ｂ＋０．５））・・・（２）ｌｏｇ（（ａ＋０．５）（ｄ＋０．５））／（（ｃ＋０．５）（ｂ＋０．５））・・・（３）この場合、不適合レコードがない場合にはｃ＝０、ｄ＝
０となり、（１）式では、すべての検索語の重みは等し
くなってしまう。さらに適合レコードが１件しかない場
合にはｂ＝０となるので、（１）、（２）、（３）のど
の式を用いても全ての検索語の重みは等しくなってしま
う。The search term weight calculator 106 is used for the search term selector 1.
The weight of the search term selected in 04 is calculated. First, the search term weight calculator 106 obtains the values of a, b, c, and d in the search item of each search term in the appearance record table 103 for each judgment. This weight indicates the likelihood that each search term can search for a new matching record. Therefore, it is considered that there is a high possibility that words that appear many times in matching records but do not appear in non-matching records will search for new matching records. That is, the values of a and d are as large as possible,
Since it is desirable to use a search word in which the values of b and c are as small as possible,
The following weighting methods are conceivable. Which weighting formula is used may be selected by a database administrator or a searcher. (A + 0.5) (d + 0.5) (1) ((a + 0.5) (d + 0.5)) / ((c + 0.5) (b + 0.5)) (2) log (( a + 0.5) (d + 0.5)) / ((c + 0.5) (b + 0.5)) (3) In this case, if there is no nonconforming record, c = 0 and d =
0, and the weights of all the search words are equal in the expression (1). Further, when there is only one matching record, b = 0, so that the weights of all the search words are equal regardless of any of the expressions (1), (2), and (3).

【００３３】であるから、図４の処理の流れに示したよ
うに、適合判定済みレコードファイル１０１中に、不適
合レコードがなくて適合レコードが１件だけの場合に
は、ａの値に項目知識１０５中の重み係数を乗じて重み
とする。不適合レコードがなくて、適合レコードが２件
以上ある場合には、ｂ≠０となる語もありうるので、
（２）式、（３）式のどちらかで重みを計算し、さらに
その検索項目の重み係数を乗じて重みとする。Therefore, as shown in the processing flow of FIG. 4, when there is no nonconforming record and only one conforming record exists in the conformity-determined record file 101, the value of a is set to the item knowledge. The weight is multiplied by the weight coefficient in 105. If there is no nonconforming record and there are two or more conforming records, there may be a word where b ≠ 0,
The weight is calculated by either equation (2) or equation (3), and the weight is calculated by multiplying the weight by the weight coefficient of the search item.

【００３４】また、適合判定済みレコードファイル１０
１中に不適合レコードがある場合には、（１）、
（２）、（３）式のいずれかで重みを計算すればよい
が、上記各式の特徴は以下の通りであり、どの重み付け
式を用いるかは、データベース管理者や、検索システム
管理者、または検索者などが自由に選択してよい。The conformity-determined record file 10
If there is a nonconforming record in 1, (1),
The weight may be calculated by either of the formulas (2) and (3). The features of the above formulas are as follows, and which weighting formula is used depends on the database manager, the search system manager, Alternatively, a searcher or the like may freely select.

【００３５】（１）式は単純にａとｂを乗じたものであ
る。ａ＋ｂ＝適合レコード数、ｃ＋ｄ＝不適合レコード
数で、これらの値はどの検索語でも同じである。従っ
て、ａの値が大きいほどｂの値は小さく、ｄの値が大き
いほどｃの値は小さくなるので、単純にａとｂを乗じた
値でも、この値が大きければ大きいほど適合レコードに
多く出現しているといえる。Equation (1) is obtained by simply multiplying a and b. a + b = the number of matching records, c + d = the number of non-matching records, and these values are the same for all search terms. Therefore, the larger the value of a, the smaller the value of b, and the larger the value of d, the smaller the value of c. Therefore, even if a value obtained by simply multiplying a and b is larger, the larger this value is, the more the matching record becomes. It can be said that it has appeared.

【００３６】（２）式は（１）式の値をさらにｂとｃと
の積で除したものである。（１）式と同様、ａ＋ｂ＝適
合レコード数、ｃ＋ｄ＝不適合レコード数で、これらの
値はどの検索語でも同じであり、よって、ａとｄの値が
大きいほどｂとｃの値は小さくなる。ｂとｃの値が小さ
いほど分母の値も小さくなり、結果として重みも大きく
なる。Equation (2) is obtained by further dividing the value of equation (1) by the product of b and c. As in the equation (1), a + b = the number of matching records and c + d = the number of non-matching records, and these values are the same for all search terms. Therefore, the larger the values of a and d, the smaller the values of b and c. . As the values of b and c are smaller, the value of the denominator is smaller, and as a result, the weight is larger.

【００３７】（３）式は（２）式の値の対数（１０を底
とする）をとったものである。対数をとることによっ
て、（２）式で差別化された値では、１０より大きい
値、例えば、２００と３００ではそれぞれ２．３０１と
２．４７７となり、その差の比率は小さく、また１０よ
り小さい値、例えば２と３では、０．３０１と０．４７
７になり、その差の比率はほぼ保たれる。すなわち、
（２）式の値から飛び抜けて大きい値の重みがなくなっ
て約０〜３の範囲に収まり、（２）式の値で１０より小
さい範囲での値の差の比率は、ほぼ保たれる特徴があ
る。Equation (3) is obtained by taking the logarithm (base 10) of the value of equation (2). By taking the logarithm, the value differentiated by the equation (2) is a value larger than 10, for example, 2.301 and 2.477 for 200 and 300, respectively, and the ratio of the difference is small and smaller than 10. For values, for example 2 and 3, 0.301 and 0.47
7 and the ratio of the difference is almost maintained. That is,
The feature that the weight of the large value by far from the value of the expression (2) disappears and falls within the range of about 0 to 3, and the ratio of the value difference in the range of less than 10 in the value of the expression (2) is almost maintained. There is.

【００３８】一方、検索式生成部１０７は検索語選択で
選択された検索語と検索項目を受けとり、検索式を生成
する。検索式においては、同一の検索項目に複数の検索
語がある場合はＯＲで連結し、異なる項目間もＯＲで連
結する。On the other hand, the search expression generation unit 107 receives the search word and the search item selected in the search word selection, and generates a search expression. In the search formula, when there are a plurality of search words in the same search item, they are connected by OR, and different items are also connected by OR.

【００３９】また、検索式の生成の際、適合判定済みレ
コードファイル１０１を参照し、一度検索者によって不
適合判定がなされたレコードを結果に含まないように、
検索式を生成する。Also, when generating a search expression, the reconciliation-determined record file 101 is referred to, and a record once a rejection is determined by the searcher is not included in the result.
Generate a search expression.

【００４０】検索実行部１０８は検索式生成部１０７で
生成された検索を用いて検索を行い、検索結果を得る。The search execution unit 108 performs a search using the search generated by the search expression generation unit 107, and obtains a search result.

【００４１】レコード適合度算出部１０９は検索実行部
１０８で得られた検索結果の各レコードの適合度を、そ
のレコードに含まれる検索語の重みの総和とする。詳し
くは、検索語選択部１０４で選択された検索項目に同じ
く選択された検索語があれば、検索語重み算出部１０６
で得られたその重みを加算して算出する。The record matching degree calculation unit 109 sets the matching degree of each record of the search result obtained by the search execution unit 108 as the total sum of the weights of the search words included in the record. More specifically, if the search item selected by the search word selection unit 104 includes the same search word, the search word weight calculation unit 106
Is calculated by adding the weights obtained in (1).

【００４２】ここで一例を挙げて説明する。説明上、検
索対象は図書館情報学関係の雑誌論文のデータベースで
あり、データベースの項目として、「タイトル」、「抄
録」、「著者」、「所属」、「雑誌名」の項目があり一
般的な図書館情報学関係のデータベースの例である。ま
た、適合判定済みレコードファイル１０１には検索者が
適合レコードとした１０件と不適当レコードとした１０
件の計２０件のレコードが記載されているものとする。Here, an example will be described. For the purpose of explanation, the search target is a database of journal articles related to library and information science, and the database items include "title", "abstract", "author", "affiliation", and "magazine name". This is an example of a library information science database. In addition, in the relevance-determined record file 101, the searcher sets 10 relevance records and 10
It is assumed that a total of 20 records are described.

【００４３】判定別出現レコード表作成部１０２は、こ
の適合判定済みレコードファイル１０１中の１０件の適
合レコードから、予めデータベース管理者あるいはデー
タベース検索システム管理者によって決められた項目に
ついて決められた取り出し方で語を項目別に取り出す。
この例では取り出す項目は、「タイトル」「抄録」「著
者」「所属」「雑誌名」である。「著者」「所属」「雑
誌名」の項目については、項目の値そのものを語として
取り出す。The appearance-by-judgment record table creation unit 102 retrieves, from the ten conformance records in the conformity-determined record file 101, an item determined in advance by a database administrator or a database retrieval system administrator. To extract words by item.
In this example, the items to be extracted are “title”, “abstract”, “author”, “affiliation”, and “magazine name”. For the items "author", "affiliation", and "magazine name", the value of the item itself is extracted as a word.

【００４４】また、一般的に「タイトル」や「抄録」は
テキスト項目であるので、自然言語解析を用いて、不要
語を削除し、残った語を取り出すものとする。本実施例
の自然言語解析は、例えば、「自然言語処理の基礎技
術」（野村浩郷著、電子情報通信学会発行、１９８８
年）の第１章、第２章に記載されているような、自然言
語解析を行い、ここでの不要語を活用語尾、助動詞、連
体助詞、終助詞、副助詞、格助詞、並列助詞とする。Since "title" and "abstract" are generally text items, unnecessary words are deleted using natural language analysis, and the remaining words are extracted. The natural language analysis according to the present embodiment is performed, for example, in “Basic technology of natural language processing” (Hirogo Nomura, published by the Institute of Electronics, Information and Communication Engineers, 1988)
The natural language analysis as described in Chapter 1 and Chapter 2 of the year) is performed, and unnecessary words are used here. The endings, auxiliary verbs, adjunct particles, final particles, accessory particles, case particles, and parallel particles are used. I do.

【００４５】そして、取り出された語について、図２の
ａ，ｂ，ｃ，ｄに相当するレコード数を適合レコードお
よび不適合レコードの計２０件でカウントし、図５
（ａ）に記載の判定別出現レコード表１０３を作成す
る。図５（ａ）によれば「大学図書館」という語はタイ
トル項目において適合レコード中の１０件中８件に出現
しており、不適合レコードのタイトル項目の１０件中に
は出現していない。抄録項目において適合レコードで９
件、不適合レコードで２件に出現している。また、「山
田太郎」という著者は適合レコードの１０件中５件に出
現しており、不適合レコード１０件中には出現していな
いことを示している。For the extracted words, the number of records corresponding to a, b, c, and d in FIG. 2 is counted for a total of 20 conforming records and nonconforming records.
The determination-specific appearance record table 103 described in (a) is created. According to FIG. 5A, the word "university library" appears in eight out of ten matching records in the title item, and does not appear in ten title items in the non-matching record. 9 in the conformance record in the abstract item
And two non-conforming records. The author "Taro Yamada" appears in five out of ten conforming records, and does not appear in ten non-conforming records.

【００４６】検索語選択部１０４では図３の処理の流れ
にしたがい、図５（ａ）の判定別出現レコード表１０３
から検索語と検索項目を選択するため図５（ｂ）の判定
別出現レコード表を作成する。The search term selection unit 104 follows the flow of the processing shown in FIG.
In order to select a search term and a search item from, an appearance record table for each judgment shown in FIG. 5B is created.

【００４７】その処理について詳しく説明する。今、適
合判定済みレコードファイル１０１には不適合レコード
が存在するので、それぞれの語について適合レコードに
出現する確率と適合判定済みの全レコードに出現する確
率を上述の（Ａ）式および（Ｂ）式で計算する。例えば
図５のタイトル項目中の「大学図書館」という語では適
合レコードに出現する確率は８／１０＝０．８、全レコ
ードに出現する確率は（８＋０）／２０＝０．４で適合
レコードに出現する確率の方が高い。よってタイトル項
目中の「大学図書館」は検索語として選択される。The processing will be described in detail. Since there is a non-conforming record in the conformity-determined record file 101, the probability that each word appears in the conforming record and the probability that it appears in all the records for which conformity has been determined are calculated by the above-described equations (A) and (B). Is calculated. For example, in the word “university library” in the title item of FIG. 5, the probability of appearing in a matching record is 8/10 = 0.8, the probability of appearing in all records is (8 + 0) /20=0.4, and The probability of appearance is higher. Therefore, “university library” in the title item is selected as a search word.

【００４８】同様にタイトル項目中の「公共図書館」に
ついて計算すると、適合レコードに出現する確率は１／
１０＝０．１、全レコードに出現する確率は（１＋３）
／２０＝０．２で全レコードに出現する確率の方が高
く、検索語には選択されない。このように取り出された
語について、（Ｃ）式を計算して検索語と検索項目を選
択した図５（ｂ）の判定別出現レコード表を作成する。
ここで選択された検索語と検索項目は検索式生成部１０
７と検索語重み算出部１０６に渡される。Similarly, when calculating for “public library” in the title item, the probability of appearing in the matching record is 1 /
10 = 0.1, probability of appearing in all records is (1 + 3)
The probability of appearing in all records at /20=0.2 is higher and is not selected as a search term. With respect to the words extracted in this way, the expression (C) is calculated to create a search-specific appearance record table of FIG. 5B in which a search word and a search item are selected.
The search term and the search item selected here are stored in the search expression generation unit 10.
7 is passed to the search term weight calculation unit 106.

【００４９】検索語重み算出部１０６は、図５（ｂ）
（検索語選択部１０４）によって選択された検索語の重
みを図４の処理の流れに従って算出する。その処理内容
について詳しく説明する。今、不適合レコードがあるの
で、検索語の重みは上述の（１）、（２）、（３）式の
いずれかを用いて計算する。例えばタイトル項目中の
「大学図書館」は（１）式によれば８．５×１０．５＝
８９．２５となり、（２）式によれば（８．５×１０．
５）／（２．５×０．５）＝７１．４、（３）式によれ
ばｌｏｇ（（８．５×１０．５）／（２．５×０．
５））＝１．８５となる。同様に、他の語についても重
みを計算すると、図６のようになる。The search term weight calculator 106 calculates the search term weight in FIG.
The weight of the search term selected by the (search term selection unit 104) is calculated according to the processing flow of FIG. The processing content will be described in detail. Now, since there is a nonconforming record, the weight of the search word is calculated by using any of the above formulas (1), (2) and (3). For example, “University Library” in the title item is 8.5 × 10.5 =
89.25, and according to equation (2), (8.5 × 10.
5) / (2.5 × 0.5) = 71.4, and according to equation (3), log ((8.5 × 10.5) / (2.5 × 0.
5)) = 1.85. Similarly, when weights are calculated for other words, the result is as shown in FIG.

【００５０】一方、検索式生成部１０７では図５（ｂ）
の検索語とその検索項目を調べ、検索項目に検索語が存
在するレコードを検索できるように検索式を生成する。
同じ検索項目、例えばタイトル項目での検索語「大学図
書館」と「ネットワーク」をＯＲ演算子で連結する。同
様に抄録項目で「大学図書館」と「ネットワーク」、著
者項目で「山田太郎」と「田中花子」をＯＲで連結し
て、検索項目間もＯＲで連結する。さらに、最初に適合
判定済みファイルと検索結果に重複するレコードが検索
されないように、適合判定済みファイルに含まれるレコ
ード番号をＮＯＴ演算子で連結する。図７は検索式生成
部１０７で生成された検索式の一例である。この検索式
は検索実行部１０８に渡され、データベースでの検索が
行われる。On the other hand, in the retrieval formula generation unit 107, FIG.
The search term and its search item are examined, and a search expression is generated so that records having the search term in the search item can be searched.
The search terms “university library” and “network” in the same search item, for example, the title item, are connected by an OR operator. Similarly, "Academic Library" and "Network" are connected in the abstract item, "Taro Yamada" and "Hanako Tanaka" are connected in OR by the author item, and the search items are also connected in OR. Furthermore, the record numbers included in the conformance-determined file are linked by the NOT operator so that a record that is identical to the conformity-determined file and the search result is not searched first. FIG. 7 is an example of a search expression generated by the search expression generation unit 107. This search formula is passed to the search execution unit 108, and a search is performed in the database.

【００５１】レコード適合度算出部１０９は図７の検索
式で検索された結果のレコードについて検索語が検索項
目に存在すればその重みをレコードの適合度とする。例
えば、タイトル項目中に「大学図書館」と「ネットワー
ク」が出現しているレコードの適合度は、検索語の重み
付け式が（３）であれば１．８５＋０．４９＝２．３４
になる。以下同様に適合度を計算してその適合度の順に
レコードをソートして適合度検索結果１１０を出力す
る。The record matching degree calculation unit 109 sets the weight of the record obtained as a result of the search using the search formula shown in FIG. For example, the relevance of a record in which “university library” and “network” appear in the title item is 1.85 + 0.49 = 2.34 if the weighting formula of the search term is (3).
become. Thereafter, the similarity is calculated in the same manner, the records are sorted in the order of the similarity, and the fitness search result 110 is output.

【００５２】さらに、もう一例を挙げて説明する。説明
上、適合判定済みレコードファイル１０１には適合レコ
ード１件のみが記載されているものとする。判定別出現
レコード表作成部１０２はこの適合判定済みレコードフ
ァイル１０１を受けとって、適合レコード中に存在する
語を取り出し、図８（ａ）の判定別出現レコード表１０
３を作成する。検索語選択部１０４は、不適合レコード
がないため、図３の不適合レコードがない場合の処理の
流れにしたがって、図９の項目知識１０５において重み
係数がしきい値以上の項目を検索項目、そこに出現して
いる語を検索語とする。ここでしきい値を０．８とすれ
ば、検索項目はタイトル、抄録、著者、引用文献とな
り、検索語として選択されるのは図８（ｂ）で○のつい
たものになる。Further, another example will be described. For the sake of explanation, it is assumed that the conformity-determined record file 101 contains only one conforming record. Receiving the conformance-determined record file 101, the judgment-specific appearance record table creation unit 102 extracts words existing in the conformance record, and determines the word-by-judgment occurrence record table 10 shown in FIG.
Create 3. Since there is no nonconforming record, the search term selection unit 104 searches the item knowledge 105 of FIG. 9 for an item whose weight coefficient is equal to or larger than the threshold value according to the processing flow when there is no nonconforming record of FIG. Let the appearing word be a search word. Here, if the threshold is set to 0.8, the search items are the title, abstract, author, and cited document, and those selected as search terms are those marked with a circle in FIG. 8B.

【００５３】検索語重み算出部１０６は図８（ｂ）の検
索語について、図４に記載の処理の流れにそって重みを
算出する。今、不適合レコードがなく、適合レコードも
１件だけなので、重みはａの値１に図９の項目知識１０
５に記載の重み係数をかけた値となる。The search word weight calculator 106 calculates the weight of the search word shown in FIG. 8B according to the processing flow shown in FIG. Now, since there is no nonconforming record and there is only one conforming record, the weight is set to the value 1 of a and the item knowledge 10 in FIG.
The value is a value obtained by multiplying the weight coefficient described in No. 5.

【００５４】例えば、抄録項目中の「レファレンス」の
重みは１×０．８＝０．８となる。図１０は検索語とこ
のように計算された重みである。また、検索式生成部１
０７は検索語選択部１０４で選択された図８の検索語と
検索項目をＯＲ演算子で連結して検索式を生成する。こ
のとき、適合判定済みレコードファイル１０１に記載さ
れている適合レコードのレコード番号を＃２１とすれ
ば、このレコードが検索結果に含まれないようにＮＯＴ
演算子で連結する。図１１は以上のようにして生成され
た検索式の一例である。For example, the weight of the “reference” in the abstract item is 1 × 0.8 = 0.8. FIG. 10 shows the search words and the weights thus calculated. Also, the search expression generation unit 1
Reference numeral 07 generates a search expression by connecting the search term and the search item of FIG. 8 selected by the search term selection unit 104 with an OR operator. At this time, if the record number of the conforming record described in the conforming record file 101 is # 21, NOT is set so that this record is not included in the search result.
Concatenate with operators. FIG. 11 shows an example of the search formula generated as described above.

【００５５】検索実行部１０８では図１１の検索式を実
行して検索結果をうけとり、レコード適合度算出部１０
９で、図１０の検索語の重みによって検索結果の各レコ
ードの適合度を算出し、適合度順にソートした適合度順
検索結果１１０を出力する。The search execution unit 108 executes the search formula shown in FIG.
In step 9, the relevance of each record of the search result is calculated based on the weight of the search word in FIG. 10, and the relevance-ordered search result 110 sorted in the relevance order is output.

【００５６】出力した結果に対して検索者が再度適合判
定を行えば、その情報を判定別出現レコード表１０３に
追加して新たな検索を行い、検索者が満足するまで処理
を続ける。When the searcher performs a re-judgment on the output result, the information is added to the judgment-specific appearance record table 103, a new search is performed, and the process is continued until the searcher is satisfied.

【００５７】[0057]

【発明の効果】以上に説明したように、本発明によれ
ば、検索者は適合／不適合の判定を行うだけで、新たに
適合度順に配列された検索結果を得ることができる。As described above, according to the present invention, a searcher can obtain search results newly arranged in the order of relevance only by determining relevance / non-relevance.

【００５８】また、適合度の計算においては、タイト
ル、抄録などの語の他に著者や著者の所属などの項目の
語についても項目ごとに検索語の重みを算出すること
で、より正確な適合度を得ることが可能である。In the calculation of the degree of relevance, the weight of the search term is calculated for each item, such as the title and the abstract, in addition to the words of the author and the affiliation of the author. It is possible to gain a degree.

【００５９】さらに、項目知識を用いることにより不適
合レコードが入力されない場合や適合レコードが１件だ
け入力された場合にも検索語の選択、重み付けが可能で
ある。Further, by using the item knowledge, a search term can be selected and weighted even when no nonconforming record is input or when only one matching record is input.

[Brief description of the drawings]

【図１】本発明における構成図を示すブロック図であ
る。FIG. 1 is a block diagram showing a configuration diagram according to the present invention.

【図２】適合判定別の語の出現状況を示す図である。FIG. 2 is a diagram illustrating the appearance of words according to a matching determination;

【図３】検索語選択部の処理の流れを示す図である。FIG. 3 is a diagram showing a processing flow of a search term selection unit.

【図４】検索語重み算出部の処理の流れを示す図であ
る。FIG. 4 is a diagram showing a processing flow of a search term weight calculation unit.

【図５】判定別出現レコード表の一例である。FIG. 5 is an example of an appearance record table for each determination.

【図６】検索語重み算出部で算出される重みの一例であ
る。FIG. 6 is an example of a weight calculated by a search word weight calculator.

【図７】検索式生成部で生成される検索式の一例であ
る。FIG. 7 is an example of a search expression generated by a search expression generation unit.

【図８】判定別出現レコード表の一例である。FIG. 8 is an example of an appearance record table for each determination.

【図９】項目知識の一例である。FIG. 9 is an example of item knowledge.

【図１０】検索語重み算出部で算出される重みの一例で
ある。FIG. 10 is an example of a weight calculated by a search word weight calculator.

【図１１】検索式生成部で生成される検索式の一例であ
る。FIG. 11 is an example of a search formula generated by a search formula generation unit.

[Explanation of symbols]

１０１適合判定済みレコードファイル１０２判定別出現レコード表作成部１０３判定別出現レコード表１０４検索語選択部１０５項目知識１０６検索語重み算出部１０７検索式生成部１０８検索実行部１０９レコード適合度算出部１１０適合度順検索結果 101 Record file for which conformity has been determined 102 Appearance record table creation unit for each judgment 103 Appearance record table for each judgment 104 Search term selection unit 105 Item knowledge 106 Search term weight calculation unit 107 Search expression generation unit 108 Search execution unit 109 Record conformance calculation unit 110 Matching order search results

フロントページの続き (56)参考文献特開平２−245971（ＪＰ，Ａ) 特開平３−294963（ＪＰ，Ａ) 特開平４−281565（ＪＰ，Ａ) 特開平６−318234（ＪＰ，Ａ) 特開平５−204975（ＪＰ，Ａ) 特開平５−151271（ＪＰ，Ａ) “ＲｅｌｅｖａｎｃｅＷｅｉｇｈｔｉｎｇｏｆＳｅａｒｃｈＴｅｒｍｓ”（Ｓ．Ｅ．Ｒｏｂｅｒｔｓｏｎ，ＫａｒｅｎＳｐａｒｃｋＪｏｎｅｓ著，ＪｏｕｒｎａｌｏｆｔｈｅＡｍｅｒｉｃａｎＳｏｃｉｅｔｙｆｏｒＩｎｆｏｒｍａｔｉｏｎＳｃｉｅｎｃｅ，ｖｏｌ．26，ｐ．129−146, 1976年) (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06F 17/30Continuation of the front page (56) References JP-A-2-2495971 (JP, A) JP-A-3-294963 (JP, A) JP-A-4-281565 (JP, A) JP-A-6-318234 (JP) , A) JP-A-5-204975 (JP, A) JP-A-5-151271 (JP, A) “Relevance Weighting of Search Terms” (SE Robertson, Karen Spark Jones, Journal of the Journal) American Society for Information Science, vol. 26, p. 129-146, 1976) (58) Fields investigated (Int. Cl. ⁶ , DB name) G06F 17/30

Claims

(57) [Claims]

1. A database search system in which a searcher creates a search formula, searches a database using the search formula, and outputs a search result. In the database search system, it is determined whether or not the search result matches or does not match the searcher. A conformance feedback device that creates a conformity-determined record file, creates a search expression again using the conformance-determined record file, and searches the database again. The words appearing in the result and the item to which the word belongs are taken out, and a judgment-based appearance record table stating how much the word appears in the item of the conformance-determined record file is described. A judgment-based occurrence record table creation unit to be created, and the database with reference to the judgment-based occurrence record table A search word selection unit for selecting a search word and a search item for performing a search again; a search expression generation unit for creating a new search expression from the search word, the search item, and the determined record file; and a search from the search expression And a search execution unit that obtains a search result by executing the search.

2. A search word weight calculation unit for calculating a weight of a search word selected by the search word selection unit with reference to the appearance record table for each judgment, and a search result obtained by the search expression execution unit. Extracting the number of search words selected by the search word selection unit, inputting the weight of each search word with reference to the search word weight calculation unit, and presenting the weight in the search result. A record relevance calculating unit that outputs a relevance order search result, which is a search result sorted in the relevance level, with a value obtained by adding all weights for each search word as the relevance level of the search result. The adaptive feedback device according to claim 1.

3. A database search system in which a searcher creates a search formula, searches a database using the search formula, and outputs a search result, and determines whether the search result matches or does not match the searcher. A conformance feedback device that creates a conformity-determined record file, creates a search expression again using the conformance-determined record file, and searches the database again. The words appearing in the result and the item to which the word belongs are taken out, and a judgment-based appearance record table stating how much the word appears in the item of the conformance-determined record file is described. An occurrence record table creation unit for each judgment to be created, and an item information holding a weight coefficient of each item of the database. A search term selection unit for selecting a search term and a search item for re-searching the database with reference to the determination-based occurrence record table or the weight coefficient of the item knowledge; and the search term, the search item, and the A matching feedback device comprising: a search formula generation unit that generates a new search formula from a determined record file; and a search execution unit that performs a search from the search formula to obtain a search result.

4. A search term weight calculation unit for calculating a weight of the search term selected by the search term selection unit by referring to the appearance record table for each judgment and the item knowledge, and obtained by the search expression execution unit. In the search result obtained, extract how much the search term selected by the search term selection unit exists, input the weight of each search term by referring to the search word weight calculation unit, and input the weight to the search result. A record relevance calculating unit that outputs a relevance order search result, which is a search result sorted in the relevance level, with a value obtained by adding all weights for each of the existing search terms as the relevance level of the search result. The adaptive feedback device according to claim 3, characterized in that:

5. When the search term selection unit selects a search term, it determines whether or not there is a nonconforming record in the record file for which conformity has been determined. To select a search term,
The matching feedback device according to claim 3, wherein when there is no nonconforming record, a search term is selected with reference to the item knowledge.