JP2002366577A

JP2002366577A - System, method, program for information retrieval, recording medium with information retrieval program recorded thereon, device, method, program for selecting output information and recording medium with output information selection program recorded thereon

Info

Publication number: JP2002366577A
Application number: JP2001168547A
Authority: JP
Inventors: Shigeki Muramatsu; 茂樹村松; Kazunori Matsumoto; 一則松本
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2001-06-04
Filing date: 2001-06-04
Publication date: 2002-12-20
Anticipated expiration: 2021-06-04
Also published as: JP4636734B2

Abstract

PROBLEM TO BE SOLVED: To provide technique by which browse of information to be required by a user is performed from a similarity retrieval result without waste to the utmost. SOLUTION: The information desired by the user is predicted and precedently presented by sequentially repeating procedures to extract a group of pieces of information with similarity of feature vectors higher than a threshold to a query, to present the contents of information having the feature vector with the highest similarity among them, when the user determines that the information as YES, to present information having high similarity with the feature vector of the information next, on the contrary, when the user determines that the information as NO, to use a virtual vector in the opposite direction to the feature vector afterward.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、情報検索システ
ム、情報検索方法、情報検索プログラム、情報検索プロ
グラムを記録した記録媒体、出力情報選択装置、出力情
報選択方法、出力情報選択プログラム及び出力情報選択
プログラムを記録した記録媒体に関する。The present invention relates to an information retrieval system, an information retrieval method, an information retrieval program, a recording medium on which the information retrieval program is recorded, an output information selection device, an output information selection method, an output information selection program, and an output information selection. The present invention relates to a recording medium on which a program is recorded.

【０００２】[0002]

【従来の技術】ユーザがインターネット上で文書検索
し、あるいは特定の場所に設置されている文書データベ
ースにアクセスして文書検索する場合、クリエーとして
自分の必要としている文書内容を記述したテキストデー
タを入力し、あるいは自分の必要としている文書のモデ
ルとなる既存の文書のテキストデータを入力し、検索指
令を与えると、検索エンジン側で、クリエーとして入力
された文書のテキストデータに類似するテキストデータ
を持つ文書を抽出してユーザに提示する情報検索システ
ムが知られている。2. Description of the Related Art When a user searches for a document on the Internet or accesses a document database installed in a specific place to search for a document, he or she inputs text data describing the contents of the document required as a creator. If you input text data of an existing document that is a model of the document you need, and give a search command, the search engine has text data similar to the text data of the document entered as a creator There is known an information search system that extracts a document and presents it to a user.

【０００３】このような情報検索システムでは、例え
ば、ｔｆ＊ｉｄｆに代表されるベクトル空間モデルを利
用する。これは、多数の文書のテキストデータを分析し
て特徴ベクトルを求めてデータベースに登録しておき、
ユーザが入力したテキストデータの特徴ベクトルを求
め、すでにデータベースに登録されている特徴ベクトル
群から類似する特徴ベクトルを抽出し、それらの抽出さ
れた特徴ベクトルを持つ文書を検索結果としてユーザに
提示するものである。In such an information retrieval system, for example, a vector space model represented by tf * idf is used. This is to analyze text data of many documents, find feature vectors and register them in the database,
Finding feature vectors of text data entered by the user, extracting similar feature vectors from a group of feature vectors already registered in the database, and presenting a document having the extracted feature vectors to the user as a search result It is.

【０００４】上記のｔｆ＊ｉｄｆは、文書のテキストデ
ータ中の各単語の出現頻度に、他の検索対象文書のテキ
ストデータ中の出現頻度を考慮した重みを加えた特徴量
である。単語に対する重みｗ_tは、次のような式１によ
って表わされる。[0004] The above tf * idf is a feature amount obtained by adding a weight in consideration of the appearance frequency of the text data of another search target document to the appearance frequency of each word in the text data of the document. The weight w _t for a word is represented by the following Equation 1.

【０００５】[0005]

【数１】ここで、Ｎは検索対象文書数、ｆ_tは単語ｔを含む文書
数である。(Equation 1) Here, N target document number, f _t is the number of documents containing the word t.

【０００６】文書のテキストデータの特徴ベクトルの各
要素ｗ_d,tは、次のように計算される。Each element w _{d, t} of the feature vector of the text data of the document is calculated as follows.

【０００７】[0007]

【数２】ここで、ｆ_d,tは、文書ｄのテキストデータ中の単語ｔ
の出現頻度である。(Equation 2) Here, f _{d, t} is the word t in the text data of document d.
Is the frequency of appearance.

【０００８】また、特徴量をベクトルで表した文書間の
類似度を求める尺度としては、例えば、コサイン係数
（Cosine coefficient）がある。このコサイン係数で
は、２つのベクトルｘ，ｙ間の類似度ｓｉｍ（ｘ，ｙ）
を次の数３式で表す。[0008] As a measure for obtaining the similarity between documents in which the feature amount is represented by a vector, for example, there is a cosine coefficient. With this cosine coefficient, the similarity sim (x, y) between two vectors x and y
Is represented by the following equation (3).

【０００９】[0009]

【数３】そして、従来の情報検索システムでは、ユーザのクエリ
ーとしてのテキストデータの特徴ベクトルとの類似度が
所定のしきい値よりも高い特徴ベクトルを持つ文書を抽
出し、図１２に示すように類似度の高いものから順に並
べる形式で検索結果を提示していた。(Equation 3) Then, in the conventional information retrieval system, a document having a feature vector whose similarity with a feature vector of text data as a user query is higher than a predetermined threshold value is extracted, and as shown in FIG. Search results were presented in a format that sorts them in descending order.

【００１０】図１２に示した類似度順の検索結果をベク
トル空間に模式的に表わすと、図１３のようになる。つ
まり、クエリーＱの特徴ベクトルと、しきい値以内の高
い類似度を持つ文書Ａ〜Ｅそれぞれの特徴ベクトルとの
空間位置関係は図１３のように表わされるのである。こ
の図１３において、クエリーＱに対してしきい値内で類
似度が高い順番は、クエリーＱに対する距離尺度がＡ＜
Ｄ＜Ｂ＜Ｃ＜ＥであることからＡ，Ｄ，Ｂ，Ｃ，Ｅであ
る。FIG. 13 schematically shows a search result in order of similarity shown in FIG. 12 in a vector space. That is, the spatial positional relationship between the feature vector of the query Q and each of the feature vectors of the documents A to E having a high similarity within the threshold value is represented as shown in FIG. In FIG. 13, the order in which the similarity is high within the threshold value for the query Q is that the distance scale for the query Q is A <
Since D <B <C <E, they are A, D, B, C, and E.

【００１１】なお、文書の特徴ベクトルで表すモデルと
しては、ｔｆ＊ｉｄｆ以外に、例えば、「Automatic Te
xt Processing The Transformation, Analysis, and Re
trieval of Information by Computer Gerard Salton」
に示されるように、Term-discrimination ValueやProba
bilistic Term Weighting等、様々なモデルがある。As a model represented by a document feature vector, besides tf * idf, for example, “Automatic Te
xt Processing The Transformation, Analysis, and Re
trieval of Information by Computer Gerard Salton ''
As shown in the Term-discrimination Value and Proba
There are various models such as bilistic Term Weighting.

【００１２】また、類似度を求めるための尺度として
も、上のCosine coefficient以外にも、上に例示した文
献に示されているように、Inner product、Dice coeffi
cient、Jaccard coefficient等、様々な距離尺度があ
る。[0012] In addition to the above Cosine coefficient, as a measure for obtaining the similarity, as shown in the above-cited documents, Inner product, Dice coeffi
There are various distance measures such as cient and Jaccard coefficient.

【００１３】[0013]

【発明が解決しようとする課題】このような従来の情報
検索システムでは、次のような問題点があった。図１３
に示したように、しきい値以内にある類似度の高い文書
Ａ〜Ｅではあっても、クエリーＱに対してそれらの特徴
ベクトルの方向はまちまちである。ところが、通常、検
索結果は図１２に示すように類似度の高い順に表示され
るだけであるため、ユーザはクエリーＱに一番近い文書
Ａを展開して読んでみたところ、自分の求めている内容
であることが分かったとしても、次には文書Ａとはクエ
リーＱに対する向きが反対である、類似度が２番目に高
い文書Ｄを開くことになる。その次にはまた、クエリー
Ｑに対して文書Ｄとは方向がほぼ正反対であるが、文書
Ａとは特徴ベクトルが近い、類似度が３番目の文書Ｂを
開く。そしてその次は、これらの文書とは全く方向が異
なる文書Ｃを開くことになる。However, such a conventional information retrieval system has the following problems. FIG.
As shown in (1), even for documents A to E having a high similarity within the threshold value, the directions of their feature vectors with respect to the query Q are different. However, since the search results are usually displayed only in the order of the highest similarity as shown in FIG. 12, the user expands and reads the document A closest to the query Q and finds that he or she is seeking. Even if it is found that the content is the content, next, the document D having the second highest similarity, which is the opposite direction to the query Q from the document A, is opened. Next, the document B having the third similarity with the document A, which is almost opposite in direction to the document D with respect to the query Q but has a similar feature vector to the document A, is opened. Then, the document C whose direction is completely different from these documents is opened.

【００１４】しかしながら、現実には、ユーザにとって
は、例えば、類似度が一番高い文書Ａを開いてみたとこ
ろ、内容的に自分の求めている文書と関連性が高けれ
ば、次には特徴ベクトルの方向として、２番目に類似度
が高いが、クエリーＱに対して特徴ベクトルが１番目の
文書Ａとは反対を向く文書Ｄを開くよりも、文書Ａと同
方向を向く特徴ベクトルを持つ文書Ｂを開く方が望まし
い。However, in reality, for the user, for example, when a document A having the highest similarity is opened, if the content is highly relevant to the document desired by the user, the next step is to open the feature vector. A document having a feature vector which has the second highest similarity but has a feature vector directed in the same direction as document A rather than opening document D whose feature vector is opposite to document A for query Q. It is desirable to open B.

【００１５】しかし、従来は特徴ベクトルの方向を考慮
せずに単純に類似度の順に展開するだけであったので、
このようなユーザの要求に応えることはできず、ユーザ
が必要としている内容の文書を見出すまでに手間がかか
る問題点があった。[0015] However, in the past, it was merely developed in the order of similarity without considering the direction of the feature vector.
Such a user's request cannot be met, and there is a problem that it takes time and effort for the user to find a document having necessary contents.

【００１６】本発明は、このような従来の問題点に鑑み
てなされたもので、類似度検索結果からユーザが必要と
している情報の閲覧が可能な限り無駄なく行える技術を
提供することを目的とする。The present invention has been made in view of such conventional problems, and has as its object to provide a technique which allows a user to browse required information from a similarity search result as efficiently as possible. I do.

【００１７】[0017]

【課題を解決するための手段】請求項１の発明の情報検
索システムは、ユーザの入力するクエリーを受け付ける
クエリー入力部と、このクエリー入力部の受け付けたク
エリーから特徴ベクトルを作成する特徴ベクトル作成部
と、多数の検索対象情報のインデックス、内容を表すデ
ータ及び特徴ベクトルデータを登録している情報データ
ベースと、前記特徴ベクトル作成部で作成されたクエリ
ーの特徴ベクトルと前記情報データベースに登録されて
いる検索対象情報ごとの特徴ベクトルとの類似度を演算
し、所定のしきい値よりも高い類似度を示す特徴ベクト
ルを持つ検索対象情報を特定し、それらのインデック
ス、内容を表すデータ及び特徴ベクトルデータを取り出
す類似度演算部と、この類似度演算部の取り出したデー
タを保存する検索結果保持部と、前記類似度演算部の取
り出した検索対象情報の１つについてそのインデック
ス、内容を示すデータを出力する検索結果出力部と、前
記検索結果出力部の出力している検索結果に対するユー
ザの適／不適の判断入力を受け付け、（１）適入力の場
合には、前記検索結果保持部に対して、現在出力中の検
索対象情報の持つ特徴ベクトルに類似度が最も高い特徴
ベクトルを持つ他の検索対象情報を検索し、該当する検
索対象情報があればその内容を次候補として前記検索結
果出力部によって出力させ、（２）不適入力の場合に
は、初段であればクエリーの特徴ベクトルに対して、２
段目以降であれば前段で出力した特徴ベクトルに対し
て、現在出力中の検索対象情報の持つ特徴ベクトルと反
対向きになる仮想ベクトルを算定し、この仮想ベクトル
に類似度が最も高い特徴ベクトルを持つ他の検索対象情
報がないか前記検索結果保持部の保持している検索対象
情報を検索し、該当する検索対象情報があればその内容
を次候補として前記検索結果出力部によって出力させる
フィードバック処理部とを備えたものである。According to the present invention, there is provided an information retrieval system comprising: a query input unit for receiving a query input by a user; and a feature vector generating unit for generating a feature vector from the query received by the query input unit. And an information database in which indexes of a large number of search target information, data representing contents and feature vector data are registered, a feature vector of a query created by the feature vector creation unit, and a search registered in the information database. Calculate the similarity with the feature vector for each target information, specify the search target information having the feature vector indicating the similarity higher than the predetermined threshold value, and extract the index, the data representing the content and the feature vector data. A similarity calculation unit to be extracted, and a search result for storing data extracted by the similarity calculation unit. A holding unit, a search result output unit that outputs an index and data indicating the contents of one of the search target information extracted by the similarity calculation unit, and a user input to the search result output by the search result output unit. (1) In the case of a suitable input, the search result holding unit has a feature vector having the highest similarity to the feature vector of the search target information currently being output. Is searched, and if there is corresponding search target information, the content is output as the next candidate by the search result output unit. (2) In the case of inappropriate input, if it is the first stage, it is added to the feature vector of the query. On the other hand, 2
If it is at or after the stage, a virtual vector that has the opposite direction to the feature vector of the search target information currently being output is calculated with respect to the feature vector output at the previous stage, and the feature vector with the highest similarity to this virtual vector is calculated. A feedback process for searching the search result information held in the search result holding unit for other search object information having the search result information and outputting the content of the corresponding search object information as a next candidate by the search result output unit. And a part.

【００１８】請求項２の発明の情報検索システムは、ユ
ーザの入力するクエリーを受け付けるクエリー入力部
と、このクエリー入力部の受け付けたクエリーから特徴
ベクトルを作成する特徴ベクトル作成部と、多数の検索
対象情報のインデックス、内容を表すデータ及び特徴ベ
クトルデータを登録している情報データベースと、前記
特徴ベクトル作成部で作成されたクエリーの特徴ベクト
ルと前記情報データベースに登録されている検索対象情
報ごとの特徴ベクトルとの類似度を演算し、所定のしき
い値よりも高い類似度を示す特徴ベクトルを持つ検索対
象情報を特定し、それらのインデックス、内容を表すデ
ータ及び特徴ベクトルデータを取り出す類似度演算部
と、この類似度演算部の取り出したデータを保存する検
索結果保持部と、前記類似度演算部が求めた類似度の高
い特徴ベクトル群に対して、最も類似度の高い特徴ベク
トルから始め、その内容がユーザに受け入れられるとし
た場合に適、受け入れられないとした場合に不適に分岐
し、適に分岐する場合には、当該特徴ベクトルに類似度
が最も高い特徴ベクトルを持つ他の検索対象情報を次候
補の検索対象情報とし、前記特徴ベクトルから不適に分
岐する場合には、初段であればクエリーの特徴ベクトル
に対して、２段目以降であれば前段の出力候補とした特
徴ベクトルに対して、不適とした検索対象情報の持つ特
徴ベクトルと反対向きになる仮想ベクトルを算定し、こ
の仮想ベクトルに類似度が最も高い特徴ベクトルを持つ
他の検索対象情報を検索し、該当する検索対象情報があ
ればその内容を次候補の検索対象情報とする処理を、前
記検索結果保持部に保持しているすべての検索対象情報
を網羅するまで繰り返して決定木を作成する決定木作成
部と、前記類似度演算部の取り出した検索対象情報の１
つについてそのインデックス、内容を示すデータを出力
する検索結果出力部と、前記検索結果出力部の出力して
いる検索結果に対するユーザの適／不適の判断入力を受
け付け、前記決定木作成部の作成した決定木に基づいて
次に出力すべき検索対象情報を特定し、該当する検索対
象情報の内容を前記検索結果出力部によって出力させる
フィードバック処理部とを備えたものである。According to a second aspect of the present invention, there is provided an information retrieval system, comprising: a query input unit for receiving a query input by a user; a feature vector generating unit for generating a feature vector from the query received by the query input unit; An information database in which data representing the index of information, contents, and feature vector data are registered; a feature vector of a query created by the feature vector creation unit; and a feature vector for each search target information registered in the information database. A similarity calculating unit for calculating the similarity with the search target information having the feature vector indicating the similarity higher than the predetermined threshold, extracting the index, the data representing the content and the feature vector data. A search result storage unit for storing data extracted by the similarity calculation unit, For the feature vector group with a high similarity calculated by the similarity calculation unit, start with the feature vector with the highest similarity, and if the content is acceptable to the user, inappropriate if the content is not acceptable Branching, when appropriately branching, other search target information having a feature vector having the highest similarity to the feature vector as the next candidate search target information, and when branching inappropriately from the feature vector, If it is the first stage, the feature vector of the query is calculated. If it is the second stage or later, the virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated for the feature vector that was the output candidate of the previous stage. Then, another search target information having a feature vector having the highest similarity to the virtual vector is searched, and if there is the corresponding search target information, its content is set as the next candidate search target information. The that process, the search result and the decision tree creation unit repeatedly to create a decision tree for all search target information held in the holding portion to be exhaustive, 1 of the extracted search target information of the similarity calculation unit
A search result output unit for outputting data indicating the index and the contents of the search tree, and accepting a user's suitability / inappropriate judgment input for the search result output from the search result output unit, and the decision tree creation unit creates A feedback processing unit that specifies the next search target information to be output based on the decision tree, and causes the search result output unit to output the contents of the corresponding search target information.

【００１９】請求項３の発明の情報検索システムは、ユ
ーザの入力するクエリーを受け付けるクエリー入力部
と、このクエリー入力部の受け付けたクエリーから特徴
ベクトルを作成する特徴ベクトル作成部と、多数の検索
対象情報のインデックス、内容を表すデータ及び特徴ベ
クトルデータを登録している情報データベースと、前記
特徴ベクトル作成部で作成されたクエリーの特徴ベクト
ルと前記情報データベースに登録されている検索対象情
報ごとの特徴ベクトルとの類似度を演算し、所定のしき
い値よりも高い類似度を示す特徴ベクトルを持つ検索対
象情報を特定し、それらのインデックス、内容を表すデ
ータ及び特徴ベクトルデータを取り出す類似度演算部
と、この類似度演算部の取り出したデータを保存する検
索結果保持部と、前記類似度演算部が求めた類似度の高
い特徴ベクトル群に対して、最も類似度の高い特徴ベク
トルから始め、その内容がユーザに受け入れられるとし
た場合に適、受け入れられないとした場合に不適に分岐
し、適に分岐する場合には、当該特徴ベクトルに類似度
が最も高い特徴ベクトルを持つ他の検索対象情報を次候
補の検索対象情報とし、前記特徴ベクトルから不適に分
岐する場合には、初段であればクエリーの特徴ベクトル
に対して、２段目以降であれば前段の出力候補とした特
徴ベクトルに対して、不適とした検索対象情報の持つ特
徴ベクトルと反対向きになる仮想ベクトルを算定し、こ
の仮想ベクトルに類似度が最も高い特徴ベクトルを持つ
他の検索対象情報を検索し、該当する検索対象情報があ
ればその内容を次候補の検索対象情報とする処理を、前
記検索結果保持部に保持している検索対象情報群につい
て所定段階まで繰り返して決定木を作成する決定木作成
部と、前記類似度演算部の取り出した検索対象情報の１
つについてそのインデックス、内容を示すデータを出力
する検索結果出力部と、前記検索結果出力部の出力して
いる検索結果に対するユーザの適／不適の判断入力を受
け付け、前記決定木作成部の作成した決定木に基づいて
次に出力すべき検索対象情報を特定し、該当する検索対
象情報の内容を前記検索結果出力部によって出力させ、
かつ前記決定木作成部に対して、既存の決定木を成長さ
せる条件に至ったならば前記決定木作成部に決定木を所
定段階だけ成長させる指示を与えるフィードバック処理
部とを備えたものである。According to a third aspect of the present invention, there is provided an information retrieval system comprising: a query input unit for receiving a query input by a user; a feature vector generating unit for generating a feature vector from the query received by the query input unit; An information database in which data representing the index of information, contents, and feature vector data are registered; a feature vector of a query created by the feature vector creation unit; and a feature vector for each search target information registered in the information database. A similarity calculating unit for calculating the similarity with the search target information having the feature vector indicating the similarity higher than the predetermined threshold, extracting the index, the data representing the content and the feature vector data. A search result storage unit for storing data extracted by the similarity calculation unit, For the feature vector group with a high similarity calculated by the similarity calculation unit, start with the feature vector with the highest similarity, and if the content is acceptable to the user, inappropriate if the content is not acceptable Branching, when appropriately branching, other search target information having a feature vector having the highest similarity to the feature vector as the next candidate search target information, and when branching inappropriately from the feature vector, If it is the first stage, the feature vector of the query is calculated. If it is the second stage or later, the virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated for the feature vector that was the output candidate of the previous stage. Then, another search target information having a feature vector having the highest similarity to the virtual vector is searched, and if there is the corresponding search target information, its content is set as the next candidate search target information. The that process, the search for results held search target information group held in the unit and the decision tree creation unit for creating a decision tree is repeated up to a predetermined stage, 1 of the extracted search target information of the similarity calculation unit
A search result output unit for outputting data indicating the index and the contents of the search tree, and accepting a user's suitability / inappropriate judgment input for the search result output from the search result output unit, and the decision tree creation unit creates The next search target information to be output is specified based on the decision tree, and the content of the relevant search target information is output by the search result output unit,
And a feedback processing unit for instructing the decision tree creating unit to grow the decision tree by a predetermined stage when a condition for growing an existing decision tree is reached. .

【００２０】請求項４の発明は、請求項１〜３の情報検
索システムにおいて、前記クエリー及び検索対象情報は
テキストデータであることを特徴とするものである。According to a fourth aspect of the present invention, in the information search system of the first to third aspects, the query and the search target information are text data.

【００２１】請求項５の発明の情報検索方法は、ユーザ
の入力するクエリーを受け付けるステップ１と、受け付
けたクエリーから特徴ベクトルを作成するステップ２
と、作成されたクエリーの特徴ベクトルと、情報データ
ベースに登録されている検索対象情報ごとの特徴ベクト
ルとの類似度を演算し、所定のしきい値よりも高い類似
度を示す特徴ベクトルを持つ検索対象情報を特定し、そ
れらのインデックス、内容を表すデータ及び特徴ベクト
ルデータを取り出すステップ３と、ステップ３で取り出
した検索対象情報の１つについてそのインデックス、内
容を示すデータを出力するステップ４と、ステップ４で
出力している検索結果に対するユーザの適／不適の判断
入力を受け付け、（１）適入力の場合には、現在出力中
の検索対象情報の持つ特徴ベクトルに類似度が最も高い
特徴ベクトルを持つ他の検索対象情報がないか前記ステ
ップ３で取り出した検索対象情報の中を検索し、該当す
る検索対象情報があればその内容を次候補として出力
し、（２）不適入力の場合には、初段であればクエリー
の特徴ベクトルに対して、２段目以降であれば前段で出
力した特徴ベクトルに対して、現在出力中の検索対象情
報の持つ特徴ベクトルと反対向きになる仮想ベクトルを
算定し、この仮想ベクトルに類似度が最も高い特徴ベク
トルを持つ他の検索対象情報がないか前記ステップ３で
取り出した検索対象情報の中を検索し、該当する検索対
象情報があればその内容を次候補として出力するステッ
プ５とを有するものである。According to a fifth aspect of the present invention, there is provided an information retrieval method, wherein a query input by a user is received, and a feature vector is created from the received query.
Calculating the similarity between the feature vector of the created query and the feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. Step 3 for specifying the target information and extracting the index, data representing the content and feature vector data, and outputting the index and data indicating the index and content of one of the search target information extracted in Step 3; In step 4, the user's input for determining whether or not the search result is output is accepted. (1) In the case of an appropriate input, the feature vector having the highest similarity to the feature vector of the search target information currently being output The search target information retrieved in step 3 is searched for other search target information having Then, the contents are output as the next candidate. (2) In the case of inappropriate input, the feature vector of the query is output in the first stage, and the feature vector output in the previous stage is output in the second and subsequent stages. A virtual vector that is in the opposite direction to the feature vector of the search target information that is currently being output is calculated. Step 5 of searching the target information and outputting the content of the relevant search target information as a next candidate if any.

【００２２】請求項６の発明の情報検索方法は、ユーザ
の入力するクエリーを受け付けるステップ１と、受け付
けたクエリーから特徴ベクトルを作成するステップ２
と、作成されたクエリーの特徴ベクトルと、情報データ
ベースに登録されている検索対象情報ごとの特徴ベクト
ルとの類似度を演算し、所定のしきい値よりも高い類似
度を示す特徴ベクトルを持つ検索対象情報を特定し、そ
れらのインデックス、内容を表すデータ及び特徴ベクト
ルデータを取り出すステップ３と、ステップ３で求めた
類似度の高い特徴ベクトル群に対して、最も類似度の高
い特徴ベクトルから始め、その内容がユーザに受け入れ
られるとした場合に適、受け入れられないとした場合に
不適に分岐し、適に分岐する場合には、当該特徴ベクト
ルに類似度が最も高い特徴ベクトルを持つ他の検索対象
情報を次候補の検索対象情報とし、前記特徴ベクトルか
ら不適に分岐する場合には、初段であればクエリーの特
徴ベクトルに対して、２段目以降であれば前段の出力候
補とした特徴ベクトルに対して、不適とした検索対象情
報の持つ特徴ベクトルと反対向きになる仮想ベクトルを
算定し、この仮想ベクトルに類似度が最も高い特徴ベク
トルを持つ他の検索対象情報を検索し、該当する検索対
象情報があればその内容を次候補の検索対象情報とする
処理を、ステップ３で取り出した検索対象情報のすべて
を網羅するまで繰り返して決定木を作成するステップ４
と、ステップ３で取り出した検索対象情報の１つについ
てそのインデックス、内容を示すデータを出力するステ
ップ５と、ステップ５で出力している検索結果に対する
ユーザの適／不適の判断入力を受け付け、ステップ４で
作成した決定木に基づいて次に出力すべき検索対象情報
を特定し、該当する検索対象情報の内容を出力させるス
テップ６とを有するものである。According to a sixth aspect of the present invention, in the information search method, a step 1 for receiving a query input by a user and a step 2 for creating a feature vector from the received query.
Calculating the similarity between the feature vector of the created query and the feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. Step 3 of identifying the target information and extracting the index, data representing the contents and feature vector data, and starting with the feature vector having the highest similarity for the feature vector group having the highest similarity obtained in Step 3; If the content is accepted by the user, it branches appropriately if it is not accepted, and if it is not, then another branch that has the feature vector with the highest similarity to the feature vector If the information is the search target information of the next candidate and the branch is inappropriately performed from the feature vector, if it is the first stage, the query If it is the second or subsequent stage, a virtual vector having the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector as the output candidate of the previous stage, and the similarity to this virtual vector is the highest. The process of searching for other search target information having a feature vector and, if there is such search target information, setting the content thereof as the next candidate search target information is repeated until all the search target information extracted in step 3 is covered. Step 4 of creating a decision tree
And step 5 for outputting data indicating the index and contents of one of the search target information extracted in step 3, and accepting a user's suitability / inappropriate determination input for the search result output in step 5; Step 6 of specifying the search target information to be output next based on the decision tree created in Step 4, and outputting the contents of the relevant search target information.

【００２３】請求項７の発明の情報検索方法は、ユーザ
の入力するクエリーを受け付けるステップ１と、受け付
けたクエリーから特徴ベクトルを作成するステップ２
と、作成されたクエリーの特徴ベクトルと、情報データ
ベースに登録されている検索対象情報ごとの特徴ベクト
ルとの類似度を演算し、所定のしきい値よりも高い類似
度を示す特徴ベクトルを持つ検索対象情報を特定し、そ
れらのインデックス、内容を表すデータ及び特徴ベクト
ルデータを取り出すステップ３と、ステップ３で求めた
類似度の高い特徴ベクトル群に対して、最も類似度の高
い特徴ベクトルから始め、その内容がユーザに受け入れ
られるとした場合に適、受け入れられないとした場合に
不適に分岐し、適に分岐する場合には、当該特徴ベクト
ルに類似度が最も高い特徴ベクトルを持つ他の検索対象
情報を次候補の検索対象情報とし、前記特徴ベクトルか
ら不適に分岐する場合には、初段であればクエリーの特
徴ベクトルに対して、２段目以降であれば前段の出力候
補とした特徴ベクトルに対して、不適とした検索対象情
報の持つ特徴ベクトルと反対向きになる仮想ベクトルを
算定し、この仮想ベクトルに類似度が最も高い特徴ベク
トルを持つ他の検索対象情報を検索し、該当する検索対
象情報があればその内容を次候補の検索対象情報とする
処理を、ステップ３で取り出した検索対象情報について
所定段階まで繰り返して決定木を作成するステップ４
と、ステップ３で取り出した検索対象情報の１つについ
てそのインデックス、内容を示すデータを出力するステ
ップ５と、ステップ５で出力している検索結果に対する
ユーザの適／不適の判断入力を受け付け、ステップ４で
作成した決定木に基づいて次に出力すべき検索対象情報
を特定し、該当する検索対象情報の内容を出力し、かつ
ステップ４で作成した決定木を成長させる条件に至った
ならば当該決定木を所定段階だけ成長させるステップ６
とを有するものである。According to a seventh aspect of the present invention, in the information search method, a step 1 for receiving a query input by a user and a step 2 for creating a feature vector from the received query.
Calculating the similarity between the feature vector of the created query and the feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. Step 3 of identifying the target information and extracting the index, data representing the contents and feature vector data, and starting with the feature vector having the highest similarity for the feature vector group having the highest similarity obtained in Step 3; If the content is accepted by the user, it branches appropriately if it is not accepted, and if it is not, then another branch that has the feature vector with the highest similarity to the feature vector If the information is the search target information of the next candidate and the branch is inappropriately performed from the feature vector, if it is the first stage, the query If it is the second or subsequent stage, a virtual vector having the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector as the output candidate of the previous stage, and the similarity to this virtual vector is the highest. A process of searching for other search target information having a feature vector and, if there is such search target information, setting the contents thereof as the next candidate search target information is repeatedly determined for the search target information extracted in step 3 up to a predetermined stage. Step 4 of creating a tree
And step 5 for outputting data indicating the index and contents of one of the search target information extracted in step 3, and accepting a user's suitability / inappropriate determination input for the search result output in step 5; Based on the decision tree created in step 4, the next search target information to be output is specified, the contents of the corresponding search target information are output, and if the condition for growing the decision tree created in step 4 is reached, Step 6 of growing the decision tree by a predetermined stage
And

【００２４】請求項８の発明は、請求項５〜７の情報検
索方法において、前記クエリー及び検索対象情報はテキ
ストデータであることを特徴とするものである。According to an eighth aspect of the present invention, in the information search method of the fifth to seventh aspects, the query and the search target information are text data.

【００２５】請求項９の発明の情報検索プログラムは、
ユーザの入力するクエリーを受け付ける処理１と、受け
付けたクエリーから特徴ベクトルを作成する処理２と、
作成されたクエリーの特徴ベクトルと、情報データベー
スに登録されている検索対象情報ごとの特徴ベクトルと
の類似度を演算し、所定のしきい値よりも高い類似度を
示す特徴ベクトルを持つ検索対象情報を特定し、それら
のインデックス、内容を表すデータ及び特徴ベクトルデ
ータを取り出す処理３と、処理３で取り出した検索対象
情報の１つについてそのインデックス、内容を示すデー
タを出力する処理４と、処理４で出力している検索結果
に対するユーザの適／不適の判断入力を受け付け、
（１）適入力の場合には、現在出力中の検索対象情報の
持つ特徴ベクトルに類似度が最も高い特徴ベクトルを持
つ他の検索対象情報がないか前記処理３で取り出した検
索対象情報の中を検索し、該当する検索対象情報があれ
ばその内容を次候補として出力し、（２）不適入力の場
合には、初段であればクエリーの特徴ベクトルに対し
て、２段目以降であれば前段で出力した特徴ベクトルに
対して、現在出力中の検索対象情報の持つ特徴ベクトル
と反対向きになる仮想ベクトルを算定し、この仮想ベク
トルに類似度が最も高い特徴ベクトルを持つ他の検索対
象情報がないか前記処理３で取り出した検索対象情報の
中を検索し、該当する検索対象情報があればその内容を
次候補として出力する処理５とをコンピュータに実行さ
せるものである。According to a ninth aspect of the present invention, there is provided an information retrieval program comprising:
Processing 1 for receiving a query input by the user, processing 2 for creating a feature vector from the received query,
Calculate the similarity between the feature vector of the created query and the feature vector for each search target information registered in the information database, and search target information having a feature vector indicating a similarity higher than a predetermined threshold value And a process 4 for extracting data indicating the index and the contents and the feature vector data, a process 4 for outputting the index and the content of one of the search target information extracted in the process 3, and a process 4 Accepts the user's suitability / inappropriate judgment input for the search result output in
(1) In the case of a suitable input, whether there is any other search target information having a feature vector having the highest similarity to the feature vector of the search target information currently being output, Is searched, and if there is corresponding search target information, the content is output as the next candidate. With respect to the feature vector output in the previous stage, a virtual vector that is in the opposite direction to the feature vector of the currently output search target information is calculated, and other search target information having a feature vector with the highest similarity to this virtual vector is calculated. The computer performs a process 5 of searching the search target information extracted in the process 3 for the presence of the search target information and outputting the content of the relevant search target information as a next candidate, if any.

【００２６】請求項１０の発明の情報検索プログラム
は、ユーザの入力するクエリーを受け付ける処理１と、
受け付けたクエリーから特徴ベクトルを作成する処理２
と、作成されたクエリーの特徴ベクトルと、情報データ
ベースに登録されている検索対象情報ごとの特徴ベクト
ルとの類似度を演算し、所定のしきい値よりも高い類似
度を示す特徴ベクトルを持つ検索対象情報を特定し、そ
れらのインデックス、内容を表すデータ及び特徴ベクト
ルデータを取り出す処理３と、処理３で求めた類似度の
高い特徴ベクトル群に対して、最も類似度の高い特徴ベ
クトルから始め、その内容がユーザに受け入れられると
した場合に適、受け入れられないとした場合に不適に分
岐し、適に分岐する場合には、当該特徴ベクトルに類似
度が最も高い特徴ベクトルを持つ他の検索対象情報を次
候補の検索対象情報とし、前記特徴ベクトルから不適に
分岐する場合には、初段であればクエリーの特徴ベクト
ルに対して、２段目以降であれば前段の出力候補とした
特徴ベクトルに対して、不適とした検索対象情報の持つ
特徴ベクトルと反対向きになる仮想ベクトルを算定し、
この仮想ベクトルに類似度が最も高い特徴ベクトルを持
つ他の検索対象情報を検索し、該当する検索対象情報が
あればその内容を次候補の検索対象情報とする処理を、
処理３で取り出した検索対象情報のすべてを網羅するま
で繰り返して決定木を作成する処理４と、処理３で取り
出した検索対象情報の１つについてそのインデックス、
内容を示すデータを出力する処理５と、処理５で出力し
ている検索結果に対するユーザの適／不適の判断入力を
受け付け、処理４で作成した決定木に基づいて次に出力
すべき検索対象情報を特定し、該当する検索対象情報の
内容を出力させる処理６とをコンピュータに実行させる
ものである。An information retrieval program according to a tenth aspect of the present invention comprises a processing 1 for receiving a query input by a user;
Process 2 for creating a feature vector from the received query
Calculating the similarity between the feature vector of the created query and the feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. A process 3 for specifying the target information and extracting the data representing the index, the content and the feature vector data, and a feature vector group having a high similarity obtained in the process 3 starting from a feature vector having the highest similarity, If the content is accepted by the user, it branches appropriately if it is not accepted, and if it is not, then another branch that has the feature vector with the highest similarity to the feature vector If the information is set as the search target information of the next candidate, and the branch is inappropriately performed from the feature vector, if the branch is in the first stage, two stages are required for the feature vector of the query. The feature vector obtained by the preceding stage output candidate if later, to calculate the virtual vector of the opposite direction, wherein vectors having the search target information unsuitable,
A process of searching for other search target information having a feature vector having the highest similarity to the virtual vector, and setting the content of the corresponding search target information, if any, as the next candidate search target information,
A process 4 for repeatedly creating a decision tree until all the search target information extracted in the process 3 is covered, and an index of one of the search target information extracted in the process 3,
Processing 5 for outputting data indicating the contents, and accepting the user's appropriateness / inappropriate determination input for the search result output in processing 5 and searching target information to be output next based on the decision tree created in processing 4 And causing the computer to execute processing 6 for outputting the contents of the corresponding search target information.

【００２７】請求項１１の発明の情報検索プログラム
は、ユーザの入力するクエリーを受け付ける処理１と、
受け付けたクエリーから特徴ベクトルを作成する処理２
と、作成されたクエリーの特徴ベクトルと、情報データ
ベースに登録されている検索対象情報ごとの特徴ベクト
ルとの類似度を演算し、所定のしきい値よりも高い類似
度を示す特徴ベクトルを持つ検索対象情報を特定し、そ
れらのインデックス、内容を表すデータ及び特徴ベクト
ルデータを取り出す処理３と、処理３で求めた類似度の
高い特徴ベクトル群に対して、最も類似度の高い特徴ベ
クトルから始め、その内容がユーザに受け入れられると
した場合に適、受け入れられないとした場合に不適に分
岐し、適に分岐する場合には、当該特徴ベクトルに類似
度が最も高い特徴ベクトルを持つ他の検索対象情報を次
候補の検索対象情報とし、前記特徴ベクトルから不適に
分岐する場合には、初段であればクエリーの特徴ベクト
ルに対して、２段目以降であれば前段の出力候補とした
特徴ベクトルに対して、不適とした検索対象情報の持つ
特徴ベクトルと反対向きになる仮想ベクトルを算定し、
この仮想ベクトルに類似度が最も高い特徴ベクトルを持
つ他の検索対象情報を検索し、該当する検索対象情報が
あればその内容を次候補の検索対象情報とする処理を、
処理３で取り出した検索対象情報について所定段階まで
繰り返して決定木を作成する処理４と、処理３で取り出
した検索対象情報の１つについてそのインデックス、内
容を示すデータを出力する処理５と、処理５で出力して
いる検索結果に対するユーザの適／不適の判断入力を受
け付け、処理４で作成した決定木に基づいて次に出力す
べき検索対象情報を特定し、該当する検索対象情報の内
容を出力し、かつ処理４で作成した決定木を成長させる
条件に至ったならば当該決定木を所定段階だけ成長させ
る処理６とをコンピュータに実行させるものである。[0027] An information retrieval program according to an eleventh aspect of the present invention includes a processing 1 for receiving a query input by a user;
Process 2 for creating a feature vector from the received query
Calculating the similarity between the feature vector of the created query and the feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. A process 3 for specifying the target information and extracting the data representing the index, the content and the feature vector data, and a feature vector group having a high similarity obtained in the process 3 starting from a feature vector having the highest similarity, If the content is accepted by the user, it branches appropriately if it is not accepted, and if it is not, then another branch that has the feature vector with the highest similarity to the feature vector If the information is set as the search target information of the next candidate, and the branch is inappropriately performed from the feature vector, if the branch is in the first stage, two stages are required for the feature vector of the query. The feature vector obtained by the preceding stage output candidate if later, to calculate the virtual vector of the opposite direction, wherein vectors having the search target information unsuitable,
A process of searching for other search target information having a feature vector having the highest similarity to the virtual vector, and setting the content of the corresponding search target information, if any, as the next candidate search target information,
A process 4 for repeatedly creating a decision tree for the search target information extracted in the process 3 until a predetermined stage, a process 5 for outputting an index and data indicating the content of one of the search target information extracted in the process 3, 5 accepts the user's input of suitability / inappropriateness for the search result output, specifies the search target information to be output next based on the decision tree created in process 4, and determines the contents of the relevant search target information. When the condition for growing the decision tree generated in the process 4 is reached, the process 6 for growing the decision tree by a predetermined stage is executed by the computer.

【００２８】請求項１２の発明は、請求項９〜１１の情
報検索プログラムにおいて、前記クエリー及び検索対象
情報はテキストデータであることを特徴とするものであ
る。According to a twelfth aspect of the present invention, in the information search program of the ninth to eleventh aspects, the query and the search target information are text data.

【００２９】請求項１３の発明の記録媒体は、ユーザの
入力するクエリーを受け付ける処理１と、受け付けたク
エリーから特徴ベクトルを作成する処理２と、作成され
たクエリーの特徴ベクトルと、情報データベースに登録
されている検索対象情報ごとの特徴ベクトルとの類似度
を演算し、所定のしきい値よりも高い類似度を示す特徴
ベクトルを持つ検索対象情報を特定し、それらのインデ
ックス、内容を表すデータ及び特徴ベクトルデータを取
り出す処理３と、処理３で取り出した検索対象情報の１
つについてそのインデックス、内容を示すデータを出力
する処理４と、処理４で出力している検索結果に対する
ユーザの適／不適の判断入力を受け付け、（１）適入力
の場合には、現在出力中の検索対象情報の持つ特徴ベク
トルに類似度が最も高い特徴ベクトルを持つ他の検索対
象情報がないか前記処理３で取り出した検索対象情報の
中を検索し、該当する検索対象情報があればその内容を
次候補として出力し、（２）不適入力の場合には、初段
であればクエリーの特徴ベクトルに対して、２段目以降
であれば前段で出力した特徴ベクトルに対して、現在出
力中の検索対象情報の持つ特徴ベクトルと反対向きにな
る仮想ベクトルを算定し、この仮想ベクトルに類似度が
最も高い特徴ベクトルを持つ他の検索対象情報がないか
前記処理３で取り出した検索対象情報の中を検索し、該
当する検索対象情報があればその内容を次候補として出
力する処理５とをコンピュータに実行させる情報検索プ
ログラムを記録したものである。A recording medium according to a thirteenth aspect of the present invention provides a process 1 for receiving a query input by a user, a process 2 for creating a feature vector from the received query, a feature vector of the created query, and registration in an information database. Calculate the similarity with the feature vector for each piece of search target information that has been searched, specify search target information having a feature vector indicating a similarity higher than a predetermined threshold, their indexes, data representing the contents and Processing 3 for extracting the feature vector data, and 1 of the search target information extracted in the processing 3
Processing 4 for outputting the data indicating the index and the contents of each of them, and accepting the user's appropriateness / inappropriate determination input with respect to the search result output in the processing 4 are received. Searches the search target information extracted in the process 3 for other search target information having the feature vector having the highest similarity to the feature vector of the search target information. The content is output as the next candidate. (2) In the case of inappropriate input, the feature vector of the query is output in the first stage, and the feature vector output in the previous stage is output in the second and subsequent stages. A virtual vector that is in the opposite direction to the feature vector of the search target information is calculated. And in searching for the search target information is a record the appropriate information retrieval program for executing the content, if the search target information and process 5 for outputting a next candidate in the computer.

【００３０】請求項１４の発明の記録媒体は、ユーザの
入力するクエリーを受け付ける処理１と、受け付けたク
エリーから特徴ベクトルを作成する処理２と、作成され
たクエリーの特徴ベクトルと、情報データベースに登録
されている検索対象情報ごとの特徴ベクトルとの類似度
を演算し、所定のしきい値よりも高い類似度を示す特徴
ベクトルを持つ検索対象情報を特定し、それらのインデ
ックス、内容を表すデータ及び特徴ベクトルデータを取
り出す処理３と、処理３で求めた類似度の高い特徴ベク
トル群に対して、最も類似度の高い特徴ベクトルから始
め、その内容がユーザに受け入れられるとした場合に
適、受け入れられないとした場合に不適に分岐し、適に
分岐する場合には、当該特徴ベクトルに類似度が最も高
い特徴ベクトルを持つ他の検索対象情報を次候補の検索
対象情報とし、前記特徴ベクトルから不適に分岐する場
合には、初段であればクエリーの特徴ベクトルに対し
て、２段目以降であれば前段の出力候補とした特徴ベク
トルに対して、不適とした検索対象情報の持つ特徴ベク
トルと反対向きになる仮想ベクトルを算定し、この仮想
ベクトルに類似度が最も高い特徴ベクトルを持つ他の検
索対象情報を検索し、該当する検索対象情報があればそ
の内容を次候補の検索対象情報とする処理を、処理３で
取り出した検索対象情報のすべてを網羅するまで繰り返
して決定木を作成する処理４と、処理３で取り出した検
索対象情報の１つについてそのインデックス、内容を示
すデータを出力する処理５と、処理５で出力している検
索結果に対するユーザの適／不適の判断入力を受け付
け、処理４で作成した決定木に基づいて次に出力すべき
検索対象情報を特定し、該当する検索対象情報の内容を
出力させる処理６とをコンピュータに実行させる情報検
索プログラムを記録したものである。According to a fourteenth aspect of the present invention, there is provided a recording medium for receiving a query input by a user, a process for creating a feature vector from the received query, a feature vector for the created query, and registering the feature vector in an information database. Calculate the similarity with the feature vector for each piece of search target information that has been searched, specify search target information having a feature vector indicating a similarity higher than a predetermined threshold, their indexes, data representing the contents and Process 3 for extracting the feature vector data, and for the feature vector group having a high similarity obtained in Process 3, starting from the feature vector having the highest similarity, if the content is acceptable to the user, it is appropriately accepted. If it does not exist, it branches inappropriately, and if it branches properly, it has the feature vector with the highest similarity to the feature vector. The other search target information is set as search target information of the next candidate, and if the branch is inappropriately performed from the feature vector, the query vector is the first stage and the output candidate of the previous stage is the second or subsequent stage. With respect to the obtained feature vector, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and another search target information having a feature vector with the highest similarity to this virtual vector is searched, If there is applicable search target information, the process of setting the content as the next candidate search target information is repeated until all search target information extracted in process 3 is covered, and a process 4 for creating a decision tree; Processing 5 for outputting data indicating the index and contents of one of the retrieved information to be searched, and determination of suitability of the user for the search result output in processing 5 An information search program is recorded that causes the computer to execute the process 6 of receiving the force, specifying the search target information to be output next based on the decision tree created in the process 4, and outputting the contents of the relevant search target information. Things.

【００３１】請求項１５の発明の記録媒体は、ユーザの
入力するクエリーを受け付ける処理１と、受け付けたク
エリーから特徴ベクトルを作成する処理２と、作成され
たクエリーの特徴ベクトルと、情報データベースに登録
されている検索対象情報ごとの特徴ベクトルとの類似度
を演算し、所定のしきい値よりも高い類似度を示す特徴
ベクトルを持つ検索対象情報を特定し、それらのインデ
ックス、内容を表すデータ及び特徴ベクトルデータを取
り出す処理３と、処理３で求めた類似度の高い特徴ベク
トル群に対して、最も類似度の高い特徴ベクトルから始
め、その内容がユーザに受け入れられるとした場合に
適、受け入れられないとした場合に不適に分岐し、適に
分岐する場合には、当該特徴ベクトルに類似度が最も高
い特徴ベクトルを持つ他の検索対象情報を次候補の検索
対象情報とし、前記特徴ベクトルから不適に分岐する場
合には、初段であればクエリーの特徴ベクトルに対し
て、２段目以降であれば前段の出力候補とした特徴ベク
トルに対して、不適とした検索対象情報の持つ特徴ベク
トルと反対向きになる仮想ベクトルを算定し、この仮想
ベクトルに類似度が最も高い特徴ベクトルを持つ他の検
索対象情報を検索し、該当する検索対象情報があればそ
の内容を次候補の検索対象情報とする処理を、処理３で
取り出した検索対象情報について所定段階まで繰り返し
て決定木を作成する処理４と、処理３で取り出した検索
対象情報の１つについてそのインデックス、内容を示す
データを出力する処理５と、処理５で出力している検索
結果に対するユーザの適／不適の判断入力を受け付け、
処理４で作成した決定木に基づいて次に出力すべき検索
対象情報を特定し、該当する検索対象情報の内容を出力
し、かつ処理４で作成した決定木を成長させる条件に至
ったならば当該決定木を所定段階だけ成長させる処理６
とをコンピュータに実行させる情報検索プログラムを記
録したものである。A recording medium according to a fifteenth aspect of the present invention provides a process 1 for receiving a query input by a user, a process 2 for creating a feature vector from the received query, a feature vector of the created query, and registration in an information database. Calculate the similarity with the feature vector for each piece of search target information that has been searched, specify search target information having a feature vector indicating a similarity higher than a predetermined threshold, their indexes, data representing the contents and Process 3 for extracting the feature vector data, and for the feature vector group having a high degree of similarity obtained in Process 3, starting from the feature vector with the highest similarity, if the contents are acceptable to the user, the process is appropriately accepted. If it does not exist, it branches inappropriately, and if it branches properly, it has the feature vector with the highest similarity to the feature vector. The other search target information is set as search target information of the next candidate, and if the branch is inappropriately performed from the feature vector, the query vector is the first stage and the output candidate of the previous stage is the second or subsequent stage. With respect to the obtained feature vector, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and another search target information having a feature vector with the highest similarity to this virtual vector is searched, If there is applicable search target information, the process of setting the contents as search target information of the next candidate is repeated up to a predetermined stage for the search target information extracted in process 3 to create a decision tree, and the process extracted in process 3 Processing 5 for outputting data indicating the index and content of one of the search target information, and determining whether the user is appropriate or inappropriate for the search result output in processing 5 Acceptance,
If the search target information to be output next is specified based on the decision tree created in the process 4, the content of the corresponding search target information is output, and the condition for growing the decision tree created in the process 4 is reached Processing 6 for growing the decision tree by a predetermined stage
And an information search program for causing a computer to execute the above.

【００３２】請求項１６の発明は、請求項１３〜１５の
記録媒体において、前記クエリー及び検索対象情報はテ
キストデータであることを特徴とするものである。According to a sixteenth aspect of the present invention, in the recording medium according to the thirteenth to fifteenth aspects, the query and the search target information are text data.

【００３３】請求項１７の発明の出力情報選択装置は、
一群の検索対象情報それぞれの内容を表現する情報と共
に特徴ベクトルデータを保持する特徴データ保持部と、
前記一群の検索対象情報の中から指定された検索対象情
報の内容を表現する情報を出力する表示情報出力部と、
前記出力された検索対象情報に対して、ユーザの適／不
適の判断入力を受け付けるフィードバック受付部と、
（１）前記ユーザの判断入力が適である場合には、前記
表示情報出力部が現在出力中の検索対象情報の持つ特徴
ベクトルに類似度が最も高い特徴ベクトルを持つ他の検
索対象情報がないか前記一群の検索対象情報を検索し、
該当する検索対象情報があればその内容を表現する情報
を前記表示情報出力部に出力させ、（２）前記ユーザの
判断入力が不適である場合には、初段であれば、あらか
じめ与えられている基準ベクトルに対して、二段目以降
であれば前段で出力した検索対象情報の持つ特徴ベクト
ルに対して、前記表示情報出力部が現在出力中の検索対
象情報の持つ特徴ベクトルと反対向きになる仮想ベクト
ルを算定し、この仮想ベクトルに類似度が最も高い特徴
ベクトルを持つ他の検索対象情報がないか前記一群の検
索対象情報を検索し、該当する検索対象情報があればそ
の内容を表現する情報を前記表示情報出力部に出力させ
る出力情報選択部とを備えたものである。[0033] The output information selection device according to the seventeenth aspect of the present invention provides
A feature data holding unit that holds feature vector data together with information expressing the content of each of the group of search target information;
A display information output unit that outputs information expressing the content of the search target information specified from the group of search target information,
A feedback receiving unit that receives a user's suitability / inappropriate determination input with respect to the output search target information;
(1) If the user's judgment input is appropriate, there is no other search target information having a feature vector having the highest similarity to the feature vector of the search target information currently output by the display information output unit. Or search the group of search target information,
If there is applicable search target information, the information representing its contents is output to the display information output unit. (2) If the user's judgment input is inappropriate, if it is the first stage, it is given in advance. With respect to the reference vector, if it is the second or subsequent stage, the display information output unit is oriented in the opposite direction to the feature vector of the search target information currently output, with respect to the feature vector of the search target information output in the previous stage. A virtual vector is calculated, and the group of search target information is searched for other search target information having a feature vector having the highest similarity to the virtual vector. An output information selection unit for outputting information to the display information output unit.

【００３４】請求項１８の発明の出力情報選択装置は、
一群の検索対象情報それぞれの内容を表現する情報と共
に特徴ベクトルデータを保持する特徴データ保持部と、
前記一群の検索対象情報の中から指定された検索対象情
報の内容を表現する情報を出力する表示情報出力部と、
前記特徴データ保持部に保持されている一群の検索対象
情報に対して、あらかじめ与えられている基準ベクトル
と最も類似度の高い特徴ベクトルを持つ検索対象情報か
ら始め、その内容がユーザに受け入れられるとした場合
に適、受け入れられないとした場合に不適に分岐し、適
に分岐する場合には、当該特徴ベクトルに類似度が最も
高い特徴ベクトルを持つ他の検索対象情報を次候補の検
索対象情報とし、前記特徴ベクトルから不適に分岐する
場合には、初段であれば、前記基準ベクトルに対して、
二段目以降であれば前段の出力候補とした検索対象情報
の持つ特徴ベクトルに対して、不適とした検索対象情報
の持つ特徴ベクトルと反対向きになる仮想ベクトルを算
定し、この仮想ベクトルに類似度が最も高い特徴ベクト
ルを持つ他の検索対象情報を検索し、該当する検索対象
情報があればその内容を次候補の検索対象情報とする処
理を、前記特徴データ保持部に保持されているすべての
検索対象情報を網羅するまで繰り返して決定木を作成す
る決定木作成部と、前記表示情報出力部が出力している
検索対象情報に対するユーザの適／不適の判断入力を受
け付け、前記決定木作成部の作成した決定木に基づいて
次に出力すべき検索対象情報を特定し、該当する検索対
象情報の内容を表現する情報を前記表示情報出力部によ
って出力させるフィードバック処理部とを備えたもので
ある。The output information selection device according to the eighteenth aspect of the present invention
A feature data holding unit that holds feature vector data together with information expressing the content of each of the group of search target information;
A display information output unit that outputs information expressing the content of the search target information specified from the group of search target information,
For a group of search target information held in the feature data holding unit, starting from search target information having a feature vector having the highest similarity with a reference vector given in advance, when the content is accepted by the user. If it is not suitable, it branches inappropriately if it is not accepted, and if it branches properly, other search target information having the feature vector with the highest similarity to the feature vector is searched for the next candidate. In the case of inappropriate branching from the feature vector, if it is the first stage, with respect to the reference vector,
If it is the second stage or later, the virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector of the search target information as the output candidate of the previous stage, and similar to this virtual vector. The other search target information having the highest feature vector is searched, and if there is the corresponding search target information, the content of the search is set as the next candidate search target information. A decision tree creating unit that repeatedly creates a decision tree until the search target information is covered, and accepts a user's suitability / inadequacy determination input for the search target information output by the display information output unit, and generates the decision tree. The search information to be output next is specified based on the decision tree created by the unit, and the information representing the contents of the relevant search target information is output by the display information output unit. It is obtained by a readback processor.

【００３５】請求項１９の発明の出力情報選択装置は、
一群の検索対象情報それぞれの内容を表現する情報と共
に特徴ベクトルデータを保持する特徴データ保持部と、
前記一群の検索対象情報の中から指定された検索対象情
報の内容を表現する情報を出力する表示情報出力部と、
前記特徴データ保持部に保持されている一群の検索対象
情報に対して、あらかじめ与えられている基準ベクトル
と最も類似度の高い特徴ベクトルを持つ検索対象情報か
ら始め、その内容がユーザに受け入れられるとした場合
に適、受け入れられないとした場合に不適に分岐し、適
に分岐する場合には、当該特徴ベクトルに類似度が最も
高い特徴ベクトルを持つ他の検索対象情報を次候補の検
索対象情報とし、前記特徴ベクトルから不適に分岐する
場合には、初段であれば、前記基準ベクトルに対して、
二段目以降であれば前段の出力候補とした検索対象情報
の持つ特徴ベクトルに対して、不適とした検索対象情報
の持つ特徴ベクトルと反対向きになる仮想ベクトルを算
定し、この仮想ベクトルに類似度が最も高い特徴ベクト
ルを持つ他の検索対象情報を検索し、該当する検索対象
情報があればその内容を次候補の検索対象情報とする処
理を、前記特徴データ保持部に保持されている検索対象
情報群について所定段階まで繰り返して決定木を作成す
る決定木作成部と、前記表示情報出力部が出力している
検索対象情報に対するユーザの適／不適の判断入力を受
け付け、前記決定木作成部の作成した決定木に基づいて
次に出力すべき検索対象情報を特定し、該当する検索対
象情報の内容を表現する情報を前記表示情報出力部によ
って出力させると共に、前記決定木作成部に対して、既
存の決定木を成長させる条件に至ったならば前記決定木
作成部に決定木を所定段階だけ成長させる指示を与える
フィードバック処理部とを備えたものである。An output information selection device according to a nineteenth aspect of the present invention
A feature data holding unit that holds feature vector data together with information expressing the content of each of the group of search target information;
A display information output unit that outputs information expressing the content of the search target information specified from the group of search target information,
For a group of search target information held in the feature data holding unit, starting from search target information having a feature vector having the highest similarity with a reference vector given in advance, when the content is accepted by the user. If it is not suitable, it branches inappropriately if it is not accepted, and if it branches properly, other search target information having the feature vector with the highest similarity to the feature vector is searched for the next candidate. In the case of inappropriate branching from the feature vector, if it is the first stage, with respect to the reference vector,
If it is the second or subsequent stage, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector of the search target information as the output candidate of the previous stage, and similar to this virtual vector The other search target information having the highest feature vector is searched, and if there is such search target information, the content of the search information is used as the next candidate search target information. A decision tree creating unit that repeatedly creates a decision tree for a target information group up to a predetermined stage; and accepts a user's suitability / inadequacy determination input with respect to the search target information output by the display information output unit. Specifying the next search target information to be output based on the created decision tree, and causing the display information output unit to output information representing the contents of the corresponding search target information A feedback processing unit for giving an instruction to the decision tree creating unit to grow the decision tree by a predetermined stage when a condition for growing an existing decision tree is reached. is there.

【００３６】請求項２０の発明の出力情報選択方法は、
一群の検索対象情報それぞれの内容を表現する情報と共
に特徴ベクトルデータを保持するステップ１と、前記一
群の検索対象情報の中から指定された検索対象情報の内
容を表現する情報を出力するステップ２と、現在出力中
の検索対象情報に対して、ユーザの適／不適の判断入力
を受け付け、（１）前記ユーザの判断入力が適である場
合には、現在出力中の検索対象情報の持つ特徴ベクトル
に類似度が最も高い特徴ベクトルを持つ他の検索対象情
報がないか前記一群の検索対象情報を検索し、該当する
検索対象情報があればその内容を表現する情報を出力
し、（２）前記ユーザの判断入力が不適である場合に
は、初段であれば、あらかじめ与えられている基準ベク
トルに対して、二段目以降であれば前段で出力した検索
対象情報の持つ特徴ベクトルに対して、現在出力中の検
索対象情報の持つ特徴ベクトルと反対向きになる仮想ベ
クトルを算定し、この仮想ベクトルに類似度が最も高い
特徴ベクトルを持つ他の検索対象情報がないか前記一群
の検索対象情報を検索し、該当する検索対象情報があれ
ばその内容を表現する情報を出力するステップ３とを有
するものである。According to a twentieth aspect of the present invention, there is provided an output information selecting method comprising:
Step 1 of holding feature vector data together with information representing the contents of each of the group of search target information, and Step 2 of outputting information representing the contents of the search target information specified from the group of search target information. Receiving the user's appropriateness / inappropriate determination input with respect to the currently output search target information, and (1) if the user's determination input is appropriate, the feature vector of the currently output search target information Searching the group of search target information for other search target information having a feature vector with the highest similarity to the search target information, and outputting information representing the contents of the corresponding search target information, if any, If the user's judgment input is inappropriate, if the input is the first stage, the reference vector given in advance is used. , A virtual vector that is in the opposite direction to the feature vector of the search target information currently being output is calculated, and if there is no other search target information having a feature vector with the highest similarity to the virtual vector, Step 3 of searching for the search target information, and outputting information representing the contents of the corresponding search target information, if any.

【００３７】請求項２１の発明の出力情報選択方法は、
一群の検索対象情報それぞれの内容を表現する情報と共
に特徴ベクトルデータを保持するステップ１と、前記一
群の検索対象情報の中から指定された検索対象情報の内
容を表現する情報を出力するステップ２と、前記一群の
検索対象情報に対して、あらかじめ与えられている基準
ベクトルと最も類似度の高い特徴ベクトルを持つ検索対
象情報から始め、その内容がユーザに受け入れられると
した場合に適、受け入れられないとした場合に不適に分
岐し、適に分岐する場合には、当該特徴ベクトルに類似
度が最も高い特徴ベクトルを持つ他の検索対象情報を次
候補の検索対象情報とし、前記特徴ベクトルから不適に
分岐する場合には、初段であれば、前記基準ベクトルに
対して、二段目以降であれば前段の出力候補とした検索
対象情報の持つ特徴ベクトルに対して、不適とした検索
対象情報の持つ特徴ベクトルと反対向きになる仮想ベク
トルを算定し、この仮想ベクトルに類似度が最も高い特
徴ベクトルを持つ他の検索対象情報を検索し、該当する
検索対象情報があればその内容を次候補の検索対象情報
とする処理を、前記一群の検索対象情報のすべてを網羅
するまで繰り返して決定木を作成するステップ３と、現
在出力中の検索対象情報に対するユーザの適／不適の判
断入力を受け付け、前記決定木に基づいて次に出力すべ
き検索対象情報を特定し、該当する検索対象情報の内容
を表現する情報を出力するステップ４とを有するもので
ある。According to a twenty-first aspect of the present invention, there is provided an output information selecting method comprising:
A step 1 of holding feature vector data together with information representing the contents of each of the group of search target information, and a step 2 of outputting information representing the contents of the search target information specified from the group of search target information. , For the group of search target information, starting with search target information having a feature vector having the highest similarity to a given reference vector, if the content is acceptable to the user, it is not accepted. If it branches inappropriately, and if it branches appropriately, other search target information having a feature vector having the highest similarity to the feature vector is set as search target information of the next candidate, and inappropriately determined from the feature vector. In the case of branching, if it is the first stage, the reference vector is used. With respect to the vector, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and other search target information having a feature vector with the highest similarity to this virtual vector is searched, and the corresponding virtual vector is searched. Step 3 of creating a decision tree by repeating the process of setting the content of the search target information as the next candidate search target information, if any, until the entire set of search target information is covered; Receiving the user's appropriate / inappropriate judgment input to the server, specifying the next search target information to be output based on the decision tree, and outputting information expressing the contents of the corresponding search target information. It is.

【００３８】請求項２２の発明の出力情報選択方法は、
一群の検索対象情報それぞれの内容を表現する情報と共
に特徴ベクトルデータを保持するステップ１と、前記一
群の検索対象情報の中から指定された検索対象情報の内
容を表現する情報を出力するステップ２と、前記一群の
検索対象情報に対して、あらかじめ与えられている基準
ベクトルと最も類似度の高い特徴ベクトルを持つ検索対
象情報から始め、その内容がユーザに受け入れられると
した場合に適、受け入れられないとした場合に不適に分
岐し、適に分岐する場合には、当該特徴ベクトルに類似
度が最も高い特徴ベクトルを持つ他の検索対象情報を次
候補の検索対象情報とし、前記特徴ベクトルから不適に
分岐する場合には、初段であれば、前記基準ベクトルに
対して、二段目以降であれば前段の出力候補とした検索
対象情報の持つ特徴ベクトルに対して、不適とした検索
対象情報の持つ特徴ベクトルと反対向きになる仮想ベク
トルを算定し、この仮想ベクトルに類似度が最も高い特
徴ベクトルを持つ他の検索対象情報を検索し、該当する
検索対象情報があればその内容を次候補の検索対象情報
とする処理を、前記一群の検索対象情報について所定段
階まで繰り返して決定木を作成するステップ３と、現在
出力中の検索対象情報に対するユーザの適／不適の判断
入力を受け付け、前記決定木に基づいて次に出力すべき
検索対象情報を特定し、該当する検索対象情報の内容を
表現する情報を出力すると共に、既存の決定木を成長さ
せる条件に至ったならば前記決定木を所定段階だけ成長
させるステップ４とを有するものである。The output information selecting method according to the twenty-second aspect is
A step 1 of holding feature vector data together with information representing the contents of each of the group of search target information, and a step 2 of outputting information representing the contents of the search target information specified from the group of search target information. , For the group of search target information, starting with search target information having a feature vector having the highest similarity to a given reference vector, if the content is acceptable to the user, it is not accepted. If it branches inappropriately, and if it branches appropriately, other search target information having a feature vector having the highest similarity to the feature vector is set as search target information of the next candidate, and inappropriately determined from the feature vector. In the case of branching, if it is the first stage, the reference vector is used. For the vector, calculate a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information, search for other search target information that has the feature vector with the highest similarity to this virtual vector, and Step 3 of creating a decision tree by repeating the process of setting the content of the search target information, if any, as the next candidate search target information up to a predetermined stage for the group of search target information; Is received, the search target information to be output next is specified based on the decision tree, the information expressing the contents of the relevant search target information is output, and the existing decision tree is grown. And a step 4 of growing the decision tree by a predetermined stage when the conditions for the decision tree are reached.

【００３９】請求項２３の発明の出力情報選択プログラ
ムは、一群の検索対象情報それぞれの内容を表現する情
報と共に特徴ベクトルデータを保持する処理１と、前記
一群の検索対象情報の中から指定された検索対象情報の
内容を表現する情報を出力する処理２と、現在出力中の
検索対象情報に対して、ユーザの適／不適の判断入力を
受け付け、（１）前記ユーザの判断入力が適である場合
には、現在出力中の検索対象情報の持つ特徴ベクトルに
類似度が最も高い特徴ベクトルを持つ他の検索対象情報
がないか前記一群の検索対象情報を検索し、該当する検
索対象情報があればその内容を表現する情報を出力し、
（２）前記ユーザの判断入力が不適である場合には、初
段であれば、あらかじめ与えられている基準ベクトルに
対して、二段目以降であれば前段で出力した検索対象情
報の持つ特徴ベクトルに対して、現在出力中の検索対象
情報の持つ特徴ベクトルと反対向きになる仮想ベクトル
を算定し、この仮想ベクトルに類似度が最も高い特徴ベ
クトルを持つ他の検索対象情報がないか前記一群の検索
対象情報を検索し、該当する検索対象情報があればその
内容を表現する情報を出力する処理３とをコンピュータ
に実行させるものである。According to a twenty-third aspect of the present invention, there is provided an output information selection program for processing 1 for retaining feature vector data together with information representing the contents of a group of search target information, and a process designated from the group of search target information. Processing 2 for outputting information expressing the contents of the search target information, and accepting the user's appropriateness / inappropriate determination input for the currently output search target information, and (1) the user's determination input is appropriate In this case, the group of search target information is searched for other search target information having a feature vector having the highest similarity to the feature vector of the search target information that is currently being output. If it outputs information expressing its contents,
(2) If the user's judgment input is inappropriate, the reference vector is given in advance if it is the first stage, and the feature vector of the search target information output in the previous stage if it is the second or subsequent stage. For this, a virtual vector that is in the opposite direction to the feature vector of the currently output search target information is calculated, and there is no other search target information having a feature vector having the highest similarity in the virtual vector. And a process 3 for searching for the search target information and outputting information representing the contents of the search target information, if any.

【００４０】請求項２４の発明の出力情報選択プログラ
ムは、一群の検索対象情報それぞれの内容を表現する情
報と共に特徴ベクトルデータを保持する処理１と、前記
一群の検索対象情報の中から指定された検索対象情報の
内容を表現する情報を出力する処理２と、前記一群の検
索対象情報に対して、あらかじめ与えられている基準ベ
クトルと最も類似度の高い特徴ベクトルを持つ検索対象
情報から始め、その内容がユーザに受け入れられるとし
た場合に適、受け入れられないとした場合に不適に分岐
し、適に分岐する場合には、当該特徴ベクトルに類似度
が最も高い特徴ベクトルを持つ他の検索対象情報を次候
補の検索対象情報とし、前記特徴ベクトルから不適に分
岐する場合には、初段であれば、前記基準ベクトルに対
して、二段目以降であれば前段の出力候補とした検索対
象情報の持つ特徴ベクトルに対して、不適とした検索対
象情報の持つ特徴ベクトルと反対向きになる仮想ベクト
ルを算定し、この仮想ベクトルに類似度が最も高い特徴
ベクトルを持つ他の検索対象情報を検索し、該当する検
索対象情報があればその内容を次候補の検索対象情報と
する処理を、前記一群の検索対象情報のすべてを網羅す
るまで繰り返して決定木を作成する処理３と、現在出力
中の検索対象情報に対するユーザの適／不適の判断入力
を受け付け、前記決定木に基づいて次に出力すべき検索
対象情報を特定し、該当する検索対象情報の内容を表現
する情報を出力する処理４とをコンピュータに実行させ
るものである。An output information selection program according to a twenty-fourth aspect of the present invention provides a process 1 for storing feature vector data together with information representing the contents of a group of search target information, and a process designated from the group of search target information. A process 2 for outputting information representing the contents of the search target information, and starting with the search target information having the feature vector having the highest similarity to the reference vector given in advance for the group of search target information, If the content is accepted by the user, it branches inappropriately if it is not accepted, and if it branches off appropriately, if it branches properly, other search target information having a feature vector with the highest similarity to the feature vector Is the next candidate search target information, and if branching inappropriately from the feature vector, if the first stage, the second and subsequent stages with respect to the reference vector If there is any, the feature vector of the search target information as the output candidate in the previous stage is calculated, and a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and the feature having the highest similarity to this virtual vector is calculated. The process of searching for other search target information having a vector and, if there is the corresponding search target information, setting the content thereof as the next candidate search target information is repeated until the entire set of search target information is covered. And a user input for judging the suitability / inappropriateness of the search target information currently being output is specified, the next search target information to be output is specified based on the decision tree, and the corresponding search target information And a process 4 for outputting information expressing the contents.

【００４１】請求項２５の発明の出力情報選択プログラ
ムは、一群の検索対象情報それぞれの内容を表現する情
報と共に特徴ベクトルデータを保持する処理１と、前記
一群の検索対象情報の中から指定された検索対象情報の
内容を表現する情報を出力する処理２と、前記一群の検
索対象情報に対して、あらかじめ与えられている基準ベ
クトルと最も類似度の高い特徴ベクトルを持つ検索対象
情報から始め、その内容がユーザに受け入れられるとし
た場合に適、受け入れられないとした場合に不適に分岐
し、適に分岐する場合には、当該特徴ベクトルに類似度
が最も高い特徴ベクトルを持つ他の検索対象情報を次候
補の検索対象情報とし、前記特徴ベクトルから不適に分
岐する場合には、初段であれば、前記基準ベクトルに対
して、二段目以降であれば前段の出力候補とした検索対
象情報の持つ特徴ベクトルに対して、不適とした検索対
象情報の持つ特徴ベクトルと反対向きになる仮想ベクト
ルを算定し、この仮想ベクトルに類似度が最も高い特徴
ベクトルを持つ他の検索対象情報を検索し、該当する検
索対象情報があればその内容を次候補の検索対象情報と
する処理を、前記一群の検索対象情報について所定段階
まで繰り返して決定木を作成する処理３と、現在出力中
の検索対象情報に対するユーザの適／不適の判断入力を
受け付け、前記決定木に基づいて次に出力すべき検索対
象情報を特定し、該当する検索対象情報の内容を表現す
る情報を出力すると共に、既存の決定木を成長させる条
件に至ったならば前記決定木を所定段階だけ成長させる
処理４とをコンピュータに実行させるものである。According to a twenty-fifth aspect of the present invention, there is provided an output information selection program for processing 1 for holding feature vector data together with information representing the contents of a group of search target information, and a process designated from the group of search target information. A process 2 for outputting information representing the contents of the search target information, and starting with the search target information having the feature vector having the highest similarity to the reference vector given in advance for the group of search target information, If the content is accepted by the user, it branches inappropriately if it is not accepted, and if it branches off appropriately, if it branches properly, other search target information having a feature vector with the highest similarity to the feature vector Is the search target information of the next candidate, and if the branch is inappropriate from the feature vector, if the first stage, the second and subsequent stages with respect to the reference vector If there is any, the feature vector of the search target information as the output candidate in the previous stage is calculated, and a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and the feature having the highest similarity to this virtual vector is calculated. A process of searching for other search target information having a vector and, if there is such search target information, making the content thereof as the next candidate search target information is repeated up to a predetermined stage for the group of search target information to create a decision tree. Processing 3 and accepting the user's appropriateness / inappropriate determination input for the currently output search target information, specifying the next search target information to be output based on the decision tree, and rewriting the contents of the relevant search target information. And outputting the information to be expressed, and when the condition for growing the existing decision tree is reached, causes the computer to execute a process 4 for growing the decision tree by a predetermined stage. It is intended.

【００４２】請求項２６の発明の記録媒体は、一群の検
索対象情報それぞれの内容を表現する情報と共に特徴ベ
クトルデータを保持する処理１と、前記一群の検索対象
情報の中から指定された検索対象情報の内容を表現する
情報を出力する処理２と、現在出力中の検索対象情報に
対して、ユーザの適／不適の判断入力を受け付け、
（１）前記ユーザの判断入力が適である場合には、現在
出力中の検索対象情報の持つ特徴ベクトルに類似度が最
も高い特徴ベクトルを持つ他の検索対象情報がないか前
記一群の検索対象情報を検索し、該当する検索対象情報
があればその内容を表現する情報を出力し、（２）前記
ユーザの判断入力が不適である場合には、初段であれ
ば、あらかじめ与えられている基準ベクトルに対して、
二段目以降であれば前段で出力した検索対象情報の持つ
特徴ベクトルに対して、現在出力中の検索対象情報の持
つ特徴ベクトルと反対向きになる仮想ベクトルを算定
し、この仮想ベクトルに類似度が最も高い特徴ベクトル
を持つ他の検索対象情報がないか前記一群の検索対象情
報を検索し、該当する検索対象情報があればその内容を
表現する情報を出力する処理３とをコンピュータに実行
させる出力情報選択プログラムを記録したものである。According to a twenty-sixth aspect of the present invention, there is provided a recording medium, comprising: processing 1 for storing feature vector data together with information representing the contents of a group of search target information; Processing 2 for outputting information expressing the content of the information, and accepting a user's suitability / inappropriate determination input for the search target information currently being output,
(1) If the user's decision input is appropriate, the group of search targets is searched for other search target information having a feature vector with the highest similarity to the feature vector of the search target information currently being output. Information is searched, and if there is applicable search target information, information representing the contents thereof is output. (2) If the user's judgment input is inappropriate, if it is the first stage, a predetermined reference is given. For vectors
If it is the second or subsequent stage, a virtual vector that is in the opposite direction to the feature vector of the currently output search target information is calculated with respect to the feature vector of the search target information output in the previous stage, and the similarity to this virtual vector is calculated. And searching the group of search target information for other search target information having the highest feature vector, and outputting the information representing the content of the search target information, if any, to the computer. An output information selection program is recorded.

【００４３】請求項２７の発明の記録媒体は、一群の検
索対象情報それぞれの内容を表現する情報と共に特徴ベ
クトルデータを保持する処理１と、前記一群の検索対象
情報の中から指定された検索対象情報の内容を表現する
情報を出力する処理２と、前記一群の検索対象情報に対
して、あらかじめ与えられている基準ベクトルと最も類
似度の高い特徴ベクトルを持つ検索対象情報から始め、
その内容がユーザに受け入れられるとした場合に適、受
け入れられないとした場合に不適に分岐し、適に分岐す
る場合には、当該特徴ベクトルに類似度が最も高い特徴
ベクトルを持つ他の検索対象情報を次候補の検索対象情
報とし、前記特徴ベクトルから不適に分岐する場合に
は、初段であれば、前記基準ベクトルに対して、二段目
以降であれば前段の出力候補とした検索対象情報の持つ
特徴ベクトルに対して、不適とした検索対象情報の持つ
特徴ベクトルと反対向きになる仮想ベクトルを算定し、
この仮想ベクトルに類似度が最も高い特徴ベクトルを持
つ他の検索対象情報を検索し、該当する検索対象情報が
あればその内容を次候補の検索対象情報とする処理を、
前記一群の検索対象情報のすべてを網羅するまで繰り返
して決定木を作成する処理３と、現在出力中の検索対象
情報に対するユーザの適／不適の判断入力を受け付け、
前記決定木に基づいて次に出力すべき検索対象情報を特
定し、該当する検索対象情報の内容を表現する情報を出
力する処理４とをコンピュータに実行させる出力情報選
択プログラムを記録したものである。A recording medium according to a twenty-seventh aspect of the present invention provides processing 1 for storing feature vector data together with information representing the contents of a group of search target information, and a search target specified from the group of search target information. Processing 2 for outputting information representing the content of the information, and starting from the search target information having the feature vector having the highest similarity with the reference vector given in advance for the group of search target information,
If the content is accepted by the user, it branches appropriately if it is not accepted, and if it is not, then another branch that has the feature vector with the highest similarity to the feature vector If the information is the search target information of the next candidate, and the branch is inappropriately performed from the feature vector, the search target information is set to the reference vector in the first stage, and to the previous stage in the second and subsequent stages. Calculates a virtual vector that has the opposite direction to the feature vector of the inappropriate search target information for the feature vector of
A process of searching for other search target information having a feature vector having the highest similarity to the virtual vector, and setting the content of the corresponding search target information, if any, as the next candidate search target information,
Processing 3 for repeatedly creating a decision tree until the entire set of search target information is covered, and accepting a user's suitability / inappropriate determination input for the search target information currently being output;
An output information selection program for causing a computer to execute a process 4 of specifying search target information to be output next based on the decision tree and outputting information representing the contents of the relevant search target information is recorded. .

【００４４】請求項２８の発明の記録媒体は、一群の検
索対象情報それぞれの内容を表現する情報と共に特徴ベ
クトルデータを保持する処理１と、前記一群の検索対象
情報の中から指定された検索対象情報の内容を表現する
情報を出力する処理２と、前記一群の検索対象情報に対
して、あらかじめ与えられている基準ベクトルと最も類
似度の高い特徴ベクトルを持つ検索対象情報から始め、
その内容がユーザに受け入れられるとした場合に適、受
け入れられないとした場合に不適に分岐し、適に分岐す
る場合には、当該特徴ベクトルに類似度が最も高い特徴
ベクトルを持つ他の検索対象情報を次候補の検索対象情
報とし、前記特徴ベクトルから不適に分岐する場合に
は、初段であれば、前記基準ベクトルに対して、二段目
以降であれば前段の出力候補とした検索対象情報の持つ
特徴ベクトルに対して、不適とした検索対象情報の持つ
特徴ベクトルと反対向きになる仮想ベクトルを算定し、
この仮想ベクトルに類似度が最も高い特徴ベクトルを持
つ他の検索対象情報を検索し、該当する検索対象情報が
あればその内容を次候補の検索対象情報とする処理を、
前記一群の検索対象情報について所定段階まで繰り返し
て決定木を作成する処理３と、現在出力中の検索対象情
報に対するユーザの適／不適の判断入力を受け付け、前
記決定木に基づいて次に出力すべき検索対象情報を特定
し、該当する検索対象情報の内容を表現する情報を出力
すると共に、既存の決定木を成長させる条件に至ったな
らば前記決定木を所定段階だけ成長させる処理４とをコ
ンピュータに実行させる出力情報選択プログラムを記録
したものである。A recording medium according to a twenty-eighth aspect of the present invention provides a process 1 for storing feature vector data together with information representing the contents of a group of search target information, and a search target specified from the group of search target information. Processing 2 for outputting information representing the content of the information, and starting from the search target information having the feature vector having the highest similarity with the reference vector given in advance for the group of search target information,
If the content is accepted by the user, it branches appropriately if it is not accepted, and if it is not, then another branch that has the feature vector with the highest similarity to the feature vector If the information is the search target information of the next candidate, and the branch is inappropriately performed from the feature vector, the search target information is set to the reference vector in the first stage, and to the previous stage in the second and subsequent stages. Calculates a virtual vector that has the opposite direction to the feature vector of the inappropriate search target information for the feature vector of
A process of searching for other search target information having a feature vector having the highest similarity to the virtual vector, and setting the content of the corresponding search target information, if any, as the next candidate search target information,
Processing 3 for repeatedly creating a decision tree for the group of search target information up to a predetermined stage, and accepting a user's appropriateness / inappropriate determination input for the search target information currently being output, and outputting the next based on the decision tree A process 4 for identifying information to be searched and outputting information representing the contents of the corresponding information to be searched, and, when conditions for growing an existing decision tree have been reached, growing the decision tree by a predetermined stage. An output information selection program to be executed by a computer.

【００４５】[0045]

【発明の実施の形態】以下、本発明の実施の形態を図に
基づいて詳説する。本発明の特徴は、特徴ベクトルに基
づき類似度検索を行って得られた文書群を、それらの持
つ特徴ベクトルの方向を考慮して分類し、検索結果とし
て提示されている文書群からユーザが１つの文書を開い
た場合に、その内容に関してユーザの満足度（つまり、
ユーザが求めていた内容の文書であるか否か）をＹＥ
Ｓ，ＮＯによって入力させ、ユーザの満足度に応じて次
に展開する文書をコンピュータ側で自動的に選択して提
示するようにすることにより、類似度の高低を基本とし
ながらも特徴ベクトルの方向性も考慮し、ユーザの求め
ている情報に即した検索結果を順次展開できるようにす
ることである。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of the present invention will be described below in detail with reference to the drawings. The feature of the present invention is that a group of documents obtained by performing a similarity search based on a feature vector is classified in consideration of the direction of the feature vector possessed by the user, and the user can select one from the group of documents presented as a search result. Open a single document, and the user ’s satisfaction with the content (i.e.,
Whether or not the document has the content requested by the user)
By inputting S and NO, and automatically selecting and presenting the next document to be developed according to the user's satisfaction on the computer side, the direction of the feature vector is based on the degree of similarity. In consideration of the nature, it is desirable to be able to sequentially develop search results according to the information required by the user.

【００４６】図１は、本発明の情報検索システムの機能
的な構成を示している。この情報検索システムは、コン
ピュータシステムで構成され、若しくは複数台のコンピ
ュータをケーブルで接続するネットワークシステムで構
成される。本システムは、機能的な構成要素として、ユ
ーザが自分の必要とする内容を表現するテキストデー
タ、又は自分の必要とする内容を含む文書のテキストデ
ータをクエリーとして入力するクエリー入力部１、この
クエリー入力部１から入力されたテキストデータから特
徴ベクトルを作成する特徴ベクトル作成部２、保存され
ている多数の文書のインデックス、テキストデータ、各
文書ごとの特徴ベクトルデータが格納されている文書デ
ータベース３、特徴ベクトル作成部２で作成されたクエ
リーの特徴ベクトルと文書データベース３に登録されて
いる文書ごとの特徴ベクトルとの類似度を演算し、所定
のしきい値よりも高い類似度を示す特徴ベクトルを持つ
文書を特定し、それらのインデックス、テキストデータ
及び特徴ベクトルを取り出す類似度演算部４、この類似
度演算部４の取り出したデータを保存する検索結果保持
部５を備えている。FIG. 1 shows a functional configuration of the information retrieval system of the present invention. This information retrieval system is configured by a computer system or a network system that connects a plurality of computers with cables. The system includes, as functional components, a query input unit 1 for inputting, as a query, text data expressing the content required by the user or text data of a document including the content required by the user as a query. A feature vector creating unit 2 for creating a feature vector from text data input from the input unit 1, an index of a large number of stored documents, text data, a document database 3 storing feature vector data for each document, The similarity between the feature vector of the query created by the feature vector creating unit 2 and the feature vector of each document registered in the document database 3 is calculated, and a feature vector indicating a similarity higher than a predetermined threshold is calculated. Classify documents that have documents and extract their indexes, text data and feature vectors Degree calculation unit 4, and a retrieval result storage 5 for storing the retrieved data in the similarity calculation unit 4.

【００４７】本システムはまた、類似度演算部４が取り
出した類似度の高い文書群のインデックスやテキストデ
ータを表示し、必要に応じて印字出力する検索結果出力
部７、ユーザによる検索結果に対するフィードバック操
作を受け付け、対応する処理を検索結果保持部５に対し
て与えるフィードバック処理部８を備えている。The present system also includes a search result output unit 7 which displays the index and text data of a group of documents having a high similarity extracted by the similarity calculation unit 4 and prints out the data as necessary. The apparatus includes a feedback processing unit 8 that receives an operation and provides a corresponding process to the search result holding unit 5.

【００４８】次に、上記の構成の情報検索システムによ
る情報検索方法について、図２を用いて説明する。この
情報検索システムでは、ユーザがクエリー入力部１によ
り、自分の必要とする内容を表現するテキストデータ、
又は自分の必要とする内容を含む文書のテキストデータ
をクエリーとして入力すると（ステップＳ１）、特徴ベ
クトル作成部２が、入力されたクリエーのテキストデー
タから特徴ベクトルを作成する（ステップＳ３）。この
特徴ベクトルの作成は、従来から広く利用されている数
１式、数２式を用いるｔｆ＊ｉｄｆによる。なお、特徴
ベクトルの作成方法は、従来例で列挙した他の方法であ
ってもよい次に、類似度演算部４により、文書データベ
ース３に保存されている多数の文書それぞれの特徴ベク
トルと特徴ベクトル作成部２で作成されたクエリーの特
徴ベクトルとの類似度を前述の数３式を用いて演算し、
所定のしきい値よりも高い類似度を示す特徴ベクトルを
持つ文書を特定し、それらのインデックス、テキストデ
ータ及び特徴ベクトルを取り出し（ステップＳ５）、検
索結果のデータとして検索結果保持部５に一時保存する
（ステップＳ７）。なお、類似度の演算も、従来例で列
挙した他の方法であってもよい。Next, an information retrieval method by the information retrieval system having the above configuration will be described with reference to FIG. In this information retrieval system, a user inputs text data expressing contents required by the query input unit 1,
Alternatively, when text data of a document including the contents required by the user is input as a query (step S1), the feature vector creation unit 2 creates a feature vector from the input text data of the creation (step S3). The creation of this feature vector is based on tf * idf using Equations 1 and 2 that have been widely used in the past. The method of creating the feature vector may be any of the other methods listed in the conventional example. The similarity to the feature vector of the query created by the creating unit 2 is calculated using the above-described equation (3),
A document having a feature vector indicating a similarity higher than a predetermined threshold is specified, their index, text data, and feature vector are extracted (step S5), and temporarily stored in the search result holding unit 5 as search result data. (Step S7). The calculation of the similarity may be performed by other methods listed in the conventional example.

【００４９】こうして得られた検索結果に対して、検索
結果出力部７はクエリーに対して類似度が一番高い特徴
ベクトルを持つ文書の内容を表す情報を表示する（ステ
ップＳ９）。なお、この文書内容を表す情報には、文書
のタイトル、著者名、発行年月日、概要を含む。また、
ユーザの指定により文書のテキストデータすべてを展開
表示させることもできる。With respect to the search result obtained in this way, the search result output unit 7 displays information representing the contents of the document having the feature vector having the highest similarity to the query (step S9). The information indicating the contents of the document includes the title of the document, the name of the author, the date of issue, and the outline. Also,
All text data of the document can be expanded and displayed by the user's specification.

【００５０】この文書の内容の表示に対して、ユーザが
ＵＩデバイス（例えば、キーボードや、マウスのような
ポインティングデバイス）により自分の求めていた内容
の文書であるか（ＹＥＳ）否か（ＮＯ）の判断を入力す
ると、フィードバック処理部８がこの判断入力を受け付
けて、次のＹＥＳ／ＮＯのいずれかの処理を行う（ステ
ップＳ１１）。Regarding the display of the content of this document, whether the user has requested the content using a UI device (for example, a keyboard or a pointing device such as a mouse) (YES) or not (NO) Is input, the feedback processing unit 8 accepts this input and performs one of the following YES / NO processes (step S11).

【００５１】ユーザが、最初に提示された文書の内容に
ついて、自分の求めていた内容であると判断し、ＹＥＳ
を入力した場合、フィードバック処理部８は検索結果保
持部５に対して、現在出力中の文書の特徴ベクトルに類
似度が最も高い特徴ベクトルを持つ文書を検索し、該当
する文書があればその内容を検索結果出力部７によって
表示させる（ステップＳ１３，Ｓ１９，Ｓ２１）。The user determines that the content of the document presented first is the content he / she has requested, and determines YES.
Is input, the feedback processing unit 8 searches the search result holding unit 5 for a document having a feature vector having the highest similarity to the feature vector of the currently output document. Is displayed by the search result output unit 7 (steps S13, S19, S21).

【００５２】ユーザが、最初に提示された文書の内容に
ついて、自分の求めていた内容ではないと判断し、ステ
ップＳ１１でＮＯを入力した場合、フィードバック処理
部８は、クエリーの特徴ベクトルに対して現在出力中の
文書の持つ特徴ベクトルと反対向きになる仮想ベクトル
を算定し（ステップＳ１５）、この仮想ベクトルに類似
度が最も高い特徴ベクトルを持つ文書を検索結果保持部
５に保持されている文書群から抽出し、その文書内容を
検索結果出力部７に表示させる（ステップＳ１７〜Ｓ２
１）。If the user determines that the content of the document presented first is not the content he / she has requested, and inputs NO in step S11, the feedback processing unit 8 sets the A virtual vector that is in the opposite direction to the feature vector of the currently output document is calculated (step S15). The document is extracted from the group and the document content is displayed on the search result output unit 7 (steps S17 to S2).
1).

【００５３】こうして、ステップＳ２１で二番目に表示
させた文書の内容について、ユーザはＵＩデバイスによ
ってＹＥＳ／ＮＯの判断を入力すれば、ステップＳ１１
〜Ｓ２１の処理を繰り返す。三番目以降の文書について
同様である。そして、検索結果保持部５に該当する特徴
ベクトルを持つ文書がなくなれば検索結果の出力を終了
する（ステップＳ１９でＮＯに分岐）。As described above, when the user inputs a YES / NO determination using the UI device with respect to the content of the document displayed second in step S21, the process proceeds to step S11.
To S21 are repeated. The same applies to the third and subsequent documents. Then, when there is no document having the corresponding feature vector in the search result holding unit 5, the output of the search result is ended (NO in step S19).

【００５４】以上の情報検索方法による検索結果の出力
形態を、図３〜図５を用いて説明する。類似度演算部４
による類似度演算結果として得られた文書群は、従来と
同様に図１２に示すものであったとする。つまり、クエ
リーＱに対して、類似度がしきい値を超える特徴ベクト
ルを持つ文書が類似度の高い順に、Ａ，Ｄ，Ｂ，Ｃ，Ｅ
であったとする。そして、各文書の特徴ベクトルは図示
したように種々の方向を向いているものとする。The output form of the search result by the above information search method will be described with reference to FIGS. Similarity calculator 4
It is assumed that the document group obtained as a result of the similarity calculation according to is similar to that shown in FIG. That is, for a query Q, documents having feature vectors whose similarities exceed the threshold value are sorted in order of A, D, B, C, E
Assume that It is assumed that the feature vectors of each document are oriented in various directions as illustrated.

【００５５】最初の検索結果表示では、図３に示すよう
に、クエリーＱに対して類似度が一番高い文書Ａについ
て、その内容を表示する。そして、ユーザがＹＥＳ／Ｎ
Ｏの判断入力を行えるように、「関係あり（適）」、
「関係なし（不適）」の選択ボタン１１，１２を表示す
る。In the first search result display, as shown in FIG. 3, the contents of the document A having the highest similarity to the query Q are displayed. Then, if the user selects YES / N
"Relevant (suitable)"
The selection buttons 11 and 12 of “not relevant (unsuitable)” are displayed.

【００５６】そして、ユーザが文書Ａの内容を確認し、
自分の求めていた内容のものであるため「関係あり」の
ボタン１１を操作すれば、続いて、図１３においてこの
文書Ａの特徴ベクトルに対して類似度が最も高い特徴ベ
クトルを持つ文書を検索結果の文書群の中から検索し、
文書Ｂを抽出する。そして図４の（ａ）に示すように、
抽出した文書Ｂの内容を表示する。Then, the user checks the contents of document A,
If the user operates the “Related” button 11 because the content is what he or she has sought, subsequently, in FIG. Search among the resulting documents,
Extract document B. Then, as shown in FIG.
The contents of the extracted document B are displayed.

【００５７】ここで反対に、ユーザが文書Ａの内容を確
認し、「関係なし」のボタン１２を操作すれば、上述し
たように、類似度の基準とした文書であるクエリーＱに
対して、文書Ａの特徴ベクトルと正反対の方向の仮想ベ
クトルＡ′を次の数４式によって求める。On the contrary, if the user confirms the contents of the document A and operates the “no relation” button 12, as described above, the query Q, which is the document based on the similarity, is A virtual vector A 'in the direction opposite to the feature vector of the document A is obtained by the following equation (4).

【００５８】[0058]

【数４】ここで、αはあらかじめ与えられた定数である。(Equation 4) Here, α is a constant given in advance.

【００５９】この仮想ベクトルＡ′は、図５に示すよう
にクエリーＱに対して、文書Ａの特徴ベクトルと正反対
の向きにある。The virtual vector A 'is in the opposite direction to the query Q as shown in FIG.

【００６０】そしてこの仮想ベクトルＡ′を用いて、こ
れに最も近い特徴ベクトルを持つ文書を検索して文書Ｄ
を抽出し、その文書Ｄの内容を図４（ｂ）に示すように
表示する。Using this virtual vector A ′, a document having a feature vector closest to the virtual vector A ′ is searched, and a document D
Is extracted, and the contents of the document D are displayed as shown in FIG.

【００６１】以下も同様に、例えば、図４（ａ）の文書
Ｂの内容を表示している状態でユーザが「関係あり」の
ボタンを操作し、あるいは同図（ｂ）の文書Ｄの内容を
表示している状態でユーザが「関係あり」のボタン操作
をすれば、文書Ｂ、文書Ｄの特徴ベクトルに類似度が最
も高い特徴ベクトルを持つ文書を検索結果の文書群の中
から抽出する。Similarly, for example, the user operates the “Related” button while the content of the document B in FIG. 4A is displayed, or the content of the document D in FIG. When the user operates the button "Related" while displaying the "?", The documents having the feature vectors having the highest similarity to the feature vectors of the documents B and D are extracted from the document group of the search result. .

【００６２】逆に、例えば、図４（ａ）の文書Ｂの内容
を表示している状態でユーザが「関係なし」のボタンを
操作した場合には、この文書Ｂを導き出した１回前の文
書Ａの特徴ベクトルに対して、文書Ｂの特徴ベクトルと
反対方向の仮想ベクトルＢ′を数４式によって求め、こ
の仮想ベクトルＢ′に対して類似度が最も高い特徴ベク
トルを持つ文書を検索する。これにより、文書Ｄが抽出
され、その内容が表示されることになる。On the other hand, for example, when the user operates the “no relation” button while displaying the contents of the document B in FIG. With respect to the feature vector of the document A, a virtual vector B 'in the opposite direction to the feature vector of the document B is obtained by Expression 4, and a document having a feature vector having the highest similarity to the virtual vector B' is searched. . As a result, the document D is extracted and its contents are displayed.

【００６３】以下、同様の方法により、ユーザの判断入
力を反映し、ユーザにとって必要とする内容に近い内容
を持つ文書群を優先的に順次に提示することができるこ
とになる。In the following, in a similar manner, a document group having a content close to the content required by the user can be sequentially presented with priority, reflecting the user's judgment input.

【００６４】なお、図１に示すシステムにおいて、用い
るコンピュータの性能により、クエリー入力部１〜類似
度演算部４はサーバ側の機能とし、これにＬＡＮやイン
ターネットその他のネットワークで接続されたクライア
ント側に検索結果保持部５、検索結果出力部７、フィー
ドバック処理部８を設けるシステム構成にすることがで
きる。あるいは、クエリー入力部１〜検索結果保持部５
はサーバ側の機能とし、これにＬＡＮやインターネット
その他のネットワークで接続されたクライアント側に検
索結果出力部７とフィードバック処理部８を設けるシス
テム構成にすることもできる。In the system shown in FIG. 1, depending on the performance of the computer used, the query input unit 1 to the similarity calculation unit 4 have functions on the server side, and can be connected to the client side connected to a LAN, the Internet or other networks. A system configuration including the search result holding unit 5, the search result output unit 7, and the feedback processing unit 8 can be adopted. Alternatively, a query input unit 1 to a search result holding unit 5
May be a server-side function, and a system in which a search result output unit 7 and a feedback processing unit 8 are provided on the client side connected to the LAN or the Internet or another network may be used.

【００６５】次に、本発明の第２の実施の形態の情報検
索システムを、図６を用いて説明する。第２の実施の形
態の情報検索システムは、機能的な構成要素として、図
１に示した第１の実施の形態のシステムと同様のクエリ
ー入力部１、特徴ベクトル作成部２、文書データベース
３、類似度演算部４、検索結果保持部５、検索結果出力
部７及びフィードバック処理部８を備えている。そして
本実施の形態のシステムはさらに、検索結果保持部５に
保存されている文書ごとの特徴ベクトルに基づいて、後
述する論理演算によって決定木を作成し、その作成した
決定木データを検索結果保持部５に保持させる決定木作
成部６を備えている。Next, an information retrieval system according to a second embodiment of the present invention will be described with reference to FIG. The information search system according to the second embodiment has, as functional components, a query input unit 1, a feature vector creation unit 2, a document database 3, and a query unit similar to those of the system according to the first embodiment shown in FIG. The apparatus includes a similarity calculation unit 4, a search result holding unit 5, a search result output unit 7, and a feedback processing unit 8. The system according to the present embodiment further creates a decision tree by a logical operation described later based on the feature vector for each document stored in the search result holding unit 5, and stores the created decision tree data in the search result holding unit. A decision tree creating unit 6 to be held by the unit 5 is provided.

【００６６】次に、上記の構成の第２の実施の形態の情
報検索システムによる情報検索方法について、図７のフ
ローチャートを用いて説明する。この情報検索システム
でも、図２に示した第１の実施の形態のシステムと同様
に、ユーザが自分の必要とする内容を表現するテキスト
データ、又は自分の必要とする内容を含む文書のテキス
トデータをクエリーとして入力すると（ステップＳ
１）、入力されたクリエーのテキストデータから特徴ベ
クトルを作成する（ステップＳ３）。そして、文書デー
タベース３に保存されている多数の文書それぞれの特徴
ベクトルとクエリーの特徴ベクトルとの類似度を演算
し、所定のしきい値よりも高い類似度を示す特徴ベクト
ルを持つ文書を特定し、それらのインデックス、テキス
トデータ及び特徴ベクトルを取り出し（ステップＳ
５）、検索結果のデータとして検索結果保持部５に一時
保存する（ステップＳ７）。Next, an information retrieval method by the information retrieval system according to the second embodiment having the above-described configuration will be described with reference to the flowchart of FIG. In this information retrieval system as well, similar to the system of the first embodiment shown in FIG. 2, text data expressing the content required by the user or text data of a document including the content required by the user. Is input as a query (step S
1) A feature vector is created from the input text data of create (step S3). Then, the similarity between the feature vector of each of the large number of documents stored in the document database 3 and the feature vector of the query is calculated, and a document having a feature vector indicating a similarity higher than a predetermined threshold is specified. , Their indexes, text data, and feature vectors (step S
5) The data is temporarily stored in the search result holding unit 5 as search result data (step S7).

【００６７】こうして得られた検索結果に対して、本実
施の形態の特徴である決定木作成部６が、後述する処理
により図８に示すような決定木を作成して検索結果保持
部５に保持させる（ステップＳ８）。そして、検索結果
出力部７はクエリーに対して類似度が一番高い特徴ベク
トルを持つ文書の内容を表す情報を表示する（ステップ
Ｓ９）。For the search results obtained in this way, the decision tree creating unit 6 which is a feature of the present embodiment creates a decision tree as shown in FIG. It is held (step S8). Then, the search result output unit 7 displays information representing the content of the document having the feature vector having the highest similarity to the query (step S9).

【００６８】この文書の内容の表示に対して、ユーザが
ＵＩデバイスにより自分の求めていた内容の文書である
か（ＹＥＳ）否か（ＮＯ）の判断を入力すると、フィー
ドバック処理部８がこの判断入力を受け付けて、次のＹ
ＥＳ／ＮＯのいずれかの処理を行う（ステップＳ１
１）。In response to the display of the contents of the document, when the user inputs a determination as to whether or not the document has the content desired by the user via the UI device (YES) or not (NO), the feedback processing unit 8 makes this determination. Accepts the input and the next Y
Perform either ES / NO processing (step S1)
1).

【００６９】ユーザが、最初に提示された文書の内容に
ついて、自分の求めていた内容であると判断し、ＹＥＳ
を入力した場合、フィードバック処理部８は検索結果保
持部５に保持されている決定木を参照し、現在出力中の
文書に対してＹＥＳの場合に移行する次の文書を特定
し、その内容を検索結果出力部７によって表示させる
（ステップＳ１２，Ｓ１６，Ｓ１８）。The user determines that the content of the document presented first is the content requested by the user, and determines YES.
Is input, the feedback processing unit 8 refers to the decision tree held in the search result holding unit 5, specifies the next document to be shifted to the case of YES for the currently output document, and It is displayed by the search result output unit 7 (steps S12, S16, S18).

【００７０】ユーザが、最初に提示された文書の内容に
ついて、自分の求めていた内容ではないと判断し、ステ
ップＳ１１でＮＯを入力した場合、フィードバック処理
部８は検索結果保持部５に保持されている決定木を参照
し、現在出力中の文書に対してＮＯの場合に移行する次
の文書を特定し、その内容を検索結果出力部７によって
表示させる（ステップＳ１４，Ｓ１６，Ｓ１８）。If the user determines that the content of the document presented first is not the content he / she has requested and inputs NO in step S11, the feedback processing unit 8 is stored in the search result storage unit 5. With reference to the decision tree that is present, the next document to be shifted to in the case of NO for the currently output document is specified, and its contents are displayed by the search result output unit 7 (steps S14, S16, S18).

【００７１】こうして、ステップＳ１８で二番目に表示
させた文書の内容について、ユーザはＵＩデバイスによ
ってＹＥＳ／ＮＯの判断を入力すれば、ステップＳ１１
〜Ｓ１８の処理を繰り返す。三番目以降の文書について
同様である。そして、検索結果保持部５に該当する文書
がなくなれば検索結果の出力を終了する（ステップＳ１
６でＮＯに分岐）。As described above, when the user inputs a YES / NO determination through the UI device with respect to the content of the document displayed second in step S18, the process proceeds to step S11.
To S18 are repeated. The same applies to the third and subsequent documents. Then, when there is no document corresponding to the search result holding unit 5, the output of the search result is ended (step S1).
6 branches to NO).

【００７２】次に、決定木作成部６による決定木の作成
処理について、図８の決定木例を用いて説明する。クエ
リーの特徴ベクトルＱに対して、しきい値内の類似度を
持つ特徴ベクトル群Ａ〜Ｅが検索されたとする。クエリ
ーＱに対する類似度はクエリーＱとの距離が近いほど類
似度が高いものとし、ベクトル方向は、クエリーＱに対
する相対的な位置として示してある。Next, the decision tree creation processing by the decision tree creation section 6 will be described with reference to the example of the decision tree in FIG. It is assumed that feature vector groups A to E having similarities within the threshold value have been searched for the feature vector Q of the query. The similarity to the query Q is assumed to be higher as the distance from the query Q is shorter, and the vector direction is shown as a relative position to the query Q.

【００７３】決定木作成部６は、類似度演算部４が求め
た類似度の高い特徴ベクトルＡ〜Ｅに対して、最も類似
度の高い特徴ベクトルＡから始め、その内容がユーザに
受け入れられるとした場合にＹＥＳ、受け入れられない
とした場合にＮＯに分岐する。The decision tree creator 6 starts with the feature vector A having the highest similarity for the feature vectors A to E having the higher similarity obtained by the similarity calculator 4, and when the content is accepted by the user. If the answer is YES, the process branches to NO.

【００７４】そして、ＹＥＳに分岐する場合には、特徴
ベクトルＡに類似度が最も高い特徴ベクトルＢを持つ文
書を、特徴ベクトルＡを持つ文書の次に出力する文書と
する。一方、特徴ベクトルＡの文書からＮＯに分岐する
場合には、上述したようにクエリーＱを基準にして、特
徴ベクトルＡと反対方向の仮想ベクトルＡ′を想定し、
この仮想ベクトルＡ′に対して最も近い位置の特徴ベク
トルＤの文書を次に表示する文書と決定する。そして、
同様にして特徴ベクトルごとにＹＥＳ，ＮＯに分岐して
順次、出力するベクトルを検索結果のベクトル群から抽
出してゆき、図８に示すような決定木を作成する。If the determination is YES, the document having the feature vector B having the highest similarity to the feature vector A is set as the document to be output next to the document having the feature vector A. On the other hand, when branching from the document of the feature vector A to NO, a virtual vector A ′ in the opposite direction to the feature vector A is assumed based on the query Q as described above,
The document of the feature vector D closest to the virtual vector A 'is determined as the document to be displayed next. And
Similarly, the process branches to YES and NO for each feature vector, and sequentially extracts vectors to be output from the vector group of the search result, thereby creating a decision tree as shown in FIG.

【００７５】これにより、ユーザがクエリーＱに対して
検索された文書群の中から順次内容を確認していく場
合、フィードバック処理部８は、決定木作成部６が作成
し、検索結果保持部５に保持されている図８に示す決定
木の情報を参照して、次のように検索結果の文書を展開
していく。In this way, when the user sequentially checks the contents from the group of documents searched for the query Q, the feedback processing unit 8 creates the decision tree creating unit 6 and creates the search result holding unit 5 With reference to the information of the decision tree shown in FIG.

【００７６】まず最初に、クエリーＱに最も近い特徴ベ
クトルを持つ文書としてＡを出力する。これに対してユ
ーザがＮＯと判断操作をすれば、フィードバック処理部
８は、次に特徴ベクトルＤの文書を検索結果出力部７に
出力させる。この特徴ベクトルＤの文書に対しても、ユ
ーザがＮＯと判断操作をすれば、次に特徴ベクトルＣの
文書を出力させる。そして、特徴ベクトルＣの文書に対
してはＹＥＳであれば、次に特徴ベクトルＢの文書を出
力させる。そして最後に、特徴ベクトルのＥの文書を出
力させるのである。First, A is output as a document having a feature vector closest to the query Q. On the other hand, if the user performs a determination operation of NO, the feedback processing unit 8 causes the search result output unit 7 to output the document of the feature vector D next. If the user performs a determination operation of NO for the document of the feature vector D, the document of the feature vector C is output next. If YES for the document of the feature vector C, the document of the feature vector B is output next. Finally, the document of the feature vector E is output.

【００７７】なお、決定木作成部６による決定木の作成
処理を一般的に示したものが図９のフローチャートであ
る。最初に入力したクエリーＱの対してしきい値内の類
似度の特徴ベクトルを持つ文書群を検索しておき、その
文書群の中で、図９のフローチャートの処理を繰り返
し、決定木を作成しておくのである。FIG. 9 is a flowchart generally showing the decision tree creation processing by the decision tree creation unit 6. A document group having a feature vector of similarity within the threshold value is searched for the query Q input first, and the processing of the flowchart in FIG. 9 is repeated in the document group to create a decision tree. Keep it.

【００７８】この第２の実施の形態によれば、特に高速
処理が可能なサーバを備えたシステムにあっては、クラ
イアントとしてこのサーバにアクセスして検索サービス
を受けるユーザは、自身のクライアントマシンが高速処
理能力を備えていないものであっても検索結果を優先度
の高いものから順次、高速で閲覧することができるよう
になる。According to the second embodiment, particularly in a system having a server capable of high-speed processing, a user who accesses the server as a client and receives a search service has his or her own client machine. Even if the search result does not have high-speed processing capability, the search results can be browsed at high speed in order from the one with the highest priority.

【００７９】次に、本発明の第３の実施の形態の情報検
索システムを、図１０のフローチャート及び図１１の動
作説明図を用いて説明する。第３の実施の形態の情報検
索システムの機能的な構成は、図６に示した第２の実施
の形態のものと共通である。ただし、決定木作成部６は
類似度演算部４が抽出した文書群すべてに対して、図８
に示すような決定木を作成するのではなく、ユーザの判
定入力に応じて、所定段ずつ決定木を成長させていくこ
とを特徴とする。Next, an information retrieval system according to a third embodiment of the present invention will be described with reference to the flowchart of FIG. 10 and the operation explanatory diagram of FIG. The functional configuration of the information search system according to the third embodiment is common to that of the second embodiment shown in FIG. However, the decision tree creator 6 applies to all the documents extracted by the similarity calculator 4 as shown in FIG.
Instead of creating a decision tree as shown in (1), the decision tree is grown in predetermined steps in accordance with the user's judgment input.

【００８０】すなわち、類似度演算部４がクエリーＱに
対して求めたしきい値内の特徴ベクトルを持つ文書群が
図１３に示したものであった場合、決定木作成部６は最
初に、図１１（ａ）に示すように、例えば３段階（この
段数は特定されることはなく、２段まででもよいし、検
索結果の文書数が多ければ４段以上であってもよい）ま
での決定木を作成する（ステップＳ３１）。That is, if the document group having the feature vector within the threshold value obtained for the query Q by the similarity calculation unit 4 is the one shown in FIG. 13, the decision tree creation unit 6 first As shown in FIG. 11A, for example, up to three levels (the number of levels is not specified and may be up to two levels, or may be four or more levels if the number of documents in the search result is large) A decision tree is created (step S31).

【００８１】そして、ユーザが第１段階の文書Ａの内容
に対して関係あり（ＹＥＳ）と判断すれば（ステップＳ
３３〜Ｓ３７）、文書展開の方向性がほぼ定まるので、
図１１（ｂ）のように決定木を成長させる（ステップＳ
３９，Ｓ４１，Ｓ４３）。そして、文書Ａの特徴ベクト
ルに対して最短距離の特徴ベクトルを持つ文書Ｂを抽出
して出力する（ステップＳ４５，Ｓ３５）。If the user determines that the contents of the first stage document A are relevant (YES) (step S
33-S37) Since the direction of document development is almost determined,
A decision tree is grown as shown in FIG.
39, S41, S43). Then, a document B having the shortest distance feature vector with respect to the feature vector of the document A is extracted and output (steps S45 and S35).

【００８２】なお、ステップＳ４１で、決定木を成長さ
せる条件は、例えば、検索結果としての文書群の中に未
提示のものが残っているが、現段階までの決定木では末
端の文書まで到達した場合、１段あるいはあらかじめ取
り決めた適数段の候補文書を提示するごとに決定木を１
段階あるいは適数段ずつ成長させる。この進段の条件
は、例えば、１段階あるいは適数段階の文書が提示され
た場合等にあらかじめ決定しておくことができる。In step S41, the condition for growing the decision tree is that, for example, unpresented documents remain in the document group as a search result, but the decision tree up to the current stage reaches the terminal document. In this case, the decision tree is increased by one every time one stage or a predetermined number of candidate documents are determined.
It is grown step by step or an appropriate number of steps. The advance condition can be determined in advance, for example, when a document of one stage or an appropriate number of stages is presented.

【００８３】この第３の実施の形態によれば、ユーザの
判定をフィードバックし、展開される可能性の高い系統
についてだけ決定木を作成することになるので、無駄な
演算処理を軽減することができ、サーバの演算負荷を軽
減することができる。According to the third embodiment, the decision of the user is fed back, and a decision tree is created only for a system which is likely to be expanded, so that unnecessary calculation processing can be reduced. It is possible to reduce the calculation load of the server.

【００８４】なお、上記の各実施の形態では、検索サー
ビスと演算処理をすべてサーバマシンで実行することに
したが、例えば、決定木作成部６も含め、検索結果保持
部５〜フィードバック処理部８の処理機能の全部あるい
は一部をクライアントマシン側に設け、サーバマシン側
とはネットワークで接続する構成にすることもできる。In each of the above embodiments, the search service and the arithmetic processing are all executed on the server machine. All or a part of the processing functions described above can be provided on the client machine side and connected to the server machine side via a network.

【００８５】また、図６に示すシステムにおいて、用い
るコンピュータの性能により、クエリー入力部１〜類似
度演算部４はサーバ側の機能とし、これにＬＡＮやイン
ターネットその他のネットワークで接続されたクライア
ント側に検索結果保持部５、決定木作成部６、検索結果
出力部７、フィードバック処理部８を設けるシステム構
成にすることができる。あるいは、クエリー入力部１〜
決定木作成部５はサーバ側の機能とし、これにＬＡＮや
インターネットその他のネットワークで接続されたクラ
イアント側に検索結果出力部７とフィードバック処理部
８を設けるシステム構成にすることもできる。Also, in the system shown in FIG. 6, depending on the performance of the computer used, the query input unit 1 to the similarity calculation unit 4 function as a server, and a client connected to a LAN, the Internet, or another network. A system configuration including the search result holding unit 5, the decision tree creating unit 6, the search result output unit 7, and the feedback processing unit 8 can be adopted. Alternatively, query input units 1 to
The decision tree creating unit 5 may be a server-side function, and a system configuration in which a search result output unit 7 and a feedback processing unit 8 are provided on the client side connected to the LAN or the Internet or another network may be used.

【００８６】なお、本発明は上記の各実施の形態の情報
検索システムの処理機能を実現するプログラム、またそ
のプログラムを記録したコンピュータ読取り可能な記録
媒体、さらには当該プログラムをコンピュータに組み込
むことによりコンピュータシステムが実行する情報検索
方法をも技術的範囲とする。The present invention provides a program for realizing the processing functions of the information retrieval system according to each of the above embodiments, a computer-readable recording medium on which the program is recorded, and a computer readable program by incorporating the program into a computer. The information retrieval method executed by the system is also within the technical scope.

【００８７】さらにまた、上記の各実施の形態ではテキ
ストデータをベースにした文書情報検索について例示し
たが、これに限らず、文書情報として内容が表現された
音楽、映画、画像情報の検索にも適用することができ
る。また、音楽情報、画像情報に関して、そのデータを
特徴ベクトル化してデータベースに登録し、またそれら
のインデックス、タイトル、著作者、発行人、販社等の
テキストデータも共に登録しておき、クエリーとして音
楽情報、画像情報を表現するデータを直接入力して特徴
ベクトル化し、特徴ベクトル間の類似度が高い音楽情
報、画像情報を抽出し、それらの音楽情報、又は画像情
報のデータ、特徴ベクトル、インデックス、タイトル等
をデータベースから取り出すようにすれば、上記の各実
施の形態と同様の決定木による検索が可能である。Further, in each of the embodiments described above, the document information retrieval based on the text data has been described as an example. However, the present invention is not limited to this. Can be applied. Also, regarding music information and image information, the data is converted into a feature vector and registered in a database, and the text data of the index, title, author, publisher, sales company, etc. are also registered together, and the music information is used as a query. , Data representing image information is directly input and converted into a feature vector, music information and image information having a high similarity between the feature vectors are extracted, and the music information or image information data, feature vector, index, title Is retrieved from the database, it is possible to perform a search using the same decision tree as in the above embodiments.

【００８８】[0088]

【発明の効果】以上のように、本発明によれば、ユーザ
が入力するクエリーに対して特徴ベクトルの類似度がし
きい値よりも高い情報群を抽出し、そのうち類似度が最
も高い特徴ベクトルを持つ情報の内容をまず提示し、ユ
ーザがその情報を適（ＹＥＳ）と判断すればその情報の
特徴ベクトルと類似度が高い特徴ベクトルを持つ情報を
次に提示し、逆にその情報を不適（ＮＯ）と判断すれ
ば、特徴ベクトルと反対向きの仮想ベクトルを用い、こ
れに類似度が高い特徴ベクトルを持つ情報を次に提示さ
せるという手順を、以降、順次繰り返し、ユーザの求め
ている情報を予測して優先的に提示するので、ユーザの
必要としている情報と類似度が高い情報が多数にのぼる
場合でも、提示されている１つの情報に対するユーザの
適／不適の内容判定に対応して、ユーザが必要としてい
る内容の情報を自動的に選択して順次提示していくこと
ができる。As described above, according to the present invention, a group of information in which the similarity of a feature vector is higher than a threshold value for a query input by a user is extracted, and the feature vector having the highest similarity among them is extracted. Is presented first, and if the user determines that the information is appropriate (YES), information having a feature vector with a high similarity to the feature vector of the information is presented next, and conversely, the information is inappropriate. If determined to be (NO), the procedure of using the virtual vector in the opposite direction to the feature vector and presenting the information having the feature vector with a high similarity to the next is sequentially repeated thereafter, and the information requested by the user is repeated. Is predicted and presented preferentially. Therefore, even when there is a large number of pieces of information having a high similarity to the information required by the user, whether the user is appropriate / unsuitable for one piece of presented information is determined. Correspondingly, the user can successively presented automatically selects the information of the contents in need.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態の情報検索システム
の機能構成を示すブロック図。FIG. 1 is a block diagram showing a functional configuration of an information search system according to a first embodiment of the present invention.

【図２】上記の実施の形態による検索処理及び検索結果
出力処理のフローチャート。FIG. 2 is a flowchart of a search process and a search result output process according to the embodiment.

【図３】上記の実施の形態による検索処理結果の表示例
の説明図。FIG. 3 is an explanatory diagram of a display example of a search processing result according to the embodiment.

【図４】上記の実施の形態による検索結果の表示出力の
遷移を示す説明図。FIG. 4 is an explanatory diagram showing transition of display output of a search result according to the embodiment.

【図５】上記の実施の形態による検索処理出力処理の説
明図。FIG. 5 is an explanatory diagram of search processing output processing according to the embodiment.

【図６】本発明の第２の実施の形態の情報検索システム
の機能構成を示すブロック図。FIG. 6 is a block diagram showing a functional configuration of an information search system according to a second embodiment of the present invention.

【図７】上記の実施の形態の検索処理及び検索結果出力
処理のフローチャート。FIG. 7 is a flowchart of a search process and a search result output process of the embodiment.

【図８】上記の実施の形態による検索結果に対する決定
木作成処理の説明図。FIG. 8 is an explanatory diagram of a decision tree creation process for a search result according to the embodiment.

【図９】上記の実施の形態による検索結果に対する決定
木作成処理のフローチャート。FIG. 9 is a flowchart of a decision tree creation process for a search result according to the embodiment.

【図１０】本発明の第３の実施の形態の情報検索システ
ムによる検索結果に対する決定木作成処理のフローチャ
ート。FIG. 10 is a flowchart of a decision tree creation process for a search result by the information search system according to the third embodiment of the present invention.

【図１１】上記の実施の形態による検索結果に対する決
定木作成処理の説明図。FIG. 11 is an explanatory diagram of a decision tree creation process for a search result according to the embodiment.

【図１２】従来例の類似度に基づく検索処理結果の出力
例を示す説明図。FIG. 12 is an explanatory diagram illustrating an output example of a search processing result based on the similarity in the conventional example.

【図１３】従来例の類似度に基づく検索処理の説明図。FIG. 13 is an explanatory diagram of a search process based on similarity in a conventional example.

[Explanation of symbols]

１クエリー入力部２特徴ベクトル作成部３文書データベース４類似度演算部５検索結果保持部６決定木作成部７検索結果出力部８フィードバック処理部 DESCRIPTION OF SYMBOLS 1 Query input part 2 Feature vector preparation part 3 Document database 4 Similarity calculation part 5 Search result holding part 6 Decision tree preparation part 7 Search result output part 8 Feedback processing part

───────────────────────────────────────────────────── フロントページの続き (54)【発明の名称】情報検索システム、情報検索方法、情報検索プログラム、情報検索プログラムを記録した記録媒体、出力情報選択装置、出力情報選択方法、出力情報選択プログラム及び出力情報選択プログラムを記録した記録媒体 ──────────────────────────────────────────────────の Continuation of the front page (54) [Title of the invention] Information retrieval system, information retrieval method, information retrieval program, recording medium recording information retrieval program, output information selection device, output information selection method, output information selection Recording medium on which program and output information selection program are recorded

Claims

[Claims]

A query input unit for receiving a query input by a user; a feature vector creating unit for creating a feature vector from the query received by the query input unit; An information database in which feature vector data is registered; a similarity between a feature vector of a query created by the feature vector creation unit and a feature vector for each search target information registered in the information database is calculated; A similarity calculation unit for specifying search target information having a feature vector indicating a similarity higher than the threshold value, and extracting the index, data representing the content, and feature vector data; A search result storage unit for storing data, and a search target retrieved by the similarity calculation unit A search result output unit for outputting data indicating the index and content of one of the reports, and accepting a user's suitability / inappropriate determination input with respect to the search result output by the search result output unit; In the case of, the search result holding unit searches for other search target information having a feature vector having the highest similarity to the feature vector of the search target information currently being output, and the corresponding search target information is If so, the content is output as the next candidate by the search result output unit. (2) In the case of inappropriate input, if it is the first stage, 2
If it is at or after the stage, a virtual vector that is in the opposite direction to the feature vector of the search target information currently being output is calculated with respect to the feature vector output at the previous stage, and the feature vector with the highest similarity to this virtual vector is calculated. A feedback process for searching the search target information held in the search result holding unit for other search target information having the search result information, and outputting the content of the corresponding search target information as a next candidate by the search result output unit, if any. Information retrieval system comprising:

2. A query input unit for receiving a query input by a user; a feature vector generating unit for generating a feature vector from the query received by the query input unit; An information database in which feature vector data is registered; a similarity between a feature vector of a query created by the feature vector creation unit and a feature vector for each search target information registered in the information database is calculated; A similarity calculation unit for specifying search target information having a feature vector indicating a similarity higher than the threshold value, and extracting the index, data representing the content, and feature vector data; A search result storage unit for storing data; and a high similarity calculated by the similarity calculation unit. For the feature vector group, starting from the feature vector with the highest similarity, it is suitable if its contents are accepted by the user, inappropriately branches if it is not accepted, and inappropriately branches if it is not accepted, If the other search target information having the feature vector having the highest similarity to the feature vector is set as the next candidate search target information, and the branch is inappropriately branched from the feature vector,
If it is the first stage, the feature vector of the query is calculated. If it is the second stage or later, the virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated for the feature vector that was the output candidate of the previous stage. Then, another search target information having a feature vector having the highest similarity to the virtual vector is searched, and if there is such search target information, the content is set as the next candidate search target information. A decision tree creating unit that repeatedly creates a decision tree until all the search target information held in the unit is covered, and data indicating the index and contents of one of the search target information extracted by the similarity calculation unit A search result output unit for outputting a search result, and receiving a user's suitability / inappropriate determination input for the search result output from the search result output unit, and generating the decision tree creating unit Then identifies the search target information to be output, the corresponding search target information retrieval system comprising a feedback unit for outputting by said search result output unit contents based on decision tree was.

3. A query input unit for receiving a query input by a user, a feature vector generating unit for generating a feature vector from the query received by the query input unit, an index of a large number of search target information, data representing contents, and An information database in which feature vector data is registered; a similarity between a feature vector of a query created by the feature vector creation unit and a feature vector for each search target information registered in the information database is calculated; A similarity calculation unit for specifying search target information having a feature vector indicating a similarity higher than the threshold value, and extracting the index, data representing the content, and feature vector data; A search result storage unit for storing data; and a high similarity calculated by the similarity calculation unit. For the feature vector group, starting from the feature vector with the highest similarity, it is suitable if its contents are accepted by the user, inappropriately branches if it is not accepted, and inappropriately branches if it is not accepted, If the other search target information having the feature vector having the highest similarity to the feature vector is set as the next candidate search target information, and the branch is inappropriately branched from the feature vector,
If it is the first stage, the feature vector of the query is calculated. If it is the second stage or later, the virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated for the feature vector that was the output candidate of the previous stage. Then, another search target information having a feature vector having the highest similarity to the virtual vector is searched, and if there is such search target information, the content is set as the next candidate search target information. A decision tree creating unit for repeatedly creating a decision tree for a group of search target information held in a unit up to a predetermined stage, and data indicating the index and contents of one of the search target information extracted by the similarity calculation unit. A search result output unit to be output, and accepting a user's suitability / inappropriate determination input with respect to the search result output by the search result output unit; The next search target information to be output is specified based on the determined decision tree, the content of the corresponding search target information is output by the search result output unit, and the existing decision tree is output to the decision tree creation unit. An information retrieval system comprising: a feedback processing unit that gives an instruction to grow the decision tree by a predetermined stage to the decision tree creating unit when a condition for growing is reached.

4. The information search system according to claim 1, wherein the query and the search target information are text data.

5. A step 1 for receiving a query input by a user, a step 2 for creating a feature vector from the received query, a feature vector for the created query, and a search for each piece of search target information registered in the information database. Calculating a similarity with the feature vector, identifying search target information having a feature vector indicating a similarity higher than a predetermined threshold, and extracting the index, data representing the content, and feature vector data; Step 4 for outputting data indicating the index and contents of one of the search target information extracted in Step 3, and accepting a user's suitability / inappropriate determination input for the search result output in Step 4; ) In the case of appropriate input, the similarity is the highest in the feature vector of the search target information currently being output. The search target information retrieved in step 3 is searched for other search target information having a new feature vector, and if there is such search target information, its contents are output as the next candidate, and (2) improper input In the case of, in the first stage, the feature vector of the query is output in the opposite direction to the feature vector of the query, and in the second and subsequent stages, with respect to the feature vector output in the previous stage, A virtual vector is calculated, and the virtual vector is searched for other search target information having a feature vector having the highest similarity in the search target information extracted in step 3. Outputting the content as a next candidate.

6. A step 1 for receiving a query input by a user, a step 2 for creating a feature vector from the received query, a feature vector of the created query, and a search for each search target information registered in the information database. Calculating a similarity with the feature vector, identifying search target information having a feature vector indicating a similarity higher than a predetermined threshold, and extracting the index, data representing the content, and feature vector data; From the feature vector group having a high similarity obtained in step 3, start with the feature vector having the highest similarity and branch appropriately if the content is acceptable to the user or inappropriate if the content is not acceptable. However, if branching is appropriate, another search pair having the feature vector with the highest similarity to the feature vector The elephant information is used as search target information of the next candidate,
In the case of inappropriate branching from the feature vector, if it is the first stage, the query vector is the query feature vector. Calculate the virtual vector in the opposite direction to the feature vector that has it, search for other search target information that has the feature vector with the highest similarity to this virtual vector, and if there is applicable search target information, replace its contents with the next candidate. Step 4 of creating a decision tree by repeating the process of setting the search target information until all the search target information extracted in Step 3 is covered, and converting the index and contents of one of the search target information extracted in Step 3 Step 5 of outputting the data shown in the step S5; In identifying the search target information to be next output based on the decision tree created,
Outputting the content of the relevant search target information.

7. A step 1 for receiving a query input by a user, a step 2 for creating a feature vector from the received query, a feature vector of the created query, and a search for each search target information registered in the information database. Calculating a similarity with the feature vector, identifying search target information having a feature vector indicating a similarity higher than a predetermined threshold, and extracting the index, data representing the content, and feature vector data; From the feature vector group having a high similarity obtained in step 3, start with the feature vector having the highest similarity and branch appropriately if the content is acceptable to the user or inappropriate if the content is not acceptable. However, if branching is appropriate, another search pair having the feature vector having the highest similarity to the feature vector The elephant information is used as search target information of the next candidate,
In the case of inappropriate branching from the feature vector, if it is the first stage, the query vector is the query feature vector. Calculate the virtual vector in the opposite direction to the feature vector that has it, search for other search target information that has the feature vector with the highest similarity to this virtual vector, and if there is applicable search target information, replace its contents with the next candidate. Step 4 of creating a decision tree by repeating the process of setting the search target information as the search target information at a predetermined stage with respect to the search target information extracted in Step 3, and data indicating the index and contents of one of the search target information extracted in Step 3 Step 5; and accepting the user's appropriate / unsuitable input for the search result output in step 5; Identify next search target information to be output based on the decision tree created,
Output the contents of the relevant search target information and step 4
And a step of growing the decision tree by a predetermined stage if the condition for growing the decision tree created in the step is reached.

8. The information search method according to claim 5, wherein the query and the search target information are text data.

9. A process 1 for receiving a query input by a user, and a process 2 for creating a feature vector from the received query.
And calculating a similarity between the feature vector of the created query and a feature vector for each search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. A process 3 for specifying the target information and extracting the index and the data representing the content and the feature vector data; a process 4 for outputting the index and the content indicating the index and the content of one of the search target information extracted in the process 3; The user's input for determining whether or not the search result is output in the process 4 is accepted. (1) In the case of the appropriate input, the feature vector having the highest similarity to the feature vector of the search target information currently being output. Searches the search target information extracted in the above process 3 for other search target information having (2) In the case of inappropriate input, the search target currently being output is the feature vector of the query in the first stage, and the feature vector output in the previous stage in the second and subsequent stages. A virtual vector that is in the opposite direction to the feature vector of the information is calculated, and there is no other search target information having a feature vector having the highest similarity in the virtual vector, and the search is performed in the search target information extracted in the process 3. An information search program for causing a computer to execute a process 5 of outputting the corresponding search target information, if any, as the next candidate.

10. A process 1 for receiving a query input by a user, and a process 2 for creating a feature vector from the received query.
And calculating a similarity between the created feature vector of the query and a feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. A process 3 for identifying the target information and extracting the data representing the index and the contents thereof and the feature vector data; and a feature vector group having a high similarity obtained in the process 3;
Starting with the feature vector with the highest similarity, it is suitable if its contents are accepted by the user, inappropriately branches if it is not accepted, and inappropriately branches if it is not accepted,
Other search target information having a feature vector having the highest similarity to the feature vector is set as search target information of the next candidate. In the second and subsequent stages, a virtual vector having the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector set as the output candidate in the preceding stage, and the virtual vector has the highest similarity. Search for other search target information with feature vector,
If there is applicable search target information, the process of setting the content as the next candidate search target information is repeated until all search target information extracted in process 3 is covered, and a process 4 for creating a decision tree; Processing 5 for outputting data indicating the index and content of one of the retrieved information to be searched, and accepting the user's input for determining whether or not the search result output in processing 5 is appropriate, and the decision made in processing 4 An information search program for causing a computer to execute processing 6 for specifying the next search target information to be output based on the tree and outputting the contents of the relevant search target information.

11. A processing 1 for receiving a query input by a user, and a processing 2 for creating a feature vector from the received query.
And calculating a similarity between the created feature vector of the query and a feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. A process 3 for identifying the target information and extracting the data representing the index and the contents thereof and the feature vector data; and a feature vector group having a high similarity obtained in the process 3;
Starting with the feature vector with the highest similarity, it is suitable if its contents are accepted by the user, inappropriately branches if it is not accepted, and inappropriately branches if it is not accepted,
Other search target information having a feature vector having the highest similarity to the feature vector is set as search target information of the next candidate. In the second and subsequent stages, a virtual vector having the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector set as the output candidate in the preceding stage, and the virtual vector has the highest similarity. Search for other search target information with feature vector,
If there is applicable search target information, the process of making the contents the search target information of the next candidate is repeated to a predetermined stage with respect to the search target information extracted in process 3 to create a decision tree, and the process extracted in process 3 Processing 5 for outputting data indicating the index and content of one of the search target information; accepting the user's appropriateness / inappropriate determination input with respect to the search result output in processing 5; Based on the search target information to be output next, the contents of the corresponding search target information are output, and if the condition for growing the decision tree created in process 4 is reached, the decision tree is grown by a predetermined level. An information search program for causing a computer to execute the processing 6 to be executed.

12. The information search program according to claim 9, wherein the query and the search target information are text data.

13. A process 1 for receiving a query input by a user, and a process 2 for creating a feature vector from the received query.
And calculating a similarity between the feature vector of the created query and a feature vector for each search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. A process 3 for specifying the target information and extracting the index and the data representing the content and the feature vector data; a process 4 for outputting the index and the content indicating the index and the content of one of the search target information extracted in the process 3; The user's input for determining whether or not the search result is output in the process 4 is accepted. (1) In the case of the appropriate input, the feature vector having the highest similarity to the feature vector of the search target information currently being output. Searches the search target information extracted in the above process 3 for other search target information having (2) In the case of unsuitable input, the search target currently being output is the feature vector of the query in the first stage, and the feature vector output in the previous stage in the second and subsequent stages. A virtual vector that is in the opposite direction to the feature vector of the information is calculated, and there is no other search target information having a feature vector with the highest similarity in the virtual vector, and the search is performed in the search target information extracted in the process 3. And a computer-readable recording medium on which an information search program for executing a process 5 for outputting the content of the corresponding search target information, if any, as the next candidate.

14. A process 1 for receiving a query input by a user, and a process 2 for creating a feature vector from the received query.
And calculating a similarity between the created feature vector of the query and a feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. A process 3 for identifying the target information and extracting the data representing the index and the contents thereof and the feature vector data; and a feature vector group having a high similarity obtained in the process 3;
Starting with the feature vector with the highest similarity, it is suitable if its contents are accepted by the user, inappropriately branches if it is not accepted, and inappropriately branches if it is not accepted,
Other search target information having a feature vector having the highest similarity to the feature vector is set as search target information of the next candidate. In the second and subsequent stages, a virtual vector having the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector set as the output candidate in the preceding stage, and the virtual vector has the highest similarity. Search for other search target information with feature vector,
If there is applicable search target information, the process of setting the content as the next candidate search target information is repeated until all the search target information extracted in the process 3 is covered, and a process 4 for creating a decision tree; Processing 5 for outputting data indicating the index and content of one of the retrieved information to be searched, and accepting the user's input for determining whether or not the search result output in processing 5 is appropriate, and the decision made in processing 4 A computer-readable recording medium which records an information search program for executing processing 6 for specifying the next search target information to be output based on the tree and outputting the contents of the relevant search target information.

15. A process 1 for receiving a query input by a user, and a process 2 for creating a feature vector from the received query.
And calculating a similarity between the created feature vector of the query and a feature vector for each piece of search target information registered in the information database, and searching for a feature vector having a similarity higher than a predetermined threshold. A process 3 for identifying the target information and extracting the data representing the index and the contents thereof and the feature vector data; and a feature vector group having a high similarity obtained in the process 3;
Starting with the feature vector with the highest similarity, it is suitable if its contents are accepted by the user, inappropriately branches if it is not accepted, and inappropriately branches if it is not accepted,
Other search target information having a feature vector having the highest similarity to the feature vector is set as search target information of the next candidate. In the second and subsequent stages, a virtual vector having the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector set as the output candidate in the preceding stage, and the virtual vector has the highest similarity. Search for other search target information with feature vector,
If there is corresponding search target information, the process of making the contents the search target information of the next candidate is repeated to a predetermined stage for the search target information extracted in process 3 to create a decision tree, and the process extracted in process 3 Processing 5 for outputting data indicating the index and contents of one of the information to be searched; and accepting the user's appropriateness / inappropriate determination input with respect to the search result output in processing 5; Based on the search target information to be output next, the contents of the relevant search target information are output, and if the condition for growing the decision tree created in process 4 is reached, the decision tree is grown by a predetermined stage. And a computer-readable recording medium on which an information search program for executing the processing 6 is recorded.

16. The computer-readable recording medium according to claim 13, wherein the query and the search target information are text data.

17. A feature data holding unit for holding feature vector data together with information representing the contents of each of the group of search target information, and expressing the contents of the search target information specified from the group of search target information. A display information output unit that outputs information; a feedback receiving unit that receives a user's appropriateness / inappropriate determination input with respect to the output search target information; and (1) when the user's determination input is appropriate The display information output unit searches the set of search target information for other search target information having a feature vector having the highest similarity to the feature vector of the search target information currently being output,
If there is applicable search target information, the information representing its contents is output to the display information output unit. (2) If the user's judgment input is inappropriate, if it is the first stage, it is given in advance. With respect to the reference vector, if it is the second or subsequent stage, the display information output unit is oriented in the opposite direction to the feature vector of the search target information currently output, with respect to the feature vector of the search target information output in the previous stage. A virtual vector is calculated, and the group of search target information is searched for other search target information having a feature vector having the highest similarity to the virtual vector, and if there is such search target information, the content is expressed. An output information selection device comprising: an output information selection unit that outputs information to the display information output unit.

18. A feature data holding unit for holding feature vector data together with information representing the contents of each of the group of search target information, and expressing the contents of the search target information designated from the group of search target information. A display information output unit for outputting information; and a group of search target information held in the feature data holding unit, from a search target information having a feature vector having the highest similarity to a reference vector given in advance. Initially, if the content is accepted by the user, it is appropriate, if it is not accepted, it branches inappropriately, and if it branches appropriately, another feature vector having the highest similarity to the feature vector is used. The search target information is set as search target information of the next candidate, and when inappropriately branching from the feature vector, if it is the first stage, with respect to the reference vector,
If it is the second stage or later, the virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector of the search target information as the output candidate of the previous stage, and similar to this virtual vector. The other search target information having the highest feature vector is searched, and if there is the corresponding search target information, the process of setting the content as the next candidate search target information is performed on all of the features held in the feature data holding unit. A decision tree creating unit that repeatedly creates a decision tree until the search target information is covered, and accepts a user's suitability / inadequacy determination input for the search target information output by the display information output unit, and generates the decision tree. The search information to be output next is specified based on the decision tree created by the unit, and the information representing the contents of the relevant search target information is output by the display information output unit. An output information selection device comprising a feedback processing unit.

19. A feature data holding unit for holding feature vector data together with information expressing the contents of each of a group of search target information, and expressing the contents of the search target information specified from the group of search target information. A display information output unit for outputting information; and a group of search target information held in the feature data holding unit. Initially, if the content is accepted by the user, it is appropriate, if it is not accepted, it branches inappropriately, and if it branches appropriately, another feature vector having the highest similarity to the feature vector is used. The search target information is set as search target information of the next candidate, and when inappropriately branching from the feature vector, if it is the first stage, with respect to the reference vector,
If it is the second stage or later, the virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated with respect to the feature vector of the search target information as the output candidate of the previous stage, and similar to this virtual vector. The other search target information having the highest feature vector is searched, and if there is the corresponding search target information, the content thereof is set as the next candidate search target information. A decision tree creating unit that repeatedly creates a decision tree for a target information group up to a predetermined stage; Specifying the next search target information to be output based on the created decision tree, and causing the display information output unit to output information representing the contents of the corresponding search target information A feedback processing unit for instructing the decision tree creating unit to grow the decision tree by a predetermined stage when a condition for growing an existing decision tree is reached. Information selection device.

20. A step 1 for holding feature vector data together with information representing the contents of each of the group of search target information, and information representing the contents of the search target information designated from the group of search target information. Outputting step 2; accepting a user's appropriateness / inappropriate determination input with respect to the currently output search target information; and (1) if the user's determination input is appropriate, the currently output search target The group of search target information is searched for other search target information having a feature vector having the highest similarity to the feature vector of the information, and if there is the corresponding search target information, information representing the content is output. (2) When the user's judgment input is inappropriate, the search target output in the first stage is the reference vector given in advance, and the search target output in the previous stage is the second or subsequent stage. With respect to the feature vector of the report, a virtual vector that is in the opposite direction to the feature vector of the currently output search target information is calculated, and other search target information having the feature vector with the highest similarity to this virtual vector is calculated A step of searching the group of search target information for the search target, and outputting information representing the contents of the corresponding search target information, if any.

21. A step 1 for holding feature vector data together with information representing the contents of each of the group of search target information, and information representing the contents of the search target information designated from the group of search target information. And outputting the search target information with the feature vector having the highest similarity to the reference vector given in advance for the group of search target information, and assuming that the content is acceptable to the user. Suitable, if not accepted and branches inappropriately, if appropriately branched, other search target information having a feature vector having the highest similarity to the feature vector as the next candidate search target information, In the case of inappropriate branching from the feature vector, if it is the first stage, the reference vector is used. For the feature vector of the elephant information, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and other search target information having the feature vector with the highest similarity to this virtual vector is calculated. Step 3 of creating a decision tree by repeating the process of searching and, if there is applicable search target information, making the contents thereof the search target information of the next candidate until the entire set of search target information is covered; Accepting a user's input regarding suitability / inappropriateness of the search target information in the search target information, specifying the next search target information to be output based on the decision tree, and outputting information representing the contents of the relevant search target information 4. An output information selection method comprising:

22. A step 1 for holding feature vector data together with information representing the contents of each of the group of search target information, and information representing the contents of the search target information designated from the group of search target information. And outputting the search target information with the feature vector having the highest similarity to the reference vector given in advance for the group of search target information, and assuming that the content is acceptable to the user. Suitable, if not accepted and branches inappropriately, if appropriately branched, other search target information having a feature vector having the highest similarity to the feature vector as the next candidate search target information, In the case of inappropriate branching from the feature vector, if it is the first stage, the reference vector is used. For the feature vector of the elephant information, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated. Step 3 of creating a decision tree by repeating the process of searching and, if there is applicable search target information, the contents thereof as the next candidate search target information up to a predetermined stage for the group of search target information; It accepts the user's input for determining whether or not the search target information is appropriate, specifies the next search target information to be output based on the decision tree, and outputs information expressing the contents of the relevant search target information. And a step of growing the decision tree by a predetermined stage when the condition for growing the decision tree is reached.

23. Processing 1 for retaining feature vector data together with information representing the contents of each of a group of search target information, and information representing the contents of the search target information designated from the group of search target information. Processing 2 to output, and accepting the user's appropriateness / inappropriate determination input with respect to the currently output search target information. (1) If the user's determination input is appropriate, the currently output search target The group of search target information is searched for other search target information having a feature vector having the highest similarity to the feature vector of the information, and if there is the corresponding search target information, information representing the content is output. (2) When the user's judgment input is inappropriate, if the first stage, the search target information output in the previous stage is stored in the second stage or later with respect to the given reference vector. With respect to the feature vector, a virtual vector that is in the opposite direction to the feature vector of the search target information currently being output is calculated, and if there is no other search target information having a feature vector with the highest similarity to the virtual vector, An output information selection program for causing a computer to execute a process 3 of searching for a group of search target information and outputting information representing the contents of the search target information, if any.

24. A process 1 for retaining feature vector data together with information representing the contents of each of a group of search target information, and information representing the contents of the search target information designated from the group of search target information. In the case where it is assumed that the process 2 to be output starts with the search target information having the feature vector having the highest similarity to the reference vector given in advance for the group of search target information, and the content thereof is accepted by the user. Suitable, if not accepted and branches inappropriately, if appropriately branched, other search target information having a feature vector having the highest similarity to the feature vector as the next candidate search target information, In the case of inappropriate branching from the feature vector, the search target information is regarded as the output candidate of the preceding stage with respect to the reference vector if it is the first stage, and the output candidate of the preceding stage if it is the second stage or later. With respect to the feature vector having, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and other search target information having the feature vector with the highest similarity to this virtual vector is searched, A process 3 of creating a decision tree by repeating the process of setting the content of the corresponding search target information as the next candidate search target information, if any, until the entire set of search target information is covered; A process 4 of receiving a user's suitability / inappropriate determination input for the target information, specifying the next search target information to be output based on the decision tree, and outputting information representing the contents of the relevant search target information. Output information selection program to be executed by computer.

25. A process 1 for retaining feature vector data together with information representing the contents of each of a group of search target information, and information representing the contents of the search target information designated from the group of search target information. In the case where it is assumed that the process 2 to be output starts with the search target information having the feature vector having the highest similarity to the reference vector given in advance for the group of search target information, and the content thereof is accepted by the user. If it is not suitable, if it is not accepted, it branches improperly, and if it branches properly, other search target information having the feature vector having the highest similarity to the feature vector is set as the next candidate search target information, In the case of inappropriate branching from the feature vector, the search target information is regarded as the output candidate of the preceding stage with respect to the reference vector if it is the first stage, and the output candidate of the preceding stage if it is the second stage or later. With respect to the feature vector having, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and other search target information having the feature vector with the highest similarity to this virtual vector is searched, A process 3 of creating a decision tree by repeating the process of setting the content of the corresponding search target information, if any, as the next candidate search target information to a predetermined stage for the group of search target information
Information that accepts a user's input for determining whether or not the search target information is currently being output, specifies the next search target information to be output based on the decision tree, and expresses the contents of the relevant search target information. And an output information selection program for causing a computer to execute a process 4 for growing the decision tree by a predetermined stage when conditions for growing an existing decision tree are reached.

26. A processing 1 for retaining feature vector data together with information representing the contents of each of a group of search target information, and information representing the contents of the search target information designated from the group of search target information. Processing 2 to output, and accepting the user's appropriateness / inappropriate determination input with respect to the currently output search target information. (1) If the user's determination input is appropriate, the currently output search target The group of search target information is searched for other search target information having a feature vector having the highest similarity to the feature vector of the information, and if there is the corresponding search target information, information representing the content is output. (2) When the user's judgment input is inappropriate, if the input is the first stage, the search target information output in the previous stage is output from the reference vector given in advance if the input is the second or subsequent stage. With respect to the feature vector, a virtual vector that is in the opposite direction to the feature vector of the search target information currently being output is calculated, and if there is no other search target information having a feature vector with the highest similarity to the virtual vector, A computer-readable recording medium for recording an output information selection program for executing a process 3 for searching a group of search target information and outputting information representing the content of the search target information, if any;

27. A process 1 for holding feature vector data together with information representing the contents of each of a group of search target information, and information representing the contents of the search target information designated from the group of search target information. In the case where it is assumed that the process 2 to be output starts with the search target information having the feature vector having the highest similarity to the reference vector given in advance for the group of search target information, and the content thereof is accepted by the user. Suitable, if not accepted and branches inappropriately, if appropriately branched, other search target information having a feature vector having the highest similarity to the feature vector as the next candidate search target information, In the case of inappropriate branching from the feature vector, the search target information is regarded as the output candidate of the preceding stage with respect to the reference vector if it is the first stage, and the output candidate of the preceding stage if it is the second stage or later. With respect to the feature vector having, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and other search target information having the feature vector with the highest similarity to this virtual vector is searched, A process 3 of creating a decision tree by repeating the process of setting the content of the corresponding search target information as the next candidate search target information, if any, until the entire set of search target information is covered; A process 4 of receiving a user's suitability / inappropriate determination input for the target information, specifying the next search target information to be output based on the decision tree, and outputting information representing the contents of the relevant search target information. A computer-readable recording medium recording an output information selection program to be executed.

28. A process 1 for retaining feature vector data together with information representing the contents of each of a group of search target information, and information representing the contents of the search target information designated from the group of search target information. Processing 2 to output, for the group of search target information, starting from search target information having a feature vector having the highest similarity to a reference vector given in advance, and assuming that the content is acceptable to the user If it is not suitable, if it is not accepted, it branches improperly, and if it branches appropriately, other search target information having the feature vector having the highest similarity to the feature vector is set as the next candidate search target information, In the case of inappropriate branching from the feature vector, the search target information as the output candidate of the preceding stage if it is the first stage and the preceding stage if it is the second stage or later. With respect to the feature vector having, a virtual vector that is in the opposite direction to the feature vector of the inappropriate search target information is calculated, and other search target information having the feature vector with the highest similarity to this virtual vector is searched, A process 3 of creating a decision tree by repeating the process of setting the content of the corresponding search target information, if any, as the next candidate search target information to a predetermined stage for the group of search target information
Information that accepts the user's input of suitability / inadequacy for the currently output search target information, specifies the next search target information to be output based on the decision tree, and expresses the contents of the relevant search target information And a processing 4 for growing the decision tree by a predetermined stage when a condition for growing an existing decision tree is reached, and a computer-readable recording medium recording an output information selection program.