JPH02163876A

JPH02163876A - Literature retrieving method using neural network model

Info

Publication number: JPH02163876A
Application number: JP63317911A
Authority: JP
Inventors: Jiichi Igarashi; 五十嵐　治一
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1988-12-16
Filing date: 1988-12-16
Publication date: 1990-06-25

Abstract

PURPOSE:To enable a retrieving user to perform the desired retrieving jobs by using the calculation formulas which are connected to each other by securing the weighted average of the synthesization result containing functions of different types of character with use of parameters. CONSTITUTION:The calculation formulas which are connected to each other by securing the weighted average of the synthesization result containing two functions (f) and (g) with use of parameters alpha1 and alpha2 when a state r3 of a unit (j) of an output layer and a state a1 of a unit (i) of an intermediate layer are calculated. Thus the 'distance' can be shown more accurately between a key word group offered by a retrieving user and a key word group assigned previously to the literature. Then both functions (f) and (g) having difference types of character compensate the defects with each other. A retrieving standard is obtained in response to the level and the purpose of the user by changing the values of both parameters alpha1 and alpha2. Thus the user can retrieve the literature as desired.

Description

【発明の詳細な説明】産業上の利用分野本発明は、文献検索ないしはデータ検索について、ニュ
ーラルネットワークモデルを用いた文献検索方法に関す
る。DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a document search method using a neural network model for document search or data search.

従来の技術近年、情報処理に関する技術はめざましいものがあり、
その一つとして文献検索に関するものがある。このよう
なものとして１例えば“Ａ　ＦＵＺＺＹＤＯＣＵＭＥＮ
Ｔ　　ＲＥ丁ＲＩＥＶＡＬ　　ＳＹＳＴＥＭ　　ＡＮＤ
　　ＬＥＡＲＮＩＮＧＭＥＴＨＯＤ　ＢＡＳＥＤ　ＯＮ
　ＴＨＥ　ＤＹＮＡＭＩＣＫＥ’ｆｌＯＲＤＣＯＮＮＥ
ＣＴＩＯＮ　ＭＥＴＨＯＤ　（＝動的キーワードコネク
ションを用いた知的文書検索システム）（ファジィシス
テムの応用に関する国際ワークショップ１８８．８．２
２〜８．２４）”（以下、これを文献１と称する）に示
されるものがある。これは、概略的には検索利用者が与
えたキーワード群Ｑと予め各文献ｉに与えられているキ
ーワード群Ａ、との１１近さ”＝関係の深さ（ファイル
確度）ｒ＋　を。Conventional technology In recent years, there have been remarkable advances in information processing technology.
One of them is related to literature search. For example, “A FUZZYDOCUMENT”
TRE-RIEVAL SYSTEM AND
LEARNING METHOD BASED ON
THE DYNAMICKE'flORD CONNE
CTION METHOD (= Intelligent document retrieval system using dynamic keyword connections) (International Workshop on Applications of Fuzzy Systems 188.8.2
2 to 8.24)" (hereinafter referred to as document 1). Roughly speaking, this is the keyword group Q given by the search user and the keywords given in advance to each document i. 11 Closeness with keyword group A" = depth of relationship (file accuracy) r+.

キーワード間の関連度Ｗ。（但し−ｊＥ−Ａ　＋　＋　
ｋ−ｒＩ　（１−Ｘ、）により計算し、関連のある文献
を検索するというものである。キーワード間の関連度（
リンクの重み）ＷＪｋは最急降下法により学習する。Relevance W between keywords. (However, -jE-A + +
It is calculated by k-rI (1-X,) and searches for related documents. Relevance between keywords (
Link weight) WJk is learned by steepest descent method.

発明が解決しようとする課題ところが、このような文献１に示される方法による場合
、キーワード間の関連度ＷＪｋからファイル確度ｒ、を
計算する処理において、次のような欠点がある。まず、
検索利用者が提示したキーワード群の唯１つが文献に割
当てられたキーワード群に含まれると、その文献が選択
されてしまう。Problems to be Solved by the Invention However, the method disclosed in Document 1 has the following drawbacks in the process of calculating the file accuracy r from the degree of association WJk between keywords. first,
If only one of the keywords presented by the search user is included in the keywords assigned to a document, that document will be selected.

他のキーワード同士の関連度が全てＯであってもそうな
ってしまう、また１文献に割当てられたキーワード群に
含まれないがそれにかなり類似したキーワードを検索利
用者が提示した場合、文献のファイル確度はそのキーワ
ードの適切さを反映しにくい、さらに、検索利用者が提
示したキーワード群と関連度が大きいキーワードが文献
に割当てられたキーワード群中にない時にも、その文献
が選択されてしまう場合が多い。This happens even if the degree of relevance between all other keywords is 0, or if a search user presents a keyword that is not included in the keyword group assigned to one document but is quite similar to it, the document file The accuracy does not reflect the appropriateness of the keyword, and furthermore, the document may be selected even when the keyword group assigned to the document does not include a keyword that has a high degree of relevance to the keyword group presented by the search user. There are many.

さらには、各文献に割当てるキーワード群Ａ。Furthermore, a keyword group A is assigned to each document.

をどう選ぶかという開運がある。There is good luck in how you choose.

課題を解決するための手段請求項１記載の発明では、各キーワードに対応したユニ
ットｋを持ち検索利用者が指定したキーワード群を入力
ベクトルユニット群（ｂ、）　　とする入力層と、前記
入力ベクトルユニット群（ｂｋ）と同一構造で各キーワ
ードに対応したユニットｉを持ちＯ≦ａｌ≦１　なる実
数値を持つユニット群（ａ、）　からなる中間層と、各
文献ｊに対応するユニットｊを持つ出力層との３層構造
を持ち、前記中間層の各ユニットｉが前記入力層の各ユ
ニツトｋとリンクを持ち、かつ、前記出力層のユニット
ｊと前記中間層のユニットｉとがリンクを持ち、前記中
間層のユニットｉでのキーワードｉが利用者提示のキー
ワード群とどの程度関連するかの程度を示す状態ａ、と
、前記出力層のユニットｊでの文献ｊが検出されるかど
うかの程度を示す状態ｒ、とを算出して文献を検索する
ニューラルネットワークモデルを用いた文献検索方法に
おいて。Means for Solving the Problems In the invention as set forth in claim 1, an input layer having a unit k corresponding to each keyword and having a keyword group specified by a search user as an input vector unit group (b,); A middle layer consisting of a unit group (a,) which has the same structure as the unit group (bk) and has a unit i corresponding to each keyword and has a real value such that O≦al≦1, and a unit j corresponding to each document j. It has a three-layer structure with an output layer, and each unit i in the intermediate layer has a link with each unit k in the input layer, and unit j in the output layer and unit i in the intermediate layer have a link. , a state a indicating the degree to which keyword i in unit i of the intermediate layer is related to the keyword group presented by the user, and a state a indicating whether document j in unit j of the output layer is detected. In a document search method using a neural network model that searches for documents by calculating a state r indicating the degree of the document.

前記中間層の各ユニットｉと前記入力層の各ユニツトｋ
との間のリンクの持つキーワード間の関連度を示す重み
をＷｉｋ（但し、０≦Ｗｉｋ≦１）とし、前記出力層の
ユニットｊと前記中間層のユニットｉとのリンクの持つ
文献ｊとキーワードｉとの間の関連度を示す重みをＶｊ
ｉ（但し、Ｖｊｉは正、Ｏ１負の実数値）とし、３’ｔ
＋　ｙｚ、ｈ１＋　ｈｚをパラメータ、関数ｇをとし、関数ｆをｆ（（ＶＪ＋ａ＋））＝ｉ　　ｎ（１−Ｖｊｔａυとし
た時、前記ｒｊと前記ａ、とを、パラメータα１．α２
（但し、Ｏ≦α１≦１，０≦α２≦１）を用いて、ｒａ＝αｚ・ｇ（（ａｔ）；（Ｖｊｉ）＋　’／ｚｔ　
ｈＪ＋（１（ｈ）ｆ（（ＶＪｔａ＋））ａ＋＝αｔ”ｇ
（（ｂＪ　；（ｖ＋ｍＬ　Ｙｚ＋　ｈＪ　＋（１−αＪ
ｆ（（Ｗ＋＊ｂＪ＋ｍＪにより算出する。Each unit i of the intermediate layer and each unit k of the input layer
Let Wik be the weight indicating the degree of association between keywords held by the link between the document j and the keyword held by the link between the unit j of the output layer and the unit i of the intermediate layer. Vj is the weight indicating the degree of association between
i (however, Vji is positive, O1 is a negative real value), and 3't
+ yz, h1+ hz are parameters, function g is the function f ((VJ+a+))=i n (1-Vjtaυ, then the above rj and the above a are parameters α1.α2
(However, using O≦α1≦1, 0≦α2≦1), ra=αz・g((at);(Vji)+'/zt
hJ+(1(h)f((VJta+))a+=αt”g
((bJ ; (v+mL Yz+hJ +(1-αJ
f((calculated by W+*bJ+mJ.

この際、請求項２記載の発明では、出力層のユニットｊ
と中間層のユニットｉとのリンクの持つ文献ｊとキーワ
ードｉとの間の関連度を示す重みｖｊｌにつき、任意の
実数値をとることを許容し、学習によりその値を設定す
る。In this case, in the invention according to claim 2, the output layer unit j
The weight vjl indicating the degree of association between the document j and the keyword i, which the link between and the intermediate layer unit i, is allowed to take any real value, and the value is set by learning.

作用請求項１記載の発明によれば、出力層のユニットｊの状
態ｒｊ　と中間層のユニットｉの状’ＭＡ　ａ　ｔとの
算出において、２つの関数ｆｉｇによる合成結果を、α
１．α２なるパラメータを用いて、重み付き平均をとる
ことにより結合した計算式を用いるため、検索利用者提
示のキーワード群と予め文献に割当てられたキーワード
群との間の″距離″を、より的確に表現できる。即ち、
相異なる性質を持つ関数ｇと関数ｆとは互いの欠点を相
補的に補うものであり、パラメータαｉ、α２の値のと
り方によって、検索利用者のレベルと目的に合わせた検
索基準とし、検索利用者の意図する通りの検索を行わせ
ることができる。According to the invention described in claim 1, in calculating the state rj of the unit j in the output layer and the state 'MA a t of the unit i in the intermediate layer, the synthesis result by the two functions fig is
1. Since a calculation formula is used that is combined by taking a weighted average using the parameter α2, it is possible to more accurately calculate the "distance" between the keyword group presented by the search user and the keyword group assigned to the document in advance. I can express it. That is,
Function g and function f, which have different properties, complement each other's shortcomings, and depending on the values of parameters αi and α2, search criteria can be set according to the level and purpose of the search user, and search usage can be improved. The search can be performed as intended by the user.

この際、請求項２記載の発明によれば、リンクの重みｖ
ＪＩがサンプルデータの提示と最急降下法による重みの
調整という学習により値が設定されるため、検索利用者
の考えているものに近い検索結果を得ることができる。At this time, according to the invention described in claim 2, the link weight v
Since the JI value is set by learning by presenting sample data and adjusting weights using the steepest descent method, it is possible to obtain search results that are close to what the search user has in mind.

実施例本発明の一実施例を図面に基づいて説明する。Example An embodiment of the present invention will be described based on the drawings.

まず、本実施例に先立ち、本出願人により既に提案され
ている文献１の前提的な“動的キーワードコネクション
を用いた知的文書作成システム”（以下、文献２という
）の内容から説明する。この文献２のシステムでは、検
索利用者が指定したキーワード集合と関連の深い文献を
検索するにつき、次の２つのステップからなる。First, prior to this embodiment, the contents of the prerequisite "Intellectual Document Creation System Using Dynamic Keyword Connections" (hereinafter referred to as Document 2) of Document 1, which has already been proposed by the applicant of the present invention, will be explained. In the system of Document 2, searching for documents closely related to the keyword set specified by the search user consists of the following two steps.

ステップ１：検索利用者がシステムの支援の下に検索に
最適なキーワード集合を選び出す。Step 1: The search user selects the optimal set of keywords for the search with the support of the system.

ステップ２ニステツプ１により選択された検索用キーワ
ード集合に基づき各文献との “近さ（ファイル確度）”を計算する。Step 2 Calculate the "closeness" (file accuracy) to each document based on the set of search keywords selected in step 1.

このうち、ステップ２の内容が重要であり、以下、ステ
ップ２についてのみ説明する。まず、第２図に示すよう
に、キーワードに、とに、の間の関連度をＷ、１（但し
、Ｏ≦Ｗ　ＩＩｍ≦１）とする、よって、Ｗ、、＝Ｏの
場合にはキーワードに、とに、どの間には全く関係がな
いことを示し、Ｗｉｋ＝１はキーワードに、とに、との
間の関係が最も強いことを意味する。ここに、検索対象
の各文献ｉにはキーワード集合が予め割付けられており
、それをＡ。Among these, the contents of step 2 are important, and only step 2 will be explained below. First, as shown in Fig. 2, the degree of association between the keywords and is set to W,1 (however, O≦W IIm≦1). Therefore, if W, , = O, the keyword Wik=1 means that there is the strongest relationship between the keywords , , and . Here, a keyword set is assigned in advance to each document i to be searched, and this keyword set is assigned to A.

とする、このとき、検索利用者が指定したキーワード集
合をＱとおけば、ｉ番目の文献とＱとの“近さ（ファイ
ル確度）′は、ｒ　１（Ｑ）＝　ｅＷＪ＊　　　　　　　　・・・・・
・・・・・・・・・・・・・・・・（１）で計算される
。ここで、Φは代数和ｅＸｒ＝１−ｎ　　（１−ＸＪ）　　・・・・・・・・
・・・・・・・・・・・・・（２）を表す、このｒ＝（
Ｑ）の計算式をみると、複数キーワードのＯＲ検索を、
代数和の持つＭＡＸ演算的性格を利用して実現しようと
するものと考えられる。In this case, if the keyword set specified by the search user is Q, the "closeness (file accuracy)" between the i-th document and Q is r 1 (Q) = eWJ*...・
・・・・・・・・・・・・・・・・・・ Calculated by (1). Here, Φ is the algebraic sum eXr=1-n (1-XJ)...
・・・・・・・・・・・・This r=( representing (2)
Looking at the calculation formula for Q), OR search of multiple keywords,
It is thought that this is achieved by utilizing the MAX operation characteristic of algebraic sums.

また、キーワード間のリンクの重みＷ、。は、次の方法
で学習させる。これは、評価関数を各文献毎に定め、重
みＷ、、の関数をみて最急降下法を使うものである。即
ち、教師パターンをｒ’（＝３゜（適切な場合）、ｒ：
＝ｉ（不適切な場合）とすると、Ｅ　７＝−Ｌ−（、、７−ｒ、）２＝（ｒ丁−ｒ＋）上ユニＬＬ−Ｗｉｋ・・・・・・・・・　・・・・・・・・（３）（但し、
ｍ　Ｅ　Ａ　＋　、　ｎ　Ｅ　Ｑ　、かつ、Ｗｉｋ≠０
゜Ｗ、、、≠１のとき）である、このような評価関数を全ての文献についてグロ
ーバルなものにするには、Ｅ＝ΣＥ；とすればよい、上
記の場合、Ｗｌ、≠０かつＷｌ、≠１であり、Ｏ＜Ｗ、
、＜１の場合で説明したが、Ｗ、、≦０の場合及びＷゎ
、≦１の場合は別に考える必要がある。しかし、以下の
説明でも、簡単のために、ｏ＜ｗ、、＜ｉの場合を想定
して説明する。Also, the weight W of links between keywords. is learned using the following method. In this method, an evaluation function is determined for each document, and the steepest descent method is used by looking at the function of weights W, . That is, the teacher pattern is r' (= 3° (if appropriate), r:
= i (inappropriate case), then E 7=-L-(,, 7-r,)2 = (r-d-r+) upper unit L L-Wik ・・・・・・・・・・・・・・・・・(3) (However,
m EA + , n EQ , and Wik≠0
To make such an evaluation function global for all documents, E=ΣE; In the above case, Wl, ≠ 0 and Wl ,≠1, and O<W,
, <1 has been explained, but the cases where W, , ≦0 and the cases where Wゎ, ≦1 need to be considered separately. However, for the sake of simplicity, the following explanation will also be based on the assumption that o<w, , <i.

ところで、上述した文献２によるシステムは、本発明者
提案の「ニューラルネットワークの考え方を導入したキ
ーワードによる文献検索方式の構想」によれば、ニュー
ラルネットワークの典型的なモデルである第１図に示す
ような、多層回路網モデルと対応付けることができる。By the way, the system according to Document 2 mentioned above is a typical model of a neural network, as shown in FIG. It can be associated with a multilayer network model.

まず、図示のような入力層１．中間Ｎ２及び出力層３か
らなる３層のニューラルネットワークモデルを考える。First, input layer 1 as shown in the figure. Consider a three-layer neural network model consisting of an intermediate N2 and an output layer 3.

入力層１及び中間Ｎ２は各キーワードに対応するユニッ
ト（Ｏ印で示す）ｋ、ｉを持つ、入力Ｎ１には検索利用
者が指定したキーワードの集合、即ちキーワード群を、
０又は１の２値を成分要素とするベクトル（ｂｍ）で入
力する。中間層２の各ユニット群（ａ、）は、入力層１
のユニット群（ｂ２）と同じ構造を持ち、各ユニットが
各キーワードに対応するが、０≦ａ、≦１の実数値をと
る点が異なる。このような中間Ｍ２の各ユニットｉは入
力層１の各ユニツトｋとリンクを持ち、その重み（即ち
、キーワード間の関連度）をＷｌｍと記す。The input layer 1 and the intermediate N2 have units k and i (indicated by O marks) corresponding to each keyword.The input layer 1 has a set of keywords specified by the search user, that is, a keyword group.
Input as a vector (bm) whose component elements are binary values of 0 or 1. Each unit group (a,) of the intermediate layer 2 is connected to the input layer 1
It has the same structure as the unit group (b2), and each unit corresponds to each keyword, except that it takes a real value of 0≦a and ≦1. Each unit i of the intermediate M2 has a link with each unit k of the input layer 1, and its weight (that is, the degree of association between keywords) is written as Wlm.

この重みＷ、はＯ≦Ｗｉｋ≦１を満たす実数であり。This weight W is a real number satisfying O≦Wik≦1.

ｖｌ：Ｗ、、≦１と固定する。出力層３の各ユニットｊ
には各文献ｊが対応し、そのユニットｊの状態ｒｊ　　
（但し、Ｏ≦ｒ、≦１の実数値）が、文献ｊが検索によ
って選択される強さを表す、出力層３のユニットｊと中
間層２のユニットｉとはリンクされており、そのリンク
の強さ、即ち、文献にとキーワードｉの関連する強さを
ｖＪｌとすると、このｖＪＩは、一般に、正、０．負の
実数値をとる。vl: W, , is fixed as ≦1. Each unit of output layer 3
corresponds to each document j, and the state rj of the unit j
(However, a real value of O≦r, ≦1) represents the strength with which document j is selected by the search. Unit j of the output layer 3 and unit i of the intermediate layer 2 are linked, and the link Let vJl be the strength of keyword i in the literature, that is, the strength associated with keyword i in the literature, this vJI is generally positive, 0. Takes a negative real value.

しかして１文献検索に際しては、ファイル確度ｒｊの値
を算出することになるが、まず、前述した文献２によれ
ば、　　（Ｖｊｉ）の値は、Ｏ又は１で、各文献毎に固
定されている。なぜならば、もし。Therefore, when searching for one document, the value of the file accuracy rj is calculated, but first, according to the above-mentioned document 2, the value of (Vji) is O or 1, which is fixed for each document. There is. Because, if.

ｉεＡ、ならばＶ。≦１であり、もし、ｉ　毎Ａ　Ｊな
らばＶ　Ｊ　ｔ　＝　Ｏだからである。よって、ファイ
ル確度ｒｊは、ｒ４＝１　　ｎ　（Ｉ　　ＶＪｔａ＋）　　・・・・・
・・・・・・・・・・・・・（４）ａ　ｒ＝　１　　ｒ
Ｉ　（Ｉ　　Ｗｔｋｂ　ｍ）　　・・・・・・・・・・
・・・・・・・・（５）により計算されるとみなすこと
ができる。この計算側は、複数の証拠（ｙ、）（但し、
０≦ｙ、≦１）から、ある結論を出すのに、マックス（
Ｍ、ＡＸ）演算的な関数ｆ（（ｙ＋））≦１−ｎ　（１
−ｙ＋）を合成側として使用したとみなせる。しかし、
推論法とみると、かなり″楽観的″′な推論法である。iεA, then V. ≦1, and if every i is A J, then V J t = O. Therefore, the file accuracy rj is r4=1 n (IVJta+)...
・・・・・・・・・・・・・・・(4) a r= 1 r
I (I Wtkb m) ・・・・・・・・・・・・
It can be considered that it is calculated according to (5). This calculation side has multiple evidences (y,) (however,
To reach a certain conclusion from 0≦y,≦1), Max (
M, AX) Arithmetic function f((y+))≦1−n (1
-y+) can be considered to be used as the synthesis side. but,
When viewed as an inference method, it is a fairly ``optimistic'' inference method.

なぜならば１課題の項目において前述したように。This is because, as mentioned above in the item of 1 assignment.

どれか１つの肯定的な証拠（ヨ＋：）’＋＃１）があれ
ば、他の証拠が全否定的（Ｖｊ、＋：　ｙＩ’：Ｏ）で
あっても、結論としては肯定（即ち、ｆ　（（ｙ　ｉ）
）≠１）されてしまう。また、結論を否定するには、全
ての証拠が否定的である必要がある。文献検索という観
点からみると、検索利用者が指定したキーワード集合と
関連度の高いキーワードを１個でも含む文献は全て選び
出されるいう望ましい場合もあるが、実際には、次のよ
うな欠点がある。If there is any one piece of positive evidence (Y+:)'+#1), even if all the other evidence is negative (Vj, +: yI':O), the conclusion is affirmative (i.e. , f ((y i)
)≠1) It will be done. Also, all the evidence must be negative to refute a conclusion. From the perspective of literature search, it may be desirable to select all documents that contain at least one keyword that is highly related to the set of keywords specified by the search user, but in reality, there are the following drawbacks: be.

今、仮に、０．５付近の証拠がｎ個あり、他の証拠は０
とする。すると。Now, suppose there are n pieces of evidence around 0.5, and other pieces of evidence are 0.
shall be. Then.

ｆ（（ｙ＋））４１　　（１−０，５）　’＝１−Ｌｎとなるが、ｎ＝６でｆ　（（ｙ　＋））白０　、９８と
なり。f((y+))41(1-0,5)'=1-Ln, but when n=6, f((y+))white 0,98.

１に近くなってしまう、これは、０．５という不確かな
情報が６個あり、他の情報が全て完全に否定（即ち、ｙ
　ｒ　＝　Ｏ）であるのに、０．９８というかなり背定
的な結論が出てしまうことになる。また、最終的な″近
さ（ｒ　、　）　＋ｔを定量的に評価したい時には、み
んな１に近くなってしまい、この合成側は不適当である
。This means that there are 6 pieces of uncertain information of 0.5, and all other information is completely negated (i.e., y
Even though r = O), a rather abject conclusion of 0.98 is reached. Furthermore, when it is desired to quantitatively evaluate the final "closeness (r, ) + t", all of the values are close to 1, and this synthesis is inappropriate.

一方、このような代数和による合成ｆ　（（ｙ　＋））
に対して、多層回路網モデルで用いられる合成法は５シ
グモイド（ｓｉｇｍｏｉｄ）関数ｇ　（（ｙ　、））で
ある。On the other hand, the composition f ((y +)) by such an algebraic sum
In contrast, the synthesis method used in the multilayer network model is a 5-sigmoid function g ((y,)).

即ち、である、ここに、第１図に示した多層回路網モデルの場
合であれば。That is, in the case of the multilayer network model shown in FIG.

と表せる−”ｊａ＊Ｖ工、ｙ２．ｈｏ、ｈ工ｐｈｉは各
々パラメータである。It can be expressed as -"ja*V, y2.ho, h, phi are parameters, respectively.

ここで、（６）式のシグモイド関数の意味を考える。パ
ラメータｙ０　を小さくすると、第３図に示すように、
ｇ　（（ｙ　＋））を表すカーブの’！　−ｈ　ｏ付近
の立上りが鋭くなり、階段関数に近づく、従って。Now, consider the meaning of the sigmoid function in equation (6). When the parameter y0 is made small, as shown in Fig. 3,
'! of the curve representing g ((y +)). -ho The rise near o becomes sharp and approaches a step function, therefore.

パラメータｙ０が小さい時には、ｇ　（（ｙ　＋））は
閾値関数的な性格を持つ、この場合、もし、ｈａ〜１と
設定すると、（Ｗ、、）の内、１に近いリンクが何本存
在しても、合成結果ｇ（（ｙｄ）は１に近くなり、代数
和の性質に似てくる。逆に、パラメータｙ０が大きい時
には、入力の総和ΣＷ　−ｍ　ｙ−の大きさに応じてゆ
るやかに反応し、ファイル確度を定量的に評価するのに
適している。しかし１代数和の持つ”　Ｍ　Ａ　Ｘ演算
的″な性格は、パラメータｙａｒ）ｌｏをどのような値
にとっても持たせることができない。When the parameter y0 is small, g ((y +)) has the characteristics of a threshold function. In this case, if ha~1 is set, how many links close to 1 exist among (W, ,) Even if the parameter y0 is large, the composition result g((yd) becomes close to 1 and resembles the property of an algebraic sum.On the other hand, when the parameter y0 is large, , and is suitable for quantitatively evaluating file accuracy. However, the "M A Can not.

よって、以上２つのファイル確度計算のための合成法の
特徴をまとめると２次のようになる。Therefore, the characteristics of the above two synthesis methods for calculating file accuracy can be summarized as follows.

■　文献２方式％式％）本定量的な評価には不適本１つでも関連ある文献は検索される ■　ニューラルネットワーク方式重み付きの和十シグモイド関数特徴本定量的な評価に適する（特に、ｗ、、ｙ、がＯと１の近傍にない場合）本ＭＡＸ演算は表現できない本キーワード群全体と文献との近さの計算に適するしかして、本実施例では、このような２つの考方を合わ
せた合成方式を提案するものである。■ Literature 2 method % formula %) Not suitable for this quantitative evaluation Even if there is only one book, related documents will be searched ■ Neural network method Weighted sum ten sigmoid function Features This book is suitable for quantitative evaluation (especially w ,,y, are not in the vicinity of O and 1) This MAX operation is suitable for calculating the closeness between the entire book keyword group, which cannot be expressed, and the literature.However, in this example, these two ideas are This paper proposes a combined synthesis method.

即ち、０≦α、≦１．０≦α２≦１を満たすパラメータ
α□、α２を用いて重ね合わせ、ｒ、＝（Ｘｉ’　ｇ　
（（ａ　＋）　；　（ＶｊｉＬ　３’ｌｌ　ｈＪ　＋　
（１ａｓ）　ｆ　（（Ｖ、ａ　１））・・・・・・・・
・・・・・・・・・・・・・・・・・・・（９）ａｌ＝
ａｘ’ｇ（（ｂＪ：（Ｖｌｂ）＋　Ｖ、ｔ　ｈ□）＋（
１（！、）ｆ（（Ｗ＋ｋｂ山ユ、）・・・・・・・・・
・・・・・・・・・・・・・・・・・・（１０）により
、出力層３のユニットｊの状態ｒ、と、中間層２のユニ
ットｉの状態ａ１とを算出させるものである。ここに−
ｒｊ＋　ａ＋はともに、０≦ｒ。That is, by superimposing r, = (Xi' g
((a +) ; (VjiL 3'll hJ +
(1as) f ((V, a 1))・・・・・・・・・
・・・・・・・・・・・・・・・・・・(9) al=
ax'g((bJ:(Vlb)+V, t h□)+(
1(!,)f((W+kbyamayu,)・・・・・・・・・
By (10), the state r of unit j in the output layer 3 and the state a1 of unit i in the intermediate layer 2 are calculated. be. here-
Both rj+a+ are 0≦r.

≦１．０≦ａ、≦１　を満たす実数値をとり、入力Ｎ１
のす、は（０，１）の２値をとる。また、重み係数Ｗｌ
ｋはＯ≦ＷＬｋ≦１を満たす実数値をとり、ｖＪＩは一
般に任意の実数値をとる。Take a real value that satisfies ≦1.0≦a, ≦1, and input N1
Nosu takes two values (0, 1). Also, the weighting coefficient Wl
k takes a real value satisfying O≦WLk≦1, and vJI generally takes an arbitrary real value.

また、これらの（９）（１０）式において用いられる合
成用の関数ｇは、前述した通りの、多層ニューラルネッ
トワークモデルでよく使用されるシグモイド関数であり
。Furthermore, the synthesis function g used in these equations (9) and (10) is a sigmoid function often used in multilayer neural network models, as described above.

・・・・・・・・・・・・・・・（１１）である＊　Ｙ
ｚｓ　ｈ□のパラメータの値は、後で学習により定めら
れる。一方、関数ｆは、文献２による関数であり。・・・・・・・・・・・・・・・(11)* Y
The value of the parameter zs h□ is determined later by learning. On the other hand, the function f is a function according to Document 2.

ｆ（（ＶＪ＋ａｉ））＝Ｉ　　ＩＩ（Ｉ　　ＶＪ＋ａ＋
）　　　　・−−−−・・（１２）である。f((VJ+ai))=I II(I VJ+a+
) ・------...(12).

このように本実施例の（９）（１０）式による計算は、
関数ｇと関数ｆとの線形結合をとった関数を用いて計算
するというものである。関数ｇは、入力の重み付き和を
（０，１〕の間に写像する関数であり、入力となるキー
ワード群全体を総合的に評価して状態を決めるのに適し
ている。また、関数ｆはＭＡＸ演算的性格を持ち関数ｇ
では表しにくいものである。よって、　（９）（１０）
式は、関数ｆと関数ｇとの欠点を互いに相補的に補うも
のであり、パラメータα０．α２のとり方により、検索
利用者が戦略を選ぶことができる。即ち、α１．α２の
値につき、０，１の選択により１台数相方式と、総和子
シグモイド方式との２つの戦略・が選べ、さらに、α□
、α２の値をＯと１との間の値とすることにより、中間
的な合成法も可能となる。In this way, calculations using equations (9) and (10) in this example are as follows:
The calculation is performed using a function that is a linear combination of the function g and the function f. The function g is a function that maps the weighted sum of inputs between (0, 1), and is suitable for comprehensively evaluating the entire input keyword group to determine the state.Furthermore, the function f has the character of MAX operation and is a function g
This is difficult to express. Therefore, (9) (10)
The equation complementarily compensates for the shortcomings of the functions f and g, and the parameters α0. Search users can choose a strategy depending on how α2 is taken. That is, α1. For the value of α2, two strategies can be selected by selecting 0 or 1, the one-unit multi-phase method and the Sumiko sigmoid method, and furthermore, α□
, α2 between O and 1, an intermediate synthesis method is also possible.

例えば、αｉ　岬０１　α２≠Ｏと設定した場合、（９
）（１０）式は、関数ｆによる代数和方式となり、対象
としている文献に割当てられたキーワード群のどれか１
つと関連の深いものが１個でも存在すれば、その文献は
検索される。検索利用者が適切なキーワードを選べない
時に有効と考えられる。For example, if you set αi Cape 01 α2≠O, (9
) (10) is an algebraic sum method using the function f, and one of the keywords assigned to the target document is
If there is even one item that is closely related to a document, that document is searched. This is considered effective when search users are unable to select appropriate keywords.

α□弁１．α２押１と設定した場合、（９）（１０）式
は、関数ｇによる総和子シグモイド方式となり、検索利
用者が提示したキーワード群と文献に割当てられたキー
ワード群との″総合的な近さ″を定量的に比較すること
が必要な場合に有効と考えられる。よって、検索利用者
がキーワードの選択に十分自信がある場合に有効と考え
られる。α□Valve 1. When α2 is set to 1, equations (9) and (10) become a summative sigmoid method using the function g, which indicates the "overall closeness" between the keyword group presented by the search user and the keyword group assigned to the document. It is considered to be effective when it is necessary to quantitatively compare ``. Therefore, it is considered to be effective when the search user is sufficiently confident in selecting keywords.

さらに、α１押１．α２斗０と設定した場合、Ｑと゛′
総合的に近い″キーワード群を選んで、その中に含まれ
るキーワードと関連度の大きい全ての文献が検索される
。Furthermore, α1 press 1. If α2to0 is set, Q and ゛′
A comprehensively similar group of keywords is selected, and all documents that have a high degree of relevance to the keywords included therein are searched.

また、αＸ　”Ｆ　Ｏｌ　α２斗１と設定した場合、Ｑ
に属するキーワードと関連度の大きいものを全て選んだ
キーワード群と、″総合的に近い“文献が検索される。Also, if αX ”F Ol α2to1 is set, Q
Searches for documents that are ``comprehensively similar'' to a keyword group that has a high degree of relevance to keywords belonging to .

このように、パラメータα１．α２の値の設定により、
２つの相異なる性質を持つ関数ｇｙｆを選択できるので
、検索利用者のレベルと目的に合わせて、検索基準を変
えることができる。In this way, the parameter α1. By setting the value of α2,
Since functions gyf with two different properties can be selected, the search criteria can be changed according to the level and purpose of the search user.

第４図は、検索時のシステムの処理手順を示す。FIG. 4 shows the processing procedure of the system at the time of search.

ところで１本実施例のシステムによれば、リンクの重み
係数（ＶＪｌ）　、　　（ｖ＋ｍ）は、学習により最適
なものを設定することが可能である（但し、Ｖ＋：ＷＩ
ｔ＝１と固定する）。学習法としては、教師用の入力と
出力のパターンｃ＜ｂ：　＞）ｃｃｒ：　））最急降下
法を適用して、などから、重み係数（ｖｊｔ）　、　　（ｗ＋、）や、
パラメータｈｉ、ｈ２．ｙ工１３’２を変化させていけ
ばよい、第５図はこのような学習時の処理手順を示す。By the way, according to the system of this embodiment, it is possible to set the link weight coefficients (VJl) and (v+m) to optimal values through learning (however, V+:WI
(fix t=1). The learning method is to apply the steepest descent method to the teacher's input and output pattern c<b: >)ccr: )), and calculate the weighting coefficients (vjt), (w+, ), etc.
Parameters hi, h2. It is only necessary to change the y-force 13'2. FIG. 5 shows the processing procedure during such learning.

より具体的には、以下の処理による。まず、教師用の出
カバターンを（ｒ、）とおく、出カバターンに一致する
尺度として、評価関数をと定義し、この評価関数について最急降下法を用いる。More specifically, the following processing is performed. First, let the output pattern for the teacher be (r,), define an evaluation function as a measure matching the output pattern, and use the steepest descent method for this evaluation function.

その結果、下記のようになる。The result is as follows.

＝０（ｍ　＝　ｎの場合）ａＥ９ Δｙ・“−ａＹｘａＥ′ Δｙ゛″］π ｘ　　ｚ（ΣＶｊｌ””／ｘ−（ΣＶＪ＋ａ＋　ｈｏ）
　）ＹｚｒａＶエ　　　１１　　ｒ　　　　　　ａａ＋＋（１−α２すＴ了堂ＡＪＩ−）但し、Ｌ肚＝−垣り虹である。=0 (when m = n) aE9 Δy・“−aYx aE′ Δy゛”]π x z(ΣVjl””/x−(ΣVJ+a+ ho)
)YzraVe 1 1 r aa+ + (1-α2suTryodo AJI-) However, L = -Hedge Rainbow.

ａＹＩ　　Ｖｓａｈｘこのように、本処理方式によれば、キーワード間の関連
度（Ｗ＋ｍ）−キーワードと文献との間の関連度（ｖｊ
Ｉ）及びその他のパラメータ（ｈｌ、　ｈ２゜ｙ１＋　
３’よ）を学習により自動的に適切なものに設定するこ
とが可能であり、検索利用者が考えているものに近い検
索結果を得ることができる。さらには、この際に、（ｖ
Ｊｌ）は負の実数値もとり得るので、あるキーワードと
文献との排他性（あるキーワードを指定すると、ある文
献が検索されにくくなる）も表現し得ることになる。即
ち、文献２方式による場合１文献とそれに付加されたキ
ーワード群とのリンクの重みは（０，１）の２値に固定
されていたが、本実施例方式によれば、（ｖｊＩ）が実
数に拡張されているため、抑制効果をも表現でき、より
動的なモデルといえる。aYI Vsahx As described above, according to this processing method, the degree of association between keywords (W+m) - the degree of association between keywords and documents (vj
I) and other parameters (hl, h2゜y1+
3') can be automatically set to an appropriate value through learning, and search results that are close to what the search user is thinking can be obtained. Furthermore, at this time, (v
Since Jl) can also take a negative real value, it can also express the exclusivity between a certain keyword and a document (specifying a certain keyword makes it difficult to search for a certain document). That is, in the case of the document 2 method, the weight of the link between one document and the keyword group added to it was fixed to the binary value of (0, 1), but according to the method of this embodiment, (vjI) is a real number. Since it has been extended to , it can also express suppressive effects, making it a more dynamic model.

発明の効果本発明は、上述したように構成したので、請求項１記載
の発明によれば、３層構造のニューラルネットワーク中
の出力層のユニットの状態ｒ、と中間層のユニットの状
態ａ、との算出において、２つの相異なる性質を持つ関
数ｆｐｇによる合成結果を、α１．α２なるパラメータ
を用いて１重みつき平均をとることにより結合した計算
式を用いるため、相異なる性質を持つ関数ｇと関数ｆと
は互いの欠点を相補的に補うものであり、検索利用者提
示のキーワード群と予め文献に割当てられたキーワード
群との間の″距離″を、より的確に表現できることにな
り、パラメータα０．α２の値のとり方によって、検索
利用者のレベルと目的に合わせた検索基準とし、検索利
用者の意図する通りの検索を行わせることができ、この
際、請求項２記載の発明によれば、リンクの重みｖＪ、
がサンプルデータの提示と最急降下法による重みの調整
という学習により値が設定されるため、検索利用者の考
えているものに近い検索結果を得ることができ、さらに
は、負の実数値をもとり得るため、あるキーワードと文
献との排他性表現も可能となるものである。Effects of the Invention Since the present invention is configured as described above, according to the invention described in claim 1, the state r of the output layer unit, the state a of the intermediate layer unit, and In the calculation of α1. Since a calculation formula that is combined by taking a weighted average using the parameter α2 is used, the functions g and f, which have different properties, complement each other's shortcomings, and the search user's presentation The "distance" between the keyword group and the keyword group assigned to the document in advance can be expressed more accurately, and the parameter α0. Depending on how the value of α2 is taken, it is possible to set the search criteria according to the level and purpose of the search user and to perform the search as intended by the search user.In this case, according to the invention as claimed in claim 2, Link weight vJ,
The value is set by learning by presenting sample data and adjusting the weights using steepest descent method, so it is possible to obtain search results close to what the search user is thinking. In order to obtain this information, it is also possible to express the exclusivity between a certain keyword and a document.

Ｊ、１　　図（ｏ、１ −あＺ図J, 1 Diagram (o, 1 -A Z diagram

[Brief explanation of the drawing]

図面は本発明の一実施例を示すもので、第１図はニュー
ラルネットワークモデルの構造図、第２図は文献１方式
の構造図、第３図はｇ（ｙ）特性図。第４図は検索処理を示すフローチャート、第５図は学習
処理を示すフローチャートである。１・・・入力層、２・・・中間層、３・・・出力層（ト
ΣＷ圃漆ｎ）The drawings show an embodiment of the present invention, and FIG. 1 is a structural diagram of a neural network model, FIG. 2 is a structural diagram of the method of document 1, and FIG. 3 is a g(y) characteristic diagram. FIG. 4 is a flowchart showing the search process, and FIG. 5 is a flowchart showing the learning process. 1... Input layer, 2... Intermediate layer, 3... Output layer (ΣW lacquer n)

Claims

[Claims]

1. Input the keyword group specified by the search user with unit k corresponding to each keyword.
b_k}, an intermediate layer consisting of a unit group {a_i} which has the same structure as the input vector unit group {b_k}, has a unit i corresponding to each keyword, and has a real value such that 0≦a_i≦1; It has a three-layer structure with an output layer having a unit j corresponding to each document j, each unit i in the intermediate layer has a link with each unit k in the input layer, and a unit j in the output layer has a link with each unit k in the input layer. A state a_i indicating the degree to which the keyword i in the intermediate layer unit i has a link with the keyword group presented by the user, and a state a_i in the output layer unit j. In a literature search method using a neural network model in which a document is searched by calculating a state r_j indicating the extent to which document j is detected, each unit i in the intermediate layer and each unit in the input layer are Let W_i be the weight indicating the degree of association between keywords of the link between
_k (where 0≦W_i_k≦1), and the weight indicating the degree of association between the document j and the keyword i held by the link between the unit j of the output layer and the unit i of the intermediate layer is V_
j_i (where V_j_i is a positive, 0, or negative real value), y_1, y_2, h_1, h_2 are parameters, the function g is ▲There are mathematical formulas, chemical formulas, tables, etc.▼, and the function f is ▲mathematical formulas, chemical formulas, There is a table etc. ▼ When we set the above r_j and the above a_i to the parameter α
_1, α_2 (however, 0≦α_1≦1, 0≦α_2≦1
), r_j=α_2・g({a_i};{V_j_i},y
_2, h_2)+(1-α_2)f({V_j_ia_
i})a_i=α_1・g({b_k};{V_i_k
},y_1,h_1)+(1-α_1)f({W_i_
1. A literature search method using a neural network model, characterized in that calculation is performed by (kb_k}_i_■_k).

2. The weight V_j_i, which indicates the degree of association between document j and keyword i in the link between unit j in the output layer and unit i in the intermediate layer, is allowed to take any real value, and the value is set by learning. 2. A literature search method using a neural network model according to claim 1.