JPH04293161A

JPH04293161A - Method and device for retrieving document

Info

Publication number: JPH04293161A
Application number: JP3080547A
Authority: JP
Inventors: Hisamitsu Kawaguchi; 川口　久光; Mitsuru Akisawa; 充秋沢; Kanji Kato; 加藤　寛次; Atsushi Hatakeyama; 敦畠山; Hiromichi Fujisawa; 浩道藤澤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1991-03-20
Filing date: 1991-03-20
Publication date: 1992-10-16
Anticipated expiration: 2015-10-16
Also published as: JP3099298B2

Abstract

PURPOSE:To obtain the concrete method and device for retrieving a document, which can decide a composite condition of a neighborhood condition, a context condition, a logical condition, etc., in a full text search. CONSTITUTION:By a device constituted of a character-string collating circuit 200 and a composite condition deciding circuit 300, a character-string collation of a document of a document data base and a designated retrieval word in a retrieval conditional expression is executed, and when the collation is executed, an identifier of the retrieval word collated with document discriminating information and collating information in the document are outputted as collating information. At the time of deciding a composite condition, a decision of a neighborhood condition is executed by checking the formation of a close distance condition between the retrieval words designated by the retrieval conditional expression, based on the collating information, the collating information and collating information of a result of decision are outputted as the whole collating information, and subsequently, a logical condition is decided by checking the common occurrence of the designated retrieval words in the same phrase, sentence, etc., based on the whole collating information, the whole collating information is outputted, and next, the logical condition is executed by checking a logical condition between the designated retrieval words, and the whole collating information obtained therein is outputted as final retrieval result information.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は情報処理システム，特に
情報検索システムにおけるフルテキストサーチに係り，
近傍条件，文脈条件，および論理条件などの複合条件判
定処理を高速に実現する方法および装置に関するもので
ある。テキストデータベース，ワードプロセッサ，およ
び文書ファイリングシステムなどにおける検索に利用し
得るものである。[Industrial Application Field] The present invention relates to full text search in information processing systems, particularly information retrieval systems.
The present invention relates to a method and apparatus for quickly realizing complex condition determination processing such as neighborhood conditions, context conditions, and logical conditions. It can be used for searching in text databases, word processors, document filing systems, etc.

【０００２】0002

【従来の技術】近年，文献情報や特許情報などの２次情
報（書誌情報）のみならず，１次情報（原文）をも含む
大規模データベース・サービスの重要性が増してきてい
る。従来，データベースの情報検索では，シソーラスに基づ
いて統制されたキーワードや分類コード等の２次情報に
よる検索が行われてきている。しかし，この方法では数
十件から数百件までにしか絞り込めないため，検索者が
最終段階で直接本文を読んで内容を確認しなければなら
ないという効率上の問題がある。また，分類体系自体が
年月と共に変化するため，常にキーワードや分類コード
を更新しなければならないという問題も生じてくる。更
に，キーワード付け（インデキシングと言う）には時間
がかかるため新たな文書はバッチ処理によりかなりの量
をまとめて登録する。そのため，検索する情報は常に一
定期間の送れを持つという問題がある。BACKGROUND OF THE INVENTION In recent years, large-scale database services that include not only secondary information (bibliographic information) such as literature information and patent information, but also primary information (original texts) have become increasingly important. Conventionally, information searches in databases have been performed using secondary information such as keywords and classification codes controlled based on a thesaurus. However, this method only narrows down the results from a few dozen to a few hundred results, and there is an efficiency problem in that the searcher must read the text directly to confirm the content at the final stage. Furthermore, because the classification system itself changes over time, the problem arises that keywords and classification codes must be constantly updated. Furthermore, since assigning keywords (referred to as indexing) takes time, new documents are registered in bulk by batch processing. Therefore, there is a problem that the information to be searched always has a certain period of time.

【０００３】これらの問題に対処する一つの方法として
，検索者が自由な検索語に基づいて，文書の本文を直接
参照して内容を検索できるフルテキストサーチシステム
が考えられている。このようなフルテキストサーチシス
テムを実現するための文書検索装置がいくつか提案され
ている。その中の代表的な文書検索装置の構成を第２図
に示し，その内容について説明する。（エル　　エー　
　ホラー　　：”ハードウェア　　システムズ　　フォ
ー　　テキスト　　インフォメーション　　リトリーバ
ル”，エー　　シー　　エム，エス　　アイ　　ジー　
　アイ　　アール，第６回コンファレンス　　１９８３
年，Ｌ．Ａ．　　Ｈｏｌｌａａｒ：”Ｈａｒｄｗａｒｅ
　　ｓｙｓｔｅｍｓ　　ｆｏｒ　　Ｔｅｘｔ　　Ｉｎｆ
ｏｒｍａｔｉｏｎ　　Ｒｅｔｒｉｅｖａｌ”，ＡＣＭ　
　ＳＩＧＩＲ６ｔｈ　　Ｃｏｎｆｅｒｅｎｃｅ　　１９
８３）文書検索装置１において，検索制御手段１０１は
検索装置全体の制御とホストコンピュータとの通信を行
う。すなわち，ホストコンピュータから送られてくる検
索要求２０１を受け付けこれを解析し，文字列照合手段
２００と複合条件判定手段３００へ検索情報２０２とし
て送出する。また，検索制御手段１０１は記憶装置制御
手段１０４を制御して，文字列記憶手段１０５に格納さ
れた文書データ２０４を文字列照合手段２００へ読み出
す。文字列照合手段２００は文書データ２０４の中に検
索要求で指示された検索語（以後，検索タームと呼ぶ）
に合致するものがあるかどうかを調べ，もし該当するも
のがあれば，該当文字列を識別する情報２０５を複合条
件判定手段３００へ出力する。複合条件判定手段３００
は該文字列識別情報２０５に対して，検索要求中に指示
されたＡＮＤやＯＲなどで構成される論理条件などが満
足されるか否かを調べる。複合条件が満足された場合に
は，該当する文書の識別情報や文書内容を検索結果２０
６としてホストコンピュータへ返送する。[0003] As one method for dealing with these problems, a full-text search system is being considered in which a searcher can directly refer to the main text of a document and search for content based on a free search term. Several document search devices have been proposed to realize such a full-text search system. The configuration of a typical document retrieval device among them is shown in FIG. 2, and its contents will be explained. (L.A.
Horror: “Hardware Systems for Text Information Retrieval”, ACM, SIG
IR, 6th Conference 1983
Year, L. A. Hollaar:”Hardware
systems for TextInf
“Ormation Retrieval”, ACM
SIGIR6th Conference 19
83) In the document search device 1, the search control means 101 controls the entire search device and communicates with the host computer. That is, it receives a search request 201 sent from the host computer, analyzes it, and sends it as search information 202 to the character string matching means 200 and the complex condition determining means 300. Further, the search control means 101 controls the storage device control means 104 to read the document data 204 stored in the character string storage means 105 to the character string collation means 200. The character string matching means 200 searches the document data 204 for a search word (hereinafter referred to as a search term) specified in the search request.
It is checked whether there is a matching string, and if there is a matching string, information 205 identifying the matching string is output to the compound condition determining means 300. Complex condition determination means 300
checks whether the character string identification information 205 satisfies logical conditions such as AND and OR specified in the search request. If the compound condition is satisfied, the identification information and document contents of the corresponding document are displayed in the search results 20.
6 and returns it to the host computer.

【０００４】本システムでは絞り込みを精度良く行うた
めに，複合条件判定手段３００の検索条件として論理条
件の他に，英文を対象とした以下に示す条件が提案され
ている。 “Ａ　．ｎ．　Ｂ”　　　　　　　　　　　　　　　　
　　　　　　（１−１）“＜Ａ，Ｂ＞ｎ”　　　　　　
　　　　　　　　　　　　　　（１−２）“Ａ　　ＡＮ
Ｄ　　Ｂ　　ＩＮ　　ＳＥＮＴ”（１−３）（１−１）
式の“Ａ　．ｎ．　Ｂ”という条件式は，“Ａ”と“Ｂ
”という２つの検索タームがこの順序で現れ，かつこの
２つの検索タームがｎ単語以内に近接して現れる文書を
探し出すことを表す。（１−２）式の“＜Ａ，Ｂ＞ｎ”という条件式は，“Ａ
”と“Ｂ”という２つの検索タームがその順序を問わず
に，すなわち“Ａ”が“Ｂ”の前に現れる場合，あるい
は“Ｂ”が“Ａ”の前に現れる場合のどちらであっても
，これらの検索タームがｎ語以内に近接して現れる文書
を探し出すことを表す。（１−１）式や（１−２）式のように検索ターム間の近
接の度合いを尺度とする検索条件を近傍条件と呼ぶこと
にする。（１−３）式の“Ａ　　ＡＮＤ　　Ｂ　　ＩＮ　　ＳＥ
ＮＴ”という条件式は，“Ａ”と“Ｂ”という２つの検
索タームがその順序を問わずに，同一の文（センテンス
）に現れる文書を探し出すことを表す。（１−３）式のように文や段落（パラグラフ）という同
一文脈（フィールドとも呼ぶ）上における２つの検索タ
ームの共起を判定する条件を文脈条件と呼ぶことにする
。[0004] In this system, in order to perform narrowing down with high precision, in addition to logical conditions, the following conditions targeting English sentences are proposed as search conditions for the compound condition determining means 300. “A.n.B”
(1-1) “<A, B>n”
(1-2) “A AN
D B IN SENT” (1-3) (1-1)
The conditional expression “A .n. B” in the expression is “A” and “B”.
”, which appear in this order, and where these two search terms appear close to each other within n words. The conditional expression is “A
” and “B” in any order, i.e. “A” appears before “B” or “B” appears before “A”. also represents searching for documents in which these search terms appear close to each other within n words.A search that measures the degree of proximity between search terms, as in equations (1-1) and (1-2). We will call this condition the neighborhood condition. “A AND B IN SE” in equation (1-3)
The conditional expression ``NT'' indicates that documents in which the two search terms ``A'' and ``B'' appear in the same sentence are found, regardless of their order. As in the expression (1-3), A condition for determining the co-occurrence of two search terms in the same context (also called a field) such as a sentence or a paragraph is called a context condition.

【０００５】このように本引用文献では，検索ターム間
の距離的な結び付きや文脈的な結び付きを制約とする近
傍条件および文脈条件などの検索条件が複合条件として
提案されている。これらの条件を用いると単に論理条件
を用いて検索するのに比べ，キーワード間の意味的な結
び付きを加味して検索できることになるため，木目細か
な検索が行えることになり，その結果精度の良い絞り込
みが出来ることになる。しかしながら，本引用文献には
近傍条件や文脈条件を実現する具体的な方法が記述され
ていない。また，フルテキストサーチでは文書データを
直接サーチするため処理時間が膨大となる。そこで検索
タームを高速に探索するためにタームコンパレータと呼
ぶ文字列照合用ハードウエアが提案されている。この具
体的実現方法は，例えば特開昭６０−１０５０３９に開
示されている。このタームコンパレータでは，数ＭＢｙ
ｔｅ／ｓから数十ＭＢｙｔｅ／ｓと高速に文字列照合を
行うことが可能である。しかしながら，これらのターム
コンパレータには文字列照合手段２００と同等の機能し
かなく，検索の絞り込みに重要な近傍条件，文脈条件，
および論理条件などの複合条件判定機能は搭載されてい
ない。さらに，複合条件判定手段３００では文字列照合
手段２００で高速に照合処理された大量の検索ターム（
以後，照合タームと呼ぶ）を文字列照合手段２００の照
合処理速度と同じく高速に判定処理しなければならない
。これは，文字列照合手段２００がいくら高速に処理で
きたとしても，複合条件判定手段３００の処理が遅いと
システムとしての検索速度が落ちてしまうため検索時間
を短縮することができないからである。したがって，複
合条件判定手段３００としては近傍条件，文脈条件，お
よび論理条件を高速判定処理できるものでなければなら
ないということになる。[0005] As described above, in this cited document, search conditions such as neighborhood conditions and context conditions, which are constrained by distance connections and contextual connections between search terms, are proposed as complex conditions. Using these conditions allows you to search by taking into account the semantic connections between keywords, compared to simply using logical conditions. This will allow you to narrow down your search. However, this cited document does not describe a specific method for realizing the neighborhood condition or context condition. In addition, full-text search requires an enormous amount of processing time because document data is directly searched. Therefore, in order to search for search terms at high speed, string matching hardware called a term comparator has been proposed. A concrete implementation method for this is disclosed in, for example, Japanese Patent Laid-Open No. 105039/1983. With this term comparator, several MBy
It is possible to perform character string matching at high speeds ranging from te/s to several tens of MByte/s. However, these term comparators only have the same functionality as the string matching means 200, and they do not have the same functionality as the string matching means 200, and they only have the same function as the string matching means 200, and they do not have the same functionality as the character string matching means 200, and they do not have the same functionality as the string matching means 200.
Composite condition judgment functions such as logical conditions and logical conditions are not included. Furthermore, the compound condition determination means 300 uses a large number of search terms (
(hereinafter referred to as a matching term) must be processed at the same high speed as the matching processing speed of the character string matching means 200. This is because, no matter how fast the character string matching means 200 can perform processing, if the processing of the complex condition determining means 300 is slow, the search speed as a system will drop, making it impossible to shorten the search time. Therefore, the compound condition determining means 300 must be capable of high-speed determination processing of neighborhood conditions, context conditions, and logical conditions.

【０００６】[0006]

【発明が解決しようとする課題】本発明の課題は，フル
テキストサーチ特有の木目細かな絞り込みを可能とする
，近傍条件判定，文脈条件判定，および論理条件判定な
どの複合条件の具体的な判定方法を提供するとともに，
これらの組合せ処理をハードウェア化された文字列照合
手段と同等の速度で行うことのできる複合条件判定方法
を提供することである。[Problems to be Solved by the Invention] The problem of the present invention is to make specific judgments of complex conditions such as neighborhood condition judgment, context condition judgment, and logical condition judgment, which enable fine-grained narrowing down that is unique to full-text search. In addition to providing a method,
It is an object of the present invention to provide a composite condition determination method that can perform these combination processes at a speed equivalent to that of a hardware string matching means.

【０００７】本発明で具体的に実現しようとする複合条
件検索機能は以下の通りである。まず近傍条件としては
，日本語の場合には検索タームの間に存在する文字数に
上限あるいは下限を指定した字間距離条件検索を，英語
の場合には検索ターム間の語数に上限あるいは下限を指
定した語間条件検索などを実現する。字間距離条件の例
としては，以下のようなものがある。 “文書［８Ｃ］検索”　　　　　　　　　　　　（２−
１）“文書［１０ｃ］検索”　　　　　　　　　　（２
−２）“文書［８ｃ，１０ｃ］検索”　　　　（２−３
）“文書＜１０ｃ＞検索”　　　　　　　　　　（２−
４）（２−１）式の“文書［８Ｃ］検索”という条件式
は，“文書”と“検索”という２つの検索タームがこの
順序で現われ，かつこの２つの検索タームの間に８文字
以内の文字が挾まっている文書を探し出すということを
表わす。したがって，第３図に示した例文の中では，■
と■を検索することになる。（２−２）式の“文書［１０ｃ］検索”という条件式は
，“文書”と“検索”という２つの検索タームがその順
序を問わずに，すなわち“文書”が“検索”の前に現わ
れる場合，あるいは“検索”が“文書”の前に現われる
場合のどちらであっても，これらの検索タームが１０文
字以内に近接して現われる文書を探し出すことを表わす
。したがって，第３図に示した例文の中では，■と■と■
を検索することになる。（２−３）式の“文書［８ｃ，１０ｃ］検索”という条
件式は，２つの検索タームがその順序を問わず，８文字
以上離れていて，かつ１０文字以内に近接して現われる
文書を探し出すことを表わす。したがって，第３図に示
した例文の中では，■と■を検索することになる。（２−４）式の“文書＜１０ｃ＞検索”という条件式は
，“文書”と“検索”という２つの検索タームがその順
序を問わず，１０文字以上離れて現われる文書を探し出
すことを表わす。したがって，第３図に示した例文の中
では，■と■を検索することになる。[0007] The complex condition search function which is specifically attempted to be realized by the present invention is as follows. First, as a neighborhood condition, in the case of Japanese, a character distance condition search with an upper or lower limit specified for the number of characters that exist between search terms is used, and in the case of English, an upper or lower limit is specified for the number of words between search terms. This allows for inter-word conditional searches. Examples of character distance conditions include the following. “Document [8C] Search” (2-
1) “Document [10c] Search” (2
-2) “Document [8c, 10c] Search” (2-3
) “Document <10c> Search” (2-
4) The conditional expression "Document [8C] Search" in equation (2-1) means that the two search terms "document" and "search" appear in this order, and there are 8 characters between these two search terms. Indicates that the document containing the characters within the range is to be found. Therefore, in the example sentences shown in Figure 3, ■
You will be searching for and ■. The conditional expression “Document [10c] Search” in expression (2-2) uses the two search terms “Document” and “Search” regardless of their order, that is, “Document” comes before “Search”. ``Search'' or ``Search'' appears before ``Document'' indicates that documents in which these search terms appear closely within 10 characters are to be found. Therefore, in the example sentences shown in Figure 3, ■, ■, and ■
will be searched for. The conditional expression “document [8c, 10c] search” in equation (2-3) searches for documents in which the two search terms appear close to each other within 10 characters and are 8 or more characters apart, regardless of their order. It means to find out. Therefore, in the example sentence shown in FIG. 3, ■ and ■ will be searched. The conditional expression “document <10c> search” in formula (2-4) indicates that the two search terms “document” and “search” appear 10 or more characters apart, regardless of their order, to be found. . Therefore, in the example sentence shown in FIG. 3, ■ and ■ will be searched.

【０００８】次に，語間距離条件の例としては以下のよ
うなものがある。 “ｔｅｘｔ［８Ｗ］ｓｅａｒｃｈ”　　（３−１）“ｔ
ｅｘｔ［１０ｗ］ｓｅａｒｃｈ”（３−２）“ｔｅｘｔ
［８ｗ，１０ｗ］ｓｅａｒｃｈ”（３−３）“ｔｅｘｔ
＜１０ｗ＞ｓｅａｒｃｈ”（３−４）（３−１）式の“
ｔｅｘｔ［８Ｗ］ｓｅａｒｃｈ”という条件式は，“ｔ
ｅｘｔ”と“ｓｅａｒｃｈ”という２つの検索タームが
この順序で現われ，かつこの２つの検索タームの間に８
語（ワード）以下の単語が挾まっている文書を探し出す
ということを表わす。（３−２）式の“ｔｅｘｔ［１０ｗ］ｓｅａｒｃｈ”と
いう条件式は，“ｔｅｘｔ”と“ｓｅａｒｃｈ”という
２つの検索タームがその順序を問わずに，すなわち“ｔ
ｅｘｔ”が“ｓｅａｒｃｈ”の前に現われる場合，ある
いは“ｓｅａｒｃｈ”が“ｔｅｘｔ”の前に現われる場
合のどちらであっても，これらの検索タームが１０語以
内に近接して現われる文書を探し出すことを表わす。（３−３）式の“ｔｅｘｔ［８ｗ，１０ｗ］ｓｅａｒｃ
ｈ”という条件式は，“ｓｅａｒｃｈ”と“ｔｅｘｔ”
という２つの検索タームがその順序を問わず，８語以上
離れていて，かつ１０語以内に近接して現われる文書を
探し出すことを表わす。（３−４）式の“ｔｅｘｔ＜１０ｗ＞ｓｅａｒｃｈ”と
いう条件式は“ｔｅｘｔ”と“ｓｅａｒｃｈ”という２
つの検索タームがその順序を問わず，１０語以上離れて
現われる文書を探し出すことを表わす。以上が近傍条件
としての課題となる。Next, examples of word distance conditions include the following. “text [8W] search” (3-1) “t
ext [10w] search” (3-2) “text
[8w, 10w] search” (3-3) “text
<10w>search” (3-4) “ of formula (3-1)
The conditional expression “text[8W]search” is “t
The two search terms “ext” and “search” appear in this order, and there are 8 search terms between these two search terms.
Indicates to search for documents containing the following words. The conditional expression “text[10w]search” in equation (3-2) means that the two search terms “text” and “search” can be used regardless of their order, that is, “t
Whether ``ext'' appears before ``search'' or ``search'' appears before ``text,'' the search term searches for documents in which these search terms appear closely within 10 words. "text[8w,10w]search" in equation (3-3)
The conditional expression “h” is “search” and “text”
This means searching for documents in which two search terms appear in close proximity within 10 words and 8 or more words apart, regardless of their order. (3-4) The conditional expression “text<10w>search” in the expression
This means finding documents in which the two search terms appear 10 or more words apart, regardless of their order. The above issues are issues regarding neighborhood conditions.

【０００９】次に文脈条件検索としては，日本語および
英語とも次のようなものがある。 “文書［　Ｐ　］検索”， “ｔｅｘｔ［　Ｐ　］ｓｅａｒｃｈ”（４−１）“文書
［　ｐ　］検索”， “ｔｅｘｔ［　ｐ　］ｓｅａｒｃｈ”（４−２）“文書
［　Ｓ　］検索”， “ｔｅｘｔ［　Ｓ　］ｓｅａｒｃｈ”（４−３）“文書
［　ｓ　］検索”， “ｔｅｘｔ［　ｓ　］ｓｅａｒｃｈ”（４−４）“文書
［ＰＨ］検索”， “ｔｅｘｔ［ＰＨ］ｓｅａｒｃｈ”（４−５）“文書［
ｐｈ］検索”， “ｔｅｘｔ［ｐｈ］ｓｅａｒｃｈ”（４−６）Next, as a context condition search, there are the following in both Japanese and English. “Document [P] search”, “text [P] search” (4-1) “Document [p] search”, “text [p] search” (4-2) “Document [S] search”, “text [S] search” (4-3) “Document [s] search”, “text [s] search” (4-4) “Document [PH] search”, “text [PH] search” (4-5) "documents[
ph] search”, “text[ph]search” (4-6)

【００１
０】以下，日本語の例で説明する。（４−１）式の“文書［　Ｐ　］検索”という条件式は
，“文書”と“検索”という２つの検索タームがこの順
序で，同一の段落（パラグラフ）に現われる文書を探し
出すということを表わす。（４−２）式の“文書［　ｐ　］検索”という条件式は
，“文書”と“検索”という２つの検索タームが順序を
問わずに，同一の段落に現われる文書を探し出すという
ことを表わす。（４−３）式の“文書［　Ｓ　］検索”という条件式は
，“文書”と“検索”という２つの検索タームがこの順
序で，同一の文（センテンス）に現われる文書を探し出
すということを表わす。（４−４）式の“文書［　ｓ　］検索”という条件式は
，“文書”と“検索”という２つの検索タームが順序を
問わずに，同一の文に現われる文書を探し出すというこ
とを表わす。（４−５）式の“文書［ＰＨ］検索”という条件式は，
“文書”と“検索”という２つの検索タームがこの順序
で，同一の句（フレーズ）に現われる文書を探し出すと
いうことを表わす。日本語の場合，句とは“、”，“，”，および“。”で区切られた文章を言
う。英語の場合は“，”および“．”で区切られた文章
ということになる。（４−６）式の“文書［ｐｈ］検索”という条件式は，
“文書”と“検索”という２つの検索タームが順序を問
わずに，同一の句に現われる文書を探し出すということ
を表わす。以上が文脈条件としての課題となる。001
0] Below, we will explain using a Japanese example. The conditional expression “document [P] search” in equation (4-1) means that the two search terms “document” and “search” appear in the same paragraph in this order. represent The conditional expression “document [p] search” in equation (4-2) indicates that the two search terms “document” and “search” search for documents that appear in the same paragraph, regardless of the order. . The conditional expression “document [S] search” in equation (4-3) means that the two search terms “document” and “search” appear in the same sentence in this order. represent The conditional expression “document [s] search” in equation (4-4) indicates that the two search terms “document” and “search” appear in the same sentence regardless of their order. . The conditional expression “document [PH] search” in equation (4-5) is
This indicates that the two search terms "document" and "search" are used in this order to find documents that appear in the same phrase. In Japanese, phrases are sentences separated by ",", ",", and ".". In English, this means sentences separated by "," and ".". The conditional expression “document [ph] search” in equation (4-6) is
This means that the two search terms "document" and "search" will search for documents that appear in the same phrase, regardless of the order. The above points are issues regarding the context conditions.

【００１１】最後に論理条件としては日本語および英語
とも次のようなものがある。 “文書［ＡＮＤ］検索”， “ｔｅｘｔ［ＡＮＤ］ｓｅａｒｃｈ”　　（５−１）“
文書［ＯＲ］検索”， “ｔｅｘｔ［ＯＲ］ｓｅａｒｃｈ”　　　　（５−２）
“文書［ＮＯＴ］検索”， “ｔｅｘｔ［ＮＯＴ］ｓｅａｒｃｈ”　　（５−３）Finally, as logical conditions, there are the following in both Japanese and English. “Document [AND] search”, “text [AND] search” (5-1)“
“Document [OR] search”, “text [OR] search” (5-2)
“Document [NOT] search”, “text [NOT] search” (5-3)

【
００１２】以下，日本語の例で説明する。（５−１）式の“文書［ＡＮＤ］検索”という条件式は
，“文書”と“検索”という２つの検索タームが同時に
現われる文書を探し出すということを表わす。（５−２）式の“文書［ＯＲ］検索”という条件式は，
“文書”あるいは“検索”という検索タームが現われる
文書を探し出すということを表わす。（５−３）式の“文書［ＮＯＴ］検索”という条件式は
，“文書”という検索タームが現われて，かつ“検索”
という検索タームが現われない文書を探し出すというこ
とを表わす。以上が論理条件としての課題となる。[
[0012] The following will be explained using an example in Japanese. The conditional expression "document [AND] search" in equation (5-1) indicates that a document in which the two search terms "document" and "search" appear simultaneously is searched. The conditional expression “document [OR] search” in expression (5-2) is
It means to search for documents in which the search term "document" or "search" appears. The conditional expression “Document [NOT] search” in expression (5-3) means that the search term “Document” appears and “Search”
This means to search for documents in which the search term does not appear. The above is a problem as a logical condition.

【００１３】これらの課題をまとめると，本発明の課題
はフルテキストサーチ特有の木目細かな絞り込みを可能
とする，近傍条件判定，文脈条件判定，および論理条件
判定などの複合条件の具体的な判定方法を提供するとと
もに，これらの組合せ処理をハードウェア化された文字
列照合手段と同等の速度で行うことの出来る複合条件判
定方法を提供することである。[0013] To summarize these problems, the problem of the present invention is to specifically determine complex conditions such as neighborhood condition determination, context condition determination, and logical condition determination, which enables fine-grained narrowing down unique to full-text searches. The object of the present invention is to provide a complex condition determination method that can perform these combinations at a speed equivalent to that of hardware string matching means.

【００１４】[0014]

【課題を解決するための手段】これらの課題を解決する
ために，本発明の方法は，文字列照合ステップと複合条
件判定ステップを備えている。文字列照合ステップにお
いては，文書中に指定された検索タームが照合された場
合，該文書の識別子である文書識別子と，照合された検
索ターム，すなわち，照合タームの識別子および該文書
中における照合タームの先頭文字位置と末尾文字位置を
照合情報として出力し，文脈条件が指定され，文脈を識
別する文字列が照合された場合，該文書の識別子と照合
された文脈識別文字列の識別子および該文書中における
該照合文脈識別文字列の先頭文字位置と末尾文字位置を
照合情報として出力する。Means for Solving the Problems In order to solve these problems, the method of the present invention includes a character string matching step and a complex condition determining step. In the string matching step, when a search term specified in a document is matched, the document identifier, which is the identifier of the document, the matched search term, that is, the identifier of the matching term, and the matching term in the document. Outputs the first character position and last character position of the text as matching information, and if a context condition is specified and a string identifying the context is matched, the identifier of the context identification string that was matched with the identifier of the document and the document The first character position and the last character position of the matching context identification character string in the text are output as matching information.

【００１５】複合条件判定ステップは，近傍条件判定ス
テップ，文脈条件判定ステップまたは論理条件判定ステ
ップ，または，これら各ステップの組み合わせからなっ
ている。近傍条件判定ステップにおいては，前記文字列
照合ステップで出力された照合情報に基づいて検索条件
式に指定された検索ターム間の文字数で表した近接距離
条件について判定を行い，条件に合致した前方に位置す
る検索タームの先頭文字位置と後方に位置する検索ター
ムの末尾文字位置を判定結果の照合情報として，これを
前記文字列照合ステップで出力された照合情報に付加し
て出力する。The complex condition determination step includes a neighborhood condition determination step, a context condition determination step, a logical condition determination step, or a combination of these steps. In the proximity condition determination step, the proximity distance condition expressed as the number of characters between the search terms specified in the search condition expression is determined based on the matching information output in the character string matching step, and the forward location that matches the condition is determined. The first character position of the located search term and the last character position of the subsequent search term are used as verification information of the determination result, and are added to the verification information output in the character string verification step and output.

【００１６】文脈条件判定ステップにおいては，検索条
件式中に近傍条件が含まれている場合には，前記近傍条
件判定ステップで出力された照合情報に基づいて該検索
条件式中に指定された検索タームの同一句，同一文，あ
るいは同一段落内での共起条件について判定を行い，条
件に合致した前方に位置する文脈識別文字列の先頭文字
位置と後方に位置する文脈識別文字列の末尾文字位置を
照合情報として，これを前記近傍条件判定ステップで出
力された照合情報に付加して出力する。検索条件式中に
近傍条件が含まれていない場合には，前記文字列照合ス
テップで出力された照合情報に基づいて該検索条件式中
に指定された検索タームの同一句，同一文，あるいは同
一段落内での共起条件について判定を行い，条件に合致
した前方に位置する文脈識別文字列の先頭文字位置と後
方に位置する文脈識別文字列の末尾文字位置を照合情報
として，これを前記文字列照合ステップで出力された照
合情報に付加して出力する。[0016] In the context condition determination step, if the search condition expression includes a neighborhood condition, the search specified in the search condition expression is performed based on the matching information output in the neighborhood condition determination step. The co-occurrence conditions of terms in the same phrase, sentence, or same paragraph are determined, and the position of the first character of the preceding context identification string that matches the condition and the last character of the subsequent context identification string are determined. The position is used as verification information, and this is added to the verification information output in the neighborhood condition determination step and output. If the search condition expression does not include a neighborhood condition, the same phrase, same sentence, or the same search term specified in the search condition expression is determined based on the collation information output in the string matching step. The co-occurrence condition within one paragraph is determined, and the first character position of the preceding context identification character string that matches the condition and the last character position of the following context identification character string are used as collation information, and this is used as collation information. Output in addition to the collation information output in the column collation step.

【００１７】論理条件判定ステップにおいては，検索条
件式中に近傍条件が含まれている場合には前記近傍条件
判定ステップで出力された照合情報に基づき，検索条件
式中に近傍条件および文脈条件が含まれている場合また
は文脈条件が含まれている場合には前記文脈条件判定ス
テップで出力された照合情報に基づき，そして，検索条
件式中に論理条件のみが含まれている場合には前記文字
列照合ステップで出力された照合情報に基づいて，該検
索条件式中に指定された該検索ターム間の論理条件につ
いて判定を行い，条件に合致した文書単位の照合情報を
前段ステップで出力された照合情報に付加して最終的な
検索結果情報として出力する。In the logical condition determination step, if the search condition expression includes a neighborhood condition, the neighborhood condition and the context condition are determined in the search condition expression based on the collation information output in the neighborhood condition determination step. or if a context condition is included, based on the collation information output in the context condition determination step, and if only a logical condition is included in the search condition expression, the character is Based on the collation information output in the column collation step, the logical conditions between the search terms specified in the search condition expression are determined, and the collation information for each document that meets the conditions is output in the previous step. It is added to the matching information and output as the final search result information.

【００１８】また，本発明の装置の一つは次のように文
字列照合手段と複合条件判定手段で構成される。文字列
照合手段は，文書中に指定された検索タームが照合され
た場合，該文書の識別子である文書識別子と，照合され
た検索ターム，すなわち，照合タームの識別子および該
文書中における照合タームの先頭文字位置と末尾文字位
置を照合情報として出力し，文脈条件が指定され，文脈
を識別する文字列が照合された場合，該文書の識別子と
照合された文脈識別文字列の識別子および該文書中にお
ける該照合文脈識別文字列の先頭文字位置と末尾文字位
置を照合情報として出力する。Further, one of the apparatuses of the present invention is composed of a character string matching means and a complex condition determining means as follows. When a search term specified in a document is matched, the string matching means identifies the document identifier that is the identifier of the document, the matched search term, that is, the identifier of the matching term, and the identifier of the matching term in the document. If the first character position and the last character position are output as matching information, a context condition is specified, and a string identifying the context is matched, the identifier of the context identifying string that was matched with the identifier of the document and the content of the document. The first character position and the last character position of the matching context identification character string in are output as matching information.

【００１９】複合条件判定手段は，近傍条件判定手段，
文脈条件判定手段，および論理条件判定手段から構成さ
れる。近傍条件判定手段は，前記文字列照合手段で出力
された照合情報に基づいて検索条件式に指定された検索
ターム間の文字数で表した近接距離条件について判定を
行い，条件に合致した前方に位置する検索タームの先頭
文字位置と後方に位置する検索タームの末尾文字位置を
判定結果の照合情報として，これを前記文字列照合手段
で出力された照合情報に付加して出力する。文脈条件判
定手段は，前記近傍条件判定手段で出力された照合情報
に基づいて該検索条件式中に指定された検索タームの同
一句，同一文，あるいは同一段落内での共起条件につい
て判定を行い，条件に合致した前方に位置する文脈識別
文字列の先頭文字位置と後方に位置する文脈識別文字列
の末尾文字位置を照合情報として，これを前記近傍条件
判定手段で出力された照合情報に付加して出力する。論
理条件判定手段は，前記文脈条件判定手段で出力された
照合情報に基づいて該検索条件式中に指定された該検索
ターム間の論理条件について判定を行い，条件に合致し
た文書単位の照合情報を最終的な検索結果情報として出
力する。[0019] The compound condition determining means includes neighborhood condition determining means,
It consists of a context condition determining means and a logical condition determining means. The proximity condition determination means determines the proximity distance condition expressed by the number of characters between the search terms specified in the search condition expression based on the collation information output by the character string comparison means, and determines the proximity distance condition expressed by the number of characters between the search terms specified in the search condition expression, and determines the proximity distance condition expressed by the number of characters between the search terms specified in the search condition expression, The first character position of the search term and the last character position of the subsequent search term are used as verification information of the determination result, and are added to the verification information output by the character string verification means and output. The context condition determining means determines co-occurrence conditions of the search terms specified in the search condition expression in the same phrase, the same sentence, or the same paragraph based on the matching information output by the neighborhood condition determining means. The first character position of the preceding context identification character string that matches the condition and the last character position of the subsequent context identification character string that match the condition are used as matching information, and this is used as matching information output by the neighborhood condition determining means. Add and output. The logical condition determination means determines the logical conditions between the search terms specified in the search condition expression based on the collation information output by the context condition determination means, and determines the collation information for each document that matches the condition. is output as the final search result information.

【００２０】[0020]

【作用】文字列照合において，文書識別子と，照合ター
ムの識別子および文書中における照合タームの先頭文字
位置と末尾文字位置とが照合情報として出力され，また
，文脈条件が指定された際に，文脈識別文字列の識別子
および文書中における該照合文脈識別文字列の先頭文字
位置と末尾文字位置とが照合情報として出力されるので
，近傍条件の判定は，検索条件式中の検索タームについ
ては前記照合タームの識別子との一致をみることにより
判定し，検索条件式中の字間距離条件については一致を
みた各照合タームの先頭文字位置と末尾文字位置とを比
較判定することにより行われる。文脈条件の判定は，検
索条件式中の検索タームについては，文字列照合，近傍
条件の判定で得られた照合タームの識別子との一致をみ
ることにより判定し，検索条件式中の検索タームが共起
する範囲についての条件については，相前後する文脈識
別文字列の識別子の位置と一致をみた各照合タームの先
頭文字位置と末尾文字位置とを比較することにより，相
前後する文脈識別文字列の識別子の位置の間に一致をみ
た各照合タームが共起することを判定することにより行
われる。論理条件については，検索条件式中の検索ター
ムについては，文字列照合，近傍条件の判定，文脈条件
の判定で得られた照合タームの識別子との一致をみるこ
とにより判定し，一致をみた照合タームの識別子が検索
条件式中の論理条件を満たしているか否かを判定するこ
とにより行われる。[Operation] In string matching, the document identifier, the identifier of the matching term, and the first and last character positions of the matching term in the document are output as matching information, and when a context condition is specified, the context Since the identifier of the identification string and the first and last character positions of the matching context identification string in the document are output as matching information, neighborhood conditions can be determined using the above matching for the search term in the search condition expression. The determination is made by checking the match with the identifier of the term, and the character distance condition in the search condition expression is determined by comparing the first character position and the last character position of each matching term. The context condition is determined by checking the match between the search term in the search condition expression and the identifier of the collation term obtained by character string matching and neighborhood condition judgment. Regarding the conditions for the range of co-occurrence, by comparing the position of the identifier in the preceding and following context identification strings with the first and last character positions of each matching term, This is done by determining whether the matching terms co-occur between the positions of the identifiers. Regarding logical conditions, the search term in the search condition expression is judged by checking the match with the identifier of the matching term obtained by character string matching, neighborhood condition judgment, and context condition judgment, and the matching is performed when a match is found. This is done by determining whether the term identifier satisfies the logical condition in the search condition expression.

【００２１】そして，文字列照合手段および複合条件判
定手段を構成することにより，上記の近傍条件，文脈条
件，および論理条件などの複合条件判定を一貫して実現
することができるためフルテキストサーチ特有の木目細
かな検索が可能となる。さらに，例えば３つのマイクロ
コンピュータで各々，近傍条件判定処理，文脈条件判定
処理，および論理条件判定処理を実行させることにより
，各処理間で同期を取らなくとも処理動作させることが
可能となる。すなわち，これらのマイクロコンピュータ
ではそれぞれの入力バッファに照合情報が格納されると
これに応じて条件判定処理を始めるというパイプライン
処理を行うことが可能となり，高速な複合条件判定処理
を実現することができる。[0021] By configuring the character string matching means and the compound condition judgment means, it is possible to consistently realize the judgment of compound conditions such as the above-mentioned neighborhood conditions, context conditions, and logical conditions, which is unique to full-text search. It becomes possible to perform a detailed search. Furthermore, for example, by having three microcomputers each execute the neighborhood condition determination process, the context condition determination process, and the logical condition determination process, it becomes possible to operate the processes without synchronizing each process. In other words, these microcomputers can perform pipeline processing in which condition judgment processing is started in response to collation information stored in each input buffer, making it possible to realize high-speed complex condition judgment processing. can.

【００２２】[0022]

【実施例】最初に、本発明の方法および装置の原理につ
いて説明する。文字列照合手段において，まず文書デー
タが入力された際，文書の先頭に格納された文書識別子
が検出され照合結果として出力される。次に文書中に指
定された検索タームが照合された場合，照合された照合
タームの識別子と，該文書中における照合タームの照合
位置として照合タームの先頭文字位置と末尾文字位置が
照合情報として出力される。すなわち，１文書における
照合情報としては，図１３に示すように先頭に文書識別
子があり，その次から検索タームの照合情報が来る構成
となる。以上の処理が文書毎に全ての文書データを読み
込み終えるまで繰り返し行われる。EXAMPLES First, the principles of the method and apparatus of the present invention will be explained. In the character string matching means, when document data is first input, a document identifier stored at the beginning of the document is detected and output as a matching result. Next, when the search term specified in the document is matched, the identifier of the matched matching term and the first and last character positions of the matching term are output as matching information as the matching position of the matching term in the document. be done. That is, as shown in FIG. 13, the collation information for one document has a document identifier at the beginning, followed by the collation information for the search term. The above process is repeated for each document until all document data has been read.

【００２３】具体的な検索タームの照合方法について図
４を用いて説明する。例えば文字列照合手段に，検索タ
ーム“文書”が設定され，“．．．。文書理解を用いた
検索システムである。．．．．”という文書が入力され
たことを想定する。この場合の文字列照合手段の出力と
して得られる文書識別情報および照合ターム識別情報は
（６−１）および（６−２）のように表す。（Ｄ１，　　０，　　０）　　　　　　　　（６−１）
（Ｔ１，Ｘｓ，Ｘｅ）　　　　　　　　（６−２）文書
識別情報（６−１）においてＤ１は文書識別子を表し，
これに続く２項は定数０（ゼロ）である。照合ターム識
別情報（６−２）において，Ｔ１は検索タームの識別子
（以後，照合ターム識別子と呼ぶ）を表し，Ｘｓは文書
中で探索された照合タームの先頭文字位置を，Ｘｅは同
様に末尾文字位置を表わす。図４の例では，文書識別情
報は（Ｄ１，０，０）となり，“文書”の照合ターム識
別情報は（Ｔ１，３１，３２）となる。A specific search term matching method will be explained using FIG. 4. For example, it is assumed that the search term "document" is set in the character string matching means and the document "...This is a search system using document understanding..." is input. The document identification information and matching term identification information obtained as the output of the character string matching means in this case are expressed as (6-1) and (6-2). (D1, 0, 0) (6-1)
(T1, Xs, Xe) (6-2) In document identification information (6-1), D1 represents a document identifier,
The second term following this is a constant 0 (zero). In the collation term identification information (6-2), T1 represents the identifier of the search term (hereinafter referred to as collation term identifier), Xs represents the first character position of the collation term searched in the document, and Xe similarly represents the last character position of the collation term searched in the document. Represents character position. In the example of FIG. 4, the document identification information is (D1, 0, 0), and the collation term identification information of "document" is (T1, 31, 32).

【００２４】次に複合条件判定手段では，以下のような
判定処理が行われる。まず，近傍条件判定手段では前記
文字列照合手段で出力された照合情報に基づいて，検索
ターム間の文字数で表した近接距離条件について判定が
行われる。すなわち，検索条件式に指定された前方に位
置する検索タームの末尾文字位置と後方に位置する検索
タームの先頭文字位置との文字距離を算出し，この文字
距離が近傍条件に指定された距離および順序を満たして
いるか否かの判定が行われる。近接距離条件が成立した
場合には判定結果として，条件に合致した前方に位置す
る検索タームの先頭文字位置と後方に位置する検索ター
ムの末尾文字位置を照合情報として，これを前記文字列
照合手段で出力された照合情報に付加して出力する。具
体的な近傍条件処理例を図５を用いて説明する。例えば
“文書”と“理解”がこの順序で現れ，かつ４文字以内
に近接する文書を検索するという近傍条件“文書［４Ｃ
］理解”が設定され，文字列照合手段に文書“．．．。文書理解を用いた検索システムである。．．．．” が入力されたことを想定する。まず，検索タームとして
“文書”と“理解”が文字列照合手段に設定される。文
書が入力されると，この２つの検索タームについて文字
列照合処理が実行され，以下の文書識別情報と照合ター
ム識別情報が得られる。（Ｄ１，　　０，　　０）　　　　　　　　　　（６−
３）（Ｔ１，３１，３２）　　　　　　　　　　（６−
４）（Ｔ２，３３，３４）　　　　　　　　　　（６−
５）（６−３）は文書識別情報，（６−４）は“文書”
の照合ターム識別情報，および（６−５）は“理解”の
照合ターム識別情報である。次に，これらの情報に基づ
いて近傍条件“文書［４Ｃ］理解”について処理が行わ
れる。Next, the composite condition determining means performs the following determining process. First, the proximity condition determining means determines the proximity distance condition expressed by the number of characters between search terms based on the collation information output by the character string collation means. In other words, the character distance between the last character position of the search term located in the front specified in the search condition expression and the first character position of the search term located in the back is calculated, and this character distance is calculated as the distance specified in the neighborhood condition and A determination is made as to whether the order is satisfied. If the proximity distance condition is satisfied, the determination result is the first character position of the search term located in the front that matches the condition and the last character position of the search term located in the back that meet the condition as matching information, which is used as matching information by the character string matching means. It is added to the verification information output in . A specific example of neighborhood condition processing will be explained using FIG. 5. For example, the neighborhood condition "Document [4C
]Understanding” is set, and the string matching method is set to document “. ．．．． . This is a search system that uses document understanding. ．．．．．．．． ” is input. First, the search terms “document” and “understanding” are set in the string matching method. When a document is input, string matching processing is performed for these two search terms. The following document identification information and matching term identification information are obtained. (D1, 0, 0) (6-
3) (T1, 31, 32) (6-
4) (T2, 33, 34) (6-
5) (6-3) is document identification information, (6-4) is “document”
and (6-5) is the verification term identification information of "understanding". Next, processing is performed on the neighborhood condition "document [4C] understanding" based on this information.

【００２５】本例では，条件に合致した前方に位置する
検索ターム“文書”の末尾文字位置である３２と，条件
に合致した後方に位置する検索ターム“理解”の先頭文
字位置である３３から文字距離は０（（３３−３２）−
１）＝１−１＝０）であることが算出でき，指定された
４文字より小さいため，本例における近傍条件“文書［
４Ｃ］理解”は成立していると判定される。最後に判定
結果として，本近傍条件の識別子ＰＩＤをＰ１とし，条
件に合致した前方に位置する検索ターム“文書”の先頭
文字位置である３１をＸｓに，後方に位置する検索ター
ム“理解”の末尾文字位置である３４をＸｅとした照合
情報（６−６）（以後，近傍条件識別情報と呼ぶ。文脈
条件の場合には文脈条件識別情報、論理条件の場合には
論理条件識別情報と呼び、これらの情報を総称して複合
条件識別情報と呼ぶ。）を，以下のように文字列照合手
段で出力された照合情報に（６−３），（６−４），お
よび（６−５）に付加して出力する。（Ｄ１，　　０，　　０）　　　　　　　　　　（６−
３）（Ｔ１，３１，３２）　　　　　　　　　　（６−
４）（Ｔ２，３３，３４）　　　　　　　　　　（６−
５）（Ｐ１，３１，３４）　　　　　　　　　　（６−
６）すなわち，近傍条件の判定結果を（ＰＩＤ，Ｘｓ，
Ｘｅ）という検索タームの照合情報と同じ形式で照合タ
ーム識別情報に付加する形で出力する。In this example, from 32, which is the last character position of the search term "document" located at the front that matches the condition, and from 33, which is the first character position of the search term "understanding" located at the back that matches the condition. The character distance is 0 ((33-32)-
1) = 1 - 1 = 0), and since it is smaller than the specified 4 characters, the neighborhood condition "Document [
4C] is determined to be satisfied.Finally, as a judgment result, the identifier PID of this neighborhood condition is set to P1, and the first character position of the search term "document" located ahead that matches the condition is 31. is Xs, and 34, which is the last character position of the search term "understanding" located later, is Xe (6-6) (hereinafter referred to as neighborhood condition identification information. In the case of context condition, context condition identification (In the case of information or logical conditions, it is called logical condition identification information, and this information is collectively called complex condition identification information.) is added to the matching information output by the character string matching means (6- 3), (6-4), and (6-5) and output. (D1, 0, 0) (6-
3) (T1, 31, 32) (6-
4) (T2, 33, 34) (6-
5) (P1, 31, 34) (6-
6) In other words, the judgment result of the neighborhood condition is (PID, Xs,
It is output in the same format as the matching information for the search term (Xe) in addition to the matching term identification information.

【００２６】次に文脈条件判定手段では前記近傍条件判
定手段で出力された照合情報に基づいて検索条件式中に
指定された検索タームの同一句，同一文，あるいは同一
段落内での共起条件について判定を行う。共起条件判定
では，条件に指定された文脈識別文字列の先頭文字位置
から次の文脈識別文字列の末尾文字位置までの文脈範囲
内に，２つの検索タームが条件中に指定された順序で現
れているかどうかの判定を行う。共起条件が成立した場
合には，判定結果として本文脈条件の識別子と条件に指
定された前方に位置する文脈識別文字列の先頭文字位置
と，後方に位置する文脈識別文字列の末尾文字位置を照
合情報として，これを前記近傍条件判定手段で出力され
た照合情報に付加して出力する。Next, the context condition determining means determines the co-occurrence condition of the search terms specified in the search condition expression in the same phrase, the same sentence, or the same paragraph based on the matching information output by the neighborhood condition determining means. Make a judgment regarding. In co-occurrence condition determination, two search terms are found in the order specified in the condition within the context range from the first character position of the context identification string specified in the condition to the last character position of the next context identification string. Determine whether it has appeared. If the co-occurrence condition is met, the determination result is the identifier of this context condition, the first character position of the preceding context identification string specified in the condition, and the last character position of the subsequent context identification string. This is added as verification information to the verification information output by the neighborhood condition determination means and output.

【００２７】具体的な文脈条件の判定処理の例を図６を
用いて説明する。ここでは，“文書”と“理解”がこの
順序で現れ，かつ同一文内に共起する文脈条件“文書［
Ｓ］検索”が設定され，文字列照合手段に“．．．。文書理解を用いた検索システムである。．．．．”とい
う文書（文書識別子＝１）が入力されたことを想定する
。　　まず，検索タームとして“文書”と“検索”が，
さらに文脈条件が指定されているので，文脈を識別する
ための文字列“。”が文字列照合手段に設定される。文
字列照合手段では，この３つの検索タームが照合され，
図６に示すように（Ｄ１，　　０，　　０）　　　　　　　　（６−７）
（Ｓ１，３０，３０）　　　　　　　　（６−８）（Ｔ
１，３１，３２）　　　　　　　　（６−９）（Ｔ３，
３３，３４）　　　　　　　　（６−１０）（Ｓ１，４
８，４８）　　　　　　　　（６−１１）が出力される
。照合ターム識別情報（６−８）および（６−１１）に
おいて，Ｓ１は文脈を識別する文字列“。”の識別子を
表している。A specific example of context condition determination processing will be explained using FIG. 6. Here, “document” and “understanding” appear in this order, and the contextual condition “document [
S]Search" is set, and the string matching method is set to ". ．．．． . This is a search system that uses document understanding. ．．．．．．．． ” (document identifier = 1) is input. First, the search terms “document” and “search” are
Furthermore, since a context condition is specified, a character string "." for identifying the context is set in the character string matching means. In the string matching means, these three search terms are matched and
As shown in Figure 6 (D1, 0, 0) (6-7)
(S1, 30, 30) (6-8) (T
1,31,32) (6-9)(T3,
33,34) (6-10)(S1,4
8, 48) (6-11) is output. In the collation term identification information (6-8) and (6-11), S1 represents the identifier of the character string "." that identifies the context.

【００２８】これらの照合情報は，近傍条件判定手段に
送られるが，本例の場合では近傍条件が設定されていな
いため，近傍条件判定手段からは入力した照合情報が以
下のようにそのまま出力される。（Ｄ１，　　０，　　０）　　　　　　　　（６−７）
（Ｓ１，３０，３０）　　　　　　　　（６−８）（Ｔ
１，３１，３２）　　　　　　　　（６−９）（Ｔ３，
３３，３４）　　　　　　　　（６−１０）（Ｓ１，４
８，４８）　　　　　　　　（６−１１）（６−７）は
文書識別情報，（６−９）は“文書”の，（６−１０）
は“検索”の照合ターム識別情報の，（６−８）と（６
−１１）は“。”の照合ターム識別情報である。These collation information are sent to the neighborhood condition determination means, but in this example, since no neighborhood conditions are set, the input collation information is output as is from the neighborhood condition determination means as shown below. Ru. (D1, 0, 0) (6-7)
(S1, 30, 30) (6-8) (T
1,31,32) (6-9)(T3,
33,34) (6-10)(S1,4
8, 48) (6-11) (6-7) is document identification information, (6-9) is “document”, (6-10)
are (6-8) and (6) of the matching term identification information of “Search”.
-11) is the verification term identification information of “.”.

【００２９】次に，これらの照合情報に基づいて文脈条
件“文書［Ｓ］検索”に関する共起条件判定が行われる
。本例では，文脈すなわち文（センテンス）の範囲は，
条件に指定された文脈識別文字列“。”の先頭文字位置
である３０文字目から次の文脈識別文字列“。”の末尾
文字位置である４８文字目までとして，すなわち（６−
８）から（６−１１）までの範囲で表される。本例の場
合，この文脈範囲内に，条件に指定された検索ターム“
文書”および“検索”がこの順序で含まれているため，
“文書［Ｓ］検索”が成立していると判定される。ここ
で，“文書”と“検索”の順序関係は“文書”の末尾文
字位置（３２）と“検索”の先頭文字位置（４０）を比
較して判定される。すなわち，“文書”の末尾文字位置
（３２）が“検索”の先頭文字位置（４０）より小さい
（３２＜４０）ので，“文書”の方が“検索”より前に
位置していると判定できる。Next, a co-occurrence condition determination regarding the context condition "document [S] search" is performed based on this collation information. In this example, the context, or the range of sentences, is
From the 30th character, which is the first character position of the context identification character string "." specified in the condition, to the 48th character, which is the last character position of the next context identification character string ".", that is, (6-
8) to (6-11). In this example, within this context range, the search term specified in the condition “
"Document" and "Search" are included in this order, so
It is determined that “document [S] search” is established. Here, the order relationship between "document" and "search" is determined by comparing the last character position (32) of "document" and the first character position (40) of "search". In other words, since the last character position (32) of "Document" is smaller than the first character position (40) of "Search"(32<40), it is determined that "Document" is located before "Search". can.

【００３０】最後に判定結果として，本文脈条件の識別
子ＣＩＤをＣ１とし，条件に指定された前方に位置する
文脈識別文字列“。”の先頭文字位置である３０（先頭
位置情報）と，後方に位置する文脈識別文字列“。”の
末尾文字位置である４８（末尾位置情報）を照合情報（
６−１２）（以後，文脈条件識別情報と呼ぶ）として，
これを前記近傍条件判定手段で出力された照合情報に付
加して以下のように出力する。（Ｄ１，　　０，　　０）　　　　　　　　（６−７）
（Ｓ１，３０，３０）　　　　　　　　（６−８）（Ｔ
１，３１，３２）　　　　　　　　（６−９）（Ｔ３，
３３，３４）　　　　　　　　（６−１０）（Ｓ１，４
８，４８）　　　　　　　　（６−１１）（Ｃ１，３０
，４８）　　　　　　　　（６−１２）Finally, as a result of the determination, the identifier CID of this context condition is set to C1, and 30 (start position information), which is the first character position of the context identification character string "." located at the front specified in the condition, and the rear The last character position 48 (end position information) of the context identification character string “.” located in
6-12) (hereinafter referred to as context condition identification information),
This is added to the collation information output by the neighborhood condition determination means and output as follows. (D1, 0, 0) (6-7)
(S1, 30, 30) (6-8) (T
1,31,32) (6-9)(T3,
33,34) (6-10)(S1,4
8,48) (6-11)(C1,30
,48) (6-12)

【００３１】最
後に論理条件判定手段では前記文脈条件判定手段で出力
された照合情報に基づいて該検索条件式中に指定された
検索ターム間の論理条件について判定を行い，条件に指
定された文書単位の照合情報を最終的な検索結果情報と
して出力する。具体的な論理条件の判定処理の例を図７
を用いて説明する。例えば，“文書”と“検索”という
２つの検索タームが同一文書中に現れる論理条件“文書
［ＡＮＤ］検索”が設定され，文字列照合手段に文書 “．．．。文書理解を用いた検索システムである。．．
．．” が入力されたことを想定する。まず，検索タームとして
“文書”と“検索”が文字列照合手段に設定され，本図
の例の文書が入力されると，文字列照合手段より以下の
照合ターム識別情報が得られる。（Ｄ０，　　０，　　０）　　　　　　（６−１３）（
Ｔ１，３１，３２）　　　　　　（６−１４）（Ｔ３，
３９，４０）　　　　　　（６−１５）（６−１３）は
文書識別情報，（６−１４）は“文書”の，（６−１５
）は“検索”の照合ターム識別情報である。もし，近傍
条件および文脈条件が設定されていない場合には，近傍
条件判定手段および文脈条件判定手段を経由するかたち
で，これらの情報はそのまま論理条件判定手段入力され
，論理条件“文書［ＡＮＤ］検索”についての論理条件
判定が行われる。Finally, the logical condition determining means determines the logical conditions between the search terms specified in the search condition expression based on the collation information output by the context condition determining means, and determines whether the document specified in the condition is Output unit matching information as final search result information. Figure 7 shows an example of a specific logical condition determination process.
Explain using. For example, a logical condition "Document [AND] search" is set in which two search terms "document" and "search" appear in the same document, and the string matching means is "document"...Search using document understanding It is a system...
．．．． ” is input. First, the search terms “document” and “search” are set in the string matching method, and when the document in the example in this figure is input, the string matching method returns the following: Verification term identification information is obtained. (D0, 0, 0) (6-13) (
T1, 31, 32) (6-14) (T3,
39, 40) (6-15) (6-13) is document identification information, (6-14) is “document”, (6-15
) is the collation term identification information for “search”. If the neighborhood condition and context condition are not set, this information is input to the logical condition determining means as is via the neighborhood condition determining means and the context condition determining means, and the logical condition "document [AND] A logical condition determination regarding "Search" is performed.

【００３２】本例では論理条件判定手段において，“文
書”と“検索”の照合タームが１文書内に同時に存在す
ることを調べ，論理条件“文書［ＡＮＤ］検索”が成立
するものと判定を下す。そして，本条件を満足した文書
識別情報（６−１３）と，本判定結果として本論理条件
の識別子ＬＩＤをＬ１とし，該当文書の先頭文字位置と
末尾文字位置とを照合情報（６−１６）（以後，論理条
件識別情報と呼ぶ）として，これを前記文脈条件判定手
段の出力に付加して，以下のように出力する。（Ｄ０，　　０，　　０）　　　　　　（６−１３）（
Ｔ１，３１，３２）　　　　　　（６−１４）（Ｔ３，
３９，４０）　　　　　　（６−１５）（Ｌ１，　　０
，９９）　　　　　　（６−１６）[0032] In this example, the logical condition determining means checks that the collation terms "document" and "search" exist simultaneously in one document, and determines that the logical condition "document [AND] search" is satisfied. Lower. Then, the document identification information (6-13) that satisfies this condition and the identifier LID of this logical condition as the result of this judgment are set as L1, and the first character position and the last character position of the corresponding document are used as matching information (6-16). This is added as (hereinafter referred to as logical condition identification information) to the output of the context condition determining means and output as follows. (D0, 0, 0) (6-13)(
T1, 31, 32) (6-14) (T3,
39,40) (6-15)(L1, 0
,99) (6-16)

【００３３】以上の
ように，検索条件式中に指定された検索ターム間の文字
数で表した近接距離条件について判定を行う近傍条件判
定手段と，検索条件式中に指定された検索タームの同一
句，同一文，あるいは同一段落内での共起条件について
判定を行う文脈条件判定手段と，検索条件式中に指定さ
れた検索ターム間の論理条件について判定を行う論理条
件判定手段からなる複合条件判定手段を用いることによ
り，フルテキストサーチ特有の木目細かな絞り込み検索
が可能となる。さらに各複合条件判定を構成する近傍条
件判定処理，文脈条件判定処理，および論理条件判定処
理の入出力情報形式が全く同じになっているため，これ
らを分散してパイプライン処理することにより高速な複
合条件判定処理が可能となる。As described above, the proximity condition determination means determines the proximity distance condition expressed by the number of characters between the search terms specified in the search condition expression, and the proximity condition determination means for determining the proximity distance condition expressed by the number of characters between the search terms specified in the search condition expression, , a complex condition judgment consisting of a context condition judgment means that judges the co-occurrence conditions in the same sentence or the same paragraph, and a logical condition judgment means that judges the logical conditions between search terms specified in the search condition expression. By using this method, it becomes possible to narrow down the search in detail, which is unique to full-text searches. Furthermore, since the input and output information formats of the neighborhood condition judgment processing, context condition judgment processing, and logical condition judgment processing that make up each complex condition judgment are exactly the same, distributing these and performing pipeline processing can speed up processing. Composite condition determination processing becomes possible.

【００３４】次に、本発明における第１の実施例につい
て図１を用いて説明する。本実施例は文字列照合回路２
００と複合条件判定回路３００から構成されている。文
字列照合回路２００では検索制御手段１０１（図２）か
ら送られてくる検索対象の検索タームと記憶装置制御手
段１０４（図２）の制御の基に文字列記憶手段１０５（
図２）から読み出される文書データ２０４とを照合し，
照合されたものがあれば照合ターム情報を照合結果２０
５として複合条件判定回路３００へ送る。複合条件判定
回路３００では検索制御手段１０１から送られてくる複
合条件を基に文字列照合回路２００より出力される照合
ターム識別情報に関して複合条件が満たされるか否かを
判定し，条件に合致する場合には該当する照合ターム識
別情報と複合条件識別情報を文書単位に判定結果２０６
として出力する。Next, a first embodiment of the present invention will be explained using FIG. 1. In this embodiment, the character string matching circuit 2
00 and a complex condition determination circuit 300. The character string matching circuit 200 uses the search term of the search target sent from the search control means 101 (FIG. 2) and the character string storage means 105 (see FIG.
Compare the document data 204 read from Figure 2),
If there is a match, the match term information is shown as the match result 20.
5 to the complex condition determination circuit 300. The compound condition determination circuit 300 determines whether the compound condition is satisfied with respect to the matching term identification information output from the character string matching circuit 200 based on the compound condition sent from the search control means 101, and determines whether the compound condition is met. In this case, the judgment result 206 for each document is the corresponding matching term identification information and compound condition identification information.
Output as .

【００３５】まず，文字列照合回路２００について詳細
に説明する。文字列照合回路２００は，タームコンパレ
ータ２１０，文書識別子検出回路２２０，文字数カウン
ト回路２３０，位置情報付加回路８００，および検索タ
ーム長テーブル２５０から構成されている。タームコン
パレータ２１０では，指定された検索タームと送られて
くる文書データ２０４との照合を行い，照合されたもの
があればその検索タームの識別子である照合ターム識別
子２１１（正の整数値データ）を，照合されたものがな
い場合には０（ゼロ）を位置情報付加回路８００へ送出
する。すなわち，照合ターム識別子２１１の値として，
０の場合は無効データであり，正の整数の場合は有効デ
ータであり照合タームの識別子を表す。このタームコン
パレータ２１０としては，特開昭６０−１０５０３９に
開示されているものを用いることができる。文書識別子
検出回路２２０は，第８図に示すようにレジスタ２２４
〜２２８と２２９ａ，コンパレータ２２３、セレクタ２
２９から構成されている。レジスタ２２４には文書単位
に文書データの先頭に付与されているトップオブテキス
トコード（ＴＯＴ）が，レジスタ２２９ａには０（ゼロ
）が初期設定されている。８ビット単位に送られてくる
文書データ２０４は，レジスタ２２５〜レジスタ２２８
から構成される４段のシフトレジスタ２２８ｂに次々と
入力される。コンパレータ２２３では，この最終段出力
２２８ａについてレジスタ２２４に格納されているＴＯ
Ｔと等しいか否かの比較を行う。レジスタ２２５〜レジ
スタ２２８では各出力を３２ビットの出力２２２として
セレクタ２２９に送り，最終段出力２２８ａがＴＯＴの
場合には，コンパレータ２２３よりセレクト信号２２３
ａが送られ，３２ビットの出力２２２を文書識別子２２
１として選択し，位置情報付加回路８００へ送出する。また，セレクタ信号２２３ａが送られない間，セレクタ
２２９ではレジスタ２２９ａに格納された０を選択する
。すなわち，文書データの中から，文書単位に文書デー
タの先頭に付与されているＴＯＴが検出され，これに続
いて格納されている３２ビットの文書識別子２２１が位
置情報付加回路８００へ送出され，ＴＯＴが検出されな
い間は０の文書識別子２２１が送出されることになる。First, the character string matching circuit 200 will be explained in detail. The character string matching circuit 200 includes a term comparator 210, a document identifier detection circuit 220, a character count circuit 230, a position information addition circuit 800, and a search term length table 250. The term comparator 210 matches the specified search term with the sent document data 204, and if there is a match, it outputs a match term identifier 211 (positive integer value data) that is the identifier of the search term. , if there is no verified item, 0 (zero) is sent to the position information adding circuit 800. That is, as the value of the matching term identifier 211,
If it is 0, it is invalid data, and if it is a positive integer, it is valid data and represents the identifier of the collation term. As this term comparator 210, the one disclosed in Japanese Patent Laid-Open No. 60-105039 can be used. The document identifier detection circuit 220 has a register 224 as shown in FIG.
~228 and 229a, comparator 223, selector 2
It consists of 29. The register 224 is initially set with a top-of-text code (TOT) that is added to the beginning of document data for each document, and the register 229a is initially set to 0 (zero). Document data 204 sent in 8-bit units is sent to registers 225 to 228.
The signals are sequentially input to a four-stage shift register 228b consisting of. The comparator 223 uses the TO stored in the register 224 for this final stage output 228a.
A comparison is made to see if it is equal to T. The registers 225 to 228 send each output as a 32-bit output 222 to the selector 229, and when the final stage output 228a is TOT, the comparator 223 outputs the select signal 223.
a is sent, and the 32-bit output 222 is sent as the document identifier 22
1 and sends it to the position information addition circuit 800. Further, while the selector signal 223a is not sent, the selector 229 selects 0 stored in the register 229a. That is, the TOT added to the beginning of the document data for each document is detected from the document data, and the stored 32-bit document identifier 221 is sent to the position information addition circuit 800, and the TOT is sent to the position information addition circuit 800. While the document identifier 221 is not detected, a document identifier 221 of 0 is sent.

【００３６】文字数カウント回路２３０では，送られて
くる文書データ２０４に対して，各文書毎に先頭からの
８ビットの文字コード数をカウントし，１文字が２バイ
トで構成される文字数へ変換し、これを位置情報付加回
路８００へ送出する。文書毎に文字コードカウント値を
リセットするのに，文書識別子検出回路２２０から出力
されるＴＯＴ検出信号２２３ａが用いられる。検索ター
ム長テーブル２５０には，図９に示すように照合ターム
識別子２１１をアドレスとするスロットに該当検索ター
ムの長さが格納されており，位置情報付加回路８００よ
り照合ターム識別子を受け取り，位置情報付加回路８０
０へ照合ターム識別子に対応した検索ターム長８７３を
返送する。本図では，照合ターム識別子が１の検索ター
ム“理解”のターム長である２と，照合ターム識別子が
２の検索ターム“システム”のターム長である４という
情報が設定されている。従って，例えば“検索”に対応
する照合ターム識別子２１１として１を受け取ると検索
ターム長として２を送り返すことになる。[0036] The character count circuit 230 counts the number of 8-bit character codes from the beginning of each document for the document data 204 sent, and converts it into the number of characters in which each character consists of 2 bytes. , and sends this to the position information adding circuit 800. The TOT detection signal 223a output from the document identifier detection circuit 220 is used to reset the character code count value for each document. As shown in FIG. 9, the search term length table 250 stores the length of the corresponding search term in the slot whose address is the matching term identifier 211, receives the matching term identifier from the location information adding circuit 800, and adds the location information. Additional circuit 80
The search term length 873 corresponding to the matching term identifier is returned to 0. In this figure, information is set such as 2, which is the term length of the search term "understanding" whose verification term identifier is 1, and 4, which is the term length of the search term "system" whose verification term identifier is 2. Therefore, for example, if 1 is received as the matching term identifier 211 corresponding to "search", 2 will be sent back as the search term length.

【００３７】位置情報付加回路８００は，図１０に示す
ようにレジスタ８１０〜８１６，ＯＲゲート８８０〜８
８１，セレクタ８２０〜８２２，減算器８３０，および
加算器８３１から構成されている。本回路の初期設定と
して，レジスタ８１３，レジスタ８１４，およびレジス
タ８１５には０（ゼロ）が設定されており，各々セレク
タ８２０，セレクタ８２１，およびセレクタ８２２に０
を出力している。また，レジスタ８１６には１が設定さ
れており，加算器８３１に１を出力している。セレクタ
８２０〜８２２ではセレクト信号８９０，８９１の両方
が０の場合はＺポートを選択する。すなわち各セレクタ
は，レジスタ８１３，レジスタ８１４，およびレジスタ
８１５を各々選択することになり，セレクタ８２０〜８
２２の出力として０が照合情報２０５として出力される
ことになる。また，セレクト信号８９０が１でセレクト
信号８９１が０の場合はＸポートを，セレクト信号８９
０が０でセレクト信号８９１が１の場合にはＹポートを
選択する。レジスタ８１０には文書識別子検出回路２２
０から文書識別子２２１が送られる度に文書識別子２２
１が格納されると共に，セレクタ８２０およびＯＲゲー
ト８８０へ出力される。ここで，文書識別子２２１が０
の場合は文書の識別子以外が検出されたことを示してい
る。ＯＲゲート８８０ではレジスタ８１０に文書識別子
２２１が格納された際，文書識別子２２１の各ビット間
の論理和を取り演算結果であるセレクト信号８９０がセ
レクタ８２０〜８２２に送られる。The position information adding circuit 800 includes registers 810 to 816 and OR gates 880 to 8 as shown in FIG.
81, selectors 820 to 822, a subtracter 830, and an adder 831. As an initial setting of this circuit, 0 (zero) is set in register 813, register 814, and register 815, and 0 (zero) is set in selector 820, selector 821, and selector 822, respectively.
is outputting. Further, 1 is set in the register 816, and 1 is output to the adder 831. Selectors 820 to 822 select the Z port when both select signals 890 and 891 are 0. In other words, each selector selects register 813, register 814, and register 815, respectively, and selectors 820 to 8
22, 0 is output as verification information 205. In addition, when the select signal 890 is 1 and the select signal 891 is 0, the X port is
If 0 is 0 and the select signal 891 is 1, the Y port is selected. The register 810 includes the document identifier detection circuit 22.
Each time a document identifier 221 is sent from 0, the document identifier 22
1 is stored and output to selector 820 and OR gate 880. Here, the document identifier 221 is 0.
In the case of , it indicates that something other than a document identifier was detected. When the document identifier 221 is stored in the register 810, the OR gate 880 calculates the logical sum between each bit of the document identifier 221, and sends a select signal 890, which is the result of the operation, to the selectors 820-822.

【００３８】したがって，レジスタ８１０に文書識別子
２２１が格納された場合には文書識別子２２１の値は０
（ゼロ）ではないため，ＯＲゲート８８０の演算結果８
９０が１になる。これを受けてセレクタ８２０〜８２２
に１が出力されるため各セレクタではＸポートが選択さ
れる。照合情報２０５には，識別子としてレジスタ８１
０に格納されている文書識別子２２１が，先頭位置情報
としてレジスタ８１４に格納されている０（ゼロ）が，
末尾位置情報としてレジスタ８１３に格納されている０
（ゼロ）が文書識別情報として出力されることになる。従って，文書識別子は図１１に示すように３２ビットの
文書識別子と，先頭位置情報が０でかつ末尾位置情報も
０である３２ビットの０の固定値から構成されることに
なる。Therefore, when the document identifier 221 is stored in the register 810, the value of the document identifier 221 is 0.
(not zero), the operation result of OR gate 880 is 8.
90 becomes 1. In response to this, selectors 820 to 822
Since 1 is output to each selector, the X port is selected. The collation information 205 includes the register 81 as an identifier.
The document identifier 221 stored in 0 is the 0 (zero) stored in the register 814 as the start position information.
0 stored in register 813 as end position information
(zero) will be output as document identification information. Therefore, as shown in FIG. 11, the document identifier consists of a 32-bit document identifier and a 32-bit fixed value of 0 in which the start position information is 0 and the end position information is also 0.

【００３９】レジスタ８１１にはタームコンパレータ２
１０から照合ターム識別子２１１が送られる度に照合タ
ーム識別子２１１が格納されると共に，この照合ターム
識別子２１１はセレクタ８２０，ＯＲゲート８８１，お
よび検索ターム長テーブル２５０へ出力される。さらに
検索ターム長テーブル２５０から照合ターム識別子２１
１に応じて検索ターム長８７３が読み出され減算器８３
０に出力される。ここで，照合ターム識別子２１１が０
の場合は検索タームが照合されていないことを示してい
る。また，レジスタ８１２には文字数カウント回路２３
０から文字数カウント２３１が送られる度に，文字数カ
ウント２３１は照合タームの末尾位置情報８１２ａとし
て格納されると共に，減算器８３０およびセレクタ８２
２へ出力され，さらに減算器８３０で末尾位置情報８１
２ａから検索ターム長８７３を引き，さらに加算器８３
１で１加えられた照合タームの先頭位置情報８３１ａが
セレクタ８２１に出力される。ＯＲゲート８８１ではレ
ジスタ８１１に照合ターム識別子２１１が格納された際
，照合ターム識別子２１１の各ビット間の論理和を取り
演算結果であるセレクト信号８９１がセレクタ８２０〜
８２２に送られる。The register 811 has a term comparator 2.
Each time a collation term identifier 211 is sent from 10, the collation term identifier 211 is stored, and this collation term identifier 211 is output to the selector 820, the OR gate 881, and the search term length table 250. Furthermore, from the search term length table 250, the matching term identifier 21
1, the search term length 873 is read out and the subtracter 83
Output to 0. Here, the matching term identifier 211 is 0.
If , it indicates that the search terms are not matched. The register 812 also includes a character count circuit 23.
Every time the character count 231 is sent from 0, the character count 231 is stored as the end position information 812a of the collation term, and is also sent to the subtracter 830 and selector 82.
2, and further subtracted by a subtractor 830 to output the end position information 81.
Subtract the search term length 873 from 2a, and add the adder 83.
The head position information 831a of the collation term incremented by 1 is output to the selector 821. When the collation term identifier 211 is stored in the register 811, the OR gate 881 performs the logical sum between each bit of the collation term identifier 211, and the select signal 891, which is the result of the operation, is sent to the selectors 820 to 881.
822.

【００４０】したがって，レジスタ８１１に照合ターム
識別子２１１が格納された場合には，照合ターム識別子
２１１の値は０でないため，ＯＲゲート８８１の出力が
１となる。これを受けてセレクト信号８９１が１として
セレクタ８２０〜８２２に送られるため各セレクタでは
Ｙポートが選択され，照合情報２０５には，識別子とし
てレジスタ８１１に格納されている照合ターム識別子２
１１と，先頭位置情報として加算器８３１から出力され
ている先頭位置情報８３１ａと，末尾位置情報としてレ
ジスタ８１２に格納されている末尾位置情報８１２ａが
照合ターム識別情報として出力されることになる。よっ
て，照合ターム識別情報は図１２に示すように３２ビッ
トの照合ターム識別子と，１６ビットの先頭位置情報お
よび１６ビットの末尾位置情報からなる３２ビットの照
合位置情報として出力されることになる。Therefore, when the collation term identifier 211 is stored in the register 811, the value of the collation term identifier 211 is not 0, so the output of the OR gate 881 becomes 1. In response, the select signal 891 is sent as 1 to the selectors 820 to 822, so each selector selects the Y port, and the collation information 205 includes the collation term identifier 2 stored in the register 811 as an identifier.
11, the start position information 831a output from the adder 831 as the start position information, and the end position information 812a stored in the register 812 as the end position information are output as the collation term identification information. Therefore, as shown in FIG. 12, the collation term identification information is output as 32-bit collation position information consisting of a 32-bit collation term identifier, 16-bit head position information, and 16-bit tail position information.

【００４１】以上の説明より明らかなように，照合情報
は（Ｔｎ，Ｘｓ，Ｘｅ）と表すことができる。ここでＴ
ｎは文書識別子または照合ターム識別子を表す。また，
Ｘｓは照合タームの先頭位置情報を，Ｘｅは照合ターム
の末尾位置情報を表す。したがって図４の照合情報の例
（Ｔ２，３３，３４）では，Ｔ２が照合ターム識別子を
表し，３３が照合タームの先頭位置情報を，３４が照合
タームの末尾位置情報を表すことになる。As is clear from the above explanation, collation information can be expressed as (Tn, Xs, Xe). Here T
n represents a document identifier or a collation term identifier. Also,
Xs represents the head position information of the collation term, and Xe represents the tail position information of the collation term. Therefore, in the example of verification information (T2, 33, 34) in FIG. 4, T2 represents the verification term identifier, 33 represents the start position information of the verification term, and 34 represents the end position information of the verification term.

【００４２】以上説明したタームコンパレータ２１０，
文書識別子検出回路２２０，文字数カウント回路２３０
，検索ターム長テーブル２５０，および位置情報付加回
路８００の動作により文字列照合回路２００からは図１
３に示すような照合情報が文書毎に出力されることにな
る。すなわち，文書毎にまず先頭に文書識別情報が来て
，次に照合ターム識別情報が並ぶ構成となる。また文書
識別情報は，照合ターム識別情報と構造が同じであり，
位置情報が０の照合ターム識別情報と見なすことができ
る。したがって，照合ターム識別情報と同様に扱うこと
ができるため，文書識別情報を構造上意識することなく
一括した処理を行うことが可能となる。以上が文字列照
合回路２００の詳細な説明である。The term comparator 210 described above,
Document identifier detection circuit 220, character count circuit 230
, the search term length table 250, and the position information addition circuit 800, the character string matching circuit 200 outputs the information shown in FIG.
Verification information as shown in 3 will be output for each document. That is, for each document, the document identification information comes first, followed by the collation term identification information. In addition, the document identification information has the same structure as the collation term identification information,
The position information can be considered as verification term identification information of 0. Therefore, since it can be handled in the same way as collation term identification information, it is possible to perform batch processing without being conscious of the structure of document identification information. The above is a detailed explanation of the character string matching circuit 200.

【００４３】次に，文字列照合回路２００の動作を具体
例で説明する。ここでは，検索条件式 “Ｑ＝（（文書［４Ｃ］理解）［Ｓ］システム）［ＡＮ
Ｄ］（文書［Ｓ］検索）”（７−１）を例として説明す
る。本例では検索タームとして“Ｔ１：文書”，“Ｔ２
：理解”，“Ｔ３：検索”，“Ｔ４：システム”および
“Ｓ１：。”の５つが検索制御手段１０１より送られ，
検索タームはタームコンパレータ２１０に，検索ターム
の長さは検索ターム長テーブルに設定される。ここで，
Ｔ１，Ｔ２，Ｔ３，Ｔ４，およびＳ１はそれぞれ検索タ
ーム“文書”，“理解”，“検索”，“システム”，お
よび“。”の照合ターム識別子を表わす。“Ｓ１：。”
は，文脈条件の“［Ｓ］”すなわち文脈として文（セン
テンス）の指定に対応してセンテンスの文脈識別文字列
（以後，文脈マーカーと呼ぶ）としての“。”を検出す
るためのものである。Next, the operation of the character string matching circuit 200 will be explained using a specific example. Here, the search condition expression “Q=((Document [4C] Understanding) [S] System) [AN
D] (Document [S] search)" (7-1) will be explained as an example. In this example, the search terms are "T1: document" and "T2
: Understanding”, “T3: Search”, “T4: System” and “S1:. ” are sent from the search control means 101,
The search term is set in the term comparator 210, and the length of the search term is set in the search term length table. here,
T1, T2, T3, T4, and S1 represent matching term identifiers for the search terms "document", "understanding", "search", "system", and ".", respectively. “S1:.”
is for detecting "." as a context identification character string (hereinafter referred to as a context marker) of a sentence in response to the context condition "[S]", that is, the specification of a sentence as a context. .

【００４４】文書としては， “・・・。文書理解を用いた検索システムである。・・
・・” （７−２）が入力されるものする。文書識別子はＤ１とする。この
文書データが文字列照合回路２００へ入力されたときの
照合結果２０５が図１４に示すような以下の照合情報が
出力される。（Ｄ１，　　０，　　０）　　　　（８−１）（Ｓ１，
３０，３０）　　　　（８−２）（Ｔ１，３１，３２）
　　　　（８−３）（Ｔ２，３３，３４）　　　　（８
−４）（Ｔ３，３９，４０）　　　　（８−５）（Ｔ４
，４１，４４）　　　　（８−６）（Ｓ１，４８，４８
）　　　　（８−７）（８−１）は文書識別情報を表し
ている。文書識別情報（８−１）においてＤ１は文書識
別子を表し，これに続く２項は定数０である。（８−２
）と（８−７）は文脈マーカー“。”の，（８−３）は
“文書”の，（８−４）は“理解”の，（８−５）は“
検索”の，および（８−６）は“システム”の照合ター
ム識別情報を表している。また，Ｓ１は“。”の，Ｔ１
は“文書”の，Ｔ２は“理解”の，Ｔ３は“検索”の，
およびＴ４は“システム”の照合ターム識別子を表して
いる。これら（８−１）〜（８−７）の照合情報２０５
が複合判定回路３００の入力として送られることになる
。[0044] As for documents, “...It is a retrieval system that uses document understanding.”
..." (7-2) is input. The document identifier is D1. When this document data is input to the character string matching circuit 200, the matching result 205 is the following matching shown in FIG. Information is output. (D1, 0, 0) (8-1) (S1,
30, 30) (8-2) (T1, 31, 32)
(8-3) (T2, 33, 34) (8
-4) (T3, 39, 40) (8-5) (T4
, 41, 44) (8-6) (S1, 48, 48
) (8-7) (8-1) represents document identification information. In the document identification information (8-1), D1 represents a document identifier, and the following two terms are constants 0. (8-2
) and (8-7) are for the context marker ".", (8-3) is for "document", (8-4) is for "understanding", and (8-5) is for "."
"Search" and (8-6) represent the collation term identification information of "System". Also, S1 is ". ”, T1
is for “document”, T2 is for “understanding”, T3 is for “search”,
and T4 represents the collation term identifier of "system". Verification information 205 for these (8-1) to (8-7)
will be sent as an input to the composite determination circuit 300.

【００４５】次に，複合条件判定回路３００の条件判定
処理について説明する。複合条件判定回路３００は図１
に示すように，３つのマイクロコンピュータのＭＰＵａ
３０１，ＭＰＵｂ３０２，およびＭＰＵｃ３０３から構
成されている。マイクロコンピュータＭＰＵａ３０１で
は近傍条件判定プログラム３１０が，マイクロコンピュ
ータＭＰＵｂ３０２では文脈条件判定プログラム３２０
が，マイクロコンピュータＭＰＵｃ３０３では論理条件
判定プログラム３３０が実行される。さらに各ＭＰＵ間
にはファーストイン・ファーストアウト（ＦＩＦＯ）メ
モリを使用してたバッファ３５０，３６０，および３７
０が配置され，それぞれのＭＰＵ間のデータの受渡しに
用いられている。Next, the condition judgment processing of the complex condition judgment circuit 300 will be explained. The complex condition determination circuit 300 is shown in FIG.
As shown in the figure, the MPUa of the three microcomputers
301, MPUb302, and MPUc303. The microcomputer MPUa301 uses a neighborhood condition judgment program 310, and the microcomputer MPUb302 uses a context condition judgment program 320.
However, a logic condition determination program 330 is executed in the microcomputer MPUc303. Furthermore, between each MPU there are buffers 350, 360, and 37 that use first-in, first-out (FIFO) memory.
0 is placed and used for data transfer between each MPU.

【００４６】まず，近傍条件判定プログラム３１０の判
定処理について説明する。近傍条件判定プログラム３１
０では，文字列照合回路２００からバッファ３５０に送
り込まれた照合情報２０５を読み出し，検索情報２０２
として指定された近傍条件に合致するか否かを判定する
。近傍条件例としては，（７−１）の中に“文書［４Ｃ
］理解”という条件式がある。“文書［４Ｃ］理解”と
いう条件式は，“文書”と“理解”という２つの検索タ
ームがこの順序で現われ，且つこの２つの検索タームが
４文字以内に近接して現れる文書を探し出すということ
を表す。ここでは検索式を一種の演算式と見なして，“
文書”を前方オペランドＴａ，“検索”を後方オペラン
ドＴｂと，また“［４Ｃ］”をオペレーションと呼ぶこ
とにする。さらに本近傍条件を表す識別子を“Ｐｉ”と
する。なおＰｉには照合ターム識別子とは区別できるコ
ードを割り付ける。このように定義することにより，近
傍条件は“Ｐｉ：Ｔａ［ｎＣ］Ｔｂ”と記述することが
できる。以下の説明はこの定義を用いて行う。First, the determination processing of the neighborhood condition determination program 310 will be explained. Neighborhood condition determination program 31
0, the matching information 205 sent from the character string matching circuit 200 to the buffer 350 is read out, and the search information 202 is read out.
Determine whether the specified neighborhood condition is met. As an example of the neighborhood condition, in (7-1) there is “Document [4C
] Understanding”. The conditional expression “Document [4C] Understanding” requires that the two search terms “document” and “understanding” appear in this order, and that these two search terms are within 4 characters. It means to search for documents that appear in close proximity.Here, we consider the search expression as a type of arithmetic expression, and use “
"Document" is called the forward operand Ta, "search" is called the backward operand Tb, and "[4C]" is called the operation. Furthermore, the identifier representing this neighborhood condition is called "Pi". Note that Pi is the collation term. A code that can be distinguished from the identifier is assigned.By defining in this way, the neighborhood condition can be written as "Pi:Ta[nC]Tb".The following explanation will be made using this definition.

【００４７】近傍条件のオペレーションにはこの他にも
前述したように， “Ｐｉ：Ｔａ［ｎＣ，ｍＣ］Ｔｂ” “Ｐｉ：Ｔａ＜ｎＣ＞Ｔｂ” “Ｐｉ：Ｔａ［ｎｃ］Ｔｂ” “Ｐｉ：Ｔａ［ｎｃ，ｍｃ］Ｔｂ” “Ｐｉ：Ｔａ＜ｎｃ＞Ｔｂ” などがある。この近傍条件の処理の手順について図１５
を用いて詳細に説明する。まず，繰返し処理ステップ１
０００ではバッファ３５０内の照合情報２０５を全て，
すなわち最後の文書の照合情報を読み出し尽くすまで１
００１から１０１０までの処理ステップを繰り返す。As mentioned above, there are other operations for neighborhood conditions such as "Pi:Ta[nC,mC]Tb""Pi:Ta<nC>Tb""Pi:Ta[nc]Tb""Pi: Examples include "Ta[nc,mc]Tb" and "Pi:Ta<nc>Tb". Figure 15 shows the procedure for processing this neighborhood condition.
This will be explained in detail using . First, iterative processing step 1
At 000, all collation information 205 in the buffer 350 is
In other words, 1 until the verification information of the last document is read out.
Repeat the processing steps from 001 to 1010.

【００４８】照合情報読込み処理ステップ１００１では
，バッファ３５０より照合情報２０５を１個読み出し，
バッファ３６０へ出力する。照合情報識別処理ステップ
１００２では，上記照合情報読込み処理ステップ１００
２で読み込んだ照合情報２０５が文書識別情報なのか，
あるいは照合ターム識別情報なのか調べる。すなわち照
合情報の下位３２ビットが０（ゼロ）の場合には，文書
識別子と判定する。文書識別子と判定した場合には，文
書単位の初期化処理ステップ１００４を実行する。ここ
ではワークエリアとして使用する前方オペランドバッフ
ァ３１１の０（ゼロ）クリアを行う。照合情報２０５が
照合ターム識別情報の場合には，後方オペランド識別処
理ステップ１００３を実行する。後方オペランド識別処
理ステップ１００３では，照合ターム識別情報の照合タ
ーム識別子を調べて，後方オペランドとして近傍条件に
指定された照合タームか否かを判定する。後方オペラン
ドの場合には後述する前方オペランドバッファ３１１に
格納されている照合ターム識別情報との距離を求め，指
定された近傍条件を満たしているか否かを判定する。In the verification information reading processing step 1001, one verification information 205 is read from the buffer 350, and
Output to buffer 360. In the verification information identification processing step 1002, the verification information reading processing step 100 is performed.
Is the collation information 205 read in step 2 document identification information?
Or check if it is collation term identification information. That is, if the lower 32 bits of the collation information are 0 (zero), it is determined that it is a document identifier. If it is determined that it is a document identifier, initialization processing step 1004 for each document is executed. Here, the forward operand buffer 311 used as a work area is cleared to 0 (zero). If the verification information 205 is verification term identification information, backward operand identification processing step 1003 is executed. In backward operand identification processing step 1003, the collation term identifier of the collation term identification information is checked to determine whether or not the collation term is specified as a rear operand in the neighborhood condition. In the case of a backward operand, the distance from the collation term identification information stored in the forward operand buffer 311, which will be described later, is determined, and it is determined whether a specified neighborhood condition is satisfied.

【００４９】以降，前方オペランドに指定された検索タ
ームに関する照合ターム識別情報を前方照合ターム識別
情報と呼び，後方オペランドに指定された検索タームに
関する照合ターム識別情報を後方照合ターム識別情報と
呼ぶことにする。すなわち，前方オペランドバッファ繰
返し処理ステップ１００５では前方オペランドバッファ
３１１に格納されている前方照合ターム識別情報と後方
オペランドに指定されている後方照合ターム識別情報と
の近接条件を判定することになる。この近接条件判定で
は，まず前方照合ターム識別情報読込み処理ステップ１
００６において前方オペランドバッファ３１１から前方
照合ターム識別情報を読み込む。次に近接条件判定処理
ステップ１００７で，読み込まれた前方照合ターム識別
情報と後方照合ターム識別情報の位置情報を比較して指
定された近接条件が満足されるか否かを判定し，近接条
件が満される場合には判定結果を照合情報としてバッフ
ァ３６０および前方オペランドバッファ３１１に出力す
る。[0049] Hereinafter, the collation term identification information regarding the search term specified as the forward operand will be referred to as forward collation term identification information, and the collation term identification information regarding the search term specified as the rear operand will be referred to as backward collation term identification information. do. That is, in the forward operand buffer iterative processing step 1005, the proximity condition between the forward matching term identification information stored in the forward operand buffer 311 and the backward matching term identification information designated as the backward operand is determined. In this proximity condition determination, first, forward matching term identification information reading processing step 1
At step 006, forward matching term identification information is read from the forward operand buffer 311. Next, in proximity condition determination processing step 1007, the position information of the read forward matching term identification information and backward matching term identification information are compared to determine whether or not the specified proximity condition is satisfied. If the condition is satisfied, the determination result is output to the buffer 360 and the forward operand buffer 311 as collation information.

【００５０】前方オペランド識別処理ステップ１００９
では，照合ターム識別情報の照合ターム識別子が前方オ
ペランドとして指定されている場合には，後方オペラン
ドに指定されているものも含めて，この照合ターム識別
情報を前方オペランドバッファ３１１に出力する。これ
は例えば，近傍条件として“Ｐｉ：Ｔａ［ｎＣ］Ｔｂ”
と“Ｐｊ：Ｔｂ［ｎＣ］Ｔｃ”が指定された場合，つま
りＴｂが後方オペランドとしても，前方オペランドとし
ても指定されているような場合に，Ｔｂについては後方
オペランド識別処理ステップ１００３と前方オペランド
識別処理ステップ１００９の両方の処理が必要となるか
らである。このために，後方オペランド識別処理ステッ
プ１００３と前方オペランド識別処理ステップ１００９
を分けて処理を行うことになる。以上の各処理ステップ
をバッファ３５０に格納されている照合情報に対し繰返
し実行していくことにより，近傍条件判定処理を実現す
ることができる。Forward operand identification processing step 1009
In this case, when the collation term identifier of the collation term identification information is specified as a forward operand, this collation term identification information including the one specified as a rear operand is output to the forward operand buffer 311. For example, this can be expressed as “Pi:Ta[nC]Tb” as a neighborhood condition.
When "Pj:Tb[nC]Tc" is specified, that is, when Tb is specified as both a backward operand and a forward operand, the backward operand identification process step 1003 and the forward operand identification process are performed for Tb. This is because both processes in process step 1009 are required. For this purpose, a backward operand identification processing step 1003 and a forward operand identification processing step 1009 are performed.
will be processed separately. By repeatedly performing each of the above processing steps on the collation information stored in the buffer 350, the neighborhood condition determination process can be realized.

【００５１】以上の処理手順を具体例で説明する。検索
条件としては，（７−１）に示した式“Ｑ＝（（文書［
４Ｃ］理解）［Ｓ］システム）［ＡＮＤ］（文書［Ｓ］
検索）”を例に用いて説明する。各条件判定プログラム
には，検索制御手段１０１で解析され各条件に分離され
た条件式が設定される。具体的には，近傍条件判定プロ
グラム３１０に本条件式（７−１）の近傍条件部分“文
書［４Ｃ］理解”が，近傍条件識別子Ｐ１と“文書”お
よび“理解”に対応する検索ターム識別子Ｔ１およびＴ
２を用いて，“Ｐ１：Ｔ１［４Ｃ］Ｔ２”という形で与
えられる。今，（７−２）に示した文書 “・・・。文書理解を用いた検索システムである。・・
・・” が入力されたとすると，前述したように文字列照合回路
２００からは以下の（８−１）〜（８−７）が照合情報
としてバッファ３５０へ出力される。これらの照合情報
を図１７に示す。The above processing procedure will be explained using a specific example. The search condition is the formula “Q=((document[
4C] Understanding) [S] System) [AND] (Document [S]
This will be explained using "Search)" as an example. Each condition determination program is set with a conditional expression that is analyzed by the search control means 101 and separated into each condition. The neighborhood condition part “Document [4C] Understanding” of conditional expression (7-1) is the neighborhood condition identifier P1 and the search term identifiers T1 and T that correspond to “Document” and “Understanding”.
2, it is given in the form "P1:T1[4C]T2". The document shown in (7-2) is a search system that uses document understanding.
..." is input, the following (8-1) to (8-7) are output from the character string matching circuit 200 as matching information to the buffer 350 as described above. These matching information are shown in the figure. 17.

【００５２】（Ｄ１，　　０，　　０）　　　　（８−
１）（Ｓ１，３０，３０）　　　　（８−２）（Ｔ１，
３１，３２）　　　　（８−３）（Ｔ２，３３，３４）
　　　　（８−４）（Ｔ３，３９，４０）　　　　（８
−５）（Ｔ４，４１，４４）　　　　（８−６）（Ｓ１
，４８，４８）　　　　（８−７）（８−１）は文書識
別情報を表している。Ｄ１は文書識別子である。（８−
２）と（８−７）は“。”の照合ターム識別情報を表し
ている。また，（８−３），（８−４），（８−５），
および（８−６）は，それぞれ“文書”，“理解”，“
検索”，および“システム”の照合ターム識別情報を表
している。　　さらに，Ｓ１，Ｔ１，Ｔ２，Ｔ３，およ
びＴ４は，それぞれ“。”，“文書”，“理解”，“検
索”，および“システム”の照合ターム識別子を表して
いる。以上の条件における近傍条件判定処理を図１５を
用いて１ステップずつ説明する。初期状態では図１６ａ，図１６ｂの初期状態に示すよう
にバッファ３５０に照合情報（８−１）〜（８−７）が
格納されており，前方オペランドバッファ３１１及びバ
ッファ３６０は０クリアされた状態になっている。(D1, 0, 0) (8-
1) (S1, 30, 30) (8-2) (T1,
31, 32) (8-3) (T2, 33, 34)
(8-4) (T3, 39, 40) (8
-5) (T4, 41, 44) (8-6) (S1
, 48, 48) (8-7) (8-1) represents document identification information. D1 is a document identifier. (8-
2) and (8-7) represent verification term identification information of ".". Also, (8-3), (8-4), (8-5),
and (8-6) are “document”, “understanding”, “
"Search" and "System" represent the matching term identification information.Furthermore, S1, T1, T2, T3, and T4 are "," respectively. ”, “document”, “understanding”, “search”, and “system”. The neighborhood condition determination process under the above conditions will be explained step by step using FIG. 15. In the initial state As shown in the initial state of FIGS. 16a and 16b, collation information (8-1) to (8-7) is stored in the buffer 350, and the forward operand buffer 311 and buffer 360 are cleared to 0. There is.

【００５３】近傍条件判定処理プログラムではこれらの
照合情報２０５をバッファ３５０から一つずつ読込み，
近傍条件“Ｐ１：Ｔ１［４Ｃ］Ｔ２”について判定処理
を行う。まずステップ１として，読込み処理ステップ１
００１が実行され，照合情報（８−１）すなわち（Ｄ１
，０，０）が図１６のステップ１に示すようにプログラ
ムのワークエリアに読み込まれ，バッファ３６０へ照合
情報としてそのまま出力される。次に照合情報識別処理
ステップ１００２が実行され，照合情報（Ｄ１，０，０
）が文書識別情報かどうかが調べられる。照合情報（Ｄ
１，０，０）は後部の２項が両方とも０（ゼロ），すな
わち下位３２ビットが０（ゼロ）のため，文書識別情報
と判断される。したがって，初期化処理ステップ１００
４が実行され，内部のワークエリアである前方オペラン
ドバッファ３１１がゼロクリアされる。その後ステップ
２として再び，照合情報読込み処理ステップ１００１が
実行され照合情報（８−２）すなわち（Ｓ１，３０，３
０）が読み込まれ，同様にしてバッファ３６０へそのま
ま出力される。次に照合情報識別処理ステップ１００２
が実行され，照合情報（８−２）が文書識別情報か照合
ターム識別情報かどうかが調べられる。照合情報（８−
２），すなわち（Ｓ１，３０，３０）は下位３２ビット
が０（ゼロ）でないため，文書識別情報ではなく照合タ
ーム識別情報と判断される。そして，次の後方オペラン
ド識別処理ステップ１００３で，本照合ターム識別情報
が近傍条件中に指定された後方オペランドに指定された
ものに該当するかどうかが調べられる。[0053] The neighborhood condition determination processing program reads these collation information 205 one by one from the buffer 350,
A determination process is performed regarding the neighborhood condition "P1:T1[4C]T2". First, as step 1, read processing step 1
001 is executed and collation information (8-1), that is, (D1
, 0, 0) is read into the work area of the program as shown in step 1 in FIG. Next, verification information identification processing step 1002 is executed, and verification information (D1, 0, 0
) is checked to see if it is document identification information. Verification information (D
1, 0, 0) is determined to be document identification information because the last two terms are both 0 (zero), that is, the lower 32 bits are 0 (zero). Therefore, initialization processing step 100
4 is executed, and the forward operand buffer 311, which is an internal work area, is cleared to zero. Thereafter, as step 2, the collation information reading process step 1001 is executed again, and the collation information (8-2), that is, (S1, 30, 3
0) is read and similarly output to the buffer 360 as is. Next, verification information identification processing step 1002
is executed, and it is checked whether the verification information (8-2) is document identification information or verification term identification information. Verification information (8-
2), that is, (S1, 30, 30), since the lower 32 bits are not 0 (zero), it is determined to be collation term identification information rather than document identification information. Then, in the next backward operand identification processing step 1003, it is checked whether this verification term identification information corresponds to the backward operand specified in the neighborhood condition.

【００５４】近傍条件　　“Ｐ１：Ｔ１［４Ｃ］Ｔ２”
の後方オペランドに指定された検索タームはＴ２であり
，本照合タームＳ１は後方オペランドに該当しないため
，１００５〜１００８の処理は行われず，すなわち，近
傍条件判定処理は行われず，次の前方オペランド識別処
理ステップ１００９が実行されることになる。本処理ス
テップでは，上記照合ターム識別情報（８−２）が近傍
条件“Ｐ１：Ｔ１［４Ｃ］Ｔ２”の前方オペランドとし
て指定されたものに該当するかどうかが調べられる。本
照合タームはＳ１のため前方オペランドには該当しない
ため，処理ステップ１０１０は実行されず，すなわち前
方オペランドバッファ３１１へ格納されることなく処理
を終える。Neighborhood condition “P1:T1[4C]T2”
The search term specified as the rear operand is T2, and this matching term S1 does not correspond to the rear operand, so the processes 1005 to 1008 are not performed, that is, the neighborhood condition determination process is not performed, and the next front operand identification Processing step 1009 will be executed. In this processing step, it is checked whether the collation term identification information (8-2) corresponds to the one specified as the forward operand of the neighborhood condition "P1:T1[4C]T2". Since this collation term is S1 and does not correspond to the forward operand, processing step 1010 is not executed, that is, the processing ends without being stored in the forward operand buffer 311.

【００５５】ステップ３として繰返し処理１０００によ
り，読込み処理ステップ１００１が実行され，３番目の
照合情報（８−３）すなわち（Ｔ１，３１，３２）が読
み込まれ，第２の入力と同様にバッファ３６０へ出力さ
れる。同時に照合情報識別処理ステップ１００２が実行
される。文書識別情報か照合ターム識別情報かが調べら
れる。（８−３）は照合ターム識別情報なので後方オペ
ランド識別処理ステップ１００３が実行され，後方オペ
ランドでないため前方オペランド識別処理ステップ１０
０９が実行される。照合ターム識別情報（８−３），す
なわち（Ｔ１，３１，３２）は近傍条件“Ｐ１：Ｔ１［
４Ｃ］Ｔ２”の前方オペランドに指定されているので，
前方オペランド格納処理ステップ１０１０が実行され，
図１６のステップ３に示すように前方オペランドバッフ
ァに格納される。As step 3, the reading process step 1001 is executed by the iterative process 1000, and the third collation information (8-3), that is, (T1, 31, 32) is read and stored in the buffer 360 in the same way as the second input. Output to. At the same time, collation information identification processing step 1002 is executed. Whether it is document identification information or collation term identification information can be checked. Since (8-3) is collation term identification information, backward operand identification processing step 1003 is executed, and since it is not a backward operand, forward operand identification processing step 10 is executed.
09 is executed. The matching term identification information (8-3), that is, (T1, 31, 32) is based on the neighborhood condition “P1:T1[
4C]T2” is specified as the forward operand, so
Forward operand storage processing step 1010 is executed,
The data is stored in the forward operand buffer as shown in step 3 of FIG.

【００５６】また，ステップ４として繰返し処理１００
０により，読込み処理ステップ１００１が実行され，４
番目の照合情報（８−４）すなわち（Ｔ２，３３，３４
）が読み込まれ，第３の入力と同様にバッファ３６０へ
照合ターム識別情報（８−４）が出力される。次に照合
情報識別処理ステップ１００２が実行され，文書識別情
報か照合ターム識別情報かが調べられる。（８−４）は
照合ターム識別情報なので後方オペランド識別処理ステ
ップ１００３が実行される。（８−４）すなわち（Ｔ２
，３３，３４）は近傍条件“Ｐ１：Ｔ１［４Ｃ］Ｔ２”
の後方オペランドに指定されているので，処理ステップ
１００５から１００８までの近接条件判定処理が実行さ
れる。まず前方オペランドバッファ読込み処理ステップ
１００６が実行され，前方オペランドバッファ３１１に
格納されている前方照合ターム識別情報（８−３）が読
み込まれる。次に近接条件判定処理ステップ１００７が
実行され前方照合ターム識別情報（８−３）と後方照合
ターム識別情報（８−４）との文字距離が調べられる。（８−３）すなわち（Ｔ１，３１，３２）の末尾位置Ｘ
ｅは３２であり，（８−４）すなわち（Ｔ２，３３，３
４）の先頭位置は３３なので，その間の文字距離は０で
あり指定条件の４文字以下を満たしている。このため判
定結果出力処理ステップ１００８が実行され，判定結果
として（８−３）の先頭位置である３１を先頭位置とし
，（８−４）の末尾位置である３４を末尾位置とし，照
合ターム識別子をＰ１とする近傍条件識別情報（Ｐ１，
３１，３４）（図１７）が図１６ａ，図１６ｂのステッ
プ４のように前方オペランドバッファ３１１とバッファ
３６０へ出力される。さらに，繰返し処理１０００によ
り，読込み処理ステップ１００１が実行され，５番目の
照合情報（８−５）すなわち（Ｔ３，３９，４０）が読
込まれ，第４の入力と同様にバッファ３６０へ（８−５
）が出力される。次に照合情報識別処理ステップ１００
２が実行され，文書識別情報か照合ターム識別情報かが
調べられる。照合ターム識別情報（８−５）は近傍条件
に設定されていない照合識別情報であるため，後方オペ
ランド識別処理ステップ１００３および前方オペランド
識別処理ステップ１００９のいずれの処理も行われずに
次の入力に移る。以後，最後の照合ターム情報（８−７
）まで同じように近傍条件判定処理を繰り返す。[0056] Also, as step 4, iterative processing 100
0, reading processing step 1001 is executed, and step 4
th verification information (8-4), i.e. (T2, 33, 34
) is read, and matching term identification information (8-4) is output to the buffer 360 in the same way as the third input. Next, a verification information identification processing step 1002 is executed to check whether the information is document identification information or verification term identification information. Since (8-4) is collation term identification information, backward operand identification processing step 1003 is executed. (8-4) That is, (T2
, 33, 34) is the neighborhood condition “P1:T1[4C]T2”
Since it is specified as the rear operand of , the proximity condition determination processing from processing steps 1005 to 1008 is executed. First, the forward operand buffer reading processing step 1006 is executed, and the forward collation term identification information (8-3) stored in the forward operand buffer 311 is read. Next, proximity condition determination processing step 1007 is executed to check the character distance between the forward matching term identification information (8-3) and the backward matching term identification information (8-4). (8-3) That is, the end position X of (T1, 31, 32)
e is 32, (8-4) or (T2, 33, 3
Since the starting position of 4) is 33, the character distance between them is 0, and the specified condition of 4 characters or less is satisfied. Therefore, judgment result output processing step 1008 is executed, and as a judgment result, 31, which is the starting position of (8-3), is set as the starting position, 34, which is the ending position of (8-4), is set as the ending position, and the matching term identifier Neighborhood condition identification information (P1,
31, 34) (FIG. 17) are output to the forward operand buffer 311 and buffer 360 as in step 4 of FIGS. 16a and 16b. Further, in the iterative process 1000, a reading process step 1001 is executed, and the fifth collation information (8-5), that is, (T3, 39, 40) is read and sent to the buffer 360 (8-5) in the same way as the fourth input. 5
) is output. Next, verification information identification processing step 100
2 is executed to check whether the information is document identification information or verification term identification information. Since the matching term identification information (8-5) is matching identification information that is not set in the neighborhood condition, the process moves to the next input without performing either backward operand identification processing step 1003 or forward operand identification processing step 1009. . After that, the final verification term information (8-7
), the neighborhood condition determination process is repeated in the same way.

【００５７】以上の近傍条件判定処理によって，以下に
示す（９−１）〜（９−８）の照合ターム識別情報（図
１７）がバッファ３６０へ出力される。（Ｄ１，　　０，　　０）　　　　（９−１）（Ｓ１，
３０，３０）　　　　（９−２）（Ｔ１，３１，３２）
　　　　（９−３）（Ｔ２，３３，３４）　　　　（９
−４）（Ｐ１，３１，３４）　　　　（９−５）（Ｔ３
，３９，４０）　　　　（９−６）（Ｔ４，４１，４４
）　　　　（９−７）（Ｓ１，４８，４８）　　　　（
９−８）ここで注目すべき点は，近傍条件判定結果（９
−５）も照合ターム識別情報として，末尾位置情報の昇
順にソートし，格納されている点である。このため，後
述するように文脈条件判定の際，全ての文脈と照合ター
ムとの組合せについて包含関係をチェックしなくても済
むため処理を軽減できるようになる。これらの照合ター
ム識別情報は文脈条件判定プログラム３２０へ送られる
ことになる。As a result of the above-described neighborhood condition determination processing, the collation term identification information (9-1) to (9-8) shown below (FIG. 17) is output to the buffer 360. (D1, 0, 0) (9-1) (S1,
30, 30) (9-2) (T1, 31, 32)
(9-3) (T2, 33, 34) (9
-4) (P1, 31, 34) (9-5) (T3
, 39, 40) (9-6) (T4, 41, 44
) (9-7) (S1, 48, 48) (
9-8) What should be noted here is the neighborhood condition judgment result (9
-5) is also sorted and stored in ascending order of the end position information as verification term identification information. Therefore, as will be described later, when determining a context condition, it is not necessary to check inclusion relationships for all combinations of contexts and collation terms, and processing can be reduced. These collation term identification information will be sent to the context condition determination program 320.

【００５８】次に，文脈条件判定プログラム３２０の判
定処理について説明する。文脈条件判定プログラム３２
０では，近傍条件判定プログラム３１０からバッファ３
６０に送り込まれた照合情報を読み出し，検索情報２０
２として指定された文脈条件に合致するか否かを判定す
る。文脈条件の例としては，（７−１）に示したような
“文書［Ｓ］検索”がある。“文書［Ｓ］検索”という
条件式は，“文書”と“検索”という２つの検索ターム
がこの順序で，同一の文（センテンス）に現れる文書を
探し出すということを表す。ここで本文脈条件を表す識
別子（以後，文脈条件識別子と呼ぶ）をＣｉとする。Ｃ
ｉには照合ターム識別子とは区別できるコードを割り付
ける。このように定義することにより，文脈条件は“Ｃ
ｉ：Ｔａ［Ｓ］Ｔｂ”と記述することができる。以下の
説明はこの定義を用いて行うことにする。文脈条件には
この他にも， “Ｃｉ：Ｔａ［Ｐ］Ｔｂ”， “Ｃｉ：Ｔａ［ＰＨ］Ｔｂ”， “Ｃｉ：Ｔａ［ｐ］Ｔｂ”， “Ｃｉ：Ｔａ［ｓ］Ｔｂ”， “Ｃｉ：Ｔａ［ｐｈ］Ｔｂ” などがある。Next, the determination processing of the context condition determination program 320 will be explained. Context condition determination program 32
0, the buffer 3 is sent from the neighborhood condition determination program 310.
The verification information sent to 60 is read out, and the search information 20 is retrieved.
It is determined whether the context condition specified as 2 is met. An example of the context condition is "document [S] search" as shown in (7-1). The conditional expression "document [S] search" indicates that the two search terms "document" and "search" appear in the same sentence in this order to find a document. Here, an identifier representing this context condition (hereinafter referred to as a context condition identifier) is assumed to be Ci. C
A code that can be distinguished from the collation term identifier is assigned to i. By defining this, the context condition becomes “C
i:Ta[S]Tb". The following explanation will be made using this definition. In addition to this, context conditions include "Ci:Ta[P]Tb", "Ci :Ta[PH]Tb", "Ci:Ta[p]Tb", "Ci:Ta[s]Tb", "Ci:Ta[ph]Tb", etc.

【００５９】文脈条件判定の原理について，図１８の近
傍条件判定処理出力例を用いて説明する。（７−２）に
示した文書“・・・。文書理解を用いた検索システムで
ある。・・・・”が入力されたことを想定すると，前述
したように近傍条件判定プログラム３１０からは本図に
示す以下の（９−１）〜（９−８）が照合情報としてバ
ッファ３６０を介し文脈条件判定処理プログラム３２０
へ送られる。（Ｄ１，　　０，　　０）　　　　（９−１）（Ｓ１，
３０，３０）　　　　（９−２）（Ｔ１，３１，３２）
　　　　（９−３）（Ｔ２，３３，３４）　　　　（９
−４）（Ｐ１，３１，３４）　　　　（９−５）（Ｔ３
，３９，４０）　　　　（９−６）（Ｔ４，４１，４４
）　　　　（９−７）（Ｓ１，４８，４８）　　　　（
９−８）（９−１）は文書識別情報を表している。Ｄ１
は文書識別子を表し，これに続く２項は定数０である。（９−２）と（９−８）は文脈マーカー“。”の照合タ
ーム識別情報を表している。同様に，（９−３），（９
−４），（９−６），および（９−７）は，それぞれ“
文書”，“理解”，“検索”“システム”の照合ターム
識別情報を表している。ここで，Ｓ１，Ｔ１，Ｔ２，Ｔ
３，およびＴ４は，それぞれ“。”，“文書”，“理解
”，“検索”，および“システム”の照合ターム識別子
を表す。また，（９−５）は近傍条件“文書［４Ｃ］理
解”が照合された際の近傍条件識別情報を表している。Ｐ１は近傍条件“文書［４Ｃ］理解”の近傍条件識別子
である。The principle of context condition determination will be explained using the output example of the neighborhood condition determination process shown in FIG. Assuming that the document shown in (7-2) "...This is a retrieval system using document understanding..." is input, the neighborhood condition determination program 310 returns the main text as described above. The following (9-1) to (9-8) shown in the figure are passed through the buffer 360 as collation information to the context condition determination processing program 320.
sent to. (D1, 0, 0) (9-1) (S1,
30, 30) (9-2) (T1, 31, 32)
(9-3) (T2, 33, 34) (9
-4) (P1, 31, 34) (9-5) (T3
, 39, 40) (9-6) (T4, 41, 44
) (9-7) (S1, 48, 48) (
9-8) (9-1) represents document identification information. D1
represents a document identifier, and the following two terms are constants 0. (9-2) and (9-8) represent collation term identification information of the context marker ".". Similarly, (9-3), (9
-4), (9-6), and (9-7) are “
It represents the verification term identification information of "document", "understanding", "search" and "system".Here, S1, T1, T2, T
3, and T4 represent matching term identifiers of ".", "document", "understanding", "search", and "system", respectively. Further, (9-5) represents the neighborhood condition identification information when the neighborhood condition "understand document [4C]" is verified. P1 is a neighborhood condition identifier of the neighborhood condition "document [4C] understanding".

【００６０】以上の照合情報に基づいて，まず照合ター
ム識別子の順序が指定されている文脈条件“Ｃｉ：Ｔａ
［Ｓ］Ｔｂ”に関する共起判定を実行する。ここで識別
子Ｔａを０番目の識別子と呼び，識別子Ｔｂを１番目の
識別子と呼ぶ。共起判定では，文脈条件に指定されてい
る全ての識別子が文脈中に出現したかどうかの判定を行
う。共起判定の成否を判断するために，ここでは共起カ
ウンタを用いる。この共起カウンタは指定された文脈条
件に対応して１つずつ設ける。これより順序が指定され
ている文脈条件における共起カウンタの制御方法につい
て，以下説明する。文脈条件“Ｃｉ：Ｔａ［Ｓ］Ｔｂ”
における文脈すなわち文（センテンス）の範囲は，出現
した文脈マーカー“。”の照合情報（９−２）から次に
出現する文脈マーカー“。”の照合情報（９−８）まで
である。よって，（９−２）〜（９−８），すなわち（
Ｓ１，３０，３０）〜（Ｓ１，４８，４８）がセンテン
スの範囲となる。　　このため，（９−２）すなわち（
Ｓ１，３０，３０）から順番に調べる。文脈マーカー“
。”が出現すると共起カウンタを０（ゼロ）にリセット
し，次に文脈条件に指定された照合情報が現れるかどう
かを調べていく。まず，文脈条件に指定された第０番目
の識別子Ｔａが現れるまで共起カウンタの値を変更しな
い。すなわち，共起カウンタの値が０のとき第０番目の
識別子Ｔａに着目し，この識別子が現れるのを監視する
。ここで識別子Ｔａの照合情報が現れた場合，共起カウ
ンタをカウントアップする。つまり共起カウンタの値を
０から１に変更する。Based on the above collation information, first, the context condition "Ci:Ta" which specifies the order of collation term identifiers is
[S] Executes co-occurrence determination regarding "Tb".Here, the identifier Ta is called the 0th identifier, and the identifier Tb is called the 1st identifier.In the co-occurrence determination, all identifiers specified in the context condition are It is determined whether or not it has appeared in the context.To determine the success or failure of the co-occurrence determination, a co-occurrence counter is used here.One co-occurrence counter is provided for each specified context condition. .A method of controlling the co-occurrence counter under a context condition in which the order is specified will be explained below.Context condition "Ci:Ta[S]Tb"
The range of the context, ie, sentence, is from the matching information (9-2) of the appearing context marker "." to the matching information (9-8) of the next appearing context marker ".". Therefore, (9-2) to (9-8), that is, (
S1, 30, 30) to (S1, 48, 48) are the range of sentences. Therefore, (9-2) or (
Check in order from S1, 30, 30). context marker “
. ” appears, the co-occurrence counter is reset to 0 (zero), and then it is checked whether the matching information specified in the context condition appears. First, if the 0th identifier Ta specified in the context condition is Do not change the value of the co-occurrence counter until it appears.In other words, when the value of the co-occurrence counter is 0, focus on the 0th identifier Ta and monitor the appearance of this identifier.Here, the collation information of identifier Ta appears. If so, the co-occurrence counter is counted up. That is, the value of the co-occurrence counter is changed from 0 to 1.

【００６１】次に共起カウンタの値が１を示しているの
で，文脈条件に指定されている第１番目の識別子Ｔｂが
現れるのを監視し，現れた場合に共起カウンタをカウン
トアップする。ここで識別子Ｔｂの照合情報が現れた場
合，共起カウンタをカウントアップする。すなわち，共
起カウンタの値を１から２に更新する。ここで共起カウ
ンタの値が２になったところで，本文脈条件に指定され
た識別子が全て現れたことになるため，共起条件が成立
したことを判別できる。このように共起カウンタを制御
することにより，順序が指定された文脈条件における共
起判定を行っている。共起カウンタの判定は，後述する
ように次の文脈マーカーが現れた時点で行う。　　さら
に，次に文脈マーカー“。”の照合情報（９−８）すな
わち（Ｓ１，４８，４８）が出現する。この時点の共起
カウンタの値は２になっている。共起カウンタの値が２
ということは，文脈条件に指定されている２つの検索タ
ームの識別子が出現し，共起条件が成立していることを
表している。このとき，成立した文脈条件の文脈条件識
別情報を出力する。この文脈条件識別情報の先頭位置情
報には前に現れた文脈マーカー“。”の先頭位置情報で
ある３０を設定し，末尾位置情報には後に現れた文脈マ
ーカー“。”の末尾位置情報である４８を設定する。ま
た，文脈条件識別情報の識別子には文脈条件の識別子“
Ｃｉ”を設定する。すなわち，文脈条件識別情報（Ｃｉ
，３０，４８）を出力する。Next, since the value of the co-occurrence counter shows 1, the appearance of the first identifier Tb specified in the context condition is monitored, and when it appears, the co-occurrence counter is counted up. If collation information of the identifier Tb appears here, the co-occurrence counter is counted up. That is, the value of the co-occurrence counter is updated from 1 to 2. When the value of the co-occurrence counter reaches 2, it means that all the identifiers specified in this context condition have appeared, so it can be determined that the co-occurrence condition has been met. By controlling the co-occurrence counter in this way, co-occurrence determination is performed under context conditions in which the order is specified. The co-occurrence counter is determined when the next context marker appears, as described below. Furthermore, collation information (9-8) of the context marker ".", that is, (S1, 48, 48) appears next. The value of the co-occurrence counter at this point is 2. Co-occurrence counter value is 2
This means that the identifiers of the two search terms specified in the context condition appear and the co-occurrence condition is satisfied. At this time, context condition identification information of the satisfied context condition is output. The start position information of this context condition identification information is set to 30, which is the start position information of the context marker "." that appeared before, and the end position information is set as the end position information of the context marker "." that appeared later. Set 48. In addition, the identifier of the context condition identifier “
In other words, the context condition identification information (Ci
, 30, 48).

【００６２】次に，照合ターム識別子の順序を問わない
文脈条件“Ｃｊ：Ｔａ［ｓ］Ｔｂ”の場合の共起カウン
タの制御方法について，以下説明する。文脈の範囲は同
様に（９−２）〜（９−８），すなわち（Ｓ１，３０，
３０）〜（Ｓ１，４８，４８）がセンテンスの範囲とな
る。このため，同様に（９−２）すなわち（Ｓ１，３０
，３０）から順番に調べる。　　まず，文脈マーカー“
。”が出現すると同様に共起カウンタを０（ゼロ）にリ
セットする。次に文脈条件に指定された照合情報が現れ
るかどうかを調べていく。　　まず，識別子“Ｔｂ”の
照合情報が現れた場合，共起カウンタをカウントアップ
する。つまり共起カウンタの値は０から１に更新する。この後で既に出現した識別子“Ｔｂ”の照合情報が同一
文脈中に再び現れた場合には共起カウンタの値を変更し
ない。次に識別子“Ｔａ”の照合情報が現れた場合，共
起カウンタをカウントアップする。つまり共起カウンタ
の値は１から２に更新する。ここで共起カウンタの値が
２になったところで，，本文脈条件に指定された２つの
識別子が全て現れたことになるため，共起条件が成立し
たことが判別できる。この後で既に出現した識別子“Ｔ
ａ”の照合情報が同一文脈中に再び現れた場合にも共起
カウンタの値を変更しない。さらに，次に文脈マーカー
“。”の照合情報（９−８）すなわち（Ｓ１，４８，４
８）が出現したとき，既に共起カウンタが２，すなわち
共起条件が成立しているので，本文脈条件の文脈条件識
別情報を出力する。すなわち，文脈条件識別情報（Ｃｊ
，３０，４８）を出力する。このように共起カウンタを
制御することにより，順序を問わない文脈条件の共起判
定にも使用することができる。以上のように共起カウン
タを制御することにより共起判定の成否を判断すること
ができる。Next, a method of controlling the co-occurrence counter in the case of the context condition "Cj:Ta[s]Tb" regardless of the order of collation term identifiers will be described below. The context range is similarly (9-2) to (9-8), i.e. (S1, 30,
30) to (S1, 48, 48) is the range of sentences. Therefore, similarly (9-2) or (S1, 30
, 30). First, the context marker “
. ” appears, the co-occurrence counter is reset to 0 (zero) in the same way. Next, it is checked whether the matching information specified in the context condition appears. First, if matching information for the identifier “Tb” appears , the co-occurrence counter is counted up.In other words, the value of the co-occurrence counter is updated from 0 to 1.After this, if the collation information of the identifier "Tb" that has already appeared appears again in the same context, the co-occurrence counter is updated. The value of the co-occurrence counter is not changed.Then, when collation information of the identifier "Ta" appears, the co-occurrence counter is counted up.In other words, the value of the co-occurrence counter is updated from 1 to 2.Here, the value of the co-occurrence counter is 2, all the two identifiers specified in this context condition have appeared, so it can be determined that the co-occurrence condition has been met.After this, the identifier “T” that has already appeared
Even if the collation information for "a" appears again in the same context, the value of the co-occurrence counter is not changed.Furthermore, the next context marker "a" appears again in the same context. ” verification information (9-8), i.e. (S1, 48, 4
8) appears, the co-occurrence counter is already 2, that is, the co-occurrence condition is satisfied, so the context condition identification information of this context condition is output. In other words, the context condition identification information (Cj
, 30, 48). By controlling the co-occurrence counter in this way, it can also be used to determine co-occurrence of context conditions regardless of order. By controlling the co-occurrence counter as described above, it is possible to determine whether the co-occurrence determination is successful or not.

【００６３】文脈条件の処理の手順について図１９を用
いて詳細に説明する。まず繰返し処理ステップ１１００
ではバッファ３６０内の照合情報を全て，すなわち最後
の文書の照合情報を読み出し尽くすまで１１０１から１
１１２までの処理ステップを繰り返す。照合情報読込み
処理ステップ１１０１では，バッファ３６０より照合情
報を１個読み出し，ワークエリアへ出力する。照合情報
識別処理ステップ１１０２では，上記照合情報読込み処
理ステップ１１０１で読み込んだ照合情報が照合ターム
識別情報なのかどうかを調べる。すなわち照合情報の下
位３２ビットが０（ゼロ）でない場合には，照合ターム
識別子と判定する。この場合には，文脈マーカーを検出
する文脈マーカー識別処理ステップ１１０３を実行する
。照合ターム識別情報以外の場合には，照合情報をバッ
ファ３７０へ出力する照合情報出力処理ステップ１１１
２を実行する。文脈マーカー識別処理ステップ１１０３
では，照合ターム識別情報の照合ターム識別子を調べて
文脈条件に指定された文脈の文脈マーカーか否かを判定
する。文脈マーカーの場合には，後処理ステップ１１０
４が実行される。The procedure for processing context conditions will be explained in detail using FIG. 19. First, iterative processing step 1100
1101 to 1 until all the collation information in the buffer 360, that is, the collation information of the last document is read out.
Repeat the processing steps up to 112. In the verification information reading processing step 1101, one piece of verification information is read from the buffer 360 and output to the work area. In the collation information identification processing step 1102, it is checked whether the collation information read in the collation information reading processing step 1101 is collation term identification information. That is, if the lower 32 bits of the collation information are not 0 (zero), it is determined that it is a collation term identifier. In this case, a context marker identification processing step 1103 for detecting a context marker is executed. If the verification information is not the verification term identification information, verification information output processing step 111 outputs the verification information to the buffer 370.
Execute 2. Context marker identification processing step 1103
Then, the matching term identifier of the matching term identification information is checked to determine whether it is a context marker of the context specified in the context condition. In the case of a context marker, a post-processing step 110
4 is executed.

【００６４】後処理ステップ１１０４では，これまで共
起判定を行ってきた文脈の末尾位置情報を格納し，ワー
クエリアに格納する。この文脈識別情報は後述する照合
ターム識別情報出力処理ステップ１１１０でバッファ３
７０に出力される。その後，前処理ステップ１１０４ａ
を実行する。ここでは本文脈マーカーを端点とする次の
文脈に対し共起判定を実行するための準備を行う。まず
，本文脈マーカーを端点とする文脈に関する文脈条件の
共起カウンタを０にリセットする。さらに，本文脈マー
カーの先頭位置情報を，本文脈マーカーを端点とする文
脈の先頭位置情報とする。後処理ステップ１１０４にお
いて，文脈条件識別情報をワークエリアに格納し，再度
文脈条件判定を行うのは文脈条件が入れ子の場合を考慮
しているからである。文脈条件の入れ子とは，すなわち
，“Ｃｊ：（Ｔａ［ｓ］Ｔｂ）［ｐ］Ｔｃ”のように“
Ｔａ”と“Ｔｂ”という２つの検索タームがその順序を
問わず，同一文（センテンス）に現れ，且つこの文（セ
ンテンス）と“Ｔｃ”という検索タームが順序を問わず
，同一段落（パラグラフ）に現れる文書を探し出すとい
うように文脈条件中に文脈条件が指定される場合をいう
。In post-processing step 1104, the end position information of the contexts for which co-occurrence determination has been made is stored in the work area. This context identification information is sent to the buffer 3 in a matching term identification information output processing step 1110, which will be described later.
70. After that, pre-processing step 1104a
Execute. Here, preparations are made to perform co-occurrence determination for the next context with this context marker as the endpoint. First, the co-occurrence counter of the context conditions related to the context with this context marker as the endpoint is reset to 0. Further, the head position information of this context marker is set as the head position information of the context whose end point is this context marker. In the post-processing step 1104, the context condition identification information is stored in the work area and the context condition determination is performed again because the case where the context conditions are nested is taken into consideration. Nesting of context conditions means nesting of context conditions, such as “Cj:(Ta[s]Tb)[p]Tc”.
The two search terms "Ta" and "Tb" appear in the same sentence regardless of their order, and this sentence and the search term "Tc" appear in the same paragraph regardless of their order. This refers to the case where a context condition is specified in a context condition, such as searching for a document that appears in a context condition.

【００６５】この判定手順としては，文脈条件を“Ｃｉ
：（Ｔａ［ｓ］Ｔｂ）”と“Ｃｊ：Ｃｉ［ｐ］Ｔｃ”と
に分け，まず含まれている方の文脈条件すなわち本例で
は段落に包含されているセンテンスに関する文脈条件“
Ｃｉ：（Ｔａ［ｓ］Ｔｂ）”を判定する。このセンテン
スの文脈条件が成立したと想定する。この成立時にセン
テンスの文脈条件識別情報が一時格納バッファに格納さ
れる。次にセンテンスの文脈マーカーが現れたとき，こ
のセンテンスの文脈条件識別情報に末尾位置情報が設定
され，文脈条件識別情報として確定する。このセンテン
スの文脈条件識別情報をワークエリアに格納し，含む方
の文脈条件すなわち本例では段落に関する文脈条件“Ｃ
ｊ：Ｃｉ［ｐ］Ｔｃ”の判定対象とする。その後，照合
ターム識別子Ｔｃの照合ターム識別情報が現れたと想定
する。このとき，段落の文脈条件は成立し，一時格納バ
ッファに段落の文脈条件識別情報が格納される。次に段
落の文脈マーカーが出現したとき，この段落の文脈条件
識別情報に末尾位置情報が設定され，文脈条件識別情報
として確定される。このように処理を行うことによって
，入れ子の文脈条件“Ｃｊ：（Ｔａ［ｓ］Ｔｂ）［ｐ］
Ｔｃ”の条件判定を実現している。[0065] In this judgment procedure, the context condition is
:(Ta[s]Tb)" and "Cj:Ci[p]Tc", and first, the context condition for the included one, that is, the context condition for the sentence included in the paragraph in this example"
Ci: (Ta[s]Tb)". It is assumed that the context condition of this sentence is satisfied. When this is satisfied, the context condition identification information of the sentence is stored in the temporary storage buffer. Next, the context marker of the sentence is determined. appears, the end position information is set in the context condition identification information of this sentence, and it is determined as the context condition identification information.The context condition identification information of this sentence is stored in the work area, and the containing context condition, that is, this example Then, the context condition regarding the paragraph “C
j:Ci[p]Tc". After that, it is assumed that the collation term identification information of the collation term identifier Tc appears. At this time, the context condition of the paragraph is satisfied, and the context condition of the paragraph is stored in the temporary storage buffer. The identification information is stored.The next time the context marker of a paragraph appears, the end position information is set in the context condition identification information of this paragraph, and it is determined as the context condition identification information.By performing processing in this way, , nested context condition “Cj: (Ta[s]Tb)[p]
This realizes the conditional judgment of “Tc”.

【００６６】文脈マーカー識別処理ステップ１１０３の
後には，ワークエリア繰返し処理ステップ１１０６を実
行する。ワークエリア繰返し処理ステップ１１０６では
，照合情報読込み処理ステップ１１０１や後処理ステッ
プ１１０４でワークエリアに格納された全ての照合ター
ム識別情報について１１０７〜１１１０の共起判定処理
を繰り返し実行する。文脈条件指定識別処理ステップ１
１０７では，ワークエリアに格納されている照合ターム
識別情報の照合ターム識別子から文脈条件に指定されて
いるか否かを調べる。文脈条件に指定されている場合に
は共起判定処理ステップ１１０８が実行され，文脈条件
に指定されている照合ターム識別子が文脈中に現れる度
にカウントアップする共起カウンタ値が２かどうかを調
べる。共起カウンタ値が２ということは，文脈条件に指
定された２つの照合ターム識別子が発見され文脈条件が
成立したことを表している。文脈条件が成立した場合に
は，該当文脈条件の文脈条件識別子を一時格納バッファ
３２１に格納する一時格納処理ステップ１１０９を実行
する。その後，文脈マーカー識別処理ステップ１１１０
ａを実行し，文脈マーカーの照合ターム識別情報でなけ
れば照合ターム識別情報出力処理ステップ１１１０を実
行する。ここでは，照合ターム識別情報をバッファ３７
０に出力する。以上の各処理ステップをバッファ３６０
に格納されている照合情報に対し繰返し実行していくこ
とにより文脈条件判定処理を実現することができる。After the context marker identification processing step 1103, a work area repetition processing step 1106 is executed. In the work area repetition processing step 1106, the co-occurrence determination processing of steps 1107 to 1110 is repeatedly executed for all the collation term identification information stored in the work area in the collation information reading processing step 1101 and the post-processing step 1104. Context condition specification identification processing step 1
In step 107, it is checked whether the collation term identifier of the collation term identification information stored in the work area is specified as a context condition. If specified in the context condition, co-occurrence determination processing step 1108 is executed, and it is checked whether the co-occurrence counter value, which is incremented every time the collation term identifier specified in the context condition appears in the context, is 2. . The co-occurrence counter value of 2 indicates that the two matching term identifiers specified in the context condition have been found and the context condition has been satisfied. If the context condition is satisfied, a temporary storage process step 1109 is executed to store the context condition identifier of the relevant context condition in the temporary storage buffer 321. Thereafter, context marker identification processing step 1110
a is executed, and if it is not the collation term identification information of the context marker, the collation term identification information output processing step 1110 is executed. Here, the verification term identification information is stored in the buffer 37.
Output to 0. Each of the above processing steps is stored in a buffer 360.
The context condition determination process can be realized by repeatedly executing the verification information stored in the .

【００６７】以上の処理手順を具体例で説明する。検索
条件としては，（７−１）に示した式“Ｑ＝（（文書［
４Ｃ］理解）［Ｓ］システム）［ＡＮＤ］（文書［ｓ］
検索）”を例に用いる。各条件判定プログラムには，検
索制御手段１０１で解析され各条件に分離された条件式
が設定される。具体的には，文脈条件判定プログラム３
２０には本条件式（７−１）の文脈条件部分“（文書［
４Ｃ］理解）［Ｓ］システム”と“文書［ｓ］検索”が
設定される。ここでは，近傍条件“文書［４Ｃ］理解”
の識別子をＰ１とし，“システム”の検索ターム識別子
をＴ３とすることにより，文脈条件“（文書［４Ｃ］理
解）［Ｓ］システム”を“Ｃ１：Ｐ１［Ｓ］Ｔ３”とい
う形で表す。Ｃ１は文脈条件の識別子である。同様に，
文脈条件“文書［ｓ］検索”は“Ｃ２：Ｔ１［ｓ］Ｔ４
”と表される。Ｃ２は文脈条件の識別子であり，Ｔ１は
“文書”の，Ｔ４は“検索”の検索ターム識別子である
。The above processing procedure will be explained using a specific example. The search condition is the formula “Q=((document[
4C] Understanding) [S] System) [AND] (Document [s]
Search)" is used as an example. Each condition determination program is set with a conditional expression that is analyzed by the search control means 101 and separated into each condition. Specifically, the context condition determination program 3
20 contains the context condition part of conditional expression (7-1) “(document [
4C] Understanding) [S] System” and “Document [s] Search” are set. Here, the neighborhood condition “Document [4C] Understanding”
By setting the identifier of P1 to P1 and the search term identifier of “system” to T3, the context condition “(document [4C] understanding) [S] system” is expressed in the form “C1:P1[S]T3”. C1 is an identifier of a context condition. Similarly,
The context condition “document [s] search” is “C2:T1[s]T4
”. C2 is an identifier of a context condition, T1 is a search term identifier of “document”, and T4 is a search term identifier of “search”.

【００６８】今，（７−２）に示した文書“・・・。文
書理解を用いた検索システムである。・・・・” が入力されたとすると，前述したように近傍条件判定プ
ログラム３１０からは図１８に示す以下の（９−１）〜
（９−８）が照合情報としてバッファ３６０へ出力され
る。（Ｄ１，　　０，　　０）　　　　（９−１）（Ｓ１，
３０，３０）　　　　（９−２）（Ｔ１，３１，３２）
　　　　（９−３）（Ｔ２，３３，３４）　　　　（９
−４）（Ｐ１，３１，３４）　　　　（９−５）（Ｔ３
，３９，４０）　　　　（９−６）（Ｔ４，４１，４４
）　　　　（９−７）（Ｓ１，４８，４８）　　　　（
９−８）本例では，文脈マーカーの照合情報（９−２）
と（９−８）を端点とする文脈，すなわちセンテンスに
おいて，文脈条件“（文書［４Ｃ］理解）［Ｓ］システ
ム”と文脈条件“文書［ｓ］検索”が成立するか否かが
判定されることになる。[0068] Now, if the document shown in (7-2) "...This is a retrieval system using document understanding..." is input, as described above, from the neighborhood condition determination program 310, is the following (9-1) shown in FIG.
(9-8) is output to the buffer 360 as collation information. (D1, 0, 0) (9-1) (S1,
30, 30) (9-2) (T1, 31, 32)
(9-3) (T2, 33, 34) (9
-4) (P1, 31, 34) (9-5) (T3
, 39, 40) (9-6) (T4, 41, 44
) (9-7) (S1, 48, 48) (
9-8) In this example, the context marker matching information (9-2)
It is determined whether the context condition "(document [4C] understanding) [S] system" and the context condition "document [s] search" are satisfied in the context, that is, the sentence with (9-8) as the end point. That will happen.

【００６９】この条件式における文脈条件判定処理につ
いて図２０ａ，図２０ｂと図２１ａ，図２１ｂを用いて
説明する。まず，図２０ａ，図２０ｂに示す初期状態に
おいては，バッファ３６０に照合情報（９−１）〜（９
−８）が格納されており，一時格納バッファ３２１及び
バッファ３７０は０クリアされた状態になっている。文
脈条件判定処理プログラム３２０ではこれらの照合情報
をバッファ３６０から照合情報を一つずつ読込み，文脈
条件“Ｃ１：Ｐ１［Ｓ］Ｔ３”と文脈条件“Ｃ２：Ｔ１
［ｓ］Ｔ４”について判定処理を行う。まずステップ１
として，図１９に示す読込み処理ステップ１１０１が実
行され，照合情報（９−１）すなわち照合情報（Ｄ１，
０，０）が図２０ａ，図２０ｂのステップ１に示すよう
にプログラムのワークエリアに読み込まれる。次に照合
情報識別処理ステップ１１０２が実行され，照合情報（
９−１）が照合ターム識別情報かどうかが調べられる。照合情報（Ｄ１，０，０）は後部の２項が両方とも０（
ゼロ），すなわち下位３２ビットが０（ゼロ）のため，
文書識別情報と判断される。したがって，照合情報（Ｄ
１，０，０）は照合ターム識別情報でないため，文脈マ
ーカー識別処理ステップ１１１０ａにより文脈マーカー
のものでないと判断される。このため，次の照合情報出
力処理ステップ１１１０が実行され，そのままバッファ
３７０へ出力される。The context condition determination process for this conditional expression will be explained using FIGS. 20a and 20b and FIGS. 21a and 21b. First, in the initial state shown in FIGS. 20a and 20b, collation information (9-1) to (9-9) is stored in the buffer 360.
-8) is stored, and the temporary storage buffer 321 and buffer 370 are in a state where they are cleared to 0. The context condition determination processing program 320 reads these collation information one by one from the buffer 360 and sets the context condition "C1:P1[S]T3" and the context condition "C2:T1".
Judgment processing is performed for “[s]T4”. First, step 1
, the reading process step 1101 shown in FIG.
0,0) is read into the program work area as shown in step 1 of FIGS. 20a and 20b. Next, the verification information identification processing step 1102 is executed, and the verification information (
It is checked whether 9-1) is the verification term identification information. In the verification information (D1, 0, 0), the last two terms are both 0 (
(zero), that is, the lower 32 bits are 0 (zero), so
It is determined to be document identification information. Therefore, matching information (D
1, 0, 0) is not matching term identification information, it is determined by the context marker identification processing step 1110a that it is not a context marker. Therefore, the next verification information output processing step 1110 is executed, and the information is output to the buffer 370 as it is.

【００７０】その後ステップ２として，再び照合情報読
込み処理ステップ１１０１が実行され，照合情報（９−
２）すなわち（Ｓ１，３０，３０）が読み込まれる。次
に照合情報識別処理ステップ１１０２が実行され，照合
情報（Ｓ１，３０，３０）が照合ターム識別情報かどう
かが調べられる。照合情報（Ｓ１，３０，３０）は下位
３２ビットが０（ゼロ）でないため，文書識別情報では
なく照合ターム識別情報と判断される。そして，次の文
脈マーカー識別処理ステップ１１０３が実行され，本照
合ターム識別情報が文脈条件中に指定された文脈の文脈
マーカーに該当するかどうかが調べられる。照合情報（
Ｓ１，３０，３０）の照合ターム識別子Ｓ１は文脈マー
カーに指定された“。”に対応するものであるので，後
処理ステップ１１０４が実行されることになる。Thereafter, as step 2, the collation information reading process step 1101 is executed again, and the collation information (9-
2) That is, (S1, 30, 30) is read. Next, a verification information identification processing step 1102 is executed, and it is checked whether the verification information (S1, 30, 30) is verification term identification information. Since the lower 32 bits of the collation information (S1, 30, 30) are not 0 (zero), it is determined to be collation term identification information rather than document identification information. Then, the next context marker identification processing step 1103 is executed, and it is checked whether this matching term identification information corresponds to the context marker of the context specified in the context condition. Verification information (
Since the collation term identifier S1 of S1, 30, 30) corresponds to "." specified as the context marker, the post-processing step 1104 is executed.

【００７１】後処理ステップ１１０４では，まず本文脈
マーカーＳ１を端点とする文脈の末尾位置情報を設定す
る。次に，一時格納バッファ３２１に照合情報が格納さ
れているかどうかをかどうかを調べ，格納されている場
合には一時格納バッファ３２１の内容を全てワークエリ
アに読み込む。一時格納バッファ３２１には，後述する
ように指定文脈条件に対応した文脈条件識別情報が照合
情報として格納されている。ここでは，照合情報が格納
されていないため，ワークエリアには読み込まれないこ
とになる。この後に，前処理ステップ１１０４ａを実行
する。ここではセンテンスの文脈条件，すなわち“Ｃ１
：Ｐ１［Ｓ］Ｔ３”と“Ｃ２：Ｔ１［ｓ］Ｔ４”に対応
する共起カウンタａと共起カウンタｂに０を設定する。次に文脈先頭位置情報に文脈マーカーＳ１の先頭位置情
報である３０を設定する。その後，ワークエリア繰返し
処理ステップ１１０６を実行する。ここでは本照合ター
ム識別子“Ｓ１”が文脈条件に指定されていないため文
脈条件指定識別処理ステップ１１０７は実行されない。さらに文脈マーカー識別処理ステップ１１１０ａが実行
されるが，本照合ターム識別子“Ｓ１”は文脈マーカー
の照合ターム識別情報であるため，照合ターム識別情報
出力処理ステップ１１１０は実行されない。したがって
，文脈マーカーの照合ターム識別情報はバッファ３７０
に出力されないことになる。In post-processing step 1104, first, the end position information of the context with the main context marker S1 as the end point is set. Next, it is checked whether collation information is stored in the temporary storage buffer 321, and if it is stored, the entire contents of the temporary storage buffer 321 are read into the work area. The temporary storage buffer 321 stores context condition identification information corresponding to the specified context condition as collation information, as will be described later. Since no collation information is stored here, it will not be read into the work area. After this, a preprocessing step 1104a is executed. Here, the context condition of the sentence, ie “C1
:P1[S]T3" and "C2:T1[s]T4", set co-occurrence counter a and co-occurrence counter b to 0. Next, set the context start position information to the start position information of context marker S1. 30 is set. After that, the work area repetition processing step 1106 is executed. Here, since the main collation term identifier "S1" is not specified as a context condition, the context condition specification identification processing step 1107 is not executed. Identification processing step 1110a is executed, but since the present collation term identifier "S1" is the collation term identification information of the context marker, the collation term identification information output processing step 1110 is not executed.Therefore, the collation term identification information of the context marker is buffer 370
will not be output.

【００７２】ステップ３として，繰返し処理ステップ１
１００により読込み処理ステップ１１０１が実行され，
３番目の照合情報（９−３），すなわち（Ｔ１，３１，
３２）がワークエリアに読み込まれる。次に照合情報識
別処理ステップ１１０２が実行され，照合ターム識別情
報か否かが調べられる。照合情報（Ｔ１，３１，３２）
は照合ターム識別情報なので文脈マーカー識別処理ステ
ップ１１０３が実行される。照合ターム識別情報（Ｔ１
，３１，３２）の照合ターム識別子Ｔ１は文脈マーカー
の照合ターム識別子Ｓ１でないため，後処理ステップ１
１０４は実行されない。この後，ワークエリア繰返し処
理ステップ１１０６が実行され，ワークエリアに格納さ
れている照合ターム識別情報について１１０７〜１１１
０の共起判定処理が実施される。文脈条件指定識別処理
ステップ１１０７では，ワークエリアに格納されている
照合ターム識別情報（Ｔ１，３１，３２）の照合ターム
識別子Ｔ１を参照して文脈条件に指定された照合ターム
か否かを調べる。本照合ターム識別子Ｔ１は，文脈条件
“Ｃ２：Ｔ１［ｓ］Ｔ４”に指定されているので共起判
定処理１１０８が実行されることになる。ここで，文脈
条件“Ｃ２：Ｔ１［ｓ］Ｔ４”に対応した共起カウンタ
ａはカウントアップされ，共起カウンタａの値は０から
１に更新される。しかし，共起カウンタａの値が２でな
いため，共起判定は成立しないことになる。この後，文
脈マーカー識別処理ステップ１１１０ａを実行するが文
脈マーカーの照合ターム識別情報でないため，照合ター
ム識別情報出力処理ステップ１１１０が実行される。ここで照合ターム識別情報（Ｔ１，３１，３２）がバッ
ファ３７０に出力される。As step 3, iterative processing step 1
A reading process step 1101 is executed by 100,
The third verification information (9-3), that is, (T1, 31,
32) is read into the work area. Next, a collation information identification processing step 1102 is executed to check whether or not it is collation term identification information. Verification information (T1, 31, 32)
Since is collation term identification information, context marker identification processing step 1103 is executed. Verification term identification information (T1
, 31, 32) is not the matching term identifier S1 of the context marker, so post-processing step 1
104 is not executed. After this, work area repetition processing step 1106 is executed, and steps 1107 to 111 are performed for the collation term identification information stored in the work area.
0 co-occurrence determination processing is performed. In the context condition designation identification processing step 1107, it is checked whether or not the verification term is specified in the context condition by referring to the verification term identifier T1 of the verification term identification information (T1, 31, 32) stored in the work area. Since the present collation term identifier T1 is specified in the context condition "C2:T1[s]T4", the co-occurrence determination process 1108 is executed. Here, the co-occurrence counter a corresponding to the context condition "C2:T1[s]T4" is counted up, and the value of the co-occurrence counter a is updated from 0 to 1. However, since the value of the co-occurrence counter a is not 2, the co-occurrence determination will not hold. After that, the context marker identification processing step 1110a is executed, but since the matching term identification information is not of the context marker, the matching term identification information output processing step 1110 is executed. Here, the collation term identification information (T1, 31, 32) is output to the buffer 370.

【００７３】また，ステップ４として繰返し処理ステッ
プ１１００のもとに，読込み処理ステップ１１０１が実
行され，４番目の照合情報（９−４），すなわち（Ｔ２
，３３，３４）がワークエリアに読み込まれる。そして
照合情報識別処理ステップ１１０２が実行され，照合タ
ーム識別情報か否かが調べられる。照合情報（Ｔ２，３
３，３４）は照合ターム識別情報なので文脈マーカー識
別処理ステップ１１０３が実行されるが，照合ターム識
別情報（Ｔ２，３３，３４）の照合ターム識別子はＳ１
でないため，後処理ステップ１１０４は実行されない。この後，ワークエリア繰返し処理ステップ１１０６のも
とに，文脈条件指定識別処理ステップ１１０７が実行さ
れ，ワークエリア内の照合ターム識別情報が文脈条件に
指定されているか否かが調べられる。ワークエリアに格
納されている照合ターム識別情報（Ｔ２，３３，３４）
の照合ターム識別子Ｔ２は文脈条件に指定されていない
ため，共起判定処理ステップ１１０８は実行されない。また，照合ターム識別情報は文脈マーカーでないため，
照合ターム識別情報出力処理ステップ１１１０が実行さ
れ，照合ターム識別情報（Ｔ２，３３，３４）がバッフ
ァ３７０に出力される。Furthermore, as step 4, reading processing step 1101 is executed under the iterative processing step 1100, and the fourth collation information (9-4), that is, (T2
, 33, 34) are read into the work area. Then, a verification information identification processing step 1102 is executed, and it is checked whether or not it is verification term identification information. Verification information (T2, 3
3, 34) are matching term identification information, so the context marker identification processing step 1103 is executed, but the matching term identifier of the matching term identification information (T2, 33, 34) is S1.
Therefore, post-processing step 1104 is not executed. Thereafter, a context condition specification identification processing step 1107 is executed under the work area repetition processing step 1106, and it is checked whether the collation term identification information in the work area is specified as a context condition. Verification term identification information stored in the work area (T2, 33, 34)
Since the collation term identifier T2 is not specified in the context condition, co-occurrence determination processing step 1108 is not executed. Also, since the matching term identification information is not a context marker,
Verification term identification information output processing step 1110 is executed, and verification term identification information (T2, 33, 34) is output to buffer 370.

【００７４】さらにステップ５として，繰返し処理ステ
ップ１１００のもとで，読込み処理ステップ１１０１が
実行され，５番目の照合情報（９−５）すなわち（Ｐ１
，３１，３４）がワークエリアに読込まれる。更に照合
情報識別処理ステップ１１０２により照合ターム識別情
報か否かが調べられ，照合情報（Ｐ１，３１，３４）は
照合ターム識別情報なので文脈マーカー識別処理ステッ
プ１１０３が実行される。ここで，照合ターム識別情報
（Ｐ１，３１，３４）の照合ターム識別子はＳ１でない
ため，この後ワークエリア繰返し処理ステップ１１０６
が実行され，ワークエリアに格納されている照合ターム
識別情報について１１０７〜１１１０の共起判定処理が
実行される。文脈条件指定識別処理ステップ１１０７で
は，ワークエリアに格納されている照合ターム識別情報
（Ｐ１，３１，３４）の照合ターム識別子Ｐ１を参照し
て文脈条件に指定されているか否かが調べられる。この
場合，文脈条件“Ｃ１：Ｐ１［Ｓ］Ｔ３”に対応する共
起カウンタｂの値が０であり，且つ照合ターム識別子Ｐ
１は文脈条件“Ｃ１：Ｐ１［Ｓ］Ｔ３”の第０番目の照
合ターム識別子として指定されている。このため共起判
定処理１１０８が実行されることになる。ここで，共起
カウンタｂはカウントアップされ，共起カウンタｂの値
は０から１になる。しかし，共起カウンタｂの値が２で
ないため，共起判定は成立しないことになる。この後，
本照合ターム識別情報が文脈マーカーのものでないため
照合ターム識別情報出力処理ステップ１１１０が実行さ
れ，本照合ターム識別情報（Ｐ１，３１，３４）がバッ
ファ３７０に出力される。Furthermore, as step 5, a reading process step 1101 is executed under the iterative process step 1100, and the fifth collation information (9-5), that is, (P1
, 31, 34) are read into the work area. Further, in a matching information identification processing step 1102, it is checked whether or not it is matching term identification information, and since the matching information (P1, 31, 34) is matching term identification information, a context marker identification processing step 1103 is executed. Here, since the collation term identifier of the collation term identification information (P1, 31, 34) is not S1, the work area repetition processing step 1106
is executed, and the co-occurrence determination processes 1107 to 1110 are executed for the collation term identification information stored in the work area. In the context condition designation identification processing step 1107, it is checked whether or not it is specified as a context condition by referring to the verification term identifier P1 of the verification term identification information (P1, 31, 34) stored in the work area. In this case, the value of the co-occurrence counter b corresponding to the context condition “C1:P1[S]T3” is 0, and the collation term identifier P
1 is specified as the 0th collation term identifier of the context condition "C1:P1[S]T3". Therefore, co-occurrence determination processing 1108 is executed. Here, the co-occurrence counter b is counted up, and the value of the co-occurrence counter b changes from 0 to 1. However, since the value of the co-occurrence counter b is not 2, the co-occurrence determination will not hold. After this,
Since this verification term identification information is not that of a context marker, verification term identification information output processing step 1110 is executed, and actual verification term identification information (P1, 31, 34) is output to the buffer 370.

【００７５】さらにステップ６として，繰返し処理ステ
ップ１１００のもとに，読込み処理ステップ１１０１が
実行され，６番目の照合情報（９−６）すなわち（Ｔ３
，３９，４０）がワークエリアに読込まれ，その後に照
合情報識別処理ステップ１１０２が実行される。ここで
，照合情報（Ｔ３，３９，４０）が照合ターム識別情報
か否かが調べられ，照合情報（Ｔ３，３９，４０）は照
合ターム識別情報なので文脈マーカー識別処理ステップ
１１０３が実行されるが，照合ターム識別情報（Ｔ３，
３９，４０）は文脈マーカーでないため，後処理ステッ
プ１１０４は実行されないことになる。この後，ワーク
エリア繰返し処理ステップ１１０６が実行され，ワーク
エリアに格納されている照合ターム識別情報について１
１０７〜１１１０の共起判定処理が実行されることにな
る。文脈条件指定識別処理ステップ１１０７では，ワー
クエリアに格納されている照合ターム識別情報（Ｔ３，
３９，４０）の照合ターム識別子Ｔ３を参照して，本照
合ターム識別子が文脈条件に指定されているか否かが調
べられる。この場合，文脈条件“Ｃ１：Ｐ１［Ｓ］Ｔ３
”に対応する共起カウンタｂの値が１であり，且つ照合
ターム識別子Ｔ３は文脈条件“Ｃ１：Ｐ１［Ｓ］Ｔ３”
の１番目の識別子に指定されているので共起判定処理１
１０８が実行される。ここで共起カウンタｂがカウント
アップされ，共起カウンタｂの値は１から２に更新され
る。共起カウンタｂの値が２になったことから，共起判
定が成立したものと判断することができる。このとき，
一時格納処理ステップ１１０９が実行され，文脈条件“
Ｃ１：Ｐ１［Ｓ］Ｔ３”の文脈条件識別情報（Ｃ１，３
０，Ｘｅ１）が一時格納バッファ３２１に格納される。ここでは，文脈の後方を端点とする文脈マーカーの位置
情報がまだ判明しないため，仮に文脈末尾位置情報をＸ
ｅ１として置く。この後に文脈マーカーの照合ターム識
別情報が現れたとき，この文脈末尾位置情報は後処理ス
テップ１１０４で設定される。その後，本照合ターム識
別情報は文脈マーカーのものでないため照合ターム識別
情報出力処理ステップ１１１０が実行され，今成立した
照合ターム識別情報（Ｔ３，３９，４０）がバッファ３
７０に出力される。Furthermore, as step 6, a reading processing step 1101 is executed under the iterative processing step 1100, and the sixth collation information (9-6), that is, (T3
, 39, 40) are read into the work area, and then the collation information identification processing step 1102 is executed. Here, it is checked whether the collation information (T3, 39, 40) is collation term identification information or not, and since the collation information (T3, 39, 40) is collation term identification information, context marker identification processing step 1103 is executed. , matching term identification information (T3,
39, 40) are not context markers, the post-processing step 1104 will not be performed. After this, the work area repeat processing step 1106 is executed, and the verification term identification information stored in the work area is
Co-occurrence determination processes 107 to 1110 will be executed. In context condition specification identification processing step 1107, matching term identification information (T3,
39, 40), it is checked whether or not this matching term identifier is specified as a context condition. In this case, the context condition “C1:P1[S]T3
The value of the co-occurrence counter b corresponding to " is 1, and the matching term identifier T3 is the context condition "C1:P1[S]T3"
Since it is specified as the first identifier of
108 is executed. Here, the co-occurrence counter b is counted up, and the value of the co-occurrence counter b is updated from 1 to 2. Since the value of the co-occurrence counter b has become 2, it can be determined that the co-occurrence determination has been established. At this time,
Temporary storage processing step 1109 is executed, and the context condition “
C1:P1[S]T3'' context condition identification information (C1,3
0, Xe1) is stored in the temporary storage buffer 321. Here, since the position information of the context marker whose end point is after the context is not known yet, we will temporarily set the context end position information to
Set it as e1. After this, when the collation term identification information of the context marker appears, this context end position information is set in post-processing step 1104. After that, since this verification term identification information is not that of a context marker, verification term identification information output processing step 1110 is executed, and the verification term identification information (T3, 39, 40) that has just been established is transferred to the buffer 3.
70.

【００７６】ステップ７として，繰返し処理ステップ１
１００のもとで，読込み処理ステップ１１０１が実行さ
れ，７番目の照合情報（９−７）すなわち（Ｔ４，４１
，４６）がワークエリアに読込まれる。その後，照合情
報識別処理ステップ１１０２が実行され，本照合情報が
照合ターム識別情報か否かが調べられる。照合情報（Ｔ
４，４１，４６）は照合ターム識別情報なので文脈マー
カー識別処理ステップ１１０３が実行され，照合ターム
識別情報（Ｔ４，４１，４６）は文脈マーカーでないた
め，後処理ステップ１１０４は実行されない。この後，
ワークエリア繰返し処理ステップ１１０６が実行され，
ワークエリアに格納されている照合ターム識別情報につ
いて１１０７〜１１１０の共起判定処理が実行されるこ
とになる。文脈条件指定識別処理ステップ１１０７では
，ワークエリアに格納されている照合ターム識別情報（
Ｔ４，４１，４６）の照合ターム識別子Ｔ４を参照して
，本照合タームが文脈条件に指定されているか否かが調
べられる。この場合，照合ターム識別子Ｔ４は文脈条件
“Ｃ２：Ｔ１［ｓ］Ｔ４”に指定されているので共起判
定処理１１０８が実行される。ここで，文脈条件“Ｃ２
：Ｔ１　　［ｓ］Ｔ４”におけるの照合ターム識別子Ｔ
４が現れたため，本文脈条件に対応する共起カウンタａ
がカウントアップされ，共起カウンタａの値は１から２
になり，共起判定が成立することになる。したがって，
一時格納処理ステップ１１０９が実行され，今成立した
文脈条件“Ｃ２：Ｔ１［ｓ］Ｔ４”の文脈条件識別情報
（Ｃ２，３０，Ｘｅ２）が一時格納バッファ３２１に格
納される。文脈末尾位置情報は決定されていないのでＸ
ｅ２を一時的に設定しておく。その後，本照合ターム識
別情報が文脈マーカーのものでないため照合ターム識別
情報出力処理ステップ１１１０が実行され，照合ターム
識別情報（Ｔ４，４１，４６）がバッファ３７０に出力
される。As step 7, iterative processing step 1
100, reading processing step 1101 is executed, and the seventh collation information (9-7), that is, (T4, 41
, 46) are read into the work area. Thereafter, a verification information identification processing step 1102 is executed, and it is checked whether this verification information is verification term identification information. Verification information (T
4, 41, 46) are matching term identification information, the context marker identification processing step 1103 is executed, and since the matching term identification information (T4, 41, 46) is not a context marker, the post-processing step 1104 is not executed. After this,
Work area repeat processing step 1106 is executed,
The co-occurrence determination processes 1107 to 1110 are executed for the collation term identification information stored in the work area. In the context condition specification identification processing step 1107, the matching term identification information (
With reference to the matching term identifier T4 of T4, 41, 46), it is checked whether or not this matching term is specified as a context condition. In this case, since the collation term identifier T4 is specified in the context condition "C2:T1[s]T4", the co-occurrence determination process 1108 is executed. Here, the context condition “C2
:T1 [s] Verification term identifier T in T4”
4 appears, the co-occurrence counter a corresponding to this context condition
is counted up, and the value of co-occurrence counter a changes from 1 to 2.
Therefore, the co-occurrence determination is established. therefore,
Temporary storage processing step 1109 is executed, and the context condition identification information (C2, 30, Xe2) of the context condition “C2:T1[s]T4” that has just been established is stored in the temporary storage buffer 321. Since the context end position information has not been determined,
Set e2 temporarily. Thereafter, since this verification term identification information is not that of a context marker, verification term identification information output processing step 1110 is executed, and verification term identification information (T4, 41, 46) is output to the buffer 370.

【００７７】最後にステップ８として，再び照合情報読
込み処理ステップ１１０１が実行され，本照合情報（９
−８）すなわち（Ｓ１，４８，４８）が読み込まれる。次に照合情報識別処理ステップ１１０２が実行され，照
合情報（Ｓ１，４８，４８）が照合ターム識別情報であ
るかどうかが調べられる。照合情報（Ｓ１，４８，４８
）は照合ターム識別情報であるため，次の文脈マーカー
識別処理ステップ１１０３が実行され，本照合ターム識
別情報が文脈条件中に指定された文脈に関する文脈マー
カーに該当するかどうかが調べられる。照合ターム識別
子はＳ１であるため指定されたセンテンスの文脈マーカ
ー“。”に該当することになり，後処理ステップ１１０
４が実行される。後処理ステップ１１０４では，まず本
文脈マーカーの末尾位置情報である４８を文脈末尾位置
情報とする。次に，センテンスの文脈条件識別情報（Ｃ
１，３０，Ｘｅ１）と（Ｃ２，３０，Ｘｅ２）が格納さ
れている一時格納バッファ３２１にセンテンスの文脈末
尾位置情報４８を設定し，この処理結果は図２１ａ，図
２１ｂのステップ８に示した（Ｃ１，３０，４８）と（
Ｃ２，３０，４８）のようになる。この処理結果はワー
クエリアに格納される。さらに前処理ステップ１１０４
ａが実行され，センテンスの文脈条件の共起カウンタを
０クリアし，本文脈マーカーの先頭位置情報４８を文脈
先頭位置情報に設定する。Finally, in step 8, the collation information reading processing step 1101 is executed again, and the main collation information (9
-8), that is, (S1, 48, 48) is read. Next, a verification information identification processing step 1102 is executed to check whether the verification information (S1, 48, 48) is verification term identification information. Verification information (S1, 48, 48
) is verification term identification information, the next context marker identification processing step 1103 is executed to check whether this verification term identification information corresponds to a context marker related to the context specified in the context condition. Since the collation term identifier is S1, it corresponds to the context marker “.” of the specified sentence, and the post-processing step 110
4 is executed. In post-processing step 1104, first, 48, which is the end position information of this context marker, is set as the context end position information. Next, the context condition identification information (C
1, 30, Xe1) and (C2, 30, Xe2) are stored in the temporary storage buffer 321, and the processing results are shown in step 8 of FIGS. 21a and 21b. (C1, 30, 48) and (
C2, 30, 48). This processing result is stored in the work area. Further pre-processing step 1104
a is executed, the co-occurrence counter of the context condition of the sentence is cleared to 0, and the start position information 48 of this context marker is set as the context start position information.

【００７８】その後，ワークエリア繰返し処理ステップ
１１０６が実行される。まず文脈条件識別情報（Ｃ１，
３０，４８）について，文脈条件指定識別処理ステップ
１１０７が実行されるが照合ターム識別子Ｃ１は文脈条
件に指定されていないため共起判定処理１１０８は実行
されず，照合ターム識別情報出力処理ステップ１１１０
が実行され，文脈条件識別情報（Ｃ１，３０，４８）が
バッファ３７０に出力される。すなわち，本文脈条件識
別情報の位置情報は，文脈マーカーの識別子（９−２）
と（９−８）を端点とするセンテンスの先頭位置情報３
０と末尾位置情報４８を位置情報としている。この後に
再度，ワークエリア繰返し処理ステップ１１０６が実行
され，文脈条件識別情報（Ｃ２，３０，４８）について
，文脈条件指定識別処理ステップ１１０７が実行される
が照合ターム識別子Ｃ２も文脈条件に指定されていない
ため，照合ターム識別情報出力処理ステップ１１１０に
より，文脈条件識別情報（Ｃ２，３０，４８）がバッフ
ァ３７０に出力される。Thereafter, work area repetition processing step 1106 is executed. First, context condition identification information (C1,
30, 48), the context condition specification identification processing step 1107 is executed, but since the collation term identifier C1 is not specified in the context condition, the co-occurrence determination processing 1108 is not executed, and the collation term identification information output processing step 1110
is executed, and the context condition identification information (C1, 30, 48) is output to the buffer 370. In other words, the location information of this context condition identification information is the context marker identifier (9-2).
Start position information 3 of the sentence with (9-8) as the end point
0 and the end position information 48 are used as position information. After this, the work area repetition processing step 1106 is executed again, and the context condition specification identification processing step 1107 is executed for the context condition identification information (C2, 30, 48), but the collation term identifier C2 is also specified as a context condition. Therefore, the context condition identification information (C2, 30, 48) is output to the buffer 370 by the matching term identification information output processing step 1110.

【００７９】以上の文脈条件判定処理が実行されること
により，図２２に示す（１０−１）〜（１０−８）の照
合情報がバッファ３７０へ出力される。（Ｄ１，　　０，　　０）　　　　（１０−１）（Ｔ１
，３１，３２）　　　　（１０−２）（Ｔ２，３３，３
４）　　　　（１０−３）（Ｐ１，３１，３４）　　　
　（１０−４）（Ｔ３，３９，４０）　　　　（１０−
５）（Ｔ４，４１，４４）　　　　（１０−６）（Ｃ１
，３０，４８）　　　　（１０−７）（Ｃ２，３０，４
８）　　　　（１０−８）ここで，（１０−７）は文脈
条件“（文書［４Ｃ］理解）［Ｓ］システム”の文脈条
件識別情報であり，同様に（１０−８）は“文書［ｓ］
検索”の文脈条件識別情報である。これらの照合情報は
引き続き論理条件判定プログラム３３０へ送られること
になる。By executing the above context condition determination processing, collation information (10-1) to (10-8) shown in FIG. 22 is output to the buffer 370. (D1, 0, 0) (10-1) (T1
,31,32) (10-2)(T2,33,3
4) (10-3) (P1, 31, 34)
(10-4) (T3,39,40) (10-
5) (T4, 41, 44) (10-6) (C1
,30,48) (10-7)(C2,30,4
8) (10-8) Here, (10-7) is the context condition identification information of the context condition “(Document [4C] Understanding) [S] System”, and similarly (10-8) is the context condition identification information of “Document [4C] Understanding) [S] System”. s]
"Search" context condition identification information.This collation information is subsequently sent to the logical condition determination program 330.

【００８０】最後に，論理条件判定プログラム３３０の
判定処理について説明する。論理条件判定プログラム３
３０では，文脈条件判定プログラム３２０からバッファ
３７０に送り込まれた照合情報を読み出し，検索制御手
段１０１より検索情報２０２として指定された論理条件
に合致するか否かを判定する。論理条件例としては，例
えば“文書［ＡＮＤ］検索”という条件式がある。“文
書［ＡＮＤ］検索”という条件式は，“文書”と“検索
”という２つの検索タームが同時に現れる文書を探し出
すということを表す。また，本論理条件を表す識別子（
以後，論理条件識別子と呼ぶ）をＬｉとする。なおＬｉ
には照合ターム識別子とは区別できるコードを割り付け
る。このように定義することにより，論理条件は“Ｌｉ
：Ｔａ［ＡＮＤ］Ｔｂ” と記述することができる。以下の説明はこの定義を用い
て行う。論理条件にはこの他にも，“Ｌｉ：Ｔａ［ＯＲ
］Ｔｂ”と“Ｌｉ：Ｔａ［ＮＯＴ］Ｔｂ”がある。ここ
で論理条件“Ｌｉ：Ｔａ［ＯＲ］Ｔｂ”は，“Ｔａ”あ
るいは“Ｔｂ”という検索タームが現れる文書を探し出
すということを表し，論理条件“Ｌｉ：Ｔａ［ＮＯＴ］
Ｔｂ”は，“Ｔａ”という検索タームが現われて，且つ
“Ｔｂ”という検索タームが現われない文書を探し出す
ということを表している。Finally, the determination processing of the logical condition determination program 330 will be explained. Logical condition judgment program 3
At step 30, the collation information sent from the context condition determination program 320 to the buffer 370 is read out, and it is determined whether it matches the logical condition specified as the search information 202 by the search control means 101. An example of a logical condition is, for example, a conditional expression "document [AND] search". The conditional expression "document [AND] search" indicates that a document in which the two search terms "document" and "search" appear simultaneously is to be found. In addition, an identifier (
(hereinafter referred to as a logical condition identifier) is assumed to be Li. Furthermore, Li
Assign a code that can be distinguished from the collation term identifier. By defining in this way, the logical condition becomes “Li
:Ta[AND]Tb". The following explanation uses this definition. In addition to this, logical conditions include "Li:Ta[OR
]Tb” and “Li:Ta[NOT]Tb”.Here, the logical condition “Li:Ta[OR]Tb” means to search for documents in which the search term “Ta” or “Tb” appears. , logical condition “Li:Ta[NOT]
"Tb" indicates that documents in which the search term "Ta" appears and in which the search term "Tb" does not appear are searched.

【００８１】これらの論理条件は検索制御手段１０１に
おいて，以下に示す積の一般形式に変換され，検索情報
２０２として論理条件判定プログラム３３０に指定され
る。Ｌｉ：（Ａ１１＋Ａ１２＋・・・＋Ａ１ｊ）＊（Ａ２１
＋Ａ２２＋・・・＋Ａ２ｋ）＊　　　　　　・・・＊（Ａｎ１＋Ａ２２＋・・・＋Ａｎｍ）（１０−１）（１０−１）式において，“＋”は論理和を表わし，“
＊”は論理積を表わす。また，Ａｎｍを要素と呼び，（
Ａｎ１＋Ａ２２＋・・・＋Ａｎｍ）を項と呼ぶ。要素Ａ
ｎｍとしては否定（“¬”で表す）が掛かっている要素
¬Ａｎｍも存在する。また，項にも否定が掛かった項　
　¬（Ａｎ１＋Ａ２２＋・・・＋Ａｎｍ）が存在する。ここで¬Ａｎｍを負論理の要素と呼ぶ。これに対し，否
定が掛かっていない項を正論理の要素と呼ぶ。また，¬
（Ａｎ１＋Ａ２２＋・・・＋Ａｎｍ）を負論理の項と呼
び，否定が掛かっていない項を正論理の項と呼ぶ。These logical conditions are converted by the search control means 101 into the general form of a product shown below, and specified as search information 202 to the logical condition determination program 330. Li: (A11+A12+...+A1j)*(A21
+A22+...+A2k)*...*(An1+A22+...+Anm) (10-1) In formula (10-1), "+" represents a logical sum, and "
*” represents logical product. Also, Anm is called an element, and (
An1+A22+...+Anm) is called a term. Element A
There is also an element ¬Anm which is negated (represented by "¬") as nm. In addition, the term is also negated.
¬(An1+A22+...+Anm) exists. Here, ¬Anm is called a negative logic element. In contrast, terms that are not negated are called elements of positive logic. Also, ¬
(An1+A22+...+Anm) is called a negative logic term, and a term that is not negated is called a positive logic term.

【００８２】（１０−１）式では，項の論理積の形にな
っているので，（１０−１）式が成立する（真になる）
ためには，各々の項が全て成立する必要がある。そのた
め，カウンタを用意し，成立した項をカウントする。こ
のカウンタ（以後，項カウンタと呼ぶ）の値が項の個数
に等しければ，（１０−１）式は成立したことになる。負論理の要素を含まない負論理の項と負論理の要素を含
む正論理の項は最初から成立している。したがって，項
カウンタの初期値には負論理の要素を含まない負論理の
項の個数と負論理の要素を含む正論理の項の個数とを加
算した値を設定する。この項カウンタを次のように制御
することにより，（１０−１）式の成否を判定すること
ができる。つまり，項カウンタには初期値として負論理
の項の個数を設定し，項が不成立から成立に変化した場
合には項カウンタに１を加算し，逆に項が成立から不成
立に変化した場合には項カウンタから１を減算する。項
が不成立から不成立または成立から成立のように変化し
ない場合は項カウンタの値を更新しない。このように項
カウンタを制御し，文書単位に項カウンタの値が項の個
数に等しいかを調べることにより（１０−１）式の成立
を判別することができる。また，式全体に否定が掛って
いる場合には逆に，項カウンタの値が項の個数未満かを
調べることにより（１０−１）式の成立を判別すること
ができる。Since the formula (10-1) is in the form of a logical product of terms, the formula (10-1) holds true (becomes true).
In order to do so, each term must all hold true. Therefore, a counter is provided to count the terms that hold true. If the value of this counter (hereinafter referred to as term counter) is equal to the number of terms, then equation (10-1) is established. Negative logic terms that do not include negative logic elements and positive logic terms that include negative logic elements are established from the beginning. Therefore, the initial value of the term counter is set to the sum of the number of negative logic terms that do not include negative logic elements and the number of positive logic terms that include negative logic elements. By controlling this term counter as follows, it is possible to determine whether equation (10-1) is successful or not. In other words, the number of negative logic terms is set as an initial value in the term counter, and when a term changes from not being true to being true, 1 is added to the term counter, and conversely, when a term changes from being true to being true, 1 is added to the term counter. subtracts 1 from the term counter. If the term does not change from not being true to not being true or from being true to true, the value of the term counter is not updated. By controlling the term counter in this way and checking whether the value of the term counter is equal to the number of terms for each document, it is possible to determine whether equation (10-1) holds true. Conversely, if the entire equation is negated, it can be determined whether equation (10-1) holds true by checking whether the value of the term counter is less than the number of terms.

【００８３】次に，全ての項では要素の論理和になって
いるため，項が成立するためにはいずれかの要素が成立
していれば良いことになる。１つの項の成否判定方法に
ついて以下説明する。ここでは，項の成否を調べるため
にカウンタを用い，項に指定されている要素において，
要素が成立したものの個数をカウントする。このカウン
タ（以後，要素カウンタと呼ぶ）の値が１以上，すなわ
ちいずれかの要素が成立している場合は，項は成立した
ものと見做す。負論理の要素は最初から成立していると
見做せるので，要素カウンタの初期値には負論理の要素
の個数を設定する。このため，要素カウンタは項に対応
して１つずつ，すなわち項の個数だけ設ける。Next, since all terms are the logical sum of elements, in order for a term to hold true, it is sufficient that any element holds true. A method for determining success or failure of one term will be described below. Here, a counter is used to check the success or failure of the term, and in the element specified in the term,
Count the number of elements that are satisfied. If the value of this counter (hereinafter referred to as an element counter) is 1 or more, that is, if any element is satisfied, the term is considered to be satisfied. Since negative logic elements can be considered to be established from the beginning, the initial value of the element counter is set to the number of negative logic elements. Therefore, one element counter is provided for each term, that is, the number of element counters is equal to the number of terms.

【００８４】この要素カウンタを以下のように制御する
ことにより項の成否を判定することができる。つまり，
要素カウンタには初期値として負論理の要素の個数を設
定し，要素が不成立から成立に変化した場合には要素カ
ウンタに１を加算し，逆に要素が成立から不成立に変化
した場合には要素カウンタから１を減算する。ここでの
要素は照合ターム識別子に対応しているため，要素が不
成立から不成立または成立から成立のようには変化しな
い。よって，不成立から成立に変化するのは正論理の要
素の場合のみであり，成立から不成立に変化するのは負
論理の要素の場合のみである。したがって，正論理の要
素に対応する照合ターム識別情報が現れた場合には要素
カウンタに１を加算し，逆に負論理の要素に対応する照
合ターム識別情報が現れた場合には要素カウンタから１
を減算する。The success or failure of a term can be determined by controlling this element counter as follows. In other words,
The number of negative logic elements is set as an initial value in the element counter, and when an element changes from not being true to being true, 1 is added to the element counter, and conversely, when the element changes from being true to not being true, the number of elements is added to the element counter. Subtract 1 from the counter. Since the element here corresponds to the collation term identifier, the element does not change from not being true to not being true or from being true to true. Therefore, only positive logic elements change from not being true to being true, and only negative logic elements change from valid to not true. Therefore, when matching term identification information corresponding to a positive logic element appears, 1 is added to the element counter; conversely, when matching term identification information corresponding to a negative logic element appears, 1 is added to the element counter.
Subtract.

【００８５】このように要素カウンタを制御し，照合タ
ーム識別情報が入力される度に要素カウンタの値が１以
上かを調べることにより項の成立を判断する。また，項
に否定が掛っているときには逆に，要素カウンタの値が
０かを調べることにより項の成立を判断する。この後，
項の成立判定の結果をもとに，項が不成立から成立に変
化した場合には項カウンタに１を加算し，逆に項が成立
から不成立に変化した場合には項カウンタから１を減算
する。また，否定を含めた項が不成立から不成立または
成立から成立のように変化しない場合は項カウンタの値
を更新しない。このように項カウンタを制御し，文書単
位に項カウンタの値が項の個数に等しいか否かを調べる
ことにより（１０−１）式の成否を判別することができ
る。例えば上記の“Ｌｉ：Ｔａ［ＡＮＤ］Ｔｂ”すなわ
ち“Ｌｉ：Ｔａ＊Ｔｂ”の場合には， “Ｌｉ：¬（（¬Ｔａ）＋（¬Ｔｂ））”のように変換
され検索情報２０２として論理条件判定プログラム３３
０に渡される。The element counter is controlled in this way, and it is determined whether a term is established by checking whether the value of the element counter is 1 or more each time collation term identification information is input. Conversely, when a term is negated, it is determined whether the term holds true by checking whether the value of the element counter is 0. After this,
Based on the result of determining whether a term is true, if the term changes from not to true, add 1 to the term counter; conversely, if the term changes from valid to not true, subtract 1 from the term counter. . Further, if the term including negation does not change from untrue to untrue or from valid to true, the value of the term counter is not updated. By controlling the term counter in this way and checking whether the value of the term counter is equal to the number of terms for each document, it is possible to determine whether equation (10-1) is successful or not. For example, in the case of the above “Li:Ta[AND]Tb”, that is, “Li:Ta*Tb”, it is converted as “Li:¬((¬Ta)+(¬Tb))” and the search information 202 is Logical condition judgment program 33
Passed to 0.

【００８６】この論理条件の処理の手順について図２３
を用いて詳細に説明する。例えば上記の“Ｌｉ：Ｔａ［
ＡＮＤ］Ｔｂ”すなわち“Ｌｉ：Ｔａ＊Ｔｂ”の場合に
は， “Ｌｉ：¬（（¬Ｔａ）＋（¬Ｔｂ））”（１０−２）のように変換され検索情報２０２として論理条件判定プ
ログラム３３０に渡される。本論理条件には，負論理の
項が１つ存在する。さらにこの項の中には，負論理の要
素が２つ存在する。項が１つなのでここでは，要素カウ
ンタを１つ使用する。項カウンタの初期値には，負論理
の要素を含まない負論理の項の個数と負論理の要素を含
む正論理の項の個数とを加算した値を設定することにな
るが，本例の項は負論理要素を含む負論理の項なので該
当しないため０を設定する。また，要素カウンタには，
負論理の要素の個数である２を設定する。ここでは，項
に¬が掛かっているため，要素カウンタをカウントダウ
ンし，要素カウンタの値が０になったときに項が成立し
たものと判断することができる。また，論理条件（１０
−２）式に¬が掛かっていないため，項カウンタの値が
１のときに論理条件（１０−２）式が成立したものと判
断することができる。FIG. 23 shows the procedure for processing this logical condition.
This will be explained in detail using . For example, the above “Li:Ta[
AND]Tb”, that is, “Li:Ta*Tb”, it is converted as “Li:¬((¬Ta)+(¬Tb))” (10-2) and the logical condition is determined as the search information 202. It is passed to the program 330. This logic condition has one negative logic term. Furthermore, this term has two negative logic elements. Since there is one term, here, the element counter The initial value of the term counter is set to the sum of the number of negative logic terms that do not include negative logic elements and the number of positive logic terms that include negative logic elements. However, since the term in this example is a negative logic term that includes negative logic elements, it is not applicable and is therefore set to 0. Also, the element counter is set to 0.
Set 2, which is the number of negative logic elements. Here, since the term is multiplied by ¬, it is possible to count down the element counter and determine that the term has been established when the value of the element counter reaches 0. Also, the logical condition (10
-2) Since the equation is not multiplied by ¬, it can be determined that the logical condition (10-2) is satisfied when the value of the term counter is 1.

【００８７】まず繰返し処理ステップ１２００ではバッ
ファ３７０内の照合情報を全て，すなわち最後の文書の
照合情報を読み出し尽くすまで１２０１から１２１０ま
での処理ステップを繰り返す。読込み処理ステップ１２
０１では，バッファ３７０より照合情報を読み出しプロ
グラムのワークエリアへ格納する。照合情報識別処理ス
テップ１２０２では，上記読込み処理ステップ１２０１
で読み込んだ照合情報が文書識別情報か，それとも照合
ターム識別情報なのかを調べる。すなわち照合情報の後
部２項が両方とも０（ゼロ），すなわち下位３２ビット
が０（ゼロ）の場合には，文書識別情報と判定する。こ
の場合には論理条件成立判定処理ステップ１２０３が実
行される。論理条件が成立したとき，すなわち論理条件
（１０−２）式では，項カウンタの値が１のとき論理条
件が成立していることを示している。First, in the iterative processing step 1200, the processing steps 1201 to 1210 are repeated until all the collation information in the buffer 370, that is, the collation information of the last document is read out. Reading process step 12
At step 01, collation information is read from the buffer 370 and stored in the work area of the program. In the collation information identification processing step 1202, the above reading processing step 1201
Check whether the collation information read in is document identification information or collation term identification information. That is, if the last two terms of the collation information are both 0 (zero), that is, the lower 32 bits are 0 (zero), it is determined that it is document identification information. In this case, logical condition establishment determination processing step 1203 is executed. When the logical condition is satisfied, that is, in the logical condition (10-2) formula, when the value of the term counter is 1, it is shown that the logical condition is satisfied.

【００８８】論理条件が成立した場合，文書識別情報判
定処理ステップ１２０３ａが実行され，さらに本文書識
別情報が最初の文書識別情報でない場合，結果出力処理
ステップ１２０４を実行する。文書識別情報判定処理ス
テップ１２０３ａにおいて，最初の文書識別情報の場合
，判定処理を行わないが，これは，まだ論理条件の判定
処理を何も行っていないためである。結果出力処理ステ
ップ１２０４では，後述する照合情報を格納した出力バ
ッファ３３１の内容と論理条件判定結果（以後，論理条
件識別情報と呼ぶ）を複合条件判定結果２０６として出
力する。このように，文書識別情報を検出したときに，
論理条件判定処理を行うのは，１文書単位に指定論理条
件の成否を判定する必要があるためである。If the logical condition is satisfied, document identification information determination processing step 1203a is executed, and if the main document identification information is not the first document identification information, result output processing step 1204 is executed. In document identification information determination processing step 1203a, no determination processing is performed in the case of the first document identification information, because no logical condition determination processing has been performed yet. In the result output processing step 1204, the contents of the output buffer 331 storing collation information to be described later and the logical condition determination result (hereinafter referred to as logical condition identification information) are output as the composite condition determination result 206. In this way, when document identification information is detected,
The logical condition determination process is performed because it is necessary to determine whether the specified logical condition is satisfied or not for each document.

【００８９】ここで出力される論理条件判定結果情報の
内容としては，位置情報には文書の先頭位置情報および
文書の末尾位置情報が設定され，識別子には論理条件識
別子Ｌｉが格納される照合情報（Ｌｉ，文書先頭位置情
報，文書末尾位置情報）となる。この文書先頭位置情報
としては常に文書の先頭で位置情報が０クリアされるた
め，必ず０（ゼロ）となる。したがって論理条件識別情
報（Ｌｉ，０，文書末尾位置情報）となる。また論理条
件成立判定処理ステップ１２０３の後，初期設定処理ス
テップ１２０５を実行し，出力バッファ３３１を０クリ
アするとともに，要素カウンタと項カウンタの初期設定
を行う。本例では要素カウンタに２を設定し，項カウン
タに０を設定する。[0089] As for the contents of the logical condition judgment result information output here, the position information is set with the document start position information and the document end position information, and the identifier is collation information in which the logical condition identifier Li is stored. (Li, document start position information, document end position information). This document start position information is always 0 (zero) because the position information is always cleared to 0 at the start of the document. Therefore, it becomes logical condition identification information (Li, 0, document end position information). Further, after the logical condition establishment determination processing step 1203, an initialization processing step 1205 is executed to clear the output buffer 331 to 0 and initialize the element counter and term counter. In this example, the element counter is set to 2 and the term counter is set to 0.

【００９０】照合情報識別処理ステップ１２０２におい
て照合情報が照合ターム識別情報であると判定された場
合には，照合ターム識別処理ステップ１２０６を実行し
，照合ターム識別情報内の照合ターム識別子が論理条件
の要素に指定されているか否かを調べる。ここで要素に
指定されている場合には，要素判定処理ステップ１２０
６ａを実行し，まず見つかった要素について論理条件を
調べ，要素に¬が掛かっていない場合にはカウントアッ
プ処理ステップ１２０６ｂを実行し，要素カウンタに１
を加算する。逆に要素に¬が掛かっている場合にはカウ
ントダウン処理ステップ１２０６ｃを実行し，要素カウ
ンタから１を減算する。ここで要素カウンタの値が０と
なった場合には項の不成立を表し，１以上となった場合
には項の成立を表す。項に¬が掛っているときは逆に，
要素カウンタの値が０のとき成立を表わし，１以上のと
き不成立を表わす。If it is determined in the matching information identification processing step 1202 that the matching information is matching term identification information, the matching term identification processing step 1206 is executed, and the matching term identifier in the matching term identification information is determined to be a logical condition. Check whether it is specified in the element. If specified as an element here, element determination processing step 120
6a, first check the logical condition for the found element, and if the element is not marked with ¬, execute count-up processing step 1206b and add 1 to the element counter.
Add. Conversely, if the element is marked with ¬, countdown processing step 1206c is executed and 1 is subtracted from the element counter. Here, when the value of the element counter becomes 0, it indicates that the term does not hold, and when it becomes 1 or more, it indicates that the term holds. Conversely, when the term is multiplied by ¬,
When the value of the element counter is 0, it indicates that it is true, and when it is 1 or more, it indicates that it does not hold.

【００９１】次に項成立判定処理ステップ１２０７を実
行する。ここでは要素が指定されている項が不成立から
成立に変化したのか，もしくは成立から不成立に変化し
たのかを調べる。項成立判定方法としては，要素カウン
タが，この計算前と計算後で，不成立から成立に変化し
た場合は成立した項が１つ増えたので項カウンタに１を
加算し，成立から不成立に変化した場合は成立した項が
１つ減ったため項カウンタから１を減算する。本例では
，初期状態における要素カウンタの値は２で，項カウン
タの値は０である。ここで要素Ｔａが見つかった場合，
要素Ｔａには¬が掛かっているので要素カウンタから１
を減算するため，要素カウンタの値は１となる。この場
合項には¬が掛っているので，項は不成立から不成立へ
変化したため，項カウンタは更新しない。さらに要素Ｔ
ｂが見つかった場合，要素Ｔｂには¬が掛かっているの
で要素カウンタから１を減算する。要素カウンタの値は
０となり，項は不成立から成立に変化したことになるた
め，項カウンタに１を加算する。この結果，項カウンタ
の値は１となる。以上のようにして，項成立判定処理が
行われる。Next, a term establishment determination processing step 1207 is executed. Here, we check whether the term for which the element is specified has changed from not being true to being true, or from being true to not being true. The method for determining whether a term holds true is that if the element counter changes from unfulfilled to true before and after this calculation, the number of satisfied terms has increased by one, so 1 is added to the term counter, and the condition changes from valid to unfulfilled. In this case, the number of valid terms has decreased by one, so 1 is subtracted from the term counter. In this example, the value of the element counter in the initial state is 2 and the value of the term counter is 0. If element Ta is found here,
Since element Ta is multiplied by ¬, 1 is added from the element counter.
, the value of the element counter becomes 1. In this case, since the term is multiplied by ¬, the term has changed from not being true to not being true, so the term counter is not updated. Furthermore, element T
If b is found, element Tb is multiplied by ¬, so 1 is subtracted from the element counter. The value of the element counter becomes 0, which means that the term has changed from not being true to being true, so 1 is added to the term counter. As a result, the value of the term counter becomes 1. As described above, the term establishment determination process is performed.

【００９２】照合情報識別処理ステップ１２０２の処理
の後，照合情報退避処理ステップ１２１０を実行し，照
合情報を出力バッファ３３１に出力する。照合情報退避
処理ステップ１２１０を実行する。この出力バッファ３
３１には１文書分の照合情報が格納され，論理条件成立
判定処理ステップ１２０３で論理条件成立が判明したと
きに複合条件判定結果２０６として出力される。以上の
各処理ステップをバッファ３７０に格納されている照合
情報に対し繰返し実行していくことにより論理条件判定
処理を実現することができる。繰返し処理ステップ１２
００終了後，最後に処理した文書については論理条件判
定が実行されないことになる。これは，文書識別情報の
入力をタイミングとして，前に読み込んだ文書の論理条
件判定を行っているからである。したがって，ここで再
度，論理条件成立判定処理ステップ１２０３を実行し，
最後に読み込んだ文書についての論理条件判定を行う。After the verification information identification processing step 1202, a verification information saving processing step 1210 is executed, and the verification information is output to the output buffer 331. Verification information saving processing step 1210 is executed. This output buffer 3
31 stores collation information for one document, and outputs it as a composite condition determination result 206 when it is found in the logical condition establishment determination process step 1203 that the logical condition is established. By repeatedly performing each of the above processing steps on the collation information stored in the buffer 370, the logical condition determination processing can be realized. Iterative processing step 12
After the end of 00, logical condition determination will not be executed for the last processed document. This is because the input of the document identification information is used to determine the logical conditions of the previously read document. Therefore, the logical condition establishment determination processing step 1203 is executed again here,
Performs logical condition judgment on the last read document.

【００９３】以上の処理手順を具体例で説明する。検索
条件としては，（７−１）に示した式“Ｑ＝（（文書［
４Ｃ］理解）［Ｓ］システム）［ＡＮＤ］（文書［ｓ］
検索）”を例に用いて説明する。各条件判定プログラム
には検索制御手段１０１で解析され，各条件に分離され
た条件式が設定される。具体的には，論理条件判定プロ
グラム３３０には本条件式（７−１）の論理条件部分“
（（文書［４Ｃ］理解）［Ｓ］システム）［ＡＮＤ］（
文書［ｓ］検索）”が設定される。ここでは，文脈条件
“（文書［４Ｃ］理解）［Ｓ］システム”の識別子をＣ
１とし，文脈条件“文書［ｓ］検索”の識別子をＣ２と
することにより，（７−１）の論理条件部分を“Ｌ１：
Ｃ１［ＡＮＤ］Ｃ２”という形で表す。Ｌ１は論理条件
識別子である。さらに“Ｌ１：¬（（¬Ｃ１）＋（¬Ｃ
２））”に変換され，論理条件判定プログラム３３０に
設定される。The above processing procedure will be explained using a specific example. The search condition is the formula “Q=((document[
4C] Understanding) [S] System) [AND] (Document [s]
Search)" will be explained as an example. Each condition judgment program is analyzed by the search control means 101 and a conditional expression separated into each condition is set. Specifically, the logical condition judgment program 330 has The logical condition part of this conditional expression (7-1) “
((Document [4C] Understanding) [S] System) [AND] (
"Document[s]Search)" is set.Here, the identifier of the context condition "(Document[4C]Understanding)[S]System" is set as C
1, and the identifier of the context condition “Document [s] search” is set to C2, the logical condition part of (7-1) is changed to “L1:
C1[AND]C2".L1 is a logical condition identifier.Furthermore, "L1:¬((¬C1)+(¬C
2))” and set in the logical condition determination program 330.

【００９４】今，（７−２）に示した文書“・・・。文
書理解を用いた検索システムである。・・・・” が入力されたとすると，前述したように文脈条件判定プ
ログラム３２０からは，図２２に示す以下の（１１−１
）〜（１１−８）が照合情報としてバッファ３７０へ出
力される。（Ｄ１，　　０，　　０）　　　　（１１−１）（Ｔ１
，３１，３２）　　　　（１１−２）（Ｔ２，３３，３
４）　　　　（１１−３）（Ｐ１，３１，３４）　　　
　（１１−４）（Ｔ３，３９，４０）　　　　（１１−
５）（Ｔ４，４１，４４）　　　　（１１−６）（Ｃ１
，３０，４８）　　　　（１１−７）（Ｃ２，３０，４
８）　　　　（１１−８）（１１−１）は文書識別情報
を表しており，Ｄ１は文書識別子を表し，これに続く２
項は定数０である。（１１−２），（１１−３），（１
１−５），および（１１−６）は，それぞれ“文書”，
“理解”，“検索”，および“システム”の照合ターム
識別情報を表わす。また，Ｔ１，Ｔ２，Ｔ３，およびＴ
４は，それぞれ“文書”，“理解”，“検索”，および
“システムの照合ターム識別子を表わす。また，（１１
−４）は近傍条件“文書［４Ｃ］理解”の近傍条件識別
情報を表しており，Ｐ１はこの近傍条件識別子を表して
いる。さらに，（１１−７）は文脈条件“（文書［４Ｃ
］理解）［Ｓ］システム”の文脈条件識別情報を表して
おり，Ｃ２はこの文脈条件識別子を表し，（１１−８）
は文脈条件“文書［ｓ］検索”の文脈条件識別情報を表
しており，Ｃ２はこの文脈条件識別子を表している。[0094] Now, if the document shown in (7-2) "...This is a retrieval system using document understanding..." is input, as described above, the context condition determination program 320 is the following (11-1
) to (11-8) are output to the buffer 370 as verification information. (D1, 0, 0) (11-1) (T1
,31,32) (11-2)(T2,33,3
4) (11-3) (P1, 31, 34)
(11-4) (T3,39,40) (11-
5) (T4, 41, 44) (11-6) (C1
,30,48) (11-7)(C2,30,4
8) (11-8) (11-1) represents document identification information, D1 represents a document identifier, and the following 2
The term is constant 0. (11-2), (11-3), (1
1-5) and (11-6) are "documents" and "documents", respectively.
Represents the collation term identification information of “understand,” “search,” and “system.” Also, T1, T2, T3, and T
4 represent "document", "understand", "search", and "system verification term identifier, respectively. Also, (11
-4) represents the neighborhood condition identification information of the neighborhood condition "understand document [4C]", and P1 represents this neighborhood condition identifier. Furthermore, (11-7) is a context condition “(document [4C
] understanding) [S] system", C2 represents this context condition identifier, and (11-8)
represents the context condition identification information of the context condition “document [s] search”, and C2 represents this context condition identifier.

【００９５】以上の条件における論理条件判定処理につ
いて図２４ａ，図２４ｂと図２５を用いて１ステップず
つ説明する。初期状態では図２４ａ，図２４ｂの初期状
態に示すようにバッファ３７０に照合情報（１１−１）
〜（１１−８）が格納されており，出力バッファ３３１
及びバッファ３７０は０クリアされた状態になっている
。論理条件判定処理プログラム３３０ではバッファ３７
０からこれらの照合情報を一つずつ読込み，論理条件“
Ｌｉ：¬（（¬Ｃ１）＋（¬Ｃ２））”について判定処
理を行う。The logical condition determination process under the above conditions will be explained step by step with reference to FIGS. 24a, 24b and 25. In the initial state, collation information (11-1) is stored in the buffer 370 as shown in the initial states of FIGS. 24a and 24b.
~(11-8) are stored, and the output buffer 331
And the buffer 370 is in a state where it is cleared to 0. In the logical condition determination processing program 330, the buffer 37
Read these collation information one by one from 0 and set the logical condition "
Li:¬((¬C1)+(¬C2))".

【００９６】まず図２４ａ，図２４ｂのステップ１に示
すように，読込み処理ステップ１２０１が実行され照合
情報（１１−１），すなわち照合情報（Ｄ１，０，０）
がプログラムのワークエリアに読み込まれる。次に照合
情報識別処理ステップ１２０２が実行され，照合情報（
１１−１）が文書識別情報であるか照合ターム識別情報
であるかが調べられる。照合情報（Ｄ１，０，０）は後
部の２項が両方とも０（ゼロ）のため，文書識別情報と
判断される。したがって，論理条件成立判定処理ステッ
プ１２０３および文書識別情報判定処理ステップ１２０
３ａが実行されるが，最初の文書識別情報なので結果出
力処理ステップ１２０４は実行されない。次に，初期設
定処理ステップ１２０５を実行し，出力バッファ３３１
を０クリアし，要素カウンタと項カウンタの初期設定を
行う。本例では，要素カウンタの初期値としては¬が掛
かっている要素（¬Ｃ１）と（¬Ｃ２）の数である２を
設定する。また，本項（（¬Ｃ１）＋（¬Ｃ２））は¬
が掛かっている要素を含む¬が掛かっている項である。このため項カウンタの初期値としては，¬が掛かってい
る要素を含まない¬が掛かっている項や¬が掛かってい
る要素を含む¬が掛かっていない項¬が存在しないため
０を設定する。照合情報識別処理ステップ１２０２が終
了したら，次に照合情報退避処理ステップ１２１０が実
行され，照合情報（Ｄ１，０，０）が出力バッファ３３
１に格納される。First, as shown in step 1 of FIGS. 24a and 24b, the reading process step 1201 is executed and collation information (11-1), that is, collation information (D1, 0, 0)
is loaded into the program's work area. Next, the verification information identification processing step 1202 is executed, and the verification information (
It is checked whether 11-1) is document identification information or verification term identification information. Verification information (D1, 0, 0) is determined to be document identification information because the last two terms are both 0 (zero). Therefore, logical condition establishment determination processing step 1203 and document identification information determination processing step 120
3a is executed, but since this is the first document identification information, result output processing step 1204 is not executed. Next, the initial setting processing step 1205 is executed, and the output buffer 331
Clear to 0 and initialize the element counter and term counter. In this example, the initial value of the element counter is set to 2, which is the number of elements (¬C1) and (¬C2) on which ¬ is applied. Also, this term ((¬C1) + (¬C2)) is
It is a term multiplied by ¬ that includes an element multiplied by . Therefore, the initial value of the term counter is set to 0 because there is no term with ¬ that does not include an element with ¬ or a term with no ¬ that includes an element with ¬. When the verification information identification processing step 1202 is completed, the verification information saving processing step 1210 is executed next, and the verification information (D1, 0, 0) is transferred to the output buffer 33.
It is stored in 1.

【００９７】その後ステップ２として，再び照合情報読
込み処理ステップ１２０１が実行され，照合情報（１１
−２）すなわち（Ｔ１，３１，３２）が読み込まれる。次に照合情報識別処理ステップ１２０２が実行され，照
合情報（Ｔ１，３１，３２）が照合ターム識別情報かど
うかが調べられる。照合情報（Ｔ１，３１，３２）は下
位３２ビットが０（ゼロ）でないため，照合ターム識別
情報と判断される。そして，次の照合ターム識別処理ス
テップ１２０６で，本照合ターム識別情報が論理条件中
に指定された要素に該当するかどうか調べられる。本論
理条件には，照合情報（Ｔ１，３１，３２）の照合識別
子Ｔ１が指定されていないため，処理ステップ１２０６
ａ〜１２０９は実行されないことになる。照合情報識別
処理ステップ１２０２が終了した後，照合情報退避処理
ステップ１２１０が実行され，照合情報（Ｔ１，３１，
３２）が出力バッファ３３１に格納される。同様にステ
ップ６まで照合情報の照合情報識別子は本論理条件に指
定されてないので照合情報退避処理ステップ１２１０だ
けが実行され，照合情報（Ｔ２，３３，３４），（Ｐ１
，３１，３４），（Ｔ３，３９，４０），および（Ｔ４
，４１，４４）が出力バッファ３３１に格納される。Thereafter, as step 2, the collation information reading processing step 1201 is executed again, and the collation information (11
-2), that is, (T1, 31, 32) is read. Next, a verification information identification processing step 1202 is executed, and it is checked whether the verification information (T1, 31, 32) is verification term identification information. Since the lower 32 bits of the collation information (T1, 31, 32) are not 0 (zero), it is determined to be collation term identification information. Then, in the next verification term identification processing step 1206, it is checked whether this verification term identification information corresponds to an element specified in the logical condition. Since the matching identifier T1 of the matching information (T1, 31, 32) is not specified in this logical condition, processing step 1206
Steps a to 1209 will not be executed. After the verification information identification processing step 1202 is completed, the verification information saving processing step 1210 is executed, and the verification information (T1, 31,
32) is stored in the output buffer 331. Similarly, until step 6, the collation information identifier of the collation information is not specified in this logical condition, so only the collation information save processing step 1210 is executed, and the collation information (T2, 33, 34), (P1
, 31, 34), (T3, 39, 40), and (T4
, 41, 44) are stored in the output buffer 331.

【００９８】さらにステップ７として，再び照合情報読
込み処理ステップ１２０１が実行され照合情報（１１−
７），すなわち（Ｃ１，３０，４８）が読み込まれる。次に照合情報識別処理ステップ１２０２が実行され，照
合情報（Ｃ１，３０，４８）が照合ターム識別情報かど
うか調べられる。照合情報（Ｃ１，３０，４８）は下位
３２ビットが０（ゼロ）でないため，照合ターム識別情
報と判断される。そして，次の照合ターム識別処理ステ
ップ１２０６で，照合ターム識別情報が論理条件中に指
定された要素に該当するかどうかが調べられる。本論理
条件には，照合情報（Ｃ１，３０，４８）の照合識別子
Ｃ１が指定されているため，処理ステップ１２０６ａ〜
１２０９が実行されることになる。まず，要素判定処理
ステップ１２０６ａが実行される。本論理条件では，要
素カウンタの初期値は２で項カウンタの初期値は１であ
る。ここで要素Ｃ１には¬が掛かっているのでカウント
ダウン処理ステップ１２０６ｃが実行される。ここで要
素カウンタから１だけ減算する。その結果，要素カウン
タの値は２から１に更新される。次に項判定処理ステッ
プ１２０７が実行される。この時点の要素カウンタの値
は１なので，項は不成立から不成立への変化となる。こ
のため，項カウンタは更新されない。照合情報識別処理
ステップ１２０２が終了した後，照合情報退避処理ステ
ップ１２１０が実行され，照合情報（Ｃ１，３０，４８
）が出力バッファ３３１に格納される。Furthermore, in step 7, the collation information reading process step 1201 is executed again and the collation information (11-
7), that is, (C1, 30, 48) is read. Next, a verification information identification processing step 1202 is executed, and it is checked whether the verification information (C1, 30, 48) is verification term identification information. Since the lower 32 bits of the collation information (C1, 30, 48) are not 0 (zero), the collation information (C1, 30, 48) is determined to be collation term identification information. Then, in the next verification term identification processing step 1206, it is checked whether the verification term identification information corresponds to an element specified in the logical condition. Since this logical condition specifies the collation identifier C1 of the collation information (C1, 30, 48), processing steps 1206a to 1206a to
1209 will be executed. First, element determination processing step 1206a is executed. In this logical condition, the initial value of the element counter is 2 and the initial value of the term counter is 1. Here, since the element C1 is multiplied by ¬, countdown processing step 1206c is executed. Here, 1 is subtracted from the element counter. As a result, the value of the element counter is updated from 2 to 1. Next, term determination processing step 1207 is executed. Since the value of the element counter at this point is 1, the term changes from not being true to not being true. Therefore, the term counter is not updated. After the verification information identification processing step 1202 is completed, the verification information saving processing step 1210 is executed, and the verification information (C1, 30, 48
) is stored in the output buffer 331.

【００９９】最後にステップ８として，照合情報読込み
処理ステップ１２０１が実行され照合情報（１１−８）
，すなわち（Ｃ２，３０，４８）が読み込まれる。次に
照合情報識別処理ステップ１２０２が実行され，照合情
報（Ｃ２，３０，４８）が照合ターム識別情報かどうか
調べられる。照合情報（Ｃ２，３０，４８）は下位３２
ビットが０（ゼロ）でないため，照合ターム識別情報と
判断される。そして，次の照合ターム識別処理ステップ
１２０６で，照合ターム識別情報が論理条件中に指定さ
れた要素に該当するかどうかが調べられる。本論理条件
には，照合情報（Ｃ２，３０，４８）の照合識別子Ｃ２
が指定されているため，処理ステップ１２０６ａ〜１２
０９が実行されることになる。この時点の要素カウンタ
の値は０で，項カウンタの値は１となっている。まず，
要素判定処理ステップ１２０６ａが実行される。ここで
，要素Ｃ２には¬が掛かっているのでカウントダウン処
理ステップ１２０６ｃが実行される。ここで要素カウン
タから１だけ減算する。その結果，要素カウンタの値は
１から０となる。次に項判定処理ステップ１２０７が実
行される。このとき本項には，¬が掛かっているので不
成立から成立への変化が起こったことになる。このため
，カウントアップ処理ステップ１２０８が実行され，項
カウンタは１を加算され０から１に更新される。その後
，照合情報退避処理ステップ１２１０が実行され，照合
情報（Ｃ２，３０，４８）は出力バッファ３３１に格納
される。ここで繰返し処理１２００は終了するが，最後
の文書なので論理条件成立判定処理ステップ１２０３が
実行される。項カウンタが１なので，本論理条件が成立
している。このため，結果出力処理ステップ１２０４が
実行され，照合情報を格納した出力バッファ３３１の内
容と論理条件判定結果情報（Ｌ１，０，９９）が複合条
件判定結果２０６として出力される。Ｌ１は論理条件識
別子を，０は文書先頭位置情報を，９９は文書末尾位置
情報を表している。Finally, in step 8, verification information reading processing step 1201 is executed and verification information (11-8) is executed.
, that is, (C2, 30, 48) is read. Next, a verification information identification processing step 1202 is executed, and it is checked whether the verification information (C2, 30, 48) is verification term identification information. Verification information (C2, 30, 48) is the lower 32
Since the bit is not 0 (zero), it is determined to be verification term identification information. Then, in the next verification term identification processing step 1206, it is checked whether the verification term identification information corresponds to an element specified in the logical condition. This logical condition includes the collation identifier C2 of collation information (C2, 30, 48).
is specified, processing steps 1206a to 1206a-12
09 will be executed. At this point, the value of the element counter is 0, and the value of the term counter is 1. first,
Element determination processing step 1206a is executed. Here, since element C2 is multiplied by ¬, countdown processing step 1206c is executed. Here, 1 is subtracted from the element counter. As a result, the value of the element counter changes from 1 to 0. Next, term determination processing step 1207 is executed. At this time, since ¬ is applied to this clause, a change has occurred from not being true to being true. Therefore, count-up processing step 1208 is executed, and the term counter is incremented by 1 and updated from 0 to 1. Thereafter, the collation information saving processing step 1210 is executed, and the collation information (C2, 30, 48) is stored in the output buffer 331. At this point, the iterative process 1200 ends, but since it is the last document, logical condition satisfaction determination process step 1203 is executed. Since the term counter is 1, this logical condition is satisfied. Therefore, the result output processing step 1204 is executed, and the contents of the output buffer 331 storing the collation information and the logical condition determination result information (L1, 0, 99) are output as the composite condition determination result 206. L1 represents a logical condition identifier, 0 represents document start position information, and 99 represents document end position information.

【０１００】以上の論理条件判定処理の結果，最終的に
は，以下に示す（１１−１）〜（１１−９）の照合情報
が，検索条件式（７−１）に示した式“Ｑ＝（（文書［
４Ｃ］理解）［Ｓ］システム）［ＡＮＤ］（文書［ｓ］
検索）”の複合条件判定結果２０６として出力される。（Ｄ１，　　０，　　０）　　　　（１１−１）（Ｔ１
，３１，３２）　　　　（１１−２）（Ｔ２，３３，３
４）　　　　（１１−３）（Ｐ１，３１，３４）　　　
　（１１−４）（Ｔ３，３９，４０）　　　　（１１−
５）（Ｔ４，４１，４４）　　　　（１１−６）（Ｃ１
，３０，４８）　　　　（１１−７）（Ｃ２，３０，４
８）　　　　（１１−８）（Ｌ１，　　０，９９）　　
　　（１１−９）（１１−１）は文書識別情報を表して
おり，Ｄ１は文書識別子を表わす。（１１−２），（１
１−３），（１１−５），および（１１−６）は，それ
ぞれ“文書”，“理解”，“検索”，および“システム
”の照合ターム識別情報を表わす。また，（１１−４）
は近傍条件“文書［４Ｃ］理解”の近傍条件識別情報を
表している。さらに，（１１−７）は文脈条件“（文書
［４Ｃ］理解）［Ｓ］システム”の文脈条件識別情報を
表し，（１１−８）は文脈条件“文書［ｓ］検索”の文
脈条件識別情報を表している。最後の（１１−９）は論
理条件“（（文書［４Ｃ］理解）［Ｓ］システム）［Ａ
ＮＤ］（文書［ｓ］検索）” の論理条件識別情報を表しており，Ｌ１はこの論理条件
識別子を表している。以上が論理条件判定プログラム３
１０，文脈条件判定プログラム３２０，および論理条件
判定プログラム３３０により構成される複合条件判定処
理の実現方法である。[0100] As a result of the above logical condition judgment processing, the collation information of (11-1) to (11-9) shown below is finally determined by the expression "Q" shown in the search condition expression (7-1). =((document[
4C] Understanding) [S] System) [AND] (Document [s]
(D1, 0, 0) (11-1) (T1
,31,32) (11-2)(T2,33,3
4) (11-3) (P1, 31, 34)
(11-4) (T3,39,40) (11-
5) (T4, 41, 44) (11-6) (C1
,30,48) (11-7)(C2,30,4
8) (11-8) (L1, 0,99)
(11-9) (11-1) represents document identification information, and D1 represents a document identifier. (11-2), (1
1-3), (11-5), and (11-6) represent collation term identification information of "document", "understanding", "search", and "system", respectively. Also, (11-4)
represents the neighborhood condition identification information of the neighborhood condition “understand document [4C]”. Furthermore, (11-7) represents the context condition identification information of the context condition “(Document [4C] understanding) [S] system”, and (11-8) represents the context condition identification information of the context condition “Document [s] search”. represents information. The last (11-9) is the logical condition “((Document [4C] Understanding) [S] System) [A
ND] (Document [s] search)" represents the logical condition identification information, and L1 represents this logical condition identifier. The above is the logical condition judgment program 3.
10, a method for realizing complex condition determination processing constituted by a context condition determination program 320 and a logical condition determination program 330.

【０１０１】以上説明したように文字列照合回路２００
および複合条件判定回路３００を構成することにより，
近傍条件，文脈条件，および論理条件などの複合条件判
定条件を一貫して実現することができるためフルテキス
トサーチ特有の木目細かな検索が可能となる。さらに，
例えば３つのマイクロコンピュータで各々，近傍条件判
定プログラム３１０，文脈条件判定プログラム３２０，
および論理条件判定プログラム３３０を実行させること
により，プログラム間で同期を取らなくとも動作させる
ことが可能となる。すなわち，これらのプログラムはそ
れぞれの入力バッファに照合情報が格納されるとこれに
応じて条件判定処理を始めるというパイプライン処理を
行うことが可能となり，高速な複合条件判定処理を実現
することができる。As explained above, the character string matching circuit 200
By configuring the complex condition determination circuit 300,
Complex conditions such as neighborhood conditions, context conditions, and logical conditions can be consistently realized, making it possible to perform fine-grained searches unique to full-text searches. moreover,
For example, three microcomputers each have a neighborhood condition determination program 310, a context condition determination program 320,
By executing the logical condition determination program 330, it becomes possible to operate the programs without synchronizing them. In other words, these programs can perform pipeline processing that starts condition judgment processing in response to collation information stored in each input buffer, making it possible to realize high-speed complex condition judgment processing. .

【０１０２】次に本発明の第２の実施例について図２６
を用いて説明する。本実施例では，複合条件のうち１つ
の条件しか設定されない場合に１つのマイクロプロセッ
サしか動作しないにもかかわらず，常に３つの複合条件
のパイプライン処理を行わなければならないという第１
の実施例の欠点を，与えられた検索条件に使用されてい
ない複合条件がある場合にはその複合条件判定プログラ
ムをバイパスすることにより解決し，効率の良い複合条
件判定処理を提供するものである。本実施例は文字列照
合回路２００と複合条件判定回路３００ａからなり，複
合条件判定回路３００ａは，３つのマイクロコンピュー
タ，すなわちＭＰＵａ３０１，ＭＰＵｂ３０２，および
ＭＰＵｃ３０３と，マルチプレクサ３９０〜３９２，並
びにセレクタ３８０〜３８２から構成されている。マイ
クロコンピュータＭＰＵａ３０１では近傍条件判定プロ
グラム３１０が，マイクロコンピュータＭＰＵｂ３０２
では文脈条件判定プログラム３２０が，マイクロコンピ
ュータＭＰＵｃ３０３では論理条件判定プログラム３３
０が実行される。さらに各ＭＰＵ間にはファーストイン
・ファーストアウト（ＦＩＦＯ）メモリを使用したバッ
ファ３５０，３６０，および３７０が配置され，それぞ
れのＭＰＵ間のデータの受渡しに用いられる。Next, regarding the second embodiment of the present invention, FIG.
Explain using. In this embodiment, although only one microprocessor operates when only one of the compound conditions is set, the first condition is that pipeline processing of three compound conditions must always be performed.
This method solves the drawbacks of the above embodiments by bypassing the complex condition judgment program when there is a complex condition that is not used in a given search condition, thereby providing efficient complex condition judgment processing. . This embodiment consists of a character string matching circuit 200 and a complex condition judgment circuit 300a. It is configured. In the microcomputer MPUa301, the neighborhood condition determination program 310 is executed in the microcomputer MPUb302.
In the case of the microcomputer MPUc303, the context condition judgment program 320 is the logical condition judgment program 33.
0 is executed. Furthermore, buffers 350, 360, and 370 using first-in, first-out (FIFO) memories are arranged between each MPU, and are used to transfer data between the respective MPUs.

【０１０３】本実施例の特徴である複合条件判定回路の
バイパス機能について述べる。本機能は指定された検索
条件式で用いられていない複合条件に対応する判定プロ
グラムが搭載されているマイクロプロセッサをマルチプ
レクサやセレクタを用いてバイパスすることにより，使
用する複合条件判定プログラムが搭載されているマイク
ロプロセッサのみを実行させる機能である。例えば検索
条件式“文書［４Ｃ］理解”が指定された場合には，近
傍条件のみが使用されるため，文字列照合回路２００か
ら得た照合情報２０５をバッファ３５０経由で近傍条件
判定プログラム３１０に入力し，近傍条件判定プログラ
ム３１０では近傍条件判定を行い，照合情報をバッファ
３６０を経由し，直接複合条件判定結果２０６として送
出するようにする。また，検索条件式 “（文書［Ｓ］理解）［ＡＮＤ］システム”が指定され
た場合には，文脈条件と論理条件が使用されるため，文
字列照合回路２００から得た照合情報２０５をバッファ
３５０経由で直接，文脈条件判定プログラム３２０に入
力し，文脈条件判定プログラム３２０から出力される照
合情報はバッファ３７０を経由して論理条件判定プログ
ラム３３０に入力し，この論理条件判定プログラム３３
０から出力される照合情報を複合条件判定結果２０６と
して送出するようにする。The bypass function of the complex condition determination circuit, which is a feature of this embodiment, will be described. This function uses a multiplexer or selector to bypass the microprocessor that is equipped with a judgment program that corresponds to a complex condition that is not used in the specified search condition expression. This function allows only the existing microprocessor to execute. For example, when the search condition expression "Document [4C] understanding" is specified, only the neighborhood condition is used, so the collation information 205 obtained from the string matching circuit 200 is sent to the neighborhood condition determination program 310 via the buffer 350. The neighborhood condition determination program 310 performs neighborhood condition determination, and collation information is sent via the buffer 360 directly as the composite condition determination result 206. Furthermore, when the search condition expression "(Document [S] understanding) [AND] system" is specified, the context condition and logical condition are used, so the collation information 205 obtained from the string collation circuit 200 is buffered. The collation information is directly input to the context condition determination program 320 via the buffer 350 , and the collation information output from the context condition determination program 320 is input to the logical condition determination program 330 via the buffer 370 .
The verification information output from 0 is sent as the composite condition determination result 206.

【０１０４】このような複合条件判定回路３００ａのバ
イパス機能の具体的な実現方式について説明する。この
バイパス動作はマルチプレクサ３９０〜３９２とセレク
タ３８０〜３８２の設定により実現する。この設定情報
は第２図の検索制御手段１０１からの検索情報２０２と
して与えられる。マルチプレクサ３９０では，文字列照
合回路２００からの照合情報２０５を，ａ１が選択され
た場合は論理条件判定プログラム３３０へ，ｂ１が指定
された場合は文脈条件判定プログラム３２０へ，ｃ１が
指定された場合は近傍条件判定プログラム３１０へ，ｄ
１が選ばれた場合には複合条件回路３００ａの複合条件
判定結果２０６として送出することになる。マルチプレ
クサ３９１では，近傍条件判定プログラム３１０の出力
を，ａ２が指定された場合には文脈条件判定プログラム
３２０へ，ｂ２が指定された場合には論理条件判定プロ
グラム３３０へ，ｃ２が指定された場合には複合条件回
路３００ａの出力として直接送出することになる。マル
チプレクサ３９２では文脈条件判定プログラム３２０の
出力を，ａ３が指定された場合には論理条件判定プログ
ラム３３０へ，ｂ３が指定された場合には複合条件判定
結果２０６として送出することになる。A specific implementation method of the bypass function of such a complex condition determination circuit 300a will be explained. This bypass operation is realized by setting multiplexers 390-392 and selectors 380-382. This setting information is given as search information 202 from the search control means 101 in FIG. The multiplexer 390 sends the collation information 205 from the character string collation circuit 200 to the logical condition determination program 330 if a1 is selected, to the context condition determination program 320 if b1 is specified, and to the context condition determination program 320 if c1 is specified. to the neighborhood condition determination program 310, d
If 1 is selected, it will be sent as the composite condition determination result 206 of the composite condition circuit 300a. The multiplexer 391 sends the output of the neighborhood condition judgment program 310 to the context condition judgment program 320 when a2 is specified, to the logical condition judgment program 330 when b2 is specified, and to the logical condition judgment program 330 when c2 is specified. is directly sent out as the output of the complex condition circuit 300a. The multiplexer 392 sends the output of the context condition determination program 320 to the logical condition determination program 330 when a3 is specified, and as the composite condition determination result 206 when b3 is specified.

【０１０５】セレクタ３８０では文脈条件判定プログラ
ム３２０の入力の選択を行い，マルチプレクサ３９０に
ｂ１が設定されたときはＸ１を，すなわち文字列照合回
路２００の照合結果２０５を選択し，マルチプレクサ３
９１にａ２が設定されたときはＹ１を，すなわち近傍条
件判定プログラム３１０の出力を選択する。セレクタ３
８１では論理条件判定プログラム３２０の入力の選択を
行い，マルチプレクサ３９０にａ１が設定されたときは
Ｘ２を，すなわち文字列照合回路２００の照合結果２０
５を選択し，マルチプレクサ３９１がｂ２に設定された
ときはＹ２を，すなわち近傍条件判定プログラム３１０
の出力を選択し，マルチプレクサ３９２がａ３に設定さ
れたときはＺ２を，すなわち文脈条件判定プログラム３
２０の出力を選択する。セレクタ３８２では複合条件判
定結果２０６の選択を行い，マルチプレクサ３９０にｄ
１が設定されたときはＺ３を，すなわち文字列照合回路
２００の照合結果２０５を選択し，マルチプレクサ３９
１がｃ２に設定されたときはＹ３を，すなわち近傍条件
判定プログラム３１０の出力を選択し，マルチプレクサ
３９２がｂ３に設定されたときはＸ３を，すなわち文脈
条件判定プログラム３２０の出力を選択し，それ以外の
場合はＷ３を選択する，すなわち論理条件判定プログラ
ム３３０の出力を選択する。以上述べたように，セレク
タ３８０〜３８２の設定は，マルチプレクサ３９０〜３
９２の設定に応じて行うことになる。The selector 380 selects the input of the context condition determination program 320, and when b1 is set in the multiplexer 390, selects X1, that is, the matching result 205 of the character string matching circuit 200, and selects the input of the multiplexer 390.
When a2 is set in 91, Y1, that is, the output of the neighborhood condition determination program 310 is selected. Selector 3
In 81, the input of the logic condition determination program 320 is selected, and when a1 is set in the multiplexer 390, X2 is selected, that is, the collation result 20 of the character string collation circuit 200 is selected.
5 is selected, and when the multiplexer 391 is set to b2, Y2 is selected, that is, the neighborhood condition determination program 310
When multiplexer 392 is set to a3, Z2 is selected, that is, context condition determination program 3
Select 20 outputs. The selector 382 selects the composite condition judgment result 206 and sends it to the multiplexer 390.
When 1 is set, Z3, that is, the matching result 205 of the character string matching circuit 200 is selected, and the multiplexer 39
1 is set to c2, selects Y3, that is, the output of the neighborhood condition judgment program 310, and when multiplexer 392 is set to b3, selects X3, that is, the output of the context condition judgment program 320, and selects it. In other cases, W3 is selected, that is, the output of the logical condition determination program 330 is selected. As described above, the settings of the selectors 380 to 382 are set by the multiplexers 390 to 390.
This will be done according to the settings of 92.

【０１０６】複合条件の組合せにより，以下のようにマ
ルチプレクサの選択を行うことになる。 ■複合条件なし　　─────────ｄ１のみを選択 ■近傍条件のみ　　─────────ｃ１とｃ２を選
択 ■文脈条件のみ　　─────────ｂ１とｂ３を選
択 ■論理条件のみ　　─────────ａ１のみを選択 ■近傍条件と文脈条件　　──────ｃ１，ａ２，お
よびｂ３を選択 ■近傍条件と論理条件　　──────ｃ１とｂ２を選
択 ■文脈条件と論理条件　　──────ｂ１とａ３を選
択 ■近傍条件と文脈条件と論理条件　　─ｃ１，ａ２，お
よびａ３を選択以上のように，検索制御手段１０１からの検索情報２０
２として与えられたマルチプレクサ３９０〜３９２およ
びセレクタ３８０〜３８２の設定情報に基づき，近傍条
件判定プログラム３１０，文脈条件判定プログラム３２
０，および論理条件判定プログラム３３０などの複合条
件判定プログラムをマルチプレクサ３９０〜３９２およ
びセレクタ３８０〜３８２を用いて選択的に接続するこ
とにより，複合条件の内１つの条件しか設定されない場
合に１つのマイクロプロセッサしか使用しないにもかか
わらず，常に３つの複合条件判定プログラムのパイプラ
イン処理を行うという第１の実施例の欠点を解決し，効
率の良い複合条件判定処理を実現することが可能となり
，高速な検索を行う文書検索装置を実現することができ
る。[0106] Depending on the combination of complex conditions, multiplexers are selected as follows. ■No compound condition ─────────Select only d1 ■Neighborhood condition only ─────────Select c1 and c2 ■Context condition only ─────────B1 and Select b3 ■ Logical condition only ────────── Select only a1 ■ Neighborhood condition and context condition ────── Select c1, a2, and b3 ■ Neighborhood condition and logical condition ──── ──Select c1 and b2 ■Context condition and logical condition ──────Select b1 and a3 ■Neighborhood condition, context condition and logical condition ─Select c1, a2, and a3 As described above, search control means Search information 20 from 101
Based on the setting information of the multiplexers 390 to 392 and selectors 380 to 382 given as 2, the neighborhood condition determination program 310 and the context condition determination program 32
By selectively connecting complex condition judgment programs such as 0 and logical condition judgment program 330 using multiplexers 390 to 392 and selectors 380 to 382, one micro Although only a processor is used, the drawback of the first embodiment of always performing pipeline processing of three complex condition judgment programs is solved, and it becomes possible to realize efficient complex condition judgment processing, resulting in high speed processing. A document search device that performs a search can be realized.

【０１０７】次に本発明の第３の実施例について図２７
を用いて説明する。本実施例では，複合条件判定回路３
００ｂにおける，近傍条件判定プログラム３１０，文脈
条件判定プログラム３２０，および論理条件判定プログ
ラム３３０の３つの複合条件判定プログラムを１つのマ
イクロコンピュータ上に載せ，これらの複合条件を切り
換えて順番に実行させることにより，複合条件判定処理
を実現することを目的としている。本実施例の複合条件
判定回路３００ｂは，処理速度では第１の実施例に劣る
が，１つのマイクロコンピュータで実現できるためコス
トを低くできるという効果が得られる。Next, regarding the third embodiment of the present invention, FIG.
Explain using. In this embodiment, the complex condition determination circuit 3
In 00b, three complex condition judgment programs, the neighborhood condition judgment program 310, the context condition judgment program 320, and the logical condition judgment program 330, are mounted on one microcomputer, and these compound conditions are switched and executed in order. , the purpose is to realize complex condition judgment processing. Although the complex condition determination circuit 300b of this embodiment is inferior to that of the first embodiment in terms of processing speed, it can be implemented with one microcomputer, so it has the advantage of lowering costs.

【０１０８】本実施例は文字列照合処理２００と複合条
件判定回路３００ｂからなり，さらに複合条件判定回路
３００ｂは，マイクロコンピュータＭＰＵａ３０１およ
び文字列照合回路２００とのデータの受渡しを行うバッ
ファ３５０から構成される。マイクロコンピュータＭＰ
Ｕａ３０１には，近傍条件判定プログラム３１０，文脈
条件判定プログラム３２０，および論理条件判定プログ
ラム３３０などの３つの複合条件判定プログラムと，こ
れらのプログラムを切り替えるスケジューラ３４０が搭
載されている。ここでバッファ３６０，３７０はプログ
ラムのワークエリアとして確保されるが，プログラムに
よりファーストイン・ファーストアウト（ＦＩＦＯ）メ
モリとして使用することで，第１の実施例と同様の機能
を得ることができる。スケジューラ３４０における，複
合条件プログラムの切替え順序は，まず近傍条件判定プ
ログラム３１０を実行し，次に文脈条件判定プログラム
３２０，さらに次は論理条件判定プログラム３３０を実
行し，近傍条件判定プログラム３１０に戻るような順序
に複合条件プログラムの切替えを行う。また，スケジュ
ーラ３４０における複合条件プログラムを切り換えるタ
イミングは，ｎ個の照合ターム情報処理する毎やｎ件文
書を処理する毎などにすることも可能である。これはス
ケジューラ３４０のプログラムの切替え処理時間との兼
ね合いで定めることになる。もし，頻繁に切り替えると
プログラムの切替え時間が複合条件プログラムの実行処
理時間に対し，大きな割合を占めるようになるため，数
百〜数千の照合情報，もしくは数十から数百文書でプロ
グラムを切り替えるのが効果的である。以上のように複
合条件判定回路３００ｂを構成することにより，第１の
実施例よりも処理速度では劣るものの，１つのマイクロ
コンピュータで近傍条件判定プログラム３１０，文脈条
件判定プログラム３２０，および論理条件判定プログラ
ム３３０などの複合条件判定処理を実現できるため，コ
ストの低い文書検索装置を実現することができる。[0108] This embodiment consists of a character string matching process 200 and a complex condition judgment circuit 300b, and the complex condition judgment circuit 300b further comprises a buffer 350 for exchanging data with the microcomputer MPUa 301 and the string matching circuit 200. Ru. microcomputer MP
The Ua 301 is equipped with three complex condition determination programs such as a neighborhood condition determination program 310, a context condition determination program 320, and a logical condition determination program 330, and a scheduler 340 that switches between these programs. Here, the buffers 360 and 370 are secured as work areas for the program, but by using them as first-in, first-out (FIFO) memories by the program, the same functionality as in the first embodiment can be obtained. The order in which the complex condition programs are switched in the scheduler 340 is to first execute the neighborhood condition determination program 310, then the context condition determination program 320, then the logical condition determination program 330, and then return to the neighborhood condition determination program 310. The compound condition programs are switched in the correct order. Further, the timing for switching the compound condition program in the scheduler 340 can be set every time n pieces of collation term information are processed or every time n documents are processed. This is determined in consideration of the program switching processing time of the scheduler 340. If you switch frequently, the program switching time will occupy a large proportion of the execution processing time of the complex condition program, so it may be necessary to switch programs when there are hundreds to thousands of collation information or tens to hundreds of documents. is effective. By configuring the complex condition determination circuit 300b as described above, although the processing speed is lower than that of the first embodiment, one microcomputer can process the neighborhood condition determination program 310, the context condition determination program 320, and the logical condition determination program. Since complex condition determination processing such as H.330 can be realized, a low-cost document retrieval device can be realized.

【０１０９】次に本発明の第４の実施例について図２８
を用いて説明する。本実施例では複合条件判定回路３０
０ｃにおける，近傍条件判定プログラム３１０および文
脈条件判定プログラム３２０の２つの複合条件判定プロ
グラムを１つのマイクロコンピュータに載せ，これらの
複合条件を切り換えて順番に処理させることにより，近
傍条件および文脈の２つの複合条件判定処理を実現させ
ることを目的としている。また，論理条件判定プログラ
ム３３０は，別のマイクロコンピュータＭＰＵｂ３０３
に搭載する。本実施例の複合条件判定回路３００ｃは，
処理速度では第１の実施例に劣るが，２つのマイクロコ
ンピュータで実現できるためコストを抑えることができ
るという効果が得られる。本実施例は文字列照合回路２
００と複合条件判定回路３００ｃからなり，複合条件判
定回路３００ｃは，マイクロコンピュータＭＰＵａ３０
５およびマイクロコンピュータＭＰＵｂ３０３と文字列
照合回路２００とのデータの受渡しを行うバッファ３５
０，３７０から構成される。Next, regarding the fourth embodiment of the present invention, FIG.
Explain using. In this embodiment, the complex condition determination circuit 30
In 0c, two complex condition judgment programs, the neighborhood condition judgment program 310 and the context condition judgment program 320, are installed on one microcomputer, and these compound conditions are switched and processed in order, so that the two conditions of neighborhood condition and context are The purpose is to realize complex condition judgment processing. In addition, the logical condition determination program 330 is run on another microcomputer MPUb303.
be installed on. The complex condition determination circuit 300c of this embodiment is as follows:
Although it is inferior to the first embodiment in terms of processing speed, it can be implemented with two microcomputers, so it has the effect of reducing costs. In this embodiment, the character string matching circuit 2
00 and a complex condition judgment circuit 300c, and the complex condition judgment circuit 300c is a microcomputer MPUa30.
5 and a buffer 35 that transfers data between the microcomputer MPUb303 and the character string matching circuit 200.
It consists of 0,370.

【０１１０】マイクロコンピュータＭＰＵａには，近傍
条件判定プログラム３１０および文脈条件判定プログラ
ム３２０の２つの複合条件判定プログラムと，これらの
プログラムを切り替えるスケジューラ３４１が搭載され
ている。ここでバッファ３６０はプログラムのワークエ
リアとして確保されるが，プログラムによりファースト
イン・ファーストアウト（ＦＩＦＯ）メモリとして使用
することにより，第１の実施例と同様の機能を得ること
ができる。　　スケジューラ３４１では，まず近傍条件
判定プログラム３１０を実行し，次に文脈条件判定プロ
グラム３２０を実行し，その後で近傍条件判定プログラ
ム３１０に戻るような順序に複合条件プログラムの切替
えを行う。また，スケジューラ３４１における複合条件
プログラムを切り換えるタイミングは，ｎ個の照合ター
ム情報を処理する毎やｎ文書を処理する毎などにする。これはスケジューラ３４１のプログラムの切替え処理時
間との兼ね合いで定めることになる。もし，頻繁に切り
替えるとプログラムの切替え時間が複合条件処理プログ
ラムの実行時間に対し，大きな割合を占めるようになる
ため，数百〜数千の照合情報，もしくは数十から数百文
書でプログラムを切り替えるのが効果的である。以上の
ように複合条件判定回路３００ｃを構成することにより
，第１の実施例よりも処理速度では劣るが，２つのマイ
クロコンピュータで，近傍条件判定プログラム３１０，
文脈条件判定プログラム３２０，および論理条件判定プ
ログラム３３０などの複合条件判定処理を実現できるた
め，コストを抑えた比較的高速な文書検索装置を提供す
ることができる。The microcomputer MPUa is equipped with two complex condition determination programs, a neighborhood condition determination program 310 and a context condition determination program 320, and a scheduler 341 for switching between these programs. Here, the buffer 360 is secured as a work area for the program, but by using it as a first-in first-out (FIFO) memory by the program, the same function as in the first embodiment can be obtained. The scheduler 341 switches the complex condition programs in such an order that the neighborhood condition determination program 310 is first executed, the context condition determination program 320 is then executed, and then the neighborhood condition determination program 310 is returned to. Further, the timing at which the compound condition program in the scheduler 341 is switched is set such as every time n pieces of collation term information are processed or every time n documents are processed. This is determined in consideration of the program switching processing time of the scheduler 341. If you switch frequently, the program switching time will take up a large proportion of the execution time of the complex condition processing program, so it may be necessary to switch programs when there are hundreds to thousands of matching information or tens to hundreds of documents. is effective. By configuring the complex condition determination circuit 300c as described above, although the processing speed is inferior to that of the first embodiment, two microcomputers can be used to configure the neighborhood condition determination program 310,
Since complex condition determination processing such as the context condition determination program 320 and the logical condition determination program 330 can be implemented, it is possible to provide a relatively high-speed document retrieval device with reduced costs.

【０１１１】次に本発明の第５の実施例について図２９
を用いて説明する。本実施例では，第４の実施例におい
て複合条件のうち１つの条件しか設定されない場合でも
常に２つのマイクロプロセッサによる複合条件のパイプ
ライン処理を行うという欠点を，与えられた検索条件に
使用されていない複合条件がある場合にはその複合条件
判定プログラムをバイパスすることにより解決し，処理
効率の良い複合条件判定回路３００ｄを提供することを
目的としている。本実施例は文字列照合回路２００と複
合条件判定回路３００ｄからなり，複合条件判定回路３
００ｄは，マイクロコンピュータＭＰＵａ３０５および
マイクロコンピュータＭＰＵｂ３０３と文字列照合回路
２００とのデータの受渡しを行うバッファ３５０と３７
０，さらにマルチプレクサ３９１と３９２およびセレク
タ３８０と３８１から構成される。Next, FIG. 29 shows a fifth embodiment of the present invention.
Explain using. This embodiment overcomes the disadvantage of the fourth embodiment in that pipeline processing of complex conditions is always performed by two microprocessors even when only one of the complex conditions is set. The purpose of this invention is to provide a complex condition judgment circuit 300d with high processing efficiency by bypassing the complex condition judgment program when there is a complex condition that does not exist. This embodiment consists of a character string matching circuit 200 and a complex condition judgment circuit 300d.
00d are buffers 350 and 37 that exchange data between the microcomputer MPUa 305 and microcomputer MPUb 303 and the character string matching circuit 200.
0, and further includes multiplexers 391 and 392 and selectors 380 and 381.

【０１１２】本実施例の特徴である複合条件判定回路の
バイパス機能について述べる。本機能は指定された検索
条件式で使用されていない複合条件判定プログラムが搭
載されているマイクロプロセッサをマルチプレクサやセ
レクタを用いてバイパスすることにより，使用する複合
条件判定プログラムが搭載されているマイクロプロセッ
サのみを実行させる機能である。例えば検索条件式“文
書［４Ｃ］理解”が指定された場合には，近傍条件のみ
が使用されるため，文字列照合回路２００から得た照合
情報２０５をバッファ３５０経由で近傍条件判定プログ
ラム３１０および文脈条件判定プログラム３２０に入力
し，これから出力されて照合情報をバッファ３７０経由
で直接複合条件判定結果２０６として送出するようにす
る。また，検索条件式“理解［ＡＮＤ］システム”が指
定された場合には，論理条件が使用されているため，文
字列照合回路２００から得た照合情報２０５をバッファ
３５０経由で直接，論理条件判定プログラム３３０に入
力し，この論理条件判定プログラム３３０から出力され
る照合情報を複合条件判定結果２０６として送出するよ
うにする。The bypass function of the complex condition determination circuit, which is a feature of this embodiment, will be described. This function uses a multiplexer or selector to bypass the microprocessor on which the complex condition judgment program that is not used in the specified search condition expression is installed. This is a function that only executes. For example, when the search condition expression "Document [4C] understanding" is specified, only the neighborhood condition is used, so the collation information 205 obtained from the character string collation circuit 200 is sent to the neighborhood condition determination program 310 and the neighborhood condition determination program 310 via the buffer 350. The information is input to the context condition determination program 320, and the collation information outputted from the program 320 is directly sent as the composite condition determination result 206 via the buffer 370. In addition, when the search condition expression "understand [AND] system" is specified, since a logical condition is used, the collation information 205 obtained from the character string collation circuit 200 is directly passed through the buffer 350 to determine the logical condition. The collation information inputted to the program 330 and outputted from the logical condition determination program 330 is sent out as the composite condition determination result 206.

【０１１３】このような複合条件判定プログラムのバイ
パス機能の具体的な実現方法について説明する。このよ
うな動作はマルチプレクサ３９１と３９２とセレクタ３
８０と３８１の設定により実現する。この設定情報は第
２図の検索制御手段１０１からの検索制御情報２０２と
して与えられる。マルチプレクサ３９１では，文字列照
合回路２００からの照合情報２０５を，ａ２が指定され
た場合には論理条件判定プログラム３３０へ送出するこ
とになる。また，ｂ２が指定された場合には，文字列照
合回路２００からの照合情報２０５を近傍条件判定プロ
グラム３１０へ送出し，さらにｃ２が指定された場合に
は複合条件回路３００の出力として直接送出することに
なる。マルチプレクサ３９２では文脈条件判定プログラ
ム３２０の出力を，ａ３が指定された場合には論理条件
判定プログラム３３０へ，ｂ３が指定された場合には複
合条件判定結果２０６として送出する。A specific method for implementing the bypass function of such a complex condition determination program will be explained. This operation is performed by multiplexers 391 and 392 and selector 3.
This is achieved by setting 80 and 381. This setting information is given as search control information 202 from search control means 101 in FIG. The multiplexer 391 sends the collation information 205 from the character string collation circuit 200 to the logic condition determination program 330 when a2 is specified. Furthermore, when b2 is specified, the matching information 205 from the character string matching circuit 200 is sent to the neighborhood condition determination program 310, and when c2 is specified, it is sent directly as the output of the compound condition circuit 300. It turns out. The multiplexer 392 sends the output of the context condition determination program 320 to the logical condition determination program 330 when a3 is specified, and as the composite condition determination result 206 when b3 is specified.

【００１１４】セレクタ３８０では論理条件判定プログ
ラム３３０の入力の選択を行い，マルチプレクサ３９１
にａ２が設定されたときはＸ１を，すなわち文字列照合
回路２００の照合結果２０５を選択し，マルチプレクサ
３９２にａ３が設定されたときはＹ１を，すなわち文脈
条件判定プログラム３２０の出力を選択することになる
。セレクタ３８１では複合条件判定結果２０６の選択を
行い，マルチプレクサ３９１にｃ２が設定されたときは
Ｘ２を，すなわち文字列照合回路２００の照合結果２０
５を選択し，マルチプレクサ３９２がｂ３に設定された
ときはＹ２を，すなわち文脈条件判定プログラム３２０
の出力を選択し，それ以外の場合はＺ２を，すなわち論
理条件判定プログラム３３０の出力を選択する。以上述
べたように，セレクタ３８０と３８１の設定は，マルチ
プレクサ３９１と３９２の設定に応じて行うことになる
。The selector 380 selects the input of the logic condition determination program 330, and the multiplexer 391
When a2 is set in the multiplexer 392, select X1, that is, the matching result 205 of the character string matching circuit 200, and when a3 is set in the multiplexer 392, select Y1, that is, the output of the context condition determination program 320. become. The selector 381 selects the composite condition judgment result 206, and when c2 is set in the multiplexer 391, X2 is selected, that is, the matching result 20 of the character string matching circuit 200 is selected.
5 is selected, and when the multiplexer 392 is set to b3, Y2 is selected, that is, the context condition determination program 320
Otherwise, select Z2, that is, the output of the logic condition determination program 330. As described above, the settings of selectors 380 and 381 are performed according to the settings of multiplexers 391 and 392.

【０１１５】複合条件の組合せにより，以下のようにマ
ルチプレクサの選択を行うことになる。（１）複合条件なし　　─────────ｃ２のみを
選択（２）論理条件のみ　　─────────ａ２のみを
選択（３）近傍条件と文脈条件　　──────ｂ２とｂ３
を選択（４）近傍条件と文脈条件と論理条件　　─ｂ２とａ３
を選択以上のように，検索制御手段１０１からの検索制御情報
２０２として与えられたマルチプレクサ３９１と３９２
およびセレクタ３８１と３８２の設定情報に基づき，近
傍条件判定プログラム３１０，文脈条件判定プログラム
３２０，および論理条件判定プログラム３３０などの複
合条件判定プログラムをマルチプレクサ３９１と３９２
およびセレクタ３８０と３８１を用いて選択的に接続す
ることにより，第４の実施例が複合条件の内１つの条件
しか設定されない場合には１つのマイクロプロセッサし
か使用されないにもかかわらず，常に２つマイクロプロ
ッセサにより複合条件のパイプライン処理を行うという
欠点を解決し，効率の良い複合条件判定処理を実現する
ことが可能となり，比較的低コストで且つ比較的高速な
文書検索装置を実現することができる。[0115] Depending on the combination of complex conditions, multiplexers are selected as follows. (1) No compound condition ────────── Select only c2 (2) Only logical condition ────────── Select only a2 (3) Neighborhood condition and context condition ──── ──b2 and b3
Select (4) Neighborhood conditions, context conditions, and logical conditions ─b2 and a3
As described above, the multiplexers 391 and 392 given as the search control information 202 from the search control means 101
Based on the setting information of selectors 381 and 382, complex condition determination programs such as neighborhood condition determination program 310, context condition determination program 320, and logical condition determination program 330 are
By selectively connecting the microprocessors 380 and 381, the fourth embodiment always uses two microprocessors even though only one microprocessor is used when only one of the complex conditions is set. To solve the drawbacks of performing pipeline processing of complex conditions using a microprocessor, to realize efficient complex condition judgment processing, and to realize a relatively low-cost and relatively high-speed document retrieval device. Can be done.

【０１１６】次に本発明の第６の実施例について図３０
を用いて説明する。本実施例では複合条件判定回路３０
０ｅにおける，文脈条件判定プログラム３２０および論
理条件判定プログラム３３０の２つの複合条件判定プロ
グラムを１つのマイクロコンピュータに搭載し，これら
の複合条件を切り換えて順番に処理させることにより，
文脈条件および論理条件の２つの複合条件判定処理を１
つのマイクロコンピュータで実現することを目的として
いる。また，近傍条件判定プログラム３１０は，別のマ
イクロコンピュータに受け持たせる。本実施例の複合条
件判定回路３００ｅは，処理速度では第１の実施例に劣
るが，２つのマイクロコンピュータで実現できるため比
較的高性能を維持してコストを抑えることができるとい
う効果が得られる。Next, regarding the sixth embodiment of the present invention, FIG.
Explain using. In this embodiment, the complex condition determination circuit 30
In 0e, two complex condition judgment programs, the context condition judgment program 320 and the logical condition judgment program 330, are installed in one microcomputer, and these compound conditions are switched and processed in order.
Two complex condition judgment processes, a context condition and a logical condition, are combined into one
The aim is to realize this using a single microcomputer. Further, the neighborhood condition determination program 310 is placed in another microcomputer. Although the complex condition determination circuit 300e of this embodiment is inferior to the first embodiment in terms of processing speed, since it can be implemented with two microcomputers, it is possible to maintain relatively high performance and reduce costs. .

【０１１７】本実施例は文字列照合回路２００と複合条
件判定回路３００ｅからなり，複合条件判定回路３００
ｅは，マイクロコンピュータＭＰＵａ３０１およびマイ
クロコンピュータＭＰＵｂ３０６と文字列照合回路２０
０とのデータの受渡しを行うバッファ３５０，３６０か
ら構成される。マイクロコンピュータＭＰＵａ３０６に
は，文脈条件判定プログラム３２０および論理条件判定
プログラム３３０の２つの複合条件判定プログラムと，
これらのプログラムを切り替えるスケジューラ３４２が
搭載されている。ここでバッファ３７０はプログラムの
ワークエリアに確保されるが，プログラムでファースト
イン・ファーストアウト（ＦＩＦＯ）メモリとして使用
することにより，第１の実施例と同様の機能を得ること
ができる。スケジューラ３４２では，まず文脈条件判定
プログラム３２０を実行し，次に論理条件判定プログラ
ム３３０を実行し，その後で文脈条件判定プログラム３
２０に戻るというような順序に複合条件プログラムの切
替えを行う。また，スケジューラ３４２における複合条
件プログラムを切り換えるタイミングは，ｎ個の照合タ
ーム情報を処理する毎やｎ件の文書を処理する毎などに
することも可能である。これはスケジューラ３４２のプ
ログラムの切替え処理時間との兼ね合いで定めることに
なる。もし，頻繁に切り替えるとプログラムの切替え時
間が複合条件処理プログラムの実行時間に対し，大きな
割合を占めるようになるため，数百〜数千の照合情報，
もしくは数十から数百文書でプログラムを切り替えるの
が効果的である。以上のように複合条件判定回路３００
ｅを構成することにより，第１の実施例よりも処理速度
では劣るが，２つのマイクロコンピュータで，近傍条件
判定プログラム３１０，文脈条件判定プログラム３２０
，および論理条件判定プログラム３３０などの複合条件
判定処理を行うため，低コストで比較的高速な文書検索
装置を実現することができる。This embodiment consists of a character string matching circuit 200 and a complex condition judgment circuit 300e.
e is a microcomputer MPUa301, a microcomputer MPUb306, and a character string matching circuit 20.
It is composed of buffers 350 and 360 that exchange data with 0. The microcomputer MPUa 306 includes two complex condition determination programs, a context condition determination program 320 and a logical condition determination program 330.
A scheduler 342 for switching these programs is installed. Here, the buffer 370 is secured in the work area of the program, but by using it as a first-in first-out (FIFO) memory in the program, the same function as in the first embodiment can be obtained. The scheduler 342 first executes the context condition determination program 320, then executes the logical condition determination program 330, and then executes the context condition determination program 330.
The compound condition program is switched in the order of returning to step 20. Further, the timing for switching the compound condition program in the scheduler 342 can be set every time n pieces of collation term information are processed or every time n documents are processed. This is determined in consideration of the program switching processing time of the scheduler 342. If the programs are switched frequently, the program switching time will occupy a large proportion of the execution time of the complex condition processing program.
Alternatively, it is effective to switch programs between dozens and hundreds of documents. As described above, the complex condition determination circuit 300
Although the processing speed is lower than that of the first embodiment, by configuring e, two microcomputers can execute the neighborhood condition determination program 310 and the context condition determination program 320.
, and the logical condition determination program 330, it is possible to realize a relatively high-speed document retrieval device at low cost.

【０１１８】次に本発明の第７の実施例について図３１
を用いて説明する。本実施例では，第６の実施例におい
て複合条件のうち１つの条件しか設定されない場合でも
常に２つのマイクロプロセッサによる複合条件のパイプ
ライン処理を行うという問題点を，与えられた検索条件
に使用されていない複合条件がある場合にはその複合条
件判定プログラムをバイパスすることにより解決し，処
理効率の良い複合条件判定回路３００ｆを実現すること
を目的としている。本実施例は文字列照合回路２００と
複合条件判定回路３００ｆからなり，複合条件判定回路
３００ｆは，マイクロコンピュータＭＰＵａ３０１およ
びマイクロコンピュータＭＰＵｂ３０６と文字列照合回
路２００とのデータの受渡しを行うバッファ３５０と３
６０，さらにマルチプレクサ３９１と３９２およびセレ
クタ３８０と３８１から構成される。Next, regarding the seventh embodiment of the present invention, FIG.
Explain using. This embodiment solves the problem in the sixth embodiment that pipeline processing of complex conditions is always performed by two microprocessors even when only one of the complex conditions is set. The purpose of this invention is to solve the problem by bypassing the complex condition determination program if there is a complex condition that has not been determined, thereby realizing a complex condition determination circuit 300f with high processing efficiency. This embodiment consists of a character string matching circuit 200 and a complex condition determining circuit 300f.
60, and further includes multiplexers 391 and 392 and selectors 380 and 381.

【０１１９】本実施例の特徴である複合条件判定回路の
バイパス機能について述べる。本機能は指定された検索
条件式に使用されてない複合条件判定プログラムが搭載
されているマイクロプロセッサをマルチプレクサやセレ
クタを用いてバイパスすることにより，使用する複合条
件判定プログラムが搭載されているマイクロプロセッサ
のみを実行させる機能である。例えば検索条件式“文書
［４Ｃ］理解”が指定された場合には，近傍条件のみが
使用されるため，文字列照合回路２００から得た照合情
報２０５をバッファ３５０経由で近傍条件判定プログラ
ム３１０に入力し，この出力の照合情報をバッファ３７
０経由で直接複合条件判定結果２０６として送出する。また，検索条件式“理解［ＡＮＤ］システム”が指定さ
れた場合には，論理条件が使用されているため，文字列
照合回路２００から得た照合情報２０５をバッファ３５
０経由で直接，文脈条件判定プログラム３２０に入力し
，さらにバッファ３７０を経由し論理条件判定プログラ
ム３３０に入力し，この論理条件判定プログラム３３０
の出力の照合情報を複合条件判定結果２０６として送出
する。The bypass function of the complex condition determination circuit, which is a feature of this embodiment, will be described. This function uses a multiplexer or selector to bypass the microprocessor on which a complex condition judgment program that is not used for the specified search condition expression is installed. This is a function that only executes. For example, when the search condition expression "Document [4C] understanding" is specified, only the neighborhood conditions are used, so the collation information 205 obtained from the string matching circuit 200 is sent to the neighborhood condition determination program 310 via the buffer 350. input, and the collation information of this output is sent to the buffer 37.
The composite condition determination result 206 is sent directly via 0. Furthermore, when the search condition expression "understand [AND] system" is specified, since a logical condition is used, the collation information 205 obtained from the character string collation circuit 200 is sent to the buffer 35.
0 directly to the context condition determination program 320, and further input to the logical condition determination program 330 via the buffer 370, and this logical condition determination program 330
The output verification information is sent as the composite condition determination result 206.

【０１２０】このような複合条件判定回路のバイパス機
能の具体的な実現方法について説明する。このような動
作はマルチプレクサ３９１と３９２とセレクタ３８０と
３８１の設定により実現される。この設定情報は第２図
の検索制御手段１０１からの検索情報２０２として与え
られる。マルチプレクサ３９１では，文字列照合回路２
００からの照合情報２０５の出力を，ａ２が指定された
場合には文脈条件判定プログラム３２０へ，ｂ２が指定
された場合には近傍条件判定プログラム３１０へ，ｃ２
が指定された場合には複合条件回路３００の出力として
直接送出することになる。マルチプレクサ３９２では近
傍条件判定プログラム３１０の出力を，ａ３が指定され
た場合には文脈条件判定プログラム３２０へ，ｂ３が指
定された場合には複合条件判定結果２０６として送出す
ることになる。A specific method for realizing the bypass function of such a complex condition determination circuit will be described. Such an operation is realized by setting multiplexers 391 and 392 and selectors 380 and 381. This setting information is given as search information 202 from the search control means 101 in FIG. In the multiplexer 391, the character string matching circuit 2
The output of the collation information 205 from 00 is sent to the context condition determination program 320 if a2 is specified, to the neighborhood condition determination program 310 if b2 is specified, and c2
If specified, it will be sent directly as the output of the complex condition circuit 300. The multiplexer 392 sends the output of the neighborhood condition determination program 310 to the context condition determination program 320 when a3 is specified, and as the composite condition determination result 206 when b3 is specified.

【０１２１】セレクタ３８０では文脈条件判定プログラ
ム３２０の入力の選択を行い，マルチプレクサ３９１に
ａ２が設定されたときはＸ１を，すなわち文字列照合回
路２００の照合結果２０５を選択し，マルチプレクサ３
９２にａ３が設定されたときはＹ１を，すなわち近傍条
件判定プログラム３１０の出力を選択する。セレクタ３
８１では複合条件判定結果２０６の選択を行い，マルチ
プレクサ３９１にｃ２が設定されたときはＸ２を，すな
わち文字列照合回路２００の照合結果２０５を選択し，
マルチプレクサ３９２がｂ３に設定されたときはＹ２を
，すなわち近傍条件判定プログラム３１０の出力を選択
し，それ以外の場合はＺ２を，すなわち論理条件判定プ
ログラム３３０の出力を選択する。The selector 380 selects the input of the context condition determination program 320, and when a2 is set in the multiplexer 391, selects X1, that is, the collation result 205 of the character string collation circuit 200, and selects the input of the context condition determination program 320.
When a3 is set in 92, Y1, that is, the output of the neighborhood condition determination program 310 is selected. selector 3
At 81, the composite condition judgment result 206 is selected, and when c2 is set in the multiplexer 391, X2 is selected, that is, the matching result 205 of the character string matching circuit 200 is selected.
When the multiplexer 392 is set to b3, it selects Y2, that is, the output of the neighborhood condition determination program 310; otherwise, it selects Z2, that is, the output of the logical condition determination program 330.

【０１２２】以上述べたように，セレクタ３８０と３８
１の設定は，マルチプレクサ３９１と３９２の設定に応
じて行うことになる。すなわち複合条件の組合せにより
，以下のようにマルチプレクサの選択を行うことになる
。（１）複合条件なし　　─────────ｃ２のみを
選択（２）近傍条件のみ　　─────────ｂ２とｂ３
を選択（３）文脈条件と論理条件　　──────ａ２のみを
選択（４）近傍条件と文脈条件と論理条件　　─ｂ２とａ３
を選択以上のように，検索制御手段１０１からの検索制御情報
２０２として与えられたマルチプレクサ３９１と３９２
およびセレクタ３８１と３８２の設定情報に基づき，近
傍条件判定プログラム３１０，文脈条件判定プログラム
３２０，および論理条件判定プログラム３３０などの複
合条件判定プログラムをマルチプレクサ３９１と３９２
およびセレクタ３８０と３８１を用いて選択的に接続す
ることにより，第６の実施例が複合条件の内１つの条件
しか設定されない場合には１つのマイクロプロセッサし
か使用されないにもかかわらず，常に２つマイクロプロ
ッセサにより複合条件のパイプライン処理を行うという
欠点を解決し，効率の良い複合条件判定処理を実現する
ことが可能となり，コストを抑えた比較的高速な文書検
索装置を提供することができる。As described above, selectors 380 and 38
The setting of 1 is performed according to the settings of multiplexers 391 and 392. In other words, the multiplexer is selected as follows based on the combination of complex conditions. (1) No complex conditions ──────────Select only c2 (2) Only neighborhood conditions ─────────b2 and b3
(3) Context condition and logical condition ──────Select only a2 (4) Neighborhood condition, context condition, and logical condition ─b2 and a3
As described above, the multiplexers 391 and 392 given as the search control information 202 from the search control means 101
Based on the setting information of selectors 381 and 382, complex condition determination programs such as neighborhood condition determination program 310, context condition determination program 320, and logical condition determination program 330 are
By selectively connecting the microprocessors 380 and 381, the sixth embodiment always uses two microprocessors even though only one microprocessor is used when only one of the complex conditions is set. This solves the drawbacks of performing pipeline processing of complex conditions using a microprocessor, makes it possible to realize efficient complex condition judgment processing, and provides a relatively high-speed document search device at low cost. .

【０１２３】次に本発明の第８の実施例について図３２
を用いて説明する。第１の実施例では，文字列照合回路
２００から出力される照合情報の中に近傍条件の処理対
象にならない文脈識別文字列，すなわち文脈マーカーの
照合情報も入っているため，近傍条件判定プログラム３
１０では文脈マーカーの照合情報についても近傍条件判
定処理を実施することになり，近傍条件判定の処理速度
が落ちるという問題がある。本実施例では上記問題点を
解決する複合条件判定方法として，文脈マーカーの照合
情報のみを格納するバッファ３８０を設け，近傍条件判
定処理をバイパスし文脈マーカーの照合情報が必要な文
脈条件判定プログラム３２０に入力することにより高速
な複合条件判定処理が可能な複合条件判定回路３００ｇ
を実現することが目的である。Next, regarding the eighth embodiment of the present invention, FIG.
Explain using. In the first embodiment, the matching information output from the character string matching circuit 200 includes matching information for context identification character strings that are not subject to neighborhood condition processing, that is, context marker matching information, so the neighborhood condition determination program 3
In No. 10, the neighborhood condition determination process is also performed on the collation information of the context marker, resulting in a problem that the processing speed of the neighborhood condition determination decreases. In this embodiment, as a complex condition determination method that solves the above problem, a buffer 380 is provided that stores only the context marker matching information, and the context condition determination program 320 which bypasses the neighborhood condition determination processing and requires the context marker matching information Complex condition judgment circuit 300g that can perform high-speed complex condition judgment processing by inputting
The purpose is to realize the following.

【０１２４】本実施例は文字列照合回路２００と複合条
件判定回路３００ｇからなり，複合条件判定回路３００
ｇは，マイクロコンピュータＭＰＵａ３０１，マイクロ
コンピュータＭＰＵｂ３０２ａおよびマイクロコンピュ
ータＭＰＵｃ３０３とこれらの間のデータの受渡しを行
うバッファ３５０，３６０，３７０および３８０，さら
にマルチプレクサ７１０，文脈マーカー検出器７２０か
ら構成される。マルチプレクサ７１０は通常はポートａ
を選択し，文字列照合回路２００から送られる照合情報
２０５を近傍条件判定プログラム３１０の入力となるバ
ッファ３５０へ送出する。また，後述する文脈マーカー
検出器７２０から文脈マーカー検出信号７２１が送られ
るとマルチプレクサ７１０ではポートｂを選択し，後述
するソートマージプログラム７３０の入力となるバッフ
ァ３８０へ送出する。すなわち，文脈マーカーの照合情
報はバッファ３８０へ送出されることになる。さらに，
後述する文脈マーカー検出器７２０から文書識別情報検
出信号７２２が送られるとマルチプレクサ７１０ではポ
ートａとポートｂの両方を選択し，バッファ３５０とバ
ッファ３８０へ同時に送出する。すなわち，文書識別情
報はバッファ３５０とバッファ３８０の両方へ送出され
ることになる。This embodiment consists of a character string matching circuit 200 and a complex condition judgment circuit 300g.
g is composed of a microcomputer MPUa 301, a microcomputer MPUb 302a, and a microcomputer MPUc 303, buffers 350, 360, 370, and 380 for transferring data between them, a multiplexer 710, and a context marker detector 720. Multiplexer 710 is typically port a
is selected, and the collation information 205 sent from the character string collation circuit 200 is sent to the buffer 350 which is input to the neighborhood condition determination program 310. Further, when a context marker detection signal 721 is sent from a context marker detector 720 (described later), the multiplexer 710 selects port b and sends it to the buffer 380 which becomes the input of a sort merge program 730 (described later). In other words, the context marker matching information is sent to the buffer 380. moreover,
When a document identification information detection signal 722 is sent from a context marker detector 720, which will be described later, multiplexer 710 selects both port a and port b and sends the signal to buffer 350 and buffer 380 at the same time. That is, the document identification information will be sent to both buffer 350 and buffer 380.

【０１２５】文脈マーカー検出器７２０では，文字列照
合回路２００の出力を参照して，文脈マーカーの照合情
報か否かを判定する。すなわち，照合情報の照合情報識
別子があらかじめ定められた文脈マーカーの照合情報識
別子と同じもので，且つ先頭位置情報と末尾位置情報と
も０（ゼロ）でないということであれば，文脈マーカー
であると判定する。また，先頭位置情報と末尾位置情報
とが共に０（ゼロ）であれば，文書識別情報であると判
定する。文脈マーカー検出器７２０は，文書識別情報検
出用のコンパレータ，文脈マーカーの照合ターム情報識
別用の２つのコンパレータ，文脈マーカーの識別子格納
用のレジスタ，および０（ゼロ）を格納するレジスタか
ら構成される。まず，文書識別情報検出用のコンパレー
タは，照合情報の位置情報が０かどうかを調べ，０の場
合には文書識別情報検出信号７２２を出力する。すなわ
ち，０が格納されているレジスタと照合情報の位置情報
を比較し，等しい場合に文書識別情報検出信号７２２を
出力することになる。次に，文脈マーカーの照合ターム
情報識別用のコンパレータは，文脈マーカーの照合ター
ム情報識別子かどうかを調べるコンパレータと，今調べ
ている照合情報が照合ターム情報かどうかを調べるコン
パレータを用意する。ここで，これらの両方のコンパレ
ータから成立信号が出力された場合にのみ文脈マーカー
検出信号７２１を出力する。文脈マーカーの照合ターム
情報識別子かどうかを調べるコンパレータでは，照合情
報の照合情報識別子が文脈マーカーの識別子と同じ場合
，すなわち照合情報の照合情報識別子と文脈マーカーの
識別子格納用のレジスタとを比較し，等しい場合に成立
信号を出力する。照合ターム情報かどうかを調べるコン
パレータでは，照合情報の位置情報が０でない場合，す
なわち照合情報の位置情報と０が格納されているレジス
タとを比較し，等しくない場合に成立信号を出力する。以上のように，文脈マーカー検出器７２０を構成する。[0125] The context marker detector 720 refers to the output of the character string matching circuit 200 and determines whether or not the output is context marker matching information. In other words, if the collation information identifier of the collation information is the same as the collation information identifier of the predetermined context marker, and both the start position information and the end position information are not 0 (zero), it is determined that it is a context marker. do. Further, if both the start position information and the end position information are 0 (zero), it is determined that the information is document identification information. The context marker detector 720 is composed of a comparator for detecting document identification information, two comparators for identifying collation term information of context markers, a register for storing identifiers of context markers, and a register for storing 0 (zero). . First, the comparator for detecting document identification information checks whether the position information of the verification information is 0, and if it is 0, outputs the document identification information detection signal 722. That is, the register storing 0 and the position information of the collation information are compared, and if they are equal, the document identification information detection signal 722 is output. Next, the comparators for identifying the collation term information of the context marker include a comparator that checks whether the context marker is a collation term information identifier, and a comparator that checks whether the collation information that is currently being checked is collation term information. Here, the context marker detection signal 721 is output only when establishment signals are output from both of these comparators. In the comparator that checks whether the collation term information identifier of a context marker is the same as the context marker identifier, if the collation information identifier of the collation information is the same as the context marker identifier, in other words, the comparator compares the collation information identifier of the collation information with the register for storing the context marker identifier, If they are equal, an established signal is output. A comparator that checks whether the information is matching term information compares the position information of the matching information with a register storing 0 if the position information of the matching information is not 0, and outputs an establishment signal if they are not equal. The context marker detector 720 is configured as described above.

【０１２６】文脈マーカーの照合情報が文脈マーカー検
出器７２０に入力された場合，マルチプレクサ７１０に
文脈マーカー検出信号７２１を出力する。これに応じて
，マルチプレクサ７１０では，文字列照合回路２００か
ら送られた照合情報２０５の送出先をバッファ３５０か
らバッファ３８０へ切り替えることになる。文書識別情
報であればマルチプレクサ７１０に文書識別情報検出信
号７２２を出力する。これに応じて，マルチプレクサ７
１０では，文字列照合回路２００から送られた照合情報
２０５の送出先をバッファ３５０とバッファ３８０の両
方に設定する。When the context marker matching information is input to the context marker detector 720, a context marker detection signal 721 is output to the multiplexer 710. In response, the multiplexer 710 switches the destination of the collation information 205 sent from the character string collation circuit 200 from the buffer 350 to the buffer 380. If it is document identification information, a document identification information detection signal 722 is output to multiplexer 710. Accordingly, multiplexer 7
In step 10, the destination of the collation information 205 sent from the character string collation circuit 200 is set to both the buffer 350 and the buffer 380.

【０１２７】マイクロコンピュータ３０２ａでは，ソー
トマージプログラム７３０と文脈条件判定プログラム３
２０とこれらを制御するスケジューラ３４２が実行され
る。ソートマージプログラム７３０ではバッファ３６０
に格納された近傍条件判定プログラム３１０の出力とし
ての照合情報と，バッファ３８０に格納された文脈マー
カー照合情報とを，末尾位置情報の昇順にマージする。すなわち，バッファ３６０とバッファ３８０から各々照
合情報を１つずつ読み込み，これらの末尾位置情報を比
較する。そして，バッファ３６０から読み込んだ照合情
報の末尾位置情報がバッファ３８０から読み込んだ照合
情報の末尾位置情報より小さい場合には，バッファ３６
０から読み込んだ照合情報をバッファ３９０へ先に出力
し，次はバッファ３６０から照合情報を読み込み，同様
に先程のバッファ３８０から読み込んだ照合情報と比較
し，小さい方をバッファ３９０へ出力する。逆にバッフ
ァ３８０の照合情報の末尾位置情報がバッファ３６０の
照合情報の末尾位置情報より小さい場合はバッファ３８
０の照合情報をバッファ３９０へ出力し，次はバッファ
３８０から照合情報を読込み，同様に先程のバッファ３
６０から読み込んだ照合情報と比較し，小さい方をバッ
ファ３９０へ出力する。また，文書識別情報はバッファ
３６０から読み込んだもののみをバッファ３９０に出力
し，バッファ３８０から読み込んだ文書識別情報はバッ
ファ３９０に出力しない。さらに，ここで行うソートマ
ージは文書毎に処理を行う。[0127] In the microcomputer 302a, the sort merge program 730 and the context condition determination program 3
20 and a scheduler 342 that controls these are executed. In the sort merge program 730, the buffer 360
The matching information as the output of the neighborhood condition determination program 310 stored in the buffer 380 and the context marker matching information stored in the buffer 380 are merged in ascending order of the end position information. That is, the verification information is read one by one from the buffer 360 and the buffer 380, and the end position information is compared. If the end position information of the verification information read from the buffer 360 is smaller than the end position information of the verification information read from the buffer 380, the buffer 360
The collation information read from 0 is first output to the buffer 390, then the collation information is read from the buffer 360, compared with the collation information read from the buffer 380 earlier, and the smaller one is output to the buffer 390. Conversely, if the end position information of the collation information in the buffer 380 is smaller than the end position information of the collation information in the buffer 360, the buffer 38
Output the verification information of 0 to the buffer 390, then read the verification information from the buffer 380, and similarly output the verification information of 0 to the buffer 390.
60 and outputs the smaller one to the buffer 390. Furthermore, only the document identification information read from the buffer 360 is output to the buffer 390, and the document identification information read from the buffer 380 is not output to the buffer 390. Furthermore, the sort merge performed here is performed for each document.

【０１２８】これらの処理を行うことによりバッファ３
９０には，バッファ３６０とバッファ３８０の照合情報
が末尾位置情報でソートマージされることになり，第１
の実施例における近傍条件判定プログラム３１０の出力
した照合情報と同様のものが格納されることになる。ス
ケジューラ３４３における，プログラムの切替え順序は
，まずソートマージプログラム７３０を実行し，次に文
脈条件判定プログラム３２０を実行し，ソートマージプ
ログラム７３０に戻るような順序にプログラムの切替え
を行う。また，スケジューラ３４３におけるプログラム
を切り換えるタイミングは，ｎ個の照合ターム情報処理
する毎やｎ件文書を処理する毎などにすることも可能で
ある。これはスケジューラ３４３のプログラムの切替え
処理時間との兼ね合いで定めることになる。もし，頻繁
に切り替えるとプログラムの切替え時間がプログラムの
実行処理時間に対し，大きな割合を占めるようになるた
め，数百〜数千の照合情報，もしくは数十から数百文書
でプログラムを切り替えるのが効果的である。以上のよ
うに複合条件判定回路３００ｇを実現することにより近
傍条件判定処理に必要のない文脈マーカーの照合情報を
近傍条件判定プログラム３１０をバイパスすることが可
能となり，第１の実施例より高速な文書検索装置が実現
できる。By performing these processes, buffer 3
90, the collation information of the buffers 360 and 380 is sorted and merged based on the end position information, and the first
The same verification information as that output by the neighborhood condition determination program 310 in the embodiment is stored. The program switching order in the scheduler 343 is such that the sort-merge program 730 is first executed, the context condition determination program 320 is then executed, and the programs are switched back to the sort-merge program 730. Further, the timing at which the program in the scheduler 343 is switched may be set every time n pieces of collation term information are processed or every time n documents are processed. This is determined in consideration of the program switching processing time of the scheduler 343. If the program is switched frequently, the program switching time will occupy a large proportion of the program execution processing time, so it may be difficult to switch programs when there are hundreds to thousands of matching information or tens to hundreds of documents. Effective. By realizing the complex condition determination circuit 300g as described above, it is possible to bypass the neighborhood condition determination program 310 with the context marker collation information that is not necessary for the neighborhood condition determination process, and the document processing speed is faster than in the first embodiment. A search device can be realized.

【０１２９】次に本発明の第９の実施例について図３３
を用いて説明する。本実施例は第８の実施例と同様に，
文字列照合回路２００から出力される照合情報の中に近
傍条件の処理対象にならない文脈マーカーの照合情報も
入っているため，近傍条件判定プログラム３１０では文
脈マーカーの照合情報についても近傍条件判定処理を実
施することになり，近傍条件判定の処理速度が落ちると
いう第１の実施例の問題点を，文脈マーカーの照合情報
のみを格納するバッファを設け，近傍条件判定処理をバ
イパスし文脈マーカーの照合情報が必要な文脈条件判定
プログラム３２０のみに入力することにより高速な複合
条件判定処理が行える複合条件判定回路３００ｈを実現
することが目的である。Next, regarding the ninth embodiment of the present invention, FIG.
Explain using. This embodiment, like the eighth embodiment,
Since the matching information output from the character string matching circuit 200 includes matching information of context markers that are not subject to neighborhood condition processing, the neighborhood condition determination program 310 also performs neighborhood condition determination processing on the matching information of context markers. In order to solve the problem of the first embodiment that the processing speed of neighborhood condition judgment is slow, a buffer is provided to store only the context marker matching information, bypassing the neighborhood condition judgment processing and storing the context marker matching information. The purpose is to realize a complex condition determination circuit 300h that can perform high-speed complex condition determination processing by inputting only the required context condition determination program 320.

【０１３０】本実施例が第８の実施例と異なるのは，文
脈マーカー検出専用に文脈マーカー用文字列照合回路２
００ａを設けた点である。第８の実施例では文脈マーカ
ーを含む検索タームを全て文字列照合回路２００に設定
するため，文字列照合回路２００に設定する検索ターム
数が多くなるという問題がある。さらには文字列照合回
路２００の許容する検索ターム数を超えてしまう場合も
出てくる。また，文脈マーカーの検索タームは１度設定
すれば済むものであるのに対して，第８の実施例のよう
に検索条件が与えられる度に文脈マーカーの検索ターム
も再設定すると，検索情報２０２の作成時間および設定
時間が長くなるという問題点も生じる。This embodiment differs from the eighth embodiment in that a context marker character string matching circuit 2 is provided exclusively for context marker detection.
This is the point where 00a was provided. In the eighth embodiment, since all search terms including context markers are set in the string matching circuit 200, there is a problem that the number of search terms set in the string matching circuit 200 increases. Furthermore, there may be cases where the number of search terms exceeds the number allowed by the character string matching circuit 200. Furthermore, whereas the search term for a context marker only needs to be set once, if the search term for a context marker is reset each time a search condition is given as in the eighth embodiment, the search information 202 is created. There also arises the problem that the time and setting time become longer.

【０１３１】本実施例は文字列照合回路２００と文脈マ
ーカー用文字列照合回路２００ａ，および複合条件判定
回路３００ｈから構成される。複合条件判定回路３００
ｈは，マイクロコンピュータＭＰＵａ３０１，マイクロ
コンピュータＭＰＵｂ３０２ａおよびマイクロコンピュ
ータＭＰＵｃ３０３とこれらの間のデータの受渡しを行
うバッファ３５０，３６０，３７０および３８０から構
成される。文脈マーカー用文字列照合回路２００ａの構
成は文字列照合回路２００と同様であり，文脈マーカー
用文字列照合回路２００ａには文脈マーカーを検索ター
ムとして設定し，文字列照合回路２００には文脈マーカ
ー以外の検索タームを設定する。また，検索制御手段１
０１では，文字列照合回路２００には検索条件が与えら
れる度に検索情報２０２を設定するが，文脈マーカー用
文字列照合回路２００ａは本検索装置の立上時に１度だ
け設定する。The present embodiment is composed of a character string matching circuit 200, a character string matching circuit for context marker 200a, and a complex condition determining circuit 300h. Complex condition determination circuit 300
h is composed of a microcomputer MPUa301, a microcomputer MPUb302a, a microcomputer MPUc303, and buffers 350, 360, 370, and 380 for exchanging data between them. The configuration of the context marker string matching circuit 200a is the same as the string matching circuit 200, and the context marker string matching circuit 200a is set with a context marker as a search term, and the string matching circuit 200 is set with a context marker other than the context marker. Set search terms. In addition, the search control means 1
In 01, search information 202 is set in the character string matching circuit 200 every time a search condition is given, but the character string matching circuit 200a for context marker is set only once when the present search device is started up.

【０１３２】本実施例の処理手順を具体例で説明する。まず，文字列照合回路２００と文脈マーカー用文字列照
合回路２００ａの動作を具体例で説明する。（７−１）
に示した式 “Ｑ＝（（文書［４Ｃ］理解）［Ｓ］システム）［ＡＮ
Ｄ］（文書［Ｓ］検索）” を例に用いて説明する。各複合条件判定プログラムには
検索制御手段１０１で解析され，各条件に分離された条
件式が設定される。本例では検索制御手段１０１より文
字列照合回路２００には，Ｔ１：文書”，“Ｔ２：理解
”，“Ｔ３：検索”，および“Ｔ４：システム”の４つ
が検索タームとして設定され，文脈マーカー用文字列照
合回路２００ａには，“Ｓ１：。”が検索タームとして
設定される。The processing procedure of this embodiment will be explained using a specific example. First, the operations of the character string matching circuit 200 and the context marker character string matching circuit 200a will be explained using a specific example. (7-1)
The formula “Q=((Document [4C] Understanding) [S] System) [AN
D] (Document [S] search)" is used as an example. Each compound condition determination program is analyzed by the search control means 101, and conditional expressions separated into each condition are set. In this example, search The control means 101 sets the four search terms T1: document, T2: understanding, T3: search, and T4: system in the string matching circuit 200, and performs string matching for context markers. "S1:." is set as a search term in the circuit 200a.

【０１３３】今，（７−２）に示した文書“・・・。文
書理解を用いた検索システムである。・・・・” が入力されたとすると，文字列照合回路２００からは以
下の照合情報（１３−１）〜（１３−５）が照合情報２
０５としてバッファ３５０へ出力される。（Ｄ１，　　０，　　０）　　　　（１３−１）（Ｔ１
，３１，３２）　　　　（１３−２）（Ｔ２，３３，３
４）　　　　（１３−３）（Ｔ３，３９，４０）　　　
　（１３−４）（Ｔ４，４１，４４）　　　　（１３−
５）また，文脈マーカー用文字列照合回路２００ａから
は以下の文脈マーカーの照合情報（１２−１）〜（１２
−３）が照合情報２０５ａとしてバッファ３８０へ出力
される。（Ｄ１，　　０，　　０）　　　　（１２−１）（Ｓ１
，３０，３０）　　　　（１２−２）（Ｓ１，４８，４
８）　　　　（１２−３）上記のバッファ３６０，３８
０の照合情報は第８の実施例と同じように，近傍条件判
定プログラム３１０，ソートマージプログラム７３０，
文脈条件判定プログラム３２０，および論理条件判定プ
ログラム３３０により処理される。以上のように文脈マ
ーカー検出専用の文字列照合回路２００ａを設けること
により，近傍条件判定処理に必要のない文脈マーカーの
照合情報を近傍条件判定プログラム３１０をバイパスす
ることが可能となり，さらに第８の実施例よりも文脈マ
ーカーに関する検索情報２０２の作成時間および設定時
間が少なくて済むため，第１の実施例より高速な文書検
索装置が実現できる。[0133] Now, if the document shown in (7-2) ``...'' is a search system using document understanding is input, the string matching circuit 200 outputs the following matching result. Information (13-1) to (13-5) are verification information 2
It is output to the buffer 350 as 05. (D1, 0, 0) (13-1) (T1
,31,32) (13-2)(T2,33,3
4) (13-3) (T3, 39, 40)
(13-4) (T4,41,44) (13-
5) In addition, the following context marker matching information (12-1) to (12
-3) is output to the buffer 380 as verification information 205a. (D1, 0, 0) (12-1) (S1
,30,30) (12-2)(S1,48,4
8) (12-3) The above buffers 360, 38
Similar to the eighth embodiment, the matching information of 0 is processed by the neighborhood condition determination program 310, the sort merge program 730,
Processed by a context condition determination program 320 and a logical condition determination program 330. By providing the character string matching circuit 200a dedicated to context marker detection as described above, it becomes possible to bypass the neighborhood condition judgment program 310 with the context marker matching information that is not necessary for the neighborhood condition judgment process. Since it takes less time to create and set the search information 202 regarding context markers than in the embodiment, a faster document retrieval device than in the first embodiment can be realized.

【０１３４】次に本発明の第１０の実施例について図３
４を用いて説明する。本実施例は第１の実施例の複合条
件判定回路３００では，マイクロコンピュータＭＰＵａ
３０１，マイクロコンピュータＭＰＵｂ３０２，および
マイクロコンピュータＭＰＵｃ３０３のデータの受渡し
に使用しているバッファ３６０およびバッファ３７０に
ファーストイン・ファーストアウト（ＦＩＦＯ）メモリ
という特殊なメモリを使用しているため，メモリ容量当
りのコストが高く掛かるという問題点がある。本実施例
では，この代りに一般のメモリを使用することにより低
コストの複合条件判定回路３００ｉを実現することを目
的としている。Next, FIG. 3 shows a tenth embodiment of the present invention.
4 will be used for explanation. In this embodiment, in the complex condition determination circuit 300 of the first embodiment, the microcomputer MPUa
301, since a special memory called first-in first-out (FIFO) memory is used for the buffers 360 and 370 used to transfer data between the microcomputer MPUb302 and the microcomputer MPUc303, the cost per memory capacity is low. The problem is that it costs a lot of money. The present embodiment aims to realize a low-cost complex condition determination circuit 300i by using a general memory instead.

【０１３５】本実施例はマイクロコンピュータＭＰＵａ
３０１，マイクロコンピュータＭＰＵｂ３０２，マイク
ロコンピュータＭＰＵｃ３０３，バッファ３５０，バス
６３０，および共有メモリ６２０から構成される。また
，マイクロコンピュータＭＰＵａ３０１では近傍条件判
定プログラム３１０が，マイクロコンピュータＭＰＵｂ
３０２では文脈条件判定プログラム３２０が，マイクロ
コンピュータＭＰＵｃ３０３では論理条件判定プログラ
ム３３０が実行される。[0135] This embodiment uses a microcomputer MPUa.
301, microcomputer MPUb302, microcomputer MPUc303, buffer 350, bus 630, and shared memory 620. In addition, in the microcomputer MPUa301, the neighborhood condition determination program 310 is
At 302, a context condition determination program 320 is executed, and at the microcomputer MPUc303, a logic condition determination program 330 is executed.

【０１３６】共有メモリ６２０は近傍条件判定プログラ
ム３１０，文脈条件判定プログラム３２０，および論理
条件判定プログラム３３０のデータの受渡しに使用する
。すなわち，近傍条件判定プログラム３１０と文脈条件
判定プログラム３２０とのデータの受渡しには共有メモ
リ６２０内のバッファ３６０ａが，文脈条件判定プログ
ラム３２０と論理条件判定プログラム３３０とのデータ
の受渡しにはバッファ３７０ａがそれぞれ使用される。バッファ３６０ａとバッファ３７０ａは，近傍条件判定
プログラム３１０，文脈条件判定プログラム３２０，お
よび論理条件判定プログラム３３０においてプログラム
でファーストイン・ファーストアウト（ＦＩＦＯ）メモ
リとして使用することにより，それぞれバッファ３６０
とバッファ３７０と同様の機能を得ることが可能である
。The shared memory 620 is used to transfer data between the neighborhood condition determination program 310, the context condition determination program 320, and the logical condition determination program 330. That is, the buffer 360a in the shared memory 620 is used to exchange data between the neighborhood condition determination program 310 and the context condition determination program 320, and the buffer 370a is used to exchange data between the context condition determination program 320 and the logical condition determination program 330. each used. The buffers 360a and 370a are used as first-in, first-out (FIFO) memories in the neighborhood condition determination program 310, the context condition determination program 320, and the logical condition determination program 330, respectively.
It is possible to obtain the same function as the buffer 370.

【０１３７】本実施例の複合条件判定処理の動作につい
て説明する。文字列照合回路２００の照合結果は照合情
報２０５としてバッファ３５０に送出される。バッファ
３５０に格納された照合情報は近傍条件判定プログラム
３１０で処理され判定結果は共有メモリ６２０内のバッ
ファ３６０ａに格納される。次にバッファ３６０ａに照
合情報が格納されると文脈条件判定プログラム３２０が
実行され，文脈条件判定プログラム３２０の判定結果は
共有メモリ６２０内のバッファ３７０ａに格納される。さらにバッファ３７０ａに照合情報が格納されると論理
条件判定プログラム３３０が実行され，論理条件判定プ
ログラム３３０の判定結果は複合条件判定結果２０６と
して送出される。以上のように，複合条件判定回路３０
０ｉを実現することにより，ＦＩＦＯメモリの替わりに
低コストの通常のメモリをバッファ３６０ａ，３７０ａ
に使用することが可能となり，低コストの安い文書検索
装置を実現することができる。[0137] The operation of the complex condition determination process of this embodiment will be explained. The matching result of the character string matching circuit 200 is sent to the buffer 350 as matching information 205. The collation information stored in the buffer 350 is processed by the neighborhood condition determination program 310, and the determination result is stored in the buffer 360a in the shared memory 620. Next, when the collation information is stored in the buffer 360a, the context condition determination program 320 is executed, and the determination result of the context condition determination program 320 is stored in the buffer 370a in the shared memory 620. Furthermore, when the collation information is stored in the buffer 370a, the logical condition determination program 330 is executed, and the determination result of the logical condition determination program 330 is sent out as the composite condition determination result 206. As described above, the complex condition determination circuit 30
By realizing 0i, low-cost ordinary memory can be used in buffers 360a and 370a instead of FIFO memory.
This makes it possible to realize a low-cost document retrieval device.

【０１３８】最後に本発明の第１１の実施例について図
３５を用いて説明する。第１の実施例の複合条件判定回
路３００では，マイクロコンピュータＭＰＵａ３０１，
マイクロコンピュータＭＰＵｂ３０２，およびマイクロ
コンピュータＭＰＵｃ３０３間のデータの受渡しに使用
するバッファ３６０およびバッファ３７０にファースト
イン・ファーストアウト（ＦＩＦＯ）メモリという特殊
なメモリを使用しているため，メモリ容量当りのコスト
が高くつくという問題点がある。本実施例ではこのＦＩ
ＦＯメモリの代りに，一般のメモリを使用することによ
り低コストの複合条件判定回路３００ｊを実現すること
を目的としている。また，第１０の実施例では，共有メ
モリ６２０を３つのマイクロプロセッサで時分割でアク
セスするため，各プロセッサが同じ回数のメモリアクセ
スを行うとすると，メモリのアクセス回数は３倍になり
処理速度がメモリのアクセスネックになるという問題が
ある。これに対して本実施例では，各プロセッサ間に一
般のメモリを使用したバッファを２面設けることにより
解決することを目的としている。Finally, an eleventh embodiment of the present invention will be explained using FIG. 35. In the complex condition determination circuit 300 of the first embodiment, the microcomputer MPUa301,
Because a special memory called first-in first-out (FIFO) memory is used for the buffer 360 and buffer 370 used to transfer data between the microcomputer MPUb302 and the microcomputer MPUc303, the cost per memory capacity is high. There is a problem. In this example, this FI
The purpose is to realize a low-cost complex condition determination circuit 300j by using a general memory instead of the FO memory. In addition, in the tenth embodiment, the shared memory 620 is accessed by three microprocessors in a time-sharing manner, so if each processor accesses the memory the same number of times, the number of memory accesses will triple and the processing speed will increase. There is a problem that it becomes a memory access bottleneck. In contrast, this embodiment aims to solve this problem by providing two buffers using general memory between each processor.

【０１３９】本実施例はマイクロコンピュータＭＰＵａ
３０１，マイクロコンピュータＭＰＵｂ３０２，マイク
ロコンピュータＭＰＵｃ３０３，バッファ３５０，バッ
ファ３６０ｂおよびバッファ３７０ｂから構成される。また，マイクロコンピュータＭＰＵａ３０１では近傍条
件判定プログラム３１０が，マイクロコンピュータＭＰ
Ｕｂ３０２では文脈条件判定プログラム３２０が，マイ
クロコンピュータＭＰＵｃ３０３では論理条件判定プロ
グラム３３０が実行される。[0139] This embodiment uses a microcomputer MPUa.
301, microcomputer MPUb302, microcomputer MPUc303, buffer 350, buffer 360b, and buffer 370b. In addition, in the microcomputer MPUa301, the neighborhood condition determination program 310
The Ub302 executes a context condition determination program 320, and the microcomputer MPUc303 executes a logical condition determination program 330.

【０１４０】バッファ３６０ｂは近傍条件判定プログラ
ム３１０と文脈条件判定プログラム３２０のデータの受
渡しに使用され，バッファ３７０ｂは文脈条件判定プロ
グラム３２０と論理条件判定プログラム３３０のデータ
の受渡しに使用されている。バッファ３６０ｂでは，マ
ルチプレクサ６３０，セレクタ６３１，メモリ６２２，
６２３，バス６４０，６４１，および通信メモリ６２４
から構成されている。バッファ３７０ｂもバッファ３６
０ｂと同様に構成される。２面バッファを構成するメモ
リ６２２とメモリ６２３は近傍条件判定プログラム３１
０から文脈条件判定プログラム３２０への照合情報の受
渡しに使用される。近傍条件判定プログラム３１０がメ
モリ６２２に照合情報を出力している間，文脈条件判定
プログラム３２０はメモリ６２３から以前に近傍条件判
定プログラム３１０が出力した照合情報を読み込む。ま
た，近傍条件判定プログラム３１０がメモリ６２３に照
合情報を出力している間，文脈条件判定プログラム３２
０はメモリ６２２から以前に近傍条件判定プログラム３
１０が出力した照合情報を読み込む。通信メモリ６２４
は，メモリ６２２およびメモリ６２３のバッファの切り
替えのための制御情報の受渡しに使用される。The buffer 360b is used to exchange data between the neighborhood condition determination program 310 and the context condition determination program 320, and the buffer 370b is used to exchange data between the context condition determination program 320 and the logical condition determination program 330. The buffer 360b includes a multiplexer 630, a selector 631, a memory 622,
623, buses 640, 641, and communication memory 624
It consists of Buffer 370b is also buffer 36
It is configured similarly to 0b. The memories 622 and 623 that constitute the two-sided buffer are used by the neighborhood condition determination program 31.
0 to the context condition determination program 320. While the neighborhood condition determination program 310 is outputting the collation information to the memory 622, the context condition determination program 320 reads from the memory 623 the collation information previously output by the neighborhood condition determination program 310. Further, while the neighborhood condition determination program 310 is outputting collation information to the memory 623, the context condition determination program 32
0 is the previous neighborhood condition determination program 3 from the memory 622.
Read the verification information output by 10. Communication memory 624
is used to exchange control information for switching the buffers in the memory 622 and memory 623.

【０１４１】マルチプレクサ６３０は切り替え信号６３
０ａに０が設定されるとポートａを選択し，近傍条件判
定プログラム３１０の出力する照合情報がメモリ６２２
に格納される。切り替え信号６３０ａに１が設定される
とポートｂを選択し，近傍条件判定プログラム３１０の
出力する照合情報がメモリ６２３に格納される。セレク
タ６３１は切り替え信号６３１ａに０が設定されるとポ
ートｘが選択され，メモリ６２２から照合情報が文脈条
件判定プログラム３２０により読込まれる。切り替え信
号６３１ａに１が設定されるとポートｙが選択され，メ
モリ６２２から照合情報が文脈条件判定プログラム３２
０により読込まれる。The multiplexer 630 switches the switching signal 63
When 0a is set to 0, port a is selected, and the collation information output from the neighborhood condition determination program 310 is stored in the memory 622.
is stored in When the switching signal 630a is set to 1, port b is selected, and the verification information output from the neighborhood condition determination program 310 is stored in the memory 623. When the selector 631 sets the switching signal 631a to 0, port x is selected, and collation information is read from the memory 622 by the context condition determination program 320. When the switching signal 631a is set to 1, port y is selected, and the collation information from the memory 622 is sent to the context condition determination program 32.
Read by 0.

【０１４２】以下，バッファ３６０ｂの２面バッファ方
式の動作の制御方法について説明する。まず，近傍条件
判定プログラム３１０が切り替え信号６３０ａとして０
をマルチプレクサ６３０に送る，すなわち近傍条件判定
プログラム３１０から出力される照合情報をメモリ６２
２に出力する。近傍条件判定プログラム３１０がメモリ
６２２に所定の量の照合情報を書き込み終えたとき通信
メモリ６２４を経由し，文脈条件判定プログラム３２０
にメモリ６２２が使用できることを知らせる。文脈条件
判定プログラム３２０はこれを受け，セレクタ６３１に
切り替え信号６３１ａとして０を設定する。すなわち，
文脈条件判定プログラム３２０の入力としてメモリ６２
２を選択することになり，メモリ６２２の照合情報を読
み込み文脈条件判定処理を行う。A method of controlling the operation of the buffer 360b using the two-sided buffer method will be described below. First, the neighborhood condition determination program 310 sets the switching signal 630a to 0.
is sent to the multiplexer 630, that is, the collation information output from the neighborhood condition determination program 310 is sent to the memory 62.
Output to 2. When the neighborhood condition determination program 310 has finished writing a predetermined amount of collation information into the memory 622, the context condition determination program 320 passes through the communication memory 624.
to inform that the memory 622 is available for use. The context condition determination program 320 receives this and sets 0 to the selector 631 as the switching signal 631a. That is,
Memory 62 as an input to the context condition determination program 320
2 is selected, the collation information in the memory 622 is read and context condition determination processing is performed.

【０１４３】次に，近傍条件判定プログラム３１０が切
り替え信号６３０ａとして１をマルチプレクサ６３０に
送る。すなわち近傍条件判定プログラム３１０から出力
される照合情報をメモリ６２３に出力することになる。近傍条件判定プログラム３１０がメモリ６２３に所定の
量の照合情報を書き込み終えたとき通信メモリ６２４を
経由し，文脈条件判定プログラム３２０にメモリ６２３
が使用できることを知らせる。文脈条件判定プログラム
３２０はこれを受け，セレクタ６３１に切り替え信号６
３１ａとして１を設定する。すなわち，文脈条件判定プ
ログラム３２０の入力としてメモリ６２３を選択するこ
とになり，メモリ６２３の照合情報を読み込み文脈条件
判定処理を行う。Next, the neighborhood condition determination program 310 sends 1 to the multiplexer 630 as a switching signal 630a. That is, the collation information output from the neighborhood condition determination program 310 is output to the memory 623. When the neighborhood condition determination program 310 finishes writing a predetermined amount of collation information to the memory 623, the information is written to the context condition determination program 320 via the communication memory 624.
inform you that it can be used. The context condition determination program 320 receives this and sends a switching signal 6 to the selector 631.
31a is set to 1. That is, the memory 623 is selected as an input to the context condition determination program 320, and the collation information in the memory 623 is read and context condition determination processing is performed.

【０１４４】その後再び，近傍条件判定プログラム３１
０が切り替え信号６３０ａとして０をマルチプレクサ６
３０に送る。このとき，近傍条件判定プログラム３１０
がメモリ６２２に照合情報を書き込む場合，文脈条件判
定プログラム３２０からのメモリ６２２の読込み終了が
報告されるまで，近傍条件判定プログラム３１０はメモ
リ６２２への照合情報の書込みを待つことになる。この
ため文脈条件判定プログラム３２０においてメモリ６２
２の読込みが終了したとき通信メモリ６２４を経由し，
メモリ６２２の読込みが終了したことを知らせる。この
ようにメモリ６２２およびメモリ６２３のバッファの切
替えを制御することにより，近傍条件判定プログラム３
１０と文脈条件判定プログラム３２０とが同じメモリを
アクセスすることのない２面バッファ方式が実現される
。文脈条件判定プログラム３２０と論理条件判定プログ
ラム３３０とのデータの受渡しに使用するバッファ３７
０ｂもバッファ３６０ｂと同様のものを使用することが
できる。以上のように，実現することにより，ＦＩＦＯ
メモリの替わりに低コストの通常のメモリをバッファ３
６０ｂ，３７０ｂを用いて複合条件判定回路３００ｊを
構成することにより，低コストで，且つ高速な文書検索
装置を実現することができる。After that, the neighborhood condition determination program 31 is executed again.
0 as the switching signal 630a to the multiplexer 6
Send it to 30. At this time, the neighborhood condition determination program 310
writes the collation information to the memory 622, the neighborhood condition determination program 310 waits to write the collation information to the memory 622 until the context condition determination program 320 reports that the reading of the memory 622 is completed. Therefore, in the context condition determination program 320, the memory 62
When the reading of 2 is completed, via the communication memory 624,
It is notified that reading of the memory 622 has been completed. By controlling the switching of the buffers in the memory 622 and the memory 623 in this way, the neighborhood condition determination program 3
A two-sided buffer system is realized in which the context condition determination program 320 and the context condition determination program 320 do not access the same memory. A buffer 37 used for exchanging data between the context condition determination program 320 and the logical condition determination program 330
0b can also use the same buffer as the buffer 360b. As described above, by realizing FIFO
Use low-cost regular memory as a buffer instead of memory3
By configuring the complex condition determination circuit 300j using 60b and 370b, a low-cost and high-speed document search device can be realized.

【０１４５】[0145]

【発明の効果】以上のように本発明が提供する文書検索
方法および装置によれば，複合条件である近傍条件，文
脈条件，および論理条件を容易に判定することができ，
しかも高速に判定処理することが可能となり，フルテキ
ストサーチ特有の木目細かな検索を高速に実現する文書
検索装置を提供することができる。[Effects of the Invention] As described above, according to the document retrieval method and device provided by the present invention, it is possible to easily determine complex conditions such as neighborhood conditions, context conditions, and logical conditions.
In addition, it is possible to perform judgment processing at high speed, and it is possible to provide a document search device that can quickly perform detailed searches unique to full-text searches.

[Brief explanation of drawings]

【図１】本発明を用いた複合条件判定回路の説明図であ
る。FIG. 1 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図２】文書検索装置の説明図である。FIG. 2 is an explanatory diagram of a document search device.

【図３】複合条件の判定例の説明図である。FIG. 3 is an explanatory diagram of an example of determination of complex conditions.

【図４】照合情報の説明図である。FIG. 4 is an explanatory diagram of collation information.

【図５】複合条件の判定例の説明図である。FIG. 5 is an explanatory diagram of an example of determination of complex conditions.

【図６】複合条件の判定例の説明図である。FIG. 6 is an explanatory diagram of an example of determination of complex conditions.

【図７】複合条件の判定例の説明図である。FIG. 7 is an explanatory diagram of an example of determination of a complex condition.

【図８】文書識別子検出回路の説明図である。FIG. 8 is an explanatory diagram of a document identifier detection circuit.

【図９】検索ターム長テーブルの説明図である。FIG. 9 is an explanatory diagram of a search term length table.

【図１０】位置情報付加回路の説明図である。FIG. 10 is an explanatory diagram of a position information adding circuit.

【図１１】文書識別情報の説明図である。FIG. 11 is an explanatory diagram of document identification information.

【図１２】照合ターム識別情報の説明図である。FIG. 12 is an explanatory diagram of verification term identification information.

【図１３】照合情報の説明図である。FIG. 13 is an explanatory diagram of collation information.

【図１４】照合情報の説明図である。FIG. 14 is an explanatory diagram of collation information.

【図１５】近傍条件判定処理の説明図である。FIG. 15 is an explanatory diagram of neighborhood condition determination processing.

【図１６ａ】近傍条件判定処理の説明図の一部である。FIG. 16a is a part of an explanatory diagram of neighborhood condition determination processing.

【図１６ｂ】近傍条件判定処理の説明図の他部である。FIG. 16b is another part of an explanatory diagram of the neighborhood condition determination process.

【図１７】近傍条件判定処理の説明図である。FIG. 17 is an explanatory diagram of neighborhood condition determination processing.

【図１８】文脈条件判定処理の説明図である。FIG. 18 is an explanatory diagram of context condition determination processing.

【図１９】文脈条件判定処理の説明図である。FIG. 19 is an explanatory diagram of context condition determination processing.

【図２０ａ】文脈条件判定処理の説明図の一部である。FIG. 20a is a part of an explanatory diagram of context condition determination processing.

【図２０ｂ】文脈条件判定処理の説明図の他部である。FIG. 20b is another part of an explanatory diagram of context condition determination processing.

【図２１ａ】文脈条件判定処理の説明図の一部である。FIG. 21a is a part of an explanatory diagram of context condition determination processing.

【図２１ｂ】文脈条件判定処理の説明図の他部である。FIG. 21b is another part of an explanatory diagram of context condition determination processing.

【図２２】文脈条件判定処理の説明図である。FIG. 22 is an explanatory diagram of context condition determination processing.

【図２３】論理条件判定処理の説明図である。FIG. 23 is an explanatory diagram of logical condition determination processing.

【図２４ａ】論理条件判定処理の説明図の一部である。FIG. 24a is a part of an explanatory diagram of logical condition determination processing.

【図２４ｂ】論理条件判定処理の説明図の他部である。FIG. 24b is another part of an explanatory diagram of logical condition determination processing.

【図２５】論理条件判定処理の説明図である。FIG. 25 is an explanatory diagram of logical condition determination processing.

【図２６】本発明を用いた複合条件判定回路の説明図で
ある。FIG. 26 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図２７】本発明を用いた複合条件判定回路の説明図で
ある。FIG. 27 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図２８】本発明を用いた複合条件判定回路の説明図で
ある。FIG. 28 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図２９】本発明を用いた複合条件判定回路の説明図で
ある。FIG. 29 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図３０】本発明を用いた複合条件判定回路の説明図で
ある。FIG. 30 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図３１】本発明を用いた複合条件判定回路の説明図で
ある。FIG. 31 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図３２】本発明を用いた複合条件判定回路の説明図で
ある。FIG. 32 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図３３】本発明を用いた複合条件判定回路の説明図で
ある。FIG. 33 is an explanatory diagram of a complex condition determination circuit using the present invention.

【図３４】プロセッサ間のバッファの説明図である。FIG. 34 is an explanatory diagram of a buffer between processors.

【図３５】プロセッサ間のバッファの説明図である。FIG. 35 is an explanatory diagram of a buffer between processors.

[Explanation of symbols]

１０１　　検索制御手段１０４　　記憶装置制御手段１０５　　文字列記憶手段２００　　文字列照合手段３００　　複合条件判定手段２１０　　文書識別子検出回路２２０　　タームコンパレータ２３０　　文字数カウンタ２４０　　位置情報付加回路２５０　　検索ターム長テーブル３１０　　近傍条件判定プログラム３２０　　文脈条件判定プログラム３３０　　論利条件判定プログラム３５０　　バッファ３６０　　バッファ３７０　　バッファ３８０　　バッファ 101 Search control means 104 Storage device control means 105 String storage means 200 String matching means 300 Complex condition determination means 210 Document identifier detection circuit 220 Term comparator 230 Character number counter 240 Location information addition circuit 250 Search term length table 310 Neighborhood condition determination program 320 Context condition determination program 330 Argument condition judgment program 350 buffer 360 buffer 370 Buffer 380 buffer

Claims

[Claims]

[Claim 1] In a document search method that searches for documents that include a search term specified in a search condition expression in a document database stored as character codes, when the specified search term is matched in a document, A character string matching step that outputs document identification information including a document identifier, the identifier of the search word that was matched, and the matching position in the document as matching information; It consists of a complex condition judgment step that judges the search condition regarding the positional relationship between the search terms specified in the search condition expression, creates collation information of the judgment result indicating that the search condition is met, and outputs it as the search result. A document retrieval method characterized by:

[Claim 2] The document retrieval method according to claim 1, characterized in that the compound condition determination step includes a neighborhood condition determination step of determining a proximity distance condition between search terms specified in the search condition expression. Document search method.

[Claim 3] As the complex condition determination step in the document search method according to claim 1, a co-occurrence condition of the search terms specified in the search condition expression in the same phrase, the same sentence, or the same paragraph is determined. A document retrieval method comprising a step of determining a context condition.

4. A document characterized in that the document search method according to claim 1 has a logical condition determining step for determining a logical condition between search terms specified in a search condition expression as the complex condition determining step. retrieval method.

5. In the document retrieval method according to claim 1, the complex condition determination processing step includes a neighborhood condition determination step of determining a proximity distance condition between search terms specified in the search condition expression; a context condition determination step for determining co-occurrence conditions of the search term specified in the same phrase, the same sentence, or the same paragraph; and a logical condition between the search terms specified in the search condition expression. A document retrieval method characterized by comprising a logical condition determination step of determining.

[Claim 6] As the compound condition determination step in the document search method according to claim 2, when the document data is Japanese, a determination is made regarding the proximity distance condition expressed by the number of characters between the search words specified in the search condition expression. A document retrieval method comprising a neighborhood condition determination step.

[Claim 7] As the complex condition determination step in the document search method according to claim 2, when the document data is in English, a determination is made regarding the proximity distance condition expressed by the number of words between the search terms specified in the search condition expression. A document retrieval method comprising the step of determining a word spacing condition.

[Claim 8] In the document retrieval method according to claim 5, in the character string matching step, when a specified search term is matched in a document, a document identifier including an identifier of the document, an identifier of the matched search term, and The first and last character positions of the matching search term in the document are output as matching information, and if a context condition is specified and a string identifying the context is matched, the context identifier matched with the document's identifier. Outputs the identifier of the character string and the start and end positions of the matching context identifying character string in the document as matching information, as well as a neighborhood condition judgment step, a context condition judgment step, and a logical condition judgment step that constitute the compound condition judgment step. In the step, the proximity condition determination step determines the proximity distance condition expressed as the number of characters between the search terms specified in the search condition expression based on the matching information output in the string matching step, and The first character position of the search word located at In the determination step, the co-occurrence conditions of the search terms specified in the search condition expression in the same phrase, the same sentence, or the same paragraph are determined based on the collation information output in the neighborhood condition determination step, and the conditions are determined. The first character position of the context identification character string located at the front and the last character position of the context identification character string located at the back that match are used as matching information, and these are added to the matching information output in the neighborhood condition determination step. In the logical condition determination step, the logical conditions between the search terms specified in the search condition expression are determined based on the collation information output in the context condition determination step, and the document units that match the conditions are determined. A document search method characterized by outputting collation information as final search result information.

[Claim 9] In a document search device that searches a document database stored as character codes for documents that include a search term specified in a search condition expression, when the specified search term is matched in a document, A character string matching means that outputs document identification information including a document identifier, an identifier of a search word that has been matched, and a matching position in the document as matching information; It is comprised of a compound condition determining means that determines a search condition regarding the positional relationship between the search terms specified in the search condition expression, creates collation information of the determination result indicating that the search condition is met, and outputs it as a search result. A document retrieval device characterized by:

10. The document retrieval device according to claim 9, characterized in that the compound condition determining means includes a neighborhood condition determining means for determining a proximity distance condition between search terms specified in the search condition expression. Document search device.

[Claim 11] The compound condition determination means in the document retrieval device according to claim 9 determines whether the search terms specified in the search condition expression co-occur in the same phrase, the same sentence, or the same paragraph. A document retrieval device comprising a context condition determining means.

12. The document retrieval device according to claim 9, further comprising a logical condition determining means for determining a logical condition between search terms specified in a search condition expression, as the compound condition determining means. Search device.

13. The compound condition determination processing means in the document retrieval device according to claim 9, comprising: a proximity condition determination means for determining a proximity distance condition expressed by the number of characters between search words specified in the search condition expression; a context condition determining means for determining co-occurrence conditions of the search term specified in the search condition expression in the same phrase, same sentence, or same paragraph; and a context condition determination means for determining the co-occurrence condition of the search term specified in the search condition expression; A document retrieval device characterized by comprising a logical condition determining means for determining logical conditions between.

14. The compound condition determination means in the document search device according to claim 10, when the document data is Japanese text, determines the proximity distance condition expressed by the number of characters between search words specified in the search condition expression. A document retrieval device characterized in that it is provided with a neighborhood condition determination means for determining a neighborhood condition.

15. The compound condition determination means in the document search device according to claim 10, when the document data is in English, determines the proximity distance condition expressed by the number of words between the search terms specified in the search condition expression. A document retrieval device characterized in that it has a word spacing condition determination means.

16. The document retrieval device according to claim 13, when the specified search term is matched in a document, the character string matching means includes an identifier of the search term matched with an identifier of the document, and an identifier of the search term matched with the identifier of the document. Outputs the first character position and last character position of the matching search term in The neighboring condition determining means constituting the compound condition determining means is configured to output an identifier and the start and end positions of the matching context identification character string in the document as matching information, Based on the matching information, the proximity distance condition expressed as the number of characters between the search words specified in the search condition expression is determined, and the first character position of the search word located before the search word that matches the condition and the search word located after the search condition are determined. The context condition determining means constituting the compound condition determining means is configured to output the position of the last character of a word as verification information of the determination result, adding this to the verification information output by the character string verification means. , based on the collation information output by the neighborhood condition determining means, determine whether the search terms specified in the search condition expression co-occur in the same phrase, the same sentence, or the same paragraph, and match the condition. The first character position of the context identification character string located at the front and the last character position of the context identification character string located at the rear are used as verification information, and these are added to the verification information output by the neighborhood condition determination means and output. The logical condition determining means constituting the compound condition determining means determines the logical conditions between the search terms specified in the search condition expression based on the matching information output by the context condition determining means. What is claimed is: 1. A document retrieval device characterized in that it is configured to perform the following steps and output collation information for each document that meets conditions as final retrieval result information.