JPH06162083A

JPH06162083A - Character-string retrieving device

Info

Publication number: JPH06162083A
Application number: JP4306748A
Authority: JP
Inventors: Katsumi Tada; 勝己多田; Hisamitsu Kawaguchi; 川口　　久光; Kanji Kato; 寛次加藤; Masatsugu Shinozaki; 雅継篠崎
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1992-11-17
Filing date: 1992-11-17
Publication date: 1994-06-10

Abstract

PURPOSE:To collate a character-string with a high throughput by operating in parallel plural filtering circuits as a pre-processing for collating the characterstring. CONSTITUTION:A parallel filtering means 3000 is constituted of a distributing means 3100 for fetching simultaneously plural characters of a text, dividing them into one character each and sending it out, plural filtering means 3200a, 3200b for deciding whether its character code is a character code contained in a designated retrieval term or not in parallel, and a collecting means 3300 for aligning the character code outputted therefrom and sending it out to a character-string collating means 102.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は情報処理システム、特に
情報検索システムにおける検索タームの照合方法に係
り、テキスト文字列から検索タームとして指定された複
数の部分文字列が存在するか否かを一括して探索する文
字列検索装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information processing system, and more particularly to a method of collating search terms in an information retrieval system, which collectively checks whether or not a plurality of partial character strings designated as a search term from a text character string exist. The present invention relates to a character string search device for searching.

【０００２】[0002]

【従来の技術】情報処理システムの分野では、文字列デ
ータの集まりからなる文書（以後、テキストと呼ぶ）の
中から、検索者の探したい言葉、すなわち、ある特定の
文字列（以後、検索タームと呼ぶ）を含む全ての文書を
探し出すことが一つの重要な処理となっている。2. Description of the Related Art In the field of information processing systems, a word that a searcher wants to search, that is, a specific character string (hereinafter referred to as a search term) is selected from a document (hereinafter referred to as a text) that is a collection of character string data. It is one of the important processes to find all the documents including.

【０００３】このような検索システムを実現するための
文字列検索装置がいくつか提案されている。その中の代
表的な文字列検索装置の構成（エルエーホラー：
“テキストリトリーバルコンピューターズ”，コン
ピューター，１９７９年３月，L.A. Hollaar：“Text
Retrieval Computers”，COMPUTER，March 1979))を
図２に示し、その内容について説明する。Several character string search devices for realizing such a search system have been proposed. The structure of a typical character string search device (LAE Horror:
"Text Retrieval Computers", Computers, March 1979, LA Hollaar: "Text
Retrieval Computers ", COMPUTER, March 1979)) is shown in FIG.

【０００４】図２に示すように、文字列検索装置１にお
いて、検索制御手段１０１は、検索装置全体の制御と、
ホストコンピュータとの通信を行う。すなわち、ホスト
コンピュータから送られてくる検索要求２０１を受け付
け、これを解析し、文字列照合手段１０２と複合条件判
定手段１０３へ検索制御情報２０２として送出する。ま
た、検索制御手段１０１は記憶装置制御手段１０４を制
御して、文字列記憶手段１０５に格納された文字列デー
タ２０４を文字列照合手段１０２へ読み出す。As shown in FIG. 2, in the character string search device 1, the search control means 101 controls the search device as a whole.
Communicates with the host computer. That is, the search request 201 sent from the host computer is accepted, analyzed, and sent as search control information 202 to the character string collating means 102 and the compound condition judging means 103. Further, the search control means 101 controls the storage device control means 104 to read the character string data 204 stored in the character string storage means 105 to the character string collating means 102.

【０００５】文字列照合手段１０２は、入力文字列デー
タ２０４の中に検索要求２０１に合致する文字列、すな
わち、検索タームがあるかどうかを調べ、もし該当する
ものがあれば文字列を識別する情報２０５を複合条件判
定手段１０３へ出力する。複合条件判定手段１０３は文
字列識別情報２０５に関して、検索要求２０１中に指示
されたＡＮＤやＯＲで構成される論理条件などが満足さ
れるか否かを調べる。指定された複合条件を満足する場
合には、該当する文書の識別情報や文書内容のテキスト
データを検索結果２０６としてホストコンピュータへ返
送する。The character string collating means 102 checks whether or not there is a character string that matches the search request 201, that is, a search term, in the input character string data 204, and if there is a corresponding character string, identifies the character string. The information 205 is output to the composite condition determination means 103. The compound condition determination means 103 checks whether or not the character string identification information 205 satisfies the logical condition or the like constituted by AND or OR specified in the search request 201. When the specified composite condition is satisfied, the identification information of the corresponding document and the text data of the document content are returned to the host computer as the search result 206.

【０００６】文字列検索装置１の要となる文字列照合手
段１０２における文字列の照合方式として、複数の文字
列を１回のテキスト走査で探索する有限オートマトンを
用いた方法が知られている。また、この有限オートマト
ンを高速に実行するためのハードウェアは、特開平3−9
5672号公報が開示されている。As a character string collating method in the character string collating means 102, which is an essential part of the character string retrieving apparatus 1, a method using a finite automaton for searching a plurality of character strings by one text scanning is known. Further, hardware for executing this finite state automaton at high speed is disclosed in Japanese Patent Laid-Open No. 3-9
Japanese Patent No. 5672 is disclosed.

【０００７】本従来例では、オートマトンのどの状態で
入力文字と照合すべきかをトークンというマークを置く
ことにより示している。すなわち、入力テキストから１
文字入力されると、トークンが置かれている状態につい
てそれぞれ照合が行われる。また、トークンは入力文字
コードが入力されると必ず初期状態で生成される。ただ
し、照合の結果、遷移すべき状態がなかった場合には、
そのトークンを消滅させる。遷移先状態の決定はトーク
ンが置かれた状態の状態番号と入力文字コードをアドレ
スとして状態遷移テーブルを参照することによって行わ
れる。したがって、オートマトンの中に複数のトークン
が存在する場合には、１文字の入力に対して、複数回状
態遷移テーブルが参照されることになる。その結果、照
合スループットが数分の一に落ちてしまうという問題が
生じる。In this conventional example, the state of the automaton to be matched with the input character is indicated by placing a mark called a token. Ie 1 from the input text
When a character is entered, each state in which the token is placed is verified. Also, the token is always generated in the initial state when the input character code is input. However, if there is no transition state as a result of the collation,
Extinguish the token. The transition destination state is determined by referring to the state transition table using the state number of the state in which the token is placed and the input character code as an address. Therefore, when there are a plurality of tokens in the automaton, the state transition table is referred to a plurality of times for one character input. As a result, there arises a problem that the matching throughput drops to a fraction.

【０００８】このように１文字の照合処理に複数のトー
クンが存在する場合の照合動作について、図３のオート
マトンを用いて説明する。本オートマトンは“インタフ
ェース”と、この異表記である“インターフェース”，
“インターフェ−ス”，“インターフェイス”，“イン
タ−フェース”，“インタ−フェ−ス”，“インタ−フ
ェイス”，“インタフェ−ス”、および“インタフェイ
ス”を一括して照合するためのものである。The collating operation when a plurality of tokens are present in the one character collating process will be described with reference to the automaton shown in FIG. This automaton is called "interface" and this different expression "interface",
"Interface", "interface", "interface", "interface", "interface", "interface", and "interface" for collating collectively It is a thing.

【０００９】入力テキストとして“インタフェイス”が
入力された場合には、図４に示すようにトークンが移動
する。まず、“イ”が入力されると新たにトークン１が
初期状態である状態０に生成される。状態０には“イ”
による遷移が記述されている（図３参照）ため照合が成
立したことになり、トークン１は状態１へ移動する。When "interface" is input as the input text, the token moves as shown in FIG. First, when "a" is input, a new token 1 is generated in the initial state of state 0. "A" in state 0
Since the transition due to is described (see FIG. 3), the matching is established, and token 1 moves to state 1.

【００１０】次の“ン”が入力されるとさらに新たにト
ークン２が状態０で生成されるが、ここには“ン”によ
る遷移が記述されていないため照合が不成立となりトー
クン２は消滅する。また状態１へ移動していたトークン
１については、状態１で“ン”と照合される。この場
合、照合が成立するため状態２へ移動する。この場合
は、このように１文字について２回の照合動作が行われ
ることになる。When the next "n" is input, a new token 2 is generated in state 0. However, since the transition by "n" is not described here, the collation fails and the token 2 disappears. . Further, the token 1 that has moved to the state 1 is collated with “n” in the state 1. In this case, since the matching is established, the state moves to state 2. In this case, the collation operation is performed twice for each character in this way.

【００１１】同様にして“タ”，“フ”、および“ェ”
が入力されるにしたがって、トークン１は状態３から状
態５，状態６へと移動する。この間、初期状態ではトー
クン３〜５も同様に発生するが、照合が成立しないため
消滅する。Similarly, "ta", "fu", and "e"
Is input, the token 1 moves from the state 3 to the states 5 and 6. During this period, tokens 3 to 5 are also generated in the initial state, but they disappear because the verification is not established.

【００１２】このようにして、引き続き入力される
“イ”や“ス”についても同様の処理が行われる。In this way, the same processing is performed for "i" and "s" that are successively input.

【００１３】結果的に、この例では以上の照合動作過程
で、７文字のテキスト入力により１４回の照合が発生す
る。As a result, in this example, in the above collation operation process, the collation is performed 14 times by the text input of 7 characters.

【００１４】上述した文字列照合処理を実施する上記従
来例の文字列照合手段１０２を図５に示す。本文字列照
合手段１０２はレジスタ２１１と２５０，２５１，状態
遷移テーブル２２０，照合結果テーブル２６０，セレク
タ２６１，ゲート２６２，マルチプレクサ２６３，バッ
ファ２８０，２８１、およびコンパレータ２５２から構
成される。FIG. 5 shows the conventional character string collating means 102 for carrying out the above-mentioned character string collating process. The character string collating means 102 includes registers 211 and 250, 251, a state transition table 220, a collation result table 260, a selector 261, a gate 262, a multiplexer 263, buffers 280, 281, and a comparator 252.

【００１５】以下、本文字列照合手段１０２の照合動作
の概略を説明する。An outline of the collating operation of the character string collating means 102 will be described below.

【００１６】入力テキスト２０４は１文字ずつレジスタ
２１１に格納される。レジスタ211から出力される文字
コード３０２は、状態遷移テーブル２２０にアドレス情
報として入力される。状態遷移テーブル２２０は現在の
状態番号３０５と文字コード３０２をアドレスとして参
照され、ここから次に遷移すべき遷移先状態番号303
（以後、次状態番号と呼ぶ）が読み出され、レジスタ２
５０に格納される。The input text 204 is stored in the register 211 character by character. The character code 302 output from the register 211 is input to the state transition table 220 as address information. The state transition table 220 is referred to by using the current state number 305 and the character code 302 as addresses, and the transition destination state number 303 to which the next transition is to be made
(Hereinafter referred to as the next state number) is read, and the register 2
Stored in 50.

【００１７】本従来例では次状態番号３０３をトークン
の識別子として使用している。このトークンの識別子と
なる次状態番号３０３は、ゲート２６２およびマルチプ
レクサ２６３を介してトークンの存在位置を表す情報と
してバッファ２８０あるいはバッファ２８１のいずれか
選択されている方へ格納される。状態遷移テーブル２２
０から出力される次状態番号３０３が０（ゼロ）、すな
わち、初期状態番号であるときは、トークンの移動先が
ないことを表す。このため、次状態番号303が初期状態
番号０の場合は、トークンを消滅させる必要がある。こ
の制御は、コンパレータ２５２とゲート２６２によって
行われる。In this conventional example, the next state number 303 is used as the token identifier. The next state number 303, which is the identifier of this token, is stored via the gate 262 and the multiplexer 263 in the buffer 280 or the buffer 281 that is selected as information indicating the location of the token. State transition table 22
When the next state number 303 output from 0 is 0 (zero), that is, the initial state number, it means that there is no token move destination. Therefore, when the next state number 303 is the initial state number 0, it is necessary to erase the token. This control is performed by the comparator 252 and the gate 262.

【００１８】すなわち、トークンを消滅させるか否かの
判断はコンパレータ２５２で行なわれ、トークンを消滅
させるか否かの制御はゲート２６２によって行なわれ
る。That is, the comparator 252 determines whether or not to erase the token, and the gate 262 controls whether or not to erase the token.

【００１９】具体的には、次の状態番号３０３が初期状
態番号０（ゼロ）である場合には、コンパレータ２５２
でレジスタ２５１に格納された状態番号０（初期状態番
号）との比較の結果が等しくなるため、ゲート２６２が
閉じられ、ゲート２６２で次状態番号３０３はマルチプ
レクサ２６３へ送られることなく消滅することになる。
逆に、次の状態番号３０３が初期状態番号０（ゼロ）で
ない場合には、次の状態番号３０３はゲート２６２から
マルチプレクサ２６３に送出され、トークンとして保存
されることになる。Specifically, when the next state number 303 is the initial state number 0 (zero), the comparator 252
Since the result of comparison with the state number 0 (initial state number) stored in the register 251 becomes equal, the gate 262 is closed, and the next state number 303 disappears in the gate 262 without being sent to the multiplexer 263. Become.
On the contrary, when the next state number 303 is not the initial state number 0 (zero), the next state number 303 is sent from the gate 262 to the multiplexer 263 and stored as a token.

【００２０】バッファ２８０と２８１には初期値として
初期状態番号が先頭アドレスに格納されており、マルチ
プレクサ２６３を介して送られてきた次状態番号３０３
は初期状態の次のアドレスから格納される。こうするこ
とにより、初期状態にはトークンを常に存在させるよう
にしている。The buffers 280 and 281 store the initial state number as the initial value at the head address, and the next state number 303 sent through the multiplexer 263.
Is stored from the next address in the initial state. By doing this, the token always exists in the initial state.

【００２１】バッファ２８０あるいはバッファ２８１の
いずれかに格納された次状態番号３０３は、次の文字コ
ード照合時に現状態番号３０５として読み出される。The next state number 303 stored in either the buffer 280 or the buffer 281 is read out as the current state number 305 at the next character code matching.

【００２２】セレクタ２６１ではバッファ２８０あるい
はバッファ２８１の内、トークン、すなわち、次状態番
号３０３が格納されている方のバッファが選択され、こ
こから現状態番号３０５が、逐次、読み出される。すべ
て読み出し終わったところで読み出し終了信号３０７が
生成される。マルチプレクサ２６３とセレクタ２６１は
同期しており、マルチプレクサ２６３がバッファ２８０
を選択しているときには、セレクタ２６１はバッファ２
８１を選択する。また、マルチプレクサ２６３がバッフ
ァ２８１を選択しているときには、セレクタ２６１はバ
ッファ２８０を選択する。すなわち、遷移元となる状態
におけるトークン（現状態番号として一方のバッファに
格納されている）が格納されているバッファとは別のバ
ッファに、遷移先の状態へ移すべきトークンを次状態番
号３０３として格納するようにしている。The selector 261 selects one of the buffer 280 or the buffer 281 in which the token, that is, the next state number 303 is stored, and the current state number 305 is sequentially read therefrom. The read end signal 307 is generated when all the data are read. The multiplexer 263 and the selector 261 are synchronized with each other, and the multiplexer 263 causes the buffer 280 to operate.
Is selected, the selector 261 selects the buffer 2
Select 81. Further, when the multiplexer 263 is selecting the buffer 281, the selector 261 selects the buffer 280. That is, the token to be transferred to the transition destination state is set as the next state number 303 in a buffer different from the buffer in which the token in the transition source state (stored in one buffer as the current state number) is stored. I am trying to store it.

【００２３】バッファ２８０および２８１の切り換え
は、セレクタ２６１によって選択されたバッファ２８０
又は２８１のいずれかの読み出しが終了した時点、すな
わち読出し終了信号３０７が発生したタイミングで行わ
れる。レジスタ２１１には、通常、レジスタ２５０と同
期してテキストから文字コードが取り込まれるが、読出
し終了信号３０７が発生するまで文字コードが保持さ
れ、遷移先のトークン、すなわち、現状態番号がバッフ
ァからすべて読み出されるまで次の入力を待つことにな
る。照合結果テーブル２６０には検索タームの終端とな
る状態（以後、終端状態とよぶ）に対応して各検索ター
ムを識別するための所定の検索ターム番号が格納され、
それ以外の状態には０（ゼロ）が格納されている。すな
わち、状態番号に対応して照合結果テーブル２６０から
出力される検索ターム番号が０以外のときのみ照合結果
２０５として意味を持つことになる。Switching of the buffers 280 and 281 is performed by switching the buffer 280 selected by the selector 261.
Alternatively, it is performed at the time when the reading of any one of 281 or 281 is completed, that is, at the timing when the reading end signal 307 is generated. The character code is normally fetched from the text in the register 211 in synchronization with the register 250, but the character code is held until the read end signal 307 is generated, and the token of the transition destination, that is, the current state number is all stored in the buffer. It waits for the next input until it is read. The collation result table 260 stores a predetermined search term number for identifying each search term corresponding to a state that will be the end of the search term (hereinafter, referred to as an end state).
0 (zero) is stored in the other states. That is, the matching result 205 is meaningful only when the search term number output from the matching result table 260 corresponding to the state number is other than 0.

【００２４】以上の一連の動作が入力テキストを構成す
る各文字毎に繰返し行われることにより文字列照合処理
が実現される。The character string collating process is realized by repeating the above series of operations for each character constituting the input text.

【００２５】このように、本引例では一つのトークンの
照合時に１回の状態遷移テーブル参照が行われる。した
がって、図４の照合動作例では、７文字のテキスト入力
において１４回のトークンに対して照合が行われ、１４
回の状態遷移テーブルの参照が発生する。つまり、１文
字に対し平均２回の状態遷移テーブルの参照が行われる
ことになる。このため、１文字に対し１回の照合で済ま
せる場合に比較して、照合スループットが約半分に低下
してしまうという問題が生じる。As described above, in this reference, the state transition table is referenced once when matching one token. Therefore, in the collation operation example of FIG. 4, the collation is performed on the token 14 times in the text input of 7 characters.
The state transition table is referenced twice. That is, the state transition table is referred to twice for each character on average. Therefore, there is a problem that the matching throughput is reduced to about half as compared with the case where only one matching is performed for one character.

【００２６】このように、文字列照合手段１０２の負荷
増大による文字列検索装置としての照合スループットの
低下を防ぐ方法として特願平4−63067号明細書が提案さ
れている。As described above, Japanese Patent Application No. 4-63067 has been proposed as a method for preventing the collation throughput of the character string search device from decreasing due to the increased load of the character string collating means 102.

【００２７】上記従来例における文字列検索装置の構成
を図６に示す。FIG. 6 shows the configuration of the character string search device in the above conventional example.

【００２８】本例では、文字列記憶手段１０５と文字列
照合手段１０２の間にフィルタリング手段３００を設
け、文字列記憶手段１０５から読み出されたテキスト中
から検索タームに含まれない文字コードを切り捨てて、
検索タームに含まれる文字コードのみを文字列照合手段
１０２に送り、文字列照合手段１０２での無駄な照合動
作を省くことによって、等価的に照合スループットを向
上させようとするものである。すなわち、低速なメモリ
を使用した文字列照合手段１０２を用いても文字列検索
装置としての検索速度が低下しない安価な文字列照合装
置を提供しようとするものである。In this example, the filtering means 300 is provided between the character string storage means 105 and the character string collating means 102, and character codes not included in the search term are cut off from the text read from the character string storage means 105. hand,
Only the character code included in the search term is sent to the character string collating means 102, and unnecessary collating operation in the character string collating means 102 is omitted, so that the collating throughput is equivalently improved. That is, an object of the present invention is to provide an inexpensive character string collating device in which the search speed as a character string searching device does not decrease even if the character string collating means 102 using a low-speed memory is used.

【００２９】本従来例におけるフィルタリング手段とし
て単一フィルタリング回路を用いた場合の実施例を図７
に示す。FIG. 7 shows an embodiment in which a single filtering circuit is used as the filtering means in this conventional example.
Shown in.

【００３０】本単一フィルタリング回路３００は、文字
コードレジスタ３２０，単一フィルタリングテーブル３
３０、および出力ゲート３６０からなる単一文字出力回
路３４０から構成されている。The single filtering circuit 300 includes a character code register 320 and a single filtering table 3.
30 and a single character output circuit 340 including an output gate 360.

【００３１】単一フィルタリングテーブル３３０は文字
コードをアドレスとしてアクセスされる一次元メモリで
ある。ここには、検索タームに含まれる文字コードに対
応して“１”が一致フラグとして格納されている。The single filtering table 330 is a one-dimensional memory which is accessed using a character code as an address. Here, "1" is stored as a match flag corresponding to the character code included in the search term.

【００３２】例えば、検索タームとして“ＢＵＳ”が与
えられた場合には、図８に示すような内容が単一フィル
タリングテーブル３３０に設定される。すなわち、検索
ターム“ＢＵＳ”を構成する“Ｂ”，“Ｕ”および
“Ｓ”の各文字コードに対応するスロットに１が設定さ
れている。For example, when "BUS" is given as the search term, the contents shown in FIG. 8 are set in the single filtering table 330. That is, 1 is set in the slots corresponding to the respective character codes of "B", "U" and "S" that form the search term "BUS".

【００３３】フィルタリング動作は、入力テキスト２０
４から１文字ずつ文字コードを文字コードレジスタ３２
０に取り込むことから始まる。The filtering operation is performed on the input text 20.
Character code register 32
It starts by taking in 0.

【００３４】文字コードレジスタ３２０の出力である文
字コード３１０を参照アドレスとして単一フィルタリン
グテーブル３３０がアクセスされ、一致信号３５０が読
み出される。検索タームを構成する文字コードの場合に
は、一致信号３５０として１が出力される。このとき出
力ゲート３６０が開き、文字コード３１０は出力ライン
２０７へ出力され、次段の文字列照合手段１０２へ送ら
れる。検索タームに含まれない文字コードの場合には、
一致信号３５０として０が出力される。このとき、出力
ゲートは閉ざされ、文字コード３１０は出力ライン２０
７へ出力されず、次段の文字列照合手段１０２へ送られ
ないことになる。The single filtering table 330 is accessed using the character code 310 output from the character code register 320 as a reference address, and the coincidence signal 350 is read. In the case of the character code forming the search term, 1 is output as the matching signal 350. At this time, the output gate 360 is opened, and the character code 310 is output to the output line 207 and sent to the character string collating means 102 in the next stage. If the character code is not included in the search term,
0 is output as the coincidence signal 350. At this time, the output gate is closed and the character code 310 changes to the output line 20.
7 is not output and is not sent to the character string collating means 102 in the next stage.

【００３５】例えば、検索タームとして“ＢＵＳ”が与
えられ、入力テキストとして“HIGH-SPEED CMOS SCSI B
US CONTROLLER ”が入力されたときの単一フィルタリン
グ回路の具体的な動作を図９のタイミングチャートを用
いて説明する。For example, "BUS" is given as the search term, and "HIGH-SPEED CMOS SCSI B as the input text.
A specific operation of the single filtering circuit when "US CONTROLLER" is input will be described with reference to the timing chart of FIG.

【００３６】まず、入力テキスト２０４から“Ｈ”が文
字コードレジスタ３２０に取り込まれ、文字コード３１
０として出力される。そして、“Ｈ”で単一フィルタリ
ングテーブル３３０がアクセスされ、一致信号３５０の
値として０が出力される。一致信号３５０の値が０のた
め、出力ゲート３６０からは文字コード“Ｈ”は出力さ
れない。First, "H" is input from the input text 204 into the character code register 320, and the character code 31
It is output as 0. Then, the single filtering table 330 is accessed at “H”, and 0 is output as the value of the coincidence signal 350. Since the value of the coincidence signal 350 is 0, the character code "H" is not output from the output gate 360.

【００３７】次の、“Ｉ”，“Ｇ”，“Ｈ”、および
“−”も同様に文字コードレジスタ３２０に取り込まれ
るが、単一フィルタリングテーブル３３０から出力され
る一致信号３５０が０のため出力ゲート３６０からは出
力されない。The next "I", "G", "H", and "-" are similarly fetched in the character code register 320, but the coincidence signal 350 output from the single filtering table 330 is 0. There is no output from the output gate 360.

【００３８】さらに次の“Ｓ”が文字コードレジスタ３
２０に取り込まれると単一フィルタリングテーブル３３
０から出力される一致信号３５０の値が１となり、出力
ゲート３６０から文字コード３１０の“Ｓ”が出力ライ
ン２０７へ出力される。The next "S" is the character code register 3
Single filtering table 33 when loaded into 20
The value of the coincidence signal 350 output from 0 becomes 1, and the output gate 360 outputs “S” of the character code 310 to the output line 207.

【００３９】次の“Ｐ”，“Ｅ”，“Ｅ”，“Ｄ”、お
よび“ ”（スペース）の間は単一フィルタリングテー
ブル３３００から出力される一致信号３３５０が０とな
るため出力ライン２０７へは何も出力されない。During the next "P", "E", "E", "D", and "" (space), the coincidence signal 3350 output from the single filtering table 3300 becomes 0, so the output line 207. Nothing is output to.

【００４０】以下、同様にフィルタリング処理を行うこ
とにより、出力ゲートから“Ｓ”，“Ｓ”，“Ｂ”，
“Ｕ”、および“Ｓ”が出力ライン２０７へ出力され
る。Thereafter, by similarly performing the filtering process, "S", "S", "B" from the output gate,
“U” and “S” are output to the output line 207.

【００４１】このようにして、本例では３０文字の入力
テキスト“HIGH-SPEED SCSI BUS CONTROLLER”から検索
タームに含まれる“SSSBUS”の６文字が抽出され、出力
テキスト２０７として出力される。すなわち、入力テキ
スト２０４の２３／３０、つまり、約４／５が不要文字
として切り捨てられるため、文字列照合手段１０２には
約１／５が送られるだけとなり、文字列照合部１０２の
負荷を約１／５に削減することが可能となる。In this way, in this example, 6 characters of "SSSBUS" included in the search term are extracted from the input text of 30 characters "HIGH-SPEED SCSI BUS CONTROLLER" and output as the output text 207. That is, 23/30 of the input text 204, that is, about 4/5 is truncated as an unnecessary character, so only about 1/5 is sent to the character string matching unit 102, and the load on the character string matching unit 102 is reduced. It is possible to reduce it to 1/5.

【００４２】[0042]

【発明が解決しようとする課題】上記特願平4−63067号
では、文字列記憶手段と文字列照合手段の間にフィルタ
リング手段を設け、文字列記憶手段から読み出されたテ
キスト中から検索タームに含まれない文字コードを切り
捨てて、検索タームに含まれる文字コードのみを文字列
照合手段に出力することにより文字列照合手段の負荷を
軽減し、照合スループットの等価的向上を図っている。
この結果、文字列照合手段では、“HIGHSPEED SCSI BUS
CONTROLLER"という３０文字の入力テキストに対して照
合処理を行う必要があったものを“SSSBUS”の６文字分
の照合処理で済ますことができるようになり、文字列照
合手段の負荷を約１／５にすることができている。In the above-mentioned Japanese Patent Application No. 4-63067, a filtering means is provided between the character string storing means and the character string collating means, and a search term is selected from the text read from the character string storing means. By truncating the character codes not included in the above, and outputting only the character codes included in the search term to the character string matching means, the load of the character string matching means is reduced, and the matching throughput is improved equivalently.
As a result, in the character string collating means, "HIGHSPEED SCSI BUS
What was necessary to perform collation processing on the input text of 30 characters "CONTROLLER" can now be completed by collation processing for 6 characters of "SSSBUS", and the load of the character string collating means is reduced to about 1 /. It can be set to 5.

【００４３】例えば、フィルタリング回路の処理速度を
５０ナノ秒／回、文字列照合手段の処理速度を１００ナ
ノ秒／回としたとき、フィルタリング手段を備えない場
合の処理時間およびフィルタリング手段を備えた場合の
処理時間は次のようになる。For example, when the processing speed of the filtering circuit is 50 nanoseconds / time and the processing speed of the character string collating means is 100 nanoseconds / time, the processing time when the filtering means is not provided and the case where the filtering means is provided. The processing time of is as follows.

【００４４】まず、フィルタリング手段を備えない場合
には“HIGH SPEED SCSI BUS CONTROLLER”という３０文
字のテキストを、すべて文字列照合する必要があるため
文字列照合手段の処理時間として１００ナノ秒／回×３０回＝３,０００ナノ秒が必要となる。First, if no filtering means is provided, it is necessary to collate all 30-character text "HIGH SPEED SCSI BUS CONTROLLER" with a character string, so the processing time of the character string collating means is 100 nanoseconds / times. 30 times = 3,000 nanoseconds are required.

【００４５】次に、フィルタリング手段を備えた場合に
は、“HIGH SPEED SCSI BUS CONTROLLER"という３０文
字のテキストはフィルタリング手段によって“SSSBUS"
の６文字に削除される。フィルタリング処理では入力文
字コード１文字に付き１回のフィルタリングテーブルの
参照が必要となるため、“HIGH SPEED SCSI BUSCONTROL
LER”という３０文字のテキストをフィルタリングする
のに３０回の処理が必要となる。この結果、フィルタリ
ング手段の処理時間として５０ナノ秒／回×３０回＝１,５００ナノ秒が必要となる。Next, when the filtering means is provided, the 30-character text "HIGH SPEED SCSI BUS CONTROLLER" is converted to "SSSBUS" by the filtering means.
Will be deleted to 6 characters. In the filtering process, it is necessary to refer to the filtering table once for each input character code, so "HIGH SPEED SCSI BUSCONTROL"
It is necessary to process 30 times to filter the text of 30 characters "LER". As a result, the processing time of the filtering means is 50 nanoseconds / time × 30 times = 1,500 nanoseconds.

【００４６】また、図１０に示すように“SSSBUS”の６
文字を照合するためには８回の文字列照合処理が必要と
なるため、文字列照合手段の処理時間は１００ナノ秒／回×８回＝８００ナノ秒が必要となる。Further, as shown in FIG. 10, 6 of "SSSBUS"
In order to collate the character, the character string collating process needs to be performed eight times. Therefore, the processing time of the character string collating means needs to be 100 nanoseconds / time × 8 times = 800 nanoseconds.

【００４７】このように、文字列記憶手段と文字列照合
手段の間にフィルタリング手段を設置することにより、
３,０００ナノ秒必要であった文字列検索処理を１,５０
０ナノ秒で処理することが可能となり、検索スループッ
トを２倍向上させることが可能となった。Thus, by installing the filtering means between the character string storing means and the character string collating means,
The character string search processing that required 3,000 nanoseconds was 1,50
It became possible to perform processing in 0 nanoseconds, and it was possible to improve search throughput by a factor of two.

【００４８】しかし、本従来例による文字列検索装置で
は以下のような問題が生じる。However, the character string retrieval apparatus according to this conventional example has the following problems.

【００４９】すなわち、文字列照合手段が実際に動作し
ている時間は８００ナノ秒であるにもかかわらず、フィ
ルタリング処理に１,５００ナノ秒の処理時間を要する
ことになる。つまり、スループット向上のために設けた
フィルタリング処理がボトムネックになってしまうとい
う状況が発生してしまう。That is, although the character string collating means is actually operating for 800 nanoseconds, the filtering processing requires a processing time of 1,500 nanoseconds. That is, a situation arises in which the filtering process provided for improving the throughput becomes a bottom neck.

【００５０】このように、本従来例による文字列検索装
置ではフィルタリング処理の速度がボトムネックとなる
ため、文字列照合手段が遊んでいる状態となり、文字列
照合手段の限界性能まで照合スループットを向上させる
ということができなくなるという問題が生じる。As described above, in the character string search device according to the conventional example, the speed of the filtering process becomes a bottom neck, so that the character string collating means is idle and the collating throughput is improved up to the limit performance of the character string collating means. There is a problem that it cannot be done.

【００５１】本発明の目的は、複数のフィルタリング手
段を並列に動作させることにより、フィルタリング処理
のスループットをさらに向上させるとともにフィルタリ
ング処理のボトムネックを回避し、ひいては文字列照合
手段の性能を最大限に引き出す、すなわち、文字列検索
装置全体としての検索スループットを限界まで高めるこ
とにある。The object of the present invention is to further improve the throughput of the filtering process by operating a plurality of filtering means in parallel, avoid the bottom neck of the filtering process, and maximize the performance of the character string collating means. That is, to bring out, that is, to increase the search throughput of the entire character string search device to the limit.

【００５２】[0052]

【課題を解決するための手段】これらの課題は、指定さ
れた複数の検索タームがコード表現された文字で構成さ
れるテキスト中に存在するか否かを一括して探索する文
字列検索装置において、文字列記憶手段と文字列照合手
段の間に下記並列フィルタリング手段を設けることによ
って達成される。These problems are solved in a character string search device for collectively searching whether or not a plurality of designated search terms are present in a text composed of code-represented characters. This is achieved by providing the following parallel filtering means between the character string storage means and the character string collating means.

【００５３】すなわち、並列フィルタリング手段とし
て、上記文字列記憶手段から読み出されたテキストから
一度に複数文字を取り込み１文字ずつ分割して送出する
分配手段と、上記分配手段から出力される文字コードを
１文字ずつ取り込み、上記文字コードが上記検索ターム
に含まれている文字コードか否かを判定する、並列に並
べられた複数のフィルタリング手段と、上記複数のフィ
ルタリング手段から出力される文字コードを取り込み整
列して文字列照合手段に送出する収集手段によって構成
される。That is, as the parallel filtering means, a distributing means for fetching a plurality of characters at a time from the text read from the character string storing means and dividing and sending the divided characters one by one, and a character code output from the distributing means are used. Capturing one character at a time, determining whether or not the character code is included in the search term, a plurality of filtering means arranged in parallel, and a character code output from the plurality of filtering means It is composed of collecting means for arranging and sending to the character string collating means.

【００５４】[0054]

【作用】本発明の原理について、図１を用いて以下に説
明する。並列フィルタリング手段３０００において、分
配手段３１００は文字列記憶手段１０５からテキストを
２文字ずつ読み出し、１文字ずつフィルタリング手段３
２００ａおよび３２００ｂに送出する。フィルタリング
手段３２００ａおよび３２００ｂでは、入力された１文
字に対して検索タームに含まれた文字であるか否かの判
定を行い、その結果を収集手段３３００に出力する。収
集手段３３００ではフィルタリング手段３２００ａおよ
び３２００ｂにおける処理結果として、検索タームに含
まれる文字コードのみを整列して文字列照合手段１０２
に出力する。The principle of the present invention will be described below with reference to FIG. In the parallel filtering unit 3000, the distribution unit 3100 reads the text from the character string storage unit 105 every two characters, and the filtering unit 3 at every one character.
200a and 3200b. The filtering means 3200a and 3200b determine whether or not one input character is a character included in the search term, and output the result to the collecting means 3300. In the collection means 3300, as the processing results in the filtering means 3200a and 3200b, only the character codes included in the search term are aligned and the character string matching means 102 is arranged.
Output to.

【００５５】こうすることにより、フィルタリング手段
の処理スループットを等価的に２倍向上させることがで
き、フィルタリング手段がボトムネックになるのを回避
することが可能となる。その結果、文字列検索装置とし
ての検索スループットをさらに２倍向上させることが可
能となる。By doing so, it is possible to equivalently increase the processing throughput of the filtering means by a factor of two and avoid the filtering means from becoming a bottom neck. As a result, it is possible to further improve the search throughput as the character string search device by a factor of two.

【００５６】以下、具体例を用いて原理を説明する。例
えば、検索タームに“ＢＵＳ”が指定され、文字列記憶
手段からテキストとして“HIGH-SPEED SCSI BUS CONTRO
LLER”が読み込まれた場合を考えてみる。The principle will be described below using a specific example. For example, "BUS" is specified as the search term, and "HIGH-SPEED SCSI BUS CONTRO" is displayed as text from the character string storage means.
Consider the case where "LLER" is loaded.

【００５７】分配手段３１００では、まず１回目の動作
でテキストの第１および第２文字目の“ＨＩ”を取り込
み、第１文字目の“Ｈ”をフィルタリング回路３２００
ａに、第２文字目の“Ｉ”をフィルタリング回路３２０
０ｂに出力する。フィルタリング回路３２００ａおよび
３２００ｂでは、入力された文字コード“Ｈ”および
“Ｉ”が検索タームに含まれる文字コードではないとい
う信号を付加して収集手段３３００に送出する。収集手
段３３００ではこれらの情報をもとにフィルタリング処
理を一括して行う。すなわち、“Ｈ”および“Ｉ”を削
除し、文字列照合手段１０２に送出しない。次に、続く
２文字“ＧＨ”が入力された時も、同様に文字列照合手
段１０２に送出しない。さらに、続く２文字“−Ｓ”が
分配手段３１００に入力されると、“−”はフィルタリ
ング回路３２００ａに、“Ｓ”はフィルタリング回路３
２００ｂに出力される。フィルタリング回路３２００ａ
では入力された文字コード“−”が検索タームに含まれ
る文字コードではないという信号を付加して収集手段３
３００に送出し、フィルタリング回路３２００ｂでは、
“Ｓ”が検索タームに含まれる文字コードであるという
信号を付加して収集手段３３００に送出する。収集手段
３３００ではこれらの情報をもとにフィルタリング処理
を一括して行い、“Ｓ”を文字列照合手段１０２に送出
する。In the distributing means 3100, first, the first operation takes in the first and second characters "HI" of the text, and the first character "H" is filtered by the filtering circuit 3200.
The second character “I” is added to a and the filtering circuit 320
Output to 0b. In the filtering circuits 3200a and 3200b, a signal indicating that the input character codes "H" and "I" are not included in the search term is added and sent to the collecting means 3300. The collecting means 3300 collectively performs filtering processing based on these pieces of information. That is, "H" and "I" are deleted and not sent to the character string collating means 102. Next, even when the next two characters "GH" are input, they are not sent to the character string collating means 102 in the same manner. Further, when the following two characters "-S" are input to the distribution means 3100, "-" is input to the filtering circuit 3200a and "S" is input to the filtering circuit 3200.
It is output to 200b. Filtering circuit 3200a
Then, a signal that the input character code "-" is not the character code included in the search term is added to collect means 3
300, and in the filtering circuit 3200b,
A signal that "S" is a character code included in the search term is added and sent to the collecting means 3300. The collecting means 3300 collectively performs filtering processing based on these pieces of information, and sends "S" to the character string collating means 102.

【００５８】続く２文字“ＰＥ”が入力された時も、こ
れらは検索タームに含まれる文字コードではないので文
字列照合手段に送出しない。Even when the following two characters "PE" are input, they are not sent to the character string collating means because they are not the character codes included in the search term.

【００５９】以下、同様に“ＥＤ”，“Ｓ”，“Ｃ
Ｓ”，“Ｉ”，“ＢＵ”，“Ｓ”，“ＣＯ”，“Ｎ
Ｔ”，“ＲＯ”，“ＬＬ”，“ＥＲ”が入力された場合
には、それぞれ検索タームに含まれる文字である
“Ｓ”，“Ｓ”，“ＢＵ”，“Ｓ”を出力する。Hereinafter, similarly, "ED", "S", "C"
S "," I "," BU "," S "," CO "," N "
When "T", "RO", "LL", and "ER" are input, the characters "S", "S", "BU", and "S" included in the search terms are output.

【００６０】すなわち、このようなフィルタリング処理
の結果として、３０文字のテキスト“HIGH-SPEED SCSI
BUS CONTROLLER”に対して１５回のフィルタリング動作
で、検索タームに含まれる文字のみで構成された“SSSB
US”という文字列が文字列照合手段に送出されることに
なる。That is, as a result of such filtering processing, the 30-character text "HIGH-SPEED SCSI
"SSSB" composed only of the characters included in the search term by 15 times of filtering operations for "BUS CONTROLLER"
The character string "US" is sent to the character string collating means.

【００６１】このように３０文字のテキスト“HIGH-SPE
ED SCSI BUS CONTROLLER”は並列フィルタリング手段に
より、１５回のフィルタリング動作で６文字のテキスト
“SSSBUS”にフィルタリングされる。すなわち、フィル
タリング処理に要する時間として５０ナノ秒／回×１５回＝７５０ナノ秒必要となる。Thus, the 30-character text "HIGH-SPE
The ED SCSI BUS CONTROLLER "is filtered by the parallel filtering means into the 6-character text" SSSBUS "in 15 filtering operations, that is, the time required for the filtering process is 50 nanoseconds / time × 15 times = 750 nanoseconds. Becomes

【００６２】また、前述したように“SSSBUS”の６文字
を照合するためには８回の文字列照合処理が必要となる
ため、文字列照合手段の処理時間は１００ナノ秒／回×８回＝８００ナノ秒必要となる。Further, as described above, in order to collate 6 characters of "SSSBUS", the character string collating process is required 8 times. Therefore, the processing time of the character string collating means is 100 nanoseconds / time × 8 times. = 800 nanoseconds is required.

【００６３】すなわち、３０文字のテキスト“HIGH-SPE
ED SCSI BUS CONTROLLER”を８００ナノ秒で照合するこ
とが可能となることになる。That is, the 30-character text "HIGH-SPE
ED SCSI BUS CONTROLLER ”can be verified in 800 nanoseconds.

【００６４】このように、従来方式では１,５００ナノ
秒必要であったフィルタリング処理を、本発明により７
５０ナノ秒で処理することが可能となる。これにより、
従来方式では文字列検索装置としての検索時間が１,５
００ナノ秒必要であったものを、８００ナノ秒に短縮
することが可能となる。すなわち、フィルタリング処理
がボトムネックになるのを回避し、文字列検索装置とし
て検索スループットを等価的に約２倍向上させることが
可能となる。As described above, according to the present invention, the filtering process, which requires 1,500 nanoseconds in the conventional method, can be performed by the present invention.
It can be processed in 50 nanoseconds. This allows
In the conventional method, the search time as a character string search device is 1,5
It was possible to reduce what was required from 00 nanoseconds to 800 nanoseconds. That is, it becomes possible to avoid the filtering process from becoming a bottom bottleneck and equivalently improve the search throughput of the character string search device by a factor of about two.

【００６５】[0065]

【実施例】以下、本発明の原理を用いた第１の実施例を
図１１を用いて説明する。本図は本実施例の構成を示す
ブロック図である。本実施例は文字列検索装置１の文字
列記憶手段１０５と文字列照合手段１０２の間に並列フ
ィルタリング手段３０００を設け、文字列記憶手段１０
５から読み出されたテキスト中から検索タームに含まれ
ない文字コードを切り捨てるフィルタリング回路を複数
個、並列に動作させることにより、フィルタリング処理
のスループットを向上させ、等価的に照合スループット
を向上させようとするものである。EXAMPLE A first example using the principle of the present invention will be described below with reference to FIG. This figure is a block diagram showing the configuration of the present embodiment. In the present embodiment, the parallel filtering means 3000 is provided between the character string storage means 105 and the character string collating means 102 of the character string search device 1, and the character string storage means 10 is provided.
In order to improve the throughput of the filtering process and equivalently improve the matching throughput, a plurality of filtering circuits that cut out the character codes that are not included in the search term from the text read from 5 are operated in parallel. To do.

【００６６】本実施例で用いる並列フィルタリング手段
３０００の一構成例を図１に示す。FIG. 1 shows an example of the configuration of the parallel filtering means 3000 used in this embodiment.

【００６７】本並列フィルタリング手段３０００は分配
手段３１００，フィルタリング手段３２００ａおよび３
２００ｂ，収集手段３３００によって構成される。The parallel filtering means 3000 includes a distributing means 3100, filtering means 3200a and 3200.
200b and collecting means 3300.

【００６８】初めに、並列フィルタリング手段３０００
の概略動作について説明する。First, the parallel filtering means 3000
The outline of the operation will be described.

【００６９】分配手段３１００は、文字列記憶手段１０
５からテキストを一度に２バイトずつ取り込み、１バイ
トずつ二つのフィルタリング手段３２００ａおよび3200
ｂに送出する。The distribution means 3100 is the character string storage means 10
Two filtering means 3200a and 3200, each taking 2 bytes of text from 5 at a time
Send to b.

【００７０】本実施例におけるフィルタリング手段３２
００ａおよび３２００ｂでは、分配手段３１００から１
文字入力される毎にこれを検索タームと比較し、検索タ
ーム中に含まれる文字と同じ文字があるかどうかを判定
し、その結果を一致フラグとして文字コードとともに収
集手段３３００に出力する。すなわち、入力された文字
コードが検索タームに含まれる文字のときにはその１文
字に対して一致フラグとして１を、それ以外の文字コー
ドに対して０を出力する。Filtering means 32 in this embodiment
00a and 3200b, the distribution means 3100 to 1
Each time a character is input, this is compared with the search term, it is determined whether there is the same character as the character included in the search term, and the result is output as a match flag together with the character code to the collecting means 3300. That is, when the input character code is a character included in the search term, 1 is output as a match flag for the one character and 0 is output for the other character codes.

【００７１】収集手段３３００では、フィルタリング手
段３２００ａおよび３２００ｂから出力される文字コー
ドおよび一致フラグをもとに２文字分の出力制御処理を
一括して行う。The collecting means 3300 collectively performs output control processing for two characters based on the character code and the match flag output from the filtering means 3200a and 3200b.

【００７２】以上が、並列フィルタリング手段３０００
の概略動作である。The above is the parallel filtering means 3000.
Is a general operation of.

【００７３】次に、分配手段３１００，フィルタリング
手段３２００ａおよび３２００ｂ，収集手段３３００の
構成および動作について、それぞれ具体的に説明する。Next, the configurations and operations of the distributing means 3100, the filtering means 3200a and 3200b, and the collecting means 3300 will be specifically described.

【００７４】本実施例における分配手段３１００の構成
を図１２に示す。本実施例における分配手段３１００
は、文字コード取り込みレジスタ３１１０，入力文字コ
ードレジスタ３１２０ａおよび３１２０ｂによって構成
される。FIG. 12 shows the structure of the distributing means 3100 in this embodiment. Distributing means 3100 in this embodiment
Is composed of a character code fetch register 3110 and input character code registers 3120a and 3120b.

【００７５】分配手段３１００は、文字列記憶手段１０
５からテキストを一度に２バイトずつ取り込み、文字コ
ード取り込みレジスタ３１１０に格納する。そして、次
のステップでは文字コード取り込みレジスタ３１１０に
格納された２バイトのうち上位側１バイトを入力文字コ
ードレジスタ３１２０ａに、下位側１バイトを入力文字
コードレジスタ３１２０ｂに格納し、それぞれデータ線
３０１０ａおよび３０１０ｂを経由してフィルタリング
手段３２００ａおよび３２００ｂに出力する。そして、
続く２バイトを新たに文字コード取り込みレジスタ３１
１０に取り込む。The distributing means 3100 is the character string storing means 10
The text from 5 is fetched 2 bytes at a time and stored in the character code fetch register 3110. Then, in the next step, of the 2 bytes stored in the character code fetch register 3110, the upper 1 byte is stored in the input character code register 3120a and the lower 1 byte is stored in the input character code register 3120b. It outputs to the filtering means 3200a and 3200b via 3010b. And
The following 2 bytes are newly added to the character code acquisition register 31
Take in 10.

【００７６】図１３は一例としてテキスト“HIGH-SPEED
SCSI BUS CONTROLLER”が入力されたときの分配手段３
１００の動作を示す。FIG. 13 shows the text "HIGH-SPEED" as an example.
Distribution means 3 when SCSI BUS CONTROLLER ”is input
100 operation is shown.

【００７７】まず、１回目の入力で最初の２バイト“Ｈ
Ｉ”が文字コード取り込みレジスタ３１１０に格納され
る。次に、２回目の入力の直前に文字コード取り込みレ
ジスタ３１１０に格納された２バイト“ＨＩ”のうち１
バイト目の“Ｈ”は入力文字コードレジスタ３１２０ａ
に、２バイト目の“Ｉ”は入力文字コードレジスタ３１
２０ｂに格納され、それぞれデータ線３０１０ａおよび
３０１０ｂを経由してフィルタリング手段３２００ａお
よび３２００ｂに出力される。そして、２回目の入力で
は、続く２バイト“ＧＨ”が新たに文字コード取り込み
レジスタ3110に取り込まれる。以下、同様に分配手段３
１００ではテキストを２バイトずつ取り込み、１バイト
ずつ順にフィルタリング手段３２００ａおよび３２００
ｂに出力していく。以上が、分配手段３１００の構成お
よび動作である。First, at the first input, the first 2 bytes "H"
I "is stored in the character code fetch register 3110. Next, one of the two bytes" HI "stored in the character code fetch register 3110 immediately before the second input.
"H" at the byte is the input character code register 3120a
The second byte "I" is the input character code register 31
20b and output to the filtering means 3200a and 3200b via the data lines 3010a and 3010b, respectively. Then, in the second input, the following 2 bytes “GH” are newly fetched in the character code fetch register 3110. Hereinafter, similarly, the distribution means 3
In 100, the text is taken in every 2 bytes and the filtering means 3200a and 3200 are taken in order of every 1 byte.
Output to b. The above is the configuration and operation of the distribution unit 3100.

【００７８】次に、フィルタリング手段３２００ａおよ
び３２００ｂの構成および動作について説明する。Next, the configuration and operation of the filtering means 3200a and 3200b will be described.

【００７９】フィルタリング手段３２００ａおよび３２
００ｂは同一の回路であり、ここでは代表としてフィル
タリング手段３２００ａの構成および動作について説明
する。Filtering means 3200a and 32
00b is the same circuit, and the configuration and operation of the filtering means 3200a will be described here as a representative.

【００８０】図１４はフィルタリング手段３２００ａの
構成を示す。すなわち、フィルタリング手段３２００ａ
はレジスタ３２１０ａ，単一フィルタリングテーブル32
20a,一致フラグレジスタ３２３０ａおよび文字コードレ
ジスタ３２４０ａによって構成される。FIG. 14 shows the structure of the filtering means 3200a. That is, the filtering means 3200a
Is a register 3210a and a single filtering table 32
20a, a match flag register 3230a and a character code register 3240a.

【００８１】まず初期設定として、検索タームに含まれ
ている文字コードに対応する単一フィルタリングテーブ
ル３２２０ａのスロットに一致フラグとして１を設定
し、それ以外のスロットに０を設定する。例えば検索タ
ームとして“ＢＵＳ”が与えられた場合には、図８に示
す内容を単一フィルタリングテーブル３２２０ａに設定
する。すなわち、“ＢＵＳ”を構成する“Ｂ”，
“Ｕ”、および“Ｓ”の各文字コードに対応するスロッ
トに一致フラグとして１を設定しておく。First, as an initial setting, 1 is set as a match flag in the slot of the single filtering table 3220a corresponding to the character code included in the search term, and 0 is set in the other slots. For example, when “BUS” is given as the search term, the contents shown in FIG. 8 are set in the single filtering table 3220a. That is, "B" that composes "BUS",
1 is set as a match flag in the slots corresponding to the character codes of "U" and "S".

【００８２】フィルタリング動作は、分配手段３１００
から１バイトずつ文字コードをレジスタ３２１０ａに取
り込むことから始まる。The filtering operation is performed by the distribution means 3100.
It starts by fetching the character code into the register 3210a byte by byte.

【００８３】レジスタ３２１０ａの出力である文字コー
ド３２１１ａはそのまま文字コードレジスタ３２４０ａ
に格納され、データ線３０２０ａを経由して収集手段33
00出力される。また、単一フィルタリングテーブル３２
２０ａは文字コード3211ａを参照アドレスとしてアクセ
スされ、一致フラグは一致フラグレジスタ３２３０ａに
格納される。すなわち、文字コード３０２０ａが検索タ
ームを構成する文字コードの場合には１が、それ以外の
文字コードの場合には０が一致フラグレジスタ３２３０
ａから読みだされ、データ線３０３０ａを経由して収集
手段３３００に出力される。The character code 3211a which is the output of the register 3210a is as it is, the character code register 3240a.
Is stored in the collecting means 33 via the data line 3020a.
00 is output. Also, the single filtering table 32
20a is accessed using the character code 3211a as a reference address, and the match flag is stored in the match flag register 3230a. That is, if the character code 3020a is a character code forming a search term, 1 is displayed. If the character code 3020a is any other character code, 0 is displayed.
It is read from a and output to the collection means 3300 via the data line 3030a.

【００８４】図１５は、検索タームとして“ＢＵＳ”が
与えられ、テキストとして“HIGH-SPEED SCSI BUS CONT
ROLLER"が入力されたときのフィルタリング手段の具体
的な動作を示す。In FIG. 15, "BUS" is given as the search term and "HIGH-SPEED SCSI BUS CONT" is given as the text.
The specific operation of the filtering means when "ROLLER" is input will be described.

【００８５】まず、１回目の入力では、分配手段３１０
０からフィルタリング手段3200ａに“Ｈ”が取り込ま
れ、レジスタ３２１０ａに格納される。また、フィルタ
リング手段３２００ｂには“Ｉ”が取り込まれ、レジス
タ３２１０ｂに格納される。そして、２回目の入力の直
前に、フィルタリング手段３２００ａではレジスタ３２
１０ａから“Ｈ”が文字コード３２１１ａとして出力さ
れる。このため、“Ｈ”で単一フィルタリングテーブル
３２２０ａがアクセスされる。しかし、“Ｈ”は検索タ
ームに含まれない文字コードであるため、一致フラグの
値として０が一致フラグレジスタ３２３０ａに格納さ
れ、データ線３０３０ａを経由して収集手段３３００に
出力される。また、文字コードとしては“Ｈ”が文字コ
ードレジスタ３２４０ａに格納され、データ線３０２０
ａを経由して収集手段3300に出力される。First, in the first input, the distribution means 310
“H” is fetched from 0 to the filtering means 3200a and stored in the register 3210a. Further, “I” is fetched by the filtering means 3200b and stored in the register 3210b. Immediately before the second input, the filtering unit 3200a registers 32
“H” is output from 10a as the character code 3211a. Therefore, the single filtering table 3220a is accessed at "H". However, since "H" is a character code that is not included in the search term, 0 is stored in the match flag register 3230a as the value of the match flag, and is output to the collection means 3300 via the data line 3030a. Further, as the character code, “H” is stored in the character code register 3240a, and the data line 3020
It is output to the collection means 3300 via a.

【００８６】同様に、フィルタリング手段３２００ｂで
は、レジスタ３２１０ｂから“Ｉ”が文字コード３２１
１ｂとして出力される。このため、“Ｉ”で単一フィル
タリングテーブル３２２０ｂがアクセスされる。しか
し、“Ｉ”も検索タームに含まれない文字コードである
ため０が一致フラグレジスタ３２３０ａに格納され、デ
ータ線３０３０ｂを経由して収集手段３３００に出力さ
れる。また、文字コードとしては“Ｉ”が文字コードレ
ジスタ３２４０ｂに格納され、データ線3020ｂを経由し
て収集手段３３００に出力される。Similarly, in the filtering means 3200b, the character code 321 from the register 3210b is "I".
It is output as 1b. Therefore, the single filtering table 3220b is accessed by "I". However, since "I" is also a character code not included in the search term, 0 is stored in the match flag register 3230a and output to the collecting means 3300 via the data line 3030b. As the character code, "I" is stored in the character code register 3240b and output to the collecting means 3300 via the data line 3020b.

【００８７】そして、２回目の入力では、分配手段３１
００からフィルタリング手段3200ａに“Ｇ”が取り込ま
れ、レジスタ３２１０ａに格納される。また、フィルタ
リング手段３２００ｂには“Ｈ”が取り込まれ、レジス
タ３２１０ｂに格納される。At the second input, the distribution means 31
"G" is fetched from 00 to the filtering means 3200a and stored in the register 3210a. Further, “H” is fetched by the filtering means 3200b and stored in the register 3210b.

【００８８】以下、同様に、フィルタリング手段３２０
０ａおよび３２００ｂでは分配手段３１００から文字コ
ードを１バイトずつ取り込み、文字コード３０２０ａお
よび３０２０ｂとして収集手段３３００に出力する。ま
た、これらを検索タームと比較し、検索ターム中に含ま
れる文字と同じ文字があるかどうかを判定し、同じもの
があればその１文字に対して一致フラグ３０３０ａおよ
び３０３０ｂとして１を、それ以外の文字コードに対し
て０を収集手段３３００に出力する。Hereinafter, similarly, the filtering means 320
At 0a and 3200b, the character code is fetched from the distribution means 3100 byte by byte and output to the collection means 3300 as character codes 3020a and 3020b. In addition, these are compared with the search term to determine whether or not there is the same character as the character included in the search term. If there is the same character, 1 is set as the match flags 3030a and 3030b for that one character, and other 0 is output to the collecting means 3300 for the character code of.

【００８９】以上が、フィルタリング手段３２００ａお
よび３２００ｂの構成および動作である。The above is the configuration and operation of the filtering means 3200a and 3200b.

【００９０】図１６は本実施例における収集手段３３０
０の構成を示す。すなわち、本実施例における収集手段
３３００は、フラグレジスタ３３１０ａおよび３３１０
ｂ，文字コードレジスタ３３２０ａおよび３３２０ｂ，
ＯＲ回路３３３０，フラグバッファ３３４０ａおよび３
３４０ｂ，文字コードバッファ３３５０ａおよび３３５
０ｂ，フィルタリング制御回路３３６０および文字コー
ドセレクタ3370によって構成される。FIG. 16 shows the collecting means 330 in this embodiment.
The structure of 0 is shown. That is, the collecting means 3300 in the present embodiment uses the flag registers 3310a and 3310.
b, character code registers 3320a and 3320b,
OR circuit 3330, flag buffers 3340a and 3
340b, character code buffers 3350a and 335
0b, a filtering control circuit 3360 and a character code selector 3370.

【００９１】フィルタリング手段から出力された一致フ
ラグ３０３０ａおよび３０３０ｂは、それぞれフラグレ
ジスタ３３１０ａおよび３３１０ｂに格納される。ま
た、文字コード３０２０ａおよび３０２０ｂは文字コー
ドレジスタ３３２０ａおよび３３２０ｂに一旦格納され
る。そして、取り込まれた２文字のうち少なくとも１文
字がフィルタリングを通過するとき、つまり二つの一致
フラグの論理和が１のときに、フラグレジスタ３３１０
ａおよび３３１０ｂに格納された一致フラグをフラグバ
ッファ３３４０ａおよび３３４０ｂにそれぞれ格納し、
出力文字コードレジスタ３３２０ａおよび３３２０ｂに
格納された文字コードを文字コードバッファ３３５０ａ
および３３５０ｂにそれぞれ格納する。取り込まれた２
文字がともにフィルタリングによって削除される場合に
は、すなわち、二つの一致フラグの論理和が０のときに
は、その２文字に対応する一致フラグと文字コードをフ
ラグバッファ３３４０ａおよび３３４０ｂと文字コード
バッファ３３５０ａおよび３３５０ｂに取り込まないと
いう処理を行なう。The match flags 3030a and 3030b output from the filtering means are stored in flag registers 3310a and 3310b, respectively. The character codes 3020a and 3020b are temporarily stored in the character code registers 3320a and 3320b. When at least one of the two fetched characters passes the filtering, that is, when the logical sum of the two match flags is 1, the flag register 3310
The match flags stored in a and 3310b are stored in flag buffers 3340a and 3340b, respectively,
The character codes stored in the output character code registers 3320a and 3320b are stored in the character code buffer 3350a.
And 3350b respectively. Captured 2
When both characters are deleted by filtering, that is, when the logical sum of the two match flags is 0, the match flags and the character codes corresponding to the two characters are set to the flag buffers 3340a and 3340b and the character code buffers 3350a and 3350b. The process of not importing into.

【００９２】フィルタリング制御回路３３６０では、フ
ラグバッファ３３４０ａおよび３３４０ｂから読み出さ
れる一致フラグ３３０３ａおよび３３０３ｂの値を入力
として、フラグバッファ３３４０ａおよび３３４０ｂと
文字コードバッファ３３５０ａおよび３３５０ｂのリー
ドイネーブル（ＲＥ）信号３３０１、および文字コード
セレクタ３３７０のセレクト信号３３０２を出力する。
すなわち、フラグバッファ３３４０ａおよび３３４０ｂ
と文字コードバッファ３３５０ａおよび３３５０ｂから
一致フラグと文字コードを読み出すタイミングでリード
イネーブル信号（ＲＥ）３３０１として１を出力する。
また、文字コードセレクタ3370のセレクト信号３３０２
として、文字コードバッファ３３５０ａから出力される
文字コード３３０４ａ（Ｘポート側）を選択する場合に
は０を、および文字コードバッファ３３５０ｂから出力
される文字コード（Ｙポート側）３３０４ａを選択する
場合には１を出力する。The filtering control circuit 3360 receives the values of the match flags 3303a and 3303b read from the flag buffers 3340a and 3340b as input, and the read enable (RE) signals 3301 of the flag buffers 3340a and 3340b and the character code buffers 3350a and 3350b, and The select signal 3302 of the character code selector 3370 is output.
That is, flag buffers 3340a and 3340b
And 1 is output as a read enable signal (RE) 3301 at the timing of reading the match flag and the character code from the character code buffers 3350a and 3350b.
In addition, the selection signal 3302 of the character code selector 3370
As 0, when selecting the character code 3304a (X port side) output from the character code buffer 3350a, and when selecting the character code (Y port side) 3304a output from the character code buffer 3350b. 1 is output.

【００９３】文字コードセレクタ３３７０では、フィル
タリング制御回路３３６０から出力されるセレクト信号
３３０２の値に応じてフィルタリングから出力する文字
コード２０７を選択する。すなわち、セレクト信号３３
０２の値が０のときには文字コードバッファ３３５０ａ
から出力される文字コード３３０４ａ（Ｘポート側）を
選択し、セレクト信号３３０２の値が１のときには文字
コードバッファ3350ｂから出力される文字コード３３０
４ｂ（Ｙポート側）を選択する。The character code selector 3370 selects the character code 207 to be output from filtering according to the value of the select signal 3302 output from the filtering control circuit 3360. That is, the select signal 33
When the value of 02 is 0, the character code buffer 3350a
The character code 3304a (X port side) output from is selected, and when the value of the select signal 3302 is 1, the character code 330 output from the character code buffer 3350b.
4b (Y port side) is selected.

【００９４】図１７はフィルタリング制御回路３３６０
の構成を示す。本実施例においてフィルタリング制御回
路３３６０はフィルタリング制御デコーダ３３６１およ
び２文字通過フラグレジスタ３３６２によって構成され
る。FIG. 17 shows the filtering control circuit 3360.
Shows the configuration of. In this embodiment, the filtering control circuit 3360 is composed of a filtering control decoder 3361 and a two-character passing flag register 3362.

【００９５】図１８はフィルタリング制御回路３３６０
の概略動作を示す。フィルタリング制御デコーダ３３６
１は、２文字通過フラグレジスタ３３６２から出力され
る前ステップの２文字通過フラグ３３６３，フラグバッ
ファ３３４０ａからの出力３３０３ａおよびフラグバッ
ファ３３４０ｂからの出力３３０３ｂを入力として、本
ステップの２文字通過フラグ３３６４，文字コードセレ
クタ３３７０のセレクト信号３３０２およびリードイネ
ーブル（ＲＥ）信号３３０１を生成し、出力する。２文
字通過フラグレジスタ３３６２には、初期値として０を
格納する。FIG. 18 shows the filtering control circuit 3360.
The general operation of is shown. Filtering control decoder 336
1 receives the two-character passage flag 3363 of the previous step output from the two-character passage flag register 3362, the output 3303a from the flag buffer 3340a, and the output 3303b from the flag buffer 3340b as input, and the two-character passage flag 3364 of this step, The character code selector 3370 generates and outputs a select signal 3302 and a read enable (RE) signal 3301. The 2-character passage flag register 3362 stores 0 as an initial value.

【００９６】まず、フラグバッファ３３４０ａからの出
力３３０３ａが０であり、フラグバッファ３３４０ｂか
らの出力３３０３ｂが１のときには、文字コードバッフ
ァ３３５０ｂから出力される文字コード３３０４ｂのみ
を文字列照合手段１０２に出力する。すなわち、文字コ
ードセレクタ３３７０のセレクト信号３３０２として１
を、リードイネーブル（ＲＥ）信号として１を出力す
る。First, when the output 3303a from the flag buffer 3340a is 0 and the output 3303b from the flag buffer 3340b is 1, only the character code 3304b output from the character code buffer 3350b is output to the character string collating means 102. . That is, 1 is set as the select signal 3302 of the character code selector 3370.
Is output as a read enable (RE) signal.

【００９７】また、フラグバッファ３３４０ａからの出
力３３０３ａが１であり、フラグバッファ３３４０ｂか
らの出力３３０３ｂが０のときには文字コードバッファ
３３５０ａから出力される文字コード３３０４ａのみを
文字列照合手段１０２に出力する。すなわち、文字コー
ドセレクタ３３７０のセレクト信号３３０２として０
を、リードイネーブル（ＲＥ）信号として１を出力す
る。When the output 3303a from the flag buffer 3340a is 1 and the output 3303b from the flag buffer 3340b is 0, only the character code 3304a output from the character code buffer 3350a is output to the character string collating means 102. That is, 0 is set as the select signal 3302 of the character code selector 3370.
Is output as a read enable (RE) signal.

【００９８】さらに、フラグバッファ３３４０ａからの
出力３３０３ａが１であり、フラグバッファ３３４０ｂ
からの出力３３０３ｂも１の時には、まず文字コードバ
ッファ３３５０ａから出力される文字コード３３０４ａ
を文字列照合手段１０２に出力する。そして、その次の
ステップで文字コードバッファ３３５０ｂから出力され
る文字コード３３０４ｂを文字列照合手段１０２に出力
する。すなわち、文字コードセレクタ３３７０のセレク
ト信号３３０２として０を、リードイネーブル（ＲＥ）
信号として０を出力するとともに、本ステップ２文字通
過フラグ3364として１を出力する。なお、本ステップ２
文字通過フラグ３３６４は２文字通過フラグレジスタ３
３６２に一旦格納され、次のステップに前ステップ２文
字通過フラグ３３６３として１が出力される。すなわ
ち、前ステップ２文字通過フラグ３３６３の値が１のと
きには、文字コードセレクタ３３７０のセレクト信号33
02として１を、リードイネーブル（ＲＥ）信号として１
を出力する。また、本ステップ２文字通過フラグ３３６
４として０を出力する。Further, the output 3303a from the flag buffer 3340a is 1, and the flag buffer 3340b is
When the output 3303b from is also 1, the character code 3304a first output from the character code buffer 3350a
Is output to the character string collating means 102. Then, in the next step, the character code 3304b output from the character code buffer 3350b is output to the character string collating means 102. That is, 0 is set as the select signal 3302 of the character code selector 3370 and read enable (RE) is performed.
In addition to outputting 0 as a signal, 1 is output as the step 2 character passage flag 3364. In addition, this step 2
The character passage flag 3364 is the 2 character passage flag register 3
Once stored in 362, 1 is output as the previous step 2 character passage flag 3363 in the next step. That is, when the value of the previous step 2 character passage flag 3363 is 1, the select signal 33 of the character code selector 3370 is displayed.
1 as 02 and 1 as read enable (RE) signal
Is output. In addition, this step 2 character passage flag 336
0 is output as 4.

【００９９】なお、フラグバッファ３３４０ａからの出
力３３０３ａおよびフラグバッファ３３４０ｂからの出
力３３０３ｂがともに０になることは起こりえない。す
なわち、フラグバッファ３３４０ａおよび３３４０ｂと
文字コードバッファ3350ａおよび３３５０ｂへの書込み
条件、（フラグバッファ３３４０ａおよび３３４０ｂか
ら出力される二つの一致フラグの論理和が１）に矛盾す
るためである。Note that the output 3303a from the flag buffer 3340a and the output 3303b from the flag buffer 3340b cannot both be 0. That is, this is because the write conditions for the flag buffers 3340a and 3340b and the character code buffers 3350a and 3350b are inconsistent with (the logical sum of the two match flags output from the flag buffers 3340a and 3340b is 1).

【０１００】次に、収集手段３３００の具体的な動作に
ついて、検索タームとして“ＢＵＳ”が与えられ、入力
テキストとして“HIGH-SPEED SCSI BUS CONTROLLER”が
入力されたときを例に説明する。Next, the specific operation of the collecting means 3300 will be described by taking as an example the case where "BUS" is given as the search term and "HIGH-SPEED SCSI BUS CONTROLLER" is inputted as the input text.

【０１０１】図１９に、フラグレジスタ３３１０ａおよ
び３３１０ｂ，文字コードレジスタ３３２０ａおよび３
３２０ｂ，ＯＲ回路３３３０，フラグバッファ３３４０
ａおよび３３４０ｂと文字コードバッファ３３５０ａお
よび３３５０ｂの動作例を示す。FIG. 19 shows flag registers 3310a and 3310b and character code registers 3320a and 3320.
320b, OR circuit 3330, flag buffer 3340
a and 3340b and character code buffers 3350a and 3350b are shown as operation examples.

【０１０２】まず１回目の入力では、フィルタリング手
段３２００ａおよび３２００ｂから、文字コードとして
それぞれ“Ｈ”および“Ｉ”が取り込まれ、それぞれ文
字コードレジスタ３３２０ａおよび３３２０ｂに格納さ
れる。また、この２文字はともに検索タームに含まれて
いない文字コードであるから、一致フラグとしてフラグ
レジスタ３３１０ａおよび３３１０ｂに、ともに０が格
納される。First, in the first input, "H" and "I" are fetched as character codes from the filtering means 3200a and 3200b, and are stored in the character code registers 3320a and 3320b, respectively. Further, since these two characters are character codes that are not included in the search term, both 0 are stored in the flag registers 3310a and 3310b as match flags.

【０１０３】次に２回目の入力の直前に、フラグレジス
タ３３１０ａおよび３３１０ｂから０が出力されるが、
この時ＯＲ回路３３３０からの出力が０となる、すなわ
ち文字コードレジスタ３３２０ａおよび３３２０ｂに格
納された２文字が、ともに検索タームに含まれない文字
であるため、これら２文字に対応する文字コードと一致
フラグを文字コードバッファ３３５０ａおよび３３５０
ｂとフラグバッファ３３４０ａおよび３３４０ｂに取り
込まない。Immediately before the second input, 0 is output from the flag registers 3310a and 3310b.
At this time, the output from the OR circuit 3330 becomes 0, that is, since the two characters stored in the character code registers 3320a and 3320b are not included in the search term, they match the character codes corresponding to these two characters. Flags to character code buffers 3350a and 3350
b and flag buffers 3340a and 3340b.

【０１０４】２回目の入力時には“Ｇ”および“Ｈ”が
文字コードレジスタ３３２０ａおよび３３２０ｂに格納
される。また、一致フラグとしてフラグレジスタ３３１
０ａおよび３３１０ｂに、ともに０が格納される。しか
し、この時もＯＲ回路3330からの出力が０となるため、
これら２文字に対応する文字コードと一致フラグを文字
コードバッファ３３５０ａおよび３３５０ｂとフラグバ
ッファ３３４０ａおよび３３４０ｂに取り込まない。At the time of the second input, "G" and "H" are stored in the character code registers 3320a and 3320b. In addition, a flag register 331 is used as a match flag.
0 is stored in both 0a and 3310b. However, since the output from the OR circuit 3330 is 0 at this time as well,
The character codes and match flags corresponding to these two characters are not taken into the character code buffers 3350a and 3350b and the flag buffers 3340a and 3340b.

【０１０５】そして、３回目の入力時には“−”および
“Ｓ”が文字コードレジスタ3320ａおよび３３２０ｂに
格納される。この時、“−”は検索ターム“ＢＵＳ”に
含まれない文字コードであるが、“Ｓ”は含まれる文字
コードのため、一致フラグとしてフラグレジスタ３３１
０ａには０が、３３１０ｂには１が格納される。すなわ
ち、この時のＯＲ回路３３３０からの出力が１となるた
め、これら２文字に対応する文字コードと一致フラグは
文字コードバッファ３３５０ａおよび3350ｂとフラグバ
ッファ３３４０ａおよび３３４０ｂにそれぞれ取り込ま
れる。At the time of the third input, "-" and "S" are stored in the character code registers 3320a and 3320b. At this time, "-" is a character code that is not included in the search term "BUS", but "S" is a character code that is included. Therefore, the flag register 331 is used as a match flag.
0 is stored in 0a and 1 is stored in 3310b. That is, since the output from the OR circuit 3330 at this time is 1, the character codes and the match flags corresponding to these two characters are fetched into the character code buffers 3350a and 3350b and the flag buffers 3340a and 3340b, respectively.

【０１０６】以下同様に、フィルタリング手段３２００
から出力される２バイトの文字コードと一致フラグを文
字コードレジスタ３３２０ａおよび３３２０ｂとフラグ
レジスタ３３１０ａおよび３３１０ｂに一旦格納する。
そして、これらの文字コードのうちに検索タームに含ま
れる文字コードが少なくとも一つ存在する場合、すなわ
ちＯＲ回路３３３０の出力が１の時には、これに対応す
る一致フラグと文字コードを文字コードバッファ３３５
０ａおよび３３５０ｂとフラグバッファ3340ａおよび３
３４０ｂに取り込んでいく。Similarly, the filtering means 3200 will be described below.
The 2-byte character code and the coincidence flag output from are temporarily stored in character code registers 3320a and 3320b and flag registers 3310a and 3310b.
When there is at least one character code included in the search term among these character codes, that is, when the output of the OR circuit 3330 is 1, the matching flag and the character code corresponding thereto are stored in the character code buffer 335.
0a and 3350b and flag buffers 3340a and 3340
Take it into 340b.

【０１０７】図２０はフィルタリング制御回路３３６０
および文字コードセレクタ３３７０の具体的な動作につ
いて示す。FIG. 20 shows the filtering control circuit 3360.
The specific operation of the character code selector 3370 will be described.

【０１０８】一回目の入力では、フラグバッファ３３４
０ａおよび３３４０ｂと文字コードバッファ３３５０ａ
および３３５０ｂがエンプティでなくなった状態、すな
わち文字コードバッファに“−”および“Ｓ”が書き込
まれた直後の状態で、文字コードバッファ３３５０ａお
よび３３５０ｂから文字コードとして“−”および
“Ｓ”を、フラグバッファ３３４０ａおよび３３４０ｂ
から一致フラグ3303ａとして０、３３０３ｂとして１を
読み出す。また、２文字通過フラグレジスタ３３６２か
ら前ステップ２文字通過フラグ３３６３として０を読み
出す。On the first input, the flag buffer 334
0a and 3340b and character code buffer 3350a
And 3350b are no longer empty, that is, immediately after "-" and "S" are written in the character code buffer, "-" and "S" are flagged as character codes from the character code buffers 3350a and 3350b. Buffers 3340a and 3340b
From 0, 0 is read as the match flag 3303a and 1 is read as 3303b. Also, 0 is read from the two-character passage flag register 3362 as the previous-step two-character passage flag 3363.

【０１０９】フィルタリング制御デコーダ３３６１は上
記の入力に対して、本ステップ２文字通過フラグ３３６
４として０を、文字コードセレクタ３３７０のセレクト
信号３３０２として１を、フラグバッファ３３４０ａお
よび３３４０ｂと文字コードバッファ３３５０ａおよび
３３５０ｂのリードイネーブル（ＲＥ）信号３３０１と
して１を出力する（図１８参照）。また、２文字通過フ
ラグレジスタ３３６２には本ステップ２文字通過フラグ
３３６４として０が格納される。In response to the above input, the filtering control decoder 3361 has the present step 2 character passage flag 336.
4 is output as 0, 1 is output as the select signal 3302 of the character code selector 3370, and 1 is output as the read enable (RE) signal 3301 of the flag buffers 3340a and 3340b and the character code buffers 3350a and 3350b (see FIG. 18). Further, 0 is stored in the two-character passage flag register 3362 as the two-character passage flag 3364 in this step.

【０１１０】文字コードセレクタ３３７０では、セレク
ト信号３３０２の値に応じてフィルリングから出力する
文字コードの選択を行う。すなわち、セレクト信号３３
０２の値が１であるから、文字コードバッファ３３５０
ｂからの出力である“Ｓ”（Ｙポート側）を選択し、文
字列照合手段１０２に出力する。また、２文字通過フラ
グレジスタ３３６２には本ステップ２文字通過フラグ３
３６４として０が格納される。The character code selector 3370 selects the character code to be output from the fill ring according to the value of the select signal 3302. That is, the select signal 33
Since the value of 02 is 1, the character code buffer 3350
"S" (Y port side) which is the output from b is selected and output to the character string collating means 102. In addition, the two-character passage flag register 3362 stores this step two-character passage flag 3
0 is stored as 364.

【０１１１】２回目の入力では、フラグバッファ３３４
０ａおよび３３４０ｂと文字コードバッファ３３５０ａ
および３３５０ｂのリードイネーブル（ＲＥ）信号３３
０１が１であるから、各バッファから次の２文字に対応
する値、すなわち文字コードバッファ３３５０ａおよび
３３５０ｂから“ ”および“Ｓ”を、フラグバッファ
３３４０ａおよび３３４０ｂから一致フラグ３３０３ａ
として０、３３０３ｂとして１を読み出す。フィルタリ
ング制御デコーダ３３６１は、先程と同様に上記の入力
に対して、本ステップ２文字通過フラグ３３６４として
０を、文字コードセレクタ３３７０のセレクト信号３３
０２として１を、フラグバッファ３３４０ａおよび３３
４０ｂと文字コードバッファ３３５０ａおよび３３５０
ｂのリードイネーブル（ＲＥ）信号３３０１として１を
出力する（図１８参照）。すなわち、文字コードセレク
タ３３７０では、セレクト信号３３０２の値が１である
から、文字コードバッファ３３５０ｂからの出力である
“Ｓ”（Ｙポート側）を選択する。また、２文字通過フ
ラグレジスタ３３６２には本ステップ２文字通過フラグ
３３６４として０が格納される。On the second input, the flag buffer 334
0a and 3340b and character code buffer 3350a
And read enable (RE) signal 33 of 3350b
Since 01 is 1, the value corresponding to the next two characters from each buffer, that is, "" and "S" from the character code buffers 3350a and 3350b, and the match flag 3303a from the flag buffers 3340a and 3340b, respectively.
0, and 3303b is read as 1. Similarly to the previous case, the filtering control decoder 3361 sets 0 as the step 2 character passage flag 3364 to the above input and selects the selection signal 33 of the character code selector 3370.
1 as 02 is set to the flag buffers 3340a and 33
40b and character code buffers 3350a and 3350
1 is output as the read enable (RE) signal 3301 of b (see FIG. 18). That is, since the value of the select signal 3302 is 1 in the character code selector 3370, “S” (Y port side) which is the output from the character code buffer 3350b is selected. Further, 0 is stored in the two-character passage flag register 3362 as the two-character passage flag 3364 in this step.

【０１１２】さらに、３回目の入力では、フラグバッフ
ァ３３４０ａおよび３３４０ｂと文字コードバッファ３
３５０ａおよび３３５０ｂのリードイネーブル（ＲＥ）
信号３３０１が１であるから、各バッファから次の２文
字に対応する値、すなわち文字コードバッファ３３５０
ａおよび３３５０ｂから“Ｃ”および“Ｓ”を、フラグ
バッファ３３４０ａおよび３３４０ｂから一致フラグ３
３０３ａとして０、３３０３ｂとして１を読み出す。フ
ィルタリング制御デコーダ３３６１は、先程と同様に上
記の入力に対して、本ステップ２文字通過フラグ３３６
４として０を、文字コードセレクタ３３７０のセレクト
信号３３０２として１を、フラグバッファ３３４０ａお
よび３３４０ｂと文字コードバッファ３３５０ａおよび
3350ｂのリードイネーブル（ＲＥ）信号３３０１として
１を出力する（図１８参照）。すなわち、文字コードセ
レクタ３３７０では、セレクト信号３３０２の値が１で
あるから、文字コードバッファ３３５０ｂからの出力で
ある“Ｓ”(Ｙポート側)を選択する。また、２文字通過
フラグレジスタ３３６２には本ステップ２文字通過フラ
グ３３６４として０が格納される。Further, in the third input, the flag buffers 3340a and 3340b and the character code buffer 3 are input.
Read enable (RE) for 350a and 3350b
Since the signal 3301 is 1, the value corresponding to the next two characters from each buffer, that is, the character code buffer 3350.
a and 3350b to "C" and "S", and flag buffers 3340a and 3340b to match flag 3
0 is read as 303a and 1 is read as 3303b. The filtering control decoder 3361, in the same manner as described above, responds to the above input with the step 2 character passage flag 336.
4, 0, 1 as the select signal 3302 of the character code selector 3370, flag buffers 3340a and 3340b, and character code buffer 3350a and
1 is output as the read enable (RE) signal 3301 of the 3350b (see FIG. 18). That is, since the value of the select signal 3302 is 1 in the character code selector 3370, “S” (Y port side) which is the output from the character code buffer 3350b is selected. Further, 0 is stored in the two-character passage flag register 3362 as the two-character passage flag 3364 in this step.

【０１１３】さらに、４回目の入力では、フラグバッフ
ァ３３４０ａおよび３３４０ｂと文字コードバッファ３
３５０ａおよび３３５０ｂのリードイネーブル（ＲＥ）
信号３３０１が１であるから、各バッファから次の２文
字に対応する値、すなわち文字コードバッファ３３５０
ａおよび３３５０ｂから“Ｂ”および“Ｕ”を、フラグ
バッファ３３４０ａおよび３３４０ｂから一致フラグ３
３０３ａとして１、３３０３ｂとして１を読み出す。フ
ィルタリング制御デコーダ３３６１は、上記の入力に対
して、本ステップ２文字通過フラグ３３６４として１
を、文字コードセレクタ３３７０のセレクト信号３３０
２として０を、フラグバッファ3340ａおよび３３４０ｂ
と文字コードバッファ３３５０ａおよび３３５０ｂのリ
ードイネーブル（ＲＥ）信号３３０１として０を出力す
る（図１８参照）。すなわち、文字コードセレクタ３３
７０では、セレクト信号３３０２の値が０であるから、
文字コードバッファ３３５０ａからの出力である“Ｂ”
（Ｘポート側）を選択する。また、２文字通過フラグレ
ジスタ３３６２には本ステップ２文字通過フラグ３３６
４として１が格納される。Further, in the fourth input, the flag buffers 3340a and 3340b and the character code buffer 3 are input.
Read enable (RE) for 350a and 3350b
Since the signal 3301 is 1, the value corresponding to the next two characters from each buffer, that is, the character code buffer 3350.
a and 3350b to "B" and "U", and flag buffers 3340a and 3340b to match flag 3
1 is read out as 303a and 1 as 3303b. The filtering control decoder 3361 sets this step 2 character passage flag 3364 to 1 for the above input.
To the select signal 330 of the character code selector 3370.
2 as 0, flag buffers 3340a and 3340b
And 0 is output as the read enable (RE) signal 3301 of the character code buffers 3350a and 3350b (see FIG. 18). That is, the character code selector 33
At 70, since the value of the select signal 3302 is 0,
"B" which is the output from the character code buffer 3350a
Select (X port side). In addition, the 2-character passage flag register 3362 stores the 2-character passage flag 336 in this step.
1 is stored as 4.

【０１１４】５回目の入力では、フラグバッファ３３４
０ａおよび３３４０ｂと文字コードバッファ３３５０ａ
および３３５０ｂのリードイネーブル（ＲＥ）信号３３
０１が０であり、各バッファから次の２文字に対応する
値を読み出さないため、文字コードバッファ３３５０ａ
および３３５０ｂから“Ｂ”および“Ｕ”が、フラグバ
ッファ３３４０ａおよび３３４０ｂから一致フラグ３３
０３ａとして１、３３０３ｂとして１が出力されたまま
の状態となる。また、前ステップ２文字通過フラグ３３
６４には２文字通過フラグ３３６２から１が出力され
る。フィルタリング制御デコーダ３３６１は、これらの
入力に対して、本ステップ２文字通過フラグ３３６４と
して０を、文字コードセレクタ３３７０のセレクト信号
3302として１を、フラグバッファ３３４０ａおよび３３
４０ｂと文字コードバッファ３３５０ａおよび３３５０
ｂのリードイネーブル（ＲＥ）信号３３０１として１を
出力する（図１８参照）。すなわち、文字コードセレク
タ３３７０では、セレクト信号３３０２の値が１である
から、文字コードバッファ３３５０ｂからの出力である
“Ｕ”（Ｙポート側）を選択する。また、２文字通過フ
ラグレジスタ３３６２には本ステップ２文字通過フラグ
３３６４として０が格納される。At the fifth input, the flag buffer 334
0a and 3340b and character code buffer 3350a
And read enable (RE) signal 33 of 3350b
Since 01 is 0 and the value corresponding to the next two characters is not read from each buffer, the character code buffer 3350a
And 3350b from "B" and "U" from flag buffers 3340a and 3340b.
The state in which 1 is output as 03a and 1 as 3303b is maintained. Also, the previous step 2 character passage flag 33
In 64, 1 is output from the two-character passage flag 3362. The filtering control decoder 3361 sets 0 as the step 2 character passage flag 3364 for these inputs, and outputs the select signal of the character code selector 3370.
1 as 3302, flag buffers 3340a and 33
40b and character code buffers 3350a and 3350
1 is output as the read enable (RE) signal 3301 of b (see FIG. 18). That is, since the value of the select signal 3302 is 1 in the character code selector 3370, “U” (Y port side) which is the output from the character code buffer 3350b is selected. Further, 0 is stored in the two-character passage flag register 3362 as the two-character passage flag 3364 in this step.

【０１１５】最後に、６回目の入力では、フラグバッフ
ァ３３４０ａおよび３３４０ｂと文字コードバッファ３
３５０ａおよび３３５０ｂのリードイネーブル（ＲＥ）
信号３３０１が１であるから、各バッファから次の２文
字に対応する値、すなわち、文字コードバッファ３３５
０ａおよび３３５０ｂから“Ｓ”および“ ”を、フラ
グバッファ３３４０ａおよび３３４０ｂから一致フラグ
３３０３ａとして１、３３０３ｂとして０を読み出す。
フィルタリング制御デコーダ３３６１は、上記の入力に
対して、本ステップ２文字通過フラグ３３６４として０
を、文字コードセレクタ３３７０のセレクト信号３３０
２として０を、フラグバッファ3340ａおよび３３４０ｂ
と文字コードバッファ３３５０ａおよび３３５０ｂのリ
ードイネーブル（ＲＥ）信号３３０１として１を出力す
る（図１８参照）。すなわち、文字コードセレクタ３３
７０では、セレクト信号３３０２の値が０であるから、
文字コードバッファ３３５０ａからの出力である“Ｓ”
（Ｙポート側）を選択する。Finally, in the sixth input, the flag buffers 3340a and 3340b and the character code buffer 3 are
Read enable (RE) for 350a and 3350b
Since the signal 3301 is 1, the value corresponding to the next two characters from each buffer, that is, the character code buffer 335.
"S" and "" are read from 0a and 3350b, and 1 is read as the match flag 3303a and 0 as 3303b from the flag buffers 3340a and 3340b.
The filtering control decoder 3361 sets the step 2 character passage flag 3364 to 0 for the above input.
To the select signal 330 of the character code selector 3370.
2 as 0, flag buffers 3340a and 3340b
And 1 is output as the read enable (RE) signal 3301 of the character code buffers 3350a and 3350b (see FIG. 18). That is, the character code selector 33
At 70, since the value of the select signal 3302 is 0,
"S" which is the output from the character code buffer 3350a
Select (Y port side).

【０１１６】以上が本実施例における並列フィルタリン
グ手段３０００の構成および動作である。The above is the configuration and operation of the parallel filtering means 3000 in this embodiment.

【０１１７】このように本実施例によると、フィルタリ
ング回路を二つ並列に動作させることにより、３０文字
の入力テキスト“HIGH-SPEED SCSI BUS CONTROLLER”を
１５回の処理サイクルで、検索タームに含まれる“SSSB
US”の６文字だけにフィルタリングすることが可能とな
る。すなわち、並列化しない場合に比べ、フィルタリン
グの処理サイクルを１５／３０、つまり１／２にするこ
とができ、これにより文字列検索装置の照合スループッ
トを２倍高めることができることになる。As described above, according to this embodiment, by operating two filtering circuits in parallel, the input text of 30 characters "HIGH-SPEED SCSI BUS CONTROLLER" is included in the search term in 15 processing cycles. "SSSB
It is possible to filter to only 6 characters of "US". That is, the processing cycle of filtering can be reduced to 15/30, that is, 1/2 as compared with the case of not parallelizing. The collation throughput can be doubled.

【０１１８】なお、本実施例では特願平4−93067号明細
書「フィルタリング回路を備えた文字列検索装置」に記
されているフィルタリング回路のうち、不連続文字の出
現を表す区切り記号（デリミタ）を挿入しない場合の単
一フィルタリング回路を並列に動作させた場合を例とし
て説明した。しかし、前記先願発明に記載されている他
のフィルタリング回路、例えばデリミタを挿入した場合
の単一フィルタリング回路や単一先頭フィルタリング回
路などを並列に動作させた場合についても同様に実現で
きることは明らかである。In this embodiment, among the filtering circuits described in Japanese Patent Application No. 4-93067 “Character string search device equipped with filtering circuit”, a delimiter (delimiter) indicating the appearance of discontinuous characters is used. ) Has been described as an example in which the single filtering circuits are operated in parallel. However, it is obvious that other filtering circuits described in the above-mentioned prior invention, for example, a single filtering circuit when a delimiter is inserted or a single head filtering circuit can be operated in parallel, can be similarly realized. is there.

【０１１９】また、本実施例ではフィルタリング手段を
二つ並列に動作させた場合について説明したが、並列に
動作させるフィルタリング手段の数を三つ以上にした場
合についても、上記実施例と同様の方法で実現できるこ
とも明らかである。Further, although the case where two filtering means are operated in parallel has been described in the present embodiment, the same method as in the above embodiment is applied to the case where the number of filtering means operated in parallel is three or more. It is also clear that can be realized with.

【０１２０】さらに、本実施例では入力テキストが１バ
イト文字で表された場合について説明したが、入力テキ
ストが２バイト文字で表された場合についても上記実施
例と同様に実現できることも明らかである。Further, although the case where the input text is represented by 1-byte characters has been described in the present embodiment, it is clear that the case where the input text is represented by 2-byte characters can be realized in the same manner as the above-mentioned embodiment. .

【０１２１】[0121]

【発明の効果】以上のように本発明によれば、フィルタ
リング回路を二つ並列に動作させることにより、例えば
３０文字の入力テキスト“HIGH-SPEED SCSI BUS CONTRO
LLER”を１５回の処理サイクルで、検索タームに含まれ
る“SSSBUS”の６文字だけにフィルタリングすることが
可能となる。すなわち、並列化しない場合に比べ、フィ
ルタリングの処理サイクルを１５／３０、つまり１／２
にすることができ、これにより文字列検索装置の照合ス
ループットを２倍高めることができる。As described above, according to the present invention, by operating two filtering circuits in parallel, the input text "HIGH-SPEED SCSI BUS CONTRO" of, for example, 30 characters can be obtained.
It is possible to filter "LLER" to only 6 characters of "SSSBUS" included in the search term in 15 processing cycles, that is, the processing cycle of filtering is 15/30, that is, compared to the case without parallelization. 1/2
Therefore, the matching throughput of the character string search device can be doubled.

【０１２２】[0122]

[Brief description of drawings]

【図１】本発明の文字列検索装置の説明図。FIG. 1 is an explanatory diagram of a character string search device of the present invention.

【図２】従来の文字列検索装置の説明図。FIG. 2 is an explanatory diagram of a conventional character string search device.

【図３】従来のオートマトンの状態遷移図。FIG. 3 is a state transition diagram of a conventional automaton.

【図４】従来のトークンの制御方法の説明図。FIG. 4 is an explanatory diagram of a conventional token control method.

【図５】従来の文字列照合回路の構成を示すブロック
図。FIG. 5 is a block diagram showing a configuration of a conventional character string matching circuit.

【図６】従来の文字列検索装置の説明図。FIG. 6 is an explanatory diagram of a conventional character string search device.

【図７】従来のフィルタリング手段の構成を示すブロッ
ク図。FIG. 7 is a block diagram showing the configuration of a conventional filtering means.

【図８】従来のフィルタリングテーブルの説明図。FIG. 8 is an explanatory diagram of a conventional filtering table.

【図９】従来のフィルタリング手段のタイミングチャー
ト。FIG. 9 is a timing chart of conventional filtering means.

【図１０】従来のオートマトンの状態遷移図および従来
のトークンの制御方法の説明図。FIG. 10 is a state transition diagram of a conventional automaton and an explanatory diagram of a conventional token control method.

【図１１】本発明の文字列検索装置の説明図。FIG. 11 is an explanatory diagram of a character string search device of the present invention.

【図１２】本発明の分配手段の構成を示すブロック図。FIG. 12 is a block diagram showing a configuration of a distribution unit of the present invention.

【図１３】本発明の分配手段の動作例の説明図。FIG. 13 is an explanatory diagram of an operation example of the distribution unit of the present invention.

【図１４】本発明のフィルタリング手段の構成を示すブ
ロック図。FIG. 14 is a block diagram showing the configuration of the filtering means of the present invention.

【図１５】本発明のフィルタリング手段の動作例の説明
図。FIG. 15 is an explanatory diagram of an operation example of the filtering means of the present invention.

【図１６】本発明の収集手段の構成を示すブロック図。FIG. 16 is a block diagram showing the configuration of a collection unit of the present invention.

【図１７】本発明のフィルタリング制御回路の構成を示
すブロック図。FIG. 17 is a block diagram showing the configuration of a filtering control circuit according to the present invention.

【図１８】本発明のフィルタリング制御デコーダの入出
力関係の説明図。FIG. 18 is an explanatory diagram of input / output relationships of the filtering control decoder of the present invention.

【図１９】本発明の文字コードレジスタ，フラグレジス
タ，文字コードバッファ，フラグバッファの動作例の説
明図。FIG. 19 is an explanatory diagram of an operation example of a character code register, a flag register, a character code buffer, and a flag buffer of the present invention.

【図２０】本発明のフィルタリング制御回路および文字
コードセレクタの動作例の説明図。FIG. 20 is an explanatory diagram of an operation example of the filtering control circuit and the character code selector of the present invention.

[Explanation of symbols]

１０２…文字列照合手段、１０５…文字列記憶手段、３
０００…フィルタリング手段、３２００…文字コードレ
ジスタ、３３００…単一フィルタリングテーブル。102 ... Character string collating means, 105 ... Character string storing means, 3
000 ... Filtering means, 3200 ... Character code register, 3300 ... Single filtering table.

───────────────────────────────────────────────────── フロントページの続き (72)発明者篠崎雅継東京都千代田区神田駿河台四丁目６番地株式会社日立製作所内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Masatsugu Shinozaki 4-6, Surugadai Kanda, Chiyoda-ku, Tokyo Inside Hitachi, Ltd.

Claims

[Claims]

1. A character string search device for collectively determining whether or not a plurality of designated search terms exist in a text composed of code-represented characters, and a character string for storing the text. The storage means and the text read from the character string storage means are collectively fetched in a plurality of characters, it is judged in parallel whether or not the character code is included in the search term, and only when it is included, it is output. A character string search device comprising: parallel filtering means; and character string matching means for collectively checking whether or not the search term is present in the character code string output from the parallel filtering means.

2. A distribution means for fetching a plurality of characters from a text read from the character string storage means at a time and dividing and sending one character at a time, as the parallel filtering means, according to claim 1. A plurality of filtering means arranged in parallel for determining whether the output character code is a character code included in the search term, and the character codes output from the plurality of filtering means are fetched and aligned. A character string search device including a collecting means for outputting the information.

3. The filtering means according to claim 2, wherein the filtering means fetches the characters one by one from the text read out from the character string storing means, and if the character code is included in the designated search term, this is used. Is output to the above character string collating means and is not output if it is not included, and a specific delimiter code is output only when the character code output immediately before is included in the above search term. A character string search device having filtering means for outputting to.

4. The filtering means according to claim 2, wherein the slot corresponding to a character code of a character included in a search term designated in advance is set to 1 and the slots corresponding to other character codes are set to 0. The flag storage means and the flag storage means are referred to in correspondence with the character code input from the character string storage means, and when the read flag is 1, the character code is output to the character string collating means. However, the character string search device constituted by the output selection means for selecting not to output when 0.

5. The filtering means as claimed in claim 2, wherein each character is fetched from the text read from the character string storage means one by one, and when the character code is the first character code of the search term, the character string collating means. In addition to outputting, it does not output except for the first character code,
Once the first character code is output to the character string collating means, the input character code is output to the character string collating means only when it is included in the search term, and is not output when it is not included. A character string search device having.

6. The filtering means according to claim 2, wherein the character code is fetched from the text read from the character string storage means one by one, and when the character code is the first character code of the search term, the character string collation is performed. The above character string collation is performed only when the input character code is included in the above search term after outputting the above first character code to the above character string collating means while outputting to the means The filtering means that outputs the specified delimiter code to the character string collating means only when the character code output immediately before is output to the character string collating means while not outputting the character code to the character string collating means. A character string search device having.