JPH06176071A

JPH06176071A - Information retrieving system

Info

Publication number: JPH06176071A
Application number: JP4330126A
Authority: JP
Inventors: Nobuo Muto; 信夫武藤
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1992-12-10
Filing date: 1992-12-10
Publication date: 1994-06-24
Anticipated expiration: 2015-10-30
Also published as: JP3104893B2

Abstract

PURPOSE:To prevent the double retrieval of the data having the equivalent significance when the information is retrieved by the right truncation logic with which the information starting with a designated retrieving key word is retrieved. CONSTITUTION:A code coincident length calculator means 3 sets the priority to the collation code trains of the data to be retrieved. The code coincident length L1 of a collation code train M1 of the highest priority is set at 0. Then the maximum value is defined as the code coincident length Li of a collation code train Mi among the number of codes coincident with each other when the collation code trains Mi (I>=2) are compared in sequence with each other and in the order of higher priority from the train M1 through the train Mi-1 and the head train. The value Li is stored in a collation code train/code coincident length storage means 2. A retrieving subject data extracting means 4 compares a key code train K0 consisting of L0 pieces of codes with the train Mi (i=1,2...) to obtain a collation code train which includes L0 pieces of codes coincident with each other from the head and has the code coincident length smaller than L0. Then the retrieving subject data corresponding to the obtained collation code train is taken out of a retrieving subject data storage means 1.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、キーワードによる前方
一致論理（条件として指定したキーワードの長さ分を先
頭から比較して検索する論理）で検索を行う情報検索方
式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information retrieval system for conducting retrieval by a keyword prefix matching logic (a logic for comparing the length of a keyword designated as a condition and conducting retrieval from the beginning).

【０００２】[0002]

【従来の技術】辞書等の検索では、１つの検索対象デー
タに別名等の複数のキーワードを付与して、曖昧な条件
でも検索できる方式がとられている。2. Description of the Related Art In a search of a dictionary or the like, a method is adopted in which a plurality of keywords such as aliases are added to one search target data so that search can be performed even under ambiguous conditions.

【０００３】[0003]

【発明が解決しようとする課題】しかし、検索条件とす
る文字列と前方一致論理で検索対象データを検索する場
合、以下のような問題がある。However, when the search target data is searched by the prefix matching logic with the character string used as the search condition, there are the following problems.

【０００４】図６に示すように、検索対象データ４５１
「天宮」に対して検索キーワード（照合符号列）２５１
「あまみや」が設定され、検索対象データ４５２「雨
宮」に対して検索キーワード２５２「あまみや」と２５
３「あめみや」が設定されているとする。「あ」で始ま
る名前を検索する場合、検索キーワード２５１と２５２
と２５３の３つが該当する。検索キーワード２５２と２
５３は検索対象データ４５２「雨宮」に付与されたキー
ワードであるので、検索対象データ４５２「雨宮」のみ
を検索結果とする処理が必要である。As shown in FIG. 6, search target data 451.
Search keyword (verification code string) 251 for "Tengu"
"Amamiya" is set, and the search keywords 252 "Amamiya" and 25 for the search target data 452 "Amemiya"
3 Assume that "Amemiya" is set. Search keywords 251 and 252 when searching for names beginning with "A"
And 253 are applicable. Search keywords 252 and 2
Since 53 is a keyword added to the search target data 452 “Amemiya”, it is necessary to perform a process of setting only the search target data 452 “Amemiya” as the search result.

【０００５】これに対し、従来は、図７に示すように、
検索キーワード２５２と２５３が同一の検索対象データ
４５２を指すことを保証するため、検索対象データ４５
１，４５２にユニークなコード５２１，５２２を付与し
て、検索結果を該コードでソートして、マージ処理によ
り同一の検索対象データの重複を削除する方式等が使用
されている。そのため、検索の度に、検索した結果全体
を対象に、ソート処理やマージ処理等が必要となり、検
索が頻繁に行われるオンライン処理では大きな負荷とな
る。On the other hand, conventionally, as shown in FIG.
In order to guarantee that the search keywords 252 and 253 point to the same search target data 452, the search target data 45
A method is used in which unique codes 521 and 522 are assigned to 1,452, search results are sorted by the codes, and duplication of the same search target data is deleted by merge processing. Therefore, every time a search is performed, sort processing, merge processing, etc. are required for the entire search result, which is a heavy load in the online processing in which the search is frequently performed.

【０００６】本発明の目的は、指定した検索キーワード
で始まる情報を検索する前方一致論理で情報検索する場
合に、同等な意味をもつデータが重複して検索されるの
を避け、必要なデータのみを、簡単な処理でかつ高速に
取り出す情報検索方式を提供することにある。An object of the present invention is to avoid duplicate search of data having an equivalent meaning and to search only necessary data when searching information by prefix matching logic for searching information starting with a specified search keyword. It is to provide an information retrieval method that retrieves the data by a simple process and at a high speed.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するため
に、本発明の情報検索方式は、検索対象データが格納さ
れる検索対象データ記憶手段と、検索対象データに付与
された照合符号列および該照合符号列の符号一致長が格
納される照合符号列・符号一致長記憶手段と、各検索対
象データに対する照合符号列に優先順位をつけ、優先順
位が最も高い照合符号列Ｍ₁ の符号一致長Ｌ₁ を０と
し、優先順位ｉの高い順に照合符号列Ｍｉ（ｉ≧２）を
照合符号列Ｍ₁ からＭ_i-1 と先頭の符号から逐次比較
し、連続して一致する符号の数のうちの最大値を照合符
号列Ｍｉの符号一致長Ｌｉとし、これら符号長Ｌｉ（ｉ
＝１，２，・・・）を照合符号列・符号一致長記憶手段
に格納する符号一致長算出手段と、指定された、Ｌ₀ 個
の符号からなるキー符号列Ｋ₀ を照合符号列・符号一致
長記憶手段に格納されている照合符号列Ｍｉ（ｉ＝１，
２，・・・）と比較し、先頭からＬ₀ 個の符号が一致
し、かつ符号一致長がＬ₀ よりも小さい照合符号列を見
い出し、それに対応する検索対象データを検索対象デー
タ記憶手段から取り出す検索対象データ取り出し手段と
を有する。In order to achieve the above-mentioned object, the information retrieval system of the present invention comprises a retrieval target data storage means for storing retrieval target data, a collation code string assigned to the retrieval target data, and A matching code string / code matching length storage unit that stores the matching code length of the matching code string, and a matching code string for each search target data are prioritized, and the matching code string of the matching code string M ₁ having the highest priority is matched. The length L _{1 is set} to 0, the collation code sequences Mi (i ≧ 2) are sequentially compared with the collation code sequences M ₁ to M _i−1 from the _first code in order of the highest priority i, and the number of consecutively coincident codes is determined. The maximum value of these is set as the code matching length Li of the collation code string Mi, and these code lengths Li (i
, 1, 2, ...) to the matching code string / code matching length storage means, and the specified key code string K ₀ consisting of L ₀ codes The matching code string Mi (i = 1, 1 stored in the code matching length storage means
2, compared with · · ·), L ₀ number of symbols matches the beginning, and the code matching length is found smaller matching code sequence than L _0, the search target data storage means searched data corresponding thereto And a retrieval target data retrieval means for retrieval.

【０００８】[0008]

【作用】本発明では、まず、複数の検索対象データ（し
たがって、照合符号列も）に、それらの同一性、包含関
係、階層関係により単純な優先順位をつける。次に、照
合符号列を優先順位の高い順に、自分よりも優先順位の
高い照合符号列と先頭から比較して連続して一致する符
号の数のうちの最大値を、その照合符号列の符号一致長
とする。ただし、優先順位が１番高い照合符号列は比較
する照合符号列がないので、符号一致長を０とする。最
後に、指定されたキー符号列が与えられると、先頭から
Ｌ₀ 個の符号が一致し、かつ符号一致長がキー符号列の
符号の個数よりも小さい照合符号列に対する検索対象デ
ータが取り出される。したがって、指定されたキー符号
列に対して同じ符号を含む符号列が複数あった場合、そ
の中で優先順位が最も高い照合符号列に対する検索対象
データが取り出される。According to the present invention, first, a simple priority order is given to a plurality of search target data (and therefore the collation code string) based on their identity, inclusion relation, and hierarchical relation. Next, the collation code string is compared with the collation code string having a higher priority order than the collation code string from the beginning, and the maximum value of the number of consecutively matched codes is determined as the code of the collation code string. Match length. However, the collation code sequence having the highest priority does not have a collation code sequence to be compared with, so the code coincidence length is set to 0. Finally, when the designated key code string is given, the search target data for the collation code string in which the L ₀ codes match from the beginning and the code matching length is smaller than the number of codes in the key code string are extracted. . Therefore, when there are a plurality of code strings including the same code for the designated key code string, the search target data for the matching code string having the highest priority is extracted.

【０００９】[0009]

【実施例】次に、本発明の実施例について図面を参照し
て説明する。Embodiments of the present invention will now be described with reference to the drawings.

【００１０】図１は本発明の一実施例の情報検索方式の
構成図である。FIG. 1 is a block diagram of an information retrieval system according to an embodiment of the present invention.

【００１１】本実施例の情報検索方式は、検索対象デー
タが格納される検索対象データ記憶手段１と、検索対象
データに付与された照合符号列および該照合符号列の符
号一致長が格納される照合符号列・符号一致長記憶手段
２と、検索対象データに対する照合符号列Ｍに優先順位
をつけ、優先順位が最も高い照合符号列Ｍ₁ の符号一致
長Ｌ₁ を０とし、優先順位ｉの高い順に、照合符号列Ｍ
ｉ（ｉ≧２）を照合符号列Ｍ₁ からＭ_i-1 と先頭の符号
から逐次比較し、連続して一致する符号の数のうちの最
大値を照合符号列Ｍｉの符号一致長Ｌｉとし、これら符
号一致長Ｌｉ（ｉ＝１，２，・・・）を照合符号列・符
号一致長記憶手段２に格納する符号一致長算出手段３
と、指定された、Ｌ₀ 個の符号からなるキー符号列Ｋ₀
を照合符号列・符号一致長記憶手段２に格納されている
照合符号列Ｍｉ（ｉ＝１，２，・・・）と比較し、先頭
からＬ₀ 個の符号が一致し、かつ符号一致長がＬ₀ より
も小さい照合符号列を見い出し、それに対応する検索対
象データを検索対象データ記憶手段１から取り出す検索
対象データ取り出し手段４とから構成されている。In the information retrieval system of this embodiment, the retrieval target data storage means 1 in which the retrieval target data is stored, the collation code string given to the retrieval target data, and the code matching length of the collation code string are stored. The matching code string / code matching length storage means 2 and the matching code string M for the search target data are prioritized, and the matching code length L ₁ of the matching code string M ₁ having the highest priority is set to 0, and the priority i is set. Collation code string M in descending order
i (i ≧ 2) is sequentially compared with the collating code strings M ₁ to M _i−1 from the leading code, and the maximum value of the number of consecutively matching codes is set as the code matching length Li of the collating code string Mi. , Code matching length calculation means 3 for storing these code matching lengths Li (i = 1, 2, ...) In the matching code string / code matching length storage means 2.
And a specified key code string K ₀ consisting of L ₀ codes
Is compared with the collation code string Mi (i = 1, 2, ...) Stored in the collation code string / code matching length storage means 2, and L ₀ codes match from the beginning, and the code matching length is Is smaller than L ₀ , and a search target data extracting unit 4 that extracts the search target data corresponding thereto from the search target data storage unit 1 is found.

【００１２】図２は符号一致長算出手段３の処理を示す
流れ図である。まず、検索対象データを１つ検索対象デ
ータ記憶手段１から取り出す（ステップ１１）。全ての
検索対象データについて処理が終了すれば、処理を終了
する（ステップ１２）。該検索対象データに対するｍ個
の照合符号列に優先順位ｉ＝１，２，・・・，ｍ（１が
最も高く、ｍが最も低い）をつける（ステップ１３）。
照合符号列Ｍ₁ の符号一致長Ｌ₁ を０とし、ｉ＝１とす
る（ステップ１４）。ｉを＋１する（ステップ１５）。
ｉをｍと比較し（ステップ１６）、ｉがｍより大きけれ
ば、次の検索対象データを検索対象データ記憶手段１か
ら取り出し、前述の処理を繰り返す（ステップ１１〜１
６）。ｉがｍ以下であれば、照合符号列Ｍ_i を照合符号
列Ｍｊ（１≦ｊ≦ｉ−１）と先頭より逐次比較し、連続
して一致する符号の数のうちの最大値を照合符号列Ｍｉ
の符号一致長Ｌｉとして照合符号列・符号一致長記憶手
段２に格納し（ステップ１７）、ステップ１５に戻る。FIG. 2 is a flow chart showing the processing of the code matching length calculation means 3. First, one retrieval target data is retrieved from the retrieval target data storage means 1 (step 11). When the processing is completed for all the search target data, the processing is completed (step 12). Priority i = 1, 2, ..., M (1 is the highest and m is the lowest) is assigned to the m collation code strings for the search target data (step 13).
A code matching length L ₁ of the matching code sequence M ₁ and 0, and i = 1 (step 14). i is incremented by 1 (step 15).
i is compared with m (step 16), and if i is larger than m, the next search target data is retrieved from the search target data storage means 1 and the above processing is repeated (steps 11 to 1).
6). If i is less than or equal to m, the collation code sequence M _i is sequentially compared with the collation code sequence Mj (1 ≦ j ≦ i−1) from the beginning, and the maximum value of the number of consecutively matching codes is determined as the collation code. Row Mi
The code matching length Li is stored in the matching code string / code matching length storage means 2 (step 17) and the process returns to step 15.

【００１３】図３は本実施例による情報検索の第１の具
体例を示す図である。本具体例は、検索対象データ４５
２「雨宮」に対する読みとして照合符号列２５２「あま
みや」、２５３「あめみや」があり、優先順位を「あめ
みや」→「あまみや」とした例である。FIG. 3 is a diagram showing a first specific example of information retrieval according to this embodiment. In this specific example, the search target data 45
2 As a reading for “Amemiya”, there are collation code strings 252 “Amamiya” and 253 “Amemiya”, and the priority is “Amemiya” → “Amamiya”.

【００１４】図２にしたがって、本具体例における符号
一致長算出処理を説明する。検索対象データ４５２「雨
宮」が取り出され（ステップ１１）、優先順位からＭ₁
＝「あめみや」、Ｍ₂ ＝「あまみや」、ｍ＝２となる
（ステップ１３）。Ｌ₁ ＝０，ｉ＝１となり（ステップ
１４）、ｉ＝２に歩進される（ステップ１５）。照合符
号列Ｍ₂ を照合符号列Ｍ₁ と比較すると、１文字目
「あ」では両者は一致し、２文字目は「め」「ま」で両
者は不一致となるので、照合符号列Ｍ₂ の符号一致長Ｌ
₂ ＝１が求まる（ステップ１７）。次に、ｉ＝３となり
（ステップ１３）、ｉ＞ｍ＝２であるのでステップ１１
に戻る（ステップ１６）。以降、他の検索対象データ４
５１「天宮」、４５３「飯田」に対して上記の処理が繰
り返され、照合符号列２５１〜２５４に対する符号一致
長３５１〜３５４が図３のように求まる。The code matching length calculation process in this example will be described with reference to FIG. The retrieval target data 452 "Amemiya" is retrieved (step 11), and the priority order is M ₁
= Amemiya, M ₂ = Amamiya, m = 2 (step 13). L ₁ = 0 and i = 1 are set (step 14), and i = 2 is stepped (step 15). When the collation code string M ₂ is compared with the collation code string M ₁ , the first character “A” matches the two and the second character “Me” and “Ma” do not match, so the collation code string M ₂ Code matching length L
₂ = 1 is obtained (step 17). Next, since i = 3 (step 13) and i> m = 2, step 11
Return to (step 16). After that, other search target data 4
The above processing is repeated for 51 "Amangu" and 453 "Iida", and the code matching lengths 351 to 354 for the collation code strings 251 to 254 are obtained as shown in FIG.

【００１５】ここで、検索条件１５１としてキー符号列
「あ」が指定された場合、照合符号列２５１〜２５３先
頭の「あ」が一致するが、Ｌ₀＝１から符号一致長Ｌｉ
＜Ｌ₀の条件を満たす符号一致長は照合符号列３５１と
３５３となり、目的とする検索対象データ「天宮」と
「雨宮」を得ることができる。Here, when the key code string "A" is specified as the search condition 151, the leading "A" of the collation code strings 251 to 253 match, but from L ₀ = 1 to the code matching length Li.
The code matching length satisfying the condition of <L ₀ becomes the matching code strings 351 and 353, and the target search target data “Tengu” and “Amemiya” can be obtained.

【００１６】図４は本実施例による情報検索の第２の具
体例を示す図である。本具体例は、職業分類をかな読み
により検索するシステムへの適用例である。思いつく見
出し語から該当の職業分類を得ようとする場合、見出し
語には、次のような包含関係がある。FIG. 4 is a diagram showing a second specific example of information retrieval according to this embodiment. This specific example is an example of application to a system for searching occupational classifications by kana reading. When trying to obtain the corresponding occupation classification from a conceivable headword, the headword has the following inclusive relation.

【００１７】「健康食品」「自然食品」等をまとめて
「健康・自然食品」と扱う場合、「健康食品」「自然食
品」は、「健康・自然食品」と包含関係にある。このと
き、「けんこう」の検索キーで、まとめて「健康・自然
食品」のみを検索結果としたい場合に適用した例であ
る。When “health food”, “natural food” and the like are collectively treated as “health / natural food”, “health food” and “natural food” have an inclusive relationship with “health / natural food”. At this time, it is an example applied when the search key of "health" is used and only "healthy / natural food" is to be the search result.

【００１８】本具体例では、検索対象データ４６１「健
康・自然食品」が検索対象データ４６２「健康食品」、
検索対象データ４６３「自然食品」を包含するので、照
合符号列２６１〜２６３の優先順位を２６１「けんこう
しぜんしょくひん」→２６２「けんこうしょくひん」→
２６３「しぜんしょくひん」と設定して、図２から符号
一致長３６１〜３６３を算出している。In this example, the search target data 461 “health / natural food” is the search target data 462 “health food”,
Since the search target data 463 “natural food” is included, the priority order of the collation code strings 261 to 263 is 261 “kenkoushokushinhin” → 262 “kenkoushokuhin” →
263 is set, and the code matching lengths 361 to 363 are calculated from FIG.

【００１９】まず、ステップ１３で、Ｍ₁ ＝「けんこう
しぜんしょくひん」、Ｍ₂ ＝「けんこうしょくひん」、
Ｍ₃ ＝「しぜんしょくひん」、ｍ＝３となる。１巡目の
ステップ１７では、Ｍ₁ とＭ₂ からＬ₂ ＝４が求まり、
２巡目のステップ１７で、Ｍ ₁ とＭ₃ の一致長とＭ₂ と
Ｍ₃ の一致長の最大値からＬ₃ ＝０が求まる。First, in step 13, M₁ = "Health
"", M₂ = "Healthcare,"
M₃ = "Shizenshokuhin", m = 3. First round
In step 17, M₁ And M₂ To L₂ = 4 is obtained,
In step 17 of the second round, M ₁ And M₃ Match length and M₂ When
M₃ From the maximum match length of L₃ = 0 is obtained.

【００２０】検索条件のキー符号列１６１として「けん
こう」が指定された場合、前方一致による検索では照合
符号列２６１と２６２が検索されるが、キー符号列１６
１の長さＬ₀ ＝４であるので、符号一致長がＬｉ＝４の
「けんこうしょくひん」２６２を包含する「けんこうし
ぜんしょくひん」２６１に対応する検索対象データ４６
１「健康・自然食品」のみが検索される。When "health" is specified as the key code string 161 of the search condition, the collation code strings 261 and 262 are searched for by the prefix match, but the key code string 16
Since the length L _{0 of} 1 is L ₀ = 4, the search target data 46 corresponding to the “kenkoushishokuhin” 261 including the “kenkoushokuhin” 262 whose code matching length is Li = 4.
1. Only "health and natural foods" are searched.

【００２１】図５は本実施例による情報検索の第３の具
体例を示す図である。本例は、符号列として、文字列で
だけではなく「都道府県」「市区郡」「町村」等の単位
で１符号（例えば都道府県コード、市区町村コード等）
として扱い、住所をキーとして会社等を検索するシステ
ムに適用した例である。FIG. 5 is a diagram showing a third specific example of information retrieval according to this embodiment. In this example, the code string is not only a character string but also one code in units such as "prefecture", "city / district", "town / village" (for example, prefecture code, city code, etc.)
It is an example applied to a system for searching a company etc. using an address as a key.

【００２２】本具体例は、検索対象データを４７１「Ａ
本社」、４７２「本社」、４７３「ＡＡ部」、４７４
「ＢＢ部」、４７５「Ｂ支店」、４７６「ＢＢ部」、４
７７「ＣＣ部」、４７８「ＤＤ部」とし、各検索対象デ
ータ４７１，４７２，４７３，４７４，４７５，４７
６，４７７に対する照合符号列を２７１「東京都千代田
区内幸町」、２７２「東京都千代田区内幸町」、２７３
「東京都千代田区内幸町」、２７４「東京都千代田区大
手町」、２７５「東京都中央区銀座」、２７６「東京都
中央区銀座」、２７７「東京都中央区日本橋」、２７８
「東京都台東区上野公園」とし、検索対象データ４７１
〜４７８の優先順位を、４７１→４７２→４７５→４７
３→４７４→４７６→４７７→４７８とした例である。
各照合符号列２７１〜２７８の符号一致長３７１〜３７
８は、前記実施例と同様に、図２の流れ図にしたがって
算出される。In this example, the search target data is 471 "A
Headquarters ", 472" Headquarters ", 473" AA Department ", 474
"BB department", 475 "B branch", 476 "BB department", 4
77 "CC section", 478 "DD section", and each search target data 471, 472, 473, 474, 475, 47
271, “Uchisaiwaicho, Chiyoda-ku, Tokyo”, 272 “Uchisaiwaicho, Chiyoda-ku, Tokyo”, 273
"Uchisaiwaicho, Chiyoda-ku, Tokyo", 274 "Otemachi, Chiyoda-ku, Tokyo", 275 "Ginza, Chuo-ku, Tokyo", 276 "Ginza, Chuo-ku, Tokyo", 277 "Nihonbashi, Chuo-ku, Tokyo", 278
Search target data 471 with "Ueno Park, Taito-ku, Tokyo"
Priority of 478 to 471 → 472 → 475 → 47
In this example, 3 → 474 → 476 → 477 → 478.
Code matching lengths 371 to 37 of the matching code strings 271 to 278
8 is calculated according to the flow chart of FIG. 2 as in the above embodiment.

【００２３】キー符号列１７１「東京都台東区上野公
園」（Ｌ₀ ＝３）、１７２「東京都中央区」（Ｌ₀ ＝
２）、１７３「東京都千代田区大手町」（Ｌ₀ ＝３）、
１７４「東京都千代田区」（Ｌ₀ ＝２）、１７５「東京
都」（Ｌ₀ ＝１）に対して検索対象データ４７８，４７
５，４７４，４７１，４７１がそれぞれ検索される。Key code strings 171 "Ueno Park, Taito-ku, Tokyo" (L ₀ = 3), 172 "Chuo-ku, Tokyo" (L ₀ =)
2), 173 “Otemachi, Chiyoda-ku, Tokyo” (L ₀ = 3),
174 "Chiyoda-ku, Tokyo" (L ₀ = 2), 175 "Tokyo" (L ₀ = 1) for search target data 478,47
5,474,471,471 are searched respectively.

【００２４】[0024]

【発明の効果】以上説明したように本発明は、複数の検
索対象データを、それらの同一性、包含関係、階層関係
等により単純な優先順位に置き換え、優先順位の高い順
にそれらの照合符号列を自分より優先順位の高い照合符
号列と比較し、先頭の符号から連続して一致する符号の
数のうちの最大値である符号一致長を算出し、Ｌ₀ 個の
符号からなる、与えられたキー符号列の、先頭からＬ₀
個の符号が一致し、かつ符号一致長がＬ₀ よりも小さい
照合符号列に対応する検索対象データを選択することに
より、同等な意味をもつデータが重複して検索されるの
を避け、必要なデータのみを簡単な処理で、高速に取り
出すことができ、一般のデータベース管理システム（Ｄ
ＢＭＳ）のもつ問い合わせ言語の大小比較の述語を用い
て簡単に実現できるため、検索プログラムが簡単化さ
れ、検索性能が向上する効果がある。As described above, according to the present invention, a plurality of search target data are replaced with simple priorities according to their identity, inclusion relation, hierarchical relation, etc., and their collation code strings are arranged in descending order of priority. Is compared with a collation code string having a higher priority than itself, the code matching length that is the maximum value of the number of consecutively matched codes from the leading code is calculated, and the code matching length is composed of L ₀ codes. L ₀ from the beginning of the key code string
By selecting the search target data corresponding to the collation code string whose codes match and the code matching length is smaller than L _0, it is possible to avoid searching for data having the same meaning in duplicate. Data can be retrieved at high speed with simple processing, and a general database management system (D
(BMS) can be easily realized by using a predicate for comparing the size of the query language, which has the effect of simplifying the search program and improving the search performance.

[Brief description of drawings]

【図１】本発明の一実施例の情報検索方式の構成図であ
る。FIG. 1 is a configuration diagram of an information search system according to an embodiment of the present invention.

【図２】符号一致長算出手段３の処理の流れ図である。FIG. 2 is a flowchart of the processing of a code matching length calculation means 3.

【図３】図１の実施例による情報検索の第１の具体例を
示す図である。FIG. 3 is a diagram showing a first specific example of information retrieval according to the embodiment of FIG.

【図４】図１の実施例による情報検索の第２の具体例を
示す図である。FIG. 4 is a diagram showing a second specific example of information search according to the embodiment of FIG.

【図５】図１の実施例による情報検索の第３の具体例を
示す図である。FIG. 5 is a diagram showing a third specific example of information search according to the embodiment of FIG. 1;

【図６】検索対象データと検索キーワードの一例を示す
図である。FIG. 6 is a diagram showing an example of search target data and search keywords.

【図７】図６の検索対象データに対する従来の方式を示
す図である。7 is a diagram showing a conventional method for the search target data of FIG.

[Explanation of symbols]

１検索対象データ記憶手段２照合符号列・符号一致長記憶手段３符号一致長算出手段４検索対象データ取り出し手段１１〜１７ステップ１５１，１６１，１７１〜１７５キー符号列２５１〜２５４，２６１〜２６３，２７１〜２７８
照合符号列３５１〜３５４，３６１〜３６３，３７１〜３７８
符号一致長４５１〜４５３，４６１〜４６３，４７１〜４７８
検索対象データ1 Search Target Data Storage Means 2 Collation Code Sequence / Code Match Length Storage Means 3 Code Match Length Calculation Means 4 Search Target Data Extraction Means 11-17 Steps 151, 161, 171-175 Key Code Sequences 251-254, 261-263 271-278
Collation code string 351 to 354, 361 to 363, 371 to 378
Sign matching length 451-453, 461-463, 471-478
Search target data

Claims

[Claims]

1. A search target data storage unit for storing search target data, a collation code string assigned to each search target data, and a collation code string / code match length for storing the code matching length of the collation code string. The storage means and the collation code string for each search target data are prioritized, and the code matching length L of the collation code string M ₁ having the highest priority is set.
_{1 is set} to 0, and the collation code string Mi (i
≧ 2) is sequentially compared with the collating code strings M ₁ to M _i−1 from the leading code, and the maximum value of the number of consecutively matching codes is set as the code matching length Li of the collating code string Mi. Long L
Code matching length calculation means for storing i (i = 1, 2, ...) In the matching code string / code matching length storage means, and a designated key code string K ₀ consisting of L ₀ codes Compared with the collation code string Mi (i = 1, 2, ...) Stored in the collation code string / code matching length storage means, L ₀ codes from the beginning match, and the code matching length is An information retrieval system having a retrieval target data retrieval means for retrieving a collation code string smaller than L _{0 and} retrieving corresponding retrieval target data from the retrieval target data storage means.