JPH0612550B2

JPH0612550B2 - Data retrieval method

Info

Publication number: JPH0612550B2
Application number: JP61237235A
Authority: JP
Inventors: 恵美香鈴木
Original assignee: Fujitsu Ltd; Fujitsu Communication Systems Ltd
Current assignee: Fujitsu Ltd; Fujitsu Communication Systems Ltd
Priority date: 1986-10-07
Filing date: 1986-10-07
Publication date: 1994-02-16
Anticipated expiration: 2009-02-16
Also published as: JPS6393033A

Description

【発明の詳細な説明】〔概要〕各キーワードに対応したデータを検索するに当り、各キ
ーワード毎のサブキーワードを一定の規則で予め作成し
ておいてそれぞれのキーワードに付加しておき、キーワ
ードを指定して対応のデータを検索すべき要求が発生す
る毎に、当該キーワードを前記一定の規則のもとに対応
のサブキーワードを算出し、算出されたサブキーワード
によって目的とするデータを検索することによりデータ
検索時間の短縮を図る。Detailed Description [Summary] When searching data corresponding to each keyword, sub-keywords for each keyword are created in advance according to a certain rule and added to each keyword. Each time a request to search for the corresponding data by specifying is generated, the corresponding subkeyword is calculated based on the keyword based on the certain rule, and the target data is searched by the calculated subkeyword. This will shorten the data search time.

[Industrial application field]

本発明はデータの検索方式に関する。 The present invention relates to a data search method.

データベースへのアクセスは必ず所望のキーワードを指
定して行う。例えば、データベースが電話番号であれ
ば、所望の人名（キーワード）を入力して当該電話番号
（データ）を得る。あるいはデータベースが例えば図書
目録であれば、所望の書籍名又は著者名（キーワード）
を入力して当該図書目録（データ）を得る。本発明はこ
のようなキーワードによるデータ検索方式に関する。Always access the database by specifying the desired keyword. For example, if the database is a telephone number, a desired person's name (keyword) is input to obtain the telephone number (data). Alternatively, if the database is a book catalog, for example, the desired book name or author name (keyword)
Enter to obtain the book catalog (data). The present invention relates to a data search method using such a keyword.

[Conventional technology]

従来のデータ検索においてはキーワードを一文字一文字
比較して所望のキーワードと一致しているか否かを検出
していた。例えば、上記の例において、人名（仮に“Ｅ
ＭＩＫＡ”とする）に対応する電話番号（データ）をデ
ータテーブルから読み出すとすると、先ず、ＥＭＩＫＡ
を例えばアスキーコードを用いて各文字毎に変換し、
“４５”，“４Ｄ”，“４９”，“４Ｂ”，“４１”と
する。そして、データテーブルの各データに付されたキ
ーワードのうち、これら“４５”…“４１”と一致する
ものを検出する。この場合、５文字の１つ１つについて
比較する。ここに一致がとれると、当該データ（電話番
号）の検索がなされる。In the conventional data search, keywords are compared character by character to detect whether or not they match a desired keyword. For example, in the above example, the person's name (provisionally "E
If the telephone number (data) corresponding to "MIKA" is read from the data table, first, EMIKA
Is converted for each character using ASCII code,
These are "45", "4D", "49", "4B", and "41". Then, among the keywords attached to each data in the data table, the keywords that match these "45" ... "41" are detected. In this case, the five characters are compared one by one. When a match is found here, the data (telephone number) is searched.

[Problems to be solved by the invention]

上述した従来のデータ検索においてはＮ文字（Ｎは自然
数）のキーワードを入力して、データテーブル内の同一
のキーワードを検出する。データテーブル内のキーワー
ドの文字数は各キーワード毎にまちまちであるが、いず
れにしても入力キーワードのＮ文字について少なくとも
Ｎ回の文字対応の比較動作を必要とする。したがって、
目的とするキーワードのアクセスにかなりの時間を要
し、結局、目的とするデータを検索するまでの時間が長
くなるという問題がある。In the above-described conventional data search, a keyword of N characters (N is a natural number) is input to detect the same keyword in the data table. The number of characters of the keywords in the data table is different for each keyword, but in any case, the comparison operation corresponding to the characters of N characters of the input keyword is required at least N times. Therefore,
There is a problem that it takes a considerable time to access a target keyword, and eventually, it takes a long time to search for target data.

[Means for solving problems]

第１図は本発明の方式に基づく原理構成を図解的に示す
図である。本図において、データ検索装置１０における
１１はデータテーブルであり、いわばデータベースをな
す。データテーブルはｎ個のデータブロック１１−１，
１１−２…１１−ｉ…１１−ｎからなる。各データブロ
ックは同一の構成を有し、例えばデータブロック１１−
ｉを詳細に示す。データブロック１１−ｉは、通常のキ
ーワード（ＫＥＹ）の領域ＫＥＹ−ｉとこれに対応する
データ（ＤＡＴＡ）の領域ＤＡＴＡ−ｉを少なくとも備
えてなり、本発明の特徴をなすサブキーワード（ｋｅ
ｙ）の領域ｋｅｙ−ｉがさらに付加される。FIG. 1 is a diagram schematically showing the principle configuration based on the method of the present invention. In the figure, reference numeral 11 in the data search device 10 is a data table, which is, so to speak, a database. The data table has n data blocks 11-1,
11-2 ... 11-i ... 11-n. Each data block has the same structure, for example, data block 11-
i is shown in detail. The data block 11-i includes at least an area KEY-i of a normal keyword (KEY) and an area DATA-i of data (DATA) corresponding to the area KEY-i, which is a feature of the present invention.
The area key-i of y) is further added.

このサブキーワードｋｅｙは各キーワードＫＥＹ毎に、
変換部１２を用いて予め生成され、各データブロック
（１１−ｉ）の先頭に付加される。This sub-keyword key is for each keyword KEY,
It is generated in advance using the conversion unit 12 and added to the beginning of each data block (11-i).

一方、データ検索の要求があったとき、アクセス用の所
望のキーワードＫＥＹが図中の右下より入力される。こ
れを変換部１２を介してアクセス用のサブキーワードｋ
ｅｙに変換し、このｋｅｙを用いてデータテーブル１１
内の各データブロック（１１−ｉ）をアクセスする。On the other hand, when a data search request is made, the desired keyword KEY for access is input from the lower right of the figure. The sub-keyword k for access via the conversion unit 12
The data table 11 is converted by using this key.
Access each data block (11-i) in

[Work]

キーワードＫＥＹよりサブキーワードｋｅｙを生成する
ための変換部１２は一定の規則のもとにキーワードＫＥ
Ｙのビット長よりも短いビット長のサブキーワードｋｅ
ｙを生成するものであり、アクセス用のキーワードＫＥ
Ｙと各データブロック（１１−ｉ）のキーワード領域
（ＫＥＹ−ｉ）との一致不一致を検出する時間に比し
て、アクセス用のサブキーワードｋｅｙと各データブロ
ック（１１−ｉ）のサブキーワード領域（ｋｅｙ−ｉ）
との一致不一致を検出する時間の方が短くなる。つま
り、データ検索時間は短縮される。なお、第１図中の上
方の変換部１２と下方の変換部１２とは同じものでもよ
いし、あるいは既述の一定の規則が双方同一でありさえ
すれば別個のものでもよい。The conversion unit 12 for generating the sub-keyword key from the keyword KEY uses the keyword KE based on a certain rule.
A subkeyword ke having a bit length shorter than the bit length of Y
y is generated, and the access keyword KE
The sub-keyword for access and the sub-keyword area of each data block (11-i) are compared with the time for detecting a match / mismatch between Y and the keyword area (KEY-i) of each data block (11-i). (Key-i)
It takes less time to detect a match / mismatch with. That is, the data search time is shortened. The upper conversion unit 12 and the lower conversion unit 12 in FIG. 1 may be the same, or may be separate as long as the above-described certain rules are the same.

キーワードＫＥＹのビット長に比しサブキーワードｋｅ
ｙのビット長が短いので、サブキーワードｋｅｙで区別
しうるデータブロック（１１−ｉ）の数(i)は当然少な
くなる。したがって１つのアクセス用サブキーワードｋ
ｅｙによって２以上のデータブロックがアクセスされる
ケースがありうる。このときは、さらにキーワードＫＥ
Ｙ同士の一致不一致を見て、所望の１つを特定する。Subkeyword ke compared to the bit length of keyword KEY
Since the bit length of y is short, the number (i) of data blocks (11-i) that can be distinguished by the sub-keyword is naturally small. Therefore, one access sub-keyword k
There may be a case where two or more data blocks are accessed by ey. In this case, the keyword KE
The desired one is specified by checking the agreement / disagreement between Ys.

〔Example〕

第２Ａおよび２Ｂ図は変換部１２における一動作例を示
すフローチャートであり、前述の“一定の規則”の一具
体例である。本例による一定の規則では、 (イ)キーワード（ＫＥＹ）を構成する第１文字を表すビ
ット列を一定の方向に１ビットシフトし、 (ロ)１ビットシフトしたビット列と、その１ビットシフ
トによりオーバーフローした１ビットキャリーと、第２
文字を表すビット列とを加算する、という操作手順を一
単位として、これを各文字毎に最終文字まで繰り返し実
行する。2A and 2B are flowcharts showing one operation example in the conversion unit 12, which is a specific example of the above-mentioned "certain rule". According to the fixed rule according to this example, (a) the bit string representing the first character forming the keyword (KEY) is shifted by 1 bit in a fixed direction, and (b) the bit string shifted by 1 bit and the overflow by the 1 bit shift. 1-bit carry and second
The operation procedure of adding a bit string representing a character is used as one unit, and this is repeatedly executed for each character up to the final character.

前述の例（ＥＭＩＫＡ）によれば、第３図に示すような
操作がなされる。第３図は本発明で用いる一定の規則を
実際の例をもって示す図である。また、第４図は第３図
の例を簡素化して示す図であり、しかもキーワード（Ｅ
ＭＩＫＡ）に対する最終的な結果であるサブキーワード
ｋｅｙ（本図ではＢＡ）まで示してある。第３図におい
て、最初の文字（第１文字のＥ）Ｗ１については、これ
に加えるべき加算ビット列（ＡＤＯ）もキャリー（ＣＲ
Ｏ）もないから、ＡＤＯもＣＲＯも共に０にプリセット
しておく。したがって第１文字Ｗ１についての加算ビッ
ト列ＡＤ１は第１文字Ｗ１そのものである。According to the above-mentioned example (EMIKA), the operation as shown in FIG. 3 is performed. FIG. 3 is a diagram showing an example of a certain rule used in the present invention. Further, FIG. 4 is a diagram showing a simplified example of FIG.
The final result is the sub-keyword key (BA in this figure) for MIKA). In FIG. 3, for the first character (E of the first character) W1, the addition bit string (ADO) to be added to this is also carry (CR).
O) does not exist, so both ADO and CRO are preset to 0. Therefore, the added bit string AD1 for the first character W1 is the first character W1 itself.

次にＡＤ１をビットシフト（レフトシフト）し、キャリ
ーＣＲ１（この場合０）を得ると共に、１ビットシフト
したビット列ＡＤ１′を得る。Next, AD1 is bit-shifted (left-shifted) to obtain a carry CR1 (0 in this case), and a bit string AD1 ′ shifted by 1 bit is obtained.

これらＣＲ１とＡＤ１′は、第２文字Ｗ２（ＥＭＩＫＡ
のＭを示す）に加えられ加算ビット列ＡＤ２を得る。Ａ
Ｄ２をさらに１ビットシフトしてＡＤ２′を得ると共
に、このときオーバーフローしたキャリーＣＲ２（この
場合１）を得る。These CR1 and AD1 'are the second characters W2 (EMIKA
(Indicating M of the above) is added to obtain an addition bit string AD2. A
D2 is further shifted by 1 bit to obtain AD2 ', and at the same time, carry CR2 (1 in this case) which overflows is obtained.

ＡＤ２′とＣＲ２は第３文字（ＥＭＩＫＡのＩ）Ｗ３に
加えられ、同様の操作が最終文字（Ａ）まで順次繰り返
される。AD2 'and CR2 are added to the third character (I in EMIKA) W3, and the same operation is sequentially repeated until the final character (A).

上記の操作手順は、第２Ａおよび２Ｂ図にフローチャー
トの形で示されている。The above operating procedure is illustrated in flow chart form in FIGS. 2A and 2B.

もっと分り易く示すと（アスキーコードによる）、第４
図に示すとおりになり、最終的な結果ＢＡが、キーワー
ドＫＥＹ（ＥＭＩＫＡ）に対するサブキーワードｋｅｙ
として得られる。なお、第４図では第３図の場合と異な
り、各値（ＡＤＯ，ＣＲＯ，Ｗ１…）の加算を左から右
へ、そして上段から下段への流れで示している。To make it easier to understand (by ASCII code), the fourth
As shown in the figure, the final result BA is the subkeyword key for the keyword KEY (EMIKA).
Obtained as. Unlike FIG. 3, FIG. 4 shows the addition of each value (ADO, CRO, W1 ...) As a flow from left to right and from upper to lower.

上述した一定の規則は一例であるが、シミュレーション
によれば、かなり一致率が低いことが確められている。
一致率とは、一つのサブキーワードによって重複してア
クセスされるキーワードの個数のことであり、この個数
が少ない程良いことは当然である。Although the above-mentioned certain rule is an example, it is confirmed by simulation that the matching rate is considerably low.
The matching rate is the number of keywords that are accessed redundantly by one sub-keyword, and it is natural that the smaller the number, the better.

〔The invention's effect〕

以上説明したように本発明によれば、従来例えば５バイ
ト分のキーワードをバイト毎に（１文字毎に）５回検査
していたのを、一定の規則によって１バイトのサブキー
ワードに変換し、１回の検査で済むようにしている。し
たがって高速で対応するデータブロックへのアクセスが
可能となる。この場合、１つのサブキーワードｋｅｙで
重複して２以上のキーワードＫＥＹをアクセスすること
がありうるが、そのとき限り本来のキーワード同士での
マッチングを行えばよい。As described above, according to the present invention, a keyword for 5 bytes is conventionally inspected 5 times for each byte (for each character), but is converted into a 1-byte sub-keyword according to a certain rule. I try to do it only once. Therefore, the corresponding data block can be accessed at high speed. In this case, one sub-keyword may be used to access two or more keywords KEY, but only at that time, the original keywords may be matched.

[Brief description of drawings]

第１図は本発明の方式に基づく原理構成を図解的に示す
図、第２Ａおよび２Ｂ図は変換部１２における一動作例を示
すフローチャート、第３図は本発明で用いる一定の規則を実際の例をもって
示す図、第４図は第３図の例を簡素化して示す図である。１０……データ検索装置、１１……データテーブル、１１−１，１１−２〜１１−ｉ〜１１−ｎ……データブ
ロック、１２……変換部、ＫＥＹ……キーワード、ｋｅｙ……サブキーワード、Ｄ
ＡＴＡ……データ。FIG. 1 is a diagram schematically showing the principle configuration based on the system of the present invention, FIGS. 2A and 2B are flow charts showing one operation example in the conversion unit 12, and FIG. FIG. 4 is a diagram showing an example, and FIG. 4 is a diagram showing the example of FIG. 3 in a simplified manner. 10 ... Data retrieval device, 11 ... Data table, 11-1, 11-2 to 11-i to 11-n ... Data block, 12 ... Conversion unit, KEY ... Keyword, key ... Sub-keyword, D
ATA ... data.

Claims

[Claims]

1. A data block (11) comprising a keyword area (KEY-i) (i is a natural number 1, 2, 3, ...) And a corresponding data area (DATA-i). -
In a data retrieval device for reading desired data (DATA) corresponding to the access keyword (KEY) by externally accessing the data table (11) having a plurality of i) with the access keyword (KEY) , Each data block (11-i) has a structure in which a sub-keyword (Key) area (Key-i) is further added to the head of the data block, and each sub-keyword ( Key) is pre-converted into a word having a bit length shorter than each keyword (KEY) according to a certain rule and written in the area (Key-i) of the sub-keyword (Key). , The access keyword (K
EY) is input, the corresponding sub-keyword (Key) corresponding to the same fixed rule as the fixed rule is input.
After the conversion, each data block (11-i) is accessed, and the same sub-keyword (Key) as the converted sub-keyword (Key) is added to the beginning of the data block (11-i). The desired data (DATA) is read out from the data area (DATA-i) of, and the same sub-keyword (Ke
The data block (11-i) with y) added at the beginning is 2
When there is the above, the data block (KEY-i) of each of the two or more data blocks is accessed, and the data block (KEY) having the same keyword (KEY) as the access keyword (KEY) is accessed. 11-i) is detected and the desired data (DATA) is read out.