JPS6393033A

JPS6393033A - Data retrieving system

Info

Publication number: JPS6393033A
Application number: JP61237235A
Authority: JP
Inventors: Emika Suzuki; 鈴木　恵美香
Original assignee: Fujitsu Dai Ichi Communications Software Ltd; Fujitsu Ltd
Current assignee: Fujitsu Dai Ichi Communications Software Ltd; Fujitsu Ltd
Priority date: 1986-10-07
Filing date: 1986-10-07
Publication date: 1988-04-23
Anticipated expiration: 2009-02-16
Also published as: JPH0612550B2

Abstract

PURPOSE:To retrieve data at a high speed by producing a sub-keyword having the shorter bit length than each keyword to write it to each corresponding data block and having an access to the keyword after converting it into a sub-keyword. CONSTITUTION:A data table 11 of a data retrieving device 10 contains data blocks 11-1-11-n. A block 11-i includes an area KEY-i of a normal keyword (KEY) and an area DATA-i of the data (DATA) corresponding to the area KEY-i together with an area key-i of a sub-keyword (key). The keyword (key) is produced previously via a converting part 12 for each KEY and added to the head of the block 11-i. When a data retrieving request is received, the KEY for access is converted into the key for access by the part 12 for access of each block 11-i. In such a way, data can be retrieved at a high speed.

Description

【発明の詳細な説明】〔概　要〕各キーワードに対応したデータを検索するに当り、各キ
ーワード毎のサブキーワードを一定の規則で予め作成し
ておいてそれぞれのキーワードに付加しておき、キーワ
ードを指定して対応のデータを検索すべき要求が発生す
る毎に、当該キーワードを前記一定の規則のもとに対応
のサブキーワードを算出し、算出されたサブキーワード
によって目的とするデータを検索することによりデータ
検索時間の短縮を図る。[Detailed description of the invention] [Summary] When searching for data corresponding to each keyword, sub-keywords for each keyword are created in advance according to certain rules and added to each keyword. Each time a request to search for corresponding data by specifying a keyword occurs, a corresponding sub-keyword is calculated based on the above-mentioned certain rules, and the target data is searched using the calculated sub-keyword. This aims to shorten data search time.

[Industrial application field]

本発明はデータの検索方式に関する。 The present invention relates to a data search method.

データベースへのアクセスは必ず所望のキーワードを指
定して行う。例えば、データベースが電話番号であれば
、所望の人名（キーワード）を入力して当該電話番号（
データ）を得る。あるいはデータベースが例えば図書目
録であれば、所望の書符名又は著者名（キーワード）を
入力して当該図書目録（データ）を得る。本発明はこの
ようなキーワードによるデータ検索方式に関する。The database must be accessed by specifying the desired keyword. For example, if the database is a phone number, enter the desired person's name (keyword) and enter the phone number (keyword).
data). Alternatively, if the database is, for example, a library catalog, the desired book name or author name (keyword) is input to obtain the library catalog (data). The present invention relates to a data search method using such keywords.

[Conventional technology]

従来のデータ検索においてはキーワードを一文字一文字
比較して所望のキーワードと一致しているか否かを検出
していた。例えば、上記の例において、人名（仮に“Ｅ
ＭＩ　ＫＡ″とする）に対応する電話番号（データ）を
データテーブルから読み出すとすると、先ず、ＥＭ　Ｉ
　ＫＡを例えばアスキーコードを用いて各文字毎に変換
し、“４５”。In conventional data searches, keywords are compared character by character to detect whether the keyword matches a desired keyword. For example, in the example above, the person's name (temporarily “E
When reading out the telephone number (data) corresponding to EM I KA'' from the data table, first
For example, convert KA character by character using ASCII code and get "45".

４Ｄ”、“４９”、”４Ｂ”、”４１″とする。4D", "49", "4B", and "41".

そして、データテーブルの各データに付されたキーワー
ドのうち、これら“４５”・・・“４１″と一致するも
のを検出する。この場合、５文字の１つ１つについて比
較する。ここに一致がとれると、当該データ（電話番号
）の検索がなされる。Then, among the keywords attached to each data in the data table, keywords that match these "45" . . . "41" are detected. In this case, each of the five characters is compared. If a match is found, the data (telephone number) is searched.

[Problem that the invention seeks to solve]

上述した従来のデータ検索においてはＮ文字（Ｎは自然
数）のキーワードを入力して、データテーブル内の同一
のキーワードを検出する。データテーブル内のキーワー
ドの文字数は各キーワード毎にまちまちであるが、いず
れにしても入力キーワードのＮ文字について少なくとも
Ｎ回の文字対応の比較動作を必要とする。したがって、
目的とするキーワードのアクセスにかなりの時間を要し
、結局、目的とするデータを検索するまでの時間が長く
なるという問題がある。In the conventional data search described above, a keyword of N characters (N is a natural number) is input to detect the same keyword in a data table. The number of characters of the keyword in the data table varies for each keyword, but in any case, at least N character comparison operations are required for the N characters of the input keyword. therefore,
There is a problem in that it takes a considerable amount of time to access the desired keyword, and as a result, it takes a long time to search for the desired data.

[Means for solving problems]

第１図は本発明の方式に基づく原理構成を図解的に示す
図である。本図において、データ検索装置１０における
１１はデータテーブルであり、いわばデータベースをな
す。データテーブルはｎ個のデータブロック１１−１．
１１−２・・・１１−１・・・１１−ｎからなる。各デ
ータブロックは同一の構成を有し、例えばデータブロッ
ク１１−１を詳細に示す。データブロック１１−１は、
通常のキーワード（Ｋ　Ｅ　Ｙ）の領域ＫＥＹ−ｉとこ
れに対応するデータ（ＤＡＴＡ）の領域ＤＡＴＡ−１を
少なくとも備えてなり、本発明の特徴をなすサブキーワ
ード（ｋ　ｅ　ｙ）の領域ｋｅｙ−ｉがさらに付加され
る。FIG. 1 is a diagram schematically showing the principle configuration based on the system of the present invention. In the figure, numeral 11 in the data search device 10 is a data table, which constitutes a database. The data table consists of n data blocks 11-1.
11-2...11-1...11-n. Each data block has the same configuration; for example, data block 11-1 is shown in detail. The data block 11-1 is
The sub-keyword (key) area key-i, which is a feature of the present invention, comprises at least an area KEY-i for a normal keyword (KEY) and an area DATA-1 for data (DATA) corresponding thereto. i is further added.

このサブキーワードｋｅｙは各キーワードＫＥＹ毎に、
変換部１２を用いて予め生成され、各データブロック（
１１−ｉ）の先頭に付加される。This sub keyword key is for each keyword KEY,
Each data block (
11-i).

一方、データ検索の要求があったとき、アクセス用の所
望のキーワードＫＥＹが図中の右下より入力される。こ
れを変換部１２を介しアクセス用のサブキーワードｋｅ
ｙに変換し、このｋｅｙを用いてデータテーブル１１内
の各データブロック（１１−ｉ）をアクセスする。On the other hand, when a data search request is made, a desired keyword KEY for access is input from the lower right of the figure. This is converted into the subkeyword ke for access via the conversion unit 12.
y, and each data block (11-i) in the data table 11 is accessed using this key.

[For production]

キーワードＫＥＹよりサブキーワードｋｅｙを生成する
ための変換部１２は一定の規則のもとにキーワードＫＥ
Ｙのビット長よりも短いビット長のサブキーワードｋｅ
ｙを生成するものであり、アクセス用のキーワードＫＥ
Ｙと各データブロック（１１−ｉ）のキーワード領域（
ＫＥＹ−ｉ）との一致不一致を検出する時間に比して、
アクセス用のサブキーワードｋｅｙと各データブロック
（１１−ｉ）のサブキーワード領域（ｋｅｙ−ｉ）との
一致不一致を検出する時間の方が短くなる。The conversion unit 12 for generating a sub-keyword key from the keyword KEY converts the keyword KE based on certain rules.
Subkeyword ke with a bit length shorter than the bit length of Y
y, and the access keyword KE
Y and the keyword area (
KEY-i)
The time required to detect a match or mismatch between the access subkeyword key and the subkeyword area (key-i) of each data block (11-i) is shorter.

つまり、データ検索時間は短縮される。なお、第１図中
の上方の変換部１２と下方の変換部１２とは同じもので
もよいし、あるいは既述の一定の規則が双方同一であり
さえすれば別個のものでもよい。In other words, data search time is reduced. Note that the upper converter 12 and the lower converter 12 in FIG. 1 may be the same, or may be separate as long as the above-described certain rules are the same for both.

キーワードＫＥＹのビット長に比しサブキーワードｋｅ
ｙのビット長が短いので、サブキーワードｋｅｙで区別
しうるデータブロック（１１−ｉ）の数（ｉ）は当然少
なくなる。したがって１つのアクセス用サブキーワード
ｋｅｙによって２以上のデータブロックがアクセスされ
るケースがありうる。このときは、さらにキーワードＫ
ＥＹ同士の一致不一致を見て、所望の１つを特定する。Compared to the bit length of the keyword KEY, the subkeyword ke
Since the bit length of y is short, the number (i) of data blocks (11-i) that can be distinguished by the sub-keyword key naturally decreases. Therefore, there may be cases where two or more data blocks are accessed by one access subkeyword key. At this time, the keyword K
Look at the matches and discrepancies between the EYs and identify the desired one.

〔Example〕

第２Ａおよび２Ｂ図は変換部１２における一動作例を示
すフローチャートであり、前述の“一定の規則・”の−
具体例である。本例による一定の規則では、（イ）キーワード（ＫＥＹ）を構成する第１文字を表す
ビット列を一定の方向に１ビットシフトし、（Ｕ）１ビ
ットシフトしたビット列と、その１ビットシフトにより
オーバーフローした１ビットキャリーと、第２文字を表
すビット列とを加算する、という操作手順を一単位とし
て、これを各文字毎に最終文字まで繰り返し実行する。FIGS. 2A and 2B are flowcharts showing an example of the operation in the conversion unit 12.
This is a specific example. According to a certain rule according to this example, (a) the bit string representing the first character constituting the keyword (KEY) is shifted by 1 bit in a certain direction, and (U) the bit string shifted by 1 bit and the bit string that is shifted by 1 bit cause overflow. The operation procedure of adding the 1-bit carry and the bit string representing the second character is taken as one unit, and this is repeated for each character until the final character.

前述の例（ＥＭＩ　ＫＡ）によれば、第３図に示すよう
な操作がなされる。第３図は本発明で用いる一定の規則
を実際の例をもって示す図である。According to the above example (EMI KA), the operations shown in FIG. 3 are performed. FIG. 3 is a diagram illustrating certain rules used in the present invention with practical examples.

また、第４図は第３図の例を簡素化して示す図であり、
しかもキーワード（ＥＭＩＫＡ）に対する最終的な結果
であるサブキーワードｋｅｙ　（本図ではＢＡ）まで示
しである。第３図において、最初の文字（第１文字のＥ
）Ｗｌについては、これに加えるべき加算ビット列（Ａ
ＤＯ）もキャリー（ＣＲＯ）もないから、ＡＤＯもＣＲ
Ｏも共にＯにプリセットしておく。したがって第１文字
Ｗ１についての加算ビット列ＡＤＩは第１文字Ｗ１その
ものである。Moreover, FIG. 4 is a diagram showing a simplified example of FIG. 3,
Furthermore, the sub-keyword key (BA in this figure), which is the final result for the keyword (EMIKA), is also shown. In Figure 3, the first letter (first letter E
)Wl, the addition bit string (A
Since there is no DO) or carry (CRO), there is no ADO or CR.
O is also preset to O. Therefore, the addition bit string ADI for the first character W1 is the first character W1 itself.

次にＡＤＩを１ビットシフト（レフトシフト）し、キャ
リーＣＲＩ（この場合Ｏ）を得ると共に、１ビットシフ
トしたビット列ＡＤＩ’を得る。Next, ADI is shifted by 1 bit (left shift) to obtain a carry CRI (O in this case), and a bit string ADI' shifted by 1 bit is obtained.

これらＣＲＩとＡＤＩ’は、第２文字Ｗ２　（ＥＭＩＫ
ＡのＭを示す）に加えられ加算ビット列ＡＤ２を得る。These CRI and ADI' are the second character W2 (EMIK
(indicating M of A) to obtain an addition bit string AD2.

ＡＤ２をさらに１ビットシフトしてＡＤ２’を得ると共
に、このときオーバーフローしたキャリーＣＲ２（この
場合１）を得る。AD2 is further shifted by 1 bit to obtain AD2' and carry CR2 (1 in this case) which has overflowed at this time is obtained.

ＡＤ２’とＣＲ２は第３文字（ＥＭＩＫＡの■）Ｗ３に
加えられ、同様の操作が最終文字（Ａ）まで順次繰り返
される。AD2' and CR2 are added to the third character (■ of EMIKA) W3, and the same operation is repeated sequentially until the final character (A).

上記の操作手順は、第２Ａおよび２Ｂ図にフローチャー
トの形で示されている。The above operating procedure is illustrated in flowchart form in Figures 2A and 2B.

もっと分り易く示すと（アスキーコードによる）、第４
図に示すとおりになり、最終的な結果ＢＡが、キーワー
ドＫＥＹ　（ＥＭＩＫＡ）に対するサブキーワードｋｅ
ｙとして得られる。なお、第４図では第３図の場合と異
なり、各値（ＡＤＯ、ＣＲＯ。To show it more clearly (using ASCII code), the fourth
As shown in the figure, the final result BA is the subkeyword ke for the keyword KEY (EMIKA).
It is obtained as y. In addition, in FIG. 4, different from the case of FIG. 3, each value (ADO, CRO.

Ｗｌ・・・）の加算を左から右へ、そして上段から下段
への流れで示している。Wl...) is shown in a flow from left to right and from the top to the bottom.

上述した一定の規則は一例であるが、シミュレーション
によれば、かなり一致率が低いことが確められている。The above-mentioned fixed rule is just an example, but simulations have confirmed that the matching rate is quite low.

一致率とは、一つのサブキーワードによって重複してア
クセスされるキーワードの個数のことであり、この個数
が少ない程良いことは当然である。The match rate is the number of keywords that are accessed repeatedly by one sub-keyword, and it goes without saying that the smaller the number, the better.

〔Effect of the invention〕

以上説明したように本発明によれば、従来例えば５バイ
ト分のキーワードをバイト毎に（１文字毎に）５回検査
していたのを、一定の規則によって１バイトのサブキー
ワードに変換し、１回の検査で済むようにしている。し
たがって高速で対応するデータブロックへのアクセスが
可能となる。As explained above, according to the present invention, conventionally, for example, a 5-byte keyword was checked five times for each byte (for each character), but it is converted into a 1-byte sub-keyword according to a certain rule. We aim to only require one test. Therefore, it becomes possible to access the corresponding data block at high speed.

この場合、１つのサブキーワードｋｅｙで重複して２以
上のキーワードＫＥＹをアクセスすることがありうるが
、そのときに限り本来のキーワード同士でのマツチング
を行えばよい。In this case, two or more keywords KEY may be accessed redundantly by one sub-keyword key, but only in that case it is sufficient to perform matching between the original keywords.

[Brief explanation of the drawing]

第１図は本発明の方式に基づく原理構成を図解的に示す
図、第２Ａおよび２Ｂ図は変換部１２における一動作例を示
すフローチャート、第３図は本発明で用いる一定の規則を実際の例をもって
示す図、第４図は第３図の例を筒素化して示す図である。１０・・・データ検索装置、１１・・・データテーブル、１１　１．１１−２〜１１−ｉ〜１ｌ−ｎ・・・データ
ブロック蔦１２・・・変換部、ＫＥＹ・・・キーワード、ｋｅｙ・・・サブキーワード
、ＤＡＴＡ・・・データ。FIG. 1 is a diagram schematically showing the principle configuration based on the method of the present invention, FIGS. 2A and 2B are flowcharts showing an example of an operation in the converter 12, and FIG. 3 is an actual diagram showing a certain rule used in the present invention. FIG. 4 is a diagram illustrating the example of FIG. 3 as a cylindrical element. DESCRIPTION OF SYMBOLS 10... Data search device, 11... Data table, 11 1.11-2~11-i~1l-n... Data block vine 12... Conversion unit, KEY... Keyword, key. ...Sub keyword, DATA...Data.

Claims

[Claims] 1. A data table (11-i) comprising a plurality of data blocks (11-i) each including a keyword area (KEY-i) and a data area (DATA-i) corresponding to the keyword area (KEY-i). ) by using the access keyword (KEY), the corresponding data (DATA
), the sub-keyword (k
ey), and each keyword (
A subkeyword (KEY) that has a bit length shorter than
key) is generated in advance and each corresponding data block (1
1-i) When the sub keyword area (key-i) is written and the access keyword (KEY) is input,
This is converted into a corresponding sub-keyword (key) based on the above-mentioned certain rules, and each data block (11-i) is
and access the area (k
A data search method characterized by reading desired data (DATA) from a data area (DATA-i) corresponding to ey-i). 2. When there are two or more data blocks (11-i) having the same sub-keyword (key), further access the keyword area (KEY-i) in the data block and use the access keyword (KEY) Desired data (D
ATA). 3. The above-mentioned certain rule: (a) Shifts the bit string representing the first character of the keyword (KEY) by 1 bit in a certain direction, and (b) Shifts the bit string obtained by the 1-bit shift and that 1 bit. Add the 1-bit carry that overflowed due to the shift and the bit string representing the second character of the keyword (KEY) to obtain an addition bit string, and (c) perform the same operations as in (a) and (b) above for the added bit string. 2. The method according to claim 1, which comprises an operating procedure of sequentially repeating the steps from the third character to the final character.