JP2005135221A

JP2005135221A - Method and device for joining spreadsheet data and program

Info

Publication number: JP2005135221A
Application number: JP2003371624A
Authority: JP
Inventors: Shinji Kosho; 晋二古庄
Original assignee: TURBO DATA LAB KK; Turbo Data Laboratories Inc
Current assignee: TURBO DATA LAB KK; Turbo Data Laboratories Inc
Priority date: 2003-10-31
Filing date: 2003-10-31
Publication date: 2005-05-26
Also published as: WO2005043409A1

Abstract

<P>PROBLEM TO BE SOLVED: To join a plurality of spreadsheet data without actually creating a join table. <P>SOLUTION: First and second spreadsheet data represented as sequences of records including item values corresponding to information items are prepared. The information items belonging to both spreadsheet data are determined as key items. Key item values corresponding to the key items of the records to be processed which are included in the first spreadsheet data are obtained. From the records included in the second spreadsheet data, a group of records including the item values that meet predetermined matching requirements with respect to the key item values about the key items is specified. A pair of the records to be processed and a record group are outputted to create a virtual join table. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、コンピュータのような情報処理装置を用いて大量のデータを処理するデータ処理方法およびデータ処理装置に係り、より詳細には、複数の表形式データを結合する方法、装置及びプログラムに関する。 The present invention relates to a data processing method and data processing apparatus for processing a large amount of data using an information processing apparatus such as a computer, and more particularly to a method, apparatus and program for combining a plurality of tabular data.

データベースは種々の用途に用いられているが、中規模ないし大規模システムにおいては、論理的な矛盾が排除できるリレーショナルデータベース（ＲＤＢ）の使用が主流となっている。たとえば、ＲＤＢは飛行機の座席予約等のシステムに利用されている。この場合、キー項目を指定することにより、（多くの場合１件の）ターゲットを迅速に検索することもでき、或いは、予約の確定、キャンセル或いは変更などを行うことができる。また、各便の座席数はせいぜい数百であるため、特定の航空便の空席数を求めることも可能である。 Databases are used for various purposes, but in medium to large-scale systems, the use of relational databases (RDB) that can eliminate logical contradictions is the mainstream. For example, RDB is used in a system such as airplane seat reservation. In this case, by specifying a key item, it is possible to quickly search for a target (in many cases, one), or to confirm, cancel or change a reservation. In addition, since the number of seats for each flight is at most several hundred, it is possible to obtain the number of vacant seats for a specific flight.

ところが、このＲＤＢを用いて、年度ごと、曜日ごと、月ごと、路線ごと、時間帯ごと或いは機種ごとなどで、特定の演算（たとえば、搭乗率の計算）をなそうとすると、非常に時間がかかることが知られている。すなわち、ＲＤＢは、処理を矛盾なく実現することに優れている反面、相当数のレコードを検索、集計或いはソートする性能が低い。 However, if this RDB is used to perform a specific calculation (for example, calculation of boarding rate) every year, every day of the week, every month, every route, every time zone or every model, etc., it takes a very long time. This is known. That is, RDB is excellent in realizing processing without contradiction, but has a low performance of searching, counting or sorting a considerable number of records.

そこで、近年、検索や集計のために、ＲＤＢとは別に、データウェアハウス（ＤＷＨ）と称するデータベースをシステムに構築することが一般化している。すなわち、エンドユーザの特定の目的に合わせて、特定のデータ形式およびデータ項目名を備えた極めて大規模なデータベースを構築し、エンドユーザはこれを用いて特定の検索や集計をなすことができるようになっている。 Therefore, in recent years, it has become common to construct a database called a data warehouse (DWH) in the system separately from the RDB for searching and tabulation. In other words, an extremely large database with a specific data format and data item name is constructed according to the specific purpose of the end user, and the end user can use this to perform a specific search and aggregation. It has become.

しかしながら、ＲＤＢのほかに、他のＤＷＨを設けること、すなわち、複数のデータベースを設けることは、本来、データを集中管理できるためにデータベース、特に、ＲＤＢが考案されたという本来あるべき姿とは乖離しており、これに由来して、たとえば、以下に述べるような様々な問題点が生じている。 However, in addition to RDB, providing other DWHs, that is, providing a plurality of databases, is essentially different from the ideal form in which RDB was devised because of the ability to centrally manage data. From this, various problems as described below have arisen, for example.

（１）ＤＷＨは固定的なものであるため、予めＤＷＨに設けられた項目以外の検索や集計をすることが困難である。
（２）ＲＤＢのほかに、固定的なＤＷＨを設けるため、データ容量が極めて大きくなるほか、ＲＤＢの更新等に対応することができない。 (1) Since the DWH is fixed, it is difficult to search and tabulate items other than those previously provided in the DWH.
(2) Since a fixed DWH is provided in addition to the RDB, the data capacity becomes extremely large, and the RDB cannot be updated.

これらの問題点を解決するため、複数の表形式データ（テーブル）を結合、即ち、ジョインすることが提案されている。表形式データのジョインとは、複数の表形式データのそれぞれに結合キーとなる項目を指定し、そのキーがある関係を満たす場合に、各表形式データに含まれるレコードを対応付けし、新たな表形式データ（ジョインテーブル）を作成することをいう。このような表形式データのジョインは、大規模なシステムのデータを取り扱う場合に、種々の場面で必要になる操作である。
特開平７−２５３９９１号公報 In order to solve these problems, it has been proposed to combine, that is, join, a plurality of tabular data (tables). A tabular data join is an item that becomes a join key for each of multiple tabular data, and when the key satisfies the relationship, the records included in each tabular data are associated with each other, and a new To create tabular data (join table). Such tabular data join is an operation required in various situations when handling data of a large-scale system.
JP 7-253991 A

しかし、この複数の表形式データのジョインは、論理積操作であるため、爆発的に巨大なジョインテーブルが頻繁に生成され、メモリやディスク装置の領域不足の問題を生じる。また、生成された新しいジョインテーブルは、検索に有利なＢ−Ｔｒｅｅやビットマップインデックスのようなインデックスが付与されていないため、高速な検索やソートを行うことが困難であるという問題も生じる。特許文献１に記載された技術においても、実際にマージしたテーブルを作成するため、巨大なテーブルを作成せざるを得ない。 However, since the joining of the plurality of tabular data is an AND operation, a huge join table is frequently generated explosively, resulting in a problem of insufficient memory and disk device areas. In addition, since the generated new join table is not provided with an index such as B-Tree or bitmap index that is advantageous for search, there is a problem that it is difficult to perform high-speed search and sort. Even in the technique described in Patent Document 1, a table that is actually merged is created, and thus a huge table must be created.

本発明は、爆発的に巨大になる可能性があり、かつ、高速な検索やソートを行うために適していないジョインテーブルを作成することなく、複数の表形式データを結合する方法及び装置の提供を目的とする。
更に、本発明は、上記の複数の表形式データを結合する方法に基づく複数の表形式データの検索方法及びソート方法の提供を目的とする。 The present invention provides a method and apparatus for combining a plurality of tabular data without creating a join table that may become explosively huge and is not suitable for performing high-speed searching and sorting. With the goal.
Another object of the present invention is to provide a search method and a sort method for a plurality of tabular data based on the above-described method for combining a plurality of tabular data.

上記目的を達成する本発明の原理は、実際にジョインテーブルを作成するのではなく、検索を用いて仮想的にジョインテーブルを作成することである。
本発明の目的は、情報の項目に対応した項目値を含むレコードの配列として表される第１の表形式データ及び第２の表形式のデータを準備し、前記第１の表形式データと前記第２の表形式データに共通に属する情報の項目をキー項目として決定する手順と、
前記第１の表形式データに含まれる処理対象レコードの前記キー項目に対応したキー項目値を得る手順と、
前記第２の表形式データに含まれるレコードから、前記キー項目に関して前記キー項目値と所定のマッチング条件を満たす項目値を含むレコード群を特定する手順と、
前記第１の表形式データに含まれる前記処理対象レコードと前記第２の表形式データに含まれる前記レコード群の対を出力する手順と、
を有する表形式データの結合方法により達成される。 The principle of the present invention that achieves the above object is not to actually create a join table, but to virtually create a join table using a search.
An object of the present invention is to prepare first tabular data and second tabular data represented as an array of records including item values corresponding to information items, and the first tabular data and the first tabular data A procedure for determining an item of information belonging to the second tabular data as a key item;
Obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
A procedure for identifying a record group including an item value that satisfies a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
Outputting a pair of the record to be processed and the record group included in the second tabular data, the processing target record included in the first tabular data;
Is achieved by a method of combining tabular data having

本発明の好ましい実施の形態によれば、前記第２の表形式データは前記キー項目にインデックスが付与された表形式データであり、
前記レコード群を特定する手順は、前記第２の表形式データの前記キー項目に付与された前記インデックスを用いた検索によって前記レコード群を特定する。 According to a preferred embodiment of the present invention, the second tabular data is tabular data in which an index is assigned to the key item,
The procedure for specifying the record group specifies the record group by a search using the index assigned to the key item of the second tabular data.

本発明の好ましい実施の形態によれば、表形式データの結合方法は、前記レコード群を特定する手順よりも前に、前記第２の表形式データに関して前記キー項目に対応した項目値毎に、該項目値から該項目値を含むレコードを指定する指標データを生成する手順を更に有し、
前記レコード群を特定する手順は、前記第２の表形式データの前記指標データを用いた検索によって前記レコード群を特定する。 According to a preferred embodiment of the present invention, the tabular data combination method includes, for each item value corresponding to the key item with respect to the second tabular data, prior to the procedure of specifying the record group, The method further includes generating index data specifying a record including the item value from the item value,
In the procedure of specifying the record group, the record group is specified by a search using the index data of the second tabular data.

本発明の好ましい実施の形態によれば、前記指標データを生成する手順は、前記指標データを前記キー項目値に対応した前記項目値に基づいてソートする手順を含む。
本発明の好ましい実施の形態によれば、前記指標データを生成する手順は、前記キー項目に対応した項目値毎に生成された前記指標データの前記項目値の部分にインデックスを付与する手順を含む。 According to a preferred embodiment of the present invention, the procedure for generating the index data includes a procedure for sorting the index data based on the item values corresponding to the key item values.
According to a preferred embodiment of the present invention, the procedure of generating the index data includes a procedure of assigning an index to the item value portion of the index data generated for each item value corresponding to the key item. .

本発明の好ましい実施の形態によれば、前記指標データはレコードへのポインターである。
本発明の好ましい実施の形態によれば、前記指標データはレコード番号である。
本発明の好ましい実施の形態によれば、前記指標データはレコードが格納されているメモリ上のアドレスに対応付けられている。 According to a preferred embodiment of the present invention, the index data is a pointer to a record.
According to a preferred embodiment of the present invention, the index data is a record number.
According to a preferred embodiment of the present invention, the index data is associated with an address on a memory where a record is stored.

本発明の表形式データの結合方法は、ジョインテーブルを検索するために利用することができる。そのため、前記第１の表形式データは、情報の項目に対応した項目値を含むレコードの配列として表される初期の表形式データから検索条件に適合する項目値を含むレコードを検索することにより準備された表形式データである。本発明によれば、第１の表形式データを予め検索することにより、ジョインテーブルを検索した場合と同じ結果を得ることができる。 The tabular data joining method of the present invention can be used to search a join table. Therefore, the first tabular data is prepared by retrieving a record including item values that meet the search condition from initial tabular data represented as an array of records including item values corresponding to information items. Tabular data. According to the present invention, it is possible to obtain the same result as when the join table is searched by searching the first tabular data in advance.

この原理を用いることにより、一般的なジョインテーブルの検索を、ジョイン前のテーブルの検索と、テーブルのジョインとによって置き換え得ることが可能である。そこで、本発明による表形式データの検索方法は、 By using this principle, it is possible to replace a general join table search by a table search before a join and a table join. Therefore, a tabular data search method according to the present invention is as follows.

情報の項目に対応した項目値を含むレコードの配列として表される初期の表形式データから検索条件に適合する項目値を含むレコードを検索することにより第１の表形式データを準備し、情報の項目に対応した項目値を含むレコードの配列として表される第２の表形式のデータを準備する手順と、
前記第１の表形式データと前記第２の表形式データに共通に属する情報の項目をキー項目として決定する手順と、
前記第１の表形式データに含まれる処理対象レコードの前記キー項目に対応したキー項目値を得る手順と、
前記第２の表形式データに含まれるレコードから、前記キー項目に関して前記キー項目値と所定のマッチング条件を満たす項目値を含むレコード群を特定する手順と、
前記第１の表形式データに含まれる前記処理対象レコードと前記第２の表形式データに含まれる前記レコード群の対を生成する手順と、
を有する。 The first tabular data is prepared by retrieving records including item values that meet the search condition from initial tabular data represented as an array of records including item values corresponding to information items. Preparing a second tabular data represented as an array of records containing item values corresponding to the items;
A procedure for determining, as a key item, an item of information belonging to both the first tabular data and the second tabular data;
Obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
A procedure for identifying a record group including an item value that satisfies a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
A procedure for generating a pair of the record to be processed and the record group included in the second tabular data included in the first tabular data;
Have

更に、本発明の表形式データの結合方法は、ジョインテーブルをソートするために利用することができる。そのため、前記第１の表形式データは、情報の項目に対応した項目値を含むレコードの配列として表される初期の表形式データのレコードを所定の項目に関して並べ替えることにより準備された表形式データである。本発明によれば、第１の表形式データを予めソートすることにより、ジョインテーブルをソートした場合と同じ結果を得ることができる。 Furthermore, the tabular data joining method of the present invention can be used to sort join tables. Therefore, the first tabular data is prepared by rearranging the records of the initial tabular data represented as an array of records including item values corresponding to the information items with respect to a predetermined item. It is. According to the present invention, the same result as when the join table is sorted can be obtained by sorting the first tabular data in advance.

この原理を用いることにより、一般的なジョインテーブルのソートを、ジョイン前のテーブルの検索と、テーブルのジョインとによって置き換えることが可能である。そこで、本発明による表形式データのソート方法は、
情報の項目に対応した項目値を含むレコードの配列として表される初期の表形式データのレコードを所定の項目に関して並べ替えることにより第１の表形式データを準備し、情報の項目に対応した項目値を含むレコードの配列として表される第２の表形式のデータを準備する手順と、
前記第１の表形式データと前記第２の表形式データに共通に属する情報の項目をキー項目として決定する手順と、
前記第１の表形式データに含まれる処理対象レコードの前記キー項目に対応したキー項目値を得る手順と、
前記第２の表形式データに含まれるレコードから、前記キー項目に関して前記キー項目値と所定のマッチング条件を満たす項目値を含むレコード群を特定する手順と、
前記第１の表形式データに含まれる前記処理対象レコードと前記第２の表形式データに含まれる前記レコード群の対を生成する手順と、
を有する。 By using this principle, it is possible to replace the general sort of the join table by the search of the table before the join and the join of the table. Therefore, a method for sorting tabular data according to the present invention is as follows.
First tabular data is prepared by rearranging records of initial tabular data represented as an array of records including item values corresponding to information items with respect to a predetermined item, and items corresponding to information items Preparing a second tabular data represented as an array of records containing values;
A procedure for determining, as a key item, an item of information belonging to both the first tabular data and the second tabular data;
Obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
A procedure for identifying a record group including an item value that satisfies a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
A procedure for generating a pair of the record to be processed and the record group included in the second tabular data included in the first tabular data;
Have

また、本発明の目的は、上記本発明の表形式データの結合方法を実施する結合装置によっても達成できる。この本発明による表形式データの結合装置は、
情報の項目に対応した項目値を含むレコードの配列として表される第１の表形式データ及び第２の表形式のデータを準備し、前記第１の表形式データと前記第２の表形式データに共通に属する情報の項目をキー項目として決定する手段と、
前記第１の表形式データに含まれる処理対象レコードの前記キー項目に対応したキー項目値を得る手段と、
前記第２の表形式データに含まれるレコードから、前記キー項目に関して前記キー項目値と所定のマッチング条件を満たす項目値を含むレコード群を特定する手段と、
前記第１の表形式データに含まれる前記処理対象レコードと前記第２の表形式データに含まれる前記レコード群の対を出力する手段と、
を有する。 The object of the present invention can also be achieved by a combining device that implements the tabular data combining method of the present invention. The tabular data combining device according to the present invention comprises:
First tabular data and second tabular data represented as an array of records including item values corresponding to information items are prepared, and the first tabular data and the second tabular data are prepared. Means for determining as a key item information items belonging to
Means for obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
Means for specifying a record group including an item value satisfying a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
Means for outputting a pair of the record to be processed included in the first tabular data and the record group included in the second tabular data;
Have

また、本発明の目的は、上記本発明の表形式データの結合方法の上記手順をコンピュータに実行させるためのプログラムによっても達成される。 The object of the present invention is also achieved by a program for causing a computer to execute the above-described procedure of the tabular data combining method of the present invention.

本発明によれば、複数のテーブルが論理的にジョインされ、巨大なジョインテーブルは実際には生成されないので、メモリやディスク装置の記憶域の消費量が削減される。また、本発明によれば、インデックスなどが付与されているジョインする前のテーブルを検索或いはソートを実施した後に、複数のテーブルを論理的にジョインすることにより、仮想的なジョインテーブルの検索或いはソートを実施することができる。 According to the present invention, since a plurality of tables are logically joined and a huge join table is not actually generated, the consumption of the storage area of the memory or the disk device is reduced. In addition, according to the present invention, a search or sort of a virtual join table is performed by logically joining a plurality of tables after performing a search or sorting on a table to which an index or the like has been assigned. Can be implemented.

以下、添付図面を参照して、本発明の実施の形態につき説明を加える。図１は、本発明の実施の形態にかかる表形式データのジョインや、検索及びソートを実現できるコンピュータシステムのハードウェア構成を示すブロックダイヤグラムである。図１に示すように、このコンピュータシステム１０は、通常のものと同様の構成であり、プログラムを実行することによりシステム全体および個々の構成部分を制御するＣＰＵ１２、ワークデータなどを記憶するＲＡＭ(Random Access Memory)１４、プログラム等を記憶するＲＯＭ(Read Only Memory)１６、ハードディスク等の固定記憶媒体１８、ＣＤ−ＲＯＭ１９をアクセスするためのＣＤ−ＲＯＭドライバ２０、ＣＤ−ＲＯＭドライバ２０や外部ネットワーク（図示せず）と接続された外部端子との間に設けられたインタフェース（Ｉ／Ｆ）２２、キーボードやマウスからなる入力装置２４、ＣＲＴ表示装置２６を備えている。ＣＰＵ１２、ＲＡＭ１４、ＲＯＭ１６、外部記憶媒体１８、Ｉ／Ｆ２２、入力装置２４および表示装置２６は、バス２８を介して相互に接続されている。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. FIG. 1 is a block diagram showing a hardware configuration of a computer system that can realize tabular data join, search, and sort according to an embodiment of the present invention. As shown in FIG. 1, this computer system 10 has the same configuration as a normal one. A CPU 12 that controls the entire system and individual components by executing a program, a RAM (Random) that stores work data, and the like. (Access Memory) 14, a ROM (Read Only Memory) 16 for storing programs, a fixed storage medium 18 such as a hard disk, a CD-ROM driver 20 for accessing the CD-ROM 19, a CD-ROM driver 20 and an external network (FIG. An interface (I / F) 22 provided between an external terminal and a connected external terminal, an input device 24 including a keyboard and a mouse, and a CRT display device 26. The CPU 12, RAM 14, ROM 16, external storage medium 18, I / F 22, input device 24, and display device 26 are connected to each other via a bus 28.

本実施の形態にかかる、表形式データを結合（ジョイン）するプログラム、結合した表形式データから所定の項目の表（ビュー）を作成するプログラム、検索プログラム、及びソートプログラムは、ＣＤ−ＲＯＭ１９に収容され、ＣＤ−ＲＯＭドライバ２０に読取られても良いし、ＲＯＭ１６に予め記憶されていても良い。また、いったんＣＤ−ＲＯＭ１９から読み出したものを、外部記憶媒体１８の所定の領域に記憶しておいても良い。或いは、上記プログラムは、ネットワーク（図示せず）、外部端子およびＩ／Ｆ２２を経て外部から供給されるものであっても良い。 The CD-ROM 19 stores a program for joining (joining) tabular data, a program for creating a table (view) of predetermined items from the tabulated data, a search program, and a sort program according to the present embodiment. It may be read by the CD-ROM driver 20 or stored in advance in the ROM 16. Further, what is once read from the CD-ROM 19 may be stored in a predetermined area of the external storage medium 18. Alternatively, the program may be supplied from the outside via a network (not shown), an external terminal, and the I / F 22.

次に、複数の表形式データのジョインについて説明する。以下の説明では、表形式データを簡単に「テーブル」と呼ぶ。本発明による仮想ジョインとは、複数のテーブルの各々にジョインキーとなる項目を指定し、そのキーが所定の関係を満たす場合に、各テーブルのレコードを対応付けして新たなテーブルを仮想的に作成することである。図２は、野球愛好会会員テーブルと試合予定テーブルの二つのテーブルから新しいジョインテーブルを仮想的に作成する方法の説明図である。野球愛好会会員テーブルは、「会員名」と「応援チーム」を項目として含む３個のレコードにより構成されている。例えば、先頭のレコードの項目「会員名」に対応する項目値は「ＴＡＮＡＫＡ」であり、項目「応援チーム」に対応する項目値は「Ｇｉａｎｔｓ」である。一方、試合予定テーブルは、「球団」と「試合日」を項目として含む４個のレコードにより構成され、例えば、先頭のレコードは、項目「球団」に対応する項目値が「Ｔｉｇｅｒｓ」であり、項目「試合日」に対応する項目値が「５月１０日」である。ここで、野球愛好会会員テーブルの項目「応援チーム」と試合予定テーブルの項目「球団」は、項目名は異なるものの、項目に対応する項目値の集合に共通の要素が含まれている。したがって、野球愛好会会員テーブルの項目「応援チーム」と試合予定テーブルの項目「球団」を両方のテーブルに共通のキー項目として決定し、このキー項目の項目値が一致するレコードを結合することによって、図２に示された仮想的なテーブルが生成される。 Next, joining of a plurality of tabular data will be described. In the following description, the tabular data is simply referred to as “table”. In the virtual join according to the present invention, an item to be a join key is specified for each of a plurality of tables, and when the key satisfies a predetermined relationship, a new table is virtually associated with the records of each table. Is to create. FIG. 2 is an explanatory diagram of a method for virtually creating a new join table from two tables, a baseball club member table and a game schedule table. The baseball club member table is composed of three records including “member name” and “support team” as items. For example, the item value corresponding to the item “member name” of the top record is “TANAKA”, and the item value corresponding to the item “support team” is “Giants”. On the other hand, the game schedule table is composed of four records including “ball team” and “game date” as items. For example, in the first record, the item value corresponding to the item “ball team” is “Tigers”. The item value corresponding to the item “game day” is “May 10”. Here, the item “supporting team” in the baseball enthusiast membership table and the item “ball team” in the game schedule table include common elements in the set of item values corresponding to the items, although the item names are different. Therefore, the item “support team” in the baseball enthusiast membership table and the item “ball team” in the game schedule table are determined as key items common to both tables, and records in which the item values of the key items match are combined. The virtual table shown in FIG. 2 is generated.

例えば、野球愛好会会員テーブルの先頭レコード（＝レコード０）の項目「応援チーム」の項目値「Ｇｉａｎｔｓ」と一致する項目値が、項目「球団」に含まれるレコードを試合予定テーブルで探すと、２行目のレコード（＝レコード１）と４行目のレコード（＝レコード３）が見つかるので、野球愛好会会員テーブルのレコード０は、試合予定テーブルのレコード１とレコード３の２個のレコードと対応付けされる。この対応付けによって新たに（仮想的に）生成されたレコードが仮想的に生成されたテーブルのレコード０及びレコード１に示されている。 For example, when a record whose item value “Giants” matches the item value “Giants” of the first record (= record 0) of the baseball enthusiast member table is searched for in the game schedule table, Since the record on the second line (= record 1) and the record on the fourth line (= record 3) are found, the record 0 in the baseball club member table has two records, record 1 and record 3 in the game schedule table. Corresponds. Newly (virtually) generated records by this association are shown in record 0 and record 1 of the virtually generated table.

ここで、仮想的なテーブルとは、実際にメモリ上に展開されるものではなく、論理的に関連付けられていることを意味する。従来のジョイン方法の場合、ジョインテーブルが実際に生成され、メモリ又はディスク装置に格納されていた。 Here, the virtual table means that the virtual table is not actually developed on the memory but is logically related. In the conventional join method, the join table is actually generated and stored in the memory or the disk device.

尚、図２の例では、キーが満たす所定の関係は「一致する」関係であるが、本発明における所定の関係は、「一致する」関係に限定されることはなく、「＜」、「＞」、「＜＝」、及び、「＞＝」のような順序関係でもよい。 In the example of FIG. 2, the predetermined relationship satisfied by the key is a “matching” relationship, but the predetermined relationship in the present invention is not limited to the “matching” relationship, and “<”, “ > ”,“ <= ”, And“> = ”.

以下では、図２に示した野球愛好会会員テーブル（＝第１のテーブル）と試合予定テーブル（＝第２のテーブル）の仮想ジョインの例に基づいて、本発明による表形式データの結合方法を説明する。図３は、本発明の表形式データの結合方法の実施の形態のフローチャートである。ステップ３０１において、第１のテーブル及び第２のテーブルを準備し、キー項目を決定する。例えば、第１のテーブル及び第２のテーブルをメモリ上に展開し、キー項目として、「応援チーム」と「球団」を指定する。次に、ステップ３０２において、第１のテーブルの処理対象レコードのキー項目値を獲得する。例えば、第１のテーブルの先頭行のレコードを処理対象レコードとし、「応援チーム」項目の項目値「Ｇｉａｎｔｓ」を取り出す。続いて、ステップ３０３において、第２のテーブルのレコードから所定のマッチング条件を満たすレコード群を特定する。例えば、第２のテーブルの「球団」項目から「Ｇｉａｎｔｓ」項目値をもつレコードを検索し、レコード１とレコード３のレコード群を特定する。最後に、ステップ３０４において、第１のテーブルの処理対象レコードと第２のテーブルのレコード群の対を出力する。本実施の形態では、処理対象レコードとレコード群の対は、例えば、図１の表示装置２６に出力され、或いは、Ｉ／Ｆ２２及び外部端子を介して、外部機器へ出力される。 In the following, based on the example of the virtual join of the baseball lover association member table (= first table) and the game schedule table (= second table) shown in FIG. explain. FIG. 3 is a flowchart of an embodiment of the tabular data combination method of the present invention. In step 301, a first table and a second table are prepared, and key items are determined. For example, the first table and the second table are expanded on the memory, and “support team” and “ball team” are designated as key items. Next, in step 302, the key item value of the processing target record in the first table is acquired. For example, the record in the first row of the first table is set as a process target record, and the item value “Giants” of the “support team” item is extracted. Subsequently, in step 303, a record group satisfying a predetermined matching condition is specified from the records in the second table. For example, the record having the “Grants” item value is searched from the “ball team” item in the second table, and the record group of the record 1 and the record 3 is specified. Finally, in step 304, a pair of a processing target record in the first table and a record group in the second table is output. In the present embodiment, the pair of the record to be processed and the record group is output to, for example, the display device 26 in FIG. 1 or output to an external device via the I / F 22 and an external terminal.

本実施の形態では、第１のテーブルの中で処理対象レコードを順次進めることにより、レコード群の対を順次出力することができる。図４は、レコード群の対を順次出力する処理の説明図である。同図において、（Ａ）は第１のテーブルのレコード０と対応する第２のテーブルのレコード群の対を表し、（Ｂ）は第１のテーブルのレコード１と対応する第２のテーブルのレコード群の対を表し、（Ｃ）は第１のテーブルのレコード２と対応する第２のテーブルのレコード群の対を表す。これにより、仮想ジョインテーブルが完成する。
このように、本実施の形態によれば、実際にジョインテーブルを作成することなく、第１のテーブルと第２のテーブルを論理的に結合し、結合結果を提示することが可能である。 In the present embodiment, pairs of record groups can be sequentially output by sequentially proceeding with the processing target records in the first table. FIG. 4 is an explanatory diagram of processing for sequentially outputting record group pairs. In the figure, (A) represents a pair of record groups in the second table corresponding to record 0 in the first table, and (B) represents records in the second table corresponding to record 1 in the first table. (C) represents a pair of record groups in the second table corresponding to record 2 in the first table. Thereby, the virtual join table is completed.
Thus, according to the present embodiment, it is possible to logically join the first table and the second table and present the join result without actually creating the join table.

図３のステップ３０３では、所定のマッチング条件を満たすレコード群を特定するために、第２のテーブル内では、「球団」項目に特定の項目値をもつレコードを検索している。この検索を高速化するため、本発明の好ましい実施の形態によれば、第２のテーブルは、「球団」項目にＢ−Ｔｒｅｅやビットマップインデックスのようなインデックスが付与される。このインデックスは、第２のテーブルを準備する際に予め付与しておいてもよく、或いは、キー項目の決定後に付与してもよい。このようにインデックスを付与することにより、レコード群を特定するステップ３０３では、第２のテーブルの「球団」項目に付与されたインデックスを用いた検索によってレコード群を特定することができる。 In step 303 in FIG. 3, in order to specify a record group satisfying a predetermined matching condition, a record having a specific item value in the “ball team” item is searched in the second table. In order to speed up this search, according to a preferred embodiment of the present invention, the second table is provided with an index such as B-Tree or bitmap index in the “Team” item. This index may be assigned in advance when preparing the second table, or may be assigned after the key item is determined. By assigning the index in this way, in step 303 for identifying the record group, the record group can be identified by a search using the index assigned to the “ball team” item in the second table.

或いは、本発明の他の実施の形態では、第２のテーブル内での「球団」項目を高速に検索するため、「球団」項目に対応した項目値毎に、この項目値から、この項目値を含むレコードを指定する指標データを予め生成し、生成された指標データを用いた検索によってレコード群を特定する。更に、検索を高速化するためには、この指標データを、キー項目の項目値に基づいて予めソートし、或いは、指標データの項目値の部分にインデックスを付与する。 Alternatively, in another embodiment of the present invention, in order to search for the “ball team” item in the second table at high speed, for each item value corresponding to the “ball team” item, from this item value, this item value Index data for designating a record including “” is generated in advance, and a record group is specified by a search using the generated index data. Further, in order to speed up the search, the index data is sorted in advance based on the item value of the key item, or an index is assigned to the item value portion of the index data.

図５は、本発明の他の実施の形態によるレコード群の検索の高速化のための指標データとしてレコードへのポインターを用いた例の説明図である。第２のテーブルの「球団」項目の値、例えば、［Ｄｒａｇｏｎｓ］、［Ｇｉａｎｔｓ］及び［Ｔｉｇｅｒｓ］からその値を含むレコードをポイントするポインターのリストが、「球団」項目の値の部分でソードされたポインターリストとして準備されている。或いは、このレコードをポイントするポインターのリストは、「球団」項目の値の部分にインデックスを付与してもよい。 FIG. 5 is an explanatory diagram of an example in which a pointer to a record is used as index data for speeding up a search for a record group according to another embodiment of the present invention. A list of pointers pointing to the record containing that value from the “Team” item in the second table, eg, [Dragons], [Giants] and [Tigers] is sorted in the value part of the “Team” item. Prepared as a pointer list. Alternatively, the list of pointers pointing to this record may add an index to the value part of the “Team” item.

本発明の他の実施の形態では、指標データとしてレコード番号が使用される。図２の例では、第２のテーブルの１行目のレコードのレコード番号は０であり、２行目のレコードのレコード番号は１であり、以下同様に続く。このレコード番号による指標データも、上述のレコードへのポインターによる指標データと同様に、ソートしたり、インデックスを付与したりすることができる。 In another embodiment of the present invention, a record number is used as index data. In the example of FIG. 2, the record number of the first row record of the second table is 0, the record number of the second row record is 1, and so on. The index data based on the record number can be sorted or indexed, similarly to the index data based on the pointer to the record.

更に、本発明の他の実施の形態では、指標データとしてレコードが格納されているメモリ上のアドレスに対応付けられている値が使用される。例えば、各レコードは、１０００番地毎に始まることにして、１０００、２０００、３０００・・・という値を用いて、レコード０、レコード１、レコード２・・・を特定してもよい。この値による指標データも、上述のレコードへのポインターによる指標データと同様に、ソートしたり、インデックスを付与したりすることができる。 Furthermore, in another embodiment of the present invention, a value associated with an address on a memory storing a record is used as index data. For example, each record may start at every 1000th address, and record 0, record 1, record 2... May be specified using values 1000, 2000, 3000. The index data based on this value can also be sorted and indexed as in the index data based on the pointers to the records described above.

本発明による表形式データの結合方法は、ジョインテーブルの検索やソートに拡張することができる。上述のように、従来の実際にジョインテーブルを作成するジョインでは、ジョインされる対象のテーブルにインデックスが付与されていたとしても、新たに作成されたジョインテーブルにはそのインデックスが反映されない。したがって、ジョイン後のジョインテーブルを用いて高速に検索・ソートを行うことができなかった。これに対して、本発明による仮想ジョインの原理に基づく表形式データの結合方法は、ジョインの元になる第１のテーブルを検索し、絞り込むこみ、絞り込まれた後の第１のテーブルと第２のテーブルを結合することにより、ジョインテーブルにおける検索を仮想的に実現することができる。また、ジョインも元になる第１のテーブルを予めソートし、ソートされた後の第１のテーブルと第２のテーブルを結合することにより、ジョインテーブルにおけるソートを仮想的に実現することができる。 The tabular data joining method according to the present invention can be extended to join table retrieval and sorting. As described above, in a conventional join that actually creates a join table, even if an index is assigned to a table to be joined, the index is not reflected in the newly created join table. Therefore, it has not been possible to search and sort at high speed using the joined table after the join. On the other hand, in the tabular data joining method based on the principle of virtual join according to the present invention, the first table that is the basis of the join is searched, the first table and the second table after being narrowed down and narrowed down. By joining these tables, the search in the join table can be virtually realized. Also, sorting in the join table can be realized virtually by sorting the first table from which the join is based in advance and combining the sorted first and second tables.

図６は本発明の実施の形態による表形式データの結合方法を用いた検索方法の説明図である。図７は本発明の実施の形態による表形式データの検索方法のフローチャートである。本実施の形態では、最初に野球愛好会会員テーブルから、会員名がＳで始まる会員名をもつレコードを検索し（ステップ７０１）、次に、検索結果のレコードだけを含む野球愛好会会員テーブル、即ち、第１のテーブルと、試合予定テーブルである第２のテーブルを「応援チーム」と「球団」で仮想ジョインすると（ステップ７０２〜７０５）、仮想ジョインで仮想的に生成されるテーブルが得られる。このテーブルは、図２の仮想ジョインテーブルから、会員名がＳで始まる会員名をもつレコードを検索した結果と一致している。このように、本発明の実施の形態によれば、予め検索した結果を用いて仮想ジョインを実行することにより、ジョインによって生成されたジョインテーブルに検索を行った場合と同じ結果が得られる。仮想ジョイン前のテーブルは、Ｂ−Ｔｒｅｅやビットマックインデックスのようなインデックスを予め付与し、或いは、その他の検索を高速化するためのタグを付与することができるので、本発明の実施の形態によれば、仮想ジョインテーブルの検索を高速に実現することが可能になる。 FIG. 6 is an explanatory diagram of a search method using the tabular data combination method according to the embodiment of the present invention. FIG. 7 is a flowchart of the tabular data search method according to the embodiment of the present invention. In the present embodiment, first, a record having a member name whose member name starts with S is searched from the baseball lover group table (step 701), and then, a baseball lover group table including only the record of the search result, That is, when the first table and the second table, which is the game schedule table, are virtually joined by the “support team” and the “ball team” (steps 702 to 705), a table virtually generated by the virtual join is obtained. . This table coincides with the result of searching the virtual join table in FIG. 2 for records having member names whose member names begin with S. As described above, according to the embodiment of the present invention, by executing a virtual join using a result searched in advance, the same result as that obtained when a search is performed on a join table generated by the join can be obtained. The table before the virtual join can be pre-assigned with an index such as a B-Tree or a bit mac index, or can be assigned a tag for speeding up other searches. According to this, it becomes possible to realize the search of the virtual join table at high speed.

図８は本発明の実施の形態による表形式データの結合方法を用いたソート方法の説明図である。図９は本発明の実施の形態による表形式データのソート方法のフローチャートである。本実施の形態では、最初に野球愛好会会員テーブルから、会員名をアルファベット順にソートすることによりレコードを並べ替え（ステップ９０１）、次に、レコード順が並び替えられた野球愛好会会員テーブル、即ち、第１のテーブルと、試合予定テーブルである第２のテーブルを「応援チーム」と「球団」で仮想ジョインすると（ステップ９０２〜９０５）、仮想ジョインで仮想的に生成されるテーブルが得られる。このテーブルは、図２の仮想ジョインテーブルを、会員名がアルファベット順になるように並べ替えた結果と一致している。このように、本発明の実施の形態によれば、予めソートした結果を用いて仮想ジョインを実行することにより、ジョインによって生成されたジョインテーブルにソートを行った場合と同じ結果が得られる。仮想ジョイン前のテーブルは、Ｂ−Ｔｒｅｅやビットマックインデックスのようなインデックスを予め付与し、或いは、その他のソートを高速化するためのタグを付与することができるので、本発明の実施の形態によれば、仮想ジョインテーブルのソートを高速に実現することが可能になる。 FIG. 8 is an explanatory diagram of a sorting method using the tabular data combining method according to the embodiment of the present invention. FIG. 9 is a flowchart of the tabular data sorting method according to the embodiment of the present invention. In the present embodiment, first, records are sorted by sorting the member names in alphabetical order from the baseball club member table (step 901), and then the baseball club members table in which the record order is rearranged, that is, When the first table and the second table, which is the game schedule table, are virtually joined by the “support team” and the “ball team” (steps 902 to 905), a table virtually generated by the virtual join is obtained. This table matches the result of rearranging the virtual join table of FIG. 2 so that the member names are in alphabetical order. As described above, according to the embodiment of the present invention, by executing a virtual join using a result that has been sorted in advance, the same result as that obtained when sorting is performed on the join table generated by the join can be obtained. The table before the virtual join can be pre-assigned with an index such as a B-Tree or a bit mac index, or can be assigned a tag for speeding up other sorts. According to this, it is possible to realize sorting of the virtual join table at high speed.

本発明を実施するコンピュータシステムのハードウェア構成を示すブロックダイヤグラムである。It is a block diagram which shows the hardware constitutions of the computer system which implements this invention. 本発明による仮想的なテーブル生成の説明図である。It is explanatory drawing of the virtual table production | generation by this invention. 本発明の表形式データの結合方法の実施の形態のフローチャートである。It is a flowchart of embodiment of the combination method of the tabular data of this invention. 本発明の実施の形態による対象レコードとレコード群の対を順次出力する処理の説明図である。It is explanatory drawing of the process which outputs sequentially the pair of the object record and record group by embodiment of this invention. 本発明の他の実施の形態によるレコード群の検索の高速化のための指標データの説明図である。It is explanatory drawing of the index data for speeding up the search of the record group by other embodiment of this invention. 本発明の実施の形態による表形式データの結合方法を用いた検索方法の説明図である。It is explanatory drawing of the search method using the combination method of the tabular data by embodiment of this invention. 本発明の実施の形態による表形式データの検索方法のフローチャートである。It is a flowchart of the search method of tabular data by embodiment of this invention. 本発明の実施の形態による表形式データの結合方法を用いたソート方法の説明図である。It is explanatory drawing of the sorting method using the combination method of the tabular data by embodiment of this invention. 本発明の実施の形態による表形式データのソート方法のフローチャートである。It is a flowchart of the sort method of the tabular data by embodiment of this invention.

Explanation of symbols

１０コンピュータシステム
１２ＣＰＵ
１４ＲＡＭ
１６ＲＯＭ
１８固定記憶装置
２０ＣＤ−ＲＯＭドライバ
２２Ｉ／Ｆ
２４入力装置
２６表示装置 10 Computer system 12 CPU
14 RAM
16 ROM
18 Fixed storage device 20 CD-ROM driver 22 I / F
24 input device 26 display device

Claims

First tabular data and second tabular data represented as an array of records including item values corresponding to information items are prepared, and the first tabular data and the second tabular data are prepared. A procedure for determining items of information belonging to the same as key items,
Obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
A procedure for identifying a record group including an item value that satisfies a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
Outputting a pair of the record to be processed and the record group included in the second tabular data, the processing target record included in the first tabular data;
A method for joining tabular data having

The second tabular data is tabular data in which an index is given to the key item,
The procedure for specifying the record group specifies the record group by a search using the index assigned to the key item of the second tabular data.
The tabular data combination method according to claim 1.

Prior to the procedure for specifying the record group, a procedure for generating index data for designating a record including the item value from the item value for each item value corresponding to the key item with respect to the second tabular data Further comprising
The procedure for specifying the record group is to specify the record group by a search using the index data of the second tabular data.
The tabular data combination method according to claim 1.

The method of combining tabular data according to claim 3, wherein the procedure of generating the index data includes a procedure of sorting the index data based on the item values corresponding to the key item values.

The combination of tabular data according to claim 3, wherein the step of generating the index data includes a step of adding an index to the item value portion of the index data generated for each item value corresponding to the key item. Method.

6. The tabular data combination method according to claim 3, wherein the index data is a pointer to a record.

6. The tabular data combination method according to claim 3, wherein the index data is a record number.

6. The tabular data combination method according to claim 3, wherein the index data is associated with an address on a memory in which a record is stored.

The first tabular data is prepared by retrieving a record including item values that meet the search condition from initial tabular data represented as an array of records including item values corresponding to information items. The tabular data combination method according to any one of claims 1 to 8, which is tabular data.

The first tabular data is tabular data prepared by rearranging records of initial tabular data represented as an array of records including item values corresponding to information items with respect to a predetermined item. 9. A method for combining tabular data according to any one of claims 1 to 8.

The first tabular data is prepared by retrieving records including item values that meet the search condition from initial tabular data represented as an array of records including item values corresponding to information items. Preparing a second tabular data represented as an array of records containing item values corresponding to the items;
A procedure for determining, as a key item, an item of information belonging to both the first tabular data and the second tabular data;
Obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
A procedure for identifying a record group including an item value that satisfies a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
A procedure for generating a pair of the record to be processed and the record group included in the second tabular data included in the first tabular data;
Search method for tabular data having

First tabular data is prepared by rearranging records of initial tabular data represented as an array of records including item values corresponding to information items with respect to a predetermined item, and items corresponding to information items Preparing a second tabular data represented as an array of records containing values;
A procedure for determining, as a key item, an item of information belonging to both the first tabular data and the second tabular data;
Obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
A procedure for identifying a record group including an item value that satisfies a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
A procedure for generating a pair of the record to be processed and the record group included in the second tabular data included in the first tabular data;
Sorting method of tabular data having

First tabular data and second tabular data represented as an array of records including item values corresponding to information items are prepared, and the first tabular data and the second tabular data are prepared. Means for determining as a key item information items belonging to
Means for obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
Means for specifying a record group including an item value satisfying a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
Means for outputting a pair of the record to be processed included in the first tabular data and the record group included in the second tabular data;
A tabular data combining device having:

First tabular data and second tabular data represented as an array of records including item values corresponding to information items are prepared, and the first tabular data and the second tabular data are prepared. A procedure for determining items of information belonging to the same as key items,
Obtaining a key item value corresponding to the key item of the processing target record included in the first tabular data;
A procedure for identifying a record group including an item value that satisfies a predetermined matching condition with the key item value with respect to the key item, from records included in the second tabular data;
Outputting a pair of the record to be processed and the record group included in the second tabular data, the processing target record included in the first tabular data;
A program that causes a computer to execute.