JPWO2018100734A1

JPWO2018100734A1 - Data processing system

Info

Publication number: JPWO2018100734A1
Application number: JP2018553621A
Authority: JP
Inventors: 晃一郎椿; 清水　晃; 清水　　晃; 太郎藤本; 直哉明渡
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2016-12-02
Filing date: 2016-12-02
Publication date: 2019-01-10
Anticipated expiration: 2036-12-02
Also published as: JP6608544B2; WO2018100734A1

Abstract

本発明の一実施形態に係るデータ処理システムは、１以上のテーブルを管理し、該テーブルに対する検索要求を受け付けるとともに、ファイルに記録されたデータをテーブルへインポート可能に構成されている。データ処理システムは、テーブルに対するデータ検索要求を受け付けると、該テーブルに対するインポート対象のデータを記録しているファイルの有無を確認し、インポート未完了のファイルがない場合は、テーブル内から検索要求で指定されたデータを検索し、インポート未完了のファイルがある場合には、検索要求で指定されたデータをテーブルの中から検索するとともに、ファイルに対しても検索要求で指定されたデータの検索を行う。 A data processing system according to an embodiment of the present invention is configured to manage one or more tables, accept search requests for the tables, and import data recorded in a file into the tables. When the data processing system accepts a data search request for a table, the data processing system checks whether there is a file that records the data to be imported for the table, and if there is no file that has not been imported, specifies the search request from within the table If there is a file that has not yet been imported, the data specified in the search request is searched from the table and the data specified in the search request is also searched for the file. .

Description

本発明は、データ処理システムに関する。 The present invention relates to a data processing system.

近年、プラント設備などに設けられたセンサから得られるセンサ計測値や、金融分野における株価や為替等の、多量の時系列データに対する解析を行うことにより、診断や予測を行うビッグデータ解析が活発に行われている。このようなビッグデータ解析においては、時々刻々と更新される多量の時系列データが絶え間なく収集され、収集されたデータを用いた解析が行われる。 In recent years, big data analysis for diagnosis and prediction has been actively conducted by analyzing large amounts of time-series data such as sensor measurement values obtained from sensors installed in plant facilities, etc., and stock prices and exchange rates in the financial field. Has been done. In such big data analysis, a large amount of time-series data updated every moment is continuously collected, and analysis using the collected data is performed.

ビッグデータ解析に用いられるデータは、いわゆるデータベース管理システム（ＤＢＭＳ）で管理されることが多い。ＤＢＭＳ、特にリレーショナルデータベース管理システム（ＲＤＢＭＳ）においては、複数のデータは「テーブル」と呼ばれる表形式の情報として管理される。ただし、データがＤＢＭＳで管理されるより前には、たとえばデータがセンサ等から取得された直後の時点においては、テキストファイル等の一般的なファイルの形式で、記憶装置に保存されることが一般的である。 Data used for big data analysis is often managed by a so-called database management system (DBMS). In a DBMS, particularly a relational database management system (RDBMS), a plurality of data is managed as tabular information called “table”. However, before the data is managed by the DBMS, for example, immediately after the data is acquired from the sensor or the like, it is generally stored in a storage device in a general file format such as a text file. Is.

多くのＤＢＭＳでは、ファイルに格納されている複数のデータを、ＤＢＭＳで管理するテーブルに格納するための機能をユーザに提供している。以下ではこの機能のことを「インポート機能」と呼ぶ。ユーザはインポート機能を用いることで、テーブルに１レコードずつデータを挿入する作業を行う必要がなくなり、ファイル内のデータを一括してテーブルに格納する（「インポートする」と呼ばれる）ことができる。 Many DBMSs provide a user with a function for storing a plurality of data stored in a file in a table managed by the DBMS. Hereinafter, this function is referred to as “import function”. By using the import function, the user does not need to insert data into the table one record at a time, and the data in the file can be stored in the table in a lump (called “import”).

インポート中に、テーブル内のデータにアクセスできない場合、ユーザはその間解析等の業務を実行できなくなる。特にファイル内のデータが多量にある時、インポートに長時間を要するため、業務を実行できない期間が長期化することになる。そのような問題を解決するために、ＤＢＭＳによっては、テーブルへの検索要求を受け付けながら、データのインポートを可能にする技術（以下、“バックグラウンドインポート機能”と呼ぶ）をサポートしているものもある（非特許文献１）。バックグラウンドインポート機能を用いることにより、ユーザはインポート中であっても、テーブルにアクセスすることができるようになる。 If the data in the table cannot be accessed during import, the user cannot perform tasks such as analysis during that time. Especially when there is a large amount of data in the file, it takes a long time to import, so the period during which the business cannot be executed becomes longer. In order to solve such a problem, some DBMSs support a technology (hereinafter referred to as “background import function”) that enables data import while accepting a search request to a table. Yes (Non-Patent Document 1). By using the background import function, the user can access the table even during import.

Hitachi Advanced Data Binder Setup and Operation Guide [online], Hitachi Ltd., (http://itdoc.hitachi.co.jp/manuals/3000/3000650150e/A0650150.PDF)Hitachi Advanced Data Binder Setup and Operation Guide [online], Hitachi Ltd., (http://itdoc.hitachi.co.jp/manuals/3000/3000650150e/A0650150.PDF)

ビッグデータ解析においては、多量の時系列データをリアルタイムに収集し解析することが求められている。バックグラウンドインポート機能を用いると、ファイルのデータをテーブルにインポートしている間でも、ユーザはテーブル内データの検索は可能だが、インポート中のファイルに記録されているデータに対する検索を行うことはできない。そのためユーザは、ファイルに記録されているデータに対する解析を行うためには、インポート処理が完了するまで待たなければならない。 In big data analysis, it is required to collect and analyze a large amount of time-series data in real time. When the background import function is used, the user can search the data in the table even while the file data is being imported into the table, but cannot search the data recorded in the file being imported. Therefore, the user must wait until the import process is completed in order to analyze the data recorded in the file.

本発明の一実施形態に係るデータ処理システムは、１以上のテーブルを管理し、該テーブルに対する検索要求を受け付けるとともに、ファイルに記録されたデータをテーブルへインポート可能に構成されている。データ処理システムは、テーブルに対するデータ検索要求を受け付けると、該テーブルに対するインポート対象のデータを記録しているファイルの有無を確認し、インポート未完了のファイルがない場合は、テーブル内から検索要求で指定されたデータを検索し、インポート未完了のファイルがある場合には、検索要求で指定されたデータをテーブルの中から検索するとともに、ファイルに対しても検索要求で指定されたデータの検索を行う。 A data processing system according to an embodiment of the present invention is configured to manage one or more tables, accept search requests for the tables, and import data recorded in a file into the tables. When the data processing system accepts a data search request for a table, the data processing system checks whether there is a file that records the data to be imported for the table, and if there is no file that has not been imported, specifies the search request from the table If there is a file that has not yet been imported, the data specified in the search request is searched from the table and the data specified in the search request is also searched for the file. .

本発明によれば、データのインポート完了を待たずに、即座にデータにアクセスすることができる。 According to the present invention, data can be accessed immediately without waiting for completion of data import.

データ処理システムの構成図である。It is a block diagram of a data processing system. 検索対象データの格納されるテーブルの一例である。It is an example of the table in which search object data is stored. インポートファイルのフォーマットの一例である。It is an example of the format of an import file. サーバの機能ブロック図である。It is a functional block diagram of a server. ディクショナリの一例である。It is an example of a dictionary. ディクショナリの一例（２）である。It is an example (2) of a dictionary. データファイル管理表の一例である。It is an example of a data file management table. ファイル監視処理のフローチャートである。It is a flowchart of a file monitoring process. ファイル更新チェック処理のフローチャートである。It is a flowchart of a file update check process. インポート処理のフローチャートである。It is a flowchart of an import process. バックグラウンドインポート機能の概念図である。It is a conceptual diagram of a background import function. 検索処理のフローチャートである。It is a flowchart of a search process. 書き換え前クエリの一例である。It is an example of the query before rewriting. 部分クエリ２の例である。It is an example of the partial query 2. 書き換え後のクエリの例である。It is an example of the query after rewriting. クエリ書き換え処理のフローチャートである。It is a flowchart of a query rewriting process. 書き換え前のクエリの一般化例である。It is a generalization example of the query before rewriting. 書き換え後のクエリの一般化例である。It is a generalization example of the query after rewriting. 実施例２におけるクエリ書き換え処理で生成される部分クエリ２の例である。It is an example of the partial query 2 produced | generated by the query rewriting process in Example 2. FIG. 実施例２におけるクエリ書き換え処理のフローチャートである。10 is a flowchart of query rewrite processing in the second embodiment. 実施例３における検索処理のフローチャートである。10 is a flowchart of search processing in Embodiment 3. 実施例１または２と、実施例３との比較図である。It is a comparison figure of Example 1 or 2 and Example 3. FIG. テーブルにインポートファイルのデータを格納するためのＳＱＬクエリの例である。It is an example of the SQL query for storing the data of an import file in a table.

以下、本発明の実施例について、図面を用いて説明する。なお、以下に説明する実施例は特許請求の範囲に係る発明を限定するものではなく、また実施例の中で説明されている諸要素及びその組み合わせの全てが発明の解決手段に必須であるとは限らない。 Embodiments of the present invention will be described below with reference to the drawings. The embodiments described below do not limit the invention according to the claims, and all the elements and combinations described in the embodiments are essential for the solution of the invention. Is not limited.

（１）システム構成
図１は、本発明の実施例に係るデータ処理システムのハードウェア構成を示す図である。データ処理システムは、データベースサーバ１（以下、「サーバ１」あるいは「ＤＢサーバ１」と略記する）、クライアント２、記憶装置３，４を有する。サーバ１とクライアント２は、例えばイーサネット（Ｅｔｈｅｒｎｅｔ）を用いて構成されたローカルエリアネットワーク（ＬＡＮ）６を介して、相互通信可能に接続される。サーバ１は、たとえばファイバチャネル（ＦｉｂｒｅＣｈａｎｎｅｌ）を用いて構成されたネットワーク５（またはＳＡＮ５と呼ばれる）を介して、記憶装置３，４と接続される。(1) System Configuration FIG. 1 is a diagram showing a hardware configuration of a data processing system according to an embodiment of the present invention. The data processing system includes a database server 1 (hereinafter abbreviated as “server 1” or “DB server 1”), a client 2, and storage devices 3 and 4. The server 1 and the client 2 are connected to each other so as to be able to communicate with each other via a local area network (LAN) 6 configured using, for example, Ethernet. The server 1 is connected to the storage devices 3 and 4 via a network 5 (or called SAN 5) configured using, for example, a fiber channel (FibreChannel).

サーバ１は、データ処理システムの利用者（以下、「ユーザ」と呼ぶ）から受領した、データベースへのアクセスリクエストの処理を行うコンピュータで、ＣＰＵ１１、メモリ１２、ＬＡＮ６に接続するためのネットワークポート１３、入出力デバイス１４、ストレージポート１６を有する。メモリ１２はたとえばＤＲＡＭ等の記憶デバイスで、ＣＰＵ１１がプログラムを実行する時に、そのプログラムまたはプログラムの実行時に用いられる制御情報等を格納するために用いられる。ＣＰＵ１１はデータベースアクセス処理を実施するためのプログラムを実行するコンポーネントである。本実施例に係るデータ処理システムにおいて、サーバ１はいわゆるＳＭＰ（ＳｙｍｍｅｔｒｉｃＭｕｌｔｉＰｒｏｃｅｓｓｏｒｉｎｇ）サーバで、複数のＣＰＵ１１を有し、各ＣＰＵ１１が並列に処理を実行することができるよう、構成されていてよい。なお、サーバ１内に複数のＣＰＵ１１を設ける代わりに、サーバ１内にいわゆる複数のプロセッサコアを有するマルチコアプロセッサが１つ（または複数）設けられる構成でも良い。 The server 1 is a computer that processes a database access request received from a user of a data processing system (hereinafter referred to as a “user”), and includes a CPU 11, a memory 12, a network port 13 for connecting to the LAN 6, It has an input / output device 14 and a storage port 16. The memory 12 is a storage device such as a DRAM, for example, and is used to store the program or control information used when the program is executed when the CPU 11 executes the program. The CPU 11 is a component that executes a program for performing database access processing. In the data processing system according to the present embodiment, the server 1 is a so-called SMP (Symmetric Multi Processing) server, and includes a plurality of CPUs 11, and each CPU 11 may be configured to execute processes in parallel. Instead of providing a plurality of CPUs 11 in the server 1, a configuration in which one (or a plurality) of multi-core processors having a so-called plurality of processor cores in the server 1 may be provided.

入出力デバイス１４はたとえば、キーボードやマウスなどの、ユーザが情報入力を行う際に用いるデバイスと、ディスプレイやプリンタ等の表示（出力）デバイスを含む。ストレージポート１６は、サーバ１と記憶装置３を接続するためのインタフェースである。 The input / output device 14 includes, for example, a device used when a user inputs information, such as a keyboard and a mouse, and a display (output) device such as a display and a printer. The storage port 16 is an interface for connecting the server 1 and the storage device 3.

クライアント２は、ユーザがサーバ１に対してデータベースへの参照更新要求を発行したり、サーバ１から返送される処理結果の出力を受領したりするために用いられるコンピュータである。クライアント２は、ＣＰＵ２１、メモリ２２、ＬＡＮ６に接続するためのネットワークポート２３、入出力デバイス２４を有する。ＣＰＵ２１、メモリ２２、ネットワークポート２３、入出力デバイス２４はそれぞれ、サーバ１のＣＰＵ１１、メモリ１２、ネットワークポート１３、入出力デバイス１４と同様のものである。またクライアント２は、メモリ２２の他に、磁気ディスクなどの補助記憶装置を備えていてもよい。 The client 2 is a computer used by the user to issue a reference update request to the database to the server 1 and receive an output of a processing result returned from the server 1. The client 2 includes a CPU 21, a memory 22, a network port 23 for connecting to the LAN 6, and an input / output device 24. The CPU 21, memory 22, network port 23, and input / output device 24 are the same as the CPU 11, memory 12, network port 13, and input / output device 14 of the server 1, respectively. In addition to the memory 22, the client 2 may include an auxiliary storage device such as a magnetic disk.

記憶装置３，４は、磁気ディスク等の不揮発性記憶デバイスを有する装置で、データベース３１やインポートファイル３２を格納するための装置である。記憶装置３，４は、いわゆるディスクアレイ（またはＲＡＩＤ）のように、複数の不揮発性記憶デバイスを有する装置であってもよい。記憶装置３は、ＳＡＮ５を介してサーバ１のストレージポート１６に接続される。 The storage devices 3 and 4 are devices having a nonvolatile storage device such as a magnetic disk, and are devices for storing the database 31 and the import file 32. The storage devices 3 and 4 may be devices having a plurality of nonvolatile storage devices such as so-called disk arrays (or RAIDs). The storage device 3 is connected to the storage port 16 of the server 1 via the SAN 5.

本実施例に係るデータ処理システムは、外部のデータソース８で生成されたデータを収集し、管理する。データソース８はたとえば、プラント設備などに設けられたセンサを含む装置で、ワイドエリアネットワーク（ＷＡＮ）７、ＬＡＮ６を介してサーバ１に接続される。図１ではデータソース８が１つだけ存在するが、データソース８は複数存在してもよい。 The data processing system according to the present embodiment collects and manages data generated by the external data source 8. For example, the data source 8 is a device including a sensor provided in a plant facility or the like, and is connected to the server 1 via a wide area network (WAN) 7 and a LAN 6. Although only one data source 8 exists in FIG. 1, a plurality of data sources 8 may exist.

データソース８は、センサによって計測された情報を多数格納したファイルを作成し、作成されたファイルをサーバ１に送信する機能を有する。サーバ１に送られてきたファイルは、一旦記憶装置４に格納され、ファイルに記録されている情報はその後、記憶装置３にあるテーブル３００に書き込まれる（インポートされる）。テーブル３００は、データベース管理プログラム１２０（後述）によって作成されるデータ構造である。本実施例では、データソース８から送られてくるファイルのことを、「インポートファイル」と呼ぶ（図１の、インポートファイル３２）。本実施例では、インポートファイル３２の格納される記憶装置４と、テーブル３００を格納する記憶装置３とは別の記憶装置である例を説明するが、テーブル３００とインポートファイル３２とが、同一の記憶装置３（または記憶装置４）に格納される構成が採用されてもよい。 The data source 8 has a function of creating a file storing a large amount of information measured by the sensor and transmitting the created file to the server 1. The file sent to the server 1 is temporarily stored in the storage device 4, and information recorded in the file is then written (imported) into the table 300 in the storage device 3. The table 300 has a data structure created by the database management program 120 (described later). In this embodiment, the file sent from the data source 8 is called an “import file” (import file 32 in FIG. 1). In this embodiment, an example in which the storage device 4 storing the import file 32 and the storage device 3 storing the table 300 are different storage devices will be described. However, the table 300 and the import file 32 are the same. A configuration stored in the storage device 3 (or the storage device 4) may be employed.

サーバ１のメモリ１２には、サーバ１で実行されるプログラムや、プログラムが使用する制御情報が格納される。サーバ１で実行されるプログラムとしては、たとえばデータベース管理プログラム１２０、ファイルシステムプログラム１２１、ＯＳ１２２、データ収集プログラム１２３がある。 The memory 12 of the server 1 stores a program executed by the server 1 and control information used by the program. Examples of programs executed by the server 1 include a database management program 120, a file system program 121, an OS 122, and a data collection program 123.

ＯＳ１２２は、サーバ１上で実行される各種プログラムのスケジュール制御を行うプログラムである。また本実施例に係るＯＳ１２２は、抽象化されたハードウェアリソースを各種プログラムに提供する処理を行うプログラムであるデバイスドライバを含んでもよい。 The OS 122 is a program that performs schedule control of various programs executed on the server 1. Further, the OS 122 according to the present embodiment may include a device driver that is a program for performing processing for providing abstracted hardware resources to various programs.

ファイルシステムプログラム１２１は、ファイルを記憶装置４等に格納して管理するプログラムである。本実施例ではファイルシステムプログラム１２１は主に、記憶装置４に格納されるインポートファイル３２に対するアクセスを行い、データベース３１へのアクセスが行われる時には、ファイルシステムプログラム１２１は用いられない。ただし別の実施形態として、ファイルシステムプログラム１２１が作成したファイルシステム（ファイルを格納・管理するためのデータ構造）の上にデータベース３１を格納するように構成されてもよい。 The file system program 121 is a program that stores and manages files in the storage device 4 or the like. In this embodiment, the file system program 121 mainly accesses the import file 32 stored in the storage device 4, and the file system program 121 is not used when the database 31 is accessed. However, as another embodiment, the database 31 may be stored on a file system (data structure for storing and managing files) created by the file system program 121.

データベース管理プログラム１２０は、リレーショナルデータベース管理システム（ＲＤＢＭＳ）と呼ばれることもあるプログラムで、「テーブル」と呼ばれる一種の表形式のデータ構造を作成し、テーブルにデータを格納して管理する。本実施例では、テーブルは記憶装置３に作成される。原則として、データベース管理プログラム１２０は、テーブル内のデータアクセスを主に行う。ただし本実施例に係るデータベース管理プログラム１２０は、表関数（後述）をサポートしており、ファイルシステムプログラム１２１によって管理されているファイル（記憶装置４に格納されるインポートファイル３２）内に格納されているデータの検索も行うことができる。詳細は後述する。なお、データベース管理プログラム１２０には、ユーザからのデータベースアクセス要求（「クエリ」または「ＳＱＬクエリ」と呼ばれる）を処理するためのプログラムのほか、ファイルに格納されたデータをテーブル３００にインポートするためのプログラム（本実施例ではインポートプログラムと呼ぶ）も含まれる。 The database management program 120 is a program sometimes called a relational database management system (RDBMS), creates a kind of tabular data structure called a “table”, and stores and manages the data in a table. In this embodiment, the table is created in the storage device 3. In principle, the database management program 120 mainly accesses data in the table. However, the database management program 120 according to the present embodiment supports table functions (described later) and is stored in a file (import file 32 stored in the storage device 4) managed by the file system program 121. You can also search for existing data. Details will be described later. The database management program 120 includes a program for processing a database access request (referred to as “query” or “SQL query”) from a user, and data for importing data stored in a file into the table 300. A program (referred to as an import program in this embodiment) is also included.

データ収集プログラム１２３は、データソース８からファイルを受信し、記憶装置４に格納する処理を行うプログラムである。 The data collection program 123 is a program that performs processing of receiving a file from the data source 8 and storing it in the storage device 4.

また、これらのプログラムが使用する管理情報として、ディクショナリ５００、データファイル管理表７００がある。これらの詳細は後述する。 Management information used by these programs includes a dictionary 500 and a data file management table 700. Details of these will be described later.

なお、上で説明したプログラムや管理情報は、サーバ１が稼働していない時は記憶装置３または４（あるいはサーバ１に内蔵された、非図示の補助記憶装置）に格納されている。サーバ１が起動し、必要な時（検索処理等が行われる時）に、これらのプログラムや管理情報は記憶装置３または４からメモリ１２上に読み出され、ＣＰＵ１１によって使用される。なお、サーバ１は上で述べたプログラム以外のプログラム、そして上で述べた管理情報以外の情報を、メモリ１２に格納してもよい。 The program and management information described above are stored in the storage device 3 or 4 (or an auxiliary storage device (not shown) built in the server 1) when the server 1 is not operating. When the server 1 is started and necessary (when a search process or the like is performed), these programs and management information are read from the storage device 3 or 4 onto the memory 12 and used by the CPU 11. The server 1 may store in the memory 12 programs other than the programs described above and information other than the management information described above.

クライアント２のメモリ２２には、クライアントプログラム２２１が存在しており、ＣＰＵ２１がクライアントプログラム２２１を実行する。クライアントプログラム２２１は、ユーザが情報検索指示を発行するためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）またはＣＬＩ（ＣｏｍｍａｎｄＬｉｎｅＩｎｔｅｒｆａｃｅ）を提供するプログラムである。 A client program 221 exists in the memory 22 of the client 2, and the CPU 21 executes the client program 221. The client program 221 is a program that provides a GUI (Graphical User Interface) or a CLI (Command Line Interface) for a user to issue an information search instruction.

（２）データベースとインポートファイルの構成
続いて、本実施例に係るデータ処理システムで取り扱われるデータベース３１とインポートファイル３２の構成について説明する。本実施例に係るサーバ１では、データベース管理プログラム１２０により、記憶装置３内にテーブル３００等を定義するためのデータベースエリア（データベース３１と呼ぶ）が定義され、データベース３１内には１または複数のテーブル３００が定義されている。(2) Configuration of Database and Import File Next, the configuration of the database 31 and the import file 32 handled by the data processing system according to the present embodiment will be described. In the server 1 according to the present embodiment, the database management program 120 defines a database area (referred to as the database 31) for defining the table 300 or the like in the storage device 3, and one or more tables are defined in the database 31. 300 is defined.

まず、サーバ１で管理されるテーブル３００の例を図２に示す。本実施例では、テーブル３００に格納されるデータ（レコード）は一例として、時系列データとする。時系列データとはたとえば、データソース８がセンサから継続的に取得した計測データの集合である。 First, an example of a table 300 managed by the server 1 is shown in FIG. In this embodiment, the data (record) stored in the table 300 is time series data as an example. The time series data is, for example, a collection of measurement data that the data source 8 continuously acquires from the sensor.

テーブル３００は、ＤＡＴＥ（３１１）、ＶＡＬＵＥ１（３１２）、ＶＡＬＵＥ２（３１３）、ＶＡＬＵＥ３（３１４）の４つのカラムを有するレコードを複数有する。ＶＡＬＵＥ１（３１２）、ＶＡＬＵＥ２（３１３）、ＶＡＬＵＥ３（３１４）のカラムは、センサで測定された、温度などの物理量の格納されるカラムである。ＤＡＴＥ（３１１）は、センサによる測定の行われた日時を格納するためのカラムである。ただし図２のテーブル３００は一例であり、各レコードは４以上のカラムを有していてもよい。逆にテーブル３００のカラム数が４未満でもよい。 The table 300 includes a plurality of records having four columns of DATE (311), VALUE1 (312), VALUE2 (313), and VALUE3 (314). The columns of VALUE1 (312), VALUE2 (313), and VALUE3 (314) are columns in which physical quantities such as temperature measured by sensors are stored. DATE (311) is a column for storing the date and time when the measurement by the sensor was performed. However, the table 300 in FIG. 2 is an example, and each record may have four or more columns. Conversely, the number of columns in the table 300 may be less than 4.

インポートファイル３２の例を図３に示す。本実施例ではインポートファイル３２のファイル形式には、いわゆるＣＳＶ（ＣｏｍｍａＳｅｐａｒａｔｅｄＶａｌｕｅ）形式が採用される。ただしＣＳＶ形式以外のファイル形式が用いられてもよい。またインポートファイル３２に格納されるデータは圧縮された状態で格納されていてもよいが、本実施例では、インポートファイル３２には非圧縮状態でデータが格納されている例を説明する。 An example of the import file 32 is shown in FIG. In the present embodiment, a so-called CSV (Comma Separated Value) format is adopted as the file format of the import file 32. However, a file format other than the CSV format may be used. The data stored in the import file 32 may be stored in a compressed state, but in this embodiment, an example in which data is stored in the uncompressed state in the import file 32 will be described.

インポートファイル３２の各行は、テーブル３００のレコードに相当する。つまり、テーブル内レコードのカラム（ＤＡＴＥ（３１１）、ＶＡＬＵＥ１（３１２）、ＶＡＬＵＥ２（３１３）、ＶＡＬＵＥ３（３１４））に格納される情報のそれぞれが、カンマで区切られた形で記述されている。図３において、行３２０が、テーブル内の１つのレコードに相当する情報のセットである。そして要素３２１，３２２，３２３，３２４はそれぞれ、テーブルのＤＡＴＥ（３１１）、ＶＡＬＵＥ１（３１２）、ＶＡＬＵＥ２（３１３）、ＶＡＬＵＥ３（３１４）に格納される情報である。以下では特に断りのない限り、インポートファイル３２の行のことも、テーブル３００のレコードと同じく、「レコード」と呼ぶ。 Each line of the import file 32 corresponds to a record in the table 300. That is, each of the information stored in the columns (DATE (311), VALUE1 (312), VALUE2 (313), and VALUE3 (314)) of the record in the table is described in a form separated by commas. In FIG. 3, a row 320 is a set of information corresponding to one record in the table. Elements 321, 322, 323, and 324 are information stored in DATE (311), VALUE 1 (312), VALUE 2 (313), and VALUE 3 (314) of the table, respectively. Hereinafter, unless otherwise specified, the row of the import file 32 is also referred to as a “record”, like the record of the table 300.

（３）機能ブロック構成
続いて図４を用いて、サーバ１の有する機能ブロックの説明を行う。本実施例に係るサーバ１は、上で説明したプログラム（主にデータベース管理プログラム１２０）がＣＰＵ１１で実行されることによって、ディクショナリ管理部２０１、ファイル収集部２０２、ファイル監視部２０３、クエリ受付部２０４、クエリ書換部２０５、クエリ最適化部２０６、クエリ実行部２０７、データベースアクセス部２０８、表関数処理部２０９という機能ブロックを備えた装置として動作する。以下、各機能ブロックの役割及び各機能ブロックが使用する管理情報を説明する。(3) Functional Block Configuration Next, the functional blocks of the server 1 will be described with reference to FIG. In the server 1 according to the present embodiment, the dictionary management unit 201, the file collection unit 202, the file monitoring unit 203, and the query reception unit 204 are executed by the CPU 11 executing the above-described program (mainly the database management program 120). , The query rewriting unit 205, the query optimization unit 206, the query execution unit 207, the database access unit 208, and the table function processing unit 209. Hereinafter, the role of each functional block and the management information used by each functional block will be described.

ディクショナリ管理部２０１は、ユーザから受け付けたテーブル作成要求に従って、テーブルの作成を行う。テーブル作成の際、ディクショナリ管理部２０１はテーブルの定義情報をディクショナリ５００に記録する。ディクショナリ５００の内容については後述する。 The dictionary management unit 201 creates a table in accordance with a table creation request received from the user. When creating a table, the dictionary management unit 201 records table definition information in the dictionary 500. The contents of the dictionary 500 will be described later.

クエリ受付部２０４は、ユーザからのデータベースアクセス要求を受け付け、適切な機能ブロックにその要求に係る処理を行わせ、処理結果をユーザに返送する。本実施例に係るＤＢサーバ１は、クライアント２からＳＱＬ（ＳｔｒｕｃｔｕｒｅｄＱｕｅｒｙＬａｎｇｕａｇｅ）で記述されたデータベースアクセス要求（「クエリ」または「ＳＱＬクエリ」と呼ばれる）を受領し、クエリの処理を行う。クエリ受付部２０４は、例えばクエリの書き換えが必要と判断した場合には、クエリ書換部２０５にクエリを渡す。 The query receiving unit 204 receives a database access request from the user, causes an appropriate functional block to perform processing related to the request, and returns the processing result to the user. The DB server 1 according to the present embodiment receives a database access request (referred to as a “query” or “SQL query”) described in SQL (Structured Query Language) from the client 2 and processes the query. For example, when it is determined that the query needs to be rewritten, the query reception unit 204 passes the query to the query rewriting unit 205.

クエリ書換部２０５は、受け付けたクエリの書き換えを行うための機能ブロックである。クエリ書換部２０５で行われる処理の詳細は、後述する。 The query rewriting unit 205 is a functional block for rewriting an accepted query. Details of processing performed by the query rewriting unit 205 will be described later.

クエリ最適化部２０６は、受け付けたクエリを解析し、クエリに係る処理の実行手順（実行プラン）を決定する機能ブロックである。クエリ実行部２０７はクエリ最適化部２０６で決定された処理実行手順に従って、テーブルに格納されているレコードの検索等の処理を行う。 The query optimization unit 206 is a functional block that analyzes a received query and determines an execution procedure (execution plan) of processing related to the query. The query execution unit 207 performs processing such as retrieval of records stored in the table according to the processing execution procedure determined by the query optimization unit 206.

データベースアクセス部２０８は、テーブル３００に格納されているレコードへのアクセスを行う機能ブロックである。データベースアクセス部２０８はクエリ実行部２０７からの指示に従って、レコードのリードやライトを行い、結果をクエリ実行部２０７に返送する。たとえばクエリ実行部２０７からの指示がレコード検索の指示であれば、データベースアクセス部２０８はテーブル３００からレコードを読み出して、クエリ実行部２０７に返送する。 The database access unit 208 is a functional block that accesses a record stored in the table 300. The database access unit 208 reads and writes records according to instructions from the query execution unit 207, and returns the result to the query execution unit 207. For example, if the instruction from the query execution unit 207 is a record search instruction, the database access unit 208 reads the record from the table 300 and returns it to the query execution unit 207.

表関数処理部２０９は、ファイル（とくにインポートファイル３２）の読み出しを行う機能ブロックである。表関数とは、ＳＱＬ２００３で規格化されている機能で、本実施例に係るデータベース管理プログラム１２０は表関数をサポートしている。表関数処理部２０９は、インポートファイル３２を読み出して、インポートファイル３２に記述されている各行をテーブル形式のデータとして、クエリ実行部２０７に返却する機能を有する。 The table function processing unit 209 is a functional block that reads a file (in particular, the import file 32). The table function is a function standardized by SQL 2003, and the database management program 120 according to the present embodiment supports the table function. The table function processing unit 209 has a function of reading the import file 32 and returning each row described in the import file 32 to the query execution unit 207 as table format data.

ファイル収集部２０２は、データソース８から送信されたファイル（インポートファイル）を記憶装置４に格納する処理を行う機能ブロックである。ＣＰＵ１１がデータ収集プログラム１２３を実行することによって、サーバ１はファイル収集部２０２としての機能を果たす。またファイル監視部２０３は、記憶装置４に格納されるインポートファイル３２の監視を行う機能ブロックである。またファイル監視部２０３は、インポートファイル３２とテーブル３００との対応関係をデータファイル管理表７００に格納して管理する。 The file collection unit 202 is a functional block that performs processing for storing a file (import file) transmitted from the data source 8 in the storage device 4. When the CPU 11 executes the data collection program 123, the server 1 functions as the file collection unit 202. The file monitoring unit 203 is a functional block that monitors the import file 32 stored in the storage device 4. Further, the file monitoring unit 203 manages the correspondence between the import file 32 and the table 300 by storing it in the data file management table 700.

なお、本実施例ではクエリ書換部等の機能ブロックを主語として、サーバ１で実行される処理の内容が説明される箇所がある。先に述べたとおり、本実施例に係るデータ処理システムでは、プログラム（主にデータベース管理プログラム１２０）がＣＰＵ１１で実行されることによって、サーバ１がこれら機能ブロックを備えた装置として動作するものであるから、実際の処理の主体は、正確にはサーバ１のＣＰＵ１１である。ただし説明が冗長になることを防ぐため、機能ブロックを主語として、各種処理の流れを説明することがある。そして機能ブロックを主語として処理内容が説明されている場合、その処理の主体はＣＰＵ１１であることを意味する。 In the present embodiment, there is a place where the contents of the processing executed by the server 1 are described with the functional block such as the query rewriting unit as the subject. As described above, in the data processing system according to the present embodiment, when the program (mainly the database management program 120) is executed by the CPU 11, the server 1 operates as an apparatus including these functional blocks. Therefore, the actual processing subject is the CPU 11 of the server 1 to be precise. However, in order to prevent the description from becoming redundant, the flow of various processes may be described with the functional block as the subject. When the processing content is described with the functional block as the subject, it means that the subject of the processing is the CPU 11.

（４）管理情報
次に、ディクショナリ５００、データファイル管理表７００について説明する。サーバ１がテーブルを作成する時、定義されるテーブルの属性情報等をディクショナリ５００に記録する。図２に示されたテーブル３００が定義（作成）された時に、ディクショナリ５００に記録される情報の例を、図５、図６を参照しながら説明する。ディクショナリ５００は、テーブル３００の属性が格納されるＳＱＬ＿ＴＡＢＬＥＳ（５１０）、テーブルの各カラムの属性が格納されるＳＱＬ＿ＣＯＬＵＭＮＳ（５２０）を有する。図５はＳＱＬ＿ＴＡＢＬＥＳ（５１０）の構成を示し、図６はＳＱＬ＿ＣＯＬＵＭＮＳ（５２０）の構成を示している。(4) Management Information Next, the dictionary 500 and the data file management table 700 will be described. When the server 1 creates a table, the attribute information of the table to be defined is recorded in the dictionary 500. An example of information recorded in the dictionary 500 when the table 300 shown in FIG. 2 is defined (created) will be described with reference to FIGS. The dictionary 500 includes SQL_TABLES (510) in which attributes of the table 300 are stored, and SQL_COLUMNS (520) in which attributes of each column of the table are stored. FIG. 5 shows the structure of SQL_TABLES (510), and FIG. 6 shows the structure of SQL_COLUMNS (520).

ＳＱＬ＿ＴＡＢＬＥＳ（５１０）は、スキーマ名（５１１）、表識別子（５１２）、表ＩＤ（５１３）、チャンク指定（５１４）、インポートファイル格納ディレクトリパス（５１６）のカラムを有するレコードを１以上有する。テーブルが作成されるたびに、ＳＱＬ＿ＴＡＢＬＥＳ（５１０）に１つのレコードが作成される。スキーマ名（５１１）、表識別子（５１２）、表ＩＤ（５１３）はそれぞれ、作成されたテーブルの属するスキーマの名称（一般には、テーブル作成を指示したユーザのユーザ名である）、作成されたテーブルの識別子（ユーザが指定したテーブルの名称）、作成されたテーブルの識別番号である。これらは公知のＲＤＢＭＳでも管理される情報であるので、詳細説明は略す。なお、本実施例では、スキーマ名称とテーブルの識別子のセットのことを「テーブル名」と呼ぶことがある。 SQL_TABLEs (510) has one or more records having columns of a schema name (511), a table identifier (512), a table ID (513), a chunk specification (514), and an import file storage directory path (516). Each time a table is created, one record is created in SQL_TABLE (510). The schema name (511), table identifier (512), and table ID (513) are the name of the schema to which the created table belongs (generally, the name of the user who instructed the table creation), and the created table. Identifier (name of the table specified by the user), and identification number of the created table. Since these are information managed by a known RDBMS, the detailed description is omitted. In this embodiment, the set of schema name and table identifier may be referred to as “table name”.

また、本実施例に係るデータベース管理プログラム１２０によって管理されるテーブルの属性には、チャンク指定（５１４）、チャンク数最大値（５１５）、及びインポートファイル格納ディレクトリパス（５１６）という情報も含まれる。チャンク指定（５１４）は、チャンク作成可能か否かを指定する表である。本実施例に係るデータベース管理プログラム１２０は、１回のインポート処理でテーブルに格納されるデータを１つの塊として管理し、この塊を「チャンク」と呼ぶ。チャンク指定（５１４）が“Ｙ”の場合、テーブルにチャンクが作成可能であることを意味する。チャンク数最大値（５１５）は、テーブルに作成可能なチャンクの最大数である。チャンク指定（５１４）、チャンク数最大値（５１５）に格納される情報は、テーブル定義時にユーザから指定される。 Further, the table attributes managed by the database management program 120 according to the present embodiment include information such as chunk designation (514), maximum number of chunks (515), and import file storage directory path (516). The chunk designation (514) is a table for designating whether or not chunk creation is possible. The database management program 120 according to the present embodiment manages the data stored in the table in one import process as one chunk, and this chunk is called “chunk”. When the chunk designation (514) is “Y”, it means that a chunk can be created in the table. The maximum number of chunks (515) is the maximum number of chunks that can be created in the table. Information stored in the chunk specification (514) and the maximum number of chunks (515) is specified by the user when defining the table.

インポートファイル格納ディレクトリパス（５１６）には、インポートファイルの格納されるディレクトリ名が格納される。このディレクトリ名の情報は、ユーザがテーブルを定義する時に、ユーザから指定される。もしテーブル定義時に、ユーザからディレクトリ名の情報が指定されなかった場合には、インポートファイル格納ディレクトリパス（５１６）にはＮＵＬＬ値が格納される。なお本実施例では、テーブルごとにインポートファイル格納ディレクトリパス（５１６）の内容(ディレクトリ名)は異なるものとする（ユーザがそのように指定する）。つまり、ある１つのディレクトリに格納されている（１または複数の）ファイルは、特定の１つのテーブルにインポートされる。 The import file storage directory path (516) stores the name of the directory where the import file is stored. This directory name information is specified by the user when the user defines the table. If no directory name information is specified by the user at the time of table definition, a NULL value is stored in the import file storage directory path (516). In this embodiment, it is assumed that the contents (directory name) of the import file storage directory path (516) are different for each table (the user designates it as such). That is, the file (s) stored in one directory is imported into one specific table.

ＳＱＬ＿ＣＯＬＵＭＮＳ（５２０）は、スキーマ名（５２１）、表識別子（５２２）、列名（５２３）、列ＩＤ（５２４）、データ型（５２５）、データ定義長（５２６）のカラムを有するレコードを１以上有する、一種の表である。これらは公知のＲＤＢＭＳでも管理される情報である。テーブルが作成されるたびに、そのテーブルについての情報を格納したＳＱＬ＿ＣＯＬＵＭＮＳ（５２０）が１つ作成される。 SQL_COLUMNS (520) has one or more records having columns of schema name (521), table identifier (522), column name (523), column ID (524), data type (525), and data definition length (526). It is a kind of table. These are information managed by a known RDBMS. Each time a table is created, one SQL_COLUMN (520) that stores information about the table is created.

スキーマ名（５２１）、表識別子（５２２）はそれぞれ、ＳＱＬ＿ＴＡＢＬＥＳ（５１０）の表識別子（５１２）、表ＩＤ（５１３）と同じ情報である。列名（５２３）は、テーブルに作成されたカラムの名称、列ＩＤ（５２４）は作成されたカラムの識別番号である。データ型（５２５）は、作成されたカラムに格納されるデータのタイプが指定される。データのタイプとは図６に示されているように、たとえば整数型（ＩＮＴＥＧＥＲ）、日付型（ＤＡＴＥ）等である。データ定義長（５２６）は、作成されたカラムに格納されるデータの長さ（最大長）が指定される。 The schema name (521) and the table identifier (522) are the same information as the table identifier (512) and the table ID (513) of the SQL_TABLEs (510), respectively. The column name (523) is the name of the column created in the table, and the column ID (524) is the identification number of the created column. The data type (525) specifies the type of data stored in the created column. As shown in FIG. 6, the data type is, for example, an integer type (INTAGER) or a date type (DATE). The data definition length (526) specifies the length (maximum length) of data stored in the created column.

続いて、本実施例に係るデータ処理システムがインポートファイルを管理するための方法について説明する。 Next, a method for managing the import file by the data processing system according to the present embodiment will be described.

データファイル管理表７００の例を図７に示す。データファイル管理表７００は、各インポートファイルについての情報を格納するための表で、１つのレコードに１つのインポートファイルの情報が格納される。 An example of the data file management table 700 is shown in FIG. The data file management table 700 is a table for storing information about each import file, and information on one import file is stored in one record.

ファイルパス（７０３）は、インポートファイルのファイル名である。ファイルパス（７０３）に格納されるファイル名には、絶対パス名が用いられる。 The file path (703) is the file name of the import file. An absolute path name is used as the file name stored in the file path (703).

一方、スキーマ名（７０１）及び表識別子（７０２）は、ファイルパス（７０３）に格納されている名称のファイル（インポートファイル）のインポート先となるテーブルの識別情報である。図７の先頭行のファイルパス（７０３）は“/home/data_dir/aaa.csv”、そしてスキーマ名（７０１）が“ＵＳＥＲ１”，表識別子（７０２）は“Ｔ１”であるから、インポートファイル“/home/data_dir/aaa.csv”は、スキーマ名（７０１）が“ＵＳＥＲ１”，表識別子（７０２）が“Ｔ１”のテーブルにインポートされるべきであること（またはインポートされたこと）を表す。インポート済みフラグ（７０７）は、インポートファイルの処理状態を表す。具体的には、ファイルパス（７０３）に記録されている名称のファイルが、テーブルにインポートされたか否かを表す情報が、インポート済みフラグ（７０７）に格納される。インポート済みフラグ（７０７）が“Ｎ”の場合、ファイルパス（７０３）で特定される名称のファイルがまだテーブルにインポートされていないことを意味し、一方インポート済みフラグ（７０７）が“Ｙ”の場合、ファイルがすでにテーブル（スキーマ名（７０１）及び表識別子（７０２）で特定されるテーブル）にインポートされたことを意味する。 On the other hand, the schema name (701) and the table identifier (702) are identification information of a table that is an import destination of a file (import file) having a name stored in the file path (703). The file path (703) on the top line in FIG. 7 is “/home/data_dir/aaa.csv”, the schema name (701) is “USER1”, and the table identifier (702) is “T1”. “/home/data_dir/aaa.csv” indicates that the schema name (701) should be imported (or has been imported) into a table having “USER1” and table identifier (702) “T1”. The imported flag (707) represents the processing state of the import file. Specifically, information indicating whether or not the file having the name recorded in the file path (703) has been imported into the table is stored in the imported flag (707). When the imported flag (707) is “N”, it means that the file having the name specified by the file path (703) has not been imported into the table, while the imported flag (707) is “Y”. In this case, the file has already been imported into the table (the table specified by the schema name (701) and the table identifier (702)).

ファイル更新日時（７０５）は、ファイルの作成日時（または更新日時）を表し、一方行挿入日時（７０６）は、このレコードがデータファイル管理表７００に作成された日時（またはレコードが更新された日時）を表す。データファイル管理表７００にレコードが格納（挿入）される契機や、その時の処理の具体内容については、後述する。 The file update date / time (705) represents the creation date / time (or update date / time) of the file, and the one-line insertion date / time (706) represents the date / time when this record was created in the data file management table 700 (or the date / time when the record was updated). ). The trigger for storing (inserting) a record in the data file management table 700 and the specific contents of the processing at that time will be described later.

なお、本実施例では、データファイル管理表７００もデータベース管理プログラム１２０が管理するテーブルである。そのためサーバ１がデータファイル管理表７００のレコードにアクセスする際、ＳＱＬクエリを発行することでアクセスできる。またデータファイル管理表７００の属性情報もディクショナリ５００に格納される。 In the present embodiment, the data file management table 700 is also a table managed by the database management program 120. Therefore, when the server 1 accesses a record in the data file management table 700, it can be accessed by issuing an SQL query. The attribute information of the data file management table 700 is also stored in the dictionary 500.

（５）処理の流れ
以下では、データ処理システムで実行される処理の流れを説明していく。(5) Process Flow The process flow executed in the data processing system will be described below.

（５−１）ファイル監視処理
まずファイル監視処理の流れを説明する。ファイル監視処理はファイル監視部２０３が実行する。ファイル監視処理では、記憶装置４に新たにインポートファイルが格納されたか、或いは既に格納されていたインポートファイルが更新されたか、について確認が行われる。(5-1) File Monitoring Process First, the flow of the file monitoring process will be described. The file monitoring process is executed by the file monitoring unit 203. In the file monitoring process, it is checked whether an import file is newly stored in the storage device 4 or whether the import file already stored is updated.

記憶装置４にファイルを格納する処理は、ファイル収集部２０２が実施する。ファイル収集部２０２はデータソース８からのファイル送信要求に応じて実行される。データソース８では常時、センサを用いた計測を行っており、計測結果の格納されたファイル（図３に示されたファイル）を適宜サーバ１に送信する。サーバ１へのファイル送信は定期的に行われてもよいし、或いは不定期なファイル送信が行われてもよい。 The process of storing the file in the storage device 4 is performed by the file collection unit 202. The file collection unit 202 is executed in response to a file transmission request from the data source 8. The data source 8 always performs measurement using a sensor, and transmits a file storing the measurement results (file shown in FIG. 3) to the server 1 as appropriate. File transmission to the server 1 may be performed periodically, or irregular file transmission may be performed.

データソース８がファイル送信要求をサーバ１に発行することに応じて、サーバ１ではファイル収集部２０２の実行が開始される。ファイル収集部２０２は、主にデータ収集プログラム１２３とファイルシステムプログラム１２１とがＣＰＵ１１で実行されることで実現される機能ブロックである。本実施例では、データソース８の発行するファイル送信要求に、ファイルと共に、ファイルの格納先ディレクトリ名も含まれている前提とする。つまりデータソース８では、あらかじめインポートファイルの格納先ディレクトリ名を把握している。そのためファイル収集部２０２は、データソース８から受領したファイルを、ファイル送信要求で指定されているディレクトリに格納する処理を行うだけであるため、詳細な説明は略す。 In response to the data source 8 issuing a file transmission request to the server 1, the server 1 starts executing the file collection unit 202. The file collection unit 202 is a functional block realized mainly by the data collection program 123 and the file system program 121 being executed by the CPU 11. In this embodiment, it is assumed that the file transmission request issued by the data source 8 includes the file storage directory name along with the file. That is, the data source 8 knows the name of the storage directory of the import file in advance. Therefore, the file collection unit 202 only performs a process of storing the file received from the data source 8 in the directory specified by the file transmission request, and thus detailed description thereof is omitted.

以下、図８を参照しながら、ファイル監視処理の流れを説明する。ファイル監視処理は定期的に実行される処理である。つまりファイル監視処理の実行契機は、データソース８からサーバ１へのファイル送信の契機とは無関係（独立）である。もしユーザからファイル監視処理の停止要求を受け付けた場合には、ファイル監視処理は停止する。 Hereinafter, the flow of the file monitoring process will be described with reference to FIG. The file monitoring process is a process that is executed periodically. That is, the execution trigger of the file monitoring process is irrelevant (independent) with the trigger of file transmission from the data source 8 to the server 1. If a stop request for the file monitoring process is received from the user, the file monitoring process stops.

ステップ１００１：ファイル監視部２０３は、所定時間待機する。待機する時間の長さは、あらかじめ定められた固定値でも良いし、あるいは待機する時間をユーザが指定することができるように、ファイル監視部２０３が構成されていてもよい。 Step 1001: The file monitoring unit 203 waits for a predetermined time. The length of the waiting time may be a predetermined fixed value, or the file monitoring unit 203 may be configured so that the user can specify the waiting time.

ステップ１００２：ファイル監視部２０３はディクショナリ５００を参照し、インポートファイルの格納されるディレクトリ名を特定する。具体的にはファイル監視部２０３は、ディクショナリ５００が有するＳＱＬ＿ＴＡＢＬＥＳ（５１０）のカラム５１６（インポートファイル格納ディレクトリパス）を参照することで、インポートファイルの格納されるディレクトリ名を特定できる。ＳＱＬ＿ＴＡＢＬＥＳ（５１０）に複数のレコードが存在する場合（つまりサーバ１にテーブルが複数定義されている場合）、ファイル監視部２０３はＳＱＬ＿ＴＡＢＬＥＳ（５１０）の各レコードを参照し、インポートファイル格納ディレクトリパス（５１６）に格納されているディレクトリ名をすべて特定する。なお、以下では説明が複雑になることを避けるため、ＳＱＬ＿ＴＡＢＬＥＳ（５１０）のレコードのうち、インポートファイル格納ディレクトリパス（５１６）にディレクトリ名が格納されているレコードが１つだけであったケースについて説明する。また、ここで特定されたディレクトリを「監視対象ディレクトリ」と呼ぶ。 Step 1002: The file monitoring unit 203 refers to the dictionary 500 and identifies a directory name in which the import file is stored. Specifically, the file monitoring unit 203 can identify the directory name where the import file is stored by referring to the column 516 (import file storage directory path) of the SQL_TABLE (510) of the dictionary 500. When there are a plurality of records in the SQL_TABLEs (510) (that is, when a plurality of tables are defined in the server 1), the file monitoring unit 203 refers to each record of the SQL_TABLEs (510), and import file storage directory path (516) ) Identify all the directory names stored in. In the following, in order to avoid complicated explanation, a case where only one record in the import file storage directory path (516) has a directory name stored in the SQL_TABLEs (510) record will be described. To do. The directory specified here is called a “monitoring target directory”.

ステップ１００３：ファイル監視部２０３は、監視対象ディレクトリ内のファイルのチェックを行う。ここでの処理の詳細は後述するが（図９）、ステップ１００３が実行された結果、もし監視対象ディレクトリ内に新たにファイル（インポートファイル）が格納されていた場合には、データファイル管理表７００の内容が更新される。 Step 1003: The file monitoring unit 203 checks a file in the monitoring target directory. Details of the processing here will be described later (FIG. 9). As a result of executing step 1003, if a new file (import file) is stored in the monitoring target directory, the data file management table 700 is displayed. The contents of are updated.

ステップ１００４：ステップ１００３において、ファイル監視部２０３がデータファイル管理表７００の内容を更新した場合（ステップ１００４：Ｙｅｓ）、ファイル監視部２０３は次にステップ１００５を実行する。もしステップ１００３でファイル監視部２０３がデータファイル管理表７００の内容を更新しなかった場合（ステップ１００４：Ｎｏ）、ファイル監視部２０３は次にステップ１００１に戻る。 Step 1004: In step 1003, when the file monitoring unit 203 updates the contents of the data file management table 700 (step 1004: Yes), the file monitoring unit 203 next executes step 1005. If the file monitoring unit 203 has not updated the contents of the data file management table 700 in step 1003 (step 1004: No), the file monitoring unit 203 returns to step 1001 next.

ステップ１００５：このステップは、ファイル監視部２０３がステップ１００３でデータファイル管理表７００の内容を更新した場合（つまり監視対象ディレクトリ内に新たなインポートファイルが格納された場合か、あるいは既に格納されていたインポートファイルの更新が行われていた場合）に実行される。ステップ１００５ではファイル監視部２０３は、新たに格納されたインポートファイル（または更新されたインポートファイル）のインポート処理を実施する。インポート処理の内容は後述する。 Step 1005: This step is performed when the file monitoring unit 203 updates the contents of the data file management table 700 in Step 1003 (that is, when a new import file is stored in the monitoring target directory or has already been stored). (If the import file has been updated). In step 1005, the file monitoring unit 203 performs an import process on the newly stored import file (or updated import file). The details of the import process will be described later.

ステップ１００６：ファイル監視部２０３は、ユーザがファイル監視処理の停止要求を発行したか確認する。もしユーザがファイル監視処理の停止要求を発行した場合には、ファイル監視部２０３は処理を終了する。ファイル監視処理の停止要求が発行されていない場合には、ファイル監視部２０３は再びステップ１００１から処理を繰り返す。 Step 1006: The file monitoring unit 203 confirms whether the user has issued a request to stop the file monitoring process. If the user issues a request to stop the file monitoring process, the file monitoring unit 203 ends the process. When the request for stopping the file monitoring process has not been issued, the file monitoring unit 203 repeats the process from step 1001 again.

ファイル監視部２０３は以上の処理を行う。なお上では、ステップ１００２で監視対象ディレクトリが１つだけ特定されたケースについて説明したが、もしステップ１００２で複数のディレクトリ名が特定された場合、ファイル監視部２０３は特定されたそれぞれのディレクトリについて、ステップ１００３〜ステップ１００５の処理を行う。 The file monitoring unit 203 performs the above processing. In the above description, the case where only one monitoring target directory is specified in step 1002 has been described. However, if a plurality of directory names are specified in step 1002, the file monitoring unit 203 performs the following for each specified directory. Steps 1003 to 1005 are performed.

続いて、図８のステップ１００３で行われる処理（ファイル更新チェック処理と呼ぶ）の詳細を、図９を用いて説明する。なお、図８の説明と同様に、ここでは図８のステップ１００２で監視対象ディレクトリが１つだけ特定されたケースについて説明する。もしステップ１００２で複数のディレクトリが特定された場合には、特定されたそれぞれのディレクトリについて、以下で述べるファイル更新チェック処理が行われる。 Next, details of the processing (referred to as file update check processing) performed in step 1003 of FIG. 8 will be described with reference to FIG. As in the description of FIG. 8, here, a case where only one monitoring target directory is specified in step 1002 of FIG. 8 will be described. If a plurality of directories are specified in step 1002, the file update check process described below is performed for each specified directory.

ステップ１２０１：ファイル監視部２０３は、監視対象ディレクトリ内に格納されている、各ファイルの情報を取得する。ここでファイル監視部２０３が読み出す必要のある情報は、データファイル管理表７００に記録される情報で、具体的にはファイルパス（７０３）、ファイルサイズ（７０４）、ファイル更新日時（７０５）に記録される情報である。データソース８から送られてきたファイル（インポートファイル）が記憶装置４に格納される時、ファイル収集部２０２（データ収集プログラム１２３及びファイルシステムプログラム１２１）によって、ファイルにはファイル名やファイル作成日時（または更新日時）等の属性情報が付与されて格納される。そのためファイル監視部２０３は、ファイルシステムプログラム１２１によって付与された各ファイルの属性情報を記憶装置４から読み出すことで、必要な各ファイルの情報を取得できる。 Step 1201: The file monitoring unit 203 acquires information on each file stored in the monitoring target directory. Here, the information that the file monitoring unit 203 needs to read is information recorded in the data file management table 700. Specifically, it is recorded in the file path (703), the file size (704), and the file update date and time (705). Information. When a file (import file) sent from the data source 8 is stored in the storage device 4, the file collection unit 202 (data collection program 123 and file system program 121) adds a file name and file creation date / time ( Alternatively, attribute information such as update date and time is assigned and stored. Therefore, the file monitoring unit 203 can acquire information on each necessary file by reading the attribute information of each file assigned by the file system program 121 from the storage device 4.

ステップ１２０２：もし、監視対象ディレクトリ内の全ファイルについて、ステップ１２０３以降の処理が行われた場合には、ファイル監視部２０３は処理を終了する。監視対象ディレクトリ内にまだステップ１２０３以降の処理が行われていないファイルが残っている場合には、ファイル監視部２０３は次にステップ１２０３を実行する。 Step 1202: If all the files in the monitoring target directory have been processed after step 1203, the file monitoring unit 203 ends the processing. If there are files in the monitoring target directory that have not been processed in step 1203 and subsequent steps, the file monitoring unit 203 next executes step 1203.

ステップ１２０３：ファイル監視部２０３は、ステップ１２０１で取得された各ファイルの情報のうち、ステップ１２０４以降の処理を行う対象となる１つのファイルの情報を選択する。以下の説明では、ここで選択されたファイルの情報のことを「対象ファイルの情報」と呼び、また対象ファイルの情報を属性情報として持つファイルのことを「対象ファイル」と呼ぶ。そしてファイル監視部２０３は、データファイル管理表７００内の行（レコード）のうち、ファイルパス（７０３）が、対象ファイルのファイル名と同じであるレコードがあるか検索する。 Step 1203: The file monitoring unit 203 selects information on one file to be processed in step 1204 and subsequent steps from the information on each file acquired in step 1201. In the following description, the information on the selected file is referred to as “target file information”, and the file having the target file information as attribute information is referred to as a “target file”. Then, the file monitoring unit 203 searches the row (record) in the data file management table 700 for a record having a file path (703) that is the same as the file name of the target file.

ステップ１２０４：ファイル監視部２０３は、ステップ１２０３の処理の結果、対象ファイルのファイル名と同じであるレコードがデータファイル管理表７００内にない場合（ステップ１２０４：Ｎｏ）、次にステップ１２０５を行い、ある場合には、次にステップ１２０６を実行する。なおステップ１２０４の判定が否定的な場合とは、監視対象ディレクトリに新たなインポートファイルが格納された場合である。 Step 1204: If the result of the processing in step 1203 is that the record having the same file name as the target file does not exist in the data file management table 700 (step 1204: No), the file monitoring unit 203 performs step 1205, If there is, then step 1206 is executed. The case where the determination in step 1204 is negative is a case where a new import file is stored in the monitoring target directory.

ステップ１２０５：ファイル監視部２０３は、データファイル管理表７００にレコードを１つ作成し、作成したレコードのファイルパス（７０３）、ファイルサイズ（７０４）、ファイル更新日時（７０５）に、対象ファイルの情報を格納する。またこの時ファイル監視部２０３は、作成したレコードの行挿入日時（７０６）に現在時刻（ステップ１２０５を実行している時点の時刻）を記録し、スキーマ名（７０１）及び表識別子（７０２）にはそれぞれ、対象ファイルがインポートされるべきテーブルのスキーマ名と表識別子を記
録する。ファイルがインポートされるべきテーブルのスキーマ名と表識別子は、ＳＱＬ＿ＴＡＢＬＥＳ（５１０）に記録されているので、ファイル監視部２０３はＳＱＬ＿ＴＡＢＬＥＳ（５１０）のインポートファイル格納ディレクトリパス（５１６）のディレクトリ名が監視対象ディレクトリと同じである行を特定し、その行のスキーマ名（５１１）及び表識別子（５１２）の情報を、今回作成したレコードのスキーマ名（７０１）及び表識別子（７０２）に格納する。Step 1205: The file monitoring unit 203 creates one record in the data file management table 700, and includes information on the target file in the file path (703), file size (704), and file update date and time (705) of the created record. Is stored. At this time, the file monitoring unit 203 records the current time (the time at which step 1205 is executed) in the row insertion date / time (706) of the created record, and the schema name (701) and the table identifier (702). Each records the schema name and table identifier of the table into which the target file is to be imported. Since the schema name and table identifier of the table into which the file is to be imported are recorded in SQL_TABLES (510), the file monitoring unit 203 monitors the directory name of the import file storage directory path (516) of SQL_TABLES (510). A row that is the same as the directory is specified, and information on the schema name (511) and table identifier (512) of the row is stored in the schema name (701) and table identifier (702) of the record created this time.

ステップ１２０６：ファイル監視部２０３は、対象ファイルの情報が、データファイル管理表７００内のレコードに記録されているファイルの情報より新しい場合（ステップ１２０６：Ｙｅｓ）、ステップ１２０７を実行し、そうでない場合にはステップ１２０２から処理を繰り返す。具体的にはファイル監視部２０３はファイルの更新日時を比較することで、ステップ１２０６の判定を行う。 Step 1206: When the information of the target file is newer than the information of the file recorded in the record in the data file management table 700 (step 1206: Yes), the file monitoring unit 203 executes step 1207, and otherwise The process is repeated from step 1202. Specifically, the file monitoring unit 203 performs the determination in step 1206 by comparing the update date and time of the file.

ステップ１２０７：対象ファイルの情報が、データファイル管理表７００内のレコードに記録されているファイルの情報より新しい場合、そのファイルが更新されたことを意味する。そのためファイル監視部２０３は、データファイル管理表７００内のレコードに記録されているファイルの情報を、ステップ１２０３で選択されたファイルの情報を用いて更新する。具体的には、ファイルサイズ（７０４）、ファイル更新日時（７０５）が、ステップ１２０３で選択されたファイルの情報を用いて更新される。またファイル監視部２０３は、行挿入日時（７０６）を現在時刻（ステップ１２０７を実行した時刻）に更新し、またインポート済みフラグ（７０７）には“Ｎ”を格納する。 Step 1207: If the information of the target file is newer than the information of the file recorded in the record in the data file management table 700, it means that the file has been updated. Therefore, the file monitoring unit 203 updates the file information recorded in the record in the data file management table 700 using the file information selected in Step 1203. Specifically, the file size (704) and the file update date / time (705) are updated using the information of the file selected in step 1203. Further, the file monitoring unit 203 updates the row insertion date / time (706) to the current time (the time when step 1207 is executed), and stores “N” in the imported flag (707).

続いて、図８のステップ１００５で実行されるインポート処理について、図１０を用いて説明する。なお、説明が複雑になることを避けるため、以下ではインポート対象のファイル（インポートファイル）が１つのケース（つまりステップ１００３で、データファイル管理表７００のレコードのうち、１つのレコードだけが更新（または追加）されていた場合）について説明する。そして以下のインポート処理の説明においては、インポート処理（図１０に記載の処理）の対象となるインポートファイルのことを「インポートソースファイル」と呼ぶ。 Next, the import process executed in step 1005 of FIG. 8 will be described with reference to FIG. In order to avoid complicated description, in the following, only one record (or import file) to be imported is updated (or updated in step 1003 in the data file management table 700 (or in step 1003) (or (If added)). In the following description of the import process, the import file that is the target of the import process (the process described in FIG. 10) is referred to as an “import source file”.

本実施例において、ファイル監視部２０３は、バックグラウンドインポート機能をサポートしている。バックグラウンドインポート機能は公知の機能で、インポート処理中に、インポート処理対象となっているテーブルに対する検索を可能とする機能である。この機能はたとえば、データベース管理プログラム１２０に含まれるインポートプログラムにより実現される機能である。たとえば以下で説明するステップ１１０１，１１０２は、サーバ１のＣＰＵ１１でインポートプログラムが実行されることにより行われる。またステップ１１０１，１１０２は公知の処理である。 In this embodiment, the file monitoring unit 203 supports a background import function. The background import function is a well-known function, and is a function that enables a search for a table that is an import process target during the import process. This function is realized by an import program included in the database management program 120, for example. For example, steps 1101 and 1102 described below are performed by the import program being executed by the CPU 11 of the server 1. Steps 1101 and 1102 are known processes.

ステップ１１０１：ファイル監視部２０３は、インポート処理対象のテーブルにチャンクを作成する。ここでは、空のチャンクが作成される。 Step 1101: The file monitoring unit 203 creates a chunk in the import processing target table. Here, an empty chunk is created.

ステップ１１０２：ファイル監視部２０３は、インポートソースファイルからインポート処理対象のテーブルにデータのインポートを行う。ここでインポートされるデータは、ステップ１１０１で作成されたチャンクに追加される。 Step 1102: The file monitoring unit 203 imports data from the import source file to the import processing target table. The data to be imported here is added to the chunk created in step 1101.

ステップ１１０３：インポートソースファイル内の全てのデータがインポートされた後、ファイル監視部２０３は、データファイル管理表７００のレコードのうち、インポートソースファイルについてのレコード（ファイルパス（７０３）にインポートソースファイルのファイル名が格納されているレコード）のインポート済みフラグ（７０７）を“Ｙ”に変更し、処理を終了する。 Step 1103: After all the data in the import source file has been imported, the file monitoring unit 203 records the import source file among the records in the data file management table 700 (the file path (703) contains the import source file). The imported flag (707) of the record in which the file name is stored is changed to “Y”, and the process ends.

バックグラウンドインポート機能、及び図１０のインポート処理で用いられるチャンクについて、図１１を用いて概要を説明する。図１１において、点線で囲まれた領域１５０１，１５０２はそれぞれチャンクを表している。チャンク１５０１はインポート処理の前から存在するもので、チャンク１５０２は、ステップ１１０１が実行された時に作成されたチャンク（空のチャンク）を表している。 An overview of the background import function and chunks used in the import process of FIG. 10 will be described with reference to FIG. In FIG. 11, areas 1501 and 1502 surrounded by dotted lines represent chunks. The chunk 1501 exists before the import process, and the chunk 1502 represents a chunk (empty chunk) created when step 1101 is executed.

インポート処理中（ステップ１１０２）は、インポートソースファイルに格納されていたデータが、順次チャンク１５０２へと追加されていき、チャンク１５０１の内容に変更が発生しない。そのため、データインポート中（ステップ１１０２の実行中）にユーザがテーブルに対して検索要求を発行したとき、サーバ１はチャンク１５０１の内容を用いて検索処理を行って、ユーザに検索結果を返すことができる。 During the import process (step 1102), the data stored in the import source file is sequentially added to the chunk 1502, and the content of the chunk 1501 does not change. Therefore, when the user issues a search request to the table during data import (during execution of step 1102), the server 1 may perform a search process using the contents of the chunk 1501 and return the search result to the user. it can.

そしてステップ１１０２が完了すると、チャンク１５０２に格納されたデータも、検索処理における検索処理対象に変更される。そのためステップ１１０２の完了後に、ユーザがテーブルに対して検索要求を発行した時には、サーバ１はチャンク１５０１及びチャンク１５０２からデータの検索を行う。 When step 1102 is completed, the data stored in the chunk 1502 is also changed to the search process target in the search process. Therefore, when the user issues a search request to the table after the completion of step 1102, the server 1 searches for data from the chunk 1501 and the chunk 1502.

（５−２）検索処理
続いて検索処理の流れを説明する。先に述べたとおり、公知のバックグラウンド機能をサポートしているデータベースシステムであれば、インポート処理中にインポート処理対象のテーブルに対する検索要求をユーザから受け付けても、ユーザを待たせることなく検索結果を返すことができる。しかし先に述べたとおり、インポート処理中にインポート処理対象のテーブルに対する検索要求をユーザから受け付けた時、検索対象となるデータは、図１１のチャンク１５０１内のデータ（インポート処理開始時にテーブルに存在したデータ）に限られ、インポートファイルに記録されているデータの検索はできない。インポートファイルのサイズが極めて大きい場合、インポート処理に要する時間は長くなるため、インポートファイルに記録されているデータを利用できない時間が長くなる。(5-2) Search Process Next, the flow of the search process will be described. As described above, if the database system supports a well-known background function, even if a search request for a table subject to import processing is received from the user during the import processing, the search result is displayed without causing the user to wait. Can return. However, as described above, when a search request for the import processing target table is received from the user during the import processing, the search target data is the data in the chunk 1501 of FIG. Data), and the data recorded in the import file cannot be searched. When the size of the import file is extremely large, the time required for the import process becomes long, so that the time during which the data recorded in the import file cannot be used becomes long.

その問題を解決するため、本実施例に係るサーバ１では、インポート処理中に検索要求を受け付けた時に、インポートファイルに格納されているデータの検索も可能にする。図１２は、本実施例に係るデータ処理システムで行われる検索処理の流れを示している。 In order to solve the problem, the server 1 according to the present embodiment enables the search of the data stored in the import file when a search request is accepted during the import process. FIG. 12 shows the flow of search processing performed in the data processing system according to the present embodiment.

ステップ１３０１：クエリ受付部２０４は、ユーザが使用するクライアント２からＳＱＬクエリを受け付けると、そのクエリが、クエリ書き換えが必要かを判断するための解析を行う。具体的には、クエリ受付部２０４はまず、データファイル管理表７００に、受け付けたＳＱＬクエリのＦＲＯＭ句に記述されているテーブル名称と同じテーブル名が登録されているレコードがあるか判断する。もしそのようなレコードがあった場合、クエリ受付部２０４は、そのレコードのインポート済みフラグ（７０７）が“Ｎ”か判断する。インポート済みフラグ（７０７）が“Ｎ”のレコードがあった場合、それは検索対象のテーブルに対するインポートファイルが記憶装置４に格納されているが、まだそのインポートファイルがテーブルにインポートされていないことを意味する。 Step 1301: When the query receiving unit 204 receives an SQL query from the client 2 used by the user, the query receiving unit 204 performs analysis for determining whether the query needs to be rewritten. Specifically, the query reception unit 204 first determines whether there is a record in the data file management table 700 in which the same table name as that described in the FROM clause of the received SQL query is registered. If there is such a record, the query reception unit 204 determines whether the imported flag (707) of the record is “N”. If there is a record whose imported flag (707) is “N”, it means that the import file for the table to be searched is stored in the storage device 4, but the import file has not yet been imported into the table. To do.

ステップ１３０２：ステップ１３０１の処理の結果、まだ検索対象のテーブルにインポートされていないインポートファイルがあった場合（ステップ１３０２：Ｙｅｓ）、次にステップ１３０３が行われる。一方、データファイル管理表７００に、検索対象のテーブルに対するインポートファイルの情報が登録されていない場合、あるいは検索対象のテーブルに対するインポートファイルの情報が登録されているが、そのインポートファイルはインポート済み（インポート済みフラグ（７０７）が“Ｙ”））の場合、ステップ１３０３はスキップされる。 Step 1302: If there is an import file that has not yet been imported to the table to be searched as a result of the processing in Step 1301, Step 1303 is performed next. On the other hand, when the import file information for the table to be searched is not registered in the data file management table 700, or the import file information for the table to be searched is registered, the import file has already been imported (imported). If the completed flag (707) is “Y”)), step 1303 is skipped.

ステップ１３０３：クエリ受付部２０４は、受け付けたＳＱＬをクエリ書換部２０５に渡す。クエリを渡されたクエリ書換部２０５は、クエリ書き換え処理を行うが、この処理は後述する。 Step 1303: The query receiving unit 204 passes the received SQL to the query rewriting unit 205. The query rewriting unit 205 to which the query is passed performs a query rewriting process, which will be described later.

ステップ１３０４：クエリ受付部２０４は、クエリをクエリ最適化部２０６に渡す。クエリ最適化部２０６では、クエリについての実行プランが生成される。この処理は公知のＲＤＢＭＳで行われるものと同様である。 Step 1304: The query reception unit 204 passes the query to the query optimization unit 206. The query optimization unit 206 generates an execution plan for the query. This process is the same as that performed in a known RDBMS.

ステップ１３０５：クエリの実行プランはクエリ実行部２０７に渡され、クエリ実行部２０７は実行プランに従って処理を行う。クエリ実行部２０７は、データベースアクセス部２０８や表関数処理部２０９を用いて、テーブル３００やインポートファイル３２からレコードの読み出しを行い、読み出されたレコードから、クエリで指定された条件に該当するレコードを抽出する。そしてクエリ実行部２０７は、抽出されたレコードをクエリ受付部２０４に返送する。クエリ受付部２０４は、返送されてきた結果をユーザ（クライアント２）へ出力する。 Step 1305: The query execution plan is passed to the query execution unit 207, and the query execution unit 207 performs processing according to the execution plan. The query execution unit 207 reads records from the table 300 and the import file 32 using the database access unit 208 and the table function processing unit 209, and records corresponding to the conditions specified by the query from the read records. To extract. Then, the query execution unit 207 returns the extracted record to the query reception unit 204. The query reception unit 204 outputs the returned result to the user (client 2).

これが検索処理の全体の流れである。以下では各ステップの詳細について説明していく。 This is the overall flow of the search process. The details of each step will be described below.

（５−３）クエリ書換処理
ここでは、ステップ１３０３で行われるクエリ書き換え処理について説明する。ただしその前に、ユーザから渡されるクエリ（書き換え前クエリと呼ぶ）、そしてステップ１３０３において書き換えられた後のクエリ（書き換え後クエリと呼ぶ）の記述例を説明する。(5-3) Query Rewriting Process Here, the query rewriting process performed in step 1303 will be described. However, before that, description examples of a query delivered from the user (referred to as a pre-rewrite query) and a query rewritten in step 1303 (referred to as a post-rewrite query) will be described.

図１３に記載のクエリは、ユーザがサーバ１に対して発行するＳＱＬクエリ（書き換え前クエリ）の例である。ＳＱＬは公知であるため、ここではクエリの概要のみ説明する。なお、図１３等に示されているクエリの各行の先頭に付されている番号は、説明のために付されている行番号である。 The query illustrated in FIG. 13 is an example of an SQL query (pre-rewrite query) issued by the user to the server 1. Since SQL is well known, only the outline of the query will be described here. Note that the numbers given to the top of each line of the query shown in FIG. 13 and the like are line numbers given for explanation.

図１３の書き換え前クエリの記述内容を簡単に説明する。図１３の書き換え前クエリは、ＦＲＯＭ句で指定されているテーブル“USER1.T1”を検索対象とするクエリである。以下、クエリのＦＲＯＭ句で指定されているテーブルのことを、「検索対象テーブル」と呼ぶ。なお、本実施例では説明の簡単化のため、ＷＨＥＲＥ句（検索されるべきレコードの条件を指定するためのフレーズ）が指定されていないＳＱＬクエリを、ユーザが発行した場合の例を説明する。図１３のクエリは、検索対象のテーブル“USER1.T1”の中の全レコードを抽出して出力することを指示するクエリである。 The description content of the pre-rewrite query in FIG. 13 will be briefly described. The pre-rewrite query in FIG. 13 is a query that searches the table “USER1.T1” specified by the FROM clause. Hereinafter, the table specified in the FROM clause of the query is referred to as “search target table”. In this embodiment, for simplification of description, an example will be described in which a user issues an SQL query in which a WHERE phrase (a phrase for specifying a condition of a record to be searched) is not specified. The query in FIG. 13 is a query that instructs to extract and output all records in the search target table “USER1.T1”.

また、ＳＱＬクエリのＳＥＬＥＣＴ句の後に属性（カラム）が指定されることもあり、サーバ１がそのようなクエリを受け付けた場合には、指定されたカラムのみを出力する。ただしここでは説明の簡単化のため、ＳＱＬクエリのＳＥＬＥＣＴ句の後にはアスタリスク（＊）が記述されている例（レコード内の全属性を出力することをユーザから要求されているケース）について説明する。 An attribute (column) may be specified after the SELECT clause of the SQL query. When the server 1 accepts such a query, only the specified column is output. However, for simplicity of explanation, an example in which an asterisk (*) is described after the SELECT clause of the SQL query (a case where the user is required to output all attributes in the record) will be described. .

また、図１３では検索対象テーブルは１つしか指定されていない例を示している。ただし実際には、ＦＲＯＭ句に検索対象テーブルが複数指定されたクエリをユーザから受領することもある。そのような例については後述する。 FIG. 13 shows an example in which only one search target table is designated. However, actually, a query in which a plurality of search target tables are specified in the FROM phrase may be received from the user. Such an example will be described later.

本実施例では、検索対象テーブルに対してデータインポート中のインポートファイル３２が記憶装置４に存在する場合には、クエリ書換部２０５はユーザから受け付けたＳＱＬクエリを書き換えることで、その後の処理を行うクエリ実行部２０７に、テーブル３００だけでなくインポートファイル３２からもレコードを取得させる。インポートファイル３２の読み出しをクエリ実行部２０７に行わせるために、クエリ書換部２０５は表関数を用いたクエリを生成する。表関数を用いて記述された、インポートファイル３２を読み出すためのクエリの例を図１４に示す。 In the present embodiment, when the import file 32 during data import for the search target table exists in the storage device 4, the query rewriting unit 205 rewrites the SQL query received from the user to perform subsequent processing. The query execution unit 207 is made to acquire records not only from the table 300 but also from the import file 32. In order to cause the query execution unit 207 to read the import file 32, the query rewriting unit 205 generates a query using a table function. An example of a query for reading the import file 32 described using a table function is shown in FIG.

図１４に示されたクエリは、図１３の書き換え前クエリと同様の検索をインポートファイル３２（ファイル‘/home/data_dir/aaa.csv’）に対して行うクエリである。以下、このクエリを「部分クエリ２」と呼び、一方書き換え前クエリで指定されている検索対象テーブルからデータを検索するためのクエリを「部分クエリ１」と呼ぶ。部分クエリ１は、書き換え前クエリと実質的に同内容のクエリである。 The query shown in FIG. 14 is a query for performing the same search as the pre-rewrite query of FIG. 13 on the import file 32 (file ‘/home/data_dir/aaa.csv’). Hereinafter, this query is referred to as “partial query 2”, and on the other hand, a query for retrieving data from the search target table specified by the pre-rewrite query is referred to as “partial query 1”. The partial query 1 is a query having substantially the same content as the pre-rewrite query.

図１４のクエリと図１３のクエリの違いは、ＦＲＯＭ句に指定された情報だけである。具体的には図１３のクエリで指定されている検索対象テーブルの名称（USER1.T1）が、以下で説明する表関数(図１４の２行目〜７行目)に置き換えられている。それ以外には、両者に違いはない。 The difference between the query of FIG. 14 and the query of FIG. 13 is only the information specified in the FROM phrase. Specifically, the name of the search target table (USER1.T1) specified by the query in FIG. 13 is replaced with a table function (2nd to 7th lines in FIG. 14) described below. Other than that, there is no difference between the two.

ここで用いられている表関数の機能について説明する。表関数ＴＡＢＬＥ（）の引数部分（図１４の３行目〜６行目）に記述されている関数ＡＤＢ＿ＣＳＶＲＥＡＤ（）は、引数に指定されたファイル（ＣＳＶファイル）から読み出した各行を出力する関数である。ＴＡＢＬＥ（）は、テキスト行をテーブル形式のデータとして出力する関数である。なお、図１４では、ＡＤＢ＿ＣＳＶＲＥＡＤ（）の引数に指定されているファイル名は１つだけだが、ＡＤＢ＿ＣＳＶＲＥＡＤ（）の引数には複数のファイル名を指定することができる。たとえばＡＤＢ＿ＣＳＶＲＥＡＤ（ＭＵＬＴＩＳＥＴ［’xxx.csv’，’aaa.csv’］）と記述されている場合、ファイルxxx.csv及びaaa.csvの読み出しが行われる。また、５行目に記述されている“ＣＯＭＰＲＥＳＳＩＯＮ＿ＦＯＲＭＡＴ＝ＮＯＮＥ”は、引数で指定されるファイルが圧縮形式でない場合に指定される引数で、リード対象のインポートファイル３２が圧縮形式の場合には、たとえば“ＣＯＭＰＲＥＳＳＩＯＮ＿ＦＯＲＭＡＴ＝ＧＺＩＰ”等の引数が記述される。 The function of the table function used here will be described. The function ADB_CSVREAD () described in the argument part of the table function TABLE () (lines 3 to 6 in FIG. 14) is a function for outputting each line read from the file (CSV file) specified in the argument. is there. TABLE () is a function that outputs a text line as table format data. In FIG. 14, only one file name is specified as an argument of ADB_CSVREAD (), but a plurality of file names can be specified as an argument of ADB_CSVREAD (). For example, when ADB_CSVREAD (MULTISET ['xxx.csv', 'aaa.csv']) is described, the files xxx.csv and aaa.csv are read. “COMPRESSION_FORMAT = NONE” described in the fifth line is an argument that is specified when the file specified by the argument is not in a compressed format. If the import file 32 to be read is in a compressed format, An argument such as “COMPRESSION_FORMAT = GZIP” is described.

クエリ書換部２０５は、上で述べた部分クエリ２を作成し、さらに部分クエリ２と、テーブル３００内のレコード検索を行うためのクエリ（部分クエリ１）の出力結果の和集合を求めるクエリを作成する。本実施例では、これを書き換え後クエリと呼ぶ。図１５は、図１３の書き換え前クエリを書き換えた後のクエリの例である。図中の「部分クエリ１」と記述された部分が、テーブル３００内のレコード検索を行うためのクエリで、この例では、部分クエリ１の内容は書き換え前クエリと同じである。図１５の５行目以降（部分クエリ２）がインポートファイル３２からレコードを検索するクエリで、図１４と同じ内容である。書き換え後クエリは、部分クエリ１と部分クエリ２を、３行目に記述されている“ＵＮＩＯＮＡＬＬ”演算子（和集合を求めるための演算子）で連結したものである。 The query rewriting unit 205 creates the partial query 2 described above, and further creates a query for obtaining the union of the output results of the partial query 2 and the query for searching the records in the table 300 (partial query 1). To do. In this embodiment, this is called a post-rewrite query. FIG. 15 shows an example of a query after rewriting the pre-rewrite query of FIG. A portion described as “partial query 1” in the figure is a query for performing a record search in the table 300. In this example, the content of the partial query 1 is the same as the pre-rewrite query. The fifth and subsequent lines in FIG. 15 (partial query 2) are queries for retrieving records from the import file 32, and have the same contents as in FIG. The post-rewrite query is obtained by concatenating partial query 1 and partial query 2 with a “UNION ALL” operator (an operator for obtaining a union) described in the third line.

ただし上で説明した書き換え後クエリは一例であり、図１５等に記載されたクエリと同じものが生成されなければいけないわけではない。書き換え前クエリで指定された条件のレコードを、テーブル３００とインポートファイル３２の両方から検索できるクエリが生成されればよい。また、本実施例に係るクエリ書換部２０５が実際に行う処理も、上で説明したものとは若干異なり、書き換え後クエリの記述内容も図１５に記載されたものと異なる点がある。クエリ書換部２０５が実際に行う処理については、以下で説明する。 However, the rewritten query described above is an example, and the same query as the query described in FIG. 15 or the like does not have to be generated. It suffices if a query that can search for the record of the condition specified by the pre-rewrite query from both the table 300 and the import file 32 is generated. Further, the processing actually performed by the query rewriting unit 205 according to the present embodiment is slightly different from that described above, and the description content of the rewritten query is different from that described in FIG. The processing actually performed by the query rewriting unit 205 will be described below.

図１６を用いて、クエリ書換部２０５が実施する、クエリ書き換え処理の流れを説明する。以下の説明では、書き換え前クエリとして、図１７に記載のクエリが与えられた場合の例について説明する。図１７は図１３に記述されたクエリの例をより一般化したもので、検索対象テーブルが複数指定されたクエリである。ＦＲＯＭ句には、実際にはユーザから指定される具体的なテーブル名が入るが、以下ではクエリ書き換えの方法を一般化した例を説明するため、クエリ中で指定されるテーブル名の情報を、変数に置き換えたクエリを用いて説明を行う。 The flow of query rewriting processing performed by the query rewriting unit 205 will be described with reference to FIG. In the following description, an example in which the query shown in FIG. 17 is given as the pre-rewrite query will be described. FIG. 17 is a more generalized example of the query described in FIG. 13, in which a plurality of search target tables are designated. The FROM clause actually contains a specific table name specified by the user, but in the following, in order to explain an example of generalizing the query rewriting method, information on the table name specified in the query, The explanation is made using the query replaced with the variable.

なお、説明が複雑になることを避けるために、書き換え前クエリの例として、ＷＨＥＲＥ句の指定がなく、またＳＥＬＥＣＴ句にはアスタリスクが指定されている例を説明する。 In order to avoid complicated description, an example in which no WHERE clause is specified and an asterisk is specified in the SELECT clause will be described as an example of the pre-rewrite query.

図１７において、ＦＲＯＭ句に含まれている変数（＄Ｂ１），（＄Ｂ２），．．．，（＄Ｂｎ）はそれぞれ、検索対象のテーブル名を表す。（＄Ｂ１），（＄Ｂ２），．．．，（＄Ｂｎ）のそれぞれが、１つのテーブル名に相当する。つまり図１７のクエリは、検索対象テーブルが複数ある場合の例である。なお、以下では（＄Ｂｘ）のことを「テーブル（＄Ｂｘ）」と呼ぶこともある（ｘは１以上ｎ以下の整数である）。 In FIG. 17, the variables ($ B1), ($ B2),. . . , ($ Bn) each represents a table name to be searched. ($ B1), ($ B2),. . . , ($ Bn) corresponds to one table name. That is, the query in FIG. 17 is an example when there are a plurality of search target tables. Hereinafter, ($ Bx) may be referred to as “table ($ Bx)” (x is an integer of 1 to n).

ここでは、ＦＲＯＭ句で指定されたテーブル（＄Ｂ１），（＄Ｂ２），．．．，（＄Ｂｎ）のうち、テーブル（＄Ｂ１）に対して、ファイル（ファイル名を“/home/data_dir/aaa.csv”とする）からのインポートが行われている場合に、図１７のクエリがクエリ書換部２０５によりどのように書き換えられるかを説明する。図１８が、図１７の書き換え前クエリが書き換えられた後の例で、３行目〜４行目が部分クエリ１、８行目〜１４行目が部分クエリ２である。 Here, the tables ($ B1), ($ B2),. . . , ($ Bn), when the table ($ B1) is imported from a file (file name is “/home/data_dir/aaa.csv”), the query in FIG. Is rewritten by the query rewriting unit 205. FIG. 18 is an example after the pre-rewrite query of FIG. 17 is rewritten. The third to fourth lines are the partial query 1 and the eighth to 14th lines are the partial query 2.

ステップ１４０１：クエリ書換部２０５は、クエリ受付部２０４から受領したＳＱＬクエリを解析し、検索対象のテーブル（ＦＲＯＭ句に記載されたテーブル）を１つ特定する。ここで特定されるべきテーブルは、まだステップ１４０３以降の処理対象になっていない検索対象テーブルである。 Step 1401: The query rewriting unit 205 analyzes the SQL query received from the query receiving unit 204, and specifies one table to be searched (a table described in the FROM phrase). The table to be specified here is a search target table that has not yet been processed after step 1403.

ステップ１４０２：クエリ書換部２０５は、ステップ１４０１の結果、まだステップ１４０３以降の処理が行われていない検索対象テーブルが存在した場合（ステップ１４０２：Ｎｏ）、次にステップ１４０３を行う。全ての検索対象テーブルについてステップ１４０３以降の処理が行われており、ステップ１４０３以降の処理を行うべき検索対象テーブルがステップ１４０１で特定できなかった場合には（ステップ１４０２：Ｙｅｓ）、処理を終了する。 Step 1402: If there is a search target table that has not been processed after Step 1403 as a result of Step 1401 (Step 1402: No), the query rewriting unit 205 performs Step 1403 next. If all the search target tables have been processed after step 1403 and the search target table to be processed after step 1403 cannot be identified in step 1401 (step 1402: Yes), the process ends. .

なお、以下のステップ１４０３以降の説明では、ステップ１４０１でテーブル（＄Ｂ１）が特定されたケースの例を説明する。 In the following description of step 1403 and later, an example in which the table ($ B1) is specified in step 1401 will be described.

ステップ１４０３：クエリ書換部２０５は、ステップ１４０１で特定されたテーブルのテーブル名（＄Ｂ１）をキーに、データファイル管理表７００の中からレコードの検索を行う。具体的には、データファイル管理表７００内のレコードのうち、テーブル名称（つまりスキーマ名（７０１）及び表識別子（７０２）の組み合わせ）がステップ１４０１で特定されたテーブル名（つまり（＄Ｂ１））と一致するレコードを検索する。この処理の結果、複数のレコードが見つかる場合がある。これは記憶装置４にインポートファイル３２が複数存在するケースである。 Step 1403: The query rewriting unit 205 searches for a record from the data file management table 700 using the table name ($ B1) of the table identified in Step 1401 as a key. Specifically, among the records in the data file management table 700, the table name (that is, the combination of the schema name (701) and the table identifier (702)) specified in step 1401 (that is, ($ B1)). Find records that match. As a result of this processing, a plurality of records may be found. This is a case where a plurality of import files 32 exist in the storage device 4.

ステップ１４０４：クエリ書換部２０５は、ステップ１４０３で特定されたレコードの中に、インポート済みフラグ（７０７）が“Ｎ”のレコードがあるか判定する。インポート済みフラグ（７０７）が“Ｎ”のレコードが１つ以上あった場合（ステップ１４０４：Ｙｅｓ）、次にステップ１４０５が行われる。逆に、ステップ１４０３で特定されたレコードの中に、インポート済みフラグ（７０７）が“Ｎ”のレコードがなかった場合、あるいはデータファイル管理表７００内に、テーブル名称（スキーマ名（７０１）及び表識別子（７０２）の組み合わせ）が（＄Ｂ１）であるレコードがなかった場合（ステップ１４０４：Ｎｏ）、クエリ書換部２０５は次にステップ１４０１を行う。 Step 1404: The query rewriting unit 205 determines whether there is a record whose imported flag (707) is “N” in the records identified in Step 1403. If there is one or more records whose imported flag (707) is “N” (step 1404: Yes), step 1405 is performed next. Conversely, if there is no record whose imported flag (707) is “N” in the record specified in step 1403, or the table name (schema name (701) and table in the data file management table 700). If there is no record whose identifier (702) is ($ B1) (step 1404: No), the query rewriting unit 205 next performs step 1401.

ステップ１４０５：クエリ書換部２０５は、受け付けたクエリ、及びステップ１４０１で特定されたテーブル名を用いて、部分クエリ１を生成する。先に説明したとおり、部分クエリ１は、（インポートファイルではなく）検索対象テーブルからレコードを検索するクエリである。 Step 1405: The query rewriting unit 205 generates a partial query 1 using the received query and the table name specified in step 1401. As described above, the partial query 1 is a query for searching for a record from a search target table (not an import file).

図１８に示されているように、受け付けたクエリが図１７のもので、ステップ１４０１でテーブル（＄Ｂ１）が特定された場合、部分クエリ１として
SELECT * FROM ($B1)
が生成される。つまりここで生成される部分クエリ１では、ＳＥＬＥＣＴ句（３行目）に、受け付けたクエリと同じ内容(つまり”＊”)が指定される。またＦＲＯＭ句には、書き換え前クエリで指定されている複数の検索対象テーブルのうち、ステップ１４０１で特定されたテーブル（＄Ｂ１）が指定される。As shown in FIG. 18, when the accepted query is that of FIG. 17 and the table ($ B1) is specified in step 1401, partial query 1 is designated.
SELECT * FROM ($ B1)
Is generated. That is, in the partial query 1 generated here, the same content (that is, “*”) as the accepted query is specified in the SELECT clause (line 3). In the FROM phrase, the table ($ B1) specified in step 1401 among the plurality of search target tables specified in the pre-rewrite query is specified.

ステップ１４０７：クエリ書換部２０５は、ステップ１４０３で特定された、データファイル管理表７００のレコードのファイルパス（７０３）の内容を用いて部分クエリ２を生成する。部分クエリ２のＳＥＬＥＣＴ句に指定される情報は、部分クエリ１のものと同じである。そして部分クエリ２のＦＲＯＭ句（図１８の９行目〜１４行目）には、ファイルパス（７０３）を引数に含む表関数が指定される。 Step 1407: The query rewriting unit 205 generates a partial query 2 using the contents of the file path (703) of the record of the data file management table 700 specified in step 1403. The information specified in the SELECT clause of partial query 2 is the same as that of partial query 1. A table function including a file path (703) as an argument is specified in the FROM phrase of the partial query 2 (lines 9 to 14 in FIG. 18).

ステップ１４０８：クエリ書換部２０５は、ここまでで作成された部分クエリ１及び部分クエリ２の出力結果の和集合を出力するクエリを生成する。具体的には、図１８（３行目〜１５行目）に示されているように、クエリ書換部２０５は、部分クエリ１と部分クエリ２を“ＵＮＩＯＮＡＬＬ”演算子で連結したクエリを生成する。そして書き換え前クエリのＦＲＯＭ句に指定されている検索対象テーブル名を、ここで生成したクエリに書き換える。たとえばステップ１４０１でテーブル（＄Ｂ１）が選択された場合、書き換え前クエリのＦＲＯＭ句のうち、（＄Ｂ１）部分を、ここで生成したクエリに書き換える。図１８では（＄Ｂ１）が書き換えられた例が示されている。 Step 1408: The query rewriting unit 205 generates a query that outputs the union of the output results of the partial query 1 and the partial query 2 created so far. Specifically, as shown in FIG. 18 (lines 3 to 15), the query rewriting unit 205 generates a query in which the partial query 1 and the partial query 2 are connected by the “UNION ALL” operator. To do. Then, the search target table name specified in the FROM clause of the pre-rewrite query is rewritten to the query generated here. For example, when the table ($ B1) is selected in step 1401, the ($ B1) portion of the FROM clause of the pre-rewrite query is rewritten to the query generated here. FIG. 18 shows an example in which ($ B1) is rewritten.

この後クエリ書換部２０５は、再びステップ１４０１からの処理を実行する。書き換え前クエリに指定されている全ての検索対象テーブルについて、上で述べた処理が行われると、書き換え後クエリが完成し、クエリ書き換え処理は終了する。 Thereafter, the query rewriting unit 205 executes the processing from step 1401 again. When the processing described above is performed for all the search target tables specified in the pre-rewrite query, the post-rewrite query is completed, and the query rewrite processing ends.

（５−４）実行プラン生成と最適化
最後に、ステップ１３０４、ステップ１３０５で行われる、実行プランの生成、実行処理について説明する。本実施例における実行プランの生成処理は、公知のＲＤＢＭＳで行われるものと大きく変わるところはないため、本実施例における実行プランの生成及び実行処理の概要を述べる。(5-4) Execution Plan Generation and Optimization Finally, execution plan generation and execution processing performed in steps 1304 and 1305 will be described. The execution plan generation processing in the present embodiment is not significantly different from that performed in a known RDBMS, so an outline of execution plan generation and execution processing in the present embodiment will be described.

本実施例に係るデータ処理システムでは、サーバ１が複数のＣＰＵ１１を有するので、幾つかの処理を並列に実行することができる。そこでクエリ最適化部２０６は、実行プランを生成する際、並列化可能な複数の処理がある場合、それらの処理が並列実行されるような実行プランを生成する。 In the data processing system according to the present embodiment, since the server 1 includes a plurality of CPUs 11, several processes can be executed in parallel. Therefore, when generating an execution plan, the query optimizing unit 206 generates an execution plan such that these processes are executed in parallel when there are a plurality of processes that can be parallelized.

並列化可能な処理とは、たとえば互いに依存関係のない処理である。逆に依存関係のある処理同士は並列実行できない。たとえば図１５等に記載の書き換え後クエリの中で、部分クエリ１と部分クエリ２の間には依存関係がない。部分クエリ１の実行のためには、記憶装置３内のテーブル３００を読み出す必要があり、また部分クエリ２の実行のためには記憶装置４内のインポートファイル３２を読み出す必要があるが、テーブル３００の読み出しとインポートファイル３２の読み出しには互いに依存関係がない（テーブル３００を読み出すまでインポートファイル３２を読み出すことができない等の制約がない）。そのためクエリ最適化部２０６は、テーブル３００の読み出しとインポートファイル３２の読み出しを並列に実行する実行プランを生成し、クエリ実行部２０７に実行させる。クエリ実行部２０７ではそのような実行プランを受領すると、たとえばテーブル３００の読み出しを行うタスク（スレッド）とインポートファイル３２の読み出しを行うタスクとを生成し、両者を並列実行する。 Processes that can be parallelized are processes that do not depend on each other, for example. Conversely, processes with dependencies cannot be executed in parallel. For example, in the post-rewrite queries described in FIG. 15 and the like, there is no dependency between partial query 1 and partial query 2. In order to execute the partial query 1, it is necessary to read the table 300 in the storage device 3, and in order to execute the partial query 2, it is necessary to read the import file 32 in the storage device 4. And the import file 32 are not dependent on each other (there is no restriction such that the import file 32 cannot be read until the table 300 is read). Therefore, the query optimization unit 206 generates an execution plan for executing the reading of the table 300 and the reading of the import file 32 in parallel, and causes the query execution unit 207 to execute the execution plan. When the query execution unit 207 receives such an execution plan, for example, a task (thread) for reading the table 300 and a task for reading the import file 32 are generated, and both are executed in parallel.

また、部分クエリ２に、リード対象のインポートファイル３２が複数（たとえばｍ個）存在する場合、ｍ個のインポートファイル３２の読み出し処理を並列実行可能である。そのためクエリ最適化部２０６は、インポートファイル３２の読み出しを並列実行する実行プランを生成し、クエリ実行部２０７に実行させる。 Further, when there are a plurality (for example, m) of import files 32 to be read in the partial query 2, the read processing of the m import files 32 can be executed in parallel. Therefore, the query optimization unit 206 generates an execution plan for executing the reading of the import file 32 in parallel, and causes the query execution unit 207 to execute the execution plan.

以上で、実施例１に係るデータ処理システムの説明を終える。実施例１に係るデータ処理システムでは上で述べた処理が行われるため、インポートファイル内のデータをテーブルへインポートしている間にユーザからクエリを受け付けた時、テーブル内のレコードだけでなくインポートファイル内のレコードも検索することができる。 This is the end of the description of the data processing system according to the first embodiment. Since the above-described processing is performed in the data processing system according to the first embodiment, when a query is received from the user while the data in the import file is being imported into the table, not only the records in the table but also the import file You can also search for records within.

なお、上では、データベース管理プログラム１２０に含まれるインポートプログラムと呼ばれる専用のプログラムが実行されることにより、インポートファイルからテーブルへのデータインポートがバックグラウンド実行される例を説明した。しかし専用のプログラムでインポートを行う以外の方法が採用されてもよい。たとえば、インポートファイル内のレコードをテーブルに挿入するためのＳＱＬクエリを作成し、それをサーバ１で実行させることで、インポート処理が行われてもよい。 In the above, an example in which data import from an import file to a table is executed in the background by executing a dedicated program called an import program included in the database management program 120 has been described. However, methods other than importing with a dedicated program may be adopted. For example, the import process may be performed by creating an SQL query for inserting a record in the import file into the table and causing the server 1 to execute the query.

図２３に、テーブルにインポートファイルのデータを格納するためのＳＱＬクエリの例を示す。図２３に示されたクエリの例は、インポート先のテーブル名が”USER1.T1”、そしてインポートファイルのファイル名が“/home/data_dir/aaa.csv”である場合の例である。このようなＳＱＬクエリを用いてインポート処理を実現する場合、ファイル監視部２０３は、図１０のステップ１１０１，１１０２に代えて、図２３のようなクエリを作成し、クエリ最適化部２０６、クエリ実行部２０７を用いてクエリを実行させるとよい。 FIG. 23 shows an example of an SQL query for storing import file data in a table. The query example shown in FIG. 23 is an example in which the import destination table name is “USER1.T1” and the import file name is “/home/data_dir/aaa.csv”. When the import process is realized using such an SQL query, the file monitoring unit 203 creates a query as shown in FIG. 23 instead of steps 1101 and 1102 in FIG. The query may be executed using the unit 207.

続いて実施例２に係るデータ処理システムの説明を行う。実施例２に係るデータ処理システムのハードウェア構成や機能ブロックの構成は、実施例１で説明したものと同じであるため、図示は略す。 Subsequently, a data processing system according to the second embodiment will be described. Since the hardware configuration and functional block configuration of the data processing system according to the second embodiment are the same as those described in the first embodiment, illustration is omitted.

実施例２に係るデータ処理システムでは、クエリ書き換え処理の一部が、実施例１で説明したものと異なるが、その他の点は実施例１で説明したものと同じである。以下では実施例２におけるクエリ書き換え処理について、実施例１と異なる点について説明する。 In the data processing system according to the second embodiment, part of the query rewriting process is different from that described in the first embodiment, but the other points are the same as those described in the first embodiment. In the following, the query rewriting process in the second embodiment will be described with respect to differences from the first embodiment.

図１９は、実施例２におけるクエリ書き換え処理で生成される部分クエリ２の例を示している。なおここでの部分クエリ２の例は、サーバ１が図１３のクエリ（検索対象テーブル名が“USER1.T1”(スキーマ名が”USER1”、表識別子が"T１"である)のクエリ）を書き換え前クエリとして受領した場合の例である。図１９に示されている部分クエリ２が、実施例１で説明した部分クエリ２（たとえば図１５や図１８を参照）と違う点は、関数ＡＤＢ＿ＣＳＶＲＥＡＤ（）の引数に指定されている内容である。実施例１で説明した部分クエリ２では、関数ＡＤＢ＿ＣＳＶＲＥＡＤ（）の引数にはファイル名が記述されている。一方図１９に示されている部分クエリ２では、図１９の５行目〜９行目に示されている通り、ファイル名の代わりにＳＱＬクエリが記述されている。以下ではこのクエリを「部分クエリ３」と呼ぶ。 FIG. 19 illustrates an example of the partial query 2 generated by the query rewriting process according to the second embodiment. In this example of partial query 2, the server 1 executes the query shown in FIG. 13 (query whose search target table name is “USER1.T1” (schema name is “USER1”, table identifier is “T1”)). It is an example when received as a pre-rewrite query. The difference between the partial query 2 shown in FIG. 19 and the partial query 2 described in the first embodiment (for example, see FIG. 15 and FIG. 18) is the content specified in the argument of the function ADB_CSVREAD (). . In the partial query 2 described in the first embodiment, the file name is described in the argument of the function ADB_CSVREAD (). On the other hand, in the partial query 2 shown in FIG. 19, an SQL query is described instead of the file name as shown in the fifth to ninth lines in FIG. Hereinafter, this query is referred to as “partial query 3”.

部分クエリ３は図１９に示されているように、データファイル管理表７００の中からスキーマ名（７０１）が”USER1”、表識別子（７０２）が”T1”、そしてインポート済みフラグ（７０７）が”N”であるレコードのファイルパス（７０３）の内容を抽出するためのクエリである。これは、図１６に示すクエリ書き換え処理のうち、ステップ１４０３と同内容の検索を行うためのクエリである。 As shown in FIG. 19, the partial query 3 has a schema name (701) “USER1”, a table identifier (702) “T1”, and an imported flag (707) set in the data file management table 700. This is a query for extracting the contents of the file path (703) of the record “N”. This is a query for performing the same search as in step 1403 in the query rewriting process shown in FIG.

図２０を参照しながら、実施例２に係るクエリ書き換え処理の流れ、特に実施例１で説明した処理と異なる点について説明する。実施例２に係るクエリ書き換え処理は、実施例１で説明したクエリ書き換え処理に対してステップ１４０６が加えられ、また実施例１で説明したステップ１４０７の代わりにステップ１４０７’が行われる点が異なる。その他の処理は実施例１で説明したものと同じであるため、以下ではステップ１４０６、ステップ１４０７’について説明する。 With reference to FIG. 20, the flow of the query rewriting process according to the second embodiment, particularly the differences from the process described in the first embodiment will be described. The query rewriting process according to the second embodiment is different in that step 1406 is added to the query rewriting process described in the first embodiment, and step 1407 'is performed instead of step 1407 described in the first embodiment. Since other processes are the same as those described in the first embodiment, steps 1406 and 1407 'will be described below.

ステップ１４０５の後、クエリ書換部２０５はステップ１４０６を実行する。ステップ１４０６ではクエリ書換部２０５は、上で説明したように、ステップ１４０３と同内容の検索を行うための部分クエリ３を生成する。 After step 1405, the query rewriting unit 205 executes step 1406. In step 1406, as described above, the query rewriting unit 205 generates the partial query 3 for performing the same search as in step 1403.

その後クエリ書換部２０５はステップ１４０７’において、部分クエリ３を含んだ部分クエリ２を生成する。ステップ１４０７’で行われる処理は、関数ＡＤＢ＿ＣＳＶＲＥＡＤ（）の引数に部分クエリ３が記述される以外は、実施例１のステップ１４０７と同様である。 Thereafter, the query rewriting unit 205 generates a partial query 2 including the partial query 3 in step 1407 ′. The processing performed in step 1407 'is the same as that in step 1407 of the first embodiment except that the partial query 3 is described in the argument of the function ADB_CSVREAD ().

部分クエリ３を含んだ部分クエリ２がクエリ最適化部２０６及びクエリ実行部２０７で処理される時、まず部分クエリ３に記述された検索処理（データファイル管理表７００内のレコード検索）が行われる。それにより、インポートファイルのファイル名が特定され、特定されたファイルについて部分クエリ２に記述された検索処理が行われることになる。 When the partial query 2 including the partial query 3 is processed by the query optimization unit 206 and the query execution unit 207, first, the search process described in the partial query 3 (record search in the data file management table 700) is performed. . As a result, the file name of the import file is specified, and the search process described in the partial query 2 is performed for the specified file.

実施例１、２のいずれにおけるクエリ書き換え処理が実行されても、クエリ実行時（クエリ実行部２０７における処理）では、インポート中のインポートファイルからのデータ検索が行われる。つまり実施例１でも実施例２でも、ユーザが得られる結果は多くの場合同じである。ただしクエリ書き換え処理が行われてから、書き換え後クエリについての処理がクエリ実行部２０７で実行されるまでには、わずかながら時間遅れがあるため、実施例１においては以下のような事象が発生することがある。 Regardless of which query rewrite process is executed in any of the first and second embodiments, at the time of query execution (process in the query execution unit 207), data search is performed from the import file being imported. In other words, the results obtained by the user are the same in both the first and second embodiments. However, since there is a slight time delay between the query rewriting process and the process for the rewritten query being executed by the query execution unit 207, the following event occurs in the first embodiment. Sometimes.

実施例１で用いた図１２、図１６を参照しながら、実施例１に係るデータ処理システムにおいてクエリ書き換え処理が行われる時の一例を説明する。またクエリ書き換え処理（図１６）のステップ１４０３で、条件に合致するデータファイル管理表７００のレコードが１つ検出され、そのレコードのファイルパス（７０３）が“aaa.csv”であるケースについて説明する。 An example when query rewrite processing is performed in the data processing system according to the first embodiment will be described with reference to FIGS. 12 and 16 used in the first embodiment. Further, a case will be described in which one record of the data file management table 700 matching the condition is detected in step 1403 of the query rewriting process (FIG. 16) and the file path (703) of the record is “aaa.csv”. .

この場合、クエリ書き換え処理のステップ１４０７で生成される部分クエリ２には、ファイル名“aaa.csv”が関数ＡＤＢ＿ＣＳＶＲＥＡＤ（）の引数に記述される。その結果、クエリ実行部２０７がクエリについての処理を実行する時には、ファイル“aaa.csv”の読み出し、及び読み出したこのファイルからのレコード検索が行われる。 In this case, the file name “aaa.csv” is described in the argument of the function ADB_CSVREAD () in the partial query 2 generated in step 1407 of the query rewriting process. As a result, when the query execution unit 207 executes processing for the query, the file “aaa.csv” is read and a record search is performed from the read file.

ただし、クエリ書き換え処理の実行後（図１２ステップ１３０３の後）、クエリ実行部２０７の処理（図１２ステップ１３０５）が開始されるまでの間に、ファイル“aaa.csv”のインポート処理が完了していることもある。インポート処理により、ファイル“aaa.csv”の内容はすでにテーブルに反映されているため、この場合、ファイル“aaa.csv”の読み出しは不要である。しかし実施例１では部分クエリ２にファイル名“aaa.csv”が関数ＡＤＢ＿ＣＳＶＲＥＡＤ（）の引数に記述されているため、ファイル“aaa.csv”のインポート処理が完了していても、クエリ実行部２０７はファイル“aaa.csv”の読み出しを行うことになる（不要なファイル読み出しが発生する）。 However, after the query rewrite process is executed (after step 1303 in FIG. 12), the import process of the file “aaa.csv” is completed before the process of the query execution unit 207 (step 1305 in FIG. 12) is started. Sometimes. Since the contents of the file “aaa.csv” are already reflected in the table by the import process, it is not necessary to read the file “aaa.csv” in this case. However, in the first embodiment, the file name “aaa.csv” is described in the partial query 2 as an argument of the function ADB_CSVREAD (). Will read the file “aaa.csv” (unnecessary file read occurs).

一方実施例２の場合、図１９のような部分クエリ２が生成され、部分クエリ２にファイル名が直接記述されない。ファイル名が特定されるのはクエリ実行部２０７の処理（図１２ステップ１３０５）が開始される時点である。もしこの時点でファイル“aaa.csv”のインポート処理が完了していた場合、データファイル管理表７００のレコードのうち、ファイルパス（７０３）が“aaa.csv”であるレコードのインポート済みフラグ（７０７）は“Ｙ”に変更されている。そのため、実施例２におけるクエリ実行部２０７が部分クエリ２に係る処理を実行すると、ファイル“aaa.csv”は読み出されないので、実施例２では不要なファイル読み出しを抑止することができる。 On the other hand, in the second embodiment, a partial query 2 as shown in FIG. 19 is generated, and the file name is not directly described in the partial query 2. The file name is specified at the time when the process of the query execution unit 207 (step 1305 in FIG. 12) is started. If the import processing of the file “aaa.csv” has been completed at this point, among the records in the data file management table 700, the imported flag (707) of the record whose file path (703) is “aaa.csv”. ) Is changed to “Y”. For this reason, when the query execution unit 207 according to the second embodiment executes the processing related to the partial query 2, the file “aaa.csv” is not read, and thus unnecessary file reading can be suppressed in the second embodiment.

続いて実施例３に係るデータ処理システムの説明を行う。実施例３に係るデータ処理システムのハードウェア構成や機能ブロックの構成は、実施例１または実施例２で説明したものと同じであるため、図示は略す。実施例３に係るデータ処理システムでは、検索処理の一部が、実施例１で説明したものと異なるが、その他の点は実施例１で説明したものと同じである。以下では実施例３における検索処理について、実施例１と異なる点について説明する。 Subsequently, a data processing system according to the third embodiment will be described. Since the hardware configuration and functional block configuration of the data processing system according to the third embodiment are the same as those described in the first or second embodiment, illustration is omitted. In the data processing system according to the third embodiment, a part of the search process is different from that described in the first embodiment, but the other points are the same as those described in the first embodiment. In the following, the difference between the search process in the third embodiment and the first embodiment will be described.

図２１は、実施例３に係るデータ処理システムで行われる検索処理の流れを示している。 FIG. 21 illustrates a flow of search processing performed in the data processing system according to the third embodiment.

ステップ２３０１：クエリ受付部２０４は、ユーザが使用するクライアント２からＳＱＬクエリを受け付けると、そのクエリに指定されている検索対象テーブルを特定する。さらにクエリ受付部２０４はＳＱＬ＿ＴＡＢＬＥＳ（５１０）を参照し、検索対象テーブルのインポートファイル格納ディレクトリパス（５１６）を特定する。 Step 2301: When the query receiving unit 204 receives an SQL query from the client 2 used by the user, the query receiving unit 204 specifies a search target table specified in the query. Further, the query receiving unit 204 refers to the SQL_TABLEs (510) and specifies the import file storage directory path (516) of the search target table.

ステップ２３０２：クエリ受付部２０４はファイル監視部２０３を呼び出して、ステップ２３０１で特定されたディレクトリ内のファイルのチェックを行わせる。ここで行われる処理は、実施例１で説明したステップ１００３（図８、図９）と同様で、ステップ２３０１で特定されたディレクトリ内に新たなインポートファイルが格納されたか、あるいはこのディレクトリ内に既に存在していたインポートファイルに更新が行われたか、確認される。そして実施例１で説明した処理と同様に、もし新たなインポートファイルが格納された（あるいはインポートファイルの更新が行われた）場合には、データファイル管理表７００の更新が行われる。 Step 2302: The query receiving unit 204 calls the file monitoring unit 203 to check the files in the directory specified in Step 2301. The processing performed here is the same as in Step 1003 (FIGS. 8 and 9) described in the first embodiment, and a new import file is stored in the directory specified in Step 2301 or already in this directory. Check if the existing import file has been updated. Similarly to the processing described in the first embodiment, if a new import file is stored (or the import file is updated), the data file management table 700 is updated.

ステップ２３０３：クエリ受付部２０４は、受け付けたクエリについてクエリ書き換えが必要か判断する。ここで行われる処理はステップ１３０１と同様で、クエリ受付部２０４はデータファイル管理表７００の中に、受け付けたＳＱＬクエリのＦＲＯＭ句に記述されているテーブル名称と同じテーブル名が登録されているレコードで、かつインポート済みフラグ（７０７）が“Ｎ”のレコードがあるか否か判断する（つまり検索対象のテーブルにインポートされていないインポートファイルがあるか否か、判断される）。 Step 2303: The query receiving unit 204 determines whether query rewriting is necessary for the received query. The processing performed here is the same as in step 1301, and the query receiving unit 204 records in the data file management table 700 the same table name as the table name described in the FROM clause of the received SQL query. And whether or not there is a record whose imported flag (707) is “N” (that is, it is determined whether or not there is an import file that is not imported to the table to be searched).

ステップ２３０４：ステップ２３０３の処理の結果、まだ検索対象のテーブルにインポートされていないインポートファイルがあった場合（ステップ２３０４：Ｙｅｓ）、次にステップ２３０５が行われる。一方、データファイル管理表７００に、検索対象のテーブルに対するインポートファイルの情報が登録されていない場合、あるいは検索対象のテーブルに対するインポートファイルの情報が登録されているが、そのインポートファイルはインポート済み（インポート済みフラグ（７０７）が“Ｙ”））の場合、ステップ２３０５はスキップされる。 Step 2304: If there is an import file that has not yet been imported to the table to be searched as a result of the processing of step 2303 (step 2304: Yes), then step 2305 is performed. On the other hand, when the import file information for the table to be searched is not registered in the data file management table 700, or the import file information for the table to be searched is registered, the import file has already been imported (imported). If the completed flag (707) is “Y”)), step 2305 is skipped.

ステップ２３０５：この処理は、ステップ１３０３と同じで、クエリ書き換えが行われる。 Step 2305: This process is the same as step 1303, and query rewriting is performed.

ステップ２３０６：この処理はステップ１３０４と同じ処理であるので、ここでの説明は略す。 Step 2306: This process is the same as step 1304, so the description thereof is omitted here.

ステップ２３０７：この処理はステップ１３０５と同じ処理であるので、ここでの説明は略す。 Step 2307: Since this process is the same as step 1305, description thereof is omitted here.

実施例３と、実施例１（または実施例２）との違いを、図２２を用いて概説する。実施例３に係るデータ処理システムでは、インポートファイルが検索対象となる契機が、実施例１または２と異なる。図２２は、インポートファイル３２（このファイル名を“aaa.csv”とする）が記憶装置４に格納されてから、インポート対象のテーブル３００（テーブル名称を“USER1.T1”とする）に格納されるまでの流れ、そして各時点における検索対象を実施例ごとに示した概念図である。また、ここでの説明において、ユーザから発行されるクエリは、テーブル“USER1.T1”内のレコードを検索するクエリ（たとえば図１３に示されたクエリ）とする。 The difference between Example 3 and Example 1 (or Example 2) will be outlined with reference to FIG. In the data processing system according to the third embodiment, the trigger for the import file to be searched is different from that in the first or second embodiment. FIG. 22 shows that the import file 32 (named “aaa.csv”) is stored in the storage device 4 and then stored in the table 300 to be imported (named “USER1.T1”). It is the conceptual diagram which showed the flow until this time and the search object in each time for every Example. In the description here, the query issued by the user is a query for searching for a record in the table “USER1.T1” (for example, the query shown in FIG. 13).

図２２（ａ）では、時刻ｔ０にデータソース８から記憶装置４にインポートファイル３２が格納され、そして時刻ｔ２に（ｔ２＞ｔ０である）、テーブル３００へのインポートが行われる例が示されている。もし時刻ｔ２以降（たとえば図２２（ｂ）における時刻ｔ１’）にユーザからクエリを受け付けると、実施例１〜３に係るデータ処理システムはいずれも、テーブル３００及びインポートファイル３２を検索対象とし、テーブル３００内のレコード及びインポートファイル３２内のレコードの両方がユーザに出力される。 FIG. 22A shows an example in which the import file 32 is stored in the storage device 4 from the data source 8 at the time t0, and the import to the table 300 is performed at the time t2 (t2> t0). Yes. If a query is received from the user after time t2 (for example, time t1 ′ in FIG. 22B), the data processing systems according to the first to third embodiments target the table 300 and the import file 32 as a search target. Both records in 300 and records in import file 32 are output to the user.

一方、時刻ｔ２（インポート処理が開始される時）より前にユーザからクエリを受け付けた場合、たとえば時刻ｔ１（なおｔ１は、ｔ０＜ｔ１＜ｔ２の関係にある）にクエリを受け付けた場合、実施例１または２に係るデータ処理システムは、テーブル３００だけを検索対象とし、インポートファイル３２内のレコードは検索されない。インポート処理は定期的に実行される処理のため、インポートファイル３２が記憶装置４に格納されてから即座にインポート処理が開始されるわけではない。そのためインポートファイル３２が記憶装置４に格納されても、ユーザはインポートファイル３２内のレコードにアクセスできない時間がある。 On the other hand, when a query is received from a user before time t2 (when import processing is started), for example, when a query is received at time t1 (note that t1 is in a relationship of t0 <t1 <t2). The data processing system according to Example 1 or 2 sets only the table 300 as a search target, and does not search for records in the import file 32. Since the import process is performed periodically, the import process is not immediately started after the import file 32 is stored in the storage device 4. Therefore, even when the import file 32 is stored in the storage device 4, there is a time when the user cannot access the records in the import file 32.

しかし実施例３に係るデータ処理システムは、検索要求を受け付けた時にステップ２３０１やステップ２３０２の処理を行い、インポートファイル３２が新たに格納されたか、あるいはインポートファイル３２の更新が行われた場合には、クエリ書き換え処理を行うことで、インポートファイル３２を検索対象にする。そのためユーザは、インポートファイル３２が記憶装置４に格納されると即座にインポートファイル３２内のレコードにアクセスすることができる。 However, the data processing system according to the third embodiment performs the processing of Step 2301 and Step 2302 when receiving a search request, and when the import file 32 is newly stored or the import file 32 is updated. The import file 32 is made a search target by performing the query rewriting process. Therefore, the user can access the record in the import file 32 immediately after the import file 32 is stored in the storage device 4.

ここまでで、いくつかの実施例を説明したが、これらは本発明の説明のための例示であって、本発明の範囲をこれらの実施例にのみ限定する趣旨ではない。すなわち、本発明は、他の種々の形態でも実施する事が可能である。 So far, several embodiments have been described, but these are examples for explaining the present invention, and the scope of the present invention is not intended to be limited only to these embodiments. That is, the present invention can be implemented in various other forms.

たとえば上で説明した実施例に係るデータ処理システムでは、ＤＢサーバとは別にクライアントが設けられ、ユーザはクライアントの入力装置及び出力装置を用いる例が説明された。ただし、クライアントを設けることは必須ではなく、ＤＢサーバでクライアントプログラムが実行される構成にしてもよい。その場合ユーザは、ＤＢサーバの入出力デバイスを用いて、情報検索のリクエストを発行するとよい。 For example, in the data processing system according to the embodiment described above, a client is provided separately from the DB server, and the user uses an input device and an output device of the client. However, it is not essential to provide a client, and the client program may be executed by the DB server. In this case, the user may issue an information search request using the input / output device of the DB server.

また、ＤＢサーバの台数は１台に限定されない。データ処理システムに複数のＤＢサーバを設けて、検索処理等を複数のＤＢサーバで並列実行させるようにしてもよい。 Further, the number of DB servers is not limited to one. A plurality of DB servers may be provided in the data processing system, and search processing or the like may be executed in parallel by the plurality of DB servers.

上で説明された各種プログラムは、プログラム配布サーバや計算機が読み取り可能な記憶メディアによって提供され、プログラムを実行する各装置にインストールされてもよい。計算機が読み取り可能な記憶メディアとは、非一時的なコンピュータ可読媒体で、例えばＩＣカード、ＳＤカード、ＤＶＤ等の不揮発性記憶媒体である。 The various programs described above may be provided by a program distribution server or a storage medium readable by a computer, and may be installed in each device that executes the program. The computer-readable storage medium is a non-transitory computer-readable medium such as a non-volatile storage medium such as an IC card, an SD card, or a DVD.

また、上の実施例で説明されたプログラムの一部または全ての処理は、専用ハードウェアによって実現されてもよい。 In addition, part or all of the processing of the program described in the above embodiment may be realized by dedicated hardware.

１：サーバ、２：クライアント、３，４：記憶装置、５：ＳＡＮ、６：ＬＡＮ、７：ＷＡＮ、８：データソース 1: Server, 2: Client, 3, 4: Storage device, 5: SAN, 6: LAN, 7: WAN, 8: Data source

Claims

A server, a first storage device storing one or more tables including a plurality of columns, and a second storage device for storing an import file having data to be imported into the table,
When the server accepts the search request,
An import incomplete file that is an import file that has data to be imported into the table that is a search target in the search request and that has not been imported into the table is stored in the second storage device. Determine if it exists,
If the import incomplete file does not exist, the data specified in the search request is searched from the table,
When the import incomplete file exists, the data specified in the search request is searched from the table, and the data specified in the search request is also searched from the import incomplete file.
Data processing system.

The server, when an import incomplete file having data to be imported into the table to be searched in the search request exists in the second storage device,
Generating a first partial query that searches the table for records that meet the conditions specified in the search request;
Generating a second partial query that is a query for searching for records that meet the conditions specified in the search request from the incomplete import file;
Generating a query for obtaining a union of output results of the first and second partial queries;
Execute the process related to the query;
The data processing system according to claim 1.

The second partial query includes a table function for outputting data in the format of the table from the import incomplete file.
The data processing system according to claim 2.

The server analyzes the generated query, generates at least a task related to the first partial query and a task related to the second partial query, and executes the generated tasks in parallel.
The data processing system according to claim 2.

The server manages a data file management table for associating and recording the table name of the table and the file name and processing state of an import file having data to be imported into the table,
When the server detects that a new import file has been stored in the second storage device, the server records the file name of the import file in the data file management table, and the processing status of the import file is Record that the import is incomplete,
When the import processing of the import file is completed, record that the processing status of the import file is an import completed status in the data file management table.
The data processing system according to claim 2.

When the server generates the second partial query,
Searching for the import file whose processing status is incomplete among the import files having the import target data to the table that is the search target in the search request from the data file management table. 3 partial queries,
Generating a partial query including the third partial query as the second partial query;
The data processing system according to claim 5.

When the server receives a search request, the server determines whether a new import file is stored in the second storage device.
The data processing system according to claim 5.

In a data processing system comprising: a first storage device that stores one or more tables including a plurality of columns; and a second storage device that stores an import file having data to be imported into the table.
The data processing system is
a) accepting a search request;
b) An import incomplete file that is an import file that has data to be imported into the table that is a search target in the search request and that has not been imported into the table is stored in the second storage. Determining whether it exists in the device;
c-1) If the incomplete import file does not exist, searching the data specified in the search request from the table;
c-2) When the import incomplete file exists, the data specified in the search request is searched from the table, and the data specified in the search request is also searched from the import incomplete file. Searching, and
The data search method of the data processing system that executes

In step c-2), the data processing system
Generating a first partial query that searches the table for records that meet the conditions specified in the search request;
Generating a second partial query that is a query for searching for records that meet the conditions specified in the search request from the incomplete import file;
Generating a query for obtaining a union of output results of the first and second partial queries;
Execute the process related to the query;
The data search method of the data processing system according to claim 8.

The second partial query includes a table function for outputting data in the format of the table from the import incomplete file.
The data search method of the data processing system according to claim 9.

The data processing system manages a data file management table for associating and recording a table name of the table and a file name and a processing state of an import file having data to be imported into the table,
The data processing system further includes:
A) detecting that a new import file is stored in the second storage device;
B) When it is detected in step a) that a new import file has been stored, the file name of the import file is recorded in the data file management table, and the processing status of the import file is not yet imported. Recording a certain fact,
C) When the import processing of the import file is completed, recording that the processing status of the import file is an import completion status in the data file management table;
The data search method of the data processing system according to claim 9, wherein:

When the data processing system generates the second partial query,
Searching for the import file whose processing status is incomplete among the import files having the import target data to the table that is the search target in the search request from the data file management table. 3 partial queries,
Generating a partial query including the third partial query as the second partial query;
The data search method of the data processing system of Claim 11.

Steps a) and b) are performed between steps a) and b).
The data search method of the data processing system of Claim 11.

A processor of a data processing system comprising: a first storage device storing one or more tables including a plurality of columns; and a second storage device storing an import file having data to be imported into the table In addition,
a) accepting a search request;
b) An import incomplete file that is an import file that has data to be imported into the table that is a search target in the search request and that has not been imported into the table is stored in the second storage. Determining whether it exists in the device;
c-1) If the incomplete import file does not exist, searching the data specified in the search request from the table;
c-2) When the import incomplete file exists, the data specified in the search request is searched from the table, and the data specified in the search request is also searched from the import incomplete file. Searching, and
The computer-readable storage medium which recorded the program which performs this.

In step c-2), the processor
Generating a first partial query that searches the table for records that meet the conditions specified in the search request;
Generating a second partial query that is a query for searching for records that meet the conditions specified in the search request from the import incomplete file;
Generating a query for obtaining a union of output results of the first and second partial queries;
Causing a process related to the query to be executed;
The computer-readable storage medium according to claim 14, wherein a program is recorded.