JP5069525B2

JP5069525B2 - Data processing system

Info

Publication number: JP5069525B2
Application number: JP2007235421A
Authority: JP
Inventors: 裕三石田; 麻衣中山; 由実杉山
Original assignee: Nomura Research Institute Ltd
Current assignee: Nomura Research Institute Ltd
Priority date: 2007-09-11
Filing date: 2007-09-11
Publication date: 2012-11-07
Anticipated expiration: 2027-09-11
Also published as: JP2009069971A

Description

この発明はデータ処理システムに係り、特に、DBサーバが管理するレコードの分類体系をAPサーバ内にインデックスとしてキャッシュしておき、DBサーバにSQL文を発行するに際してはレコードのプライマリキーを特定することにより、検索処理等の高速化を実現する技術に関する。 The present invention relates to a data processing system. In particular, a record classification system managed by a DB server is cached as an index in an AP server, and a primary key of a record is specified when an SQL statement is issued to the DB server. Thus, the present invention relates to a technique for realizing high-speed search processing and the like.

クライアントサーバ型システムの進展に伴い、より大規模な情報処理の要求に応えるために、データの表示をするクライアントの他にデータの加工を行うAPサーバ及びデータの格納をするDBサーバを備えた、いわゆる三層構造のクライアントサーバシステムが普及してきている。
また、処理速度の向上を図るため、複数のAPサーバを並列配置させることで負荷を分散させることも行われている。
特開２００５−１６５６１０ Along with the development of client-server systems, in order to meet the demand for larger-scale information processing, in addition to the client that displays data, it has an AP server that processes data and a DB server that stores data. A so-called three-layer client-server system has become widespread.
In order to improve the processing speed, a load is distributed by arranging a plurality of AP servers in parallel.
JP 2005-165610 A

APサーバは廉価なPCサーバで構成することができるため、設置台数を増加させることで処理速度を向上させることは比較的容易であるが、DBサーバについてはデータの同期を維持する必要性があるため、APサーバのように簡単に分散処理に移行することはできない。もちろん、データベースシステムのベンダ各社は、様々な技術を駆使してソフトウェア及びハードウェアの両面からDBサーバ自体の高速化を図ってきており、その結果一定の成果は上がっているが、その分システムの価格が上昇することは否めない。今後ともクライアントサーバ型システムに担わされるデータベースの規模が増大を続ける限り、いずれはディスクI/O（データの読み書き）速度が壁となり、DBサーバの性能アップでは対応できない時期が来るものと予想される。 Since the AP server can be configured with an inexpensive PC server, it is relatively easy to increase the processing speed by increasing the number of installed servers, but there is a need to maintain data synchronization for the DB server Therefore, it is not possible to shift to distributed processing as easily as AP server. Of course, database system vendors have been using various technologies to speed up the DB server itself from both the software and hardware sides, and as a result, certain results have been achieved. It cannot be denied that prices will rise. As long as the scale of the database carried by the client-server system continues to increase, the disk I / O (data read / write) speed will eventually become a barrier, and it is expected that there will be a time when the performance of the DB server cannot be improved. The

ところで、DBサーバにおいては一般に、各レコードにユニークなプライマリキーが付与されると共に、各レコードの位置付けを規定するための分類属性が複数付与されている。例えば、図１１に示す伝票管理用のテーブルにおいては、それぞれのレコードに「伝票番号」というユニークなプライマリキーが割り振られており、この伝票番号を指定することによって目的のレコードをピンポイントで抽出することが可能となるのであるが、検索の便宜のために年度、法人、伝票種類の分類属性が割り振られている。この結果ユーザは、具体的な伝票番号を認識していなくとも、「年度×法人×伝票種類」を検索条件として指定することにより、この検索条件にマッチする複数のレコードをDBサーバから抽出することが可能となる。 By the way, in a DB server, a unique primary key is generally assigned to each record, and a plurality of classification attributes for defining the position of each record are assigned. For example, in the voucher management table shown in FIG. 11, a unique primary key called “voucher number” is assigned to each record, and the target record is pinpointed by specifying this voucher number. However, for the convenience of search, classification attributes of year, corporation, and slip type are assigned. As a result, even if the user does not recognize a specific slip number, the user can extract a plurality of records that match the search condition from the DB server by specifying “year × corporate × slip type” as the search condition. Is possible.

しかし一方で、SQL文中で年度、法人、伝票種類の分類属性のみを指定してレコードの抽出を命令した場合、DBサーバは指定された年度、法人、伝票種類にマッチする全てのレコードを先頭レコードから走査して該当のレコードを抽出するフル・スキャンと呼ばれる処理を実行する必要が生じ、その分時間を要することとなる。 However, on the other hand, if you specify the year, corporation, and slip type classification attributes in the SQL statement and command the record extraction, the DB server will start all records that match the specified year, corporation, and slip type as the first record. Therefore, it is necessary to execute a process called a full scan in which the corresponding record is extracted by scanning from time to time.

また、各レコード毎に分類属性の項目を設ける必要があるため、当然ながらその分データ容量が嵩むこととなり、テーブルの肥大化に繋がる。大手流通グループの場合でいえば、伝票件数が年間１千万件単位となるため、各レコード毎に「2006」や「H001」等のデータを重複して持つことにより、膨大なリソースを消費する結果となる。 Moreover, since it is necessary to provide a classification attribute item for each record, the data capacity naturally increases accordingly, which leads to enlargement of the table. In the case of a major distribution group, the number of vouchers is 10 million units per year, and by having duplicate data such as “2006” and “H001” for each record, a huge amount of resources are consumed. Result.

また、一旦分類体系を付与した以上、途中で分類体系の見直しをするとテーブル定義の変更（カラムの追加・変更・削除）が必要となるため、長期間に亘ってこれを踏襲する必要があり、システムの柔軟性が損なわれるという問題もある。 In addition, once the classification system has been assigned, it is necessary to follow this over a long period of time because the table definition must be changed (addition / change / deletion of columns) if the classification system is reviewed in the middle. There is also the problem that the flexibility of the system is impaired.

さらに、検索処理の高速化のため、分類属性の組合せパターンに応じてDBサーバ内にインデックス（索引情報）が生成されるのが一般的であるが、分類属性の項目数が多くて様々な検索パターンが想定される場合には、２次インデックス、３次インデックス…のように索引情報自体が複雑化・肥大化する結果、DBサーバのディスク容量を圧迫する。もちろん、レコードの追加時にはこれらのインデックスを再構築する必要が生じ、その結果生じるCPU負荷も無視することはできない。 Furthermore, in order to speed up the search process, an index (index information) is generally generated in the DB server according to the combination pattern of classification attributes, but there are many classification attribute items and various searches are performed. When a pattern is assumed, the index information itself becomes complicated and enlarged like a secondary index, a tertiary index, etc., and as a result, the disk capacity of the DB server is compressed. Of course, when adding records, these indexes need to be rebuilt, and the resulting CPU load cannot be ignored.

この発明は、従来のデータ処理システムが抱えていた上記の問題を解決するために案出されたものであり、DBサーバ自体の性能アップに依存することなく、DBサーバにおけるフル・スキャンの頻度を低減することで検索処理の高速化を達成でき、また各レコードの分類属性項目によってDBサーバのディスク容量が圧迫されることもなく、レコードの分類体系の変更にも柔軟に対応可能であり、２次、３次インデックスの生成も不要なデータ処理システムの提供を目的としている。 The present invention has been devised to solve the above-mentioned problems that the conventional data processing system has, and does not depend on the performance improvement of the DB server itself. By reducing this, the search processing speed can be increased, the disk attribute of the DB server is not squeezed by the classification attribute item of each record, and the change of the record classification system can be flexibly handled. The purpose of the present invention is to provide a data processing system that does not require generation of a secondary or tertiary index.

上記の目的を達成するため、請求項１に記載したデータ処理システムは、DBサーバとAPサーバを備えたデータ処理システムであって、上記DBサーバが、プライマリキーが付与された第１のレコードを格納する第１のテーブルと、第１のレコードの階層構造化された分類属性を規定する第２のレコードを格納した第２のテーブルとを備え、上記APサーバが、上記DBサーバにSQL文を発行し、上記第２のテーブルに格納された第２のレコードの読み出しを指令するデータ読み出し手段と、DBサーバから送信された第２のレコードに基づいて木構造のインデックスを生成し、APサーバの記憶装置に格納するインデックス生成手段と、クライアント端末から分類属性を組み合わせた検索条件が送信された場合に、上記インデックスを参照し、当該検索条件に合致する第１のレコードのプライマリキーを取得する手段と、第１のレコードのプライマリキーを特定したSQL文を生成し、上記DBサーバに発行する手段を備え、上記第２のレコードが、上位の分類属性が記述される親項目と、これに直結する下位の分類属性が記述される子項目を有しており、あるレコードの子項目に記述された分類属性が他のレコードの親項目の分類属性として記述されると共に、最下位の分類属性を親項目に記述したレコードの子項目に第１のレコードのプライマリキーが記述される再帰的な構造を備えており、さらに上記APサーバが、上記第２のレコードに格納された必要な分類属性に対して、当該分類属性に一意性を付与するためのユニーク情報を追加する手段と、上記インデックスを生成する際に各分類属性から上記ユニーク情報を削除する手段を備えたことを特徴としている。 In order to achieve the above object, a data processing system according to claim 1 is a data processing system including a DB server and an AP server, wherein the DB server stores a first record to which a primary key is assigned. A first table for storing, and a second table for storing a second record that defines a hierarchically structured classification attribute of the first record, wherein the AP server sends an SQL statement to the DB server. A data read means for issuing a command to read the second record stored in the second table, and generating a tree structure index based on the second record transmitted from the DB server, When a search condition that combines classification attributes is transmitted from the client terminal and the index generation means to be stored in the storage device, the index is referred to and the search condition is Means for acquiring the primary key of the first record to be matched, and means for generating an SQL statement specifying the primary key of the first record and issuing it to the DB server. It has a parent item in which a classification attribute is described and a child item in which a lower-level classification attribute directly connected to this is described, and the classification attribute described in the child item of one record is the classification of the parent item of another record while being described as an attribute includes a recursive structure where the primary key of the first record is written to the child items of the record that describes the lowest classification attributes to the parent item, further the AP server, the A means for adding unique information for imparting uniqueness to the classification attribute for the necessary classification attribute stored in the second record, and each classification attribute when generating the index It is characterized by comprising means for deleting the Martinique information.

請求項２に記載したデータ処理システムは、請求項１のシステムであって、さらに上記ユニーク情報が、相互に重複しない数値からなる連番であることを特徴としている。 A data processing system according to a second aspect is the system according to the first aspect, wherein the unique information is a serial number composed of numerical values that do not overlap each other .

請求項３に記載したデータ処理システムは、請求項１のシステムであって、さらに上記ユニーク情報が、各分類属性よりも上位に位置する分類属性を連結させた経路情報であることを特徴としている。 The data processing system according to claim 3 is the system according to claim 1 , wherein the unique information is route information obtained by connecting classification attributes positioned higher than the classification attributes . .

請求項１に記載したデータ処理システムにあっては、DBサーバの第１のテーブルに格納された第１のレコードの階層構造化された分類属性が、APサーバのメモリやディスク内にキャッシュされているため、DBサーバに検索を指令する際には第１のレコードのプライマリキーを特定し、抽出対象をピンポイントで絞り込んだ形のSQL文を発行することが可能となる。この結果、DBサーバにおいてフル・スキャンが発生することがなくなり、その分検索処理の高速化を実現できる。このように、APサーバから抽出対象をピンポイントで絞り込んだSQL文が発行される以上、DBサーバ側に２次、３次インデックスを用意しておく必要もなくなる。 In the data processing system according to claim 1, the hierarchically structured classification attribute of the first record stored in the first table of the DB server is cached in the memory or disk of the AP server. Therefore, when a search is instructed to the DB server, it is possible to specify the primary key of the first record and issue a SQL statement in which the extraction target is pinpointed. As a result, a full scan does not occur in the DB server, and the search process can be speeded up accordingly. In this way, as long as the SQL statement narrowing down the extraction target is issued from the AP server, there is no need to prepare secondary and tertiary indexes on the DB server side.

また、第２のテーブルにおいて、上位レコードの子項目が次の階層のレコードの親項目となり、その子項目に下位の分類属性が記述されるという、いわゆる再帰的なデータ構造によって第１のレコードの分類体系が表現されているため、分類属性を第１のレコード毎に表形式で規定する場合に比べ、データの重複を大幅に削減することが可能となる。さらに、システムの運用途中で分類体系に変更を加える必要が生じた場合であっても、第２のテーブルのレコードを変更するだけで済み、テーブル定義に変更を加える必要がないため、システムの柔軟性が向上する利点がある。 Further, in the second table, the first record classification is based on a so-called recursive data structure in which the child item of the upper record becomes the parent item of the record of the next hierarchy, and the lower classification attribute is described in the child item. Since the system is expressed, duplication of data can be greatly reduced as compared with the case where the classification attribute is defined in a tabular format for each first record. In addition, even if it is necessary to change the classification system during system operation, it is only necessary to change the record in the second table, and there is no need to change the table definition. There is an advantage of improving the performance.

さらに、第２のレコードに格納された分類属性に対して、必要に応じて連番や経路情報のようなユニーク情報が付加される仕組みを備えているため、相互に重複する分類属性が複数存在したとしても相互間で一意性が担保される結果、APサーバは取り違えることなく正しい木構造のインデックスを生成可能となる。

In addition, since there is a mechanism for adding unique information such as serial numbers and route information to the classification attributes stored in the second record as necessary, there are multiple overlapping classification attributes. Even so, as a result of ensuring the uniqueness between them, the AP server can generate a correct tree-structured index without confusion.

図１は、この発明に係るデータ処理システム10の全体構成図であり、このシステム10は、複数のAPサーバ12と、DBサーバ14と、ロードバランサ（負荷分散装置）16とを備えている。
ロードバランサ16と各APサーバ12間、及び各APサーバ12とDBサーバ14間はネットワークによって接続されている。
また、各APサーバ12に対しては、イントラネット18やインターネット等のネットワーク及びロードバランサ16を介して多数のクライアント端末20が接続されている。 FIG. 1 is an overall configuration diagram of a data processing system 10 according to the present invention, and this system 10 includes a plurality of AP servers 12, a DB server 14, and a load balancer (load balancer) 16.
The load balancer 16 and each AP server 12, and each AP server 12 and DB server 14 are connected by a network.
A large number of client terminals 20 are connected to each AP server 12 via a network such as an intranet 18 or the Internet and a load balancer 16.

各APサーバ12は、データ処理部22と、インデックス生成部24と、メモリ26と、ハードディスク（HDD）27を備えている。
各APサーバ12のハードディスク27には、OS及びこのシステム専用のアプリケーションプログラムがセットアップされており、APサーバ12のCPUがこれらのプログラムに従って動作することにより、上記のデータ処理部22及びインデックス生成部24が実現される。 Each AP server 12 includes a data processing unit 22, an index generation unit 24, a memory 26, and a hard disk (HDD) 27.
The hard disk 27 of each AP server 12 is set up with an OS and an application program dedicated to this system. When the CPU of the AP server 12 operates according to these programs, the data processing unit 22 and the index generation unit 24 described above. Is realized.

DBサーバ14は、データベース管理システム（RDBMS）28と、第１のテーブル30と、第２のテーブル32を備えている。
データベース管理システム28は、第１のテーブル30及び第２のテーブル32を管理し、各テーブルに格納されたデータの入出力、更新、および所定の演算などを行う。 The DB server 14 includes a database management system (RDBMS) 28, a first table 30, and a second table 32.
The database management system 28 manages the first table 30 and the second table 32, and performs input / output and update of data stored in each table, predetermined calculation, and the like.

ロードバランサ16は、クライアント端末20から送信されたリクエストを、各APサーバ12にかかっている負荷に応じて分散する役割を果たす。 The load balancer 16 serves to distribute the request transmitted from the client terminal 20 according to the load applied to each AP server 12.

クライアント端末20は、ＰＣ等のコンピュータよりなり、OSの他に、Webブラウザプログラム等のアプリケーションプログラムがセットアップされている。 The client terminal 20 is composed of a computer such as a PC, and an application program such as a Web browser program is set up in addition to the OS.

図２は、第１のテーブル30の構成例を示すものであり、伝票番号、発行日、発行店舗、担当者等のデータ項目を備えた第１のレコード34が多数登録されている。第１のレコード34は、ユニークな伝票番号によって一意に特定される。 FIG. 2 shows an example of the configuration of the first table 30, in which a large number of first records 34 having data items such as a slip number, an issue date, an issue store, and a person in charge are registered. The first record 34 is uniquely specified by a unique slip number.

図３は、第２のテーブル32の構成例を示すものであり、親ID（親項目）及び子ID（子項目）の二つのデータ項目を備えた第２のレコード36が多数登録されている。
この第２のテーブル32は、第１のテーブル30に格納された第１のレコード34の階層構造化された分類属性を規定するものである。 FIG. 3 shows a configuration example of the second table 32, in which a large number of second records 36 having two data items of a parent ID (parent item) and a child ID (child item) are registered. .
The second table 32 defines the hierarchically structured classification attributes of the first record 34 stored in the first table 30.

図４は、第１のレコード34の分類体系を木構造で表現したものであり、ROOT（根）の直下に「2005」、「2006」、「2007」の年度が配置されている。また、各年度には「H001」、「H002」、「H003」の法人コードが配置され、各法人コードには「C01」、「C02」、「C03」の伝票種類コードが配置されている。さらに、各伝票種類コードには、「10001」、「10002」、「10003」等の伝票番号が配置されている（図示の便宜上、「2006-H001-C01」のライン以外は記載を省略）。
この分類体系を参照することにより、例えば「伝票番号＝10002」の伝票は、2006年度にH001の法人によって発行された「C01」種類に属することが理解できる。 FIG. 4 represents the classification system of the first record 34 in a tree structure, and the years “2005”, “2006”, and “2007” are arranged directly under the ROOT. In each fiscal year, corporate codes “H001”, “H002”, and “H003” are arranged, and “C01”, “C02”, and “C03” slip type codes are arranged in each corporate code. Further, slip numbers such as “10001”, “10002”, “10003”, and the like are arranged in each slip type code (for the sake of illustration, description is omitted except for the line “2006-H001-C01”).
By referring to this classification system, for example, it can be understood that a slip with “slip number = 10002” belongs to the “C01” type issued by a corporation of H001 in FY2006.

従来のデータ管理システムにあっては、図１１に示した通り、各レコードに年度コード、法人コード、伝票種類コードの項目を設けることによって、このような分類体系を表現していたのであるが、このシステム10の場合には、親ID及び子IDの２つのデータ項目を備えた第２のテーブル36によって伝票の分類体系を規定している。 In the conventional data management system, as shown in FIG. 11, such a classification system was expressed by providing items of year code, corporate code, and slip type code in each record. In the case of this system 10, a slip classification system is defined by a second table 36 having two data items of a parent ID and a child ID.

まず、第２のテーブル36においては、「親ID＝ROOT、子ID＝2005」、「親ID＝ROOT、子ID＝2006」、「親ID＝ROOT、子ID＝2007」の３つのレコード群Ａにより、ROOTに繋がる最上位の分類属性が「2005」、「2006」、「2007」の３つであることが規定されている。 First, in the second table 36, three record groups of “parent ID = ROOT, child ID = 2005”, “parent ID = ROOT, child ID = 2006”, “parent ID = ROOT, child ID = 2007”. A defines that the top-level classification attributes connected to ROOT are three, “2005”, “2006”, and “2007”.

また、第２のテーブル36においては、「親ID＝2006、子ID＝1001_H001」、「親ID＝2006、子ID＝1002_H002」、「親ID＝2006、子ID＝1003_H003」の３つのレコード群Ｂにより、2006に繋がる中位の分類属性が「H001」、「H002」、「H003」の３つであることが規定されている。 In the second table 36, three record groups of “parent ID = 2006, child ID = 1001_H001”, “parent ID = 2006, child ID = 1002_H002”, “parent ID = 2006, child ID = 1003_H003”. B stipulates that there are three medium classification attributes connected to 2006: “H001”, “H002”, and “H003”.

なお、H001の前に付加された「1001_」、H002の前に付加された「1002_」、H003の前に付加された「1003_」は、各ノード（分類属性）にユニーク性を付与する目的で、APサーバ12によって必要データに対し自動的に採番される連番である。
すなわち、H001の法人コードは2006年に限定されるものでなく、2005年あるいは2007年配下にも存在する可能性があるため、他の年度におけるH001と区別するため、APサーバ12のデータ処理部22は、予め第２のテーブル32に格納された第２のレコード36に対して、自動的にユニークな連番を付与しておく。
因みに、2005、2006、2007はROOTに繋がる最上位の分類属性であり、本来的にユニークであるため、自動連番付与の対象から外されている。 “1001_” added before H001, “1002_” added before H002, and “1003_” added before H003 are for the purpose of adding uniqueness to each node (classification attribute). The serial number is automatically assigned to the necessary data by the AP server 12.
In other words, since the corporate code of H001 is not limited to 2006, it may exist under 2005 or 2007, so to distinguish it from H001 in other years, the data processing section of AP server 12 22 automatically assigns a unique serial number to the second record 36 stored in the second table 32 in advance.
Incidentally, 2005, 2006, and 2007 are the highest classification attributes connected to ROOT, and are inherently unique, so they are excluded from automatic serial number assignment.

また、第２のテーブル32においては、「親ID＝1001_H001、子ID＝1004_C01」、「親ID＝1001_H001、子ID＝1005_C02」、「親ID＝1001_H001、子ID＝1006_C03」の３つのレコード群Ｃにより、H001に繋がる下位の分類属性が「C01」、「C02」、「C03」の３つであることが規定されている。
ここでも、C01の前に付加された「1004_」、C02の前に付加された「1005_」、C03の前に付加された「1006_」は、各ノード（分類属性）にユニーク性を付与するためAPサーバ12によって自動的に採番される連番を意味している。 Further, in the second table 32, three record groups of “parent ID = 1001_H001, child ID = 1004_C01”, “parent ID = 1001_H001, child ID = 1005_C02”, “parent ID = 1001_H001, child ID = 1006_C03” C stipulates that there are three lower classification attributes connected to H001: “C01”, “C02”, and “C03”.
Again, "1004_" added before C01, "1005_" added before C02, and "1006_" added before C03 are used to give each node (classification attribute) uniqueness. This means a serial number automatically assigned by the AP server 12.

さらに、第２のテーブル36においては、「親ID＝1004_C01、子ID＝10001」、「親ID＝1004_C01、子ID＝10002」、「親ID＝1004_C01、子ID＝10003」、「親ID＝1004_C01、子ID＝10004」、「親ID＝Ｃ０１、子ID＝10005の５つのレコード群Ｄにより、最下位の分類属性であるＣ０１に繋がる具体的な伝票番号が、「10001」、「10002」、「10003」、「10004」、「10005」の５つであることが規定されている。 Further, in the second table 36, “parent ID = 1004_C01, child ID = 10001”, “parent ID = 1004_C01, child ID = 10002”, “parent ID = 1004_C01, child ID = 1003”, “parent ID = 1004_C01, child ID = 1004 ”,“ parent ID = C01, child ID = 10005 ”, the specific slip numbers linked to C01 which is the lowest classification attribute are“ 10001 ”,“ 10002 ”. , “10003”, “10004”, and “10005”.

上記のように、ユニークな伝票番号によって一意に特定される第１のレコード34の分類体系を規定する第２のレコード36を、第１のレコード34毎に年度、法人、伝票種類といった分類属性を関連付ける通常の表形式とする代わりに、上位の子項目が次の階層の親項目となり、下位の分類属性を子項目として指定する（関連付ける）という再帰的な構造によって分類体系を表現することにより、データの重複を大幅に削減することが可能となる。 As described above, the second record 36 that defines the classification system of the first record 34 uniquely specified by the unique slip number is assigned a classification attribute such as year, corporation, and slip type for each first record 34. By expressing the classification system with a recursive structure in which the upper child item becomes the parent item of the next hierarchy and the lower classification attribute is specified (associated) as a child item instead of the normal tabular form to associate, Data duplication can be greatly reduced.

また、システムの運用途中で分類体系に変更を加える必要が生じた場合であっても、第２のテーブルのレコードの変更（追加及び削除）だけで済み、テーブル定義に変更を加える必要がないため、システムの柔軟性な改変が可能となる利点がある。 Even if it is necessary to change the classification system during system operation, it is only necessary to change (add and delete) the records in the second table, and there is no need to change the table definition. There is an advantage that the system can be flexibly modified.

なお、新しいレコード36を第２のテーブル32に加える際、APサーバ12のデータ処理部22は第２のテーブル32を参照して最終の連番を確認し、これよりも後の連番（必ずしも直後の連番である必要はなく、一定の間隔が空いてもよい）を必要なデータに対し自動的に付与する。 When a new record 36 is added to the second table 32, the data processing unit 22 of the AP server 12 refers to the second table 32 to confirm the final serial number, and a serial number after this (not necessarily) It is not necessary to be the serial number immediately after, but a certain interval may be provided) is automatically given to the necessary data.

あるいは、連番管理用のテーブルをDBサーバ14内に設けておき、新たに連番を付与するに際してはAPサーバ12がこれを参照し、最終の連番＋１の値を第２のレコード36に付加するように構成してもよい。この場合、APサーバ12はこの最新の連番によって上記連番管理用テーブルの最終値を更新する。 Alternatively, a serial number management table is provided in the DB server 14, and when assigning a new serial number, the AP server 12 refers to this, and the last sequential number + 1 value is stored in the second record 36. You may comprise so that it may add. In this case, the AP server 12 updates the final value of the serial number management table with the latest serial number.

以下、図５のフローチャートに従い、APサーバ12による木構造のインデックスの生成に係る処理手順を説明する。
まず、APサーバ12のデータ処理部22は、DBサーバ14に対して定期的にSQL文を発行し、第２のテーブル32に格納されたROOTノードを親とする第２のレコード36を取得する（Ｓ10）。
つぎに、インデックス生成部24が上記第２のレコード36の子の値を抽出し、ROOTノードに直に繋がる最上位ノードの一覧を生成する（Ｓ12）。具体的には、図６に示すように、「2005」、「2006」、「2007」が最上位ノードとして認定される。 Hereinafter, a processing procedure related to generation of a tree structure index by the AP server 12 will be described with reference to the flowchart of FIG.
First, the data processing unit 22 of the AP server 12 periodically issues an SQL statement to the DB server 14 to obtain the second record 36 whose parent is the ROOT node stored in the second table 32. (S10).
Next, the index generation unit 24 extracts the child values of the second record 36, and generates a list of the highest nodes directly connected to the ROOT node (S12). Specifically, as shown in FIG. 6, “2005”, “2006”, and “2007” are recognized as the highest nodes.

つぎにデータ処理部22は、DBサーバ14に対してSQL文を発行し、ROOTノードを親としない第２のレコード36を、第２のテーブル32から最上位ノード単位で取得する（Ｓ14）。 Next, the data processing unit 22 issues an SQL statement to the DB server 14 and acquires the second record 36 not having the ROOT node as a parent from the second table 32 in units of the highest node (S14).

つぎにインデックス生成部24は、共通の親を有する第２のレコード36同士を集めてグループを生成する（Ｓ16）。図７に示すように、このグループ38は、共通の親をキー部とし、それぞれの子をメンバー部として有している。 Next, the index generating unit 24 collects the second records 36 having a common parent and generates a group (S16). As shown in FIG. 7, this group 38 has a common parent as a key part and each child as a member part.

つぎにインデックス生成部24は、各グループ38のキー部を参照し、親ノードの一覧を生成する（Ｓ18）。すなわち、何れかのキー部に登場するノードは親ノードと判定され、一度もキー部に登場しないノードは親ノードではないものとして一覧から排除される。この親ノード一覧は、後で各グループ38の子が他のグループ38の親に該当するか否かを判断する際に参照される。 Next, the index generation unit 24 refers to the key part of each group 38 and generates a list of parent nodes (S18). That is, a node that appears in any key part is determined as a parent node, and a node that never appears in the key part is excluded from the list as not being a parent node. This parent node list is later referred to when determining whether or not the child of each group 38 corresponds to the parent of another group 38.

つぎにインデックス生成部24は、最上位ノードの１つを選択した後（Ｓ20）、当該最上位ノードに下位のノードを継ぎ足して木構造を形成していく。ここでは「2005」配下の木構造が既に完成し、「2006」がつぎの処理対象として選択されたものとして説明を進める。 Next, after selecting one of the highest nodes (S20), the index generator 24 adds a lower node to the highest node to form a tree structure. Here, the description will proceed assuming that the tree structure under “2005” has already been completed and “2006” has been selected as the next processing target.

まずインデックス生成部24は、「2006」を親に持つグループを検索する（Ｓ22）。この結果、「H001」、「H002」、「H003」をメンバー部に有する図７(a)のグループ38aが抽出される。 First, the index generation unit 24 searches for a group having “2006” as a parent (S22). As a result, the group 38a in FIG. 7A having “H001”, “H002”, and “H003” in the member portion is extracted.

つぎにインデックス生成部24は、Ｓ18で生成した親ノード一覧を参照し、同グループ38aのメンバー部が他のグループ38の親であるか否かを判定する（Ｓ24）。
この場合、「H001」等は「C01」、「C02」、「C03」等をメンバー部とするグループ38bにおいて親となっているため、YESの判定結果が得られる。
この結果、インデックス生成部24は、メンバー部の１つである「H001」を選択し、図７(b)のグループを枝として上記2006のTreeに継ぎ足す（Ｓ26）。この際、各データに付与されていた連番が、インデックス生成部24によって削除される（以下同様）。 Next, the index generation unit 24 refers to the parent node list generated in S18, and determines whether the member part of the group 38a is a parent of another group 38 (S24).
In this case, “H001” or the like is a parent in the group 38b whose members are “C01”, “C02”, “C03”, and the like, and therefore a YES determination result is obtained.
As a result, the index generation unit 24 selects “H001”, which is one of the member units, and adds the group in FIG. 7B as a branch to the 2006 tree (S26). At this time, the serial number assigned to each data is deleted by the index generation unit 24 (the same applies hereinafter).

つぎにインデックス生成部24は、このグループ38bに属する子ノードの１つである「C01」を選択した後（Ｓ30）、この「C01」を親とするグループ38を検索する（Ｓ32）。この結果、図７(c)のグループ38cが抽出される。 Next, the index generation unit 24 selects “C01” which is one of the child nodes belonging to the group 38b (S30), and then searches for the group 38 whose parent is “C01” (S32). As a result, the group 38c in FIG. 7C is extracted.

つぎにインデックス生成部24は、Ｓ18で生成した親ノード一覧を参照し、当該グループ38cのメンバー部が他のグループの親であるか否かを判定する（Ｓ34）。
この場合、「C01」のメンバー部である「10001」〜「10005」は何れも他のグループで親となっていないため、NOの判定結果が得られる。
この結果、インデックス生成部24は当該グループ38cを葉としてTreeに追加する（Ｓ36）。 Next, the index generating unit 24 refers to the parent node list generated in S18, and determines whether or not the member part of the group 38c is a parent of another group (S34).
In this case, since the member parts “10001” to “10005” of “C01” are not parents in other groups, the determination result of NO is obtained.
As a result, the index generating unit 24 adds the group 38c as a leaf to the Tree (S36).

このようにして、図６に示すように、「2006-H001-C01」のラインが完了した後、インデックス生成部24はＳ30〜Ｓ36の処理を繰り返し、「C02」及び「C03」をキー部とするグループを葉としてTreeに追加することにより、「2006-H001-C02」及び「2006-H001-C03」のラインを完成させる（図示省略）。
その後、インデックス生成部24は「H002」以下及び「H003」以下についてもＳ26〜Ｓ36 の処理を繰り返し、「2006」配下のTreeを完成させる。
そして、「2006」を最上位ノードとするTreeが完成した後、インデックス生成部24は「2007」を最上位ノードとするTreeの生成に着手する。 In this way, as shown in FIG. 6, after the line “2006-H001-C01” is completed, the index generating unit 24 repeats the processes of S30 to S36, and “C02” and “C03” are used as key parts. By adding the groups to be added to the Tree as leaves, the lines of “2006-H001-C02” and “2006-H001-C03” are completed (not shown).
Thereafter, the index generation unit 24 repeats the processing of S26 to S36 for “H002” and below and “H003” and below to complete the tree under “2006”.
Then, after a tree having “2006” as the highest node is completed, the index generation unit 24 starts generating a tree having “2007” as the highest node.

なお、上記のＳ24においてNoと判定された場合（グループ38の子が他のグループ38の親でない場合）、インデックス生成部24は該当グループ38を葉としてTreeに追加すると共に、階層の深化を停止して右隣の葉の追加処理に移行する。
また、上記のＳ34においてYesと判定された場合（グループ38の子が他のグループ38の親である場合）、インデックス生成部24は該当グループ38を枝としてTreeに追加すると共に、つぎの階層の構築に移行する。
このように、あるグループ38の子が他のグループ38の親であるか親でないかをインデックス生成部24が判定し、その結果に応じてTreeに枝として追加するか葉として追加するかを切り替えることにより、分類体系の階層が増減してもインデックス生成部24は正しく階層構造を再現することが可能となる。 In addition, when it determines with No in said S24 (when the child of the group 38 is not the parent of the other group 38), the index production | generation part 24 adds the said group 38 to a Tree as a leaf, and stops deepening of a hierarchy Then, the process moves to the process for adding the leaf on the right.
If the determination in S34 is Yes (when the child of the group 38 is the parent of another group 38), the index generation unit 24 adds the corresponding group 38 to the Tree as a branch, and at the next hierarchical level. Transition to construction.
In this way, the index generation unit 24 determines whether a child of a certain group 38 is a parent or not a parent of another group 38, and switches between adding as a branch or a leaf according to the result. Thus, even if the hierarchy of the classification system increases or decreases, the index generation unit 24 can correctly reproduce the hierarchical structure.

以上のようにして、全ての最上位分類属性配下のTreeをメモリ26上に完成させたインデックス生成部24は、この木構造の分類体系に基づいて、フォルダ−ファイル形式のインデックスをハードディスク27上に生成する。
具体的には、図８に示すように、ハードディスク27の「ROOT」フォルダ44配下に「2005」、「2006」、「2007」の年度フォルダ46を生成し、それぞれの配下に法人フォルダ48（「H001」等）を生成する。
また、インデックス生成部24によって伝票種類に対応したファイル名（「C01」等）のテキストファイル50が生成され、対応の法人フォルダ48内に格納される。
これらのテキストファイル50には、当該伝票種類に属する具体的な伝票番号52が記述されている。 As described above, the index generation unit 24, which has completed all the trees under the highest classification attribute on the memory 26, stores the folder-file format index on the hard disk 27 based on the tree structure classification system. Generate.
Specifically, as shown in FIG. 8, year folders 46 of “2005”, “2006”, and “2007” are created under the “ROOT” folder 44 of the hard disk 27, and corporate folders 48 (“ H001 "etc.).
In addition, a text file 50 having a file name (“C01” or the like) corresponding to the slip type is generated by the index generation unit 24 and stored in the corresponding corporate folder 48.
In these text files 50, specific slip numbers 52 belonging to the slip type are described.

つぎに、図９のフローチャートに従い、このシステム10による検索処理について説明する。
まず、クライアント端末20からの検索リクエストを、ロードバランサ16経由でAPサーバ12が受信すると（Ｓ50）、データ処理部22はハードディスク27上に形成されたフォルダ−ファイル形式のインデックス42を参照し、検索条件に該当する全ての伝票番号を取得する（Ｓ52）。
例えば、クライアント端末20から「2006（年度）×H001（法人）×C01（伝票種類）」の検索条件が送信された場合、データ処理部22は「10001」、「10002」、「10003」、「10004」、「10005」の値を取得する。 Next, search processing by the system 10 will be described with reference to the flowchart of FIG.
First, when the AP server 12 receives a search request from the client terminal 20 via the load balancer 16 (S50), the data processing unit 22 refers to the folder-file format index 42 formed on the hard disk 27, and searches. All slip numbers corresponding to the conditions are acquired (S52).
For example, when the search condition “2006 (year) × H001 (corporate) × C01 (slip type)” is transmitted from the client terminal 20, the data processing unit 22 sets “10001,” “10002,” “10003,” “ The values of “10004” and “10005” are acquired.

つぎにデータ処理部22は、上記の伝票番号を明記したSQL文を生成し、DBサーバ14に発行する（Ｓ54）。
これに対しDBサーバ14は、該当する第１のレコード34を第１のテーブル30から抽出し、APサーバ12に送信する。
これを受けたデータ処理部22は、所定の形式に加工した上で、クライアント端末20に送信する（Ｓ56）。 Next, the data processing unit 22 generates an SQL statement specifying the above slip number and issues it to the DB server 14 (S54).
In response to this, the DB server 14 extracts the corresponding first record 34 from the first table 30 and transmits it to the AP server 12.
Receiving this, the data processing unit 22 processes the data into a predetermined format and transmits it to the client terminal 20 (S56).

このデータ処理システム10にあっては、DBサーバ14の第１のテーブル30に格納された第１のレコード34の分類体系が、木構造のインデックスとしてAPサーバ12内にキャッシュされているため、DBサーバ14に検索を指令する際には第１のレコード34のプライマリキーを特定し、抽出対象をピンポイントで絞り込んだ形のSQL文を発行することが可能となる。
この結果、DBサーバ14においてフル・スキャンが発生することがなくなり、その分検索処理の高速化を実現できる。 In this data processing system 10, the classification system of the first record 34 stored in the first table 30 of the DB server 14 is cached in the AP server 12 as a tree structure index. When instructing the server 14 to perform a search, it is possible to specify the primary key of the first record 34 and issue an SQL sentence in which the extraction target is narrowed down pinpointed.
As a result, a full scan does not occur in the DB server 14, and the search process can be speeded up accordingly.

上記のように、一旦メモリ26上に構築した木構造のインデックス（分類体系）を、フォルダ−ファイル形式のインデックス42に変換してハードディスク27に格納した結果、比較的サイズの大きなインデックスであっても、必要な部分を個別にメモリに復元すれば済むため、瞬間的にAPサーバ12のメモリを圧迫することを有効に回避できる利点があるが、分類属性の数や第１のレコード34の件数が比較的少ない場合には、メモリ26上に生成された木構造のインデックスをそのまま参照するように構成してもよい。 As described above, the tree structure index (classification system) once built on the memory 26 is converted into the folder-file format index 42 and stored in the hard disk 27. As a result, even if the index is relatively large, Since it is sufficient to restore the necessary parts individually to the memory, there is an advantage that it is possible to effectively avoid instantaneously pressing the memory of the AP server 12, but the number of classification attributes and the number of the first record 34 are If the number is relatively small, the tree structure index generated on the memory 26 may be referred to as it is.

図１０は、第２のテーブル32の他の構成例を示すものであり、各ノード（分類属性）のユニーク性を担保するために「1001_」や「1002_」といった連番を自動的に付与する代わりに、必要なデータに対し上位ノードの経路（パス）情報がAPサーバ12のデータ処理部22によって付与される点に特徴を有している。 FIG. 10 shows another configuration example of the second table 32, and serial numbers such as “1001_” and “1002_” are automatically assigned to ensure the uniqueness of each node (classification attribute). Instead, it is characterized in that the route information of the upper node is given to the necessary data by the data processing unit 22 of the AP server 12.

例えば「2006_H001」のように、「H001」の前に「2006_」を付与することにより、2006配下のH001であることが明確となり、2005配下のH001や2007配下のH001と識別可能となる。
また、「2006H001_C01」のように、「C01」の前に「2006H001_」を付与することにより、2006×H001配下のC01であることが明確となり、2006×H002配下のC01と識別可能となる。 For example, by adding “2006_” before “H001” like “2006_H001”, it becomes clear that it is H001 under 2006, and H001 under 2005 and H001 under 2007 can be identified.
Also, by adding “2006H001_” before “C01” like “2006H001_C01”, it becomes clear that C01 is under 2006 × H001, and can be identified from C01 under 2006 × H002.

この経路情報は、上記連番付与方式の第２のテーブル32による場合と同様、メモリ26上に木構造のインデックスが生成されるタイミングで、インデックス生成部24によって削除される。
また、システムの運用途中で分類体系に変更を加える必要が生じ、第２のテーブル32に対する第２のレコード36の追加及び削除がなされた際には、APサーバ12のデータ処理部22によって変更箇所配下に位置する各データの経路情報の書き替えが実行される。 This path information is deleted by the index generation unit 24 at the timing when a tree-structured index is generated on the memory 26, as in the case of the second table 32 of the serial number assignment method.
In addition, when the classification system needs to be changed during the operation of the system, and the second record 36 is added to or deleted from the second table 32, the data processing unit 22 of the AP server 12 changes the changed part. The rewriting of the route information of each data located under the subordinate is executed.

なお、分類属性相互間で同じ値が重複する可能性がない場合には、上記のように連番や経路情報をレコード36の分類属性に付加する必要がないことは言うまでもない。 Needless to say, when there is no possibility that the same value is duplicated between the classification attributes, it is not necessary to add the serial number or the route information to the classification attribute of the record 36 as described above.

この発明に係るデータ処理システムの全体構成図である。1 is an overall configuration diagram of a data processing system according to the present invention. 第１のテーブルの構成例を示す図である。It is a figure which shows the structural example of a 1st table. 第２のテーブルの構成例を示す図である。It is a figure which shows the structural example of a 2nd table. 第１のレコードの分類体系を木構造で表現した図である。It is the figure which expressed the classification system of the 1st record with the tree structure. インデックスの生成手順を示すフローチャートである。It is a flowchart which shows the production | generation procedure of an index. 木構造のインデックスを生成する過程を示す図である。It is a figure which shows the process which produces | generates the index of a tree structure. グループの生成過程を示す図である。It is a figure which shows the production | generation process of a group. フォルダ−ファイル形式のインデックスを示す図である。It is a figure which shows the index of a folder file format. 検索手順を示すフローチャートである。It is a flowchart which shows a search procedure. 第２のテーブルの他の構成例を示す図である。It is a figure which shows the other structural example of a 2nd table. 従来のテーブルの構成例を示す図である。It is a figure which shows the structural example of the conventional table.

Explanation of symbols

10 データ処理システム
12 APサーバ
14 DBサーバ
16 ロードバランサ
18 イントラネット
20 クライアント端末
22 データ処理部
24 インデックス生成部
26 メモリ
27 ハードディスク
28 データベース管理システム
30 第１のテーブル
32 第２のテーブル
34 第１のレコード
36 第２のレコード
38a〜38c グループ
42 フォルダ−ファイル形式のインデックス
44 ルートフォルダ
46 年度フォルダ
48 法人フォルダ
50 テキストファイル
52 伝票番号 10 Data processing system
12 AP server
14 DB server
16 Load balancer
18 Intranet
20 Client terminal
22 Data processing section
24 Index generator
26 memory
27 Hard disk
28 Database management system
30 First table
32 Second table
34 First record
36 Second record
38a-38c group
42 Folder-file format index
44 Root folder
46 year folder
48 Corporate folder
50 text files
52 Document number

Claims

A data processing system comprising a DB server and an AP server,
The DB server stores a first table storing a first record to which a primary key is assigned, and a second table storing a second record that defines a hierarchically structured classification attribute of the first record. And
Data reading means for issuing a SQL statement to the DB server and instructing reading of the second record stored in the second table, the AP server;
Index generating means for generating a tree-structured index based on the second record transmitted from the DB server and storing it in the storage device of the AP server;
Means for acquiring a primary key of a first record that matches the search condition by referring to the index when a search condition that combines classification attributes is transmitted from the client terminal;
A means for generating a SQL statement specifying the primary key of the first record and issuing it to the DB server,
The second record has a parent item in which a higher level classification attribute is described and a child item in which a lower level classification attribute directly connected thereto is described, and the classification attribute described in the child item of a certain record Is described as a classification attribute of the parent item of another record, and has a recursive structure in which the primary key of the first record is described in the child item of the record in which the lowest classification attribute is described in the parent item. And
Further, the AP server adds, to the necessary classification attribute stored in the second record, unique information for imparting uniqueness to the classification attribute;
A data processing system comprising means for deleting the unique information from each classification attribute when generating the index .

The data processing system according to claim 1 , wherein the unique information is a serial number composed of numerical values that do not overlap each other.

The unique information, the data processing system according to claim 1, characterized in that the path information obtained by connecting a classification attribute located higher than the classification attributes.