JP2001291048A

JP2001291048A - Data tabulation method and storage medium stored with program for the data tabulation method

Info

Publication number: JP2001291048A
Application number: JP2000102349A
Authority: JP
Inventors: Shinji Kosho; 晋二古庄
Original assignee: TAABO DATA LAB KK; TAABO DATA LABORATORY KK
Current assignee: TAABO DATA LAB KK; TAABO DATA LABORATORY KK
Priority date: 2000-04-04
Filing date: 2000-04-04
Publication date: 2001-10-19

Abstract

PROBLEM TO BE SOLVED: To provide a data tabulation method for reducing the using amount of a memory and realizing cross tabulation at a high speed. SOLUTION: This method for executing a sum-up processing to plural table form data respectively expressed as the array of records including items and item values included in them is provided with a step for constituting the table form data so as to be respectively divided into one or more information blocks composed of a value list storing the item values in the order of item value numbers corresponding to the item values belonging to a specified item and a pointer array storing pointer values for instructing the item value numbers in the order of unique record numbers, a step for providing a presence number array in which a presence number for indicating the number of the records provided with the equal item value number is arranged corresponding to the value in the value list in the information block, a step for finding the item value in which the presence number is 0 in the presence number array and a step for preparing a sum-up table composed of the presence number corresponding to the item value in the item in the state of eliminating the found item value.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の技術分野】本発明は、コンピュータのような
情報処理装置を用いて大量のデータを処理するデータ処
理方法およびデータ処理装置に関し、より詳細には、リ
レーショナルデータベースにおいて、複数の表形式デー
タをクロス集計したときに得られるデータの圧縮方法に
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data processing method and a data processing apparatus for processing a large amount of data by using an information processing apparatus such as a computer. And a method for compressing data obtained when cross-tabulation is performed.

【０００２】[0002]

【従来の技術】レコードごとに、複数の項目（氏名、年
令、住所、学校区分や職業など）のそれぞれに関する値
を対応付けて、種々のデータを記憶するデータベース
（ＤＢ）、特に、リレーショナルデータベース（ＲＤ
Ｂ）は、たとえば、顧客管理、受注管理、在庫管理な
ど、一連のデータ群を管理するために広く利用されてい
る。このようなＲＤＢにおいて、複数、たとえば、２つ
の項目に関するマトリクスを作成して、各マトリクス中
の領域に該当するレコード数を示す表（ビュー）を提示
するために、クロス集計という技法が用いられる。たと
えば、図２１（ａ）に示すように、各レコードに対応し
て、学校区分および年令という項目に関する項目値が、
それぞれ与えられている場合を考える。ここで、学校区
分および年令という二つの項目についてクロス集計を行
なうと、図２１（ｂ）に示すような表（ビュー）を得る
ことが可能となる。2. Description of the Related Art A database (DB) for storing various data by associating a value for each of a plurality of items (name, age, address, school division, occupation, etc.) with each record, particularly a relational database (RD
B) is widely used to manage a series of data groups, such as customer management, order management, and inventory management. In such an RDB, a technique called cross tabulation is used to create a matrix for a plurality of, for example, two items, and to present a table (view) indicating the number of records corresponding to an area in each matrix. For example, as shown in FIG. 21A, corresponding to each record, the item values relating to the items of school classification and age are:
Consider the case where each is given. Here, when cross-tabulation is performed on the two items of the school division and the age, a table (view) as shown in FIG. 21B can be obtained.

【０００３】ここで、図２１（ｂ）の例では、年齢が８
歳、９歳に対応する行などに、レコードが存在せず、か
つ、中学生に対応する列にもレコードが存在しないこと
がわかる。このように、クロス集計においては、マトリ
クスを構成する要素（たとえば、年齢が８歳で、かつ、
中学生であるレコードを示す要素）によっては、当該要
素に対応するレコード数が０である場合を少なからず見
ることができる。そこで、従来のクロス集計において
は、このように「０」でない要素が散在する場合（スパ
ースな場合）に、以下の手法により、マトリクス中の要
素のデータ圧縮を実現していた。[0003] Here, in the example of FIG.
It can be seen that there is no record in the row corresponding to the age of 9 or 9 and no record exists in the column corresponding to the junior high school student. As described above, in the cross tabulation, the elements constituting the matrix (for example, the age is 8 years old and
Depending on the element indicating a record that is a junior high school student), it is possible to see a case where the number of records corresponding to the element is 0. Therefore, in the conventional cross tabulation, when elements other than “0” are scattered (sparse), data compression of elements in the matrix is realized by the following method.

【０００４】項目値の数が「Ｒ」であるような項目と、
項目値の数が「Ｃ」であるような項目のクロス集計の場
合には、まず、「Ｒ×Ｃ」のマトリクスを作成して、ク
ロス集計の表（ビュー）を作成する。次いで、マトリク
スの列および行に着目して、列に含まれる要素の数（レ
コードの存在数）の総和を算出し、総和が「０」である
ような列および／または行を検出する。このようにし
て、作成されたクロス集計表（ビュー）から、上記総和
が「０」である行や列を削除したものを再度作成する
（図２１（ｃ）参照）。[0004] An item whose number of item values is "R";
In the case of cross tabulation of items whose number of item values is “C”, first, an “R × C” matrix is created, and a table (view) of the cross tabulation is created. Next, paying attention to the columns and rows of the matrix, the sum of the number of elements (the number of records existing) included in the columns is calculated, and the columns and / or rows whose sum is “0” are detected. From the created cross tabulation table (view), a row or column with the total sum of “0” deleted is recreated (see FIG. 21C).

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、従来の
クロス集計によっては、要素に対応するレコード数が全
て０であるような行或いは列が数多く見られる場合であ
っても、まず、（一方の項目に関する値の数「Ｒ」）×
（他方の項目に関する値の数「Ｃ」）だけの要素からな
るマトリクスが生成されていた。However, according to the conventional cross tabulation, even when a large number of rows or columns in which the number of records corresponding to an element is all 0 is found, first, one of the items Number of values related to “R”) ×
A matrix consisting of elements (the number of values “C” for the other item) has been generated.

【０００６】特に、項目に関する値（項目値）の数
「Ｒ」および「Ｃ」が多数存在する場合、たとえば、
「Ｒ」および「Ｃ」がともに１万である場合には、１億
個の要素からなるマトリクスが、クロス集計にて生成さ
れる。したがって、項目値の数が増えるのにしたがっ
て、より多くのメモリ領域が必要であり、クロス集計の
ための処理時間が著しく増大するという問題点があっ
た。また、「Ｒ×Ｃ」の領域からなるマトリクスである
クロス集計表をいったん作成した後に、各列および各行
の存在数の総和を算出するため、より処理時間を要する
という問題点があった。[0006] In particular, when there are a large number of values "R" and "C" related to an item (item value), for example,
When both “R” and “C” are 10,000, a matrix including 100 million elements is generated by cross tabulation. Therefore, as the number of item values increases, more memory area is required, and there is a problem that the processing time for cross tabulation increases significantly. In addition, there is a problem in that once a cross tabulation table, which is a matrix including “R × C” regions, is created, the total number of the columns and rows is calculated, so that more processing time is required.

【０００７】本発明は、メモリの使用量を削減するとと
もに、クロス集計を高速に実現可能な処理方法を提供す
ることを目的とする。An object of the present invention is to provide a processing method capable of reducing the amount of memory used and realizing high-speed cross tabulation.

【０００８】[0008]

【課題を解決するための手段】本発明の目的は、各々
が、項目とこれに含まれる項目値とを含むレコードの配
列として表わされる複数の表形式データに集計処理を施
すデータ集計方法であって、前記表形式データを、各々
が、特定の項目に属する項目値に対応した項目値番号の
順に当該項目値が格納されている値リストと、一意的な
レコード番号の順に、当該項目値番号を指示するための
ポインタ値が格納されたポインタ配列とからなる一以上
の情報ブロックに分割するように構成するステップと、
前記情報ブロック中に、相等しい項目値番号を備えたレ
コードの数を示す存在数を、前記値リスト中の値と対応
させて配置した存在数配列を設けるステップと、当該存
在数配列において存在数が０であるような項目値を見出
すステップと、前記見出された項目値を除去した状態
で、前記項目中の項目値と対応する存在数からなる集計
表を作成するステップとを備えたことを特徴とするデー
タ集計方法により達成される。SUMMARY OF THE INVENTION An object of the present invention is a data summarizing method for summarizing a plurality of tabular data, each of which is represented as an array of records each including an item and an item value included therein. The tabular data is divided into a value list in which the item values are stored in the order of the item value numbers corresponding to the item values belonging to the specific item, and the item value numbers in the order of the unique record numbers. And a pointer array for storing pointer values for indicating
Providing, in the information block, a presence number array in which the number of records indicating the number of records having the same item value number is arranged in association with the value in the value list; Finding an item value such that is 0, and a step of creating a summary table comprising the number of items corresponding to the item value in the item with the found item value removed. This is achieved by a data aggregation method characterized by the following.

【０００９】本発明によれば、項目値に関して、レコー
ドの数を示す存在数を算出し、当該存在数が０であるよ
うな項目値を除いた集計表を作成するため、いったん、
集計表を作成した後に、その要素が０であるものを除去
する手順を省略することができる。According to the present invention, with respect to item values, the number of occurrences indicating the number of records is calculated, and a totaling table excluding the item values in which the number of occurrences is 0 is created.
After the tally table is created, the procedure for removing those whose elements are 0 can be omitted.

【００１０】また、本発明の目的は、各々が、項目とこ
れに含まれる項目値とを含むレコードの配列として表さ
れる表形式データに、複数の項目に関する集計処理を施
すデータ集計方法であって、集計に利用される複数の各
々の項目について、当該項目に属する項目値に対応した
項目値番号の順に当該項目値が格納されている値リスト
と、一意的なレコード番号の順に、当該項目値番号を指
示するためのポインタ値が格納されたポインタ配列とか
らなる一以上の情報ブロックに分割するように構成する
ステップと、前記各々の項目について、相等しい項目値
番号を備えたレコードの数を示す存在数を、前記値リス
ト中の値と対応させて配置した存在数配列を設けるステ
ップと、前記存在数配列の各々において当該存在数が０
であるような項目値を見出すステップと、前記複数の項
目からなるマトリクス状の集計表の各要素の存在数を、
アドレスに対応させるために、前記各要素の多次元的な
位置を、一次元の位置に射影するステップと、前記存在
数配列の各々において、当該存在数が０でないような要
素に対応した要素を少なくとも含み、当該要素が前記射
影に基づいてオフセット値となるようなオフセット配列
を生成するステップと、前記レコード番号に基づき、各
項目に関して、前記ポインタ配列中の対応するポインタ
値を参照することにより、対応するオフセット配列中の
オフセット値を特定するステップと、得られたオフセッ
ト値を加算して、加算オフセット値を算出するステップ
と、加算オフセットを前記射影と対応させることによ
り、アドレスを算出するステップと、得られたアドレス
に示す位置の値をカウントアップするステップとを備
え、最終的に得られたアドレスに示す位置の値に基づき
集計表が得られることを特徴とするデータ集計方法によ
り達成される。Another object of the present invention is a data summarizing method for summarizing a plurality of items to tabular data represented as an array of records each including an item and an item value included therein. For each of a plurality of items used for aggregation, a value list in which the item values are stored in the order of the item value numbers corresponding to the item values belonging to the item, and the item list in the order of the unique record numbers Configuring to divide into one or more information blocks consisting of a pointer array storing pointer values for indicating value numbers, and the number of records having the same item value number for each of the items Providing an existence number array in which the number of existences corresponding to the values in the value list is arranged, and the existence number is 0 in each of the existence number arrays.
Finding the item value such that is, and the number of each element of the matrix-like summary table consisting of the plurality of items,
Projecting the multidimensional position of each element to a one-dimensional position to correspond to an address; and, in each of the existence number arrays, the element corresponding to the element whose existence number is not 0. Generating at least an offset array such that the element becomes an offset value based on the projection, and by referring to a corresponding pointer value in the pointer array for each item based on the record number, Specifying an offset value in a corresponding offset array, adding the obtained offset value, calculating an added offset value, and calculating an address by associating the added offset with the projection. Counting up the value of the position indicated by the obtained address, It is achieved by the data aggregation method characterized in that aggregate tables based on the value of the position indicated in dress is obtained.

【００１１】本発明によれば、複数の次元の集計表を作
成する場合に、集計表中の要素を示すアドレスに合致す
るような射影が、加算オフセット値に基づき得られる。
したがって、いったん複数次元の集計表を作った上で、
その要素が０であるような領域を見出す作業をすること
なく、所望の集計表を得ることが可能となる。According to the present invention, when a total table of a plurality of dimensions is created, a projection matching an address indicating an element in the total table is obtained based on the added offset value.
Therefore, once you create a multi-dimensional summary table,
A desired tabulation table can be obtained without having to find an area where the element is 0.

【００１２】上記発明の好ましい実施態様においては、
各項目の項目値の数が、Ｎｉ（０≦ｉ≦Ｉ−１：Ｉは、
集計にかかる項目の総数）で表され、かつ、各項目の存
在数配列において値が０であった要素の数がｎｉ（（０
≦ｉ≦Ｉ−１）で表されるときに、各項目のオフセット
配列のオフセット値Ｐｉ_ｍ（ただし、０≦ｍ≦Ｎｉ−ｎ
ｉ）が、Ｐｉ_０＝０Ｐｉ_{（ｍ＋１）}−Ｐｉ_ｍ＝Π（Ｎｋ−ｎｋ）（０≦ｋ
≦ｍ−１）により表される。In a preferred embodiment of the present invention,
When the number of item values of each item is Ni (0 ≦ i ≦ I−1: I is
The number of elements whose value is 0 in the existence number array of each item is represented by ni ((0
≦ i ≦ when I-1) represented by the offset value Pi _m offset arrangement of each item (except, 0 ≦ m ≦ Ni-n
i) I _{_{am, Pi 0 = 0 Pi (m}} + 1) -Pi m = Π (Nk-nk) (0 ≦ k
.Ltoreq.m-1).

【００１３】また、本発明の他の実施態様においては、
前記存在数配列を設けるのに先立ち、前記レコード番号
のうち、集計の対象たり得るレコード番号を選択するス
テップと、当該選択されたレコード番号を、所定の順序
で配置したレコード番号集合を生成するステップと、前
記レコード番号集合に含まれるレコード番号に基づき、
前記ポインタ配列中の対応するポインタ値を参照するこ
とにより、前記存在数配列を生成するステップとを備え
ている。さらに、本発明の目的は、上記方法を実行する
プログラムを記憶した、コンピュータにより読み取り可
能な記憶媒体によっても達成される。[0013] In another embodiment of the present invention,
Prior to providing the existence number array, a step of selecting a record number that can be counted from among the record numbers, and a step of generating a record number set in which the selected record numbers are arranged in a predetermined order And, based on the record numbers included in the record number set,
Generating the existence number array by referring to a corresponding pointer value in the pointer array. Further, the object of the present invention is also achieved by a computer-readable storage medium storing a program for executing the above method.

【００１４】[0014]

【発明の実施の形態】以下、添付図面を参照して、本発
明の実施の形態につき説明を加える。図１は、本発明の
実施の形態にかかるクロス集計、および、データ圧縮を
実現できるコンピュータシステムのハードウェア構成を
示すブロックダイヤグラムである。図１に示すように、
このコンピュータシステム１０は、通常のものと同様の
構成であり、プログラムを実行することにより、システ
ム全体および個々の構成部分を制御するＣＰＵ１２、ワ
ークデータなどを記憶するＲＡＭ(Random Access Memor
y)１４、プログラム等を記憶するＲＯＭ(Read Only Mem
ory)１６、ハードディスク等の固定記憶媒体１８、ＣＤ
−ＲＯＭ１９をアクセスするためのＣＤ−ＲＯＭドライ
バ２０、ＣＤ−ＲＯＭドライバ２０や外部ネットワーク
（図示せず）と接続された外部端子との間に設けられた
インタフェース（Ｉ／Ｆ）２２、キーボードやマウスか
らなる入力装置２４、ＣＲＴ表示装置２６を備えてい
る。ＣＰＵ１２、ＲＡＭ１４、ＲＯＭ１６、外部記憶媒
体１８、Ｉ／Ｆ２２、入力装置２４および表示装置２６
は、バス２８を介して相互に接続されている。Embodiments of the present invention will be described below with reference to the accompanying drawings. FIG. 1 is a block diagram showing a hardware configuration of a computer system capable of realizing cross tabulation and data compression according to an embodiment of the present invention. As shown in FIG.
The computer system 10 has a configuration similar to that of a normal computer. By executing a program, the computer system 10 controls the entire system and individual components, and a RAM (Random Access Memory) for storing work data and the like.
y) 14, ROM (Read Only Mem
ory) 16, fixed storage medium 18, such as hard disk, CD
A CD-ROM driver 20 for accessing the ROM 19; an interface (I / F) 22 provided between the CD-ROM driver 20 and an external terminal connected to an external network (not shown); a keyboard and a mouse Input device 24 and a CRT display device 26. CPU 12, RAM 14, ROM 16, external storage medium 18, I / F 22, input device 24, and display device 26
Are connected to each other via a bus 28.

【００１５】本実施の形態にかかる表形式データに基づ
くクロス集計を実現するプログラム、クロス集計により
得られたデータを圧縮するプログラム、および、クロス
集計表（ビュー）を作成するプログラムは、ＣＤ−ＲＯ
Ｍ１９に収容され、ＣＤ−ＲＯＭドライバ２０に読取ら
れても良いし、ＲＯＭ１６に予め記憶されていても良
い。また、いったんＣＤ−ＲＯＭ１９から読み出したも
のを、外部記憶媒体１８の所定の領域に記憶しておいて
も良い。或いは、上記プログラムは、ネットワーク（図
示せず）、外部端子およびＩ／Ｆ２２を経て外部から供
給されるものであっても良い。A program for realizing cross tabulation based on tabular data, a program for compressing data obtained by cross tabulation, and a program for creating a cross tabulation (view) according to the present embodiment are CD-RO.
It may be stored in the M19 and read by the CD-ROM driver 20, or may be stored in the ROM 16 in advance. The data once read from the CD-ROM 19 may be stored in a predetermined area of the external storage medium 18. Alternatively, the program may be supplied from the outside via a network (not shown), an external terminal, and the I / F 22.

【００１６】また、本実施の形態においては、上記クロ
ス集計およびデータ圧縮を高速に実現するために、後述
するように所定のデータ形式の情報ブロックを生成する
必要がある。この情報ブロック生成プログラムも同様
に、ＣＤ−ＲＯＭ１９に収容され、ＲＯＭ１６に記憶さ
れ、或いは、外部記憶媒体１８に記憶されても良い。或
いは、これらプログラムは、ネットワーク（図示せず）
を介して、外部から供給されても良いことはいうまでも
ない。また、本実施の形態において、情報ブロック生成
プログラムにて生成されたデータ（情報ブロック）は、
ＲＡＭ１４に記憶され、或いは、外部記憶媒体１８の所
定の領域に記憶される。In the present embodiment, it is necessary to generate an information block having a predetermined data format, as described later, in order to realize the above-described cross tabulation and data compression at a high speed. Similarly, this information block generation program may be stored in the CD-ROM 19, stored in the ROM 16, or stored in the external storage medium 18. Alternatively, these programs are stored on a network (not shown)
Needless to say, it may be supplied from the outside via the. In the present embodiment, the data (information block) generated by the information block generation program is:
It is stored in the RAM 14 or in a predetermined area of the external storage medium 18.

【００１７】次に、本発明の前提となるデータ形式およ
び集計の原理につき説明を加える。本発明者は、処理の
超高速化を図るため、特定のデータ形式を有する表形式
データの構築と、検索、集計およびソート方法とを考案
した（ＰＣＴＷＯ００／１０９１３号）。本発明にお
いても、基本的には、この出願に開示された手法に基づ
いて、表形式データを所定の情報ブロックの集合体とし
て構築し、これを用いてクロス集計を含む集計を実現し
ている。Next, the data format and the principle of tabulation, which are the premise of the present invention, will be described. The present inventor has devised a method of constructing tabular data having a specific data format and a method of searching, counting, and sorting in order to achieve ultra-high-speed processing (PCT WO00 / 10913). Also in the present invention, basically, tabular data is constructed as a set of predetermined information blocks based on the method disclosed in this application, and tabulation including cross tabulation is realized using this. .

【００１８】図２は、本実施の形態にて用いる情報ブロ
ックを示す図である。図２に示すように、情報ブロック
１００は、値リスト１１０と値リストへのポインタ配列
１２０とを含んでいる。値リスト１１０は、表形式デー
タの各項目に対して、その項目に属する項目値が順序付
け（整数化）された項目値番号の順番に、上記項目値番
号に対応した項目値１１１が格納されたテーブルであ
る。値リストへのポインタ配列１２０は、表形式データ
のある列（すなわち項目）の項目値番号、つまり値リス
ト１１０へのポインタが表形式データのレコード番号順
に格納された配列である。FIG. 2 is a diagram showing information blocks used in the present embodiment. As shown in FIG. 2, the information block 100 includes a value list 110 and an array of pointers 120 to the value list. The value list 110 stores, for each item of the tabular data, the item values 111 corresponding to the item value numbers in the order of the item value numbers in which the item values belonging to the item are ordered (converted into integers). It is a table. The pointer array 120 to the value list is an array in which the item value numbers of a certain column (that is, item) of the tabular data, that is, the pointers to the value list 110 are stored in the order of the record numbers of the tabular data.

【００１９】上記値リストへのポインタ配列１２０と値
リスト１１０とを組み合わせることにより、あるレコー
ド番号が与えられたときに、所定の項目に関する値リス
トへのポインタ配列１２０からそのレコード番号に対応
して格納された項目値番号を取り出し、次いで、値リス
ト１１０内でその項目値番号に対応して格納された項目
値を取り出すことにより、レコード番号から項目値を得
ることができる。したがって、従来のデータ表と同様
に、レコード番号（行）と項目（列）という座標を用い
てすべてのデータ（項目値）を参照することができる。By combining the above-mentioned pointer array 120 to the value list and the value list 110, when a certain record number is given, the pointer array 120 to the value list relating to a predetermined item corresponds to the record number. By taking out the stored item value number and then taking out the item value stored in the value list 110 corresponding to the item value number, the item value can be obtained from the record number. Therefore, similarly to the conventional data table, all data (item values) can be referred to using the coordinates of the record number (row) and the item (column).

【００２０】たとえば、図３（ａ）に示す表形式データ
を考える。この例では、顧客ＩＤ、顧客名、電話番号と
いう項目に種々の項目値が与えられている。本実施の形
態においては、このような表形式データを、図３（ｂ）
ないし（ｄ）に示す形式の情報ブロックとして保持して
いる。たとえば、図３（ｂ）において、ポインタ配列１
２０−１は、顧客ＩＤを示す項目値を格納した値リスト
１１０−１に関連付けられている。すなわち、先頭レコ
ード（レコード番号“０”）のポインタ配列のポインタ
値は０であり、これに対応して、顧客ＩＤを示す項目値
“１”が得られる。図３（ｂ）において、ポインタ配列
１２０−２は、顧客名を示す項目値を格納した値リスト
１１０−２に関連付けられている。たとえば先頭レコー
ド（レコード番号“０”）のポインタ配列におけるポイ
ンタ値は“５”であり、これに対応して、顧客名を示す
項目値“山田 ○男”が得られる。図３（ｃ）において
も、同様に、ポインタ配列１２０−３が、電話番号を示
す項目値を格納した値リスト１１０−３に関連付けられ
ていることが理解できよう。また、各値リストにおいて
は、項目値が順序付けられて（この例では昇順）いるこ
とが理解できよう。For example, consider the tabular data shown in FIG. In this example, various item values are given to items such as customer ID, customer name, and telephone number. In the present embodiment, such tabular data is converted to the format shown in FIG.
Or (d) as an information block. For example, in FIG.
20-1 is associated with a value list 110-1 storing an item value indicating a customer ID. That is, the pointer value of the pointer array of the first record (record number “0”) is 0, and correspondingly, the item value “1” indicating the customer ID is obtained. In FIG. 3B, the pointer array 120-2 is associated with a value list 110-2 storing an item value indicating a customer name. For example, the pointer value in the pointer array of the first record (record number “0”) is “5”, and correspondingly, an item value “Yamada 男” indicating the customer name is obtained. 3C, similarly, it can be understood that the pointer array 120-3 is associated with the value list 110-3 storing the item values indicating the telephone numbers. Also, in each value list, it can be understood that the item values are ordered (in this example, ascending order).

【００２１】さらに、本実施の形態においては、情報ブ
ロック１００の値管理テーブルは、値リスト１１０のほ
か、検索や集計のために用いる分類番号フラグ配列、項
目値に対応するポインタを格納すべきメモリ空間の先頭
アドレスを示す開始位置配列、および、存在数配列が含
まれている。分類番号フラグ配列の各フラグ、および、
存在数配列の各存在数は、項目値の各々に対応付けられ
ている。分類番号フラグのフラグ値は、通常 “０”で
あり、検索や集計の際に見出すべき項目値に対応して
“１”にセットされる。また、存在数は、その項目値を
有するレコードの個数に対応する。なお、開始位置は、
対応するポインタ値よりも小さなポインタ値に対応する
存在数を加算したものに対応するため、必ずしも設ける
必要はない。Further, in the present embodiment, the value management table of the information block 100 stores, in addition to the value list 110, a class number flag array used for searching and counting, and a memory for storing pointers corresponding to item values. A start position array indicating the head address of the space and an existence number array are included. Each flag of the classification number flag array, and
Each existence number in the existence number array is associated with each item value. The flag value of the classification number flag is usually “0”, and is set to “1” corresponding to the item value to be found at the time of search or totalization. The number of existences corresponds to the number of records having the item value. The starting position is
It does not necessarily need to be provided because it corresponds to the sum of the number of existences corresponding to pointer values smaller than the corresponding pointer value.

【００２２】図４（ａ）は、表形式データの他の例を示
す図、図４（ｂ）および（ｃ）は、それぞれ、「性別」
および「年令」に関する情報ブロックを示す図である。
図４（ｂ）に示すように、性別に関する情報ブロック２
００−１の値管理テーブル２１０−１には、ポインタ配
列２２０の各ポインタ値に対応する項目値（「男性」お
よび「女性」）と、各項目値に対応する分類番号、開始
位置および存在数が示されている。たとえば、ポインタ
値が“０”（つまり、値リストの項目値が「男性」）で
あるようなレコードの数は６３２５６４個であり、その
一方、ポインタ値が“１”（つまり、値リストの項目値
が「女性」）であるようなレコードの数は３６７４２６
個となっている。また、各項目値に対応する開始位置
は、後述するレコードへのポインタ配列２３０−１の先
頭アドレスを示している。図４（ｃ）においても、同様
のことが理解できよう。FIG. 4 (a) shows another example of tabular data, and FIGS. 4 (b) and (c) each show "sex".
It is a figure which shows the information block regarding "age".
As shown in FIG.
In the value management table 210-1 of 00-1, the item values ("male" and "female") corresponding to each pointer value of the pointer array 220, the classification number corresponding to each item value, the start position, and the number of existences It is shown. For example, the number of records whose pointer value is “0” (that is, the item value of the value list is “male”) is 632564, while the pointer value is “1” (that is, the item of the value list is “male”). The number of records whose value is "female") is 369426
It is individual. The start position corresponding to each item value indicates the head address of a pointer array 230-1 to a record described later. The same can be understood from FIG.

【００２３】このようなデータ構造を有する情報ブロッ
クを用いた検索および集計の一例並びに情報ブロックの
生成処理につき、以下に説明する。図５は、単一項目に
関する検索手法を示すフローチャートである。この処理
は、ＣＰＵ１２（図１参照）が所定の検索プログラムを
実行することにより実現される。この例では、「年令」
の項目値が１６歳または１９歳であるレコードが検索さ
れる。まず、表形式データに関する情報ブロックのう
ち、図４（ｃ）に示す「年令」に関する情報ブロック２
００−２が特定される（ステップ５０１）。An example of retrieval and counting using an information block having such a data structure and a process of generating an information block will be described below. FIG. 5 is a flowchart showing a search method for a single item. This processing is realized by the CPU 12 (see FIG. 1) executing a predetermined search program. In this example, "age"
A record whose item value is 16 years or 19 years is searched. First, of information blocks relating to tabular data, information block 2 relating to “age” shown in FIG.
00-2 is specified (step 501).

【００２４】次いで、特定された情報ブロック（以下、
「特定情報ブロック」と称する。）の値リスト２１０−
２において、項目値が上記検索条件に合致するもの（１
６歳または１９歳）に対応する行の分類番号が“１”に
セットされる（ステップ５０２）。本例の場合には、項
目値番号“０”および項目値番号“３”に対応する行の
分類番号が１にセットされる。次いで、分類番号が
“１”にセットされている行に対応した開始位置および
存在数が取得される（ステップ５０３）。これら情報を
ポインタ取り出し情報と称する。レコードへのポインタ
配列において、ステップ５０３にて取得されたポインタ
取り出し情報に基づき、検索条件に合致したレコードへ
のポインタを示すレコード番号が取り出される（ステッ
プ５０４）。本例においては、項目値番号“０”に対応
したレコードのポインタは、レコードへのポインタ配列
の開始位置“０”すなわち先頭から、４５８９８個目ま
での領域に格納され、その一方、項目値番号“３”に対
応したレコードのポインタは、レコードへのポインタ配
列の２３８３１３７番目から１８９６５３個分の領域に
格納されていることがわかる。最後に、後の処理にて利
用できるようにするために、取り出されたレコード番号
の配列が、結果集合として作成され、これが保持される
（ステップ５０５）。Next, the specified information block (hereinafter, referred to as the information block)
This is called a “specific information block”. ) Value list 210-
2. In the case where the item value matches the above search condition in (2)
The classification number of the row corresponding to (6 years old or 19 years old) is set to "1" (step 502). In the case of this example, the classification number of the row corresponding to the item value number “0” and the item value number “3” is set to “1”. Next, the start position and the number of occurrences corresponding to the line in which the classification number is set to "1" are obtained (step 503). Such information is referred to as pointer extraction information. In the array of pointers to records, a record number indicating a pointer to a record that matches the search condition is extracted based on the pointer extraction information acquired in step 503 (step 504). In this example, the pointer of the record corresponding to the item value number “0” is stored in the start position “0” of the pointer array to the record, that is, in the area from the beginning to the 45898th item. It can be seen that the pointer of the record corresponding to “3” is stored in the area of 1,89653 from the 2383137th position of the pointer array to the record. Finally, an array of the retrieved record numbers is created as a result set and stored, so that it can be used in the subsequent processing (step 505).

【００２５】次に、上述したような検索処理等に利用す
るための情報ブロックの生成処理につき説明を加える。
図６は、表形式データに基づき情報ブロックを作成する
ための処理を説明するフローチャートである。まず、シ
ステム１０は、表形式の原データを取得し、これを項目
別のものに分解する（ステップ６０１）。この原データ
は、たとえば、図７（ａ）に示すものでも良いし、或い
は、図７（ｂ）に示すものでも良い。これら原データ
は、外部から供給されるものであっても良いし、或い
は、固定記憶媒体１８に記憶されたものであっても良
い。以下に述べるステップ６０２ないしステップ６０４
からなる処理ブロック６１０は、ある一つの項目に関す
る情報ブロックの生成を示す。したがって、複数の項目
に関する情報ブロックを生成する場合には、項目の数だ
け処理ブロック６１０に対応する処理が実行される。以
下、「性別」に関する項目の情報ブロックを例にとって
説明を加える。Next, a description will be given of a process of generating an information block for use in the above-described search process and the like.
FIG. 6 is a flowchart illustrating a process for creating an information block based on tabular data. First, the system 10 obtains original data in the form of a table and decomposes the data into items (step 601). This original data may be, for example, the data shown in FIG. 7A or the data shown in FIG. 7B. These original data may be supplied from the outside, or may be stored in the fixed storage medium 18. Steps 602 to 604 described below
The processing block 610 consisting of indicates the generation of an information block for a certain item. Therefore, when generating information blocks relating to a plurality of items, the processes corresponding to the processing blocks 610 are executed by the number of items. Hereinafter, an explanation will be given by taking an information block of an item relating to “sex” as an example.

【００２６】まず、「性別」に関する項目の情報ブロッ
ク用の領域が、たとえば、ＲＡＭ１４中に確保される
（ステップ６０２）。次いで、この確保された領域中
に、値管理テーブルが生成される。より詳細には、ま
ず、値管理テーブルが初期化される。次いで、原データ
のうち、「性別」に関するデータを先頭から末尾まで操
作することにより、どのような項目名が、それぞれいく
つ存在するかが見出される。本例では、「女性」および
「男性」という項目名が、それぞれ、３６７４３６個お
よび６３２５６４個だけあることが見出される。これに
より、値リストに、「女性」および「男性」という項目
値がセットされ、また、存在数配列にも所定の数がセッ
トされる。その後に、項目値が所定の基準にしたがって
ソートされる。ソートの際には、項目数の並び替えにし
たがって存在数も並びかえられる。次いで、開始位置配
列の値が決定される。これは、ソートにより自己より上
位に位置する存在数を累算することにより求められる。
また、開始位置配列の値を、対応する分類番号配列の値
に割り当てる。この値は次のステップにて用いられる。
このようにして値管理テーブルが生成された後に、レコ
ードへのポインタ配列が生される成。このポインタ配列
の領域の大きさは、存在数の総和に対応する。このよう
にして、所定の項目に関する情報ブロックを作り出すこ
とが可能となる。この情報ブロックの生成を予め行って
おき、生成された情報ブロックを用いて検索、集計など
の処理が実行される。First, an area for an information block of an item relating to “sex” is secured, for example, in the RAM 14 (step 602). Next, a value management table is generated in the secured area. More specifically, first, the value management table is initialized. Next, by manipulating the data relating to “sex” from the beginning to the end of the original data, it is possible to find out how many item names exist and how many. In this example, it is found that there are only 369436 and 632564 item names of “female” and “male”, respectively. As a result, the item values “female” and “male” are set in the value list, and a predetermined number is also set in the existence number array. Thereafter, the item values are sorted according to predetermined criteria. At the time of sorting, the number of items is also rearranged according to the rearrangement of the number of items. Next, the value of the start position array is determined. This is obtained by accumulating the number of existences higher than the self by sorting.
Further, the value of the start position array is assigned to the value of the corresponding classification number array. This value will be used in the next step.
After the value management table is generated in this way, an array of pointers to records is generated. The size of the area of the pointer array corresponds to the sum of the number of existences. In this way, it is possible to create an information block relating to a predetermined item. This information block is generated in advance, and processes such as search and counting are executed using the generated information block.

【００２７】次に、本実施の形態にかかるコンピュータ
システムを利用した集計処理につき説明を加える。この
集計処理では、原理としては、本実施の形態にかかる情
報ブロック（たとえば、図４（ｂ）、（ｃ）参照）の存
在数配列中の要素の値（存在数）を利用すればよい。図
８は、本実施の形態にかかる集計処理を示すフローチャ
ートである。この処理は、ＣＰＵ１２（図１参照）が所
定の集計プログラムを実行することにより実現される。
図９（ａ）は、図２１（ａ）に示す表形式データのう
ち、「学校区分」という項目に関する情報ブロックを示
す図である。ここでは、学校区分の項目値ごとに、どれ
だけのレコードが存在するかを集計している。Next, a tallying process using the computer system according to the present embodiment will be described. In this tallying process, in principle, the value (existence number) of an element in the existence number array of the information block (for example, see FIGS. 4B and 4C) according to the present embodiment may be used. FIG. 8 is a flowchart illustrating the counting process according to the present embodiment. This processing is realized by the CPU 12 (see FIG. 1) executing a predetermined tallying program.
FIG. 9A is a diagram showing an information block related to an item “school division” in the tabular data shown in FIG. 21A. Here, for each item value of the school classification, the number of records that exist is counted.

【００２８】集計処理においては、まず、格納位置番号
が初期化され、かつ、この情報ブロックに関して、値リ
スト９０１と同じ要素数を有する存在数配列の領域９０
２が確保される（ステップ８０１）。初期的には存在数
配列の全ての要素の値として「０」が与えられる。な
お、この格納位置番号は、全てのレコードに関する集計
の場合には、レコード番号と一致する。その一方、検索
により全レコードから所定のものが選択された場合に
は、選択されたレコードが所定の順序（たとえば昇順）
に並べられた配列において、各要素（選択されたレコー
ドのレコード番号）に昇順で付された数が、格納位置番
号に対応する。In the totaling process, first, the storage position number is initialized, and an area 90 of the existence number array having the same number of elements as the value list 901 with respect to this information block.
2 is secured (step 801). Initially, “0” is given as a value of all elements of the existence number array. Note that this storage position number coincides with the record number in the case of aggregation for all records. On the other hand, when a predetermined record is selected from all the records by the search, the selected records are placed in a predetermined order (for example, in ascending order).
In the array arranged in the table, the number assigned to each element (the record number of the selected record) in ascending order corresponds to the storage position number.

【００２９】次いで、格納位置番号の小さいものから順
に、当該格納位置番号に対応する値リストへのポインタ
中のポインタ値が参照されて（ステップ８０２）、当該
ポインタ値が指示する値リストの番号に対応する、存在
数配列中の要素の値がインクリメントされる（ステップ
８０３）。たとえば、格納位置番号「０」に関して、そ
のポインタ配列のポインタ値は「０」である。したがっ
て、値リストの第１行目（０番）に対応する、存在数配
列の要素を「０」から「１」にする（図９（ｂ）参
照）。また、格納位置番号「１」に関して、そのポイン
タ配列のポインタ値は「２」である。したがって、値リ
ストの第３行目（２番）に対応する、存在数配列の要素
を「０」から「１」にする（図９（ｂ）参照）。Next, the pointer values in the pointer to the value list corresponding to the storage location number are referred to in ascending order of the storage location number (step 802), and the pointers in the value list indicated by the pointer value are referred to. The value of the corresponding element in the existence number array is incremented (step 803). For example, regarding the storage position number “0”, the pointer value of the pointer array is “0”. Therefore, the element of the number-of-existence array corresponding to the first row (No. 0) of the value list is changed from “0” to “1” (see FIG. 9B). As for the storage position number “1”, the pointer value of the pointer array is “2”. Therefore, the element of the number-of-existence array corresponding to the third row (No. 2) of the value list is changed from “0” to “1” (see FIG. 9B).

【００３０】ステップ８０２およびステップ８０３の処
理を全ての格納位置番号に関して実行することにより
（ステップ８０４、８０５）、値リストの項目値ごと
に、当該項目値を有するレコード数が、存在数配列中の
要素として表される（図９（ｃ）参照）。次いで、ＣＰ
Ｕ１２は、存在数配列を参照して、その値が「０」であ
るような要素を特定して当該要素に関連する項目値を見
出す（ステップ８０６）。その後に、ＣＰＵ１２は、対
応する存在数配列中の要素（存在数）が０であるような
項目値を除いた、当該項目に関する集計表（ビュー）を
作成する（ステップ８０７）。図１０は、このような手
法により得られたビューの一例を示す図である。図１０
に示すように、本実施の形態にかかる集計処理により、
存在しない項目値に関する欄が除去されたビューを得ら
れることが理解される。本実施の形態にかかる集計手法
によれば、レコード数が「Ｎ」であった場合に、Ｎ回だ
け、存在数配列中の要素の加算演算（カウントアップ）
を実行すればよいため、極めて高速に、集計を実行する
ことが可能となる。By executing the processing of steps 802 and 803 for all storage location numbers (steps 804 and 805), for each item value of the value list, the number of records having the item value is determined in the presence number array. It is represented as an element (see FIG. 9C). Then, CP
U12 refers to the existence number array, specifies an element whose value is “0”, and finds an item value related to the element (step 806). After that, the CPU 12 creates an aggregation table (view) for the item, excluding the item value for which the element (existence number) in the corresponding existence number array is 0 (step 807). FIG. 10 is a diagram illustrating an example of a view obtained by such a method. FIG.
As shown in, by the aggregation processing according to the present embodiment,
It will be appreciated that a view can be obtained in which columns for non-existent item values have been removed. According to the counting method according to the present embodiment, when the number of records is “N”, the addition operation (count-up) of the elements in the existence number array is performed only N times.
, It is possible to execute the totaling at a very high speed.

【００３１】次に、本実施の形態にかかるクロス集計に
つき説明を加える。この実施の形態においても、図２１
（ａ）に示す表形式データに関して、「学校区分」とい
う項目、および、「年齢」という項目のそれぞれの項目
値に関するクロス集計を実行することを考える。クロス
集計においては、項目ごとに、図８に示すものと略同等
の処理が実行される。図１１は、本実施の形態にかかる
クロス集計の処理を示すフローチャートである。図１１
に示すように、この処理では、任意の順序でクロス集計
にかかる項目を含む情報ブロックに関して、存在数配列
が生成される（ステップ１１０１）。この処理は、図８
のステップ８００〜８０５と同様である。次いで、生成
された存在数配列の要素を参照して、その値が「０」で
あるような要素が特定され、当該要素に関連する項目値
が見出される（ステップ１１０２）。この処理は、図８
のステップ８０６と同様である。これを全ての項目に関
して実行した（ステップ１１０３、１１０４）後に、対
応する存在数配列の要素（存在数）が０であるような項
目値を除いた、複数の項目に関する集計表（ビュー）が
作成される（ステップ１１０５）。図１２は、２つの項
目に関するクロス集計を実行する場合のステップ１１０
５の処理をより詳細に示したフローチャートである。ク
ロス集計表では、複数の項目に関するマトリクスが作ら
れ、その各要素に、複数の項目の各々の項目値の組み合
わせと関連するレコードの数が配置される。たとえば、
上記「年齢」という項目と「学校区分」という二つの項
目を考えた場合には、（「年齢」中の何れかの要素，
「学校区分」中の何れかの要素）という組み合わせごと
に、該当するレコード数が配置される。Next, a description will be given of the cross tabulation according to the present embodiment. Also in this embodiment, FIG.
Regarding the tabular data shown in (a), a case is considered in which cross tabulation is performed on the item values of the item “school category” and the item “age”. In the cross tabulation, processing substantially equivalent to that shown in FIG. 8 is executed for each item. FIG. 11 is a flowchart illustrating cross tabulation processing according to the present embodiment. FIG.
As shown in (1), in this process, an existence number array is generated for an information block including items related to cross tabulation in an arbitrary order (step 1101). This processing is shown in FIG.
Steps 800 to 805 are the same. Next, an element whose value is “0” is specified with reference to the element of the generated existence number array, and an item value related to the element is found (step 1102). This processing is shown in FIG.
Step 806 is the same. After this is executed for all items (steps 1103 and 1104), a summary table (view) for a plurality of items is created except for the item values for which the corresponding element of the number array (number of occurrences) is 0. Is performed (step 1105). FIG. 12 shows step 110 in the case of executing cross tabulation on two items.
5 is a flowchart showing the processing of No. 5 in more detail. In the cross tabulation table, a matrix for a plurality of items is created, and in each element, the number of records associated with a combination of item values of the plurality of items is arranged. For example,
Considering the above two items, "age" and "school classification", (any element in "age,
The corresponding number of records is arranged for each combination of any of the “school divisions”.

【００３２】ところで、実際には、メモリのアドレスは
１次元であるため、複数項目の集計表の各要素のレコー
ド数も、一次元の配列中に配置されることになる。図１
３に、学校区分および年齢からなる集計表と、表中の各
領域のアドレスとの対応の一例を示す。たとえば、図１
３に示す配列において、年齢が大きくなるのにしたがっ
てアドレスを一つずつ増大させると考えると、そのアド
レスは、図１３の配列中に示す数字のようになる。そこ
で、本実施の形態では、多次元のマトリクスの要素に、
一次元のアドレスを射影し、アドレスが一つずつ増大す
るような項目を最低次の項目と設定し、アドレスのオフ
セットがより大きくなるような項目を高次の項目と設定
している。たとえば、図１３の例では、低次の項目が
「年齢（縦方向の項目）」であり、高次の項目が「学校
区分（横方向の項目）」となる。３種以上の項目に関す
る場合でも、よりアドレスのオフセットが大きくなるも
のが、さらに高次の項目であると考える。Since the address of the memory is actually one-dimensional, the number of records of each element of the total table of a plurality of items is also arranged in a one-dimensional array. FIG.
FIG. 3 shows an example of the correspondence between the tabulation table including the school divisions and the ages, and the addresses of the respective areas in the table. For example, FIG.
In the array shown in FIG. 3, assuming that the addresses are increased one by one as the age increases, the addresses become like the numbers shown in the array of FIG. Therefore, in this embodiment, the elements of the multi-dimensional matrix
One-dimensional addresses are projected, and items whose addresses increase one by one are set as lowest-order items, and items whose address offsets are larger are set as higher-order items. For example, in the example of FIG. 13, the lower order item is “age (vertical item)”, and the higher order item is “school category (horizontal item)”. Even in the case of three or more types of items, items having a larger address offset are considered to be higher-order items.

【００３３】そこで、図１２に示すように、ビュー作成
の処理では、集計表中の項目を参照してその次元が確定
され（ステップ１２００）、最低次元の項目が、第１の
処理対象にされる（ステップ１２０１）。Therefore, as shown in FIG. 12, in the view creation processing, the dimension is determined by referring to the item in the summary table (step 1200), and the item of the lowest dimension is set as the first processing target. (Step 1201).

【００３４】次いで、着目した処理対象たる項目に関し
て、存在数配列と同じ要素数を有するオフセット配列の
ための領域が生成される（ステップ１２０２）。次い
で、格納位置番号が初期化され（ステップ１２０３）、
当該格納位置番号に対応するオフセット配列のための領
域中の要素の値を配置するために、基本オフセット値が
算出される（ステップ１２０４）。基本オフセット値
は、最低次（最下位）の次元では「１」となる。その一
方、次の次元においては、一つ低次（下位）の次元のオ
フセット値と、当該一つ低次（下位）の次元の項目に関
する存在数配列における「０」以外の要素の数との積と
なる。Next, an area for an offset array having the same number of elements as the number-of-existence array is generated for the item to be processed which is focused (step 1202). Next, the storage position number is initialized (step 1203),
A basic offset value is calculated to arrange the values of the elements in the area for the offset array corresponding to the storage position number (step 1204). The basic offset value is “1” in the lowest (lowest) dimension. On the other hand, in the next dimension, the offset value of the one lower (lower) dimension and the number of elements other than “0” in the existence number array for the item of the one lower (lower) dimension are set. Product.

【００３５】次いで、格納位置番号に対応する位置の要
素として、オフセット値が配置される（ステップ１２０
５）。より具体的には、オフセット値は初期的には
「０」であり、格納位置番号が大きくなるのにつれて、
オフセット値が加えられたものとなる。しかしながら、
当該格納位置番号に対応する存在数配列中の要素が
「０」である場合には、オフセット値が加えられない。
また、対応する存在数配列の要素が「０」である場合に
は、オフセット値の算出および得られた値の配置を行な
う必要は無い。このような処理を全ての要素（特に、存
在数配列中の要素が「０」でないもの）について実行さ
れる（ステップ１２０６、１２０７参照）。ステップ１
２０３ないしステップ１２０７の処理は、次元が低いも
のから高いものまで、順次実行される（ステップ１２０
９、１２１０参照）。これにより、各項目ごとに、オフ
セット配列を作り出すことができる。Next, an offset value is arranged as an element of the position corresponding to the storage position number (step 120).
5). More specifically, the offset value is initially “0”, and as the storage position number increases,
The offset value is added. However,
If the element in the existence number array corresponding to the storage position number is “0”, no offset value is added.
When the element of the corresponding number array is “0”, it is not necessary to calculate the offset value and arrange the obtained value. Such processing is executed for all elements (particularly, elements whose elements in the existence number array are not “0”) (see steps 1206 and 1207). Step 1
The processes of 203 to 1207 are sequentially executed from a low dimension to a high dimension (step 120).
9, 1210). Thus, an offset array can be created for each item.

【００３６】たとえば、図２１（ａ）、（ｂ）に示す表
形式データに関して、「学校区分」という項目に関する
情報ブロック（図９（ｃ）参照）、および、「年齢」に
関する情報ブロックについて考える。これら項目におい
て、「年齢」を低次（下位）の次元と定義し、「学校区
分」を高次（上位）の次元と定義した。これにより、ま
ず、「年齢」という項目に関して、ステップ１２０２な
いしステップ１２０７の処理が実行される。図１４
（ａ）において、符号１４００は、これらステップによ
る処理の結果得られたオフセット配列を示す。図１４
（ａ）に示すように、存在数配列中の対応する要素が
「０」以外の値をとるような、オフセット配列中の要素
の値が、「０」、「１」、「２」、・・・と「１」ずつ
増大していることがわかる。For example, regarding the tabular data shown in FIGS. 21A and 21B, consider an information block relating to the item "school division" (see FIG. 9C) and an information block relating to "age". In these items, "age" was defined as a lower (lower) dimension, and "school division" was defined as a higher (higher) dimension. As a result, first, the processing of steps 1202 to 1207 is executed for the item “age”. FIG.
In (a), reference numeral 1400 denotes an offset array obtained as a result of the processing in these steps. FIG.
As shown in (a), the values of the elements in the offset array are “0”, “1”, “2”,... So that the corresponding elements in the existence number array take values other than “0”. It can be seen that the number has increased by "1".

【００３７】また、「年齢」という項目に関する処理が
終了した後に、「学校区分」という項目に関して、ステ
ップ１２０２ないしステップ１２０７の処理が実行され
る。図１４（ｂ）において、符号１４１０は、これらス
テップによる処理の結果得られたオフセット配列を示
す。図１４（ｂ）に示すように、存在数配列中の対応す
る要素が「０」以外の値をとるような、オフセット配列
中の要素の値が、「０」、「５」、「１０」と、「５」
ずつ増大していることがわかる。これは、一つだけ低次
（下位）の項目に関する存在数配列において、要素が
「０」でないものが「５」だけあるため、「１×５＝
５」だけ、オフセット値が増大しているものである。After the processing for the item "age" is completed, the processing of steps 1202 to 1207 is executed for the item "school division". In FIG. 14B, reference numeral 1410 denotes an offset array obtained as a result of the processing in these steps. As shown in FIG. 14B, the values of the elements in the offset array are “0”, “5”, and “10” such that the corresponding elements in the existence number array take values other than “0”. And "5"
It can be seen that it is increasing at a time. This is because, in the existence count array relating to only one lower-order (lower-order) item, there are only “5” whose elements are not “0”.
The offset value is increased by "5".

【００３８】このようにして、各項目に関するオフセッ
ト配列が生成されると、クロス集計表を作成するため
の、マトリクス状の領域（クロス集計領域）が確保され
る（ステップ１２１０）。ここで確保すべき領域のサイ
ズは、各項目の存在数配列中において、「０」以外の値
をとっている要素の数の積となる。たとえば、上記図１
４（ａ）、（ｂ）に示す例では、そのサイズは、「５×
３＝１５」となる。また、クロス集計領域においては、
アドレスにしたがって、各マトリクス中の領域が特定で
きるようにする。すなわち、最低次の項目に関しては、
インクリメントし、かつ、より高次の最低次の項目は、
低次の項目が同一であれば、そのアドレスが、低次の項
目の数だけ増大するようなアドレスが算出される。When the offset array for each item is generated in this way, a matrix area (cross tabulation area) for creating a cross tabulation table is secured (step 1210). Here, the size of the area to be secured is a product of the number of elements having a value other than “0” in the existence number array of each item. For example, in FIG.
4A and 4B, the size is “5 ×
3 = 15 ". In the cross tabulation area,
The area in each matrix can be specified according to the address. That is, at least the following items:
Incrementing and higher-order lowest items are:
If the lower order items are the same, an address whose address is increased by the number of lower order items is calculated.

【００３９】次いで、レコード番号の集合に関する格納
位置番号が初期化され（ステップ１２１１）、先頭のレ
コード番号から順次、以下の処理が実行される。まず、
各項目に関して、値リストへのポインタ配列における、
当該レコード番号に対応する位置のポインタ値が特定さ
れる（ステップ１２１２）。次いで、各項目に関して、
オフセット配列における、当該ポインタ値が示す位置の
要素（オフセット値）が取り出され、これらオフセット
値が加算される（ステップ１２１３）。このような加算
されたオフセット値（加算オフセット値）が、クロス集
計領域中のマトリクス領域を示すアドレスとなる。した
がって、当該加算オフセット値をアドレスとして示すよ
うなマトリクス領域中の要素がインクリメントされる
（ステップ１２１４）。Next, a storage position number relating to a set of record numbers is initialized (step 1211), and the following processing is executed sequentially from the first record number. First,
For each item, in the pointer array to the list of values,
The pointer value at the position corresponding to the record number is specified (step 1212). Then, for each item,
The element (offset value) at the position indicated by the pointer value in the offset array is extracted, and these offset values are added (step 1213). The added offset value (addition offset value) becomes an address indicating the matrix area in the cross tabulation area. Therefore, an element in the matrix area that indicates the added offset value as an address is incremented (step 1214).

【００４０】このような処理を、全ての格納位置番号に
関して実行する（ステップ１２１５、１２１６参照）こ
とにより、クロス集計領域中の各マトリクス領域に、対
応するレコードの数が与えられることになる。たとえ
ば、図１４（ａ）および（ｂ）に示すオフセット配列が
作られた場合について説明する。図１５に示すように、
まず、レコード番号の集合のうち、先頭（格納位置番号
が「０」）のレコード番号「０」が取り出され、「学校
区分」に関するポインタ配列中の対応するポインタ値
「０」が特定される一方、「年齢」に関するポインタ配
列中の対応するポインタ値「０」が特定される。By executing such processing for all storage position numbers (see steps 1215 and 1216), the number of records corresponding to each matrix area in the cross tabulation area is given. For example, a case will be described in which the offset arrays shown in FIGS. As shown in FIG.
First, from the set of record numbers, the record number “0” at the head (storage position number is “0”) is extracted, and the corresponding pointer value “0” in the pointer array related to “school division” is specified. , "Age", the corresponding pointer value "0" in the pointer array is specified.

【００４１】次いで、「学校区分」に関するポインタ配
列中のポインタ値「０」により示されるオフセット配列
中の対応するオフセット値「０」と、「年齢」に関する
ポインタ配列中の対応するオフセット値「０」とが加算
される。このようにして得られた加算オフセット値「０
＋０＝０」が、クロス集計領域のアドレスを表す。そこ
で、クロス集計領域のアドレス「０」のマトリクス領域
（符号１５００参照）中の値をインクリメントする。Next, the corresponding offset value “0” in the offset array indicated by the pointer value “0” in the pointer array for “school division” and the corresponding offset value “0” in the pointer array for “age” Are added. The added offset value “0” thus obtained
“+ 0 = 0” represents the address of the cross tabulation area. Therefore, the value in the matrix area (see reference numeral 1500) of the address “0” of the cross tabulation area is incremented.

【００４２】また、レコード番号の集合のうち、第２番
目（格納位置番号が「１」）のレコード番号「１」につ
いて、「学校区分」に関するポインタ配列中のポインタ
値「２」と「年齢」に関するポインタ配列中のポインタ
値「１０」が特定され、これらポインタ値に示されるオ
フセット配列中のオフセット値「５」、「２」がそれぞ
れ特定される。したがって、加算オフセット値は、「５
＋２＝７」となり、当該加算オフセット値をアドレスと
するようなクロス集計領域中のマトリクス領域（符号１
５０２参照）中の値がインクリメントされる。同様に、
第３番目（格納位置番号が「２」）のレコード番号
「２」についても同様にして、加算オフセット値「５＋
３＝８」が算出される。このようにして、レコード番号
の集合に格納された全てのレコード番号について、加算
オフセット値の算出、当該加算オフセット値をアドレス
としたクロス集計領域中のマトリクス領域の指定、当該
マトリクス領域の要素のインクリメントを繰り返すこと
により、図２１（ｃ）に示すような集計表を得ることが
できる。For the second record number "1" (storage position number "1") of the set of record numbers, the pointer value "2" and the "age" The pointer value “10” in the pointer array is specified, and the offset values “5” and “2” in the offset array indicated by these pointer values are specified, respectively. Therefore, the added offset value is “5
+ 2 = 7 ", and the matrix area (reference numeral 1) in the cross tabulation area using the added offset value as an address
(See 502) is incremented. Similarly,
Similarly, for the third (the storage position number is “2”) record number “2”, the addition offset value “5+
3 = 8 "is calculated. In this way, for all the record numbers stored in the set of record numbers, calculation of the addition offset value, designation of the matrix area in the cross tabulation area using the addition offset value as an address, increment of the elements of the matrix area Is repeated, it is possible to obtain a tabulation table as shown in FIG.

【００４３】このように、本実施の形態によれば、値リ
ストの存在数配列を生成して、存在数が「０」以外の値
（項目値）を特定し、かつ、各項目について、レコード
番号に対応して、それぞれのオフセット値を与え、得ら
れたオフセット値の加算値（加算オフセット値）によ
り、クロス集計領域のアドレスが示される。As described above, according to the present embodiment, the presence number array of the value list is generated, the value (item value) having the existence number other than “0” is specified, and the record is recorded for each item. Each offset value is given corresponding to the number, and the address of the cross tabulation area is indicated by the added value (added offset value) of the obtained offset value.

【００４４】さらに、上記クロス集計のさらに他の例に
ついて簡単に説明する。たとえば、図１６（ａ）に示す
表形式データを考える。この表形式データは、本実施の
形態においては、「学校区分」、「性」、「年齢」およ
び「名前」の情報ブロックから構成される。各情報ブロ
ックは、前述したように、「値リストへのポインタ配
列」および「値リスト」を備える（図１７参照）。な
お、図１７においては、存在数配列中の要素に値が与え
られていない。Further, another example of the cross tabulation will be briefly described. For example, consider the tabular data shown in FIG. In the present embodiment, the tabular data is composed of information blocks of “school division”, “sex”, “age”, and “name”. Each information block includes a "pointer array to a value list" and a "value list" as described above (see FIG. 17). In FIG. 17, no value is given to the elements in the existence number array.

【００４５】まず、「性」という項目で、「男性」のレ
コードだけに絞り込むと、図１６（ｂ）に示すレコード
番号の集合および表（ビュー）を得ることができる。ま
た、これを五十音順にソートすれば、図１６（ｃ）に示
すレコード番号の数号および表（ビュー）を得ることが
できる。これら絞り込み（検索）およびソートには、本
発明者の考案した、ＰＣＴＷＯ００／１０９１３号に開
示された手法を利用することができる。このような絞込
みおよびソートにより、図１６（ｃ）に示すレコード番
号の集合（符号１６０１参照）を得ることができる。次
いで、図１６（ｃ）に示すレコード番号の集合中の各レ
コード番号を参照して、「学校区分」および「年齢」に
関する存在数配列を作成する（図１８参照）。以下、図
１９に示すようなオフセット配列の作成を経て、図２０
に示すクロス集計領域中のマトリクス領域に、対応する
レコードの数が配置される。First, by narrowing down only the records of “male” in the item of “sex”, a set of record numbers and a table (view) shown in FIG. 16B can be obtained. If this is sorted in the order of the Japanese syllabary, the number of the record number and the table (view) shown in FIG. 16C can be obtained. For the narrowing down (searching) and the sorting, a method disclosed by PCT WO00 / 10913 devised by the present inventor can be used. By such narrowing and sorting, a set of record numbers (see reference numeral 1601) shown in FIG. 16C can be obtained. Next, referring to each record number in the set of record numbers shown in FIG. 16 (c), an existence number array related to “school division” and “age” is created (see FIG. 18). Hereinafter, after creating an offset array as shown in FIG. 19, FIG.
The number of corresponding records is arranged in the matrix area in the cross tabulation area shown in FIG.

【００４６】最後に、この実施の形態にかかる手法にし
たがったクロス集計の処理速度について、考察を加え
る。クロス集計をすべきレコードのレコード数を
「N」、二次元のクロス集計の場合の一方の項目（たと
えば、クロス集計表における「行」）の項目値の数を
「Ｒ」、他方の項目（たとえば、クロス集計表における
「列」）の項目値の数を「Ｃ」と考える。この場合に、
本実施の形態にかかる手法を利用すると、（１）「存在数配列の生成」のために、「２×Ｎ」回の
カウントアップ（２）「オフセット配列の生成」のために、「Ｒ＋Ｃ」
回のカウントアップ、（３）「クロス集計表の作成」のために、「Ｎ」回のカ
ウントアップのみを要することがわかる。したがって、ＲやＣの数に
もよるが、「３×Ｎ」回程度のカウントアップを実行す
れば足りる。最悪の場合（つまり、Ｒ＝Ｃ＝Ｎの場合）
であっても、「５×Ｎ」回のカウントアップで良い。Finally, the processing speed of the cross tabulation according to the method according to this embodiment will be considered. The number of records to be cross-tabulated is “N”, the number of field values of one item (for example, “row” in the cross-tabulation table) in the case of two-dimensional cross-tabulation is “R”, and the other item ( For example, it is assumed that the number of item values of “column” in the cross tabulation table is “C”. In this case,
When the method according to this embodiment is used, (1) “2 × N” count-ups for “generation of array” and “R + C” for “generation of offset array”
It can be seen that only "N" count-ups are required for (3) "Creating a cross tabulation table". Therefore, although it depends on the number of R and C, it is sufficient to execute the count up about “3 × N” times. Worst case (ie, R = C = N)
However, it is sufficient to count up “5 × N” times.

【００４７】その一方、従来の手法によれば、「２×Ｃ
×Ｒ」回の計算が必要であった。ここで、「Ｃ≒Ｎ」か
つ「Ｒ≒Ｎ」である場合には、その計算量は、Ｏ
（Ｎ^２）に達する。したがって、ＲやＣがＮに対して十
分小さい場合を除き、本実施の形態にかかるクロス集計
の手法が、従来のものよりも著しく高速であることが理
解できる。On the other hand, according to the conventional method, “2 × C
× R ”calculations were required. Here, if “C ≒ N” and “R ≒ N”, the calculation amount is O
(N ² ). Therefore, it can be understood that the cross tabulation method according to the present embodiment is significantly faster than the conventional one, except when R and C are sufficiently smaller than N.

【００４８】また、従来の手法によれば、クロス集計に
おいてどの要素が「空」（つまり、値が「０」）である
かが不明であるため、いったん、巨大なクロス集計のた
めのメモリ領域を確保する必要があった。（たとえば、
一方の項目に関して「１０万」個の値が存在し、他方の
項目に関して「１０００」個の値が存在するのであれ
ば、「１０万×１０００＝１億」だけの要素を格納する
ための領域を、いったん確保する必要がある。）ここ
で、メモリ領域の確保ができない場合には、集計自体が
不可能となるおそれがあった。その一方、本発明によれ
ば、最終的に、その要素（値）が「０」となるような領
域が、クロス集計領域にて作られず、上記メモリ領域の
確保ができない可能性を著しく小さくすることが可能と
なる。Further, according to the conventional method, it is unknown which element is “empty” (that is, the value is “0”) in the cross tabulation, so that the memory area for the huge cross tabulation is once used. Had to be secured. (For example,
If “100,000” values exist for one item and “1000” values exist for the other item, an area for storing only “100,000 × 1000 = 100 million” elements Need to be secured once. Here, if the memory area cannot be secured, the tallying itself may not be possible. On the other hand, according to the present invention, an area whose element (value) becomes “0” is not formed in the cross tabulation area, and the possibility that the memory area cannot be secured is significantly reduced. It becomes possible.

【００４９】本発明は、以上の実施の形態に限定される
ことなく、特許請求の範囲に記載された発明の範囲内
で、種々の変更が可能であり、それらも本発明の範囲内
に包含されるものであることは言うまでもない。たとえ
ば、前記実施の形態においては、１次元の集計および２
次元の集計について説明したが、３次元以上の集計につ
いても、本発明の手法を利用できることは明らかであ
る。ここに、多次元の場合のアドレスの射影につき考え
る。The present invention is not limited to the embodiments described above, and various modifications can be made within the scope of the invention described in the claims, and these are also included in the scope of the present invention. Needless to say, this is done. For example, in the above embodiment, one-dimensional aggregation and 2
Although the counting of dimensions has been described, it is clear that the technique of the present invention can be used for counting of three or more dimensions. Here, consider the projection of addresses in the case of multi-dimension.

【００５０】「Ｉ」個の項目に関する集計表を求める場
合に、各項目の項目値の数が、Ｎｉ（０≦ｉ≦Ｉ−１）
で表され、かつ、各項目の存在数配列において値が０で
あった要素の数がｎｉ（（０≦ｉ≦Ｉ−１）で表される
と考える。ここで、各項目のオフセット配列のオフセッ
ト値Ｐｉ_ｍ（ただし、０≦ｍ≦Ｎｉ−ｎｉ）は、Ｐｉ_０＝０Ｐｉ_{（ｍ＋１）}−Ｐｉ_ｍ＝Π（Ｎｋ−ｎｋ）（０≦ｋ
≦ｍ−１）により表される。すなわち、ある次元「ｉ」において、
隣接するオフセット値ｎ間の差は、自己よりも低次元の
項目の要素のうち、対応する存在数が「０」でない要素
の数を、順次乗算したものとなる。When a total table for “I” items is obtained, the number of item values of each item is Ni (0 ≦ i ≦ I−1).
And the number of elements whose value is 0 in the existence number array of each item is represented by ni ((0 ≦ i ≦ I-1). Here, the offset array of each item is offset value Pi _m (although, 0 ≦ m ≦ Ni-ni ) _{_{is, Pi 0 = 0 Pi (m}} + 1) -Pi m = Π (nk-nk) (0 ≦ k
.Ltoreq.m-1). That is, in a certain dimension "i",
The difference between adjacent offset values n is a value obtained by sequentially multiplying the number of elements whose corresponding number is not “0” among the elements of items of lower dimensions than the self.

【００５１】また、オフセット配列中に与えられる具体
的な値は、前記実施の形態に示すものに限定されない。
すなわち、クロス集計領域のために確保された領域にお
いて、１以上のアドレスにて、マトリクス領域中の値が
特定される場合（すなわち、隣接するマトリクス領域の
アドレスの差が２以上である場合）には、オフセット値
を「ｋ」倍（ｋ：上記アドレスの差）すれば良い。Further, specific values given in the offset array are not limited to those shown in the above embodiment.
That is, when the value in the matrix area is specified by one or more addresses in the area reserved for the cross tabulation area (that is, when the difference between the addresses of adjacent matrix areas is 2 or more) Can be obtained by multiplying the offset value by “k” (k: the difference between the addresses).

【００５２】さらに、前記実施の形態においては、一般
のコンピュータシステム１０内に、所定のプログラムを
読み込み、当該プログラムを実行することにより、複数
の表形式データのジョインおよびジョインされた表形式
データに関する処理を実現しているが、本発明はこれに
限定されるものではなく、パーソナルコンピュータ等の
ような一般のコンピュータシステムに、データベース処
理専用のボードコンピュータを接続し、当該ボードコン
ピュータが上記処理を実行できるように構成しても良い
ことは言うまでもない。したがって、本明細書におい
て、手段とは必ずしも物理的手段を意味するものではな
く、各手段の機能が、ソフトウェアによって実現される
場合も包含する。さらに、一つの手段の機能が、二つ以
上の物理的手段により実現されても、若しくは、二つ以
上の手段の機能が、一つの物理的手段により実現されて
もよい。Further, in the above-described embodiment, a predetermined program is read into the general computer system 10, and the program is executed, thereby joining a plurality of tabular data and processing the joined tabular data. However, the present invention is not limited to this. A board computer dedicated to database processing is connected to a general computer system such as a personal computer, and the board computer can execute the above processing. Needless to say, such a configuration may be adopted. Therefore, in this specification, means does not necessarily mean physical means, but also includes a case where the function of each means is realized by software. Further, the function of one unit may be realized by two or more physical units, or the function of two or more units may be realized by one physical unit.

【００５３】[0053]

【発明の効果】本発明によれば、メモリの使用量を削減
するとともに、クロス集計を高速に実現可能な処理方法
を提供することが可能となる。According to the present invention, it is possible to provide a processing method capable of reducing the amount of memory used and achieving high-speed cross tabulation.

[Brief description of the drawings]

【図１】図１は、本発明の実施の形態にかかるクロス
集計等を実現できるコンピュータシステムのハードウェ
ア構成を示すブロックダイヤグラムである。FIG. 1 is a block diagram showing a hardware configuration of a computer system capable of implementing cross tabulation and the like according to an embodiment of the present invention.

【図２】図２は、本実施の形態にて用いる情報ブロッ
クを示す図である。FIG. 2 is a diagram showing information blocks used in the present embodiment.

【図３】図３は、表形式データの例、および、当該表
形式データに基づく情報ブロックの例を示す図である。FIG. 3 is a diagram illustrating an example of tabular data and an example of an information block based on the tabular data.

【図４】図４は、表形式データの他の例、および、当
該表形式データに基づく情報ブロックの他の例を示す図
である。FIG. 4 is a diagram illustrating another example of tabular data and another example of an information block based on the tabular data.

【図５】図５は、単一項目に関する検索手法を示すフ
ローチャートである。FIG. 5 is a flowchart illustrating a search method for a single item.

【図６】図６は、表形式データに基づき情報ブロック
を作成するための処理を説明するフローチャートであ
る。FIG. 6 is a flowchart illustrating a process for creating an information block based on tabular data.

【図７】図７は、情報ブロックを作成するための原デ
ータの例を示す図である。FIG. 7 is a diagram illustrating an example of original data for creating an information block.

【図８】図８は、本実施の形態にかかる集計処理を示
すフローチャートである。FIG. 8 is a flowchart illustrating a tallying process according to the embodiment;

【図９】図９は、本実施の形態にしたがった、ある情
報ブロックの例を示す図である。FIG. 9 is a diagram showing an example of a certain information block according to the present embodiment.

【図１０】図１０は、本実施の形態にかかる集計手法
により得られたビューの一例を示す図である。FIG. 10 is a diagram illustrating an example of a view obtained by a tallying method according to the embodiment;

【図１１】図１１は、本実施の形態にかかるクロス集
計の処理を示すフローチャートである。FIG. 11 is a flowchart illustrating a cross tabulation process according to the embodiment;

【図１２】図１２は、２つの項目に関するクロス集計
を実行する場合のステップ１１０５の処理をより詳細に
示したフローチャートである。FIG. 12 is a flowchart showing the processing of step 1105 in a case where cross tabulation is performed on two items in more detail.

【図１３】図１３に、本実施の形態にかかる手法にし
たがって生成された学校区分および年齢からなる集計表
の一例を示す図である。FIG. 13 is a diagram showing an example of a tabulation table composed of school divisions and ages generated according to the method according to the present embodiment.

【図１４】図１４は、本実施の形態にかかるオフセッ
ト配列を含む情報ブロックの例を示す図である。FIG. 14 is a diagram illustrating an example of an information block including an offset array according to the present embodiment.

【図１５】図１５は、本実施の形態にかかるオフセッ
ト配列を利用した、集計表の作成を説明するための図で
ある。FIG. 15 is a diagram for explaining creation of a tally table using the offset array according to the present embodiment;

【図１６】図１６は、本実施の形態にかかる他のクロ
ス集計表を作成するための過程を説明する図である。FIG. 16 is a diagram for explaining a process for creating another cross tabulation table according to the embodiment;

【図１７】図１７は、本実施の形態にかかる他のクロ
ス集計表を作成するための過程を説明する図である。FIG. 17 is a diagram illustrating a process for creating another cross tabulation table according to the embodiment;

【図１８】図１８は、本実施の形態にかかる他のクロ
ス集計表を作成するための過程を説明する図である。FIG. 18 is a diagram illustrating a process for creating another cross tabulation table according to the embodiment;

【図１９】図１９は、本実施の形態にかかる他のクロ
ス集計表を作成するための過程を説明する図である。FIG. 19 is a diagram for explaining a process for creating another cross tabulation table according to the embodiment;

【図２０】図２０は、本実施の形態にかかる他のクロ
ス集計表を作成するための過程を説明する図である。FIG. 20 is a diagram illustrating a process for creating another cross tabulation table according to the embodiment;

【図２１】図２１は、クロス集計の一例を示す図であ
る。FIG. 21 is a diagram illustrating an example of a cross tabulation;

[Explanation of symbols]

１０コンピュータシステム１２ＣＰＵ１４ＲＡＭ１６ＲＯＭ１８固定記憶装置２０ＣＤ−ＲＯＭドライバ２２Ｉ／Ｆ２４入力装置２６表示装置 DESCRIPTION OF SYMBOLS 10 Computer system 12 CPU 14 RAM 16 ROM 18 Fixed storage device 20 CD-ROM driver 22 I / F 24 Input device 26 Display device

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ０６Ｆ 17/30 １７０Ｇ０６Ｆ 17/30 １７０Ｈ１８０１８０Ｄ３４０３４０Ｄ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G06F 17/30 170 G06F 17/30 170H 180 180D 340 340D

Claims

[Claims]

1. A data summarizing method for summarizing a plurality of tabular data each represented as an array of records including an item and an item value included therein, wherein each of said tabular data is A value list in which the item values are stored in the order of the item values corresponding to the item values belonging to the specific item, and a pointer value for indicating the item value number in the order of the unique record number are stored. And arranging the number of records having the same item value number in the information block as the value in the value list. Providing an existence number array arranged in correspondence with the above, finding an item value whose existence number is 0 in the existence number array, and removing the found item value. In, data aggregation method characterized by comprising the steps of creating a schedule consisting existence number corresponding to the item value in the item.

2. Tabular data, each represented as an array of records, each containing an item and an item value contained therein,
A data aggregation method for performing aggregation processing on a plurality of items, and for each of a plurality of items used for aggregation, a value in which the item values are stored in the order of the item value numbers corresponding to the item values belonging to the item A step of dividing the data into one or more information blocks consisting of a list and a pointer array storing pointer values for indicating the item value numbers in the order of the unique record numbers; and Providing an existence number array in which the number of existences indicating the number of records having the same item value number is arranged in association with the value in the value list; and in each of the existence number arrays, A step of finding an item value that is 0; and setting the number of each element of the matrix-like tabulation table including the plurality of items to correspond to an address. Projecting a multidimensional position of the element onto a one-dimensional position; and in each of the existence number arrays, at least an element corresponding to an element whose existence number is not 0, wherein the element is included in the projection. Generating an offset array that becomes an offset value based on the record number, and referring to a corresponding pointer value in the pointer array for each item based on the record number, thereby obtaining an offset value in the corresponding offset array. Identifying, adding the obtained offset value to calculate an added offset value, calculating the address by associating the added offset with the projection, and calculating the address indicated by the obtained address. Counting up the value, based on the value of the position indicated by the finally obtained address. Data aggregation method characterized in that a total of Table obtained.

3. The number of item values of each item is Ni (0 ≦ i ≦
I-1: I is represented by the total number of items related to aggregation, and the number of elements having a value of 0 in the existence number array of each item is represented by ni ((0 ≦ i ≦ I−1)). when it is, the offset value Pi _m offset arrangement of each item (except, 0 ≦ m ≦ Ni-ni ) _{_{is, Pi 0 = 0 Pi (m}} + 1) -Pi m = Π (nk-nk) (0 ≦ k
≦ m−1) 3. The data summarizing method according to claim 2, wherein:

4. A step of selecting a record number which can be counted from among the record numbers prior to providing the existence number array, and a record number in which the selected record numbers are arranged in a predetermined order. Generating a set, and referring to a corresponding pointer value in the pointer array based on a record number included in the record number set, generating the existence number array. The data aggregation method according to any one of claims 1 to 3, wherein:

5. A computer readable by a computer storing a program for implementing a data summarizing method for summarizing a plurality of tabular data represented as an array of records each including an item and an item value included therein. A storage medium, wherein the tabular data, each value list in which the item value is stored in the order of the item value number corresponding to the item value belonging to a specific item, and in the order of a unique record number, A step of dividing the data into at least one information block consisting of a pointer array storing a pointer value for indicating the item value number, and a record having the same item value number in the information block Providing an existence number array in which the number of existences indicating the number of items is arranged so as to correspond to the values in the value list; A program that includes a step of finding a certain item value; and a step of creating a tabulation table including the number of items corresponding to the item value in the item with the found item value removed. Storage medium.

6. Tabular data, each represented as an array of records each including an item and an item value included therein,
A computer-readable storage medium storing a program for implementing a data aggregation method for performing aggregation processing on a plurality of items, wherein each of a plurality of items used for aggregation corresponds to an item value belonging to the item. One or more information blocks including a value list in which the item values are stored in the order of the item value numbers, and a pointer array in which pointer values for indicating the item value numbers are stored in the order of the unique record numbers And a presence number array in which, for each of the items, the number of occurrences indicating the number of records having the same item value number is arranged in correspondence with the value in the value list. A step of finding an item value such that the number of occurrences is 0 in each of the existence number arrays; Projecting the multi-dimensional position of each element to a one-dimensional position, in order to make the number of each element of the liquor-based summary table correspond to the address, Generating an offset array including at least an element corresponding to an element whose number of existence is not 0, and the element having an offset value based on the projection; and, for each item based on the record number, the pointer Specifying an offset value in the corresponding offset array by referring to a corresponding pointer value in the array; adding the obtained offset value to calculate an added offset value; Calculating the address by associating with the projection, and counting up the position value indicated by the obtained address. To a step, the finally obtained based on the value of the position shown in the address summary table storage medium storing a program configured to obtain.

7. The number of item values of each item is Ni (0 ≦ i ≦
I-1: I is represented by the total number of items related to aggregation, and the number of elements having a value of 0 in the existence number array of each item is represented by ni ((0 ≦ i ≦ I−1)). when it is, the offset value Pi _m offset arrangement of each item (except, 0 ≦ m ≦ Ni-ni ) _{_{is, Pi 0 = 0 Pi (m}} + 1) -Pi m = Π (nk-nk) (0 ≦ k
≦ m−1) The data summarizing method according to claim 6, wherein:

8. A step of selecting a record number that can be counted from among the record numbers prior to providing the existence number array, and a record number in which the selected record numbers are arranged in a predetermined order. Generating a set, and referring to a corresponding pointer value in the pointer array based on a record number included in the record number set, generating the existence number array. The storage medium according to claim 1, wherein the storage medium stores a program to be executed.