JPWO2009044486A1

JPWO2009044486A1 - Method for sorting tabular data, multi-core type apparatus, and program

Info

Publication number: JPWO2009044486A1
Application number: JP2009535944A
Authority: JP
Inventors: 古庄　晋二; 晋二古庄
Original assignee: Turbo Data Laboratories Inc
Current assignee: Turbo Data Laboratories Inc
Priority date: 2007-10-05
Filing date: 2007-10-05
Publication date: 2011-02-03
Also published as: WO2009044486A1

Abstract

マルチコア型処理装置において、表形式データのレコードを所定の項目値に応じて並べ替えるソート方法は、複数台の演算ユニットが並列的に動作して、(i)各演算ユニットが担当するブロック内で、項目値情報をキーとして用いて、項目値アクセス情報にソートを適用し、(ii)ブロック間で、項目値情報とレコード順序番号の組を所定の順序でソートし、(iii)ブロック間のソート(ii)を繰り返すことにより、ソートされたブロック番号配列を作成し、(iv)ブロック番号配列中の要素をブロック毎に分配することにより、ソートされたレコード順序番号配列を作成する。In a multi-core processing device, a sorting method for rearranging the records of tabular data according to a predetermined item value is as follows. (I) In a block in which each arithmetic unit is in charge, multiple arithmetic units operate in parallel. Apply the sort to the item value access information using the item value information as a key, and (ii) sort the combination of the item value information and the record sequence number in a predetermined order between the blocks, and (iii) between the blocks A sorted block number array is created by repeating the sort (ii), and (iv) a sorted record sequence number array is created by distributing the elements in the block number array for each block.

Description

本発明は、マルチコア型処理装置において、データ項目に対応した項目値を含むレコードの配列として表され、グローバルメモリに構築された表形式データのレコードを所定の項目値に応じて並べ替えるソート方法に関係する。 The present invention relates to a sorting method for rearranging records of tabular data constructed in a global memory according to a predetermined item value, represented as an array of records including item values corresponding to data items in a multi-core processing apparatus. Involved.

本発明は、データ項目に対応した項目値を含むレコードの配列として表され、グローバルメモリに構築された表形式データのレコードを所定の項目値に応じて並べ替えるマルチコア型処理装置にも関係する。 The present invention also relates to a multi-core processing device that is represented as an array of records including item values corresponding to data items and rearranges the records of tabular data constructed in the global memory according to predetermined item values.

さらに、本発明は、上記ソート方法を、マルチコア型プロセッサを備えるコンピュータに実行させるためのプログラム、コンピュータプログラムプロダクト、及び、コンピュータプログラムが記録された記録媒体に関係する。 Furthermore, the present invention relates to a program for causing a computer having a multi-core processor to execute the sorting method, a computer program product, and a recording medium on which the computer program is recorded.

従来、産業上の様々な分野において、大規模データを高速に処理することが求められている。大規模データの処理は、キャッシュやプリフェッチなどによるメモリアクセスの高速化、メモリ自体の高速化、及び、プロセッサの並列化のような演算処理の高速化、といったハードウェア技術の開発、ならびに、データ処理アルゴリズムの開発によって、高速化され続けている。 Conventionally, it is required to process large-scale data at high speed in various industrial fields. For large-scale data processing, development of hardware technologies such as high-speed memory access by cache and prefetch, high-speed memory, and high-speed arithmetic processing such as parallel processing of processors, and data processing The development of algorithms continues to speed up.

本発明者は、大規模データを高速に処理するための基本的なデータ処理アルゴリズム、たとえば、特許文献１に記載されているような、「オンメモリデータ処理アルゴリズム」を提案している。この技術は、表形式データを、従来のようなレコード（すなわち、行）単位ではなく、項目（すなわち、列）単位に成分分解するという考え方に基づいている。より具体的には、表形式データが、（１）レコード順を表す配列と、（２）項目に属する一意の項目値が所定の順序（たとえば、昇順）に並べられた値テーブルと、（３）各レコードに対応する項目値が値テーブルに格納されている位置情報を表す配列とからなるデータ構造によって表現されている。このようなデータ構造を採用することにより、表形式データの検索、ソート、マージ、ジョイン等の処理が高速に実現されている。 The inventor has proposed a basic data processing algorithm for processing large-scale data at high speed, for example, an “on-memory data processing algorithm” as described in Patent Document 1. This technique is based on the idea that the tabular data is decomposed into items (ie, columns) instead of records (ie, rows) as in the prior art. More specifically, the tabular data includes (1) an array representing the record order, (2) a value table in which unique item values belonging to the items are arranged in a predetermined order (for example, ascending order), and (3 ) The item value corresponding to each record is represented by a data structure including an array representing position information stored in the value table. By adopting such a data structure, processes such as retrieval, sorting, merging, and joining of tabular data are realized at high speed.

さらに、本発明者は、メモリ分散型のマルチプロセッサシステム及びメモリ共有型のマルチプロセッサシステムのようなプロセッサの並列化に対応した種々のオンメモリデータ処理アルゴリズムを提案している。たとえば、メモリ分散型のマルチプロセッサシステムに対応した検索・ソートアルゴリズムが特許文献２に記載され、集計アルゴリズムが特許文献３に記載されている。さらに、メモリ共有型のマルチプロセッサシステムに対応した効率的なソートアルゴリズムが特許文献４に記載されている。 Furthermore, the present inventor has proposed various on-memory data processing algorithms corresponding to the parallelization of processors such as a memory distributed multiprocessor system and a memory shared multiprocessor system. For example, Patent Literature 2 describes a search / sort algorithm corresponding to a memory distributed multiprocessor system, and Patent Literature 3 describes a tabulation algorithm. Furthermore, Patent Document 4 describes an efficient sorting algorithm corresponding to a memory-shared multiprocessor system.

大規模データ、特に、大規模な表形式データを処理する際に頻出する処理はソートであり、効率的なソートアルゴリズムとして、基数（ＲＡＤＩＸ）ソートとカウンティングソート（計数ソート、分布数え上げソートとも称される）が知られている。基数ソートは、ソートの対象となるデータの各桁をキーとして、桁毎にソートを行う。カウンティングソートは、ソートの対象となるデータの出現回数と累積度数分布とを計算し、累積度数分布に応じてソートの対象となるデータを並べ替える。カウンティングソートは基数ソートの各桁のソートに利用されることもある。特許文献４には、共有メモリ型プロセッサシステムにおいて、共有メモリ上の大規模な表形式データを複数台のプロセッサで並列にソートする方法が提案されている。 Processing that occurs frequently when processing large-scale data, especially large-scale tabular data, is sorting. As an efficient sorting algorithm, it is also called radix (RADIX) sorting and counting sorting (counting sorting, distribution counting sorting). Is known). The radix sort is performed for each digit using each digit of data to be sorted as a key. In the counting sort, the number of appearances of the data to be sorted and the cumulative frequency distribution are calculated, and the data to be sorted is rearranged according to the cumulative frequency distribution. The counting sort may be used for sorting each digit of the radix sort. Patent Document 4 proposes a method of sorting large-scale tabular data on a shared memory in parallel by a plurality of processors in a shared memory processor system.

これに対して、近年、１台のプロセッサの内部に複数（又は多数）のコアを含むプロセッサアーキテクチャが提案されている。マルチコア型プロセッサの一例として、ＣｅｌｌＢｒｏａｄｂａｎｄＥｎｇｉｎｅ^ＴＭが知られている（非特許文献１を参照のこと）。このタイプのプロセッサは、たとえば、マルチメディアデータの高速処理や、分散コンピューティングなどに適用することが意図されている。このアーキテクチャでは、各コアは、大容量ではないが、専用のローカルメモリを有し、他のコアとは独立して演算を行うことができる。実際には、ローカルメモリのメモリ容量はマルチメディアデータ等の処理に不足しているので、外付けのグローバルメモリが設けられている。マルチコア型プロセッサアーキテクチャは、クロックの高速化に頼るのではなく、コアの追加によって並列性が高まり、処理能力が上昇するので、拡張性に優れている。On the other hand, in recent years, a processor architecture including a plurality of (or many) cores in one processor has been proposed. Cell Broadcast Engine ^TM is known as an example of a multi-core type processor (see Non-Patent Document 1). This type of processor is intended to be applied to, for example, high-speed processing of multimedia data and distributed computing. In this architecture, each core is not large-capacity, but has a dedicated local memory, and can perform calculations independently of other cores. Actually, since the memory capacity of the local memory is insufficient for processing multimedia data or the like, an external global memory is provided. The multi-core processor architecture has excellent extensibility because it increases parallelism and processing capacity by adding cores, rather than relying on clock speedup.

よって、このようなマルチコア型プロセッサアーキテクチャは、マルチメディアデータ処理だけでなく、高速性が要求される種々のアプリケーション、特に、大規模な表形式データのソート処理に適用されることが望まれている。
"Cell Broadband EngineArchitecture", Version 1.01, October 3, 2006、［平成１９年７月２６日検索］、インターネット(URL:http://cell.scei.co.jp/pdf/CBE_Architecture_v101.pdf) 国際公開第００／１０１０３号公報国際公開第２００５／０４１０６６号公報国際公開第２００５／０４１０６７号公報国際公開第２００６／１２６４６７号公報 Therefore, it is desired that such a multi-core processor architecture is applied not only to multimedia data processing but also to various applications that require high speed, particularly to sort processing of large-scale tabular data. .
"Cell Broadband EngineArchitecture", Version 1.01, October 3, 2006, [Search on July 26, 2007], Internet (URL: http://cell.scei.co.jp/pdf/CBE_Architecture_v101.pdf) International Publication No. 00/10103 International Publication No. 2005/041066 International Publication No. 2005/041067 International Publication No. 2006/126467

本発明者は、大規模な表形式データを高速にソート処理するため、上記の拡張性に優れたマルチコア型プロセッサアーキテクチャを利用する技術の重要性を認識した。 The present inventor has recognized the importance of the above-described technology that uses the multi-core processor architecture having excellent extensibility in order to sort large-scale tabular data at high speed.

しかし、データベースのように大規模なメモリを用いるアプリケーションでは、ソート処理されるべきデータの全部はコアに付随するローカルメモリに収容できないので、ソート処理アルゴリズムの複雑性が増す。たとえば、各コアに付随するローカルメモリに収容できない程に大きなデータをランダムアクセスすると、外付けのグローバルメモリへのアクセスが頻発し、処理性能が著しく低下する。よって、このような問題を起こさない新たなソート処理アルゴリズムが必要とされる。 However, in an application using a large-scale memory such as a database, since all data to be sorted cannot be accommodated in the local memory attached to the core, the complexity of the sorting processing algorithm increases. For example, if data that is too large to be accommodated in the local memory associated with each core is randomly accessed, access to an external global memory frequently occurs and processing performance is significantly reduced. Therefore, a new sort processing algorithm that does not cause such a problem is required.

したがって、マルチコア型プロセッサを備えるコンピュータにおいて、１つ以上のデータ項目に対応した項目値を含むレコードの配列を表現する表形式データのレコードを、並列処理性能を低下させることなく、容量の小さい作業用メモリを用いて、所定のデータ項目に関する項目値に応じて並べ替えるソート方法を提供できることが好ましい。 Accordingly, in a computer equipped with a multi-core processor, a tabular data record representing an array of records including item values corresponding to one or more data items can be used for work with a small capacity without degrading parallel processing performance. It is preferable that a sorting method can be provided in which a memory is used for sorting according to item values relating to predetermined data items.

また、１つ以上のデータ項目に対応した項目値を含むレコードの配列を表現する表形式データのレコードを、並列処理性能を低下させることなく、容量の小さい作業用メモリを用いて、所定のデータ項目に関する項目値に応じて並べ替えるマルチコア型情報処理装置を提供できることが好ましい。 In addition, a record of tabular data that represents an array of records including item values corresponding to one or more data items can be obtained by using predetermined working memory with a small capacity without reducing parallel processing performance. It is preferable to be able to provide a multi-core information processing apparatus that rearranges the items according to item values.

さらに、マルチコア型プロセッサを備えるコンピュータにおいて、１つ以上のデータ項目に対応した項目値を含むレコードの配列を表現する表形式データのレコードを、並列処理性能を低下させることなく、容量の小さい作業用メモリを用いて、所定のデータ項目に関する項目値に応じて並べ替えるプログラム、コンピュータプログラムプロダクト、及び、コンピュータプログラムが記録された記録媒体を提供できることが好ましい。 Furthermore, in a computer having a multi-core type processor, a record of tabular data representing an array of records including item values corresponding to one or more data items can be used for work with a small capacity without degrading parallel processing performance. It is preferable that a program, a computer program product, and a recording medium on which the computer program is recorded can be provided by using a memory and rearranging according to the item values relating to the predetermined data item.

本発明の一実施形態によれば、専用のローカルメモリを含む複数台の演算ユニットと、上記複数台の演算ユニットに接続されているグローバルメモリとを備えるマルチコア型処理装置において、１つ以上のデータ項目に対応した項目値を含むレコードの配列を表現し、上記グローバルメモリに構築されている表形式データの上記レコードを所定のデータ項目に関する項目値に応じて並べ替えるソート方法であって、
上記表形式データが、
各演算ユニットの上記ローカルメモリに収容可能なサイズをもつブロックに分割されている上記レコードの配列中の各レコードに対応するブロック番号が上記表形式データ中の上記レコードのレコード順序番号の順番に格納されているブロック番号配列と、
各ブロックに属するレコードである担当レコードの上記レコード順序番号が上記レコード順序番号の順番に格納されているレコード順序番号配列と、
上記担当レコードに含まれる上記項目値にアクセスするための項目値アクセス情報が上記レコード順序番号の順番に格納されている項目値アクセス情報配列と、
データ項目毎に上記項目値アクセス情報を用いてアクセスされる上記項目値を記述する項目値情報と、
によって表現され、
当該ソート方法が、上記所定のデータ項目に関して、
(i)上記複数台の演算ユニットが並列的に動作して、上記担当レコードを含むブロック毎に、上記項目値情報をキーとして用いて、上記項目値アクセス情報配列にソートを適用し、これによって、ソートされた項目値アクセス情報配列と、上記ソートされた項目値アクセス情報配列に対応する作業用項目値情報配列及び作業用レコード順序番号配列を上記グローバルメモリに作成するステップと、
(ii)上記複数台の演算ユニットが並列的に動作して、１対のブロックに関する上記作業用項目値情報配列からの第１の要素及び対応する上記作業用レコード順序番号配列からの第２の要素からなる要素の組を、上記第１の要素を上位の桁及び上記第２の要素を下位の桁として用いて、所定の順序にマージし、マージされた要素の組に当該要素の組が含まれる上記ブロック番号を関連付け、これによって、上記１対のブロックに関する新たな作業用項目値情報配列、新たな作業用レコード順序番号配列、及び、新たなブロック番号配列を作成するステップと、
(iii)上記複数台の演算ユニットが並列的かつ階層的に動作して、上記ステップ(ii)を繰り返し実行し、これによって、最終的なブロック番号配列を上記グローバルメモリに作成するステップと、
(iv)上記複数台の演算ユニットが並列的に動作して、上記最終的なブロック番号配列中の要素が格納されているレコード順序番号を当該要素によって指定されたブロック番号毎に分配し所定の順番に並べ、これによって、ソートされたレコード順序番号配列を上記グローバルメモリに作成するステップと、
を備えるソート方法が提供される。According to one embodiment of the present invention, in a multi-core processing apparatus including a plurality of arithmetic units including a dedicated local memory and a global memory connected to the plurality of arithmetic units, one or more data A sorting method for expressing an array of records including item values corresponding to items, and rearranging the records of tabular data constructed in the global memory according to item values related to a predetermined data item,
The above tabular data is
The block number corresponding to each record in the array of records divided into blocks having a size that can be accommodated in the local memory of each arithmetic unit is stored in the order of the record sequence number of the record in the tabular data. Block number array being
A record sequence number array in which the record sequence numbers of records in charge that are records belonging to each block are stored in the order of the record sequence numbers;
Field value access information array in which field value access information for accessing the field value included in the responsible record is stored in the order of the record sequence number;
Item value information describing the item value accessed using the item value access information for each data item;
Expressed by
The sorting method is related to the predetermined data item.
(i) The plurality of arithmetic units operate in parallel, and for each block including the record in charge, the item value information is used as a key to apply sorting to the item value access information array. Creating a sorted item value access information array, a work item value information array corresponding to the sorted item value access information array, and a work record sequence number array in the global memory;
(ii) The plurality of arithmetic units operate in parallel, and the first element from the work item value information array related to a pair of blocks and the second element from the corresponding work record sequence number array. An element set consisting of elements is merged in a predetermined order using the first element as the upper digit and the second element as the lower digit, and the element set is merged into the merged element set. Associating the included block numbers, thereby creating a new work item value information array, a new work record sequence number array, and a new block number array for the pair of blocks;
(iii) the plurality of arithmetic units operate in a parallel and hierarchical manner, repeatedly performing the step (ii), thereby creating a final block number array in the global memory;
(iv) The plurality of arithmetic units operate in parallel, and the record sequence numbers in which the elements in the final block number array are stored are distributed for each block number designated by the elements. Arranging in order, thereby creating a sorted record sequence number array in the global memory;
A sorting method is provided.

マルチコア型プロセッサを備えるコンピュータのグローバルメモリ上に構築された表形式データは、並列処理性能を低下させることなく、容量の小さい作業用メモリを用いてレコードの並べ替えが可能であるように、２タイプのデータ形式によって記述されている。第１のタイプのデータは、ローカルメモリに収容できることが保証される程に小さく分割され、グローバルメモリ（又は、ディスク）に保持される配列群である。この第１のタイプのデータは、グローバルメモリからローカルメモリへ一括転送され得るので、ランダムアクセスを行っても遅延を生じない。第２のタイプのデータは、大量のデータをアクセスする際に、必ず所定の順序（たとえば、昇順又は降順）に連続的にアクセスされることが保証され、グローバルメモリ（又は、ディスク）に保持される配列群である。第２のタイプのデータは、そのままではローカルメモリに収容できないので、ローカルメモリに収容可能なサイズの単位で、グローバルメモリからローカルメモリへ順次アクセスによって転送される。もちろん、第１のタイプのデータは、順次アクセスによらずに、部分的にグローバルメモリ上に格納されている要素が直接アクセスされることもある。 Two types of tabular data built on the global memory of a computer equipped with a multi-core processor are available so that records can be rearranged using a small-capacity working memory without degrading parallel processing performance. It is described by the data format. The first type of data is a group of arrays that are divided into small pieces so that they can be accommodated in the local memory and are held in the global memory (or disk). Since the first type of data can be transferred from the global memory to the local memory at once, no delay occurs even if random access is performed. The second type of data is guaranteed to be accessed continuously in a predetermined order (eg, ascending or descending order) when accessing a large amount of data, and is held in global memory (or disk). Sequence group. Since the second type of data cannot be accommodated in the local memory as it is, it is transferred by sequential access from the global memory to the local memory in units of a size that can be accommodated in the local memory. Of course, the first type of data may be directly accessed by an element partially stored in the global memory without being sequentially accessed.

本文書中で、表形式データとは、データ項目に対応した項目値を含むレコードの配列として表されるデータを意味する。 In this document, tabular data means data represented as an array of records including item values corresponding to data items.

また、本文書中で、マルチコア型処理装置又はプロセッサとは、専用のローカルメモリを含む複数台の演算ユニットと、上記複数台の演算ユニットに接続されているグローバルメモリと、上記複数台の演算ユニットを接続するバスと、上記グローバルメモリ及び上記複数台の演算ユニットに接続されている少なくとも１台の制御ユニットと、を備える装置を意味する。 In addition, in this document, the multi-core processing device or processor means a plurality of arithmetic units including a dedicated local memory, a global memory connected to the plurality of arithmetic units, and the plurality of arithmetic units. Means at least one control unit connected to the global memory and the plurality of arithmetic units.

さらに、本文書中、「配列にソート（処理）を適用」するとは、「配列の要素をソート」することを意味している。 Further, in this document, “applying sorting (processing) to an array” means “sorting elements of the array”.

本実施形態によれば、上記の成分分解の考え方と、上記の２タイプのデータ形式の考え方とを組み合わせて、マルチコア型処理装置のグローバルメモリ上に構築された表形式データの（複数又は多数の）レコードは、ブロック番号によって識別されるブロックに分割されている。このブロックは、このブロックに含まれるレコードの処理を担当する演算ユニットに対応している。各演算ユニットが担当するレコードは、本書中で、担当レコードと呼ばれる。そして、このブロック番号がレコード順序番号の順番に格納されているブロック番号配列がグローバルメモリ上に作成されている。ブロック番号配列は第２のタイプのデータをもつ。レコード順序番号とは、表形式データの中で各レコードが収容されている位置、たとえば、行番号に対応する。本実施形態によるソート処理は、表形式データのレコードを並び替えるので、各レコードに付与されているレコード順序番号は、ソート処理によって変化させられる。 According to the present embodiment, a combination of the above-described concept of component decomposition and the above-described two types of data formats is used to generate (a plurality or a large number of tabular data) constructed on the global memory of the multi-core processing apparatus. ) The record is divided into blocks identified by block numbers. This block corresponds to the arithmetic unit in charge of processing the records included in this block. The record that each arithmetic unit is responsible for is referred to as the charge record in this document. A block number array in which the block numbers are stored in the order of the record sequence numbers is created on the global memory. The block number array has a second type of data. The record sequence number corresponds to a position where each record is accommodated in the tabular data, for example, a row number. Since the sorting process according to the present embodiment rearranges the records of the tabular data, the record sequence number assigned to each record is changed by the sorting process.

各演算ユニットは、担当レコードを認識するために、担当レコードのレコード順序番号がレコード順序番号の順番に格納されているレコード順序番号配列にアクセスすることができる。このレコード順序番号配列は、第１のタイプのデータをもち、必要に応じて、グローバルメモリから各演算ユニット内のローカルメモリへ転送される。 Each arithmetic unit can access the record sequence number array in which the record sequence numbers of the assigned records are stored in the order of the record sequence numbers in order to recognize the assigned records. This record sequence number array has the first type of data, and is transferred from the global memory to the local memory in each arithmetic unit as necessary.

さらに、各演算ユニットは、担当レコードに含まれる項目値にアクセスするため、項目値アクセス情報がレコード順序番号の順番に格納されている項目値アクセス情報配列にアクセスすることができる。この項目値アクセス情報配列は、第１のタイプのデータをもち、必要に応じて、グローバルメモリから各演算ユニット内のローカルメモリへ転送される。 Furthermore, since each arithmetic unit accesses the item value included in the assigned record, it can access the item value access information array in which the item value access information is stored in the order of the record sequence numbers. This item value access information array has the first type of data, and is transferred from the global memory to the local memory in each arithmetic unit as necessary.

各演算ユニットの担当レコードに含まれる項目値は、データ項目毎に各演算ユニットが項目値アクセス情報配列を用いてアクセスすることができるように項目値情報としてグローバルメモリに保持され、必要に応じて、グローバルメモリから各演算ユニット内のローカルメモリへ転送される。 The item value included in the record in charge of each arithmetic unit is held in the global memory as item value information so that each arithmetic unit can access for each data item using the item value access information array. , And transferred from the global memory to the local memory in each arithmetic unit.

本発明の一実施形態によれば、
上記ステップ(i)が、
上記複数台の演算ユニットが並列的に動作して、上記担当レコードを含むブロックに対応している、上記レコード順序番号配列及び上記項目値アクセス情報配列と、上記所定の項目に関する上記項目値情報配列とを、上記グローバルメモリから上記ローカルメモリへ転送するステップと、
上記複数台の演算ユニットが並列的に動作して、上記ローカルメモリに作成された上記ソートされた項目値アクセス情報配列と、上記作業用項目値情報配列及び上記作業用レコード順序番号配列とを上記グローバルメモリへ転送するステップと、
をさらに含み、
上記ステップ(ii)が、
上記複数台の演算ユニットが並列的に動作して、１対のブロックに関する上記作業用項目値情報配列及び上記作業用レコード順序番号配列を、上記グローバルメモリから上記ローカルメモリへ転送するステップと、
上記複数台の演算ユニットが並列的に動作して、上記ローカルメモリに作成された上記新たな作業用項目値情報配列、上記新たな作業用レコード順序番号配列、及び、上記新たなブロック番号配列を上記グローバルメモリ又はさらなる処理のための演算ユニットへ転送するステップと、
をさらに含む。According to one embodiment of the present invention,
Step (i) above is
The plurality of arithmetic units operate in parallel to correspond to the block including the record in charge, the record sequence number array and the item value access information array, and the item value information array relating to the predetermined item Transferring from the global memory to the local memory;
The plurality of arithmetic units operate in parallel, and the sorted item value access information array created in the local memory, the work item value information array, and the work record sequence number array are Transferring to global memory;
Further including
Step (ii) above is
Transferring the work item value information array for the pair of blocks and the work record sequence number array from the global memory to the local memory by operating the plurality of arithmetic units in parallel;
The plurality of arithmetic units operate in parallel, the new work item value information array created in the local memory, the new work record sequence number array, and the new block number array Transferring to the global memory or computing unit for further processing;
Further included.

本発明の一実施形態によれば、上記項目値情報は、
データ項目毎に、一意の項目値が所定の順序に格納されているグローバル項目値配列と、
データ項目毎に、上記担当レコードに含まれる項目値を特定するローカル項目値番号が上記レコード順序番号の順番に格納されているローカル項目値番号配列と、
データ項目毎に、上記ローカル項目値番号によって表される項目値が上記グローバル項目値配列中に格納されている位置を指定する項目値指定ポインタが所定の順番に格納されている項目値指定ポインタ配列と、
を備え、
上記ステップ(i)において適用されるソートは分布数え上げソートであり、上記作業用項目値情報配列は作業用項目値指定ポインタ配列である。According to an embodiment of the present invention, the item value information is
For each data item, a global item value array in which unique item values are stored in a predetermined order;
For each data item, a local field value number array in which local field value numbers that specify field values included in the record in charge are stored in the order of the record order numbers;
An item value designation pointer array in which an item value designation pointer for designating a position where the item value represented by the local item value number is stored in the global item value array is stored in a predetermined order for each data item When,
With
The sort applied in step (i) is a distribution counting sort, and the work item value information array is a work item value designation pointer array.

本実施形態によれば、表形式データの項目値情報は、データ項目毎に、一意の項目値が所定の順序（昇順又は降順）に格納されているグローバル項目値配列としてグローバルメモリ上に構築されている。このグローバル項目値配列は第２のタイプのデータをもつ。また、各演算ユニットが項目値アクセス情報配列を用いて担当レコードに含まれる項目値にアクセスするため、データ項目毎に、担当レコードに含まれる項目値を特定するローカル項目値番号が、たとえば、原始レコード位置番号（すなわち、初期的なレコード順序番号）の順番に格納されているローカル項目値番号配列と、ローカル項目値番号によって表される項目値がグローバル項目値配列中に格納されている位置を指定する項目値指定ポインタが所定の順序（昇順又は降順）に格納されている項目値指定ポインタ配列とがグローバルメモリ上に構築され、必要に応じて、グローバルメモリから各演算ユニット内のローカルメモリへ転送される。ローカル項目値番号配列及び項目値指定ポインタ配列は第１のタイプのデータをもつ。このように、表形式データの項目値は、データ項目毎に、グローバル項目値配列、ローカル項目値番号配列、及び、項目値指定ポインタ配列の形に展開されている。 According to this embodiment, the item value information of tabular data is constructed on a global memory as a global item value array in which unique item values are stored in a predetermined order (ascending or descending order) for each data item. ing. This global item value array has a second type of data. Further, since each arithmetic unit uses the field value access information array to access the field value included in the assigned record, for each data item, a local field value number that identifies the field value included in the assigned record is, for example, a primitive The local item value number array stored in the order of the record position number (that is, the initial record sequence number) and the position where the item value represented by the local item value number is stored in the global item value array An item value specification pointer array in which item value specification pointers to be specified are stored in a predetermined order (ascending or descending order) is constructed on the global memory, and from the global memory to the local memory in each arithmetic unit as necessary. Transferred. The local item value number array and the item value designation pointer array have the first type of data. As described above, the item values of the tabular data are expanded in the form of a global item value array, a local item value number array, and an item value designation pointer array for each data item.

上記ステップ(i)は、
上記複数台の演算ユニットが並列的に動作して、上記担当レコードを含むブロックに対応している、上記レコード順序番号配列及び上記項目値アクセス情報配列と、上記所定の項目に関する上記ローカル項目値番号配列及び上記項目値指定ポインタ配列とを、上記グローバルメモリから上記ローカルメモリへ転送するステップと、
上記複数台の演算ユニットが並列的に動作して、上記ローカルメモリに作成された上記ソートされた項目値アクセス情報配列と、上記作業用項目値指定ポインタ配列及び上記作業用レコード順序番号配列とを上記グローバルメモリへ転送するステップと、
をさらに含み、
上記ステップ(ii)は、
上記複数台の演算ユニットが並列的に動作して、１対のブロックに関する上記作業用項目値指定ポインタ配列及び上記作業用レコード順序番号配列を、上記グローバルメモリから上記ローカルメモリへ転送するステップと、
上記複数台の演算ユニットが並列的に動作して、上記ローカルメモリに作成された上記新たな作業用項目値指定ポインタ配列、上記新たな作業用レコード順序番号配列、及び、上記新たなブロック番号配列を上記グローバルメモリ又はさらなる処理のための演算ユニットへ転送するステップと、
をさらに含む。Step (i) above is
The plurality of arithmetic units operate in parallel to correspond to the block including the record in charge, the record sequence number array and the item value access information array, and the local item value number related to the predetermined item Transferring the array and the item value designation pointer array from the global memory to the local memory;
The plurality of arithmetic units operate in parallel, the sorted item value access information array created in the local memory, the work item value designation pointer array, and the work record sequence number array. Transferring to the global memory;
Further including
The above step (ii)
The plurality of arithmetic units operating in parallel, transferring the work item value designation pointer array and the work record sequence number array for a pair of blocks from the global memory to the local memory;
The plurality of operation units operate in parallel, the new work item value designation pointer array created in the local memory, the new work record sequence number array, and the new block number array Transferring to the global memory or computing unit for further processing;
Further included.

本発明の他の実施形態によれば、上記項目値情報はデータ項目毎に上記担当レコードに含まれる項目値が格納されている項目値配列を備え、
上記ステップ(i)において適用されるソートが安定性のあるソートであり、
上記作業用項目値情報配列が作業用項目値配列である。According to another embodiment of the present invention, the item value information includes an item value array in which item values included in the charge record are stored for each data item,
The sort applied in step (i) above is a stable sort,
The work item value information array is a work item value array.

本実施形態では、表形式データの項目値情報は、データ項目毎に担当レコードに含まれる項目値が、たとえば、原始レコード位置番号（すなわち、初期的なレコード順序番号）の順番に格納されている項目値配列としてグローバルメモリ上に構築されている。ステップ(i)では、項目値をキーとして、項目値指定ポインタ配列に安定性のあるソート、すなわち、キー値が同値である場合に、ソートの前後で並べ替えが起こらないソートを適用する。 In this embodiment, the item value information of the tabular data stores the item values included in the assigned record for each data item, for example, in the order of the original record position number (that is, the initial record sequence number). It is constructed in global memory as an item value array. In step (i), using the item value as a key, a stable sort is applied to the item value designation pointer array, that is, a sort that does not cause reordering before and after the sort is applied when the key value is the same value.

また、上記ステップ(i)は、
上記複数台の演算ユニットが並列的に動作して、上記担当レコードを含むブロックに対応している、上記レコード順序番号配列及び上記項目値アクセス情報配列と、上記所定の項目に関する上記項目値とを、上記グローバルメモリから上記ローカルメモリへ転送するステップと、
上記複数台の演算ユニットが並列的に動作して、上記ローカルメモリに作成された上記ソートされた項目値アクセス情報配列と、上記作業用項目値配列及び上記作業用レコード順序番号配列を上記グローバルメモリへ転送するステップと、
をさらに含み、
上記ステップ(ii)は、
上記複数台の演算ユニットが並列的に動作して、１対のブロックに関する上記作業用項目値配列及び上記作業用レコード順序番号配列を、上記グローバルメモリから上記ローカルメモリへ転送するステップと、
上記複数台の演算ユニットが並列的に動作して、上記ローカルメモリに作成された上記新たな作業用項目値配列、上記新たな作業用レコード順序番号配列、及び、上記新たなブロック番号配列を上記グローバルメモリ又はさらなる処理のための演算ユニットへ転送するステップと、
をさらに含む。In addition, the above step (i)
The plurality of arithmetic units operate in parallel to correspond to the block including the record in charge, the record sequence number array and the item value access information array, and the item value related to the predetermined item. Transferring from the global memory to the local memory;
The plurality of arithmetic units operate in parallel, and the sorted item value access information array created in the local memory, the work item value array, and the work record sequence number array are stored in the global memory. Transferring to
Further including
The above step (ii)
Transferring the work item value array and the work record sequence number array for a pair of blocks from the global memory to the local memory, wherein the plurality of arithmetic units operate in parallel;
The plurality of arithmetic units operate in parallel, and the new work item value array, the new work record sequence number array, and the new block number array created in the local memory are Transferring to global memory or a computing unit for further processing;
Further included.

本発明の一実施形態によれば、２つ以上のデータ項目に関してソートを多段階的に実行するため、ソート方法は、
(v)上記ソートされたレコード順序番号配列及び上記ソートされた項目値アクセス情報配列をそれぞれ上記レコード順序番号配列及び上記項目値アクセス情報配列として用いて、別のデータ項目に関して、上記ステップ(i)、(ii)、(iii)及び(iv)を実行するステップをさらに備える。According to an embodiment of the present invention, since the sorting is performed in a multistage manner with respect to two or more data items, the sorting method includes:
(v) using the sorted record sequence number array and the sorted item value access information array as the record sequence number array and the item value access information array, respectively, with respect to another data item, the step (i) , (Ii), (iii) and (iv) are further included.

本発明の一実施形態によれば、本発明の方法を実施するマルチコア型処理装置が提供される。本実施形態によるマルチコア型処理装置は、専用のローカルメモリを含む複数台の演算ユニットと、上記複数台の演算ユニットに接続されているグローバルメモリとを備え、１つ以上のデータ項目に対応した項目値を含むレコードの配列を表現し、上記グローバルメモリに構築されている表形式データの上記レコードを所定のデータ項目に関する項目値に応じて並べ替える。このマルチコア型処理装置では、上記表形式データは、
各演算ユニットの上記ローカルメモリに収容可能なサイズをもつブロックに分割されている上記レコードの配列中の各レコードに対応するブロック番号が上記表形式データ中の上記レコードのレコード順序番号の順番に格納されているブロック番号配列と、
各ブロックに属するレコードである担当レコードの上記レコード順序番号が上記レコード順序番号の順番に格納されているレコード順序番号配列と、
上記担当レコードに含まれる上記項目値にアクセスするための項目値アクセス情報が上記レコード順序番号の順番に格納されている項目値アクセス情報配列と、
データ項目毎に上記項目値アクセス情報を用いてアクセスされる上記項目値を記述する項目値情報と、
によって表現されている。According to one embodiment of the present invention, a multi-core processing apparatus for performing the method of the present invention is provided. The multi-core processing apparatus according to the present embodiment includes a plurality of arithmetic units including a dedicated local memory, and a global memory connected to the plurality of arithmetic units, and corresponds to one or more data items. An array of records including values is expressed, and the records of the tabular data constructed in the global memory are rearranged in accordance with item values relating to predetermined data items. In this multi-core processing apparatus, the tabular data is
The block number corresponding to each record in the array of records divided into blocks having a size that can be accommodated in the local memory of each arithmetic unit is stored in the order of the record sequence number of the record in the tabular data. Block number array being
A record sequence number array in which the record sequence numbers of records in charge that are records belonging to each block are stored in the order of the record sequence numbers;
Field value access information array in which field value access information for accessing the field value included in the responsible record is stored in the order of the record sequence number;
Item value information describing the item value accessed using the item value access information for each data item;
It is expressed by.

マルチコア型処理装置の各演算ユニットは、
(a)他の演算ユニットと並列的に動作して、上記担当レコードを含むブロック毎に、上記所定のデータ項目に関する上記項目値情報をキーとして用いて、上記項目値アクセス情報配列にソートを適用し、これによって、ソートされた項目値アクセス情報配列と、上記ソートされた項目値アクセス情報配列に対応する作業用項目値情報配列及び作業用レコード順序番号配列を上記グローバルメモリに作成する手段と、
(b)他の演算ユニットと並列的に動作して、１対のブロックに関する上記作業用項目値情報配列の第１の要素及び対応する上記作業用レコード順序番号配列の第２の要素からなる要素の組を、上記第１の要素を上位の桁及び上記第２の要素を下位の桁として用いて、所定の順序にマージし、マージされた要素の組に当該要素の組が含まれる上記ブロック番号を関連付け、これによって、上記１対のブロックに関する新たな作業用項目値情報配列、新たな作業用レコード順序番号配列、及び、新たなブロック番号配列を作成する手段と、
(c)他の演算ユニットと並列的かつ階層的に動作して、上記手段（ｂ）を繰り返し作動させ、これによって、最終的なブロック番号配列を上記グローバルメモリに作成する手段と、
(d)他の演算ユニットと並列的に動作して、上記最終的なブロック番号配列中の要素が格納されているレコード順序番号を当該要素によって指定されたブロック番号毎に分配し所定の順番に並べ、これによって、ソートされたレコード順序番号配列を上記グローバルメモリに作成する手段と、
を備える。Each arithmetic unit of the multi-core processor is
(a) Operates in parallel with other arithmetic units, and applies sorting to the item value access information array using the item value information relating to the predetermined data item as a key for each block including the assigned record And means for creating the sorted item value access information array, the working item value information array and the working record sequence number array corresponding to the sorted item value access information array in the global memory,
(b) An element composed of a first element of the work item value information array related to a pair of blocks and a corresponding second element of the work record sequence number array, operating in parallel with another arithmetic unit. The above blocks are merged in a predetermined order using the first element as the upper digit and the second element as the lower digit, and the merged element set includes the element set. Means for associating numbers, thereby creating a new work item value information array for the pair of blocks, a new work record sequence number array, and a new block number array;
(c) operating in parallel and hierarchically with other arithmetic units, repeatedly operating the means (b), thereby creating a final block number array in the global memory;
(d) Operates in parallel with other arithmetic units, and distributes the record sequence numbers in which the elements in the final block number array are stored for each block number specified by the elements in a predetermined order. Arranging, thereby creating a sorted record sequence number array in the global memory;
Is provided.

本発明の好ましい一実施形態によれば、マルチコア型処理装置において各演算ユニットは上記専用のローカルメモリと上記グローバルメモリとの間のデータ転送のための専用のメモリインターフェイスをさらに備える。上記手段(a)は、上記専用のメモリインターフェイスを介して、上記担当レコードを含むブロックに対応している、上記レコード順序番号配列及び上記項目値アクセス情報配列と、上記所定の項目に関する上記項目値情報配列とを、上記グローバルメモリから上記ローカルメモリへ転送し、上記ローカルメモリに作成された上記ソートされた項目値アクセス情報配列と、上記作業用項目値情報配列及び上記作業用レコード順序番号配列とを上記グローバルメモリへ転送する。上記手段(b)は、上記専用のメモリインターフェイスを介して、１対のブロックに関する上記作業用項目値情報配列及び上記作業用レコード順序番号配列を、上記グローバルメモリから上記ローカルメモリへ転送し、上記ローカルメモリに作成された上記新たな作業用項目値情報配列、上記新たな作業用レコード順序番号配列、及び、上記新たなブロック番号配列を上記グローバルメモリ又はさらなる処理のための演算ユニットへ転送する。上記手段(c)は、上記専用のメモリインターフェイスを介して、上記ローカルメモリに作成された上記最終的なブロック番号配列を上記グローバルメモリへ転送する。 According to a preferred embodiment of the present invention, each arithmetic unit in the multi-core processing apparatus further includes a dedicated memory interface for data transfer between the dedicated local memory and the global memory. The means (a), via the dedicated memory interface, corresponds to the block including the record in charge, the record sequence number array and the item value access information array, and the item value relating to the predetermined item. An information array is transferred from the global memory to the local memory, the sorted item value access information array created in the local memory, the work item value information array, and the work record sequence number array; Are transferred to the global memory. The means (b) transfers the work item value information array and the work record sequence number array for a pair of blocks from the global memory to the local memory via the dedicated memory interface, and The new work item value information array, the new work record sequence number array, and the new block number array created in the local memory are transferred to the global memory or the arithmetic unit for further processing. The means (c) transfers the final block number array created in the local memory to the global memory via the dedicated memory interface.

さらに、本発明の一実施形態によれば、専用のローカルメモリを含む複数台の演算ユニットと、上記複数台の演算ユニットに接続されているグローバルメモリとを備えるコンピュータにロードされ、１つ以上のデータ項目に対応した項目値を含むレコードの配列を表現し、上記グローバルメモリに構築されている表形式データの上記レコードを所定のデータ項目に関する項目値に応じて並べ替えるコードを上記コンピュータに実行させるコンピュータ読み取り可能なプログラムであって、
コンピュータにおいて、上記表形式データは、
各演算ユニットの上記ローカルメモリに収容可能なサイズをもつブロックに分割されている上記レコードの配列中の各レコードに対応するブロック番号が上記表形式データ中の上記レコードのレコード順序番号の順番に格納されているブロック番号配列と、
各ブロックに属するレコードである担当レコードの上記レコード順序番号が上記レコード順序番号の順番に格納されているレコード順序番号配列と、
上記担当レコードに含まれる上記項目値にアクセスするための項目値アクセス情報が上記レコード順序番号の順番に格納されている項目値アクセス情報配列と、
データ項目毎に上記項目値アクセス情報を用いてアクセスされる上記項目値を記述する項目値情報と、
によって表現され、
(a)各演算ユニットが他の演算ユニットと並列的に動作して、上記担当レコードを含むブロック毎に、上記所定のデータ項目に関する上記項目値情報をキーとして用いて、上記項目値アクセス情報配列にソートを適用し、これによって、ソートされた項目値アクセス情報配列と、上記ソートされた項目値アクセス情報配列に対応する作業用項目値情報配列及び作業用レコード順序番号配列を上記グローバルメモリに作成するコードと、
(b)各演算ユニットが他の演算ユニットと並列的に動作して、１対のブロックに関する上記作業用項目値情報配列からの第１の要素及び対応する上記作業用レコード順序番号配列からの第２の要素からなる要素の組を、上記第１の要素を上位の桁及び上記第２の要素を下位の桁として用いて、所定の順序にマージし、マージされた要素の組に当該要素の組が含まれる上記ブロック番号を関連付け、これによって、上記１対のブロックに関する新たな作業用項目値情報配列、新たな作業用レコード順序番号配列、及び、新たなブロック番号配列を作成するコードと、
(c)各演算ユニットが他の演算ユニットと並列的かつ階層的に動作して、上記コード（ｂ）を繰り返し実行し、これによって、最終的なブロック番号配列を上記グローバルメモリに作成するコードと、
(d)各演算ユニットが他の演算ユニットと並列的に動作して、上記最終的なブロック番号配列中の要素が格納されているレコード順序番号を当該要素によって指定されたブロック番号毎に分配し所定の順番に並べ、これによって、ソートされたレコード順序番号配列を上記グローバルメモリに作成するコードと、
を備える。Furthermore, according to one embodiment of the present invention, the computer is loaded with a plurality of arithmetic units including a dedicated local memory, and a global memory connected to the plurality of arithmetic units. Expresses an array of records including item values corresponding to data items, and causes the computer to execute code for rearranging the records of tabular data constructed in the global memory according to the item values relating to a predetermined data item A computer-readable program,
In the computer, the tabular data is
The block number corresponding to each record in the array of records divided into blocks having a size that can be accommodated in the local memory of each arithmetic unit is stored in the order of the record sequence number of the record in the tabular data. Block number array being
A record sequence number array in which the record sequence numbers of records in charge that are records belonging to each block are stored in the order of the record sequence numbers;
Field value access information array in which field value access information for accessing the field value included in the responsible record is stored in the order of the record sequence number;
Item value information describing the item value accessed using the item value access information for each data item;
Expressed by
(a) Each arithmetic unit operates in parallel with other arithmetic units, and for each block including the record in charge, the item value access information array using the item value information relating to the predetermined data item as a key Sort is applied to the sorted item value access information array, and a work item value information array and a work record sequence number array corresponding to the sorted item value access information array are created in the global memory. Code to
(b) Each arithmetic unit operates in parallel with other arithmetic units, and the first element from the work item value information array relating to a pair of blocks and the corresponding first number from the work record sequence number array. The element set of two elements is merged in a predetermined order using the first element as the upper digit and the second element as the lower digit, and the merged element set A code for creating a new work item value information array, a new work record sequence number array, and a new block number array related to the pair of blocks, by associating the block numbers included in the set;
(c) Each arithmetic unit operates in parallel and hierarchically with other arithmetic units and repeatedly executes the code (b), thereby generating a final block number array in the global memory; ,
(d) Each arithmetic unit operates in parallel with other arithmetic units and distributes the record sequence numbers in which the elements in the final block number array are stored for each block number specified by the element. A code for arranging in a predetermined order, thereby creating a sorted record sequence number array in the global memory,
Is provided.

さらに、本発明の一実施形態によれば、専用のローカルメモリを含む複数台の演算ユニットと、上記複数台の演算ユニットに接続されているグローバルメモリとを備えるコンピュータにロードされ、１つ以上のデータ項目に対応した項目値を含むレコードの配列を表現し、上記グローバルメモリに構築されている表形式データの上記レコードを所定のデータ項目に関する項目値に応じて並べ替える本発明の方法を上記コンピュータに実行させるためのコンピュータプログラムプロダクトが提供される。 Furthermore, according to one embodiment of the present invention, the computer is loaded with a plurality of arithmetic units including a dedicated local memory, and a global memory connected to the plurality of arithmetic units. The computer according to the present invention expresses an array of records including item values corresponding to data items, and rearranges the records of the tabular data constructed in the global memory according to the item values relating to a predetermined data item. There is provided a computer program product for causing a program to execute.

さらに、本発明の一実施形態によれば、専用のローカルメモリを含む複数台の演算ユニットと、上記複数台の演算ユニットに接続されているグローバルメモリとを備えるコンピュータにロードされ、１つ以上のデータ項目に対応した項目値を含むレコードの配列を表現し、上記グローバルメモリに構築されている表形式データの上記レコードを所定のデータ項目に関する項目値に応じて並べ替える本発明の方法を上記コンピュータに実行させるためのコンピュータプログラムが記録された記録媒体が提供される。 Furthermore, according to one embodiment of the present invention, the computer is loaded with a plurality of arithmetic units including a dedicated local memory, and a global memory connected to the plurality of arithmetic units. The computer according to the present invention expresses an array of records including item values corresponding to data items, and rearranges the records of the tabular data constructed in the global memory according to the item values relating to a predetermined data item. A recording medium on which a computer program to be executed is recorded is provided.

本発明の少なくとも1つの実施例によれば、表形式データのレコードにマルチコア型処理装置によるソート処理を適用できるので、大規模な表形式データのソート処理及びこのソート処理を含む種々のアプリケーションを高速に実現することが可能になる。 According to at least one embodiment of the present invention, the sort processing by the multi-core processing device can be applied to the tabular data record, so that the large-scale tabular data sort processing and various applications including the sort processing can be performed at high speed. Can be realized.

本発明の一実施形態によるマルチコア型処理装置の概略図である。It is the schematic of the multi-core type processing apparatus by one Embodiment of this invention. 本発明の一実施形態によるコンピュータシステムの概略図である。1 is a schematic diagram of a computer system according to an embodiment of the present invention. 本発明の基礎となるデータ管理機構を説明するための表形式データの一例を表す図である。It is a figure showing an example of the tabular data for demonstrating the data management mechanism used as the foundation of this invention. 本発明の基礎となる基本的なデータ管理機構の説明図である。It is explanatory drawing of the basic data management mechanism used as the foundation of this invention. 本発明の一実施形態によるマルチコア型処理装置向けデータ構造の説明図である。It is explanatory drawing of the data structure for multi-core type processing apparatuses by one Embodiment of this invention. 本発明の一実施形態によるマルチコア型処理装置向けデータ構造の説明図である。It is explanatory drawing of the data structure for multi-core type processing apparatuses by one Embodiment of this invention. 本発明の一実施形態によるマルチコア型処理装置向けデータ構造の説明図である。It is explanatory drawing of the data structure for multi-core type processing apparatuses by one Embodiment of this invention. 本発明の一実施形態によるマルチコア型処理装置向けデータ構造の説明図である。It is explanatory drawing of the data structure for multi-core type processing apparatuses by one Embodiment of this invention. 本発明の一実施形態によるソート処理を適用する前の表形式データの説明図である。It is explanatory drawing of the tabular data before applying the sort process by one Embodiment of this invention. 本発明の一実施形態によるソート処理を適用した後の表形式データの説明図である。It is explanatory drawing of the tabular data after applying the sort process by one Embodiment of this invention. 本発明の一実施形態によるソート処理を図５Ａの表形式データに適用することによって得られる表形式データの説明図である。It is explanatory drawing of the tabular data obtained by applying the sort process by one Embodiment of this invention to the tabular data of FIG. 5A. 本発明の一実施形態によるソート処理を図５Ａの表形式データに適用することによって得られる表形式データの説明図である。It is explanatory drawing of the tabular data obtained by applying the sort process by one Embodiment of this invention to the tabular data of FIG. 5A. 本発明の一実施形態によるソート処理を図５Ａの表形式データに適用することによって得られる表形式データの説明図である。It is explanatory drawing of the tabular data obtained by applying the sort process by one Embodiment of this invention to the tabular data of FIG. 5A. 本発明の一実施形態によるソート処理を図５Ａの表形式データに適用することによって得られる表形式データの説明図である。It is explanatory drawing of the tabular data obtained by applying the sort process by one Embodiment of this invention to the tabular data of FIG. 5A. 本発明の一実施形態による項目値取得処理のフローチャートである。It is a flowchart of the item value acquisition process by one Embodiment of this invention. 本発明の一実施形態による表形式データ（第１のタイプの項目値情報）のソート処理の概略的なフローチャートである。It is a schematic flowchart of the sort processing of tabular data (first type item value information) according to an embodiment of the present invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の結果の説明図である。It is explanatory drawing of the result of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１の階層構造を説明する図である。It is a figure explaining the hierarchical structure of the sort processing 1 between blocks in the sort processing of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１の階層構造を説明する図である。It is a figure explaining the hierarchical structure of the sort processing 1 between blocks in the sort processing of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１の階層構造を説明する図である。It is a figure explaining the hierarchical structure of the sort processing 1 between blocks in the sort processing of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における１段目のマージ処理の説明図である。It is explanatory drawing of the 1st stage merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における１段目のマージ処理の説明図である。It is explanatory drawing of the 1st stage merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における１段目のマージ処理の説明図である。It is explanatory drawing of the 1st stage merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における２段目のマージ処理の説明図である。It is explanatory drawing of the 2nd merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における２段目のマージ処理の説明図である。It is explanatory drawing of the 2nd merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における３段目のマージ処理の説明図である。It is explanatory drawing of the merge process of the 3rd step | paragraph in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の結果を表現する表形式データの説明図である。It is explanatory drawing of the tabular data showing the result of the block-to-block sort process 1 (merge process) in the tabular data sort process by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の結果を表現する表形式データの説明図である。It is explanatory drawing of the tabular data showing the result of the block-to-block sort process 1 (merge process) in the tabular data sort process by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の結果を表現する表形式データの説明図である。It is explanatory drawing of the tabular data showing the result of the block-to-block sort process 1 (merge process) in the tabular data sort process by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理２（分配処理）の説明図である。It is explanatory drawing of the inter-block sort process 2 (distribution process) in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理２（分配処理）の説明図である。It is explanatory drawing of the inter-block sort process 2 (distribution process) in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理２（分配処理）の説明図である。It is explanatory drawing of the inter-block sort process 2 (distribution process) in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理の結果を説明する図である。It is a figure explaining the result of the sort processing of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理のフローチャートである。It is a flowchart of the sort processing of tabular data by one Embodiment of this invention. 本発明の代替的な実施形態によるブロック番号配列の分離処理の説明図である。It is explanatory drawing of the separation process of the block number arrangement | sequence by alternative embodiment of this invention. 本発明の一実施形態による多段階ソート処理の適用例の説明図である。It is explanatory drawing of the example of application of the multistage sort process by one Embodiment of this invention. 本発明の一実施形態による多段階ソート処理の適用例の説明図である。It is explanatory drawing of the example of application of the multistage sort process by one Embodiment of this invention. 本発明の一実施形態による表形式データのデータ構造の説明図である。It is explanatory drawing of the data structure of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのデータ構造の説明図である。It is explanatory drawing of the data structure of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのデータ構造の説明図である。It is explanatory drawing of the data structure of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのデータ構造の説明図である。It is explanatory drawing of the data structure of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データ（第２のタイプの項目値情報）のソート処理の概略的なフローチャートである。It is a schematic flowchart of the sort processing of the tabular data (2nd type item value information) by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の結果の説明図である。It is explanatory drawing of the result of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における１段目のマージ処理の説明図である。It is explanatory drawing of the 1st stage merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における１段目のマージ処理の説明図である。It is explanatory drawing of the 1st stage merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における１段目のマージ処理の説明図である。It is explanatory drawing of the 1st stage merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における２段目のマージ処理の説明図である。It is explanatory drawing of the 2nd merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における２段目のマージ処理の説明図である。It is explanatory drawing of the 2nd merge process in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理における３段目のマージ処理の説明図である。It is explanatory drawing of the merge process of the 3rd step | paragraph in the sort process of the tabular data by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の結果を表現する表形式データの説明図である。It is explanatory drawing of the tabular data showing the result of the block-to-block sort process 1 (merge process) in the tabular data sort process by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の結果を表現する表形式データの説明図である。It is explanatory drawing of the tabular data showing the result of the block-to-block sort process 1 (merge process) in the tabular data sort process by one Embodiment of this invention. 本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の結果を表現する表形式データの説明図である。It is explanatory drawing of the tabular data showing the result of the block-to-block sort process 1 (merge process) in the tabular data sort process by one Embodiment of this invention.

Explanation of symbols

１００マルチコア型処理装置
１０１マルチコア型プロセッサチップ
１１０，１２０，１３０，１４０演算ユニット
１１１，１２１，１３１，１４１コア
１１２，１２２，１３２，１４２ローカルメモリ
１５０チップ内バス
１６０，１６１，１６２，１６３バス
１７０，１７１，１７２，１７３グローバルメモリ
２００コンピュータシステム
２０２マルチコア型処理装置
２１０ＣＰＵ
２１２ＲＡＭ
２１４ＲＯＭ
２１６固定記憶装置
２１８ＣＤ−ＲＯＭ
２２０ＣＤ−ＲＯＭドライバ
２２２Ｉ／Ｆ
２２４入力装置
２２６表示装置
２２８バス
５００表形式データ
５０１データ項目「Ｓｃｈｏｏｌ」
５０２データ項目「Ａｇｅ」
５１０レコード０
５１１レコード１４
５２０，５２１，・・・，５２７ブロック
５３０順序情報
５３１項目情報「Ｓｃｈｏｏｌ」
５３２項目情報「Ａｇｅ」
５４０ブロック番号配列
５５１−０，５５１−１，・・・，５５１−７レコード順序番号配列
５５２−０，５５２−１，・・・，５５２−７項目値アクセス情報配列
５６０−０，５６０−１，・・・，５６０−７ブロック情報「Ｓｃｈｏｏｌ」
５６１−０，５６１−１，・・・，５６１−７ローカル項目値番号配列「Ｓｃｈｏｏｌ」
５６２−０，５６２−１，・・・，５６２−７項目値指定ポインタ配列「Ｓｃｈｏｏｌ」
５７０グローバル項目値配列「Ｓｃｈｏｏｌ」
５８０−０，５８０−１，・・・，５８０−７ブロック情報「Ａｇｅ」
５８１−０，５８１−１，・・・，５８１−７ローカル項目値番号配列「Ａｇｅ」
５８２−０，５８２−１，・・・，５８２−７項目値指定ポインタ配列「Ａｇｅ」
５９０グローバル項目値配列「Ａｇｅ」DESCRIPTION OF SYMBOLS 100 Multi-core type processing apparatus 101 Multi-core type processor chip 110,120,130,140 Arithmetic unit 111,121,131,141 Core 112,122,132,142 Local memory 150 In-chip bus 160,161,162,163 Bus 170, 171, 172, 173 Global memory 200 Computer system 202 Multi-core processor 210 CPU
212 RAM
214 ROM
216 Fixed storage device 218 CD-ROM
220 CD-ROM driver 222 I / F
224 Input device 226 Display device 228 Bus 500 Tabular data 501 Data item “School”
502 Data item “Age”
510 records 0
511 Record 14
520, 521, ..., 527 Block 530 Order information 531 Item information "School"
532 Item information "Age"
540 Block number array 551-0, 551-1, ..., 551-7 Record sequence number array 552-0, 552-1, ..., 552-7 Item value access information array 560-0, 560-1 , ..., 560-7 Block information "School"
561-0, 561-1, ..., 561-7 Local item value number array "School"
562-0, 562-1, ..., 562-7 Item value designation pointer array "School"
570 Global Item Value Array “School”
580-0, 580-1,..., 580-7 Block information “Age”
581-0, 581-1,..., 581-7 Local item value number array “Age”
582-0, 582-1, ..., 582-7 Item value designation pointer array "Age"
590 Global Item Value Array “Age”

以下、本発明を実施するための種々の形態を図面と共に詳細に説明する。 Hereinafter, various embodiments for carrying out the present invention will be described in detail with reference to the drawings.

［マルチコア型処理装置］
最初に、本発明の一実施例によるデータ処理を実現するマルチコア型処理装置について説明する。図１はマルチコア型処理装置の一実施形態の概略図である。マルチコア型処理装置１００は、マルチコア型プロセッサチップ１０１上に複数台（たとえば、２台、４台、８台等、本例では４台）の演算ユニット１１０、１２０、１３０、１４０が設けられている。各演算ユニット１１０、１２０、１３０、１４０は、データ処理用のコア１１１、１２１、１３１、１４１とコア専用のローカルメモリ１１２、１２２、１３２、１４２とを含む。各演算ユニット１１０、１２０、１３０、１４０は、チップ内のバス１５０によって接続されている。このバス１５０は、好ましくは、リング型バスである。演算ユニットは、チップ内のバス１５０によって接続されているので、高速にデータ通信することが可能である。さらに、各演算ユニット１１０、１２０、１３０、１４０は、ＤＭＡ転送をサポートするバス１６０、１６１、１６２、１６３を介して、チップ１０１に外付けされたグローバルメモリ１７０、１７１、１７２、１７３と接続されている。[Multi-core processing equipment]
First, a multi-core type processing apparatus that implements data processing according to an embodiment of the present invention will be described. FIG. 1 is a schematic view of an embodiment of a multi-core processing apparatus. The multi-core processing apparatus 100 is provided with a plurality of (for example, two, four, eight, etc., four in this example) arithmetic units 110, 120, 130, 140 on a multi-core processor chip 101. . Each arithmetic unit 110, 120, 130, 140 includes a core 111, 121, 131, 141 for data processing and a local memory 112, 122, 132, 142 dedicated to the core. Each arithmetic unit 110, 120, 130, 140 is connected by a bus 150 in the chip. This bus 150 is preferably a ring bus. Since the arithmetic units are connected by a bus 150 in the chip, data communication can be performed at high speed. Furthermore, each arithmetic unit 110, 120, 130, 140 is connected to global memory 170, 171, 172, 173 externally attached to the chip 101 via buses 160, 161, 162, 163 that support DMA transfer. ing.

チップ内のローカルメモリ１１２、１２２、１３２、１４２の記憶容量は、たとえば、２５６ＫＢ（キロバイト）程度であり、一方、グローバルメモリ１７０、１７１、１７２、１７３は数十ＧＢ（ギガバイト）の大容量メモリである。同図では、グローバルメモリ１７０、１７１、１７２、１７３が区別して記載されている。これは、各コアからグローバルメモリへ１本のバスでアクセスすると、バスの通信性能がボトルネックとなるので、各コアに専用のメモリインターフェイス（図示せず）を設け、外付けのグローバルメモリへはこのメモリインターフェイスを介してアクセスすることを示している。もちろん、このような構成であっても、ＮＵＭＡ（不均一メモリアクセス）方式のように、グローバルメモリが全体として論理的に連続した１つのメモリとして見えるように管理することは可能である。代替的な実施形態では、各演算ユニットは、１つのバスを介して物理的に一体的な外付けのグローバルメモリに接続される。 The storage capacity of the local memories 112, 122, 132, 142 in the chip is about 256 KB (kilobytes), for example, while the global memories 170, 171, 172, 173 are large-capacity memories of several tens GB (gigabytes). is there. In the figure, the global memories 170, 171, 172, and 173 are distinguished from each other. This is because when each core accesses the global memory with a single bus, the communication performance of the bus becomes a bottleneck. Therefore, a dedicated memory interface (not shown) is provided for each core. It shows access through this memory interface. Of course, even with such a configuration, it is possible to manage the global memory so that it appears as one logically continuous memory as a whole, as in the NUMA (non-uniform memory access) system. In an alternative embodiment, each computing unit is connected to a physically integrated external global memory via one bus.

さらに、上記のＣｅｌｌＢｒｏａｄｂａｎｄＥｎｇｉｎｅ^ＴＭのようなプロセッサでは、１チップ内には、汎用プロセッサコアと、演算用プロセッサコアとが搭載されている。汎用プロセッサコアは複数台の演算用プロセッサコアの動作を制御することが可能である。したがって、マルチコア型プロセッサは、好ましくは、汎用プロセッサコアのような制御ユニットを備えるが、制御ユニットは、チップ内に搭載する必要はなく、チップの外部に設けられることもある。Furthermore, in a processor such as the above-mentioned Cell Broadband Engine ^™ , a general-purpose processor core and an arithmetic processor core are mounted in one chip. The general-purpose processor core can control operations of a plurality of arithmetic processor cores. Therefore, the multi-core type processor preferably includes a control unit such as a general-purpose processor core, but the control unit does not need to be mounted in the chip and may be provided outside the chip.

制御ユニットと演算ユニット、又は、演算ユニット同士は、たとえば、メールボックスやシグナル機構を用いて通信することが可能である。 The control unit and the arithmetic unit, or the arithmetic units can communicate with each other using, for example, a mailbox or a signal mechanism.

［コンピュータシステム構成］
図２は、本発明の一実施形態による表形式データを操作するコンピュータシステム２００の概略図である。コンピュータシステム２００は、データ項目に対応した項目値を含むレコードの配列として表される表形式データを複数台の演算ユニットによって分担して操作する、図１に示されているような、マルチコア型処理装置２０２を備えている。図２に示されているように、コンピュータシステム２００は、さらに、プログラムを実行することによりシステム全体および個々の構成部分を制御するＣＰＵ２１０と、ワークデータ等を記憶する、たとえば、ＲＡＭ(Random Access Memory)のようなメモリ２１２と、プログラム等を記憶するＲＯＭ(Read Only Memory)２１４と、ハードディスク等の固定記憶媒体２１６と、ＣＤ−ＲＯＭ２１８をアクセスするためのＣＤ−ＲＯＭドライバ２２０と、ＣＤ−ＲＯＭドライバ２２０及び外部ネットワーク等（図示せず）へ繋がれた外部端子に接続されているインタフェース（Ｉ／Ｆ）２２２と、キーボード及びマウス等のような入力装置２２４と、コンピュータモニターのような表示装置２２６とを備えている。マルチコア型処理装置２１０、ＲＡＭ２１２、ＲＯＭ２１４、外部記憶媒体２１６、Ｉ／Ｆ２２２、入力装置２２４及び表示装置２２６は、バス２２８を介して相互に接続されている。[Computer system configuration]
FIG. 2 is a schematic diagram of a computer system 200 for manipulating tabular data according to one embodiment of the present invention. The computer system 200 performs multi-core processing as shown in FIG. 1 in which tabular data represented as an array of records including item values corresponding to data items is shared and operated by a plurality of arithmetic units. A device 202 is provided. As shown in FIG. 2, the computer system 200 further stores a CPU 210 that controls the entire system and individual components by executing a program, work data, and the like, for example, a RAM (Random Access Memory). ), A ROM (Read Only Memory) 214 that stores programs, a fixed storage medium 216 such as a hard disk, a CD-ROM driver 220 for accessing the CD-ROM 218, and a CD-ROM driver 220 and an interface (I / F) 222 connected to an external terminal connected to an external network or the like (not shown), an input device 224 such as a keyboard and a mouse, and a display device 226 such as a computer monitor. And. The multi-core processing device 210, the RAM 212, the ROM 214, the external storage medium 216, the I / F 222, the input device 224, and the display device 226 are connected to each other via a bus 228.

表形式データの操作をコンピュータシステム２００のマルチコア型処理装置２０２とＣＰＵ２１０に実行させるプログラムは、ＣＤ−ＲＯＭ２１８に収容され、ＣＤ−ＲＯＭドライバ２２０に読取られても良いし、ＲＯＭ２１４に予め記憶されていても良い。また、いったんＣＤ−ＲＯＭ２１８から読み出したものを、外部記憶媒体２１６の所定の領域に記憶しておいても良い。或いは、上記プログラムは、ネットワーク（図示せず）、外部端子、及び、Ｉ／Ｆ２２０を介して、外部から供給されるものであっても良い。 A program for causing the multi-core processing device 202 and the CPU 210 of the computer system 200 to operate the tabular data is stored in the CD-ROM 218 and may be read by the CD-ROM driver 220 or stored in the ROM 214 in advance. Also good. Further, what is once read from the CD-ROM 218 may be stored in a predetermined area of the external storage medium 216. Alternatively, the program may be supplied from the outside via a network (not shown), an external terminal, and the I / F 220.

また、本発明の一実施形態によるマルチコア型プロセッサシステムは、コンピュータシステム２００に表形式データを操作するプログラムを実行させることにより実現される。 The multi-core processor system according to an embodiment of the present invention is realized by causing the computer system 200 to execute a program for manipulating tabular data.

図２に示されているコンピュータシステム２００では、マルチコア型処理装置２０２の他にＣＰＵ２１０が設けられ、システム全体及び個々の構成部分を制御している。しかし、本発明は、このような実施形態に限定されることはなく、代替的な実施形態では、マルチコア型処理装置２０２に含まれている制御ユニットがシステム全体及び個々の構成部品を制御する。 In the computer system 200 shown in FIG. 2, a CPU 210 is provided in addition to the multi-core processing device 202, and controls the entire system and individual components. However, the present invention is not limited to such an embodiment, and in an alternative embodiment, a control unit included in the multi-core processing apparatus 202 controls the entire system and individual components.

［情報ブロックに基づくデータ管理機構］
図３は本発明の基礎となるデータ管理機構を説明するための表形式データの一例を表す図である。この表形式データは、上述の国際公開第ＷＯ００／１０１０３号に提案したデータ管理機構を用いることにより、コンピュータ内では図４に示されるようなデータ構造として記憶される。このデータ構造は、市販されているコンピュータ、たとえば、パーソナルコンピュータのハードウェア資源、特に、プロセッサ及びメモリを使用して大規模な表形式データの検索、ソート、集計等を実現するために提案された、コンピュータのメモリ上に置かれる表形式データのデータ構造であることに注意すべきである。[Data management mechanism based on information blocks]
FIG. 3 is a diagram showing an example of tabular data for explaining the data management mechanism which is the basis of the present invention. This tabular data is stored as a data structure as shown in FIG. 4 in the computer by using the data management mechanism proposed in the above-mentioned International Publication No. WO00 / 10103. This data structure has been proposed to realize retrieval, sorting, aggregation, etc. of large-scale tabular data using hardware resources of commercially available computers, for example, personal computers, in particular, processors and memories. It should be noted that the data structure of tabular data placed on the computer memory.

なお、本書中では、「元の表形式データ中でレコードが収容されている位置を表す情報（すなわち、原始レコード位置番号）」と「レコードの並び順を表す情報（すなわち、レコード順序番号）」とが区別されている。原始レコード位置番号は、データ項目に対応した項目値を含む個々のレコードを特定するために利用される仮想的な情報である。たとえば、通常の表形式データを情報ブロックに基づく表形式データに変換する際に、元々の通常の表形式データ中でレコードが収容されている位置が原始レコード位置番号によって表される。一般に、情報ブロックに基づく表形式データでは、レコードが常に原始レコード位置番号の順番に配列されているとは限らない。たとえば、表形式データをある項目の項目値に関して昇順にソートすると、ソート後の表形式データのレコードの並び順は元の表形式データのレコードの並び順とは異なる。但し、通常の表形式データから変換された直後の情報ブロックに基づく表形式データ中のレコードは、レコードが原始レコード位置番号の順番に並べられていることがあり、この場合には、原始レコード位置番号とレコード順序番号とが初期的に一致している。 In this document, “information indicating the position where a record is stored in the original tabular data (ie, the original record position number)” and “information indicating the order of the record (ie, record order number)” And are distinguished. The source record position number is virtual information used for specifying individual records including item values corresponding to data items. For example, when normal tabular data is converted into tabular data based on an information block, the position where a record is accommodated in the original normal tabular data is represented by a source record position number. In general, in tabular data based on information blocks, records are not always arranged in the order of the original record position numbers. For example, when the tabular data is sorted in ascending order with respect to the item value of a certain item, the sort order of the tabular data records after sorting is different from the sort order of the original tabular data records. However, records in tabular data based on information blocks immediately after conversion from normal tabular data may be arranged in the order of the source record position number. In this case, the source record position The number and the record sequence number initially match.

図４に示すように、表形式データの各レコードの並び順の番号（レコード順序番号）と、原始レコード位置番号は、レコード順序指定配列４０１（以下、この配列を「ＯｒｄＳｅｔ」のように略記する。）によって対応付けられる。レコード順序指定配列４０１は、レコード順序番号の順に原始レコード位置番号を格納している。図４の例では、レコードは原始レコード位置番号の順番に並べられている。 As shown in FIG. 4, the order number (record order number) of each record of the tabular data and the original record position number are abbreviated as a record order designation array 401 (hereinafter, this array is abbreviated as “OrdSet”). .). The record order specification array 401 stores the original record position numbers in the order of the record order numbers. In the example of FIG. 4, the records are arranged in the order of the original record position numbers.

ここで、本明細書中での配列の記法について説明する。一般に、配列Ａは、添字をｉとすると、配列の要素がＡ［ｉ］のように表記できるが、図面中では、配列は、配列の要素Ａ［ｉ］は、実線で囲まれた領域内に示され、要素Ａ［ｉ］と要素Ａ［ｉ＋１］の境界は点線で示されている。また、要素Ａ［ｉ］の添字ｉが要素Ａ［ｉ］の左側に示されている。また、配列の添字ｉは０から始まる整数で表されている。 Here, the notation of the arrangement | sequence in this specification is demonstrated. In general, an array element A can be expressed as A [i], where i is a subscript. However, in the drawing, an array element A [i] is within a region surrounded by a solid line. The boundary between the element A [i] and the element A [i + 1] is indicated by a dotted line. The subscript i of element A [i] is shown on the left side of element A [i]. Further, the subscript i of the array is represented by an integer starting from 0.

もう一度図４に戻ると、性別に関しては、表形式データのレコード順序番号＝０に対応する原始レコード位置番号は、配列ＯｒｄＳｅｔ［０］から「０」であることがわかる。原始レコード位置番号が「０」であるレコードに関する実際の性別の値、即ち、「男」又は「女」は、実際の値が所定の順序（たとえば、昇順又は降順）に従ってソートされた値リストである項目値配列４０３（以下、項目値配列、すなわち、値リストを「ＶＬ」のように略記する。）へのポインタ配列である項目値番号配列４０２（以下、項目値番号配列、すなわち、ポインタ配列を「ＶＮｏ」のように略記する。）を参照することによって取得できる。ポインタ配列４０２は、配列ＯｒｄＳｅｔ４０１に格納されている原始レコード位置番号の順番に従って、実際の値リスト４０３中の要素を指し示すポインタを格納している。これにより、表形式データのレコード「０」に対応する性別の項目値は、（１）配列ＯｒｄＳｅｔ４０１からレコード順序番号＝０に対応する原始レコード位置番号＝０を取り出し、（２）値リストへのポインタ配列４０２から原始レコード位置番号＝０に対応する要素「１」を取り出し、（３）値リスト４０３から、値リストへのポインタ配列３０２から取り出された要素「１」によって指し示される要素「女」を取り出すことにより取得できる。 Returning to FIG. 4 again, regarding the gender, it can be seen that the original record position number corresponding to the record order number = 0 of the tabular data is “0” from the array OrdSet [0]. The actual gender value for the record whose source record position number is “0”, ie, “male” or “female” is a value list in which the actual values are sorted according to a predetermined order (eg, ascending or descending order). Item value number array 402 (hereinafter, item value number array, that is, pointer array) that is a pointer array to a certain item value array 403 (hereinafter, item value array, that is, a value list is abbreviated as “VL”). Is abbreviated as “VNo.”). The pointer array 402 stores pointers that point to elements in the actual value list 403 according to the order of the source record position numbers stored in the array OrdSet 401. As a result, the item value of the gender corresponding to the record “0” in the tabular data is (1) the original record position number = 0 corresponding to the record sequence number = 0 is extracted from the array OrdSet 401, and (2) is stored in the value list. The element “1” corresponding to the original record position number = 0 is extracted from the pointer array 402, and (3) the element “female” indicated by the element “1” extracted from the value array 403 from the pointer array 302 to the value list is extracted. Can be obtained by taking out "".

他のレコードに対しても、また、年齢及び身長に関しても同様に項目値を取得することができる。 The item values can be acquired in the same manner for other records and also for age and height.

このように表形式データは、値リストＶＬと、値リストへのポインタ配列ＶＮｏの組合せにより表現され、この組合せを、特に、「情報ブロック」と称する。図４には、性別、年齢及び身長に関する情報ブロックがそれぞれ情報ブロック４０８、４０９及び４１０として示されている。 In this way, the tabular data is expressed by a combination of the value list VL and the pointer array VNo to the value list, and this combination is particularly referred to as an “information block”. In FIG. 4, information blocks regarding gender, age, and height are shown as information blocks 408, 409, and 410, respectively.

単一のコンピュータが単一のメモリ（物理的には複数であっても良いが、単一のアドレス空間に配置されアクセスされるという意味で単一のメモリ）を有するならば、単一のコンピュータは、当該メモリに、順序集合の配列ＯｒｄＳｅｔ、各情報ブロックを構成する値リストＶＬおよびポインタ配列ＶＮｏとを記憶しておけばよい。しかしながら、本発明の種々の実施形態では、表形式データの操作は、小容量の専用のローカルメモリを伴う複数台の演算ユニットにより構成されたマルチコア型処理装置によって行われる。そのため、効率的な並列処理を実現するために、表形式データを保持する新たな仕組みが提案されている。 If a single computer has a single memory (physically multiple, but a single memory in the sense that it is located and accessed in a single address space) May store the ordered set array OrdSet, the value list VL constituting each information block, and the pointer array VNo in the memory. However, in various embodiments of the present invention, the manipulation of tabular data is performed by a multi-core processing device configured by a plurality of arithmetic units with a small capacity dedicated local memory. Therefore, a new mechanism for holding tabular data has been proposed in order to achieve efficient parallel processing.

［マルチコア型処理装置向けデータ構造］
次に、本発明の一実施形態によるマルチコア型処理装置向けデータ構造について説明する。図５Ａ乃至５Ｂは本発明の一実施形態によるデータ構造の説明図である。図５Ａは、表形式データの一例を示している。図５Ａ５Ａに示された表形式データ５００は、「Ｓｃｈｏｏｌ」というデータ項目５０１に対応した項目値（たとえば、「Ｗｅｓｔ」、「Ｓｏｕｔｈ」、「Ｎｏｒｔｈ」及び「Ｅａｓｔ」と、「Ａｇｅ」というデータ項目５０２に対応した項目値（たとえば、「１２」、「８」、「１１」、「１０」など）とを含むレコードの配列として表される。この表形式データ５００のレコードは、先頭から順番に、レコード順序番号＝０、１、２、・・・、３１の順番に並んでいる。配列の先頭に位置するレコード５１０は、レコード順序番号０が付与されたレコードである。レコード５１０のデータ項目「Ｓｃｈｏｏｌ」の項目値は「Ｗｅｓｔ」であり、データ項目「Ａｇｅ」の項目値は「１２」である。レコード５１１のデータ項目「Ｓｃｈｏｏｌ」の項目値は「Ｎｏｒｔｈ」であり、データ項目「Ａｇｅ」の項目値は「９」である。ここで、この表形式データのレコードがソート処理によって並び替えられると、各レコードに付与されるレコード順序番号は変化することに注意すべきである。[Data structure for multi-core processors]
Next, a data structure for a multi-core processing apparatus according to an embodiment of the present invention will be described. 5A to 5B are explanatory diagrams of a data structure according to an embodiment of the present invention. FIG. 5A shows an example of tabular data. The tabular data 500 shown in FIG. 5A5A includes item values corresponding to the data item 501 “School” (for example, “West”, “South”, “North”, “East”, and data items “Age”). It is represented as an array of records including item values (for example, “12”, “8”, “11”, “10”, etc.) corresponding to 502. The records of this tabular data 500 are in order from the top. , Record sequence number = 0, 1, 2, ..., 31. Record 510 at the head of the array is a record to which record sequence number 0 is assigned. The item value of “School” is “West”, and the item value of the data item “Age” is “12.” The data item “School” of the record 511 is displayed. The item value of “” is “North”, and the item value of the data item “Age” is “9.” Here, when the records of this tabular data are rearranged by the sort processing, they are given to each record. Note that the record sequence number changes.

本発明の一実施形態によるマルチコア型処理装置向けデータ構造では、この表形式データのレコードは、ブロック番号（本例では、０から７の８個のブロック番号）によって識別されるブロック５２０、５２１、・・・、５２７に分割される。初期的には、このブロックは、このブロックに含まれるレコードの処理を担当するマルチコア型処理装置の演算ユニットに対応している。 In the data structure for a multi-core processing device according to an embodiment of the present invention, this tabular data record includes blocks 520, 521, identified by a block number (in this example, eight block numbers from 0 to 7). ... divided into 527. Initially, this block corresponds to an arithmetic unit of a multi-core type processing apparatus that is responsible for processing the records included in this block.

マルチコア型処理装置向けデータ構造は、レコードの並び順（すなわち、レコード順序番号）と、データ構造内の項目値の格納場所とを対応付ける順序に関する情報（順序情報）と、データ項目毎の項目値に関する情報（項目値情報）とによって構成される。順序情報は、機能的に上記の本発明の基礎となるデータ管理機構におけるレコード順序指定配列ＯｒｄＳｅｔに対応し、項目値情報は、同様に情報ブロックに対応している。順序情報と項目値情報は、共にグローバルメモリに保持され、必要に応じて、それらの一部が各演算ユニットのローカルメモリへ転送される。図５Ｂは順序情報５３０を示し、図５Ｃ及び５Ｂ５Ｃ５Ｄは、それぞれ、データ項目「Ｓｃｈｏｏｌ」及びデータ項目「Ａｇｅ」の項目値情報５３１及び５３２を示している。 The data structure for a multi-core type processing apparatus relates to information (order information) relating to the order in which records are arranged (that is, record order numbers) and storage locations of item values in the data structure, and item values for each data item. Information (item value information). The order information functionally corresponds to the record order designation array OrdSet in the data management mechanism that is the basis of the present invention, and the item value information similarly corresponds to the information block. Both the order information and the item value information are held in the global memory, and a part of them is transferred to the local memory of each arithmetic unit as necessary. FIG. 5B shows the order information 530, and FIGS. 5C and 5B5C5D show the item value information 531 and 532 of the data item “School” and the data item “Age”, respectively.

順序情報５３０は、ブロック番号がレコード順序番号の順番に、格納されているブロック番号配列５４０を含む。本実施形態のデータ構造では、レコード毎に当該レコードの操作を担当する演算ユニットが定められる。よって、（複数の）レコードは、各演算ユニットが担当するレコード、すなわち、担当レコードに分割され、担当レコード毎にブロック番号が割り当てられる。ブロック番号配列をＢｌｋＮｏ、レコード順序番号をｉとすると、ＢｌｋＮｏ［ｉ］は、レコード順序番号ｉをもつレコードが属するブロックのブロック番号がＢｌｋＮｏ［ｉ］であることを表している。ブロック番号配列５４０は、レコードの個数に等しいサイズを有する整数型の配列である。また、ブロック番号配列５４０は第２のタイプのデータである。たとえば、図５の例では、レコード順序番号０から３のレコードはブロック番号０のブロックに含まれ、レコード順序番号４から７のレコードはブロック番号１のブロックに含まれ、以下同様である。 The order information 530 includes a block number array 540 in which the block numbers are stored in the order of the record order numbers. In the data structure of this embodiment, an arithmetic unit responsible for the operation of the record is determined for each record. Therefore, the record (s) is divided into records that each arithmetic unit is in charge of, that is, the records in charge, and a block number is assigned to each record in charge. If the block number array is BlkNo and the record order number is i, BlkNo [i] indicates that the block number of the block to which the record having the record order number i belongs is BlkNo [i]. The block number array 540 is an integer type array having a size equal to the number of records. The block number array 540 is the second type of data. For example, in the example of FIG. 5, records with record sequence numbers 0 to 3 are included in the block with block number 0, records with record sequence numbers 4 to 7 are included in the block with block number 1, and so on.

本実施形態のデータ構造によれば、全レコードはブロックに対応した担当レコードに分割されるので、ブロック毎に、担当レコードのそれぞれを元の表形式データのレコードと対応付ける情報が必要になる。そのため、順序情報５３０は、ブロック毎に、担当レコードのレコード順序番号がレコード順序番号の順番に格納されているレコード順序番号配列５５１−１、５５１−２、・・・、５５１−７を含む。レコード順序番号配列は、以下では、ＧＯｒｄという名前で呼ばれることがある。たとえば、図５の例では、ブロック番号０というブロックに属する担当レコードのレコード順序番号は、０、１、２、３であり、ブロック番号１というブロックに属する担当レコードのレコード順序番号は、４、５、６、７であり、以下同様である。レコード順序番号配列は、各ブロックに属する担当レコードの数と同じサイズを有し、整数型の配列である。また、レコード順序番号配列は、各演算ユニットのローカルメモリに収容可能なサイズに分割されているので、第１のタイプのデータである。したがって、レコード順序番号配列は、必要に応じて、グローバルメモリから各演算ユニット内のローカルメモリへ転送される。 According to the data structure of the present embodiment, all records are divided into assigned records corresponding to the blocks. Therefore, information for associating each assigned record with the original tabular data record is required for each block. Therefore, the order information 530 includes a record order number array 551-1, 551-2, ..., 551-7 in which the record order numbers of the records in charge are stored in the order of the record order numbers for each block. The record order number array may be referred to as GOrd below. For example, in the example of FIG. 5, the record sequence numbers of the assigned records belonging to the block of block number 0 are 0, 1, 2, and 3, and the record sequence numbers of the assigned records belonging to the block of block number 1 are 4, 5, 6, 7 and so on. The record sequence number array has the same size as the number of records in charge belonging to each block, and is an integer type array. The record sequence number array is divided into sizes that can be accommodated in the local memory of each arithmetic unit, and is therefore the first type of data. Therefore, the record sequence number array is transferred from the global memory to the local memory in each arithmetic unit as necessary.

ここで、ブロック番号配列５４０とレコード順序番号配列５５１−１、５５１−２、・・・、５５１−７は、相互に変換可能であることに注意すべきである。たとえば、レコード順序番号ｉをもつレコードが属するブロックのブロック番号がＢｌｋＮｏ［ｉ］で表現され、ブロック番号＝ｊであるブロックに属するレコード順序番号配列の添字ｋに対応する要素がＧＯｒｄ［ｊ］［ｋ］で表現されるとする。このとき、レコードの総数をＲｍａｘとすると、ブロック番号配列からレコード順序番号配列への変換は、
Ｆｏｒ（ｉ＝０；ｉ＋＋；ｉ＜Ｒｍａｘ）｛
ＧＯｒｄ［ＢｌｋＮｏ［ｉ］］［ｊ［ＢｌｋＮｏ［ｉ］］］＝ｉ；
ｊ［ＢｌｋＮｏ［ｉ］］＋＋；
｝
として表現される。ここで、ｊ［ＢｌｋＮｏ［ｉ］］は、ブロック番号がＢｌｋＮｏ［ｉ］であるブロックに属するレコード順序番号配列ＧＯｒｄ［ＢｌｋＮｏ［ｉ］］の要素を指定する添字を表している。なお、変数ｊ［ＢｌｋＮｏ［ｉ］］は全要素が０に初期化されている。Here, it should be noted that the block number array 540 and the record sequence number arrays 551-1, 551-2, ..., 551-7 can be converted to each other. For example, the block number of the block to which the record having the record sequence number i belongs is expressed by BlkNo [i], and the element corresponding to the subscript k of the record sequence number array belonging to the block having the block number = j is GOrd [j] [ k]. At this time, assuming that the total number of records is Rmax, the conversion from the block number array to the record sequence number array is as follows.
For (i = 0; i ++; i <Rmax) {
GOrd [BlkNo [i]] [j [BlkNo [i]]] = i;
j [BlkNo [i]] ++;
}
Is expressed as Here, j [BlkNo [i]] represents a subscript that designates an element of the record sequence number array GOrd [BlkNo [i]] belonging to the block whose block number is BlkNo [i]. Note that all elements of the variable j [BlkNo [i]] are initialized to 0.

逆に、レコード順序番号配列からブロック番号配列への変換は、ブロック番号をｉ、ブロックの総数をＢｍａｘ、各ブロック内のレコードの総数をＢＲｍａｘによって表現すると、以下のように記述される。すなわち、
Ｆｏｒ（ｉ＝０；ｉ＋＋；ｉ＜Ｂｍａｘ）｛
Ｆｏｒ（ｊ＝０；ｊ＋＋；ｊ＜ＢＲｍａｘ）｛
ＢｌｋＮｏ［ＧＯｒｄ［ｉ］］［ｊ］＝ｉ；
｝
｝
となる。Conversely, the conversion from the record sequence number array to the block number array is described as follows, where the block number is represented by i, the total number of blocks is represented by Bmax, and the total number of records in each block is represented by BRmax. That is,
For (i = 0; i ++; i <Bmax) {
For (j = 0; j ++; j <BRmax) {
BlkNo [GOrd [i]] [j] = i;
}
}
It becomes.

このように、ブロック番号配列とレコード順序番号配列は相互変換が可能であるため、何れか一方が準備されているならば十分である。 Thus, since the block number array and the record order number array can be converted into each other, it is sufficient if either one is prepared.

さらに、各レコードに含まれる項目値は、後述する項目値情報の形で保持されているので、各演算ユニットは、担当レコードに含まれる項目値をアクセスするためのアドレス情報、すなわち、項目値アクセス情報を取得することが必要である。よって、本実施形態のデータ構造によれば、順序情報５３０は、ブロック毎に、担当レコードの項目値アクセス情報がレコード順序番号の順番に格納されている項目値アクセス情報配列５５２−１、５５２−２、・・・、５５２−７をさらに含む。この項目値アクセス情報配列は整数型の配列であり、項目値アクセス情報配列のサイズは担当レコードのレコード数に一致する。項目値アクセス情報配列もまた、第１のタイプのデータであり、必要に応じて、グローバルメモリから各演算ユニット内のローカルメモリへ転送される。項目値アクセス情報配列は、ＬＯｒｄという名前で呼ばれることもある。たとえば、図５の例では、ブロック番号０というブロックに含まれるレコード順序番号が０というレコードに含まれる項目値は、このブロック番号０に関して、０という項目値アクセス情報によってアクセス可能であり、ブロック番号１というブロックに含まれるレコード順序番号が５というレコードに含まれる項目値は、このブロック番号１に関して、１という項目値アクセス情報によってアクセス可能である。 In addition, since the item value included in each record is held in the form of item value information described later, each arithmetic unit can access address information included in the record in charge, that is, item value access. It is necessary to obtain information. Therefore, according to the data structure of the present embodiment, the order information 530 includes the item value access information arrays 552-1 and 552 in which the item value access information of the assigned record is stored in the order of the record order number for each block. 2, ..., 552-7. This item value access information array is an integer type array, and the size of the item value access information array matches the number of records in charge. The item value access information array is also a first type of data, and is transferred from the global memory to the local memory in each arithmetic unit as necessary. The item value access information array may be called by the name LOrd. For example, in the example of FIG. 5, the item value included in the record with the record sequence number 0 included in the block with the block number 0 can be accessed with the item value access information of 0 with respect to the block number 0. The item value included in the record having the record sequence number 5 included in the block 1 can be accessed by the item value access information 1 regarding the block number 1.

次に、本実施形態によれば、項目値情報は、データ項目毎の項目値情報として保持される。たとえば、図５の例では、データ項目「Ｓｃｈｏｏｌ」に関する項目値情報５３１とデータ項目「Ａｇｅ」に関する項目値情報５３２とがグローバルメモリに構築される。そして、ブロック毎の担当レコードに含まれる項目値は、データ項目毎に各演算ユニットが項目値アクセス情報配列を用いてアクセスすることができるようにグローバルメモリに保持され、必要に応じて、グローバルメモリから各演算ユニット内のローカルメモリへ転送される。項目値そのものは、データ項目毎に、一意の項目値が所定の順序（昇順又は降順）に格納されているグローバル項目値配列としてグローバルメモリ上に構築されている。たとえば、図５の例では、データ項目「Ｓｃｈｏｏｌ」に関する項目値は、グローバル項目値配列５７０としてグローバルメモリに保持され、データ項目「Ａｇｅ」に関する項目値は、グローバル項目値配列５９０としてグローバルメモリに保持されている。このグローバル項目値配列は第２のタイプのデータである。なお、グローバル項目値配列は、項目値そのものを格納する配列であるため、整数型、浮動小数点型、文字列型などの様々なデータ型をとる。 Next, according to the present embodiment, the item value information is held as item value information for each data item. For example, in the example of FIG. 5, item value information 531 related to the data item “School” and item value information 532 related to the data item “Age” are constructed in the global memory. The item values included in the record in charge for each block are held in the global memory so that each arithmetic unit can access each data item using the item value access information array. To the local memory in each arithmetic unit. The item value itself is constructed on the global memory as a global item value array in which unique item values are stored in a predetermined order (ascending order or descending order) for each data item. For example, in the example of FIG. 5, the item value relating to the data item “School” is held in the global memory as the global item value array 570, and the item value relating to the data item “Age” is held in the global memory as the global item value array 590. Has been. This global item value array is the second type of data. The global item value array is an array for storing the item value itself, and thus has various data types such as an integer type, a floating point type, and a character string type.

項目値情報は、担当レコードに関連した項目値アクセス情報を用いて、グローバル項目値配列に格納されている項目値を特定できるように構成されている。そのため、項目値情報は、データ項目毎に、担当レコードに含まれる項目値を特定するローカル項目値番号がレコード順序番号の順番（たとえば、昇順又は降順）に格納されているローカル項目値番号配列と、ローカル項目値番号によって表される項目値がグローバル項目値配列中に格納されている位置を指定する項目値指定ポインタがローカル項目値番号の順番に格納されている項目値指定ポインタ配列とを含む。ローカル項目値番号配列及び項目値指定ポインタ配列はブロック毎に設けられる。ローカル項目値番号配列は、担当レコードのレコード数に一致するサイズを有する整数型配列であり、第１のタイプのデータであり、ＶＮｏという名前で呼ばれることもある。項目値指定ポインタ配列は、担当レコードに含まれる一意の項目値の数と同じサイズを有する整数型配列であり、第１のタイプのデータであり、ＬＶＬという名前で呼ばれることもある。ローカル項目値番号配列及び項目値指定ポインタ配列は、共に第１のタイプのデータであるため、ブロック毎にグローバルメモリ上に構築され、必要に応じて、グローバルメモリから各演算ユニット内のローカルメモリへ転送される。 The item value information is configured so that the item value stored in the global item value array can be specified using the item value access information related to the record in charge. Therefore, the item value information includes, for each data item, a local item value number array in which local item value numbers that specify item values included in the assigned record are stored in the order of record sequence numbers (for example, ascending order or descending order). Including an item value specification pointer array in which an item value specification pointer for specifying a position where the item value represented by the local item value number is stored in the global item value array is stored in the order of the local item value number . A local item value number array and an item value designation pointer array are provided for each block. The local item value number array is an integer type array having a size that matches the number of records in charge, is the first type of data, and is sometimes referred to as VNo. The item value designation pointer array is an integer type array having the same size as the number of unique item values included in the record in charge, is a first type of data, and is sometimes called LVL. Since both the local item value number array and the item value designation pointer array are data of the first type, they are constructed on the global memory for each block, and from the global memory to the local memory in each arithmetic unit as necessary. Transferred.

図５の例では、データ項目「Ｓｃｈｏｏｌ」に関して、項目値情報５３１は、ローカル項目値番号配列５６１−０、５６１−１、・・・、５６１−７と、項目値指定ポインタ配列５６２−０、５６２−１、・・・、５６２−７と、グローバル項目値配列５９０とを含む。ローカル項目値番号配列と項目値指定ポインタ配列は、ブロック毎に分割されている。同図において、たとえば、ローカル項目値番号配列ＶＮｏの先頭の要素の値は「１」である。これは、値が「０」である項目値アクセス情報によって指定されたレコードに含まれる項目値の項目値番号が「１」であることを意味する。項目値番号が「１」である項目値は、項目値指定ポインタ配列ＬＶＬの２番目の要素、すなわち、ＬＶＬ［１］を参照することにより、グローバル項目値配列ＧＶＬの３番目の要素、すなわち、ＧＶＬ［２］であることがわかる。その他のブロックに関しても、また、その他のデータ項目に関しても、同様である。 In the example of FIG. 5, regarding the data item “School”, the item value information 531 includes local item value number arrays 561-0, 561-1,... 561-7, an item value designation pointer array 562-0, 562-1,... 562-7 and a global item value array 590. The local item value number array and the item value designation pointer array are divided for each block. In the figure, for example, the value of the first element of the local item value number array VNo is “1”. This means that the item value number of the item value included in the record specified by the item value access information whose value is “0” is “1”. The item value whose item value number is “1” refers to the second element of the item value designation pointer array LVL, that is, the third element of the global item value array GVL, ie, LVL [1]. It turns out that it is GVL [2]. The same applies to other blocks and other data items.

このように、本実施形態のデータ構造によれば、各ブロックに属するレコードに含まれる項目値は、ブロック内で各項目値に付けられたローカル項目値番号と、このローカル項目値番号とグローバル項目値配列中の項目値とを対応付ける項目値指定ポインタと、グローバル項目値配列とによって表現されている。 As described above, according to the data structure of the present embodiment, the item value included in the record belonging to each block includes the local item value number assigned to each item value in the block, the local item value number, and the global item. This is expressed by an item value designation pointer that associates an item value in the value array with a global item value array.

［表形式データのソート処理の概要］
次に、本発明の一実施形態によるマルチコア型処理装置による表形式データのソート処理について説明する。図６は本発明の一実施形態による表形式データのソート処理の説明図である。図６Ａにはソート処理前の表形式データが示され、図６Ｂにはソート処理後の表形式データが示され、ソート処理前の表形式データは図５Ａに示された表形式データと同一である。同図の例では、表形式データは、データ項目「Ｓｃｈｏｏｌ」をキーとして、このデータ項目の項目値の昇順（より具体的にはアルファベット順）にレコードが並べ替えられている。ソート前のレコード順序番号＝０に相当するレコード（Ｗｅｓｔ，１２）は、ソート処理によってレコード順序番号が２４に変化している。また、ソート前のレコード順序番号＝２に相当するレコード（Ｗｅｓｔ，１１）は、ソート処理後にレコード順序番号が２５に変化している。このようにキー値が同値（すなわち、Ｗｅｓｔ）である２つのレコードの順序がソート処理の前後で変化しないようなソート処理は「安定性のある」ソート処理と呼ばれる。ここで、レコード（Ａ，Ｂ）という表記は、データ項目「Ｓｃｈｏｏｌ」の項目値がＡであり、データ項目「Ａｇｅ」の項目値がＢであるレコードを表している。[Overview of sort processing of tabular data]
Next, sorting processing of tabular data by the multi-core processing apparatus according to an embodiment of the present invention will be described. FIG. 6 is an explanatory diagram of a tabular data sort process according to an embodiment of the present invention. FIG. 6A shows tabular data before sort processing, FIG. 6B shows tabular data after sort processing, and the tabular data before sort processing is the same as the tabular data shown in FIG. 5A. is there. In the example of the figure, in the tabular data, records are rearranged in the ascending order (more specifically, alphabetical order) of the item value of the data item using the data item “School” as a key. In the record (West, 12) corresponding to the record order number = 0 before sorting, the record order number is changed to 24 by the sorting process. In addition, the record (West, 11) corresponding to the record order number = 2 before sorting has the record order number changed to 25 after the sorting process. A sort process in which the order of two records having the same key value (that is, West) does not change before and after the sort process is called a “stable” sort process. Here, the notation of record (A, B) represents a record in which the item value of the data item “School” is A and the item value of the data item “Age” is B.

図６Ａに示されたソート処理前の表形式データは、マルチコア型処理装置向けデータ構造を用いると図５Ａ乃至５Ｄに示されたデータ構造によって表現される。これに対して、図７Ａ乃至７Ｄは、本発明の一実施形態によるソート処理を、図５Ａ乃至５Ｄに示された表形式データに適用することにより得られる表形式データの説明図である。図７Ａ乃至７Ｄを参照して、ソート処理の結果を説明する。たとえば、図７Ａの表形式データのレコード順序番号＝０に相当するレコード（Ｅａｓｔ，６）は、図７Ｂのブロック番号配列のＢｌｋＮｏ［０］＝２から、ブロック番号＝２のブロックに属していることがわかる。次に、ブロック番号＝２のブロックに含まれるレコード順序番号配列ＧＯｒｄを参照すると、レコード順序番号＝０と一致するレコード順序番号配列ＧＯｒｄの要素は、ＧＯｒｄ［０］であることがわかる。このレコード順序番号＝０が格納されているレコード順序番号配列ＧＯｒｄ中の要素ＧＯｒｄ［０］の格納位置、すなわち、０は、このブロックを受け持つ演算ユニットの担当レコード中での対象レコードの順位（ランク）を表している。レコード順序番号配列は、本実施形態では、昇順の配列であるため、この格納位置は、周知の２分割法などによって効率的に見つけられる。 The tabular data before sort processing shown in FIG. 6A is expressed by the data structures shown in FIGS. 5A to 5D when the data structure for multi-core processing devices is used. 7A to 7D are explanatory diagrams of tabular data obtained by applying the sort processing according to the embodiment of the present invention to the tabular data shown in FIGS. 5A to 5D. The result of the sorting process will be described with reference to FIGS. 7A to 7D. For example, the record (East, 6) corresponding to the record sequence number = 0 of the tabular data in FIG. 7A belongs to the block of block number = 2 from BlkNo [0] = 2 in the block number array of FIG. 7B. I understand that. Next, referring to the record sequence number array GOrd included in the block of block number = 2, it can be seen that the element of the record sequence number array GOrd that matches the record sequence number = 0 is GOrd [0]. The storage position of the element GOrd [0] in the record sequence number array GOrd in which this record sequence number = 0 is stored, that is, 0 is the rank (rank) of the target record in the record in charge of the arithmetic unit responsible for this block. ). In this embodiment, the record order number array is an ascending order array, so that this storage position can be found efficiently by a well-known two-division method or the like.

次に、ブロック番号＝２のブロック中でランク＝０が付与されたレコードに含まれる項目値情報を取得するために、項目値アクセス情報配列ＬＯｒｄが参照される。より具体的には、ＬＯｒｄ［０］＝０は、ブロックＮｏ．２のブロック中のランク＝０が付与されたレコードに含まれる項目値情報にアクセスするための項目値指定ポインタを表している。 Next, the item value access information array LOrd is referred to in order to acquire item value information included in the record with rank = 0 in the block of block number = 2. More specifically, LOrd [0] = 0 indicates that block No. 2 represents an item value designation pointer for accessing item value information included in a record to which rank = 0 in block 2 is assigned.

データ項目「Ｓｃｈｏｏｌ」に関する項目値情報は図７Ｃに示されている。ブロックＮｏ．２のブロック中のランク＝０が付与されたレコードに含まれる項目値は、
ＧＶＬ［ＬＶＬ［ＶＮｏ［ＬＯｒｄ［０］］＝“Ｅａｓｔ”
によって得られる。この項目値は次の手順にしたがって取得される。最初に、ブロック２に関するローカル項目値番号配列ＶＮｏ中で、項目値アクセス情報＝ＬＯｒｄ［０］＝０によって指定される要素
ＶＮｏ［ＬＯｒｄ［０］］＝ＶＮｏ［０］＝０
を取得する。次に、項目値指定ポインタ配列ＬＶＬ中で、ＶＮｏ［０］＝０によって指定される要素
ＬＶＬ［ＶＮｏ［０］］＝ＬＶＬ［０］＝０
を取得する。最後に、得られた項目値指定ポインタを用いてグローバル項目値配列ＧＶＬ中から項目値
ＧＶＬ［ＬＶＬ［０］］＝ＧＶＬ［０］＝“Ｅａｓｔ”
を取得する。これにより、図７Ａの表形式データのレコード順序番号＝０に相当するレコードに含まれるデータ項目「Ｓｃｈｏｏｌ」に関する項目値は“Ｅａｓｔ”であることが示された。同様に、データ項目「Ａｇｅ」に関する項目値が“６”であることも明らかである。The item value information related to the data item “School” is shown in FIG. 7C. Block No. The item value included in the record with rank = 0 in the block of 2 is
GVL [LVL [VNo [LOrd [0]] = “East”
Obtained by. This item value is obtained according to the following procedure. First, in the local item value number array VNo regarding the block 2, the element specified by the item value access information = LOrd [0] = 0 VNo [LOrd [0]] = VNo [0] = 0
To get. Next, in the item value designation pointer array LVL, the element designated by VNo [0] = 0 LVL [VNo [0]] = LVL [0] = 0
To get. Finally, using the obtained item value designation pointer, the item value GVL [LVL [0]] = GVL [0] = “East” from the global item value array GVL
To get. This indicates that the item value relating to the data item “School” included in the record corresponding to the record sequence number = 0 of the tabular data in FIG. 7A is “East”. Similarly, it is also clear that the item value for the data item “Age” is “6”.

このように、本発明の一実施形態によるソート処理では、ソート処理の前後で変化するのは順序情報（すなわち、ブロック番号配列、レコード順序番号配列及び項目値アクセス情報配列）だけであり、項目値情報はソート処理の前後で変化しない。 As described above, in the sorting process according to the embodiment of the present invention, only the order information (that is, the block number array, the record order number array, and the item value access information array) changes before and after the sorting process. Information does not change before and after the sorting process.

本実施形態の説明では、項目値情報は、ローカル項目値番号配列ＶＮｏ、項目値指定ポインタ配列ＬＶＬ、及び、グローバル項目値配列ＧＶＬによって構成されているが、項目値情報は、加工されていない項目値を含む項目値配列によって構成することも可能である。よって、以下の説明では、前者を第１のタイプの項目値情報、後者を第２のタイプの項目値情報と呼ぶことにする。 In the description of the present embodiment, the item value information is configured by the local item value number array VNo, the item value designation pointer array LVL, and the global item value array GVL, but the item value information is not processed. It is also possible to configure by an item value array including values. Therefore, in the following description, the former is referred to as first type item value information, and the latter is referred to as second type item value information.

［項目値取得処理の概要］
なお、本発明の一実施形態によるマルチコア型処理装置では、上記の項目値を取得する処理は、制御ユニットと演算ユニットの協働、及び、グローバルメモリとローカルメモリとの間のデータ転送を利用して行われる。図８は、本発明の一実施形態による項目値取得方法のフローチャートである。図８に示されているように、最初に、制御ユニットが、グローバルメモリ上のブロック番号配列を参照して、項目値が取得されるべき所定のレコードが含まれるブロックのブロック番号と、このブロックを担当する演算ユニットとを決定する（ステップ８０２）。次に、制御ユニットは、決定されたブロック番号によって識別される演算ユニットへ所定のレコードのレコード順序番号を通知する（ステップ８０４）。その後、演算ユニットは、この演算ユニットの担当レコードに関するレコード順序番号配列及び項目値アクセス情報配列を、グローバルメモリからこの演算ユニットのローカルメモリへ転送する（ステップ８０６）。続いて、演算ユニットは、通知されたレコード順序番号が格納されている位置をローカルメモリへ転送されたレコード順序番号配列中で特定する（ステップ８０８）。その後、演算ユニットは、特定された位置によって指定される項目値アクセス情報をローカルメモリへ転送された項目値アクセス情報配列中で特定する（ステップ８１０）。最後に、演算ユニットは、データ項目毎に、グローバルメモリに保持されたグローバル項目値配列の中から特定された項目値アクセス情報によって指定される項目値を取得し、取得された項目値をグローバルメモリへ転送する（ステップ８１２）。[Overview of field value acquisition processing]
In the multi-core processing apparatus according to the embodiment of the present invention, the process for acquiring the item value uses the cooperation of the control unit and the arithmetic unit and the data transfer between the global memory and the local memory. Done. FIG. 8 is a flowchart of an item value acquisition method according to an embodiment of the present invention. As shown in FIG. 8, first, the control unit refers to the block number array in the global memory, and the block number of the block including the predetermined record from which the item value is to be acquired. Is determined (step 802). Next, the control unit notifies the record unit number of a predetermined record to the arithmetic unit identified by the determined block number (step 804). Thereafter, the arithmetic unit transfers the record sequence number array and the item value access information array relating to the record in charge of this arithmetic unit from the global memory to the local memory of this arithmetic unit (step 806). Subsequently, the arithmetic unit specifies the position where the notified record sequence number is stored in the record sequence number array transferred to the local memory (step 808). Thereafter, the arithmetic unit specifies item value access information specified by the specified position in the item value access information array transferred to the local memory (step 810). Finally, for each data item, the arithmetic unit acquires the item value specified by the item value access information specified from the global item value array held in the global memory, and stores the acquired item value in the global memory. (Step 812).

［表形式データのソート処理（第１のタイプの項目値情報）］
第１のタイプの項目値情報を含む表形式データのソート処理は、以下の３ステップにより構成される。図９は、第１のタイプの項目値情報を含む表形式データのソート処理の概略的なフローチャートである。[Sort processing of tabular data (first type item value information)]
Sorting processing of tabular data including item value information of the first type includes the following three steps. FIG. 9 is a schematic flowchart of the sort processing of the tabular data including the first type item value information.

ステップ９０１：ブロック内ソート
各演算ユニットが並列的に動作して、それぞれのブロック内で、ローカル項目値番号配列ＶＮｏをキーとして、項目値アクセス情報配列ＬＯｒｄに分布数え上げソートを適用し、ソートされた項目値アクセス情報配列ＬＯｒｄに適合するように項目値指定ポインタ配列ＬＶＬ’及びレコード順序番号配列ＧＯｒｄ’を生成する。Step 901: Sorting within a block Each arithmetic unit operates in parallel, and within each block, sorting is performed by applying a distribution counting sort to the item value access information array LOrd using the local item value number array VNo as a key. An item value designation pointer array LVL ′ and a record sequence number array GOrd ′ are generated so as to conform to the item value access information array LOrd.

ステップ９０２：ブロック間ソート１（マージ）
各演算ユニットが並列的かつ階層的に動作して、項目値指定ポインタ配列ＬＶＬ’及びレコード順序番号配列ＧＯｒｄ’からブロック番号配列ＢｌｋＮｏを生成する。ここでは、ブロック間で、それぞれのブロックからの項目値指定ポインタ配列ＬＶＬ’、レコード順序番号配列ＧＯｒｄ’及びブロック番号配列ＢｌｋＮｏ’がトーナメント方式で所定の順序にマージされる。Step 902: Inter-block sort 1 (merge)
Each arithmetic unit operates in parallel and hierarchically to generate a block number array BlkNo from the item value designation pointer array LVL ′ and the record order number array GOrd ′. Here, the item value designation pointer array LVL ′, the record order number array GOrd ′, and the block number array BlkNo ′ from the respective blocks are merged in a predetermined order between the blocks.

ステップ９０３：ブロック間ソート２（分配）
各演算ユニットが並列的に動作して、ブロック番号配列ＢｌｋＮｏからレコード順序番号配列ＧＯｒｄを生成する。Step 903: Sorting between blocks 2 (distribution)
Each arithmetic unit operates in parallel to generate a record order number array GOrd from the block number array BlkNo.

このように、表形式データのソート処理は、ブロック内ソートと、ブロック間ソート１（マージ）と、ブロック間ソート２（分配）の３ステップによって実現される。ここで、ブロック間ソート１（マージ）は、１対のブロックからのデータを所定の順序で並べ替えるという意味でソート処理であり、１対のブロックからのデータを１組のデータに統合するという意味でマージ処理でもある。本文書中において、「所定の順序でマージ」するとは、ブロック間ソート１（マージ）における処理を指している。 As described above, the sort processing of the tabular data is realized by three steps of sort within block, sort 1 between blocks (merge), and sort 2 between blocks (distribution). Here, the inter-block sort 1 (merge) is a sort process in the sense that data from a pair of blocks is rearranged in a predetermined order, and the data from the pair of blocks is integrated into a set of data. It is also a merge process in meaning. In this document, “merge in a predetermined order” refers to processing in block-to-block sort 1 (merge).

以下では、図５Ａ乃至５Ｄ、図６Ａ及び６Ｂ、及び図７Ａ乃至７Ｄに示された例を参照して、本発明の一実施形態による表形式データのソート処理（第１のタイプの項目値情報）の各ステップの処理をより詳細に説明する。 In the following, with reference to the examples shown in FIGS. 5A to 5D, FIGS. 6A and 6B, and FIGS. 7A to 7D, the tabular data sorting process according to the embodiment of the present invention (first type item value information) ) Will be described in more detail.

図１０Ａ乃至１０Ｃは本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。同図には、一例として、ブロック番号＝１のブロック（すなわち、ブロック１）に関する処理が記載されているが、他のブロックに関する処理も同様に行われることは当業者に明白であろう。各ブロックに関する処理は、そのブロックが割り当てられた各演算ユニットによって実行される。また、以下の説明では、配列の操作は、Ｃ言語に類似した疑似命令によって表現されることがある。 10A to 10C are explanatory diagrams of the intra-block sorting process in the tabular data sorting process according to the embodiment of the present invention. In the figure, as an example, processing related to the block of block number = 1 (ie, block 1) is described, but it will be apparent to those skilled in the art that processing related to other blocks is performed in the same manner. The processing related to each block is executed by each arithmetic unit to which the block is assigned. In the following description, an array operation may be expressed by a pseudo instruction similar to the C language.

図１０Ａは分布数え上げソート処理におけるカウントアップ処理の説明図である。図１０Ａには、ブロック１に関するレコード順序番号配列ＧＯｒｄ、項目値アクセス情報配列ＬＯｒｄ、ローカル項目値番号配列ＶＮｏ、及び、項目値指定ポインタ配列ＬＶＬと、分布数え上げソートのキーとして利用するローカル項目値番号の出現回数をカウントするカウント配列Ｃｏｕｎｔの遷移が示されている。カウントアップ処理は、
ｆｏｒ（ｉ＝０；ｉ＜ブロック内レコード数；ｉ＋＋）｛
Ｃｏｕｎｔ［ＶＮｏ［ＬＯｒｄ［ｉ］］］＋＋；
｝
として記述できる。図１０Ａの例では、１段目がｉ＝０、２段目がｉ＝１、３段目がｉ＝２、４段目がｉ＝３に対応している。FIG. 10A is an explanatory diagram of the count-up process in the distribution counting sort process. FIG. 10A shows a record order number array GOrd, an item value access information array LOrd, a local item value number array VNo, an item value designation pointer array LVL, and a local item value number used as a distribution counting sort key. The transition of the count array Count that counts the number of occurrences of is shown. The count-up process is
for (i = 0; i <number of records in block; i ++) {
Count [VNo [LOrd [i]]] ++;
}
Can be described as In the example of FIG. 10A, the first stage corresponds to i = 0, the second stage corresponds to i = 1, the third stage corresponds to i = 2, and the fourth stage corresponds to i = 3.

図１０Ｂは分布数え上げソート処理における累計数化処理の説明図である。カウントアップ処理の結果として得られたカウント配列Ｃｏｕｎｔは、
Ｃｏｕｎｔ［０］＝１
Ｃｏｕｎｔ［１］＝２
Ｃｏｕｎｔ［２］＝１
である。この出現回数を累積度数分布に変換すると（すなわち、累計数化すると）、累積度数分布配列Ａｇｇｒが得られる。なお、この累計数化処理によって生成される累積度数分布配列Ａｇｇｒの先頭の要素は０であり、実際の累積度数はＡｇｇｒ［１］以降に格納されている。累計数化処理は、たとえば、
Ａｇｇｒ［０］＝０；
ｆｏｒ（ｉ＝１；ｉ＜キー値の個数；ｉ＋＋）｛
Ａｇｇｒ［ｉ］＝Ａｇｇｒ［ｉ−１］＋Ｃｏｕｎｔ［ｉ−１］；
｝
として記述できる。FIG. 10B is an explanatory diagram of the cumulative number process in the distribution counting sort process. The count array Count obtained as a result of the count-up process is
Count [0] = 1
Count [1] = 2
Count [2] = 1
It is. When this number of appearances is converted into a cumulative frequency distribution (that is, converted into a cumulative number), a cumulative frequency distribution array Aggr is obtained. Note that the head element of the cumulative frequency distribution array Aggr generated by the cumulative number processing is 0, and the actual cumulative frequency is stored after Aggr [1]. For example, the cumulative number processing
Aggr [0] = 0;
for (i = 1; i <number of key values; i ++) {
Aggr [i] = Aggr [i-1] + Count [i-1];
}
Can be described as

最後に、このようにして生成された累積度数分布配列Ａｇｇｒの要素をポインタとして利用して項目値アクセス情報配列ＬＯｒｄの要素をコピーすることにより、ソートされた項目値アクセス情報配列ＬＯｒｄ’が得られる。図１０Ｃは分布数え上げソート処理における転送処理の説明図である。転送処理では、配列ＬＯｒｄの要素を配列ＬＯｒｄ’へコピーするだけでなく、新たに生成された配列ＬＯｒｄ’に対応する項目値指定ポインタ配列ＬＶＬ’及びレコード順序番号配列ＧＯｒｄ’も生成される。たとえば、図１０Ｃの最上段には、ＬＯｒｄ［０］をコピーする処理が示されている。ＬＯｒｄ［０］＝０は、ＬＯｒｄ’の要素ＬＯｒｄ’［１］へコピーされている。これは、ＬＯｒｄ［０］であるレコードに含まれる項目値のローカル項目値番号ＶＮｏ［０］が１であることから、累積度数分布配列Ａｇｇｒ中の要素Ａｇｇｒ［１］＝１をポインタとして利用して、項目値アクセス情報配列ＬＯｒｄ’の要素ＬＯｒｄ’［１］にＬＯｒｄ［０］＝０をコピーすることによって実現されている。この転送処理は、一般的に、たとえば、
ｆｏｒ（ｉ＝０；ｉ＜ブロック内のレコード数；ｉ＋＋）｛
ＬＯｒｄ’［Ａｇｇｒ［ＶＮｏ［ＬＯｒｄ［ｉ］］］］＝ＬＯｒｄ［ｉ］；
ＬＶＬ’［Ａｇｇｒ［ＶＮｏ［ＬＯｒｄ［ｉ］］］］＝ＬＶＬ［ＶＮｏ［ＬＯｒｄ［ｉ］］］；
ＧＯｒｄ’［Ａｇｇｒ［ＶＮｏ［ＬＯｒｄ［ｉ］］］］＝ＧＯｒｄ［ｉ］；
Ａｇｇｒ［ＶＮｏ［ＬＯｒｄ［ｉ］］］＋＋；
｝
として記述できる。Finally, the item value access information array LOrd ′ is obtained by copying the elements of the item value access information array LOrd using the elements of the cumulative frequency distribution array Aggr thus generated as pointers. . FIG. 10C is an explanatory diagram of the transfer process in the distribution counting sort process. In the transfer process, not only the elements of the array LOrd are copied to the array LOrd ′, but also the item value designation pointer array LVL ′ and the record sequence number array GOrd ′ corresponding to the newly generated array LOrd ′ are generated. For example, a process of copying LOrd [0] is shown at the top of FIG. 10C. LOrd [0] = 0 is copied to element LOrd ′ [1] of LOrd ′. This is because the local item value number VNo [0] of the item value included in the record that is LOrd [0] is 1, so that the element Aggr [1] = 1 in the cumulative frequency distribution array Aggr is used as a pointer. This is realized by copying LOrd [0] = 0 to the element LOrd ′ [1] of the item value access information array LOrd ′. This transfer process is typically performed, for example,
for (i = 0; i <number of records in block; i ++) {
LOrd ′ [Aggr [VNo [LOrd [i]]]] = LOrd [i];
LVL ′ [Aggr [VNo [LOrd [i]]]] = LVL [VNo [LOrd [i]]];
GOrd ′ [Aggr [VNo [LOrd [i]]]] = GOrd [i];
Aggr [VNo [LOrd [i]]] ++;
}
Can be described as

このようにして得られた項目値アクセス情報配列ＬＯｒｄ’は、ソート後のブロック内での最終的な項目値アクセス情報配列ＬＯｒｄと一致する。また、新たに生成された項目値指定ポインタ配列ＬＶＬ’とレコード順序番号配列ＧＯｒｄ’は、ブロック内で項目値をキーとしてソートされたレコードに対応している。 The item value access information array LOrd 'obtained in this way matches the final item value access information array LOrd in the sorted block. Also, the newly generated item value designation pointer array LVL 'and record sequence number array GOrd' correspond to records sorted using item values as keys in the block.

本発明の一実施形態によれば、各演算ユニットによってブロック内でソートされた表形式データのレコードは、次に、ブロック間でマージされる。ブロック間のマージでは、それぞれのブロック内でソートされているデータが併合され、全体としてソートされた併合データが生成される。より具体的には、項目値指定ポインタ配列ＬＶＬ’の要素とレコード順序番号配列ＧＯｒｄ’の要素の組がソートされる。レコード順序番号配列ＧＯｒｄ’の要素はレコード毎に一意に決まる値をもつので、項目値指定ポインタ配列ＬＶＬ’の要素とレコード順序番号配列ＧＯｒｄ’の要素の組は一意である。あるデータ項目に関する項目値指定ポインタ配列ＬＶＬ’の要素は、そのデータ項目に関する項目値が所定の順序で整列させられたグローバル項目値配列の要素を指示するので、項目値指定ポインタ配列ＬＶＬ’の要素の値の順番にソートすることは、項目値の順番にソートすることと等価である。なお、ブロック内ソート処理の結果として得られた項目値指定ポインタ配列ＬＶＬ’及びレコード順序番号配列ＧＯｒｄ’には、各ブロック内のレコードが属するブロック番号を表すブロック番号配列ＢｌｋＮｏ’が後の処理のため追加される。 According to one embodiment of the present invention, the tabular data records sorted within the block by each arithmetic unit are then merged between the blocks. In merging between blocks, data sorted in each block is merged, and merged data sorted as a whole is generated. More specifically, a set of elements of the item value designation pointer array LVL 'and elements of the record sequence number array GOrd' is sorted. Since the element of the record sequence number array GOrd 'has a value uniquely determined for each record, the set of the element of the item value designation pointer array LVL' and the element of the record sequence number array GOrd 'is unique. The element of the item value designation pointer array LVL ′ relating to a certain data item indicates the element of the global item value array in which the item values relating to the data item are arranged in a predetermined order. Sorting in the order of the values is equivalent to sorting in the order of the item values. In addition, in the item value designation pointer array LVL ′ and the record sequence number array GOrd ′ obtained as a result of the intra-block sort process, a block number array BlkNo ′ representing the block number to which the record in each block belongs is used for the subsequent processing. Because it is added.

図１１は、このようにして得られた、本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の結果の説明図である。この時点で、各ブロック内の項目値アクセス情報配列ＬＯｒｄは最終的な結果と一致しているので、ＬＯｒｄのようにプライム記号（’）無しで示されている。一方、項目値指定ポインタ配列ＬＶＬ’及びレコード順序番号配列ＧＯｒｄ’は最終的な結果ではなく、処理途中の作業用配列を表しているので、プライム記号（’）付きで示されている。さらに、最終的な結果であるブロック番号配列ＢｌｋＮｏ及びレコード順序番号配列ＧＯｒｄは、この時点では未だ決定されていないのでブランクのまま残されている。 FIG. 11 is an explanatory diagram of the result of the intra-block sorting process in the tabular data sorting process according to the embodiment of the present invention obtained as described above. At this time, the item value access information array LOrd in each block is consistent with the final result, and thus is indicated without a prime symbol (') like LOrd. On the other hand, the item value designation pointer array LVL 'and the record sequence number array GOrd' are not final results but represent working arrays in the middle of processing, and therefore are indicated with a prime symbol ('). Further, the block number array BlkNo and the record sequence number array GOrd, which are the final results, have not yet been determined at this point, and are left blank.

図１２Ａ乃至１２Ｃは本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の階層構造を説明する図である。同図の例では、たとえば、少なくとも７台の演算ユニットＳＰＵ−０、ＳＰＵ−１、・・・、ＳＰＵ−６が、Ｂｌｏｃｋ０からブロック１２７までの１２８ブロックのソート処理を実行している。 12A to 12C are diagrams illustrating the hierarchical structure of the inter-block sort process 1 (merge process) in the tabular data sort process according to an embodiment of the present invention. In the example of the figure, for example, at least seven arithmetic units SPU-0, SPU-1,..., SPU-6 are executing 128 blocks of sort processing from Block0 to Block127.

図１２Ａの１回目の処理は、ブロックＢｌｏｃｋ−０からブロックＢｌｏｃｋ−７までのソート処理と、ブロックＢｌｏｃｋ−８からブロックＢｌｏｃｋ−１５までのソート処理と、以下同様に、ブロックＢｌｏｃｋ−１２０からブロックＢｌｏｃｋ−１２７までのソート処理の１６回のソート処理、すなわち、８個のブロック間のソート処理を１６回に亘って逐次的に実行する。たとえば、１段目では、ＳＰＵ−０がブロックＢｌｏｃｋ−０とブロックＢｌｏｃｋ−１に所定の順序のマージ処理を適用してブロック０〜１を出力し、ＳＰＵ−１がブロックＢｌｏｃｋ−２とブロックＢｌｏｃｋ−３に所定の順序のマージ処理を適用してブロック２〜３を出力する。ここで、ブロック０〜１のようなブロックＡ〜Ｂという表記は、ブロックＡからブロックＢまでに所定の順序のマージ処理を適用した結果として得られるブロックに関するデータ（より詳細には、作業用の項目値指定ポインタ配列、作業用のレコード順序番号配列、及び、作業用のブロック番号配列）を表している。次に、２段目において、ＳＰＵ−４がＳＰＵ−０によって所定の順序のマージ処理されたブロック０〜１と、ＳＰＵ−１によって所定の順序のマージ処理されたブロック２〜３とに所定の順序のマージ処理を適用して、ブロック０〜３を出力する。同様に、ＳＰＵ−４はブロック４〜７を出力する。３段目のＳＰＵ−６は、ＳＰＵ−４によって出力されたブロック０〜３と、ＳＰＵ−５によって出力されたブロック４〜７とに所定の順序のマージ処理を適用して、ブロック０〜７を出力する。このような３段目のマージ処理が、さらに繰り返し実行されることによって、ブロック０〜７、ブロック８〜１５、・・・、ブロック１２０〜１２７の１６個のブロックが出力される。尚、図中、白抜きの矢印は、グローバルメモリと演算ユニット内のローカルメモリとの間の入出力を表し、黒色の矢印は、チップ内バスを経由する演算ユニット内のローカルメモリ間のデータ転送を表している。 The first process of FIG. 12A includes a sort process from block Block-0 to block Block-7, a sort process from block Block-8 to Block Block-15, and so on. The sort process of 16 sorts up to -127, that is, the sort process between 8 blocks is sequentially executed 16 times. For example, in the first stage, SPU-0 applies a merging process in a predetermined order to block Block-0 and block Block-1, and outputs blocks 0 to 1, and SPU-1 outputs block Block-2 and block Block. Apply merge processing in a predetermined order to -3 to output blocks 2 to 3. Here, the notation of the blocks A to B such as the blocks 0 to 1 is the data regarding the block obtained as a result of applying the merge processing in a predetermined order from the block A to the block B (more specifically, for the work An item value designation pointer array, a work record sequence number array, and a work block number array). Next, in the second stage, SPU-4 is assigned to blocks 0 to 1 that have been merged in a predetermined order by SPU-0 and blocks 2 to 3 that have been merged in a predetermined order by SPU-1 The order merge process is applied and blocks 0 to 3 are output. Similarly, SPU-4 outputs blocks 4-7. The SPU-6 at the third stage applies merge processing in a predetermined order to the blocks 0 to 3 output by the SPU-4 and the blocks 4 to 7 output by the SPU-5, thereby performing the blocks 0 to 7 Is output. By repeating the third-stage merge process repeatedly, 16 blocks of blocks 0 to 7, blocks 8 to 15,..., And blocks 120 to 127 are output. In the figure, white arrows indicate input / output between the global memory and the local memory in the arithmetic unit, and black arrows indicate data transfer between the local memory in the arithmetic unit via the intra-chip bus. Represents.

図１２Ｂの２回目の処理は、１回目の処理によって出力された１６個のブロックのうち、ブロック０〜７、・・・、ブロック５６〜６３の８個のブロックに所定の順序のマージ処理を適用して、ブロック０〜６３を出力し、ブロック６４〜７１、・・・、ブロック１２０〜１２７の８個のブロックに所定の順序のマージ処理を適用して、ブロック６４〜１２７を出力する。 In the second process of FIG. 12B, a merge process in a predetermined order is performed on eight blocks of blocks 0 to 7,..., Blocks 56 to 63 among the 16 blocks output by the first process. Applying blocks 0 to 63, blocks 64 to 71,..., And blocks 120 to 127 are applied to a predetermined order of merge processing to output blocks 64 to 127.

さらに、図１２Ｃの３回目の処理では、ＳＰＵ−０が、ブロック０〜６３とブロック６４〜１２７に所定の順序のマージ処理を適用して、最終的な１個のブロック０〜１２７を出力する。 Further, in the third process of FIG. 12C, the SPU-0 applies a merge process in a predetermined order to the blocks 0 to 63 and the blocks 64 to 127, and outputs one final block 0 to 127. .

ブロック間ソート処理１（すなわち、マージ処理）では、各演算ユニットが、１対のブロックに関する情報をマージして、マージされたより高い層の１個のブロックに関する情報を生成する。よって、マージ処理は、複数台の演算ユニットの並列動作によって実現される。また、各演算ユニットは、同じ層に属するマージされたよりブロックの対に関する情報をマージし、マージされたさらに高い層の１個のブロックに関する情報を生成する。このようにマージ処理を並列的かつ階層的に繰り返すことにより、最終的に最上層の１個のブロックに関する情報が生成される。最上層の１個のブロックとは、レコード全体を含むブロックである。 In the inter-block sort process 1 (that is, the merge process), each arithmetic unit merges information related to a pair of blocks, and generates information related to one block of the merged higher layer. Therefore, the merge process is realized by a parallel operation of a plurality of arithmetic units. Each arithmetic unit also merges information about merged more block pairs belonging to the same layer and generates information about one block of the merged higher layer. In this way, by repeating the merge processing in parallel and hierarchically, information on one block at the top layer is finally generated. One block in the uppermost layer is a block including the entire record.

たとえば、２^ｎ−１台の演算ユニットが存在し、各演算ユニットが２個のブロックに関する情報を入力し、それらをマージして、１個のブロックに関する情報を出力すると仮定すると、各演算ユニットが１回ずつマージ処理を実行することによって、ｎ段（層）のマージ処理が実現される。この場合、全演算プロセッサによる全データ通信量のうち、演算プロセッサがグローバルメモリとの間で行う通信が占める割合は、１／ｎである。演算ユニット間の通信量は、全データ通信量の（ｎ−１）／ｎである。For example, ^assuming that there are 2 ⁿ -1 arithmetic units, each arithmetic unit inputs information about two blocks, merges them and outputs information about one block, each arithmetic unit has By executing the merge process once, an n-stage (layer) merge process is realized. In this case, of the total data communication amount by all the arithmetic processors, the ratio occupied by the communication performed by the arithmetic processor with the global memory is 1 / n. The communication amount between the arithmetic units is (n−1) / n of the total data communication amount.

続いて、本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（ブロック間マージ処理）をより詳細に説明する。図１３Ａ乃至１３Ｃは、本発明の一実施形態による表形式データのソート処理における１段目のブロック間マージ処理の説明図である。本例では、ＳＰＵ−０がブロックＢｌｏｃｋ−０（以下、ブロック０）とブロックＢｌｏｃｋ−１（ブロック１）との間で所定の順序のマージ処理を実行する。ブロック間マージ処理は、２つの昇順リストから１つの昇順リストを生成するという点で昇順リストのマージ処理である。最初に、ＳＰＵ−０は、ブロック０及びブロック１に関する項目値指定ポインタ配列ＬＶＬ’、レコード順序番号配列ＧＯｒｄ’、及び、ブロック番号配列ＢｌｋＮｏ’をグローバルメモリからローカルメモリへ転送する。但し、全体を一度にグローバルメモリへ転送できないときには、グローバルメモリからローカルメモリへ少しずつ転送される。 Next, the inter-block sort process 1 (inter-block merge process) in the tabular data sort process according to an embodiment of the present invention will be described in more detail. 13A to 13C are explanatory diagrams of the inter-block merge process at the first stage in the tabular data sort process according to the embodiment of the present invention. In this example, the SPU-0 executes a merge process in a predetermined order between the block Block-0 (hereinafter, block 0) and the block Block-1 (block 1). The inter-block merge process is an ascending list merge process in that one ascending list is generated from two ascending lists. First, the SPU-0 transfers the item value designation pointer array LVL ′, the record order number array GOrd ′, and the block number array BlkNo ′ regarding the block 0 and the block 1 from the global memory to the local memory. However, when the entire data cannot be transferred to the global memory at once, the data is transferred little by little from the global memory to the local memory.

図１３Ａには、ブロック０内の１番目のレコードに関する情報（Ｂ０（ＬＶＬ’，ＧＯｒｄ’）のように表す）とブロック１内の１番目のレコードに関する情報（Ｂ１（ＬＶＬ’，ＧＯｒｄ’）のように表す）とを比較する処理が示されている。このとき、ブロック０側の読み出しポインタとブロック１側の読み出しポインタは共にデータの先頭に位置している。ＬＶＬ’を上位の桁、ＧＯｒｄ’を下位の桁とみなして、ブロック０側及びブロック１側からの（ＬＶＬ’，ＧＯｒｄ’）を所定の順番（たとえば、昇順）に並べる。本例では、
Ｂ０（ＬＶＬ’，ＧＯｒｄ’）＝（２，１）＞Ｂ１（ＬＶＬ’，ＧＯｒｄ’）＝（１，５）
であるので、Ｂ１（ＬＶＬ’，ＧＯｒｄ’）が先頭（最も小さい）要素であることが判定される。よって、ブロック番号を含めた要素の組Ｂ１（１，５，１）がマージ処理の結果として取り出される。このマージ処理の結果は、
ＬＶＬ’［０］＝１
ＧＯｒｄ’［０］＝５
ＢｌｋＮｏ’［０］＝１
のように記述できる。これにより、ブロック１側のデータが取り出されたので、ブロック１側の読み出しポインタが１つ先へ進められる。FIG. 13A shows information on the first record in block 0 (represented as B0 (LVL ′, GOrd ′)) and information on the first record in block 1 (B1 (LVL ′, GOrd ′)). The process of comparing with the above is shown. At this time, both the read pointer on the block 0 side and the read pointer on the block 1 side are located at the head of the data. LVL ′ is regarded as an upper digit and GOrd ′ is regarded as a lower digit, and (LVL ′, GOrd ′) from the block 0 side and the block 1 side are arranged in a predetermined order (for example, ascending order). In this example,
B0 (LVL ′, GOrd ′) = (2, 1)> B1 (LVL ′, GOrd ′) = (1, 5)
Therefore, it is determined that B1 (LVL ′, GOrd ′) is the head (smallest) element. Therefore, the element set B1 (1, 5, 1) including the block number is extracted as a result of the merge process. The result of this merge process is
LVL '[0] = 1
GOrd ′ [0] = 5
BlkNo '[0] = 1
Can be described as follows. As a result, since the data on the block 1 side is extracted, the read pointer on the block 1 side is advanced by one.

図１３Ｂでは、読み出しポインタが先頭に位置しているブロック０内の１番目のレコードと、読み出しポインタが１つ先へ進められたブロック１内の２番目のレコードとを比較する処理が示されている。同様に、Ｂ０（２，１）とＢ１（２，４）とを比較すると、Ｂ０（２，１）の方が小さいと判定されるので、ブロック番号を含めた要素の組Ｂ０（２，１，０）がマージ処理の結果、すなわち、
ＬＶＬ’［１］＝２
ＧＯｒｄ’［１］＝１
ＢｌｋＮｏ’［１］＝０
として取り出される。これにより、ブロック０側のデータが取り出されたので、ブロック０側の読み出しポインタが１つ先へ進められる。FIG. 13B shows a process of comparing the first record in block 0 where the read pointer is positioned at the head with the second record in block 1 where the read pointer is advanced one step forward. Yes. Similarly, when B0 (2,1) and B1 (2,4) are compared, it is determined that B0 (2,1) is smaller, so the element set B0 (2,1) including the block number is determined. , 0) is the result of the merge process, ie
LVL '[1] = 2
GOrd '[1] = 1
BlkNo '[1] = 0
As taken out. As a result, since the data on the block 0 side is extracted, the read pointer on the block 0 side is advanced by one.

このように、ポインタを進めながら、ブロック０側のデータとブロック１側のデータを順次比較することにより、最終的に、ブロック０側のデータとブロック１側のデータがマージされた昇順のリストが得られる。図１３Ｃには、最終的に取り出された項目値指定ポインタ配列ＬＶＬ’とレコード順序番号配列ＧＯｒｄ’とブロック番号配列ＢｌｋＮｏ’とが示されている。 In this way, by sequentially comparing the data on the block 0 side and the data on the block 1 side while advancing the pointer, finally, an ascending list in which the data on the block 0 side and the data on the block 1 side are merged is obtained. can get. FIG. 13C shows the item value designation pointer array LVL ′, the record sequence number array GOrd ′, and the block number array BlkNo ′ that are finally extracted.

なお、取り出された項目値指定ポインタ配列ＬＶＬ’とレコード順序番号配列ＧＯｒｄ’とブロック番号配列ＢｌｋＮｏ’は、２段目のマージ処理のため、本例では、ＳＰＵ−４へ送出されるが、最終的な結果を一括して送出するのではなく、ブロック０側とブロック１側の比較処理を進めながら、取り出されたデータを必要に応じて、ＳＰＵ−０からＳＰＵ−４へ送出してもよい。 Note that the extracted item value designation pointer array LVL ′, record sequence number array GOrd ′, and block number array BlkNo ′ are sent to the SPU-4 in this example because of the second-stage merging process. The extracted data may be sent from the SPU-0 to the SPU-4 as necessary while proceeding with the comparison process between the block 0 side and the block 1 side, instead of sending the actual results all at once. .

以上の説明からわかるように、本発明の一実施形態によるブロック間ソート処理１（マージ処理）の１段目の処理では、データアクセスがシーケンシャルアクセスだけに限定され、かつ、各ＳＰＵが並列にブロック間ソート処理１を実行可能である。よって、マルチプロセッサ型処理装置の性能が十分に活かされている。 As can be seen from the above description, in the first stage of the inter-block sort process 1 (merge process) according to one embodiment of the present invention, data access is limited to sequential access and each SPU is blocked in parallel. Intersort processing 1 can be executed. Therefore, the performance of the multiprocessor type processing apparatus is fully utilized.

今度は、演算ユニットＳＰＵ−４が、ＳＰＵ−０から出力されたブロック０〜１と、ＳＰＵ−１１から出力されたブロック２〜３に所定の順序のマージ処理を適用し、１つのブロック０〜３を出力する２段目のブロック間ソート処理１について説明する。図１４Ａ及び１４Ｂは、本発明の一実施形態による表形式データのソート処理における２段目のマージ処理の説明図である。２段目のマージ処理は、入力される情報が他の演算ユニットのローカルメモリから転送される点を除いて、１段目のマージ処理と同様である。この処理を簡単に説明すると、図１４Ａに示されているように、最初、２つのブロックからの項目値指定ポインタ配列ＬＶＬ’、レコード順序番号配列ＧＯｒｄ’、及び、ブロック番号配列ＢｌｋＮｏ’の読み出し用ポインタが先頭に設定される。両方の（ＬＶＬ’，ＧＯｒｄ’）の組の値を比較すると、
Ｂ０〜１（ＬＶＬ’，ＧＯｒｄ’）＝（１，５）＞Ｂ２〜３（ＬＶＬ’，ＧＯｒｄ’）＝（０，８）
であることから、Ｂ２〜３側の要素の方が小さいということがわかる。よって、Ｂ２〜３側の先頭の要素の組であるＢ２〜３（０，８，２）を取り出す。そして、要素の組が取り出された方のＢ２〜３側の読み出しポインタを１つ先へ進められる。ここで、Ｂａ〜ｂという表記は、ブロックａからブロックｂまでの間の所定の順序のマージ処理の結果として得られたデータを表している。This time, the arithmetic unit SPU-4 applies a merge process in a predetermined order to the blocks 0 to 1 output from the SPU-0 and to the blocks 2 to 3 output from the SPU-11, thereby obtaining one block 0 to 0. The second block sorting process 1 for outputting 3 will be described. 14A and 14B are explanatory diagrams of the second-stage merge processing in the tabular data sort processing according to the embodiment of the present invention. The second-stage merge process is the same as the first-stage merge process, except that the input information is transferred from the local memory of another arithmetic unit. Briefly explaining this processing, as shown in FIG. 14A, first, for reading the item value designation pointer array LVL ′, the record sequence number array GOrd ′, and the block number array BlkNo ′ from the two blocks. The pointer is set to the beginning. Comparing the values of both (LVL ', GOrd') pairs,
B0-1 (LVL ', GOrd') = (1,5)> B2-3 (LVL ', GOrd') = (0,8)
Therefore, it can be seen that the element on the B2-3 side is smaller. Therefore, B2-3 (0, 8, 2), which is the set of the leading elements on the B2-3 side, is extracted. Then, the read pointer on the B2-3 side from which the element set has been extracted is advanced by one. Here, the notation Ba to b represents data obtained as a result of the merge processing in a predetermined order between block a and block b.

このような要素の組の大小比較と、小さい方の要素の組の読み出しを繰り返すことにより、図１４Ｂに示されているように、最後にＢ２〜３側の要素の組であるＢ２〜３（３，１３，３）が取り出され、２段目のマージ処理が終了する。 By repeating the size comparison of such element sets and reading out the smaller element set, as shown in FIG. 14B, finally, B2-3 ( 3, 13, 3) are taken out, and the second-stage merge processing ends.

なお、取り出された項目値指定ポインタ配列ＬＶＬ’とレコード順序番号配列ＧＯｒｄ’とブロック番号配列ＢｌｋＮｏ’は、３段目のマージ処理のため、本例では、ＳＰＵ−６へ送出されるが、最終的な結果を一括して送出するのではなく、ブロック０側とブロック１側の比較処理を進めながら、取り出されたデータを必要に応じて、ＳＰＵ−４からＳＰＵ−６へ送出してもよい。また、以上の説明からわかるように、本発明の一実施形態によるブロック間ソート処理１（マージ処理）の２段目の処理では、データアクセスがシーケンシャルアクセスだけに限定され、かつ、各ＳＰＵが並列にブロック間ソート処理１を実行可能である。よって、マルチプロセッサ型処理装置の性能が十分に活かされている。 Note that the extracted item value designation pointer array LVL ′, record sequence number array GOrd ′, and block number array BlkNo ′ are sent to the SPU-6 in this example because of the third-stage merge process. The extracted data may be sent from the SPU-4 to the SPU-6 as necessary while proceeding with the comparison process between the block 0 side and the block 1 side, instead of sending the actual results all at once. . As can be seen from the above description, in the second stage of the inter-block sort process 1 (merge process) according to an embodiment of the present invention, data access is limited to sequential access and each SPU is parallel. In addition, the inter-block sort process 1 can be executed. Therefore, the performance of the multiprocessor type processing apparatus is fully utilized.

２段目のマージ処理は、本例では、ＳＰＵ−４とＳＰＵ−５によって並列に実行されている。ＳＰＵ−４は、ブロック０からブロック３までのブロック間マージ処理の結果をブロック０〜３としてＳＰＵ−６へ出力し、ＳＰＵ−５は、ブロック４からブロック７までのブロック間マージ処理の結果をブロック４〜７としてＳＰＵ−６へ出力する。本例では、ブロックの総数は、ブロック０からブロック７までの８ブロックであるため、ＳＰＵ−６による３段目のマージ処理によって、全てのブロックからのデータのマージが終了し、すなわち、全てのブロックを考慮したソート処理が終了する。当業者によって理解されるように、ブロック数が９個以上ある場合には、たとえば、図１２Ａ乃至１２Ｃに示されるように、マージ処理の段数を増加させることにより、最終的に全てのブロックからのデータがマージされたソート処理結果を得ることが可能である。 In the present example, the second-stage merge processing is executed in parallel by SPU-4 and SPU-5. The SPU-4 outputs the result of the inter-block merge process from the block 0 to the block 3 to the SPU-6 as the blocks 0 to 3, and the SPU-5 displays the result of the inter-block merge process from the block 4 to the block 7 It outputs to SPU-6 as blocks 4-7. In this example, since the total number of blocks is 8 blocks from block 0 to block 7, merging of data from all the blocks is completed by the third-stage merge processing by SPU-6, that is, all blocks The sort process considering the block ends. As will be appreciated by those skilled in the art, if there are more than nine blocks, for example, as shown in FIGS. It is possible to obtain a sort processing result in which data is merged.

図１５は、本発明の一実施形態による表形式データのソート処理における３段目のマージ処理の説明図である。３段目のマージ処理もまた、１段目のマージ処理及び２段目のマージ処理と同様に、Ｂ０〜３側のデータＢ０〜３（ＬＶＬ’，ＧＯｒｄ’）とＢ４〜７側のデータＢ０〜４（ＬＶＬ’，ＧＯｒｄ’）を先頭から順番に比較し、小さい方の要素の組を取り出し、要素が取り出された側のデータの読み出し用ポインタを１つ先へ進める、という操作を繰り返す。これにより、図１５の右側に示されているような配列の組、すなわち、項目値指定ポインタ配列ＬＶＬ’、レコード順序番号配列ＧＯｒｄ’、及び、ブロック番号配列ＢｌｋＮｏ’の組が得られる。この配列の組は、全レコードのソート結果を表現している。たとえば、項目値指定ポインタ配列ＬＶＬ’を参照すると、先頭から、同一値を含めて値が昇順に並べられているので、グローバル項目値配列中の項目値の整列順にレコードがソートされていることがわかる。また、項目値指定ポインタ配列ＬＶＬ’の要素の値が同一であるレコードは、レコード順序番号配列ＧＯｒｄ’を参照することにより、ソート前のレコード順序番号の昇順に整列されていることもわかる。このように、レコード順序番号に関して安定性のあるソート結果が得られた理由は、ブロック間ソートの際に、項目値指定ポインタとレコード順序番号の組に関する大小関係に基づいてレコードの並べ替えが行われたからである。 FIG. 15 is an explanatory diagram of the third merging process in the tabular data sorting process according to the embodiment of the present invention. Similarly to the first-stage merge process and the second-stage merge process, the third-stage merge process also includes data B0-3 on the B0-3 side (LVL ′, GOrd ′) and data B0 on the B4-7 side. -4 (LVL ', GOrd') are compared in order from the top, the smaller element set is extracted, and the operation of advancing the read pointer of the data on the side from which the element has been extracted is repeated. As a result, a set of arrays as shown on the right side of FIG. 15, that is, a set of an item value designation pointer array LVL ', a record sequence number array GOrd', and a block number array BlkNo 'is obtained. This set of arrays represents the sorting result of all records. For example, referring to the item value designation pointer array LVL ′, since the values including the same value are arranged in ascending order from the beginning, the records may be sorted in the order of the item values in the global item value array. Recognize. It can also be seen that records having the same element value in the item value designation pointer array LVL 'are arranged in ascending order of the record order numbers before sorting by referring to the record order number array GOrd'. As described above, the reason why a stable sorting result with respect to the record sequence number is obtained is that when sorting between blocks, the records are rearranged based on the magnitude relation regarding the combination of the item value designation pointer and the record sequence number. Because it was broken.

本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）によって得られた３つの配列から次のことがわかる。たとえば、配列の組の１行目のデータ＝（０，８，２）を参照すると、ソート後にレコード順序番号０が付与されるレコードは、
（１）ソートのキーとなるデータ項目に関する項目値を指定する項目値指定ポインタの値が０であり、
（２）ソート前に付与されていたレコード順序番号が８であり、
（３）ブロック２に属している。The following can be understood from the three arrays obtained by the inter-block sort process 1 (merge process) in the tabular data sort process according to the embodiment of the present invention. For example, referring to the data in the first row of the array set = (0, 8, 2), the record to which the record sequence number 0 is given after sorting is
(1) The value of the item value designation pointer that designates the item value related to the data item that is the key for sorting is 0,
(2) The record sequence number assigned before sorting is 8,
(3) Belonging to block 2

発明の理解を助けるため、このブロック間ソート処理1（マージ処理）の結果がマルチコア型処理装置向けのデータ構造で表現される。図１６Ａ乃至１６Ｃは、本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の結果の説明図である。図１６Ａ乃至１６Ｃに示された表形式データは図７Ｂ乃至７Ｄに示された表形式データと類似しているが、この時点では、図１６Ａに示されているレコード順序番号配列ＧＯｒｄに格納されるべき値が未定である。一方、図１６Ｂ及び図１６Ｃに示されている項目値情報は、ソート処理による影響を受けないので、当然に同じである。 In order to help the understanding of the invention, the result of the inter-block sort process 1 (merge process) is expressed in a data structure for a multi-core type processing apparatus. 16A to 16C are explanatory diagrams of the result of the inter-block sort process 1 (merge process) in the tabular data sort process according to the embodiment of the present invention. The tabular data shown in FIGS. 16A to 16C are similar to the tabular data shown in FIGS. 7B to 7D, but at this point, they are stored in the record sequence number array GOrd shown in FIG. 16A. The power value is undecided. On the other hand, the item value information shown in FIGS. 16B and 16C is not affected by the sort process, and is naturally the same.

本発明の一実施形態による表形式データのソート処理では、最後に、各ブロックに属するレコードのレコード順序番号を決定する。このレコード順序番号を決定する処理は、ブロック間ソート処理２（分配処理）と呼ばれる。分配処理では、複数台の演算ユニットが並列的に動作して、ブロック番号配列ＢｌｋＮｏの添字ｉに相当するレコード順序番号をブロック番号ＢｌｋＮｏ［ｉ］で表されるブロック毎に分配し、分配されたレコード順序番号をブロック内で所定の順番（たとえば、昇順）に並べる。この処理は、たとえば、ブロックｊのレコード順序番号配列のｋ番目の要素をＧＯｒｄ［ｊ］［ｋ］とし、レコード順序番号配列ＧＯｒｄ［ｊ］にレコード順序番号を設定するための書き込み用ポインタｋをｉｎｄｅｘ［ｊ］とすると、次のように記述できる。
インデックス配列ｉｎｄｅｘを初期化；
ｆｏｒ（ｉ＝０；ｉ＜レコード総数；ｉ＋＋）｛
ＧＯｒｄ［ＢｌｋＮｏ［ｉ］］［ｉｎｄｅｘ［ＢｌｋＮｏ［ｉ］］＝ｉ；
ｉｎｄｅｘ［ＢｌｋＮｏ［ｉ］］＋＋；
｝
但し、実際には、複数台の演算ユニットがブロック番号配列ＢｌｋＮｏの一部を分担して分配処理を行う。そのため、あるブロックに関するレコード順序番号配列ＧＯｒｄ［ｊ］は複数台の演算ユニットによって分担して処理される。そして、複数の演算ユニットによって分担して作成されたレコード順序番号配列がブロック毎に１つのレコード順序番号配列に統合される。ブロック番号配列ＢｌｋＮｏの複数台の演算ユニットへの割り当てが連続的に行われるならば、すなわち、各演算ユニットが担当するブロック番号配列ＢｌｋＮｏの一部が連続しているならば、この統合処理は非常に簡単化される。なぜならば、同一のブロックに関して、別々の演算ユニットによって作成されたレコード順序番号配列の間で要素の順番を入れ替える必要がないからである。つまり、レコード順序番号配列の統合処理は、別々に作成されたレコード順序番号配列を単に連結することにより達成される。In the sort processing of tabular data according to an embodiment of the present invention, the record sequence number of the record belonging to each block is finally determined. The process for determining the record sequence number is called an inter-block sort process 2 (distribution process). In the distribution process, a plurality of arithmetic units operate in parallel, and the record order number corresponding to the subscript i of the block number array BlkNo is distributed to each block represented by the block number BlkNo [i]. Record order numbers are arranged in a predetermined order (for example, ascending order) within a block. In this process, for example, the k-th element of the record sequence number array of block j is GOrd [j] [k], and the write pointer k for setting the record sequence number in the record sequence number array GOrd [j] is set. If index [j], it can be described as follows.
Initialize the index array index;
for (i = 0; i <total number of records; i ++) {
GOrd [BlkNo [i]] [index [BlkNo [i]] = i;
index [BlkNo [i]] ++;
}
In practice, however, a plurality of arithmetic units share part of the block number array BlkNo and perform distribution processing. Therefore, the record sequence number array GOrd [j] relating to a certain block is processed by being shared by a plurality of arithmetic units. Then, the record sequence number arrays created by sharing by a plurality of arithmetic units are integrated into one record sequence number array for each block. If the block number array BlkNo is continuously assigned to a plurality of arithmetic units, that is, if a part of the block number array BlkNo assigned to each arithmetic unit is continuous, this integration process is very To be simplified. This is because it is not necessary to change the order of elements between record sequence number arrays created by different arithmetic units for the same block. That is, the integration process of the record sequence number arrays is achieved by simply concatenating the record sequence number arrays created separately.

図１７Ａ乃至１７Ｃは、本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理２（分配処理）の説明図である。図１７Ａを参照すると、ブロック番号配列ＢｌｋＮｏは、先頭から順番にＳＰＵ−０、ＳＰＵ−１、・・・、ＳＰＵ−７の８台の演算ユニットに４個ずつの要素が割り当てられている。図１７Ａでは、各演算ユニットが、並列的に、自分の担当するブロック番号配列ＢｌｋＮｏの添字ｉ（＝レコード順序番号）と要素ＢｌｋＮｏ［ｉ］（＝ブロック番号）とに応じて、レコード順序番号をブロック番号毎に分配している。たとえば、ＳＰＵ−０は、担当するデータＢｌｋＮｏ［０］＝２に応じて、ブロック番号２のためのレコード順序番号配列ＧＯｒｄ−２の先頭にレコード順序番号＝０を書き込む。すなわち、ＧＯｒｄ−２［０］＝０である。なお、各演算ユニットは、ブロック番号配列中の自分の担当する範囲のデータと、レコード順序番号配列ＧＯｒｄ−０、ＧＯｒｄ−１、・・・、ＧＯｒｄ−７をローカルメモリ上に作成して、分配処理を行う。 17A to 17C are explanatory diagrams of the inter-block sort process 2 (distribution process) in the tabular data sort process according to the embodiment of the present invention. Referring to FIG. 17A, in the block number array BlkNo, four elements are assigned to eight arithmetic units SPU-0, SPU-1,..., SPU-7 in order from the top. In FIG. 17A, each arithmetic unit sets the record sequence number in parallel according to the subscript i (= record sequence number) and the element BlkNo [i] (= block number) of its own block number array BlkNo. It is distributed for each block number. For example, the SPU-0 writes the record sequence number = 0 at the head of the record sequence number array GOrd-2 for the block number 2 in accordance with the data BlkNo [0] = 2 in charge. That is, GOrd-2 [0] = 0. Each arithmetic unit creates the data in its own range in the block number array and the record sequence number arrays GOrd-0, GOrd-1,..., GOrd-7 on the local memory and distributes them. Process.

各演算ユニットは、自分の担当する範囲内の４個のデータについて順次分配処理を行う。図１７Ｂは、各演算ユニットが自分の担当する範囲内の４個目のデータを分配したときの状態を示している。たとえば、ＳＰＵ−７は、ＢｌｋＮｏ［３１］＝７に応じて、ＧＯｒｄ−７［１］に３１を設定する。 Each arithmetic unit sequentially performs distribution processing on four data within the range that it is in charge of. FIG. 17B shows a state in which each arithmetic unit distributes the fourth data within the range that it is in charge of. For example, SPU-7 sets 31 to GOrd-7 [1] according to BlkNo [31] = 7.

本例では、演算ユニット毎に、ブロック０からブロック７までの配列ＧＯｒｄ−０からＧＯｒｄ−７が使用されている。この場合、演算ユニットの台数（＝８個）倍の作業領域がローカルメモリとグローバルメモリに確保されることになる。この作業領域をコンパクトにするため、リンクリストを使用しても構わない。重要なことは、作業領域がローカルメモリに格納できる間は、その作業領域をローカルメモリに収容し、作業領域がローカルメモリに収容できなくことが分かった時点で、ローカルメモリ中の作業領域の全部又は一部をある程度まとめてグローバルメモリへ転送することによって、グローバルメモリへのメモリアクセスを一括化することが可能である。 In this example, an array GOrd-0 to GOrd-7 from block 0 to block 7 is used for each arithmetic unit. In this case, a work area equal to the number of arithmetic units (= 8) is secured in the local memory and the global memory. In order to make this work area compact, a linked list may be used. The important thing is that as long as the work area can be stored in local memory, the work area is accommodated in local memory, and when the work area cannot be accommodated in local memory, the entire work area in local memory is Alternatively, memory access to the global memory can be unified by transferring a part of the data to the global memory.

最後に、各演算ユニットによって作成されたブロック毎のレコード順序番号配列が統合された最終的なレコード順序番号配列が作成される。具体的には、レコード順序番号の小さい順に、すなわち、ＳＰＵ−０からＳＰＵ−７の順に、ＧＯｒｄ−０、ＧＯｒｄ−１、・・・、ＧＯｒｄ−７のそれぞれに格納されている値を取り出し、その値を最終的なＧＯｒｄ−０、ＧＯｒｄ−１、・・・、ＧＯｒｄ−７の先頭から順に格納すればよい。この操作は、たとえば、いずれか１台の演算ユニット、又は、制御ユニットが実行可能であるが、ブロック番号毎に、複数台の演算ユニットが並列的に、ＧＯｒｄ−０等から値を取り出し、最終的なＧＯｒｄ−０等へ書き込むようにしてもよい。 Finally, a final record order number array is created by integrating the record order number arrays for each block created by each arithmetic unit. Specifically, the values stored in each of GOrd-0, GOrd-1,..., GOrd-7 are extracted in ascending order of record order numbers, that is, in the order of SPU-0 to SPU-7. The values may be stored in order from the top of the final GOrd-0, GOrd-1,..., GOrd-7. For example, this operation can be executed by any one of the arithmetic units or the control unit, but for each block number, a plurality of arithmetic units take out values from GOrd-0 etc. in parallel. You may make it write in general GOrd-0 grade | etc.,.

このようにして得られたＧＯｒｄ−０はブロック０に属するレコードに付与されるレコード順序番号であり、ＧＯｒｄ−１はブロック１に属するレコードに付与されるレコード順序番号であり、以下同様である。図１８は、本発明の一実施形態による表形式データのソート処理の結果の説明図である。図１７Ｃで得られたレコード順序番号配列が各ブロックのレコード順序番号配列と一致していることがわかる。 GOrd-0 obtained in this way is a record sequence number assigned to records belonging to block 0, GOrd-1 is a record sequence number assigned to records belonging to block 1, and so on. FIG. 18 is an explanatory diagram of the result of the sort processing of tabular data according to an embodiment of the present invention. It can be seen that the record sequence number array obtained in FIG. 17C matches the record sequence number array of each block.

［表形式データのソート処理のまとめ］
上述の本発明の一実施形態による表形式データのソート方法を、図１９に示されたフローチャートを参照してもう一度全体的に説明する。[Summary of sort processing of tabular data]
The tabular data sorting method according to the above-described embodiment of the present invention will be described once more generally with reference to the flowchart shown in FIG.

表形式データのソート方法は、
上記複数台の演算ユニットが並列的に動作して、上記担当レコードを含むブロック毎に、上記項目値情報をキーとして用いて、上記項目値アクセス情報配列にソートを適用し、これによって、ソートされた項目値アクセス情報配列と、上記ソートされた項目値アクセス情報配列に対応する作業用項目値情報配列及び作業用レコード順序番号配列を上記グローバルメモリに作成するステップ１９０２と、
上記複数台の演算ユニットが並列的に動作して、１対のブロックに関する上記作業用項目値情報配列からの第１の要素及び対応する上記作業用レコード順序番号配列からの第２の要素からなる要素の組を、上記第１の要素を上位の桁及び上記第２の要素を下位の桁として用いて、所定の順序にマージし、マージされた要素の組に当該要素の組が含まれる上記ブロック番号を関連付け、これによって、上記１対のブロックに関する新たな作業用項目値情報配列、新たな作業用レコード順序番号配列、及び、新たなブロック番号配列を作成するステップ１９０４と、
上記複数台の演算ユニットが並列的かつ階層的に動作して、上記ステップ(ii)を繰り返し実行し、これによって、１組の最終的な作業用項目値情報配列、最終的な作業用レコード順序番号配列、及び、最終的なブロック番号配列を上記グローバルメモリに作成するステップ１９０６と、
上記複数台の演算ユニットが並列的に動作して、上記最終的なブロック番号配列中の要素が格納されているレコード順序番号を当該要素によって指定されたブロック番号毎に分配し所定の順番に並べ、これによって、ソートされたレコード順序番号配列を上記グローバルメモリに作成するステップ１９０８と、
を備えている。How to sort tabular data
The plurality of arithmetic units operate in parallel and apply the sort to the item value access information array using the item value information as a key for each block including the record in charge. A step 1902 for creating an item value access information array, a work item value information array corresponding to the sorted item value access information array, and a work record sequence number array in the global memory;
The plurality of arithmetic units operate in parallel, and are composed of a first element from the work item value information array relating to a pair of blocks and a second element from the corresponding work record sequence number array. The element set is merged in a predetermined order using the first element as the upper digit and the second element as the lower digit, and the element set is included in the merged element set Associating block numbers, thereby creating a new work item value information array, a new work record sequence number array, and a new block number array for the pair of blocks, 1904;
The plurality of arithmetic units operate in a parallel and hierarchical manner, and the above step (ii) is repeatedly executed, whereby a set of final work item value information array, final work record order Creating a number array and a final block number array in the global memory 1906;
The plurality of arithmetic units operate in parallel, and the record sequence numbers storing the elements in the final block number array are distributed for each block number designated by the elements and arranged in a predetermined order. Thereby creating a sorted record sequence number array in the global memory 1908;
It has.

そして、上記項目値情報が、データ項目毎に、一意の項目値が所定の順序に格納されているグローバル項目値配列と、データ項目毎に、上記担当レコードに含まれる項目値を特定するローカル項目値番号が上記レコード順序番号の順番に格納されているローカル項目値番号配列と、データ項目毎に、上記ローカル項目値番号によって表される項目値が上記グローバル項目値配列中に格納されている位置を指定する項目値指定ポインタが所定の順番に格納されている項目値指定ポインタ配列とを備えている場合、上記ステップ(i)において適用されるソートが分布数え上げソートであり、上記作業用項目値情報配列が作業用項目値指定ポインタ配列である。 The item value information includes a global item value array in which unique item values are stored in a predetermined order for each data item, and a local item for specifying an item value included in the assigned record for each data item. The local item value number array in which the value numbers are stored in the order of the record sequence numbers, and the position where the item value represented by the local item value number is stored in the global item value array for each data item If the item value specifying pointer that specifies the item value specifying pointer array is stored in a predetermined order, the sort applied in step (i) is the distribution counting sort, and the work item value The information array is a work item value designation pointer array.

さらに、上記ステップ１９０２は、
上記複数台の演算ユニットが並列的に動作して、上記担当レコードを含むブロックに対応している、上記レコード順序番号配列及び上記項目値アクセス情報配列と、上記所定の項目に関する上記ローカル項目値番号配列及び上記項目値指定ポインタ配列とを、上記グローバルメモリから上記ローカルメモリへ転送するステップと、
上記複数台の演算ユニットが並列的に動作して、上記ローカルメモリに作成された上記ソートされた項目値アクセス情報配列と、上記作業用項目値指定ポインタ配列及び上記作業用レコード順序番号配列とを上記グローバルメモリへ転送するステップと、
をさらに含む。また、上記ステップ１９０４は、
上記複数台の演算ユニットが並列的に動作して、１対のブロックに関する上記作業用項目値指定ポインタ配列及び上記作業用レコード順序番号配列を、上記グローバルメモリから上記ローカルメモリへ転送するステップと、
上記複数台の演算ユニットが並列的に動作して、上記ローカルメモリに作成された上記作業用項目値指定ポインタ配列、上記作業用レコード順序番号配列、及び、上記新たなブロック番号配列を上記グローバルメモリ又はさらなる処理のための演算ユニットへ転送するステップと、
を含む。さらに、上記ステップ１９０６は、上記ローカルメモリに作成された上記最終的なブロック番号配列を上記グローバルメモリへ転送するステップを含む。Further, the above step 1902 includes
The plurality of arithmetic units operate in parallel to correspond to the block including the record in charge, the record sequence number array and the item value access information array, and the local item value number related to the predetermined item Transferring the array and the item value designation pointer array from the global memory to the local memory;
The plurality of arithmetic units operate in parallel, the sorted item value access information array created in the local memory, the work item value designation pointer array, and the work record sequence number array. Transferring to the global memory;
Further included. Also, the above step 1904 includes
The plurality of arithmetic units operating in parallel, transferring the work item value designation pointer array and the work record sequence number array for a pair of blocks from the global memory to the local memory;
The plurality of arithmetic units operate in parallel, and the work item value designation pointer array created in the local memory, the work record sequence number array, and the new block number array are stored in the global memory. Or transferring to an arithmetic unit for further processing;
including. Further, the step 1906 includes a step of transferring the final block number array created in the local memory to the global memory.

［代替的な分配処理］
図１７Ａ乃至１７Ｃに示された分配処理の例では、ブロック数が増加すると、ローカルメモリ上の作業領域も増大する。そのため、本発明の代替的な実施形態では、特に、ブロック数が多い場合に、処理を効率化するために、複数のブロックをグループ化した後に、グループ毎に分配処理を実施する。たとえば、ブロック番号を４で除算することにより、上位ブロック番号と、下位ブロック番号に分離し（グループ化し）、上位ブロック番号と下位ブロック番号に関して別々に分配処理を適用する。具体的には、ブロック間ソート処理２におけるマージ処理によって得られたブロック番号配列ＢｌｋＮｏ’から、下位ブロック番号用のブロック番号配列ＢｌｋＮｏ’と上位ブロック番号用のブロック番号作業配列ＢｌｋＮｏ’とを生成する。この生成処理もまた、複数台の演算ユニットが並列的に動作して、実行可能である。[Alternative distribution process]
In the example of the distribution process shown in FIGS. 17A to 17C, when the number of blocks increases, the work area on the local memory also increases. Therefore, in an alternative embodiment of the present invention, particularly when the number of blocks is large, a distribution process is performed for each group after a plurality of blocks are grouped in order to improve the processing efficiency. For example, the block number is divided by 4 to separate the upper block number and the lower block number (grouping), and the distribution process is separately applied to the upper block number and the lower block number. Specifically, the block number array BlkNo ′ for the lower block number and the block number work array BlkNo ′ for the upper block number are generated from the block number array BlkNo ′ obtained by the merge process in the inter-block sort process 2. . This generation process can also be executed by a plurality of arithmetic units operating in parallel.

たとえば、図１６Ａ乃至１６Ｃに示されたブロック番号配列ＢｌｋＮｏを下位のブロック番号（すなわち、０から３）と上位のブロック番号（すなわち、４から７）に分離すると、２つのブロック番号配列が生成される。図２０は、本発明の代替的な実施形態によるブロック番号配列の分離処理の説明図である。このようにして得られた、下位ブロック番号用のＢｌｋＮｏ’及びＬＶＬ’と、上位ブロック番号用のＢｌｋＮｏ’及びＬＶＬ’は、図１７Ａ乃至１７Ｃを参照して説明した、本発明の一実施形態によるブロック間ソート処理における分配処理が適用され、ブロック番号毎にレコード順序番号が得られる。 For example, when the block number array BlkNo shown in FIGS. 16A to 16C is separated into a lower block number (ie, 0 to 3) and an upper block number (ie, 4 to 7), two block number arrays are generated. The FIG. 20 is an explanatory diagram of block number array separation processing according to an alternative embodiment of the present invention. The BlkNo ′ and LVL ′ for the lower block number and the BlkNo ′ and LVL ′ for the upper block number obtained in this way are according to the embodiment of the present invention described with reference to FIGS. 17A to 17C. A distribution process in the inter-block sort process is applied, and a record sequence number is obtained for each block number.

［表形式データの多項目ソート処理］
本発明の一実施形態による表形式データのソート処理は、所定のデータ項目に関するソート処理である。このソート処理によって変化するのは、ブロック番号配列、レコード順序番号配列及び項目値アクセス情報配列である。一方、各ブロックに属するレコード、及び、項目値情報は変化しない。よって、複数のデータ項目に関するソート処理は、上述の所定のデータ項目に関するソート処理を繰り返すことによって実現される。本発明のマルチコア型処理装置の好ましい一実施形態によれば、多項目ソート処理は、制御ユニットが、複数のデータ項目に関してソート処理を繰り返すように演算ユニットを制御することによって実現される。[Multi-item sort processing for tabular data]
The sort processing of tabular data according to an embodiment of the present invention is a sort processing related to a predetermined data item. What is changed by this sort processing is the block number array, the record sequence number array, and the item value access information array. On the other hand, records belonging to each block and item value information do not change. Therefore, the sorting process for a plurality of data items is realized by repeating the sorting process for the predetermined data item. According to a preferred embodiment of the multi-core processing apparatus of the present invention, the multi-item sort process is realized by the control unit controlling the arithmetic unit to repeat the sort process for a plurality of data items.

よって、たとえば、図５Ａに示された表形式データに対して、最初に、データ項目＝「Ｓｃｈｏｏｌ」に関してソート処理を実行し、次に、データ項目＝「Ａｇｅ」に関してソート処理を実行する場合（多段階ソート処理）、図１０Ａ乃至１０Ｃから図１８を参照して説明したソート処理によってデータ項目＝「Ｓｃｈｏｏｌ」に関するソートを行い、続いて、同様に、図１８に示されたレコード順序番号配列及び項目値アクセス情報配列から始めて、データ項目＝「Ａｇｅ」に関するソート処理を実行すればよい。なお、複数のデータ項目に関してソート処理を順次適用する場合、優先度の高いデータ項目に関するソート処理が後から適用される。 Thus, for example, when the tabular data shown in FIG. 5A is first subjected to sort processing for data item = “School” and then to sort processing for data item = “Age” ( Multi-stage sort processing), the sort processing described with reference to FIGS. 10A to 10C to FIG. 18 is performed to sort the data item = “School”, and then the record sequence number array shown in FIG. Starting with the item value access information array, the sort processing for data item = “Age” may be executed. Note that when the sort process is sequentially applied to a plurality of data items, the sort process for the data item having a high priority is applied later.

図２１は、図５Ａに示された表形式データに対して本発明の一実施形態による多段階ソート処理を適用した結果を示す図である。図２１Ａでは、最初にデータ項目＝「Ｓｃｈｏｏｌ」に関するソート処理が適用され、その後に、データ項目＝「Ａｇｅ」に関するソート処理が適用されている。一方、図２１Ｂでは、最初にデータ項目＝「Ａｇｅ」に関するソートが適用され、その後に、データ項目＝「Ｓｃｈｏｏｌ」に関するソートが適用されている。このように、ソートが適用されるデータ項目の順序に応じてソート処理の結果が異なる。また、本発明の一実施形態による多段階ソート処理が安定性のあるソートであることもわかる。 FIG. 21 is a diagram showing a result of applying the multi-stage sorting process according to the embodiment of the present invention to the tabular data shown in FIG. 5A. In FIG. 21A, the sorting process for data item = “School” is first applied, and then the sorting process for data item = “Age” is applied. On the other hand, in FIG. 21B, the sort related to the data item = “Age” is applied first, and then the sort related to the data item = “School” is applied. As described above, the result of the sort process varies depending on the order of the data items to which the sort is applied. It can also be seen that the multi-stage sort process according to an embodiment of the present invention is a stable sort.

或いは、代替的な実施形態では、多項目ソート処理を１段階で実現することも可能である。これば、上記の実施形態におけるブロック内ソート処理で実行される分布数え上げソート処理を複数のデータ項目を併合して実行することにより実現される。たとえば、図５Ａの表形式データの例において、データ項目＝「Ｓｃｈｏｏｌ」を優先度の低いデータ項目とし、データ項目＝「Ａｇｅ」を優先度の高いデータ項目として、多項目ソート処理を実施する場合を考える。データ項目＝「Ｓｃｈｏｏｌ」の項目値指定ポインタの値Ｘとデータ項目＝「Ａｇｅ」の項目値指定ポインタの値Ｙとを組み合わせて生成される値Ｚを新たな項目値とみなしてソート処理を実行すればよい。本例では、Ｘの値の取り得る範囲が０≦Ｘ≦３の４通りであり、かつ、Ｙの値の取り得る範囲が０≦Ｙ≦６の７通りであるので、２つの項目値の組み合わせに対して、４×７＝２８の値、すなわち、０から２７までの値を割り付ける。より具体的には、データ項目＝「Ａｇｅ」が優先度の高いデータ項目であることから、
Ｚ＝４×Ｙ＋Ｘ
という値Ｚを新たな項目値とみなして、ソート処理を実行すればよい。Alternatively, in an alternative embodiment, the multi-item sort process can be realized in one stage. This is realized by executing the distribution counting sort process executed in the intra-block sort process in the above embodiment by combining a plurality of data items. For example, in the example of the tabular data in FIG. 5A, when the multi-item sort process is performed with the data item = “School” as the data item with low priority and the data item = “Age” as the data item with high priority. think of. Sorting is executed by regarding the value Z generated by combining the value X of the item value designation pointer of data item = “School” and the value Y of the item value designation pointer of data item = “Age” as a new item value. do it. In this example, there are four possible ranges of X values, 0 ≦ X ≦ 3, and seven possible ranges of Y values are 0 ≦ Y ≦ 6. A value of 4 × 7 = 28, that is, a value from 0 to 27 is assigned to the combination. More specifically, since data item = “Age” is a high priority data item,
Z = 4 × Y + X
The sort process may be executed by regarding the value Z as a new item value.

実際的には、新たな項目値をもつ新たな表形式データを作成する必要はなく、分布数え上げソート処理のキー値Ｚとして、データ項目＝「Ｓｃｈｏｏｌ」の項目値指定ポインタの値Ｘとデータ項目＝「Ａｇｅ」の項目値指定ポインタＹの値を組み合わせた値Ｚ＝４×Ｙ＋Ｘを使用すればよい。そして、この新しい項目値指定ポインタＺを用いて、ブロック内ソート処理、ブロック間ソート処理１（マージ処理）、及び、ブロック間ソート処理２（分配処理）を実行することにより、データ項目毎にソート処理を実行する場合と同じ結果を得ることができる。 Actually, it is not necessary to create new tabular data having new item values, and the value X of the item value designation pointer of data item = “School” and the data item are used as the key value Z for the distribution counting sorting process. = The value Z = 4 × Y + X, which is a combination of the values of the item value designation pointer Y of “Age”, may be used. Then, using this new item value designation pointer Z, sorting is performed for each data item by executing intra-block sort processing, inter-block sort processing 1 (merge processing), and inter-block sort processing 2 (distribution processing). The same result can be obtained as when processing is executed.

［表形式データの別のデータ構造］
ここまでの説明では、第１のタイプの項目値情報を使用する場合について、本発明による表形式データのソート処理を説明しているが、既に説明しているように、本発明による表形式データのソート処理は、表形式データのデータ構造が第２のタイプの項目値情報を保持する場合にも適用可能である。図２２Ａ乃至２２Ｄは、図５Ａの表形式データと同一の表形式データを記述する、本発明の一実施形態による表形式データのデータ構造の説明図である。図２２Ａは表形式データを概念的に説明する図、図２２Ｂは表形式データのデータ構造中の順序情報を示す図である。図２２Ｃには、データ項目＝「Ｓｃｈｏｏｌ」に関する項目値情報が示され、図２２Ｄには、データ項目＝「Ａｇｅ」に関する項目値情報が示されている。本実施形態では、項目値情報は項目値自体である。本実施形態においても、項目値情報は、ソート処理の前後で変化しない。以下では、このような表形式データに対して、データ項目＝「Ｓｃｈｏｏｌ」の項目値をキー値として用いるソート処理を説明する。[Another data structure of tabular data]
In the description so far, the sort processing of the tabular data according to the present invention has been described for the case where the item value information of the first type is used. As already described, the tabular data according to the present invention is described. This sort processing can also be applied when the data structure of the tabular data holds the item type information of the second type. 22A to 22D are explanatory diagrams of the data structure of tabular data according to an embodiment of the present invention, which describes the same tabular data as the tabular data of FIG. 5A. FIG. 22A is a diagram for conceptually explaining tabular data, and FIG. 22B is a diagram showing order information in the data structure of tabular data. FIG. 22C shows item value information related to data item = “School”, and FIG. 22D shows item value information related to data item = “Age”. In the present embodiment, the item value information is the item value itself. Also in this embodiment, the item value information does not change before and after the sorting process. Hereinafter, with respect to such tabular data, a sorting process using an item value of “data item =“ School ”as a key value will be described.

［表形式データのソート処理（第２のタイプの項目値情報）］
第１のタイプの項目値情報を用いる表形式データのソート処理と第２のタイプの項目値情報を用いる表形式データのソート処理との相違点は以下の通りである。第１のタイプの項目値情報を用いるソート処理では、ローカル項目値番号をキー値としてブロック内で分布数え上げソート処理を実行して項目値アクセス情報を並べ替え、その後に、項目値指定ポインタの値とレコード順序番号の組をキー値として用いてブロック間ソート処理を行った。これに対して、第２のタイプの項目値情報を用いるソート処理では、項目値をキー値としてブロック内で安定性のあるソート処理を実行して項目値アクセス情報を並べ替え、その後に、項目値とレコード順序番号の組をキー値として用いてブロック間ソートを実行する。[Sort processing of tabular data (second type item value information)]
The differences between the tabular data sort process using the first type item value information and the tabular data sort process using the second type item value information are as follows. In the sort processing using the first type of item value information, the item value access information is sorted by executing the distribution counting sort processing in the block using the local item value number as the key value, and then the value of the item value designation pointer And the record sequence number pair as a key value. On the other hand, in the sort process using the second type item value information, the item value access information is sorted by executing a stable sort process in the block using the item value as a key value, Perform block-to-block sort using a set of value and record sequence number as key value.

図２３は、第２のタイプの項目値情報を含む表形式データのソート処理の概略的なフローチャートである。 FIG. 23 is a schematic flowchart of sorting processing of tabular data including item value information of the second type.

ステップ２３０１：ブロック内ソート
各演算ユニットが並列的に動作して、それぞれのブロック内で、項目値配列ＲＶ中の項目値をキー値として、項目値アクセス情報配列ＬＯｒｄに安定性のあるソート（たとえば、マージソート、バブルソートなど）を適用し、ソートされた項目値アクセス情報配列ＬＯｒｄ’に適合するように項目値配列ＲＶ’及びレコード順序番号配列ＧＯｒｄ’を生成する。Step 2301: In-block sorting Each arithmetic unit operates in parallel, and in each block, the item value in the item value array RV is used as a key value, and the item value access information array LOrd has a stable sort (for example, , Merge sort, bubble sort, etc.) are applied to generate the item value array RV ′ and the record sequence number array GOrd ′ so as to match the sorted item value access information array LOrd ′.

ステップ２３０２：ブロック間ソート１（マージ）
各演算ユニットが並列的かつ階層的に動作して、項目値配列ＲＶ’及びレコード順序番号配列ＧＯｒｄ’からブロック番号配列ＢｌｋＮｏを生成する。ここでは、ブロック間で、それぞれのブロックからの項目値配列ＲＶ’、レコード順序番号配列ＧＯｒｄ’及びブロック番号配列ＢｌｋＮｏ’がトーナメント方式でマージされる。Step 2302: Inter-block sort 1 (merge)
Each arithmetic unit operates in parallel and hierarchically to generate a block number array BlkNo from the item value array RV ′ and the record order number array GOrd ′. Here, the item value array RV ′, the record order number array GOrd ′, and the block number array BlkNo ′ from each block are merged in a tournament manner between the blocks.

ステップ２３０３：ブロック間ソート２（分配）
各演算ユニットが並列的に動作して、ブロック番号配列ＢｌｋＮｏからレコード順序番号配列ＧＯｒｄを生成する。Step 2303: Inter-block sort 2 (distribution)
Each arithmetic unit operates in parallel to generate a record order number array GOrd from the block number array BlkNo.

図２４Ａ乃至２４Ｈは、本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の説明図である。図２４Ａから図２４Ｈには、それぞれ、ブロック０からブロック７におけるブロック内ソート処理の様子が示されている。本例では、ＳＰＵ−０からＳＰＵ−７までの７台の演算ユニットがそれぞれブロック０からブロック７までの各ブロック内のブロック内ソート処理を担当する。本実施形態においても、各演算ユニットは、処理に必要なデータをグローバルメモリから専用のローカルメモリへ転送し、ローカルメモリ内でデータを操作する。全てのデータを同時にローカルメモリに収容できないときには、ある程度のまとまり毎に、グローバルメモリとローカルメモリとの間でデータ転送を行って処理を行う。 24A to 24H are explanatory diagrams of the intra-block sort process in the tabular data sort process according to the embodiment of the present invention. FIG. 24A to FIG. 24H show the state of the intra-block sorting process in block 0 to block 7, respectively. In this example, seven arithmetic units from SPU-0 to SPU-7 are in charge of the intra-block sort processing in each block from block 0 to block 7, respectively. Also in this embodiment, each arithmetic unit transfers data necessary for processing from the global memory to a dedicated local memory, and manipulates the data in the local memory. When all the data cannot be accommodated in the local memory at the same time, the data is transferred between the global memory and the local memory for processing to some extent.

たとえば、図２４Ａを参照すると、本例では、ＲＶの要素の値は文字列であるため、ＬＯｒｄは、ＲＶの値をキー値として、キー値のアルファベット順にソートされる。また、マージソート又はバブルソートのような安定性のあるソートアルゴリズムが使用されるので、配列ＬＯｒｄ＝（０，１，２，３）は、配列ＬＯｒｄ’＝（１，３，０，２）のように並べ替えられる。なお、配列Ａ＝（ａ，ｂ，ｃ，ｄ）は、配列Ａの要素が先頭から順に、ａ，ｂ，ｃ，ｄであることを表している。配列ＬＯｒｄの要素の値０、１、２、３に対して、それぞれ、項目値配列ＲＶの要素“Ｗｅｓｔ”、“Ｓｏｕｔｈ”、“Ｗｅｓｔ”、“Ｓｏｕｔｈ”が対応しているので、項目値をキー値として配列ＬＯｒｄの要素を並べ替えると、配列ＬＯｒｄ’が得られることは明らかである。ここで、ＬＯｒｄ’の要素の順に、対応する項目値配列ＲＶの要素とレコード順序番号配列ＧＯｒｄの要素を取り出すと、項目値配列ＲＶ’及びレコード順序番号配列ＧＯｒｄ’が得られる。項目値配列ＲＶ’は、当然、項目値が昇順（アルファベット順）にソートされた配列である。また、レコード順序番号配列ＧＯｒｄ’の要素は、項目値が同一値である場合に、先に出現したＬＯｒｄ値が優先されるような順番にソートされている。すなわち、項目値とレコード順序番号の組は、安定したソートによって並び替えられている。その他のブロック１からブロック７に関しても同様の処理が行われる。 For example, referring to FIG. 24A, in this example, since the value of the element of RV is a character string, LOrd is sorted in alphabetical order of the key value using the value of RV as the key value. Further, since a stable sort algorithm such as merge sort or bubble sort is used, the array LOrd = (0, 1, 2, 3) has the array LOrd ′ = (1, 3, 0, 2). Sorted as follows. The array A = (a, b, c, d) represents that the elements of the array A are a, b, c, d in order from the top. Since the elements “West”, “South”, “West”, “South” of the item value array RV correspond to the element values 0, 1, 2, 3 of the array LOrd, respectively, It is clear that the array LOrd ′ is obtained by rearranging the elements of the array LOrd as key values. Here, when the elements of the corresponding item value array RV and the elements of the record order number array GOrd are extracted in the order of the elements of LOrd ', the item value array RV' and the record order number array GOrd 'are obtained. The item value array RV 'is, of course, an array in which the item values are sorted in ascending order (alphabetical order). The elements of the record sequence number array GOrd 'are sorted in such an order that the LOrd value that appears first is given priority when the item values are the same value. That is, the combination of the item value and the record sequence number is rearranged by stable sorting. The same processing is performed for the other blocks 1 to 7.

図２５は、このようにして得られた、本発明の一実施形態による表形式データのソート処理におけるブロック内ソート処理の結果の説明図である。この時点で、各ブロック内の項目値アクセス情報配列ＬＯｒｄは最終的な結果と一致しているので、ＬＯｒｄのようにプライム記号（’）無しで示されている。一方、項目値配列ＲＶ’及びレコード順序番号配列ＧＯｒｄ’は最終的な結果ではなく、処理途中の作業用配列を表しているので、プライム記号（’）付きで示されている。さらに、最終的な結果であるブロック番号配列ＢｌｋＮｏ及びレコード順序番号配列ＧＯｒｄは、この時点では未だ決定されていないのでブランクのまま残されている。また、ブロック内ソートの結果である配列ＲＶ’及び配列ＧＯｒｄ’に、ブロック番号ＢｌｋＮｏ’が付加されている。 FIG. 25 is an explanatory diagram of the result of the intra-block sorting process in the tabular data sorting process according to the embodiment of the present invention obtained as described above. At this time, the item value access information array LOrd in each block is consistent with the final result, and thus is indicated without a prime symbol (') like LOrd. On the other hand, the item value array RV 'and the record sequence number array GOrd' are not final results, but represent working arrays in the middle of processing, and therefore are indicated with a prime symbol ('). Further, the block number array BlkNo and the record sequence number array GOrd, which are the final results, have not yet been determined at this point, and are left blank. Further, the block number BlkNo ′ is added to the array RV ′ and the array GOrd ′ that are the results of the intra-block sort.

本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）は、図１２Ａ乃至１２Ｃを参照して説明したように、並列的かつ階層的に実行される。 As described with reference to FIGS. 12A to 12C, the inter-block sort process 1 (merge process) in the tabular data sort process according to an embodiment of the present invention is executed in parallel and hierarchically.

続いて、本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（ブロック間マージ処理）をより詳細に説明する。図２６Ａ乃至２６Ｃは、本発明の一実施形態による表形式データのソート処理における１段目のブロック間マージ処理の説明図である。本例では、ＳＰＵ−０がブロックＢｌｏｃｋ−０（以下、ブロック０）とブロックＢｌｏｃｋ−１（ブロック１）との間で所定の順序のマージ処理を実行する。ブロック間ソート処理は、２つの昇順リストから１つの昇順リストを生成するという点で昇順リストのマージ処理である。最初に、ＳＰＵ−０は、ブロック０及びブロック１に関する項目値配列ＲＶ’、レコード順序番号配列ＧＯｒｄ’、及び、ブロック番号配列ＢｌｋＮｏ’をグローバルメモリからローカルメモリへ転送する。但し、全体を一度にグローバルメモリへ転送できないときには、グローバルメモリからローカルメモリへ少しずつ転送される。 Next, the inter-block sort process 1 (inter-block merge process) in the tabular data sort process according to an embodiment of the present invention will be described in more detail. FIGS. 26A to 26C are explanatory diagrams of the inter-block merging process in the first stage in the tabular data sort process according to the embodiment of the present invention. In this example, the SPU-0 executes a merge process in a predetermined order between the block Block-0 (hereinafter, block 0) and the block Block-1 (block 1). The inter-block sorting process is an ascending list merging process in that one ascending list is generated from two ascending lists. First, the SPU-0 transfers the item value array RV ′, the record order number array GOrd ′, and the block number array BlkNo ′ regarding the block 0 and the block 1 from the global memory to the local memory. However, when the entire data cannot be transferred to the global memory at once, the data is transferred little by little from the global memory to the local memory.

図２６Ａには、ブロック０内の１番目のレコードに関する情報（Ｂ０（ＲＶ’，ＧＯｒｄ’）のように表す）とブロック１内の１番目のレコードに関する情報（Ｂ１（ＲＶ’，ＧＯｒｄ’）のように表す）とを比較する処理が示されている。このとき、ブロック０側の読み出しポインタとブロック１側の読み出しポインタは共にデータの先頭に位置している。ＲＶ’を上位の桁、ＧＯｒｄ’を下位の桁とみなして、ブロック０側及びブロック１側からの（ＲＶ’，ＧＯｒｄ’）を所定の順番（たとえば、昇順）に並べる。本例では、
Ｂ０（ＲＶ’，ＧＯｒｄ’）＝（“Ｓｏｕｔｈ”，１）＞Ｂ１（ＬＶＬ’，ＧＯｒｄ’）＝（“Ｎｏｒｔｈ”，５）
であるので、Ｂ１（ＲＶ’，ＧＯｒｄ’）が先頭（最も小さい）要素であることが判定される。よって、ブロック番号を含めた要素の組Ｂ１（“Ｎｏｒｔｈ”，５，１）がマージ処理の結果として取り出される。このマージ処理の結果は、
ＲＶ’［０］＝“Ｎｏｒｔｈ”
ＧＯｒｄ’［０］＝５
ＢｌｋＮｏ’［０］＝１
のように記述できる。これにより、ブロック１側のデータが取り出されたので、ブロック１側の読み出しポインタが１つ先へ進められる。26A shows information on the first record in block 0 (represented as B0 (RV ′, GOrd ′)) and information on the first record in block 1 (B1 (RV ′, GOrd ′)). The process of comparing with the above is shown. At this time, both the read pointer on the block 0 side and the read pointer on the block 1 side are located at the head of the data. RV ′ is regarded as an upper digit and GOrd ′ is regarded as a lower digit, and (RV ′, GOrd ′) from the block 0 side and the block 1 side are arranged in a predetermined order (for example, ascending order). In this example,
B0 (RV ′, GOrd ′) = (“South”, 1)> B1 (LVL ′, GOrd ′) = (“North”, 5)
Therefore, it is determined that B1 (RV ′, GOrd ′) is the head (smallest) element. Therefore, the element set B1 (“North”, 5, 1) including the block number is extracted as a result of the merge process. The result of this merge process is
RV ′ [0] = “North”
GOrd ′ [0] = 5
BlkNo '[0] = 1
Can be described as follows. As a result, since the data on the block 1 side is extracted, the read pointer on the block 1 side is advanced by one.

図２６Ｂでは、読み出しポインタが先頭に位置しているブロック０内の１番目のレコードと、読み出しポインタが１つ先へ進められたブロック１内の２番目のレコードとを比較する処理が示されている。同様に、Ｂ０（“Ｓｏｕｔｈ”，１）とＢ１（“Ｓｏｕｔｈ”，４）とを比較すると、Ｂ０（“Ｓｏｕｔｈ”，１）の方が小さいと判定されるので、ブロック番号を含めた要素の組Ｂ０（“Ｓｏｕｔｈ”，１，０）がマージ処理の結果、すなわち、
ＲＶ’［１］＝“Ｓｏｕｔｈ”
ＧＯｒｄ’［１］＝１
ＢｌｋＮｏ’［１］＝０
として取り出される。これにより、ブロック０側のデータが取り出されたので、ブロック０側の読み出しポインタが１つ先へ進められる。FIG. 26B shows a process of comparing the first record in block 0 where the read pointer is positioned at the head with the second record in block 1 where the read pointer is advanced one step forward. Yes. Similarly, when B0 (“South”, 1) is compared with B1 (“South”, 4), it is determined that B0 (“South”, 1) is smaller, so the element including the block number The set B0 (“South”, 1, 0) is the result of the merge process, ie,
RV ′ [1] = “South”
GOrd '[1] = 1
BlkNo '[1] = 0
As taken out. As a result, since the data on the block 0 side is extracted, the read pointer on the block 0 side is advanced by one.

このように、ポインタを進めながら、ブロック０側のデータとブロック１側のデータを順次比較することにより、最終的に、ブロック０側のデータとブロック１側のデータがマージされた昇順のリストが得られる。図２６Ｃには、最終的に取り出された項目値配列ＲＶ’とレコード順序番号配列ＧＯｒｄ’とブロック番号配列ＢｌｋＮｏ’とが示されている。 In this way, by sequentially comparing the data on the block 0 side and the data on the block 1 side while advancing the pointer, finally, an ascending list in which the data on the block 0 side and the data on the block 1 side are merged is obtained. can get. FIG. 26C shows the item value array RV ′, the record order number array GOrd ′, and the block number array BlkNo ′ that are finally extracted.

なお、取り出された項目値配列ＲＶ’とレコード順序番号配列ＧＯｒｄ’とブロック番号配列ＢｌｋＮｏ’は、２段目のマージ処理のため、本例では、ＳＰＵ−４へ送出されるが、最終的な結果を一括して送出するのではなく、ブロック０側とブロック１側の比較処理を進めながら、取り出されたデータを必要に応じて、ＳＰＵ−０からＳＰＵ−４へ送出してもよい。 Note that the extracted item value array RV ′, record order number array GOrd ′, and block number array BlkNo ′ are sent to the SPU-4 in this example because of the second-stage merge process. Instead of sending the results all at once, the extracted data may be sent from the SPU-0 to the SPU-4 as necessary while proceeding with the comparison process between the block 0 side and the block 1 side.

今度は、演算ユニットＳＰＵ−４が、ＳＰＵ−０から出力されたブロック０〜１と、ＳＰＵ−１１から出力されたブロック２〜３に所定の順序のマージ処理を適用し、１つのブロック０〜３を出力する２段目のブロック間ソート処理１について説明する。図２７Ａ及び２７Ｂは、本発明の一実施形態による表形式データのソート処理における２段目のマージ処理の説明図である。２段目のマージ処理は、入力される情報が他の演算ユニットのローカルメモリから転送される点を除いて、１段目のマージ処理と同様である。この処理を簡単に説明すると、図２７Ａに示されているように、最初、２つのブロックからの項目値配列ＲＶ’、レコード順序番号配列ＧＯｒｄ’、及び、ブロック番号配列ＢｌｋＮｏ’の読み出し用ポインタが先頭に設定される。両方の（ＲＶ’，ＧＯｒｄ’）の組の値を比較すると、
Ｂ０〜１（ＲＶ’，ＧＯｒｄ’）＝（“Ｎｏｒｔｈ”，５）＞Ｂ２〜３（ＲＶ’，ＧＯｒｄ’）＝（“Ｅａｓｔ”，８）
であることから、Ｂ２〜３側の要素の方が小さいということがわかる。よって、Ｂ２〜３側の先頭の要素の組であるＢ２〜３（“Ｅａｓｔ”，８，２）を取り出す。そして、要素の組が取り出された方のＢ２〜３側の読み出しポインタを１つ先へ進められる。ここで、Ｂａ〜ｂという表記は、ブロックａからブロックｂまでの間のソート処理の結果として得られたデータを表している。This time, the arithmetic unit SPU-4 applies a merge process in a predetermined order to the blocks 0 to 1 output from the SPU-0 and to the blocks 2 to 3 output from the SPU-11, thereby obtaining one block 0 to 0. The second block sorting process 1 for outputting 3 will be described. 27A and 27B are explanatory diagrams of the second-stage merge process in the tabular data sort process according to an embodiment of the present invention. The second-stage merge process is the same as the first-stage merge process, except that the input information is transferred from the local memory of another arithmetic unit. Briefly describing this processing, as shown in FIG. 27A, first, the reading pointers of the item value array RV ′, the record sequence number array GOrd ′, and the block number array BlkNo ′ from the two blocks are displayed. Set to the beginning. Comparing the values of both (RV ′, GOrd ′) pairs,
B0-1 (RV ′, GOrd ′) = (“North”, 5)> B2-3 (RV ′, GOrd ′) = (“East”, 8)
Therefore, it can be seen that the element on the B2-3 side is smaller. Therefore, B2-3 ("East", 8, 2), which is the set of the leading elements on the B2-3 side, are extracted. Then, the read pointer on the B2-3 side from which the element set has been extracted is advanced by one. Here, the notation Ba to b represents data obtained as a result of the sorting process from block a to block b.

このような要素の組の大小比較と、小さい方の要素の組の読み出しを繰り返すことにより、図２７Ｂに示されているように、最後にＢ２〜３側の要素の組であるＢ２〜３（“Ｗｅｓｔ”，１３，３）が取り出され、２段目のマージ処理が終了する。 By repeating the size comparison of such element sets and reading out the smaller element set, finally, as shown in FIG. 27B, B2-3 ( “West”, 13, 3) is taken out, and the second-stage merge processing ends.

なお、取り出された項目値配列ＲＶ’とレコード順序番号配列ＧＯｒｄ’とブロック番号配列ＢｌｋＮｏ’は、３段目のマージ処理のため、本例では、ＳＰＵ−６へ送出されるが、最終的な結果を一括して送出するのではなく、ブロック０側とブロック１側の比較処理を進めながら、取り出されたデータを必要に応じて、ＳＰＵ−４からＳＰＵ−６へ送出してもよい。また、以上の説明からわかるように、本発明の一実施形態によるブロック間ソート処理１（マージ処理）の２段目の処理では、データアクセスがシーケンシャルアクセスだけに限定され、かつ、各ＳＰＵが並列にブロック間ソート処理１を実行可能である。よって、マルチプロセッサ型処理装置の性能が十分に活かされている。 Note that the extracted item value array RV ′, record order number array GOrd ′, and block number array BlkNo ′ are sent to the SPU-6 in this example because of the third-stage merge process, Rather than sending the results all at once, the extracted data may be sent from the SPU-4 to the SPU-6 as necessary while proceeding with the comparison process between the block 0 side and the block 1 side. As can be seen from the above description, in the second stage of the inter-block sort process 1 (merge process) according to an embodiment of the present invention, data access is limited to sequential access and each SPU is parallel. In addition, the inter-block sort process 1 can be executed. Therefore, the performance of the multiprocessor type processing apparatus is fully utilized.

図２８は、本発明の一実施形態による表形式データのソート処理における３段目のマージ処理の説明図である。３段目のマージ処理もまた、１段目のマージ処理及び２段目のマージ処理と同様に、Ｂ０〜３側のデータＢ０〜３（ＲＶ’，ＧＯｒｄ’）とＢ４〜７側のデータＢ０〜４（ＲＶ’，ＧＯｒｄ’）を先頭から順番に比較し、小さい方の要素の組を取り出し、要素が取り出された側のデータの読み出し用ポインタを１つ先へ進める、という操作を繰り返す。これにより、図２８の右側に示されているような配列の組、すなわち、項目値配列ＲＶ’、レコード順序番号配列ＧＯｒｄ’、及び、ブロック番号配列ＢｌｋＮｏ’の組が得られる。この配列の組は、全レコードのソート結果を表現している。たとえば、項目値配列ＲＶ’を参照すると、先頭から、同一値を含めて値が昇順に並べられているので、グローバル項目値配列中の項目値の整列順にレコードがソートされていることがわかる。また、項目値配列ＲＶ’の要素の値が同一であるレコードは、レコード順序番号配列ＧＯｒｄ’を参照することにより、ソート前のレコード順序番号の昇順に整列されていることもわかる。このように、レコード順序番号に関して安定性のあるソート結果が得られた理由は、ブロック間ソートの際に、項目値指定ポインタとレコード順序番号の組に関する大小関係に基づいてレコードの並べ替えが行われたからである。 FIG. 28 is an explanatory diagram of the third merging process in the tabular data sorting process according to an embodiment of the present invention. Similarly to the first-stage merge process and the second-stage merge process, the third-stage merge process also includes data B0-3 on the B0-3 side (RV ′, GOrd ′) and data B0 on the B4-7 side. -4 (RV ′, GOrd ′) are compared in order from the beginning, the smaller element set is extracted, and the operation of advancing the read pointer of the data from which the element has been extracted is repeated. As a result, a set of arrays as shown on the right side of FIG. 28, that is, a set of an item value array RV ′, a record order number array GOrd ′, and a block number array BlkNo ′ is obtained. This set of arrays represents the sorting result of all records. For example, referring to the item value array RV ', it can be seen that since the values are arranged in ascending order including the same value from the top, the records are sorted in the order of the item values in the global item value array. It can also be seen that records having the same element value in the item value array RV 'are arranged in ascending order of the record order numbers before sorting by referring to the record order number array GOrd'. As described above, the reason why a stable sorting result with respect to the record sequence number is obtained is that when sorting between blocks, the records are rearranged based on the magnitude relation regarding the combination of the item value designation pointer and the record sequence number. Because it was broken.

本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）によって得られた３つの配列から次のことがわかる。たとえば、配列の組の１行目のデータ＝（“Ｅａｓｔ”，８，２）を参照すると、ソート後にレコード順序番号０が付与されるレコードは、
（１）ソートのキーとなるデータ項目に関する項目値の値が“Ｅａｓｔ”であり、
（２）ソート前に付与されていたレコード順序番号が８であり、
（３）ブロック２に属している。The following can be understood from the three arrays obtained by the inter-block sort process 1 (merge process) in the tabular data sort process according to the embodiment of the present invention. For example, referring to the data in the first row of the array set = (“East”, 8, 2), the record assigned the record sequence number 0 after sorting is
(1) The value of the item value related to the data item that is the key for sorting is “East”.
(2) The record sequence number assigned before sorting is 8,
(3) Belonging to block 2

発明の理解を助けるため、このブロック間ソート処理1（マージ処理）の結果がマルチコア型処理装置向けのデータ構造で表現される。図２９Ａ乃至２９Ｃは、本発明の一実施形態による表形式データのソート処理におけるブロック間ソート処理１（マージ処理）の結果の説明図である。図２９Ａに示された表形式データは図７Ｂに示された表形式データと類似しているが、この時点では、図２９Ａに示されているレコード順序番号配列ＧＯｒｄに格納されるべき値が未定である。一方、図２９Ｂ及び図２９Ｃに示されている項目値情報、すなわち、項目値配列ＲＶは、ソート処理による影響を受けないので、図２２Ａ乃至２２Ｄに示されている項目値情報と当然に同じである。 In order to help the understanding of the invention, the result of the inter-block sort process 1 (merge process) is expressed in a data structure for a multi-core type processing apparatus. FIG. 29A thru | or 29C are explanatory drawings of the result of the inter-block sort process 1 (merge process) in the sort processing of the tabular data by one Embodiment of this invention. The tabular data shown in FIG. 29A is similar to the tabular data shown in FIG. 7B, but at this point, the values to be stored in the record sequence number array GOrd shown in FIG. 29A are undecided. It is. On the other hand, the item value information shown in FIGS. 29B and 29C, that is, the item value array RV, is not affected by the sort process, and is naturally the same as the item value information shown in FIGS. 22A to 22D. is there.

本発明の一実施形態による表形式データのソート処理では、最後に、各ブロックに属するレコードのレコード順序番号を決定する。このレコード順序番号を決定する処理は、ブロック間ソート処理２（分配処理）と呼ばれる。ブロック間ソート処理２（分配処理）は、図１７Ａ、１７Ｂ、１７Ｃ及び図１８を参照して説明した処理と全く同一であるため、これ以上の説明は加えない。 In the sort processing of tabular data according to an embodiment of the present invention, the record sequence number of the record belonging to each block is finally determined. The process for determining the record sequence number is called an inter-block sort process 2 (distribution process). Since the inter-block sort process 2 (distribution process) is exactly the same as the process described with reference to FIGS. 17A, 17B, 17C and FIG. 18, no further description will be given.

さらに、第１のタイプの項目値情報を含む表形式データにも多項目ソート処理を適用することが可能である。この場合に用いられる多項目ソート処理は、多段階ソート処理である。すなわち、図２２Ａ乃至２２Ｄから図２９Ａ乃至２９Ｃを参照して説明した所定のデータ項目に関するソート処理を複数のデータ項目に関して繰り返すことによって実現される。 Furthermore, it is possible to apply multi-item sort processing to tabular data including item value information of the first type. The multi-item sort process used in this case is a multi-stage sort process. That is, it is realized by repeating the sorting process for a predetermined data item described with reference to FIGS. 22A to 22D to FIGS. 29A to 29C for a plurality of data items.

よって、たとえば、図２２Ａに示された表形式データに対して、最初に、データ項目＝「Ｓｃｈｏｏｌ」に関してソート処理を実行し、次に、データ項目＝「Ａｇｅ」に関してソート処理を実行する場合（多段階ソート処理）、上述のソート処理によってデータ項目＝「Ｓｃｈｏｏｌ」に関するソートを行い、続いて、同様に、図１８に示されたブロック番号配列、レコード順序番号配列及び項目値アクセス情報配列から始めて、データ項目＝「Ａｇｅ」に関するソート処理を実行すればよい。なお、複数のデータ項目に関してソート処理を順次適用する場合、優先度の高いデータ項目に関するソート処理が後から適用される。なお、ソートが適用されるデータ項目の順序に応じてソート処理の結果が異なる。 Therefore, for example, when the tabular data shown in FIG. 22A is first subjected to sort processing for data item = “School” and then to sort processing for data item = “Age” ( (Multi-stage sort process), the sort for the data item = “School” is performed by the above sort process, and similarly, starting from the block number array, record sequence number array, and item value access information array shown in FIG. , Sort processing relating to data item = “Age” may be executed. Note that when the sort process is sequentially applied to a plurality of data items, the sort process for the data item having a high priority is applied later. Note that the result of the sort process varies depending on the order of the data items to which the sort is applied.

本発明は、以上の実施の形態に限定されることなく、特許請求の範囲に記載された発明の範囲内で、種々の変更が可能であり、それらも本発明の範囲内に包含されるものであることは言うまでもない。 The present invention is not limited to the above embodiments, and various modifications can be made within the scope of the invention described in the claims, and these are also included in the scope of the present invention. Needless to say.

Claims

In a multi-core processing apparatus including a plurality of arithmetic units including a dedicated local memory and a global memory connected to the plurality of arithmetic units, a record including item values corresponding to one or more data items A sorting method for representing an array and rearranging the records of tabular data constructed in the global memory according to item values relating to a predetermined data item,
The above tabular data is
The block number corresponding to each record in the array of records divided into blocks having a size that can be accommodated in the local memory of each arithmetic unit is stored in the order of the record sequence number of the record in the tabular data. Block number array being
A record sequence number array in which the record sequence numbers of records in charge that are records belonging to each block are stored in the order of the record sequence numbers;
Field value access information array in which field value access information for accessing the field value included in the responsible record is stored in the order of the record sequence number;
Item value information describing the item value accessed using the item value access information for each data item;
Expressed by
The sorting method is related to the predetermined data item.
(i) The plurality of arithmetic units operate in parallel, and for each block including the responsible record, the item value information is used as a key to apply sorting to the item value access information array, Creating a sorted item value access information array, a work item value information array corresponding to the sorted item value access information array, and a work record sequence number array in the global memory;
(ii) The plurality of arithmetic units operate in parallel, and the first element from the work item value information array related to a pair of blocks and the second element from the corresponding work record sequence number array. An element set consisting of elements is merged in a predetermined order using the first element as the upper digit and the second element as the lower digit, and the element set is merged into the merged element set. Associating the included block numbers, thereby creating a new work item value information array, a new work record sequence number array, and a new block number array for the pair of blocks;
(iii) the plurality of arithmetic units operate in a parallel and hierarchical manner, repeatedly performing the step (ii), thereby creating a final block number array in the global memory;
(iv) The plurality of arithmetic units operate in parallel, and the record sequence numbers in which the elements in the final block number array are stored are distributed for each block number designated by the elements. Arranging in order, thereby creating a sorted record sequence number array in the global memory;
A sorting method comprising:

Step (i) above is
The plurality of arithmetic units operate in parallel to correspond to the block including the record in charge, the record sequence number array and the item value access information array, and the item value information array relating to the predetermined item Transferring from the global memory to the local memory;
The plurality of arithmetic units operate in parallel, and the sorted item value access information array created in the local memory, the work item value information array, and the work record sequence number array are Transferring to global memory;
Further including
Step (ii) above is
Transferring the work item value information array for the pair of blocks and the work record sequence number array from the global memory to the local memory by operating the plurality of arithmetic units in parallel;
The plurality of arithmetic units operate in parallel, the new work item value information array created in the local memory, the new work record sequence number array, and the new block number array Transferring to the global memory or computing unit for further processing;
Further including
The sorting method according to claim 1.

The above field value information
For each data item, a global item value array in which unique item values are stored in a predetermined order;
For each data item, a local field value number array in which local field value numbers that specify field values included in the record in charge are stored in the order of the record order numbers;
An item value designation pointer array in which an item value designation pointer for designating a position where the item value represented by the local item value number is stored in the global item value array is stored in a predetermined order for each data item When,
With
The sort applied in step (i) above is a distribution counting sort,
The work item value information array is a work item value designation pointer array.
The sorting method according to claim 1 or 2.

The item value information includes an item value array in which item values included in the charge record are stored for each data item,
The sort applied in step (i) above is a stable sort,
The work item value information array is a work item value array.
The sorting method according to claim 1 or 2.

(v) using the sorted record sequence number array and the sorted item value access information array as the record sequence number array and the item value access information array, respectively, with respect to another data item, the step (i) 5. The sorting method according to claim 1, further comprising a step of executing (ii), (iii), and (iv).

Comprising a plurality of arithmetic units including a dedicated local memory and a global memory connected to the plurality of arithmetic units, and representing an array of records including item values corresponding to one or more data items; A multi-core processing device for rearranging the records of tabular data constructed in the global memory according to item values relating to predetermined data items,
The above tabular data is
The block number corresponding to each record in the array of records divided into blocks having a size that can be accommodated in the local memory of each arithmetic unit is stored in the order of the record sequence number of the record in the tabular data. Block number array being
A record sequence number array in which the record sequence numbers of records in charge that are records belonging to each block are stored in the order of the record sequence numbers;
Field value access information array in which field value access information for accessing the field value included in the responsible record is stored in the order of the record sequence number;
Item value information describing the item value accessed using the item value access information for each data item;
Expressed by
Each arithmetic unit of the multi-core processing device
(a) Operates in parallel with other arithmetic units, and applies sorting to the item value access information array using the item value information relating to the predetermined data item as a key for each block including the assigned record And means for creating the sorted item value access information array, the working item value information array and the working record sequence number array corresponding to the sorted item value access information array in the global memory,
(b) Operate in parallel with other arithmetic units, from the first element from the work item value information array relating to a pair of blocks and the second element from the corresponding work record sequence number array Are merged in a predetermined order using the first element as the upper digit and the second element as the lower digit, and the merged element set includes the element set. Means for associating the block numbers, thereby creating a new work item value information array for the pair of blocks, a new work record sequence number array, and a new block number array;
(c) operating in parallel and hierarchically with other arithmetic units, repeatedly operating the means (b), thereby creating a final block number array in the global memory;
(d) Operates in parallel with other arithmetic units, and distributes the record sequence numbers in which the elements in the final block number array are stored for each block number specified by the elements in a predetermined order. Arranging, thereby creating a sorted record sequence number array in the global memory;
A multi-core processing apparatus.

Each arithmetic unit further comprises a dedicated memory interface for data transfer between the dedicated local memory and the global memory,
The means (a) corresponds to the block including the record in charge through the dedicated memory interface, the record sequence number array and the item value access information array, and the item value relating to the predetermined item. An information array is transferred from the global memory to the local memory, the sorted item value access information array created in the local memory, the work item value information array, and the work record sequence number array; To the above global memory,
The means (b) transfers the work item value information array and the work record sequence number array related to a pair of blocks from the global memory to the local memory via the dedicated memory interface, and Transfer the new work item value information array created in the local memory, the new work record sequence number array, and the new block number array to the global memory or the arithmetic unit for further processing,
The means (c) transfers the final block number array created in the local memory to the global memory via the dedicated memory interface.
The multi-core type processing apparatus according to claim 6.

An array of records loaded into a computer comprising a plurality of arithmetic units including a dedicated local memory and a global memory connected to the plurality of arithmetic units and including item values corresponding to one or more data items A computer-readable program that causes the computer to execute a code that rearranges the records of the tabular data constructed in the global memory according to the item values relating to a predetermined data item,
In the computer, the tabular data is
The block number corresponding to each record in the array of records divided into blocks having a size that can be accommodated in the local memory of each arithmetic unit is stored in the order of the record sequence number of the record in the tabular data. Block number array being
A record sequence number array in which the record sequence numbers of records in charge that are records belonging to each block are stored in the order of the record sequence numbers;
Field value access information array in which field value access information for accessing the field value included in the responsible record is stored in the order of the record sequence number;
Item value information describing the item value accessed using the item value access information for each data item;
Expressed by
The above program is
(a) Each arithmetic unit operates in parallel with other arithmetic units, and for each block including the record in charge, the item value access information array using the item value information relating to the predetermined data item as a key Sort is applied to the sorted item value access information array, and a work item value information array and a work record sequence number array corresponding to the sorted item value access information array are created in the global memory. Code to
(b) Each arithmetic unit operates in parallel with other arithmetic units, and the first element from the work item value information array relating to a pair of blocks and the corresponding first number from the work record sequence number array. The element set of two elements is merged in a predetermined order using the first element as the upper digit and the second element as the lower digit, and the merged element set A code for creating a new work item value information array, a new work record sequence number array, and a new block number array related to the pair of blocks, by associating the block numbers included in the set;
(c) Each arithmetic unit operates in parallel and hierarchically with other arithmetic units and repeatedly executes the code (b), thereby generating a final block number array in the global memory; ,
(d) Each arithmetic unit operates in parallel with other arithmetic units and distributes the record sequence numbers in which the elements in the final block number array are stored for each block number specified by the element. A code for arranging in a predetermined order, thereby creating a sorted record sequence number array in the global memory,
Including the program.

An array of records loaded into a computer comprising a plurality of arithmetic units including a dedicated local memory and a global memory connected to the plurality of arithmetic units and including item values corresponding to one or more data items 7. The sorting method according to claim 1, wherein the record of the tabular data constructed in the global memory is rearranged according to an item value relating to a predetermined data item. A computer program product that causes a computer to execute.

An array of records loaded into a computer comprising a plurality of arithmetic units including a dedicated local memory and a global memory connected to the plurality of arithmetic units and including item values corresponding to one or more data items 7. The sorting method according to claim 1, wherein the record of the tabular data constructed in the global memory is rearranged according to an item value relating to a predetermined data item. A recording medium on which a computer program to be executed is recorded.