WO2010013320A1 - Procédé d'exploitation de données de forme tabulaire, multiprocesseur à mémoire distribuée et programme - Google Patents

Procédé d'exploitation de données de forme tabulaire, multiprocesseur à mémoire distribuée et programme Download PDF

Info

Publication number
WO2010013320A1
WO2010013320A1 PCT/JP2008/063660 JP2008063660W WO2010013320A1 WO 2010013320 A1 WO2010013320 A1 WO 2010013320A1 JP 2008063660 W JP2008063660 W JP 2008063660W WO 2010013320 A1 WO2010013320 A1 WO 2010013320A1
Authority
WO
WIPO (PCT)
Prior art keywords
array
local
item
record
data
Prior art date
Application number
PCT/JP2008/063660
Other languages
English (en)
Japanese (ja)
Inventor
晋二 古庄
Original Assignee
株式会社ターボデータラボラトリー
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社ターボデータラボラトリー filed Critical 株式会社ターボデータラボラトリー
Priority to PCT/JP2008/063660 priority Critical patent/WO2010013320A1/fr
Publication of WO2010013320A1 publication Critical patent/WO2010013320A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs

Definitions

  • the disclosure of the present application is a method for sharing and manipulating tabular data represented as an array of records including item values corresponding to data items by a plurality of arithmetic units, in particular, tabular data is constructed and tabular data is constructed.
  • This method is related to a method of sorting specific data from tabular data or tabulating tabular data.
  • the disclosure of the present application is a distributed memory multiprocessor that operates by sharing tabular data represented as an array of records including item values corresponding to data items by a plurality of arithmetic units, in particular, tabular data. It also relates to a distributed memory multiprocessor that sorts tabular data, retrieves specific data from tabular data, or tabulates tabular data.
  • the disclosure of the present application relates to a program for causing a distributed memory multiprocessor to execute the above method, a computer program product, and a recording medium on which the computer program is recorded.
  • the present inventor has proposed a basic data processing algorithm for processing large-scale data at high speed, for example, an “on-memory data processing algorithm” as described in Patent Document 1.
  • This technique is based on the idea that the tabular data is decomposed into items (ie, columns) instead of records (ie, rows) as in the prior art.
  • the tabular data includes (1) an array representing the record order, (2) a value table in which unique item values belonging to the items are arranged in a predetermined order (for example, ascending order), and (3 )
  • the item value corresponding to each record is represented by a data structure including an array representing position information stored in the value table.
  • Patent Literature 2 describes a search / sort algorithm corresponding to a distributed memory type multiprocessor system
  • Patent Literature 3 describes an aggregation algorithm
  • Patent Document 4 describes an efficient sorting algorithm corresponding to a shared memory type multiprocessor system.
  • Patent Documents 2 and 3 are specifically designed to be compatible with a massively parallel processor (MPP) architecture.
  • MPP massively parallel processor
  • a comparison operation may be performed with data stored in the module. Since the cost of the comparison operation is an order of the square of the number of processor modules, there is a relationship that the cost of the comparison operation increases as the degree of parallelization increases.
  • the sort for the shared memory type multiprocessor system described in Patent Document 4 is specifically designed to be compatible with a symmetric multiprocessor (SMP) architecture.
  • SMP symmetric multiprocessor
  • Non-Patent Document 1 computer systems that employ a non-uniform memory access (NUMA) architecture as described in Non-Patent Document 1 are commercially available.
  • NUMA type computer system is a distributed memory type multiprocessor system in that remote memory access is possible but no shared memory is provided.
  • the techniques described in Patent Documents 2 and 3 can be directly applied to this NUMA type computer system if remote memory access is not used.
  • the operation amount of each operation unit, the amount of data held in each operation unit, and the operation unit It is preferable that a method for processing data can be provided so that the amount of data communication between them can be reduced as much as possible.
  • the calculation amount of each calculation unit, the amount of data held in each calculation unit, and the calculation unit It is preferable that a program for processing data, a computer program product, and a recording medium on which the computer program is recorded can be provided so that the amount of data communication can be reduced as much as possible.
  • a plurality of arithmetic units including a dedicated local memory and a processor and connected to communicate with each other to transmit and receive data are provided, and a series of data is transmitted to the plurality of arithmetic units.
  • a distributed memory that is divided into a plurality of associated blocks and is held in a dedicated local memory of each of the plurality of arithmetic units, and the series of data is manipulated by pipeline processing of the plurality of arithmetic units.
  • Each arithmetic unit receives local data from one or more arithmetic units in the previous pipeline stage, converts the received at least two local data into one further local data, and converts the further local data into Including merging means for transmitting to one arithmetic unit in a later pipeline stage;
  • the merging means of the plurality of arithmetic units can be dynamically connected in a tournament manner so that finally one global data is generated,
  • At least one of the arithmetic units includes a distribution unit that divides the global data and allocates the global data to the plurality of arithmetic units based on block numbers corresponding to the plurality of blocks.
  • a distributed memory multiprocessor is provided.
  • the distributed memory type multiprocessor includes a dedicated local memory, and includes a plurality of arithmetic units connected to communicate with each other to transmit and receive data, and the plurality of arithmetic units are configured to realize pipeline processing. It is a device. An arithmetic unit is sometimes called a node.
  • the processor in the arithmetic unit may be a single core processor or a multi-core processor including two or more processor cores.
  • each block obtained from a series of data is associated with each arithmetic unit, that is, with a single core processor in the arithmetic unit.
  • the local memory in the arithmetic unit is used as a shared memory between two or more processor cores.
  • each block obtained from a series of data is preferably associated with each processor core in the arithmetic unit. Therefore, a plurality of blocks are associated with one arithmetic unit.
  • the processor included in the arithmetic unit may be either a single core processor or a multicore processor.
  • a series of data is divided into a plurality of blocks, and block numbers are assigned to the blocks.
  • This block number is copied when certain data is merged by pipeline processing of a plurality of arithmetic units, and is associated with the merged data. Therefore, by referring to the block number corresponding to this data, it is possible to distribute the merged data to the arithmetic units associated with the original block.
  • By associating block numbers with data in this way it is not necessary for each arithmetic processor to receive data from all other arithmetic processors. Therefore, even if the degree of parallelization increases, operations such as comparison operations can be performed.
  • the merging means of each arithmetic unit is connected to the merging means of one or more arithmetic units on the input side, and is connected to the merging means of one arithmetic unit on the output side, and is input from at least one arithmetic unit.
  • Two local data are merged into one local data and output to one arithmetic unit. Thereby, one local data is finally produced
  • the merging means of each arithmetic unit may be configured to merge three or more local data from one or more arithmetic units into one local data. Further, in order to reduce the amount of data communication between the arithmetic units, one of the two arithmetic units on the input side and the arithmetic unit including the merging means for merging local data from the input side are the same arithmetic unit. It does not matter.
  • the processor of the arithmetic unit is a multi-core processor
  • the first layer merge processing in the tournament table may be performed between processor cores in the same arithmetic unit.
  • the first-layer merge processing may be executed using local data from different arithmetic units.
  • the series of data is tabular data expressed as an array of records including item values corresponding to data items.
  • the record assigned to each arithmetic unit that is, the record in charge is separated into order information that depends on the order of the records and item information that depends on each data item. Has been. If tabular data is separated into order information and item information, when search or sort is applied to the tabular data, only the order information is affected, and the item information maintains the state before the search or sort is applied. Thereby, the calculation cost and the data communication cost are greatly reduced.
  • the order information includes a record number for identifying the assigned record in the tabular data, that is, a record order of the assigned record in the block as a source record position number.
  • the record order number array is stored in the order of the numbers
  • the item value access information array is stored in the order of the record order numbers.
  • the item value access information is used to access the item values included in the assigned record.
  • the record sequence number array and the item value access information array are integer type arrays and have the same size. Each element of the record sequence number array represents the position of the record corresponding to this element in the entire tabular data, that is, the original record position number.
  • Each element of the item value access information array is used as an index indicating a specific array in the item information, that is, a local item value number array to be described later, in order to combine the order information and the item information.
  • the item information includes a local item value array in which unique item values included in the assigned record are stored in a predetermined order, and the item value included in the assigned record is stored in the local item value array.
  • Local field value number array in which the local field value number specifying the position is stored in the order of the original record position number of the record in charge, and the entire tabular data based on the predetermined order
  • the global item value sequence number array stores the sequence numbers assigned to the unique item values in the local item value array.
  • the local field value number array has the same size as the number of records in the assigned record, and the elements in the local field value number array specify the field values included in a certain record in the assigned record.
  • the local field value array is an array in which unique field values included in all records of the assigned record are arranged in a predetermined order (for example, ascending order or descending order). Since the local item value array is an array for storing the item value itself, it can take various data types such as an integer type, a floating point type, and a character string type.
  • the global item value sequence number array stores the rank of each item value stored in the local item value array among the item values held in the entire tabular data in the local item value array. It is an array that stores in order of item value. The rank of each item value is determined based on a predetermined order such as ascending order or descending order.
  • a distributed memory multiprocessor that constructs tabular data on a memory by combining the above-described concept of component decomposition, the concept of block numbers, and the concept of pipeline processing. Is done.
  • the process of building tabular data on the memory of the distributed memory multiprocessor includes (i) an order set creation process for creating a record sequence number array and an item value access information array; The local item value number array and the intra-block compilation process for creating the local item value array, and (iii) the inter-block compilation process for creating the global item value sequence number array between the blocks.
  • the record (a plurality or a plurality of) of the tabular data is divided into blocks identified by the block number.
  • this block is initially associated with an arithmetic unit that is responsible for processing the records included in this block.
  • the record that each arithmetic unit is responsible for is referred to as the charge record in this document.
  • the assigned record defined as described above is transmitted from each external device to each arithmetic unit.
  • the external device may be an external storage device, or may be an external arithmetic unit or an external computer.
  • Each arithmetic unit generates a record sequence number array in which the record sequence number of the record in charge (initially matches the original record position number) is stored in order of the record sequence number in order to recognize the record in charge.
  • the original record position number corresponds to a position where each record is accommodated in the original tabular data, for example, a line number.
  • each arithmetic unit generates an item value access information array in which item value access information is stored in the order of record order numbers in order to access the item values included in the assigned record.
  • the item values included in the record in charge of each arithmetic unit are stored in the local item value array for each arithmetic unit so that each arithmetic unit can access each data item using the item value access information array.
  • the In the local item value array unique item values are stored in a predetermined order (ascending order or descending order) for each data item. Since each arithmetic unit uses the field value access information array to access the field value included in the assigned record, the local field value number that identifies the field value included in the assigned record is the source record position number for each data item. A local item value number array stored in the order is generated.
  • the global field value included in the entire tabular data is A global item value sequence number array that specifies the position of the local item value in the virtual global item value array stored in a predetermined order is generated.
  • the generated various arrays are stored in a dedicated local memory of the arithmetic unit.
  • Each arithmetic unit is Means for receiving the assigned record assigned to the arithmetic unit from an external device connected to the distributed memory multiprocessor; Means for generating the record sequence number array and the item value access information array from the received record in charge and storing them in the dedicated local memory of the arithmetic unit; By sorting the item values in the assigned record for each item in the predetermined order, the local item value array and the local item value number array are generated and stored in the dedicated local memory of the arithmetic unit.
  • Local compilation means to store; For each item, the local item value array, a local item in which the sequence number assigned to the unique item value included in the local item value array is stored based on the predetermined order within the range of the record in charge Pipeline after this arithmetic unit using the value sequence number array and the block number array including the block number indicating the assigned record related to the sequence number in the local item value sequence number array as the local data Means for transmitting to the computing unit in the stage; Including The merging means of each arithmetic unit merges the item values in the at least two local item value arrays included in the at least two local data from the previous pipeline stage in the predetermined order so that the at least Means for converting two local data into one local data formed by a further block number array, a further local item value sequence number array and a further local item value array; The distribution means associates a sequence number stored in the finally generated further local item value sequence number array with a block number stored in the corresponding finally generated further block number array. Including means for transmitting to the a
  • the above-described concept of component decomposition, the concept of block numbers, and the concept of pipeline processing are combined, and tabular data is stored using item values included in predetermined items as keys.
  • a distributed memory multiprocessor for sorting is provided.
  • the sort processing of tabular data is a process of creating a sorted record sequence number array and a sorted item value access information array.
  • a block that executes counting sort in the block creates a new item value access information array, and creates a global item value sequence number array and a record sequence number array used in later processing.
  • An internal sort process (ii) an inter-block merge process in which the global item value sequence number array and the record sequence number array created in the previous process are merged together with the block number array by a tournament method, and (iii) merged blocks And an inter-block distribution process for creating a new record sequence number array from the number array.
  • Each arithmetic unit is With respect to a predetermined item, for each block including the record in charge, sorting is applied to the record order number array, the item value access information array, and the global item value order number array, using the local item value number as a key.
  • Local sort means for generating a locally sorted record sequence number array, a locally sorted field value access information array and a locally sorted global field value sequence number array; For the given item, the locally sorted global field value sequence number array, the locally sorted record sequence number array, and the block number indicating the responsible record associated with the locally sorted record sequence number array Means for transmitting the block number array including Including A combination of the locally sorted global item value sequence number array and the locally sorted record sequence number array included in at least two local data from the previous pipeline stage Means for converting the at least two local data into one local data formed by a further global item value sequence number array, a further record sequence number array, and a further block number array by merging in a predetermined order , A means for transmitting the sequence number in the further block number array of the block number included in the further generated block number array to the arithmetic unit associated with the block number; Including Each arithmetic unit further includes means for sequentially storing the transmitted sequence numbers in the record sequence number array.
  • a distributed memory that searches a record satisfying a predetermined search condition from tabular data by combining the concept of component decomposition, the concept of a block number, and the concept of pipeline processing.
  • a type multiprocessor is provided.
  • the tabular data search process is a process of creating order information, that is, a new record order number array and a new item value access information array from the tabular data before the search.
  • this search processing (i) it is determined whether the item value stored in the local item value array in the block matches the search condition, and the record order number and the item value corresponding to the item value matching the search condition Local processing for extracting access information; and (ii) a record sequence number array extracted in each block is merged together with a block number array in a predetermined order, and a final new record sequence number array is obtained according to the merged block number array
  • Each arithmetic unit is For a predetermined item, for each block including the record in charge, a new record sequence number array and a new item value access information array corresponding to the record including the item value matching the search condition are generated, and the item value access information Local search means for replacing the array with the new item value access information array, and With respect to the predetermined item, a block number array including the new record sequence number array and the block number indicating the record in charge associated with the new record sequence number array is used as the local data.
  • Means for transmitting to a computation unit in a later pipeline stage Including The merging means of each arithmetic unit merges a set of the new record order number array and the block number array included in at least two local data from the previous pipeline stage in a predetermined order.
  • Each arithmetic unit further includes means for sequentially storing the transmitted sequence numbers in the new record sequence number array.
  • a distributed memory type multiprocessor that tabulates tabular data by combining the concept of component decomposition, the concept of block numbers, and the concept of pipeline processing.
  • tabular data as a tabulation result is newly generated from the tabular data of the tabulation source.
  • the tabular data of the tabulation result is tabular data including items related to the tabulation dimension and items related to tabulation targets (items before tabulation, items in the middle of tabulation calculation, items of tabulation results, etc.).
  • a process of calculating the size of tabular data of the tabulation result from the tabular data of the tabulation source, and (ii) tabular format of the tabulation result (Iii) a process of dividing data into blocks and assigning it to a plurality of arithmetic units; (iii) specifying a set of item values of items relating to the dimension of aggregation (one or more) and assigning to each arithmetic unit; iv) It is comprised by the process which specifies the item value of the item regarding a total object, and assigns it to several arithmetic units.
  • the plurality of arithmetic units include a first charge record relating to the first tabular data used as the tabular data of the summation source, and item values relating to a predetermined set of items of the first tabular data. It is configured so that the second assigned record relating to the second tabular data representing the result of aggregation obtained by aggregating item values relating to at least one other aggregation item for each group is stored in each dedicated local memory. Has been.
  • Each arithmetic unit is From the external device connected to the distributed memory type multiprocessor, the range information of the second assigned record divided into the calculation unit in the second tabular data, and the set of the predetermined items Means for receiving a set of the number of unique field values to which it belongs; Based on the range information of the second assigned record, a record sequence number array and an item value access information array of the second assigned record assigned to the operation unit are generated, and the dedicated local memory of the operation unit is generated.
  • the merging means of each arithmetic unit generates a further local item value array by merging the local item value arrays included in the at least two local data from the previous pipeline stage in a predetermined order.
  • Generating a further block number array and a further global item value sequence number array by merging the block number array and the global item value sequence number array contained in the at least two local data from the pipeline stage in a predetermined order; This includes means for converting the at least two local data into one local data formed by the further local item value array, the further block number array and the further global item value sequence number array.
  • the distribution means corresponds to the item value stored in the finally generated further local item value array specified by the sequence number stored in the finally generated further global item value sequence number array.
  • Each arithmetic unit further includes means for sequentially storing the transmitted item value in the local item value array of the second assigned record of the arithmetic unit.
  • Each arithmetic unit includes a dimension value number array that includes a dimension value number that identifies a set of item values related to the predetermined set of items included in the first assigned record, and the at least one corresponding to the dimension value number. And a means for generating a local aggregate value array including the aggregate values of the item values related to the aggregate items, and transmitting the local aggregate value array to the arithmetic unit in the pipeline stage after the arithmetic unit as the local data.
  • the merging means of each arithmetic unit is configured to merge the dimension value number array and the local aggregate value array included in at least two local data from the previous pipeline stage in a predetermined order, so that the at least two local data Means for converting the data into one local data formed by a further dimension value number array and a further local aggregate value array.
  • the distribution means further includes means for transmitting the total value stored in the finally generated further local total value array to the arithmetic unit according to the range information of the second assigned record.
  • Each arithmetic unit is By sorting the transmitted item values in a predetermined order for each of the at least one total item, a local item value array and a local item value number array relating to the total item are generated, and the dedicated local of this arithmetic unit is generated.
  • Local compilation means to store in memory; For each of the at least one total item, an order number assigned to a unique item value included in the local item value array based on a predetermined order within the range of the local item value array and the second charge record This calculation is performed using the stored local item value sequence number array and the block number array including the block number indicating the assigned record related to the sequence number in the local item value sequence number array as the local data.
  • the merging means of each arithmetic unit merges the item values in the at least two local item value arrays included in the at least two local data from the previous pipeline stage in the predetermined order, thereby Further comprising means for converting the two local data into one local data formed by the further block number array, the further local item value sequence number array and the further local item value array.
  • the distribution means associates a sequence number stored in the finally generated further local item value sequence number array with a block number stored in the corresponding finally generated further block number array. Further comprising means for transmitting to the computing unit.
  • Each arithmetic unit further includes means for sequentially storing the transmitted sequence number in a global item value sequence number array relating to the at least one total item provided in the dedicated local memory of the arithmetic unit.
  • a plurality of arithmetic units including a dedicated local memory and a processor and connected to communicate with each other to transmit and receive data are provided.
  • a series of data is divided into a plurality of blocks associated with the plurality of arithmetic units and held in a dedicated local memory of each of the plurality of arithmetic units, and the series of data is stored in the plurality of arithmetic units.
  • Each computing unit receives respective local data from one or more computing units in the previous pipeline stage, converts at least two received local data into one further local data, and converts the further local data into A dynamic operation in a tournament manner to transmit to one computing unit in a later pipeline stage and finally generate one global data; At least one of the arithmetic units divides the global data and assigns the global data to the plurality of arithmetic units based on block numbers corresponding to the plurality of blocks; A tabular data manipulation method is provided.
  • a computer comprising a plurality of arithmetic units including a dedicated local memory and a processor and connected to communicate with each other to send and receive data;
  • a series of data is divided into a plurality of blocks associated with the plurality of arithmetic units and held in a dedicated local memory of each of the plurality of arithmetic units, and the series of data is stored in the plurality of arithmetic units.
  • a computer-readable program for causing the computer to execute code operated by the pipeline processing of Each computing unit receives respective local data from one or more computing units in the previous pipeline stage, converts at least two received local data into one further local data, and converts the further local data into A code to be sent to one arithmetic unit in a later pipeline stage;
  • a code that at least one of the arithmetic units divides the global data and assigns the global data to the plurality of arithmetic units based on block numbers corresponding to the plurality of blocks;
  • a program is provided.
  • a computer comprising a plurality of arithmetic units including a dedicated local memory and a processor and connected to communicate with each other to send and receive data;
  • a series of data is divided into a plurality of blocks associated with the plurality of arithmetic units and held in a dedicated local memory of each of the plurality of arithmetic units, and the series of data is stored in the plurality of arithmetic units.
  • a computer comprising a plurality of arithmetic units including a dedicated local memory and a processor and connected to communicate with each other to send and receive data;
  • a series of data is divided into a plurality of blocks associated with the plurality of arithmetic units and held in a dedicated local memory of each of the plurality of arithmetic units, and the series of data is stored in the plurality of arithmetic units.
  • a storage medium on which a computer program for causing the computer to execute the tabular data manipulation method operated by the pipeline processing is recorded.
  • the calculation amount of each arithmetic unit of the distributed memory type multiprocessor, the data amount held in each arithmetic unit, and the data communication amount between the arithmetic units are reduced as much as possible. Therefore, it is possible to realize a distributed memory type multiprocessor capable of efficiently operating large-scale data.
  • FIG. 1 is a schematic diagram of a distributed memory processor according to an embodiment of the present invention.
  • FIG. 1 is a schematic diagram of a computer system according to an embodiment of the present invention. It is a figure showing an example of the tabular data for demonstrating the data management mechanism used as the foundation of one Embodiment of this invention. It is explanatory drawing of the basic data management mechanism used as the foundation of one Embodiment of this invention. It is explanatory drawing of the data structure for distributed memory type
  • FIG. 5 is a schematic flowchart of a tabular data sorting method according to an embodiment of the present invention. It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. It is explanatory drawing of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention. It is explanatory drawing of the result of the sort process in a block in the sort process of the tabular data by one Embodiment of this invention.
  • FIG. 1 is a schematic diagram of an embodiment of a distributed memory multiprocessor.
  • the distributed memory multiprocessor 100 includes a plurality of arithmetic units 110, 120, 130, and 140 (for example, two, four, eight, etc., four in this example).
  • Each arithmetic unit 110, 120, 130, 140 includes a processor 111, 121, 131, 141 for data processing and a dedicated local memory 112, 122, 132, 142 connected directly to the processor.
  • the arithmetic units 110, 120, 130, and 140 are connected by an interconnect 150 that enables high-speed data communication between the arithmetic units.
  • the arithmetic units 110, 120, 130, and 140 may include a cache memory.
  • the arithmetic unit may be configured to allow access (remote access) to the local memory of another arithmetic unit. In that case, the arithmetic unit may further include a cache memory for remote access.
  • the local memory of each arithmetic unit may be configured to appear as one logically continuous memory as a whole.
  • the processor in the arithmetic unit may be a single core processor or a multi-core processor including two or more processor cores.
  • the local memory in the arithmetic unit may be used as a shared memory between the processor cores.
  • Each arithmetic unit can be configured such that a series of data such as tabular data is operated by pipeline processing of a plurality of arithmetic units.
  • each arithmetic unit receives local data from one or more (eg, two) arithmetic units in the previous pipeline stage, and at least two (eg, two) received local data Merging means for converting the data into one further local data and sending the further local data to one arithmetic unit in a later pipeline stage.
  • the merging means of these plural arithmetic units can be dynamically connected by a tournament method so as to finally generate one global data.
  • “dynamic” means that the number of stages of the tournament table, the arrangement of each arithmetic unit in the tournament table, and the like are variable according to the objects to be merged. Furthermore, at least one of the arithmetic units is assigned to each block when a series of data is divided into a plurality of blocks, that is, the block number assigned to the arithmetic unit to which each block is assigned. Based on this, it includes distribution means for dividing the data generated by the merging and allocating the data to a plurality of arithmetic units.
  • each block obtained from a series of data is associated with each arithmetic unit, that is, a single core processor in the arithmetic unit.
  • each block obtained from a series of data is associated with each processor core in the arithmetic unit. Therefore, a plurality of blocks are associated with one arithmetic unit, and various processes related to the blocks are executed in units of processor cores in the arithmetic unit.
  • the processor in the arithmetic unit may be either a single core processor or a multi-core processor.
  • the embodiments of the invention are described in units of arithmetic units as if the arithmetic units are single-core processors. However, even if the processor in the arithmetic unit is a multi-core processor, if the processor of one arithmetic unit is considered to be compatible with one core processor, the following description can be similarly applied to this case. Cost.
  • FIG. 2 is a schematic diagram of a computer system 200 for manipulating a large series of data according to an embodiment of the present invention.
  • the computer system 200 includes a distributed memory type multiprocessor 202 as shown in FIG. 1, which operates a series of data in a shared manner by a plurality of arithmetic units.
  • the computer system 200 further stores a CPU 210 that controls the entire system and individual components by executing a program, and stores work data, for example, RAM (Random Access Memory). ), A ROM (Read Only Memory) 214 for storing programs, a storage device 216 such as a hard disk, a CD-ROM driver 220 for accessing the CD-ROM 218, and a CD-ROM driver 220.
  • RAM Random Access Memory
  • a ROM Read Only Memory
  • an interface (I / F) 222 connected to an external terminal connected to an external network or the like (not shown), an input device 224 such as a keyboard and a mouse, and a display device 226 such as a computer monitor It has.
  • the processor 210, the RAM 212, the ROM 214, the storage device 216, the I / F 222, the input device 224, and the display device 226 are connected to each other via a bus 228.
  • a program that causes the distributed memory multiprocessor 202 of the computer system 200 to execute a series of data operations may be stored in the CD-ROM 218 and read by the CD-ROM driver 220 or stored in advance in the ROM 214. good. Further, what is once read from the CD-ROM 218 may be stored in a predetermined area of the external storage medium 216. Alternatively, the program may be supplied from the outside via a network (not shown), an external terminal, and the I / F 222.
  • the distributed memory type computer is realized by causing the computer system 200 to execute a program for operating a series of data.
  • a CPU 210 is provided in addition to the distributed memory type multiprocessor 202 to control the entire system and individual components.
  • the present invention is not limited to such an embodiment.
  • an arithmetic unit included in the distributed memory multiprocessor 202 controls the entire system and individual components. .
  • FIG. 3 is a diagram showing an example of tabular data for explaining the data management mechanism which is the basis of the present invention.
  • This tabular data is stored as a data structure as shown in FIG. 4 in the computer by using the data management mechanism proposed in the above-mentioned International Publication No. WO00 / 10103.
  • This data structure has been proposed to realize retrieval, sorting, aggregation, etc. of large-scale tabular data using hardware resources of commercially available computers, for example, personal computers, in particular, processors and memories. It should be noted that the data structure of tabular data placed on the computer memory.
  • the source record position number is virtual information used for specifying individual records including item values corresponding to data items. For example, when normal tabular data is converted into tabular data based on an information block, the position where a record is accommodated in the original normal tabular data is represented by a source record position number. In general, in tabular data based on information blocks, records are not always arranged in the order of the original record position numbers.
  • the sort order of the tabular data records after sorting is different from the sort order of the original tabular data records.
  • records in tabular data based on information blocks immediately after conversion from normal tabular data may be arranged in the order of the source record position number. In this case, the source record position The number and the record sequence number initially match.
  • the order number (record order number) of each record of the tabular data and the original record position number are abbreviated as a record order designation array 401 (hereinafter this array is abbreviated as “OrdSet”) .).
  • the record order specification array 401 stores the original record position numbers in the order of the record order numbers. In the example of FIG. 4, the records are arranged in the order of the original record position numbers.
  • the array A can be expressed as A [i], where the subscript i is an index, but in the drawing, the array is shown in a region surrounded by a solid line, and the element A [i] The boundary of the element A [i + 1] is indicated by a dotted line.
  • the subscript i of element A [i] is shown on the left side of element A [i]. Further, the subscript i of the array is represented by an integer starting from 0.
  • the actual gender value for the record whose source record position number is “0”, ie, “male” or “female” is a value list in which the actual values are sorted according to a predetermined order (eg, ascending or descending order).
  • Item value number array 402 (hereinafter, item value number array, that is, pointer array) that is a pointer array to a certain item value array 403 (hereinafter, item value array, that is, a value list is abbreviated as “VL”). Is abbreviated as “VNo.”).
  • the pointer array 402 stores pointers that point to elements in the actual value list 403 according to the order of the source record position numbers stored in the array OrdSet 401.
  • Item values can be obtained for other records as well as for age and height.
  • the tabular data is expressed by a combination of the value list VL and the pointer array VNo to the value list, and this combination is particularly referred to as an “information block”.
  • information blocks regarding gender, age, and height are shown as information blocks 408, 409, and 410, respectively.
  • a single computer has a single memory (physically multiple, but a single memory in the sense that it is located and accessed in a single address space) May store the ordered set array OrdSet, the value list VL constituting each information block, and the pointer array VNo in the memory.
  • the manipulation of tabular data is performed by a distributed memory multiprocessor composed of multiple arithmetic units with local memory. Therefore, a new mechanism for holding tabular data has been proposed in order to achieve efficient parallel processing.
  • FIG. 5A shows an example of tabular data.
  • the tabular data 500 shown in FIG. 5A includes item values corresponding to the data item 501 “School” (for example, “West”, “South”, “North”, “East”, and data items “Age”). It is represented as an array of records including item values (for example, “12”, “8”, “11”, “10”, etc.) corresponding to 502.
  • Record 510 at the head of the array is a record to which record sequence number 0 is assigned.
  • the item value of “School” is “West” and the item value of the data item “Age” is “12.”
  • the data item “School” of the record 511 is displayed.
  • the item value is “North”, and the item value of the data item “Age” is “9.”
  • the record order given to each record Note that the numbers change.
  • this tabular data record is identified by block numbers (in this example, eight block numbers from 0 to 7) blocks 520 and 521. ... 527. Initially, this block is associated with the arithmetic unit of the distributed memory type multiprocessor responsible for processing the records included in this block, more specifically, the processor of the arithmetic unit.
  • the data structure for the distributed memory multiprocessor includes information (order information) relating to the order in which records are arranged (that is, record order numbers) and storage locations of field values in the data structure, and field values for each data field.
  • Information (item information).
  • the order information functionally corresponds to the record order designation array OrdSet in the data management mechanism that is the basis of the present invention, and the item information similarly corresponds to the information block. Both the order information and the item information are held in the global memory, and a part of them is transferred to the local memory of each arithmetic unit as necessary.
  • FIG. 5B shows the order information 530
  • FIGS. 5C and 5D show the item information 531 and 532 of the data item “School” and the data item “Age”, respectively.
  • an arithmetic unit responsible for the operation of the record is determined for each record. Therefore, the record (s) is divided into records that each arithmetic unit is responsible for, that is, the records in charge, and a block number is assigned to each record in charge. If the block and the arithmetic unit correspond one-to-one as in the present embodiment, each arithmetic unit does not need to store the block number.
  • a block number array BlkNo [i] indicates that the block number of the block to which the record having the record order number i belongs is BlkNo [i]. For example, in the example of FIG. 5A, records with record sequence numbers 0 to 3 are included in the block with block number 0, records with record sequence numbers 4 to 7 are included in the block with block number 1, and so on.
  • the order information 530 includes record order number arrays 551-0, 551-1, 551-2,... 551 in which the record order numbers of the records in charge are stored in the order of the record order numbers for each block. 7 is included.
  • the record order number array may be referred to as GOrd below.
  • the record sequence numbers of the assigned records belonging to the block of block number 0 are 0, 1, 2, and 3
  • the record sequence numbers of the assigned records belonging to the block of block number 1 are 4, 5, 6, 7 and so on.
  • the record sequence number array is an integer type array having the same size as the number of records in charge belonging to each block and storing the record sequence numbers in ascending order.
  • the record sequence number array is divided into sizes that can be accommodated in the local memory of each arithmetic unit, and stored in the local memory in each arithmetic unit.
  • the block number array and the record sequence number arrays 551-0, 551-1, 551-2, ..., 551-7 can be converted to each other.
  • the block number of the block to which the record having the record sequence number i belongs is expressed by BlkNo [i]
  • the conversion from the block number array to the record sequence number array is as follows.
  • J [BlkNo [i]] of the array J represents a subscript that designates an element of the record sequence number array GOrd [BlkNo [i]] belonging to the block whose block number is BlkNo [i]. Note that all elements of the array J are initialized to 0.
  • the order information 530 includes the item value access information arrays 552-0 and 552 in which the item value access information of the assigned record is stored in the order of the record order number for each block. 1, 552-2,... 552-7.
  • This item value access information array is an integer type array, and the size of the item value access information array matches the number of records in charge.
  • the item value access information array may be called by the name LOrd. For example, in the example of FIG.
  • the item value included in the record whose record sequence number is 0 included in the block of block number 0 can be accessed by the item value access information of 0 with respect to this block number 0.
  • the item value included in the record having the record sequence number 5 included in the block 1 can be accessed by the item value access information 1 regarding the block number 1.
  • item information is held as item information for each data item.
  • the item information 531 regarding the data item “School” and the item information 532 regarding the data item “Age” are divided and stored in the local memory of each arithmetic unit.
  • the item values included in the record in charge for each block are held in the local memory of each arithmetic unit so that each arithmetic unit can access each data item using the item value access information array.
  • the item value itself is constructed on the local memory as a local item value array LVL in which unique item values are stored in a predetermined order (ascending order or descending order) for each data item. For example, in the examples of FIGS.
  • the item values relating to the data item “School” are held in the local memory of each arithmetic unit as local item value arrays 562-0, 562-1,.
  • Item values relating to the data item “Age” are held in the local memory of each arithmetic unit as local item value arrays 582-0, 582-1,. Since the local item value array is an array for storing the item value itself, it takes various data types such as an integer type, a floating point type, and a character string type.
  • Item information is configured so that item values stored in the global item value array can be specified using item value access information related to the record in charge. Therefore, the item information includes, for each data item, a local item value number array in which local item value numbers that specify item values included in the assigned record are stored in the order of record order numbers (for example, ascending order or descending order); A global that specifies the location where the field value represented by the local field value number is stored in a unique field value array (hereinafter sometimes referred to as the global field value array) held in the entire tabular data Field value sequence number array. Note that the global item value array does not need to be actually constructed in memory.
  • the local item value number array and the global item value sequence number array are provided for each block and are stored in the local memory of each arithmetic unit.
  • the local item value number array is an integer type array having a size that matches the number of records of the assigned record, and is sometimes called by the name VNo.
  • the global item value sequence number array is an integer type array having the same size as the number of unique item values included in the record in charge, and is sometimes referred to as GVOrd.
  • the item information 531 includes local item value number arrays 561-0, 561-1,... 561-7 and local item value arrays 562-0, 562. 1,... 562-7 and global item value sequence number arrays 563-0, 563-1,.
  • the local item value number array, the local item value array, and the global item value sequence number array are all divided into blocks.
  • the value of the head element of the local item value number array VNo of Block-0 is “1”. This means that the item value number of the item value included in the record specified by the item value access information whose value is “0” is “1”.
  • the item value whose item value number is “1” is found to be “West” by referring to the second element of the local item value array LVL, that is, LVL [1]. Furthermore, this item value is the third element of the virtual global item value array by referring to the second element of the global item value sequence number array, that is, GVOrd [1]. It can be seen that the order number of this item value in the entire format data is “3”. The same applies to other blocks and other data items.
  • the item value included in the record belonging to each block is associated with the local item value number assigned to each item value in the block and the local item value number. And a local item value array that is displayed.
  • FIG. 6 is a flowchart of a method for constructing a data structure for a distributed memory multiprocessor according to an embodiment of the present invention on a local memory of each arithmetic unit.
  • a plurality of arithmetic units of the distributed memory type multiprocessor 202 operate in parallel to construct a tabular data structure for the distributed memory type multiprocessor in the local memory of the arithmetic unit.
  • the receiving unit of the arithmetic unit operates in parallel, for example, from an external device such as a device on the network connected via the storage device 216 and the I / F 222 to each arithmetic unit of the tabular data.
  • the assigned record in charge is received and stored in a dedicated local memory (step 602).
  • the sequence information creation unit of the arithmetic unit operates in parallel to generate a record sequence number array and an item value access information array from the received assigned records, and stores them in a dedicated local memory (step 604).
  • the local compiling unit of the arithmetic unit operates in parallel, and for each item, sorts the item values in the assigned record in a predetermined order (for example, ascending or descending order), whereby a local item value array, Then, a local item value number array is generated and stored in a local memory dedicated to the arithmetic unit (step 606).
  • the transmission unit of the arithmetic unit operates in parallel to generate a unique item value included in the local item value array for each item based on a predetermined order within the range of the local item value array and the assigned record.
  • the data is transmitted to the arithmetic unit in the pipeline stage after the arithmetic unit (step 608).
  • the merge unit of the arithmetic unit operates in parallel to merge the item values in the two local item value arrays included in the two local data from the previous pipeline stage in a predetermined order.
  • the distribution unit of any of the arithmetic units stores the sequence number stored in the finally generated further local item value sequence number array in the corresponding finally generated further block number array. It transmits to the arithmetic unit associated with the block number (step 612).
  • the sequence number storage unit of the arithmetic unit operates in parallel and sequentially stores the transmitted sequence numbers in the global item value sequence number array secured in the dedicated local memory (step 614).
  • FIG. 7 is a flowchart of an item value acquisition method according to an embodiment of the present invention.
  • the item value is stored in the local memory of each arithmetic unit in the form of item information for each data item. Therefore, for example, the distributed memory multiprocessor can easily acquire the item value included in the designated record.
  • the tabular data is divided into records in charge of the arithmetic units, and is held independently for each arithmetic unit.
  • each arithmetic unit can acquire the item value held in the local memory of this arithmetic unit completely independently of the other arithmetic units.
  • a situation is also considered in which item values included in a large number of records are acquired simultaneously by a large number of arithmetic units operating simultaneously. Even in such a situation, it will be understood that the basic operation of acquiring the item value is a process in which a specific arithmetic unit acquires an item value included in a certain record in the assigned record.
  • each arithmetic unit determines whether or not the designated record sequence number exists in the record sequence number array held in the local memory. (Step 702). If the specified record sequence number does not exist (No in step 702), the process ends. If the specified record sequence number exists, the arithmetic unit specifies the position where the specified record sequence number is stored in the record sequence number array, and the item value access information specified by the specified position The item value access information in the array is read (step 704). Thereafter, the arithmetic unit reads the local item value number in the local item value number array designated by the read item value access information for each item (step 706).
  • the arithmetic unit specifies an item value in the local item value array designated by the read local item value number for each item (step 708). Finally, the arithmetic unit determines whether there is an item whose item value has not yet been specified (step 710). If there is an item for which the item value has not yet been specified (Yes in Step 710), the process returns to Step 706 to continue the processing for the next item. If all item values have been identified (No at step 710), the process ends.
  • FIGS. 5A to 5D An example of data acquisition according to the present embodiment will be described in more detail using the data structure shown in FIGS. 5A to 5D.
  • FIGS. 5A to 5D consider obtaining the item value of the record indicated by the reference numeral 511.
  • FIG. 8 is a schematic flowchart of the compiling process according to an embodiment of the present invention.
  • Acquired record acquisition first, in the tabular data, the assigned record handled by each arithmetic unit is taken into the arithmetic unit (step 802).
  • the assigned record handled by each arithmetic unit is taken into the arithmetic unit (step 802).
  • a record in charge is fetched for each processor, and block unit processing is executed for each processor.
  • Order information creation Next, order information including a record order number array and an item value access information array is created on the local memory of each arithmetic unit (step 804). As described above, the record sequence number array and the item value access information array are created in parallel by a plurality of arithmetic units.
  • In-block compilation Next, multiple arithmetic units operate in parallel, and for each data item, local item value numbers are stored in the order of the original record position number of the assigned record contained in a single block.
  • a local item value array is created, and at the same time, a plurality of arithmetic units also store a local item value array that stores unique values among the item values included in the assigned record in a predetermined order (for example, ascending or descending order). Create (step 806).
  • Inter-block compilation 1 (merge): Next, a plurality of arithmetic units operate in parallel and hierarchically, and each data item is associated with at least two (in this embodiment, two) blocks.
  • the block number array, local item value array, and virtual global item value array that stores unique item values included in the entire tabular data in a predetermined order.
  • a block number array related to a block obtained by merging two blocks from a pair of global item value sequence number arrays in which pointers for specifying the stored positions are stored in the order of local item value numbers A merge process for creating a set of a local item value array and a global item value sequence number array is executed. The arithmetic unit repeatedly executes this merging process until it is finally merged into one block, the final block number array, the final local item value array, and the final global item value sequence number array. Is created (step 808).
  • Inter-block compilation 2 (distribution): Finally, at least one arithmetic unit specifies, for each data item, an element in the final global item value sequence number array by the corresponding element in the final block number array By sequentially distributing the block numbers to the arithmetic units in charge of the respective blocks, a global item value sequence number array relating to the record in charge of each arithmetic unit is created on the local memory of each arithmetic unit (step 810). ).
  • FIGS. 9A and 9B are explanatory diagrams of order information creation processing according to an embodiment of the present invention.
  • the data shown in FIGS. 9A and 9B is the same as the data shown in FIGS. 5A and 5B, and the order information 530 in FIG. 9B is created from the tabular data 500 in FIG. 9A.
  • the order information creation process is as described above.
  • the row number of the original tabular data is set as it is in the record order number array GOrd, and a sequential number starting from 0 is set in the item value access information array LOrd.
  • the order information creation process is executed in parallel by a plurality of arithmetic units.
  • FIGS. 10A to 10C are schematic views of the in-block compilation processing according to the embodiment of the present invention.
  • the item information includes a local item value number array VNo and a local item value array LVL.
  • the intra-block compilation process is executed in parallel by each arithmetic unit for each block of Block-0, Block-1,..., Block-7.
  • FIG. 11 is a schematic diagram of the intra-block compilation process according to an embodiment of the present invention.
  • the local item value array LVL is a list of values in which unique item values extracted from the item values included in the item value array School are stored in a predetermined order (in this example, ascending alphabetical order).
  • the following processing is executed by the arithmetic unit using the local memory of the arithmetic unit.
  • FIG. 12A to 12E are explanatory diagrams of the in-block compilation processing according to the embodiment of the present invention.
  • An item value array as shown in FIG. 12A is created in the local memory of the arithmetic unit.
  • the elements of the item array A are sorted in ascending order of the item values using the item values as keys, and the elements of the position array B are also sorted at the same time.
  • the item value array A and the position array B as shown in FIG. 12C are obtained.
  • an item value number array C is generated by assigning item value numbers to unique item values in order from 0.
  • the item value can be accessed without duplication using the item value number.
  • a local item value array LVL is generated as shown in FIG. 12D.
  • the local item value number array VNo and the local item value array LVL are generated in the local memory of the arithmetic unit.
  • the inter-block compilation process is a process of merging a pair of local data until a plurality of arithmetic units operate in parallel and hierarchically and finally merge into one local data for each data item. Is repeated to generate the final block number array, the final global item value sequence number array, and the final local item value array, and the final global item value sequence number generated by the merge processing. Distribution processing for distributing the global item value sequence numbers in the array to the arithmetic units corresponding to the block numbers based on the block numbers stored in the corresponding block number arrays. Each arithmetic unit stores the distributed global item value sequence numbers in a global item value sequence number array secured in the respective local memory.
  • each arithmetic unit merges information about a pair of blocks, that is, a pair of local data, and generates information about one block of the merged higher layer, that is, further local data. Therefore, the merge process is realized by a parallel operation of a plurality of arithmetic units. Each arithmetic unit also merges information about merged block pairs belonging to the same layer and generates information about one block of the merged higher layer. In this way, by repeating the merge processing in parallel and hierarchically, information on one block at the top layer is finally generated.
  • One block in the uppermost layer is a block including the entire record.
  • each arithmetic unit inputs information about two blocks, merges them, and outputs information about one block, each arithmetic unit
  • an n-stage (layer) merge process is realized.
  • the ratio of the communication performed by the arithmetic unit with the global memory is 1 / n.
  • the communication volume between the arithmetic units is (n ⁇ 1) / n of the total data communication volume.
  • the block 0 related to the arithmetic unit 0 and the block 1 related to the arithmetic unit 1 may be merged by an arithmetic unit other than the arithmetic unit 0 or the arithmetic unit 1, or either the arithmetic unit 1 or the arithmetic unit 1
  • the calculation units may be merged. As described above, one arithmetic unit holding the original local data is responsible for the merge processing of the next pipeline stage, thereby reducing the data communication amount between the arithmetic units.
  • FIG. 13 is an outline diagram of merge processing of inter-block compilation processing.
  • PE-i represents the arithmetic unit i
  • Block-i represents the block i.
  • arithmetic unit 0 merges block 0 from arithmetic unit 0 and block 1 from arithmetic unit 1.
  • the arithmetic unit 2 merges the block 2 from the arithmetic unit 2 and the block 3 from the arithmetic unit 3.
  • the arithmetic unit 4 merges the block 4 from the arithmetic unit 4 and the block 5 from the arithmetic unit 5.
  • the arithmetic unit 6 merges the block 6 from the arithmetic unit 6 and the block 7 from the arithmetic unit 7.
  • the arithmetic unit 1 merges the blocks 0 to 1 from the arithmetic unit 0 and the blocks 2 to 3 from the arithmetic unit 2.
  • the arithmetic unit 5 merges the blocks 4 to 5 from the arithmetic unit 4 and the blocks 6 to 7 from the arithmetic unit 6.
  • the notation of blocks i to j means one block generated by merging blocks i to j.
  • the arithmetic unit executes pipeline processing in the tournament method.
  • the arithmetic unit 3 merges the blocks 0 to 3 from the arithmetic unit 1 and the blocks 4 to 7 from the arithmetic unit 5 into one Blocks 0-7 are generated.
  • the arrangement of the arithmetic units in the tournament table that is, the correspondence between the blocks and the arithmetic units) is not limited to the arrangement shown in FIG.
  • the blocks 0 to 7 generated by the arithmetic unit 3 are sequentially transmitted to the arithmetic unit 7 when values are obtained.
  • the arithmetic unit 7 distributes the blocks 0 to 7 to the arithmetic unit 0, the arithmetic unit 1,..., The arithmetic unit 7 corresponding to the original block.
  • the arrays handled (that is, the block number array, the global item value sequence number array, and the local item value array) cannot be accommodated in the local memory of one arithmetic unit. May reach the size of Therefore, when the pipelined arithmetic unit receives data from the arithmetic unit of the previous pipeline stage, it immediately processes the received data and transmits the processed data to the arithmetic unit of the subsequent pipeline stage. And The arithmetic unit in the subsequent pipeline stage receives data from the previous pipeline stage if the data can be stored in the local memory of the arithmetic unit, and the previous pipeline stage if the data cannot be stored. Wait for transmission from the stage.
  • the arithmetic unit that has completed the data transmission can release the storage area in the local memory in which the transmitted data is stored.
  • An arithmetic unit that has not completed data transmission receives and processes data from the previous pipeline stage as long as it can store data in the local memory of this arithmetic unit, but no more data can be stored in the local memory. If the processing unit in the previous pipeline stage waits for data transmission and the area for storing the data in the local memory is secured again, the data is sent to the arithmetic processor in the previous pipeline stage. Resume sending.
  • the data reception side controls data communication between the arithmetic units by issuing a data transmission stop request and a data transmission restart request to the data transmission source. It should be noted that this data communication can be realized by any generally known flow control.
  • the arithmetic unit creates a block number array BlkNo initialized on the local memory of the arithmetic unit.
  • the block number array BlkNo has the same size as the already created local item value array LVL, and a block number for identifying the assigned record assigned to each arithmetic unit is set as an initial value.
  • 0 is stored in the block number array of the arithmetic unit
  • 1 is stored in the block number array of the arithmetic unit 1, and so on.
  • the arithmetic unit further creates an initialized global item value sequence number array GVOrd on the local memory.
  • the size of the global item value order number array GVOrd is the same size as the local item value array LVL, and a sequential number starting from 0 is set in order from the top as an initial value.
  • FIGS. 15A, 15B, and 15C the first-stage merge process of the inter-block compilation process according to an embodiment of the present invention will be described.
  • the arithmetic unit 0 merges the block 0 and the block 1 will be described.
  • the pointers to both local item value arrays LVL ie, pointers specifying the elements to be compared are advanced.
  • the pointers to both local item value arrays LVL ie, pointers specifying the elements to be compared
  • the merge processing at the first stage of the inter-block compilation processing by PE-0 is completed.
  • two sets of local data i.e., two sets of block number arrays, a global item value sequence number array, and a local item value array become one set of further local data, i.e., one set of further block numbers. It is converted to an array, a further global item value sequence number array and a further local item value array.
  • the merge process at the first stage of the inter-block compilation process is executed only by sequential access to the local memory, so that the small size is independent of the sizes of the block number array, global item value sequence number array, and local item value array. Note that it can be implemented using only the working memory.
  • block numbers stored in the further block number array BlkNo ′ are always in a predetermined order (in the ascending order in this example) as long as the values stored in the further global item value sequence number array GVOrd ′ are the same. Be careful.
  • 16A to 16D are explanatory diagrams of the second-stage merge process in the inter-block compilation according to the embodiment of the present invention.
  • the second-stage merge process is the same as the first-stage merge process except that the input information is information that has already been merged by another arithmetic unit.
  • 16A, 16B, 16C and 16D show the process and the resulting further block number array BlkNo ′, further global item value sequence number array GVOrd ′, and further local item value array LVL ′. ing.
  • FIG. 17 is a diagram illustrating the result of the third-stage merge process in the inter-block compilation process according to the embodiment of the present invention.
  • FIG. 17 shows the result of merge processing by inter-block compilation for the data item “School”. It should be noted here that the final further local item value array LVL 'matches the virtual global item value array.
  • FIG. 18A is an explanatory diagram of distribution processing in inter-block compilation processing according to an embodiment of the present invention.
  • the distribution process is executed after the merge process.
  • at least one arithmetic unit has, for each data item, an element of the final global item value sequence number array GVOrd ′ for each block number specified by a corresponding element of the final block number array BlkNo ′. And distributed to the arithmetic processors corresponding to this block number.
  • the global item value order numbers distributed by the respective arithmetic processors are stored in a global item value order number array GVOrd that has been initially set in a predetermined order (for example, ascending order).
  • the global item value sequence number array of the arithmetic unit i is set to GVord [i] [j ]
  • the global item value sequence numbers are distributed to the global item value sequence number array GVOrd reserved in the local memory of each arithmetic unit according to the following procedure.
  • the global item value sequence number array GVOrd [i] [j] is once created on the local memory of the arithmetic unit that is executing the distribution process, and this global item value sequence number array GVOrd [i] [j] As these elements are filled little by little, they may be sent to each arithmetic unit little by little, or a certain amount may be collectively sent to each arithmetic unit.
  • FIG. 18B is a diagram showing a result of distribution processing in inter-block compilation processing according to an embodiment of the present invention.
  • a distribution process is performed for each group after a plurality of blocks are grouped in order to improve the processing efficiency. For example, the block number is divided by 4 to separate the upper block number and the lower block number (grouping), and the distribution process is separately applied to the upper block number and the lower block number.
  • the block number array BlkNo ′ for the higher block number and the global item value sequence number from the set of the block number array BlkNo ′ and the global item value sequence number array GVOLd ′ obtained by the merge processing in the inter-block compilation processing A set of the array GVOrd ′, a block number array BlkNo ′ for the lower block number, and a global item value order number array GVOrd ′ are generated. This processing can also be executed by a plurality of arithmetic units operating in parallel.
  • 19A to 19C are diagrams for explaining the result of the compilation process according to the embodiment of the present invention.
  • 19A, 19B and 19C are the same as FIGS. 5B, 5C and 5D, respectively.
  • FIG. 20A and 20B are explanatory diagrams of the sort processing of tabular data according to an embodiment of the present invention.
  • 20A shows tabular data before sorting processing
  • FIG. 20B shows tabular data after sorting processing
  • the tabular data before sorting processing is the same as the tabular data shown in FIG. 5A. is there.
  • a sort process in which the order of two records having the same key value (that is, West) does not change before and after the sort process is called a “stable” sort process.
  • FIGS. 21A to 21D are explanatory diagrams of tabular data obtained by applying the sort processing according to the embodiment of the present invention to the tabular data shown in FIGS. 5A to 5D.
  • the result of the sorting process will be described with reference to FIGS. 21A to 21D.
  • the storage position of the element GOrd [0] in the record sequence number array GOrd in which this record sequence number 0 is stored, that is, 0 is the rank (rank) of the target record in the record in charge of the arithmetic unit responsible for this block. ).
  • the record order number array is an ascending order array, so that this storage position can be found efficiently by a well-known two-division method or the like.
  • the order information that is, the record order number array and the item value access information array
  • the item information is the sort process. It does not change before and after.
  • the item information is configured by a local item value number array VNo, a local item value array LVL, and a global item value order number array GVord.
  • FIG. 22 is a schematic flowchart of sort processing of tabular data according to an embodiment of the present invention.
  • Step 2201 Sorting within a block Each arithmetic unit operates in parallel, and within each block, sorting is performed by applying a distribution counting sort to the item value access information array LOrd using the local item value number array VNo as a key. A global item value sequence number array GVOrd ′ and a record sequence number array GOrd ′ for the next step are generated so as to conform to the item value access information array LOrd.
  • a plurality of processors are accommodated in one arithmetic unit, processing in units of blocks is executed for each processor.
  • Step 2202 Inter-block sort 1 (merge) Each arithmetic unit operates in parallel and hierarchically, and adds a block number array BlkNo ′ to the global item value order number array GVOrd ′ and the record order number array GOrd ′ from the previous pipeline stage, and the global item value order Using the number array GVOrd 'and the record order number array GOrd' as a key, the global item value order number array GVOrd ', the record order number array GOrd' and the block number array BlkNo 'from each block are merged into a predetermined order by the tournament method. To do.
  • This step is a sequential process.
  • Step 2203 Inter-block sort 2 (distribution) At least one arithmetic unit, in order from the first element of the block number array BlkNo ′, the block designated by the element value, that is, the value of the position where the element is stored in the arithmetic unit associated with this block
  • the record order number array GOrd is generated on the local memory of each arithmetic unit.
  • the sorting process of the tabular data is realized by three steps of sorting within a block, sorting 1 between blocks (merge), and sorting 2 between blocks (distribution).
  • the inter-block sort 1 (merge) is a sort process in the sense that data from a pair of blocks is rearranged in a predetermined order, and the data from the pair of blocks is integrated into a set of data. It is also a merge process in meaning.
  • “merge in a predetermined order” refers to processing in block-to-block sort 1 (merge).
  • FIGS. 23A to 23C are explanatory diagrams of the intra-block sorting process in the tabular data sorting process according to the embodiment of the present invention.
  • the processing related to each block is executed by each arithmetic unit to which the block is assigned.
  • an array operation may be expressed by a pseudo instruction similar to the C language.
  • FIG. 23A is an explanatory diagram of the count-up process in the distribution counting sort process.
  • FIG. 23A shows a record order number array GOrd, an item value access information array LOrd, a local item value number array VNo, a global item value order number array GVOrd, and a local item value used as a distribution counting sort key. The transition of the count array Count that counts the number of occurrences of the number is shown.
  • FIG. 23B is an explanatory diagram of the cumulative number process in the distribution counting sort process.
  • a cumulative frequency distribution array Aggr is obtained. Note that the head element of the cumulative frequency distribution array Aggr generated by this cumulative number processing is 0, and the actual cumulative frequency is stored after Aggr [1].
  • FIG. 23C is an explanatory diagram of the transfer process in the distribution counting sort process.
  • the transfer process not only the elements of the array LOrd are copied to the array LOrd ′, but also the global item value sequence number array GVOrd ′ and the record sequence number array GOrd ′ corresponding to the newly generated array LOrd ′ are generated.
  • a process of copying LOrd [0] is shown at the top of FIG. 23C.
  • the item value access information array LOrd 'obtained in this way matches the final item value access information array LOrd in the sorted block.
  • the newly generated global item value order number array GVOrd 'and record order number array GOrd' correspond to records sorted in the block using item values as keys.
  • the tabular data records sorted within the block by each arithmetic unit are then merged between the blocks.
  • data sorted in each block is merged, and merged data sorted as a whole is generated. More specifically, a set of elements of the global item value sequence number array GVord 'and elements of the record sequence number array GOrd' is sorted. Since the elements of the record order number array GOrd 'have values that are uniquely determined for each record, the set of the elements of the global item value order number array GVOrd' and the elements of the record order number array GOrd 'is unique.
  • the element of the global item value sequence number array GVOrd ′ relating to a certain data item indicates the position of the element of the virtual global item value array in which the item values relating to the data item are arranged in a predetermined order. Sorting in the order of the element values of the sequence number array GVOrd ′ is equivalent to sorting in the order of the item values.
  • a block number array BlkNo ′ representing the block number to which the record in each block belongs is processed later. Added for.
  • FIG. 24 is an explanatory diagram of the result of the intra-block sort process in the tabular data sort process according to the embodiment of the present invention obtained as described above.
  • the item value access information array LOrd in each block is consistent with the final result, and thus is indicated without a prime symbol (') like LOrd.
  • the global item value sequence number array GVOrd 'and the record sequence number array GOrd' are not final results but represent work arrays in the middle of processing, and therefore are indicated with a prime symbol (').
  • an additionally generated block number array BlkNo ' is also shown.
  • the inter-block sort process 1 (merge process) in the tabular data sort process can be realized by the same hierarchical structure as the hierarchical structure shown in FIG.
  • the merge processing of the block 0 and the block 1 by the arithmetic unit 0 is performed as the first stage merge processing.
  • the merge processing of block 2 and block 3 by unit 2 the merge processing of block 4 and block 5 by arithmetic unit 4, and the merge processing of block 6 and block 7 by arithmetic unit 6 are executed.
  • the second stage merge processing is realized by the merge processing of blocks 0-1 and 2-3 by the arithmetic unit 1, and the merge processing of blocks 4-5 and 6-7 by the arithmetic unit 5.
  • the operation block 3 executes the merge processing of block 0-3 and block 4-7.
  • the arithmetic units in charge of the merge process in each pipeline stage are not limited to the above combinations.
  • the distribution process is executed by the arithmetic unit 7, for example.
  • each arithmetic unit merges information related to a pair of blocks, and generates information related to one block of the merged higher layer. Therefore, the merge process is realized by a parallel operation of a plurality of arithmetic units. Each arithmetic unit also merges information about merged block pairs belonging to the same layer and generates information about one block of the merged higher layer. In this way, by repeating the merge processing in parallel and hierarchically, information on one block at the top layer is finally generated.
  • One block in the uppermost layer is a block including the entire record.
  • inter-block sort process 1 in the tabular data sort process according to an embodiment of the present invention will be described in more detail.
  • 25A to 25C are explanatory diagrams of the first-stage inter-block merge processing in the tabular data sort processing according to an embodiment of the present invention.
  • PE-0 executes a merge process in a predetermined order between block Block-0 (hereinafter referred to as block 0) and block Block-1 (block 1).
  • the inter-block merge process is an ascending list merge process in that one ascending list is generated from two ascending lists.
  • FIG. 25A shows information on the first record in block 0 (represented as B0 (GVOrd ′, GOrd ′)) and information on the first record in block 1 (B1 (GVOrd ′, GOrd ′)).
  • B0 GVOrd ′, GOrd ′
  • B1 GVOrd ′, GOrd ′
  • FIG. 25B shows a process of comparing the first record in block 0 where the read pointer is positioned at the head with the second record in block 1 where the read pointer is advanced one step forward.
  • B0 (2,1) is compared with B1 (2,4), it is determined that B0 (2,1) is smaller, so the element set B0 (2,1) including the block number is determined.
  • the read pointer on the block 0 side is advanced by one.
  • FIG. 25C shows the global item value sequence number array GVOrd ', the record sequence number array GOrd', and the block number array BlkNo 'that are finally extracted.
  • the retrieved global item value sequence number array GVOrd ′, record sequence number array GOrd ′, and block number array BlkNo ′ are sent to, for example, PE-1 in this example for the second stage merge processing.
  • the extracted data is sent from PE-0 to PE-1 as necessary while proceeding with the comparison process on the block 0 side and the block 1 side. May be.
  • inter-block sort process 1 (merge process) according to one embodiment of the present invention
  • data access is limited to sequential access, and each arithmetic unit is connected in parallel.
  • the sort processing 1 between blocks can be executed. Therefore, the performance of the multiprocessor type processing apparatus is fully utilized.
  • 26A and 26B are explanatory diagrams of the second-stage merge processing in the tabular data sort processing according to the embodiment of the present invention.
  • the second-stage merge process is the same as the first-stage merge process, except that input information is transferred from the local memory of another arithmetic unit. Briefly describing this processing, as shown in FIG.
  • the global item value sequence number array GVOrd ′, the record sequence number array GOrd ′, and the block number array BlkNo ′ are read from the two blocks.
  • the notations Ba to b represent data obtained as a result of the merge processing in a predetermined order from block a to block b.
  • the retrieved global item value sequence number array GVOrd ′, record sequence number array GOrd ′, and block number array BlkNo ′ are sent to PE-3 in this example because of the third-stage merge process. Rather than sending the final result all at once, the extracted data may be sent from PE-1 to PE-3 as necessary while proceeding with the comparison process on the block 0 side and the block 1 side. Good.
  • the second stage of the inter-block sort process 1 (merge process) according to an embodiment of the present invention, data access is limited to sequential access, and each arithmetic unit is The block sorting process 1 can be executed in parallel. Therefore, the performance of the multiprocessor type processing apparatus is fully utilized.
  • the second stage merge processing is executed in parallel by PE-1 and PE-5 in this example.
  • PE-1 outputs the result of the inter-block merge processing from block 0 to block 3 to PE-3 as blocks 0 to 3
  • PE-5 outputs the result of the inter-block merge processing from block 4 to block 7 Output to PE-3 as blocks 4-7.
  • the merge of data from all blocks is completed by the third-stage merge processing by PE-3, that is, all blocks
  • the sort process considering the block ends.
  • the number of blocks is 9 or more, for example, by increasing the number of stages of merge processing, finally, a sort processing result in which data from all blocks are merged is obtained. It is possible.
  • FIG. 27 is an explanatory diagram of the third merging process in the tabular data sorting process according to the embodiment of the present invention.
  • the third-stage merge process also includes B0-3 data B0-3 (GVOrd ', GOrd') and B4-7 data B0. ... 4 (GVOrd ′, GOrd ′) are compared in order from the head, the smaller element set is extracted, and the operation of advancing the read pointer of the data from which the element has been extracted is repeated.
  • B0-3 data B0-3 B0-3
  • GVOrd ′ GOrd ′
  • This set of arrays represents the sorting result of all records. For example, referring to the global item value sequence number array GVOrd ', since the values are arranged in ascending order including the same value from the top, the records are sorted in the order of the item values in the virtual global item value array. You can see that It can also be seen that records having the same element value in the global item value sequence number array GVOrd 'are arranged in ascending order of the record sequence numbers before sorting by referring to the record sequence number array GOrd'. As described above, the reason why a stable sorting result with respect to the record sequence number is obtained is that when sorting between blocks, the records are rearranged based on the magnitude relation regarding the combination of the item value designation pointer and the record sequence number. Because it was broken.
  • the inter-block sort process 1 (merge process) in the tabular data sort process according to the embodiment of the present invention.
  • the record to which the record sequence number 0 is given after sorting is (I)
  • the value of the global item value sequence number of the item value related to the data item that is the key for sorting is 0, (Ii)
  • the record sequence number assigned before sorting is 8, (Iii) Belonging to block 2
  • this inter-block sort process 1 (merge process) is expressed in a data structure for a distributed memory multiprocessor.
  • the record sequence number of the record belonging to each block is finally determined.
  • This process of determining the record sequence number is called inter-block sort process 2 (distribution process).
  • at least one arithmetic unit or processor core sets the record sequence number corresponding to the subscript i of the block number array BlkNo ′ to the block number BlkNo. Distribution is performed for each block represented by [i], and the distributed record order numbers are arranged in a predetermined order (for example, ascending order) within the block.
  • the k-th element of the record sequence number array of block j is GOrd [j] [k]
  • the write pointer k for setting the record sequence number in the record sequence number array GOrd [j] is set.
  • Offsets [j] it can be described as follows.
  • the record sequence number arrays created by sharing by a plurality of arithmetic units are integrated into one record sequence number array for each block. If the block number array BlkNo ′ is continuously assigned to a plurality of arithmetic units, that is, if a part of the block number array BlkNo ′ in charge of each arithmetic unit is continuous, this integration processing Is greatly simplified. This is because it is not necessary to change the order of elements between record sequence number arrays created by different arithmetic units for the same block. That is, the integration process of the record sequence number arrays is achieved by simply concatenating the record sequence number arrays created separately.
  • FIG. 29A is an explanatory diagram of an inter-block sort process according to an embodiment of the present invention.
  • the field value access information array LOrd and the global field value sequence number array GVOLd ′ relating to the assigned record (that is, each block) are stored in the local memory of each arithmetic unit.
  • a record sequence number array GOrd ′ and a block number array BlkNo ′ are created.
  • FIG. 29A is an explanatory diagram of an inter-block sort process according to an embodiment of the present invention.
  • the arithmetic unit 0 merges the local data from the arithmetic unit 0 and the arithmetic unit 1, and the arithmetic unit 2 from the arithmetic unit 2 and the arithmetic unit 3 ,
  • the arithmetic unit 4 merges the local data from the arithmetic units 4 and 5
  • the arithmetic unit 6 merges the local data from the arithmetic units 6 and 7.
  • the arithmetic unit 1 merges the local data from the arithmetic units 0 and 2
  • the arithmetic unit 5 Merge local data from unit 6.
  • the arithmetic unit 3 merges the local data from the arithmetic units 1 and 5.
  • the merge processing in the inter-block processing is performed hierarchically by the tournament method.
  • the arithmetic unit 7 receives the block number array BlkNo ′ from the top of the final local data generated by the arithmetic unit 3, and the block number is the block number array.
  • the position stored therein, that is, the record sequence number is distributed to the arithmetic unit associated with the block number.
  • the arithmetic units to which the record order numbers are distributed sequentially store the distributed record order numbers in the record order number array GOrd on the respective local memories.
  • 29D shows the global item value order number array GVOrd ′ and the record order number array GOrd ′ in addition to the block number array BlkNo ′ for convenience of explanation. Note that the result is the block number sequence BlkNo ′. Therefore, only the block number array BlkNo ′ needs to be generated at the final stage of the merge process. Note that the arithmetic processor 3 located at the third stage of the pipeline processing may further perform the distribution processing.
  • FIG. 30 is an explanatory diagram of the order information generated by the sort processing of the tabular data according to the embodiment of the present invention. This is consistent with the sorted ordered set shown in FIG. 21B. It should be noted that the item information is not changed by the sorting process.
  • the sort processing of tabular data is a sort processing related to predetermined data items. What is changed by this sort processing is the record sequence number array and the item value access information array. On the other hand, records belonging to each block and item information do not change. Therefore, the sorting process for a plurality of data items is realized by repeating the sorting process for the predetermined data item. According to a preferred embodiment of the distributed memory multiprocessor of the present invention, the multi-item sort process is realized by the control unit controlling the arithmetic unit to repeat the sort process for a plurality of data items.
  • FIG. 31A and 31B are explanatory diagrams of tabular data search processing according to an embodiment of the present invention.
  • FIG. 31A shows tabular data before search processing
  • FIG. 31B shows tabular data after search processing
  • tabular data before search processing is the same as the tabular data shown in FIG. 5A. is there.
  • FIGS. 32A to 32D are explanatory diagrams of tabular data obtained by applying the search processing according to the embodiment of the present invention to the tabular data shown in FIGS. 5A to 5D.
  • the results of the search process will be described with reference to FIGS. 32A to 32D.
  • FIG. 32A shows the order information of the tabular data before the search (that is, the search source) corresponding to FIG. 31A.
  • FIG. 32B shows the order information of the tabular data after the search corresponding to FIG. 31B
  • FIG. 32C and FIG. 32D show the item information of the tabular data after the search corresponding to FIG. 32B. .
  • the record (North, 6) that matches the search condition exists in the block 1 of the tabular data in FIG. 31A, the order information related to the block 1 of the tabular data in FIG.
  • the item information includes a local item value number array VNo, a local item value array LVL, and a global item value sequence number array GVOrd.
  • FIG. 33 is a schematic flowchart of a tabular data search process according to an embodiment of the present invention. As described above, only the order information changes in the search. Accordingly, the record sequence number array GOrd and the item value access information array LOrd are created by the search process.
  • Step 3301 Local processing (hit flag array setting)
  • Each arithmetic unit operates in parallel to determine whether each item value in the local item value array LVL matches the search condition for the item to be searched in each block, and matches the search condition Set the element of the hit flag array corresponding to the item value to be set.
  • processing in units of blocks is executed for each processor.
  • Step 3302 Local processing (record extraction) Each arithmetic unit operates in parallel to extract the record sequence number and item value access information corresponding to the item value for which the hit flag is set, and create the record sequence number array GOrd and the item value access information array LOrd. .
  • Step 3303 Global processing (block number setting) Each arithmetic unit operates in parallel to create a block number array BlkNo having the same size as the created record sequence number array GOrd and set a block number.
  • Step 3304 Global processing (merge processing) The merging means of each arithmetic unit merges the record order number array GOrd and the block number array BlkNo by a tournament method.
  • Step 3305 Global processing (distribution processing)
  • the distribution means of at least one arithmetic unit specifies, in order from the first element of the final block number array BlkNo, the value of the position where this element is stored, that is, the sequence number, is specified by the value of this element.
  • the data is transmitted to the block, that is, the arithmetic unit associated with the block, and the sequence number transmitted by each arithmetic unit is sequentially stored in the respective record sequence number array GOrd.
  • the tabular data search process includes a local process corresponding to a search process within a block, a merge process between blocks, and a distribution process between blocks.
  • FIG. 34 is an explanatory diagram of hit flag array setting processing in tabular data search processing according to an embodiment of the present invention.
  • an element corresponding to the item value that matches the search condition is marked (that is, a flag is set).
  • 35A to 35C are explanatory diagrams of local processing in tabular data search processing according to an embodiment of the present invention.
  • a record in which a flag in the hit flag array is set is extracted.
  • the arithmetic unit creates an item value access information array LOrd 'and a record sequence number array GOrd' of a record in which a flag in the hit flag array is set on the local memory.
  • the operation of the arithmetic unit 1 related to the block 1 will be described.
  • the arithmetic unit 1 extracts the elements of the local item value number array of the record in charge, that is, the local item value numbers in order from the top of the array, and this local item value number. It is determined whether the element of the hit flag array specified by is set. If the hit flag is set, the record corresponding to this local field value number matches the search condition, so that the field value access information and record sequence number of this record are the field value access information array LOrd ′. And the record sequence number array GOrd ′. As shown in FIG. 35B, this record extraction processing is executed in parallel by a plurality of arithmetic units. Finally, as shown in FIG.
  • an item value access information array LOrd 'and a record sequence number array GOrd' relating to records that match the search condition are constructed on the local memory of the arithmetic unit.
  • the item value access information array LOrd 'constructed at this time matches the item value access information LOrd included in the order information generated as a search result.
  • each arithmetic unit creates a block number array BlkNo ′ of the same size as the record sequence number array GOrd ′ created by the record extraction process in the local memory, and this block number indicates the record in charge of each arithmetic unit. Fill block number array.
  • the record order number array GOrd 'and the block number array BlkNo' are empty.
  • the tabular data records retrieved within the block by each arithmetic unit are then merged between the blocks.
  • data searched in the respective blocks are merged to generate merged data searched as a whole.
  • a set of elements of the record sequence number array GOrd 'and elements of the block number array BlkNo' are merged in ascending order of the record sequence numbers. Since the elements of the record sequence number array GOrd 'have values that are uniquely determined for each record, the set of elements is unique.
  • the merge process in the tabular data search process according to the embodiment of the present invention can be realized by the same hierarchical structure as the hierarchical structure shown in FIG.
  • the merge processing of the block 0 and the block 1 by the arithmetic unit 0 is performed as the first-stage merge processing.
  • the merge process of block 2 and block 3 by 2 the merge process of block 4 and block 5 by the arithmetic unit 4, and the merge process of block 6 and block 7 by the arithmetic unit 6 are executed.
  • the second stage merge processing is realized by the merge processing of blocks 0-1 and 2-3 by the arithmetic unit 1, and the merge processing of blocks 4-5 and 6-7 by the arithmetic unit 5.
  • the operation block 3 executes the merge processing of block 0-3 and block 4-7.
  • the arithmetic units in charge of the merge process in each pipeline stage are not limited to the above combinations.
  • the distribution process may be executed by, for example, an arithmetic unit 7 different from the arithmetic unit 3.
  • each arithmetic unit merges information related to a pair of blocks, and generates information related to one block of the merged higher layer. Therefore, the merge process is realized by a parallel operation of a plurality of arithmetic units. Each arithmetic unit also merges information about merged more block pairs belonging to the same layer and generates information about one block of the merged higher layer. In this way, by repeating the merge processing in parallel and hierarchically, information on one block at the top layer is finally generated.
  • One block in the uppermost layer is a block including the entire record.
  • the arithmetic unit 3 finally generates one record order number array GOrd 'and block number array BlkNo'.
  • the elements of the record order number array GOrd 'are arranged in a predetermined order in this example, ascending order.
  • the merge process in the tabular data search process according to the embodiment of the present invention can be realized in exactly the same manner as the inter-block sort process 1 (inter-block merge process) in the tabular data sort process. Not explained.
  • the record sequence number of the record belonging to each block is finally determined.
  • the process for determining the record sequence number is called a distribution process.
  • at least one arithmetic unit or processor core sets the record sequence number corresponding to the subscript i of the block number array BlkNo ′ as the block number BlkNo [i]. Distribution is performed for each represented block, and the distributed record order numbers are arranged in a predetermined order (for example, ascending order) within the block.
  • the k-th element of the record sequence number array of block j is GOrd [j] [k]
  • the write pointer k for setting the record sequence number in the record sequence number array GOrd [j] is set.
  • Offsets [j] it can be described as follows.
  • a plurality of arithmetic units or processor cores may share the block number array BlkNo and perform the distribution process. Therefore, the record sequence number array GOrd [j] relating to a certain block is processed by being shared by a plurality of arithmetic units. Then, the record sequence number arrays created by sharing by a plurality of arithmetic units are integrated into one record sequence number array for each block. If the block number array BlkNo ′ is continuously assigned to a plurality of arithmetic units, that is, if a part of the block number array BlkNo ′ in charge of each arithmetic unit is continuous, this integration process Is greatly simplified. This is because it is not necessary to change the order of elements between record sequence number arrays created by different arithmetic units for the same block. That is, the integration process of the record sequence number arrays is achieved by simply concatenating the record sequence number arrays created separately.
  • FIG. 37A and 37B are explanatory diagrams of tabulation processing for tabular data.
  • FIG. 37A shows tabular data of the tabulation source
  • FIG. 37B shows tabular data of tabulation results.
  • the tabular data of the aggregation source is formed by three items, School, Class, and Age.
  • the tabulation is for each item value (dimension value) of a certain item (dimension) in the tabular data. The quantity (measure) based on the item value of the item is calculated.
  • Calculation of a measure means counting the number of measures, calculating the sum of measures, or calculating an average value of measures. Further, the number of dimensions may be two or more as in this example. For example, in the case of tabular data including an item “School”, an item “Cass”, and an item “Age”, the process of obtaining the average value of Age by School / Class is based on the values of “School” and “Class”. This is a tabulation process as a measure.
  • FIG. 38A to 38D are explanatory diagrams of tabular data of the summation source expressed by the data structure according to the embodiment of the present invention, which is equivalent to FIG. 37A.
  • FIG. 38A shows order information
  • Record sequence numbers 0 to 3 are included in block 0
  • record sequence numbers 4 to 7 are included in block 1
  • record sequence numbers 8 to 11 are included in block 2
  • record sequence numbers 12 to 15 are included in block 3.
  • the record sequence numbers 16 to 19 are included in the block 4, the record sequence numbers 20 to 23 are included in the block 5, the record sequence numbers 24 to 27 are included in the block 6, and the record sequence numbers 28 to 31 are the block 7 Included in It is obvious that this tabular data can be built on the local memory of the arithmetic unit using the above-described compiling process of the present invention.
  • or 39E are explanatory drawings of the tabular data of the total result represented by the data structure by one Embodiment of this invention equivalent to FIG. 37B.
  • the tabular data of the tabulation result shown in FIG. 37B is divided into blocks different from the tabular data of the tabulation source shown in FIG. 37A.
  • the tabular data of the tabulation source and the tabular data of the tabulation result are different in block division.
  • 39A shows order information
  • record sequence numbers 0 to 4 are included in block 0
  • record sequence numbers 5 and 6 are included in block 1
  • record sequence numbers 7 and 8 are included in block 2
  • record sequence numbers 9 to 11 are included. It is included in block 3. If a method for dividing tabular data into blocks is defined in this way, it is obvious that the tabular data can be constructed on the local memory of the arithmetic unit using the above-described compiling process of the present invention.
  • the size of the tabular data of the aggregation result can be estimated from the number of unique item values in each dimension. Specifically, for example, the number of records of tabular data as a result of aggregation is determined by a product of (the number of unique item values in each dimension). Note that the number of unique item values for each dimension is the value obtained by adding 1 to the maximum value of the global item value sequence number for the item corresponding to the dimension in the tabular table of the aggregation source. Rather, the size of the tabular data of the tabulation result can be determined. Furthermore, it should be noted that the division of the tabular data of the tabulation result can also be determined in advance based on the size of the tabular data of the tabulation result. By using such prior knowledge, tabular data can be tabulated more efficiently.
  • FIG. 40 shows a flowchart of a tabular data tabulation method according to an embodiment of the present invention.
  • the aggregation method is A step 4002 of expanding the tabular table of the summation source on the local memory of each arithmetic unit; A step 4004 for determining the size of the tabular table of the aggregation result and the block division definition; Step 4006 for creating order information of the tabular table of the aggregation results; Step 4008 for determining dimension division and creating dimension item information; Assigning measure item values to a dimensional space 4010; A step 4012 of counting the item values of the measure; Step 4014 for creating measure item information; Is provided.
  • processors if a plurality of processors are accommodated in one arithmetic unit, processing in units of blocks is executed for each processor.
  • the size of the virtual global item value array GVL that is, the number of unique item values is considered.
  • the virtual global item value array GVL is an array generated by merging element values of the local item value array LVL without duplication.
  • the size of this GVL is the maximum value +1 of the global item value sequence number for the dimension item in the tabular table of the tabulation source.
  • the maximum value of the global item value sequence number is easily determined from the tabular data of the aggregation source.
  • the size of the dimension space (that is, the number of unique item value pairs of the dimension items) is a product of the sizes of the GVL.
  • the set of the dimension 1 item value number and the dimension 2 item value number (dimension 1 item value number, dimension 2 item value number) is (0,0), (0,1), (0, 2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2), (3,0), (3,1) , (3, 2).
  • N the size of the dimensional space.
  • the tabular data as a result of aggregation is divided into M blocks from the 0th to the (M ⁇ 1) th with reference to the size of the dimensional space.
  • the size of each block is determined in consideration of the number of records that can be accommodated in the block.
  • block k is in charge of R k rows to R k + 1 ⁇ 1 rows of tabular data as a total result.
  • Block 0 0 to 4 rows
  • Block 1 5 to 6 rows
  • Block 2 7 to 8 rows
  • Block 3 9 to 11 rows
  • the block division definition is not limited to this example.
  • the dimension value belonging to the p-th dimension of the block k can be specified as follows.
  • the dimension value that is, the item value does not need to handle the dimension value itself, and each item value can be specified by the global item value sequence number assigned to each item value.
  • the global item value sequence number it is possible to handle the dimension value using the integer type regardless of the data type of the item value.
  • specifying the dimension value specifically corresponds to specifying the local item value number array VNo and the local item value array LVL.
  • the calculation of the local item value number array VNo and the local item value array LVL of the p-th dimension block k can be considered by classifying into the following three cases.
  • Case A In this case, the following relation: (R k + 1 ⁇ R k ) ⁇ CSize_high And all item values in the pth dimension are included in the block k.
  • Case B In this case, the following relation: (R k + 1 ⁇ R k ) ⁇ CSize_high And ((R k mod CSize_high) div CSize_low) ⁇ ((R k + 1 ⁇ 1) mod CSize_high) div CSize_low)
  • the block k is partially included in order from the smallest p-th dimension item value.
  • Case C In this case, the following relational expression: (R k + 1 ⁇ R k ) ⁇ CSize_high And ((R k mod CSize_high) div CSize_low)> ((R k + 1 ⁇ 1) mod CSize_high) div CSize_low)
  • a part of the item value of the p-th dimension is included in the block k, and changes from a small value to a large value instead of a constant order such as ascending or descending order (not changing monotonously).
  • the item value is changed from a large value to a small value in the middle of the block k, and again changed from a small value to a large value.
  • div represents integer division, and the fractional part of the quotient is rounded down.
  • Mod represents an integer remainder.
  • Block 0 assigned range 0 to 4 rows
  • Block 1 assigned range 5 to 6 rows
  • FIG. 41 shows the first-dimensional and second-dimensional classification results of all the blocks from block 0 to block 3.
  • this block is classified as case A.
  • case A generally speaking, all values in the pth dimension are present in block k. Since all of the values of the p-th dimension are assigned to the arithmetic processor in charge of the block k, by referring to the assigned values, the number of unique values of the p-th dimension (GVL) Of size). Therefore, the global item value sequence number array GVOrd can be created.
  • the arithmetic processor in charge of the block k creates a local item value number array VNo for the pth dimension.
  • the local item value number VNo (L mod CSize_high) div CSize_low Can be calculated by:
  • VNo [i] ((i + R k ) mod CSize_high) div CSize_low; ⁇
  • case C in general, the value of the p-th dimension of block k is separated into ascending part 1 in the first half and ascending part 2 in the second half in block k, and reaches the maximum value at the end of ascending part 1 in the first half, It returns to the minimum value at the beginning of the second ascending part 2.
  • the next variable Gap is calculated.
  • Gap ((R k mod CSize_high) div CSize_low) ⁇ (((R k + 1 ⁇ 1) mod CSize_high) div CSize_low) ⁇ 1
  • the end of the intermediate section matches the end of the global item value sequence number array GVOrd.
  • the arithmetic processor in charge of the block k creates a local item value number array VNo for the pth dimension.
  • the item values included in the block k are all values between the minimum value and the first intermediate value, and between the second intermediate value and the maximum value among the entire item values. Is all the values.
  • the block k does not include an item value between the first intermediate value and the second intermediate value among the entire item values.
  • the local item value number array VNo includes, in order from the top, the item value number corresponding to the second intermediate value to the item value number corresponding to the maximum value continuously.
  • the item value number corresponding to the minimum value to the item value number corresponding to the first intermediate value are stored.
  • VNo [0] 1
  • VNo [1] 0 It becomes.
  • a general calculation method of the local item value number array VNo is as follows.
  • block 2 is classified as case B.
  • the p-th dimension value of block k generally appears in ascending order.
  • the minimum value and the maximum value of the global item value sequence number may be determined.
  • VNo [i] (((i + R k ) mod CSize_high) div CSize_low) ⁇ Min (GVOrd) ⁇
  • the local item value number array VNo and the global item value order number array GVOrd are obtained for each block with respect to the second dimension.
  • each arithmetic unit creates on the local memory a block number array BlkNo having the same size as the global item value sequence number array GVOrd created by the local item information creation processing of tabular data as a result of aggregation.
  • This block number array is filled with the block number indicating the record in charge of the unit.
  • each arithmetic unit prepares a local item value array LVL of tabular data of the aggregation source.
  • Each arithmetic processing unit uses the local item value array LVL of the tabular data of the summation source, the block number array BlkNo of the tabular data of the tabulation result, and the global item value sequence number array GVOLd as local data, and merges the first stage. It is transmitted to the arithmetic unit that executes processing.
  • the local data prepared in the block by each arithmetic unit is then merged between the blocks.
  • merging between blocks data prepared in each block are merged, and new merged data is generated as a whole.
  • the local item value array LVL of the tabular data of the summation source is merged in ascending order of the local item values, and the elements of the global item value sequence number array GVOrd of the tabular data of the tabulation result and the block number array A set of BlkNo elements is merged using the global item value sequence number as a key.
  • the item information related to the tabular data of the tabulation source and the item value information related to the tabular data of the tabulation result are merged independently.
  • the merge process in the tabular data search process according to the embodiment of the present invention can be realized by the same hierarchical structure as the hierarchical structure shown in FIG.
  • the merge processing of the block 0 and the block 1 by the arithmetic unit 0 is performed as the first-stage merge processing.
  • the merge process of block 2 and block 3 by 2 the merge process of block 4 and block 5 by the arithmetic unit 4, and the merge process of block 6 and block 7 by the arithmetic unit 6 are executed.
  • the second stage merge processing is realized by the merge processing of blocks 0-1 and 2-3 by the arithmetic unit 1, and the merge processing of blocks 4-5 and 6-7 by the arithmetic unit 5.
  • the operation block 3 executes the merge processing of block 0-3 and block 4-7.
  • the arithmetic units in charge of the merge process in each pipeline stage are not limited to the above combinations.
  • the distribution process may be executed by, for example, an arithmetic unit 7 different from the arithmetic unit 3.
  • each arithmetic unit merges information related to a pair of blocks, and generates information related to one block of the merged higher layer. Therefore, the merge process is realized by a parallel operation of a plurality of arithmetic units. Each arithmetic unit also merges information about merged more block pairs belonging to the same layer and generates information about one block of the merged higher layer. In this way, by repeating the merge processing in parallel and hierarchically, information on one block at the top layer is finally generated.
  • One block in the uppermost layer is a block including the entire record.
  • Arithmetic unit 3 finally has a local item value number array (that is, a virtual global item value number array) related to the tabular data of the aggregation source, and one global item value order related to the tabular data of the aggregation result.
  • a number array GVOrd and a block number array BlkNo are generated.
  • the item information related to the tabular data of the total result is arranged in the order of global item value sequence numbers (in this example, ascending order).
  • the merge process in the tabular data tabulation process according to the embodiment of the present invention can be realized in exactly the same manner as the inter-block sort process 1 (inter-block merge process) in the tabular data sort process. Not explained.
  • a local item value array LVL included in each block is created.
  • the process for generating the local item value array is also called a distribution process.
  • at least one arithmetic unit for example, arithmetic unit PE-7 in this example
  • arithmetic unit PE-7 virtually assigns the item value designated by the global item value sequence number corresponding to the subscript i of the block number array BlkNo. It is taken out from the global item value array LVL, and this item value is distributed to each block designated by the block number.
  • the arithmetic unit that receives the distributed item value sequentially stores the received item value in the local item value array LVL on the local memory of the arithmetic unit.
  • the kth element of the local item value array of block j is set to LVL [j] [k], and a write pointer for setting the item value in the local item value array LVL [j] [k] If k is Offsets [j] and the virtual global item value array is GVL (to distinguish it from the local item value array for each block), it can be described as follows.
  • a plurality of arithmetic units may be in charge of distribution processing.
  • the item information related to the dimension of aggregation that is, the local item value number array VNo, the global item value sequence number array GOrd, and the local item value array LVL, for example, are operations that share and hold tabular data of the aggregation results.
  • the unit acquires the global item value sequence number array GVOrd and the local item value array VL of the tabular data of the tabulation source held by the other arithmetic units from the other arithmetic units, so that each arithmetic unit becomes independent. You may get it.
  • a global item value array GVL is defined for the measure item.
  • the global item value array GVL is a virtual item value array obtained by merging item values existing in each block without duplication.
  • “virtual” means that it is not necessary to actually create it.
  • the data type of the virtual global item value array GVL is the same as the data type of the local item value array LVL, and has various data types such as a character string type, an integer type, and a floating point type.
  • the item values stored in the global item value array GVL are arranged in a predetermined order (for example, ascending order). As described above, the stored item values do not overlap. Further, the size of the global item value array GVL is the maximum value +1 of the stored values of the global item value sequence number array GVOLd existing in each block.
  • Measure is created by aggregating values for each set of dimension values. Therefore, the measure creation process sorts a set of dimension values, that is, a set of item value numbers corresponding to the dimension values, in ascending order of the item values with respect to the tabular data of the aggregation source. The measure value is calculated, and item information is created for each item from the calculated measure value.
  • 44A to 44J are explanatory diagrams of sorting processing of a set of dimension values for creating a measure.
  • this sort processing first, the item value access information array LOrd is sorted with respect to the second dimension having a low priority, and then the item value access information array LOrd is sorted again with respect to the first dimension having a high priority.
  • the example of the block 0 of the tabular data of a summation origin is demonstrated.
  • the sorting process uses a well-known counting sort.
  • 44A to 44D are explanatory diagrams of the process of counting up the local item value numbers.
  • the count array Count generated by the count-up is accumulated and a cumulative number array Aggr is created.
  • the array Aggr is created by shifting the elements of the Count array backward by one and making the total number.
  • the element value access information array LOrd is transferred by transferring the elements of the item value access information array LOrd using the elements of the cumulative number array Aggr as pointers. 'Is created.
  • FIG. 44H shows an item value access information array LOrd 'generated by the sort processing for the second dimension.
  • the item value access information array LOrd ′ is sorted with respect to the first dimension having a high priority
  • the item value access information array LOrd ′′ is obtained.
  • the measure item value array storing the dimension number sequence number array CubeAdr and the measure value corresponding to the dimension space sequence number.
  • the sequence number in the dimension space is a sequence number assigned to the set of dimension value item value numbers in the order of dimension value sorting.
  • the dimension space order number is calculated by (the value of the first dimension GVOrd) ⁇ (the product of the sizes of the virtual global item value array of dimension 2) + (the value of the second dimension GVOrd).
  • FIG. 45 shows the creation processing of the sequence number array in the dimensional space and the measure item value array.
  • the computation unit calculates an aggregate value for each order number in the dimension space.
  • the number of occurrences of the measure value Count and the sum Sum of the measure values are calculated.
  • FIG. 46 shows a process of creating a measure appearance count array Count and a measure sum array Sum from the order number array CubAdr in the dimensional space and the measure item value array wVL. In this example, if the value of CubeAdr does not overlap, 1 is stored in the Count array, and the value of wVL is stored in Sum.
  • the array Count and the array Sum are merged between a plurality of blocks using the value of CubeAdr ′ as a key.
  • the merge process is realized by a pipeline process performed by a merge unit of a plurality of arithmetic units in a tournament manner.
  • the order number array CubeAdr ′ in the dimension space, the appearance count array Count, and the sum array Sum are used as local data, and the appearance count and sum associated with the same order number in the dimension space are merged.
  • the elements of the local data are arranged in ascending order of the sequence numbers in the dimension space.
  • the appearance count array Count and the sum array Sum merged by the tournament method are converted into the tabular data item information of the tabulation result using the order information of the tabular data of the tabulation result already created.
  • 48A shows the order information of the tabular data of the tabulation results
  • FIG. 48B is an explanatory diagram of the compile process regarding the appearance count array Count
  • FIG. 48C is an explanatory diagram of the compile process regarding the sum array Sum.
  • the compilation process has already been described in this specification and will not be described in further detail.
  • the tabulation data of the tabulation results shown in FIGS. 39D and 39E is created by the above tabulation processing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

L'invention porte sur un multiprocesseur à mémoire distribuée qui comprend ses mémoires et processeurs locaux dédiés et a une pluralité d'unités arithmétiques qui sont connectées en communication pour émettre/recevoir des données les unes avec les autres. Un ensemble de données est divisé en une pluralité de blocs et manipulé par le traitement pipeline des unités arithmétiques. Les unités arithmétiques reçoivent des données locales individuelles provenant d'une ou de plusieurs unités arithmétiques situées sur l'étage de pipeline précédent, convertissent les données locales en d'autres données locales, et les transmettent à une unité arithmétique située sur l'étage de pipeline suivant. Les unités arithmétiques peuvent être dynamiquement interconnectées sous la forme d'un tournoi de façon à générer finalement un morceau de données globales et au moins l'une quelconque de celles-ci attribue les données globales aux multiples unités arithmétiques sur la base de numéros de bloc.
PCT/JP2008/063660 2008-07-30 2008-07-30 Procédé d'exploitation de données de forme tabulaire, multiprocesseur à mémoire distribuée et programme WO2010013320A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2008/063660 WO2010013320A1 (fr) 2008-07-30 2008-07-30 Procédé d'exploitation de données de forme tabulaire, multiprocesseur à mémoire distribuée et programme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2008/063660 WO2010013320A1 (fr) 2008-07-30 2008-07-30 Procédé d'exploitation de données de forme tabulaire, multiprocesseur à mémoire distribuée et programme

Publications (1)

Publication Number Publication Date
WO2010013320A1 true WO2010013320A1 (fr) 2010-02-04

Family

ID=41610044

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2008/063660 WO2010013320A1 (fr) 2008-07-30 2008-07-30 Procédé d'exploitation de données de forme tabulaire, multiprocesseur à mémoire distribuée et programme

Country Status (1)

Country Link
WO (1) WO2010013320A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011099114A1 (fr) * 2010-02-09 2011-08-18 株式会社ターボデータラボラトリー Système de base de données de type hybride et procédé de fonctionnement de celui-ci
CN111538750A (zh) * 2020-06-24 2020-08-14 深圳壹账通智能科技有限公司 一种信息还原方法、装置、计算机系统及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05128164A (ja) * 1990-03-27 1993-05-25 Internatl Business Mach Corp <Ibm> データベース処理装置
JPH05143287A (ja) * 1991-11-19 1993-06-11 Hitachi Ltd ハードウエアソート処理装置
WO2000010103A1 (fr) * 1998-08-11 2000-02-24 Shinji Furusho Procede et dispositif de recuperation, de stockage et de triage de donnees formatees en tableaux
WO2005041067A1 (fr) * 2003-10-27 2005-05-06 Shinji Furusho Systeme de traitement d'informations du type memoire distribuee
WO2005041066A1 (fr) * 2003-10-24 2005-05-06 Shinji Furusho Systeme de traitement d'informations du type memoire distribuee

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05128164A (ja) * 1990-03-27 1993-05-25 Internatl Business Mach Corp <Ibm> データベース処理装置
JPH05143287A (ja) * 1991-11-19 1993-06-11 Hitachi Ltd ハードウエアソート処理装置
WO2000010103A1 (fr) * 1998-08-11 2000-02-24 Shinji Furusho Procede et dispositif de recuperation, de stockage et de triage de donnees formatees en tableaux
WO2005041066A1 (fr) * 2003-10-24 2005-05-06 Shinji Furusho Systeme de traitement d'informations du type memoire distribuee
WO2005041067A1 (fr) * 2003-10-27 2005-05-06 Shinji Furusho Systeme de traitement d'informations du type memoire distribuee

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011099114A1 (fr) * 2010-02-09 2011-08-18 株式会社ターボデータラボラトリー Système de base de données de type hybride et procédé de fonctionnement de celui-ci
JPWO2011099114A1 (ja) * 2010-02-09 2013-06-13 株式会社ターボデータラボラトリー ハイブリッド型データベースシステム及びその動作方法
CN111538750A (zh) * 2020-06-24 2020-08-14 深圳壹账通智能科技有限公司 一种信息还原方法、装置、计算机系统及可读存储介质

Similar Documents

Publication Publication Date Title
KR101196566B1 (ko) 멀티 프로세서 시스템 및 그 정보처리방법
US9805080B2 (en) Data driven relational algorithm formation for execution against big data
CN110990638B (zh) 基于fpga-cpu异构环境的大规模数据查询加速装置及方法
US10521441B2 (en) System and method for approximate searching very large data
Wen et al. Exploiting GPUs for efficient gradient boosting decision tree training
US8996436B1 (en) Decision tree classification for big data
US9147168B1 (en) Decision tree representation for big data
JP6418431B2 (ja) 効率的な1対1結合のための方法
Song et al. DTransE: Distributed translating embedding for knowledge graph
JP4511469B2 (ja) 情報処理方法及び情報処理システム
WO2010013320A1 (fr) Procédé d&#39;exploitation de données de forme tabulaire, multiprocesseur à mémoire distribuée et programme
JP4881435B2 (ja) メモリ共有型並列処理システムにおいて表形式データを集計する方法及び装置
JP4620593B2 (ja) 情報処理システムおよび情報処理方法
JPWO2009044486A1 (ja) 表形式データをソートする方法、マルチコア型装置、及び、プログラム
JP4511464B2 (ja) 情報処理システムおよび情報処理方法
JP4995724B2 (ja) 情報処理システムおよび情報処理方法
JP4772506B2 (ja) 情報処理方法、情報処理システムおよびプログラム
JP5208117B2 (ja) 表形式データを操作するマルチコア対応データ処理方法、マルチコア型処理装置、及び、プログラム
JP4559971B2 (ja) 分散メモリ型情報処理システム
Riha et al. An Adaptive Hybrid OLAP Architecture with optimized memory access patterns
Kaur et al. Sentimental analysis using various analytical tools from hadoop eco system
US11734244B2 (en) Search method and search device
US11734318B1 (en) Superindexing systems and methods
Kobus Accelerating bioinformatics applications on CUDA-enabled multi-GPU systems
Chen et al. Fast Approximate LUT-based Vector Multiplication in DRAM

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08791891

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08791891

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP