The parallel computation array architecture that many CELL processor makes up
Technical field
The invention belongs to FPGA and parallel computing field, be specifically related to a kind of parallel computation array based on the CELL processor.
Background technology
In investment in March calendar year 2001, Sony, new power computer amusement, Toshiba, U.S.'s IBM (IBM) company develop jointly the processor that is used for high-speed computation to the CELL processor by Sony.It is that PowerPC framework with the RISC instruction system designs, and has high clock frequency, the high characteristics such as efficient of carrying out.Be mainly used on PlayStation 3 and the cutter point server.The CELL processor has 1 processor core (comprise 1 simplified by PowerPC970 and come PPE and 8 to be called SPE collaborative process device), and frequency of operation surpasses 4GHz.The CELL processor is a 64-bit Power processor, and built-in 8 processing units that cooperate with each other have and handle the ability that separate type is calculated, and have the ability that uniprocessor moves a plurality of operating systems.
The design of high flexibility and Distributed Calculation are two spotlights of CELL except that high-performance.CELL is applicable in nearly all computing equipment such as from the embedded device to the mainframe computer, so CELL is designed to a general processor platform.If be used for workstation/server system, IBM can directly integrate two pieces of CELL processors with the higher usefulness of acquisition, and will be used for mainframe computer, and CELL then can be configured to comprise " the MCM module " of four pieces of independent processors.Therefore use a plurality of CELL processors to make up array and carry out the important directions that high-effect parallel computation becomes present research.
List of references
[1] IBM.?PowerPC?Microprocessor?Family:?Vector/SIMD?Multimedia?Extension?Technology?Programming?Environments?Manual?[M].?NY,?USA,?2005:?317.
[2] IBM.?PowerPC?Operating?Environment?Architecture?[M].?NY,?USA,?2005:?135。
Summary of the invention
The object of the present invention is to provide a kind of utilization factor that can improve the CELL processor, balanced load, and reduce the parallel computation array architecture of power consumption as far as possible.This array architecture is applicable to the FPGA structure of various complexity towards the FPGA software systems, carries out high performance universal vanning and FPGA placement-and-routing.
The parallel computation array architecture that the present invention proposes is by forming that many CELL processor makes up.Specifically, the parallel computation array architecture that the present invention proposes is connection mode, the communication pattern between the CELL processor, the thread scheduling between the CELL processor and the load balance pattern that redefines between the CELL processor.Wherein:
Described connection mode is with a plurality of CELL processors, by the wideband engine interface, couples together the composition array.
Described communication pattern is to be coupled by the internal memory that wideband engine interface BIF agreement is carried out between the CELL processor, and carries out the communication between the CELL processor.
Described thread scheduling and load balance pattern are with original single CELL multithreading implementation, are decomposed into multitask multiprogramming by software, distribute and are dispatched on the CELL array and carry out, and realize the large-scale data parallel computation.
Concrete operations are as follows:
A) a plurality of CELL processors are coupled together by the wideband engine interface, form the array that comprises some CELL processors;
B) configuration software is used for the intercommunication mutually between the CELL processor, by wideband engine interface BIF agreement, realizes the internal memory coupling between a plurality of CELL processors;
C) configuration software is used for the thread scheduling and the load balance of CELL array, with the support that provides it that large-scale parallel is calculated.
Parallel computation array architecture of the present invention can be implemented under the condition of low-power consumption and carries out high-effect parallel computation.This array architecture is applicable to the FPGA structure of various complexity towards the FPGA software systems, carries out high performance universal vanning and FPGA placement-and-routing.
Description of drawings
Fig. 1 is a parallel computation array architecture model diagram of the present invention.
Embodiment
The concrete basic mode of implementing of the parallel computation array architecture that the present invention proposes is as follows: utilize existing wideband engine interface among the CELL, a plurality of CELL processors are coupled together the formation array, by software arrangements realize this array internal memory coupling and with the uncoupled I/O transmission of outside and the balance of load.Aspect software arrangements, at first utilize the internal memory coupling CELL wideband engine interface BIF agreement of CELL processor, can allow each CELL processor can be connected to remaining CELL processor in the array by IOIF0 in the array; Secondly by the Optimization Dispatching algorithm, will dispatch between the CELL processor of computing load in array, realize load balance.The CELL array architecture that structure is finished is a core with one of them CELL processor, is responsible for task analysis, thread scheduling; Utilize configuration software and dispatching algorithm, the large-scale parallel data operation is packaged into suitable single CELL processor Program for Calculation, be assigned in the array and carry out on each CELL processor; Each CELL processor all keeps original architecture in the array, and single CELL processor has had powerful computation capability, and the CELL array just can be realized the multiplier effect of parallel computation like this.
Specifically be described below:
A) a plurality of CELL processors are coupled together by the wideband engine interface, form the array that comprises some CELL processors,, realize the internal memory coupling between a plurality of CELL processors by wideband engine interface BIF agreement;
B) a CELL processor in the array is a processor controls, is responsible for creation task, and according to the loading condition of each processor in the array task is distributed in other processors and moves;
C) load factor of CELL processor is the data traffic between other processors and the weighted mean value of three factors of SPE occupancy in data traffic, PPE and the array between PPE and SPE in the unit interval, and the weights of three factors can be adjusted according to needs of different applications.Processor controls is distributed to the minimum processor of load factor with the task of each new establishment;
C) task is the program of a multithreading, comprises one or more PPE threads, and a plurality of SPE thread;
D) background program of operation on the CELL of each in the array processor is responsible for monitoring processor communication and load calculated rate, reports to processor controls;
E) background program on each CELL processor receives task from processor controls, and the initiating task operation.