CN102184093A

CN102184093A - Parallel computing array framework constructed by multi CELL processors

Info

Publication number: CN102184093A
Application number: CN2011101588623A
Authority: CN
Inventors: 周学功; 王伶俐; 曹伟; 叶晓敏
Original assignee: Fudan University
Current assignee: Fudan University; Shanghai Redneurons Co Ltd
Priority date: 2011-06-14
Filing date: 2011-06-14
Publication date: 2011-09-14

Abstract

The invention belongs to the technical field of a PPGA (Plastic Pin Grid Array) and a parallel counting array, in particular to a parallel counting array framework constructed by multi CELL processors. In the parallel counting array framework, the existing broadband engine interface in a CELL is utilized, a plurality of CELL processors are connected to form an array, software is configured to realize the balance between memory coupling in the array and I/O (Input/Output) transmission and load of non-coupling at an outer part. At the soft configuration aspect, by using a memory coupling CELL broadband engine interface BIF (Boot Image File) protocol of each CELL processor, each CELL processor in the array is connected with the rest CELL processors in the array through IOIF0; and by optimizing the dispatching algorithm, the load among the CELL processors in the array can be dispatched, so that the load balance is realized. In the parallel counting array framework, the optimized stroke dispatching among the CELL processors can be realized and optimized, the utilization rate of the CELL processors can be improved, and the load is balanced, and the power consumption can be reduced to the greatest extent.

Description

The parallel computation array architecture that many CELL processor makes up

Technical field

The invention belongs to FPGA and parallel computing field, be specifically related to a kind of parallel computation array based on the CELL processor.

Background technology

In investment in March calendar year 2001, Sony, new power computer amusement, Toshiba, U.S.'s IBM (IBM) company develop jointly the processor that is used for high-speed computation to the CELL processor by Sony.It is that PowerPC framework with the RISC instruction system designs, and has high clock frequency, the high characteristics such as efficient of carrying out.Be mainly used on PlayStation 3 and the cutter point server.The CELL processor has 1 processor core (comprise 1 simplified by PowerPC970 and come PPE and 8 to be called SPE collaborative process device), and frequency of operation surpasses 4GHz.The CELL processor is a 64-bit Power processor, and built-in 8 processing units that cooperate with each other have and handle the ability that separate type is calculated, and have the ability that uniprocessor moves a plurality of operating systems.

The design of high flexibility and Distributed Calculation are two spotlights of CELL except that high-performance.CELL is applicable in nearly all computing equipment such as from the embedded device to the mainframe computer, so CELL is designed to a general processor platform.If be used for workstation/server system, IBM can directly integrate two pieces of CELL processors with the higher usefulness of acquisition, and will be used for mainframe computer, and CELL then can be configured to comprise " the MCM module " of four pieces of independent processors.Therefore use a plurality of CELL processors to make up array and carry out the important directions that high-effect parallel computation becomes present research.

List of references

[1] IBM.?PowerPC?Microprocessor?Family:?Vector/SIMD?Multimedia?Extension?Technology?Programming?Environments?Manual?[M].?NY,?USA,?2005:?317.

[2] IBM.?PowerPC?Operating?Environment?Architecture?[M].?NY,?USA,?2005:?135。

Summary of the invention

The object of the present invention is to provide a kind of utilization factor that can improve the CELL processor, balanced load, and reduce the parallel computation array architecture of power consumption as far as possible.This array architecture is applicable to the FPGA structure of various complexity towards the FPGA software systems, carries out high performance universal vanning and FPGA placement-and-routing.

The parallel computation array architecture that the present invention proposes is by forming that many CELL processor makes up.Specifically, the parallel computation array architecture that the present invention proposes is connection mode, the communication pattern between the CELL processor, the thread scheduling between the CELL processor and the load balance pattern that redefines between the CELL processor.Wherein:

Described connection mode is with a plurality of CELL processors, by the wideband engine interface, couples together the composition array.

Described communication pattern is to be coupled by the internal memory that wideband engine interface BIF agreement is carried out between the CELL processor, and carries out the communication between the CELL processor.

Described thread scheduling and load balance pattern are with original single CELL multithreading implementation, are decomposed into multitask multiprogramming by software, distribute and are dispatched on the CELL array and carry out, and realize the large-scale data parallel computation.

Concrete operations are as follows:

A) a plurality of CELL processors are coupled together by the wideband engine interface, form the array that comprises some CELL processors;

B) configuration software is used for the intercommunication mutually between the CELL processor, by wideband engine interface BIF agreement, realizes the internal memory coupling between a plurality of CELL processors;

C) configuration software is used for the thread scheduling and the load balance of CELL array, with the support that provides it that large-scale parallel is calculated.

Parallel computation array architecture of the present invention can be implemented under the condition of low-power consumption and carries out high-effect parallel computation.This array architecture is applicable to the FPGA structure of various complexity towards the FPGA software systems, carries out high performance universal vanning and FPGA placement-and-routing.

Description of drawings

Fig. 1 is a parallel computation array architecture model diagram of the present invention.

Embodiment

The concrete basic mode of implementing of the parallel computation array architecture that the present invention proposes is as follows: utilize existing wideband engine interface among the CELL, a plurality of CELL processors are coupled together the formation array, by software arrangements realize this array internal memory coupling and with the uncoupled I/O transmission of outside and the balance of load.Aspect software arrangements, at first utilize the internal memory coupling CELL wideband engine interface BIF agreement of CELL processor, can allow each CELL processor can be connected to remaining CELL processor in the array by IOIF0 in the array; Secondly by the Optimization Dispatching algorithm, will dispatch between the CELL processor of computing load in array, realize load balance.The CELL array architecture that structure is finished is a core with one of them CELL processor, is responsible for task analysis, thread scheduling; Utilize configuration software and dispatching algorithm, the large-scale parallel data operation is packaged into suitable single CELL processor Program for Calculation, be assigned in the array and carry out on each CELL processor; Each CELL processor all keeps original architecture in the array, and single CELL processor has had powerful computation capability, and the CELL array just can be realized the multiplier effect of parallel computation like this.

Specifically be described below:

A) a plurality of CELL processors are coupled together by the wideband engine interface, form the array that comprises some CELL processors,, realize the internal memory coupling between a plurality of CELL processors by wideband engine interface BIF agreement;

B) a CELL processor in the array is a processor controls, is responsible for creation task, and according to the loading condition of each processor in the array task is distributed in other processors and moves;

C) load factor of CELL processor is the data traffic between other processors and the weighted mean value of three factors of SPE occupancy in data traffic, PPE and the array between PPE and SPE in the unit interval, and the weights of three factors can be adjusted according to needs of different applications.Processor controls is distributed to the minimum processor of load factor with the task of each new establishment;

C) task is the program of a multithreading, comprises one or more PPE threads, and a plurality of SPE thread;

D) background program of operation on the CELL of each in the array processor is responsible for monitoring processor communication and load calculated rate, reports to processor controls;

E) background program on each CELL processor receives task from processor controls, and the initiating task operation.

Claims

1. the parallel computation array architecture that makes up of CELL processor more than a kind, it is characterized in that by forming that many CELL processor makes up, and redefine connection mode, the communication pattern between the CELL processor, the thread scheduling between the CELL processor and load balance pattern between the CELL processor; Wherein:

Described connection mode is with a plurality of CELL processors, by the wideband engine interface, couples together the composition array;

Described communication pattern is to be coupled by the internal memory that wideband engine interface BIF agreement is carried out between the CELL processor, and carries out the communication between the CELL processor;

2. the parallel computation array architecture that many CELL processor according to claim 1 makes up is characterized in that:

A CELL processor in the array is a processor controls, is responsible for creation task, and according to the loading condition of each processor in the array task is distributed in other processors and moves;

Background program of operation is responsible for monitoring processor communication and load calculated rate on each CELL processor in array, reports to processor controls; Background program receives task from processor controls simultaneously, and the initiating task operation.