CN101505319B - Method for accelerating adaptive reconfigurable processing unit array system based on network - Google Patents

Method for accelerating adaptive reconfigurable processing unit array system based on network Download PDF

Info

Publication number
CN101505319B
CN101505319B CN2009100959563A CN200910095956A CN101505319B CN 101505319 B CN101505319 B CN 101505319B CN 2009100959563 A CN2009100959563 A CN 2009100959563A CN 200910095956 A CN200910095956 A CN 200910095956A CN 101505319 B CN101505319 B CN 101505319B
Authority
CN
China
Prior art keywords
node
task
restructural
equipment
processing unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009100959563A
Other languages
Chinese (zh)
Other versions
CN101505319A (en
Inventor
胡威
吴斌斌
冯德贵
王超
曹满
马建良
陈度
王罡
施青松
陈天洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN2009100959563A priority Critical patent/CN101505319B/en
Publication of CN101505319A publication Critical patent/CN101505319A/en
Application granted granted Critical
Publication of CN101505319B publication Critical patent/CN101505319B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Multi Processors (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a method for accelerating a network-based adaptive reconfiguring processing unit array system. According to the method, an adaptive reconfiguring processing unit network is formed by computing nodes formed by general processors and reconfiguring equipment. A program needing to be executed is divided into a collection of tasks which can operate independently; and each task in the collection of the tasks is allocated through the adaptive reconfiguring processing unit network to find a suitable computing resource to be executed and can be reconfigured by making use of the characteristics of high efficiency and flexibility of the reconfiguring equipment, so that the function of the reconfiguring equipment can meet the demands of different tasks. Under the action of the adaptive reconfiguring processing unit network, the network-based adaptive reconfiguring processing unit array system can intelligently allocate the tasks according to the condition of nodes in the network and primarily allocates the tasks to other nodes with idle computing resources so as to reduce the time of executing the program, improve the utilization rate of the reconfiguring equipment and achieve the purpose of accelerating the program.

Description

The method that based on network adaptive reconfigurable processing unit array system quickens
Technical field
The present invention relates to multi-core technology field and restructural technical field, particularly relate to a kind of implementation method of restructural multinuclear accelerated procedure execution of communication Network Based.
Background technology
Along with improving constantly of computer manufacturing technology level, very large scale integration technology makes great progress, and multi-core technology has become the processor technology of existing main flow.But along with being on the increase of the quantity of processor core, the utilance that how to improve these nuclears becomes a difficult problem.
Because the degree of parallelism of general application program is not high, it has been generally acknowledged that the processor general purpose core outnumber 16 after, increase the lifting that the number of common treatment nuclear just is difficult to bring performance.
In heterogeneous multi-nucleus processor, can more integratedly be the specific core of special duty custom-made, thereby can reach the high-performance of customized application, but this customization heterogeneous multi-nucleus processor only operation institute towards application the time could obtain the performance of getting well, otherwise performance common treatment examine not lower.
The appearance of restructural technology makes the integrated reconfigurable core based on FPGA of general processor that another kind of solution is provided.When keeping the most high flexibilities of using of general processor adaptation, obtained the high-performance and the high efficiency of application specific processor.Simultaneously, reconfigurable core can be supported various dissimilar application by reshuffling, thereby can be applied to various program more neatly.
But an isolated node does not have enough abilities and flexibility to go to handle some application program.
Restructural inadequate resource: when application program needed restructural resource has surpassed one isolated node had, in this case, some requests of application program must be suspended, after occupied restructural resource is released and reconfigures, just can proceed, so just reduce performance and efficient.
The waste of restructural resource: when application program is not suitable for restructural equipment, the restructural resource of node is just by idle waste so.
The high cost of frequently reshuffling: when functional module that application program is called continually can not provide on restructural equipment simultaneously, this needs continually the functional module on the restructural equipment to be switched, cause frequent reshuffling, thus the performance of making and decrease in efficiency.
Summary of the invention
The object of the present invention is to provide a kind of based on network adaptive reconfigurable processing unit array accelerated method.
The technical scheme that the present invention solves its technical problem employing is as follows:
1) task division of original program:
System is divided into a program set of the task of energy independent operating;
2) build adaptive reconfigurable processing unit array based on grand network:
Based on network adaptive reconfigurable processing unit array is made up of n node, n ∈ [1,2,3 ...), each node has 0-4 neighbor node that directly is connected by Ethernet, each node is made up of two parts, and first is an all-purpose computer, and second portion is restructural equipment NetFPGA;
According to will on restructural equipment, carrying out of task, initialization restructural equipment;
3) distribution of task:
Based on the node in the adaptive reconfigurable processing unit array of grand network, oneself is local node for any one, and all the other nodes are remote node;
The distribution of task is distributed to the task in the set of tasks in the program computational resource of local node and the computational resource of remote node exactly;
4) change of restructural functions of the equipments:
For the restructural equipment of any one node in n the node, if the restructural device resource free time, and the logic function module wanted of the required by task that is assigned to is currently oneself not possess, and restructural equipment need reconfigure functional module so;
5) task executions:
After Task Distribution, execute the task, and after task is carried out end, return execution result.
Describedly build based on network adaptive reconfigurable to put the pe array step as follows:
1) the adaptive reconfigurable processing unit array of being built is based on the high-speed local area network network of 100Mb or 1000Mb transmission rate;
2) the adaptive reconfigurable processing unit array of being built is made up of n node, and n ∈ [1,2,3 ...), wherein each node is made up of two parts, and first is an all-purpose computer, and second portion is a restructural equipment;
3) restructural equipment NetFPGA has 4 Ethernet interfaces, so the restructural equipment of a node can directly be connected by the high-speed local area network network with maximum four other nodes, communicate, direct-connected node is called neighbor node, and node can communicate by neighbor node and non-neighbor node;
4) connected mode of two of a node part all-purpose computers and restructural equipment is to be connected by the peripheral component interconnection pci interface;
5) the control module SuperBlock on the customization restructural equipment;
6) functional module on the initialization restructural equipment.
The change step of described restructural functions of the equipments is as follows:
1) before whole based on network adaptive reconfigurable processing unit array system is started working,, generates this part of configuration to restructural equipment according to will on restructural equipment, moving of task;
2) after system starts working, dynamic assignment along with task, the desired logic function block of task that the restructural equipment of any one node will move in n node does not possess on restructural equipment, and then restructural equipment sends the request of reshuffling to the main frame of own node;
3) after main frame was received the reconfiguration request that the restructural equipment of own node sends, the host computer invokes program was reshuffled the restructural equipment of own node.
Control module SuperBlock on the described customization restructural equipment serves as the function of the communication controler of any one intra-node restructural equipment and all-purpose computer in n the node; Serve as the function of the restructural equipment and the communication controler between the neighbor node restructural equipment of any one node in n the node; For in system's running, the Task Distribution that restructural equipment is received manages, task is handled, and the residing state of logger task.
The present invention compares with background technology, and the useful effect that has is:
The present invention is based on the method for the restructural multinuclear accelerated procedure execution of network service, and its major function is to utilize restructural equipment to strengthen the computing capability of general processor, quickens the execution of certain functional modules.Utilize the communication of network, seek computational resource idle on the network, can alleviate the pressure of the not enough node of computational resource on the one hand, can improve the resource utilization of all nodes on the whole network on the other hand again.
(1) adaptivity: system can carry out smart allocation to task according to the situation of node on the network.The task of busy node wait resource can be distributed to idle node.
(2) high efficiency: because system can carry out the intelligence of task according to the situation of node on the network, thus can utilize resource idle in the network, thus the performance and the efficient of application program improved.Through the proof of experiment, system can improve performance effectively.
Description of drawings
Fig. 1 is system's pie graph of an example of the present invention.
Fig. 2 is by a node in the system, is made up of an all-purpose computer and one or more restructural equipment, connects with PCI between two parts.
Fig. 3 is a flow chart, shows the flow process according to task distribution of the present invention.
Embodiment
Relate to relevant symbolic interpretation in the method:
C1: can on restructural equipment, move;
C2: local node general processor free time;
C3: local node restructural device free;
C4: have neighbor node restructural device free;
The specific implementation flow process of based on network adaptive reconfigurable processing unit array accelerating system is as follows.
The first step: the taskization of original program:
1: { { A}, { B}}, set A represents that the task of can not moving, set B represent can not moving of task on restructural equipment on restructural equipment but system is divided into a program two set of the task of independent operating.
2: according to the dependence between the task, the dependence table between the generation task.
Second step: build the reconfigurable pe array of based on network self adaptation:
As shown in Figure 1, the adaptive reconfigurable processing unit array of being built is based on the high-speed local area network network of 100Mb or 1000Mb transmission rate.
The adaptive reconfigurable processing unit array of being built is made up of n node, and n ∈ [1,2,3 ...), wherein each node is made up of two parts, and first is an all-purpose computer, and second portion is a restructural equipment.
As shown in Figure 2, the connected mode of two of a node part all-purpose computers and restructural equipment is to be connected by the peripheral component interconnection pci interface.
Restructural equipment NetFPGA has 4 Ethernet interfaces, node can directly be connected by the high-speed local area network network with maximum four other nodes, communicate, direct-connected node is called neighbor node, and node can communicate by neighbor node and non-neighbor node.
Control module SuperBlock on the customization restructural equipment, just write the logic function module, communicating by letter of inner restructural equipment of this logic function module responsible node and all-purpose computer, communicating by letter between the restructural equipment of node and the neighbor node restructural equipment, for in system's running, the Task Distribution that restructural equipment is received manages, task handled, and the residing state of logger task.This control module SuperBlock is fixed on the restructural equipment, and when restructural equipment was reshuffled, the part of this control module on restructural equipment was constant.
Functional module on the initialization restructural equipment, the prediction task of will on restructural equipment, carrying out just, with the binary stream file programming of corresponding function module to restructural equipment, thereby reach initialized purpose.
The 3rd step: the distribution of task:
For any one node in the based on network adaptive reconfigurable processing unit array, oneself be local node, all the other nodes are remote node.
The distribution of task is distributed to the task in the set of tasks in the program computational resource of local node and the computational resource of remote node exactly.After a task is assigned with, will leave out the set of task under task.For any one node in the based on network adaptive reconfigurable processing unit array, need handling of task has two kinds, and first kind from the local node set of tasks, and second kind is being transmitted by neighbor node of task.Fig. 3 has showed that these two kinds of tasks carry out the flow process of task distribution.
When the arbitrary task in the local node set of tasks is carried out the distribution of task:
1: the task of the distributing C1 that satisfies condition, this task enters waiting list 2.
2: the task of distributing does not satisfy condition C 1, and this task enters waiting list 1.
3: when condition C 2 satisfied, a task of getting from waiting list 1 was carried out.
4: when condition C 2 did not satisfy, a task of getting from waiting list 1 reentered waiting list 1.
When a task of being transmitted by neighbor node is carried out the distribution of task:
1: the task of distributing enters waiting list 2.
2: when condition C 3 satisfied, a task of getting from waiting list 2 was carried out.
3: condition C 3 does not satisfy, and condition C 4 is when satisfying, and a task will getting from waiting list 2 is transmitted to neighbours.
4: condition C 3 does not satisfy, and condition C 4 is not when satisfying, and a task of getting from waiting list 2 reenters waiting list 2.
The 4th step: the change of restructural functions of the equipments:
1: before whole based on network adaptive reconfigurable processing unit array system is started working,, write and the corresponding logic function module of task according to will on restructural equipment, moving of task.Because each logic function is wanted the amount of hardware resources difference that takies on the loading reconstruction equipment, so in can configuration binary bits stream file with a plurality of logic function module combinations to restructural equipment.
2: after system starts working, dynamic assignment along with task, the desired logic function block of task that will move when the restructural equipment of a node does not possess on restructural equipment, and then restructural equipment is to the main frame of own node, and just all-purpose computer sends the request of reshuffling;
3: after main frame was received the reconfiguration request that the restructural equipment of own node sends, the host computer invokes program was reshuffled the restructural equipment of own node.Main frame downloads to binary profile file on the restructural equipment by pci interface.
The 5th step: task executions:
After Task Distribution, execute the task, and after task is carried out end, return execution result.
1: if task is carried out, after then task is carried out and finished, the result who carries out is returned monitoring process on all-purpose computer, by monitoring process the result is returned to the calling process of task again.
2:, after then task is carried out and finished,, comprise that the result of execution sends to the control module SuperBlock of restructural equipment on restructural equipment with the state of task if task is carried out.
3: when SuperBlock receives task action result, check the record sheet of task, if being the all-purpose computer by this locality, task calls, then the result that will carry out sends to the monitoring process of all-purpose computer by pci interface, by monitoring process the result is returned to the calling process of task again.If task is called by remote node, then the result that will carry out is by network, sends to the control module SuperBlock of restructural equipment that this task is forwarded to the neighbor node of local node.

Claims (4)

1. method that based on network adaptive reconfigurable processing unit array system quickens is characterized in that:
1) task division of original program:
System is divided into a program set of the task of energy independent operating;
2) build based on network adaptive reconfigurable processing unit array:
Based on network adaptive reconfigurable processing unit array is made up of n node, n ∈ [1,2,3...), each node has 1-4 neighbor node that directly is connected by Ethernet, each node is made up of two parts, and first is an all-purpose computer, and second portion is restructural equipment NetFPGA;
According to will on restructural equipment, carrying out of task, initialization restructural equipment;
3) distribution of task:
For the node in any one based on network adaptive reconfigurable processing unit array, oneself be local node, all the other nodes are remote node;
The distribution of task is distributed to the task in the set of tasks in the program computational resource of local node and the computational resource of remote node exactly;
4) change of restructural functions of the equipments:
For the restructural equipment of any one node in n the node, if the restructural device resource free time, and the logic function module wanted of the required by task that is assigned to is currently oneself not possess, and restructural equipment reconfigures functional module so;
5) task executions:
After Task Distribution, execute the task, and after task is carried out end, return execution result.
2. the method that a kind of based on network adaptive reconfigurable processing unit array system according to claim 1 quickens is characterized in that, describedly builds based on network adaptive reconfigurable to put the pe array step as follows:
1) the adaptive reconfigurable processing unit array of being built is based on the high-speed local area network network of 100Mb or 1000Mb transmission rate;
2) restructural equipment NetFPGA has 4 Ethernet interfaces, so the restructural equipment of a node can directly be connected by the high-speed local area network network with maximum four other nodes, communicate, direct-connected node is called neighbor node, and node can communicate by neighbor node and non-neighbor node;
3) connected mode of two of a node part all-purpose computers and restructural equipment is to be connected by the peripheral component interconnection pci interface;
4) the control module SuperBlock on the customization restructural equipment;
5) functional module on the initialization restructural equipment.
3. the method that a kind of based on network adaptive reconfigurable processing unit array system according to claim 1 quickens is characterized in that the change step of described restructural functions of the equipments is as follows:
1) before whole based on network adaptive reconfigurable processing unit array system is started working,, generates this part of configuration to restructural equipment according to will on restructural equipment, moving of task;
2) after system starts working, dynamic assignment along with task, the desired logic function block of task that the restructural equipment of any one node will move in n node does not possess on restructural equipment, and then restructural equipment sends the request of reshuffling to the main frame of own node;
3) after main frame was received the reconfiguration request that the restructural equipment of own node sends, the host computer invokes program was reshuffled the restructural equipment of own node.
4. the method that a kind of based on network adaptive reconfigurable processing unit array system according to claim 2 quickens, it is characterized in that the control module SuperBlock on the described customization restructural equipment serves as the function of the communication controler of any one intra-node restructural equipment and all-purpose computer in n the node; Serve as the function of the restructural equipment and the communication controler between the neighbor node restructural equipment of any one node in n the node; For in system's running, the Task Distribution that restructural equipment is received manages, task is handled, and the residing state of logger task.
CN2009100959563A 2009-02-26 2009-02-26 Method for accelerating adaptive reconfigurable processing unit array system based on network Expired - Fee Related CN101505319B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009100959563A CN101505319B (en) 2009-02-26 2009-02-26 Method for accelerating adaptive reconfigurable processing unit array system based on network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009100959563A CN101505319B (en) 2009-02-26 2009-02-26 Method for accelerating adaptive reconfigurable processing unit array system based on network

Publications (2)

Publication Number Publication Date
CN101505319A CN101505319A (en) 2009-08-12
CN101505319B true CN101505319B (en) 2011-09-28

Family

ID=40977383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009100959563A Expired - Fee Related CN101505319B (en) 2009-02-26 2009-02-26 Method for accelerating adaptive reconfigurable processing unit array system based on network

Country Status (1)

Country Link
CN (1) CN101505319B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107229504A (en) * 2017-05-12 2017-10-03 广州接入信息科技有限公司 Program distribution operation method, apparatus and system

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102063410B (en) * 2010-10-22 2012-05-23 中国科学技术大学 Computer based on programmable hardware computing platform
CN102081841B (en) * 2011-01-18 2013-06-19 北京世纪高通科技有限公司 Method and system for processing huge traffic data
CN103677991A (en) * 2013-12-16 2014-03-26 重庆川仪自动化股份有限公司 Task execution method based on single chip microcomputer system framework and single chip microcomputer system framework
CN105700956A (en) * 2014-11-28 2016-06-22 国际商业机器公司 Distributed job processing method and system
CN106933212B (en) * 2017-04-21 2019-12-10 华南理工大学 reconfigurable industrial robot programming control method in distributed manufacturing environment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070283128A1 (en) * 2006-06-06 2007-12-06 Matsushita Electric Industrial Co., Ltd. Asymmetric multiprocessor

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070283128A1 (en) * 2006-06-06 2007-12-06 Matsushita Electric Industrial Co., Ltd. Asymmetric multiprocessor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Kritchalach Thitikamol, Peter Keleher.Thread Migration and Load Balancing in Non-Dedicated Environments.《Parallel and Distributed Processing Symposium, 2000. IPDPS 2000. Proceedings. 14th International》.2000,全文. *
KritchalachThitikamol Peter Keleher.Thread Migration and Load Balancing in Non-Dedicated Environments.《Parallel and Distributed Processing Symposium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107229504A (en) * 2017-05-12 2017-10-03 广州接入信息科技有限公司 Program distribution operation method, apparatus and system

Also Published As

Publication number Publication date
CN101505319A (en) 2009-08-12

Similar Documents

Publication Publication Date Title
CN101505319B (en) Method for accelerating adaptive reconfigurable processing unit array system based on network
CN108319563B (en) Network function acceleration method and system based on FPGA
US9563474B2 (en) Methods for managing threads within an application and devices thereof
US20130283286A1 (en) Apparatus and method for resource allocation in clustered computing environment
US11620510B2 (en) Platform for concurrent execution of GPU operations
CN106201720B (en) Virtual symmetric multi-processors virtual machine creation method, data processing method and system
CN103761146A (en) Method for dynamically setting quantities of slots for MapReduce
CN1979423A (en) Multi-processor load distribution-regulation method
CN112882828A (en) Upgrade processor management and scheduling method based on SLURM job scheduling system
JP2024020271A5 (en)
US9509562B2 (en) Method of providing a dynamic node service and device using the same
Wang et al. Dependency-aware network adaptive scheduling of data-intensive parallel jobs
Carretero et al. Mapping and scheduling HPC applications for optimizing I/O
Han et al. Energy efficient VM scheduling for big data processing in cloud computing environments
CN111427822A (en) Edge computing system
WO2023020010A1 (en) Process running method, and related device
CN112068964A (en) Slice type edge computing force management method
JP2016071886A (en) Scheduler computing device, data node for distributed computing system including the same, and method thereof
CN110401939B (en) Low-power consumption bluetooth controller link layer device
Filippini et al. SPACE4AI-R: a runtime management tool for AI applications component placement and resource scaling in computing continua
CN113608861B (en) Virtualized distribution method and device for software load computing resources
CN112416538B (en) Multi-level architecture and management method of distributed resource management framework
Fu et al. Optimizing data locality by executor allocation in spark computing environment
CN103488527A (en) PHP (hypertext preprocessor) API (application programing interface) calling method, related equipment and system
Attiya et al. Optimal allocation of tasks onto networked heterogeneous computers using minimax criterion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110928

Termination date: 20120226