CN101505319B - Method for accelerating adaptive reconfigurable processing unit array system based on network - Google Patents
Method for accelerating adaptive reconfigurable processing unit array system based on network Download PDFInfo
- Publication number
- CN101505319B CN101505319B CN2009100959563A CN200910095956A CN101505319B CN 101505319 B CN101505319 B CN 101505319B CN 2009100959563 A CN2009100959563 A CN 2009100959563A CN 200910095956 A CN200910095956 A CN 200910095956A CN 101505319 B CN101505319 B CN 101505319B
- Authority
- CN
- China
- Prior art keywords
- node
- task
- restructural
- equipment
- processing unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 31
- 238000012545 processing Methods 0.000 title claims abstract description 29
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000004891 communication Methods 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000002093 peripheral effect Effects 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 8
- 238000012544 monitoring process Methods 0.000 description 4
- 230000006978 adaptation Effects 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
Images
Landscapes
- Multi Processors (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention discloses a method for accelerating a network-based adaptive reconfiguring processing unit array system. According to the method, an adaptive reconfiguring processing unit network is formed by computing nodes formed by general processors and reconfiguring equipment. A program needing to be executed is divided into a collection of tasks which can operate independently; and each task in the collection of the tasks is allocated through the adaptive reconfiguring processing unit network to find a suitable computing resource to be executed and can be reconfigured by making use of the characteristics of high efficiency and flexibility of the reconfiguring equipment, so that the function of the reconfiguring equipment can meet the demands of different tasks. Under the action of the adaptive reconfiguring processing unit network, the network-based adaptive reconfiguring processing unit array system can intelligently allocate the tasks according to the condition of nodes in the network and primarily allocates the tasks to other nodes with idle computing resources so as to reduce the time of executing the program, improve the utilization rate of the reconfiguring equipment and achieve the purpose of accelerating the program.
Description
Technical field
The present invention relates to multi-core technology field and restructural technical field, particularly relate to a kind of implementation method of restructural multinuclear accelerated procedure execution of communication Network Based.
Background technology
Along with improving constantly of computer manufacturing technology level, very large scale integration technology makes great progress, and multi-core technology has become the processor technology of existing main flow.But along with being on the increase of the quantity of processor core, the utilance that how to improve these nuclears becomes a difficult problem.
Because the degree of parallelism of general application program is not high, it has been generally acknowledged that the processor general purpose core outnumber 16 after, increase the lifting that the number of common treatment nuclear just is difficult to bring performance.
In heterogeneous multi-nucleus processor, can more integratedly be the specific core of special duty custom-made, thereby can reach the high-performance of customized application, but this customization heterogeneous multi-nucleus processor only operation institute towards application the time could obtain the performance of getting well, otherwise performance common treatment examine not lower.
The appearance of restructural technology makes the integrated reconfigurable core based on FPGA of general processor that another kind of solution is provided.When keeping the most high flexibilities of using of general processor adaptation, obtained the high-performance and the high efficiency of application specific processor.Simultaneously, reconfigurable core can be supported various dissimilar application by reshuffling, thereby can be applied to various program more neatly.
But an isolated node does not have enough abilities and flexibility to go to handle some application program.
Restructural inadequate resource: when application program needed restructural resource has surpassed one isolated node had, in this case, some requests of application program must be suspended, after occupied restructural resource is released and reconfigures, just can proceed, so just reduce performance and efficient.
The waste of restructural resource: when application program is not suitable for restructural equipment, the restructural resource of node is just by idle waste so.
The high cost of frequently reshuffling: when functional module that application program is called continually can not provide on restructural equipment simultaneously, this needs continually the functional module on the restructural equipment to be switched, cause frequent reshuffling, thus the performance of making and decrease in efficiency.
Summary of the invention
The object of the present invention is to provide a kind of based on network adaptive reconfigurable processing unit array accelerated method.
The technical scheme that the present invention solves its technical problem employing is as follows:
1) task division of original program:
System is divided into a program set of the task of energy independent operating;
2) build adaptive reconfigurable processing unit array based on grand network:
Based on network adaptive reconfigurable processing unit array is made up of n node, n ∈ [1,2,3 ...), each node has 0-4 neighbor node that directly is connected by Ethernet, each node is made up of two parts, and first is an all-purpose computer, and second portion is restructural equipment NetFPGA;
According to will on restructural equipment, carrying out of task, initialization restructural equipment;
3) distribution of task:
Based on the node in the adaptive reconfigurable processing unit array of grand network, oneself is local node for any one, and all the other nodes are remote node;
The distribution of task is distributed to the task in the set of tasks in the program computational resource of local node and the computational resource of remote node exactly;
4) change of restructural functions of the equipments:
For the restructural equipment of any one node in n the node, if the restructural device resource free time, and the logic function module wanted of the required by task that is assigned to is currently oneself not possess, and restructural equipment need reconfigure functional module so;
5) task executions:
After Task Distribution, execute the task, and after task is carried out end, return execution result.
Describedly build based on network adaptive reconfigurable to put the pe array step as follows:
1) the adaptive reconfigurable processing unit array of being built is based on the high-speed local area network network of 100Mb or 1000Mb transmission rate;
2) the adaptive reconfigurable processing unit array of being built is made up of n node, and n ∈ [1,2,3 ...), wherein each node is made up of two parts, and first is an all-purpose computer, and second portion is a restructural equipment;
3) restructural equipment NetFPGA has 4 Ethernet interfaces, so the restructural equipment of a node can directly be connected by the high-speed local area network network with maximum four other nodes, communicate, direct-connected node is called neighbor node, and node can communicate by neighbor node and non-neighbor node;
4) connected mode of two of a node part all-purpose computers and restructural equipment is to be connected by the peripheral component interconnection pci interface;
5) the control module SuperBlock on the customization restructural equipment;
6) functional module on the initialization restructural equipment.
The change step of described restructural functions of the equipments is as follows:
1) before whole based on network adaptive reconfigurable processing unit array system is started working,, generates this part of configuration to restructural equipment according to will on restructural equipment, moving of task;
2) after system starts working, dynamic assignment along with task, the desired logic function block of task that the restructural equipment of any one node will move in n node does not possess on restructural equipment, and then restructural equipment sends the request of reshuffling to the main frame of own node;
3) after main frame was received the reconfiguration request that the restructural equipment of own node sends, the host computer invokes program was reshuffled the restructural equipment of own node.
Control module SuperBlock on the described customization restructural equipment serves as the function of the communication controler of any one intra-node restructural equipment and all-purpose computer in n the node; Serve as the function of the restructural equipment and the communication controler between the neighbor node restructural equipment of any one node in n the node; For in system's running, the Task Distribution that restructural equipment is received manages, task is handled, and the residing state of logger task.
The present invention compares with background technology, and the useful effect that has is:
The present invention is based on the method for the restructural multinuclear accelerated procedure execution of network service, and its major function is to utilize restructural equipment to strengthen the computing capability of general processor, quickens the execution of certain functional modules.Utilize the communication of network, seek computational resource idle on the network, can alleviate the pressure of the not enough node of computational resource on the one hand, can improve the resource utilization of all nodes on the whole network on the other hand again.
(1) adaptivity: system can carry out smart allocation to task according to the situation of node on the network.The task of busy node wait resource can be distributed to idle node.
(2) high efficiency: because system can carry out the intelligence of task according to the situation of node on the network, thus can utilize resource idle in the network, thus the performance and the efficient of application program improved.Through the proof of experiment, system can improve performance effectively.
Description of drawings
Fig. 1 is system's pie graph of an example of the present invention.
Fig. 2 is by a node in the system, is made up of an all-purpose computer and one or more restructural equipment, connects with PCI between two parts.
Fig. 3 is a flow chart, shows the flow process according to task distribution of the present invention.
Embodiment
Relate to relevant symbolic interpretation in the method:
C1: can on restructural equipment, move;
C2: local node general processor free time;
C3: local node restructural device free;
C4: have neighbor node restructural device free;
The specific implementation flow process of based on network adaptive reconfigurable processing unit array accelerating system is as follows.
The first step: the taskization of original program:
1: { { A}, { B}}, set A represents that the task of can not moving, set B represent can not moving of task on restructural equipment on restructural equipment but system is divided into a program two set of the task of independent operating.
2: according to the dependence between the task, the dependence table between the generation task.
Second step: build the reconfigurable pe array of based on network self adaptation:
As shown in Figure 1, the adaptive reconfigurable processing unit array of being built is based on the high-speed local area network network of 100Mb or 1000Mb transmission rate.
The adaptive reconfigurable processing unit array of being built is made up of n node, and n ∈ [1,2,3 ...), wherein each node is made up of two parts, and first is an all-purpose computer, and second portion is a restructural equipment.
As shown in Figure 2, the connected mode of two of a node part all-purpose computers and restructural equipment is to be connected by the peripheral component interconnection pci interface.
Restructural equipment NetFPGA has 4 Ethernet interfaces, node can directly be connected by the high-speed local area network network with maximum four other nodes, communicate, direct-connected node is called neighbor node, and node can communicate by neighbor node and non-neighbor node.
Control module SuperBlock on the customization restructural equipment, just write the logic function module, communicating by letter of inner restructural equipment of this logic function module responsible node and all-purpose computer, communicating by letter between the restructural equipment of node and the neighbor node restructural equipment, for in system's running, the Task Distribution that restructural equipment is received manages, task handled, and the residing state of logger task.This control module SuperBlock is fixed on the restructural equipment, and when restructural equipment was reshuffled, the part of this control module on restructural equipment was constant.
Functional module on the initialization restructural equipment, the prediction task of will on restructural equipment, carrying out just, with the binary stream file programming of corresponding function module to restructural equipment, thereby reach initialized purpose.
The 3rd step: the distribution of task:
For any one node in the based on network adaptive reconfigurable processing unit array, oneself be local node, all the other nodes are remote node.
The distribution of task is distributed to the task in the set of tasks in the program computational resource of local node and the computational resource of remote node exactly.After a task is assigned with, will leave out the set of task under task.For any one node in the based on network adaptive reconfigurable processing unit array, need handling of task has two kinds, and first kind from the local node set of tasks, and second kind is being transmitted by neighbor node of task.Fig. 3 has showed that these two kinds of tasks carry out the flow process of task distribution.
When the arbitrary task in the local node set of tasks is carried out the distribution of task:
1: the task of the distributing C1 that satisfies condition, this task enters waiting list 2.
2: the task of distributing does not satisfy condition C 1, and this task enters waiting list 1.
3: when condition C 2 satisfied, a task of getting from waiting list 1 was carried out.
4: when condition C 2 did not satisfy, a task of getting from waiting list 1 reentered waiting list 1.
When a task of being transmitted by neighbor node is carried out the distribution of task:
1: the task of distributing enters waiting list 2.
2: when condition C 3 satisfied, a task of getting from waiting list 2 was carried out.
3: condition C 3 does not satisfy, and condition C 4 is when satisfying, and a task will getting from waiting list 2 is transmitted to neighbours.
4: condition C 3 does not satisfy, and condition C 4 is not when satisfying, and a task of getting from waiting list 2 reenters waiting list 2.
The 4th step: the change of restructural functions of the equipments:
1: before whole based on network adaptive reconfigurable processing unit array system is started working,, write and the corresponding logic function module of task according to will on restructural equipment, moving of task.Because each logic function is wanted the amount of hardware resources difference that takies on the loading reconstruction equipment, so in can configuration binary bits stream file with a plurality of logic function module combinations to restructural equipment.
2: after system starts working, dynamic assignment along with task, the desired logic function block of task that will move when the restructural equipment of a node does not possess on restructural equipment, and then restructural equipment is to the main frame of own node, and just all-purpose computer sends the request of reshuffling;
3: after main frame was received the reconfiguration request that the restructural equipment of own node sends, the host computer invokes program was reshuffled the restructural equipment of own node.Main frame downloads to binary profile file on the restructural equipment by pci interface.
The 5th step: task executions:
After Task Distribution, execute the task, and after task is carried out end, return execution result.
1: if task is carried out, after then task is carried out and finished, the result who carries out is returned monitoring process on all-purpose computer, by monitoring process the result is returned to the calling process of task again.
2:, after then task is carried out and finished,, comprise that the result of execution sends to the control module SuperBlock of restructural equipment on restructural equipment with the state of task if task is carried out.
3: when SuperBlock receives task action result, check the record sheet of task, if being the all-purpose computer by this locality, task calls, then the result that will carry out sends to the monitoring process of all-purpose computer by pci interface, by monitoring process the result is returned to the calling process of task again.If task is called by remote node, then the result that will carry out is by network, sends to the control module SuperBlock of restructural equipment that this task is forwarded to the neighbor node of local node.
Claims (4)
1. method that based on network adaptive reconfigurable processing unit array system quickens is characterized in that:
1) task division of original program:
System is divided into a program set of the task of energy independent operating;
2) build based on network adaptive reconfigurable processing unit array:
Based on network adaptive reconfigurable processing unit array is made up of n node, n ∈ [1,2,3...), each node has 1-4 neighbor node that directly is connected by Ethernet, each node is made up of two parts, and first is an all-purpose computer, and second portion is restructural equipment NetFPGA;
According to will on restructural equipment, carrying out of task, initialization restructural equipment;
3) distribution of task:
For the node in any one based on network adaptive reconfigurable processing unit array, oneself be local node, all the other nodes are remote node;
The distribution of task is distributed to the task in the set of tasks in the program computational resource of local node and the computational resource of remote node exactly;
4) change of restructural functions of the equipments:
For the restructural equipment of any one node in n the node, if the restructural device resource free time, and the logic function module wanted of the required by task that is assigned to is currently oneself not possess, and restructural equipment reconfigures functional module so;
5) task executions:
After Task Distribution, execute the task, and after task is carried out end, return execution result.
2. the method that a kind of based on network adaptive reconfigurable processing unit array system according to claim 1 quickens is characterized in that, describedly builds based on network adaptive reconfigurable to put the pe array step as follows:
1) the adaptive reconfigurable processing unit array of being built is based on the high-speed local area network network of 100Mb or 1000Mb transmission rate;
2) restructural equipment NetFPGA has 4 Ethernet interfaces, so the restructural equipment of a node can directly be connected by the high-speed local area network network with maximum four other nodes, communicate, direct-connected node is called neighbor node, and node can communicate by neighbor node and non-neighbor node;
3) connected mode of two of a node part all-purpose computers and restructural equipment is to be connected by the peripheral component interconnection pci interface;
4) the control module SuperBlock on the customization restructural equipment;
5) functional module on the initialization restructural equipment.
3. the method that a kind of based on network adaptive reconfigurable processing unit array system according to claim 1 quickens is characterized in that the change step of described restructural functions of the equipments is as follows:
1) before whole based on network adaptive reconfigurable processing unit array system is started working,, generates this part of configuration to restructural equipment according to will on restructural equipment, moving of task;
2) after system starts working, dynamic assignment along with task, the desired logic function block of task that the restructural equipment of any one node will move in n node does not possess on restructural equipment, and then restructural equipment sends the request of reshuffling to the main frame of own node;
3) after main frame was received the reconfiguration request that the restructural equipment of own node sends, the host computer invokes program was reshuffled the restructural equipment of own node.
4. the method that a kind of based on network adaptive reconfigurable processing unit array system according to claim 2 quickens, it is characterized in that the control module SuperBlock on the described customization restructural equipment serves as the function of the communication controler of any one intra-node restructural equipment and all-purpose computer in n the node; Serve as the function of the restructural equipment and the communication controler between the neighbor node restructural equipment of any one node in n the node; For in system's running, the Task Distribution that restructural equipment is received manages, task is handled, and the residing state of logger task.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2009100959563A CN101505319B (en) | 2009-02-26 | 2009-02-26 | Method for accelerating adaptive reconfigurable processing unit array system based on network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2009100959563A CN101505319B (en) | 2009-02-26 | 2009-02-26 | Method for accelerating adaptive reconfigurable processing unit array system based on network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101505319A CN101505319A (en) | 2009-08-12 |
CN101505319B true CN101505319B (en) | 2011-09-28 |
Family
ID=40977383
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2009100959563A Expired - Fee Related CN101505319B (en) | 2009-02-26 | 2009-02-26 | Method for accelerating adaptive reconfigurable processing unit array system based on network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101505319B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107229504A (en) * | 2017-05-12 | 2017-10-03 | 广州接入信息科技有限公司 | Program distribution operation method, apparatus and system |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102063410B (en) * | 2010-10-22 | 2012-05-23 | 中国科学技术大学 | Computer based on programmable hardware computing platform |
CN102081841B (en) * | 2011-01-18 | 2013-06-19 | 北京世纪高通科技有限公司 | Method and system for processing huge traffic data |
CN103677991A (en) * | 2013-12-16 | 2014-03-26 | 重庆川仪自动化股份有限公司 | Task execution method based on single chip microcomputer system framework and single chip microcomputer system framework |
CN105700956A (en) * | 2014-11-28 | 2016-06-22 | 国际商业机器公司 | Distributed job processing method and system |
CN106933212B (en) * | 2017-04-21 | 2019-12-10 | 华南理工大学 | reconfigurable industrial robot programming control method in distributed manufacturing environment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070283128A1 (en) * | 2006-06-06 | 2007-12-06 | Matsushita Electric Industrial Co., Ltd. | Asymmetric multiprocessor |
-
2009
- 2009-02-26 CN CN2009100959563A patent/CN101505319B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070283128A1 (en) * | 2006-06-06 | 2007-12-06 | Matsushita Electric Industrial Co., Ltd. | Asymmetric multiprocessor |
Non-Patent Citations (2)
Title |
---|
Kritchalach Thitikamol, Peter Keleher.Thread Migration and Load Balancing in Non-Dedicated Environments.《Parallel and Distributed Processing Symposium, 2000. IPDPS 2000. Proceedings. 14th International》.2000,全文. * |
KritchalachThitikamol Peter Keleher.Thread Migration and Load Balancing in Non-Dedicated Environments.《Parallel and Distributed Processing Symposium |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107229504A (en) * | 2017-05-12 | 2017-10-03 | 广州接入信息科技有限公司 | Program distribution operation method, apparatus and system |
Also Published As
Publication number | Publication date |
---|---|
CN101505319A (en) | 2009-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101505319B (en) | Method for accelerating adaptive reconfigurable processing unit array system based on network | |
CN108319563B (en) | Network function acceleration method and system based on FPGA | |
US9563474B2 (en) | Methods for managing threads within an application and devices thereof | |
US20130283286A1 (en) | Apparatus and method for resource allocation in clustered computing environment | |
US11620510B2 (en) | Platform for concurrent execution of GPU operations | |
CN106201720B (en) | Virtual symmetric multi-processors virtual machine creation method, data processing method and system | |
CN103761146A (en) | Method for dynamically setting quantities of slots for MapReduce | |
CN1979423A (en) | Multi-processor load distribution-regulation method | |
CN112882828A (en) | Upgrade processor management and scheduling method based on SLURM job scheduling system | |
JP2024020271A5 (en) | ||
US9509562B2 (en) | Method of providing a dynamic node service and device using the same | |
Wang et al. | Dependency-aware network adaptive scheduling of data-intensive parallel jobs | |
Carretero et al. | Mapping and scheduling HPC applications for optimizing I/O | |
Han et al. | Energy efficient VM scheduling for big data processing in cloud computing environments | |
CN111427822A (en) | Edge computing system | |
WO2023020010A1 (en) | Process running method, and related device | |
CN112068964A (en) | Slice type edge computing force management method | |
JP2016071886A (en) | Scheduler computing device, data node for distributed computing system including the same, and method thereof | |
CN110401939B (en) | Low-power consumption bluetooth controller link layer device | |
Filippini et al. | SPACE4AI-R: a runtime management tool for AI applications component placement and resource scaling in computing continua | |
CN113608861B (en) | Virtualized distribution method and device for software load computing resources | |
CN112416538B (en) | Multi-level architecture and management method of distributed resource management framework | |
Fu et al. | Optimizing data locality by executor allocation in spark computing environment | |
CN103488527A (en) | PHP (hypertext preprocessor) API (application programing interface) calling method, related equipment and system | |
Attiya et al. | Optimal allocation of tasks onto networked heterogeneous computers using minimax criterion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20110928 Termination date: 20120226 |