WO2012049247A1 - A computer cluster arrangement for processing a computation task and method for operation thereof - Google Patents

A computer cluster arrangement for processing a computation task and method for operation thereof Download PDF

Info

Publication number
WO2012049247A1
WO2012049247A1 PCT/EP2011/067888 EP2011067888W WO2012049247A1 WO 2012049247 A1 WO2012049247 A1 WO 2012049247A1 EP 2011067888 W EP2011067888 W EP 2011067888W WO 2012049247 A1 WO2012049247 A1 WO 2012049247A1
Authority
WO
WIPO (PCT)
Prior art keywords
computation
booster
group
assignment
computer cluster
Prior art date
Application number
PCT/EP2011/067888
Other languages
English (en)
French (fr)
Inventor
Thomas Lippert
Original Assignee
Partec Cluster Competence Center Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=43831684&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=WO2012049247(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Priority to RSP20191093 priority Critical patent/RS59165B1/sr
Priority to SI201131773T priority patent/SI2628080T1/sl
Priority to LTEP11768015.7T priority patent/LT2628080T/lt
Priority to CN201180056850.8A priority patent/CN103229146B/zh
Priority to PL11768015T priority patent/PL2628080T3/pl
Priority to KR1020187002295A priority patent/KR102103596B1/ko
Priority to CA2814309A priority patent/CA2814309C/en
Priority to EP19179073.2A priority patent/EP3614263A3/de
Priority to EP11768015.7A priority patent/EP2628080B1/de
Priority to JP2013533215A priority patent/JP6494161B2/ja
Priority to DK11768015.7T priority patent/DK2628080T3/da
Priority to ES11768015T priority patent/ES2743469T3/es
Priority to KR1020197005913A priority patent/KR102074468B1/ko
Priority to RU2013121560/12A priority patent/RU2597556C2/ru
Priority to KR1020137011931A priority patent/KR101823505B1/ko
Application filed by Partec Cluster Competence Center Gmbh filed Critical Partec Cluster Competence Center Gmbh
Publication of WO2012049247A1 publication Critical patent/WO2012049247A1/en
Priority to US13/861,429 priority patent/US10142156B2/en
Priority to US16/191,973 priority patent/US10951458B2/en
Priority to CY20191100948T priority patent/CY1122108T1/el
Priority to HRP20191640 priority patent/HRP20191640T1/hr
Priority to US17/196,665 priority patent/US11934883B2/en
Priority to US18/429,370 priority patent/US20240168823A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/509Offload

Definitions

  • the present invention is directed towards a computer cluster arrangement.
  • it relates to a computer cluster arrangement with improved resource management as regards the application of computing nodes for processing scalable computation tasks as well as complex computation tasks. It is especially directed towards a computer cluster arrangement for processing a computation task and a method for operating the computer cluster arrangement.
  • the computer cluster arrangement in accordance with the present invention makes use of acceleration functionality, which assist the computing nodes to accomplish a given computation task.
  • the present invention is furthermore directed towards a computer program product being configured for accomplishing the method as well as a computer readable medium for storing the computer program product.
  • Known in the art are computer cluster arrangements comprising computing nodes including at least one processor as well as accelerators being tightly coupled to the computing nodes for outsourcing computations of high resource requirements.
  • a tight coupling of accelerators to computing nodes results in a static assignment and leads to over- or under subscription of accelerators. This may lead to a lack of resources or may lead to an excessive supply of resources.
  • Such a static assignment of accelerators to computing nodes does furthermore not provide fault tolerance in case of accelerator failures.
  • Fig. 1 shows a computer cluster arrangement according to the state of the art.
  • the computer cluster arrangement comprises several computations nodes CN, which are interconnected and jointly compute a computation task.
  • Each computation node CN is tightly coupled with an accelerator Acc. As can be seen in Fig.
  • a computation node CN comprises an accelerator unit ACC which is virtually integrated on the computation node CN along with a microprocessor, for instance a central processing unit CPU.
  • a microprocessor for instance a central processing unit CPU.
  • the fixed coupling of accelerators Acc to computation nodes CN leads to an over- or under subscription of accelerators Acc depending on the computation task.
  • no fault tolerance is provided in case of failure of one of the accelerators Acc.
  • computing nodes CN communicate with each other over an infrastructure, wherein accelerators Acc do not exchange information directly, but require a computation node CN interfacing the infrastructure IN for data exchange.
  • a computer cluster arrangement for processing a computation task comprising: - a plurality of computation nodes, each of which interfacing a communication infrastructure, at least two of which being arranged to jointly compute at least a first part of the computation task;
  • At least one booster being arranged to compute at least a second part of the computation task, each booster interfacing the communication infrastructure; and - a resource manager being arranged to assign at least one booster to at least one of the plurality of computation nodes for computation of the second part of the computation task, the assignment being accomplished as a function of a predetermined assignment metric.
  • acceleration functionality is being provided by independent boosters.
  • the described computer cluster arrangement allows a loose coupling of those boosters to computation nodes, which may also be referred to as compute nodes.
  • a sharing of accelerators, here in form of boosters, by computation nodes is feasible.
  • a resource manager in form of a resource manager module or resource manager node, may be provided.
  • the resource manager may establish a static assignment at start of a processing of a computation task. Alternatively or additionally it may establish a dynamic assignment at runtime, which means during processing of the computation task.
  • the resource manager is arranged to provide assignment information to the computation nodes for outsourcing parts of the computation tasks from at least one computation node to at least one booster.
  • the resource manager may be implemented as a specific hardware unit, a virtual unit or be compound of any of them.
  • the resource manager may be formed by any- one of: a microprocessor, a hardware component, a virtualized hardware component or a daemon.
  • parts of the resource manager may be distributed over the system and communicate via a communication infrastructure.
  • booster allocation is performed as a function of application needs, which means in dependency of processing a specific computation task.
  • Fault tolerance in case of booster failure is provided as well as scalability is fostered.
  • Scalability is made possible by support of incremental system development, as boosters are provided independently of computation nodes. Hence, the number of computation nodes and the number of provided boosters may differ. Thus, a maximum flexibility in providing hardware resources is established. Furthermore, all computation nodes do share the same growth capacity.
  • a computation task may be defined by means of an algorithm, a source code, a binary code and may be furthermore be compound of any of them.
  • a computation task may for instance be a simulation, which is to be computed by the computer cluster arrangement.
  • the computation task may comprise several sub problems, also referred to as sub tasks, which in their entirety describe the overall computation task. It is possible to divide the computation task into several parts, for instance at least a first part of the computation task and at least a second part of the computation task. It is also possible for the computer cluster arrangement to solve the parts of the computation task in parallel or in succession.
  • Each computation node interfaces a communication infrastructure, also referred to as interconnect. Analogously, each booster interfaces the communication infrastructure.
  • each computation node communicates with each booster over the communication infrastructure, without the necessity to involve a further communication node while exchanging data from a computation node to a booster.
  • a dynamic assignment of computation nodes to boosters is established, wherein computation nodes process at least a part of the computation task and are not required for passing through of information from one computation node to one booster. Therefore, it is possible to directly couple boosters to the communication infrastructure without the necessity of an intermediate computation node as it is typically implemented in the state of the art.
  • an assignment metric is provided, which serves as a basis for the decision which booster is coupled with which computation node.
  • the assignment metric may be managed by a resource manager.
  • Managing the assignment metric refers to establishing and updating rules naming at least one booster, which is assigned to at least one further named computation node. Hence, it is possible to update the assignment metric at runtime.
  • Such assignment rules may be created as a function of a load balancing, which detects workload of the computer cluster arrangement, especially of the boosters.
  • the assignment metric is predetermined but may be altered at runtime. Hence, static assignment is provided at start of the processing of computation task and dynamic assignment is provided at runtime.
  • the determined assignment metric is formed according to at least one of group of metric specification techniques, the group comprising: a temporal logic, an assignment matrix, an assignment table, a probability function and a cost function.
  • temporal dependencies may be considered for assigning the boosters. It may be the case, that a temporal order is defined on the boosters, which makes sure that a specific booster is always assigned to a computation node in case a further booster failed to solve at least a part of the computation task.
  • An assignment metric may name an identification of a computation node and may furthermore define identifications of compatible boosters which can be assigned.
  • a probability function may for instance describe that in case a specific booster failed to compute a certain computation task a further booster may solve the same computation task at a specific probability.
  • cost functions may be applied for evaluation of required resource capacities and furthermore for evaluation of provided computation capacities of boosters.
  • a computation history also referred to as computation log record, may also be applied for dynamic assignment.
  • computation tasks can be empirically evaluated by computation on at least one first booster and recording response times and furthermore by processing the same computation task on at least one further booster and recording response times.
  • capacities of boosters can be recorded, empirically evaluated and therefore be assigned to computation nodes as a function of required capacities and their provided capacities.
  • Specific computation tasks may comprise priority information, which indicates how urgently this specific computation task has to be computed. It may also be the case that specific computation nodes provide a priority, which indicates how urgent a processing of a computation task, or at least a part of a computation task, is compared to other parts of computation tasks being originated from other computation nodes. Hence, it is possible to provide priority information as regards single parts of the computation task as well as priority information referring to computation nodes.
  • the booster processes specific parts of a computation task. This may be accomplished by a remote procedure call, a parameter handover or data transmission.
  • the complexity of the part of the computation task may be evaluated as a function of a parameter handover. In case a parameter contains a matrix, the complexity of the parameter handover can be evaluated by the number of dimensions of the matrix.
  • an interfacing unit may be provided, which is arranged between one computation node and the communication infrastructure.
  • a further interfacing unit being different from the first interfacing unit may be arrange between the booster and the communication infrastructure.
  • the interfacing unit can be different form the computation node and is also different from the booster.
  • the interfacing unit merely provides network functionality, without being arranged to process parts of the computation task.
  • the interfacing unit merely provides functionality as regards the administration and communication issues of the computation tasks. It may for example provide functionality as regards routing and transmission of data referring to the computation task.
  • acceleration can also be performed reversely by outsourcing at least a part of the computation task from at least one booster to at least one computation node. Hence, control and information flow is reversed as regards the above introduced aspects of the invention.
  • the predetermined assignment may be formed according to at least one group of matrix specification techniques, the group comprising: a temporal logic, an assignment matrix, an assignment table, a probability function and a cost function.
  • a group of matrix specification techniques comprising: a temporal logic, an assignment matrix, an assignment table, a probability function and a cost function.
  • the predetermined assignment metric is specified as a function of at least one of a group of assignment parameters, the group comprising: resource information, cost information, complexity information, scalability information, a computation log record, compiler information, priority information and a time stamp.
  • a group of assignment parameters the group comprising: resource information, cost information, complexity information, scalability information, a computation log record, compiler information, priority information and a time stamp.
  • each computation node and each booster interfaces the communication infrastructure respectively via an interfacing unit.
  • This may provide the advantage that data can be communicated via the communication infrastructure without the necessity of an intermediate computation node. Hence, it is not required to couple a booster with a computation node directly but a dynamic assignment is reached.
  • the interfacing unit comprises at least one group of components, the group comprising: a virtual interface, a stub, a socket, a network controller and a network device.
  • the interfacing unit comprises at least one group of components, the group comprising: a virtual interface, a stub, a socket, a network controller and a network device.
  • the computation nodes as well as the boosters can also be virtually connected to the communication and infrastructure.
  • existing communication infrastructures can be easily accessed.
  • communication and infrastructure comprises at least one of a group of components, the group comprising: a bus, a communication link, a switching unit, a router and a high speed network. This may provide the advantage that existing communication infrastructures can be used and new communication infrastructures can be created by commonly available network devices.
  • each computation node comprises at least one of a group of components, the group comprising: a multi core processor, a cluster, a computer, a workstation and a multipurpose processor. This may provide the advantage that the computation nodes are highly scalable.
  • the at least one booster comprises at least one group of components, the group comprising: a many-core-processor, a scalar processor, a co-processor, a graphical processing unit, a cluster of many-core-processors and a monolithic processor. This may provide the advantage that the boosters are implemented to process specific problems at high speed.
  • Computation nodes typically apply processors comprising an extensive control unit as several computation tasks have to be processed simultaneously.
  • Processors being applied in boosters typically comprise an extensive arithmetic logic unit and a simple control structure when being compared to computation nodes processors. For instance SIMD, also refer to as single instruction multiple data computers, may find application in boosters.
  • SIMD also refer to as single instruction multiple data computers
  • processors being applied in computation nodes differ in their processor design compared to processors being applied in boosters.
  • the resource manager is arranged to update said predetermined assignment metric during computation of at least a part of said computation task. This may provide the advantage that the assignment of boosters to computation nodes can be performed dynamically at runtime.
  • a method for operating a computer cluster arrangement for processing a computation task, the method comprising: - computing at least a first part of the computation task by at least two of the plurality of computation nodes, each computation node interfacing a communication infrastructure;
  • Fig. 1 shows a computer cluster arrangement according to the state of the art
  • Fig. 2 shows a schematic illustration of a computer cluster arrangement according to an aspect of the present invention.
  • Fig. 3 shows a schematic illustration of a computer cluster arrangement according to a further aspect of the present invention.
  • Fig. 4 shows a schematic illustration of a method for operating a computer cluster arrangement according to an aspect of the present invention.
  • Fig. 5 shows a schematic illustration of a method for operating a computer cluster arrangement according to a further aspect of the present invention.
  • Fig. 6 shows a schematic illustration of control flow of a computer cluster arrangement
  • Fig. 7 shows a schematic illustration of control flow implementing reverse acceleration of a computer cluster arrangement according to a further aspect of the present invention.
  • Fig. 8 shows a schematic illustration of control flow of a computer cluster arrangement according to a further aspect of the present invention.
  • Fig. 9 shows a schematic illustration of network topology of a computer cluster arrangement according to an aspect of the present invention.
  • Fig. 2 shows a computer cluster arrangement comprising a cluster C as well as a booster group BG.
  • the cluster comprises in the present embodiment four computation nodes, also referred as CN, as well as three boosters, also referred to as B.
  • each of the boosters B can be shared by any of the computation nodes CN. Furthermore a virtualization on cluster level can be accomplished. Each booster, or at least a part of the boosters, can be virtualized and made available to the computation nodes virtually.
  • computation tasks are processed by at least one of the computation nodes CN and at least a part of the computation tasks may be forwarded to at least one of the boosters B.
  • the boosters B are arranged to compute specific problems and provide specific processing power. Hence, problems can be outsourced from one of the computation nodes CN to the boosters B, be computed by the booster and the result may be delivered back to the computation node.
  • the assignment of boosters ESB to computation nodes CN can be accomplished by a resource manager, also referred to as RM.
  • the resource manager initializes a first assignment and further on establishes a dynamic assignment of boosters B to
  • boosters B For communication between boosters and computation nodes an application programming interface, also referred to as API, can be provided.
  • API application programming interface
  • the boosters B may be controlled
  • the API abstracts and enhances actual native programming models of the boosters. Furthermore the API may provide means for fault tolerance in case of a booster failure.
  • a communication protocol involved in API calls may be layered on top of a communication layer.
  • the direction of the copy operation can be:
  • the number of threads is determined by number of threads per block (block dim) and number of blocks in the grid (grid_dim)
  • Fig. 3 shows a further cluster arrangement according to an aspect of the present invention.
  • the depicted computer cluster arrangement is arranged to compute scientific computation tasks, especially in the context of high performance cluster technology.
  • codes with Exascale needs include, on the one hand, code blocks that are well suited for Exascaling, and, on the other hand, such code blocks that are too complex to be so scalable.
  • highly scalable and complex is made on the level of code blocks, and we introduce the notions Exascale Code Blocks (ECB) and complex Code Blocks (CCB).
  • EBC Exascale Code Blocks
  • CCB complex Code Blocks
  • a coarse-grained architectural model emerges, where the highly scalable parts or ECBs of an application code are executed on a parallel many-core architecture, which is accessed dynamically, while the CCBs are executed on a traditional cluster system suitable dimensioned, including the connectivity along with a refined dynamical resource allocation system.
  • Clusters at Exascale require virtualization elements in order to guarantee resilience and reliability. While local accelerators, in principle, allow for a simple view on the entire system and in particular can utilize the extremely high local bandwidth, they are absolutely static hardware elements, well suited for farming or master-slave parallelization. Hence, it would be difficult to include them in a virtualization software layer. In addition, there would be no fault tolerance if an accelerator fails, and there was no tolerance for over or under subscription.
  • the cluster's computation nodes CN are internally coupled by a standard cluster interconnect, e.g. Mellanox InfiniBand.
  • This network is extended to include the booster (ESB) as well.
  • the ESBs each consist of a multitude of many-core accelerators connected by a specific fast low-latency network.
  • KC Intel's many-core processor Knight's Corner
  • the KC-chip will consist of more than 50 cores and is expected to provide a DP compute capacity of over 1 Teraflop/s per chip. Withl 0.000 elements a total performance of 10 Petaflop/s would be in reach.
  • the predecessor of KC the Knight's Ferry processor (KF) will be used in the project to create a PCIe-based pilot system to study the cluster-booster (CN -ESB) concept.
  • the communication system requires at least 1 Terabit/s per card (duplex).
  • the communication system EXTOLL may be used as an implementation of a bus system, which provides a communication rate of 1 .44 Terabit/s per card. It realizes a 3d topology providing 6 links per card. Concerning its simplicity, this topology appears to be applicable for a booster based on many-core accelerators. Even with two directions reserved for cut-through routing, EXTOLL can saturate the PCI Express performance as far as the data rate is concerned. The latency can reach 0.3 ⁇ , when based on an ASIC realization. Currently, EXTOLL is realized by means of FPGAs.
  • Fig. 4 shows a flow diagram for illustrating an aspect of a method for operating a computer cluster arrangement according to the present invention.
  • a first step 100 at least the first part of a computation task is computed by at least two of the plurality of computation nodes CN, each computation node CN interfacing a communication infrastructure IN.
  • computing of at least a second part of the computation task in step 101 by at least one booster B is performed, each booster B interfacing the communication infrastructure IN.
  • assigning at least one booster B to one of the plurality of computation nodes CN in step 102 by a resource manager RM, for computation of the second part of the computation task is performed.
  • the right arrow in Fig. 4 indicates the control flow may point back to step 100.
  • Fig. 5 shows a flow diagram illustrating a method for operating a computer cluster arrangement according to an aspect of the present invention.
  • the step of computing 201 at least a second part of the computation task is performed.
  • a booster B computes the at least second part of the computation task. This may be of advantage in case the at least second part of the computation task is forwarded to the resource manager RM, which assigns a booster B to the second part of the computation task. The resource manager RM can then transmit the second part of the computation task to the booster B, without the necessity that the computation node CN directly contacts the booster B.
  • step 102 may be performed before step 101 , which results in a computation of a first part of the computation task, an assignment of one booster to one computation node and finally
  • Step 102 may comprise sub steps such as returning the computed at least second part of the computation task back to the computation node CN.
  • the booster B returns the computed result back to the computation nodes CN.
  • the computation nodes CN may use the returned value for computation of further computation tasks and may again forward at least a further part of a computation task to at least one of the boosters B.
  • Fig. 6 shows a block diagram of control flow of a computer cluster arrangement according to an aspect of the present invention.
  • a computation node CN receives a computation task and requests a booster B for outsourcing at least a part of the received computation task. Therefore, a resource manager RM is accessed, which forwards the part of the computation task to a selected booster B.
  • the booster B computes the part of the computation task and returns a result, which is indicated by the most right arrow.
  • the return value can be passed back to the
  • Fig. 7 shows a block diagram of control flow, implementing reverse acceleration, of a computer cluster arrangement according to an aspect of the present invention.
  • an acceleration of computation of computation tasks being computed by at least one booster B is performed by assigning at least one computation node CN to at least one booster B.
  • the control and information flow is reversed as regards the embodiment being shown in Fig. 6.
  • Computation of tasks can therefore be accelerated by outsourcing computation tasks from the boosters B to at least one computation node CN.
  • Fig. 8 shows a block diagram of control flow of a computer cluster arrangement according to a further aspect of the present invention.
  • the resource manager RM does not pass the at least one part of the computation task to the booster B, but the computation node CN requests an address or a further identification of a booster B, which is arranged to compute the specific at least one part of the computation task.
  • the resource manager RM returns the required address to the computation node CN.
  • the computation node CN is now able to directly access the booster B by means of the communication infrastructure IN.
  • the communication infrastructure IN is accessed via interfacing units.
  • the computation nodes CN accesses the communication infrastructure IN by interfacing unit IU1 and the booster B interfaces the communication infrastructure IN by interfacing unit IU2.
  • the resource manager RM is arranged to evaluate the resource capacities of the booster B and performs the assignment, which means the selection of the booster B, as a function of the evaluated resource capacities of each of the boosters B. For doing so the resource manager RM may access the assignment metric, which may be stored in a database DB or any kind of data source. The resource manager RM is arranged to update the assignment metric, which can be performed under usage of a database management system.
  • the database DB can be implemented as any kind of storage. It may for instance be implemented as a table, a register or a cache.
  • Fig. 9 shows a schematic illustration of network topology of a computer cluster arrangement according to an aspect of the present invention.
  • the computation nodes share a common, first, communication infrastructure, for instance a star topology with a central switching unit S.
  • a further, second, communication infrastructure is provided for communication of the computation nodes CN with booster nodes BN.
  • a third communication infrastructure is provided for communication among booster nodes BN.
  • a high speed network interface for communication among booster nodes BN can be provided with a specific BN-BN communication interface.
  • one communication infrastructure can be implemented as a 3d topology.
  • two communication infrastructures are provided, one for communication among computation nodes CN and one further communication infrastructure for communication among booster nodes BN. Both communication infrastructures can be coupled by at least one communication link from the first network to the second network or from the second network to the first network.
  • one selected computation node CN or one selected booster node BN is connected with the respectively other network.
  • one booster node BN is connected with the communication infrastructure of the computation nodes CN under usage of a switching unit S.
  • the booster group BG itself may be connected to the communication infrastructure of the computation nodes CN or an intermediate communication infrastructure.
  • the communication infrastructures may generally differ among other characteristics in their topology, bandwith, communication protocols, throughput and message exchange.
  • a booster B may for example comprise 1 to 10.000 booster nodes BN, but is not restricted to this range.
  • the resource manager RM may generally manage parts of the booster nodes BN and can therefore partition the overall number of booster nodes BN and dynamically form boosters B out of said number of booster nodes BN.
  • the switching unit S may be implemented by a switch, a router or any network device.
  • the database DB may be accessed by further components, respectively nodes of the computer cluster arrangement.
  • the illustrated computation nodes CN as well as the illustrated booster group BG may be one of many further computation nodes CN as well as one of many booster groups BG, respectively, which access the resource manager RM and/or the communication infrastructure IN.
  • acceleration can also be performed reversely by outsourcing at least a part of the computation task from at least one booster B to at least one computation node.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mathematical Physics (AREA)
  • Multi Processors (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)
  • Hardware Redundancy (AREA)
  • Debugging And Monitoring (AREA)
  • Advance Control (AREA)
  • Stored Programmes (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Electrotherapy Devices (AREA)
  • Complex Calculations (AREA)
PCT/EP2011/067888 2010-10-13 2011-10-13 A computer cluster arrangement for processing a computation task and method for operation thereof WO2012049247A1 (en)

Priority Applications (21)

Application Number Priority Date Filing Date Title
SI201131773T SI2628080T1 (sl) 2010-10-13 2011-10-13 Razmestitev gruče računalnikov za obdelavo računske naloge in postopek za delovanje le-te
ES11768015T ES2743469T3 (es) 2010-10-13 2011-10-13 Disposición de ordenadores en racimo para el tratamiento de una tarea de cálculo y procedimiento correspondiente
DK11768015.7T DK2628080T3 (da) 2010-10-13 2011-10-13 Computerklyngearrangement til bearbejdning af en beregningsopgave og fremgangsmåde til drift deraf
CN201180056850.8A CN103229146B (zh) 2010-10-13 2011-10-13 用于处理计算任务的计算机集群布置及其操作方法
PL11768015T PL2628080T3 (pl) 2010-10-13 2011-10-13 Układ klastrów komputerowych do przetwarzania zadania komputerowego i sposób działania
KR1020187002295A KR102103596B1 (ko) 2010-10-13 2011-10-13 계산 작업을 처리하기 위한 컴퓨터 클러스터 장치 및 이를 작동시키기 위한 방법
CA2814309A CA2814309C (en) 2010-10-13 2011-10-13 A computer cluster arrangement for processing a computation task and method for operation thereof
EP19179073.2A EP3614263A3 (de) 2010-10-13 2011-10-13 Computerclusteranordnung zur verarbeitung einer berechnungsaufgabe und verfahren zum betrieb davon
EP11768015.7A EP2628080B1 (de) 2010-10-13 2011-10-13 Computerclusteranordnung zur verarbeitung einer berechnungsaufgabe und betriebsverfahren dafür
KR1020197005913A KR102074468B1 (ko) 2010-10-13 2011-10-13 계산 작업을 처리하기 위한 컴퓨터 클러스터 장치 및 이를 작동시키기 위한 방법
LTEP11768015.7T LT2628080T (lt) 2010-10-13 2011-10-13 Kompiuterio klasterio išdėstymas, skirtas skaičiavimo užduoties apdorojimui ir jo veikimo būdas
RSP20191093 RS59165B1 (sr) 2010-10-13 2011-10-13 Raspored klastera računara za obradu računskog zadatka i njegov način rada
JP2013533215A JP6494161B2 (ja) 2010-10-13 2011-10-13 計算タスクを処理するためのコンピュータクラスタ構成、およびそれを動作させるための方法
RU2013121560/12A RU2597556C2 (ru) 2010-10-13 2011-10-13 Структура компьютерного кластера для выполнения вычислительных задач и способ функционирования указанного кластера
KR1020137011931A KR101823505B1 (ko) 2010-10-13 2011-10-13 계산 작업을 처리하기 위한 컴퓨터 클러스터 장치 및 이를 작동시키기 위한 방법
US13/861,429 US10142156B2 (en) 2010-10-13 2013-04-12 Computer cluster arrangement for processing a computation task and method for operation thereof
US16/191,973 US10951458B2 (en) 2010-10-13 2018-11-15 Computer cluster arrangement for processing a computation task and method for operation thereof
CY20191100948T CY1122108T1 (el) 2010-10-13 2019-09-11 Διαταξη συστοιχιας ηλεκτρονικων υπολογιστων για επεξεργασια ενος υπολογιστικου εργου και μεθοδος λειτουργιας αυτης
HRP20191640 HRP20191640T1 (hr) 2010-10-13 2019-09-12 Način raspoređivanja računalnog klastera za obradu računskih zadataka i njihov način rada
US17/196,665 US11934883B2 (en) 2010-10-13 2021-03-09 Computer cluster arrangement for processing a computation task and method for operation thereof
US18/429,370 US20240168823A1 (en) 2010-10-13 2024-01-31 Computer cluster arrangement for processing a computation task and method for operation thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP10187436.0 2010-10-13
EP10187436A EP2442228A1 (de) 2010-10-13 2010-10-13 Computerclusteranordnung zur Verarbeitung einer Berechnungsaufgabe und Betriebsverfahren dafür

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/861,429 Continuation US10142156B2 (en) 2010-10-13 2013-04-12 Computer cluster arrangement for processing a computation task and method for operation thereof

Publications (1)

Publication Number Publication Date
WO2012049247A1 true WO2012049247A1 (en) 2012-04-19

Family

ID=43831684

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2011/067888 WO2012049247A1 (en) 2010-10-13 2011-10-13 A computer cluster arrangement for processing a computation task and method for operation thereof

Country Status (18)

Country Link
US (4) US10142156B2 (de)
EP (3) EP2442228A1 (de)
JP (3) JP6494161B2 (de)
KR (3) KR102103596B1 (de)
CN (2) CN103229146B (de)
CA (3) CA2814309C (de)
CY (1) CY1122108T1 (de)
DK (1) DK2628080T3 (de)
ES (1) ES2743469T3 (de)
HR (1) HRP20191640T1 (de)
HU (1) HUE044788T2 (de)
LT (1) LT2628080T (de)
PL (1) PL2628080T3 (de)
PT (1) PT2628080T (de)
RS (1) RS59165B1 (de)
RU (1) RU2597556C2 (de)
SI (1) SI2628080T1 (de)
WO (1) WO2012049247A1 (de)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20150088045A (ko) * 2014-01-23 2015-07-31 서울대학교산학협력단 매니코어 클러스터 시스템 상에서 병렬 프로그래밍을 수행하는 방법 및 매니코어 클러스터 시스템
KR20160042848A (ko) * 2013-01-18 2016-04-20 서울대학교산학협력단 클러스터 시스템의 계산 디바이스 가상화 방법 및 그 시스템
US9501325B2 (en) 2014-04-11 2016-11-22 Maxeler Technologies Ltd. System and method for shared utilization of virtualized computing resources
US9584594B2 (en) 2014-04-11 2017-02-28 Maxeler Technologies Ltd. Dynamic provisioning of processing resources in a virtualized computational architecture
WO2018065530A1 (en) 2016-10-05 2018-04-12 Partec Cluster Competence Center Gmbh High performance computing system and method
KR20190038195A (ko) * 2017-09-29 2019-04-08 주식회사 트레드링스 작업 할당 시스템, 방법, 및 컴퓨터 프로그램
WO2019145354A1 (en) 2018-01-23 2019-08-01 Partec Cluster Competence Center Gmbh Application runtime determined dynamical allocation of heterogeneous compute resources
WO2019219747A1 (en) 2018-05-15 2019-11-21 Partec Cluster Competence Center Gmbh Apparatus and method for efficient parallel computation
US10715587B2 (en) 2014-04-11 2020-07-14 Maxeler Technologies Ltd. System and method for load balancing computer resources
WO2020221799A1 (en) 2019-04-30 2020-11-05 Bernhard Frohwitter Apparatus and method to dynamically optimize parallel computations
RU2815262C2 (ru) * 2018-05-15 2024-03-12 Партек Кластер Компитенс Сентер Гмбх Устройство и способ эффективного параллельного вычисления

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2442228A1 (de) 2010-10-13 2012-04-18 Thomas Lippert Computerclusteranordnung zur Verarbeitung einer Berechnungsaufgabe und Betriebsverfahren dafür
WO2014188643A1 (ja) * 2013-05-24 2014-11-27 日本電気株式会社 スケジュールシステム、スケジュール方法、及び、記録媒体
JP2014078214A (ja) * 2012-09-20 2014-05-01 Nec Corp スケジュールシステム、スケジュール方法、スケジュールプログラム、及び、オペレーティングシステム
WO2014188642A1 (ja) * 2013-05-22 2014-11-27 日本電気株式会社 スケジュールシステム、スケジュール方法、及び、記録媒体
US9576039B2 (en) 2014-02-19 2017-02-21 Snowflake Computing Inc. Resource provisioning systems and methods
CN105681366A (zh) * 2014-09-26 2016-06-15 广西盛源行电子信息有限公司 一种把上万台北斗终端接入同一台服务器的算法
CN111865657B (zh) 2015-09-28 2022-01-11 华为技术有限公司 一种加速管理节点、加速节点、客户端及方法
US10432450B2 (en) * 2016-06-30 2019-10-01 Microsoft Technology Licensing, Llc. Data plane API in a distributed computing network
US11049025B2 (en) * 2017-03-15 2021-06-29 Salesforce.Com, Inc. Systems and methods for compute node management protocols
CN111108474A (zh) * 2017-09-30 2020-05-05 英特尔公司 通过云资源管理器管理加速器资源的技术
CN110390516B (zh) * 2018-04-20 2023-06-06 伊姆西Ip控股有限责任公司 用于数据处理的方法、装置和计算机存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040098447A1 (en) * 2002-11-14 2004-05-20 Verbeke Jerome M. System and method for submitting and performing computational tasks in a distributed heterogeneous networked environment
US20040257370A1 (en) * 2003-06-23 2004-12-23 Lippincott Louis A. Apparatus and method for selectable hardware accelerators in a data driven architecture
US20050097300A1 (en) * 2003-10-30 2005-05-05 International Business Machines Corporation Processing system and method including a dedicated collective offload engine providing collective processing in a distributed computing environment
US20090213127A1 (en) * 2008-02-22 2009-08-27 International Business Machines Corporation Guided attachment of accelerators to computer systems

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0744504A (ja) * 1993-07-27 1995-02-14 Hitachi Ltd Cpuと複数のpu,fpuから成る演算ユニット
EP1046994A3 (de) * 1994-03-22 2000-12-06 Hyperchip Inc. Direkte Zellenersetzung für fehlertolerante Architektur mit gänzlich integrierten Systemen und mit Mitteln zur direkten Kommunikation mit Systembediener
US7051188B1 (en) 1999-09-28 2006-05-23 International Business Machines Corporation Dynamically redistributing shareable resources of a computing environment to manage the workload of that environment
US7418470B2 (en) * 2000-06-26 2008-08-26 Massively Parallel Technologies, Inc. Parallel processing systems and method
JP2002084302A (ja) * 2000-09-06 2002-03-22 Nippon Telegr & Teleph Corp <Ntt> ネットワークによる通信方法及び装置
RU2188451C2 (ru) * 2000-10-25 2002-08-27 Курский государственный технический университет Система взаимораспределения ресурсов
US7739398B1 (en) * 2000-11-21 2010-06-15 Avaya Inc. Dynamic load balancer
US6922832B2 (en) 2000-12-12 2005-07-26 Lockheed Martin Corporation Execution of dynamic services in a flexible architecture for e-commerce
US20030164842A1 (en) * 2002-03-04 2003-09-04 Oberoi Ranjit S. Slice blend extension for accumulation buffering
US8397269B2 (en) * 2002-08-13 2013-03-12 Microsoft Corporation Fast digital channel changing
US7137040B2 (en) 2003-02-12 2006-11-14 International Business Machines Corporation Scalable method of continuous monitoring the remotely accessible resources against the node failures for very large clusters
CN1754146B (zh) 2003-02-24 2010-04-28 Bea系统公司 用于服务器负载均衡和服务器亲缘关系的系统和方法
US7093147B2 (en) 2003-04-25 2006-08-15 Hewlett-Packard Development Company, L.P. Dynamically selecting processor cores for overall power efficiency
US7996839B2 (en) 2003-07-16 2011-08-09 Hewlett-Packard Development Company, L.P. Heterogeneous processor core systems for improved throughput
US9264384B1 (en) 2004-07-22 2016-02-16 Oracle International Corporation Resource virtualization mechanism including virtual host bus adapters
US7437581B2 (en) * 2004-09-28 2008-10-14 Intel Corporation Method and apparatus for varying energy per instruction according to the amount of available parallelism
JP2006277458A (ja) 2005-03-30 2006-10-12 Hitachi Ltd リソース割当管理装置およびリソース割当方法
WO2007038445A2 (en) * 2005-09-26 2007-04-05 Advanced Cluster Systems, Llc Clustered computer system
US7490223B2 (en) 2005-10-31 2009-02-10 Sun Microsystems, Inc. Dynamic resource allocation among master processors that require service from a coprocessor
US7441224B2 (en) * 2006-03-09 2008-10-21 Motorola, Inc. Streaming kernel selection for reconfigurable processor
US8713574B2 (en) * 2006-06-05 2014-04-29 International Business Machines Corporation Soft co-processors to provide a software service function off-load architecture in a multi-core processing environment
JP4936517B2 (ja) * 2006-06-06 2012-05-23 学校法人早稲田大学 ヘテロジニアス・マルチプロセッサシステムの制御方法及びマルチグレイン並列化コンパイラ
US8589935B2 (en) 2007-05-08 2013-11-19 L-3 Communications Corporation Heterogeneous reconfigurable agent compute engine (HRACE)
US8250578B2 (en) * 2008-02-22 2012-08-21 International Business Machines Corporation Pipelining hardware accelerators to computer systems
US8615647B2 (en) 2008-02-29 2013-12-24 Intel Corporation Migrating execution of thread between cores of different instruction set architecture in multi-core processor and transitioning each core to respective on / off power state
US8434087B2 (en) * 2008-08-29 2013-04-30 International Business Machines Corporation Distributed acceleration devices management for streams processing
US9104617B2 (en) * 2008-11-13 2015-08-11 International Business Machines Corporation Using accelerators in a hybrid architecture for system checkpointing
FR2938943B1 (fr) * 2008-11-21 2010-11-12 Thales Sa Systeme multiprocesseur.
CN101441564B (zh) * 2008-12-04 2011-07-20 浙江大学 为程序定制的可重构加速器实现方法
US9588806B2 (en) * 2008-12-12 2017-03-07 Sap Se Cluster-based business process management through eager displacement and on-demand recovery
US8869160B2 (en) * 2009-12-24 2014-10-21 International Business Machines Corporation Goal oriented performance management of workload utilizing accelerators
CN101763288B (zh) * 2010-01-19 2012-09-05 湖南大学 考虑硬件预配置因素的动态软硬件划分方法
US8875152B2 (en) * 2010-04-22 2014-10-28 Salesforce.Com, Inc. System, method and computer program product for dynamically increasing resources utilized for processing tasks
US8739171B2 (en) * 2010-08-31 2014-05-27 International Business Machines Corporation High-throughput-computing in a hybrid computing environment
EP2442228A1 (de) * 2010-10-13 2012-04-18 Thomas Lippert Computerclusteranordnung zur Verarbeitung einer Berechnungsaufgabe und Betriebsverfahren dafür

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040098447A1 (en) * 2002-11-14 2004-05-20 Verbeke Jerome M. System and method for submitting and performing computational tasks in a distributed heterogeneous networked environment
US20040257370A1 (en) * 2003-06-23 2004-12-23 Lippincott Louis A. Apparatus and method for selectable hardware accelerators in a data driven architecture
US20050097300A1 (en) * 2003-10-30 2005-05-05 International Business Machines Corporation Processing system and method including a dedicated collective offload engine providing collective processing in a distributed computing environment
US20090213127A1 (en) * 2008-02-22 2009-08-27 International Business Machines Corporation Guided attachment of accelerators to computer systems

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
AMNON BARAK ET AL., A PACKAGE FOR OPEN CL BASED HETEROGENEOUS COMPUTING ON CLUSTERS WITH MANY GPU DEVICES
JOSE DUATO, RAFAEL MAYO ET AL.: "rCUDA: reducing the number of GPU-based accelerators in high performance clusters", INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND SIMULATION (HPCS, 28 June 2010 (2010-06-28), pages 224 - 231

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101682113B1 (ko) 2013-01-18 2016-12-02 서울대학교산학협력단 클러스터 시스템의 계산 디바이스 가상화 방법 및 그 시스템
KR20160042848A (ko) * 2013-01-18 2016-04-20 서울대학교산학협력단 클러스터 시스템의 계산 디바이스 가상화 방법 및 그 시스템
KR101594915B1 (ko) 2014-01-23 2016-02-17 서울대학교산학협력단 매니코어 클러스터 시스템 상에서 병렬 프로그래밍을 수행하는 방법 및 매니코어 클러스터 시스템
US9396033B2 (en) 2014-01-23 2016-07-19 Snu R&Db Foundation Method of executing parallel application on manycore cluster system and the manycore cluster system
KR20150088045A (ko) * 2014-01-23 2015-07-31 서울대학교산학협력단 매니코어 클러스터 시스템 상에서 병렬 프로그래밍을 수행하는 방법 및 매니코어 클러스터 시스템
US10715587B2 (en) 2014-04-11 2020-07-14 Maxeler Technologies Ltd. System and method for load balancing computer resources
US9584594B2 (en) 2014-04-11 2017-02-28 Maxeler Technologies Ltd. Dynamic provisioning of processing resources in a virtualized computational architecture
US9501325B2 (en) 2014-04-11 2016-11-22 Maxeler Technologies Ltd. System and method for shared utilization of virtualized computing resources
WO2018065530A1 (en) 2016-10-05 2018-04-12 Partec Cluster Competence Center Gmbh High performance computing system and method
EP3944084A1 (de) 2016-10-05 2022-01-26 ParTec AG Hochleistungsrechensystem und -verfahren
US11494245B2 (en) 2016-10-05 2022-11-08 Partec Cluster Competence Center Gmbh High performance computing system and method
KR20190038195A (ko) * 2017-09-29 2019-04-08 주식회사 트레드링스 작업 할당 시스템, 방법, 및 컴퓨터 프로그램
KR101985899B1 (ko) 2017-09-29 2019-06-04 주식회사 트레드링스 작업 할당 시스템, 방법, 및 컴퓨터 프로그램
WO2019145354A1 (en) 2018-01-23 2019-08-01 Partec Cluster Competence Center Gmbh Application runtime determined dynamical allocation of heterogeneous compute resources
WO2019219747A1 (en) 2018-05-15 2019-11-21 Partec Cluster Competence Center Gmbh Apparatus and method for efficient parallel computation
RU2815262C2 (ru) * 2018-05-15 2024-03-12 Партек Кластер Компитенс Сентер Гмбх Устройство и способ эффективного параллельного вычисления
WO2020221799A1 (en) 2019-04-30 2020-11-05 Bernhard Frohwitter Apparatus and method to dynamically optimize parallel computations

Also Published As

Publication number Publication date
JP6433554B2 (ja) 2018-12-05
RU2013121560A (ru) 2014-11-20
RS59165B1 (sr) 2019-10-31
EP2442228A1 (de) 2012-04-18
KR20190025746A (ko) 2019-03-11
EP2628080A1 (de) 2013-08-21
US11934883B2 (en) 2024-03-19
PL2628080T3 (pl) 2019-11-29
CA3027973A1 (en) 2012-04-19
CA3145494A1 (en) 2012-04-19
US20130282787A1 (en) 2013-10-24
EP2628080B1 (de) 2019-06-12
CN103229146A (zh) 2013-07-31
HRP20191640T1 (hr) 2019-12-13
HUE044788T2 (hu) 2019-11-28
JP2017216000A (ja) 2017-12-07
KR102074468B1 (ko) 2020-02-06
SI2628080T1 (sl) 2019-10-30
ES2743469T3 (es) 2020-02-19
RU2597556C2 (ru) 2016-09-10
DK2628080T3 (da) 2019-09-02
US20240168823A1 (en) 2024-05-23
CA2814309C (en) 2019-03-12
US20210194748A1 (en) 2021-06-24
US10142156B2 (en) 2018-11-27
CA2814309A1 (en) 2012-04-19
KR101823505B1 (ko) 2018-02-01
KR20180014185A (ko) 2018-02-07
KR102103596B1 (ko) 2020-04-23
CY1122108T1 (el) 2020-11-25
EP3614263A3 (de) 2021-10-06
LT2628080T (lt) 2019-10-10
US10951458B2 (en) 2021-03-16
JP6653366B2 (ja) 2020-02-26
CA3027973C (en) 2022-03-22
CN103229146B (zh) 2018-12-11
JP2013539881A (ja) 2013-10-28
US20190089574A1 (en) 2019-03-21
CN109491795A (zh) 2019-03-19
JP2019057303A (ja) 2019-04-11
EP3614263A2 (de) 2020-02-26
KR20140018187A (ko) 2014-02-12
JP6494161B2 (ja) 2019-04-03
PT2628080T (pt) 2019-09-13

Similar Documents

Publication Publication Date Title
US11934883B2 (en) Computer cluster arrangement for processing a computation task and method for operation thereof
CN109791509B (zh) 高性能计算系统和方法
US7716336B2 (en) Resource reservation for massively parallel processing systems
KR20160087706A (ko) 가상화 플랫폼을 고려한 분산 데이터 처리 시스템의 자원 할당 장치 및 할당 방법
KR20130088512A (ko) 클러스터 컴퓨팅 환경에서의 자원 관리 장치 및 방법
JP2016541072A (ja) リソース処理方法、オペレーティング・システム、およびデバイス
US11169846B2 (en) System and method for managing tasks and task workload items between address spaces and logical partitions
Neuwirth et al. Scalable communication architecture for network-attached accelerators
US11138146B2 (en) Hyperscale architecture
CN111459871A (zh) 一种基于fpga异构计算的区块链加速系统及方法
Neuwirth et al. Communication models for distributed intel xeon phi coprocessors
Misawa et al. Dynamic Reconfiguration of Computer Platforms at the Hardware Device Level for High Performance Computing Infrastructure as a Service
Ihnotic Scaling to 32 GPUs on a Novel Composable System Architecture

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11768015

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2814309

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2013533215

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20137011931

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2011768015

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2013121560

Country of ref document: RU

Kind code of ref document: A