WO2009146721A1 - Resource allocation in a distributed system - Google Patents

Resource allocation in a distributed system Download PDF

Info

Publication number
WO2009146721A1
WO2009146721A1 PCT/EP2008/004498 EP2008004498W WO2009146721A1 WO 2009146721 A1 WO2009146721 A1 WO 2009146721A1 EP 2008004498 W EP2008004498 W EP 2008004498W WO 2009146721 A1 WO2009146721 A1 WO 2009146721A1
Authority
WO
WIPO (PCT)
Prior art keywords
message
nodes
identifier
resources
node
Prior art date
Application number
PCT/EP2008/004498
Other languages
French (fr)
Inventor
Jens Elmenthaler
Original Assignee
Verigy (Singapore) Pte. Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Verigy (Singapore) Pte. Ltd. filed Critical Verigy (Singapore) Pte. Ltd.
Priority to PCT/EP2008/004498 priority Critical patent/WO2009146721A1/en
Publication of WO2009146721A1 publication Critical patent/WO2009146721A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/546Xcast

Definitions

  • Embodiments of the invention concern a concept for allocating resources in a distributed system.
  • Embodiments of the invention concern a node for a distributed system, which is configured for a corresponding resource allocation.
  • Embodiments of the invention concern a distributed system and embodiments of the invention concern a method for distributing identical resource usage information from a central unit to nodes of a distributed system.
  • a distributed system is shown in Fig. 5 of the present application.
  • An example of the distributed system shown in Fig. 5 may be, for example, implemented in automatic test equipment, such as the Verigy V93000 series.
  • the distributed system comprises a central unit 1000 coupled to multiple processing nodes 1002 via a communication infrastructure 1004. Three nodes, node 0, node 1 and node n are shown in Fig. 5, standing in for a possibly greater number of nodes, as indicated by the dots between node 1 and node n.
  • the communication infrastructure 1004 has multicast capability, i.e. has the possibility of sending messages to a sub-set of the nodes 1002 only.
  • multicast capability is implemented in the Verigy V93000 series, making use of so-called "common channels", representing communication means between the central unit and a sub-set of the distributed nodes.
  • common channels representing communication means between the central unit and a sub-set of the distributed nodes.
  • 32 common channels may be implemented in the Verigy 93000 series .
  • Each node 1002 has associated therewith a resource pool 1006, such as resource pools 0 to n shown in Fig. 5.
  • a resource pool 1006 may comprise memory, registers, interrupts, ports, file descriptors, etc.
  • memory associated with each resource pool may comprise different address regions, as shown by the respective address regions indicated in Fig. 5.
  • the central unit 1000 may represent the test system controller and the distributed nodes 1002 may represent respective testing modules, such as test processor driven instruments, having associated therewith one or more pins to be connected to a DUT (Device Under Test) .
  • DUT Device Under Test
  • Managing resource pools such as memory, registers, interrupts, ports, file descriptors, etc. replicated on multiple processing nodes, may comprise allocating a resource, access it and free it. Access operations, such as write to, read from, open, execute, trigger, etc., may happen in common for all nodes that, for a specific access operation, a resource is allocated for. Triggering the operation in the minimum time possible on all nodes would be beneficial. For a given operation, a resource is not allocated on all nodes. Instead, many different sub-sets of nodes may be created by an application. A resource is only allocated for one of these sub-sets of nodes, i.e. a resource allocated for one sub-set of nodes may not be allocated to another sub-set of nodes.
  • a central instance such as the central unit 1000, manages the resource pools of all nodes in common. The same resource is picked on all nodes for an allocation request.
  • the physical resource ID can be used as a broadcast parameter.
  • Predefined multicast groups are defined and each node knows to which multicast groups it belongs. This way, access operations are triggered by a single multicast achieving the best throughput.
  • the central unit 1000 allocates space for 10 instructions on nodes 0 to 15. To be more specific, the central unit 1000 allocates memory in the intersection of the physical memory of nodes 0 to 15, i.e. allocates memory in overlapping address areas of nodes 0 to 15. Nodes 0 to 15 may be selected corresponding to the requirements of an application running on the central unit. Thereupon, a multicast group for nodes 0 to 15 is created and a group ID is assigned to the multicast group, for example, IDO. Creating a multicast group may include configuring the communication infrastructure 1004 to implement a connection between the central unit 1000 and each of the nodes of the multicast group, i.e. nodes 1 to 15. In the Verigy V93000 series, such a connection is referred to as the "common channel" and the Verigy V93000 series provides for the possibility of implementing 32 common channels.
  • the 10 instructions are written into the allocated resources, making use of a multicast message.
  • the central unit 1000 sends a multicast message to the group having the group ID IDO, the multicast message including a request to write the instructions, such as i ⁇ ,..., i9, to the allocated memory, such as starting from address OxOOOOffff.
  • the nodes 0 to 15 write the instructions to the allocated memory.
  • the central unit 1000 sends a multicast message to group 0, the multicast message including a request to start execution at the starting address of the allocated memory, such as OxOOOOffff. Upon receipt of this message, nodes 1 to 15 start execution of the corresponding instructions.
  • a central instance manages the pools of all nodes in common.
  • the same resource is picked on all nodes for an allocation request.
  • the physical resource ID can be used as a broadcast parameter.
  • Predefined multicast groups are defined and each node knows to which multicast groups it belongs. Accordingly, in the execution phase, operations are triggered by a single multicast achieving the best throughput.
  • the nodes have different physical value ranges of resources in their pool, only the intersection of available physical resources for the requesting nodes can be utilized. If the resources are allocated on many overlapping sub-sets of nodes, i.e. if nodes belong to many different multicast groups generated in the above manner, resources are wasted. In addition, multicast is fast only if the multicast group for the receiving nodes is already defined. The number of required sub-sets is huge. Thus, for a given hardware implementation, not all sub-sets a multicast group can be permanently defined.
  • An embodiment of the invention provides for a node for a distributed system, the node comprising:
  • a local processor configured to:
  • Embodiments of the invention provide a distributed system comprising a plurality of such nodes; and a central unit coupleable to the plurality of nodes via a communication interface having multicast capabilities.
  • Embodiments of the invention provide a method for distributing identical resource configuration information from a central unit to a sub-set of a plurality of nodes of a distributed system, comprising:
  • the central unit multicasting a first message to a subset of at least two of said plurality of nodes, the first message including an identifier and a request to allocate required resources to the identifier;
  • the central unit broadcast a second message to the plurality of nodes, the second message including the identifier and resource configuration information defining how to configure the allocated resources;
  • the plurality of nodes receiving the second message, deriving the identifier from the second message, configuring the allocated resources according to the resource configuration information if the node has stored the identifier and, otherwise, ignoring the second message else.
  • a node for a distributed system is provided with a local facility, such as a local processor core, to manage its resource pool.
  • the local facility may be configured to map an identifier, such as a logical resource ID, to a local physical resource, upon receipt of a message from a central facility including the identifier.
  • the identifier such as the logical resource ID, may be assigned by the central facility during the allocation of this resource.
  • embodiments of the invention permit for a simpler algorithm for managing the resource pool of a node by the node itself.
  • embodiments of the invention permit for an optimal usage of resources in the node's pool.
  • the data structure overhead for achieving management of the resource pool may be distributed over nodes, so that the local facility associated with each node avoids overloading the central node with the algorithmic complexity of a per node management.
  • the first message including the identifier and the request to allocate required resources to the identifier is a multicast message.
  • the second message including the identifier and the resource configuration information is a broadcast message.
  • a third message including the identifier and a request to execute a program utilizing the configured resources is a broadcast message.
  • the resource pool of a node i.e. the plurality of resources of a node, comprises a memory having a plurality of storage locations, wherein the resource configuration information define instructions to be written to the memory and wherein the local processor is further configured to write the instructions to the memory, receive a third message including the identifier and a request to execute the instructions, and execute the instructions written to the memory.
  • setup data may be stored in the memory upon receipt of the second message and utilized upon receipt of the third message.
  • the resource pools may comprise registers, interrupts, ports, file descriptors, etc. replicated on multiple processing nodes, which may be configured upon receipt of the second message and which may be utilized accordingly upon receipt of the third message.
  • the procedures are almost identical and necessary modifications can be easily applied by an person skilled in the art.
  • Fig. 1 shows a schematic view of a distributed system according to an embodiment of the invention
  • Fig. 2 shows a schematic view of a testing system
  • Fig. 3 shows a schematic view of a communication infrastructure with multicast capability to a set of nodes
  • Figs. 4a and 4b show schematic views of a flow chart of a method according to an embodiment of the invention.
  • Fig. 5 shows a schematic view of a conventional distributed system.
  • Fig. 1 shows a schematic view of a distributed system comprising a central unit 10, a communication interface 12, a plurality of nodes 14 (designated node 0 to node n in Fig. 1), and a resource pool 16 associated with each node 14.
  • the resource pools 16 are designated resource pool 0 to resource pool 10 in Fig. 1 and each resource pool comprises a memory having physical memory addresses as indicated in Fig. 1.
  • the physical memory addresses of the memories associated with the different nodes may be different.
  • the structure of the distributed system may be comparable to the structure of the distributed system explained above referring to Fig. 5.
  • An example of such a distributed system is implemented in the Verigy V93000 series.
  • the central unit 10 and the nodes 14 shown in Fig. 1 are adapted to implement the inventive approach for managing the resource pools 16 replicated on the plurality of processing nodes 14.
  • Managing the resource pools may include allocating a resource, accessing it and freeing it. Access operations, such as write to, read from, open, execute, trigger, etc. may happen in common for all nodes 14 that a resource is allocated for. It would be beneficial, however, if the operations could be triggered in the minimum time possible on all nodes.
  • the central unit 10 comprises a central processor 20, which may be part of a work station, for example.
  • Each node 14 is provided with a local facility, such as a local processor 22, configured to implement the inventive approach.
  • the central unit 10 and the nodes 14 comprise additional hardware structures required besides the processors 20, 22 in order to implement the inventive approach, such as memories, interfaces, connection lines, and the like.
  • the central unit 10 may be the central workstation of a testing system and the nodes 14 may be testing modules having individual processors associated with a single test pin or a plurality of test pins.
  • a tester 30 comprising a central processor 32 and a plurality of individual processors 34, which are coupled to the central processor 32 via a communication infrastructure 36.
  • Each local processor 34 is associated and coupled to a single test pin 38.
  • a device under test 40 or a plurality of devices under test may be contacted by the test pins 38 to apply test signals to the device under test 40.
  • the test signals may be generated by executing instructions stored in the memories of the resource pools 16 associated with nodes 14.
  • instructions that reference setup data stored in the memories may be used in generating the test signals.
  • the instructions may be distributed to the nodes during a set-up phase.
  • the central processor 20 may run one test application or a plurality of test applications.
  • the test application (s) may create many different sub-sets of nodes to which specific instructions are to be written. To be more specific, the application (s) may create a first subset of nodes to which a first set of instructions is to be written and a different second sub-set of nodes to which a different second set of instructions has to be written.
  • a test application running on the central processor 20 may require that 10 instructions i ⁇ ,..., i9 are to be distributed to each of nodes 0 to 15.
  • the central processor 20 In reply to the application creating the sub-set of nodes 1 to 15 in this manner, the central processor 20 generates a multicast group for nodes 0 to 15, step Sl in Fig. 4a.
  • Creating a multicast group may include assigning a group ID, such as IDO to nodes 0 to 15. Creating a multicast group may also include configuring the communication infrastructure 12, 36 to provide for a direct connection between the central unit 10 and nodes 0 to 15.
  • a communication infrastructure with multicast capability is shown in Fig. 3.
  • the communication infrastructure comprises a number of channels, each channel comprising a switching unit 50, a first signal line 52 coupling switching unit 50 to the central unit 10 and a plurality of signal lines 54 coupling the switching unit to each of nodes 14.
  • the switching unit 50 may be configured to connect the first signal line 52 to any desired number of second signal lines 54, so that a given channel may connect the central unit 10 to any desired number of nodes 14.
  • one of the switching units 50 may be configured to connect the associated first signal line 52 with each of the second signal lines 54 connected to nodes 1 to 15. Accordingly, in embodiments of the invention, a multicast group is created by hardware means. In alternative embodiments of the invention, creating a multicast group may be implemented simply by using the group ID to address each member of the multicast group.
  • the central unit 10 After generating the multicast group, the central unit 10 assigns a logical resource ID, such as ID15 to the memory space required to store the 10 instructions.
  • the central unit 10 generates a first message, the first message including the logical source ID, such as ID15 and a request to allocate space for 10 instructions.
  • the central unit multicasts the first message to the members of the multicast group, i.e. to nodes 1 to 15, step S2 in Fig. 4a.
  • Nodes 1 to 15 receive the first message, step S3 in Fig. 4a.
  • Each of the nodes 1 to 15 stores the logical resource identifier, such as ID15 and allocates resources as requested in the first message, step S4.
  • the logical resource identifier such as ID15
  • the multicast group is discarded by the central unit, see step S5.
  • the central unit sees step S5.
  • a second multicast group might be required, if a distributed system comprises nodes supporting the inventive approach and nodes not supporting the inventive approach.
  • a multicast group such as a permanent multicast group may be permanently configured for all nodes supporting the inventive approach. Then, the further messages would be multicast to this permanent multicast group.
  • the central unit then broadcasts a second message, the second message including a request to write the instruction i ⁇ , ... , i9 and the logical resource identifier, such as ID15, see step S6 in Fig. 4a.
  • the second message may also include an offset value, such as offsetO, concerning the address in the allocated memory space.
  • the second message is broadcast, i.e. sent to all nodes of the distributed system rather than nodes 1 to 15 only.
  • each node Upon receipt, step S7, of the second message, each node checks whether it has stored the identifier, such as ID15 included in the second message, step S8. If a node has stored the identifier, it will follow the request in the second message, i.e. will write (step S9) the instructions iO to i9 to the allocated memory space, so that the resources are configured for an execution of the instructions. If a node has not stored the identifier, it will ignore the second message, step SlO.
  • the identifier such as ID15 included in the second message
  • steps Sl to SlO are performed during a set-up phase once in order to configure the nodes for a later execution.
  • the central unit 10 broadcasts a third message to all nodes, step S12.
  • the third message includes a request to start execution and includes the logical resource identifier, such as ID15.
  • the third message may also include an offset value, such as offsetO.
  • each of nodes 14 checks whether it has stored the identifier, step S14.
  • Step S15 the instructions stored at the allocated space assigned to the identifier are executed.
  • a program formed by the instructions
  • step S15 Nodes, which have not stored the identifier, ignore the third message, step S16.
  • Steps S12 to S16 may be performed during the execution phase a number of times. In a test environment, steps S12 to S16 may be performed a number of times for testing the same device under test or different devices under test.
  • a set of registers may be provided in the nodes, such as SW flags (software flags) provided by the Verigy 93000, which support the central unit in controlling control flow of a program in the nodes.
  • SW flags are meant to change the control flow of a program, such as a sequencer program, in the nodes very fast. If SW flags would not be available, the central unit would need to change the programs in the affected nodes, which might take much longer.
  • a message sequence might be as follows: 1) The central unit creates a multicast group with ID2 for nodes 0 ... 255. 2) The central unit tells the multicast group with ID2 to allocate a physical SW flag register to be assigned to a logical SW flag with the IDl. 3) The nodes in the multicast group each allocate one of their available SW flag registers. 4) Multicast group with ID2 is discarded.
  • this SW flag is variation of the download instructions scenario, assuming the 10 instruction on nodes 0 ... 15 with the memory resource ID 15.
  • one of the instruction is a conditional jump that depending on the value of the logical SW flag 1 jumps over the following 3 instructions, or continues with the next instruction.
  • the nodes that know the affected program resource internally translate the jump instruction such that the locally assigned physical SW flag is applied.
  • the central unit broadcasts a message to all nodes, to set SW flag 1 to 0. All nodes that have been told to assign a physical SW flag to the logical SW flag 1, follow this instruction. The remaining nodes ignore this message. Then the central unit continues to start the program with the ID 15.
  • embodiments of the invention provide for an approach in which resources are allocated locally at nodes of a distributed system rather than by a central unit. Thus, waste of resources if resources are allocated on many overlapping sub-sets of nodes can be avoided.
  • a logical resource ID is distributed using a multicast message. Referring to the Verigy V93000, a common channel may be used to this end. In embodiments of the invention, the following second and third messages are broadcast and, therefore, common channels are not required for these messages.
  • embodiments of the invention can be in a digital storage medium, for example, a floppy disc, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a flash memory having electronically readable control signals stored thereon, which co-operate with a programmable computer system such that the respective method is performed.
  • embodiments of the present invention can be implemented as a computer program or a computer program product with a program code stored on a machine- readable carrier, the program code being operative for performing the method when the computer program product runs on a computer.
  • embodiments of the inventive method are, therefore, a computer program having a program code for performing the method when the computer program runs on a computer.

Abstract

A distributed system has a plurality of nodes and a central unit couplable to the plurality of nodes via a communication interface having multicast capabilities. In order to manage resources on the nodes, the nodes receive a first message including an identifier and a request to allocate required resources to the identifier from the central unit. The nodes allocate required resources accordingly and receive a second message including the identifier and resource configuration information defining how to configure the allocated resources. The allocated resources are configured accordingly and, upon receipt of a third message including the identifier, the configured resources are utilized.

Description

Resource Allocation in a Distributed System
Description
Embodiments of the invention concern a concept for allocating resources in a distributed system. Embodiments of the invention concern a node for a distributed system, which is configured for a corresponding resource allocation. Embodiments of the invention concern a distributed system and embodiments of the invention concern a method for distributing identical resource usage information from a central unit to nodes of a distributed system.
A distributed system is shown in Fig. 5 of the present application. An example of the distributed system shown in Fig. 5 may be, for example, implemented in automatic test equipment, such as the Verigy V93000 series. The distributed system comprises a central unit 1000 coupled to multiple processing nodes 1002 via a communication infrastructure 1004. Three nodes, node 0, node 1 and node n are shown in Fig. 5, standing in for a possibly greater number of nodes, as indicated by the dots between node 1 and node n. The communication infrastructure 1004 has multicast capability, i.e. has the possibility of sending messages to a sub-set of the nodes 1002 only. For example, multicast capability is implemented in the Verigy V93000 series, making use of so-called "common channels", representing communication means between the central unit and a sub-set of the distributed nodes. At present, 32 common channels may be implemented in the Verigy 93000 series .
Each node 1002 has associated therewith a resource pool 1006, such as resource pools 0 to n shown in Fig. 5. A resource pool 1006 may comprise memory, registers, interrupts, ports, file descriptors, etc. For example, memory associated with each resource pool may comprise different address regions, as shown by the respective address regions indicated in Fig. 5.
Again referring to the Verigy V93000 series, the central unit 1000 may represent the test system controller and the distributed nodes 1002 may represent respective testing modules, such as test processor driven instruments, having associated therewith one or more pins to be connected to a DUT (Device Under Test) .
Managing resource pools, such as memory, registers, interrupts, ports, file descriptors, etc. replicated on multiple processing nodes, may comprise allocating a resource, access it and free it. Access operations, such as write to, read from, open, execute, trigger, etc., may happen in common for all nodes that, for a specific access operation, a resource is allocated for. Triggering the operation in the minimum time possible on all nodes would be beneficial. For a given operation, a resource is not allocated on all nodes. Instead, many different sub-sets of nodes may be created by an application. A resource is only allocated for one of these sub-sets of nodes, i.e. a resource allocated for one sub-set of nodes may not be allocated to another sub-set of nodes.
As an operation example, the following sequence of actions may be considered:
1. Allocate space for 10 instructions on nodes 0...15; 2. Write these 10 instructions to the allocated space; and
3. Execute these instructions.
According to conventional distributed systems, such as the Verigy 93000 series, a central instance, such as the central unit 1000, manages the resource pools of all nodes in common. The same resource is picked on all nodes for an allocation request. Thus, the physical resource ID can be used as a broadcast parameter. Predefined multicast groups are defined and each node knows to which multicast groups it belongs. This way, access operations are triggered by a single multicast achieving the best throughput.
With respect to the above operation example, this results in the following sequence of actions:
1. The central unit 1000 allocates space for 10 instructions on nodes 0 to 15. To be more specific, the central unit 1000 allocates memory in the intersection of the physical memory of nodes 0 to 15, i.e. allocates memory in overlapping address areas of nodes 0 to 15. Nodes 0 to 15 may be selected corresponding to the requirements of an application running on the central unit. Thereupon, a multicast group for nodes 0 to 15 is created and a group ID is assigned to the multicast group, for example, IDO. Creating a multicast group may include configuring the communication infrastructure 1004 to implement a connection between the central unit 1000 and each of the nodes of the multicast group, i.e. nodes 1 to 15. In the Verigy V93000 series, such a connection is referred to as the "common channel" and the Verigy V93000 series provides for the possibility of implementing 32 common channels.
2. The 10 instructions are written into the allocated resources, making use of a multicast message. To be more specific, the central unit 1000 sends a multicast message to the group having the group ID IDO, the multicast message including a request to write the instructions, such as iθ,..., i9, to the allocated memory, such as starting from address OxOOOOffff. In response to the multicast message, the nodes 0 to 15 write the instructions to the allocated memory.
3. In order to trigger execution of the instructions, the central unit 1000 sends a multicast message to group 0, the multicast message including a request to start execution at the starting address of the allocated memory, such as OxOOOOffff. Upon receipt of this message, nodes 1 to 15 start execution of the corresponding instructions.
The above steps of allocating memory, creating a multicast group and writing the instructions into the memory of the nodes of the multicast group take place during a set-up phase, while the step of executing takes place during an execution phase.
Thus, in the conventional approach, a central instance manages the pools of all nodes in common. The same resource is picked on all nodes for an allocation request. Thus, the physical resource ID can be used as a broadcast parameter. Predefined multicast groups are defined and each node knows to which multicast groups it belongs. Accordingly, in the execution phase, operations are triggered by a single multicast achieving the best throughput.
In the above approach, if the nodes have different physical value ranges of resources in their pool, only the intersection of available physical resources for the requesting nodes can be utilized. If the resources are allocated on many overlapping sub-sets of nodes, i.e. if nodes belong to many different multicast groups generated in the above manner, resources are wasted. In addition, multicast is fast only if the multicast group for the receiving nodes is already defined. The number of required sub-sets is huge. Thus, for a given hardware implementation, not all sub-sets a multicast group can be permanently defined.
Accordingly, there is a need for an improved concept for distributing identical resource configuration information from a central unit to a sub-set of a plurality of nodes of a distributed system. It would be beneficial to have an approach with a reduced resource waste and with a reduced number of required multicast groups in the execution phase, while triggering operations in the execution phase can still be achieved in a short time.
Summary of the Invention
It is the object of the present invention to provide an improved approach for allocating resources of nodes in a distributed system with a reduced resource waste, which may permit triggering execution of operations using the resources in a fast manner.
This object is solved by a node according to claim 1, a distributed system according to claim 7 and a method for distributing identical resource configuration information according to claim 15.
An embodiment of the invention provides for a node for a distributed system, the node comprising:
a plurality of resources;
a local processor configured to:
receive a first message including an identifier and a request to allocate required resources to the identifier;
store the identifier and allocate the required resources from the plurality of resources to the identifier;
receive a second message including the identifier and resource configuration information defining how to configure the allocated resources; and
configure the allocated resources according to the resource configuration information.
Embodiments of the invention provide a distributed system comprising a plurality of such nodes; and a central unit coupleable to the plurality of nodes via a communication interface having multicast capabilities.
Embodiments of the invention provide a method for distributing identical resource configuration information from a central unit to a sub-set of a plurality of nodes of a distributed system, comprising:
by the central unit, multicasting a first message to a subset of at least two of said plurality of nodes, the first message including an identifier and a request to allocate required resources to the identifier;
by the sub-set of the plurality of nodes, receiving the first message, storing the identifier and allocating required resources to the identifier;
by the central unit, broadcast a second message to the plurality of nodes, the second message including the identifier and resource configuration information defining how to configure the allocated resources;
by the plurality of nodes, receiving the second message, deriving the identifier from the second message, configuring the allocated resources according to the resource configuration information if the node has stored the identifier and, otherwise, ignoring the second message else.
According to embodiments of the invention, a node for a distributed system is provided with a local facility, such as a local processor core, to manage its resource pool. The local facility may be configured to map an identifier, such as a logical resource ID, to a local physical resource, upon receipt of a message from a central facility including the identifier. The identifier, such as the logical resource ID, may be assigned by the central facility during the allocation of this resource.
Accordingly, embodiments of the invention permit for a simpler algorithm for managing the resource pool of a node by the node itself. In addition, due to the per node management of resources, embodiments of the invention permit for an optimal usage of resources in the node's pool. The data structure overhead for achieving management of the resource pool may be distributed over nodes, so that the local facility associated with each node avoids overloading the central node with the algorithmic complexity of a per node management.
According to embodiments of the invention, the first message including the identifier and the request to allocate required resources to the identifier is a multicast message. In embodiments of the invention, the second message including the identifier and the resource configuration information is a broadcast message. According to embodiments of the invention, a third message including the identifier and a request to execute a program utilizing the configured resources is a broadcast message. Thus, according to embodiments of the invention, with respect to the second and third messages, multicast can be replaced by broadcast or multicast to always the same group. No multicast group needs to be reconfigured for an access operation utilizing the configured resources. Rather, a broadcast message my be used, wherein nodes not knowing the identifier, such as the logical resource ID, ignore the broadcast message.
In embodiments of the invention, the resource pool of a node, i.e. the plurality of resources of a node, comprises a memory having a plurality of storage locations, wherein the resource configuration information define instructions to be written to the memory and wherein the local processor is further configured to write the instructions to the memory, receive a third message including the identifier and a request to execute the instructions, and execute the instructions written to the memory. In embodiments of the invention, setup data may be stored in the memory upon receipt of the second message and utilized upon receipt of the third message.
In other embodiments of the invention, the resource pools may comprise registers, interrupts, ports, file descriptors, etc. replicated on multiple processing nodes, which may be configured upon receipt of the second message and which may be utilized accordingly upon receipt of the third message. With respect to such other types of resources, the procedures are almost identical and necessary modifications can be easily applied by an person skilled in the art.
Brief Description of the Drawings
Embodiments of the present invention will be described in the following with reference to the accompanying drawings in which:
Fig. 1 shows a schematic view of a distributed system according to an embodiment of the invention;
Fig. 2 shows a schematic view of a testing system;
Fig. 3 shows a schematic view of a communication infrastructure with multicast capability to a set of nodes;
Figs. 4a and 4b show schematic views of a flow chart of a method according to an embodiment of the invention; and
Fig. 5 shows a schematic view of a conventional distributed system. Fig. 1 shows a schematic view of a distributed system comprising a central unit 10, a communication interface 12, a plurality of nodes 14 (designated node 0 to node n in Fig. 1), and a resource pool 16 associated with each node 14. The resource pools 16 are designated resource pool 0 to resource pool 10 in Fig. 1 and each resource pool comprises a memory having physical memory addresses as indicated in Fig. 1. As can be derived from Fig. 1, the physical memory addresses of the memories associated with the different nodes may be different. Insofar, the structure of the distributed system may be comparable to the structure of the distributed system explained above referring to Fig. 5. An example of such a distributed system is implemented in the Verigy V93000 series.
In comparison to the conventional structure shown in Fig. 5, the central unit 10 and the nodes 14 shown in Fig. 1 are adapted to implement the inventive approach for managing the resource pools 16 replicated on the plurality of processing nodes 14. Managing the resource pools may include allocating a resource, accessing it and freeing it. Access operations, such as write to, read from, open, execute, trigger, etc. may happen in common for all nodes 14 that a resource is allocated for. It would be beneficial, however, if the operations could be triggered in the minimum time possible on all nodes.
In the following, embodiments of the invention are explained in part referring to a testing system. Persons skilled in the art will understand that the present application is not restricted to testing systems, but may be implemented in other environments in which resources are replicated on multiple processing nodes and in which access operations may happen in common for a plurality of nodes, such as distributed sensor systems, distributed actuation systems, and the like. The central unit 10 comprises a central processor 20, which may be part of a work station, for example. Each node 14 is provided with a local facility, such as a local processor 22, configured to implement the inventive approach. It goes without saying that the central unit 10 and the nodes 14 comprise additional hardware structures required besides the processors 20, 22 in order to implement the inventive approach, such as memories, interfaces, connection lines, and the like. In embodiments of the invention, the central unit 10 may be the central workstation of a testing system and the nodes 14 may be testing modules having individual processors associated with a single test pin or a plurality of test pins. Reference is made to Fig. 2 showing a tester 30 comprising a central processor 32 and a plurality of individual processors 34, which are coupled to the central processor 32 via a communication infrastructure 36. Each local processor 34 is associated and coupled to a single test pin 38. A device under test 40 or a plurality of devices under test may be contacted by the test pins 38 to apply test signals to the device under test 40.
The test signals may be generated by executing instructions stored in the memories of the resource pools 16 associated with nodes 14. In addition, instructions that reference setup data stored in the memories may be used in generating the test signals. To permit the execution of the instructions during the execution phase, the instructions may be distributed to the nodes during a set-up phase. To this end, the central processor 20 may run one test application or a plurality of test applications. The test application (s) may create many different sub-sets of nodes to which specific instructions are to be written. To be more specific, the application (s) may create a first subset of nodes to which a first set of instructions is to be written and a different second sub-set of nodes to which a different second set of instructions has to be written. For example, a test application running on the central processor 20 may require that 10 instructions iθ,..., i9 are to be distributed to each of nodes 0 to 15. In reply to the application creating the sub-set of nodes 1 to 15 in this manner, the central processor 20 generates a multicast group for nodes 0 to 15, step Sl in Fig. 4a.
Creating a multicast group may include assigning a group ID, such as IDO to nodes 0 to 15. Creating a multicast group may also include configuring the communication infrastructure 12, 36 to provide for a direct connection between the central unit 10 and nodes 0 to 15. An example of a communication infrastructure with multicast capability is shown in Fig. 3. According to the embodiment shown in Fig. 3, the communication infrastructure comprises a number of channels, each channel comprising a switching unit 50, a first signal line 52 coupling switching unit 50 to the central unit 10 and a plurality of signal lines 54 coupling the switching unit to each of nodes 14. The switching unit 50 may be configured to connect the first signal line 52 to any desired number of second signal lines 54, so that a given channel may connect the central unit 10 to any desired number of nodes 14. For example, to create a multicast group for nodes 0 to 15, one of the switching units 50 may be configured to connect the associated first signal line 52 with each of the second signal lines 54 connected to nodes 1 to 15. Accordingly, in embodiments of the invention, a multicast group is created by hardware means. In alternative embodiments of the invention, creating a multicast group may be implemented simply by using the group ID to address each member of the multicast group.
After generating the multicast group, the central unit 10 assigns a logical resource ID, such as ID15 to the memory space required to store the 10 instructions. The central unit 10 generates a first message, the first message including the logical source ID, such as ID15 and a request to allocate space for 10 instructions. The central unit multicasts the first message to the members of the multicast group, i.e. to nodes 1 to 15, step S2 in Fig. 4a.
Nodes 1 to 15 receive the first message, step S3 in Fig. 4a. Each of the nodes 1 to 15 stores the logical resource identifier, such as ID15 and allocates resources as requested in the first message, step S4. Optionally, there may be some kind of handshake protocol indicating to the central unit that each of nodes 1 to 15 has received the first message and has allocated resources accordingly.
Thereupon, the multicast group is discarded by the central unit, see step S5. In embodiments of the invention, there is no additional multicast group required for the further process steps.
In embodiments of the invention, a second multicast group might be required, if a distributed system comprises nodes supporting the inventive approach and nodes not supporting the inventive approach. In such embodiments, a multicast group, such as a permanent multicast group may be permanently configured for all nodes supporting the inventive approach. Then, the further messages would be multicast to this permanent multicast group.
The central unit then broadcasts a second message, the second message including a request to write the instruction iθ, ... , i9 and the logical resource identifier, such as ID15, see step S6 in Fig. 4a. The second message may also include an offset value, such as offsetO, concerning the address in the allocated memory space. In embodiments of the invention, the second message is broadcast, i.e. sent to all nodes of the distributed system rather than nodes 1 to 15 only.
Upon receipt, step S7, of the second message, each node checks whether it has stored the identifier, such as ID15 included in the second message, step S8. If a node has stored the identifier, it will follow the request in the second message, i.e. will write (step S9) the instructions iO to i9 to the allocated memory space, so that the resources are configured for an execution of the instructions. If a node has not stored the identifier, it will ignore the second message, step SlO.
By the above steps, instructions iO to i9 have been written to the allocated memories of each of nodes 0 to 15. Accordingly, the resources associated with these nodes are configured for execution during an execution stage. In embodiments of the invention, steps Sl to SlO are performed during a set-up phase once in order to configure the nodes for a later execution. In order to execute the instructions, the central unit 10 broadcasts a third message to all nodes, step S12. The third message includes a request to start execution and includes the logical resource identifier, such as ID15. Optionally, the third message may also include an offset value, such as offsetO. Upon receipt, step S13 of the third message, each of nodes 14 checks whether it has stored the identifier, step S14. If so, the instructions stored at the allocated space assigned to the identifier are executed. In other words, a program (formed by the instructions) is executed, step S15. Nodes, which have not stored the identifier, ignore the third message, step S16. Steps S12 to S16 may be performed during the execution phase a number of times. In a test environment, steps S12 to S16 may be performed a number of times for testing the same device under test or different devices under test.
In other embodiments of the invention, other resources, such as registers, may be allocated according to the inventive approach. A set of registers may be provided in the nodes, such as SW flags (software flags) provided by the Verigy 93000, which support the central unit in controlling control flow of a program in the nodes. SW flags are meant to change the control flow of a program, such as a sequencer program, in the nodes very fast. If SW flags would not be available, the central unit would need to change the programs in the affected nodes, which might take much longer.
With respect to allocating such registers, a message sequence might be as follows: 1) The central unit creates a multicast group with ID2 for nodes 0 ... 255. 2) The central unit tells the multicast group with ID2 to allocate a physical SW flag register to be assigned to a logical SW flag with the IDl. 3) The nodes in the multicast group each allocate one of their available SW flag registers. 4) Multicast group with ID2 is discarded.
The application of this SW flag is variation of the download instructions scenario, assuming the 10 instruction on nodes 0 ... 15 with the memory resource ID 15. When downloading the instructions, one of the instruction is a conditional jump that depending on the value of the logical SW flag 1 jumps over the following 3 instructions, or continues with the next instruction. The nodes that know the affected program resource internally translate the jump instruction such that the locally assigned physical SW flag is applied. Later, when running the program, the central unit broadcasts a message to all nodes, to set SW flag 1 to 0. All nodes that have been told to assign a physical SW flag to the logical SW flag 1, follow this instruction. The remaining nodes ignore this message. Then the central unit continues to start the program with the ID 15.
Accordingly, embodiments of the invention provide for an approach in which resources are allocated locally at nodes of a distributed system rather than by a central unit. Thus, waste of resources if resources are allocated on many overlapping sub-sets of nodes can be avoided. In embodiments of the invention, a logical resource ID is distributed using a multicast message. Referring to the Verigy V93000, a common channel may be used to this end. In embodiments of the invention, the following second and third messages are broadcast and, therefore, common channels are not required for these messages.
The implementation of embodiments of the invention can be in a digital storage medium, for example, a floppy disc, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a flash memory having electronically readable control signals stored thereon, which co-operate with a programmable computer system such that the respective method is performed. Generally, embodiments of the present invention can be implemented as a computer program or a computer program product with a program code stored on a machine- readable carrier, the program code being operative for performing the method when the computer program product runs on a computer. In other words, embodiments of the inventive method are, therefore, a computer program having a program code for performing the method when the computer program runs on a computer.

Claims

Claims
1. Node (14; 34) for a distributed system (30), the node comprising:
a plurality of resources (16);
a local processor (22) configured to:
receive (S3) a first message including an identifier and a request to allocate required resources to the identifier;
store (S4) the identifier and allocate the required resources from the plurality of resources to the identifier;
receive (S7) a second message including the identifier and resource configuration information defining how to configure the allocated resources; and
configure (S9) the allocated resources according to the resource configuration information.
2. The node of claim 1, wherein the local processor (22) is further configured to receive (S14) a third message including the identifier and a request to execute a program utilizing the configured resources.
3. The node of claim 1, wherein the plurality of resources (16) comprise a memory comprising a plurality of storage locations,- wherein the resource configuration information define instructions to be written to the memory, and wherein the local processor (22) is further configured to:
write the instructions to the memory; receive (S14) a third message including the identifier and a request to execute the instructions; and
execute (S18) the instructions written to the memory.
4. The node of claim 1, wherein the first message is a multicast message, and wherein the second message is a broadcast message, wherein the local processor (22) is configured to derive from the second message the identifier and ignore the second message if the node does not have stored the identifier.
5. The node of claim 2 or 3, wherein the first message is a multicast message and wherein the second and third messages are broadcast messages, and wherein the local processor (22) is configured to:
derive from the second message the identifier and to ignore the second message if the node does not have stored the identifier, and to derive from the third message the identifier and to ignore the third message if the node does not have stored the identifier.
6. The node of one of claims 1 to 4, comprising a test pin (38) associated therewith, wherein the resource configuration information define test instructions to be stored in the allocated resources (16), execution of the test instructions causing test signals to be applied to the test pin (38) •
7. The node of one of claims 1 to 6, wherein the resources comprise at least one of memory, registers , interrupts, ports and file descriptors.
8. A distributed system comprising: a plurality of nodes (14; 34) according to one of claims 1 to 7;
a central unit (10; 32) couplable to the plurality of nodes (14; 34) via a communication interface (12; 36) having multicast capabilities.
9. The distributed system of claim 8, wherein the central unit (10; 32) is configured to:
multicast (S2) the first message to a sub-set of at least two of the plurality of nodes (14; 34); and
broadcast (S6) the second message to the plurality of nodes (14; 34) .
10. The distributed system of claim 9, wherein the local processor (22) is further configured to receive a third message including the identifier and a request to execute a program utilizing the configured resources, and wherein the central unit (10; 32) is further configured to broadcast (S12) the third message to the plurality of nodes.
11. The distributed system of one of claims 8 to 10, wherein the central unit (10; 32) is configured to select the sub-set of the plurality of nodes depending on the requirements of an application, to generate (Sl) a multicast group by assigning a group ID to each node of the sub-set of the plurality of nodes, to assign the resource identifier to each node of the multicast group, and to discard (S5) the multicast group after having multicast the first message.
12. The distributed system of one of claims 8 to 11, comprising signal lines (52, 54) and switches (50) by which the plurality of nodes (14) are couplable to the central unit (10) via the signal lines (52, 54) .
13. The distributed system of one of claims 8 to 12, wherein the broadcast channel comprises signal lines coupling the central unit (10; 32) to each of the plurality of nodes (14; 34) .
14. The distributed system of one of claims 8 to 13, wherein the multicast channel comprises signal lines coupling a sub-set of the plurality of nodes (14; 34) to the central unit (10; 32) .
15. The distributed system of one of claims 8 to 14, wherein the central unit (10; 32) is a test system controller, wherein the nodes (14; 34) are testing modules each associated with at least one test pin (38), and wherein the resource configuration information define instructions to be stored in the allocated resources and relating to test signals to be applied to the at least one test pin (38) .
16. Method for distributing identical resource configuration information from a central unit (10; 32) to a sub-set of a plurality of nodes (14; 34) of a distributed system, comprising:
by the central unit (10; 32), multicasting (S2) a first message to a sub-set of at least two of said plurality of nodes (14; 34), the first message including an identifier and a request to allocate required resources (16) to the identifier;
by the sub-set of the plurality of nodes (14; 34), receiving (S3) the first message,- storing (S4) the identifier and allocating required resources to the identifier;
by the central unit (10; 32), broadcast (S6) a second message to the plurality of nodes, the second message including the identifier and resource configuration information defining how to configure the allocated resources (16); and
by the plurality of nodes (14; 34), receiving (S7) the second message, deriving the identifier from the second message, configuring the allocated resources (16) according to the resource configuration information if the node has stored the identifier and, otherwise, ignoring the second message else.
17. The method of claim 16, further comprising:
by the central unit (10; 32), broadcasting (S12) a third message including the identifier and a request to execute a program utilizing the configured resources (16); and
by the plurality of nodes (14; 34), receiving the third message, executing the program if the node has stored the identifier and, otherwise, ignoring the third message else.
18. The method of claim 17, wherein multicasting the first message and broadcasting the second message are performed in a set-up phase once, and wherein broadcasting the third message is performed in an execution phase a plurality of times.
19. Method of claim 18, wherein the central unit (10; 32) is a central testing unit and the nodes (14; 34) are testing modules each associated with at least one test pin (38), wherein broadcasting (S12) the third message is performed for each of a number of devices under test, which are connected to pins (38) of the test modules sequentially.
20. The method of one of claims 16 to 19, wherein the plurality of resources (16) comprise a memory comprising a plurality of storage locations, and wherein the resource configuration information define instructions to be stored in the memory.
21. The method of one of claims 16 to 20, wherein the resources comprise at least one of memory, registers, interrupts, ports and file descriptors.
22. A program comprising a program code performing a method according to one of claims 16 to 21 if running on a computing device.
PCT/EP2008/004498 2008-06-05 2008-06-05 Resource allocation in a distributed system WO2009146721A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2008/004498 WO2009146721A1 (en) 2008-06-05 2008-06-05 Resource allocation in a distributed system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2008/004498 WO2009146721A1 (en) 2008-06-05 2008-06-05 Resource allocation in a distributed system

Publications (1)

Publication Number Publication Date
WO2009146721A1 true WO2009146721A1 (en) 2009-12-10

Family

ID=40125792

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/004498 WO2009146721A1 (en) 2008-06-05 2008-06-05 Resource allocation in a distributed system

Country Status (1)

Country Link
WO (1) WO2009146721A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018075529A1 (en) * 2016-10-19 2018-04-26 Advanced Micro Devices, Inc. System and method for dynamically allocating resources among gpu shaders
US10311626B2 (en) 2016-10-19 2019-06-04 Advanced Micro Devices, Inc. System and method for identifying graphics workloads for dynamic allocation of resources among GPU shaders

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6981027B1 (en) * 2000-04-10 2005-12-27 International Business Machines Corporation Method and system for memory management in a network processing system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6981027B1 (en) * 2000-04-10 2005-12-27 International Business Machines Corporation Method and system for memory management in a network processing system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018075529A1 (en) * 2016-10-19 2018-04-26 Advanced Micro Devices, Inc. System and method for dynamically allocating resources among gpu shaders
US10311626B2 (en) 2016-10-19 2019-06-04 Advanced Micro Devices, Inc. System and method for identifying graphics workloads for dynamic allocation of resources among GPU shaders

Similar Documents

Publication Publication Date Title
CN108537543B (en) Parallel processing method, device, equipment and storage medium for blockchain data
CN108924268B (en) Container cloud service system and pod creation method and device
CN102202078B (en) The method and system of a kind of multiple foreign peoples roles for configuration server field
US7953793B2 (en) Distributed preboot execution environment (PXE) server booting
US6081895A (en) Method and system for managing data unit processing
CN101923490B (en) Job scheduling apparatus and job scheduling method
US9104501B2 (en) Preparing parallel tasks to use a synchronization register
JP2005501333A (en) System for transferring to a processor
US7987352B2 (en) Booting with sub socket partitioning
CN109240754A (en) A kind of logical device and method, system configuring BIOS startup item
CN112463375A (en) Data processing method and device
CN111190810A (en) Method, device, server and storage medium for executing test task
KR20210105378A (en) How the programming platform's user code works and the platform, node, device, medium
CN110990114A (en) Virtual machine resource allocation method, device, equipment and readable storage medium
CN113766042A (en) Container address configuration method, system, device, equipment and medium
KR20200038038A (en) Apparatus and method for managing application on multi-cloud service environment
JP6543219B2 (en) Virtual machine allocation apparatus and resource management method
WO2009146721A1 (en) Resource allocation in a distributed system
US6598105B1 (en) Interrupt arbiter for a computing system
EP3811210B1 (en) Method and supporting node for supporting process scheduling in a cloud system
JP2014102758A (en) Virtual client management system and virtual client management method
EP2109253A1 (en) A license implementation method, device and system of sharing the exchange node
CN114253704A (en) Method and device for allocating resources
CN111309467A (en) Task distribution method and device, electronic equipment and storage medium
CN113434283B (en) Service scheduling method and device, server and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08759048

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08759048

Country of ref document: EP

Kind code of ref document: A1