GB2454996A

GB2454996A - Handling inbound initiatives for a multi-processor system by its input/output subsystem using data that defines which processor is to handle it.

Info

Publication number: GB2454996A
Application number: GB0822779A
Authority: GB
Inventors: Elke Nass; Martin Taubert; Hans-Helge Lehmann; Joachim Von Buttlar; Frank Koeble; Alexander Albus; Kenneth J Oakes; John S Trotter
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2008-01-23
Filing date: 2008-12-15
Publication date: 2009-05-27
Anticipated expiration: 2028-12-15
Also published as: GB2454996B; GB0822779D0

Abstract

Disclosed is a method for multiprocessor computer systems using an optimized input/output subsystem that is based on dedicated System Assist Processors (SAPs) or I/O Processors (I0Ps) to handle inbound initiatives i.e. request responses and / or external event notifications. Each node (10) of the I/O processors (12) communicates with I/O devices, via I/O paths (2) corresponding to the nodes (10), and the responses are generated by I/O hardware and/or firmware addressed by a precedent I/O request issued by a respective one of said I/O processors (12). The external event notifications are generated by I/O hardware and/or firmware, and the initiatives are to be processed by one or a group of the multiple I/O processors. Using a pre-defined, first data element (20) indicating incoming or existing initiatives from any I/O path (2) of all of the nodes to be served by one of the I/O processors (12), using a second data structure (22), one of the second structures per node (10), wherein each of said second data structures defines which initiative is preferably handled by which processor (12) or group of processors (12), respectively, and using a third data structure (24), one of per node (10), such that multiple bits can be set in order to indicate the occurrence of respective initiatives for respective multiple I/O processors (12).

Description

DESCRIPTION

Method for Balanced Handling of Initiative in a Non-uniform Multiprocessor Computing System

1. BACKGROUND OF THE INVENTION

1.1. FIELD OF THE INVENTION

The present invention relates to the field of high-performance, multiprocessor computer systems which are using an optimized Input I Output subsystem that is based on dedicated System Assist Processors (SAPs) or I/O Processors (lOPs) . These SAPs perform communication between multiple CPUs and many peripheral I/O devices attached via multiple I/O channel paths, e.g. for database business applications. More specifically, the present invention relates to a method and respective system as given in the preamble of the independent claims.

1.2. DESCRIPTION AND DISADVANTAGES OF PRIOR ART

The present invention relates to prior art methods and systems for processing I/O requests issued by said System Assist Processors and particularly to the way how a request response issued by the said I/O periphery is processed in the multiprocessor system.

An exemplary prior art system of the above mentioned shape is described in "Multiple-logical-channel subsystems: Increasing zSeries I/O scalability and connectivity", in IBM Journal of Research and Development, Volume 48, Number 3, 4, May/July 2004, pages 489 to 505.

I

A further prior art method and system is described in US Patent No. 4,959,781, assigned to Stardent Computer, Inc. In this patent disclosure a hardware-implemented apparatus is described in a symmetrical multiprocessor computer system, in which a control method for dispatching processor work to different processors is implemented controlling that an idle processor has higher priority than a busy processor, when the handling of interrupts is dispatched under the plurality of available processors.

Disadvantageously, the teaching of that patent cannot be applied to non-uniform multiprocessor computing systems which are the focus of the present invention because the patent disclosure is based on the general assumption that any work to be dispatched across the processors can basically be performed by each processor with exactly the same performance impact. In a non-uniform multiprocessor computing system, also referred to as an asymmetrical computer system (in contrast to symmetrical systems), various processors have respective various different tasks to do. In particular, some specific I/O related work concerning a given I/O channel path can only be performed by one dedicated System Assist Processor.

The second aspect of prior art which is not reflected in the before mentioned US Patent is the special processing which is required when the multiple processors are implemented in a number of substructures, as they are called "nodes" in prior art. A single node typically comprises a large plurality of CPUs, a respective plurality of I/O processors which act as I/O servers for the CPUS. In addition, it usually also contains a portion of the computing system's distributed shared main storage and cache subsystem, and provides a portion of the computing system's I/o connectivity. Prior art does not disclose a performance optimizing control method for avoiding so-called "cross-node communication" or cross-node traffic with respect to main storage accesses and any cache coherency protocol implemented in the system when handling the request responses coming back from the I/O hardware. A non-optimized response handling, however, causes a serious performance loss of the multiprocessor system.

The prior art technical disadvantages described before are also incorporated in an exemplary prior art system, that of the IBM System z server product line.

Figure 1 illustrates the most basic structural components of prior art processor node hardware and software environment used for a prior art method in such System z server products.

This product line includes large multiprocessor systems with a high number of I/O paths 2 attached via an I/O expansion network 4 to one or more nodes 10, which are subsystems of the overall system. Each node contains multiple Processing Units 12 (PUs), which in an IBM System z server are used as general CPUs or System Assist Processors (SAPs). A shared cache 8 is provided amongst the multiple processors and nodes, with the node interconnect implementing a cache coherency protocol. Shared memory 6 is distributed across all nodes. In prior art IBM System z servers, every I/O path 2 has a static functional affinity to one SAP, which means that all its inbound initiative is handled by this dedicated processor.

In this context, an "initiative" is to be understood as a response to a former initiated I/O operation or as a request coming from an I/O device via an I/O path.

In this context functional affinity is to be understood as the requirement to do initialization or handle initiative from an I/O path on a specific SAP.

Statistically, the operations that must honor the functional affinity are rare, but they can occur at any time and must be handled appropriately.

With respect to functional affinity, all I/O paths are equally distributed amongst all available SAPS. This fixed assignment is optimized only relative to a numerical point of view and in order to avoid single points of failure in the I/O configuration. In prior art IBM System z servers, the functional affinity statically defines which SAP handles initiatives coming from a given I/O path, regardless of the current overall workload distribution in the running system.

To receive initiative and handle the inbound requests a prior art System z SAP uses processor-local hardware registers, processor-local initiative vectors in memory and processor-local

summary vectors in memory.

As the distribution of the I/O paths among the System Assist Processors is static, but the workloads targeted for an I/O path and in turn to the respective processor are not deterministic but depend on the overall system workload and the actual configuration, prior art initiative handling has the following problems and disadvantages: * It may lead to an unbalanced system with some processors being much more utilized than others, * In extreme situations this may even lead to an overload situation where one or more processors become a bottleneck, * It limits the capacity of one I/O path to the capacity of one processor even if the I/O path has a higher capacity, * In addition the accumulated capacity of all I/O paths assigned to one processor is limited by the capacity of this processor (even though other processors may have capacity available).

Furthermore, the following constraints must be met: * Only a part of the inbound work -depending on its technical nature-can be done on every processor; * All inbound initiatives from one I/O path must be handled in-order; * Not all required resources can be accessed with the same performance characteristics (like local or remote cache) by all processors.

Therefore it is not sufficient to just introduce a simple straight forward "floating concept", e.g. where any I/O initiative can be handled by any available processor. For this kind of I/O work in such a non-uniform environment a more sophisticated method is required.

Generally, the following hierarchy is given in a prior art I/O subsys tern: * The I/O subsystem runs on multiple System Assist Processors (SAPs), which are denoted in figure 1 and 2 just as PUs (processing units), * One SAP drives multiple I/O paths, also referred to as I/O channels, * One channel drives multiple peripheral devices (such as disks, screens, printers, etc.); An I/O configuration consists of: * One to about a dozen SAPs, * Several hundred channels, * Thousands of devices.

The present invention solely deals with the phases where the I/O path 2 signals status responses and external events to the SAP and the SAP processes the channel's initiative.

In a running system, the vast majority of I/O operations deal with individual devices, such as disks. The invention allows that for these operations, status responses from the channel can be processed by any SAP.

But, in a specific prior art where a processing unit (PU) serves as a SAP, there are other I/O operations that must be handled exclusively by a specific SAP, the so-called "SAP that has affinity to that channel". Such operations typically occur: 1. during initialization (e.g. to initialize the link to the channel) 2. during or after I/O network configuration (establish path to newly added device), 3. during recovery and reset (e.g. any failure that requires re-initialization of the channel), 4. during and as part of concurrent maintenance procedures (e.g. the channel is removed from the configuration or dynamically added).

Statistically, the operations that must honor affinity are rare, but they can occur at any time and must be handled appropriately.

In general, when processing such initiatives, the following steps are performed: 1. Initialization, 2. I/O initiative signaling, 3. Selection and firmware execution.

Resuming the description of disadvantages of prior art as done above, prior art does not disclose a performance-optimized control method for avoiding so-called "cross-node communication' or cross-node traffic when handling the request responses Corning back from the I/O hardware. A non-optimized response handling, however, leads to a serious performance loss of the multiprocessor system.

1.3. OBJECTIVES OF THE INVENTION The objective of this invention is to overcome throughput limitations caused by static routing of I/O responses to I/O processors in a multiprocessor system.

2. SUNNARY AND ADVANTAGES OF THE INVENTION This objective of the invention is achieved by the features stated in enclosed independent claims. Further advantageous arrangements and embodiments of the invention are set forth in the respective subclaims. Reference should now be made to the appended claims.

According to a preferred aspect of the present invention a method and respective system is disclosed that allows dynamic distribution of inbound I/O workload, i.e. the before-mentioned initiatives, across the I/O subsystem, taking into account 1. hardware structures for optimum performance, 2. workload that must be handled by a specific processor, i.e. that is not eligible for distribution (functional affinity) 3. the processing of requests in the correct order (FIFO) On an I/O path basis, 4. firmware path length optimization In general, any System Assist or I/O processor is able to handle inbound initiative from any I/O path. However, due to the given hardware structures, it is advisable to associate inbound work preferably with a processor or group of processors that have the closest ties (in terms of latency) with that I/O path. All pairs of "node and respective I/o path" are assigned to performance groups. There may be multiple performance groups, e.g. based on a multi-level cache hierarchy.

The term "I/O" as used in here usually describes any kind of system activity like reading and writing to external devices, e.g. disks, tapes, terminals or printers. But it can be extended to any other kind of non-uniform initiative handling.

To achieve this behavior, in a preferred embodiment, the following data elements are defined: * A Global Summary Vector (GSV) indicating initiative from any existing I/O paths. This vector is used for fast detection of initiative.

* A Node Local Summary Mask (NLSM), one per performance group (node) that defines which initiative is preferably handled by each I/O processor. One bit in the NLSM corresponds to one bit in the GSV.

* Node Local Initiative Vectors (NLIV) that indicate inbound initiatives for a system node. A 0/1-bit in the GSV corresponds to a zero/non-zero double word (64 bits) of the NLIV.

* A Functional Affinity Work Indicator (FAWI), associated with an I/O path that is set for initiative that must be handled by the processor with functional affinity.

* A semaphore mechanism to ensure that requests are handled in the correct order (FIFO).

The GSV indicates to all processors whether there is initiative at all. In general, a processor handles initiatives in descending order of preference. If there is no initiative in the most preferable performance group, initiatives from less-preferred performance groups are evaluated. For initiatives that can only be handled by a specific processor, the FAWI is honored instead of the performance group.

Thus, according to a basic aspect of the present invention, a method and system for handling inbound initiatives, i.e., request responses and / or external event notifications in an input/output (I/O) subsystem of a multi-node computer system is disclosed, wherein in each node a plurality of I/O processors read from and write to, respectively, I/O devices, via a plurality of I/O paths corresponding to said plurality of nodes, and wherein said initiatives are generated by I/O hardware and / or firmware addressed by a precedent I/O request issued by a respective one of said I/O processors, and wherein said external event notifications are generated by I/O hardware and / or firmware, and wherein said initiatives are to be processed by one of the multiple I/O processors. This method is characterized by the steps of: a) using a first, pre-defined data element -exemplarily referred to herein as Global Summary Vector (GSV) -or a plurality of them, indicating incoming or existing initiatives from all I/O paths of all of said nodes and to be served by one of said I/O processors, in order to fast detect an initiative, b) using a plurality of second data structures -exemplarily referred to herein as Node Local Summary Mask (NLSM) -, one of said second structures per each node, wherein each of said second data structures defines, which initiative is preferably handled by which processor, C) using a plurality of third data structures -exemplarily referred to herein as Node-Local Initiative Vector (NLIV) -, one of said third structures per each node, wherein multiple bits can be set in order to indicate the occurrence of respective initiatives for respective multiple I/O processors, d) preferably and optionally using a plurality of fourth data structures -exemplarily referred to herein as Functional -10 -Affinity Work Indicator (FAWI) -one of said fourth data structures per each I/O path, which is set to indicate that an initiative must be handled by a processor with functional affinity corresponding to the type of initiative, and optionally e) using a semaphore mechanism for ensuring that requests are handled in a predefined order -FIFO.

Instead of one Global Summary Vector also a plurality of equivalently used summary vectors can be used, e.g. the same number as nodes are present, in order to adapt the system to use cases, where the load is concentrated on a single node or a small number of them. In the latter case no mask like above NLSM in the case of a true GSV being present, is required anymore. In this case, a node summary vector (NSV) indicates initiatives in the NLIV.

Instead of a node, also a number of processors can be collectively regarded as a group. Then, a node comprises a group of processors, wherein group members are defined to be in a same group independently of any associated I/O paths, other than in the case of node-I/O path association as depicted in figure 1.

Criteria to build a group and to include a certain processor and to exclude another one may be various and wide-spread in nature.

Then the inventive method is performed group-wise instead of node-wise.

This invention achieves the following advantages: 1. As each processor is generally eligible to handle any initiative, workload is balanced across the available I/O processor capacity, thereby improving overall throughput and latency.

* -11 - 2. Workload from high-performance I/O paths whose traffic would exceed the capacity of one I/O processor is distributed across multiple I/O processors.

3. As the workload is distributed honoring hardware structures, optimum overall system performance is achieved.

4. It is still possible to route certain initiatives to certain processors.

As a person skilled in the art may appreciate, the inventive method and system successfully meets the following constraints: * Only a part of the inbound work -depending on its technical nature-can be done on every processor * All inbound initiatives from one I/O path must be handled in order * Not all required resources can be accessed with the same performance characteristics (like local or remote cache) by all processors

3. BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and is not limited by the shape of the figures of the drawings in which: Figure 1 illustrates the most basic structural components of a prior art hardware and software environment used for a prior art method, Figure 2 illustrates the most basic structural components of a inventive hardware and software environment used for a preferred embodiment of the inventive method, Figure 3 illustrates the control flow of the most important steps of a preferred embodiment of the inventive method.

-12 - 4. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT In general, when applying the inventive method according to this embodiment, the following steps are performed: 1. Initialization, 2. I/O initiative signaling, 3. Selection and firmware execution.

Step 1 is performed once at system startup or occasionally in hotplug scenarios and described below. Steps 2 and 3 are performed continuously and with high frequency in a running system with ongoing I/o operations and are described in more detail below.

Figure 2 illustrates the resources which are required as the base f or the inventive selective floating algorithm and how they correlate to the system hardware structure. Figure 3 illustrates the control flow.

With reference to figure 2, as to the initialization step, the system assist processor (SAP) system initialization allocates and initializes the Global Summary Vector 20 (GSV), the Node Local Summary Masks 22 (NLSM), and the Node Local Interrupt Vectors (NLIV) 24. Broken line printed circles interconnected by bidirectional broken line arrows indicate bits or paths, respectively, corresponding to each other.

The GSV 20 -96 bits for example -is initialized to zero. Groups of (up to 64) channels are assigned one bit in this vector.

The NLIVs 24 are initialized to zero. There is one vector per node, and each channel physically attached to that node is assigned a unique bit position in that vector. One double word -13 - (64 bits) in the NLIV corresponds to one bit in the GSV. The NLIV is filled in by channels to signal initiative to the SAPs.

One NLSM 22 is defined per node. It has the same size as the GSV. A 1-bit in the NLSM indicates that the corresponding bit in the GSV describes summary initiative for channels of that node.

GSV and NLSM are logically ANDed to identify those initiatives that are preferably handled by SAPs on the local node. However, when no node-local initiative is found, a SAP will attempt to process initiative from the other nodes.

The inventive algorithm takes the following specific characteristics into consideration: * Distinction of local vs. remote nodes which corresponds to the "non-uniform" aspect, and * The need to honor a functional affinity for specific operations.

It is conceivable to extend the hierarchical concept (local vs. remote nodes) to a multi-level hierarchy.

With general reference to the figures and with special reference now to figure 3 the following section describes the selective and performance-optimized balanced handling of I/O initiative as an algorithm which will be realized in System z firmware.

Inbound [/0 work according to the selective floating algorithm can be divided in two phases: 1. "i/o Initiative Signaling": the hardware has to be setup such that the initiative coming from an I/O path causes a bit to be set in the Node Local Interrupt Vector (NLIV). The important difference to prior art is that this initiative is now kept in a data structure which is common for all processors on one -14 -node and no longer kept local for each processor. If such a bit is set by the I/O path in the NLIV, in addition an attention interrupt to the processor with functional affinity is generated if the affected double word containing this NLIV bit changes from zero to non-zero. This behavior is unchanged compared to previous implementations and therefore does not require any new hardware or changes to existing processor or I/O hardware. In turn, the attention handler which gets invoked on the processor with functional affinity now accumulates the appropriate processor local hardware register content into a system unique data structure Global Summary Vector (GSV) instead into a processor local summary vector, as

it was done before in prior art.

2. "Selection and Firmware Execution": firmware on each processor in the system is polling (its local copy of) the GSV and thus detects available inbound work if the GSV becomes non-zero: Using a single summary vector -thus a global one (GSV) has the advantage that each processor can determine very quickly if any inbound work is pending in the system at all. Then a processor can start working on this global initiative in the following selective and cache optimized way, the control flow of which is depicted in figure 3: In figure 3, the first step 302 is to check if there is any GSV initiative at all. If not control is fed back to the end step 400. Then a new control flow begins.

In the YES branch of step 302, the next step 305 is to use the inventive Node Local Summary Mask (NLSM) to check if any work which is indicated in the GSV 20 is local to this node (preferably implemented by a logical AND of GSV and NLSM).

If any local work is available, then the inventive firmware selects -steps 310, 315-such an I/O path from NLIV (round robin) for further handling. This is the performance and cache -15 -optimal case and has an increasing probability as the I/O rate goes up. If there is no local work pending then the implemented firmware selects initiative in steps 320, 325 from any other I/O path (round robin) even if this I/O path is not local to this node.

Then, step 330, the firmware checks the functional affinity work indicator' bit (FAWI) which may have been set by some processor earlier by step 370. It this indication is active (n-branch), and if this processor is not the one with functional affinity, see the decision 335, then the initiative for the selected I/O path is skipped for this task, see the n-branch of step 335, and step 340.

The firmware then tries to obtain the semaphore which is associated with the selected I/O path, step 345. If this semaphore can not be obtained, see n-branch of decision 350, then another processor is already handling the initiative of that same I/O path and in turn the initiative for the selected I/O path is skipped for this task, see step 340 again. This ensures the exclusiveness and strict ordering of the initiative handling on an I/O path basis.

Having successfully obtained the semaphore, see y-branch of step 350, the firmware examines the data which is associated with this I/O path initiative, step 355, in order to determine whether this kind of work coming from the selected I/O path can be done by any processor (all performance critical work) or if this work is dedicated to the processor with functional affinity (non performance critical work only), see step 360. This is the selective part of the algorithm and is used to distinguish between floating work and dedicated work requiring a processor with functional affinity.

In the latter case -y-branch of step 360 -and if this processor is not the one with functional affinity, then the -16 -functional affinity work indicator' for this I/O path is set, step 370, the semaphore is released, step 380, and the task ends without processing the initiative, step 400.

In the n-branch of decision 360, the functional affinity work indicator' for this I/O path is cleared, step 385, and the work associated with the initiative received is handled, step 390.

Then the semaphore is released, step 395, and finally also the end 400 of the processing is reached.

In a preferred embodiment of the inventive method as implementable on IBM System z computer systems, the processor type that handles inbound I/O initiative and thereby makes use of the invention is a System Assist Processor (SAP) . This does not preclude applying the invention to other processor types.

While before-mentioned embodiment describes an example of IBM System z -typical inbound I/O initiative handling, it is generally applicable for any other non-uniform initiative handling.

The invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc. Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.

For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use -17 -by or in connection with the instruction execution system, apparatus, or device.

The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAJ1), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk -read only memory (CD-ROM), compact disk -read/write (CD-R/w) and DVD.

A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus.

The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.

Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers.

Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.

-18 -The respective circuit of a hardware implemented embodiment as mentioned above is part of the design for an integrated circuit chip. The chip design is created in graphical computer programming language, and stored in a computer storage medium (such as a disk, tape, physical hard drive, or virtual bard drive such as in a storage access network) . If the designer does not fabricate chips or the photolithographic masks used to fabricate chips, the designer transmits the resulting design by physical means (e.g.., by providing a copy of the storage medium storing the design) or electronically (e.g., through the Internet) to such entities, directly or indirectly. The stored design is then converted into the appropriate format (e.g., GDSII) for the fabrication of photolithographic masks, which typically include multiple copies of the chip design in question that are to be formed on a wafer. The photolithographic masks are utilized to define areas of the wafer (and/or the layers thereon) to be etched or otherwise processed. *1

Claims

CLA INS

1. A method for handling inbound initiatives in an input/output (I/O) subsystem of a multi-node computer system, wherein in each node (10) a plurality of I/O processors (12) communicate with I/O devices, via a plurality of I/O paths (2) corresponding to said plurality of nodes (10), and wherein said initiatives are generated by I/O hardware and / or firmware addressed by a precedent I/O request issued by a respective one of said I/O processors (12), and wherein said initiatives are to be processed by one or by a group of said multiple I/O processors (12), characterized by the steps of: a) using at least one pre-defined, first data element (20) indicating incoming or existing initiatives from any I/O path (2) of all of said nodes and to be served by one of said I/O processors (12), b) using a plurality of second data structures (22), one of said second structures per node (10) , wherein each of said second data structures defines which initiative is preferably handled by which processor (12) or group of processors (12) , respectively, c) using a plurality of third data structures (24), one of said third structures per node (10), wherein multiple bits can be set in order to indicate the occurrence of respective initiatives for respective multiple I/O processors (12)

2. The method according to claim 1, further comprising the step of using a semaphore mechanism for ensuring that requests are handled in a predefined order. *

-20 -

3. The method according to claim 1, further comprising the step of: using a plurality of fourth data structures, one of said fourth data structures per I/O path, which is set to indicate that an initiative must be handled by a processor (12) with functional affinity corresponding to a given type of initiative.

4. The method according to the preceding claim, wherein said second, third, and fourth data structures are implemented node-local, respectively.

5. The method according to claim 1, wherein said node (10) comprises a group of processors, wherein group members are defined to be in a same group independently of any associated I/O path.

6. The method according to claim 1, wherein said steps are performed by firmware implemented within the system.

7. An electronic data processing system comprising means for handling inbound initiatives in an input/output (I/O) subsystem of a multi-node computer system, wherein in each node (10) a plurality of I/O processors (12) communicate with I/O devices, via a plurality of I/O paths (2) corresponding to said plurality of nodes (10), and wherein said initiatives are generated by I/O hardware and / or firmware addressed by a precedent I/O request issued by a respective one of said I/O processors (12), and wherein said initiatives are to be processed by one or by a group of said multiple I/O processors (12), characterized by: a) at least one pre-defined, first data element (20) indicating incoming or existing initiatives from any I/O path (2) of all of said nodes and to be served by one of -21 -said I/O processors (12), b) a plurality of second data structures (22), one of said second structures per node, wherein each of said second data structures defines which initiative is preferably handled by which group of I/O processor (12) C) a plurality of third data structures (24) , one of said third structures per node (10) , wherein multiple bits can be set in order to indicate the occurrence of respective initiatives for respective multiple I/O processors (12)

8. A computer program product for handling inbound initiatives in an input/output (I/O) subsystem of a multi-node computer system, wherein in each node (10) a plurality of I/O processors (12) communicate with I/O devices, via a plurality of I/O paths (2) corresponding to said plurality of nodes (10), and wherein said initiatives are generated by I/O hardware and / or firmware addressed by a precedent I/O request issued by a respective one of said I/O processors (12), and wherein said initiatives are to be processed by one or by a group of said multiple I/O processors (12), comprising a computer useable medium including a computer readable program, wherein the computer readable program includes functional components (20, 22, 24) that when executed on a computer causes the computer to perform the steps of: a) using at least one pre-defined, first data element (20) indicating incoming or existing initiatives from any I/O path (2) of all of said nodes and to be served by one of said I/O processors (12), b) using a plurality of second data structures (22), one of said second structures per node (10), wherein each of said second data structures defines which initiative is * -22 -preferably handled by which processor (12) or group of processors (12), respectively, c) using a plurality of third data structures (24), one of said third structures per node (10), wherein multiple bits can he set in order to indicate the occurrence of respective initiatives for respective multiple I/O processors (12)