GB2243052A - Switching network - Google Patents

Switching network Download PDF

Info

Publication number
GB2243052A
GB2243052A GB9100531A GB9100531A GB2243052A GB 2243052 A GB2243052 A GB 2243052A GB 9100531 A GB9100531 A GB 9100531A GB 9100531 A GB9100531 A GB 9100531A GB 2243052 A GB2243052 A GB 2243052A
Authority
GB
United Kingdom
Prior art keywords
message
routed
network
processors
preferred destination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB9100531A
Other versions
GB9100531D0 (en
GB2243052B (en
Inventor
Trevor Hall
Stephen Roy Leunig
Richard Henry Banach
John Sargeant
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Services Ltd
Original Assignee
Fujitsu Services Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Services Ltd filed Critical Fujitsu Services Ltd
Publication of GB9100531D0 publication Critical patent/GB9100531D0/en
Publication of GB2243052A publication Critical patent/GB2243052A/en
Application granted granted Critical
Publication of GB2243052B publication Critical patent/GB2243052B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/25Routing or path finding in a switch fabric
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • H04L49/101Packet switching elements characterised by the switching fabric construction using crossbar or matrix
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/25Routing or path finding in a switch fabric
    • H04L49/253Routing or path finding in a switch fabric using establishment or release of connections between ports
    • H04L49/254Centralised controller, i.e. arbitration or scheduling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3009Header conversion, routing tables or routing tags
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3018Input queuing

Abstract

A multi-processor data processing system is described in which the processors are interconnected by a switching network which routes messages between the processors. Each processor has an activity level, and these levels are propagated backwards through the switching network. Each message has a preferred destination. If the difference between the activity levels of the preferred destination and the least active processor in the neighbourhood of that preferred destination is less than a threshold value, the message is routed to the preferred destination. If, however, this difference is greater than the threshold, the message is routed to the least active processor in that neighbourhood. This ensures that the workload of the system is distributed evenly, while ensuring that, where possible, messages are directed to a preferred locality where they can be handled most efficiently. <IMAGE>

Description

Switching Network Background to the invention This invention relates to switching networks. The invention is particularly, although not exclusively, concerned with a switching network for use in a multi-processor data processing system, for routing messages between the processors.
One such data processing system is described in "Flagship hardware and implementation" by ?aul Townsend, ICL Technical Journal, May 1987, pages 575-594 (Oxford University Press, England). This describes a system comprising a number of processors, interconnected by a delta network. The workload of the system is divided into units referred to as packets, ano eac processor has an activity level, representing te number of packets that it has waiting to be processec.
A problem that arises in such a sister S how to balance the workloads of the processors. One a of doing this, as described in the above reference, is or processors with relatively high workloads to cistrlbute some of their packets to processors with relatively low workloads. This implies that each packet is routed through the network to the processor w-tn the activity level. However, the above reerence also mentions that it is desirable, as far as possible, to distribute packets to processors in which their associated data structures reside. This implies that each packet should be routed two a particular preferred destination, irrespective of activity level.
The object of the present invention is to provide a way of resolving these two conflicting requirements.
Summary of the invention According to the invention, there is provided a switching network for routing messages from a plurality of input ports to a plurality of output ports, wherein each output port has an activity level associated with it, and each message has a preferred destination associated with it, identifying one of the output ports to which the message should preferably be routed, and wherein:: (a) if the difference between the actIvity Level of the preferred destination and the minimum activity level in a predetermined neighbourhood cf the preferred destination is less than a threshold value, the message s routed to the preferred destinatlc.., (b) if sac difference 15 greater than tne threshold value, the message is routed to the port wIt saic maximum activity level in tne neighbourhood of the preferred destination.
Brief descrittion of the drawings One data processing system including a switching network In accordance with the invention will now be described b way of example with reference to tne accompanying drawings.
Figure 1 is an overall view of the data processing system.
Figure 2 shows the switching network in more detail.
Figure 3 shows one switching element of the switching network in detail, Figure 4 shows control logic in the switching element.
Description of an embodiment of the invention Referring to Figure 1, this shows a multi-processor data processing system comprising sixteen processors 10 (PROC O-PROC 15), each of whic has its own local memory 11 (MEM O-MEM 15). The processors 10 are interconnected by a switching network 12.
Each processor 10 holds in its local memory a number of executable packets, representing work whIch is required to be performed by the processor. Exection of these packets may, in turn, create further packets which must also be executed. The exact nature of these packets, and the way in which they are executed by the processors, form no part of the present invent on and so will not be described in detail.
For efficient operation of the processing system, it 5 necessary to provide a load balancing scheme to distribute the workload evenly between the processors. For example, it must ensure that oe processor does not stand idle while the others are overloaded. For tis purpose, each of the processor has an actlvltv level A associated with it, indicating how busy it Is. The activity level Is an Integer the range 0-255.
As will be described, these activity levels are all fed to the switching network 12. The network determines the lowest activity level in the system, and returns this global-minimum level to each processor.
Each processor compares its own activity level with the global minimum level and, if its own activity level is higher, sends one or more of its executable packets through the switching network to one of the other processors, thereby reducing its own workload.
The packets are transmitted in the form of messages. Each message has a header, which includes three parameters as follows.
Preferred address (PAD). This indicates the preferred destination of the message, i.e. the processor to which the message should preferably be routed.
Strength of feeling (SOF). This is an integer in the range 0-255 and indicates how much weight is to be given to the preferred address. The higher the value of SOF, the ore likely it is to be routed to the preferred destination. Conversely, the lower the value of SOF, the more likely it is to be routed dynamically, i.e. on the basIs of the activity levels of the processors.
Neighbourhood (NHD). This defines a set processors thaW =e regarded as the neighbourncoc Co the preferred destination. ore specifically, the ND parameter specifIes â particular logical grouong Co the processors PROC 0-15 into neighbourhoods of predetermined size, according to the value of this parameter, as follows.
NHD = 0 : sixteen neighbourhoods, each consisting of one processor.
NHD = 1 : eight neighbourhoods, each consisting of two processors (PROC 0-1, PROC 2-3 etc.) NHD = 2 : four neighbourhoods, each consisting of four processors (PROC 0-3, PROC 4-7 etc.) NHD = 3 : two neighbourhoods, each consisting of eight processors (PROC 0-7, PROC 8-15) NHD = 4-7: one neighbourhood, consisting of all sixteen processors (PROC 0-15) Thus, for example, if PAD = 9 and NHD = 2, then the preferred destination is PROC 9, and the neighbourhood of the preferred destination consists of the four processors PROC 8-11.
As will be described in detail later, each message sent through the network is routed preferentially to the processor indicated by PAD.
However, if the difference between the activity levels of the preferred processor and the least active processor in its neIghbourhood is greater than so-, the message is routed to the least active processor in that neighbourhood.
As a result it can be seen that if SOS has its maximum value (255), the routing will be purely static i.e. the message will always be routed to preferred destInatIon irrespective of the actlvit: levels of the processors. Conversely, if SOF has l=s minimum value (0), the message will always be route to the least active processor in the neighbourhood of the preferred destinatIon, i.e. the routing is purely dynamic. For intermediate values of SOF, routIng be either to the preferred destination or to the active processor in the neighbourhood, depending c their relative activity levels.
Referring now to Figure 2, this shows the switching network 12 in more detail.
The switching network is a delta network comprising eight switching elements 20-27 arranged in two switching-levels (level 0, 1). Each switching element has four inputs and four outputs, and includes a 4 x 4 crossbar switch which can connect the inputs to the outputs in any desired pattern.
The data outputs of the processors PROC 0-15 are connected in groups of four to the inputs of the switching elements 20-23 in level 0 of the network. The outputs of these switching elements are in turn connected to the inputs of the switching elements 24-27 in level 1 as shown, such that each switch in level 0 has a connection to each of the switches in level 1.
The outputs of the switches 24-27 are connected to the data inputs of the sixteen processors Thus, it can be seen that the network is able to route a message from any processor to any otter processor As well as this flow of messages from left to right as viewed In Figure 2, there is also a flow ol activity leve information through the network in the opposite direction, from right to left. Each processing units applies its current activity left to the corresponding one of the switching eler en 5 On level 1.Each of the switching elements in eve : compares the four activity levels received by - , passes the lowest of these levels to all our switching elements in level 1. ac of the switching elements in level 0 then compares the four activity levels received by it, and passes the lowest of these levels to all our processors connected to it. Thus, it can be seen that each processor receives from the network the global minimum activity level of the system Referring now to Figure 3, this shows one of the switching elements in more detail.
The switching element has four data input ports DATAIN 0-3 for receiving messages from the processors or switching elements to its left (as viewed in Figure 2), and four data output ports DATAOUT 0-3 for passing messages to the processors or switching elements to its right. The switching element also receives four input activity level signals ALO-3 from the processors or switching elements to its right, and produces an output activity level signal ALX which is passed to the processors or switching elements to its left.
Messages received at the data input ports DATAIN 0-3 are stored in four first-in first-out (rIFO) buffers 30. The outputs of the FIFO buffers are connected to the Inputs of a 4 x 4 crossbar switc:.
The outputs of the crossbar switch are connected to the data outputs DATAOUT 0-3.
The hearers of the messages are storey four separate FlO buffers 32 while the messaces are waiting to be routed through the switch.
A channel select circuit 33 selects the next message to be routed, resolving any contention between messages if more than one message is waiting to be routed. The circuit 33 passes the header of the selected message to the control logic circuit 34.
ne contro' logic circuit 3-, when it receives the header, decides which of the output ports DATAOUT 0-3 the message is to be routed to, on the basis of the signals PAD, SOF, NHD and AL 0-3. The circuit produces a control signal PSEL identifying the desired output port.
The signal PSEL is fed to an arbiter circuit 35, which resolves contention between conflicting claims for the output ports by different messages, and produces the necessary control signals for the crossbar switch 31 so as to set up the desired connection.
The control logic 34 also generates the output activity level signal ALX which, in this example, is equal to the minimum of the four input activity level signals ALO-3.
Referring now to Figure 4, thus shows the control logic circuit 34 in more detail.
As mentioned above, the control logic circuit 34 receives the header of the next message to be routed, and also receives the four input activity levels ;-3.
The PAD parameter from the header is decoded by a decoder circuit 41 to produce a signal P?s~7 indicating which of the four output ports lead to the preferred destinatIon. The signal PREF controls a multiplexer 42 which selects the one of the four activity levels ;5 > O-3 corresponding to the preferred destination. The output AnP of the multiplexer thus indicates the activity level of the preferred destination.
A decoder 43 receives the parameters NHD and PAD from the header and the four activity levels ALO-3.
and decodes these to produce the following sic-.a:s.
MIN : the minimum activity level in the neighbourhood of the preferred destination.
PMIN : the output port which leads to the processor with this minimum activity level.
ALX : the minimum of the four levels ALO-3.
The signals MIN and ALP are applied to a difference circuit 44 to produce a signal DIF which equals the difference between the activity levels of the preferred destination and the least active processor in its neighbourhood.
The signals DIF and SOF are applied to a comparator circuit 45 which produces an output signal indicating whether DIF is less than SOF.
The signal SOF is also applied to a comparator circuit 46 which produces an output signal indicating whether SOF has its maximum possible value (255).
The outputs of the comparators 45, 46 and the signal NHD are decoded in a decoder circuit 47, to produce a control signal for a multiplexer 48. The multiplexer 48 has two inputs, which receive the slcrals PREF and PYIN. The multiplexer selects one of these two inputs to produce te output signal PSEL for controlling the routing of the message.
The decoder 47 controls the multiplexer 88 according to the following rules.
(1) If SOF equals 255, then the multiplexer U selects REF. In other words, if so- has its maximum value, the message is routed towards the preferred destination, irrespectIve cf the activity levels.
(2) If NHD is such that there is only one possible output port leading to the neighbourhood of the preferred destination, the multiplexer again selects PREF, so that in this case also the message is routed towards the preferred destination, irrespective of the activity levels. For example, if the neighbourhood consists of the four processors PROC 8-11, then at level 0 in the network there is only one output port leading to this neighbourhood: namely, the port that is connected to switching element 26.
(3) If neither of these two conditions applies, then the routing of the message is governed by the output of comparator 45, i.e. by the relative values of DIF and SOF. If Di" is less than SOF, the multiplexer 48 selects PREF. If on the other hand, DIF is greater than or equal to SOF, the multiplexer selects PMIN. Hence it can be seen that the message is routed towards the preferred destinatoon, unless te difference between the activity levels of the preferred destinatIon arc tne least active processor in its neigncournood equals or exceeds the threshold value so-, in which case it is routed towards that east active processor.
It will be appreciated that while the system described above has 16 processing units, and te switching network uses 4 x 4 switching elements arranged in 2 levels, in other ejodiments of the inventIon these numbers can be var-ed.
Other modifications may also be made to the system described above without departing from the scope of the present invention. For example, the signal ALX may be formed by calculating the average of the four activity level signals ALO-3, rather than the minimum of these signals. In that case, the global activity level received by each processor will be the global average activity level, rather than the global minimum.

Claims (8)

1. A switching network for routing messages from a plurality of input ports to a plurality of output ports, wherein each output port has an activity level associated with it, and each message has a preferred destination associated with it, identifying one of the output ports to which the message should preferably be routed, and wherein: (a) if the difference between the activity level of the preferred destination and the minimum activity level in a predetermined neighbourhood of the preferred destination is less than a threshold value, the message is routed to the preferred destination, and (b) if said difference is greater than the threshold value, the message is routed to the port with said minimum activity level in the neighbourhood of the preferred destnation.
2. A network according to claim 1 wherein said threshold value for each message is determined by a parameter associated with that message.
3. A network according to claim 1 or 2 wherein said neighbourhood for each message is determIned by a parameter associated with that message.
4. A network according to any preceding claim wherein said network is a multi-stage switching network, and wherein said activity levels are propagated through the network to provide activity levels associated wIth each stage of the network; and wherein at eac stage the messages are routed in accordance with the activity levels associated with that stage.
5. A switching network substantially as hereinbefore described with reference to the accompanying drawIngs.
6. A data processing system comprising a plurality of processing units, and a switching network according to any preceding claim for routing packets from any of the processing units to any other of the processing units.
7. A data processing system according to claim 6 wherein said packets represents units of work to be performed by the processors, and wherein said activity levels represent the number of packets waiting to be executed in each processor.
8. A data processing system substantially as hereinbefore described with reference to the accompanying drawings.
GB9100531A 1990-04-03 1991-01-10 Switching network Expired - Fee Related GB2243052B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB909007541A GB9007541D0 (en) 1990-04-03 1990-04-03 Switching network

Publications (3)

Publication Number Publication Date
GB9100531D0 GB9100531D0 (en) 1991-02-20
GB2243052A true GB2243052A (en) 1991-10-16
GB2243052B GB2243052B (en) 1994-03-16

Family

ID=10673824

Family Applications (2)

Application Number Title Priority Date Filing Date
GB909007541A Pending GB9007541D0 (en) 1990-04-03 1990-04-03 Switching network
GB9100531A Expired - Fee Related GB2243052B (en) 1990-04-03 1991-01-10 Switching network

Family Applications Before (1)

Application Number Title Priority Date Filing Date
GB909007541A Pending GB9007541D0 (en) 1990-04-03 1990-04-03 Switching network

Country Status (1)

Country Link
GB (2) GB9007541D0 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0572721A1 (en) * 1992-06-01 1993-12-08 ALCATEL BELL Naamloze Vennootschap Switching network
EP1445703A1 (en) * 2003-02-10 2004-08-11 Nokia Corporation Content Transfer
US11075987B1 (en) * 2017-06-12 2021-07-27 Amazon Technologies, Inc. Load estimating content delivery network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0572721A1 (en) * 1992-06-01 1993-12-08 ALCATEL BELL Naamloze Vennootschap Switching network
EP1445703A1 (en) * 2003-02-10 2004-08-11 Nokia Corporation Content Transfer
US11075987B1 (en) * 2017-06-12 2021-07-27 Amazon Technologies, Inc. Load estimating content delivery network

Also Published As

Publication number Publication date
GB9100531D0 (en) 1991-02-20
GB2243052B (en) 1994-03-16
GB9007541D0 (en) 1990-05-30

Similar Documents

Publication Publication Date Title
CA2156654C (en) Dynamic queue length thresholds in a shared memory atm switch
KR100247022B1 (en) A single switch element of atm switching system and buffer thresholds value decision method
JP3075251B2 (en) Virtual Path Bandwidth Distribution System in Asynchronous Transfer Mode Switching Network
US8391174B2 (en) Data packet routing
US7746784B2 (en) Method and apparatus for improving traffic distribution in load-balancing networks
CA2123951C (en) Output-buffer switch for asynchronous transfer mode
AU764546C (en) Dynamic load balancer for multiple network servers
US8984526B2 (en) Dynamic processor mapping for virtual machine network traffic queues
US5721820A (en) System for adaptively routing data in switching network wherein source node generates routing message identifying one or more routes form switch selects
WO1998023127A1 (en) Scalable parallel packet router
US20030231627A1 (en) Arbitration logic for assigning input packet to available thread of a multi-threaded multi-engine network processor
KR100334871B1 (en) How to respond to overload in a distributed real-time system
US5568468A (en) Usage parameter control apparatus for performing a plurality of conformance checking operations at high speed
CA2329357A1 (en) System and method for regulating message flow in a digital data network
CA2329542A1 (en) System and method for scheduling message transmission and processing in a digital data network
GB2365665A (en) Switching arrangement for data packets
KR100258157B1 (en) Priority control method of virtual clrcuit and device thereof
US6625160B1 (en) Minimum bandwidth guarantee for cross-point buffer switch
CN111356181B (en) Traffic forwarding method, traffic forwarding device, network equipment and computer readable storage medium
GB2243052A (en) Switching network
WO1998044686A2 (en) Method and device in telecommunications system
JPH09321768A (en) Atm exchange
JPH04838A (en) Buffer control system
JPH08130560A (en) Inter-network connecting device and its method
Lin et al. On rearrangeability of multirate Clos networks

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20050110