CA2114725A1 - Message structure for scalable self-routing non-blocking message switching and routing system - Google Patents

Message structure for scalable self-routing non-blocking message switching and routing system

Info

Publication number
CA2114725A1
CA2114725A1 CA 2114725 CA2114725A CA2114725A1 CA 2114725 A1 CA2114725 A1 CA 2114725A1 CA 2114725 CA2114725 CA 2114725 CA 2114725 A CA2114725 A CA 2114725A CA 2114725 A1 CA2114725 A1 CA 2114725A1
Authority
CA
Canada
Prior art keywords
message
address
switch
network
stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA 2114725
Other languages
French (fr)
Inventor
Aloke Guha
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honeywell Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2114725A1 publication Critical patent/CA2114725A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q3/00Selecting arrangements
    • H04Q3/64Distributing or queueing
    • H04Q3/68Grouping or interlacing selector groups or stages

Abstract

Described is a messaging system for use with truly non-blocking networks. It is adapted for use with single stage crossbars or multistage truly non-blocking networks of various configurations.
The invention is a self-routing design which uses both the message destination address and an associated turn signal. For returning blocked messages, a return (or source) address is sent with the message and it, too, has an associated return signal. The switches themselves in the network either turn or not depending on whether there is a match with the appropriate address and turn signal.

Description

W~ 93/03581 ~ 2 3~ 2 ~ Pcr/usg2/~52 ~SAGE STRUCTI~liOR S~ALABI~ELF-130~I~Ç~ ;
~O~-BI,OC~ IG MES~GE SVVITC~G AND R~T~M
l~is inven~on reiates to a self-routing, non-bl~vking switch sys~em for rou~ng messages ~rom any one of a number of inputs to any one of a number of outputs in an efficient and cost effective manner. ~ -:

E3ACKGROU~D OF~ INVEN~ON
As data transfer becomes more and more complex, the numbers of the messages wl-ich need to be transferred across a network from one location to another, or one area of a computing device to another, has grown remarkably. ~ ~; s AccoIdingly, any system or subcomponent thereof which allows for ease of -routing messages from any of a large number of inputs to any ~cular one of a large number of ou~puts is important. The invention described herQn allows ;;
for the routing of such messages with v~y lit~e overhead, at high speed.
The invention herein employs a retum network, is capable of using different kinds of switches in the switching network, may operate in a ` -broadcast mode, may reduce the number of switches necessary by using a Clos network or by reliance on busses and con~rollers, and uses a self-routing ;
algorithm, elil~inahng the need for preprocessing ovelhead.
2 0 ~ Sdf-rwting algorilhms use informabon contained in tho message itself to rou~e~the message at the~ switch level to the desired loc3~0n.
Prior selif-routing algori~ms such as ~e one described in the 1990 ,,,~F~PP.(Vol.~78, No. 1), entitled nF~5L~Witch ~r~kite~tures for Broad ~ervices Digital Networ!~, by Tobagi, do not indicate as~ how these algonthms can actually ope ab, and do not describe how ~eir operation can be accomplished when using a Clos-type network.
Clos networks are preferred for larger networks because they reduce the number of sw~tehes necessary to accomplish a N x N message switching system to a much smaller number than N2. See, for example, U.S. Patent -21:1 ~ i 2 ~ ~
WO 93/03581 PCr/USs2/06652 ~`~ 3 - 2 - -;

No. 4,696,000, issued to Payne, which refers to the Clos network and rea~angeable networks.
The invention described herein simplilSes all these ideas.
The switches may be a~sanged in a standard crossbar network for sma11 ;~ ~
s scale implementations, a Clos network or a variation of a Clos network using ~ ~;
busses, and the types of switches and routings, dependent on the switch t~pe ; -used, may also vary de~ending on the implementation. As described in the detailed description, a preferred implementation employs a return network for ~ ~
all blocked messages that wish to address the same output node and returns ~ ;
thenn to the source or sending node. I)ue to the nature of the routing system itself and the construction of the switching net vork, essentially no overhead or ~ -pr~processing is required to route messages through the system described from the input nodes to the output nodes.

BRIEP DESCRIPIION ~I~E DRAwINGs Fig. 1 is a two-dimensional layout of the switches in an N x N
crossbar.
Fig. 2 is a block diagram of a trap network for removing connection request confhcts.
2o ~ Fig. 3 is a layout of a second~implementa~on of an n by m crossbar.
Fig. 4 is a~ yout~of a n by m cros~ar wi~ an ~socia~d retum ;
nehvor~
Fig. 5a is a schematic diaglam of 2 x 2 crossbar switch.
Fig. -Sb is a diagram of a gate level implementation of ~e switch 2s~ described~in Fig. Sa. ~
Pig. 6a (i), (ii), and (iii) are ~ree possible modes of opelalion of the 2 x 2 broadcast switch. --Fig. 6a is a diagram of a gate-level implementation of a broadcast switch.
- - -, 2 ~ 2 ~ ` ~
} : - -.n~ a~am of ~ implementation of a crossbar switch.
Fig. 8 is a 36 x 36 Clos network diagram.
Fig. 9 is a 16 x 16 Clos network diagram.
Fig. 10 is a layout of a non-blocking nr x nr self-routing Clos ne~vork with ;
returnnetworks.
Figs~ 1 la, b and c are diagrams of the i~31~tS and ou~puts of the cross~rs usedin Fig. 10. -Fig. 12 is a schematic diagram of the bus implementation of a truly non-blocking network in accord with the invention.
Fig. 13 is a block diagrara of a finite-state machine model of the bus controller which may be used with the invention as described in Fig. 12.
Fig. 14 is a detailed schematic diagram of the databus controller blocks employed in one of the 2n-1 databus controller blocks used with the invention asdescribed with reference to Fig. 12.
Fig. 15 is a drawing of a complete 4 x 4 crossbar with associated retum net.
Pig. 16 is a bloek diagram of one message embodiment ~;, the invention having a header ~ortion H and a data portion D.
SUMMARY OF l`HE IN~TION ~;
Message Structure The invention teaches a messaging system~ for use with a switching network for switching messages from any of a multiplicity of sending node addresses to any one of 25 a plurali~ of destination node addresses. Each sending node has a renlrn path for messages~ returned to it, and each message has a message head~er which contains representadons for: the destmation address for the message, the source address for the message and a turn signal for the destination and for the returned source message. - -The source nodes have aIl incrementing processor device associated with the 30 return path for messages which i~crements one half of the destination $~C~T~ *~ g ~2. 9~ ~ "' ~' -' ' Wog3/035~l 2 ~ ~ ~ 7 2 ~ PCr/US92/06~52 ; ~
, . ,, ~
,., . , ' ~-' .;

address each ~ime a message is returned. The message can then be resent over a different route. The system w~rlcs by having the hlrn signal set in each message to be routed. The switching network is made up of a truly non-blocl~ng network of 2 X 2 switches, each switch setable into either a pass or s exchange mode, but all set in default to exchange mode or pass mode. Each ~ -switch has a way to compare the address of the switch with the destination address of the message header, and the switch node is changed on a match if the turn signal has not previously been reset. If the turn signal is set and theaddress matches, the turn signal is reset before passing it on to the switchls ~ ;
out~)ut.
In a preferred embodiment, the truly non-blocking network is a ;
crossbar and said address comparing means compares the entire des~nation address wi~ the switch address of said comparing means.
The inven~on may be employed in cases where the truly non-blocking network is a multistage network and the compaling means used in each switch in each stage compares that part of the header destination address which is associated with that stage of the;routing. In such cases, the incrementing means increments only the first part of the desdnation address and wherein at the output of each stage, the turn signal for ~e des~nation address is reset before or upon encountering its first switch in the next stage.

DE~AIIJ3D~PJ~SCRIPI~N C)F TH~ INVENTION
Several concepts are included here, which are, all centaed on ~e idea that self-routing algori~ms ma,y be employed to provide local, control of 2 x 2 ~5 ~ ~ switcheswhichcompriva~iousnetw~rks. Indoing~is,theneedfora global controller moni~g all of ~e inputs and all of ~e outputs, and their connec~ons, is not required. Where plain crossbars are used,, a total of NM of the 2 x 2 switches would be re~quired for an N input and M output network.
Because, the switches are relatively simple in functionality and layout of the WO93/03581 2~ 2~ -: P~r/usg2~o6652 , ~
design would be extremely regular, such networks can be fabricated easily in elect~onic as well as optical implementations. Using a Clos varie~r of ~uly non-bloeking switching network, even the number of switches may be substantially redueed. Even further reductions in the number of switches required is possible using the bus design. This kind of saviDgs improves the scalability of the inventive concepts described herein.
For exarnple, in a 36 x 36 network using a straight crossbar design, 1,296 2 x 2 switches would be required. Using a Clos netWork design, only 1,188 switches would be required. When moving to a network of approximately 128 x 128 design, 16,384 switches would be required, whereas ~ -with a Clos design, only 9,936 switches would be required.
An additional and important part of this set of inven~ons is the use of a return network. The return network operates to retum messages that attempt to route themselves to outputs which are already occupied. This re-routing or routing back is automatic, and takes less time than a time-out, es~ecially in larger networks, to inform the sending node that the message has not gotten through. (A timeout is a counter or tima that waits some increment longer than the maximum delay, and upon finding the time e~cpired or the counter : .:
~ fullj a new action is begun.) In povidmg this descIiption, some te~minology ought first be defined.
- ~ The~f;rstdcfini~onwouldbe"TrulyNon-Blocl~ngNetwollcs"~INBs). A ;~
truly non-bloclcing network is one in which rear~ngement of e~isting conditions is~not ~equired in order that every message get ~rough. It requires a set of feature~ lnding the follo~
- ~ (i) seffing ~e approp~iate network switches for all requested connections, -(ii) ensuring that no conflict occurs (a conflict occurs when ; -mul~ple inputs are connected to the same output), :
- .:., .:..... ... ....... ....................................................................
............. ..... ........ .. ~
' . ~.' ..... ... ....... .....................................................................
............ ..... ......... , ,':

:: , ~;
. ~ .

Wo 93/03~81 2 1 ~ ~ 7 2 ~ ~ Pcr/us92/o~652 - 6 - ~ . . .
,'-~ ~ '. .:., (iii) in the event of a conflict, only one connection may be .
satisfied while the other requests must be ke~t waiting or infonned, (iv) in case of broadcasting networks, the con~rol must also ' ;. . ~ .
.... . .....
allow a single source to be CoMeCted to multiple outputs.
The inventive conce~ts descnbed in this application show how INBs can provide for prac~cal implementation for high speed and high band width ;
applications using a distributed local ~ou~ng con~ol. ('I~ey are principally . ~:
oriented toward electrical switching but additional benefilts can.be gained using optical switching, such as no separate line being required for a return signal.)lo The cr~ssbar switch in general can be thought of as a square or rectangular hVO dimensional array connecting a set of inputs to an equal or, if .
recta~gl:lar, unequal, number of outputs. Thus, for N inputs and M ou~uts, a ~ ~
crossbar has NM crosspoints and can simultaneously pro~ide any combination ~ ;
of input and out~ut on~t~one connections in a non-blocking manner. In practice, other methods besides using an array of switches, where each switch ~epresents a crosspoint, have been used. These include for example, in electronics, bus arbitrated architecture used to simplify the quadratic complexity of switches and control and, in optics, outerproduct matnx multiplying architectures have been used. The prima~y approach here is to use 20 ~ - switch level design for sel~-routing. ~
.
A INB that ~in:s less switches ~an are r~q~ Ln a crossbar was first proposed by Clos in Study of Non-Blo~ldn~_$witç~ etw~rks, Bell ;ri System Technical Jou~nal, 32, pp. ;406 424,-1953. A general Clos network ~ ~ ~ (CN) has an odd numberj say, 2k+1 (where k=1,2,....), of stages and is built 25 ~ in~a modular design employing a multiplici~r of crossb~ which are . .
substan~ally smaller than the total N x N siæ desired. Thus, the number of switches which would be reql~ired for a Clos network would be on ~e order of N x ~/N, when a comparably sized crossbar would be N x N. (See above .~
paragraph.) Also, networks of more than three stages can be built from :.

WO 93/03581 2 1 i 4 ~ 2 ~ PCr/U~s2/0~652 - 7 ~

three-stage networks by successively replacing the center-stage by ano~her - three stage CN.
, ., ~ . .
It is htown that in a N ~ N Clos network, if the input stage is built of r crossbars of dimension n x m where rn=N, the second stage is made up of m, ; ~ ';
r x r crossbars, and the third stage is made up of r, m x n crossbars. The size of a crossbar can be designated with the sequence (n,m,r). The Clos network wiU be truly non-blocking if m 2 2n-1. The CN is not truly non-blocking but may be rearrangeably non-blocldng if m < 2n-l.
It is not particularly desirable to have a rearrangeably non-blocldng ne~vork in that messages may be required to be interrupted to rearrange tbe network when such rea~angement becomes necessary. For applications, therefore, re~quiring long duration or large messages, interruptions are .
extremely undesirable.
. ~
Refer first to Fig. 1 in which a simple crossbar network (10) is shown.
It has inputs 1 through n and outputs l through m as shown, and each one of the 2 x 2 switches may be labeled as points in a matrix or, 1, l to n,m. The switches may be labeled with their column address only since this is the one which will be compared with the m~sage header as will be explained later. ; ;
In Fig 8, a Clos network is drawn ~or a 36 x 36 ar~y of inputs to : ., ... :
~ outputs.- The netwo~ is com~ised of 6, 6 x 11 crossbars in column l, plus 11, 6 x 6 crossbars in column 2, plus another 6, I} ~ 6 crossbars in~columll 3.
.
Ihus, an input on line 81 may be directed at switch or cros~point 82 to ~ ~ -- t ~ c onnect to the f;rst top line 83 of crossbar 84 in colwTm 2, row 1, which may ~ -e di~d by switch 85 to output 86 of the same crossbar, conne~lg it to input 87 of crossbar 88.~ At ~is poiné it may only be routed to one of ~e six outputs of crossbar 88 by one of ~e six switches in that column of crossbar . ~ -.,:
88. It will occur to the rPader at this point that three crosspoints are involYed ~ -in transferring the message from input 81 to one of the outputs in crossbar 88.
. .. ~. ~
.~ :..: : ~.
....

... ~, . :

~3 -2 ~ A ry c~ _, W093/03581 l y ( ~G~ PCI~/US92/06652 ~3 - ~:

The Clos network, it will also be noted, is generally constructed having ~ ~ ~
square IOOt of N rows (plus m rows of center cr~ssbars, not labeled) by ~ree - - ~ ::
columns. The network may be expanded and the number of switches ~urther reduced by s~litting the central portion into three in accordance with the teachings of Clos. ::
Refer now to Fig. 2 in which a trap network scheme is used to remove :
connection request conflicts before they happen, according to the design 20.
This design is avoided by the invention. In Fig. 2, inputs l-n are provided to a parallel SO~Dg netwo~k 21 and provided, sorted, to a comparator stage 22.
The comparator stage 22 provides an indication of a conflict to the duplicate ::
rout~r stage 23 along with each message being passed. The duplicate router ~ :
stage 23 retuMs messages, along lines 24 or 25, to the inputs prior to the sorting network a, to be handled in accordance with whatever scheme is desired. For instance, either ~e message header itself m~y go back to the . input node where the masage header and the message may be directed to a .
. buffer, or any number of other tbings may be desired for messages which may .
.~ require resending at some other time. The crossbar 27 then allows its switches i ~:
.:. .: .
. . : to be set in accordance with ~e instructions carried in the header which . : : .
~. . indicate tho output address for each message.routed through it. Using ~e 20 ~ achings descri~t herein,-such an~implementation, including units 21, 22,:23 ,and oulputs 24 and 25, is :not ~equired. ~ - . . . - : ~ .
n~; Switches~
.. , , ~ ::
. ~ g ~ ; Befole m~ving on to a descnption of the algont~m for self-routing, and : -because it would create some compUcations to desc~ibe them later, a ~ desaiption of the switches which may be u~ with ~is device, or at least their functionality, is now described. -.
While this invention may be well suited to light switehes, only ~e : electronic versions are shown, however, it should be clear to one of ordinary skill in the art that light switches will provide enhanced flexibility allowing for WO 93/0358~ 2 Pcr/usg2/o66s2 ~ ~
9 ~

etum signals over any path once established. Higher speed communications, higher bandwidth and so forth become available when using light switching devices. Also, by not requiring messages to get a return routing, the ~eturn network described herein may even be avoided when using light switches. ~ ;
The basic 2 x 2 switch can be seen in Fig. Sa, 50, as having two ;
inputs, IN 1 and IN 2, and nvo outputs, OUT 1 and OUT 2. It may be ~ ; ~
generally set in either a pass or exchange mode. Under the pass mode, lines ; ;
51 and 52 will be open and lines 53 and 54 will be closed. In the exchange mode, the reverse is true. - `
In Pig. 5b, the switch 50 is shown in logic diagram form. The "change" in input 55 will control whether the switch is in the pass or exchange -mode. Thus, when the input to 55 is zero, OUI 1 is open to I 1 and closed to I 2, and OUT 2 is open to I 2 and closed to I l. When 55 is one, the reverse is true. OUT 1 is open to IN 2 and OUT 2 is open to IN l.
:: :. .,. ,~.
Refernng now to Fig. 6a, three modes of operation of a broadcast ~ ~
switch logical connection are shown, i, ii, and iii. In mode i, one input may ;
be connected to both outputs. In mode ii, the inputs are connected to their ". ~ .. : ,:
direct outputs. In mode iii, the inputs are conneeted to their crossed ou~uts.
~ ~ ~ InFig 6b,agate-levelimpbmentationofsuchasvAtch~isshown.
20~ Thus,~where input 66 is ze~ and input 65 is hi (or "onen), ~e output of a' will be b and not a, and ~e output of b' will be a and not b, ~thus, a sv/ih h ,' '," .'i ~ndition. If input ~ is zero, or low, ~d output 65 is ~so low, ~e~ ~ ~ - -- a broadcast pass condition in which both outputs will be b. ~`
likewise, if the input 66 is hi and the output 65 is low, the input b will be passed on a', and ~e input b will also be passed on the output b'. In ~e case where both 66 and 65 are hi, ~e output of a' will be b, and the ou~put of ~ -b' will be b, another broadcast condition.
The switch may be redesigned to eliminate broadcast conditions if ;;
desir~d. As is also true with the gate-level implementadon of ~e switches ~ I

Wo 93/03581 2 1 1 ~ 7 2 ~ PCI /US92/06652 ~ ~

- 10- '' ~ ' ' described previously, this switch may be designed differently as is well hlown ; ~
to those of ordinary sldll in the art. ~ -In setting the switches, or not setting tbem, or setting them in~ a broadcast state, ~is invention requires ~eference to the intended address of themessage. Thus, it requires a device similar to that described with reference to Fig. 7, 70, which contains the 2 x 2 switch, sw. Input data containing the address is received by the crossbar switch device 70 across datalines 71 and 72. Depending on the overall implementation, either a serial or parallel input may be desired. Either way, a delay buffer 73a and 73b is provided before the data reaches the switch sw. The address label part of the message is read into the address comparator 74 at line 75. The signal output by comparator 74 on line 75 is positive if the switch address matches the appropriate part of the ~-des~lnabon address. Such a positive signal on line 75 latches And-ga~e 78. In ` -~
this way, if the And 78 is not disabled by line 76 or reset by a signal from line 77, the control output will provide input C to switch sw. The C signal also changes the output on 77a from Hturn" to l'n~turn" by means of latched And 78. Thus, all succeeding switches would receive the changed turn signal, allowing the message to ~ontinue down the column. Different schema for this switch which perform similar functions may be developed easily within the ~ ~ ~
2 0 ~ sldll of ~ose of ordina y ~s~ll in the art. This particular design is pr~ferred - -for tho networks and algo~ms descnbed, but variations will be apparent to - ~ t hose of ordinary sl~ll in the art. For example, line 76 could be ~emoved without any loss of function described herein. There may be parallel input hs if desu~d. ~ May o~er vaIiations may be constructed wi~out 25 ~ ~ircumven~ng this invention as claimed.
e Basic Sel~outin~ Ai~orithm Please refer to Fig. 1. The switches should all have a default sefflng, either pass or exchange, and all should be set in that mode when beginning to make any~connections. For the preferr~d embodiment, the default se~ng is Wo 93/03581 2 i 3L 4 rl 2 S PC~/US92/06652 exchange. ~If desired, the default settings and switches could all be reversed and appropnate adjustments made to accommodate the reversed order. Such adjustments would be principally to reverse the input lines to each switch.
The switch itself may requir~ redesign.) ;
When a message amves at any switch, the header having a destination address will be routed along the row corresponding to ~e source from which it is sent. Thus, if it enters from input port or node 1, it will be routed along switches l,l; 1,2; 1,3; ... 1,m. The mes~sage will be accompanied with a s~parate, active signal which will be denoted for the purposes of this explanation "turn". The label on the switch will be compared with the destination address in the header of the message at each switch at which the message arrives. If they are the same and the turn signal is high, the switch ~ - : . ....
will be set into, for example, the pass mode, and the turn signal will be reset by the switch (see Fig. 7). If a des6nation label or indicator does not match ;
the address of the switch, or if the turn signal is low, ~e switch will remain in ~ -the default mode (in the preferred embodiment, the case exchange mode). ; ~;
. . .
Thus, if a message is to be t~ansferred from node 1 to output node 3, at switch ~
1,1, it will be told to exchange to switch 1,2 across line a'. The turn signal ~;
~ will not have been reset since it is to continue to ~avel across this row. No ;
2 0 ~ match has been made so the switch will remain in an exchange condition.
On arnving at switch 1,2, the message header will again be compar~
; with the address of the switch. On again finding no match, and the turn signalt ~ again not rese~t, the message again will be exchanged thr~ugh to switeh l,3 0n ~ ~ -line a'2. When the message rteaches~swiitch 1,3, the turn signal wiU be reset.
: ~ : : . .: .
2 5 ~ When the control signal is provlded to switch sw (see Fig. 7) because the . . - . . .
add~ess matches~ ~e switch position wal be changed to pass, and the message he~der and ensuing message will be tlansferred from input line a'2 to ou~ut .: ~
line b'3 of switch 1,3. The same control signal C, will reset the tum signal to ~ ~
. .:
itsopposite.
~. ~....

WO ~73/03581 2 ~ 7 2 ~ P~/US92/06~52 ,~ .

There are several ways to reset turn signals when the message is complete, such as reset~ng the entire network together at contro31ed intervals determined based on predetermined message sizing. However, the inven~ve concepts herein will work with numerous global or pass message reset s tec}miques so these are not explained fur~er here.
Refer now to Fig. 3 in which a network 30 is shown ha~ing nearly the identic~ layout as the network of Fig. 1. The difference here is that each one of the input nodes broadcasts its mcssage across all OI the columns of switches so that each one may compare its address at the same time, pr~viding for much faster connections. These broadcast lines are indicated as lines 31-34. : :
Bç~rn Networks ! , The }~ey conce~t for elimina~dng the need for a conflict ~esolu~on devi~ is the provision of the return network for each input node or line. A
simplified implementation is described with reference to Fig.~ 4 in which a ............................................................................................ .......... ... ... ... ~ .
crossbar network 40 and an associated return net 41 are shown. For each node ;
42 ox input line 1, an output as well as the input line is provided. A
modification is also required in the header: both the destination address and the source address are required in order to route back the message which has found a conflict. Each should also have its own turn signaL The first ~art of 2 o ~ t he header should contain the destination address in ~e prefe~Kd embodiment.
, Thus, ~or~a message from node 1, it is ` 1' of 1 to m outputs. If ~1 goes well, . . .
the header routes its~lf to the pro~er ou~t as descnbed proviously.
Howeve~, if the des~nation output L~ine is already busy, ~e header will be diverted lo ~e return networlc 41. Consider, for exampb, a case where input n ode 42~using input line l, and a second input node using input line 2, have connection requests for output node 3 (and thus send messages with headers to node 3). The header from the second node, for exalrlple, reaches the switch 2,3 before the header from node 42, if the two headers are clocked out ~ ~ ;
together. Thus, switch 2,3 will be set (in the prefe~ed embodiment) into the .

WO 93/03581 21~ ~7 2 5 - PClr/US92/06652 pass mode by the header from the second node and, therefore, the header from node 42 (input line 1) on arnving at node 2,3 will be routed ~rough switch 2,n into the rebum net 41.
The switches in ~e return net will look only at the source address part of the header and its associated turn signal. Since the source has sent a message request, it should not be busy when it receives the ~uest back"n ;
this case across line 44. To get there it will have come through switch s from switch t. The routing through the switches in the return net may work exactly the same way as in the forward net, with a separate turn signal for the sending address, which is only reset when the switch address matches the origin address, and it then stays in reset condition through the switches in that (diagonal) return switch line until the header with its mess3ge (if any) is ~ --ret lrned across one of the long horizontal lines in the crossbar (see Fig. 15). ; - -an cases where it is being routed back to a previous stage crossbar, ~he tum -signal must change again before it is processed in the lower stage.) . ~ - .
See Fig. lS (or Fig. 4) for a complete 4 x 4 array of switches 150 and -~ :, i an associated return network. If Node 1 sends a message Ml at the same time ;
- Node 2 sends a message M2 the routing tilrough both the crossbars and the return net 151 can be t~ on this diagram as labelled by following the M1 20 ~ ~ and M2 labels. ~ (Note that to the right of ~dotted line 153, the s~hes may be shorted pc =lly if a different tum signal scheme is used than that ~ --~
described in the plefn d embodiment.) (You can also diminat~ the -connection ~rom the topmost row switch in ~e crossbar to the return net and -....;
~ - ~e lower right corner switch in the retum net because no output from row 1 ~ - will ever need to be returned unless the crossbar receives messages back from - ~ -a later stage cr~ssbar by bus.) e ~los Implementation This self-rou~ng algorithm may be applied to any N x N Clos network ' . !
with r n x m crossbars in the first and third stages, and where rn=N, m rXr Wo 93/03581 2 .~ ~ ~ 7 2 ~ PCI/US92/06652 - 14- `

clossbars in the second stage. Also, because we are interested in INBs, m~2n~
Previously, solving a self-routing erossbar network problem in a distributed mannery permutation of ~e rn requests such that the requests (to route a message) routed to add~esses in different crossbars in the third stage are routed from different crossbars in the input stages. See, for e~ample, Lin et al, T-y~pln~Q~sional~tical Clos Interconnection Networlc and Its Uses, Applied Optics, Volume 27, No. 9, May 1, 1988, pp. 1734-1741. This solution, however, requires permutating or rearranging all nr requests -simultaneously and cannot be implemented in the self-routing distributed mamner used by this invention.
The self-routing approach described previously works fine for crossbars but not ~or Clos networks. The important inventive aspect for Clos networks for achievmg non-blocking self-routing is in selecting the crossbars to be used in rwting the requests through the second stage of the network. Therefore, when conflicts are detected in the first n shared crossbars of the second stage crossbars, these messages are re-routed using the remaining n-l crossbars in thesecond stage.
"Gr~edy" Sç~f-~RQ~c~ orithm For Multistage or (~lQs~yQLb , 2 0 ~ i ~ ; Refer now to Fig. 9 which illustrates a crossbar network similar to ~at , u~bd in Fig. 8 but sr~ll~. Each input line Cl has an address, in ~is network being a 16 X 16 Clos network there addresses being (i~or b~ lO, l - ;
through 16 but as shown) for base 2, 0000 through 1111. ~he des~nation sddrsses~in column C3 have the same designation and C3 is~;likowise 2 s ~ ' comprised of the same number of crossbar switches. C2 has a greater number of smaller crossbar switches in accordance with the requirements of a Clos :
network. ('I~e crossbars in columns Cl and C3 are of size n by 2n-1). For .~ -the pulposes of the preferred embodiment if the number of crossbar switches :

WO93/03581 2~72~ Pcr/lJsg2/o66s2 in Cl and C3 is not a poweF of 2 (thus r is not a power of 2) the number should be increased until a power of 2 is achieved.
Thus it can be seen that, for the prefe~ed embodiment, with four digits the number of bits and the label for each address is l log2r ~
s In the preferred embodiment again, no switches are considered latched thus they are all in the pass position.
When any message ar~ives at any switch, the header conta~ning the : :
.:: . . .: .
destination address is routed following the self-routing algorithm described above for the first column. The destination address will be made up of two parts: the output crossbar address of 1log2rl bits and a local destination : ::
add~ess I log2n I bits witllin the individual crossbar. This matching is . ~ ::
perfonned only on the first llog2rl bits. ::-~
If ~e message~header finds ~at ~e desired address (Ilog2r1, say 00 ) ;: :
output is busy in ~e first of the intermedia~y switches in column C2, it will be - -~: routed to the next intermedia~y switch and so on until a pa* to ~e crossbar :;. .. : .: . ;:
: ": . :
switch in column C3 desired is achieved. Thus the algorithm is termed greedy .`:;. :. ~. . .
. . ~ .. .: in that any unused column may be ùsed for routing. In the i11ustrated example of ~ig. 9 the address 00~ was f~ee and a connection made on 91 to the second . . ~
.. ~ stage crossbar is C2. Tlie oth message that wants ~is crossbar is coming ~in .; .
20 ~ on 92. If ~hey are going to differa~t final crossbars as shown, ~oss lines 93 :. . .
2 ~ . and~94 :there is~no~plob!em,~ othelwise the second message through will be ,.blocked and have to be rerouted. This-is why~the greedy algon~m was -:

The;key to resl)lvin~ ccnflicts in~ each stago is the use of the crossbar 2s~ the ~ network for~conflict det~ction and resolutiQn. Stage 1~ and 3 ...
conflicting requests, due to true input ~uest conflicts (whe~e two mput ; `
requests specify the same output address) are retumed to the source node. -. .
Howev, conflicts in the second stage or second column that arise due to ; .
sha~ing of the crossbars in the second stage are resolved by detecting conflicts :

7 ~ ~
Wo 93/03581 PCr/USs2/06652 '; -in those labelled crossbars and then, on receiving the return header, scanning available oonnec~Qns in the ~e~uning second column c~ossbars. In the worst case then, a ~equest may scan through n outputs in its stage one or Cl crossbar before its request is met by the (2n-l)th output.
Another way to say this is there are two fonns of 10 conflict that can occur, the first occurs in the first or third stages due to direct input conflicts.
This can be wher0 two or more inputs request the sarne output address in the third stage or subaddresses in the first stage. These are handled by the conflict resolution as described in the case of the single crossbar. The second kind of conflict arises due to the shanng of crossbars in the middle stages and requiresscarming the last n-l free crossbars to resolve the conflicts.
Because of these multiple stage conflicts, crossbars 20 in the second or middle and ~ird stages must provide return paths so that when the header part of the message is sent, ~e response ean be sent back to the source to contirm that a rou~ng path is available. Thus, the multi-stage network acts more like a router than a pure interconnection network.
Please refer to Fig. lO which describes such a system, including the return nets and ex~a return paths.
~ Again, if light switches are used, return paths may be obviated because 2 0 ~ ~ the light can travd back over the same open switch ~at it travded to get to the ~ -des~on node. It ~hould also be noted here that electronic switches with ; return pa~s ~pened at ~e same ~me and location as sending paths would also obviate the need for a return net or separate return path system.
- ~ . Crossbars~ la- to ra es~blish the first stage, lb to 2n-lb the second and .
lc to rc the third. Eac}i has a return net lO2, similar to ~e one desclibed in detail with Ieference Fig. 15.~ E~or every connection between crossbars like ;
line 103 there is a return line like 103r. Please refer briefly back ~o Fig. 15 and line 103r thereon.
.,, . . ,~ .

: .

WO93/03581 21~72~ PCr/US92/06652 - 17~

For ease of explanation, assume the network is p~ulated with 4 x 4 crossb~s in the first stage li~e 150 of Fig. 15. Output 1 goeis to the first ~ ;
crossbar in the second stage, output 2 to the second, etsi. The line returning f~om the first crossbar of the second stage would be line 103r, ~rom the second ~-S crossbar of the second stage would be line 104r, from the third crossbar of the ~ ~;
second stage would be retu~n line 105r, and from the four~ crossbar of the second stage, the return line would be 106r.
Reviewing the two message state of crossbar 150 descr~bed earlier, two ` `~
messages, Ml and M2, have tried to reach output 3, only message M2 made ` `
it. Ml was returned. Let us assume M2 did not make it through the third i i ~
-.
crossbar of the second stage. It would then be reh~rned on line 105r. Switch 3 in the first ~w is still held in the pass mode and the network message is routed ~rough across to the return net. The return net routes tho message bac~: to node 2. (All switches, once changed from the default state, should be held in that state for the maximum dela~r required for the message to return if - ;;
blocked at the last stage.) M2 then would be routed through the second row, 4th column switch, into tho return net which would forward the header back to n ode 2, using the return address. As in the explanation which follows, des~ ion address can then be incremental; and the 1~eader resent.
.
20 S ~ Figs. lla, b, and c detail the number~of lines used in each of the T he ranaimng detail is how to scan to the next or subsequent second - ~ stage crossbar when the original choice is blocked. When the message is -returned~to its originating node, the header's des~nation address is simply 25 ~; - incrementcd and the header reset to try ~e ~next second stage switch. Ihe .-..: ~.
incrementing is only done to the first part of the destina~don address, however.So, for example, (with r~ference to Fig. 9 assurning it has return nets), if output node at receiving address ~00 wants to send a message to a receiving ~ -- ~ ` -node at 0000, the incrementing can be done to the first two digits of the 0000 Wo 93/03581 2 ~ ~ 4 7 2 ~ PCI /US92/06652 address, making the next attempt through line 01-of crossbar 9S. I~is would send the header to the isecond crossbar in C2 (not shown) whose switches would look at the second part of the destination address9 ~us routing the message ~o the co~Tect stage 3 or C3 crossbar, in this case crossbar 96 If a third crossbar in stage 3 is tried, because the second part of the destination address remains unchanged, the message will still be routed to crsssbar 96 if the output address is 0000 Use of Busises Instead of Clos networks having an intermediary stage or set of stages of crossbars, they can use busses for routing messages This embodiment ~;
further reduces the number of switches but adds complexity in that it requires bus con~ollers Refer to Fig. 12 in which a network 120 is shown having again 1 -, . . .
through r crossbars with return nets for both the firs~ and third stages. Again, ;
as in Fig. 10, the return lines are shown attached near the sending line even though they are located as described with reference to Fig. lS on the side ~ ~ -opposite the output. The difference in structure between this Fig. and Fig. 10 .
is that there are 2n-I controllers 1211 - 1212n l. Each controller itself has a . : ~ set of r data bus con~s l211d -~ 12~1rd. -z o ~ T he algori~thm, howev, works about the samo way on this structure as it~does on the~kind illusW~1 in Fig. lO. l~us, a signal or _e sent ~m - ;
node 1 of crossbar 123, passes on output line 1 to Data and Bus Con~roUer 1 211d. I f ~e message is supposed to go to crossbar 124, for one of its 1 to 2n-l outputs it will try line l of its set of busses. If ~ere ~s no "busy- signal 25~ set by some~v outpui on Li~e-a of ~bus l, the message will pass onto the bus line b on output w. These v and w outputs are bi~irectional (or may be pai~d inputs and outputs, thus 2 v's and 2 w's if desired~
If this bus line is busy, the message will be returned to node 1, incremented and resent on line 2 (not shown) to the next controller. ~e next ::

WO93/03581 2~ 2~ PCr/USg2/06652 .
- 19- ::.';' ':,~"

controller 1212 (not shown but next in series) would try line 1 of its first ~ ~
Data/Bus Controller, which also connects to crossbar 124. This would go on ~ ; -until line 2n-1 of crossbar 1231 is tried, and a link attempted through Data andBus Controller 1212n ld is tried and its line 1. If still unsucceissful the next ;
increment can reset the header to try line 1 and start the cycle over.
Enhancements to this concept, such as sending a signal backward beyond the input node to cancel a message or resend later, should be apparent to one of ordinary skill in the art, without going outside the scope of this invention.
(Note also that there are single input/output lines between busses and crossbar 124r and paired input and output lines between busses and crossbar 124, in the drawing. This is done to show that it could be either way although it would be preferred if all the crossbars in a stage are connected to busses in the same manner).
The data/bus controller is next described ~or completeness. Refer first ~ ;
to Fig. 14 in which a bus controller 140 is shown. This is essentially an exploded view of the box labeled 1211 iD Fig. 12. However, the diagram also illustrates a variation in embodiments in which a demultiplexer is used rather ~ -,.:: .: -: . .
than an input and output line from and to each one of the first stage crossbars -and their return networks. The input data coming in on line 141 is 2 0 ~ demultiplexed by loolcing at the destination address, first part, and the Iequest ~ -issent~ooneo~thebuscontro!units,saybuscontroli. Buscontrol1~en needs to seize databus 1 in order to send the message as datz ~oss ~e data I ine to da~bus 1. To do so it first checks the st~tus of bus status 1 to determine if databus 1 is busy. This is done on line status 1~ If it is not busy, i t places a high signal or other appropriate signal on bus 1 status line across~e set/reset 1 line of bus control 1. If the status is busy, bus control 1 will rerout the request out the response line which is wire-ored to line 142. A turn signal will be tagged onto this retumed message so that it may find its way ~- ;; ~ `
back ~rough the return net. As was described in the earlier implementation, WO g3/03581 2 i ~ ~ 7 2 ~ PCr/USg2/06652 .
- 2~

sueh a turn signal for return coming from a intermedia~y level or seeond stage orossbar was not required bec~use each line went to ~e a~propriate address in the return net. ;
FSM of the Bu~ ContrQI
1. Inputs: Parallel Input Data/~Ieader bu~, the Bus Status (0/1), and the Next Stage Response, a single packet of data equal to the bus wid~
usually a header bit, and the source and destination ad~,dresses sent back from the next stage. The input Data/Header always contains 2 res~rved bits for routing.
2. Outputs: Data Bus, parallel output header/d,ata to the next stage, SetJReset control for the bus status line, Response Back lines that ~ --~ .
eontains a bit (available or note ~vailable) ~ source/destination information which may also be obtained fonn the Next_Stage Response.
Note the bus lines are tri-state (low, high, high impedance) to : :
.
distinguish between~ no signal and ~valued signal.
3 ~iti~
I~ Input Data/Header bit = 00 /*leading bit denotes header or message : packet*/ ~::
AND Bus_Status = l /~Bus occupied or busy*/
........
20 ~ Response_Back = O!Input Data/Header;
/*send~a:rlot ~vailable signal in response~
ELSE
s . ~ Response Back = l!InputDatalHeader; .i~
i``
/*send available signal in response*/
25:: ~ : Bus Status = 1 ~*set bus status busy reserve the bus- . . .
lines*/
ELSE IF Input DatafHeader bit = 10 /*message packet r~ceived :: . .~ - . .~
after the source received an available signal*/
...~. ,, ~
Data Bus = Input DatafHeader /*output WO 93/03581 2 ~ ~ ~ 7 2 3 Pcr/us92/o66s2 - 21 - ~ ~-... '.,,~
bus = input bus - transmit data*/
IF Input Data/Header bit--11 /*s~ecial code for end of message*/ - -T~IEN Bus_Status = O /~free the bus*/
Data Bus = XXX /*set bus back to high impedance state~
We note that this implementation in rou~ng ~rough ~e second stage assumes that the source node operates in two modes: first i~ sends out the re~uesting header pa~ket. If the response received from the rou~ers is negative, it retransmits the headers until it obtains a connection. Second, if apositive response is received, the source node transmits its message in packets until it frees all the latches switches in the crossbars or the busses it had used.
Rou~ing Delays We ~,vill only consider the average delays in setting up the routing path in the bus-based self-routing Clos net~,vork. The average ddays in *e first and thlrd ~stage as before is (n+ l) for the Xl implementation and n/2 for the X2 ;~ -implementation. The delay in the seoond stage is not as simple. Consider the delay in determining a conflict using the Data/Bus Controller architecture. ;
From F'igure 14, it will be clear that deterrnining the availability of a ~ ~
." ~. . ~
connecdon is tA (access time) = ~og2r+k) where log2r is delay in the ~ ~ ;
demultipla~er while k is a constant due ~e latching delay in the FSM of the 2 0 ~ bus control and response logic. This delay tA is r~ at most n ~dmes if in the wor~st call a11 ~n crossbars in the soc~nd s~age are scanned. Thus, ~e worst case delay in seffing up the routing path in this implementation is 0(2n+nlog2r). Once the routing path is set up, the switching delay is smaller since DO conflicts have to be further resolved. Since ~e optimal value of ~-25~ r--n~N, the worst case delay in sefflng up the routing path is ~ -O(~N(2+10g2~1N)) (only 0(~/N) for normal routing delay). We believe an optical implementation may have lower delays than the ele tronic one when ` - ;
implementing the switch version. ~ ~ ~
:

: . ~.. ..

Wo 93/03581 2 ~ 2 ~ PCr/US92/066~2 ~' '?

Broadcasting in CN's follows very simply from the above self-routing -scheme. To implement broadcasting CN's, broadcas~ng switches discus~d ;
above must be used ~or the individual c~ossbars in the CN. The only difference in ~e broadcasting is the limit on the maximum number adidresses to which a message can be broadcast. If the original Clos design is used, ~e ~ -limit is (n-l). This limit is imposed by the maximum number of crossbars used in the second stage. ;;

. ~

"' ~''',,,'."
',,.' '," ' '~

.. ~ ,-: '' ., . . ' ' '; . ' ':
~ : ~ . -`'''"' ''''''',`'' .: ~' ~, ' - ' .
~: : ' : :

Claims (7)

What is claimed is:
1. A messaging system for use with a switching network for switching messages from any of a multiplicity of sending node addresses to any one of a plurality of destination node addresses and wherein each sending node has a return path for messages returned to it, wherein substantially each message has a message header, said header containing representations for; the destination address for the message, the source address for the message and a turn signal.
2. A messaging system employing message headers as set forth in claim 1 for use in a multistage network, and further comprising incrementing means associated with the return path for messages wherein said incrementing means increments one half of the destination address each time a message is returned.
3. A messaging system as set forth in claim 2 wherein said sending nodes resend messages, the header destination addresses of which have been returned and incremented.
4. A messaging system as set forth in claim 1 wherein in each message to be routed, the turn signal is set and said switching network is made up of a truly non-blocking network of 2 X 2 switches, each said switch setable into either a pass or exchange mode, but all set in default to exchange mode and each switch having address comparing means for comparing the address of the switch with the destination address of the message header, said address comparing means so arranged and disposed to change the mode of the switch from exchange to pass mode on the condition that the turn signal has not yet been reset, and on the resetting of the switch, also resetting the turn signal.
5. A messaging system as set forth in claim 4 wherein the truly non-blocking network is a crossbar and said address comparing means compares the entire destination address with the switch address of said comparing means.
6. A messaging system as set forth in claim 4 wherein the truly non-blocking network is a multistage network and the comparing means for each stage of the multistage network compare that part of the header destination address which is associated with that stage of the routing, and wherein the incrementing means increments only the first part of the destination address and wherein at the output of each stage, the turn signal forsaid destination address is reset before or upon encountering its first switch in the next stage.
7. A messaging system as set forth in claim 1 wherein the message header further contains a second turn signal for use in returning the message header toits source address.
CA 2114725 1991-08-05 1992-08-05 Message structure for scalable self-routing non-blocking message switching and routing system Abandoned CA2114725A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US74026091A 1991-08-05 1991-08-05
US07/740,260 1991-08-05

Publications (1)

Publication Number Publication Date
CA2114725A1 true CA2114725A1 (en) 1993-02-18

Family

ID=24975729

Family Applications (1)

Application Number Title Priority Date Filing Date
CA 2114725 Abandoned CA2114725A1 (en) 1991-08-05 1992-08-05 Message structure for scalable self-routing non-blocking message switching and routing system

Country Status (4)

Country Link
EP (1) EP0600997A1 (en)
JP (1) JPH06509918A (en)
CA (1) CA2114725A1 (en)
WO (1) WO1993003581A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IT1156369B (en) * 1982-06-22 1987-02-04 Cselt Centro Studi Lab Telecom MODULAR AND SELF-INSTRADING PCM SWITCHING NETWORK WITH ROAD CONTROL ON VIRTUAL CHANNEL FOR SELF-SWITCHING TELEPHONE WITH DISTRIBUTED CONTROL
JPS61214694A (en) * 1985-03-18 1986-09-24 インタ−ナショナル ビジネス マシ−ンズ コ−ポレ−ション Switching unit for data transmission
US4696000A (en) * 1985-12-12 1987-09-22 American Telephone And Telegraph Company, At&T Bell Laboratories Nonblocking self-routing packet and circuit switching network
US4984237A (en) * 1989-06-29 1991-01-08 International Business Machines Corporation Multistage network with distributed pipelined control

Also Published As

Publication number Publication date
JPH06509918A (en) 1994-11-02
EP0600997A1 (en) 1994-06-15
WO1993003581A1 (en) 1993-02-18

Similar Documents

Publication Publication Date Title
US4984237A (en) Multistage network with distributed pipelined control
US5319639A (en) Crossbar with return net for scalable self-routing non-blocking message switching and routing system
US5654695A (en) Multi-function network
US5801641A (en) Controller for a non-blocking broadcast network
EP0425807B1 (en) One-sided crosspoint switch with distributed control
US8170040B2 (en) Fully connected generalized butterfly fat tree networks
JP2523207B2 (en) Multi-stage network control method
JPS6360579B2 (en)
JPH0151108B2 (en)
JPH0720102B2 (en) Collision crossbar switch and its operating method
US6992984B1 (en) Credit-based adaptive flow control for multi-stage multi-dimensional switching architecture
US4276611A (en) Device for the control of data flows
US4456987A (en) Digital switching network
EP0363822B1 (en) Broadcast network
US3582560A (en) Multistage telephone switching system for different priority users
US7468974B1 (en) Efficient strictly non-blocking multicast switch architecture for time division multiplexed traffic
US3916124A (en) Nodal switching network arrangement and control
CA2114725A1 (en) Message structure for scalable self-routing non-blocking message switching and routing system
CA1045231A (en) Network routing and control arrangement
KR0170493B1 (en) Non-blocking fault tolerance gamma network for multi-processor system
EP0598030A1 (en) Scalable self-routing nonblocking message switching and routing network
WO1993006676A1 (en) Nonblocking point-to-point fast packet/circuit switching networks
Lin et al. Repackable multi-Log2N networks
EP0505782A2 (en) Multi-function network
KR930011986B1 (en) Fan switch of exchange for isdn and switching method

Legal Events

Date Code Title Description
FZDE Dead