GB2268859A

GB2268859A - Communication between processing elements in a network system

Info

Publication number: GB2268859A
Application number: GB9215104A
Authority: GB
Inventors: William Alden Crossland; Peter John Ayliffe
Original assignee: Northern Telecom Ltd
Current assignee: Nortel Networks Ltd
Priority date: 1992-07-16
Filing date: 1992-07-16
Publication date: 1994-01-19
Anticipated expiration: 2012-07-16
Also published as: GB2268859B; GB9215104D0

Abstract

In a network e.g. a distributed computer system, communication between pairs of e.g. processing elements is effected via an optical crossbar or matrix switch. Each processing element has means for storing in a queued sequential order the addresses of those system elements wishing to transmit to that element. The address at the head of the queue is broadcast to indicate that the system element is ready to receive a transmission from the requesting element. The technique reduces the system latency (average delay) and prevents queue jumping. The optical switch has 2 rectangular array of liquid crystal spatial light modulator elements. The means for storing queued addresses is a first-in-first-out (FIFO) store. <IMAGE>

Description

NETWORK SYSTEM This invention relates to a network system e.g. for interconnecting a number of terminals or system elements in a distributed computer system.

A recent development in computer technology is the concept of a distributed system in which a number of processors together with a host element and an input/output element are interconnected. Typically this interconnection incorporates some form of switch whereby information may be conveyed between pairs of system elements. In an attempt to increase the system operating speed it has been proposed to provide optical message transmission between the system elements and to route these transmission via an optical crossbar or matrix switch. In such an arrangement each system element is associated with a corresponding row and column of the switch. The system elements are interrogated sequentially to determine which of these element wishes to transmit information.The destination of said message is accessed to ensure that this destination is not already busy with another information transmission and, if the destination is free, the transmission is routed via the switch. It will be appreciated that in such a system a significant proportion of the message handling time is spent in checking the state of each destination. At low system loadings the check on the destination will normally return an idle answer. At higher loadings the destination may well be busy and if this is so the requesting terminal must await the next cycle of interrogation, i.e. the next scanning cycle, before the request can be reconsidered even if the destination becomes idle immediately after the first request. Furthermore, in such circumstances the destination can again become busy during the scanning cycle as a result of 'queue jumping'.This can place a significant restriction on the system operating speed under high load condition.

The object of the invention is to minimise or to overcome this disadvantage.

According to the invention there is provided a network system, including a plurality of system elements each having a corresponding address, a switch whereby pairs of said elements may be interconnected to establish a communications path therebetween, means associated with each said system element for storing in queued sequential order the addressed of those system elements requesting communication with that element, and means associated with each said system element for broadcasting to the other system elements the stored address of the element at the head of the queue whereby to establish communication via the switch between those two system elements.

An embodiment of the invention will now be described with reference to the accompanying drawings in which: Fig. 1 is a schematic diagram of a distributed computer network; Fig. 2 shows an optical switch whereby information may be transmitted between the system elements or terminals of the network of Fig. 1; Fig. 3 shows a portion of the switch matrix of Fig. 2; Fig. 4 illustrates a control arrangement for supervising communication between terminals via the switch of Fig.

2; and Fig. 5 is a graphical illustration of the relationship between latency and loading level for the network of Fig. 1.

Referring to Fig. 1, the network includes at least one host element 11, a plurality of processing elements 12 and an input/output element 13 interconnected via an optical transmission path 14. Advantageously the system may also include a diagnostic element 15. Typically each processing element contains a random access memory which forms part of a distributed store.

When used as a database server the system may have its database embedded in the distributed store. In a typical arrangement a 64 element system may comprise 60 processing elements, 3 host elements and 1 diagnostic element. The system may rely on a host processor for disc input/output support.

Optionally a number of input/output elements could replace processing or host elements to provide local disc access.

The system may employ the Relational Execution module which communicates by using message passing. The primary characteristics of the message passing approach are high integrity, low error rate, error handling, message ordering, flow control, asynchronous message passing and buffer management. Message passing is used to copy data from one processing element's store to another processing element's store. In order to perform this message passing function the system provides for a large number of concurrent peer to peer communication between elements via the optical crossbar switch.

Fig. 2 shows a suitable optical crossbar switch in schematic form. Each processor element of the system is coupled via optical fibres (not shown) to one row (column) of an input array 21 and to one column (row) of an output rectangular array 22 of photodetectors. Light signals generated by the elements 22' of the transmitter array 21 are directed via a holographic fan-out element 23 on to a spatial light modulator array 24. The purpose of the fan-out element is to replicate the input array N time when N is the number of elements in that array. The spatial light modulator this has a total of N2 elements or pixels. Light passing through the spatial light modulator array 24 is directed via a fan-in lens array 25 on to the detection array 22.Advantageously, each element or pixel of the spatial light modulator comprises a ferro-electric liquid crystal material which is capable of being switched rapidly between transmitting and non-transmitting conditions. To effect transmission between a pair of system elements the appropriate rows and columns of the spatial light modulator are addressed to render a corresponding set of pixels transmitting whereby to provide a light path between a selected element of the transmitter array and a selected element of the detector array.

Fig. 3 shows a portion of the switch matrix of Fig. 2.

The switch comprises an array of pixels each addressed by a row conductor 311 and a column conductor 312 via a field effect transistor 313. In use, a drive voltage is applied to a back electrode 314 via the transistor 313 to render the cell transmitting or non-transmitting. The arrangement is provided with a common transparent front electrode (not shown).

The construction of an optical crossbar switch is discussed in our Specification No. 2 243 967 (W A Crossland 58-1-1). Smart pixels for use in an optical crossbar switch are discussed in our UK specification No. 2 233 469 (W A Crossland 57-9-1).

Before transmission via the spatial light modulator can take place it is necessary to ensure that the receiving column (row) of the detector array is not already receiving another transmission, i.e. the processor element corresponding to that column (row) is not busy. An arrangement for providing this communication control or supervision is shown in Fig. 3.

Referring to Fig. 4, each system element is coupled to two electrical buses 31, 32. The former, the I/P bus, is controlled by an external controller 311 which places on the I/P bus 31 the binary address of each system element in turn. The system elements continually compare the broadcast address with their own address using the address comparator, and remain idle until the addresses match. If the entire control system is built on a single chip each element will be hardwired with its own address, or if the system is built from discrete units the element addresses could be externally set.

When an element recognises its own address it can issue a request to send a message to another element (destination), by asserting on the 'request destination' line and simultaneously placing on the O/P bus 32 the binary address of the destination. It does this by enabling the tri-state high impedance buffers to output mode, whereby they transfer the contents of the internally generated request destination to the output bus.

If the element has no request to send a message, it does not assert "request destination' and the external controller moves to the next element address in the sequence.

Meanwhile all the other elements are idle and continually compare the address on the O/P with their own address. When the addresses match, an idle element knows that the element whose address is present on the I/P bus wishes to send it a message, and it writes the present contents of the I/P bus into its own queue 33. This queue is a first-in-first-out (FIFO) store, which stores the incoming addresses of the requesting devices in the order in which they are received, and outputs them in the same order. It could be a discrete device, or it could be an area of memory associated with the internal control unit, similar to the stack memory area of a microprocessor.

If the destination queue FIFO is full, the element signals this via the 'queue full' line to the external controller, which signals 'request refused' to the source element.

When the external controller recognises asserted 'request destination' and 'request valid' lines, it open both the optical channels on the SLM between the element whose address is on the I/P bus (source) and the element whose address is on the O/P bus (destination). This operation occurs in parallel with the queue updating described above. The external controller then moves on to the next element address in the sequence.

The operation described so far results in each element having in its queue store a list of the elements which wish to send to it, in the order in which the requests were received. Also the optical channels between each element and those that wish to send to it are open. Therefore each element broadcasts optically to all those waiting elements a non-data header followed by the address at the top of its queue. Waiting elements recognise the non-data header and check the broadcast address against their own address.

If they match they are able to send their message, if not they remain silent.

Once a message has been passed, the optical channels are closed by the source element, which issues a 'request closure' signal when it is next addressed by the external controller, which maintains - a current directory of source-destination connections, and closes the corresponding optical channels.

The sequence of bus cycles is as follows 1) Address an element and obtain a destination address and a possible closure request.

2) Close optical channels if required.

3) Source address is written into queue at destination.

4) Optical channels are opened between source and destination.

The procedure has the advantage that only two bus cycles are required for a request and that there is minimal delay between several requests to the same destination. Also full two-way communication is provided without further electrical requests.

In order for the system to operate successfully it is necessary to apply the following conditions.

i A requesting element, once in a queue for a destination, must remain optically silent and cannot service its own queue buffer until its transaction has been completed and the corresponding optical channels closed.

ii A system element cannot initiate a transmission request until its own queue buffer is empty, i.e. incoming messages have priority over outgoing messages. This is necessary to prevent a deadlock situation in which two system elements are trying to transmit to each other.

iii While transmitting or receiving a system element cannot transmit (receive) signals to (from) any other element.

I.e. simultaneous one-way transactions with two elements are not permitted.

The arrangement described above reduces in comparison with a conventional arrangement, the average delay or latency between making a transmission request and effecting that transmission. The latency is to some extent dependent on the message length, i.e. the number of bytes, and is reduced at longer message lengths. To demonstrate the efficiency of the arrangement described above, a computer model has been developed to predict the latency for message lengths of 14, 128 and 256 bytes and for various loading levels. For comparison purposes similar computations have been performed for a conventional system. The results of these compotations are illustrated in the graph of Fig. 5 which clearly shows the improvement achieved by the present arrangement.

Although the network system has been described with particular reference to use in a distributed processor system, it is not limited to that particular application and may, for example, also be employed in data switching applications.

Claims

CLAIMS:

1. A network system, including a plurality of system elements each having a corresponding address, a switch whereby pairs of said elements may be interconnected to establish a communications path therebetween, means associated with each said system element for storing in queued sequential order the addressed of those system elements requesting communication with that element, and means associated with each said system element for broadcasting to the other system elements the stored address of the element at the head of the queue whereby to establish communication via the switch between those two system elements.

2. A network system, including one or more host elements, a plurality of processing elements, each said element being allocated an address, an optical transmission medium whereby said elements are interconnected, an optical crossbar or matrix switch whereby communication may be established between selected pairs of elements, a first bus for carrying in sequential order the addresses of the system elements, a second bus for carrying the addresses of those elements to which a communication path is to be established via the switch, means associated with each said element for recognising the address of that element on the first bus and for placing on the second bus the address of an element with which communication is to be established, means associated with each system element for recognising its address on the second bus and for storing in queued sequential order the addresses of those system elements requesting communication with that element, means for opening an optical channel via the switch between each system element and those system elements that wish to communicate with that element, means associated with each element for broadcasting the address of the element at the head of its queue, means associated with each system element for recognising its own broadcast address and in response thereto transmitting a message to the broadcasting element, and means consequent upon the completion of transmission between two elements for closing the optical channel between those elements.

3. A network as claimed in claims 1 or 2, wherein said switch comprises a rectangular array of liquid crystal spatial light modulator elements or pixels.

4. A network as claimed in claim 1, 2 or 3, wherein said means for storing addresses in a queued sequential order comprises a first-in-first-out (FIFO) store.

5. A network as claimed in claim 4, wherein each said FIFO store comprises a respective area of memory in a common stack memory.

6. A network substantially as described herein with reference to and as shown in the accompanying drawings.

7. A method of intercommunication between pairs of elements in a network, which method is substantially as described herein with reference to and as shown in the accompanying drawings.