GB2238142A - Computer systems - Google Patents

Computer systems Download PDF

Info

Publication number
GB2238142A
GB2238142A GB8921359A GB8921359A GB2238142A GB 2238142 A GB2238142 A GB 2238142A GB 8921359 A GB8921359 A GB 8921359A GB 8921359 A GB8921359 A GB 8921359A GB 2238142 A GB2238142 A GB 2238142A
Authority
GB
United Kingdom
Prior art keywords
processing
module
controlling
transputer
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB8921359A
Other versions
GB8921359D0 (en
Inventor
David John Phillips
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CAPLIN CYBERNETICS
Original Assignee
CAPLIN CYBERNETICS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CAPLIN CYBERNETICS filed Critical CAPLIN CYBERNETICS
Priority to GB8921359A priority Critical patent/GB2238142A/en
Publication of GB8921359D0 publication Critical patent/GB8921359D0/en
Publication of GB2238142A publication Critical patent/GB2238142A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17337Direct connection machines, e.g. completely connected computers, point to point communication networks
    • G06F15/17343Direct connection machines, e.g. completely connected computers, point to point communication networks wherein the interconnection is dynamically configurable, e.g. having loosely coupled nearest neighbor architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • G06F15/8015One dimensional arrays, e.g. rings, linear arrays, buses

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Multi Processors (AREA)

Abstract

A parallel processing system can be used in co-operation with another computer to enhance the operation of that other computer. The system includes an interface module for interfacing the other computer to a plurality of interface links. The interface module 2 is arranged such that a single interface link 9 is assignable for use by a user and at least one processing node 13 from an array of such nodes is connectable under the control of a controlling sub-system 10, 14, 15 to the said interface link 9 to provide a parallel processing domain for exclusive use by that user. Each processing node may comprise a transputer and dedicated memory. <IMAGE>

Description

IMPROVEMENTS IN COMPUTER SYSTEMS Field of the invention The present invention relates to improvements in computer systems and more particularly, though not exclusively, to parallel processing systems.
Backqround of the invention In recent years the processing speed of sequential processing computers, i.e. computers which perform individual operations one after another and which are based on von Neumanns definition first proposed in the 1940's, has increased significantly.
This increase has been due to rapid advances in technology, but it is now becoming clear that advances in this area of technology are reaching a limit and that further increases in the power of sequential computers will only be possible by use of expensive materials in the manufacture of components and/or expensive cooling techniques. Sequential processing computers are therefore approaching the limit of what can be achieved and it is now being acknowledged that further significant increases in computational power are likely to be possible only by way of systems based on parallel processing techniques where several operations are performed concurrently.
A silicon based parallel processing system is now offered by Inmos by way of so-called transputers which provide a parallel processing architecture in the form of a microcomputer with communications links. The communications links enable each transputer to be connected to other transputers and in this way a full parallel processing system can be provided as an array of interconnected transputers. Each transputer also includes special circuitry and interfaces which can be used to adapt the transputer to a particular application. Transputers can be used individually as a single processing system, and there can be advantages in this in terms of the additional power gained as compared to that of conventional sequential processors, but more commonly they are used as processing elements or nodes in networks or arrays which form a high power concurrent or parallel processing system.
Whilst parallel processing systems offer significant advantages in terms of computational power compared to that of sequential processing systems, the majority of existing computers are sequential systems and there has been a substantial investment in the development of operating systems and applications programs for these computers. There are therefore advantages to be had, in order that this investment should not be wasted, in providing a parallel processing system which can be used in co-operation with existing computers without the need to develop completely new operating systems and applications software. By using existing operating system facilities available on established computers, such as disc management, keyboard scanning, display control, file servers etc., costs in terms of development time and user training time can be minimised.
Such systems have previously been proposed and indeed our own QT Series (Trade mark) of products provides a parallel processing system which is designed for use with a microVAX computer, manufactured by Digital Equipment Corporation. A QT system can be used to provide a host computer, in this case a microVAX, with an expandable on-board transputer array in such a way that groups or arrays of transputers can be allocated to individual users who can gain access to their allocated arrays via the microVAX. An interface module provides an interface between microVAX I/O lines, i.e. the Q-bus of the microVAX, and links which connect to individual transputers or arrays of transputers.The interface module receives commands and requests from the microVAX host and, in response to these commands and requests, establishes connections between the Q-bus and the transputer links. Each transputer link can be used to connect to independent transputer sub-systems or sub-arrays provided in the form of discrete modules.
Whilst any predefined number of transputers may be connected to each transputer link in the interface module, the QT system requires these connections to be made at the time of installation of the system by hardwiring the required number of transputers which form the sub-array to the appropriate link. This hardwiring arrangement has been found to be highly satisfactory in most situations, but there are situations where it would be desirable to provide a reconfigurable system in which transputers can be assigned to a given link without the need to dismantle and rewire the system. This is particularly so in multi-user situations where individuals may require the use of different numbers of transputers at different times in order to optimise the execution of particular applications programs.Moreover, users will expect to have exclusive use of the transputers assigned to them and it is therefore important that groups of transputers are assigned to each user in a way which provides for a secure link to those transputers.
It will be appreciated that the secure assignment of transputers to individual users is not a problem where individual transputers are hardwired to a link so that transputers are in effect dedicated to a single user. However, this arrangement can be wasteful of computational resources because a given link will have a fixed number of transputers permanently assigned, ie hardwired, to it regardless of whether or not all of those transputers are actually required by the user to perform a particular task.
Clearly then there is a need to provide a parallel processing system which can be used in conjunction with existing computers and in which transputers or other processing nodes in a parallel processing array can be assigned to a given user without the need to dismantle and rewire the system.
Summary and obiects of the invention It is therefore an object of the invention to provide such a parallel processing system.
It is also an object of the invention to provide a secure multi-user, user-reconfigurable, parallel processing system.
It is also an object of the invention to provide a parallel processing system in which processing resources are user reconfigurable under software control and thereby can be made available on request by the user.
According to one aspect of the invention, therefore, there is provided a parallel processing system for use in co-operation with another computer to enhance the operation of that other computer, in which parallel processing system an interface means is provided for interfacing the other computer to a plurality of interface links, the interface means being arranged such that a single interface link is assignable for use by a user and at least one processing node from an array of such nodes is connectable under the control of a controlling subsystem to the said interface link to provide a parallel processing domain for exclusive use by that user.
In an embodiment of the invention, to be described in greater detail hereinafter, the interface means includes means for connecting data lines from the other computer to said interface links for data transfer therebetween. The interface means is provided in the form of an interface module including a transputer which functions in accordance with firmware in ROM and co-operates with among other things dual port RAM under DMA control to supervise the transferring of data between the said other computer and the interface links. The interface module also includes a sub-system controller which is supervised by the transputer to control the abovementioned controlling sub-system. Thus, the controlling sub-system is arranged to connect a user defined plurality of processing nodes to the interface link assigned to that user.
The embodiment of the invention to be described provides the array of processing nodes in a plurality of processing modules, with each processing module comprising a number of processing nodes and being adapted to be cascadably connected to other such modules.
Each processing module may further comprise controlling means associated with the controlling subsystem for controlling the connecting of processing nodes in the module to a user link. In practice, this controlling means may be a transputer which is adapted to control a link adapter or crossbar switch.
Commands to this controlling sub-system are delivered from the other computer, which acts merely as a host, along lines associated with the controlling subsystem. Such commands among other things instruct the controlling transputer in the switching of the crossbar switch and in this way the connecting of processing nodes to other nodes to define a sub-array or processing domain of nodes is controllable.
Each processing node, in the embodiment to be described, itself comprises a transputer and a dedicated memory arranged such that an applications program or part of such a program can be executed thereon.
In another of its aspects the invention provides a processing system for use with a host computer, which processing system comprises an interface module for connecting data lines from the host computer to links within the system, at least one processing module defining a plurality of processing nodes connectable to each other and to said links, and a controlling sub-system comprising a controlling link associated with the interface module for connection to the host computer and a controlling node in the said at least one processing module, which controlling node is adapted to control the connection of processing nodes to each other and to said links.
In the embodiment, the system further comprises other processing modules adapted for cascadable connection by way of connecting and controlling links associated respectively with the processing and control nodes in each nodule.
It should become clear from the detailed description hereinafter that the embodiment provides a system in which a controlling sub-system of transputers controls the usage of other processing transputers or workers in an array. The controlling sub-system is linked independently to the other, i.e.
host, computer and this enables independent control of the controlling sub-system. Moreover, the host computer can control and communicate with any worker in the system although this is limited by the fact that each user is assigned an independent link into the system with the system allocating spare processing resources within the system to meet the users requirements without affecting sub-arrays or domains of processing nodes assigned to other users.
Thus the system in the embodiment provides for the seamless and flexible integration of an array of transputers with a host computer.
Further features and advantages of the invention, together with those above mentioned, should become clearer from consideration of the detailed description of an embodiment of the invention that is given hereinafter with reference to the accompanying drawings.
Brief description of the drawinqs Figure 1 is a schematic block diagram of a parallel processing system in accordance with the present invention; Figure 2 is a schematic block diagram of the functional components of an interface module; Figure 3 is a diagram showing the format of (a) a control field and (b) a control word; Figure 4 is a more detailed schematic diagram showing a typical processor module from the system in Figure 1; Figure 5 is a schematic block diagram of a worker; Figure 6 is a schematic block diagram of a monitor and control sub-system; Figure 7 shows schematic functional block diagrams of two types of special purpose modules; and Figure 8 illustrates the software structure of the system.
Detailed description of an embodiment Referring now to Figure 1 of the accompanying drawings, a parallel processing system 1 comprises a number of cascadable modules 2 to 7 linked together by a data highway 8 consisting of sixteen worker links 9 and by a monitor link 10 and a sub-system control path 11. The various links and paths 9 to 11 can be implemented as a set of ribbon cables in order to simplify installation of the system 1. It will be appreciated that, whilst in the above and the following description reference is and will be made to specific numbers of lines, etc, these references are made merely to assist in understanding the embodiment and are not intended to be in any way limiting.
With the exception of the interface module 2, each module 3 to 7 contains a number, typically from one to eight, of processing nodes or workers 13 which each include an INMOS T8 Series floating point transputer and an associated memory (not shown in Figure 1) with a storage capacity from say one to sixty four megabytes as required. Each module 3 to 7, with the exception of interface module 2, also includes a crossbar switch 14 which is connected to each worker 13 in the module and to the data highway 8 to enable "up" communication, ie the transfer of data between modules towards the interface module 2, and "down" communication, ie the transfer of data between modules away from the interface module 2.Each crossbar switch 14 provides sixteen "up" worker links and sixteen "down" worker links which are connected between modules to form a cascading data highway 8.
In each module, the crossbar switch 14 is controlled by a monitor 15, which is preferably an Inmos T222 transputer, which is connected to other monitors in other modules via the monitor link 10.
Thus, it should be apparent that each module 3 to 7 is divided into two parts; a user system and a controlling sub-system. The user system comprises one or more workers 13 each with local memory (not shown in Figure 1) the workers being linked to the crossbar switch 14 for flexible user-definable connection to other workers and to the interface module 2 thereby to enable groups or arrays of workers (including one worker) to be assigned exclusively to a particular user. The controlling sub-system comprises a monitor 15 which controls the operation of the crossbar switch 14 and can also read signals output from the workers 13 to identify error situations and control resetting of individual workers and of the module. The monitor 15, which it will be recalled is in itself a transputer, can communicate directly with associated worker transputers 13, i.e transputers in workers in the same module, and with monitors 15 in other modules and thereby also with the workers in other modules.
It can be seen from Figure 1 that the system is modular and cascadable which enables the abovementioned links and paths 9 to 11 to be daisy-chained between modules. By using cascadable modules, the system 1 can be readily expanded to include further processor modules 3 to 5 or specialist modules such as interface module 6 and graphics module 7 with its associated display monitor 7a and user operable input device, e.g. mouse 7b, which modules 6,7 will also be described in further detail hereinafter.
The system 1 is not intended to be used as a stand alone parallel processing facility (i.e a super computer) but is instead intended to be associated with a host computer to increase the speed of execution of applications programs run on the host.
The system is modular and expandable and is best suited, but not limited, to small to medium sized arrangements of up to say sixty four workers. Extra modules can be added easily to the system because of the simplicity of the one dimensional connection scheme, i.e. the worker links 9, the monitor link 10 and the control path 11, and because the specific assignment of workers to a user is determined electronically by way of the crossbar switches under monitor control.
A host computer (not shown in Figure 1), which in this embodiment is intended to be a DEC microVAX running VMS software for example, can be connected to the system 1 via its Q-bus 12 connected to the interface module 2 and in this way a means by which users can access and control the system 1 is provided.
The interface module 2 provides a number of links which can be used to connect between the Q-bus and other modules in the system 1. Typically an interface module will include four links, one of which is used as the monitor link 10 to connect to the monitor 15 of the first module 3 in the system 1. Other links in the interface module 2 are used as user worker links 9 and if only one interface module 2 is used in the system, the remaining three links will provide three user worker links 9 for exclusive use by three users.
If the system is required to provide access for more than three users further interface modules can be added and, since only one monitor link 10 is ever required, all four links in each extra interface module are available for exclusive use as user worker links.
An interface module 20 is shown in greater detail in Figure 2 of the accompanying drawings. As can be seen from Figure 2, the interface module 2 is based around an INMOS 222 transputer 16 which supervises control of the operation of other components in the module. The transputer 16 is linked to link terminations and buffers 17 which provide the above mentioned user links 9, depicted here as links 9a to 9d, which are interfaced by the module 2 to the Q-bus 12. The buffers 17 also provide ESD protection for each link 9a to 9d. Each link 9a to 9d is controlled associated with a sub-system controller 11' which appears as eight registers which are memory mapped into the transputers address space to allow it to generate and respond to reset, analyse and error signals.A separate sub-system control line lla to lld is provided for each link 9a to 9d. Each link comprises an input device and an output device (not shown) with each device providing a communications channel. Thus each link comprises two communications channels and in this way full duplex communication is possible over each link. The sub-system controller 11' enables the module 2 to control and monitor four independent transputer sub-systems associated respectively with links 9a to 9d. As mentioned above, the controller 11' appears as eight registers mapped into a transputer address space and these registers are decoded in such a way as to replicate them throughout the entire sub-system controller 11' address space.The transputer 16 executes firmware instructions held in a ROM 18 typically 8K words, to provide a large number of complex data transfer related functions which enable the transputer 16 to "intelligently" buffer data between the links 9a to 9d and the Q-bus 12. The transputer 16 can be booted from one of the links 9a to 9d to facilitate, for example, the development of application firmware or the running of extended diagnostics programs in the event of a system failure.
The Q-bus 12 is interfaced by transceivers 19 to a multiplexed address and data highway 20 and data is transferred between the Q-bus and the transputer 16 via a dual port RAM 21. The transputer 16 communicates with the RAM 21 and other units within the module over a transputer address and data highway 22. Thus data and status information from the transputer 16 are written by the transputer to one port 23 of the dual port RAM 21 and read from the other port 24 for transfer to the Q-bus.
Additionally, a Q-bus master (not shown) via the Q-bus 12 writes commands and associated parameters to the RAM 21 at the port 24 and these commands and parameters are read from the port 23 by the transputer 16. A direct memory access (DMA) control unit 25 controls all DMA transfers and uses the dual port RAXI 21 for buffering data during these transfers. At least a pair of DMA buffers (not shown but provided in the RAM 21) is used for each channel to enable data to be transmitted over a link 9a to 9d at the same time as Q-bus data is being transferred. The RAM 21 is large enough to accommodate multiple buffers for each channel, i.e input and output devices for a link 9a to 9d, and so permits the interleaving of DMA transfers.
The dual port RAM 21 is 2K words deep. The majority of the RAM 21 is available for buffering data during DMA transfers. A few locations, i.e. sixty-four, in the RAM 21 are used to pass commands and data between a Q-bus master and the transputer 16. The RAM 21 also provides programmable 1/0 registers which are programmed and grouped according to function. To this end, there are eight of each type of register (input and output), one for each communication channel. The number or code identifying the required channel can be used as an array index to select a particular register from within the block or array of eight registers.
A Control and Status Register associated with the DMA control 25 is used to initiate DMA transfers and provide error information when a DMA transfer is complete. This register is written to initiate a transfer and to set the direction of that transfer.
the transputer 16 can only set a DMA request bit, and this bit is cleared on DMA completion along with the bit indicating direction. The direction bit should not be changed during a DMA transfer, it is sampled before each burst of data is transmitted on the bus.
DMA completion is signalled by an event signal or event line 16a between the transputer 16 and the DMA control 25.
Two bits provide an indication of timeout and parity error and can be read on completion of a DMA transfer to determine whether a parity error occurred during data reads, or whether a bus timeout condition occurred during the DMA transfer.
A control and status register 26 enables the Qbus master to reset and monitor the transputer 16 and is also used during register access from the Q-bus 12 to point to a channel of interest in the dual port RAM 21. Thus, a method of paging is provided which minimises the number of Q-bus word locations required for the Q-bus master to access the interface module 2.
The transputer 16 uses the control and status register 26 together with an interrupter 27 to initiate an interrupt in accordance with Q-bus protocol and to indicate the channel requesting the interrupt. The interrupter 27 supplies an interrupt vector to the Qbus 12 at the appropriate time under Q-bus protocol.
The interrupt vector is placed on the Q-bus 12 during an interrupt and is unique to each Q-bus device. The interrupt vector is a nine bit quantity. The interface module 2 can be configured to generate interrupts at one level (according to Q-bus protocol) between levels four and seven inclusive.
Firmware support is provided for block mode DMA to facilitate high data transfer rates. The interface module 2 is arranged so that if can acquire control of the Q-bus and become a Q-bus master to transfer data using block mode DMA. During block mode DMA, Q-bus word addresses are generated. Q-bus addresses are in general 22-bits wide and the Q-bus word addresses generated during block mode DMA can be anywhere in this address space. When an address is within the socalled I/O page of the Q-bus, a Q-bus 1/0 device select signal known as bbs7 is generated. The DMA control unit 25 generates bursts of multiple bus requests during a DMA transfer and is arranged such that the period between bursts can be defined under program control.If the destination for data, i.e a DMA slave, does not have the capabilities for block mode DMA transfer then normal DMA transfers are performed. Bus requests from other devices on the Qbus are monitored during DMA transfers. If there are no requests during a block mode DMA transfer, then DMA will occur for sixteen bus cycles before the bus is released. If other devices request the bus during a block mode DMA transfer then the transfer is limited to eight cycles. Normal DMA transfers are limited to four cycles. If the DMA slave does not respond to a request within a predetermined time period, e.g. 8 S, then a timeout condition occurs and the bus is released for other devices to use.By defining the period between DMA request bursts and by limiting the number of cycles for a DMA transfer, the interface module 2 is prevented from excluding other devices from using the Q-bus 12 so as to enable other devices to share the Q-bus bandwidth more evenly. The period or delay between DMA request bursts is defined in an eight bit register which is loaded with information relating to the transfer size before a DMA transfer is requested. When a DMA transfer has been requested, register writes are disabled until completion of the DMA transfer. Typically the maximum delay will be 14.4 S. Parity is checked during read cycles.
The interface module 2 is arranged to respond to a field of eight Q-bus words. This field is indicated generally at 28 in Figure 3(a). The base address 29 of this field 28 identifies the location of a control and status register (CSR) control word which is used to access the control and status register 26 which it will be recalled is shared between the eight communications channels which constitute the four links 9a to 9d. The remaining seven Q-bus word addresses 30 to 36 are used to access registers, within the control and status register 26, associated with a channel of interest. The words at these addresses 30 to 36 are paged by a field in the CSR control word 29. The CSR control word, the format of which is shown in Figure 3(b), is generated by the host and comprises sixteen bits which are used for various purposes.The control and service of interrupts, mentioned above in relation to the interrupter 27, is by way of two bits 37,38 in the CSR control word. An interrupt enable bit 37, when active, in combination with an attention bit 38, when active, causes the interrupter 27 to generate an interrupt signal which is output to the Q-bus 12 via the address and data highway 20 and transceivers 19.
The system is arranged such that the attention bit 38 can only be set (active) by the transputer 16 and can only be cleared by a Q-bus master (not shown).
Clearing the attention bit serves to indicate to the transputer 16 that the attention request has been seen and dealt with by the Q-bus master.
The three least significant bits 39 of the CSR control word 29 are used to indicate the selecting of a particular channel and to control the transputer.
These three least significant bits 39 are written by the Q-bus master and identify one of eight pages of registers, in the control and status register 26, associated with the channel of interest. The selected page of registers are then selected using the other words at the other addresses 30 to 36 in the field 28.
Reading these three bits 39 gives the identifying number of the channel which has requested a Q-bus interrupt, this number being set by the transputer 16.
Another bit 40 in the CSR control word 29 is used to control and monitor the transputer 16. Writing to this bit 40 controls a reset line (not shown) on the transputer 16 and reading from it provides a status indication of an error line (not shown) on the transputer.
A command word 30 in field 28 as shown in Figure 3(a) is used to issue a command on a selected channel.
Since the command word is sixteen bits wide the total number of possible commands is very large. Commands can be predefined in firmware or user defined in software. Hardware support within the module 2 simplifies the task of the transputer 16 in searching for and keeping track of outstanding commands. To this end, a command and detect register (not shown) is provided to assist the transputer 16 in detecting newly issued commands from the Q-bus and keeping track of pending command requests. Whenever a command is written to the module 2, a unique bit is set in the command and detect register. The transputer 16 can read this unique bit and will know from that whether any requests are outstanding. The command and detect register is arranged so that the type of request and the relevant channel number can be found by examining the position of the bits. Thus, the first eight bits of the register represent command requests on channels zero to seven respectively and the next eight bits cancel requests. Bits are cleared in this register by writing a binary 1 back to the position to be cleared.
This enables the transputer 16 to service command requests by reading a command bit from the register, storing the value of the command of bit as a variable and then writing the same value back to the corresponding cancel bit. If a command bit value of 1 is read from the register, subsequently writing this value back to the corresponding cancel bit will cause that bit to be cancelled. The cancel word 31, in a similar way to the command word 30, provides a flexible means of cancelling incomplete commands. The remaining words 32 to 36 in the field have no special features and simply correspond to locations in the dual port RAM 21. These words 32 to 36 are used to pass command parameters and channel status information between the Q-bus 12 and the transputer 16.They are each sixteen bits wide and this gives a total of ten bytes of data storage which is enough to transfer a quad-word in a single programmed i/O for example.
Turning now to Figure 4 of the accompanying drawings, there is shown a schematic representation of a typical module, for example a processor module 3, used in the system 1. As can be seen from this drawing, the module 3 includes a block 41 of worker or user transputers which define a processing array in which each worker can be assigned to a particular user under the control of control and communication logic 42 associated with the crossbar switch 14 and the monitor transducer 15. The crossbar switch 14 preferably comprises one or more Immos IMSC004 programable link switches configured so as to provide a programable crossbar switch. A processor module 3 may also include special purpose logic 41a to facilitate execution of specific functions.For example, the graphics module 7 (to be described in greater detail hereinbelow) includes special purpose logic in the form of, among other things, a D/A converter and a colour look up table which enable a colour display to be driven.
As can be seen from Figure 5 of the drawings, each worker 13 comprises a worker transputer 43 which is connected to a local memory, such as DRAM 44, and to a link adapter 45. The DRAM 44 may have a capacity of say one to sixty four megabytes as required and is used as a storage medium for storing programs and for storing data to be processed by the transputer 43.
The link adapter 45 is preferably an INMOS IMSCOl2 device and is connected to the worker transputer 43 to provide a fifth or pseudo link 46 between the worker transputer 43 and the monitor transputer 15 (also see Figure 6 of the drawings). The worker transputer 43 can write a data byte into the link adapter 45 for serial transmission by the link adaptor along the pseudo link 46. Similarly, a data byte received serially on the pseudo link 46 is buffered by the link adapter 45 for delivery to the worker transputer 43.
When a data byte is received by the link adapter 21 from the pseudo link 46, the link adapter outputs an interrupt signal which is delivered via line 47 to the so called EVENT input of the transputer 43 where the signal is interpreted as an indication that there is valid data in the link adapter waiting to be read by the transputer. The worker transputer 43 can also set an interrupt bit in a similar manner so as to request service from the monitor transputer 15. The provision of the extra or pseudo link 46 by the link adaptor leaves all four of the links 9a of the worker transputer free to be used as worker links 9 in the data highway 8 for communication with transputers in other workers 13 or with the interface module 2.It will be appreciated that a number of workers similar to the worker 13 shown in Figure 5 are provided in the module 3 and that these workers and workers in other modules are connected to each other via the links of crossbar switch 14 to form an array of transputers.
Each worker 13 forms a processing node within this array and individual or sub arrays of processing nodes can be securely assigned to specific users under monitor 15 control.
The components of the module shown in Figure 4, which make up the control sub-system, namely the crossbar switch 14, the control and communication logic -42 and the monitor 15 are shown in greater detail in Figure 6 of the drawings. Referring now to Figure 6, it can be seen that the monitor transputer 15 controls the crossbar or link switch 14 and, via pseudo links 46 and interface logic 48, controls the worker 13. The monitor 15 receives tagged system commands from the host computer (not shown) delivered along the monitor links 10 from the interface module 2.Each monitor in the modules which make up the system examines the tagged command and if the tag indicates that the command is not relevant to that monitor it simply relays the command along its down monitor link line 10a to the up monitor link line 10b of the monitor transputer of the next module in the system. Once the command reaches the appropriate monitor transputer it is interpreted by the monitor which acts accordingly.
Commands transmitted in this way may include commands to reset one or all workers 13, analyse any worker 13, or detect any worker 13 which is indicating that an error or interrupt condition has occurred.
The interface logic 48 is provided to facilitate execution of these commands and a memory 49 contains programs for use by the monitor transputer 15 and the interface logic 48. The exact nature of the program stored in the memory 49 depends on the particular arrangement of worker transputers and link or crossbar switching in a given module. However, the program will normally be arranged so that each module will respond in the desired manner to commands dispatched from other modules and other parts of the system and will include unique load identity codes which can be transmitted along the monitor link 10 so enable the host to determine how and where modules are connected in the system. The program also enable the control subsystem to monitor error signals and the like from boards further down the line, i.e. away from the interface module 2.
Error and interrupt signals 50 (also see Figure 5) from the workers 13 are delivered to the interface logic which outputs an event signal in response to these signals along event line 27 to the monitor transputer 15. These interrupt and error signals can be individually or globally masked if so desired.
When the event input 49 to the monitor transputer 15 is set, thereby indicating an interrupting request from a worker transputer, the monitor 15 causes the interface logic 24 to write the identity, i.e. the address, of the interrupting worker to a pseudo link switch 50 along select line 51. The pseudo link switch 50 may be implemented by way of an Immos IMSCOOIZ programable link switch similar to that used to provide the crossbar switch 14, but since the number of connections to the pseudo link switch 50 is relatively small it is advantageous in terms of cost to use programable logic to realise the pseudo link switch 50.Where a large number of worker transputers are provided in a single module, more than one programable link switch device (eg IMSC0O4) may be required to realise the crossbar switch 14 and under these circumstances the monitor link 52 used for configuring the crossbar switch 14 must itself be switched between the switch devices. To this end, a small link switch (not shown in Figure 6), similar to the pseudo link switch 50, can be provided for switching the monitor link 52 between the link switch devices which make up the crossbar switch 14.
When the address of an interrupting worker is input to the pseudo link switch 50, this causes the switch 50 to connect the pseudo link 46 of the interrupting worker to a link 53 of the monitor transputer 15. The interface logic 48 also sends a signal to cancel the interrupt request of the interrupting worker. Once the connection has been made between the pseudo link 46 and link 53 the monitor transputer 15 sends a message to the interrupting worker to indicate that the connection has been established. Communication between the interrupting worker and the monitor can then commence.
The monitor transputer is also able to reset the crossbar switch 14 and in this way all worker links can be disconnected.
In addition to, or instead of, the processor modules 3,4,5, the system in Figure 1 may include special purpose modules such as the sub-system interface module 6 or the graphics module 7.
The sub-system interface module 6 provides a means by which the parallel processing system can be interfaced to other independent sub-systems such as the independent transputer sub-system 6a. The subsystem interface module 6 includes a crossbar switch 14 and monitor 15 which form part of the above discussed control sub-system. The crossbar switch 14 is arranged to enable links from the other transputer sub-systems 6a to be hardwired to it and therefore to enable transputers in the other sub-systems 6a to be dynamically allocated to a user on request under the control of the abovementioned control sub-system. A sub-system interface module 6 may be used to connect to existing QT systems for example.The crossbar switch, which it will be recalled may be an INMOS IMSC004 programmable link switch, provides a sufficient number of incoming and outgoing links to connect to up to eight independent QT systems.
The graphics module 7 provides a means by which for example user interactive or graphics applications programs may be executed in a parallel processing environment. A more detailed view of two exemplary graphics modules 7a, are shown in Figure 7 of the accompanying drawings. Turning now to Figure 7, it can be seen that the graphics module 7a comprises a processor node 60, i.e. a transputer, and a memory 61 of say 2 Mbytes in which applications programs, image data and the like are stored. The processor 60 and memory are connected by way of an internal bus 62 which also provides a path to an overlay store 63 and a framestore 64.An application program may be arranged for example to extract images from the memory and to transfer this data to the overlay store 63 and the framestore 64 for selective combination before being output from the framestore 64 via a D/A (digital to analog) converter 65 for display on a display monitor having a resolution of up to say 1024 x 768 pixels. Additional storage areas and controllers such as a colour palette 66, a cursor generator 67 and a zoom/pan controller 68 are provided for facilitating effects such as colouring of the image, moving a cursor about on the displayed image, and zooming and panning of the displayed image.
An alternative graphics module 7b is also shown in Figure 7. This graphics module 7b again comprises a processor 60, a memory 61 and internal bus 62, a framestore 64, a D/A converter 65 and a colour palette 66, each of which can be used to serve similar functions to those discussed in relation to the module 7a. This graphics module 7b however also includes a drawing processor 68 and a window manager 69 which may be of the type described in our British Patent Application Publication No. 2,202,115. The drawing processor 68 and the window manager 69 are controlled by the processor 60 and can be configured to generate windowing effects in the displayed image, that is to say multiple image portions from one or more images held in the memory 61 may be extracted for display in an overlapping or transparent manner as described in our abovementioned patent application.Each of the graphics modules 7a, 7b shown in Figure 7 also includes a monitor 15 which communicates with other monitors in other modules in the system, as already described hereinabove in relation to the processor module 3, and a crossbar switch represented in this Figure as a topology controller 14.
It will be appreciated by those possessed of the appropriate skills in the art that the exact operation of the functional units which make up the graphics modules 7a, 7b is dependent upon the specific application for which they are to be used, as defined by the application program. The above discussion of the graphics module, and indeed of the processor module 3 and sub-system interface module 6 is therefore merely exemplary in order to assist in understanding the invention and is not intended to be in any way limiting.
The system is arranged so that a user may request the use of a sub array of transputers through the host computer. A suite of supervising programs are also run on the host computer. An example of the structure and interaction of this suite is shown in Figure 8 of the accompanying drawings. This software suite 70 fully controls the' configuration and allocation of hardware resources together with all data flow to and from the control bus, i.e. the Q-bus. The supervising programs can accept messages from both processes on the host and from any node.
A typical multi-user software structure is shown in Figure 8 of the accompanying drawings. Turning now to Figure 8 it will be seen that the software in this example can accommodate up to three separate users each using their own Digital Command Language (DCL) command line 70,71, 72 on a microVAX computer. The microVAX computer hosts a number of files that include configuration tools 73a to 73c to enable the user to request transputer sub-arrays, run time servers 74a to 74c which are a standard feature of the microVAX adapted to allow transputer sub-arrays to access files within the microVAX, and user application files 75a to 75c, including i/O libraries 76a to 76c, which are also used in the allocation of networks etc. A system process 79 in the host computer supervises allocation of the hardware resources of the system 1. The system process 79 sends commands along the monitor link 10 requesting a report from each module in the system to discover what hardware is available in the system.
The report from each module includes information as to the nature of the module and the number and nature of each worker in that module. When the system is started up the process controlling the system will execute this procedure and the resulting information is stored for later use. Initially all workers in the system are deemed unused and available for allocation.
Each module is identified in terms of its type, i.e. a general purpose processing module or a dedicated module designed for a specific task for example, its location within the system, and/or its configuration.
With this information the host computer transmits further commands which allocates specific worker transputers to the requesting user and which arranges the allocated worker transputers in the required configuration. Once these tasks have been performed the sub array will have been defined creating a user domain in which the user can work, for example by running an application program, to the exclusion of all other users. Thus, the system process is arranged to keep an internal record of which workers of each module have been assigned to particular uses. The system process also stores information on which intermodule connection links have been used in this respect and information about each module and the properties of the workers on that module.
The configuration tools 73a to 73c are able both to interrogate the system process for the abovementioned information and to request the allocation of certain module workers and links to a given user. Hence, the system enables a utility programme to be written using the configuration tools to assign and wire up a network of worker notes according to the requirements specified by the user.
To allow multiuser allocation properly without interference and corruption, a pair of exclusive-use procedures are provided in the configuration tools to enable the user to disallow any other user from simultaneously attempting to obtain use of a network of worker notes.
Thus, it will be appreciated that the control sub-system, which comprises amongst other things the crossbar switches 14 and monitors 15 in each module, provides a means for defining the topology of the array of processing nodes provided by the cascading processor modules 3 to 7 of Figure 1. That is to say, the control sub-system provides a topology controlling function which is responsible for controlling the configuration of the system to suit the needs of multiple users and ensuring complete electrical isolation between sub-arrays of transputers assigned to different users.
While a user may request an array of transputers it is up to the system software running on the host computer to allocate specific processing nodes and to implement their configuration by the appropriate setting of link switches. This new user domain must then be reset before the user may download applications software into it. Should a detectable run time error occur the host computer will be informed and the appropriate user given the option of analysing his array. All of these control functions are handled by the sub-system transputers under control of the host computer. The host computer must also ensure isolation between user domains. The system may include a number of sub-system allocating algorithms and is also arranged to enable users to implement their own allocating algorithms. In this way the user can select the best algorithm according to his needs.A typical algorithm is a "breath first" algorithm which has five stages. In the first stage, when a user has requested the allocating of a subsystem of notes, the algorithm checks that the required number of transputers, i.e. nodes, of the correct type are available. If there are not enough suitable transputers available then the allocation will fail and a corresponding message is sent to the user to indicate this. If there are sufficient nodes to meet the user's needs, the second stage of the algorithm is to assign nodes which are closest in terms of link distance to the host. Thus, nodes closest to the host in terms of link distance are allocated first to a user. If there are multiple users, and thus multiple user links, the shortest link distance can be calculated from the user link to the nodes.Once this has been done, the third stage of the algorithm to create a list of sub-system nodes sorted into the order of the assigned link distances.
At this stage specific transputers are not necessarily allocated to particular nodes and therefore the fourth stage of the algorithm is concerned with allocating transputers to each of the nodes. Allocation starts at the node which is directly connected to the user link and transputers are allocated to each node by looking for a transputer of the correct type from the module closest to the interface module 2. Once transputers have been allocated to each node the fifth stage of the algorithm forms link connections between the allocated transputers. Transputers from different modules may be assigned to the same user and this requires transputer links to be established between the modules using the up/down links. It is possible that the algorithm may fail at this stage if all of the up/down links have already been allocated to other users.Tagged messages can be dispatched from the host along the sub-system pipeline. On receipt of a message a sub-system transputer inspects the tag to determine whether it should implement the attached command, or simply relay the message down to the next module in the system. Each sub-system transputer can send responses and status messages up to the host. A user transputer may interrupt the subsystem transputer and then communicate with the subsystem via a byte wide port that is mapped into its address space. This port is interfaced to a link on the sub-system transputer. Through the control subsystem the host computer is able to control and communicate with every user transputer in the system.
Every module is able to identify itself (type and/or configuration) to the host computer on request. This facility allows the host to interrogate the system to determine what resources are available for it to allocate to users.
The system 1 is controlled by a suite of operating system software and firmware programs which provide the multi-user, multi-task environment within the above described hardware. The operating system allows the user to partition the processing nodes and other hardware resources dynamically into independent and protected sub-systems but imposes no restrictions on the processing activities performed within each sub-system. Applications programs can therefore take advantage of the parallel processing environment provided by the system 1 by using the operating system facilities to communicate transparently with the host computer or even other sub-systems.
Each module 3 to 7, with the exception of the interface module 2, contains a number of processing nodes which can be connected directly to other nodes in the same or other modules by way of the crossbar switches 14 which provide reconfigurable point to point connections. Thus, a sub-system may be composed of nodes located on many different modules.
To allow communication between a user and the system process 79, VMS mailboxes 77, 78 are used.
User processes create their own mailboxes 77 and these mailboxes 77 receive replies to commands from the HEX system process 79. The HEX system process 79 has its own mailbox 78 to which all user requests must be sent. The system process 79 accepts requests from its mailbox in the order in which they are placed.
Thus, it will be appreciated that the software suite consists firstly of an object library containing routines for controlling and configuring the sub-arrays or processing domains, and passing integer and real data to and from those domains.
These routines can be called from any programming language. Layered on top of this library are a group of tools invoked from the host computers command line for example allowing a user to configure his system by typing commands when prompted. The library transparently communicates with a secure system process; this process is responsible for domain allocation and message-passing along the shared system link.
Users have the option of either running their complete application program on a selected domain, using a standard "server" running on the host to handle requests for file access, etc., or they can split an application into a host-resident section and a transputer-resident section. This latter mode of operation allows an application accelerated by transputers still to take advantage of any features available on the host such as in the case of a microVAX, DECnet, VAX-dependent file access or control of specific VAX hardware devices, while simultaneously using the number-crunching power of transputers to great effect. In this way, it is possible for an enduser of an application program to be unaware that transputers are involved at all.
The system 1 and associated software and firmware suite are designed to cater for a wide spectrum of users. Those users who simply wish to run an application program in a parallel processing environment can do so automatically and transparently.
At the other end of the spectrum, users who wish to have control over the systems configuration mechanism and the hosts interface are also accommodated. In between these two extremes lies the level at which most user programmers are likely to operate. Such users can generate "configuration files" in the course of creating a parallel program and the system 1 will then allocate the appropriate number of processing nodes and wire them into the required topology while automatically optimising the use of these hardware resources when the program is run.
To summarise, the embodiment according to the invention provides a range of transputer based processing modules which connect together in expandable manner to a host computer to allow multiple users access to individual sub-arrays (domains) of transputers. The system design can be split into two parts; user and sub-system. The user part comprises one or more T222 transputers each with local memory and in the case of special purpose modules, additional logic to perform the required function. All user transputer links and i/O links, i.e "up" links and "down" links are taken to a link switch in the module.
This allows full configuration of the user transputers, flexible connection of user transputer links to i/O links and flexible connection of i/O links to other i/O links. The sub-system comprises a T222 transputer and additional logic which enables the transputer to control the resetting and analysing of inputs, and to read the error output, of all user transputers in a module. The sub-system transputer can communicate with all user transputers in the module and also configure the link switch. The host computer can detect errors in the sub-system pipeline and can reset and/or analyse the sub-system if required by using global sub-system control signals.
Modules are designed to be cascaded, with both the user links and the sub-system links being daisychained between boards on twisted pair cabling. The 1/0 and sub-system links are divided into 'up' links and 'down' links, with the down links from one module connecting to the up links on the next. This strategy connects all sub-system transputers in a pipeline with system control generated by the host computer which connects to the up sub-system link on the first module in the pipeline. Similarly, user links from the host computer connect to up 1/0 links on the first module in the pipeline down links of the last board in a system can be left unconnected.

Claims (13)

CLAIMS:
1. A parallel processing system for use in cooperation with another computer to enhance the operation of that other computer, in which parallel processing system an interface means is provided for interfacing the other computer to a plurality of interface links, the interface means being arranged such that a single interface link is assignable for use by a user and at least one processing node from an array of such nodes is connectable under the control of a controlling sub-system to the said interface link to provide a parallel processing domain for exclusive use by that user.
2. A system according to claim 1, wherein the interface means includes means for connecting data lines from the other computer to said interface links for data transfer therebetween.
3. A system according to claim 1 or 2, wherein the controlling sub-system is arranged to connect a user defined plurality of processing nodes to the interface link assigned to that user.
4. A system according to claim 1 or 2 or 3, wherein the array of processing nodes comprises a plurality of processing modules, each module comprises a number of processing nodes and being adapted to be cascadably connected to other such modules.
5. A system according to claim 4 wherein each processing module further comprises controlling means associated with the controlling sub-system for controlling the connecting of processing nodes in the module to a user link.
6. A system according to any preceding claim, wherein the interface means is adapted to interface the said other computer to a control link associated with the controlling sub-system such that the system can be controlled by the other computer via the controlling sub-system.
7. A parallel processing system according to any preceding claim, wherein the or each processing node comprises a transputer device.
8. A parallel processing system according to any preceding claim, wherein the controlling sub-system comprises at least one controlling transputer device.
9. A parallel processing system according to claim 5, wherein said controlling means comprises a controlling transputer.
10. A processing system for use with a host computer, which processing system comprises an interface module for connecting data lines from the host computer to links within the system, at least one processing module defining a plurality of processing nodes connectable to each other and to said links, and a controlling sub-system comprising a controlling link associated with the interface module for connection to the host computer and a controlling node in the said at least one processing module which controlling node is adapted to control the connection of processing nodes to each other and to said links.
11. A processing system according to claim 10, further comprising other processing modules adapted for cascadable connection by way of connecting and controlling links associated respectively with the processing and control nodes in each module.
12. A parallel processing system that provides for dynamic allocation and wiring of arrays of processors in a multi user environment.
13. A system substantially as herein described with reference to any of the accompanying drawings.
GB8921359A 1989-09-21 1989-09-21 Computer systems Withdrawn GB2238142A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB8921359A GB2238142A (en) 1989-09-21 1989-09-21 Computer systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB8921359A GB2238142A (en) 1989-09-21 1989-09-21 Computer systems

Publications (2)

Publication Number Publication Date
GB8921359D0 GB8921359D0 (en) 1989-11-08
GB2238142A true GB2238142A (en) 1991-05-22

Family

ID=10663421

Family Applications (1)

Application Number Title Priority Date Filing Date
GB8921359A Withdrawn GB2238142A (en) 1989-09-21 1989-09-21 Computer systems

Country Status (1)

Country Link
GB (1) GB2238142A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0735508A1 (en) * 1995-03-29 1996-10-02 Sony United Kingdom Limited Data processing apparatus

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1233714A (en) * 1967-12-20 1971-05-26
GB1271928A (en) * 1969-06-10 1972-04-26 Ibm Data processing system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1233714A (en) * 1967-12-20 1971-05-26
GB1271928A (en) * 1969-06-10 1972-04-26 Ibm Data processing system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0735508A1 (en) * 1995-03-29 1996-10-02 Sony United Kingdom Limited Data processing apparatus

Also Published As

Publication number Publication date
GB8921359D0 (en) 1989-11-08

Similar Documents

Publication Publication Date Title
US5414851A (en) Method and means for sharing I/O resources by a plurality of operating systems
US5333298A (en) System for making data available to an outside software package by utilizing a data file which contains source and destination information
CA1172376A (en) Interrupt coupling and monitoring system
EP0380851B1 (en) Modular crossbar interconnections in a digital computer
US8526422B2 (en) Network on chip with partitions
US5574914A (en) Method and apparatus for performing system resource partitioning
US4493034A (en) Apparatus and method for an operating system supervisor in a data processing system
JP3200500B2 (en) Disk device and disk control method
US5146605A (en) Direct control facility for multiprocessor network
GB2079997A (en) Data processing apparatus
JPH07120337B2 (en) Processor system
JPH01200467A (en) Apparatus and method for data processing system having equal relationship between a plurality of central processors
US5404477A (en) Extended memory address conversion and data transfer control system
US20120183001A1 (en) Network apparatus, network configuration method and program recording medium which records a network apparatus program
EP0658998B1 (en) Data switching apparatus
KR100316190B1 (en) Increasing i/o performance through storage of packetized operational information in local memory
CA1304517C (en) Multiple computer interface
EP0259659A2 (en) A data processing machine including an adapter card driver mechanism
GB2238142A (en) Computer systems
GB2096369A (en) Decentralized data processing system of modular construction
CA2051199C (en) Multiple input/output devices having shared address space
EP0316251B1 (en) Direct control facility for multiprocessor network
WO1991010204A1 (en) Image processing apparatus having disk storage resembling ram memory
JPH08272754A (en) Multiprocessor system
JPS61118847A (en) Simultaneous access control system of memory

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)