WO1984004190A1 - Systeme d'ordinateur a processeur multiple - Google Patents

Systeme d'ordinateur a processeur multiple Download PDF

Info

Publication number
WO1984004190A1
WO1984004190A1 PCT/US1984/000557 US8400557W WO8404190A1 WO 1984004190 A1 WO1984004190 A1 WO 1984004190A1 US 8400557 W US8400557 W US 8400557W WO 8404190 A1 WO8404190 A1 WO 8404190A1
Authority
WO
WIPO (PCT)
Prior art keywords
computer
processor
file
message
computer architecture
Prior art date
Application number
PCT/US1984/000557
Other languages
English (en)
Inventor
Richard Lowenthal
Jonathan Huie
Milan Momirov
Ben Wegbreit
David Cline
John P Burger
Original Assignee
Convergent Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Convergent Technologies Inc filed Critical Convergent Technologies Inc
Priority to AU28608/84A priority Critical patent/AU2860884A/en
Publication of WO1984004190A1 publication Critical patent/WO1984004190A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/17Interprocessor communication using an input/output type connection, e.g. channel, I/O port

Definitions

  • the present.invention relates to computer architectures. More particularly, the present invention relates.to a computer architecture including a plurality of parallel asynchronous independent computers intercon ⁇ nected by a transparent parallel bus to form a computer network.
  • the computer system is required to grow in any or all of . three directions by adding additional: 1) terminals or communications ports;
  • the first system bottleneck is that of terminal I/O. Terminals interrupting a CPU a character at a time drastically slow down prior art shared logic computer systems. A partial remedy for this situation involves dedicating a front end processor to off-load the communications overhead from the main CPU, such as in the IBM 3705 front end processor manufactured by International Business Machines of Armonk, New York.
  • the second system bottleneck involves file I/O. File access demands that the CPU spend some of its time handling the disc and file system, rather than executing main line code. A partial remedy for this situation was provided by dedicating back end processors to off-load the file processing overhead from the main CPU.
  • the third system bottleneck results from the fact that existing large scale minicomputers and main frame computers are limited by the fixed amount of processing power inherent in the single CPU which executes the applications code. The amount of applica- tions processing available to the system user is thereby limited.
  • the Digital Equipment Corporation VAX-11/782 manufactured by Digital Equipment Corporation of Maynard, Massachusetts, is an example of a system that can add an additional processor. It should be noted, however, that on most mainframes, users are unable to add multiple processors or field upgrade their existing processor to get increased processing power. Thus a user requiring more computing power must purchase a new system.
  • the mentioned bottlenecks in mainframe per ⁇ formance have been caused by the mainframe's dependence on the traditional shared logic architecture which has dominated computer system design for the last thirty years.
  • the present invention addresses the three single processor bottlenecks - file, communications, and application processing - by providing multiple concurrent processors.
  • the present invention multi-com ⁇ puter computer architecture provides a series of independent parallel processors of which any number may be added to structure the system as desired.
  • Each of the file, terminal, application, and communications processors runs its own operating system, and they all execute in parallel.
  • multiple applications, file, terminal, and communications processors may also be added to meet the additional computing requirements.
  • the system resources gracefully grow to meet user requirements.
  • the present invention is a system of multiple processors tied together on a high speed asynchronous bus.
  • Each of the processors on the bus consists of a CPU and memory; the processors can also include I/O interfaces.
  • the bus in the present invention may be extended across multiple enclosures, each with a multi-slot backplane. Each enclosure supports integral mass storage.
  • the present invention uniquely provides a file system which executes on parallel processors con- currently with applications execution by adding multiple file processors as system needs grow.
  • Terminal handling executes on parallel processors concurrently with application execution and multiple terminal processors may also be added as system needs grow.
  • the total available computing power provided by the present invention can grow by adding multiple applications processors, each of which may run a distributed version of the UNI operating system, developed and licensed by Bell Laboratories of Murray Hill, New Jersey.
  • the present invention can support a mix of dumb terminals, intelligent terminals and work stations, and can thereby allow system users to tailor a system to their needs.
  • the hardware and software architecture pro- vided by the present invention is modular and includes selected system entry points and multiple upgrade paths as system needs grow.
  • an eight-user minicomputer configuration may be upgraded to a 12S-user mainframe configuration without software modification.
  • Traditional computer architectures use a single, synchronized operating system to control the overall operation of the computer and to perform such tasks as assigning places in memory to programs and data, processing interrupts, scheduling jobs, and controlling the overall input/output of the computer.
  • the present invention combines a message passing operatmg system with the UNI operating system to maximize reliability and software compatibility.
  • two or more operating systems can run concurrently in a manner completely transparent to the application or the system user. Accordingly, each processor is supplied with its own independent and unique operating system.
  • the architecture includes virtual memory hardware on each application processor to provide a demand paged virtual memory system.
  • the memory management hardware provides a high speed two-level paging scheme.
  • the present invention provides a departure for operating systems, such as UNIX , which have previously been executed as a monolithic program on a single processor.
  • multiple processors are dedicated for each function.
  • the file processor a back-end data base processor, runs the file system under the message-based operating system. Up to twelve file processors may be run in parallel in any system.
  • the terminal processor, a front-end processor runs all communications protocols and terminal handling under the message-based operating system. Up to sixteen terminal processors can be run in parallel on any system.
  • the applications processor is capable of being replicated up to sixteen times in the present invention. Each applications processor runs its own copy of the UNIX ® kernel..
  • the UNIX operating system is distributed across multiple processors in the following manner:
  • the file system code runs on one or more file processors each of which also runs the message-based operating system;
  • processors and their applica ⁇ tions communicate over the system bus.
  • Each parallel processor within the present invention communicates via short messages which allow the processors to DMA directly into each others memory.
  • the only connection the UNIX - kernel has with the other operating systems in the system is through the inter-computer communications (ICC) module.
  • the ICC provides request and response blocks to the kernel and all processors.
  • the software running on the applications processor communicates with the file system, paging, and terminals via the ICC module.
  • the system bus is a high speed asynchronous backplane interconnect. It provides the throughput necessary to insure that all of the processors in a system can communicate and process in parallel.
  • the system bus provides a 32-bit wide data path and has a maximum transfer rate of 11 Mbytes/second.
  • the system bus is a central feature in the present architecture in which hardware and software functionality are bundled in subsystems rather than specific devices. Device dependent information is not transferred across the system bus but logical concepts are. Thus, the system bus allows the processors in the system to communicate without using interrupt structure necessary in conven- tional unit processor systems.
  • the hardware provides:
  • the inter-CPU bus traffic consists of the request and response blocks, to and from all processor boards, and DMA transfers to and from the discs.
  • the present computer architecture invention consists of three main processing elements or computers: the file processor, the applications processor, and cluster processor. Various embodiments of the invention can also include a disc processor and terminal processor.
  • the applications processor contains a 10-MHz Motorola 68010 CPU, memory management hardware to support a two-level paging scheme, and 512 Kbytes - 4 Mbytes of dual-ported error-correcting RAM.
  • the applications processor is dedicated to run both the
  • the memory management hardware provides a high speed (no wait states), two-level paging scheme with 8 Mbytes of virtual address space per applications processor. Each page is 4 Kbytes and has associated status. The page is either not present, present but not accessed, accessed but not written, or written
  • a segment map provides protection for up to 64 system segments. Each segment can be an executed only, read only, writeable, or user segment.
  • the system handles multiple processor address ⁇ ing via an extended address.
  • the processors When addressing memory "off board", the processors issue a 5-byte address (CPU number, address). The appropriate CPU recognizes its CPU number and uses the address as an input to its map.
  • the file processor contains an 8-MHz INTEL
  • the 80186 processor with 256 Kbytes - 760 Kbytes of triple- ported RAM including full error correction and an LSI Winchester controller.
  • the high speed bus and memory allow the system to provide DMA access to and from other processor boards.
  • the file processor -runs the UNIX file system concurrently with application execu ⁇ tion on the applications processor. Additionally, the file processor runs file-oriented data management tools.
  • the file processor can support up to three 5 ⁇ inch 50 Mbyte or more Winchester disc drives and a removable 5 Mbyte cartridge. Four additional drives may be provided by expansion units.
  • additional file processors each with its- own parallel file system - and up to four additional discs, may be added, off-loading disc and file system overhead.
  • Each file processor is responsible only for the discs and files that reside on the particular processor. All file processors are "known" to the first file processor installed within the system and which is designated as the master file processor.
  • the master file processor redirects file requests from one file processor to another.
  • the cluster processor in the present invention contains an 8-MHz INTEL 8.0186 processor having 256 Kbytes - 768 Kbytes of triple-ported RAM with full error correction.
  • the cluster processor controls two cluster RS-422 ports and can run terminals at speeds from 300 Kbits/second up to 500 Kbits/second; work stations may be run at 1.8 Mbits/second. All RS-422 lines have DMA to provide high throughput.
  • the cluster processor supports three RS-232 ports; two RS-232 port lines are synchronous or asynchronous, while the third line is a serial printer interface (asynchronous-only in the present embodiment of the invention).
  • the terminal processor contains an 8-MHz
  • the terminal processor contains ten RS-232 ports, four of which support synchronous or asynchronous operation, while six ports support asynchronous operation only. Each RS-232 line can operate at up to 19.2 Kbaud.
  • the terminal processor includes an operating system kernel and provides a virtual terminal interface for dumb terminals-. Additionally, the terminal processor can run communications oriented products, such as modems.
  • the terminal processor has access to a table kept on disc that describes the default characteristic of the devices attached to each port. After the system is reset, the terminal processor monitors each terminal for activity, and if active state data are received, it requests service from one of the applications processors. Upon power-up, a list of file and applications processors are made known to the terminal processor. If the system contains applications processors, the terminal processor dispenses initial requests in a round-robin fashion. When additional terminals log on to a system that already has terminals assigned to all applications processors, the terminal processor assigns the latest requests to the least loaded applications processor.
  • a storage processor which contains an 8-MHz INTEL 80186 processor having up to 768 Kbytes of triple-ported RAM and including full error correction. Additionally, the storage processor provides a tape interface for ⁇ _.-inch tape drive units. The storage processor also provides memory, DMA, and compute power for the disc controller. The disc controller contains a microcontroller circuit and controls up to six 600 Mbyte disc drives. The controller interfaces to the system bus via the storage processor.
  • Fig. 1 is a block diagram of the multi-computer computer architecture basic components
  • Fig. 2 is a block diagram showing individual computer structure and message passing across the system bus within the multi-computer computer architec- ture;
  • Fig. 3 is a block diagram of a file processor
  • Fig. 4 is a block diagram of operating system intercommunications via the inter-CPU communication module
  • Fig. 5 is a block diagram of operating system file structure
  • Fig. 6 is a block diagram of interprocess communication
  • Fig. 7 is a block diagram of an inter-CPU request
  • Fig. 8 is a block diagram of remote DMA initiation by a file processor
  • Fig. 9 is a block diagram of file system request routing
  • Fig. 10 is a schematic diagram of an exemplary computer microprocessor circuit
  • Fig. 11 is a schematic diagram of an exemplary computer memory circuit
  • Fig. 12 is a schematic diagram of an exemplary computer system bus interface circuit
  • Fig. 13 is a schematic diagram of an exemplary doorbell interrupt PAL circuit.
  • FIG. 1 A significant feature of the present invention is that it dedicates several types of single board computers to specific functions.
  • the computer boards provide high throughput in four functional areas by providing the following specialized computers: I)- Applications processor 11; 2) File processor 12; 3) Cluster processor 13; and
  • the self-contained computers are interconnected via a high-speed, 11 Mbytes/second, 32 bit asynchronous system bus 10.
  • the bus is used primarily for inter-com- puter communications. Although bus speed and width have been stated for the exemplary embodiment of the invention, other such speeds and system bus widths may be provided in different embodiments of the invention.
  • the present invention may be expanded by the inclusion of an expansion enclosure (Fig. 2) which is added to form a powerful collection of computers linked together on bus 10.
  • the exemplary embodiment of the invention provides a bus that may span up to six system enclosures 15/16. If one of the four above defined functional areas requires extra processing, more processes of that type may be added to an enclosure and/or additional enclosures may be added.
  • Each applications processor is a 68010 based computer with 512 K - 4 Mbytes of error correcting memory.
  • the 68010 is manufactured by Motorola of Phoenix, Arizona and supports true virtual memory with instruction restart capability, accessing virtual memory through a two-level segment/page map.
  • the first level of the segment/page map consists of up to 16 contexts in use at once, where a context is a region in memory in which a process runs.
  • a context may contain up to 64 segments of 64 Kbytes each, providing up to 4 Mbytes of virtual space to each processor.
  • each segment there are 16 pages of 4 Kbytes each. Segments are protected against unauthorized access, and both segments and pages are protected against accessing non-present entries.
  • the file processor and storage processors are dedicated to controlling secondary storage devices 21. These devices include high capacity interface Winchester disc drives that can be in the 5 inch format, large capacity SMD interface disc drives, and -inch streaming tape drives.
  • the file processor may be provided in two embodiments - the first embodiment for Winchester type discs and the second embodiment for SMD discs and i_, inch tape units.
  • the first embodiment of the file processor computer uses an 80186 microprocessor, manufactured by Intel of Sunnyvale, California.
  • the first embodiment includes 256 K of random access memory and controls the Winchester 5 discs in memory 17.
  • Disc control is provided by disc controller circuit 19 which performs all formatting, sector ID scans with sector interleading, CRC calculations, data encoding, multiple sector reads and writes, and implied seek operations.
  • the second embodiment of the file processor computer controls SMD discs and streaming tape drives-
  • This embodiment of the file processor includes a disc controller 20, disc drives 21, memory 17, and an Intel 80186 microprocessor 18.
  • the file processor is coupled to the high speed bus via one connection only. A local (private) interconnection is used for command and data transfer.
  • the file processor and associated software provide back-end support for secondary storage I/O.
  • the file processor computer is a critical element in the present multi-computer computer system. Therefore, high bandwidth between the secondary storage devices and applications is provided. To accomplish this, the file processor performs DMA transfers directly between a disc and any remote processor's memory (Fig. 2).
  • the file processor executes a DMA transfer across the high speed bus to or from another processor's memory using only a small buffer in the controlling file processor's main memory.
  • Each file processor may control up to four disc drives (in the first embodiment of the file processor; or up to six SMD devices and four tape drives in the second embodiment of the file processor). When more devices are needed, more file processors can be added.
  • the addition of autonomous file processors does not affect the software appearance of the secondary storage because software communication modules assure that all file Drocessors act in concert.
  • the hardware is configured to provide further assistance in the interaction of various file processors by designating one file processor the master. Such designation insures a reliable bootstrap procedure, allows a central CPU to coordinate certain types of operations, and permits a unique connection to control the system via a system control panel.
  • the file processors do not operate in a master-slave relationship. Rather, each can initiate transfers over the bus.
  • the master file processor serves only as a single point of coordination for events which have system-wide implications.
  • each computer includes a micropro ⁇ cessor, a memory store, and (with the exception of the applications processor) I/O circuitry.
  • the cluster processor computer also contains an Intel 80186 with 256 K of random access memory.
  • the major function of the cluster processor is to provide back-end support for work stations via two RS422 multi-drop lines running at 307 Kbits/second to 1.8 Mbits/second in this embodiment of the invention.
  • the cluster processor also includes three RS232 ports, two of which are intelligent HDLC USARTs, and a parallel interface for supporting a line printer.
  • the terminal processor computer contains an Intel 80186 microprocessor and 256 K of random access memory. To support asynchronous terminals, the terminal processor contains ten RS232 ports, four of which are intelligent HDLC USARTs. The terminal processor also provides a parallel printer interface. The terminal processor serves as a front-end processor dedicated to support up to ten RS232 compatible user terminals.
  • the present invention can provide multiple copies of the same operating system, running on
  • CTOS on all but the Applications Processor
  • All system services needed by the operating system is a message based operating system with a real time kernel. All system services needed by the operating system
  • each computer is endowed with only the necessary functions to enable it to perform its unique specialized function.
  • the terminal processor has no file management capability, but has extensive terminal handling capabilities.
  • the file processor " however, has no terminal handling capability, but does have a sophisticated file management capability. Much of the operating system effort is therefore generalized for all 80186 based computers.
  • each separate computer within the present invention architecture runs its own specialized operating system.
  • ICC inter-CPU communications
  • the protocol allows the passing of discrete messages across the system bus. Each message is either a request for a task to be performed or a response saying whether or not the task was performed and, if not, why. Every processor can access the memory of any other processor, permitting the message based transfers of information.
  • CTOS message based operating system
  • ® UNIX system This layer, the ICC, allows the operating system to communicate all other processors in a system (Fig. 4). To improve performance and allow sharing of file .resources, the UNIX file system is off-loaded and included as a server process in the present invention's operating system on the file processor computer.
  • the UNIX system is properly interfaced with the operating system of the present invention by converting the file system into a message based server process (Fig. 4).
  • Fig. 4 the file system is now single-threaded in a multi-processor environment. That is, the file system operating system executes any given task from beginning to end without interruption.
  • a parallel server process which uses the same data structures as the regular file system, but does not modify them, is added. Any operation that modifies the data structure is sent to the single-threaded file system server for completion.
  • This arrangement allows high volume traffic, such as normal reads and writes, to flow quickly around the single-threaded bottleneck. This approach also eliminates concurrency problems that arise in a multiple threaded operation.
  • each file system in UNIX ® is managed as if it was a continuous piece of raw disc (Fig. 5).
  • each UNIX ® file system is a separate operating system file in the present invention.
  • the files are stored such that the file processor CPU can read/write as a raw device to understand the control structures within the file.
  • the UNIX ® version incorporated within the present invention allows as many simultaneous reads and writes as desired.
  • the UNIX ® system is buffered properly.
  • This situation is known as a controlled access situation and in the present invention causes the file processor to pick a file master for that file. From that point on, or until the file is closed by the last processor, all operations concerning this file are funnelled through the file master to the file processor.
  • the UNIX ® system is also configured to consider all of its processes to be running on the same processor, which implies that all information concerning the processes is in one central place.
  • the problem of process location is solved by assigning unique process identi- fication numbers and arranging the utilities so that they use the information to look the same as for a standard UNIX system.
  • Running multiple operating systems in the same computer architecture also requires a method of sharing service resources and data.
  • To let the UNIX ® system share resources with the operating system of the present invention it is necessary to enhance the message handler running m the UNIX ® kernel.
  • the message passing operating system present in the present invention is available with full capability, including direct sending of messages to another proces- sor, which messages may be used on the UNIX ® system.
  • the message structure used by the present invention (and m the adapted version of UNIX ®) is a request/response algorithm. For each request sent to a process, the process must respond in some way (even by dying). To reach a process, a request or a response must pass through an exchange - a place where messages wait to be received and where processes wait for messages to arrive.
  • a request code is a number which specifies the basic format of a request and location of the service exchange.
  • a service exchange is an exchange on which a request is queued for service by some process.
  • the service process de-queues each message and services its request.
  • the service process calls a subroutine to respond to the request.
  • Embedded in the request is the exchange to which the response is queued. Because request blocks are self-describing, they can be checked for validity regardless of the actual requests.
  • the request/response function is illustrated in Fig. 6.
  • ICC inter-CPU communications
  • the processor determines whether the message is a request or a response. The message is then copied from the client processor to an area in the server processor. If the message is a response, the processor finds out which exchange the response is to be queued on. If there is a processor waiting for something to be queued on that exchange, the message is broken up to process the response. If the message is a request, the processor finds out if there is a functional unit that services the requests on this particular processor. If not, an error code is set within the request and a response is initiated. If there is a server for the request, the message is queued to the proper service exchange; if the server is waiting for a request, it is woken up.
  • the processor repeats the same sequence as above with a response.
  • the particular design of the computer archi ⁇ tecture allows the software to be functionally parti ⁇ tioned. Most significant in the software structure is the ICC module (included as a Microfiche Appendix with this document) .
  • the ICC module is discussed in more detail below.
  • a cluster processor needs only a file processor to complete the architecture.
  • the cluster processor controls communication on the high speed cluster lines, on which both intelligent work stations and terminals can be placed.
  • the cluster processor polls each work station and terminal connected to the cluster line; it also allows multiple printers to be operated at the same time.
  • the terminal processor is a sophisticated
  • the RS232 communication module designed to handle up to ten RS232 lines at speeds of up to 19,200 baud. To run the lines at this speed, it is necessary to poll for input characters every 500 M/seconds. A polling loop is included which consumes 18% of the total processor wide bandwidth in this embodiment of the invention. The rest of the processor's bandwidth is used to run communication utilities.
  • the application processor runs a UNIX ® kernel operating system. In the present embodiment of the invention, the UNIX ® kernel has been converted from a swapping system to a virtual memory system. In the present architecture there can be many processors running the UNIX ® concurrently. Lacking the described adaptations, the location of a single processor is difficult.
  • Assigning a unique process identification makes a processor data command look as it would on any standard UNIX ® system. The command is modified to clearly process information for processors across the bus. This information is then processed to be displayed in the same manner as m a standard UNIX ® process command.
  • Each processing module in the present computer architecture invention executes only from its own local memory, but has the ability to read and write the local memory of all other processors. Additionally, each computer module has the ability to interrupt other processors to perform indivisible transfer to read- modify-write in access to remote memory. Although all of the memory is sharable, only the ICC module makes remote references. In this way, bus bandwidth is preserved for operations that cannot tolerate latency, such as disc DMA.
  • the ICC module is message based and performs all of the transport, routing, and presentation functions transparently to the user. Messages in this system are entirely self-describing. It is not necessary for the ICC module to understand anything of the content of the message to be able to write, transport, or present it.
  • a client process makes a request of a service process and receives a response. This high level view is exactly what occurs if the client and the server are actually located on the same processor. This is the user's model of the transaction and in fact, the user is not generally aware of the request is actually being serviced at a remote location.
  • location transparency is accomplished by the client agent on the requester's processor and the service agent on the remote processor.
  • the ter -"agent is used in a descriptive sense.
  • some of the agent functions are implemented at different levels in the kernel.
  • One function of the ICC module is to avoid burdening service processes with the details of the addressing structure of the client processor, as in dissimilar operating systems.
  • a presentation function is provided to reconcile the internal architectures of the 68010 and 80186 processors resident in ;the various computers.
  • the 80186 stores words in low byte/high byte order, while the 68010 stores words in high byte/low byte order.
  • Addresses on the 80186 consist of a 16 bit segment addressed together with a 16 bit offset; effective address calculation consists of shifting the segment address up 4 bits and adding the offset to achieve a 20 bit memory address.
  • Addresses on the 68010 are 32 bit quantities wherein a 24 bit address is presented to the memory bus.
  • the presentation function maps the remote data into local format, completely isolating the service processes from these variations between the microprocessor circuits.
  • various microprocessor-based operating systems transparently intercommunicate one with the other. This enables each computer in the architecture to have a diverse operating system adapted to the computer's specialized function.
  • Each processor (computer) in the present computer architecture is identified by an 8 bit hardware assigned "slot number".
  • the pair can address any byte on any computer in the computer architecture.
  • the processors address remote memory using special hardware registers to establish a mapping of a portion of remote address space into local address space. Normal memory reference instructions are used to manipulate remote data.
  • an address in the present architec ⁇ ture is a 40 bit quantity - 8 bits of slot number and 32 bits of address.
  • Such addresses are called full bus addresses (FBAs).
  • the address portion of an FBA is stored in 68010 format (high order byte at lowest memory address).
  • the present format accommodates the fact that the 80186 format cannot address more than 1 Mbyte, while an application processor can have consid ⁇ erably more than 1 Mbyte of memory.
  • the FBAs allow the lowest layers of the ICC module to be unaware of the remote processor type. .
  • the FBAs also simplify the initialization and configuration of the system.
  • each processor executes the local read only memory (ROM) program.
  • the ROM program performs a self-test diagnostic, initializes its CPU description table (CDT), arms the interrupt system, and waits to be awakened. All of the processors in the system perform this sequence with the exception of the master file processor.
  • the master file processor is hardware desig ⁇ nated by slot location at a control panel portion (not shown) of the system.
  • the master file processor boot ⁇ straps a system image from disc, rather than waiting to be awakened. It then probes the CDT of the potential processors in the process. There is a point at which the CDT is at the same location of every processor within the architecture, although the CDT itself may be located anywhere in memory.
  • the CDT contains a three byte signature field which is used to distinguish between a valid CDT and random memory contents.
  • One of the fields which is initialized by each processor is the CPU type field. CPU type information allows the master file processor to build a map of the slot numbers-to-processor types. Initialization failures entered during these self tests- result in an entry in the CDT.
  • the master file processor otherwise reads a configuration file, downloads the appropriate operating system image into the remote processors, initializes certain information in the remote CDTs, and awakens remote processors with a door ⁇ bell (ICC) interrupt.
  • the remote processor then performs its initialization and marks the CDT as ready for operation.
  • the master file processor is fault tolerant. That is, it brings up the system even if one or more of the remote processors fail during system initialization.
  • the master file processor also "watchdogs" each of the remote processors once per second by setting a flag word in the remote CDT. If the flag is not reset by the time the master file processor checks, the remote processor is assumed to have died and the master file processor begins logging a shutdown sequence.
  • the master file processor initializes certain fields in the remote CDTs. These fields tell the remote processor what the system configuration is and which processor is the master file processor.
  • each CDT contains a lock byte and the request and response circular buffer pointers.
  • the request and response circular buffers are used during message transport.
  • Each circular buffer is described by four pointers (START, AND, GET, and PUT).
  • Each of these pointers is a single 16 bit word. The words are taken to be offsets relative to the start of the CDT. Double words (a full 32 bit address) are not storable in an 80186 microprocessor in a single individual operation and are therefore not used.
  • multiple remote processors may attempt to update the PUT pointers simultaneously, they must lock the CDT. Only the local processor has information from the buffers; this operation does not require a lock if the operation of updating the pointers is an indivi- sible one.
  • the master file processor CDT also contains additional information which is used for routing requests.
  • Messages are routed by a request code which is one of the fields in the fixed length portion of the message header.
  • the first step in routing a request is to look the code up in a local table to determine the routing class. Possible values for the routing class include "local”, “possibly remote”, “route to master file processor”, and several other types of remote (that is, see request block for exact destination). In the latter two types of requests, the target is known and the messages transported immediately. If the routing code has the value "local”, the request is routed locally by consulting a table that maps request codes to exchanges.
  • routing class is "possibly remote"
  • a special table in the master file processor CDT is consulted to determine if there is a server for this request and if so, where it is to be found (slot number) . If the result of any of the routing lookups is the client processor's slot number, then the request is treated exactly as a local request. Keeping the
  • -g fRJ ⁇ AU master routing table in the master file processor allows the dynamic installation of various operating services, the UNIX ® filing system, etc. without the necessity of either distributing the information (which creates inconsistencies during distributed updates) or hardwiring the locations of the service processors.
  • Additional user defined requests may be added to the system and routed using the same mechanisms.
  • sending a message consists of the following steps:
  • the remote processor When the remote processor receives the doorbell interrupt, it awakens the ICC module in the server agent.
  • the ICC module server agent removes the FBA from the request circular buffer, allocates space to hold the request, and copies the request in from the client's address space.
  • the request is converted to local format (the previously mentioned presentation service) and the request-is sent to the local exchange, which serves the request.
  • the request is then sent directly to the proper exchange to avoid potential loops in the routing function.
  • global printer name resolution is performed by a routing process on the master file processor.
  • a cluster processor routes requests which require printer name resolution to the master file processor.
  • the result of the master file processor's name resolution process may in fact be the cluster processor that originated the request.
  • the cluster processor server agent must route the requests locally to avoid a loop. In the process of making a local copy of the request, it is modified so that when the user responds, the response goes to the ICC module response process.
  • the server processes the request just as it would process a local request. In fact, it cannot tell the difference.
  • the server processor puts output data into the request block and responds, which awakens the ICC module response process.
  • the ICC response process copies the output data back to the client.
  • the client CDT is then locked, the FBA of the original request is put into the client's response circular buffer, the CDT is unlocked, and the client's doorbell is rung.
  • the client ICC module server is then awakened by the doorbell interrupt handler. In response thereto the response is removed from the circular buffer and a response is made to the process that initiated the request.
  • the present embodiment of the invention is able to handle several hundred such messages per second.
  • Character output in the present architecture is as follows: Characters to be output are stored directly into a circular buffer maintained on the terminal processor or requester processor that is handling the terminal. The processors periodically poll their output circular buffers and emit any characters that are found.
  • the terminal processor can support I/O to all ten of its RS232 ports at the full 19,200 baud rate. High speed input is also supported with a request to "read up to X characters in Y milli- seconds". If "Y" milliseconds elapses before "X" characters are received, the number of characters received up to that time are returned.
  • the "X” and “Y” parameters are adjustable, but the defaults which are based on the terminal baud rate are acceptable for interactive use.
  • the primary role of the file processor is to provide direct service between the other processors and the secondary storage devices.
  • the master file processor serves as a coordination point for many of the activities outlined above.
  • One activity controlled by the master file processor provides an essential name service that is used by any processor which is used to access a system resource that is known only by its name.
  • the essential name service determines where the resource is located.
  • the file processor provides all the services that implement a base file system.
  • the base file system supports the operating system files directly, the UNIX ® system being built on top of it.
  • the base file system provides both I/O on a disc sector level and provides directory services, such as creation and deletion of files and directories.
  • the base file system has a base, simple directory hierarchy - the top level addresses the physical disc device, the middle level a particular directory within the device, and the bottom level a particular file within that directory.
  • the interface subsystems allow a.specific file and directory access method to be based on these capabilities, such as the UNIX ® access methods, and provide the code that maps the particular structures to the base file system.
  • the base system remains constant while retaining flexibility in types of application file access methods that can be built on top of it. Such structure affords several benefits.
  • the struc ⁇ ture provides flexibility in supporting diverse operat- ing systems with their special need for file services.
  • the base system allows the storage devices to have one unique formula, such that backup and restoration operations always function regardless of the type of application file system supported in present embodiments or future embodiments of the invention.
  • the base system allows the basic operating system file system software running in the file processor to remain unchanged when a new file system or file access method is added in other embodiments of the invention.
  • each file processor controls all accesses to the files physically resident on the devices actually connected to that file processor; and only in those files and on those devices. That is, each file processor has complete control over file storage on the devices connected to it, but it cannot operate on files residing on devices connected to any other file processor.
  • a file processor (other than the master) has no information about the devices and files on the other file processors.
  • Each file processor is only aware that other file processors exist.
  • the particular parameters describing the directories, files, or open files reside exclusively in the control ⁇ ling file processor.
  • the separateness of this approach is further imposed on file handles which are created for open files, and on the device names themselves so that the device is uniquely addressed from any computer in the system.
  • requests which originated on other processors must be routed to a particular file processor by the ICC.
  • the ICC plays an important role in the interaction of multiple file processors.
  • the file processor performs all the functions required of an efficient secondary storage driver.
  • the file processor optimizes the execution of multiple sector transfers by transferring as much data as is physically possible for each single I/O operation, depending upon the characteristics of the disc controller and the drive itself.
  • the disc driver schedules all pending I/O operations using an elevator structure.
  • the driver code performs overlapping seeks by issuing buffered seeks to each drive at the highest possible rate.
  • the file processors function as the main data servers for the distributed processing system. In fact, they are the only type of processor that controls storage devices. This status puts a high demand on the file processors for producing and consuming secondary storage data. The demand comes from all the other file processors, but particularly from the applications processor. Generally, the file processor has two responsibilities: 1) Providing secondary storage service for the system when on-line; and
  • file processors have the ability to establish a direct DMA channel between the disc device and the memory of remote processors, the operating system uses this capability to achieve a high disc bandwidth.
  • the most common service a file processor provides is reading and writing the discs.
  • the destination of the disc data may be a remote processor. That is, the requesting process could be running in a processor other than the one receiving the request.
  • an address of the buffer where the disc data is to be delivered is given. As described above, the address has two basic components: 1) A single byte hardware encoded bus address of the processor; and
  • the file processor determines that the disc is ready to start the data transfer, it issues a read operation to the disc controller along with a start remote DMA operation to the DMA logic. This causes the entire disc transfer to run to completion, although the hardware is performing several discrete steps, as follows: As the disc controller starts to transfer data, the file processor hardware captures the byte stream as it is sent by the controller and assembles it into four byte word aligned packets called quads, placing them into the small circular buffer in the file processor main memory. The quads are then transferred with a hardware DMA operation over the main bus to the correct location within the destination processor's memory. Each discrete DMA transfer length (or "burst") is 8 quads. After the transfer, the processor releases the buffer used by other processors.
  • the disc transfers data at 5 Mbits/second. Because file processor buffering is minimal, it must transfer disc data across the system bus at a high speed. The very high speed at which the bus runs in burst mode - 11 Mbytes/second - makes the high transfer rates possible.
  • the operation When the operation is determined to be com ⁇ pleted, a signal is issued to the requester process, posting the status of the request in the requester's address space. If the destination buffer is within the servicing file processor, the operation is essentially the same as described above, except that the inter-com ⁇ puter bus is not used and there is no intermediate buf ⁇ fering of the data. Rather, the data is transferred directly to the destination memory address.
  • the downloading function is a special respon ⁇ sibility of the master file processor 12a (Fig. 9). During the downloading process, the master file proces ⁇ sor is in control and all other processors in the system act as slaves.
  • Each processor enters its ROM code, which inserts its processor type code in a special table in RAM, along with a signature bit pattern. ' In this way, the CDT is assembled.
  • the processor then runs ROM-based diagnos ⁇ tics. If the diagnostics succeed, the processor sets a flag indicating it is okay, sets another flag represent ⁇ ing a request to be bootstrapped and downloaded, enables interrupts ⁇ so that the master file processor can communicate with it, and enters an idle loop waiting for service.
  • the master file processor executes the boot strap ROM in the same way as the other processors.
  • the master file processor finds that all is well after polling the various system processors, it reads each processor's request for service, downloads the appro ⁇ priate system image into each processor, and issues an interrupt to that processor, waking it up and causing it to execute the code that was downloaded to it.
  • the master file processor is thus the critical element in bringing the system to life.
  • the power and flexibility of the present architecture invention comes from its ability to permit several autonomous processors of the same or different types to function by interacting with each other and by delivering higher throughput.
  • Multiple file processors in the same system act in concert to support a very large unified data area capability.
  • the ICC module is the functional part of the invention that ties all the multiple file processors together, allowing them to function as a unit.
  • Multiple file processors use the ICC in three different ways to achieve unified file system service.
  • the master file processor "broadcasts" requests to insure that all file processors are synchronized.
  • the ICC routes all file system requests which involve a path name (that is, the device/directory/file specification) to the master, which then determines which file proces ⁇ sor services the device name specified and which performs another ICC route to that file processor.
  • the file processor uses the ICC's ability to route file system requests containing a file handle for an open file directly to the file processor serving that file.
  • each file processor manages the devices and the data on them almost completely indepen ⁇ dently, there is a small class of information and activity that is "synchronized" among all the file processors.
  • the relevant original request is routed to the master file processor, which implements the synchronization.
  • the shared information is the operating system user profile information. Since a user or some application process may potentially access data on any file processor, each file processor must have the user profile information available for every other user.
  • the activities that must be synchronized are those which have implications global to all file system storage. There are two such cases: First, the file system supports requests which permit a user to close all files that the user currently has open. Because the user may have files open on any device, each processor must receive the requests so that any users open files controlled by that processor can be closed. Second, the file system requires the ability to quiesce all disc activity. This request must be broadcast to all file processors so that all activity can be quieted.
  • the enhanced multiple file processor code running in the master file processor executes and duplicates the request, and sends it to each of the other file proces ⁇ sors for execution. When all of the other file proces ⁇ sors have replied to the master that they have executed the request, the master file processor posts the completion to the original requesting user, indicating the activity has been globally executed.
  • a typical series of file requests may be for simple open-read/ write-close files.
  • the ICC module server local to the processor in which the request is made routes the request to the master file processor.
  • the master receives the request and determines, via a table maintained therein, which file processor serves the advice being addressed, it routes the request via the ICC module to the processor.
  • the master is not initiating a new request, but merely passing on the original request to the destination file processor. The result is that when the request is completed, the completion status and response are sent to the original requester. That is, the master file processor functions as a filter process.
  • the master acts as a processor which services the request.
  • the master does not act as a filter but, instead, actually services the request and posts completion to original requester using the ICC module.
  • One of the primary functions of the open request in terms of request routing to cause the open to establish a logical connection between the requesting user application and the file processor serving the file being opened (Fig. 9).
  • Each file processor controls the volumes with names as labelled.
  • an applications processor can request OPEN FILE [c] ⁇ JOB> NAME, where file "NAME" is in directory "JOB” on volume “C”.
  • OPEN FILE [c] ⁇ JOB> NAME, where file "NAME" is in directory "JOB” on volume "C”.
  • the file processor servicing the open request places an encoding of its processor bus address into the file handle.
  • the ICC module uses this encoding to route requests, such as read sector or write sector, from the user application directly to the servicing file processor. This procedure produces the logical connection that allows the efficient, direct routing to take place.
  • the user application is finished with the file, it issues a close file request using the file handle.
  • the servicing file processor closes the file and considers that file handle invalid. The logical connection is now severed.
  • the request is first routed to the master file processor.
  • the file processor volume location is determined by the master file processor which redirects the request accordingly.
  • the appropriate file processor completes the request and responds directly to the requesting applications processor.
  • the ICC module itself knows nothing of the file system activities. If the user application erroneously issues another file request using that file handle, the ICC module routes it to the correct file processor. Upon arrival at the file processor, it is determined that the file handle is not currently valid and a completion status is posted with an error indicat ⁇ ing an invalid file handle.
  • the ICC module and various functions in the file processor operate in concert to form a unified storage system distributed among the devices controlled by several processors.
  • FIG. 10 is a schematic diagram of the exemplary computer microprocessor circuit.
  • An Intel 80186 microprocessor integrated circuit 16E is shown coupled via a microprocessor bus to a plurality of address latches 13F-16F and 22E. Address latches 15F, 16F, and 22E produce an internal memory address; address latches 13F and 14F produce an internal I/O address.
  • CPU data is transported to and from microprocessor 16E by data latches 13G and 14G.
  • Microprocessor bus 30 couples microprocessor 16E to local ROM 13E/14E, which is the boot strap ROM mentioned above.
  • the microprocessor bus is coupled to a file processor internal bus via microprocessor bus transceiver latches 9G-12G.
  • Fig. 11 is a schematic diagram of an exemplary computer memory circuit as is present in a file proces- sor.
  • the memory circuit provides a three bus structure including a memory address bus 31, a data bus 32, and a control bus 33.
  • the circuit shown in Fig. 11 is configured for 256 K of local random access memory.
  • Latches U8, U19, U30, and U41 provide memory address decoding of the memory address supplied to the RAM memory chips shown in the figure.
  • Fig. 12 is a schematic diagram of an exemplary computer system bus interface circuit by which the system bus 34 is coupled to .the local memory data bus 35 and to CPU data bus 36. Latches 8H-11H decode the system bus to produce the memory data bus.
  • the processor address in terms of slot number is decoded from the CPU data bus by a slot number decoding circuit 20H and coupled to a slot number converter 22H.
  • the local slot number - my slot - is also coupled to converter 22 ⁇ .
  • a slot compare is performed. If the message is intended for the processor, a slot match signal is produced by converter 22H which is also coupled to program array logic (PAL 20G).
  • PAL 20G determines if the slot match refers to a memory access from a remote processor or if slot match refers to a doorbell interrupt.
  • Fig. 13 is a schematic diagram of an exemplary doorbell interrupt PAL circuit. Referring to the inputs (marked I) it can be seen that various states produce various outputs (marked 0).
  • IF(VCC) SFACK SLOTEQ * SFRES + SLOT ⁇ O * DBI + SLOTEQ * INVADDR
  • VADR SLOTEQ * /A15 * /A14 * /A13 * /A12 * /All * /A10 * /A9 * /A8 * /A7 * /LATCH + SLOTEQ * VADR
  • INVADDR SLOTEQ * INVl * /LATCH +
  • the present invention provides a significant step in full realization of microprocessor based systems.
  • the use of true distributed processing within a local environment produces a powerful high bandwidth system.
  • the present invention provides a powerful base for a diverse set of applications encompassing all data processing environments.
  • the use of multiple back-end file processors for modular additions of disc storage enhances system throughput by off-loading the major portions of I/O activity from the other processors and is a critical feature of the present invention.
  • the file processors are true computer systems themselves, they help support the sophisticated applica ⁇ tions that may be required in. such a computer system, such as data base management, which applications now have the advantage of system-wide availability.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

Un système d'ordinateur à processeur multiple comprend plusieurs processeurs parallèles asynchrones indépendants à fonction spécialisée (11-14). Chaque processeur possède un système de fonctionnement discret et indépendant; les processeurs sont interconnectés pour assurer une communication transparente entre les processeurs à un niveau de fonctionnement sur un bus parallèle et asynchrone (10). Chacun des processeurs comprend une unité centrale de traitement (18) et une mémoire (17). Les processeurs s'envoient des messages en les plaçant dans une mémoire du processeur qui les reçoit. Le processeur de réception est notifié de la présence du message par un signal d'interruption "sonnette de porte" reçu du processeur transmettant le message. Les processeurs sont couplés entre eux par l'intermédiaire de plusieurs fentes de connexion qui définissent une enceinte (15) laquelle forme une unité à ordinateur multiple indépendante, fonctionnelle. Une pluralité d'enceintes (15, 16) peuvent être interconnectées de manière transparente pour définir un système à ordinateur multiple. De cette manière, le système d'ordinateur obtenu peut être étendu à partir d'un système de mini-ordinateur jusqu'à obtenir une grande unité centrale en fonction de l'application et de l'utilisation.
PCT/US1984/000557 1983-04-15 1984-04-12 Systeme d'ordinateur a processeur multiple WO1984004190A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU28608/84A AU2860884A (en) 1983-04-15 1984-04-12 Multi-computer computer architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US48556983A 1983-04-15 1983-04-15

Publications (1)

Publication Number Publication Date
WO1984004190A1 true WO1984004190A1 (fr) 1984-10-25

Family

ID=23928660

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1984/000557 WO1984004190A1 (fr) 1983-04-15 1984-04-12 Systeme d'ordinateur a processeur multiple

Country Status (2)

Country Link
EP (1) EP0139727A1 (fr)
WO (1) WO1984004190A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5101342A (en) * 1985-02-06 1992-03-31 Kabushiki Kaisha Toshiba Multiple processor data processing system with processors of varying kinds
WO2014164310A1 (fr) * 2013-03-13 2014-10-09 Qualcomm Incorporated Contrôleur de dispositif partagé intégré d'hôte double

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3349375A (en) * 1963-11-07 1967-10-24 Ibm Associative logic for highly parallel computer and data processing systems
US3480914A (en) * 1967-01-03 1969-11-25 Ibm Control mechanism for a multi-processor computing system
US3648256A (en) * 1969-12-31 1972-03-07 Nasa Communications link for computers
US3768074A (en) * 1972-05-12 1973-10-23 Burroughs Corp Multiprocessing system having means for permissive coupling of different subsystems
US3886524A (en) * 1973-10-18 1975-05-27 Texas Instruments Inc Asynchronous communication bus
US3905023A (en) * 1973-08-15 1975-09-09 Burroughs Corp Large scale multi-level information processing system employing improved failsaft techniques
US4141067A (en) * 1977-06-13 1979-02-20 General Automation Multiprocessor system with cache memory
US4253144A (en) * 1978-12-21 1981-02-24 Burroughs Corporation Multi-processor communication network
US4320452A (en) * 1978-06-29 1982-03-16 Standard Oil Company (Indiana) Digital bus and control circuitry for data routing and transmission

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3349375A (en) * 1963-11-07 1967-10-24 Ibm Associative logic for highly parallel computer and data processing systems
US3480914A (en) * 1967-01-03 1969-11-25 Ibm Control mechanism for a multi-processor computing system
US3648256A (en) * 1969-12-31 1972-03-07 Nasa Communications link for computers
US3768074A (en) * 1972-05-12 1973-10-23 Burroughs Corp Multiprocessing system having means for permissive coupling of different subsystems
US3905023A (en) * 1973-08-15 1975-09-09 Burroughs Corp Large scale multi-level information processing system employing improved failsaft techniques
US3886524A (en) * 1973-10-18 1975-05-27 Texas Instruments Inc Asynchronous communication bus
US4141067A (en) * 1977-06-13 1979-02-20 General Automation Multiprocessor system with cache memory
US4320452A (en) * 1978-06-29 1982-03-16 Standard Oil Company (Indiana) Digital bus and control circuitry for data routing and transmission
US4253144A (en) * 1978-12-21 1981-02-24 Burroughs Corporation Multi-processor communication network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5101342A (en) * 1985-02-06 1992-03-31 Kabushiki Kaisha Toshiba Multiple processor data processing system with processors of varying kinds
WO2014164310A1 (fr) * 2013-03-13 2014-10-09 Qualcomm Incorporated Contrôleur de dispositif partagé intégré d'hôte double
US9431077B2 (en) 2013-03-13 2016-08-30 Qualcomm Incorporated Dual host embedded shared device controller

Also Published As

Publication number Publication date
EP0139727A1 (fr) 1985-05-08

Similar Documents

Publication Publication Date Title
CA2091993C (fr) Systeme informatique insensible aux defaillances
JP2757961B2 (ja) 複数の中央処理装置間が対等の関係を有するデータ処理システム用の装置および方法
EP0709779B1 (fr) Disques virtuels partagés avec récupération transparente pour l'application
US5931918A (en) Parallel I/O network file server architecture
EP0794492B1 (fr) Exécution répartie de commandes à mode inadapté dans des systèmes multiprocesseurs
US6249830B1 (en) Method and apparatus for distributing interrupts in a scalable symmetric multiprocessor system without changing the bus width or bus protocol
JP2644780B2 (ja) 処理依頼機能を持つ並列計算機
US20040083481A1 (en) System and method for transferring data between virtual machines or other computer entities
US5944809A (en) Method and apparatus for distributing interrupts in a symmetric multiprocessor system
CA1304513C (fr) Diffusion virtuelle par bus d'entree-sortie d'instructions d'entree-sortie programmees
Palmer et al. Connection Machine model CM-5 system overview
JPH0232659B2 (fr)
US20010023467A1 (en) Efficient transfer of data and events between processes and between processes and drivers in a parallel, fault tolerant message based operating system
WO1984004190A1 (fr) Systeme d'ordinateur a processeur multiple
Ousterhout Partitioning and cooperation in a distributed multiprocessor operating system: Medusa
JP3375649B2 (ja) 並列計算機
Winterbottom et al. Topsy: an extensible unix multicomputer
WO1991010958A1 (fr) Systeme de bus d'ordinateur
Hildebrand A microkernel POSIX OS for realtime embedded systems
Crowley The design and implementation of a new UNIX kernel
JP3189894B2 (ja) 仮想計算機システム
JP3209560B2 (ja) マイクロプロセッサ・システムの割込管理方法
Osmon et al. The Topsy project: a position paper
CA1142619A (fr) Systeme multiprocesseur
JPS62286155A (ja) マルチcpu制御方式

Legal Events

Date Code Title Description
AK Designated states

Designated state(s): AU DE GB JP

AL Designated countries for regional patents

Designated state(s): AT BE CH DE FR GB LU NL SE

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642