GB1577658A - Digital data processors - Google Patents

Digital data processors Download PDF

Info

Publication number
GB1577658A
GB1577658A GB37048/77A GB3704877A GB1577658A GB 1577658 A GB1577658 A GB 1577658A GB 37048/77 A GB37048/77 A GB 37048/77A GB 3704877 A GB3704877 A GB 3704877A GB 1577658 A GB1577658 A GB 1577658A
Authority
GB
United Kingdom
Prior art keywords
control
instruction
data
values
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
GB37048/77A
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Massachusetts Institute of Technology
Original Assignee
Massachusetts Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US05/721,083 external-priority patent/US4145733A/en
Application filed by Massachusetts Institute of Technology filed Critical Massachusetts Institute of Technology
Publication of GB1577658A publication Critical patent/GB1577658A/en
Expired legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4494Execution paradigms, e.g. implementations of programming paradigms data driven

Description

(54) DIGITAL DATA PROCESSORS (71) We, MASSACHUSETTS INSTITUTE OF TECHNOLOGY, a corporation existing and organised under the laws of the State of Massachusetts, U.S.A., having its principal place of business at 77 Massachusetts Avenue, Cambridge, Massachusetts, U.S.A., do hereby declare the invention for which we pray that a patent may be granted to us, and the method by which it is to be performed, to be particularly described in and by the following statement:- This invention relates to digital data processors.
As background to the present invention the reader is referred to U.S. Patent No.
3 962 706, for Data Processing Apparatus For Highly Parallel Execution of Stored Programs, and to the publication "A preliminary architecture for a basic dataflow processor" by J. B. Dennis and D.P.
Misunas, published in Proceedings of the Second Annual Symposium on Computer Architecture, Institute of Electrical and Electronics Engineers, New York (January 1975), pp 126-132.
The work resulting in the present invention was supported by research grants from the Office of Computing Activities of the National Science Foundation and the Advances Research Projects Agency of the Department of Defense.
The study of the expression of concurrent operation within programming languages has yielded a data-driven form of program representation known as data flow; that is, an instruction of a program in a processor is enabled for execution upon the arrival of all required operands, and upon being executed, copies of the resulting values are sent to all instructions which require it for their execution. The development of dataflow representation was accompanied by the development of processors designed to fully exploit the high degree of local parallelism exposed by the data-flow representation. The architectures of two such processors are described in the two publications mentioned above, which are incorporated into the present specification by reference.
The "Elementary Processor" presented in U.S. Patent No. 3 962 706 was designed to execute a simple class of programs which are well-suited for the representation of signal processing computations. This class of programs permits only elementary computation; no decision capability is provided. The "Basic Processor" presented in the second-mentioned publication adds conditional and iterative constructs to the language and architecture, and incorporates a multi-level memory system in which an active memory is operated as a cache, and individual instructions are retrieved from an auxiliary memory as they are required for computation. It is desirable to expand the capabilities of The Elementary and Basic Processors to avoid possible deadlock conditions in the execution of streamoriented and iterative computations, and the present invention aims to provide a processor which may be improved in this respect.
According to the present invention, there is provided a digital data processor arranged to execute data-flow programs by the concurrent processing of a plurality of instructions each of which is executed upon the arrival of at least one data value and/or at least one control value which is generated upon the execution of a downstream instruction, the processor comprising (a) a read/write memory comprising a plurality of instruction cells, each of which has a respective index, contains a respective one of said instructions, and is arranged to receive each data value and/or control value necessary for the execution of its instruction; (b) operation means for performing arithmetic and/or logic operations on said data values; (c) control identity means for identifying said control values; (d) first arbitration means transmitting signals from said memory to said operation means, each of which signals represents one of said instructions together with each data value necessary for its execution; (e) second arbitration means transmitting signals representing said control values from said memory to said control identity means; (f) control means transmitting said signals representing said control values to respective cells of said memory; and (g) distribution means transmitting signals from said operation means to respective cells of said memory, each of which signals represents the result of an operation of said operation means.
To.assist in understanding the invention and to show how it may be carried out, reference will now to made, by way of example, to the accompanying drawings, in which: Fig. 1 is a general schematic of a processor embodying the present invention; Fig. 2 is a diagram of an elementary dataflow program, illustrating certain background principles underlying the present invention; Fig. 3 is a general schematic of a system for executing the data-flow program of Fig.
2; Fig. 4 is a detailed schematic of an instruction cell, which constitutes a component of the systems of Figs. 1 and 3; Fig. 5 is a detailed schematic of an instruction format, which describes an aspect of the instruction cell of Fig. 4; Fig. 6 illustrates symbols representing links of basic data-flow language; Fig. 7 illustrates symbols representing actors of basic data-flow language; Fig. 8 is a diagram of a basic data-flow program, illustrating further background principles underlying the present invention; Fig. 9 is a general schematic of a system for executing the data-flow program of Fig.
8; Fig. 10 illustrates symbols representing additional actors of data-flow language; Fig. 11 is a diagram of a deadlock-free version of the elementary data-flow program of Fig. 2; Fig. 12 is a diagram of a deadlock-free version of the basic data-flow program of Fig. 8; Fig. 13 illustrates a revised instruction cell format; Fig. 14 illustrates a detail of an instruction cell format of Fig. 13; Fig. 15 illustrates the format of instruction cells when containing the program of Fig. 12; and Fig. 16 illustrates the format of instruction cells when containing the program of Fig. 11.
Generally, the embodiment of Fig. 1 comprises an active memory 20 for holding at least a record of active instructions, one or more operation units 22 for managing signals in correspondence with data computations, one or more decision units 24 for managing signals in correspondence with selections, one or more control identity units 26 for managing signals in correspondence with program control, a first arbitration network 28 for transmitting signals representing information packets from active memory 20 to operation unit(s) 22 and decision unit(s) 24, a second arbitration network 30 for transmitting signals representing information packets from active memory 20 to control identity unit(s) 26, a control network 32 for transmitting signals representing information packets from decision unit(s) 24 to active memory 20, another control network 34 for transmitting signals representing information packets from control identity unit(s) 26 to active memory 20, and a distribution network 36 for transmitting signals representing information packets from operation unit(s) 22 to active memory 20.
Details of the various components of the above system and their function will become apparent later, following an initial discussion of background considerations with reference to what are termed herein The Elementary Processor and The Basic Processor.
The Elementary Processor The Elementary Processor executes programs represented in elementary dataflow language. A program in this language is a directed graph in which the nodes are operators or links. These nodes are connected by arcs along which values (carried by tokens) may travel. An operator of the program is enabled when tokens are present on all input arcs. An enabled operator may fire at any time, removing a token from each input arc, computing a value from the operands associated with the input tokens, and associating that value with a result token placed on its output arc. A result may be sent to more than one destination by means of a link which removes a token on its input and places identical tokens on its outputs. An operator or a link cannot fire unless there is no token present on any output arc of that operator or link.
The elementary data-flow program of Fig.
2 has a token present on each input arc X, Y. Links L1 and L2 are enabled, and either one can fire - suppose L2 does. Then operator A2 and link L1 are enabled, and once again, either one can fire. In this manner, tokens travel through the program until a token appears on the output conveying the value A(x)(x+y). Once operators Al and A2 have fired, there are no tokens present on any of the arcs emanating from L1 and L2, and the links can fire as soon as the input operators deliver new values.
The structure of The Elementary Processor is presented in Fig. 3. A data-flow program to be executed is stored in a Memory of the processor. The Memory is a collection of Instruction Cells; one Instruction Cell is associated with each operator of the program. Each Instruction Cell (Fig. 4) is composed of three registers.
The first register holds an instruction (Fig.
5) which specifies the operation to be performed and the addresses of the registers to which the result of the operation is to be directed. The second and third registers hold the operands for use in execution of the instruction.
When an Instruction Cell contains an instruction and the necessary operands, it is enabled and signals an Arbitration Network that it is ready to transmit its contents as an operation packet to an Operation Unit which can perform the desired function.
The operation packet flows through the Arbitration Network which directs it to an appropriate Operation Unit by decoding the instruction portion of the packet.
The result of an operation leaves an Operation Unit as one or more data packets, each consisting of the computed value and the address of a register in the Memory to which the value is to be delivered. A Distribution Network accepts data packets from the Operation Units and utilizes the address contained in each to direct the data item through the network to the correct register in the Memory.
The Basic Processor The computational capability of The Basic Processor is greater than that of The Elementary Processor due to the addition of conditional and iterative constructs to the language executed by the processor. To illustrate this additional capability, presented herein is the structure of the instruction execution section of The Basic Processor.
The representation of conditional and iterative program constructs in data-flow form requires additional types of links and actors beyond those described for The Elementary Processor. The types of links and actors in the basic data-flow language are shown in Figs. 6 and 7.
Data values pass through data links in the manner presented previously. Tokens transmitted by control links are known as control tokens and carry a Boolean value of either true or false. A control token is generated at a decider (Fig. 7b) which, when the decider receives values from its input links applies its associated predicate, and produces either a true or false control token at its output arc. The control token produced at a decider can be combined with other control tokens by means of a Boolean operator (Fig. 7f).
Control tokens enable the flow of data tokens by means of either a T-gate, and Fgate, or a merge (Fig. 7c, d, e). A T-gate will pass the data token on its input arc to its output arc when it receives a control token conveying the value true at its control input.
It will absorb the data token on its input arc and place nothing on its output arc if a falsevalued control token is received. Similarly, the F-gate will pass its input data token to its output arc only on receipt of a false-valued token on the control input. Upon receipt of a true-valued token, it will absorb the data token.
A merge actor has a true input, a false input, and a control input. It passes to its output arc a data token from the input arc corresponding to the value of the control token received. Any tokens on the other input are not affected.
In illustration of the use of the actors and links of the basic data-flow language, Fig. 8 gives a basic data-flow program for the following conputation: input x, y n:=0 while y < x do x:=x-y n:=n+ 1 end output x, n In the data-flow program, the control input arcs of the gate and merge actors are to be considered connected to the output of the decider. The control input arcs of the three merge actors carry false-valued control tokens in the initial configuration to allow the input values of x and y and the constant 0 to be admitted as initial values for the iteration. Once these values have been received, the predicate y < x is tested. If it is true, the value of y and the new value of x are cycled back into the body of the iteration through the T-gate and two merge nodes. Concurrently, the remaining T-gate and merge node return incremented value of the iteration count n. When the output of the decider is false, the current values of x and n are delivered through the two F-gates, and the initial configuration is restored.
The organization of the instruction processing section of The Basic Processor is shown in Fig. 9. As in the elementary processor, each Instruction Cell consists of three registers, the first of which holds an instruction, and the remaining ones of which contain space for a data value, for one or more Boolean values, for one or more destination cell identifiers, or for a combination of the three. Each instruction corresponds to an operator or a decider of a basic data-flow program. The gate and merge actors of the data-flow program are not represented by separate instructions; rather, the function of the gates is incorporated into the instructions associated with operators and deciders, and the function of the merge actors is implemented for free by the nature of the Distribution Network.
As in The Elementary Processor, the Operation Units of The Basic Processor correspond to operators of a data-flow program. Decision Units of The Basic Processor correspond to deciders of a program, and each produces a control packet containing either the value true of the value false for each destination upon receipt of a decision packet containing a decision specification, one or more destination specifications, and the necessary operands.
Each data or Boolean value held as an operand by an Instruction Cell has associated with it a gating code which specifies whether the associated value is to be true-gated, false-gated, not gated at all, or is a constant. The specification of a true or false gating code designates that the associated register is enabled only upon receipt of a data value and a Boolean value of type matching the gating code. The receipt of a Boolean value not matching the gating code causes the corresponding data value to be discarded upon arrival.
The Deadlock-Free Architecture The deadlock problem in The Elementary Processor and The Basic Processor arises due to the fact that, in practice, an operator or link of a data-flow program being executed in either of the processors may not necessarily obey the rule that an operator or link cannot fire unless there is no token on the output arc of that operator or link. Thus, it is possible for a number of values destined for the same register to be in the Distribution Network simultaneously. In such a case, several values will be stored in buffers within the Distribution Network, blocking access to succeeding portions of the network and preventing any other packets from being transferred to the portion of the Memory serviced by the succeeding portions of the network. A deadlock condition arises when one of the stored values blocks a packet which is needed by the program in order to enable the cell to which the blocked packet is destined.
The solution to the deadlock problem requires the addition of a form of feedback between operators of a program in order to place an upper bound on the number of tokens which may be present upon an arc of a given data-flow program. This feedback is accomplished through the backward flow of control tokens. For this purpose a new type of control token is introduced, having value control to differentiate it from the Boolean control tokens which have value true or false.
A control-valued control token is produced at a decider with the nil predicate, and controls the flow of data tokens by means of a C-gate or control-gate (Fig. 10).
Such a gate is enabled upon receipt of a data value and a control-values control token, and upon firing, transfers the data value from its data input arc to its output arc. An operator and gate may be combined as in Fig. 10b. Such a joint operator is enabled when there is a data token present on each data input and a control-valued token is present on the control input arc.
A deadlock-free version of a pipelined data-flow program is constructed from the original version by replacing each operator which could possibly place multiple tokens on its output arc by a joint operator and gate. The gate is controlled by the output of succeeding operators. When the link on the output of each of these succeeding operators receives a data token, it sends one copy to a decider with the nil predicate. This decider generates a control-values token which is passed to the gate on the output of the first operator, allowing that operator to become reenabled as soon as all necessary operands are present.
A deadlock-free version of the elementary data-flow program of Fig. 2 is shown in Fig. Il. The program contains an initial marking of control-valued tokens which permit only one token to be present on each arc of the program. To establish a larger bound on the number of tokens on an arc of the program, the initial marking must contain several control-valued tokens on each arc. Link L2 of the program provides a fan-out of two for the values produced by input operator 1, and hence, the control token for that operator is produced by Aiding two control-valued tokens from succeeding operators Al and A2.
In the case of an interative data-flow program, deadlock can arise due to possible conflict between tokens of simultaneously active cycles of the iteration. The addition of feedback to an iterative program assures that all operations of one cycle of the iteration are concluded before the next cycle is initiated.
Due to the fact that there exist alternate paths in an iterative data-flow program, to insure freedom from deadlock, the structure of a T- or F-gate must be redefined as shown in Fig. l0c,d. Each T- or F-gate of the program operates in the manner described previously; however, the gate is required to generate a control-valued token upon firing, regardless of whether the firing of the gate actor propagates its input data value or not.
The deadlock-free version of the iterative data-flow program of Fig. 8 is shown in Fig.
12.
The merge actors must now be included as separate instructions in the implementation of the iterative data-flow program since they are now utilized to control the initiation of each cycle of the iteration. This control is accomplished by placing a C-gate on the output of each merge actor, permitting the actor to become enabled only after receiving a data value and a control-valued control token.
Through use of such feedback loops in a data-flow program, one can precisely control the number of tokens on a given arc of the program. The number of tokens allowed on a single arc may vary, due to architectural constraints to be presented next.
The deadlock-free architecture of The Basic Processor is presented in Fig. 1. The difference between the processor depicted in Fig. 1 and the processor presented in Fig.
9 arises in the structure of the Memory Cells and in the addition of the second Arbitration Network 28, Control Identity Units 26, and the Control Network 34 for the conveyance of control-valued control packets.
The revised Instruction Cell format is shown in Fig. 13. Each operand register of the Instruction Cell is structured as a firstin-first-out (FIFO) queue. The depth of this queue determines the number of data packets which can be simultaneously destined for the operand register, and hence determines the maximum number of tokens which can be present on an arc of a dataflow program. The Instruction Cell of Fig.
13 has a queue of depth three in each operand register, hence up to three tokens may be present on an arc of a program in such a processor.
The revised instruction format is shown in Fig. 14. Each destination address has a network identifier associated with it. This identifier specifies whether the address is a destination address for a value generated at the appropriate Operation or Decision Unit or is an address to which a control-valued control packet is to be sent. The identifier performs this specification function by designating one of the two Arbitration Networks 28 and 30, the first of which delivers operation and decision packets to the Operation and Decision Units 22 and 24, and the second of which accepts control packets from the Memory 20 and delivers them to the Control Identity Units 26. The Control Identity Units 26 present each control packet received to the Control Network 34 for conveyance to the proper destination Instruction Cell.
Each instruction contains a control status field which specifies the number of controlvalued packets which must be received in order for the instruction to become reenabled. A number of control receipt fields equal to the operand queue depth are used to note the arrival of control-valued packets at the Cell; one control receipt field is associated with each level of the operand queue. The control receipt fields also operate as a FIFO queue; that is, when the first control receipt field is cleared after transmission of the Cell content to the Arbitration Network each of the remaining fields shift up one field, and the last field is set to zero. Initially, all control receipt fields have value zero.
The control receipt field of an Instruction Cell is utilized to perform the ANDing of control tokens from the appropriate destination Cells. To avoid conflict in a processor which has an operand queue length greater than one between succeeding control-valued packets from the same Cell, each-control-valued packet from a particular cell has associated with it an integer value which is uniquely associated with that particular Cell, and a control-valued packet conveying the unique integer value x received by a Cell in a processor which has operand queue length m is processed in the following manner: 1. let the value of the variable d be 0 2. if the xth bit of control receipt field d equals 1 then go to step 3 else set the xth bit of control receipt field d to 1 stop 3. 3. if d has value m then signal an error else d receives a new value equal to the old value of d plus 1 go to step 2 This procedure insures that there will be no conflict between control-valued packets generated by the same Cell, because if several control-valued packets are received by a Cell from the same succeeding Cell, their receipt will be noted in different control receipt fields of the Cell which receives them.
An Instruction Cell is enabled when all three registers of the Cell are enabled. An instruction register is enabled when there is an instruction present in the register and the value contained in the first control receipt field is equal to the control status value. An operand register is enabled upon receipt of a data value and an appropriate Boolean gating value (if any). When all three registers are enabled, the instruction specification, destination specifications whose associated network identifiers designate the first Arbitration Network, and the first set of operands in the operand queues are transmitted to the first Arbitration Network, and the first element of each operand queue and the first control receipt field are emptied. Simultaneously, an appropriate control-valued packet is placed in the second Arbitration Network for each destination address which has an associated network identifier designating the second Arbitration Network.
To assure freedom from deadlock, each Instruction Cell is also required to provide a control-valued packet upon consumption of any data value by a gate actor contained in the Cell. This is accomplished by having a Cell, upon consumption of a data value, place in the second Arbitration Network a control-valued packet for each destination with a network identifier designating the second Arbitration Network.
If the buffer length of a processor is equal to one then it is not necessary to associate an integer value with each control-valued packet since each succeeding Instruction Cell can send at most one control-valued packet between successive enablings of an Instruction Cell. In such a case, each Instruction Cell must merely count the number of control-valued packets which arrive, and a Cell is enabled when all operands are present and the correct number of control-valued packets have been received.
The initial contents of Instruction Cells for the iterative data-flow program of Fig.
12 is presented in Fig. 15. For the sake of simplicity, the destination address field of each instruction holds the specification of all required destinations.
The Cell configuration depicted in Fig. 15 has a buffer length of one; however, integer values are associated with control-valued packets to demonstrate their use. The control status value in each instruction register is underlined to indicate that it is a constant and to differentiate it from the value contained in the preceding control receipt field. Cells 0, 1, and 6 have initial values of 1 in their control receipt fields to initiate the computation. A destination address preceeded by a c and an integer value designates that a control-valued packet with the specified value is to be placed in the second Arbitration Network at the appropriate time, as described previously.
An empty operand register contains a dash. Each non-empty operand register contains first a gating code specifying whether the register is to act as a T-gate (true), a F-gate (false), is not gated (no), or contains a constant (cons). The remaining fields of the operand registers contain the initial data and Boolean values necessary for the computation. Operand registers which are initially empty are indicated by parentheses.
Initially, Cells 0, 1, and 6 are enabled, and upon processing by an identity operation unit, pass the initial values of x, y, and 0 into the body of the iteration contained in Cells 2, 3, and 7. Cells 0, 1, 4, 7, and 8 each generate a control-valued packet upon execution of a gating action. The control values produced by these Cells and ANDed by Cell 5, which sends out control packets to reenable Cells 0, 1, and 6.
The structure of a deadlock-free version of The Elementary Processor is similar to the deadlock-free version of The Basic Processor with the exception that there are no Decision Units 24 and no corresponding Control Network 32. The structure differs from that of The Elementary Processor presented previously by the addition of a Control Network and a second Arbitration Network which connects to the Control Network via Control Identity Units.
The required modifications to the Operation Units and the instruction format of The Elementary Processor to achieve deadlock-free operation are identical to those utilized in The Basic Processor, and the instruction format shown in Fig. 13 is valid for The Elementary Processor.
Buffering of operands is utilized to allow several packets to simultaneously specify the same destination register, and the depth of the buffering controls the number of values which may have a common destination. Addition of the control status and control receipt fields permits the anding of control tokens from several destinations.
The initial contents of the Instruction Cells for the deadlock-free version of the elementary data-flow program of Fig. 11 are shown in Fig. 16. Initially Cells 0 and 1 of the program are enabled and are directed to an input Operation Unit. Two input values are accepted over channels 1 and 2 and are sent as data packets through the Distribution Network to registers 7, 10, and 11. Upon transferring their contents as operation packets to the Arbitration Network, Cells 0 and 1 cannot be enabled again until each has received the specified number of control-valued packets. These control packets are provided to Cells 0 and 1 by Cells 2 and 3 (which also are initially enabled).
It may be appreciated that the illustrated embodiments of the invention provide processors which may achieve highly parallel execution of programs represented in data-flow form, incorporating a fork of deadlock prevention between the instructions of a data-flow program, allowing a value to be generated by an instruction and sent to the successor instructions in the computation only when those instructions are ready to receive the value. The incorporation of this mechanism prevents the possibility of conflict between successive stages of a pipelined computation and between successive iteration of an iterative computation.
A significant distinctive feature of the illustrated embodiments of the invention is that, in execution of a program, no central control is required or provided, but internal communication between instructions is purely local, as contrasted with conventional serial processing programs in processors having central control.
WHAT WE CLAIM IS: 1. A digital data processor arranged to execute data-flow programs by the concurrent processing of a plurality of instructions, each of which is executed upon the arrival of at least one data value and/or at least one control value which is generated upon the execution of a downstream instruction, the processor comprising: (a) a read/write memory comprising a plurality of instruction cells, each of which has a respective index, contains a respective one of said instructions, and is arranged to receive each data value and/or control value necessary for the execution of its instruction; (b) operation means for performing arithmetic and/or logic operations on said data values; (c) control identity means for identifying said control values; (d) first arbitration means transmitting signals from said memory to said operation means, each of which signals represents one of said instructions together with each data value necessary for its execution; (e) second arbitration means transmitting signals representing said control values from said memory to said control identity means; (f) control means transmitting said signals representing said control values to respective cells of said memory; and (g) distribution means transmitting signals from said operation means to respective cells of said memory, each of which signals represents the result of an operation of said operation means.
2. A processor according to claim 1, further comprising: decision means for performing comparison operations on data values and outputting corresponding Boolean values; and further control means transmitting signals representing said Boolean values to respective cells of said memory: said first arbitration means transmitting signals from said memory to said decision means, each of which signals represents one of said instructions together with each data value necessary for its execution; and each instruction in said memory being executed upon the arrival of at least one data value, and/or said control value, and/or said Boolean value.
3. A processor according to claim 1 or 2, wherein each of said instruction cells comprises a plurality of register, one of which contains an instruction, and the or each other of which is for containing a respective data value and/or Boolean value.
4. A processor according to claim 3, wherein each register containing an instruction has an instruction field containing a functional specification and a destination field containing a destination specification.
5. A processor according to claim 4, wherein the functional specifications contain at least one member of a set of operational specifications, to be performed by said operation means.
6. A processor according to claims 2 to 5, wherein each functional specification contains at least one member of a set of operational and decision specifications, to be performed respectively by said operation and decision means.
7. A processor according to claim 4, 5 or 6, wherein each destination field contains at least one of said cell indices and at least one routing specification indicating whether, upon execution of the respective instruction of the cell, the associated signal is to be transmitted via the first or second arbitration means eventually to the cell having said one index.
8. A processor according to claim 3, 4, 5 or 6, wherein each register containing an instruction has a control receipt field and a control status field, the control receipt field indicating the receipt of control values by the respective cell and/or the availability of the respective cell to receive data values and/or Boolean values, and the control status field indicating the number of control values which must be received by the respective cell in order for the instruction to be enabled for execution.
9. A processor according to any preceding claim, wherein each data value or Boolean value contained in a respective one of said instruction cells has an associated index indicating the presence of that value.
10. A processor according to any preceding claim, wherein at least some of said instructions cells are provided with buffers for storing a queue of data values and/or Boolean values and/or control values necessary for successive executions of each instruction of those cells.
11. A processor according to any preceding claim, wherein said memory is an auxiliary memory and the processor has a main memory, from which said instructions are loaded into the auxiliary memory.
12. A digital data processor substantially
**WARNING** end of DESC field may overlap start of CLMS **.

Claims (15)

**WARNING** start of CLMS field may overlap end of DESC **. prevents the possibility of conflict between successive stages of a pipelined computation and between successive iteration of an iterative computation. A significant distinctive feature of the illustrated embodiments of the invention is that, in execution of a program, no central control is required or provided, but internal communication between instructions is purely local, as contrasted with conventional serial processing programs in processors having central control. WHAT WE CLAIM IS:
1. A digital data processor arranged to execute data-flow programs by the concurrent processing of a plurality of instructions, each of which is executed upon the arrival of at least one data value and/or at least one control value which is generated upon the execution of a downstream instruction, the processor comprising: (a) a read/write memory comprising a plurality of instruction cells, each of which has a respective index, contains a respective one of said instructions, and is arranged to receive each data value and/or control value necessary for the execution of its instruction; (b) operation means for performing arithmetic and/or logic operations on said data values; (c) control identity means for identifying said control values; (d) first arbitration means transmitting signals from said memory to said operation means, each of which signals represents one of said instructions together with each data value necessary for its execution; (e) second arbitration means transmitting signals representing said control values from said memory to said control identity means; (f) control means transmitting said signals representing said control values to respective cells of said memory; and (g) distribution means transmitting signals from said operation means to respective cells of said memory, each of which signals represents the result of an operation of said operation means.
2. A processor according to claim 1, further comprising: decision means for performing comparison operations on data values and outputting corresponding Boolean values; and further control means transmitting signals representing said Boolean values to respective cells of said memory: said first arbitration means transmitting signals from said memory to said decision means, each of which signals represents one of said instructions together with each data value necessary for its execution; and each instruction in said memory being executed upon the arrival of at least one data value, and/or said control value, and/or said Boolean value.
3. A processor according to claim 1 or 2, wherein each of said instruction cells comprises a plurality of register, one of which contains an instruction, and the or each other of which is for containing a respective data value and/or Boolean value.
4. A processor according to claim 3, wherein each register containing an instruction has an instruction field containing a functional specification and a destination field containing a destination specification.
5. A processor according to claim 4, wherein the functional specifications contain at least one member of a set of operational specifications, to be performed by said operation means.
6. A processor according to claims 2 to 5, wherein each functional specification contains at least one member of a set of operational and decision specifications, to be performed respectively by said operation and decision means.
7. A processor according to claim 4, 5 or 6, wherein each destination field contains at least one of said cell indices and at least one routing specification indicating whether, upon execution of the respective instruction of the cell, the associated signal is to be transmitted via the first or second arbitration means eventually to the cell having said one index.
8. A processor according to claim 3, 4, 5 or 6, wherein each register containing an instruction has a control receipt field and a control status field, the control receipt field indicating the receipt of control values by the respective cell and/or the availability of the respective cell to receive data values and/or Boolean values, and the control status field indicating the number of control values which must be received by the respective cell in order for the instruction to be enabled for execution.
9. A processor according to any preceding claim, wherein each data value or Boolean value contained in a respective one of said instruction cells has an associated index indicating the presence of that value.
10. A processor according to any preceding claim, wherein at least some of said instructions cells are provided with buffers for storing a queue of data values and/or Boolean values and/or control values necessary for successive executions of each instruction of those cells.
11. A processor according to any preceding claim, wherein said memory is an auxiliary memory and the processor has a main memory, from which said instructions are loaded into the auxiliary memory.
12. A digital data processor substantially
as hereinbefore described with reference to Figure 1 of the accompanying drawings.
13. A processor according to claim 12 and also substantially as hereinbefore described with reference to Figures 4 to 7 and 10 of the accompanying drawings.
14. A processor according to claim 12 or 13 and also substantially as hereinbefore described with reference to Figures 13 and 14 of the accompanying drawings.
15. A processor according to claim 12, 13 or 14, and also substantially as hereinbefore described with reference to Figures 11 and 16 or Figures 12 and 15 of the accompanying drawings.
GB37048/77A 1976-09-07 1977-09-05 Digital data processors Expired GB1577658A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US05/721,083 US4145733A (en) 1974-03-29 1976-09-07 Data processing apparatus for highly parallel execution of stored programs

Publications (1)

Publication Number Publication Date
GB1577658A true GB1577658A (en) 1980-10-29

Family

ID=24896467

Family Applications (1)

Application Number Title Priority Date Filing Date
GB37048/77A Expired GB1577658A (en) 1976-09-07 1977-09-05 Digital data processors

Country Status (4)

Country Link
JP (1) JPS5334439A (en)
DE (1) DE2740118A1 (en)
FR (1) FR2363833A1 (en)
GB (1) GB1577658A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5561850A (en) * 1978-10-31 1980-05-09 Nippon Telegr & Teleph Corp <Ntt> Computer system
JPH0661110B2 (en) * 1985-05-31 1994-08-10 松下電器産業株式会社 Data processing device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3962706A (en) * 1974-03-29 1976-06-08 Massachusetts Institute Of Technology Data processing apparatus for highly parallel execution of stored programs

Also Published As

Publication number Publication date
FR2363833B1 (en) 1983-04-08
JPS5334439A (en) 1978-03-31
FR2363833A1 (en) 1978-03-31
JPS627580B2 (en) 1987-02-18
DE2740118A1 (en) 1978-03-09

Similar Documents

Publication Publication Date Title
US4145733A (en) Data processing apparatus for highly parallel execution of stored programs
Dennis et al. A preliminary architecture for a basic data-flow processor
US4153932A (en) Data processing apparatus for highly parallel execution of stored programs
US4149240A (en) Data processing apparatus for highly parallel execution of data structure operations
US3978452A (en) System and method for concurrent and pipeline processing employing a data driven network
Smith Architecture and applications of the HEP multiprocessor computer system
US5465368A (en) Data flow machine for data driven computing
US4943916A (en) Information processing apparatus for a data flow computer
US4295193A (en) Machine for multiple instruction execution
US5208914A (en) Method and apparatus for non-sequential resource access
US3962706A (en) Data processing apparatus for highly parallel execution of stored programs
US5471626A (en) Variable stage entry/exit instruction pipeline
US20060259744A1 (en) Method for information processing
US5561808A (en) Asymmetric vector multiprocessor composed of a vector unit and a plurality of scalar units each having a different architecture
US4683547A (en) Special accumulate instruction for multiple floating point arithmetic units which use a putaway bus to enhance performance
Amamiya et al. Dataflow computing and eager and lazy evaluations
Dennis et al. A computer architecture for highly parallel signal processing
US5274777A (en) Digital data processor executing a conditional instruction within a single machine cycle
US4654780A (en) Parallel register transfer mechanism for a reduction processor evaluating programs stored as binary directed graphs employing variable-free applicative language codes
GB1577658A (en) Digital data processors
Dennis et al. A preliminary architecture for a basic data-flow processor
Treleaven et al. A multi-processor reduction machine for user-defined reduction languages.
JPS6134629A (en) Graph manager
JP3737573B2 (en) VLIW processor
JPH073655B2 (en) Organizing / editing processor

Legal Events

Date Code Title Description
PS Patent sealed
PCNP Patent ceased through non-payment of renewal fee