WO1993011485A1 - Method for ordering events in a parallel data processing system - Google Patents

Method for ordering events in a parallel data processing system Download PDF

Info

Publication number
WO1993011485A1
WO1993011485A1 PCT/DK1992/000352 DK9200352W WO9311485A1 WO 1993011485 A1 WO1993011485 A1 WO 1993011485A1 DK 9200352 W DK9200352 W DK 9200352W WO 9311485 A1 WO9311485 A1 WO 9311485A1
Authority
WO
WIPO (PCT)
Prior art keywords
processes
queues
bit
stacks
vectors
Prior art date
Application number
PCT/DK1992/000352
Other languages
French (fr)
Inventor
S. M. G. B. S. Harlequin
Richard John Bird
Original Assignee
KLAUSTRUP, Edel, Kirstine
KLAUSTRUP ANDERSEN, Henning
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by KLAUSTRUP, Edel, Kirstine, KLAUSTRUP ANDERSEN, Henning filed Critical KLAUSTRUP, Edel, Kirstine
Publication of WO1993011485A1 publication Critical patent/WO1993011485A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores

Definitions

  • the invention achieves parallel processing using a new structural principal in the organisation of computers, It is embodied in a chip (the QCC-chip specified as the kernel), Though the cardinal principle is one of parallel processing among a plurality of processors, it may be used for many other purposes, e.g. communication switching, computer network control, etc.
  • Parallel processing has been attempted in many forms, using hardware, software and mixes of the two approaches, What distinguishes this invention from such previous approaches is the use of a totally dynamic and implicitly self-organising method which transparrently leads to an
  • Hardware designs for parallel processing raise many problems involving communication between one or more processors. More or less exotic bus geometries have been attempted, such as rings, stars, hypercubes, butterflies and so on.
  • the QCC-system is original by embodying a method distinguished by a new transparent event ordering system, be it physical and/or logical event ordering, which implies synchronisation and coherence in a recursive manner, thus enabling serial processes to be parallelised.
  • a new transparent event ordering system be it physical and/or logical event ordering, which implies synchronisation and coherence in a recursive manner, thus enabling serial processes to be parallelised.
  • PIPES variable sized transparent Queue/Stack cultures named spanning the complete address space for which event ordering is needed and dynamically controlled by the QCC-Chip, i.e. the invention is distinguished and original by being a totally transparent dynamic recursive system adhering to memory management, as opposed to other systems using
  • the QCC is a pipe controller with the ability to transparently handle multiple pipes recursively, arbitrarily organized as queues and/or stacks, thus enabling cross referencing between pipes, pipes and their related functions and between functions, It is originated for the purpose of high performance parallel processing, but can be used in any design demanding high performance intercommunication with synchronisation, event-ordering and coherence,
  • the function of the kernel is essentially that of an address handler, which generates the addresses of data in a main memery device according to the sequence of events requesting their access.
  • the kernel performs this function by the use of recursively handled vector pointers to sets of complementary stacks and queues, generically referred to as pipes.
  • the kernel receives inputs U L F and E, these designating the Upper and Lower vector addresses of pipes, and Full and Empty queue bits and a Read/Write from a participant bit, a Queue/Stack select bit, a Refocus request bit and a Bit Zero status bit (1 or 0). It also receives an Adders Control which is the value of a pipe step increment/decrement;, which may in the simplest case be unity.
  • the output from the kernel is principally an Address at which data may be found in the memory interlocked between locals, globals and the QCC system, together with Bit Zero which may be a control bit for queue and/or stack selection, and NMI a non maskable interrupt signal.
  • the mode of operation of the kernel is as follows. If a F & U or an E & R(i.e. full and write or empty and read) condition is present, the status of the Q/S signal is tested. If it is S a non-maskable interrupt is emitted by the kernel. If it is Q then Bit Zero is set to 1 and this is output by the kernel. If neither the F & W nor E & R combinations is detected, the status of the R/W bit is tested. If it is found to be low (write signal) the status of the Q/S bit is tested.
  • the Upper vector addresses is immediately asserted as the Address output from the kernel and the Refocus bit is tested, If the Refocus bit is zero, U is incremented by amount of the Adders Control, The R/W bit is then tested and if it is high (read signal) THE VALUES OF U and L are compared. If they are equal F is set to 1 and if they are unequal F is set to 0. If the value of the R/W bit is low (write signal) the values of U and L are compared. If they are equal E is set to 1 and if they are unequal E is set to 0.
  • the R/W signal is found to be high (read) then the status of the Q/S bit is tested, If it is Q then L is immediately asserted as the Address output from the kernel and the Refocus bit is tested, If the Refocus bit is zero, L is incremented by the amount of the Adders Control, The R/W bit is then tested and if it is high (read signal) the values of U and L are compared, If they are equal F is set to 1 and if they are unequal F is set to 0, If the value of the R/W bit is low (write signal) the values of U and L are compared, If they are equal E is set to 1 and if they are unequal E is set to 0,
  • the QCC-system is built around two complementarily operating vector memories each with between 512 (minimal system) and 4 giga (maximal in a 32 bit system) entries. Each entry is minimally a 64 bit word, made up of two 32 bit words formatted as shown in "THE COMPLEMENTARY VECTOR FORMAT", page 14 ,
  • These complementary vector memories contains the upper addresses (U), lower addresses (L), queue/stack offset (D), full (F), empty (E) status of the queues/stacks and the semaphores (SO- Sn).
  • the semaphores and their purpose may be defined at will by the users and also the word width of the vector memories may be expanded for any purpose, such as restart procedures,
  • the U contains the upper vector addresses of the
  • the L contains the lower vector addresses of the queues/stacks both together with their appropriate status being updated as a result of their lasi. operation, according to the rules embodied in the chip design as shown in the drawing named Fig. 2 and descriped in "Kernel Description, page 4".
  • the attemptator (the process or thread of a process which attempted the read or write) receives a signal that an event occurred (interrupted), whereafter the attemptator may start a new sequence of operations by

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

A method for controlling many processes or threads of processes operating in parallel with transparent intercommunication between these processes or threads of processes, recursively driven by events. The processes or threads of processes transparently address sets of pairs of complementary vectors which address queues in a shared memory area and are recursively updated after each access which encounters an event whereby the event itself becomes stacked. Stacked events will thus control the sequence of execution in an optimally stochastic manner, not the linear sequence in which the problem was presented, as such a sequence would not execute in parallel.

Description

METHOD FOR ORDERING EVENTS IN A PARALLEL DATA PROCESSING SYSTEM Background to The Invention
Everywhere in the industrialised world there is an overwhelming and ever- increasing demand for computers with ever higher speed, greater capacity and improved structures for data control, This requirement may be for purposes of research, space programs, meteorology, database programs, pattern recognition, CAD/CAM, artificial intelligence, neural network models, genetics, to give a few examples only, One obvious way to achieve these objectives is through the development of faster processor technology, e.g. the use of new materials for chip construction, bus communication etc, and the development of designs such as RISC. Another approach is that of parallel processing, in which essentially many processors carry out parts of the same task(s) simultaneously, By the use of such a parallel processing approach a task can be speeded up by many orders of magnitude, even using conventional chip designs,
The invention (method) achieves parallel processing using a new structural principal in the organisation of computers, It is embodied in a chip (the QCC-chip specified as the kernel), Though the cardinal principle is one of parallel processing among a plurality of processors, it may be used for many other purposes, e.g. communication switching, computer network control, etc. Parallel processing has been attempted in many forms, using hardware, software and mixes of the two approaches, What distinguishes this invention from such previous approaches is the use of a totally dynamic and implicitly self-organising method which transparrently leads to an
automatic ordering of events. This ordering is optimal in processing efficiency, allowing the processors to approach the ideal "theoretical peak" performance as defined by Dongarra in his paper "Performance of Various Computers Using Standard Linear Equations Software." (Mathematical Sciences Section, Oak Ridge National Laboratory, Knoxville USA). To summarise this approach one could say that it implies the structural sorting of chaos into order with coherence, implicitly carried out at the lowest possible level, i.e. machine instruction level,
The Stats of the Art
To meet the demands for parallel processing many different methods have been developed, and these may be divided into three classes; hardware solutions, software solutions and mixes of the two ,
Hardware designs for parallel processing raise many problems involving communication between one or more processors. More or less exotic bus geometries have been attempted, such as rings, stars, hypercubes, butterflies and so on.
Conventional parallel processing systems are extremely expensive in hardware and even more so in software. Hardware solutions may involve elaborate bus structures which raise problems of process scheduling.
Software solutions require special parallel programming languages. These need a massive reprogramming effort if existing software is to be
reproduced in the new environment. Also programming for parallel processes is very complex, since the three major problems, synchronisation, coherence and ordering of events, have not yet found a satisfactory solution,
This is why parallel processing has been up unti l now for the very few, The Inventive Step
The QCC-system, according to the invention (method), is original by embodying a method distinguished by a new transparent event ordering system, be it physical and/or logical event ordering, which implies synchronisation and coherence in a recursive manner, thus enabling serial processes to be parallelised. This is achieved by introducing a massive amount of variable sized transparent Queue/Stack cultures named PIPES, spanning the complete address space for which event ordering is needed and dynamically controlled by the QCC-Chip, i.e. the invention is distinguished and original by being a totally transparent dynamic recursive system adhering to memory management, as opposed to other systems using
Queue/Stack principles.
The QCC is a pipe controller with the ability to transparently handle multiple pipes recursively, arbitrarily organized as queues and/or stacks, thus enabling cross referencing between pipes, pipes and their related functions and between functions, It is originated for the purpose of high performance parallel processing, but can be used in any design demanding high performance intercommunication with synchronisation, event-ordering and coherence,
kernel Description
Fig, 2 relates completely to this paragraph
The function of the kernel is essentially that of an address handler, which generates the addresses of data in a main memery device according to the sequence of events requesting their access. The kernel performs this function by the use of recursively handled vector pointers to sets of complementary stacks and queues, generically referred to as pipes.
The kernel receives inputs U L F and E, these designating the Upper and Lower vector addresses of pipes, and Full and Empty queue bits and a Read/Write from a participant bit, a Queue/Stack select bit, a Refocus request bit and a Bit Zero status bit (1 or 0). It also receives an Adders Control which is the value of a pipe step increment/decrement;, which may in the simplest case be unity. The output from the kernel is principally an Address at which data may be found in the memory interlocked between locals, globals and the QCC system, together with Bit Zero which may be a control bit for queue and/or stack selection, and NMI a non maskable interrupt signal.
The mode of operation of the kernel is as follows. If a F & U or an E & R(i.e. full and write or empty and read) condition is present, the status of the Q/S signal is tested. If it is S a non-maskable interrupt is emitted by the kernel. If it is Q then Bit Zero is set to 1 and this is output by the kernel. If neither the F & W nor E & R combinations is detected, the status of the R/W bit is tested. If it is found to be low (write signal) the status of the Q/S bit is tested. If this is Q the Upper vector adress is immediately asserted as the Address output from the kernel and the Refocus bit is tested, If the Refocus bit is zero, U is incremented by amount of the Adders Control, The R/W bit is then tested and if it is high (read signal) THE VALUES OF U and L are compared. If they are equal F is set to 1 and if they are unequal F is set to 0. If the value of the R/W bit is low (write signal) the values of U and L are compared. If they are equal E is set to 1 and if they are unequal E is set to 0.
If after testing the F & W and E & R combination the R/W signal is found to be high (read) then the status of the Q/S bit is tested, If it is Q then L is immediately asserted as the Address output from the kernel and the Refocus bit is tested, If the Refocus bit is zero, L is incremented by the amount of the Adders Control, The R/W bit is then tested and if it is high (read signal) the values of U and L are compared, If they are equal F is set to 1 and if they are unequal F is set to 0, If the value of the R/W bit is low (write signal) the values of U and L are compared, If they are equal E is set to 1 and if they are unequal E is set to 0,
If after testing the F & W and E & R combination the R/W signal is found to be high (read) and the Q/S signal is found to be S, then L is
immediately asserted as the Address output from the kernel and the Refocus bit is tested, If the Refcus bit is zero, L is decremented by the amount of the Addders Control, The R/W bit is then tested and if it is high (read signal) the values of U and L are compared, If they are equal F is set to 1 and if they are unequal F is set to 0, If the value of the R/W bit is low (write signal) the values of U and L are compared, If they are equal F is set to 1 and if they are unequal F is set to 0, If the value of th R/W bit is low (write signal) the values of U and L are compared. If they are equal E is set to 1 and if theay are unequal E is set to 0,
Figure imgf000008_0001
Figure imgf000009_0001
Figure imgf000010_0001
Figure imgf000011_0001
Figure imgf000012_0001
Figure imgf000013_0001
Figure imgf000014_0001
Figure imgf000015_0001
Figure imgf000016_0001
Further annexed hereto is an implementation of the QCC chip kernel, entitled Q, made to the above specification, including Fig. 3, by Derik Renton of EVJ Electronics, thus incidentally demonstrating that a technician who is an outsider can follow the above specification.
The QCC-System - Principles of Operation
The QCC-system is built around two complementarily operating vector memories each with between 512 (minimal system) and 4 giga (maximal in a 32 bit system) entries. Each entry is minimally a 64 bit word, made up of two 32 bit words formatted as shown in "THE COMPLEMENTARY VECTOR FORMAT", page 14 , These complementary vector memories contains the upper addresses (U), lower addresses (L), queue/stack offset (D), full (F), empty (E) status of the queues/stacks and the semaphores (SO- Sn). The semaphores and their purpose may be defined at will by the users and also the word width of the vector memories may be expanded for any purpose, such as restart procedures,
The U contains the upper vector addresses of the
queues/stacks. The L contains the lower vector addresses of the queues/stacks both together with their appropriate status being updated as a result of their lasi. operation, according to the rules embodied in the chip design as shown in the drawing named Fig. 2 and descriped in "Kernel Description, page 4".
Whenever an attempt is made to either write to a full pipe (F=1) or read from an empty pipe (E=1) the sequence is
terminated and the address of the vectors concerned is
recursively stacked in the neighbouring event pipe, asserted by means of bit zero. The attemptator (the process or thread of a process which attempted the read or write) receives a signal that an event occurred (interrupted), whereafter the attemptator may start a new sequence of operations by
recursively reading a new vector address from a stack in the neighbouring event pipe or initiate a new task. The beginning of a sequence is implicit and inherent in the nature of attemptator's normal interrupt procedure and will succeed when the condition for the termination has been resolved by a process or a thread of a process writing, making it possible to read again (E=0) or by reading, making it possible to write again (F=0). Thus the system is stochastic and operates transparently and recursively in concordance with e.g. a
Markov chain principle, fully exploited to any depth of recursitivity. It can be seen, that the system is alive by any means, i.e. Its logical behaviour is complete towards any condition which may arise , what so ever . It wi ll solve a problem of any complexity, even its own, by its recursitivity, It cannot, by its own rules, produce disorder. On the
contrary it will reduce entropy to the minimum possible level, The system may therefore be regarded as an "entropy machine" similar in operation to the concept of a Maxwell demon used to illustrate thermodynamic theory and by Norbert Wiener in his book "Cybernetics" (page 57),

Claims

Claims
1. A method for ordering events and activities in a
mechanical, electromechanical, electronic or other similar device using the principal of queues and stacks controlled by complementary vectors, distinguished by its transparency and the ability that any auxiliary information may become contents of the address vectors and also that the addresses of the address vectors may recursively become the contents of queues or stacks, allowing the system to operate automatically in a recursive manner with synchronization and coherence,
2. A method according to claim 1 in which a process or a number of processes can be controlled by the recursive
ordering of queues and stacks controlled by complementary vectors,
3. A method according to claims 1 and 2 in which the queues, stacks, vectors, addresses, control lines for addressing data, instructions and environments are embodied in a microprocessor chip or other device or devices,
4. A method according to claims 1, 2 and 3 in which the control device is embodied in one or more full custom designed chips or uncommitted logic array(s) or other device(s).
5. Methods according to claims 1, 2, 3 and 4 in which this concept may be used wholly and/or partly in any device and/or apparatus for any other purpose(s).
PCT/DK1992/000352 1991-11-26 1992-11-26 Method for ordering events in a parallel data processing system WO1993011485A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DK1924/91 1991-11-26
DK192491A DK192491D0 (en) 1991-11-26 1991-11-26 DEVICE FOR PARALLEL COUPLING OF COMPUTERS

Publications (1)

Publication Number Publication Date
WO1993011485A1 true WO1993011485A1 (en) 1993-06-10

Family

ID=8108999

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK1992/000352 WO1993011485A1 (en) 1991-11-26 1992-11-26 Method for ordering events in a parallel data processing system

Country Status (3)

Country Link
AU (1) AU3081892A (en)
DK (1) DK192491D0 (en)
WO (1) WO1993011485A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100421797B1 (en) * 1994-12-09 2004-05-20 텔레폰아크티에볼라게트 엘엠 에릭슨 An internal execution thread management system and method thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4819201A (en) * 1983-09-29 1989-04-04 Alain Thomas Asynchronous FIFO device comprising a stack of registers having a transparent condition
EP0340344A2 (en) * 1988-03-11 1989-11-08 International Business Machines Corporation Fast access priority queue
US4980824A (en) * 1986-10-29 1990-12-25 United Technologies Corporation Event driven executive

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4819201A (en) * 1983-09-29 1989-04-04 Alain Thomas Asynchronous FIFO device comprising a stack of registers having a transparent condition
US4980824A (en) * 1986-10-29 1990-12-25 United Technologies Corporation Event driven executive
EP0340344A2 (en) * 1988-03-11 1989-11-08 International Business Machines Corporation Fast access priority queue

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100421797B1 (en) * 1994-12-09 2004-05-20 텔레폰아크티에볼라게트 엘엠 에릭슨 An internal execution thread management system and method thereof

Also Published As

Publication number Publication date
DK192491D0 (en) 1991-11-26
AU3081892A (en) 1993-06-28

Similar Documents

Publication Publication Date Title
US4956771A (en) Method for inter-processor data transfer
US5689647A (en) Parallel computing system with processing element number setting mode and shortest route determination with matrix size information
US5471592A (en) Multi-processor with crossbar link of processors and memories and method of operation
US20040215678A1 (en) Method for finding local extrema of a set of values for a parallel processing element
EP0135127A2 (en) Personal computer interface
US7076629B2 (en) Method for providing concurrent non-blocking heap memory management for fixed sized blocks
US4731737A (en) High speed intelligent distributed control memory system
US7581080B2 (en) Method for manipulating data in a group of processing elements according to locally maintained counts
US3710349A (en) Data transferring circuit arrangement for transferring data between memories of a computer system
EP0144779A2 (en) Parallel processing computer
WO1993011485A1 (en) Method for ordering events in a parallel data processing system
US5842035A (en) Parallel computer utilizing less memory by having first and second memory areas
CA1119307A (en) Microcomputer having separate bit and word accumulators and separate bit and word instruction sets
US8856493B2 (en) System of rotating data in a plurality of processing elements
CA1137641A (en) Single chip microcomputer selectively operable in response to instructions stored on the computer chip or in response to instructions stored external to the chip
Hsu et al. A hardware mechanism for priority queue
JPS6285343A (en) Memory reading-out circuit
EP0326164A2 (en) Parallel computer comprised of processor elements having a local memory and an enhanced data transfer mechanism
CA1065492A (en) System and method for concurrent and pipeline processing employing a data driven network
Simoes On multiprocessor dataflow parallel pipelined processors in image processing
KR940009832B1 (en) Read-time data processor by using moduler memory
Sim et al. Fast line detection using major line removal morphological Hough transform
Woodward Coordination
RU2179333C1 (en) Synergistic computer system
CN1487453A (en) Universal dot matrix liquid crystal display controller and its application software developing system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AT AU BB BG BR CA CH CS DE DK ES FI GB HU JP KP KR LK LU MG MN MW NL NO PL RO RU SD SE US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)

Free format text: PL

EX32 Extension under rule 32 effected after completion of technical preparation for international publication

Ref country code: BY

LE32 Later election for international application filed prior to expiration of 19th month from priority date or according to rule 32.2 (b)

Ref country code: BY

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase