WO2002071249A9 - Verfahren und vorrichtungen zur datenbe- und/oder verarbeitung - Google Patents
Verfahren und vorrichtungen zur datenbe- und/oder verarbeitungInfo
- Publication number
- WO2002071249A9 WO2002071249A9 PCT/EP2002/002403 EP0202403W WO02071249A9 WO 2002071249 A9 WO2002071249 A9 WO 2002071249A9 EP 0202403 W EP0202403 W EP 0202403W WO 02071249 A9 WO02071249 A9 WO 02071249A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- bus
- transmitter
- identifier
- transmitters
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/447—Target code generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7867—Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
- G06F9/4494—Execution paradigms, e.g. implementations of programming paradigms data driven
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K19/00—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
- H03K19/02—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components
- H03K19/173—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components
- H03K19/177—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components arranged in matrix form
Definitions
- the invention describes methods and methods for managing and transferring data within multidimensional arrangements of transmitters and receivers.
- the division of a data stream into several independent branches and the subsequent combination of the individual branches into a data stream should be easy to carry out, the individual data streams being summarized again in the correct chronological order. This process is particularly important for processing reentrant code.
- the described method is particularly suitable for configurable architectures, the efficient control of the configuration and reconfiguration. receives special attention.
- the object of the invention is to provide something new for commercial use.
- the solution to the problem is claimed independently.
- Preferred embodiments are in the subclaims.
- a reconfigurable architecture is understood to mean modules (VPU) with configurable function and / or networking, in particular integrated modules with a plurality of arithmetic and / or logical and / or logical and / or analog and / or storing and / or internal / external arranged in one or more dimensions networking modules that are connected to each other directly or through a bus system.
- the genus of these modules includes, in particular, systolic arrays, neural networks, multiprocessor systems, processors with several arithmetic units and / or logical cells and / or communicative / peripheral cells (10), networking and network modules such as crossbar switches, as well as known modules the genus FPGA, DPGA, Chameleon, XPUTER, etc ..
- the above architecture is used as an example for clarification and is referred to below as the VPU.
- the architecture consists of any arithmetic, logical (also memory rather) and / or memory cells and / or network cells and / or communicative / peripheral (10) cells (PAEs), which can be arranged in a one- or multi-dimensional matrix (PA), the matrix being able to have different cells of any design , the bus systems are also understood as cells.
- a configuration unit (CT) is assigned to the matrix as a whole or in part, which influences the networking and function of the PA.
- the configurable cells of a VPU must be synchronized with each other for the correct processing of data. Two different protocols are used for this purpose, one for the synchronization of the data traffic and another for the sequence control of the data processing.
- Data is preferably transmitted via a plurality of configurable bus systems. Configurable bus systems mean in particular that any PAEs send data and the connection to the receiver PAEs, and in particular the receiver PAEs, can be configured as desired.
- the data traffic is preferably synchronized using handshake protocols that are transmitted with the Dateji.
- handshake protocols that are transmitted with the Dateji.
- the following description describes simple handshakes and complex processes, the preferred use of which depends on the particular application or application quantity to be carried out.
- the sequence control is carried out by signals (triggers) which
- Triggers can be used independently of the
- Triggers are generated by a status of a sending PAE (e.g. Zefo flag, overflow flag, negative flag) by forwarding individual states or combinations.
- a sending PAE e.g. Zefo flag, overflow flag, negative flag
- Data processing cells (PAEs) within a VPU can assume different processing states, which depend on the configuration state of the cells and / or incoming or arriving triggers: "not configured":
- STOP incoming data are not calculated. STEP exactly one calculation is carried out.
- GO, STOP and STEP are triggered by the trigger described below.
- a particularly simple, yet very powerful handshake protocol which is preferably used for the transmission of data and triggers, is described below.
- the control of the handshake and protocols is preferably predetermined in the hardware and can represent an essential part of the data processing paradigm of a 'VPU.
- the basics of this protocol are already described in PACT02.
- An RDY signal is sent with every piece of information sent by a transmitter via any bus, indicating the validity of the information.
- the receiver only processes information with an RDY signal. are provided, all other information is ignored.
- a special task of handshake protocols for VPUs is to carry out pipeline-like data processing, in which data in every clock cycle in particular PAE can be administered. This requirement leads to special demands on the functioning of handshakes.
- the problem and solution of this task is shown using the example of an RDY / ACK protocol:
- Figure la shows a structure of a pipeline within a VPU.
- the data are led via (preferably configurable) bus systems (0107, 0108, 0109) to registers (0101, 0104), which are followed by data processing logic (0102, 0105), if applicable.
- This is assigned an output stage (0103, 0106), which preferably again contains a register in order to connect the results to a bus again.
- Both the bus systems (0107, 0108, 0109) and the data processing logic preferably (0102, 0105) transmit the RDY / ACK protocol for synchronization.
- ACK means "Receiver will take over data", with the effect that the pipeline works in every cycle.
- ACK means "Receiver has taken over data”, with the effect that the ACK only ever runs to the next level and there is a register there. The problem that arises from this is that the pipeline only works in every second cycle due to the delay in the register required in the hardware implementation.
- protocol b is used in that a • register (0110) delays the incoming RDY by one clock with the writing of the transferred data into an input register and forwards it on the bus as ACK.
- This stage (0110) works as a kind of protocol converter between a bus protocol and the protocol within a data processing logic.
- the data processing logic uses the protocol a). This is generated by a downstream protocol converter (Olli). The special thing about Olli is that a prediction must be made as to whether the incoming data are actually taken from the data processing logic by the bus system. This is solved by an additional buffer register (0112) in the output stages (0103, 0106) for the data to be transferred to the bus system is introduced. The data generated by the data processing logic are simultaneously written to the bus system and in the buffer register. If the bus cannot accept the data, i.e. the ACK of the bus system is absent, the data are available in the uPffer register and are switched to the bus system via a multiplexer (0113) as soon as the bus system is ready.
- Olli downstream protocol converter
- the data is forwarded directly to the bus via multiplexer (0113).
- the buffer register enables acknowledgment with the semantics a), since "Receiver will accept data" can be acknowledged as long as the buffer register is empty, since writing to the buffer register ensures that the data is not lost.
- Triggers are used in VPU modules for the transmission of simple information PACT08 are described. Triggers are transmitted using a one- or multi-dimensional bus system divided into segments. The individual segments can be equipped with drivers to improve the signal quality.
- the respective trigger connections which are implemented by interconnecting several segments, are programmed by the user and configured via the CT.
- triggers primarily, but not exclusively, transmit the following information or any combination of these
- Triggers are generated by any cells and are triggered by any events in the individual cells.
- triggers from eLner CT or an external unit that are generated outside the cell array or the module can be generated.
- Triggers are received by any cells and evaluated in any way.
- triggers from a CT or an external unit that are evaluated outside the cell array or the module can be evaluated.
- Triggers are mainly used for process control within a VPU, for example for comparisons and / or loops. Data paths and / or branches can be enabled or disabled using triggers. Another important area of application for triggers is the synchronization and control of sequencers, as well as their information exchange; as well as the control of data processing in the cells.
- the management of triggers and the control of data processing can be done according to the state of the art by a permanently implemented state machine (see PACT02, PACT08), by a finely configured state machine (see PACT01, PACT04, PACT08, [Chameleon]) or preferably by a programmable state machine (PACT13).
- the programmable state machine is configured according to the procedure to be carried out.
- the block EPS448 from Altera [ALTERA Data Book 1993] ⁇ realizes such a programmable sequencer.
- RDY is pulsed, ie lies exactly for a clock so that the data is not incorrectly read multiple times.
- this control is stored for the period of data transmission (RdyHold).
- the effect of this is that the position of the gates and / or multiplexers and / or other suitable transmission elements remains valid even after the RDY pulse and thus there is still valid data on the bus.
- ACK is preferably also transmitted as a pulse. If an ACK passes through a multiplexer and / or gate and / or another suitable one
- One solution to this is to basically pulse ACK and that store incoming ACK of each branch on a branch. Only when the ACKs of all branches arrive will an ACK pulse be forwarded towards the transmitter and at the same time all stored ACKs (AckHold) and possibly the RdyHold will be deleted.
- Figure lc shows the basics of the method.
- a transmitter 0120 sends data via a bus system 0121 together with an RDY 0122.
- Several receivers (0123, 0124, 0125, 0126) receive the data and the associated RDY (0122).
- Each receiver generates an ACK (0127, 0128, 0129, 0130), each of which uses suitable Boolean logic (0131, 0132, 0133) e.g. a logical AND function can be linked and sent to the transmitter (0134).
- FIG. lc shows a possible preferred embodiment with 2 receivers (a, b).
- An output stage (0103) sends data and the associated RDY (0131) pulsed in this example.
- RdyHold levels (0130) in front of the target PAEs translate the pulsed RDY into a standing RDY.
- a standing RDY should have the boolean value b'l.
- the contents of all RdyHold levels are returned to 0103 via a chain of logical OR functions (-0133). If a target PAE confirms the acceptance of the data, the respective ' RdyHold level is only reset by the incoming ACK (0134) X
- b'l. "Any PAE 'has not decreased, the data”.
- a simple n: 1 transmission can be realized by routing several data paths to the inputs of PAEs.
- the PAEs are configured as multiplexer stages. Incoming triggers control the multiplexer and each select one of the plurality of data paths. If necessary, tree structures can be constructed from PAEs configured as multiplexers in order to combine a large number of data streams (large n). The procedure erfordet the special attention of the programmer to the different • Date st 'röme sort timed correctly. In particular, all data paths should be of the same length and / or
- FIG. 2 shows a first possible implementation example.
- a FIFO (0206) is used to correctly store and process the chronological order of transmission requests to a bus system (0208).
- each transmitter (0201, 0202, 0203, 0204) is assigned a unique number that represents its address.
- Each transmitter requests data transmission to the 0208 bus system by displaying its address on a bus (0209, 0210, 0211, 0212).
- the respective addresses are stored in a FIFO (0206) via a multiplexer (0205) in accordance with the sequence of the send requests.
- the FIFO is processed step by step and the address of the respective FI-FO entry is displayed on another bus (0207).
- This bus addresses the transmitters and the transmitter with the appropriate address receives access to bus 0208.
- the internal memory of the VPU technology can be used as a FIFO (cf. PACTO ' 4, PACT13).
- An additional counter (REQCNT, 0301) counts the number of clocks T.
- Each transmitter (0201, 0202, 0203, 0204) that requests transmission at clock t stores the value of REQCNT (REQCNT (t)) at the clock t as his address.
- Each transmitter that stores the transmission request at clock t + 1 has the value of REQCNT (REQCNT (t + 1)) at
- Each transmitter at clock t + n the transmission requests stores the value 'yon REQCNT (REQCNT (t + n)) at clock t + n as its address.
- the FIFO (0206) now stores the values of REQCNT (tb) at a particular clock tb.
- the FIFO shows a stored value of REQCNT as a request to send on a separate bus (0207).
- Each transmitter compares this value with the one it has saved. If the values are the same, it sends the data. If several transmitters have the same value, i.e. if data are to be transmitted at the same time, the transmission is now arbitrated using a suitable arbiter (CHNARB, 0302b) and switched to the bus using a multiplexer (0302a) controlled by the arbiter.
- CHNARB CHNARB, 0302b
- a multiplexer 0302a
- the FIFO advances to the next value. If the FIFO no longer contains any valid entries (empty), the values are marked as invalid so that there are no incorrect bus accesses.
- REQCNT are stored in the FIFO (0206) in which there was a bus request from a transmitter (0201, 0202, 0203, 0204). For this purpose, each transmitter signals its bus request (0310, 0311, 0312, 0313). These are logically linked (0314), e.g. through an OR function.
- the resulting send request from all transmitters (0315) is routed to a gate (0316) that only forwards the values from REQCNT to the FIFO (0206) for which there was actually a bus request.
- the .The method described can be one of a preferred exemplary tion according to Figure 4 is further as follows: • be optimized: By REQCNT (0410) generated a linear sequence of values (REQCNT (tb)), if, instead of all clocks t only the clocks are counted in which a bus request from a transmitter (0315) exists. Due to the seamless linear sequence of values generated by REQCNT, the FIFO can be replaced by a simple counter (SNDCNT, 0402), which also counts linearly and whose value (0403) enables the respective transmitters in accordance with 0207. SNDCNT continues to count as soon as no transmitter responds to the value of SNDCNT. As soon as the value of REQCNT is equal to the value of SNDCNT, SNDCNT stops counting because the last value has been reached.
- the maximum required width of REQCNT is log 2 (number_of_sender). If the largest possible value is exceeded, REQCNT and SNDCNT start again at the minimum value (usually. 0).
- arbiters can be used as CHNARB according to the prior art.
- better prioritized or unprioritized arbiters are suitable, with prioritized ones offering the advantage that they can prefer certain tasks for real-time tasks.
- a serial arbiter is described below, • which is particularly simple and resource-saving to implement in VPU technology.
- the.-Arbiter offers the advantage of working with priority, which enables the preferred processing of certain transmissions.
- Blocks of the VPU type have a network of parallel data bus systems (0502), with each PAE having at least one connection to a data bus for data transmission.
- a network is usually built up from several equivalent parallel data buses (0502), whereby One data bus each can be configured for data transmission. The remaining data buses can be freely available for other data transfers.
- the data buses can be segmentable, i.e. configuration (0521) enables a bus segment (0502) to be switched through to the neighboring bus segment (0522) via gates (G).
- the gates (G) can be constructed from transmission gates and preferably have signal amplifiers and / or registers.
- a PAE (0501) preferably taps data from one of the buses (0502) via multiplexer - (0503) or a comparable circuit.
- the activation of the multiplexer arrangement can be configured (0504).
- the data (results) generated by a PAE (0510) are preferably connected to a bus (0502) via a similar, independently configurable • (0505) multiplexer circuit.
- the circuit described in FIG. 5 is referred to as a bus node.
- a simple arbiter for a bus note can be implemented as follows, as shown in FIG. 6:
- the basic element 0610 of a simple serial arbiter can be constructed using two AND gates (0601, 0602).
- Figure ⁇ a The base member has an input (RDY, 0603) indicating through which an input bus that 'he transmits data and requesting a quality control on the receiver bus.
- Another input (ACTIVATE, 0604) which in this example indicates by a logic 1 level that none of the previous ones
- gang RDY_OUT (0605) indicates to a downstream bus node that the basic element enables bus access (if there is a bus request (RDY)) and ACTIVATE_OUT (0606) indicates that the basic element currently does not (no longer) activate, since none Bus request (RDY) (more) exists and / or no previous arbiter stage has occupied the receiver bus (ACTIVE).
- serial chaining according to FIG. 6b of ACTIVATE and ACTIVATE_OUT via the basic elements 0610 creates a serial prioritizing arbiter, " the first basic element having the highest priority and the ACTIVATE input always being activated.
- the method can be used over long distances. From a length dependent on the system frequency, the transmission of data and execution of the protocol are no longer possible in one cycle.
- One solution is to design the data paths to be exactly the same length and to merge them in exactly one place. This means that all control signals for the protocol are local, which makes it possible to increase the system frequency.
- a much more optimal solution in which data paths can also be merged in a tree, can be constructed as follows:
- FIG. 7a shows an example of a CASE-like construct.
- a REQCNT (0702) is assigned at the latest to the last PAE before a branch (0701), which assigns a value (timestamp) to each data word, which is then always transmitted together with the data word.
- REGCNT continues to count linearly with each data word, so that the position of a data word within a data stream can be determined by a unique value.
- the data words subsequently branch into several different data paths (0703, 0704, 0705). With each data word, the value assigned to it (timestamp) is routed through the data paths.
- a multiplexer (0707) sorts the data words back into the correct order before the PAE (0708) that processes the merged data path.
- a linearly counting SNDCNT (0706) is assigned to the multiplexer for this purpose.
- the data paths In order to achieve the highest possible clock frequency, the data paths must be combined very locally. This minimizes the cable lengths and keeps the associated runtimes short.
- the lengths of the data paths are offset by register stages (pipelines) 'can be up to all the data paths together at a common point. Care should be taken to ensure that the lengths of the pipelines are approximately the same in order not to get too large a time shift between the data words.
- PAE-S PAE-S
- PAE-E PAE-E
- the PAE-E each have a different, permanently configured address, which is compared with the TimeStamp bus.
- the PAE-S selects the receiving PAE by outputting the address of the receiving PAE on the TimeStamp bus. This addresses the PAE for which the data is intended.
- speculative execution and task switch The problem of speculative execution is known from classic microprocessors. This occurs when the processing of data is dependent on a result of the previous data processing; but with the processing However, for performance reasons, the dependent data is started in advance - without the required result being available. If the result is different than previously assumed, the processing of the data based on incorrect assumptions must be carried out again (incorrect speculation). In general, this can also occur in VPUs.
- a similar problem exists if a data processing unit e.g. the task scheduler of an operating system, real-time request, etc. interrupts the data processing before it has been carried out completely by a unit that is superior to the data processing in one half of the PA.
- a data processing unit e.g. the task scheduler of an operating system, real-time request, etc.
- the state of the pipeline must be saved in such a way that data processing starts again after the location of the operands that led to the calculation of the last finished result.
- the state .MISS_PREDICT can also be used, which indicates that incorrect speculation has occurred. Alternatively, this state can also be generated by negating the DONE state at a suitable time.
- Data is usually processed linearly in VPUs, so that the FIFO operating mode is often preferred.
- a special expansion of the memory for the FIFO operating mode is to be presented as an example, which directly supports speculation and, in the event of incorrect speculation, enables repeated processing of the incorrectly speculated data.
- the FIFO also supports task switches at any time.
- the extended FIFO operating mode is carried out using the example of a memory which is accessed for reading (reading page) as part of a specific data processing.
- the exemplary FIFO is shown in FIG. 8.
- the structure of the write circuit corresponds to the state of the art with a conventional write pointer (WR_PTR, 0801), which moves on with each write access (0810).
- the read circuit has, for example, the usual counter (RD_PTR, 0802), which counts each word read according to a read signal (0811) and modifies the read address of the memory (0803) accordingly.
- DONE_PTR (0804), which does not document the read data, but the read and correctly processed, in other words, only the data in which no errors occurred and the result of which was output at the end of the calculation and the correct end of calculation a signal (0812) was displayed. Possible circuits are described below.
- the FULL flag (0805) (according to the prior art), which indicates that the FIFO is full and no further data can be stored, is now generated by a comparison (0806) of DONE_PTR with WR_WTR. This ensures that data that may need recourse due to a possible wrong speculation is not overwritten.
- the EMPTY flag (0807) is generated according to the usual structure by comparing (0808) the RD_PRT with the WR_PTR. If an incorrect speculation occurred (MISS_PREDICT, 0809), the read pointer is loaded with the value DONE_PTR + 1. This starts the data processing again at the value that triggered a false speculation.
- Two possible configurations of the DONE_PTR are to be carried out in more detail by way of example: a) Implementation by a counter
- DONE_PTR is implemented as a counter, which is set to RD_PTR when the circuit is reset or at the start of data processing.
- An incoming signal (DONE) indicates that the data has been processed successfully, ie without speculation. This modifies DONE_PRT in such a way that it points to the next data word being processed.
- a subtractor can be used. The length of the pipeline from the connection of the memory to the detection of a possible false speculation is stored in an assigned register.
- the data processing must be restarted after incorrect speculation on the data word, which can be calculated from the difference.
- a correspondingly configured memory is required on the write side in order to save the result of the data processing of a configuration, the function of the DONE_PRT for the write pointer being implemented in order to return already (incorrectly) calculated results when the data processing is run through again overwrite.
- the function of the read / write pointer is reversed according to the addresses bracketed in the drawing.
- FIFOs for input / output levels e.g. 0101, 0103
- FIFOs have adjustable latency times, so that the delay of different edges / branches, i.e. the runtime of data over different .but mostly parallel data paths, can be coordinated.
- a FIFO stage can be constructed as follows, for example, as shown in FIG. 9: A register (0901) is followed by a multiplexer (0902). The register stores the data (0903) and its correct existence, ie the associated RDY (0904). The register is written into the register when the neighboring FIFO stage, which is closer to the output (0920) of the FIFO, indicates that it is full (0905) and there is an RDY (0904) for the data.
- the multiplexer forwards incoming data (0903) directly to the output (0906) until data has been written into the register and the FIFO stage itself is thus full, which is to the neighboring FIFO stage that is closer to the input (0921) of the FIFO is displayed (0907).
- the acceptance of data in a FIFO stage is confirmed with an input acknowledge (IACK, 0908).
- the acceptance of data from a FIFO is confirmed by output acknowledge (OACK, 0909).
- OACK reaches all FIFO levels and causes the data in the FIFO to be pushed forward by one level at a time.
- a new data word is routed past the registers via the multiplexers of the individual FIFO stages.
- the first full FIFO stage (1001) signals the previous stage (1002) based on the stored RDY that it cannot accept any data.
- the previous stage (1002) has no RDY saves, but knows the "full" state of the following (1001). Therefore, the stage stores the data and RDY (1003); and acknowledges the storage by an ACK to the transmitter.
- the multiplexer (1004) of the FIFO stage switches over in such a way that it no longer forwards the data path to the subsequent stage, but the content of the register.
- the first full stage (1012) stores the data. As previously described, their data are stored in the same cycle by the subsequent stage. In other words: New data to be written automatically slip into the first free FIFO stage (1012), i.e. the previous last full FIFO level that was emptied when ACK arrived.
- a total of 3 methods are available for merging the data streams, which are suitable depending on the application: a) Local Merge b) Tree Merge c) Memory Merge
- a local SNDCNT uses a multiplexer to select exactly the data word whose timestamp corresponds to the value of SNDCNT and is therefore currently expected. Two possibilities will be explained in more detail with reference to FIGS. 7a and 7b: a) A counter SNDCNT (0706) continues to count for each incoming data packet; A comparator is connected for each data path, which compares the counter reading with the timestamp of the data path. If the values match, the current data packet is forwarded to the subsequent PAEs via the multiplexer. b) The solution according to a) is expanded such that according to the
- a target data path is assigned.
- the source data path is determined by comparing (0712) the timestamp arriving with the data in accordance with method a) with an SNDCNT (0711) and addressing (0714) the corresponding data path and selecting it via a multiplexer (0713).
- the address (0714) is assigned to a target data path address (0715) by means of the exemplary lookup table (0710) which uses a demultiplexer (0716) to select the target path.
- the data connection to the PAE (0718) assigned to the bus node can also be established via the exemplary lookup table (0710), for example via a gate function (transmission gates) (0717) to the input of the PAE.
- a PAE (0720) has 3 data inputs (A, B, C), such as in the XPU128ES.
- Bus systems (0733) can be configured and / or multiplexed and selected per clock cycle to be connected to the data inputs. Each bus system transmits data, handshakes and the assigned timestamp (0721).
- Inputs A and C of the PAE (0720) are used to forward the timestamp of the data channels to the PAE (0722, 0723).
- Timestamp can be bundled, for example, using the SIMD bus system described below.
- the 'bundled timestamp is separated again in the PAE and each timestamp individually (0725, 0726, 0727) compared with an SNDCNT (0724) implemented / configured in the PAE (0728).
- the results of the comparisons are used to Ei 'n- gear multiplexer (0730) to control such that the bus system with the correct time stamp on a- busbar (0731) is switched through.
- the busbar is preferably connected to input B in order to enable data to be forwarded to the PAE in accordance with 0717, 0718.
- the output demultiplexers (0732) for forwarding the data to different bus systems are also controlled by the results, the results preferably being rearranged by a flexible translation, for example by a lookup table (0729), so that the results are freely available Demultiplexer (0732) to be selected bus systems can be assigned.
- Points to merge parts of a data stream The result is a tree-like structure.
- the problem arises that a central decision about the selection of a data word cannot be made, but that the decision is distributed over several nodes. It is therefore necessary to transmit the respective value of SNDCNT to all nodes. At high clock frequencies, however, this is only possible with a latency period that arises, for example, from several register stages during the transmission. As a result, this solution initially does not offer meaningful performance.
- a method of improving the. Performance is local. Decisions are allowed in each node regardless of the value of SNDCNT.
- a simple approach for example, is to select the data word with the smallest timestamp at a node. However, this approach becomes problematic when. a data path at a node does not supply a data word for a clock. Now it cannot be decided which data path is the preferred one.
- Each node receives an independent SNDCNT counter SNDCNT K.
- Each node should have n input data paths (P 0 ..P n ).
- Each node can have several output data paths which are selected depending on the input data path by means of a translation method, for example a lookup table that can be configured by a higher-level configuration unit CT.
- the root node has the main SNDCNT to which all SNDCNT K are synchronized if necessary.
- the root node has the SNDCNT, which counts each time a valid data word is selected and - the correct one
- FIG. 11 shows a possible tree which, for example, is based on PAEs similar to those of the VPU XPU128ES.
- a root node (1101) has an integrated SNDCNT, the value of which is available at output H (1102).
- the data words at inputs A and C are selected in accordance with the described method and the data word is led to output L in the correct order.
- the PAEs of the next hierarchy level (1103) and at every further higher hierarchy level (1104, 1105) work accordingly, but have the following difference:
- the integrated SNDCNT K is local, the respective value is not passed on.
- SNDCNT K is synchronized with SNDCNT, the value of which is present at input B, in accordance with the method described.
- SNDCNT can be pipelined between all nodes, but in particular also between the individual hierarchy levels, for example via registers.
- This method uses memory to merge data streams.
- a memory location is assigned to each value of the timestamp.
- the data is then stored in the memory according to the value of its timestamp; in other words, the timestamp serves as the address of the storage location for the assigned data. This creates a
- Data space that is linear to the timestamp ie accordingly the timestamp is sorted. Only when the data space is complete, ie all data has been saved, is the memory released for further processing or read out linearly. This can be easily determined, for example, by counting how much data has been written to a memory. If as much data has been written as the memory has data entries, it is full.
- a timestamp is a number from a finite linear number space (TSR).
- TSR finite linear number space
- the assignment of timestamps is strictly monotonous, which means that each timestamp assigned is unique within the TSR number space. If the end of the number range is reached when assigning a timestamp, the assignment is continued at the start of TSR; this creates a point of discontinuity.
- the timestamps now assigned are no longer unambiguous compared to the previous ones. It must always be ensured that these discontinuities are taken into account during processing.
- TSR number space
- the new data cannot be written to the storage locations of the old data, since these have not yet been read out. Therefore, several (at least two) independent memory blocks must be provided so that the old and new data can be written separately.
- Any method can be used to manage the memory blocks. Two options are explained in more detail: a) " If it is always ensured that the old data of a certain timestamp value arrive before the new data of this timestamp value, it is tested whether the storage location for the old data is still free in this case there is old data and the storage location is written, if there are no new data and the storage location for the new data is written, b) it is not certain that the old data of a certain timestamp value before the new data this
- the timestamp may be provided with an identifier, the old and new timestamps differ.
- This identifier can be one or more bits wide. If the timestamp overflows, the identifier is changed linearly. As a result, old and new data are now provided with clear timestamps. According to the identifier, the data is assigned to one of several memory blocks.
- Identifiers are therefore preferably used, the maximum numerical values of which are considerably smaller than the maximum numerical value of the timestamps.
- a preferred ratio can be specified using the following formula: identifier max ⁇ TimeStamp ma ⁇ / 2
- Edges (1201, 1202, 1203) are present at the interface (1204).
- the partitioning can be carried out in accordance with the invention in such a way that all edges are cut according to FIG. 12b.
- the data of each edge of a first configuration (1213) is written to a separate memory (1211).
- the data and / or status information is read from a subsequent configuration (1214) from the memories and processed further by this configuration.
- the memories work as data receivers of the first configuration (ie in a mainly descriptive mode of operation) and as data transmitters of the subsequent configuration (ie in a mainly read-out mode of operation).
- the memories (1211) themselves are part / resources of both configurations.
- this can be guaranteed by either sorting the data streams a) when writing in i a memory and / or b) sorting them when reading out from a memory and / or c) storing the sorting order with the data and making it available for subsequent data processing is provided.
- control units are assigned to the memories, which ensure the management of the data sequences and data dependencies both when writing the data (1210) into the memories (1211) and when reading out the data from the 'memories' (1212).
- different types of administration and corresponding control mechanisms are used.
- the memories are assigned to an array (1310, 1320) made of PAEs: a) In FIG. 1.3a, the memories generate their addresses synchronously, for example by means of common address generators and independent but synchronously switched. In other words, the write address (1301) is incremented per cycle, regardless of whether a memory actually has valid data to store. A plurality of memories (1303, 1304) thus have the same time base or. Read / write address. An additional flag (VOID, 1302) for each data storage location in the memory indicates whether valid data has been written to a memory address. The flag VOID can be generated by the RDY (1305) assigned to the data.
- the data RDY (1306) is generated from the VOID flag.
- a common read address (1307) is generated in accordance with the writing in of the data and is switched on per cycle.
- ⁇ Memory has independent write pointers (1313, 1314) for the data-writing configuration and read pointers (1315, 1316) for the subsequent data-reading configuration. decision speaking of the known methods (for example corresponding to FIG. 7a or FIG. 11), the data word which is correct in terms of time in each case is selected on the basis of the assigned and stored timestamp (1312).
- the sorting of the data into the memories / from the memories can thus be carried out according to different algorithmic methods, for example by a) allocation of a memory space by the TimeStamp b) sorting in the data stream according to the TimeStamp c) storage per cycle together with a VALID -Flag d) Save the TimeStamp and pass it on to the subsequent algorithm when reading the memory
- a method is described below for assigning timestamps to 10 channels for peripheral components and / or external memories.
- the assignment can serve different purposes, for example to allow correct sorting of data streams between sender and receiver and / or to clearly select sources and / or destinations of data streams.
- the following explanations are illustrated using the example of the interface cells from PACT03.
- PACT03 describes a procedure for bundling VPU internal buses and data exchange between different VPUs or VPUs and peripherals (10).
- a disadvantage of the method is that the data source can no longer be identified at the receiver and the correct chronological order is also not ensured.
- the following new method to solve this problem it can each specific application RIE some or more of the dam 'used enclosed methods and, if necessary, in combination:
- FIG. 14 describes, by way of example, such identification between arrays (PAs, 1408) from reconfigurable elements (PAEs) of two VPUs.
- An arbiter (1401) selects one of the possible data sources (1405) on a data-sending module (VPU, 1410) in order to switch this to the 10 via a multiplexer (1402).
- the address of the data source (1403) is sent to the 10 together with the data (1404).
- the data receiving module (VPU, 1411) selects the corresponding receiver (1406) according to the address (1403) of the data source via a demultiplexer (1407).
- a 'translation method for example ' .
- a lookup table which can be configured, for example, by a higher-level configuration unit (CT), enables flexible assignment between the transmitted address (1403) and the receiver (1406).
- CT higher-level configuration unit
- the multiplexers (1402) are connected upstream and / or the demultiplexers (1407) are connected downstream by interface modules.
- PACT03 and / or PACT15 can be used for the configurable connection of bus systems.
- Adherence to the chronological order bl) The simplest method is to send the timestamp to the 10 and leave the evaluation to the recipient who receives the timestamp.
- the timestamp is decoded by the arbiter, who only selects the transmitter with the correct timestamp and sends it to the 10. The recipient receives the data in the correct order.
- the procedure can also be expanded by assigning and identifying channel numbers.
- a channel number identifies a specific station area.
- a channel number can consist of several identifications, such as specifying the bus within a block, the block, or the block group. This provides simple identification even in applications with a large number of PAEs and / or a combination of many modules.
- individual data words are preferably not transmitted in each case, but rather a plurality of data words are combined to form a data packet and then transmitted by specifying the channel number.
- the combination of the individual data words can be done using, for example a suitable memory, as described for example in PACT18 (BURST-FIFO).
- the transmitted addresses and / or timestamp can preferably be used as identifiers or part of identifiers in bus systems according to PACT15.
- timestamps or comparable methods enables sequencers to be easily constructed from groups of PAEs.
- the busses and basic functions of the circuit are configured, the detailed function and data addresses are flexibly set at runtime using an OpCode.
- sequencers can be set up and operated simultaneously within a PA (array of PAEs).
- sequencers within a VPU can be constructed in accordance with the algorithm, examples have already been given in several fully integrated documents of the inventor.
- PACT13 describes the construction of sequencers from a plurality of PAEs, which serves as an exemplary basis for the following description.
- the following designs of sequencers can be freely adapted:
- Type and amount of IO / memory • Type and amount of interrupts (e.g. via trigger)
- a simple sequencer can be created, for example, from 1.
- sequencer is expanded by IO elements (PACT03, PACT22 / 24).
- PAEs can be connected as data sources or receivers.
- the procedure according to PACT08 can be used, which allows the direct setting of OpCodes of a PAE via data buses, as well as the specification of the data sources / targets.
- the addresses of the data sources / destinations can be transferred, for example, using the time stamp procedure.
- the bus can also be used to transmit OpCodes.
- a sequencer consists of a RAM for storing the program (1501), a PAE to calculate the data (ALU) (1502), a PAE to calculate the program pointer (1503), a memory as a register set (1504) and a 10 for external devices (1505).
- Wiring creates two bus systems, an input bus to the ALU IBUS (1506) and an output bus from the ALU OBUS (1507).
- a 4-bit wide timestamp is assigned to the buses, which addresses the source IBUS-ADR (1508) or the destination OBUS-ADR (1509).
- the program pointer (1510) is passed to 1501.
- 1501 returns the OpCode (1511).
- the OpCode is split into commands for the ALU (1512) and the program pointer (1513), as well as the data addresses (1508, 1509).
- the SIMD methods and bus systems described below can be used to split the bus.
- 1502 is designed as an accumulator machine and supports, for example, the following functions: ld ⁇ reg> Load accumulator (1520) from register add_sub ⁇ reg> Add / Subtract register to accumulator sl_sr Push accumulator rl_rr Rotate accumulator - st ⁇ reg> Write accumulator in register
- a fourth bit indicates the type of operation to: add or subtract, left or right push.
- ⁇ reg> is coded as follows:
- 1503 supports the following operations via the program pointer: jmp jump to address in input register (2321) jtO jump to address in input register if TriggerO is set jtl jump to address in input register if triggerl is set jt2 jump to address in input register is specified, if trigger2 set jmpr jump to PP plus address in the input register 3-bit commands are necessary.
- a fourth bit indicates the type of operation: add or subtract.
- the OpCode 1511 is broken down into 3 groups of 4 bits each: (1508, 1509), 1512, 1513, 1508 and 1509 can be identical for the given instruction set.
- 1512, 1513 are, for example, passed to the C register of the PAEs (see PACT22 / 24) and decoded as a command within the PAEs (see PACT08).
- the sequencer can be built into a more complex structure.
- Data sources and receivers can be any, in particular PAEs.
- circuit shown only requires 12 bits of the OpCode 1511. With a 32-bit architecture 20-bit are optionally available for expanding the basic circuit.
- the multiplexer functions of the buses can be implemented according to the time stamping procedure described. Other configurations are also possible, for example PAEs could be used as multiplexer stages.
- the configuration specifies for each arithmetic unit whether an arithmetic unit should work undivided or whether the arithmetic unit should be broken down into one or more blocks, each of the same or different width.
- an arithmetic unit can also be broken down in such a way that different word widths within one arithmetic unit are configured the same way (e.g. width 32-bit, broken down into 1x16-, 1x8- and 2x4-bit).
- SIMD-WORD disassembled data words
- the network always transmits a complete packet, i.e. H . all data words are valid within one packet and are transmitted using the well-known handshake procedure.
- the bus switches according to FIGS. 5 and 7b, c can be modified in such a way that the individual SIMD-WORD can be networked flexibly.
- the matrix structure of the buses (FIG. 5) enables simple data sorting, as shown in FIG. 16c.
- a first PAE sends data over two buses (1601, 1602), each of which is divided into 4 sub-buses.
- a bus system (1603) interconnects the individual subbuses with additional subbuses located on the bus.
- a second PAE receives differently sorted subbuses on its two input buses (1604, 1605).
- the handshakes of the buses between two PAEs with, for example, a double SIMD arithmetic logic unit (1614, 1615) are logically linked in FIG. 16a in such a way that a common handshake (1610) for the rearranged bus (1611) is generated from the handshakes of the original buses ,
- an RDY for a newly sorted bus can be generated from a logical AND operation of all RDYs of the data supplying buses for this bus.
- the ACK of a bus providing data can be generated from an AND operation of the ACKs of all buses that process the data further.
- the common handshake controls a control unit (1613) for the administration of the PAE (1612).
- the PAE bus 1611 is split internally between two arithmetic units (1614, 1615).
- the handshakes are linked within each bus node. This makes it possible to assign only one handshake protocol to a bus system of width m, consisting of n sub-buses of width b.
- all bus systems are configured in width b, which corresponds to the smallest realizable input / output data width b of a SIMD word.
- width b corresponds to the smallest realizable input / output data width b of a SIMD word.
- a PAE with 3 32-bit input buses and 2 32-bit output buses with a minimum SIMD word width of 8 actually has 3x4 8-bit input buses and 2x4 8-bit output buses.
- All handshake and control signals are assigned to each of the sub-buses.
- the output of a PAE sends the same control signals for all n sub-buses.
- Incoming acknowledgment signals of all sub-buses are logically linked to each other, e.g. B. by an AND function.
- the bus systems can freely interconnect each sub-bus and route independently.
- the bus system and in particular the bus nodes do not process and link the handshake signals of the individual buses, regardless of their routing, their arrangement and sorting.
- the control signals of all the n sub-buses are linked to each other such that a generally valid 'control signal is generated as a quasi- .Bus control signal for the data path.
- RdyHold levels can be used for each individual data path and only when all RdyHold levels signal pending data are they accepted by the PAE.
- the data of each sub-bus are written and acknowledged individually in the input register of the PAE, whereby the sub-bus is immediately free for the next data transfer.
- the presence of all the necessary data from all sub-buses in the input registers is detected within the PAE by means of a suitable logic combination, the RDY signals stored in the input register for each sub-bus, whereupon the PAE begins data processing.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Multi Processors (AREA)
- Information Transfer Systems (AREA)
Abstract
Description
Claims
Priority Applications (21)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2002570104A JP2004536373A (ja) | 2001-03-05 | 2002-03-05 | データ処理方法およびデータ処理装置 |
EP02712937A EP1454258A2 (de) | 2001-03-05 | 2002-03-05 | Verfahren und vorrichtungen zur datenbe- und/oder verarbeitung |
US10/469,910 US20070299993A1 (en) | 2001-03-05 | 2002-03-05 | Method and Device for Treating and Processing Data |
EP02777144A EP1466264B1 (de) | 1995-12-29 | 2002-09-18 | Verfahren zur konfiguration der verbindung zwischen datenverarbeitungszellen |
AU2002338729A AU2002338729A1 (en) | 2001-09-19 | 2002-09-18 | Router |
PCT/EP2002/010479 WO2003025781A2 (de) | 2001-09-19 | 2002-09-18 | Verfahren zur konfiguration der verbindung zwischen datenverarbeitungszellen |
AU2002357982A AU2002357982A1 (en) | 2001-09-19 | 2002-09-19 | Reconfigurable elements |
EP02791644A EP1472616B8 (de) | 2001-09-19 | 2002-09-19 | Rekonfigurierbare elemente |
AT02791644T ATE533111T1 (de) | 2001-09-19 | 2002-09-19 | Rekonfigurierbare elemente |
US10/490,081 US8429385B2 (en) | 2001-09-03 | 2002-09-19 | Device including a field having function cells and information providing cells controlled by the function cells |
PCT/EP2002/010572 WO2003036507A2 (de) | 2001-09-19 | 2002-09-19 | Rekonfigurierbare elemente |
JP2003538928A JP4456864B2 (ja) | 2001-09-19 | 2002-09-19 | リコンフィギュアブル素子 |
US12/247,076 US8209653B2 (en) | 2001-09-03 | 2008-10-07 | Router |
US12/389,116 US20090210653A1 (en) | 2001-03-05 | 2009-02-19 | Method and device for treating and processing data |
JP2009271120A JP2010079923A (ja) | 2001-09-19 | 2009-11-30 | 処理チップ、チップを含むシステム、マルチプロセッサ装置およびマルチコアプロセッサ装置 |
US13/023,796 US8686475B2 (en) | 2001-09-19 | 2011-02-09 | Reconfigurable elements |
US14/318,211 US9250908B2 (en) | 2001-03-05 | 2014-06-27 | Multi-processor bus and cache interconnection system |
US14/500,618 US9141390B2 (en) | 2001-03-05 | 2014-09-29 | Method of processing data with an array of data processors according to application ID |
US14/728,422 US9411532B2 (en) | 2001-09-07 | 2015-06-02 | Methods and systems for transferring data between a processing device and external devices |
US15/225,638 US10152320B2 (en) | 2001-03-05 | 2016-08-01 | Method of transferring data between external devices and an array processor |
US16/190,931 US20190102173A1 (en) | 2001-03-05 | 2018-11-14 | Methods and systems for transferring data between a processing device and external devices |
Applications Claiming Priority (72)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE10110530 | 2001-03-05 | ||
DE10110530.4 | 2001-03-05 | ||
DE10111014 | 2001-03-07 | ||
DE10111014.6 | 2001-03-07 | ||
PCT/EP2001/006703 WO2002013000A2 (de) | 2000-06-13 | 2001-06-13 | Pipeline ct-protokolle und -kommunikation |
EPPCT/EP01/06703 | 2001-06-13 | ||
EP01115021 | 2001-06-20 | ||
DE10129237.6 | 2001-06-20 | ||
EPPCT/EP01/115021.6 | 2001-06-20 | ||
DE10135211.5 | 2001-07-24 | ||
DE10135210 | 2001-07-24 | ||
DE10135210.7 | 2001-07-24 | ||
PCT/EP2001/008534 WO2002008964A2 (de) | 2000-07-24 | 2001-07-24 | Integrierter schaltkreis |
EPEP0108534 | 2001-07-24 | ||
DE10135211 | 2001-07-24 | ||
DE10139170 | 2001-08-16 | ||
DE10139170.6 | 2001-08-16 | ||
DE10142231 | 2001-08-29 | ||
DE10142231.8 | 2001-08-29 | ||
DE10142904 | 2001-09-03 | ||
DE10142894.4 | 2001-09-03 | ||
DE10142903 | 2001-09-03 | ||
DE10142904.5 | 2001-09-03 | ||
DE10142894 | 2001-09-03 | ||
DE10142903.7 | 2001-09-03 | ||
US31787601P | 2001-09-07 | 2001-09-07 | |
US60/317,876 | 2001-09-07 | ||
DE10144732 | 2001-09-11 | ||
DE10144733 | 2001-09-11 | ||
DE10144732.9 | 2001-09-11 | ||
DE10144733.7 | 2001-09-11 | ||
DE10145795 | 2001-09-17 | ||
DE10145795.2 | 2001-09-17 | ||
DE10145792.8 | 2001-09-17 | ||
DE10145792 | 2001-09-17 | ||
DE10146132 | 2001-09-19 | ||
DE10146132.1 | 2001-09-19 | ||
US09/967,847 | 2001-09-28 | ||
US09/967,847 US7210129B2 (en) | 2001-08-16 | 2001-09-28 | Method for translating programs for reconfigurable architectures |
EP0111299 | 2001-09-30 | ||
EPEP01/11299 | 2001-09-30 | ||
EPEP01/11593 | 2001-10-08 | ||
PCT/EP2001/011593 WO2002029600A2 (de) | 2000-10-06 | 2001-10-08 | Zellenarordnung mit segmentierterwischenzellstruktur |
DE10154260.7 | 2001-11-05 | ||
DE10154260 | 2001-11-05 | ||
DE10154259.3 | 2001-11-05 | ||
DE10154259 | 2001-11-05 | ||
EPEP01/129923.7 | 2001-12-14 | ||
EP01129923 | 2001-12-14 | ||
EP02001331 | 2002-01-18 | ||
EPEP02/001331.4 | 2002-01-18 | ||
DE10202044.2 | 2002-01-19 | ||
DE10202044 | 2002-01-19 | ||
DE10202175.9 | 2002-01-20 | ||
DE10202175 | 2002-01-20 | ||
DE10206653.1 | 2002-02-15 | ||
DE10206653 | 2002-02-15 | ||
DE10206856 | 2002-02-18 | ||
DE10206857 | 2002-02-18 | ||
DE10206857.7 | 2002-02-18 | ||
DE10206856.9 | 2002-02-18 | ||
DE10207225 | 2002-02-21 | ||
DE10207225.6 | 2002-02-21 | ||
DE10207224.8 | 2002-02-21 | ||
DE10207224 | 2002-02-21 | ||
DE10207226.4 | 2002-02-21 | ||
DE10207226 | 2002-02-21 | ||
DE10208434 | 2002-02-27 | ||
DE10208435 | 2002-02-27 | ||
DE10208435.1 | 2002-02-27 | ||
DE10208434.3 | 2002-02-27 | ||
DE10129237A DE10129237A1 (de) | 2000-10-09 | 2002-06-20 | Verfahren zur Bearbeitung von Daten |
Related Parent Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/967,847 Continuation US7210129B2 (en) | 2001-03-05 | 2001-09-28 | Method for translating programs for reconfigurable architectures |
US09/967,847 Continuation-In-Part US7210129B2 (en) | 2001-03-05 | 2001-09-28 | Method for translating programs for reconfigurable architectures |
Related Child Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10469910 A-371-Of-International | 2001-09-07 | ||
US10/469,910 A-371-Of-International US20070299993A1 (en) | 2001-03-05 | 2002-03-05 | Method and Device for Treating and Processing Data |
US12/389,116 Continuation US20090210653A1 (en) | 2001-03-05 | 2009-02-19 | Method and device for treating and processing data |
Publications (4)
Publication Number | Publication Date |
---|---|
WO2002071249A2 WO2002071249A2 (de) | 2002-09-12 |
WO2002071249A9 true WO2002071249A9 (de) | 2003-04-10 |
WO2002071249A8 WO2002071249A8 (de) | 2003-10-30 |
WO2002071249A3 WO2002071249A3 (de) | 2004-07-08 |
Family
ID=27586945
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2002/002403 WO2002071249A2 (de) | 1995-12-29 | 2002-03-05 | Verfahren und vorrichtungen zur datenbe- und/oder verarbeitung |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2002071249A2 (de) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
USRE44365E1 (en) | 1997-02-08 | 2013-07-09 | Martin Vorbach | Method of self-synchronization of configurable elements of a programmable module |
US8869121B2 (en) | 2001-08-16 | 2014-10-21 | Pact Xpp Technologies Ag | Method for the translation of programs for reconfigurable architectures |
US8914590B2 (en) | 2002-08-07 | 2014-12-16 | Pact Xpp Technologies Ag | Data processing method and device |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1466264B1 (de) | 1995-12-29 | 2011-09-14 | Richter, Thomas | Verfahren zur konfiguration der verbindung zwischen datenverarbeitungszellen |
US8058899B2 (en) | 2000-10-06 | 2011-11-15 | Martin Vorbach | Logic cell array and bus system |
US9037807B2 (en) | 2001-03-05 | 2015-05-19 | Pact Xpp Technologies Ag | Processor arrangement on a chip including data processing, memory, and interface elements |
US7444531B2 (en) | 2001-03-05 | 2008-10-28 | Pact Xpp Technologies Ag | Methods and devices for treating and processing data |
EP1514193B1 (de) | 2002-02-18 | 2008-07-23 | PACT XPP Technologies AG | Bussysteme und rekonfigurationsverfahren |
US7634597B2 (en) | 2003-10-08 | 2009-12-15 | Micron Technology, Inc. | Alignment of instructions and replies across multiple devices in a cascaded system, using buffers of programmable depths |
DE102014007308A1 (de) | 2014-05-17 | 2015-11-19 | Diehl Bgt Defence Gmbh & Co. Kg | Verfahren zum Betreiben eines bodengebundenen Luftabwehrsystems |
US11061682B2 (en) | 2014-12-15 | 2021-07-13 | Hyperion Core, Inc. | Advanced processor architecture |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4663706A (en) * | 1982-10-28 | 1987-05-05 | Tandem Computers Incorporated | Multiprocessor multisystem communications network |
US5581778A (en) * | 1992-08-05 | 1996-12-03 | David Sarnoff Researach Center | Advanced massively parallel computer using a field of the instruction to selectively enable the profiling counter to increase its value in response to the system clock |
US6038656A (en) * | 1997-09-12 | 2000-03-14 | California Institute Of Technology | Pipelined completion for asynchronous communication |
-
2002
- 2002-03-05 WO PCT/EP2002/002403 patent/WO2002071249A2/de active Application Filing
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
USRE44365E1 (en) | 1997-02-08 | 2013-07-09 | Martin Vorbach | Method of self-synchronization of configurable elements of a programmable module |
USRE44383E1 (en) | 1997-02-08 | 2013-07-16 | Martin Vorbach | Method of self-synchronization of configurable elements of a programmable module |
USRE45109E1 (en) | 1997-02-08 | 2014-09-02 | Pact Xpp Technologies Ag | Method of self-synchronization of configurable elements of a programmable module |
USRE45223E1 (en) | 1997-02-08 | 2014-10-28 | Pact Xpp Technologies Ag | Method of self-synchronization of configurable elements of a programmable module |
US8869121B2 (en) | 2001-08-16 | 2014-10-21 | Pact Xpp Technologies Ag | Method for the translation of programs for reconfigurable architectures |
US8914590B2 (en) | 2002-08-07 | 2014-12-16 | Pact Xpp Technologies Ag | Data processing method and device |
Also Published As
Publication number | Publication date |
---|---|
WO2002071249A2 (de) | 2002-09-12 |
WO2002071249A3 (de) | 2004-07-08 |
WO2002071249A8 (de) | 2003-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0961980B1 (de) | Verfahren zur selbstsynchronisation von konfigurierbaren elementen eines programmierbaren bausteines | |
EP1057117B1 (de) | VERFAHREN ZUM HIERARCHISCHEN CACHEN VON KONFIGURATIONSDATEN VON DATENFLUSSPROZESSOREN UND BAUSTEINEN MIT ZWEI- ODER MEHRDIMENSIONALER PROGRAMMIERBARER ZELLSTRUKTUR (FPGAs, DPGAs, o.dgl.) | |
EP0948842B1 (de) | VERFAHREN ZUM SELBSTÄNDIGEN DYNAMISCHEN UMLADEN VON DATENFLUSSPROZESSOREN (DFPs) SOWIE BAUSTEINEN MIT ZWEI- ODER MEHRDIMENSIONALEN PROGRAMMIERBAREN ZELLSTRUKTUREN (FPGAs, DPGAs, o.dgl.) | |
EP0951682B1 (de) | IO- UND SPEICHERBUSSYSTEM FÜR DFPs SOWIE BAUSTEINE MIT ZWEI- ODER MEHRDIMENSIONALEN PROGRAMMIERBAREN ZELLSTRUKTUREN | |
EP1454258A2 (de) | Verfahren und vorrichtungen zur datenbe- und/oder verarbeitung | |
DE69323861T2 (de) | Multiprozessorsystem mit gemeinsamem Speicher | |
EP1342158A2 (de) | Pipeline ct-protokolle und -kommunikation | |
DE69130106T2 (de) | Arbitrierung von paketvermittelten Bussen, einschliesslich Bussen von Multiprozessoren mit gemeinsam genutztem Speicher | |
EP1329816A2 (de) | Verfahren zum selbständigen dynamischen Umladen von Datenflussprozessoren (DFPs) sowie Bausteinen mit zwei- oder mehrdimensionalen programmierbaren Zellstrukturen (FPGAs, DPGAs, o.dgl.) | |
WO2000077652A2 (de) | Sequenz-partitionierung auf zellstrukturen | |
DE10028397A1 (de) | Registrierverfahren | |
DE3114961A1 (de) | Datenverarbeitungssystem | |
DE3114921C2 (de) | Mikroprogramm-Speicheranordnung | |
WO2002071249A9 (de) | Verfahren und vorrichtungen zur datenbe- und/oder verarbeitung | |
DE102017200456A1 (de) | Recheneinheit und Betriebsverfahren hierfür | |
DE102004009610A1 (de) | Heterogener paralleler Multithread-Prozessor (HPMT) mit geteilten Kontexten | |
DE102011009518B4 (de) | Schaltungsanordnung für Verbindungsschnittstelle | |
EP1308846B1 (de) | Datenübertragungseinrichtung | |
WO2006029986A1 (de) | Rechnereinrichtung mit rekonfigurierbarer architektur zur aufnahme eines globalen zellularen automaten | |
DE102005037234A1 (de) | Vorrichtung und Verfahren zur Speicherung von Daten und/oder Befehlen in einem Rechnersystem mit wenigstens zwei Ausführungseinheiten und wenigstens einem ersten Speicher oder Speicherbereich für Daten und/oder Befehle | |
EP1316891A1 (de) | Datenübertragungseinrichtung | |
WO2003071432A2 (de) | Bussysteme und rekonfigurationsverfahren | |
DE102017216991B4 (de) | Kommunikationsbaustein und Vorrichtung zur Datenübertragung | |
DE10360637B4 (de) | Programmgesteuerte Einheit | |
EP1069513A1 (de) | Programmgesteuerte Einheit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
COP | Corrected version of pamphlet |
Free format text: PAGE 15/16, DRAWINGS, ADDED |
|
REEP | Request for entry into the european phase |
Ref document number: 2002712937 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2002712937 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2002570104 Country of ref document: JP |
|
CFP | Corrected version of a pamphlet front page | ||
CR1 | Correction of entry in section i |
Free format text: IN PCT GAZETTE 37/2002 DUE TO A TECHNICAL PROBLEM AT THE TIME OF INTERNATIONAL PUBLICATION, SOME INFORMATION WAS MISSING (81). THE MISSING INFORMATION NOW APPEARS IN THE CORRECTED VERSION. Free format text: IN PCT GAZETTE 37/2002 DUE TO A TECHNICAL PROBLEM AT THE TIME OF INTERNATIONAL PUBLICATION, SOME INFORMATION WAS MISSING (81). THE MISSING INFORMATION NOW APPEARS IN THE CORRECTED VERSION. |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWP | Wipo information: published in national office |
Ref document number: 2002712937 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10469910 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 10469910 Country of ref document: US |