EP1812861A1

EP1812861A1 - Method and device for delaying accesses to data and/or commands of a multiprocessor system

Info

Publication number: EP1812861A1
Application number: EP05811008A
Authority: EP
Inventors: Thomas Kottke
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2004-10-25
Filing date: 2005-10-25
Publication date: 2007-08-01
Also published as: DE502005005490D1; WO2006045801A2; US20080126718A1; JP2008518312A; EP1807763B1; WO2006045804A1; KR20070083772A; WO2006045800A1; JP2008518311A; JP2008518308A; DE502005005284D1; KR20070083771A; US20090164826A1; US7853819B2; WO2006045798A1; EP1807763A2; EP1820102A2; KR20070067168A; JP4532561B2; EP1810145B1

Abstract

The invention relates to a method and device for delaying accesses to data and/or commands of a multiprocessor system comprising a first and a second processor to both of which a memory unit is assigned. The second processor operates with a clock pulse offset, and the device is designed in such a manner that the first processor accesses the memory unit, and the second processor, with a clock pulse offset, receives the data and/or commands.

Description

Method and device for delaying access to data and / or commands of a multiprocessor system

State of the art

The invention relates to a method for delaying the access to data and / or commands of a multi-computer system and a corresponding delay unit according to the known from the prior art features of the independent claims.

In technical applications, such as in particular in the motor vehicle or in

For example, in the industrial goods sector, eg in the machine area, and in automation, more and more microprocessor-based or computer-based control systems are being used for safety-critical applications. In this case, dual-computer systems or dual-processor systems (dual cores) are today's computer systems for safety-critical applications, in particular in vehicles such as for anti-lock braking systems, electronic stability program (ESP), X-by-wire systems such as drive-by-wire or steer-by-wire as well as break-by-wire, etc. or other networked systems. In order to meet these high security requirements in future applications, powerful error mechanisms and error handling mechanisms are required, in particular to counteract transient errors that arise, for example, in miniaturization of the semiconductor structures of the computer systems. It is relatively difficult to protect the core itself, so the processor. One solution to this is, as mentioned, the use of a dual-computer system or dual core system for fault detection. Such processor units with at least two integrated execution units are thus known as dual-core or multi-core architectures. Such dual-core or multi-core architectures are proposed according to the current state of the art mainly for two reasons:

On the one hand, an increase in performance, ie an increase in performance, can be achieved by considering and treating the two execution units or cores as two arithmetic units on a semiconductor component. In this configuration, the two execution units or cores process different

Programs respectively tasks. As a result, an increase in performance can be achieved, which is why this configuration is referred to as a power mode or performance mode.

The second reason to realize a dual-core or multi-core architecture is one

Increased security, as the two execution units redundantly execute the same program. The results of the two execution units or CPUs, that is to say cores, are compared and an error can be identified in the comparison for consistency. In the following, this configuration is referred to as safety mode or safety mode or error detection mode.

Nowadays, there are on the one hand two- or multi-processor systems that work redundantly to detect hardware errors (see dual-core or master-checker systems) and, on the other hand, two- or multi-processor systems that process different data on their processors.

Advantages and objects of the invention

Combining these two modes of operation according to an embodiment of the present invention in a two or more processor system (for simplicity, only a two-processor system is now discussed, but the following invention is equally applicable to multiprocessor systems), the two processors must be in the performance Mode, and the same data in error detection mode. Such a device or unit enables the effective operation of a two-processor system, so that in the two modes security and performance can be switched during operation. In this case, processors will be discussed below, which also includes cores or computing units conceptually.

In implementations of two-processor systems, in particular, a cache is usually provided for each processor. A cache is usually not sufficient because this cache must be spatially located between the two processors. Consequently, due to the long delay between the cache and the two processors, the two processors could only operate with a limited clock frequency. Caches serve as a fast cache in the system, so that the processor does not always have to fetch the data from the slow main memory. In order to make this possible, it is important to pay close attention to the access time when implementing the cache. This consists of the actual access time to fetch the data from the cache and from the time to pass the data to the processor together. If the cache is now located far away from the processor, the transfer of data takes a long time and the processor can no longer work with its full clock. Because of this timing problem, two-processor systems typically provide a separate cache for each processor.

It is an object of the invention to provide a method and a device by which a cache can be saved in a Zweiprozessorsytsem, or in multiprocessor systems, the redundant caches. The saving takes place by utilizing a clock offset.

Description of the embodiments and advantages of the invention

To achieve the object, the invention describes a method and a device for delaying the access to data and / or commands of a multiprocessor system having a first and a second processor, to which a memory unit is assigned, wherein the second processor operates with a clock offset and the device operates in this way is formed such that the first processor accesses the memory unit and the second processor receives the data and / or commands with a clock offset. Advantageously - A -

If the memory unit is a cache memory, the advantages of this memory technology can be combined with the advantages of the invention.

Conveniently, the memory unit is addressed by at least one processor and is directly coupled to the processor that addresses it.

It is advantageous that a delay element is included and the device is designed such that the clock offset is used by the delay element to implement a bridging of the duration of the data and / or commands from the memory unit to the second processor.

It is furthermore advantageous that comparison means are provided by which the data and / or commands are compared and these comparison means are arranged spatially close to the following processor.

Conveniently, the device is configured such that the clock offset is utilized to guide the comparison data of the first processor to the second processor.

It is advantageous that, depending on the configuration as accesses either write operations and

Read operations or only read operations or only write operations are delayed.

If these two processors are now operated with a clock offset, can now with the proposed method and the corresponding device to the second

Cache for the slave processor can be omitted.

In a dual-processor system, there are 2 processors that can handle the same or different tasks. These two processors of the dual-computer system can execute these tasks isochrone- or off-clock. If a two-processor system is constructed for fault detection, it is advantageous to avoid common-mode errors that these two processors operate with a clock skew. This method is most effective when a non-integer clock offset> 1 is chosen. This means that in this first application form both processors or cores work the same tasks. If the two processors work off different tasks, it is more advantageous for them to run clock-edge synchronously, since the external components such as memory can only be controlled with the clock of a processor. If, for example, a two-processor system switchable between these two modes is used, it is thus optimized for an operating mode.

According to the invention, this is compensated for by the fact that in the two-processor system (or multiprocessor system) which can be switched over between two modes such as security and performance, the two processors work in the security mode with a clock offset and in the mode without clock offset. In the Performance mode, no clock offset is advantageous because the external components such as memory are usually operated at a lower clock frequency and are designed by the clock edge suitable for only one processor. Otherwise, the second clock offset processor would have a wait cycle each time it accesses memory, because it will expose the external component by one-half

Clock is too late.

By switching the clock for a two-processor system, the optimum in error detection is taken out of safety mode and in performance mode the maximum in performance.

Thus, the invention advantageously relates to a method and a device for delaying access to data and / or instructions of a multiprocessor system having a first and a second processor, which is associated with a memory unit, wherein the first and second processors operate with a clock offset and the Device is designed such that both processors access the same memory unit with this clock offset.

Appropriately, write operations and read operations are delayed as accesses, the device between delaying the accesses and non¬

Delay of accesses is switchable. In addition, a multiprocessor system is disclosed with such a device.

In at least one mode, the two processors operate at a clock skew. This can be shifted both by whole bars as well as parts of the bar against each other be. Another variant is that a different clock frequency is used in the two modes. In the safety-critical mode, for example, a lower clock can be used for interference suppression than in the performance mode. These two variants can also be combined with each other.

In this case, the first operating mode corresponds to a safety mode in which the two arithmetic units execute the same programs and / or data and comparison means are provided which compare the states that arise during the execution of the same programs to match.

The erfϊndungsgemäße unit or the inventive method allows the optimized implementation of the two modes in a two-processor system.

If the two processors operate in error detection mode (F mode), the two processors receive the same data / instructions and operate in performance mode (P mode), so each processor can access the memory. Then this unit manages the accesses to the only simply existing memory or peripherals.

In F mode, the unit takes over the data / addresses of a processor (called master here) and forwards them to the components such as memory, bus, etc. The second processor (here slave) wants to make the same access. The data distribution unit accepts this at a second port but does not forward the request to the other components. The data distribution unit gives the slave the same data as the master and compares the data of the two processors. If these are different, this indicates the data distribution unit (here DVE) by an error signal. Thus, only the master works on the bus / memory and the slave gets the same data (functioning as with a dual-core system).

In P mode, the two processors work on different parts of the program. The memory accesses are thus also different. The DVE thus accepts the request from the processors and returns the results / requested data to the processor that requested them. Would like both processors at the same time on one Accessing components, one processor is placed in a wait state until the other has been serviced.

The switching between the two modes and thus the different operation of the data distribution unit is effected by a control signal. This can either be generated by one of the two processors or externally.

If the two-processor system is operated with a clock offset in F mode and not in P mode, the DVE unit delays the data for the slave accordingly or stores the output data of the master until it matches the output data of the master

Slave can be compared for error detection.

The clock offset is explained in more detail with reference to FIG. 1 for a dual-computer system:

FIG. 1 shows a dual-computer system with a first computer 100, in particular one

Master computer and a second computer 101, in particular a slave computer. The entire system is operated with a predeterminable clock or in predeterminable clock cycles (clock cycle) CLK. About the clock input CLKL of the computer 100 and the clock input CLK2 of the computer 101, the clock is supplied to this. In this dual-computer system is also an example of a special feature of

Error detection include, in which namely the first computer 100 and the second computer 101 with a time offset, in particular a predetermined time offset or a predetermined clock offset work. In this case, any time can be predetermined for a time offset and also any desired clock with respect to an offset of the clock cycles. This may be an integer offset of the clock cycle, but just as shown in this example, for example, an offset of 1.5 clock cycles, in which case the first computer 100 just works 1.5 clock cycles before the second computer 101 respectively operated becomes. By this offset can be avoided that common mode failures, the computers or processors, so the cores of the dual-core system, disturbing similar and thus remain unrecognized. That is to say, such common-mode errors relate to the computers at different times in the program sequence due to the offset, and thus cause different effects with respect to the two computers, as a result of which errors become recognizable. Similar error effects without clock skew could not be detected in a comparison, this is avoided. To this offset in time or of the clock, in particular 1.5 clock cycles in the dual-computer system for implementing the offset blocks 112 to 115 are implemented.

For example, to recognize the aforementioned common mode errors, this system is designed to operate at a given skew or clock skew, particularly 1.5 clock cycles, i. while a computer, z. B. computer 100 directly the components, in particular the external components 103 and 104 responds, the second computer 101 operates with a delay of exactly 1.5 clock cycles to do so. In order to produce in this case the desired one and a half cycle delay, ie of 1.5 clock cycles, computer 101 is fed with the inverted clock, that is to say the inverted clock at the clock input CLK2. As a result, but also the aforementioned connections of the computer so its data or commands on the buses to the clock cycles mentioned, so here in particular 1.5 clock cycles are delayed, including just as said the offset or delay blocks 112 to 115 are provided. In addition to the two computers or processors 100 and 101 are components 103 and

104, which are connected via buses 116, consisting of the bus lines 116A and 116B and 116C and 117, consisting of the bus lines 117A and 117B with the two computers 100 and 101 in combination. 117 is a command bus in which 117A is a command address bus and 117B is the partial command (data) bus. Address bus 117A is connected to computer 100 via a command address connection IA1 (instruction address 1) and to computer 101 via an instruction address connection IA2 (instruction address 2). The instructions themselves are transmitted via the sub-command bus 117B, which is connected to computer 100 via a command terminal II (Instruction 1) and to computer 101 via a command terminal 12 (Instruction 2). In this command bus 117 consisting of 117A and 117B is a component 103 z. B. a

Instruction memory, in particular a secure instruction memory or the like interposed. This component, in particular as a command memory is operated in this example with the clock CLK. In addition, at 116, a data bus is shown which includes a data address bus or a data address line 116A and a data bus or a data line 116B. It is 116A, so the

Data address line, via a data address port DAl (Data Address 1) to the computer 100 and via a data address port DA2 (Data Address 2) connected to computer 101. Likewise, the data bus or the data line 116B is connected via a data connection DO1 (Data Out 1) and a data connection DO2 (Data Out 2) Computer 100 or computer 101 connected. The data bus 116C, which is connected to computer 100 or computer 101 via a data connection Dil (Data In 1) and a data connection DI2 (Data In 2), also belongs to data bus 116. In this data bus 116 consisting of the lines 116A, 116B and 116C, a component 104 is interposed, for example a data memory, in particular a secure data memory o. Ä. This component 104 is also supplied with the clock CLK in this example.

The components 103 and 104 are representative of any components which are connected via a data bus and / or command bus to the computers of the dual-computer system and corresponding to the accesses via data and / or commands of the dual-processor system with respect to write operations and / or read operations erroneous data and / or commands receive or give away. In order to avoid errors, error detection generators 105, 106 and 107 are provided which generate an error detection such as, for example, a parity bit or another error code such as an error correction code, ie ECC, or the like Fehlerkennungsprüfeinrichtungen or check-facilities 108 and 109 for checking the respective misrecognition, for example, the parity bit or another error code such as ECC.

The comparison of the data and / or commands with respect to the redundant embodiment in the dual-computer system takes place in the comparators or comparators 110 and 111 as shown in FIG. But now there is a time offset, in particular a clock or clock cycle offset between the computers 100 and 101, either caused by a non-synchronous Zweiprozessorsystem or a synchronous

Two-processor system due to errors in the synchronization or as in this particular example by a time or clock cycle offset desired for error detection, in particular here 1.5 clock cycles, then in this time or clock offset a computer here in particular computer 100 erroneous data and / or commands in components, especially external components such. B. here in particular the memory 103 or 104, but also with respect to other participants or actuators or sensors write or read. Thus, it may also erroneously perform a write access instead of a designated read access by this clock offset. Of course, these scenarios lead to errors in the entire system, in particular, without clear indication of what data and / or commands have just been changed incorrectly, which also causes the recovery problem.

In order to solve this problem, a delay unit 102 is now connected as shown in the lines of the data bus and / or in the command bus. Because of

Clarity is shown only the involvement in the data bus. Of course, this is just as possible and imaginable with regard to the command bus. This delay unit 102 or the delay unit delays the accesses, here in particular the memory accesses, in such a way that a possible time or clock offset is compensated, in particular in the event of an error detection, for example via the comparators 110 and

111 e.g. at least until the error signal is generated in the dual-computer system, that is, the error detection is performed in the dual-computer system. Various variants can be implemented here: delay of the read and write operations, delay of only the write operations or, although not preferred, a delay of the read operations. It can be converted by a change signal, in particular the error signal, a delayed write operation in a read operation to prevent erroneous writing.

Below with reference to FIG 2 now an exemplary implementation with respect to

Data distribution unit (DVE), which preferably consists of a device for detecting the changeover request (by IIIOPDetect), the mode switch unit and the Iram and Dram Control module:

IllOpDetect: The switching between the two modes is through the units

This unit is located between the cache and the processor on the instruction bus and checks if the IUOp command is being loaded into the processor. If the command is detected, this event is reported to the Modeswitch unit. Switch-Detect "'unit is unique to each processor. The "Switch-Detect" unit does not have to be fault-tolerant since it is duplicated and thus redundant, but on the other hand it is conceivable to perform this unit in a fault-tolerant and thus singular manner, but the redundant design is preferred.

ModeSwitch: Switching between the two modes is triggered by the "Switch-Detect" unit, which is to switch from Lock to Split mode Both "Switch-Detect" units detect the switchover, as both processors execute the same program code in Lock mode.The "Switch-Detect" unit of processor 1 recognizes this 1.5 clocks before the "" switch mode. Detect "'Unit of the processor 2. The""Modeswitch'" unit stops the processor 1 by 2 clocks with the help of the wait signal. The processor 2 is also stopped 1.5 clocks later, but only by half a clock to be synchronized to the system clock. Subsequently, the status signal is switched to split for the other components and the two processors continue to work. In order for the two processors to perform different tasks, they must diverge in the program code. This is done by reading directly after switching to split mode

Processor ID is done. This read processor ID is different for each of the two processors. If a comparison is now made to a desired processor ID, then the corresponding processor can be brought to another program location with a conditional jump command. When switching from split mode to lock mode, a processor will notice this, or one of them first. This

Processor will execute program code containing the switchover command. This is now registered by the "Switch-Detect" unit and informs the Modeswitch unit which stops the corresponding processor and informs the second of the request for synchronization by an interrupt, the second processor receives an interrupt and can now execute a software routine to complete its task.

Now he also jumps to the program location where the changeover command is located. His "Switch-Detect" unit now also signals the desire to change mode to the Modeswitch unit.The Wait signal for the processor 1 is now deactivated for the next rising system clock edge and 1.5 clocks later for the processor 2. Both now work again synchronous with a clock offset of 1.5 clocks.

If the system is in Lock mode, both "Switch-Detect" units must notify the Modeswitch unit that they want to switch to Split mode, and if the changeover request is only from one unit, the error is detected by the comparison units These continue to receive data from one of the two processors and they do not match the stopped processors.

If the two processors are in split mode and one does not switch back to lock mode, this can be detected by an external watchdog. With a trigger signal for each processor, the watchdog notices that the waiting processor does not answer. If there is only one watchdog signal for the processor system, then the triggering of the watchdog must only take place in lock mode. Thus, the watchdog would recognize that the mode switch did not occur. The mode signal is available as a dual-rail signal. Where "UO" is Lock mode and " ^Λ 01" is SpKt mode. Errors have occurred with "W and" ^Λ 11 "'.

IramControl: Access to the instruction memory of the two processors is controlled via the IRAM Control. This must be designed securely because it is a single point of failure. It consists of two state machines for each processor: one each isochronous iramlclkreset and one asynchronous readiraml. In safety-critical mode, the state machines of the two processors monitor each other and in performance mode, they work separately.

The reloading of the two caches of the processors are controlled by 2 state machines. A synchronous state machine iramclkreset and an asynchronous readiram. These two state car offices also distribute the memory accesses in split mode. Here processor 1 has the higher priority. After a access to the main memory by processor 1 gets now - if both processors want to access the main memory again ~ processor2 the

Allocated memory access permission. These two state machines are implemented for each processor. In lock mode, the output signals of the machines are compared to detect any errors.

The data for updating the cache 2 in lock mode are in the IRAM control

Unit delayed by 1.5 bars.

In bit 5 in register 0 of the SysControl is encoded which core is concerned. Core 1 is bit 0 and Core 2 is high. This register is mirrored in the memory area with the address 65528.

In a memory access of Core 2 is first checked in which mode the computer is. If it is in lock mode, its memory access is suppressed. This signal is available as a common-rail signal because it is safety-critical. The program counter of the processor 1 is delayed by 1.5 clocks to be compared in lock mode with the program counter of the processor 2 can.

In split mode, the caches of the two processors can be reloaded differently. When switching to lock mode, the two caches are not coherent. As a result, the two processors can diverge and the comparators thus signal an error. To avoid this, a flag table is set up in the IRAM Control. This indicates whether a cache line was written in lock or split mode. In lock mode, the cache line entry will be set to 0 on a cache line reload and set to 0 in the cache line

Split mode - even with a cache update of the cache line from only one cache - to 1. If the processor now executes a memory access in lock mode, it is checked whether this cache line was updated in lock mode, i. is the same in both caches. In split mode, the processor can always access the cache line, regardless of the cache line

Flag Vector is. This table only has to be present once, since in the case of an error the two processors diverge and thus the errors are reliably detected at the comparators. Since the access times on the central table are relatively high, this table can also be copied to every cache.

DramControl: In this component, the parity is formed for each of the address, data, and memory control signals from each processor.

There is a process for both processors to lock the memory. This process does not have to be implemented safely because in Lock mode faulty memory accesses are detected by the comparators and no safety-relevant applications are executed in split mode. Here it is checked if the processor wants to lock the memory for the other processor. This data memory is locked by accessing the memory address $ FBFF $ = 64511. This signal should be present for exactly one cycle, even if a wait command is present at the processor at the time of the call. The state machine for managing the data storage access consists of 2 main states:

- Processor Status Lock: The two processors are in lock mode. This means that the functionality of the data storage icing is not necessary. Processor 1 coordinates the memory accesses. - Processor status Split: An access conflict resolution to the data storage is now necessary and a storage lock must be possible.

The state in split mode is again divided into 7 states, which resolve the access conflicts and the

Lock data storage for each other processor. At the same time request of the two processors in an access, the listed order is also the prioritization.

- Corel \ _Lock: Processor 1 has locked the data store. I want to be in this state

Processor 2 access the memory, it is stopped by a wait signal until processor 1 releases the data memory again. \

- Core2 \ _Lock: Is the same state as the previous one except that now processor 2 has locked the data memory and processor 1 is stopped during data storage operations.

- Lockl \ _wait: The data storage was locked by the processor 2 as processor 1 wanted him to reserve for himself. Processor 1 is thus flagged for the next memory lock.

- nex: The same for processor 2. The data store was locked during the attempted lock by processor 1. Processor 2 gets the memory pre-reserved. In the case of normal memory access without locks, processor 2 can access processor 1 before processor 1 if processor 1 was in front of it.

Memory access of processor 1: The memory is not locked in this case. Processor 1 is allowed to access the data store. If he wants to lock him, he can do so in this condition.

Memory access by processor 2. In the same clock processor 1 did not want to access the memory thus the memory is free for the processor 2.

- no processor wants to access the data store

The DVE sits down as mentioned together from the detection of Umschaltwunsches

(IllOPDetect) of the ModeSwitch unit and the Iram and DramControl.

In FIG. 3, the clock changeover is shown using an example, so that with respect to one mode, a clock change takes place in comparison to the other mode. The two modes, the clock clk and the two processor or Coretakte are shown. In one mode, the two processors work in one clock skew. This can be shifted both by whole bars as well as parts of the clock against each other. Another variant is that a different clock frequency is used in the two modes. In the safety-critical mode, for example, a lower clock can be used for interference suppression than in the performance mode. These two variants can also be combined with each other.

In addition, however, the illustrated special implementation solves the aforementioned tasks.

In the implementations of two-processor systems in particular (dual-core), a cache is provided for each processor, as shown again schematically in FIG. A cache is usually not sufficient because this cache must be spatially located between the two processors. Consequently, due to the long delay between the cache and the two processors, the two processors could only operate with a limited clock frequency.

Caches serve as a fast cache so that the processor does not always have to fetch the data from the slow main memory. In order to make this possible, it is important to pay close attention to the access time when implementing the cache. This consists of the actual access time to fetch the data from the cache and from the time to pass the data to the processor together. If the cache is now located far away from the processor, the transfer of data takes a long time and the processor can no longer work with its full clock. Because of this timing problem, two-processor systems typically provide a separate cache for each processor.

If these two processors are now operated with a clock offset, the method proposed in FIG. 5 can now be applied to the second cache for the slave channel.

Processor be waived.

A cache requires a lot of chip space and also a lot of power. As a result, it also produces a lot of waste heat, which must be dissipated. If a cache can now be dispensed with, then a two-processor system can be implemented much more cost-effectively. In the dual-computer system presented here, one processor is the master and one processor is the slave. The master first processes the data and thus also drives the peripheral components such as memory, cache, DMA controller, etc. The slave processes the same data with a clock offset of 1.5 clocks, for example. This also means that it also gets the data from the shared memory and from the external components later this time. The output data of the two processors such as memory address, data, etc. are compared with each other. To be able to compare the data with each other, the results of the master must also be buffered for 1.5 cycles. Such an example system is shown below.

In order to be able to use a cache for both processors according to FIG. 5, the command and data cache are now arranged directly at the master as in a single processor. The master therefore does not have to accept any performance losses in terms of the cache-to-processor runtimes. Since the slave only processes the data 1.5 clocks later, this time can now be used to transfer the data to the second processor, which is now further away from the cache.

For this purpose, two flip-flops can be used in an exemplary clock offset of 1.5 clocks, as shown in Figure 6. The first one is with the beat of the

Masters controlled, the second with the clock of the slave. The first flip-flop is positioned directly at the output of the source. The second will now be positioned closer to the slave according to the length that the signal can travel in the difference between the two measures. At 1.5 clocks, this corresponds to a time offset of the runtime length in half a clock and a clock offset of 2 clocks in the runtime of one

Clock. Then the second flip-flop takes over the signal. Now, once again, the distance that the signal can cover during a whole measure can be bridged. In the figure this is represented by 1.) the close arrangement at the sink, 2.) the length which can be covered in the clock difference, and 3.) the length which can be covered in one cycle after the second flip-flop ,

Claims

claims

A method of delaying accesses to data and / or instructions of a multiprocessor system having first and second processors associated therewith a memory unit, the second processor operating at a clock skew, and the apparatus being configured such that the first processor resides on the first processor

Memory unit accesses and receives the second processor with a clock offset data.

2. A method for delaying the access to data and / or instructions according to claim 1, characterized in that the clock offset by a

Delay element is used to bridge the runtime of the data and / or commands from the memory unit to realize the second processor.

3. A method for delaying the access to data and / or commands according to claim 1, characterized in that the clock offset is exploited to

To pass comparison data of the first processor to the second processor.

4. A method for delaying the access to data and / or commands according to claim 1, characterized in that as accesses write operations and read operations are delayed.

5. A method for delaying the access to data and / or commands according to claim 1, characterized in that are delayed as accesses only write operations.

6. A method for delaying the access to data and / or commands according to claim 1, characterized in that as accesses only read operations are delayed.

7. A method for delaying the access to data and / or instructions according to claim 1, characterized in that the clock offset is given half-integer.

8. A method for delaying the access to data and / or commands according to claim 1, characterized in that the clock offset is given in integers.

9. A method for delaying the access to data and / or commands according to claim 1, characterized in that the clock offset is set to 1.5 clocks.

10. A device for delaying access to data and / or instructions of a multiprocessor system having a first and a second processor, which is associated with a memory unit, wherein the second processor operates with a clock offset and the device is designed such that the first processor to the Memory unit accesses and receives the second processor with a clock offset data and / or commands.

11. Device for delaying the access to data and / or commands according to claim 10, characterized in that the storage unit is a cache.

12. Device for delaying the access to data and / or commands according to claim 10, characterized in that the memory unit is addressed by at least one processor and the memory unit is coupled directly to the processor that addresses it.

13. A device for delaying the access to data and / or commands according to claim 10, characterized in that a delay element is included and the device is configured such that the clock offset is used by the delay element to a bridging the duration of the data and / or To implement commands from the memory unit to the second processor.

14. Device for delaying the access to data and / or commands according to claim 10, characterized in that comparison means are provided by which the data and / or commands are compared.

15. Device for delaying the access to data and / or commands according to claim 14, characterized in that the comparison means are arranged spatially close to the subsequent processor.

16. Device for delaying the access to data and / or commands according to claim 14, characterized in that the device is designed such that the clock offset is utilized to guide the Verleichsdaten the first processor to the second processor.

17. Device for delaying the access to data and / or commands according to claim 10, characterized in that the device is designed such that write accesses and read operations are delayed as accesses.

18. Device for delaying access to data and / or commands

Claim 10, characterized in that the device is designed such that only write operations are delayed as accesses.

19. Device for delaying the access to data and / or commands according to claim 10, characterized in that the device is designed such that only read operations are delayed as accesses.

20. Device for delaying the access to data and / or commands according to claim 10, characterized in that the device is designed such that the clock offset is given half-integer.

21. Device for delaying the access to data and / or commands according to claim 10, characterized in that the device is designed such that the clock offset is given in integers.

22. Device for delaying the access to data and / or commands according to

Claim 10, characterized in that the device is designed such that the clock offset is set to 1.5 clocks. 23. Multiprocessor system with a device according to one of claims 10 to

22nd