US20080126718A1 - Method And Device For Monitoring A Memory Unit In A Mutliprocessor System - Google Patents
Method And Device For Monitoring A Memory Unit In A Mutliprocessor System Download PDFInfo
- Publication number
- US20080126718A1 US20080126718A1 US11/666,407 US66640705A US2008126718A1 US 20080126718 A1 US20080126718 A1 US 20080126718A1 US 66640705 A US66640705 A US 66640705A US 2008126718 A1 US2008126718 A1 US 2008126718A1
- Authority
- US
- United States
- Prior art keywords
- memory
- mode
- processor
- data
- logging
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000015654 memory Effects 0.000 title claims abstract description 128
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000012544 monitoring process Methods 0.000 title claims abstract description 8
- 238000012545 processing Methods 0.000 claims abstract description 25
- 230000006870 function Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 8
- 230000003111 delayed effect Effects 0.000 description 5
- 238000013461 design Methods 0.000 description 3
- 230000009977 dual effect Effects 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 102100040844 Dual specificity protein kinase CLK2 Human genes 0.000 description 2
- 101000749291 Homo sapiens Dual specificity protein kinase CLK2 Proteins 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 102100040862 Dual specificity protein kinase CLK1 Human genes 0.000 description 1
- 101000749294 Homo sapiens Dual specificity protein kinase CLK1 Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/1641—Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1695—Error detection or correction of the data by redundancy in hardware which are operating with time diversity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/16—Protection against loss of memory contents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
- G06F9/30189—Instruction operation extension or modification according to execution mode, e.g. mode flag
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/845—Systems in which the redundancy can be transformed in increased performance
Definitions
- the present invention relates to a method and device for monitoring a memory unit in a multiprocessor system.
- dual-processor systems are computer systems commonly used for safety-critical applications, especially in motor vehicles, for example for antilock systems, the electronic stability program (ESP), X-by-wire systems, such as drive-by-wire or break-by-wire, etc., or for other networked systems.
- ESP electronic stability program
- X-by-wire systems such as drive-by-wire or break-by-wire, etc., or for other networked systems.
- processor units having at least two integrated execution units are therefore known as dual-core or multi-core architectures.
- dual-core or multi-core architectures are proposed mainly for two reasons:
- performance mode it allows performance to be enhanced, i.e., increased, by regarding and treating the two execution units or cores as two processing units on one semiconductor device.
- the two execution units or cores execute different programs or tasks, respectively. This makes it possible to increase performance, and therefore this configuration is termed performance mode.
- the second reason for implementing a dual-core or multi-core architecture is to increase safety by the two execution units redundantly executing the same program.
- the results of the two executions units or CPUs, i.e., cores, are compared, and a fault can be detected in the conformity-check comparison.
- this configuration will be referred to as “safety mode” or also as “fault detection mode”.
- the clock frequency of today's processors is typically significantly higher than the frequency with which a memory, especially an external memory, can be accessed.
- Cache memories are used to compensate for this time lag.
- the interaction of such a fast buffer memory with a corresponding main memory then allows access times to be significantly reduced.
- dual-processor dual-core
- one cache is provided for each processor.
- caches are used as fast intermediate memories to eliminate the need for the processor to always have to retrieve the data from the slow main memory.
- the access time of a cache must be paid particular attention to during the implementation thereof. The access time is made up of the actual access time for retrieving the data from the cache and the time for transferring the data to the processor.
- a plurality of processors executes the same or different tasks. If they execute different tasks, usually, a cache is coupled between the processor and the main memory for each processor, respectively. The cache is needed to decouple the different operating speeds of the main memory and the processor.
- the dual-processor system operates in the mode in which the two processors execute different tasks, then the caches of the processors are loaded with different data.
- the cache content When switching over to safety mode, in which the processors execute the same tasks and the output data are compared, the cache content must be deleted or marked invalid prior to switching over.
- An object of the present invention is to provide a method and device, and an implementation, for avoiding this performance-reducing drawback so as to eliminate the need to completely delete the cache or invalidate it each time a switchover is made from performance mode to safety mode.
- processors are understood to also include cores or processing units.
- the present invention discloses a method and device for monitoring a memory unit in a system including at least two processing units, a switchover arrangement being included that allows switching between at least two operating modes of the system, the device being arranged to log the memory content and/or the operating mode in which the memory content was generated.
- the present invention also discloses a corresponding system and a corresponding memory unit, in particular, a cache memory.
- a unit for distributing data from at least one data source is provided in a system including at least two processing units, a switchover arrangement (ModeSwitch) being included that allow switching between at least two operating modes of the system, the unit being designed such that the data distribution and/or the data source (in particular, instr. memory, data memory, cache) is/are dependent on the operating mode. Also disclosed is a system including such a unit.
- ModeSwitch switchover arrangement
- FIG. 1 shows a dual-processor system including a first processor 100 , in particular a master processor, and a second processor 101 , in particular a slave processor.
- FIG. 2 shows another view of the dual processor or dual core system.
- FIG. 3 is a schematic view showing a switchable dual-processor system with the caches.
- FIG. 4 shows an exemplary cache memory
- the first operating mode corresponds to a safety mode, in which the two processing units execute or process the same programs and/or data and in which a comparison arrangement is provided by which the states occurring during the execution of the same programs are checked for conformity.
- the unit and method according to the present invention allows the two modes to be implemented in a dual-processor system without reducing the cache utilization performance.
- the two processors When the two processors operate in the fault detection mode (F mode), then the two processors receive the same data/instructions, and when operating in the performance mode (P mode), each processor can access the memory. Then, this unit manages the accesses to the only one existing memory or peripheral equipment.
- F mode fault detection mode
- P mode performance mode
- the unit In the F mode, the unit reads the data/addresses of one processor (herein referred to as “master”) and forwards the same to components such as the memory, bus, etc.
- the second processor (hereinafter “slave”) wishes to make the same access.
- the data distribution unit receives this request at a second port, but does not forward it to the other components.
- the data distribution unit transfers to the slave the same data as those transferred to the master and compares the data of the two processors. If these data are different, then the data distribution unit (hereinafter “DDU”) will indicate this by an error signal.
- DDU data distribution unit
- the two processors execute different program parts.
- the memory accesses are therefore also different.
- the DDU receives the requests of the processors and returns the results/requested data to the requesting processor. If now both processors wish to access a component simultaneously, then one processor is put to a waiting state until the other is served.
- control signal can be generated either by one of the two processors, or externally.
- the logging includes recording which memory content was generated in the performance mode. Also or instead, the logging advantageously includes recording which memory content was generated in the safety mode. For logging purposes, it is possible to create a table, and to evaluate the table as a function of a mode signal indicative of the operating mode of at least one processing unit.
- data are advantageously distinguished from other memory contents, in particular instructions, and the table additionally logs whether the data were changed in the memory unit in the performance mode and/or in the safety mode.
- the logging additionally includes recording whether the respective memory content is valid. This allows all memory contents, especially all data, to be invalidated when starting the processing units.
- one memory unit is provided for each processing unit, and the logging is performed for each memory unit, and, in addition, a comparison of the log is made for each memory unit.
- only one table is created for the two processing units during logging, or one table is created for each processing unit during logging, in which case the table entries are interchangeable between the tables.
- table entries are checked for conformity. It is also advantageous that the validity information is evaluated in the safety mode.
- the DDU unit delays the data for the slave accordingly, or stores the output data of the master until they can be compared to the output data of the slave for fault detection purposes.
- the clock offset will be explained in more detail with reference to FIG. 1 .
- FIG. 1 shows a dual-processor system including a first processor 100 , in particular a master processor, and a second processor 101 , in particular a slave processor.
- the entire system is operated with a predeterminable clock pulse, i.e., in predeterminable clock cycles (CLK).
- CLK predeterminable clock cycles
- the clock pulse is supplied to the system via clock input CLK 1 of processor 100 and via clock input CLK 2 of processor 101 .
- this dual-processor system includes, by way of example, a special feature for fault detection, namely that first processor 100 and second processor 101 operate with a time offset, in particular a predeterminable time offset or a predeterminable clock offset.
- any time may be specified for a time offset and any clock pulse may be specified with respect to an offset of the clock cycles.
- This may be an integral clock cycle offset, but also, for example, an offset of 1.5 clock cycles, as illustrated in this example, in which first processor 100 operates or is operated 1.5 clock cycles ahead of second processor 101 .
- This offset prevents so-called common mode failures from affecting the processors, i.e., the cores or the dual-core system, in the same manner, as a result of which the common mode failures would remain undetected. That is, due to the offset, such common mode failures affect the processors at different points in the program execution and, therefore, have different effects on the two processors, as a result of which faults become detectable. Without a clock offset, identical fault effects would possibly not be detectable in a comparison. This is prevented in this manner.
- this offset with respect to time or clock pulse, here in particular 1.5 clock cycles, in the dual-processor system, there are implemented offset blocks 112 through 115 .
- this system is designed, for example, to operate with a predetermined time offset or clock cycle offset, here in particular 1.5 clock cycles; i.e., while one processor, for example processor 100 , accesses the components, in particular external components 103 and 104 , directly, second processor 101 operates with a delay of exactly 1.5 clock cycles with respect thereto.
- processor 101 in order to generate the desired one-and-a-half cycle delay, i.e., the delay of 1.5 clock cycles, processor 101 is supplied with the inverted clock, i.e., the inverted clock pulse at clock input CLK 2 .
- 117 is an instruction bus, where an instruction address bus is denoted by 117 A and the sub-instruction (data) bus is denoted by 117 B.
- Address bus 117 A is connected via an instruction address port IA 1 (Instruction Address 1 ) to processor 100 and via an instruction address port IA 2 (Instruction Address 2 ) to processor 101 .
- the instructions themselves are transmitted via sub-instruction bus 117 B, which is connected via an instruction port I 1 (Instruction 1 ) to processor 100 and via an instruction port I 2 (Instruction 2 ) to processor 101 .
- This instruction bus 117 which is formed by 117 A ad 117 B, has interconnected therein a component 103 , for example, an instruction memory, in particular a safe instruction memory, or the like. In this example, this component, especially as an instruction memory, is also operated with clock pulse CLK.
- 116 represents a data bus, which includes a data address bus or data address line 116 A and a data bus or data line 116 B.
- 116 A i.e., the data address line
- DA 1 Data Address 1
- DA 2 Data Address 2
- the data bus or data line 116 B is connected via a data port DO 1 (Data Out 1 ) and a data port DO 2 (Data Out 2 ) to processor 100 and processor 101 , respectively.
- Data bus line 116 C is also part of data bus 116 , and is connected via a data port DI 1 (Data In 1 ) and a data port DI 2 (Data In 2 ) to processor 100 and processor 101 , respectively.
- This data bus 116 which is formed by lines 116 A, 116 B and 116 C, has interconnected therein a component 104 , for example, a data memory, in particular a safe data memory, or the like. In this example, this component 104 is also supplied with clock pulse CLK.
- components 103 and 104 represent any components which are connected via a data bus and/or instruction bus to the processors of the dual-processor system, and which may receive or output erroneous data with respect to write operations and/or read operations according to the accesses via data and/or instructions of the dual-processor system.
- error detection generators 105 , 106 and 107 are provided, which generate an error code, such as a parity bit or other suitable error code, such as, for example, an error correction code, i.e., ECC, or the like.
- ECC error correction code
- there are also provided corresponding error detection checkers or checking devices 108 and 109 which are used to check the respective error code, i.e., for example, the parity bit or other error code, such as ECC.
- comparators 110 and 111 The comparison of the data and/or instructions with respect to the redundant embodiment in the dual-processor system is performed in comparators 110 and 111 , as shown in FIG. 1 .
- a time offset in particular a clock pulse or clock cycle offset
- processors 100 and 101 may write or read erroneous data and/or instructions in components, especially external components, such as here, for example, memories 103 or 104 , in particular, or also with respect to other stations or actuators or sensors, during this time or clock offset.
- the processor may also erroneously perform, for example, a write access instead of an intended read access.
- a write access instead of an intended read access.
- a delay unit 102 is interconnected in the lines of the data bus and/or in the instruction bus, as shown. For the sake of clarity, only the interconnection in the data bus is illustrated. Of course, this is equally possible and conceivable with respect to the instruction bus.
- This delay unit 102 delays the accesses, here especially the memory accesses, in such a way that a possible time or clock offset is compensated for, especially when performing fault detection, for example, using comparators 110 and 111 , the accesses being delayed, for example, until the error signal is generated in the dual-processor system, i.e., until the fault detection is performed in the dual-processor system.
- a delayed write operation can be converted to a read operation in order to prevent erroneous writing.
- DDU data distribution unit
- IIIOPDetect the switchover request
- Mode Switch unit the Mode Switch unit
- Iram and Dram control blocks described below with reference to FIG. 2 .
- IIIOpDetect The switchover between the two modes is detected by the “‘Switch Detect”’ units. This unit is located between the cache and the processor on the instruction bus and observes whether the IllOp instruction is loaded into the processor. If this instruction is detected, then this result is communicated to the Mode Switch unit.
- the “‘Switch Detect’ ” unit is provided for each processor separately.
- the “‘Switch Detect’ ” unit does not need to be fault-tolerant, because two such units are provided, which makes them redundant. On the other hand, it is possible to design this unit to be fault-tolerant and, therefore, as a single unit, but preference may be given to the redundant design.
- ModeSwitch Switching between the two modes is triggered by the “‘Switch Detect’ ” unit. If a switchover is to be made from the lock mode to the split mode, both “‘Switch Detect’ ” units detect the switchover because, both processors execute the same program code in the lock mode.
- the “‘Switch Detect’ ” unit of processor 1 detects this 1.5 clock pulses before the “‘Switch Detect’ ” unit of processor 2 .
- the “‘Mode Switch’ ” unit halts processor 1 for 2 clock pulses with the aid of the wait signal. 1.5 clock pulses later, processor 2 is also halted, but only for a half clock pulse, in order to be synchronized with the system clock.
- the status signal is switched to split for the other components, and the two processors continue to operate.
- the two processors In order for the two processors to execute different tasks, they must diverge in terms of the program code. This is accomplished by performing a read access to the processor ID immediately after switching to the split mode. This read-out processor ID is different for each of the two processors. If a comparison is now made with a reference processor ID, the respective processor can then be taken to a different program location using a conditional jump instruction.
- a switchover from the split mode to the lock mode will be detected by one processor, that is, by one of two processors first.
- This processor will execute program code that contains the switching instruction. This is now detected by the “‘Switch Detect’ ” unit and communicated by it to the Mode Switch unit.
- the Mode Switch unit halts the respective processor and communicates the request for synchronization to the second processor using an interrupt.
- the second processor receives an interrupt and can now execute a software routine to complete its task. Then, it also jumps to the program location containing the switching instruction. Then, its “‘Switch Detect’ ” unit also signals the mode switch request to the Mode Switch unit.
- the wait signal is deactivated for processor 1 , and 1.5 clock pulses later for processor 2 .
- both processors operate synchronously again with a clock offset of 1.5 clock pulses.
- both “‘Switch Detect’ ” units When the system is in lock mode, both “‘Switch Detect’ ” units must inform the Mode Switch unit that they wish to change to the split mode. If only one unit issues a switchover request, then the comparison units will detect the fault, because one of the two processors continues to supply data to the comparison units, and the data do not match those of the halted processors.
- the two processors are in split mode and one processor does not switch back to lock mode, then this can be detected by an external watchdog. If there is a trigger signal for each processor, the watchdog notices that the waiting processor is no longer sending any signals. If there is only one watchdog signal for the processor system, then the triggering of the watchdog may only take place in the lock mode. Consequently, the watchdog would detect that that the mode switch has not occurred.
- the mode signal is present as a dual-rail signal. In this context, “‘10’” stands for the lock mode and “‘01’” stands for the split mode. In the case of “‘00’” and “‘11’”, faults have occurred.
- IramControl Access to the instruction memory of the two processors is controlled by the IRAM Control, which must be safe, because it is a single point of failure.
- the IRAM Control includes two finite automatons for each processor: in each case in the form of a clocked iram1clkreset and an asynchronous readiram1, respectively. In the safety-critical mode, the finite automatons of the two processors monitor each other, and in the performance mode, they operate separately.
- the reloading of the two caches of the processors is controlled by 2 finite automatons.
- These two finite automatons also distribute the memory accesses in the split mode. In this process, processor 1 has the higher priority. After an access to the main memory by processor 1 , processor 2 is given memory access authorization, if both processors wish to access the main memory again.
- These two finite automatons are implemented for each processor. In the lock mode, the output signals of the automatons are compared to be able to detect any errors that may occur.
- the data for updating cache 2 in the lock mode are delayed by 1.5 clock pulses in the IRAM control unit.
- the program counter of processor 1 is delayed by 1.5 clock pulses to be able to be compared to the program counter of processor 2 in the lock mode.
- the caches of the two processors can be reloaded differently. If a switchover is now made to lock mode, the two caches are not coherent with each other. Because of this, the two processors may diverge and, therefore, the comparators signal a fault. In order to avoid this, a flag table is set up in the IRAM Control. This table records whether a cache line was written in lock mode or in split mode. In the lock mode, the corresponding entry for the cache line is set to 0, and in the split mode, it is set to 1, even if the cache line of only one cache is updated. If the processor now performs a memory access in lock mode, then a check is made as to whether this cache line was updated in the lock mode, that is, whether it is identical in both caches.
- the process can always access the cache line, regardless of the state of the Flag_Vector. Only one such table needs to exist, because a fault causes the two processors to diverge and, therefore, this fault is reliably detected in the comparators. Since the access times to the central table are relatively high, this table may also be copied to each cache.
- DramControl In this component, the parity is generated for the address, data, and memory control signals from each processor.
- processor state LOCK The two processors operate in the lock mode. That is, the data memory locking functionality is not needed. Processor 1 coordinates the memory accesses.
- processor state SPLIT now, access conflict resolution is needed on the data memory, and memory locking must be possible.
- the state in the split mode is, in turn, divided into 7 states, which allow access conflicts to be resolved and the data memory to be locked for the respective other processor.
- the specified order at the same time represents the priority assignment.
- Core1 ⁇ _Lock Processor 1 has locked the data memory. If, in this state, the processor 2 wishes to access the memory, then it is halted by a wait signal until processor 1 releases the data memory. ⁇
- Core2 ⁇ _Lock This is same state as the previous one, except that now processor 2 has locked the data memory and processor 1 is halted in the case of data memory operations.
- lock1 ⁇ _wait The data memory was locked by processor 2 when processor 1 also wished to reserve it for itself. Therefore, processor 1 is earmarked for the next memory locking operation.
- nex The same for processor 2 .
- the data memory was locked when processor 1 attempted to lock it.
- the memory is pre-reserved for processor 2 .
- processor 2 can perform an access before processor 1 here, if, before that, it had been the turn of processor 1 .
- Memory access of processor 1 In this case, the memory is not locked. Processor 1 may access the data memory. If it wishes to lock the data memory, it can do so in this state.
- processor 1 did not wish to access the memory and, consequently, the memory is free for processor 2 .
- the DDU is formed by the device for detecting the switchover request (IIIOPDetect), the ModeSwitch unit and the IramControl and DramControl.
- FIG. 3 is a schematic view showing a switchable dual-processor system with the caches.
- One cache memory is shown by way of example in FIG. 4 .
- no coherence problem occurs with respect to an instruction cache. Therefore, no snooping has been used so far.
- the approach is now to perform snooping of the instructions that are loaded into the respective caches of the processors.
- a table is set up:
- each time a cache access is made it is only checked whether it contains valid values. In the lock mode, however, this new table is also queried. If the data are marked invalid in this table, there may indeed be valid data in the caches, but these data are not identical in the caches. In the lock mode, the comparator of the dual-processor system would therefore indicate a fault, because the two processors would diverge.
- this table is also used for the data memory, it must additionally be checked whether, if the data were loaded in the lock mode, this cache line was not only replaced in the split mode, but whether the data were also updated by a processor in one of the caches.
- New Cache Valid Higher-Order Action Field Table Action system starts all data all data up invalid invalid cache line is cache line cache line loaded in lock valid valid mode cache line is cache line cache line When accessing loaded in valid invalid the cache line in split mode lock mode, this cache line must be reloaded in all caches of the processors even if it is already marked valid in one cache.
- a second variant of the table may be as follows:
- the second variant of the table is that it includes only the set and tag fields, but for each member separately. This indeed makes the table larger, but the advantage is that, in the split mode, the system centrally documents for both caches what their contents look like. Then, in the lock mode, comparison of the tables makes it possible to determine whether these data are identical in both caches. Thus, unlike the first method, cache lines can be updated at different points in time, without being marked as invalid for the lock mode.
- the core of the present invention is the logging of the data in the cache.
- the object mentioned at the outset is also achieved by the specific implementation described.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Hardware Redundancy (AREA)
- Multi Processors (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
- Synchronisation In Digital Transmission Systems (AREA)
- Debugging And Monitoring (AREA)
Applications Claiming Priority (11)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102004051964.1 | 2004-10-25 | ||
DE102004051952A DE102004051952A1 (de) | 2004-10-25 | 2004-10-25 | Verfahren zur Datenverteilung und Datenverteilungseinheit in einem Mehrprozessorsystem |
DE102004051950A DE102004051950A1 (de) | 2004-10-25 | 2004-10-25 | Verfahren und Vorrichtung zur Taktumschaltung bei einem Mehrprozessorsystem |
DE102004051950.1 | 2004-10-25 | ||
DE200410051964 DE102004051964A1 (de) | 2004-10-25 | 2004-10-25 | Verfahren und Vorrichtung zur Überwachung einer Speichereinheit in einem Mehrprozessorsystem |
DE102004051952.8 | 2004-10-25 | ||
DE200410051937 DE102004051937A1 (de) | 2004-10-25 | 2004-10-25 | Verfahren und Vorrichtung zur Synchronisierung in einem Mehrprozessorsystem |
DE102004051937.4 | 2004-10-25 | ||
DE200410051992 DE102004051992A1 (de) | 2004-10-25 | 2004-10-25 | Verfahren und Vorrichtung zur Verzögerung von Zugriffen auf Daten und/oder Befehle eines Mehrprozessorsystems |
DE102004051992.7 | 2004-10-25 | ||
PCT/EP2005/055538 WO2006045801A2 (de) | 2004-10-25 | 2005-10-25 | Verfahren und vorrichtung zur überwachung einer speichereinheit in einem mehrprozessorsystem |
Publications (1)
Publication Number | Publication Date |
---|---|
US20080126718A1 true US20080126718A1 (en) | 2008-05-29 |
Family
ID=35677569
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/666,405 Active 2027-04-27 US7853819B2 (en) | 2004-10-25 | 2005-10-25 | Method and device for clock changeover in a multi-processor system |
US11/666,413 Abandoned US20090164826A1 (en) | 2004-10-25 | 2005-10-25 | Method and device for synchronizing in a multiprocessor system |
US11/666,406 Abandoned US20080163035A1 (en) | 2004-10-25 | 2005-10-25 | Method for Data Distribution and Data Distribution Unit in a Multiprocessor System |
US11/666,407 Abandoned US20080126718A1 (en) | 2004-10-25 | 2005-10-25 | Method And Device For Monitoring A Memory Unit In A Mutliprocessor System |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/666,405 Active 2027-04-27 US7853819B2 (en) | 2004-10-25 | 2005-10-25 | Method and device for clock changeover in a multi-processor system |
US11/666,413 Abandoned US20090164826A1 (en) | 2004-10-25 | 2005-10-25 | Method and device for synchronizing in a multiprocessor system |
US11/666,406 Abandoned US20080163035A1 (en) | 2004-10-25 | 2005-10-25 | Method for Data Distribution and Data Distribution Unit in a Multiprocessor System |
Country Status (8)
Country | Link |
---|---|
US (4) | US7853819B2 (de) |
EP (5) | EP1812861A1 (de) |
JP (5) | JP4532561B2 (de) |
KR (4) | KR20070067168A (de) |
AT (2) | ATE407398T1 (de) |
DE (2) | DE502005005490D1 (de) |
RU (1) | RU2007119316A (de) |
WO (5) | WO2006045804A1 (de) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090177290A1 (en) * | 2007-12-27 | 2009-07-09 | Horst-Dieter Nikolai | Safety controller |
US20090249271A1 (en) * | 2008-03-27 | 2009-10-01 | Hiromichi Yamada | Microcontroller, control system and design method of microcontroller |
US20100262811A1 (en) * | 2009-04-08 | 2010-10-14 | Moyer William C | Debug signaling in a multiple processor data processing system |
US20110213948A1 (en) * | 2010-02-01 | 2011-09-01 | Steven Perry | Efficient Processor Apparatus and Associated Methods |
US20120272007A1 (en) * | 2011-04-19 | 2012-10-25 | Freescale Semiconductor, Inc. | Cache memory with dynamic lockstep support |
US10025281B2 (en) | 2011-03-15 | 2018-07-17 | Omron Corporation | Control device and system program, and recording medium |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7882379B2 (en) * | 2006-09-22 | 2011-02-01 | Sony Computer Entertainment Inc. | Power consumption reduction in a multiprocessor system |
US20080244305A1 (en) * | 2007-03-30 | 2008-10-02 | Texas Instruments Deutschland, Gmbh | Delayed lock-step cpu compare |
US7941698B1 (en) * | 2008-04-30 | 2011-05-10 | Hewlett-Packard Development Company, L.P. | Selective availability in processor systems |
JP2010198131A (ja) * | 2009-02-23 | 2010-09-09 | Renesas Electronics Corp | プロセッサシステム、及びプロセッサシステムの動作モード切り替え方法 |
US8295287B2 (en) * | 2010-01-27 | 2012-10-23 | National Instruments Corporation | Network traffic shaping for reducing bus jitter on a real time controller |
US9052887B2 (en) | 2010-02-16 | 2015-06-09 | Freescale Semiconductor, Inc. | Fault tolerance of data processing steps operating in either a parallel operation mode or a non-synchronous redundant operation mode |
KR101664108B1 (ko) | 2010-04-13 | 2016-10-11 | 삼성전자주식회사 | 멀티 코어의 동기화를 효율적으로 처리하기 위한 하드웨어 가속 장치 및 방법 |
JP5718600B2 (ja) * | 2010-09-10 | 2015-05-13 | 日本電気通信システム株式会社 | 情報処理システム、および、情報処理方法 |
US8683251B2 (en) | 2010-10-15 | 2014-03-25 | International Business Machines Corporation | Determining redundancy of power feeds connecting a server to a power supply |
WO2012144011A1 (ja) | 2011-04-18 | 2012-10-26 | 富士通株式会社 | スレッド処理方法、およびスレッド処理システム |
US9842014B2 (en) | 2012-11-22 | 2017-12-12 | Nxp Usa, Inc. | Data processing device, method of execution error detection and integrated circuit |
US9429981B2 (en) * | 2013-03-05 | 2016-08-30 | St-Ericsson Sa | CPU current ripple and OCV effect mitigation |
US9823983B2 (en) | 2014-09-25 | 2017-11-21 | Nxp Usa, Inc. | Electronic fault detection unit |
WO2016087175A1 (de) * | 2014-12-01 | 2016-06-09 | Continental Teves Ag & Co. Ohg | Rechensystem für ein kraftfahrzeugsystem |
JP6516097B2 (ja) * | 2015-06-11 | 2019-05-22 | 大日本印刷株式会社 | 演算装置、icカード、演算方法、及び演算処理プログラム |
JP2019061392A (ja) | 2017-09-26 | 2019-04-18 | ルネサスエレクトロニクス株式会社 | マイクロコントローラ及びマイクロコントローラの制御方法 |
US10642826B1 (en) | 2018-08-30 | 2020-05-05 | Gravic, Inc. | Mixed-mode method for combining active/active and validation architectures utilizing a check integrity module |
US11269799B2 (en) * | 2019-05-03 | 2022-03-08 | Arm Limited | Cluster of processing elements having split mode and lock mode |
US11899547B2 (en) * | 2021-11-30 | 2024-02-13 | Mellanox Technologies, Ltd. | Transaction based fault tolerant computing system |
US12032460B2 (en) * | 2022-02-11 | 2024-07-09 | Stmicroelectronics S.R.L. | Systems and methods to test an asynchronous finite machine |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5809522A (en) * | 1995-12-18 | 1998-09-15 | Advanced Micro Devices, Inc. | Microprocessor system with process identification tag entries to reduce cache flushing after a context switch |
US20020073357A1 (en) * | 2000-12-11 | 2002-06-13 | International Business Machines Corporation | Multiprocessor with pair-wise high reliability mode, and method therefore |
US6615366B1 (en) * | 1999-12-21 | 2003-09-02 | Intel Corporation | Microprocessor with dual execution core operable in high reliability mode |
US6947047B1 (en) * | 2001-09-20 | 2005-09-20 | Nvidia Corporation | Method and system for programmable pipelined graphics processing with branching instructions |
US20070245133A1 (en) * | 2003-10-24 | 2007-10-18 | Reinhard Weiberle | Method and Device for Switching Between at Least Two Operating Modes of a Processor Unit |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE1269827B (de) * | 1965-09-09 | 1968-06-06 | Siemens Ag | Verfahren und Zusatzeinrichtung zur Synchronisierung von parallel arbeitenden Datenverarbeitungsanlagen |
US3783250A (en) * | 1972-02-25 | 1974-01-01 | Nasa | Adaptive voting computer system |
US4823256A (en) | 1984-06-22 | 1989-04-18 | American Telephone And Telegraph Company, At&T Bell Laboratories | Reconfigurable dual processor system |
AU616213B2 (en) * | 1987-11-09 | 1991-10-24 | Tandem Computers Incorporated | Method and apparatus for synchronizing a plurality of processors |
US6038584A (en) * | 1989-11-17 | 2000-03-14 | Texas Instruments Incorporated | Synchronized MIMD multi-processing system and method of operation |
US5226152A (en) * | 1990-12-07 | 1993-07-06 | Motorola, Inc. | Functional lockstep arrangement for redundant processors |
DE4104114C2 (de) * | 1991-02-11 | 2000-06-08 | Siemens Ag | Redundantes Datenverarbeitungssystem |
JPH05128080A (ja) * | 1991-10-14 | 1993-05-25 | Mitsubishi Electric Corp | 情報処理装置 |
US5751932A (en) * | 1992-12-17 | 1998-05-12 | Tandem Computers Incorporated | Fail-fast, fail-functional, fault-tolerant multiprocessor system |
JPH07121483A (ja) * | 1993-10-28 | 1995-05-12 | Nec Eng Ltd | 共有メモリアクセス制御回路 |
US5758132A (en) | 1995-03-29 | 1998-05-26 | Telefonaktiebolaget Lm Ericsson | Clock control system and method using circuitry operating at lower clock frequency for selecting and synchronizing the switching of higher frequency clock signals |
CA2178440A1 (en) * | 1995-06-07 | 1996-12-08 | Robert W. Horst | Fail-fast, fail-functional, fault-tolerant multiprocessor system |
JPH096733A (ja) * | 1995-06-14 | 1997-01-10 | Toshiba Corp | 並列信号処理装置 |
JPH0973436A (ja) * | 1995-09-05 | 1997-03-18 | Mitsubishi Electric Corp | 多重化計算機における動作モード切替方式 |
US5732209A (en) * | 1995-11-29 | 1998-03-24 | Exponential Technology, Inc. | Self-testing multi-processor die with internal compare points |
FR2748136B1 (fr) * | 1996-04-30 | 1998-07-31 | Sextant Avionique | Module electronique avec architecture redondante pour controle d'integrite du fonctionnement |
GB2317032A (en) * | 1996-09-07 | 1998-03-11 | Motorola Gmbh | Microprocessor fail-safe system |
GB9704542D0 (en) * | 1997-03-05 | 1997-04-23 | Sgs Thomson Microelectronics | A cache coherency mechanism |
EP0978784A1 (de) * | 1998-08-04 | 2000-02-09 | Motorola, Inc. | Verfahren zur Rechnerprogrammkodierung und Verfahren zur Fehlerbeseitigung kodierter Rechnerprogramme |
GB2340627B (en) * | 1998-08-13 | 2000-10-04 | Plessey Telecomm | Data processing system |
JP2000200255A (ja) * | 1999-01-07 | 2000-07-18 | Hitachi Ltd | プロセッサ間の同期化方法及び同期回路 |
WO2000079405A1 (fr) * | 1999-06-21 | 2000-12-28 | Hitachi, Ltd. | Processeur de donnees |
US6640313B1 (en) * | 1999-12-21 | 2003-10-28 | Intel Corporation | Microprocessor with high-reliability operating mode |
DE10136335B4 (de) | 2001-07-26 | 2007-03-22 | Infineon Technologies Ag | Prozessor mit mehreren Rechenwerken |
US20040076189A1 (en) * | 2002-10-17 | 2004-04-22 | International Business Machines Corporation | Multiphase clocking method and apparatus |
US7055060B2 (en) | 2002-12-19 | 2006-05-30 | Intel Corporation | On-die mechanism for high-reliability processor |
JP2004234144A (ja) * | 2003-01-29 | 2004-08-19 | Hitachi Ltd | プロセッサの動作比較装置および動作比較方法 |
WO2005003962A2 (de) * | 2003-06-24 | 2005-01-13 | Robert Bosch Gmbh | Verfahren zur umschaltung zwischen wenigstens zwei betriebsmodi einer prozessoreinheit sowie entsprechende prozessoreinheit |
US7134031B2 (en) * | 2003-08-04 | 2006-11-07 | Arm Limited | Performance control within a multi-processor system |
-
2005
- 2005-10-25 EP EP05811008A patent/EP1812861A1/de not_active Ceased
- 2005-10-25 US US11/666,405 patent/US7853819B2/en active Active
- 2005-10-25 KR KR1020077009252A patent/KR20070067168A/ko not_active Application Discontinuation
- 2005-10-25 WO PCT/EP2005/055542 patent/WO2006045804A1/de active Application Filing
- 2005-10-25 EP EP05797084A patent/EP1810145B1/de active Active
- 2005-10-25 RU RU2007119316/09A patent/RU2007119316A/ru not_active Application Discontinuation
- 2005-10-25 JP JP2007537302A patent/JP4532561B2/ja not_active Expired - Fee Related
- 2005-10-25 EP EP05801268A patent/EP1807761A1/de not_active Ceased
- 2005-10-25 WO PCT/EP2005/055539 patent/WO2006045802A2/de active Application Filing
- 2005-10-25 WO PCT/EP2005/055532 patent/WO2006045798A1/de active Application Filing
- 2005-10-25 EP EP05811107A patent/EP1820102A2/de not_active Withdrawn
- 2005-10-25 KR KR1020077009251A patent/KR20070062579A/ko not_active Application Discontinuation
- 2005-10-25 AT AT05797084T patent/ATE407398T1/de not_active IP Right Cessation
- 2005-10-25 EP EP05801543A patent/EP1807763B1/de not_active Not-in-force
- 2005-10-25 WO PCT/EP2005/055538 patent/WO2006045801A2/de active IP Right Grant
- 2005-10-25 KR KR1020077009253A patent/KR20070083772A/ko active IP Right Grant
- 2005-10-25 WO PCT/EP2005/055537 patent/WO2006045800A1/de active Application Filing
- 2005-10-25 AT AT05801543T patent/ATE409327T1/de not_active IP Right Cessation
- 2005-10-25 US US11/666,413 patent/US20090164826A1/en not_active Abandoned
- 2005-10-25 DE DE502005005490T patent/DE502005005490D1/de active Active
- 2005-10-25 US US11/666,406 patent/US20080163035A1/en not_active Abandoned
- 2005-10-25 JP JP2007537304A patent/JP2008518311A/ja active Pending
- 2005-10-25 JP JP2007537301A patent/JP2008518308A/ja active Pending
- 2005-10-25 KR KR1020077009250A patent/KR20070083771A/ko not_active Application Discontinuation
- 2005-10-25 JP JP2007537303A patent/JP2008518310A/ja active Pending
- 2005-10-25 DE DE502005005284T patent/DE502005005284D1/de active Active
- 2005-10-25 JP JP2007537305A patent/JP2008518312A/ja active Pending
- 2005-10-25 US US11/666,407 patent/US20080126718A1/en not_active Abandoned
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5809522A (en) * | 1995-12-18 | 1998-09-15 | Advanced Micro Devices, Inc. | Microprocessor system with process identification tag entries to reduce cache flushing after a context switch |
US6615366B1 (en) * | 1999-12-21 | 2003-09-02 | Intel Corporation | Microprocessor with dual execution core operable in high reliability mode |
US20020073357A1 (en) * | 2000-12-11 | 2002-06-13 | International Business Machines Corporation | Multiprocessor with pair-wise high reliability mode, and method therefore |
US6947047B1 (en) * | 2001-09-20 | 2005-09-20 | Nvidia Corporation | Method and system for programmable pipelined graphics processing with branching instructions |
US20070245133A1 (en) * | 2003-10-24 | 2007-10-18 | Reinhard Weiberle | Method and Device for Switching Between at Least Two Operating Modes of a Processor Unit |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8010723B2 (en) * | 2007-12-27 | 2011-08-30 | Robert Bosch Gmbh | Safety controller with data lock |
US20090177290A1 (en) * | 2007-12-27 | 2009-07-09 | Horst-Dieter Nikolai | Safety controller |
US20090249271A1 (en) * | 2008-03-27 | 2009-10-01 | Hiromichi Yamada | Microcontroller, control system and design method of microcontroller |
US7890233B2 (en) | 2008-03-27 | 2011-02-15 | Renesas Electronics Corporation | Microcontroller, control system and design method of microcontroller |
US20110106335A1 (en) * | 2008-03-27 | 2011-05-05 | Renesas Electronics Corporation | Microcontroller, control system and design method of microcontroller |
US8046137B2 (en) | 2008-03-27 | 2011-10-25 | Renesas Electronics Corporation | Microcontroller, control system and design method of microcontroller |
US8275977B2 (en) | 2009-04-08 | 2012-09-25 | Freescale Semiconductor, Inc. | Debug signaling in a multiple processor data processing system |
US20100262811A1 (en) * | 2009-04-08 | 2010-10-14 | Moyer William C | Debug signaling in a multiple processor data processing system |
US20110213948A1 (en) * | 2010-02-01 | 2011-09-01 | Steven Perry | Efficient Processor Apparatus and Associated Methods |
US8954714B2 (en) | 2010-02-01 | 2015-02-10 | Altera Corporation | Processor with cycle offsets and delay lines to allow scheduling of instructions through time |
US10025281B2 (en) | 2011-03-15 | 2018-07-17 | Omron Corporation | Control device and system program, and recording medium |
US20120272007A1 (en) * | 2011-04-19 | 2012-10-25 | Freescale Semiconductor, Inc. | Cache memory with dynamic lockstep support |
US9086977B2 (en) * | 2011-04-19 | 2015-07-21 | Freescale Semiconductor, Inc. | Cache memory with dynamic lockstep support |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20080126718A1 (en) | Method And Device For Monitoring A Memory Unit In A Mutliprocessor System | |
US20090044044A1 (en) | Device and method for correcting errors in a system having at least two execution units having registers | |
TWI502376B (zh) | 多處理器資料處理系統中之錯誤偵測之方法及系統 | |
US8140828B2 (en) | Handling transaction buffer overflow in multiprocessor by re-executing after waiting for peer processors to complete pending transactions and bypassing the buffer | |
US6732250B2 (en) | Multiple address translations | |
US20100287443A1 (en) | Processor based system having ecc based check and access validation information means | |
JP2006164277A (ja) | プロセッサにおけるエラー除去装置および方法,プロセッサ | |
JPH0239254A (ja) | データ処理システム及びそのキヤツシユ記憶システム | |
KR19980023978A (ko) | 메모리 갱신 이력 보존 장치 및 메모리 갱신 이력 보존 방법 | |
EP3404537A1 (de) | Verarbeitungsknoten, computersystem und verfahren zur detektion von transaktionskonflikten | |
JP4182948B2 (ja) | フォールト・トレラント・コンピュータシステムと、そのための割り込み制御方法 | |
CN100511167C (zh) | 监控多处理器系统中的存储单元的方法和设备 | |
US20070294559A1 (en) | Method and Device for Delaying Access to Data and/or Instructions of a Multiprocessor System | |
US20090024908A1 (en) | Method for error registration and corresponding register | |
US20180157549A1 (en) | Multi-core processor and cache management method thereof | |
US6584580B1 (en) | Processor and multiprocessor system | |
US20100011183A1 (en) | Method and device for establishing an initial state for a computer system having at least two execution units by marking registers | |
JP3746957B2 (ja) | 論理分割システムの制御方法 | |
JP3239935B2 (ja) | 密結合マルチプロセッサシステムの制御方法、密結合マルチプロセッサシステム及びその記録媒体 | |
JP2968484B2 (ja) | マルチプロセッサ計算機及びマルチプロセッサ計算機における障害復旧方法 | |
JP2008176731A (ja) | マルチプロセッサシステム | |
JP3068491B2 (ja) | キャッシュ索引障害処理方式 | |
JPH04157543A (ja) | キャッシュメモリ制御回路 | |
JPH0469747A (ja) | 演算処理装置 | |
JPH0498326A (ja) | マイクロプロセッサ |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ROBERT BOSCH GMBH, GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KOTTKE, THOMAS;TRITTLER, STEFAN;REEL/FRAME:019266/0203;SIGNING DATES FROM 20060811 TO 20060903 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |