EP2452264A1 - System-on-chip fehlererkennung - Google Patents
System-on-chip fehlererkennungInfo
- Publication number
- EP2452264A1 EP2452264A1 EP10744654A EP10744654A EP2452264A1 EP 2452264 A1 EP2452264 A1 EP 2452264A1 EP 10744654 A EP10744654 A EP 10744654A EP 10744654 A EP10744654 A EP 10744654A EP 2452264 A1 EP2452264 A1 EP 2452264A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- core
- message
- trm
- messages
- sent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/55—Prevention, detection or correction of errors
- H04L49/555—Error detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/004—Error avoidance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/26—Functional testing
- G06F11/273—Tester hardware, i.e. output processing circuits
- G06F11/2736—Tester hardware, i.e. output processing circuits using a dedicated service processor for test
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7807—System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
- G06F15/7825—Globally asynchronous, locally synchronous, e.g. network on chip
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/10—Packet switching elements characterised by the switching fabric construction
- H04L49/109—Integrated on microchip, e.g. switch-on-chip
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/55—Prevention, detection or correction of errors
- H04L49/552—Prevention, detection or correction of errors by ensuring the integrity of packets received through redundant connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/18—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
- G06F11/183—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits by voting, the voting not being performed by the redundant components
- G06F11/184—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits by voting, the voting not being performed by the redundant components where the redundant components implement processing functionality
Definitions
- This invention relates to a method and apparatus for improving the reliability of a system-on-chip in an embedded computer system.
- the invention relates to a method for error detection in a system-on-chip (SoC) consisting of a number of IP cores, each IP core being a fault-containment unit, and where the IP cores are via a network-on -Chip communicate with each other by means of messages and an excellent IP core, a TRM (Trusted Resource Monitor) realized.
- SoC system-on-chip
- TRM Trusted Resource Monitor
- SoC system-on-chip
- An IP core is a hardware / software component that fulfills a given function.
- the communication of IP cores can be done either through the access of the IP cores to a common memory or via messages.
- PCT / AT 2009/00207 an SoC architecture is presented in which the IP cores communicate exclusively via messages.
- the present invention aims to prevent a faulty IP core of a SoC from failing other IP cores not directly affected by the fault.
- the present invention aims to prevent, in a system-on-chip (SoC) in which a plurality of components (IP cores) communicate exclusively by means of messages, to prevent a fault of one IP core on the others, from Error not directly affected IP cores, propagates.
- SoC system-on-chip
- This goal is achieved by detecting and discarding a faulty control message sent from a nonprivileged IP core to another unprivileged IP core from a (by definition independent) fault containment unit, so that it fails Control message can not cause failure of the message recipient.
- Any message from one IP core that can cause another IP core to fail can be checked by a third IP core and discarded if necessary prevent this erroneous message sent by a failed IP core from causing the failure of another IP core.
- any control message to be sent from a non-privileged IP core to another unprivileged IP core is first sent to a third IP core, this third IP core verifying the message, and where if the message is not faulty, the message is forwarded from this third IP core to the intended final recipient.
- the checking IP core may classify a message as faulty if the evaluation of one of the assurances known to the verifying IP core a priori has the value wrong.
- An advantage is the third IP core of the TRM.
- the TRM only forwards messages from a sender entitled to send a control message to the IP core specified in the message.
- TRM can send a control message to the TII (technology-independent interface) of a non-privileged IP core.
- At least three messages, each from a different IP core, must be sent to the TRM within a predetermined time interval, and where the receiving TRM verifies that at least two of the three messages contain the same instruction before issuing this message the TII interface of the addressed IP core is forwarded.
- At least three messages, each from another SoC, must be sent to the TRM within a predetermined time interval, and where the receiving TRM checks to see if at least two of the three messages contain the same instruction before that message to the TII Interface of the addressed IP core is forwarded. It is expedient if the functions of the privileged subsystem, which consists of the TRM, the Network on Chip and the Network Interfaces, are protected by error-correcting codes.
- the invention relates to an apparatus for performing a method as described above, wherein one or more or all process steps are performed directly in the hardware of the SoCs.
- Fig. 1 shows the structure of a system-on-chip (SoC).
- Fig. 2 shows the structure of an IP core of a SoC.
- Fig. 3 shows the sending of a control message from one IP core to another IP core of a SoC.
- IP 1 shows a SoC 100 with the eight IP cores 111, 112, 113, 114, 115, 116, 117 and 118. These eight IP cores can exchange messages via a network-on-chip 101.
- Each IP core e.g., the IP core 114, is connected to the NoC 101 via a network interface (NI) 102.
- NI network interface
- One of these eight IP cores, e.g. the IP core 111 is a privileged IP core called the Trusted Resource Monitor (TRM), while the remaining seven IP cores 112, 113, 114, 115, 116, 117 and 118 are nonprivileged IP cores.
- TRM Trusted Resource Monitor
- the TRM 111, the Network on Chip 101, and the eight Network Interfaces 102 form the privileged subsystem of the SoC 100.
- an error in this privileged subsystem may result in failure of the entire SoC. Therefore, according to the invention, the functions of the privileged subsystem are to be protected by special error control measures, such as the use of error-correcting codes. Corresponding error-correcting codes can be used to detect and correct transient and permanent hardware errors in the privileged subsystem.
- Each of the seven non-privileged IP cores forms its own Fault-Containment Unit (FCU) (Kopetz, H. (1997) Real-Time Systems, Design Principles for Distributed Embedded Applications; ISBN: 0-7923-9894-7. Boston, Kluwer Academic Publishers.), Ie the consequences of any one Software or hardware failures within an unprivileged IP core can only directly disrupt the functions of the affected IP core, but they can only indirectly affect the functionality of the other IP cores through faulty messages. If bad messages are detected and rejected, the indirect consequences of an IP Core error can not be reproduced. In PCT / AT 2006/00278 an architecture is described in which temporal errors of IP core messages are recognized and discarded by the privileged network interface (NI) 102 of the NoC 101.
- NI privileged network interface
- PCT / AT 2009/00207 (WO 2009/140707), only the TRM 111 is allowed to write temporal parameters to the NI 102 in order to prevent a faulty IP core from independently changing the transmission parameters of a message.
- the method as described in PCT / AT 2006/00278 does not prevent contentually incorrect control messages from being sent from an unprivileged faulty IP core to the other non-privileged IP cores.
- Fig. 2 shows the structure of a nonprivileged IP core, e.g. IP core 114.
- This IP core has four external interfaces: 211, 212, 213 and 122.
- the three message interfaces 211, 212 and 213 are connected to the Network Interface (NI) 102 of FIG.
- the interface 122 is a local interface of the IP core, via which a connection to the outside world of the SoC 100 is realized.
- This interface 122 may be e.g. an input / output network (e.g., a CAN network) or a wireless connection to the SoC 100 environment.
- the message interface 211 is referred to as the Linking Interface (LIF) of the IP core 114. Via the LIF 211, the services of the IP core 114 are offered to the seven other IP cores of the SoC 100.
- LIF Linking Interface
- the message interface 212 will be referred to as a Technology Dependent Interface (TDI) that allows the service technician to communicate with the internal functions of the IP core 114. Since the format and content of these TDI messages depend on the specific implementation technique of the IP core, this interface is implementation-dependent.
- TDI Technology Dependent Interface
- the message interface 213 is called TII (Technology-Independent Interface). Via this TII interface 213, the configuration and the sequence control of the IP core 114 are realized by means of control messages.
- a control message is a message that controls the flow of the calculation in an IP Core. For example, by means of control messages, a hardware reset of the entire IP core 114 is initiated, or the start of a program execution or the termination of a program execution of the IP address. Cores 114 arranged. Furthermore, control messages can be used to configure or reconfigure the SoC.
- a faulty control message sent to the TII interface of the IP core may cause the failure of the IP core 114, eg, if during the correct work of the IP core 114 suddenly a faulty hardware reset message is received at the TII interface 213 becomes.
- the internal structure of the IP core 114 is shown.
- IP core hardware executing the software loaded in the IP core 114.
- IP core internal operating system executing the software loaded in the IP core 114.
- the IP Core internal middleware At the next level 202 is the IP Core internal operating system and at level 203 is the IP Core internal middleware.
- the application software is the application software.
- API application program interface
- the messages received over the TII interface 213 communicate either directly with the IP core hardware 201 (eg a reset message), with the operating system 202 (eg a control message for scheduling a process) or the middleware 203, but not with the application software 204. It is therefore the application software of a non-privileged IP core not possible, faulty control messages, which can be detected via the TII interface 213 to recognize.
- Fig. 3 shows the sending of a control message to the TII interface of a non-privileged IP core. If e.g. If the IP core 115 wants to send a reset message 140 to the IP core 116, it must first send this message 140 to an independent third IP core, the TRM 111, according to the invention. The TRM 111 checks if the message 140 is faulty. This verification is done on the basis of assertions that the TRM needs to know a priori. These representations may relate to the state of the overall system, the identity of the sender, the time of the message, and the content of the message. If all the assertions evaluated by the TRM are correct, then the TRM sends the reset message 141 to the TII interface of the IP core 115.
- the architecture it must be ensured by the architecture that only the (privileged) TRM 111 is able to handle messages to the TII interface of a non-privileged IP core.
- the implementation of a non-privileged IP core must ensure that control messages (such as the reset message) that could cause an IP core failure can only be received via the TII interface. Therefore, according to the invention, it is not possible for a nonprivileged IP core to directly send a control message to another unprivileged IP core.
- the error detection of the control messages via assurances can be regarded as insufficient.
- three parallel-running IP cores need the control commands that are included in the control are embedded, calculate.
- the TRM compares these three control messages and sends a corresponding message to the TII interface of the receiver only if at least two of these messages are identical. This masks any error in one of the three sending IP cores.
- these three parallel control messages must come from three independent SoCs to prevent a common-mode error that can occur within a single SoC.
- This invention substantially improves the reliability of a SoC by preventing a faulty IP core from causing the failure of another IP core.
- the error detection in the received IP core does not make sense, since the receiving IP core can not correctly execute its own error detection in the event of an error.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Computing Systems (AREA)
- Computer Security & Cryptography (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Hardware Redundancy (AREA)
- Multi Processors (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AT10772009 | 2009-07-09 | ||
PCT/AT2010/000248 WO2011003121A1 (de) | 2009-07-09 | 2010-07-07 | System-on-chip fehlererkennung |
Publications (1)
Publication Number | Publication Date |
---|---|
EP2452264A1 true EP2452264A1 (de) | 2012-05-16 |
Family
ID=43012654
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP10744654A Withdrawn EP2452264A1 (de) | 2009-07-09 | 2010-07-07 | System-on-chip fehlererkennung |
Country Status (5)
Country | Link |
---|---|
US (1) | US8732522B2 (ja) |
EP (1) | EP2452264A1 (ja) |
JP (1) | JP2012532385A (ja) |
CN (1) | CN102473121A (ja) |
WO (1) | WO2011003121A1 (ja) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9164852B2 (en) | 2009-07-09 | 2015-10-20 | Fts Computertechnik Gmbh | System on chip fault detection |
WO2013123543A1 (de) * | 2012-02-22 | 2013-08-29 | Fts Computertechnik Gmbh | Verfahren zur fehlererkennung in einem system-of-systems |
AT512665B1 (de) * | 2012-03-20 | 2013-12-15 | Fts Computertechnik Gmbh | Verfahren und Apparat zur Bildung von Software Fault Containment Units in einem verteilten Echtzeitsystem |
US9160617B2 (en) | 2012-09-28 | 2015-10-13 | International Business Machines Corporation | Faulty core recovery mechanisms for a three-dimensional network on a processor array |
US8990616B2 (en) | 2012-09-28 | 2015-03-24 | International Business Machines Corporation | Final faulty core recovery mechanisms for a two-dimensional network on a processor array |
AT515454A3 (de) * | 2013-03-14 | 2018-07-15 | Fts Computertechnik Gmbh | Verfahren zur Behandlung von Fehlern in einem zentralen Steuergerät sowie Steuergerät |
US10318458B2 (en) * | 2013-08-21 | 2019-06-11 | Siemens Ag Österreich | Method and circuit arrangement for temporally limiting and separately accessing a system on a chip |
FR3026869B1 (fr) * | 2014-10-07 | 2016-10-28 | Sagem Defense Securite | Systeme embarque sur puce a haute surete de fonctionnement |
CN105991384B (zh) * | 2016-06-23 | 2019-03-08 | 天津大学 | 兼容时间触发以太网与1553b的航天以太网通信方法 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09153020A (ja) * | 1995-11-29 | 1997-06-10 | Hitachi Ltd | 疎結合計算機システム |
US7007099B1 (en) * | 1999-05-03 | 2006-02-28 | Lucent Technologies Inc. | High speed multi-port serial-to-PCI bus interface |
AT411948B (de) * | 2002-06-13 | 2004-07-26 | Fts Computertechnik Gmbh | Kommunikationsverfahren und apparat zur übertragung von zeitgesteuerten und ereignisgesteuerten ethernet nachrichten |
US7606190B2 (en) * | 2002-10-18 | 2009-10-20 | Kineto Wireless, Inc. | Apparatus and messages for interworking between unlicensed access network and GPRS network for data services |
EP1977566B1 (de) * | 2006-01-27 | 2017-03-15 | FTS Computertechnik GmbH | Zeitgesteuerte sichere kommunikation |
JP5190586B2 (ja) * | 2007-04-11 | 2013-04-24 | ティーティーテック コンピュータテクニック アクティエンゲセルシャフト | Ttイーサネットメッセージの効率的かつ安全な伝送のためのコミュニケーション方法及び装置 |
-
2010
- 2010-07-07 WO PCT/AT2010/000248 patent/WO2011003121A1/de active Application Filing
- 2010-07-07 JP JP2012518691A patent/JP2012532385A/ja active Pending
- 2010-07-07 US US13/383,011 patent/US8732522B2/en active Active
- 2010-07-07 CN CN2010800311123A patent/CN102473121A/zh active Pending
- 2010-07-07 EP EP10744654A patent/EP2452264A1/de not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
See references of WO2011003121A1 * |
Also Published As
Publication number | Publication date |
---|---|
CN102473121A (zh) | 2012-05-23 |
US20120124411A1 (en) | 2012-05-17 |
US8732522B2 (en) | 2014-05-20 |
JP2012532385A (ja) | 2012-12-13 |
WO2011003121A1 (de) | 2011-01-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2011003121A1 (de) | System-on-chip fehlererkennung | |
DE3687777T2 (de) | Verfahren und geraet zur sicherstellung eines datenuebertragungssystems. | |
DE3879071T2 (de) | Verwaltung einer defekten Hilfsquelle in einem Multiplex-Kommunikationssystem. | |
DE102005009795A1 (de) | Mikroprozessorsystem für eine Maschinensteuerung in sicherheitszertifizierbaren Anwendungen | |
EP1789857B1 (de) | Datenübertragungsverfahren und automatisierungssystem zum einsatz eines solchen datenübertragungsverfahrens | |
EP3662601A1 (de) | Konzept zum unidirektionalen übertragen von daten | |
DE102010012904B4 (de) | Systeme zum Durchführen eines Tests | |
DE102015119643A1 (de) | Verfahren und Vorrichtungen zur Bereitstellung von Redundanz in einem Prozesssteuerungssystem | |
DE102010010198A1 (de) | System und Verfahren zum Testen eines Moduls | |
DE102014111361A1 (de) | Verfahren zum Betreiben einer Sicherheitssteuerung und Automatisierungsnetzwerk mit einer solchen Sicherheitssteuerung | |
DE102012000188A1 (de) | Verfahren zum Betreiben eines Kommunikationsnetzwerkes und Netzwerkanordnung | |
DE19831720A1 (de) | Verfahren zur Ermittlung einer einheitlichen globalen Sicht vom Systemzustand eines verteilten Rechnernetzwerks | |
EP3201774B1 (de) | Verteiltes echtzeitcomputersystem und zeitgesteuerte verteilereinheit | |
EP3061213B1 (de) | Verfahren zur übertragung von nachrichten in einem computernetzwerk sowie computernetzwerk | |
DE102004044764B4 (de) | Datenübertragungsverfahren und Automatisierungssystem zum Einsatz eines solchen Datenübertragungsverfahrens | |
DE102017103147A1 (de) | Alarmabwicklungs-Schaltungsanordnung und Verfahren zur Abwicklung eines Alarms | |
CN110381035A (zh) | 网络安全测试方法、装置、计算机设备及可读存储介质 | |
EP2250560B1 (de) | Verfahren zur erhöhung der robustheit von computersystemen sowie computersystem | |
EP4127934A1 (de) | Verfahren und sicherheitsgerichtetes system zum ausführen von sicherheitsfunktionen | |
AT411853B (de) | Sichere dynamische softwareallokation | |
DE102013204371B4 (de) | Verfahren und Bussystem zum protokollunabhängigen Übertragen von Standarddatenpaketen mit Sicherheitsdaten | |
DE112019002278T5 (de) | Integritätsüberwachungsperipheriegerät für mikrocontroller- und prozessor-eingangs-/ausgangs-pins | |
US9164852B2 (en) | System on chip fault detection | |
EP3696629A1 (de) | Verfahren zur überprüfung einer industriellen anlage, computerprogramm, computerlesbares medium und system | |
EP2290882B1 (de) | Verfahren zur mehrfachen Fehlerredundanz in Netzwerken mit Ringtopologien |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20111215 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR |
|
DAX | Request for extension of the european patent (deleted) | ||
17Q | First examination report despatched |
Effective date: 20141114 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20150325 |