WO2021164679A1 - 面向车规级芯片功能安全的故障管理系统 - Google Patents
面向车规级芯片功能安全的故障管理系统 Download PDFInfo
- Publication number
- WO2021164679A1 WO2021164679A1 PCT/CN2021/076492 CN2021076492W WO2021164679A1 WO 2021164679 A1 WO2021164679 A1 WO 2021164679A1 CN 2021076492 W CN2021076492 W CN 2021076492W WO 2021164679 A1 WO2021164679 A1 WO 2021164679A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- fault
- chip
- type
- module
- car
- Prior art date
Links
- 230000003068 static effect Effects 0.000 claims description 45
- 230000007246 mechanism Effects 0.000 claims description 42
- 238000001514 detection method Methods 0.000 claims description 32
- 238000002347 injection Methods 0.000 claims description 29
- 239000007924 injection Substances 0.000 claims description 29
- 238000012545 processing Methods 0.000 claims description 19
- 238000012937 correction Methods 0.000 claims description 12
- 238000012544 monitoring process Methods 0.000 claims description 6
- 230000004044 response Effects 0.000 description 12
- 230000008901 benefit Effects 0.000 description 9
- 238000000034 method Methods 0.000 description 8
- 238000013461 design Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000011161 development Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012502 risk assessment Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0766—Error or fault reporting or storing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/2273—Test methods
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07C—TIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
- G07C5/00—Registering or indicating the working of vehicles
- G07C5/08—Registering or indicating performance data other than driving, working, idle, or waiting time, with or without registering driving, working, idle or waiting time
- G07C5/0808—Diagnosing performance data
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60T—VEHICLE BRAKE CONTROL SYSTEMS OR PARTS THEREOF; BRAKE CONTROL SYSTEMS OR PARTS THEREOF, IN GENERAL; ARRANGEMENT OF BRAKING ELEMENTS ON VEHICLES IN GENERAL; PORTABLE DEVICES FOR PREVENTING UNWANTED MOVEMENT OF VEHICLES; VEHICLE MODIFICATIONS TO FACILITATE COOLING OF BRAKES
- B60T17/00—Component parts, details, or accessories of power brake systems not covered by groups B60T8/00, B60T13/00 or B60T15/00, or presenting other characteristic features
- B60T17/18—Safety devices; Monitoring
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B23/00—Testing or monitoring of control systems or parts thereof
- G05B23/02—Electric testing or monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0733—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a data processing system embedded in an image processing device, e.g. printer, facsimile, scanner
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0736—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in functional embedded systems, i.e. in a data processing system designed as a combination of hardware and software dedicated to performing a certain function
- G06F11/0739—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in functional embedded systems, i.e. in a data processing system designed as a combination of hardware and software dedicated to performing a certain function in a data processing system embedded in automotive or aircraft systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0766—Error or fault reporting or storing
- G06F11/0769—Readable error formats, e.g. cross-platform generic formats, human understandable formats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/26—Functional testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/2284—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing by power-on test, e.g. power-on self test [POST]
Definitions
- This application relates to a system fault management system for passenger cars, and in particular to a system fault management system oriented to the functional safety of car-level chips.
- Functional safety is essential for safety-related electrical and electronic systems (such as power control systems) in the automotive field.
- These functional safety (Functional Safety) applications can impose strict constraints on the system to execute safely and reliably in a complex system environment.
- Safety Mechanism is integrated inside the car-level chip.
- the safety mechanism can include the safety mechanism inside the IP (a designed module inside the chip) and the safety mechanism at the system level.
- the current vehicle-level chips have a great load in terms of fault identification, classification, and processing, and they cannot take reasonable fault response measures in an effective and timely manner, thereby reducing the availability of the system when a fault occurs.
- the fault management further includes a fault injection module (Fault Injector), a static signal detection module (Static Signal Monitor), and a fault control module (Fault Controller).
- a fault injection module Fault Injector
- static signal detection module Static Signal Monitor
- a fault control module Fault Controller
- the fault injection module (Fault Injector) is connected to all the functional modules (IP1...IPn) inside the chip through electrical connection, and each functional module (IP1...IPn) is equipped with a safety mechanism.
- the fault control module (Fault Controller) is connected to each IP (IP1...IPn), static signal detection module (Static Signal Monitor), processor (CPU), system controller (System Controller), and chip external system through electrical connection. (out of chip).
- the static signal detection module (Static Signal Monitor) is connected to the system configuration module (System Configure) inside the chip through an electrical connection.
- the fault controller (Fault Controller) is responsible for summarizing the fault indication signals ( Fault Indicated Signals (Fault Indicated Signals) sent.
- the static signal detection module (Static Signal Monitor) performs real-time monitoring of the static signal generated by the system configuration module (System Configure) inside the chip, so as to avoid the stuck-at fault (Stuck-at Fault). ) Caused by the failure.
- the fault indication signal generated by the static signal detection module is output to the fault controller (Fault Controller) for classification processing.
- the fault injection module (Fault Injector) is connected to all the functional modules (IP1...IPn) inside the chip through electrical connection, and each functional module (IP1...IPn) is equipped with a safety mechanism.
- the fault control module (Fault Controller) is connected to each IP (IP1...IPn), static signal detection module (Static Signal Monitor), processor (CPU), system controller (System Controller), and chip external system through electrical connection. (out of chip), the fault control module (Fault Controller) has a built-in fault classification management model composed of four types of faults.
- the static signal detection module (Static Signal Monitor) is connected to the system configuration module (System Configure) inside the chip through an electrical connection.
- Type 1 Faults that require assistance from an external system are configured as Fail Fatal; Type 2: Faults that fail the main function are configured as Fail Safe; Type 3: Configure the faults handled by the automatic degraded operation as Fail Operational; Type 4: Configure the faults handled by the automatic error correction operation as Fail Correctable.
- the four types of failure severity are configured as: Rule 1: Type 1> Main Type 2> ⁇ Type 3, Type 4 ⁇ , where " ⁇ Type 3, Type 4 ⁇ " means the collection of Type 3 and Type 4; Rule 2: Type 3>Type 4; Rule 3: Rule 1>Rule 2.
- a fault controller (Fault Controller) generates a four-level structure of fault information composed of four types of faults according to different scenarios and fault types to which the chip is applied according to pre-configuration.
- the fault controller further includes 4 fault selection units (Fault Selection), and the generated fault information and the input fault indication signal can be connected to the fault selection unit (Fault Selection). Selection) configuration forms a variety of corresponding relationships.
- multiple correspondences include: one-to-one (1 to 1), one-to-many (1 to N), and/or many-to-one (N to 1) to adapt to different Application scenarios and different functional safety level requirements.
- the system fault management system oriented to the functional safety of vehicle-level chips provided by this application can ensure that the system software accurately locates and responds to various faults through a fine-grained fault classification system, and can effectively and timely take reasonable fault response measures to improve the system Availability in the event of a fault; at the same time, reducing the system software fault detection load is conducive to fast, high coverage, and individually configurable power-on and power-down self-checks for the chip.
- FIG. 1 shows a schematic diagram of a four-level fault classification management model designed according to the severity of a chip function fault (Severity Level) in an embodiment of the present application;
- Fig. 2 shows a logical application flow chart of a four-level fault classification management model (F4CM) according to an embodiment of the present application
- FIG. 5 shows a logical structure diagram of a fault management system (Fault Management) oriented to the functional safety of a car-level chip according to an embodiment of the present application.
- Fault Management fault Management
- the hardware device may be specially designed and manufactured for the required purpose, or may also be a known device in a general-purpose computer or other known hardware devices.
- the general-purpose computer has a program stored in it to be selectively activated or reconfigured.
- Automotive Functional Safety (Functional Safety) design generally follows the ISO (International Organization for Standardization) 26262 standard (for automobiles, the first release in 2011 and the second edition in 2018), which are based on the functional safety of electronic, electrical and programmable devices. Derived from the standard IEC (International Electrotechnical Commission) 61508 (first released in 1998 and the latest version released in 2010), it is mainly positioned in the automotive industry for specific electrical devices, electronic equipment, programmable electronic devices, etc., which are specifically used in the automotive field The components are designed to improve the international standards for the functional safety of automotive electronics and electrical products.
- ISO International Organization for Standardization
- IEC International Electrotechnical Commission
- the safety goal derives the system-level safety requirement, and then the safety requirement is allocated to the hardware and software.
- the ASIL level determines the requirements for system security. The higher the ASIL level, the higher the security requirements for the system, and the higher the cost to achieve security, which means the higher the diagnostic coverage of the hardware and the stricter the development process.
- the development cost of the company has increased, the development cycle has been extended, and the technical requirements have been strict.
- the ISO 26262 Functional Safety (Functional Safety) standard requires that the Single-Point Fault Metric (SPFM) be greater than or equal to 99% to achieve the highest safety integrity level ASIL D. Therefore, meeting functional safety can be complicated and difficult for real-time systems.
- the safety mechanism can include the safety mechanism inside the IP (a designed module inside the chip) and the safety mechanism at the system level.
- these safety mechanisms need to report the occurrence of the fault in time, so that the system can respond to the fault according to the type and degree of the fault, so as to avoid the potential of the fault or the function failure directly caused by the fault. .
- the lack of a centralized fault management module inside the chip imposes a great load on the identification, classification and processing of system software faults, and it is also not conducive to the realization of fast, high coverage, and personalized configuration of the chip. Power-on and Power-down self-check.
- the faults are classified, but the classification granularity is very large (the faults are divided into two categories: Fatal and Error), which makes the system unable to take effective and timely measures.
- Reasonable failure response measures reduce the availability of the system when a failure occurs.
- the embodiment of the present application provides a fault management system oriented to the functional safety of a vehicle-level chip.
- the fault management system includes an out of chip system (out of chip) and a vehicle-level chip.
- the vehicle-level chip includes a fault management (Fault Management). ).
- the Fault Management is configured with a fault classification management model.
- the fault manager with fault classification management model can be used to ensure that the system software accurately locates and responds to various faults through a fine-grained fault classification system, thus effectively , Take reasonable failure response measures in a timely manner to improve the availability of the system when a failure occurs.
- the vehicle-level chip may also include a processor (CPU), a system controller (System Controller), a system configuration module (System Configure), an in-chip functional module (IP1...IPn), etc. .
- processor CPU
- System Controller System Controller
- System Configure system configuration module
- IP1...IPn in-chip functional module
- application scenario refers to an application scenario in a car to which a chip (vehicle-level chip) is applied, and mainly relates to an environment constituted by different systems or components in a car.
- the car-level chip will integrate the security mechanism inside the IP and the security mechanism at the system level. When a fault occurs and is detected by the corresponding security mechanism, these security mechanisms need to report the occurrence of the fault in time, so that the system can respond according to the type and degree of the fault Response to failures, so as to avoid the potential of the failure or the failure of the function directly caused by the failure.
- random failures of the internal hardware of the chip can be distinguished according to the following dimensions (W1 to W3):
- a failure that requires assistance from an external system is defined as a “fatal failure (Fail Fatal)";
- the failures of all functional modules (IP1...IPn) in the vehicle-level chip can be divided into the four categories described in Table 1 (fault levels 1-4 correspond to types 1-4 in turn).
- Table 1 can be used in engineering practice to classify and mark random hardware faults inside the chip so that the system can automatically determine the type of fault and accurately locate the fault location.
- the fault classification is refined from the current common fatal (Fatal) and error (Error) faults into the above-mentioned four categories (Type 1 to Type 4), which improves the classification granularity.
- Software or hardware can directly handle correspondingly. The response speed of the fault is improved.
- the use scenario can be customized.
- the fault classification method can be customized to meet different application scenarios and improve the flexibility of chip application.
- step S2-1 a functional failure of an IP inside the chip is detected, that is, a fault indication signal (Fault Indicated Signals) sent by the safety mechanism is received.
- a fault indication signal Fram Indicated Signals
- step S2-3 if the judgment result is "Yes”, it is determined to be Fail Safe, and the IP function failure signal (Fail Safe) information is output to the system controller (System Controller) inside the chip. ) Carry out necessary operations such as automatic reset to make the system enter a safe state or resume operation; if the judgment result is "No”, proceed to the next judgment step according to the four-level fault classification management model (F4CM), that is, after judging the failure, the chip Do the main functions of the internal hardware or the software system running on the chip need to be degraded?
- F4CM four-level fault classification management model
- step S2-4 if the judgment result is "Yes”, it is determined to be a failure operation (Fail Operational), and the information of the IP function failure signal (Fail Operation, Fail Operation) is output to the processor (CPU) inside the chip Hand over to the software running on the CPU for degraded operation processing; if the judgment result is "No", it is determined as a Fail Correctable fault (Fail Correctable), and the IP function fault signal (Fail Correctable) information is output
- the processor (CPU) inside the chip is handed over to the software running on the CPU to perform automatic error correction processing through a security mechanism, or the security mechanism in the IP performs self-correction.
- the fault management system determines from low to high to which level the fault should be classified, and during execution, the fault is processed in the order from low to high.
- the process of handling relatively serious faults can be accelerated, and the response time of fault handling can be shortened.
- the classification criteria for high and low failure levels are based on the numbers shown in Table 1, that is, the highest failure level is the correctable failure represented by the number 4, and the lowest failure level is represented by the number. For fatal faults, the smaller the number of the fault level, the greater the severity of the fault.
- Fig. 3 shows a logic application flowchart of a four-level fault classification management model (F4CM) according to another embodiment of the present application.
- F4CM fault classification management model
- the fault manager may further include a classifier, which is used to receive a signal of a functional failure of each functional module inside the chip and determine the type of the functional failure. Using the classifier to pre-judge the type of functional failure can reduce the steps of logical judgment, simplify calculations, and improve processing efficiency.
- the fault manager including the classifier may execute the following steps S3-1 to S3-3, where the difference between the embodiment in FIG. 3 and the embodiment in FIG. 2 lies in that, FIG. 3 In the embodiment, the judgment logic of the four-level fault has been changed.
- a classifier is used to receive the functional fault signals from the internal IP1...IPn of the chip, and simultaneously determine which type of fault the functional fault belongs to based on the four different types of fault attributes.
- the four-level fault classification management model (F4CM) is configured in the classifier.
- step S3-1 a functional failure of an IP inside the chip is detected, that is, a fault indication signal (Fault Indicated Signals) sent by the safety mechanism is received.
- a fault indication signal Fram Indicated Signals
- step S3-2 according to the four-level fault classification management model (F4CM), it is judged that the functional fault type of the IP is fatal fault (Fail Fatal), fail safe (Fail Safe), fault operation (Fail Operational), and correctable Which of the four types of failure (Fail Correctable)?
- F4CM fault classification management model
- step S3-3 when the type of functional failure is a fatal failure (Fail Fatal), the IP functional failure signal (Fail Fatal) information is output to the out of chip system (out of chip), which is assisted by the external system. Reset, power off or other necessary operations.
- step S3-3 when the functional failure type is fatal failsafe (Fail Safe), the IP functional failure signal (Fail Safe) information is output to the system controller (System Controller) inside the chip for automatic Necessary operations such as reset to make the system enter a safe state or resume operation.
- the system controller System Controller
- step S3-3 when the type of the functional failure is a fatal failure (Fail Correctable), the information of the IP functional failure signal (Fail Correctable) is output to the processor (CPU) inside the chip
- the software running on the CPU performs automatic error correction processing through the security mechanism or the security mechanism in the IP performs automatic error correction.
- the logic application embodiment of the four-level fault classification management model (F4CM) of the present application is a low-cost and high-efficiency system fault management system for car-level chip functional safety, which can be centralized, hierarchical, and detailed.
- the granular chip function fault management system can effectively detect and classify the faults inside the chip according to the severity, so as to provide the system with accurate fault information, ensure that the system software accurately locates and respond to various faults, and reduces the system software fault detection load , Take reasonable fault response measures effectively and timely to improve the availability of the system when a fault occurs.
- the fault controller (Fault Controller) is responsible for summarizing the various IP (IP1...IPn) inside the chip and the fault indication signals (Fault Indicated Signals) sent by all safety mechanisms in the chip system. And according to the different scenarios and fault types used by the chip, the fault information corresponding to the four-level fault classification management model (F4CM) shown in Figure 1 is generated according to the pre-configuration.
- F4CM four-level fault classification management model
- the fault controller can be further used to summarize its own static signal detection module (Static Signal Monitor), each IP inside the chip, and all security mechanisms sent by the chip system.
- the fault indication signal (Fault Indicated Signals) sent by the fault indication signal (Fault Indicated Signals).
- the fault controller may include 4 fault selection units (Fault Selection).
- Various correspondences can be formed between the generated fault information and the input fault indication signal through the configuration of the fault selection unit (Fault Selection).
- multiple correspondence relationships include: one-to-one (1 to 1), one-to-many (1 to N), and/or many-to-one (N to 1), where N is a positive integer not less than 2.
- the fault management system with the controller in this embodiment can adapt to different application scenarios and different functional safety level requirements.
- fault selection units are provided in the fault controller (Fault Controller), and the four fault selection units correspond to fatal faults (Fail Fatal),
- the four fault selection units correspond to fatal faults (Fail Fatal)
- Each IP (IP1...IPn) inside the chip is connected to the fault selection unit (Fault Selection) through electrical signals, so that the fault selection unit (Fault Selection) can receive the fault indication signals (Fault Indicated Signals) sent by each IP inside the chip.
- each fault selection unit for example, the fault selection unit 1
- it is signal-connected with multiple functional modules IP1 ⁇ IPn to establish a corresponding relationship.
- the corresponding relationship is It is the above-mentioned many-to-one;
- each functional module for example, IP1
- it is signal-connected with multiple fault selection units 1 to 4 to establish a corresponding relationship.
- the corresponding relationship is the above-mentioned one-to-many;
- the corresponding relationship established by the signal connection between a fault selection unit (for example, the fault selection unit 1) and a functional module (for example, IP1) is the above-mentioned one-to-one.
- the one-to-one, one-to-many, and many-to-one correspondence can exist independently or coexist as shown in FIG. 4, which can be specifically designed according to actual needs. There is no restriction here.
- a software configuration module may also be provided outside the fault controller (Fault Controller).
- the software configuration module (Software Configuration) is connected to the 4 fault selection units (Fault Selection) through electrical signals, and is pre-configured according to the different scenarios and fault types used by the chip, so that the fault selection unit can receive the transmission of each IP in the chip Fault Indicated Signals (Fault Indicated Signals).
- the software configuration module (Software Configuration) can also be used to monitor the working status of the fault selection unit (Fault Selection) in real time. When a fault or logic error occurs in the fault selection unit (Fault Selection), it can perform external monitoring and correction in time. After the software configuration module (Software Configuration) collects and judges the fault indication signals (Fault Indicated Signals), the fault information (Fault Information) is generated.
- the generated fault information can be sent to the chip's internal modules and external (external systems, such as software configuration modules, etc.) for the following processing: 1) Run the fault (Fail Operational) and correctable faults (Fail Correctable information is output to the processor (CPU) inside the chip for processing by the software running on the CPU; 2) Failsafe information is output to the system controller (System Controller) inside the chip for automatic reset Wait for necessary operations to make the system enter a safe state or resume operation; 3) Output the fatal fault (Fail Fatal) information to the outside of the chip (out of chip), and the external system assists in resetting, powering off, or other necessary operations.
- Fig. 5 shows a logical structure diagram of a fault management system according to an embodiment of the present application.
- the fault management system (Fault Management) in Fig. 5 is configured with: a fault controller (Fault Controller), a static signal detection module (Static Signal Monitor), and a fault injection module (Fault Injector) as shown in Fig. 4.
- a fault controller Fault Controller
- static signal detection module Static Signal Monitor
- Fault Injector fault injection module
- the static signal detection module (Static Signal Monitor) is responsible for real-time monitoring of the static signal generated by the system configuration module (System Configure) inside the chip according to the pre-configuration, and detects the signal fixed fault (Stuck-at Fault). ) Caused by the failure.
- the stuck-at fault (Stuck-at Fault) is a stuck-at 0 or stuck-at 1 type fault known in the art, which means that a signal or pin in a circuit is unexpectedly fixed to a logic 0 (stuck-at 0). ) Or logic 1 (stuck-at 1), which cannot be changed.
- the fault controller can be configured with a fault classification management model using the four-level fault classification management model (F4CM) designed in this application.
- the fault management system for the functional safety of car-level chips provided by this application can ensure that the system software accurately locates and responds to various faults through a fine-grained fault classification system, effectively and timely Take reasonable fault response measures to improve the availability of the system when a fault occurs; at the same time, reduce the system software fault detection load, which is conducive to fast, high coverage, and individually configurable power-on and power-off of the chip. (Power-down) self-check.
- Table 2 The corresponding relationship between the functional effects and technical means of the fault management system provided by the embodiments of the present application can be referred to in Table 2 below.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Transportation (AREA)
- Mechanical Engineering (AREA)
- Automation & Control Theory (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
Claims (20)
- 一种面向车规级芯片功能安全的故障管理系统,其特征在于,包括芯片外部系统和车规级芯片,其中,所述车规级芯片包括故障管理器,所述故障管理器配置有故障分类管理模型。
- 如权利要求1所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述故障管理器内置有由故障等级由高到低划分的四种类型故障构成的所述故障分类管理模型。
- 如权利要求1或2所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述四种类型故障被配置为:类型1:将需要所述芯片外部系统协助处理的故障配置为致命故障;类型2:将主要功能失效的故障配置为故障安全;类型3:将自动降级运行处理的故障配置为故障运行;以及类型4:将自动纠错运行处理的故障配置为可纠错故障。
- 如权利要求3所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述四种类型故障被被进一步配置为:规则1:类型1>类型2>{类型3,类型4},其中“{类型3,类型4}”表示类型3和类型4的合集;规则2:类型3>类型4;以及规则3:规则1>规则2。
- 如权利要求3或4所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述车规级芯片包括处理器、系统控制器、系统配置模块和位于所述车规级芯片内的至少一个功能模块。
- 如权利要求5所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述故障管理器进一步包括故障注入模块、静态信号检测模块以及故障控制模块,其中,所述故障注入模块通过电连接方式接入位于所述芯片内部的所述至少一个功能模块的每个功能模块,每个所述功能模块内配置有安全机制;所述故障控制模块通过电连接方式分别接入每个所述功能模块、静态信号检测模块、处理器、系统控制器、芯片外部系统,所述故障控制模块内置有所述故障分类管理模型;以及所述静态信号检测模块通过电连接方式接入位于所述芯片内部的所述系统配置模块。
- 如权利要求6所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述故障注入模块通过故障注入信号对所述安全机制进行故障注入,检测相应的故障指示信号,并判断所述安全机制本身是否失效。
- 如权利要求6或7所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述故障控制模块负责汇总自身的静态信号检测模块、所述安全机制所送出的故障指示信号。
- 如权利要求8所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述故障控制模块将产生的故障信息发送给所述功能模块或所述芯片外部系统,包括:将分类为所述故障运行以及所述可纠错故障的信息输出到所述处理器并进行处理;将分类为所述故障安全的信息输出到所述系统控制器进行自动复位以使系统进入安全状态或者恢复运行;以及将分类为所述致命故障的信息输出到所述芯片外部系统,由所述芯片外部系统协助进行复位、断电操作。
- 如权利要求9所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述故障管理器执行步骤包括:步骤S2-1,接收到安全机制所送出的故障指示信号;步骤S2-2,判断是否需要所述芯片外部系统协助处理故障,包括:如果判断结果为“是”,则确定为所述致命故障,由所述芯片外部系统协助进行复位、断电操作;如果判断结果为“否”,执行步骤S2-3;步骤S2-3,判断所述芯片内部的硬件或者运行于所述芯片上的软件系统的主要功能是否失效,包括:如果判断结果为“是”,则确定为所述故障安全,将所述故障指示信号输出到所述系统控制器进行自动复位操作来使所述硬件或者所述软件系统进入安全状态或者恢复运行;如果判断结果为“否”,执行步骤S2-4;步骤S2-4,判断所述硬件或者所述软件系统的主要功能是否需要降级运行,包括:如果判断结果为“是”,则确定为所述故障运行,将所述故障指示信号输出到所述 处理器以进行降级运行处理;如果判断结果为“否”,则确定为所述可纠错故障,将所述故障指示信号输出到所述处理器以通过所述安全机制进行自动纠错处理。
- 如权利要求6-10中任一项所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述静态信号检测模块对位于所述芯片内部的所述系统配置模块所产生的静态信号进行实时监测,检测由信号固定故障所导致的失效。
- 如权利要求11所述的面向车规级芯片功能安全的故障管理系统,其特征在于,所述静态信号检测模块所产生的故障指示信号输出到所述故障控制模块并进行分类处理。
- 一种面向车规级芯片功能安全的故障管理器,所述故障管理器应用至故障管理系统,所述故障管理系统包括芯片外部系统和车规级芯片,其特征在于,所述故障管理器配置有故障分类管理模型。
- 如权利要求13所述的面向车规级芯片功能安全的故障管理器,其特征在于,所述故障控制模块内置有由故障等级由高到低划分的四种类型故障构成的故障分类管理模型。
- 如权利要求14所述的面向车规级芯片功能安全的故障管理器,其特征在于,所述四种类型故障被配置为:类型1:将需要所述芯片外部系统协助处理的故障配置为致命故障;类型2:将主要功能失效的故障配置为故障安全;类型3:将自动降级运行处理的故障配置为故障运行;以及类型4:将自动纠错运行处理的故障配置为可纠错故障。
- 如权利要求14所述的面向车规级芯片功能安全的故障管理器,其特征在于,所述四种类型故障被进一步配置为:规则1:类型1>类型2>{类型3,类型4},其中“{类型3,类型4}”表示类型3和类型4的合集;规则2:类型3>类型4;以及规则3:规则1>规则2。
- 如权利要求14-16中任一项所述的面向车规级芯片功能安全的故障管理器,其特征在于,所述故障管理器包括故障注入模块、静态信号检测模块以及故障控制模块,其中:所述故障注入模块通过电连接方式接入位于所述芯片内部的所述至少一个功能模块的每个功能模块,每个所述功能模块内配置有安全机制;所述故障控制模块通过电连接方式分别接入每个所述功能模块、静态信号检测模块、处理器、系统控制器、芯片外部系统,所述故障控制模块内置有所述故障分类管理模型;以及所述静态信号检测模块通过电连接方式接入位于所述芯片内部的所述系统配置模块。
- 如权利要求14-17中任一项所述的面向车规级芯片功能安全的故障管理器,其特征在于,所述故障控制模块根据所述芯片所应用的不同场景以及所述故障的类型产生故障信息。
- 如权利要求7所述的面向车规级芯片功能安全的故障管理器,其特征在于,故障注入模块产生故障指示信号以输入至所述故障控制模块,所述故障控制模块还包括4个故障选择单元,所述故障信息与所述故障指示信号之间可以通过对所述故障选择单元的配置形成多种对应关系。
- 如权利要求11所述的面向车规级芯片功能安全的故障管理器,其特征在于,所述多种对应关系包括:一对一、一对多和/或多对一,以适应不同的应用场景以及不同的功能安全等级要求。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/891,501 US20220392280A1 (en) | 2020-02-20 | 2022-08-19 | Fault management system for functional safety of automotive grade chip |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010103727.8A CN110955571B (zh) | 2020-02-20 | 2020-02-20 | 面向车规级芯片功能安全的故障管理系统 |
CN202010103727.8 | 2020-02-20 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/891,501 Continuation US20220392280A1 (en) | 2020-02-20 | 2022-08-19 | Fault management system for functional safety of automotive grade chip |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021164679A1 true WO2021164679A1 (zh) | 2021-08-26 |
Family
ID=69985704
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/076492 WO2021164679A1 (zh) | 2020-02-20 | 2021-02-10 | 面向车规级芯片功能安全的故障管理系统 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20220392280A1 (zh) |
CN (1) | CN110955571B (zh) |
WO (1) | WO2021164679A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110955571B (zh) * | 2020-02-20 | 2020-07-03 | 南京芯驰半导体科技有限公司 | 面向车规级芯片功能安全的故障管理系统 |
CN114968646A (zh) * | 2022-07-27 | 2022-08-30 | 南京芯驰半导体科技有限公司 | 一种功能故障处理系统及其方法 |
CN115792583B (zh) * | 2023-02-06 | 2023-05-12 | 中国第一汽车股份有限公司 | 一种车规级芯片的测试方法、装置、设备及介质 |
CN116501008B (zh) * | 2023-03-31 | 2024-03-05 | 北京辉羲智能信息技术有限公司 | 一种面向自动驾驶控制芯片的故障管理系统 |
CN116681015B (zh) * | 2023-08-03 | 2023-12-22 | 苏州国芯科技股份有限公司 | 一种芯片设计方法、装置、设备及存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140201583A1 (en) * | 2013-01-15 | 2014-07-17 | Scaleo Chip | System and Method For Non-Intrusive Random Failure Emulation Within an Integrated Circuit |
CN105365712A (zh) * | 2015-11-05 | 2016-03-02 | 东风汽车公司 | 一种用于车身控制系统的功能安全电路及控制方法 |
CN109308367A (zh) * | 2017-07-26 | 2019-02-05 | 台湾积体电路制造股份有限公司 | 对电子装置的安全电路进行仿真的方法 |
CN109709849A (zh) * | 2018-12-20 | 2019-05-03 | 浙江吉利汽车研究院有限公司 | 单片机安全运行控制方法与装置 |
CN109709963A (zh) * | 2018-12-29 | 2019-05-03 | 百度在线网络技术(北京)有限公司 | 无人驾驶控制器及无人驾驶车辆 |
CN110658807A (zh) * | 2019-10-16 | 2020-01-07 | 上海仁童电子科技有限公司 | 一种车辆故障诊断方法、装置及系统 |
CN110955571A (zh) * | 2020-02-20 | 2020-04-03 | 南京芯驰半导体科技有限公司 | 面向车规级芯片功能安全的故障管理系统 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104360868B (zh) * | 2014-11-29 | 2017-10-24 | 中国航空工业集团公司第六三一研究所 | 一种大型飞机综合处理平台中的多级故障管理方法 |
US10685159B2 (en) * | 2018-06-27 | 2020-06-16 | Intel Corporation | Analog functional safety with anomaly detection |
CN109484474B (zh) * | 2018-09-19 | 2021-06-08 | 上海汽车工业(集团)总公司 | Eps控制模块及其控制系统和控制方法 |
-
2020
- 2020-02-20 CN CN202010103727.8A patent/CN110955571B/zh active Active
-
2021
- 2021-02-10 WO PCT/CN2021/076492 patent/WO2021164679A1/zh active Application Filing
-
2022
- 2022-08-19 US US17/891,501 patent/US20220392280A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140201583A1 (en) * | 2013-01-15 | 2014-07-17 | Scaleo Chip | System and Method For Non-Intrusive Random Failure Emulation Within an Integrated Circuit |
CN105365712A (zh) * | 2015-11-05 | 2016-03-02 | 东风汽车公司 | 一种用于车身控制系统的功能安全电路及控制方法 |
CN109308367A (zh) * | 2017-07-26 | 2019-02-05 | 台湾积体电路制造股份有限公司 | 对电子装置的安全电路进行仿真的方法 |
CN109709849A (zh) * | 2018-12-20 | 2019-05-03 | 浙江吉利汽车研究院有限公司 | 单片机安全运行控制方法与装置 |
CN109709963A (zh) * | 2018-12-29 | 2019-05-03 | 百度在线网络技术(北京)有限公司 | 无人驾驶控制器及无人驾驶车辆 |
CN110658807A (zh) * | 2019-10-16 | 2020-01-07 | 上海仁童电子科技有限公司 | 一种车辆故障诊断方法、装置及系统 |
CN110955571A (zh) * | 2020-02-20 | 2020-04-03 | 南京芯驰半导体科技有限公司 | 面向车规级芯片功能安全的故障管理系统 |
Also Published As
Publication number | Publication date |
---|---|
US20220392280A1 (en) | 2022-12-08 |
CN110955571B (zh) | 2020-07-03 |
CN110955571A (zh) | 2020-04-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021164679A1 (zh) | 面向车规级芯片功能安全的故障管理系统 | |
US20180111626A1 (en) | Method and device for handling safety critical errors | |
US8732522B2 (en) | System on chip fault detection | |
US10649487B2 (en) | Fail-safe clock monitor with fault injection | |
US11774487B2 (en) | Electrical and logic isolation for systems on a chip | |
CN107193680A (zh) | 一种心跳检测方法、设备及系统 | |
CN116049249A (zh) | 报错信息处理方法、装置、系统、设备和存储介质 | |
US8255769B2 (en) | Control apparatus and control method | |
CN114968646A (zh) | 一种功能故障处理系统及其方法 | |
US10467889B2 (en) | Alarm handling circuitry and method of handling an alarm | |
CN108254670A (zh) | 用于高速交换SoC的健康监控电路结构 | |
JP7012915B2 (ja) | コントローラ | |
US8478478B2 (en) | Processor system and fault managing unit thereof | |
CN104050051B (zh) | 一种星载计算机的故障诊断方法 | |
US20210397502A1 (en) | Method and system for fault collection and reaction in system-on-chip | |
CN107179911A (zh) | 一种重启管理引擎的方法和设备 | |
US9164852B2 (en) | System on chip fault detection | |
CN110991673A (zh) | 用于复杂系统的故障隔离和定位方法 | |
JP5337661B2 (ja) | メモリ制御装置及びメモリ制御装置の制御方法 | |
CN111859843B (zh) | 检测电路故障的方法及其装置 | |
CN103391207B (zh) | 异构的故障管理系统 | |
CN109885450B (zh) | 主动式星载计算机健康状态监视优化方法及系统 | |
Pandya et al. | Software Validation for Safety System based on IEC61508 | |
JP5151216B2 (ja) | 論理機能回路と自己診断回路とからなる統合回路の設計方法 | |
CN111061243B (zh) | 电子控制器程序流监控系统及方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21757528 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21757528 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21757528 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.07.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21757528 Country of ref document: EP Kind code of ref document: A1 |