CN116932272A - Error reporting method and microprocessor - Google Patents

Error reporting method and microprocessor Download PDF

Info

Publication number
CN116932272A
CN116932272A CN202311181976.9A CN202311181976A CN116932272A CN 116932272 A CN116932272 A CN 116932272A CN 202311181976 A CN202311181976 A CN 202311181976A CN 116932272 A CN116932272 A CN 116932272A
Authority
CN
China
Prior art keywords
error
functional module
reporter
reporting
microprocessor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311181976.9A
Other languages
Chinese (zh)
Other versions
CN116932272B (en
Inventor
郭御风
窦强
周强
吴欢欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Phytium Technology Co Ltd
Original Assignee
Phytium Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Phytium Technology Co Ltd filed Critical Phytium Technology Co Ltd
Priority to CN202311181976.9A priority Critical patent/CN116932272B/en
Publication of CN116932272A publication Critical patent/CN116932272A/en
Application granted granted Critical
Publication of CN116932272B publication Critical patent/CN116932272B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/24Resetting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0721Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
    • G06F11/0724Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU] in a multiprocessor or a multi-core unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/24Handling requests for interconnection or transfer for access to input/output bus using interrupt

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application provides an error reporting method and a microprocessor, wherein the method is applied to the microprocessor, the microprocessor comprises a functional module, an error reporting device and an MCU, the functional module is connected with the error reporting device, the error reporting device is connected with the MCU, the functional module comprises a hardware module for realizing the specific processor function, and the method comprises the following steps: the MCU receives an error report message sent by the error report device, wherein the error report message is generated by the error report device under the condition of receiving an error signal sent by the functional module; and the MCU controls the functional module to reset under the condition that the error report message is detected not to be processed within a set time length. By adopting the method, the error of the functional module can be repaired in time under the condition that the error reporting channel between the functional module and the processor core is abnormal.

Description

Error reporting method and microprocessor
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and a microprocessor for reporting errors.
Background
Microprocessors are large-scale integrated circuit devices having a central processing unit function, and their internal architecture includes various functional modules, such as peripheral type controllers, on-chip memory units, memory type controllers, etc., in addition to a processor core.
During the operation of the microprocessor, the functional module may be in error and have abnormal functions, and the processor core is required to repair the functional module error. In some cases, an exception may occur in an error reporting path between the functional module and the processor core, which may cause the functional module error to not be timely reported to the processor core, thereby causing the functional module to not be timely repaired, and even causing a more serious processor exception.
Disclosure of Invention
Aiming at the technical problems, the application provides an error reporting method and a microprocessor, which can repair the error of a functional module in time under the condition that an error reporting channel between the functional module and a processor core is abnormal.
In order to achieve the technical purpose, the application specifically provides the following technical scheme:
the first aspect of the present application proposes an error reporting method, applied to a microprocessor, where the microprocessor includes a functional module, an error reporting device and an MCU, the functional module is connected to the error reporting device, the error reporting device is connected to the MCU, and the functional module includes a hardware module for implementing a specific processor function, where the method includes: the MCU receives an error report message sent by the error report device, wherein the error report message is generated by the error report device under the condition of receiving an error signal sent by the functional module; and the MCU controls the functional module to reset under the condition that the error report message is detected not to be processed within a set time length.
The second aspect of the present application proposes another error reporting method, applied to a microprocessor, where the microprocessor includes a functional module, an error reporting device and an MCU, the functional module is connected to the error reporting device, the error reporting device is connected to the MCU, and the functional module includes a hardware module for implementing a specific processor function, where the method includes: the error reporting device sends an error reporting message to the MCU under the condition that an error signal sent by the functional module is received, so that the MCU controls the functional module to reset under the condition that the error reporting message is detected not to be processed within a set duration; the error report message is used for indicating that the functional module has an error.
A third aspect of the present application proposes a microprocessor, including a functional module, an error reporter and an MCU, where the functional module is connected to the error reporter, the error reporter is connected to the MCU, the functional module includes a hardware module for implementing a specific processor function, the MCU is configured to execute the error reporting method executed by the MCU and/or the error reporter is configured to execute the error reporting method executed by the error reporter.
Based on any one of the first to third aspects, the error reporting method provided by the application sets an error reporting device in the microprocessor, and the error reporting device sends an error reporting message to the MCU in the microprocessor after receiving an error signal sent by the functional module, and the MCU controls the functional module to reset when detecting that the error reporting message is not processed within a set time. According to the scheme, under the condition that the functional module of the microprocessor is wrong and is not repaired in time, the functional module can be reset by means of the MCU under the condition that an error reporting channel between the functional module and the processor core is abnormal, so that the functional module can be repaired in time, and the processor abnormality which is more serious because the functional module cannot be repaired in time is avoided.
In some implementations, the error reporter is provided with a packet status register and an error record status register; a first register bit in the packet status register is used to store an error identification corresponding to the error signal; the error record state register corresponds to the error identifier and is used for recording error information corresponding to the error signal; wherein the first register bit is a register bit corresponding to the error signal, the error information including an error type; the controlling the reset of the functional module comprises the following steps: accessing an error grouping state register of the error reporting device, obtaining an error identification corresponding to the error reporting message, accessing an error recording state register corresponding to the error identification, and determining error information corresponding to the error reporting message; and controlling the functional module to reset based on the error information. In the implementation mode, the error reporting device can effectively record the error related information, and when the error conditions are more, the error information can be ensured not to be disordered, and the error record is organized and the data is accurate. The MCU obtains accurate error information of the functional module by accessing the error reporting device, thereby being capable of accurately repairing the functional module with error.
In some implementations, the microprocessor further includes a clock reset module, the clock reset module connected to the MCU; the MCU controls the functional module to reset, and the method comprises the following steps: the MCU instructs the clock reset module to reset the functional module. Based on the implementation mode, the MCU calls a clock reset module in the microprocessor to reset the functional module, so that the reset efficiency can be improved, and the operation pressure of the MCU can be reduced.
In some implementations, the microprocessor further includes a processor core, and the error reporter is further connected to the processor core, and the error reporter sends an error report message to the processor core when receiving an error signal sent by the functional module. In the implementation mode, the error reporting device reports errors to the MCU and the processor core at the same time, so that the error repairing efficiency can be improved.
In some implementations, the microprocessor further includes a processor core and an interrupt controller, the error reporter is connected to the interrupt controller, the interrupt controller is connected to the processor core, and the error reporter triggers the interrupt controller to send a first interrupt signal to the processor core when receiving an error signal sent by the functional module, where the first interrupt signal is used to indicate that an error occurs in the functional module. In the implementation mode, the error reporting device reports errors to the MCU and the processor core at the same time, so that the error repairing efficiency can be improved.
In some implementations, in a case where the error signal is plural, the error reporter sends an error report message to the MCU, including: the error reporting device sequentially determines the reporting priority of each error signal, and sequentially sends error reporting messages corresponding to the error signals to the MCU according to the sequence of the reporting priority from high to low. Based on the implementation mode, the error reporting device can report errors more orderly, ensure that the errors with high priority are reported preferentially, so that the errors are processed preferentially, and improve the error processing efficiency of the microprocessor.
In some implementations, the error reporter sending an error report message to the MCU includes: judging whether the error corresponding to the error signal is an error which can be corrected; wherein, the error-correctable error is used for representing that the error corresponding to the error signal can be corrected in the running process of the functional module; judging whether the number of the error-correctable errors is larger than a preset number or not under the condition that the error corresponding to the error signal is an error-correctable error; and sending an error reporting message to the MCU under the condition that the number of the generated error is larger than the preset number, and resetting the number of the generated error. In the implementation mode, the error reporting device identifies error-correctable errors and reports the error-correctable errors after a certain number of error-correctable errors are reached, so that the waste of error reporting resources is avoided, and the error reporting efficiency is improved.
In some implementations, the method further comprises: and counting the number of the generated error-correctable errors under the condition that the number of the generated error-correctable errors is not larger than the preset number. In the implementation mode, the error reporting device counts and records the error which can be corrected, can ensure accurate statistics of the error which can be corrected, and avoids more serious processor errors caused by repeated error which can be corrected.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a microprocessor according to an embodiment of the application.
Fig. 2 is a schematic diagram of another microprocessor according to an embodiment of the present application.
Fig. 3 is a flow chart of an error reporting method according to an embodiment of the present application.
Fig. 4 is a schematic diagram of another microprocessor according to an embodiment of the present application.
Fig. 5 is a schematic diagram of another microprocessor according to an embodiment of the present application.
Fig. 6 is a flowchart of another error reporting method according to an embodiment of the present application.
Fig. 7-12 are schematic diagrams illustrating other configurations of microprocessors according to embodiments of the present application.
Fig. 13 is a flowchart of another error reporting method according to an embodiment of the present application.
Fig. 14 (a) -14 (c) are schematic diagrams illustrating other configurations of microprocessors according to embodiments of the present application.
Fig. 15-17 are schematic structural diagrams of some computer systems according to embodiments of the present application.
Fig. 18 is a flowchart of another error reporting method according to the application embodiment.
Fig. 19 is a schematic diagram of another computer system according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The microprocessor is a large-scale integrated circuit device with CPU function, and can complete the operations of fetching instruction, executing instruction and exchanging information with external memory and logic component, etc., and is an operation control portion of microcomputer, and it can be combined with memory and peripheral circuit chip into microcomputer.
Fig. 1 shows a schematic diagram of a microprocessor. Referring to fig. 1, it can be seen that the microprocessor is provided with one or more processor cores 001 first, and further includes a plurality of different functional modules 002, such as a peripheral controller, an on-chip memory unit, and a memory controller. In addition, an interrupt controller 003 is included in the microprocessor, and the interrupt controller 003 is connected to the processor core 001 through an address bus of the microprocessor, and at the same time, the function module 002 is connected to the interrupt controller 003, thereby forming a path among the function module 002, the interrupt controller 003 and the processor core 001.
The processor core 001 is responsible for the control and operation processing of the whole microprocessor, and the interrupt controller 003 is used for generating an interrupt and reporting the interrupt to the processor core 001, so that the processor core 001 processes the interrupt. Interrupt controller 003 generates interrupts including, but not limited to, interrupts generated when a business process interrupt, an emergency interrupt, a high priority event interrupt, a fault interrupt, etc., occurs.
During operation of the microprocessor, errors may also occur in the functional module 002 on the microprocessor, such as module operating program errors, data errors, etc. When the functional module 002 generates an error, an error signal is sent to the interrupt controller 003, and then the interrupt controller 003 sends an interrupt to the processor core 001 according to the error signal sent by the functional module 002, so that the processor core 001 processes the interrupt accordingly, and the error of the functional module 002 is repaired.
However, in a microprocessor, the interrupt controller 003 is responsible for controlling interrupts of the entire microprocessor, that is, all types of interrupts need to be controlled by the interrupt controller 003. In this case, the function module 002 sends an error signal to the interrupt controller 003, which further increases the workload of the interrupt controller 003, occupies interrupt controller resources, and may cause the interrupt generated by the error of the function module 002 to be mixed with other types of interrupt, so that the processor core 001 cannot timely learn and process the error of the function module 002, and affects the hardware functions of the microprocessor.
In order to solve the above technical problems, an embodiment of the present application first proposes an improved microprocessor, as shown in fig. 2, where the microprocessor provided in the embodiment of the present application includes a processor core 001, a functional module 002, and an error reporter 004, the number of the processor cores 001 may be one or more, and the number of the functional modules 002 may be one or more, where each of the functional modules 002 is respectively connected to the error reporter 004, and the error reporter 004 is directly connected to the processor core 001 through a separate path. The error reporting unit 004 may be implemented by adding an error reporting function to any device having a data processing function, for example, the error reporting unit may be implemented by executing the error reporting function by a processor, or the error reporting unit may be implemented by executing an error reporting program by a hardware circuit or the like.
Based on the above-mentioned microprocessor structure, the embodiment of the present application proposes an error reporting method applied to the microprocessor, where the method is cooperatively executed by the error reporting device 004 and the processor core 001 in the microprocessor, that is, in the above-mentioned microprocessor according to the embodiment of the present application, the error reporting device 004 and the processor core 001 are respectively configured to execute the processing steps of the corresponding execution bodies in the error reporting method as described below.
Referring to fig. 3, the processing procedure of the error reporting method provided by the embodiment of the present application includes:
and S101, the functional module sends an error signal to the error reporter under the condition of error.
Specifically, in the working process of the microprocessor, if one or more of the functional modules 002 on the microprocessor, such as the peripheral type controller, the on-chip memory unit, the memory type controller, etc., have errors, an error signal is actively sent to the error reporting unit 004 to inform the error reporting unit 004 that the functional module has errors.
In some implementations, the function module 002 and the error reporter 004 are connected through a plurality of ports, where different ports correspond to different types of errors, and when a certain type of error occurs in the function module 002, a signal is sent to the error reporter 004 through a port corresponding to the type of error, so that the error reporter 004 can determine the type of error occurring in the function module 002 while knowing that the error occurs in the function module 002.
S102, the error reporting device sends an error reporting message to the processor core under the condition that the error reporting device receives an error signal sent by the functional module.
The error report message is used to indicate that the functional module has an error.
Specifically, when receiving the error signal sent by the functional module 002, the error reporter 004 can determine that an error occurs in the functional module, and at this time, the error reporter 004 sends an error reporting message to the processor core 001, so as to notify the processor core 001 that an error occurs in the functional module.
The error report message may be in any message form or message content, and when the technical scheme of the embodiment of the application is actually implemented, it can be ensured that the processor core 001 knows that the message in any form and content with the error occurred in the functional module can be used as the error report message.
In some implementations, a register is also provided inside the error reporter 004 for storing information about errors occurring in the functional module 002.
Specifically, a packet status register and an error record status register are provided inside the error reporting unit 004. The packet status register is used for storing error identification, the error record status register is used for storing error information, the error identification is stored in the packet status register, the error identification can be an error number and the like, one error identification corresponds to one error record status register and is used for storing specific error information corresponding to the error identification, such as error type, error occurrence time, error occurrence functional module information and the like. The error types may be classified into correctable errors, uncorrectable errors, delayed errors, internal memory data errors, external memory data errors, illegal addresses, illegal accesses, slave error responses, external timeouts (such as timeouts when accessing other modules), internal timeouts (such as timeouts when accessing other modules), and the like according to different classification manners.
When the error reporter 004 receives the error signal sent by the functional module 002, firstly, according to the error signal, an error identifier corresponding to the error signal is stored in a first register bit of the packet status register, and error information corresponding to the error signal, such as information of an error type, is stored in an error record status register corresponding to the error identifier.
The first register bit is a register bit corresponding to a received error signal in the packet status register. For example, the correspondence between different error signals and error identifiers may be preset, and when the error reporter 004 receives an error signal, the error identifier corresponding to the error signal may be determined according to the preset correspondence.
In some implementations, the functional module 002 is connected to the error reporter 004 through a plurality of ports, wherein different ports correspond to different types of errors, and to different error identifications. When a certain type of error occurs in the function module 002, an error signal is sent to the error presenter 004 through a port corresponding to the type of error occurred. The error reporter 004 can determine the error identification corresponding to the error signal according to the port receiving the error signal.
Meanwhile, the correspondence relationship among the error signal, the error flag, and the register bits in the packet status register may be preset.
Based on the above correspondence, when the error reporter 004 receives the error signal sent by the functional module 002, the corresponding error type and error identification can be determined, and the register bit corresponding to the error signal is determined according to the above correspondence, and the error identification corresponding to the error signal is stored in the register bit. Then, the error reporter 004 stores the error information corresponding to the error signal, such as the error type, the functional module in which the error occurred, the time, the number of times of occurrence of the error, etc., into the error record status register corresponding to the error identification.
The error reporter 004 records error related information through the grouping state register and the error recording state register, can realize effective recording of the error related information, and can ensure that the error information is not disordered when more error conditions exist, and ensure that the error recording is ordered and the data is accurate.
And S103, under the condition that the processor core receives the error reporting message sent by the error reporting device, the processor core controls the functional module to reset.
Specifically, when the processor core 001 receives the error report message, it may be determined that an error has occurred in the functional module, and at this time, the processor core 001 controls the reset of the functional module in which the error has occurred.
For example, the error report message sent by the error report 004 to the processor core 001 may carry information of the functional module in which the error occurs. When receiving the error report message sent by the error report device, the processor core 001 can control the reset of the functional module according to the information of the functional module with the error contained in the error report message.
In other implementations, the error reporter records error identification and error information through a packet status register and an error record status register. In this manner, when the processor core 001 receives the error report message sent by the error report 004, the error packet status register in the error report 004 is accessed, the error flag stored in the error packet status register is queried, then the error record status register corresponding to the error flag is accessed, and the error information corresponding to the received error report message is read therefrom.
By reading the error information described above, the processor core 001 can confirm the functional module in which the error occurred, and then control the functional module to reset according to the error information described above.
For example, a communication link may be directly established between the processor core 001 and each functional module 002, so that the processor core 001 may generate a reset control signal to the functional module having an error through the communication link, and control the functional module having an error to be reset.
In other implementations, referring to fig. 4, the microprocessor according to the embodiment of the present application further includes an MCU (Microcontroller Unit, micro control unit) 005 and a clock reset module 006. Wherein, MCU005 links to each other with processor core 001, and clock reset module 006 links to each other with MCU 005.
The MCU005 is used for controlling and scheduling specific modules (modules configured to be controlled by the MCU, including a clock reset module) on the microprocessor. The clock reset module 006 is used to control the clock and reset of the various functional modules on the microprocessor.
Based on the microprocessor structure shown in fig. 4, when the processor core 001 controls the reset of the functional module, a reset instruction is sent to the MCU005, and after receiving the reset instruction, the MCU005 sends indication information to the clock reset module 006, so as to instruct the clock reset module 006 to reset the functional module. After receiving the indication information sent by the MCU005, the clock reset module 006 controls the reset of the functional module.
As can be seen from the above description, in the error reporting method provided by the embodiment of the present application, an error reporting device is disposed in the microprocessor, and after receiving an error signal sent by a functional module, the error reporting device sends an error reporting message indicating that an error occurs in the functional module to the processor core, so that the processor core can learn that an error occurs in the functional module when the error occurs in the functional module, so as to repair the functional module in time.
Based on the error reporting method described in any of the foregoing embodiments, the microprocessor shown in fig. 2 according to the embodiment of the present application sets, by configuring the error reporting unit 004, processing steps of corresponding execution bodies in the error reporting method to be executed by the error reporting unit 004 and the processor core 001 in the microprocessor, so that the microprocessor has a higher error reporting efficiency and a higher error repairing efficiency.
Based on the microprocessor, any computer device containing the microprocessor, such as a personal computer, an industrial computer, an intelligent terminal, a handheld terminal, an intelligent wearable device, a server and the like, can be constructed.
In other embodiments, another microprocessor is also provided, and as shown in fig. 5, the microprocessor includes a processor core 001, a functional module 002, an interrupt controller 003, and an error reporter 004. The number of the processor cores 001 may be one or more, and the number of the functional modules 002 may be one or more, where each of the functional modules 002 is connected to the error reporter 004, the error reporter 004 is connected to the interrupt controller 003, and the interrupt controller 003 is connected to the processor cores 001 through an address bus.
The error reporting unit 004 may be implemented by adding an error reporting function to any device having a data processing function, for example, the error reporting unit may be implemented by executing the error reporting function by a processor, or the error reporting unit may be implemented by executing an error reporting program by a hardware circuit or the like.
Based on the above-mentioned microprocessor structure, the embodiment of the present application proposes an error reporting method applied to the microprocessor, where the method is executed by the error reporting device 004, the processor core 001 and the interrupt controller 003 in the microprocessor in cooperation, that is, in the above-mentioned microprocessor according to the embodiment of the present application, the error reporting device 004, the processor core 001 and the interrupt controller 003 are respectively configured to execute the processing steps of the corresponding execution bodies in the error reporting method as described below.
Referring to fig. 6, the processing procedure of the error reporting method provided by the embodiment of the present application includes:
and S201, the functional module sends an error signal to the error reporter under the condition of error.
Specifically, during the working process of the microprocessor, if an error occurs in the functional module 002 on the microprocessor, for example, when an error occurs in one or more of the peripheral type controller, the on-chip storage unit, the storage type controller, etc., the functional module 002 will actively send an error signal to the error reporting device 004 to inform the error reporting device 004 that an error occurs in the functional module.
In some implementations, the function module 002 and the error reporter 004 are connected through a plurality of ports, where different ports correspond to different types of errors, and when a certain type of error occurs in the function module 002, a signal is sent to the error reporter 004 through a port corresponding to the type of error, so that the error reporter 004 can determine the type of error occurring in the function module 002 while knowing that the error occurs in the function module 002.
S202, under the condition that an error signal sent by the functional module is received by the error reporter, triggering the interrupt controller to send a first interrupt signal to the processor core.
The first interrupt signal is used for indicating that an error occurs in the functional module.
Specifically, when receiving the error signal sent by the functional module 002, the error reporter 004 can determine that the functional module has an error, and can determine which functional module has an error. At this time, the error reporter 004 sends an error trigger signal to the interrupt controller 003, where the error trigger signal is used to inform the interrupt controller 003 that an error occurs in a functional module, and triggers the interrupt controller 003 to send an interrupt signal to the processor core 001, so that the processor core 001 knows that an error occurs in a functional module.
The error trigger signal sent by the error reporter 004 to the interrupt controller 003 may be in any signal form or signal content, and when the technical scheme of the embodiment of the present application is actually implemented, it can be ensured that the interrupt controller 003 knows that any form and content signal with an error occurs in a functional module, and the signal can be used as the error trigger signal.
In some implementations, a register is also provided inside the error reporter 004 for storing information about errors occurring in the functional module 002.
Specifically, a packet status register and an error record status register are provided inside the error reporting unit 004. The packet status register is used for storing error identification, the error record status register is used for storing error information, the error identification is stored in the packet status register, the error identification can be an error number and the like, one error identification corresponds to one error record status register and is used for storing specific error information corresponding to the error identification, such as error type, error occurrence time, error occurrence functional module information and the like. The error types may be classified into correctable errors, uncorrectable errors, delayed errors, internal memory data errors, external memory data errors, illegal addresses, illegal accesses, slave error responses, external timeouts (such as timeouts when accessing other modules), internal timeouts (such as timeouts when accessing other modules), and the like according to different classification manners.
When the error reporter 004 receives the error signal sent by the functional module 002, the error type is resolved according to the error signal, the corresponding error identifier is determined, then the error identifier corresponding to the error signal is stored in the first register bit of the packet status register, and the error information corresponding to the error signal, such as information of the error type, is stored in the error record status register corresponding to the error identifier.
The first register bit is a register bit corresponding to a received error signal in the packet status register. For example, the correspondence between different error signals and error identifiers may be preset, and when the error reporter 004 receives an error signal, the error identifier corresponding to the error signal may be determined according to the preset correspondence.
In some implementations, the functional module 002 is connected to the error reporter 004 through a plurality of ports, wherein different ports correspond to different types of errors, and to different error identifications. When a certain type of error occurs in the function module 002, an error signal is sent to the error presenter 004 through a port corresponding to the type of error occurred. The error reporter 004 can determine the error identification corresponding to the error signal according to the port receiving the error signal.
Meanwhile, the correspondence relationship among the error signal, the error flag, and the register bits in the packet status register may be preset.
Based on the above correspondence, when the error reporter 004 receives the error signal sent by the functional module 002, the corresponding error type and error identification can be determined, and the register bit corresponding to the error signal is determined according to the above correspondence, and the error identification corresponding to the error signal is stored in the register bit. Then, the error reporter 004 stores the error information corresponding to the error signal, such as the error type, the functional module in which the error occurred, the time, the number of times of occurrence of the error, etc., into the error record status register corresponding to the error identification.
The error reporter 004 records error related information through the grouping state register and the error recording state register, can realize effective recording of the error related information, and can ensure that the error information is not disordered when more error conditions exist, and ensure that the error recording is ordered and the data is accurate.
And S203, the interrupt controller responds to the trigger of the error reporter and sends a first interrupt signal to the processor core.
Specifically, when receiving the error trigger signal sent by the error reporter 004, the interrupt controller 003 initiates an interrupt to the processor core 001, specifically sends a first interrupt signal to the processor core 001, where the first interrupt signal is specifically an error interrupt signal, and is used to indicate that an error occurs in a functional module.
S204, the processor core controls the functional module to reset under the condition that the processor core receives a first interrupt signal sent by the interrupt controller.
Specifically, when the processor core 001 receives the first interrupt signal, it may be determined that an error has occurred in the functional module, and at this time, the processor core 001 controls the functional module in which the error has occurred to be reset. The processor core 001 may control the reset of the functional module with the error, or the processor core 001 may directly send a reset signal to the functional module with the error to reset the functional module with the error, or the processor core 001 may control other modules to send a reset signal to the functional module with the error to reset the functional module with the error.
For example, the error trigger signal sent to the interrupt controller 003 by the error reporter 004 may carry information of the functional module that generates the error, and after the interrupt controller 003 receives the error trigger signal, the first interrupt signal sent to the processor core 001 may also carry information of the functional module that generates the error. Therefore, when receiving the first interrupt signal sent by the interrupt controller 003, the processor core 001 can control the reset of the functional module according to the information of the functional module in which the error has occurred included in the interrupt control signal.
In other implementations, error reporter 004 records error identification and error information via packet status registers and error record status registers. In this manner, when the processor core 001 receives the first interrupt signal sent by the interrupt controller 003, the error packet status register in the error reporter 004 is accessed, the error flag stored in the error packet status register is queried, then the error record status register corresponding to the error flag is accessed, and the error information corresponding to the received first interrupt signal, that is, the specific error information, is read therefrom.
By reading the error information described above, the processor core 001 can confirm the functional module in which the error occurred, and then control the functional module to reset according to the error information described above.
For example, a communication link may be directly established between the processor core 001 and each functional module 002, so that the processor core 001 may send a reset control signal to the functional module having an error through the communication link, to control the functional module having an error to be reset.
In other implementations, referring to fig. 7, the microprocessor according to the embodiment of the present application further includes a micro control unit MCU (Microcontroller Unit) 005 and a clock reset module 006. Wherein, MCU005 links to each other with processor core 001, and clock reset module 006 links to each other with MCU 005.
The MCU005 is used for controlling and scheduling specific modules (modules configured to be controlled by the MCU, including a clock reset module) on the microprocessor. The clock reset module 006 is used to control the clock and reset of the various functional modules on the microprocessor.
Based on the microprocessor structure shown in fig. 7, when the processor core 001 controls the reset of the functional module, a reset instruction is sent to the MCU005, and after receiving the reset instruction, the MCU005 sends indication information to the clock reset module 006, to instruct the clock reset module 006 to reset the functional module. After receiving the indication information sent by the MCU005, the clock reset module 006 controls the reset of the functional module.
As can be seen from the above description, in the error reporting method provided by the embodiment of the present application, an error reporting device is disposed in the microprocessor, and after receiving an error signal sent by a functional module, the error reporting device sends an error reporting message indicating that an error occurs in the functional module to the processor core, so that the processor core can learn that an error occurs in the functional module when the error occurs in the functional module, so as to repair the functional module in time.
Based on the above-mentioned error reporting method, the microprocessor shown in fig. 5 according to the embodiment of the present application sets the error reporting device 004, the interrupt controller 003 and the processor core 001 in the microprocessor to execute the processing steps of the corresponding execution bodies in the above-mentioned error reporting method respectively, so that the microprocessor has a higher error reporting efficiency and a higher error repairing efficiency.
Based on the microprocessor, any computer device containing the microprocessor, such as a personal computer, an industrial computer, an intelligent terminal, a handheld terminal, an intelligent wearable device, a server and the like, can be constructed.
When the error reporting method provided by the embodiment of the application is actually applied, the error reporting method shown in fig. 3 can be selectively implemented by the microprocessor structure shown in fig. 2, and the error reporting method shown in fig. 6 can also be selectively implemented by the microprocessor structure shown in fig. 5.
In other embodiments, the microprocessor architecture shown in FIG. 2 may be integrated with the microprocessor architecture shown in FIG. 5 to provide the microprocessor shown in FIG. 8. The microprocessor includes a processor core 001, a functional module 002, an error reporting unit 004, and an interrupt controller 003.
Each of the functional modules 002 is connected to the error reporter 004, the error reporter 004 is connected to the processor core 001 through the error reporting path 1, and meanwhile, the error reporter 004 is connected to the interrupt controller 003 through the error reporting path 2, and the interrupt controller 003 is connected to the processor core 001 through the address bus.
The specific functions and arrangements of the processor core 001, the functional module 002, the error presenter 004, and the interrupt controller 003 described above are referred to in the description of the above embodiments and will not be repeated here.
Referring to the microprocessor shown in fig. 8, two error reporting paths are included in the microprocessor, namely, an error reporting path 1 formed by the error reporting unit 004 and the processor core 001, and an error reporting path 2 formed by the error reporting unit 004, the interrupt controller 003 and the processor core 001. Based on the error reporting path 1, the error reporting device 004 and the processor core 001 implement the error reporting device 004 directly reporting the error to the processor core 001 by executing the error reporting method shown in fig. 3 described in the above embodiment; based on the error reporting path 2, the error reporting device 004, the interrupt controller 003 and the processor core 001 implement the error reporting device 004 indirectly reports the error to the processor core 001 through the interrupt controller 003 by executing the error reporting method shown in fig. 6 described in the above embodiment.
Based on the microprocessor shown in fig. 8, the error reporter 004 may flexibly select to report an error to the processor core 001 through the error reporting path 1 or report an error to the processor core 001 through the error reporting path 2. When the error reporting unit 004 selects to report an error to the processor core 001 through the error reporting channel 1 or through the error reporting channel 2, the error reporting unit 004 may select according to a preset channel selection rule, for example, according to the channel selection rule, when the rule specifies that the error reporting unit 1 needs to report an error, the error reporting unit 004 can report an error to the processor core 001 only through the error reporting channel 1, and when the rule specifies that the error reporting unit 2 needs to report an error, the error reporting unit 004 can report an error to the processor core 001 only through the error reporting channel 2. The above-mentioned path selection rule may be adjusted in real time, or may be updated according to a set rule.
Alternatively, the error reporting unit 004 may select to report the error through the error reporting path 1 or the error reporting path 2 according to the real-time processor resource. For example, when processing resources of the interrupt controller 003 are tense, the error reporter 004 may report an error to the processor core 001 directly through the error reporting path 1, and when processing resources of the interrupt controller 003 are sufficient, the error reporter 004 may report an error to the processor core 001 through the error reporting path 2.
In other implementations, the error reporter 004 may also prioritize the errors to be reported, and select whether to report through the error reporting path 1 or report through the error reporting path 2 according to the result of classifying the errors to be reported.
Specifically, the error reporting unit 004 classifies the priority of the errors to be reported according to the types, the emergency degree, and the like of the errors generated by the functional modules, and the more serious the types and the more urgent the emergency degree of the errors generated by the functional modules, the higher the priority of the errors, and the lower the priority of the errors. Then, the error reporting unit 004 selects an error reporting path according to the classification result of the error to be reported. The selection principle is that the higher the priority of the error to be reported is, the reporting is preferably performed through the error reporting channel 1, so that the processor core 001 can perform error repair more rapidly; conversely, the lower the priority of the error to be reported is, the reporting is preferably performed through the error reporting path 2, so as to avoid frequent preemption of the processing resources of the processor core 001 and increase the working pressure of the processor core 001.
In the microprocessor shown in fig. 8, a communication link is provided between the processor core 001 and the functional module 002, so that the processor core 001 can reset the functional module 002 through the communication link.
In some implementations, based on the microprocessor shown in fig. 8, the microprocessor shown in fig. 9 may also be constructed, in which an MCU (Microcontroller Unit, micro control unit) 005 and a clock reset module 006 are also included. Wherein, MCU005 links to each other with processor core 001, and clock reset module 006 links to each other with MCU 005.
The MCU005 is used for controlling and scheduling specific modules (modules configured to be controlled by the MCU, including a clock reset module) on the microprocessor. The clock reset module 006 is used to control the clock and reset of the various functional modules on the microprocessor.
Based on the microprocessor structure shown in fig. 9, when the processor core 001 controls the reset of the functional module, a reset instruction is sent to the MCU005, and after receiving the reset instruction, the MCU005 sends instruction information to the clock reset module 006, instructing the clock reset module 006 to reset the functional module. After receiving the indication information sent by the MCU005, the clock reset module 006 controls the reset of the functional module.
In the above-mentioned microprocessor, the specific processing procedures of the error reporting unit 004, the interrupt controller 003 and the processor core 001 when executing the above-mentioned error reporting method can be referred to the description of the embodiments of the above-mentioned error reporting method.
As can be seen from the above description, in the error reporting method provided by the embodiment of the present application, an error reporting device is disposed in a microprocessor, and after receiving an error signal sent by a functional module, the error reporting device sends an error reporting message indicating that an error occurs in the functional module to a processor core, or triggers an interrupt controller to send first interrupt information indicating that an error occurs in the functional module to the processor core, so that the processor core can learn that an error occurs in the functional module when an error occurs in the functional module, so as to repair the functional module in time.
Based on the above-mentioned error reporting method, the microprocessor shown in fig. 8 according to the embodiment of the present application sets the error reporting device 004, the interrupt controller 003 and the processor core 001 in the microprocessor to execute the processing steps of the corresponding execution bodies in the above-mentioned error reporting method respectively, so that the microprocessor has a higher error reporting efficiency and a higher error repairing efficiency.
Based on the microprocessor, any computer device containing the microprocessor, such as a personal computer, an industrial computer, an intelligent terminal, a handheld terminal, an intelligent wearable device, a server and the like, can be constructed.
As can be seen from the above description, the various microprocessors provided in the embodiments of the present application are internally provided with the error reporting device 004, and an error reporting path between the error reporting device 004 and the processor core 001 is established, so that the error reporting device 004 can collect, record, report, etc. various functional module errors occurring in the microprocessors.
However, in some cases, an error reporting path between the error reporting unit 004 and the processor core 001 may be abnormal, so that the error reporting unit 004 cannot report an error to the processor core 001, and thus the processor 001 cannot be informed of the error of the functional module, and the functional module with the error cannot be repaired. Wherein, the error reporting path between the error reporting device 004 and the processor core 001 is abnormal, including but not limited to, the communication link between the error reporting device 004 and the processor core 001 is interrupted, or the device fault on the error reporting path between the error reporting device 004 and the processor core 001, or the processor core 001 is faulty.
For example, in the microprocessor architecture depicted in FIG. 8, an exception may occur in false-report lane 1, such as an interruption of the communication link of the lane or a processor core 001 failure; and, an abnormality may occur in the error reporting path 2, such as interruption of the communication link of the path or failure of the interrupt controller 003, or failure of the processor core 001, etc.
When the above error reporting channel abnormal condition occurs, the error reporting device 004 cannot report the error effectively. For example, in the microprocessor structure shown in fig. 2, when the error reporting path is abnormal, the error reporting unit 004 cannot report the error; in the microprocessor structure shown in fig. 5, when the error reporting path is abnormal, the error reporting device 004 cannot report the error; in the microprocessor structure shown in fig. 8, when both the error reporting path 1 and the error reporting path 2 are abnormal, the error reporting unit 004 cannot report the error as well.
In view of the above technical problems, the embodiments of the present application further provide another new microprocessor, which includes a functional module 002, an error reporter 004 and an MCU (Microcontroller Unit, micro control unit) 005. The functional module 002 is connected to the error reporter 004, and the error reporter 004 is connected to the MCU 005. The specific functions and relevant descriptions of the functional module 002, the error reporter 004 and the MCU005 can be referred to as relevant descriptions of the above embodiments.
The above-described microprocessor structure may be applied to the microprocessor shown in fig. 2, thereby obtaining the microprocessor structure shown in fig. 10, or to the microprocessor shown in fig. 5, thereby obtaining the microprocessor structure shown in fig. 11, or to the microprocessor shown in fig. 8, thereby obtaining the microprocessor structure shown in fig. 12.
Based on any one or more of the microprocessors shown in fig. 10, 11 and 12, the present application proposes an error reporting method suitable for these microprocessors, and referring to fig. 13, the method includes:
and S301, the functional module sends an error signal to the error reporter under the condition of error.
Specifically, during the working process of the microprocessor, if an error occurs in the functional module 002 on the microprocessor, for example, when an error occurs in one or more of the peripheral type controller, the on-chip storage unit, the storage type controller, etc., the functional module 002 will actively send an error signal to the error reporting device 004 to inform the error reporting device 004 that an error occurs in the functional module.
In some implementations, the function module 002 and the error reporter 004 are connected through a plurality of ports, where different ports correspond to different types of errors, and when a certain type of error occurs in the function module 002, a signal is sent to the error reporter 004 through a port corresponding to the type of error, so that the error reporter 004 can determine the type of error occurring in the function module 002 while knowing that the error occurs in the function module 002.
S302, the error reporting device sends an error reporting message to the MCU under the condition that the error reporting device receives an error signal sent by the functional module. The error report message is used for indicating that the functional module has an error.
Specifically, when receiving the error signal sent by the functional module 002, the error reporter 004 can determine that the functional module has an error, and can determine which functional module has an error. At this time, the error reporter 004 sends an error reporting message to the MCU005 for notifying the MCU005 that an error has occurred in the functional module.
The error report message can be in any message form or message content, and when the technical scheme of the embodiment of the application is actually implemented, the MCU005 can be ensured to know that the message in any form and content with the error occurs in the functional module, and the message can be used as the error report message.
In some implementations, in a scenario in which the processor core 001 is included in the microprocessor, and the error reporter 004 is connected to the processor core 001, such as in the microprocessor shown in fig. 10 and 12, the error reporter 004 sends an error report message to the processor core 001 in addition to sending the error report message to the MCU005 when receiving the error signal sent by the functional module. The error reporting unit 004 sends an error reporting message to the processor core 001, which is used for informing the processor core 001 that the functional module has an error. The error reporter 004 sends error report information to the MCU005 and error report information to the processor core 001, and the information forms and the information contents of the error report information and the error report information can be the same or different.
In some implementations, the microprocessor includes a processor core 001 and an interrupt controller 003, and the error reporter 004 is connected to the interrupt controller 003, where the interrupt controller 003 is connected to the processor core 001, for example, in the microprocessor shown in fig. 11 and 12, when the error reporter 004 receives an error signal sent by a functional module, the error reporter 004 sends an error report message to the MCU005, and also triggers the interrupt controller 003 to send a first interrupt signal to the processor core 001, where the first interrupt signal indicates that an error occurs in the functional module, so that the processor core 001 can also know that an error occurs in the functional module.
Based on the above two implementations, when receiving the error signal sent by the functional module, the error reporter 004 sends a message to the processor core 001 and the MCU005 at the same time.
In some implementations, a register is also provided inside the error reporter 004 for storing information about errors occurring in the functional module 002.
Specifically, a packet status register and an error record status register are provided inside the error reporting unit 004. The packet status register is used for storing error identification, the error record status register is used for storing error information, the packet status register is used for storing error identification, the error identification can be an error number and the like, one error identification corresponds to one error record status register and is used for storing specific error information corresponding to the error identification, such as error type, time of error occurrence, function module information of error occurrence and the like. Error types may include, among other things, error correctable, error uncorrectable, delay error, internal memory data error, external memory data error, illegal address, illegal access, slave error response, external timeout (e.g., timeout when accessing other modules), internal timeout (e.g., timeout when accessing other modules), etc.
When the error reporter 004 receives the error signal sent by the functional module 002, the error type is resolved according to the error signal, the corresponding error identifier is determined, then the error identifier corresponding to the error signal is stored in the first register bit of the packet status register, and the error information corresponding to the error signal, such as information of the error type, is stored in the error record status register corresponding to the error identifier.
The first register bit is a register bit corresponding to a received error signal in the packet status register. For example, the correspondence between different error signals and error identifiers may be preset, and when the error reporter 004 receives an error signal, the error identifier corresponding to the error signal may be determined according to the preset correspondence.
In some implementations, the functional module 002 is connected to the error reporter 004 through a plurality of ports, wherein different ports correspond to different types of errors, and to different error identifications. When a certain type of error occurs in the function module 002, an error signal is sent to the error presenter 004 through a port corresponding to the type of error occurred. The error reporter 004 can determine the error identification corresponding to the error signal according to the port receiving the error signal.
Meanwhile, the correspondence relationship among the error signal, the error flag, and the register bits in the packet status register may be preset.
Based on the above correspondence, when the error reporter 004 receives the error signal sent by the functional module 002, the corresponding error type and error identification can be determined, and the register bit corresponding to the error signal is determined according to the above correspondence, and the error identification corresponding to the error signal is stored in the register bit. Then, the error reporter 004 stores the error information corresponding to the error signal, such as the error type, the functional module in which the error occurred, the time, the number of times of occurrence of the error, etc., into the error record status register corresponding to the error identification.
The error reporter 004 records error related information through the grouping state register and the error recording state register, can realize effective recording of the error related information, and can ensure that the error information is not disordered when more error conditions exist, and ensure that the error recording is ordered and the data is accurate.
S303, the MCU receives the error report message sent by the error report device and detects whether the error report message is processed within a set time length.
Specifically, referring to the above description, the error reporter 004 sends an error report message to the MCU005, and also sends an error report message to the processor core 001, or triggers the interrupt controller 003 to send a first interrupt message to the processor core 001, so that the processor core 001 also knows that an error occurs in a functional module.
If the processor core 001 can successfully receive the error report message or the first interrupt message and the functions and the operation resources of the processor core 001 are normal, the processor core 001 can repair the functional module with the error, that is, process the error report message or the first interrupt message.
However, if the error reporting path between the error reporting unit 004 and the processor core 001 is abnormal or the processor core 001 is abnormal, the processor core 001 cannot repair the functional module having the error. In this case, the embodiment of the application sets whether the error report message is timely processed or not by the MCU005, and when the error report message is not timely processed, the MCU005 repairs the error function module.
When receiving the error report message sent by the error report 004, the MCU005 uses the time of receiving the error report message as the starting time, and determines whether the error report message is processed within a set time period from the starting time.
In some implementations, the MCU005 determines whether the error report message is processed by detecting a level signal on a connection path between the MCU005 and the error reporter 004.
Specifically, when the error reporter 004 sends an error report message to the MCU005, the level on the connection path between the MCU005 and the error reporter 004 is pulled high to form a high level signal. When the error reporting device 004 detects that the error reporting message sent to the processor core 001 is processed, for example, the processor core 001 successfully receives the error reporting message, or the processor core 001 adds the received error reporting message into the message processing queue, the error reporting device 004 can confirm that the error reporting message is processed by the processor core 001, at this time, the error reporting device 004 can cancel the error reporting message sent to the MCU005, so that a high level signal on a connection path between the MCU005 and the error reporting device 004 has a falling edge and becomes a low level signal.
Therefore, the MCU005 can determine whether the error report message is processed within a set period of time after receiving the error report message by detecting a level signal on a connection path between the MCU005 and the error report 004. After the MCU005 receives the high level signal corresponding to the error report message, if the high level signal on the connection path between the MCU005 and the error report 004 has a falling edge within the set time length, the error report message is processed within the set time length; if the high level signal on the connection path between the MCU005 and the error presenter 004 does not have a falling edge in the set time period, it is indicated that the error presenter message is not processed in the set time period.
S304, the MCU controls the functional module to reset under the condition that the error report message is detected not to be processed within a set time length.
Through the judgment in step S303, if the MCU005 confirms that the error report message is not processed within the set period after receiving the error report message, the MCU005 processes the error report message, thereby avoiding the problem of timely repairing the error of the functional module under the condition that the error report path between the error report 004 and the processor core 001 is abnormal, and thus causing the functional problem of the microprocessor.
When the MCU005 detects that the received error report message is not processed within the set time length, the MCU005 controls the functional module with the error to reset.
For example, in the error report message sent to the MCU005 by the error report 004, information of the functional module in which the error occurs may be carried. When receiving the error report message sent by the error report device, the MCU005 can control the reset of the functional module according to the information of the functional module with the error contained in the error report message.
In other implementations, error reporter 004 records error identification and error information via packet status registers and error record status registers. In this way, when the MCU005 receives the error report message sent by the error report 004, it accesses the error packet status register in the error report 004, inquires the error flag stored in the error packet status register, then accesses the error record status register corresponding to the error flag, and reads the error information corresponding to the received error report message therefrom.
By reading the error information, the MCU005 can confirm the functional module having the error, and then control the functional module to be reset according to the error information.
For example, a communication link may be directly established between the MCU005 and each functional module 002, so that the MCU005 may send a reset control signal to the functional module having an error through the communication link, and control the functional module having an error to be reset.
In other implementations, based on the microprocessors shown in fig. 10, 11 and 12, a clock reset module 006 may also be provided in the microprocessors, where the clock reset module 006 is connected to the MCU005, resulting in the microprocessors shown in fig. 14 (a), 14 (b) and 14 (c). The MCU005 is used for controlling and scheduling specific modules (modules configured to be controlled by the MCU, including a clock reset module) on the microprocessor. The clock reset module 006 is used to control the clock and reset of the various functional modules on the microprocessor.
Based on the microprocessor structure shown in fig. 14 (a), 14 (b) and 14 (c), when the MCU005 controls the functional module to be reset, indication information is sent to the clock reset module 006, and the clock reset module 006 is instructed to reset the functional module. After receiving the indication information sent by the MCU005, the clock reset module 006 controls the reset of the functional module.
As can be seen from the above description, in the error reporting method provided by the embodiment of the present application, an error reporting device is disposed in the microprocessor, and after receiving an error signal sent by the functional module, the error reporting device sends an error reporting message to the MCU in the microprocessor, and the MCU controls the functional module to reset when detecting that the error reporting message is not processed within a set period of time. According to the scheme, under the condition that the functional module of the microprocessor is wrong and is not repaired in time, the functional module can be reset by means of the MCU under the condition that an error reporting channel between the functional module and the processor core is abnormal, so that the functional module can be repaired in time, and the processor abnormality which is more serious because the functional module cannot be repaired in time is avoided.
Based on the error reporting method described in the foregoing embodiments, the microprocessor as shown in fig. 10, 11 and 12 according to the embodiments of the present application sets the error reporting unit 004 and the MCU005 in the microprocessor to execute the processing steps of the corresponding execution bodies in the error reporting method respectively, so that the microprocessor has a higher error reporting efficiency and a higher error repairing efficiency.
Based on the microprocessor, any computer device containing the microprocessor, such as a personal computer, an industrial computer, an intelligent terminal, a handheld terminal, an intelligent wearable device, a server and the like, can be constructed.
In the above embodiment, the microprocessor structures shown in fig. 10, 11 and 12 are constructed, based on these microprocessor structures, the backup of the error reporting path between the error reporting device 004 and the processor core 001 is realized through the error reporting path between the error reporting device 004 and the MCU005, that is, when the error reporting path between the error reporting device 004 and the processor core 001 is abnormal, the error reporting message may be sent to the MCU005 through the path between the error reporting device 004 and the MCU005, and the MCU005 replaces the processor core 001 to reset the functional module with the error.
However, in some cases, the path between the error presenter 004 and the MCU005 may be abnormal, so that a case occurs in which the error presenter 004 does not issue an error presenter signal without any path. For example, in the microprocessor shown in fig. 10, 11, and 12, the path between the error presenter 001 and the processor core 001 and the path between the MCU005 may be abnormal at the same time, and in this case, the functional module error processing cannot be realized at all inside the microprocessor.
Aiming at the technical problems, the embodiment of the application further provides another error reporting method and a computer system for executing the error reporting method.
The computer system comprises a microprocessor and a field programmable gate array (FPGA, field Programmable Gate Array) unit 007, wherein the microprocessor comprises a functional module 002 and an error reporter 004, the functional module 002 is connected with the error reporter 004, and the error reporter 004 is connected with the field programmable gate array unit 007. The specific functions and related descriptions of the functional module 002 and the error presenter 004 described above can be found in the related descriptions of the above embodiments.
The field programmable gate array unit 007 is an FPGA hardware module running an error reporting program and having an error reporting function. Illustratively, the field programmable gate array unit 007 is connected to the error presenter 004 via a pin of the microprocessor. It is understood that the field programmable gate array unit 007 is a hardware module provided outside the microprocessor.
The above-described computer system configuration may be applied to the microprocessor shown in fig. 10 to obtain the computer system configuration shown in fig. 15, or to the microprocessor shown in fig. 11 to obtain the computer system configuration shown in fig. 16, or to the microprocessor shown in fig. 12 to obtain the computer system configuration shown in fig. 17.
Based on any one or more of the computer systems shown in fig. 15, 16 and 17, an embodiment of the present application proposes an error reporting method applicable to these computer systems, and referring to fig. 18, the method includes:
s401, the functional module sends an error signal to the error reporter under the condition of error.
Specifically, during the working process of the microprocessor, if an error occurs in the functional module 002 on the microprocessor, for example, when an error occurs in one or more of the peripheral type controller, the on-chip storage unit, the storage type controller, etc., the functional module 002 will actively send an error signal to the error reporting device 004 to inform the error reporting device 004 that an error occurs in the functional module.
In some implementations, the function module 002 and the error reporter 004 are connected through a plurality of ports, where different ports correspond to different types of errors, and when a certain type of error occurs in the function module 002, a signal is sent to the error reporter 004 through a port corresponding to the type of error, so that the error reporter 004 can determine the type of error occurring in the function module 002 while knowing that the error occurs in the function module 002.
And S402, the error reporter sends an error report message to the field programmable gate array unit under the condition that the error reporter receives an error signal sent by the functional module. The error report message is used for indicating that the functional module has an error.
Specifically, when receiving the error signal sent by the functional module 002, the error reporter 004 can determine that the functional module has an error, and can determine which functional module has an error. At this time, the error reporter 004 sends an error report message to the field programmable gate array unit 007 for notifying the field programmable gate array unit 007 that an error has occurred in the functional module.
The error report message may be any message form or message content, and when the technical scheme of the embodiment of the present application is actually implemented, it can be ensured that the field programmable gate array unit 007 knows that the error message of any form and content occurs in the functional module, and the error report message may be the error report message.
In some implementations, in a scenario in which the processor core 001 is included in the microprocessor and the error presenter 004 is connected to the processor core 001, such as in the computer systems shown in fig. 15 and 17, the error presenter 004 sends an error report message to the processor core 001 in addition to the field programmable gate array unit 007 upon receiving an error signal sent by the functional module. The error reporting unit 004 sends an error reporting message to the processor core 001, which is used for informing the processor core 001 that the functional module has an error. The error reporter 004 sends error report information to the field programmable gate array unit 007 and error report information to the processor core 001, and the information forms and the information contents of the error report information and the error report information can be the same or different.
In some implementations, the processor core 001 and the interrupt controller 003 are included in the microprocessor, and the error reporter 004 is connected to the interrupt controller 003, where the interrupt controller 003 is connected to the processor core 001, for example, in the computer system shown in fig. 16 and 17, when the error reporter 004 receives the error signal sent by the functional module, the error reporter 004 sends a first interrupt signal for indicating that an error occurs in the functional module to the processor core 001 in addition to the error report message to the field programmable gate array unit 007, so that the processor core 001 can also know that an error occurs in the functional module.
Based on the above two implementations, when receiving the error signal sent by the functional module, the error reporter 004 sends a message to the processor core 001 and the field programmable gate array unit 007 at the same time.
In some implementations, a register is also provided inside the error reporter 004 for storing information about errors occurring in the functional module 002.
Specifically, a packet status register and an error record status register are provided inside the error reporting unit 004. The packet status register is used for storing error identification, the error record status register is used for storing error information, the packet status register is used for storing error identification, the error identification can be an error number and the like, one error identification corresponds to one error record status register and is used for storing specific error information corresponding to the error identification, such as error type, time of error occurrence, function module information of error occurrence and the like. Error types may include, among other things, error correctable, error uncorrectable, delay error, internal memory data error, external memory data error, illegal address, illegal access, slave error response, external timeout (e.g., timeout when accessing other modules), internal timeout (e.g., timeout when accessing other modules), etc.
When the error reporter 004 receives the error signal sent by the functional module 002, the error type is resolved according to the error signal, the corresponding error identifier is determined, then the error identifier corresponding to the error signal is stored in the first register bit of the packet status register, and the error information corresponding to the error signal, such as information of the error type, is stored in the error record status register corresponding to the error identifier.
The first register bit is a register bit corresponding to a received error signal in the packet status register. For example, the correspondence between different error signals and error identifiers may be preset, and when the error reporter 004 receives an error signal, the error identifier corresponding to the error signal may be determined according to the preset correspondence.
In some implementations, the functional module 002 is connected to the error reporter 004 through a plurality of ports, wherein different ports correspond to different types of errors, and to different error identifications. When a certain type of error occurs in the function module 002, an error signal is sent to the error presenter 004 through a port corresponding to the type of error occurred. The error reporter 004 can determine the error identification corresponding to the error signal according to the port receiving the error signal.
Meanwhile, the correspondence relationship among the error signal, the error flag, and the register bits in the packet status register may be preset.
Based on the above correspondence, when the error reporter 004 receives the error signal sent by the functional module 002, the corresponding error type and error identification can be determined, and the register bit corresponding to the error signal is determined according to the above correspondence, and the error identification corresponding to the error signal is stored in the register bit. Then, the error reporter 004 stores the error information corresponding to the error signal, such as the error type, the functional module in which the error occurred, the time, the number of times of occurrence of the error, etc., into the error record status register corresponding to the error identification.
The error reporter 004 records error related information through the grouping state register and the error recording state register, can realize effective recording of the error related information, and can ensure that the error information is not disordered when more error conditions exist, and ensure that the error recording is ordered and the data is accurate.
S403, the field programmable gate array unit receives the error report message sent by the error report device and detects whether the error report message is processed within a set time length.
Specifically, referring to the above description, the error reporter 004 sends an error report message to the field programmable gate array 007, and also sends an error report message to the processor core 001, or triggers the interrupt controller 003 to send a first interrupt message to the processor core 001, so that the processor core 001 also knows that an error occurs in a functional module.
If the processor core 001 can successfully receive the error report message or the first interrupt message and the functions and the operation resources of the processor core 001 are normal, the processor core 001 can repair the functional module with the error, that is, process the error report message or the first interrupt message.
However, if the error reporting path between the error reporting unit 004 and the processor core 001 is abnormal or the processor core 001 is abnormal, the processor core 001 cannot repair the functional module having the error. In this case, the embodiment of the present application sets whether the field programmable gate array unit 007 detects the error report message is processed in time, and when the error report message is not processed in time, the field programmable gate array unit 007 actively triggers the repair of the error function module.
When receiving the error report message sent by the error report unit 004, the field programmable gate array unit 007 uses the time of receiving the error report message as the starting time, and determines whether the error report message is processed within a set time period from the starting time.
In some implementations, the field programmable gate array unit 007 determines whether the error report message is processed by detecting a level signal on a connection path (pin) between the field programmable gate array unit 007 and the error report 004.
Specifically, when the error reporter 004 sends an error report message to the field programmable gate array unit 007, the level on the connection path (pin) between the field programmable gate array unit 007 and the error reporter 004 is pulled high to form a high level signal. When the error reporting device 004 detects that the error reporting message sent to the processor core 001 is processed, for example, the processor core 001 successfully receives the error reporting message, or the processor core 001 adds the received error reporting message to the message processing queue, the error reporting device 004 can confirm that the error reporting message is processed by the processor core 001, at this time, the error reporting device 004 can cancel the error reporting message sent to the field programmable gate array unit 007, so that a high level signal on a connection path (pin) between the field programmable gate array unit 007 and the error reporting device 004 has a falling edge and becomes a low level signal.
Accordingly, the field programmable gate array unit 007 can determine whether or not the error report message is processed within a set period of time after receiving the error report message by detecting a level signal on a connection path (pin) between the field programmable gate array unit 007 and the error report 004. After the field programmable gate array unit 007 receives the high level signal corresponding to the error report message, if a falling edge occurs in the high level signal on the connection path (pin) between the field programmable gate array unit 007 and the error report 004 within the set time period, the error report message is processed within the set time period; if the high signal on the connection path (pin) between the field programmable gate array unit 007 and the error presenter 004 does not have a falling edge within the set time period, it indicates that the error presenter message is not processed within the set time period.
S404, the field programmable gate array unit controls the functional module to reset under the condition that the error report message is detected not to be processed within a set time length.
Through the judgment in step S403, if the field programmable gate array unit 007 confirms that the error report message is not processed within the set time period after receiving the error report message, the field programmable gate array unit 007 processes the error report message, thereby avoiding the problem of timely repairing the error of the functional module under the condition that the error report channel between the error report 004 and the processor core 001 is abnormal, and further causing the functional problem of the microprocessor.
When the field programmable gate array unit 007 detects that the received error report message is not processed within the set time period, the field programmable gate array unit 007 controls the reset of the functional module having the error.
For example, the error report message sent by the error report 004 to the field programmable gate array unit 007 may carry information of the functional module in which the error occurs. When receiving the error report message sent by the error report device, the field programmable gate array unit 007 may control the reset of the functional module according to the information of the functional module with the error contained in the error report message.
In other implementations, error reporter 004 records error identification and error information via packet status registers and error record status registers. In this way, when the field programmable gate array unit 007 receives the error report message transmitted from the error report 004, it accesses the error packet status register in the error report 004, inquires the error flag stored in the error packet status register, and then accesses the error record status register corresponding to the error flag, and reads the error information corresponding to the received error report message therefrom.
By reading the error information described above, the field programmable gate array unit 007 can confirm the function module in which the error occurred, and then control the reset of the function module according to the error information described above.
Illustratively, a communication link may be established directly between the field programmable gate array unit 007 and each of the functional modules 002. Thus, the field programmable gate array unit 007 may send a reset control signal to the functional module having an error through the communication link, and control the functional module having an error to perform a reset.
As can be seen from fig. 15, 16 and 17, the field programmable gate array unit 007 is a functional device disposed outside the microprocessor, and if the field programmable gate array unit 007 is made to directly control the functional module inside the microprocessor, it is possible that an illegal person would have malicious control over the microprocessor through the field programmable gate array unit 007 if the field programmable gate array unit 007 was maliciously controlled, thereby affecting the on-chip security of the microprocessor.
In order to solve the above-mentioned problem, in other implementations, the microprocessor of the computer system according to the embodiment of the present application further includes a clock reset module 006, where the cpu 005 is connected to the field programmable gate array unit 007, and the clock reset module 006 is connected to the MCU 005. The MCU005 is used for controlling and scheduling specific modules (modules configured to be controlled by the MCU, including a clock reset module) on the microprocessor. The clock reset module 006 is used to control the clock and reset of the various functional modules on the microprocessor.
The above-described computer system configuration may be applied to any one or more of the computer systems shown in fig. 15, 16, and 17, and for example, the computer system shown in fig. 19 may be obtained by applying the above-described computer system configuration to the computer system shown in fig. 17.
Based on the computer system configuration shown in fig. 19, when the field programmable gate array unit 007 controls the function module to be reset, an error notification message is sent to the MCU005, thereby informing the MCU005 that an error has occurred in the function module, and triggering the MCU005 to reset the function module.
The MCU005 resets the function module after receiving the error notification message sent by the field programmable gate array unit 007.
For example, in the error notification message sent to the MCU005 by the field programmable gate array unit 007, information of the functional module in which the error occurs may be carried. When receiving the error notification message sent by the field programmable gate array unit 007, the MCU005 may control the reset of the functional module according to the information of the functional module having the error included in the error notification message.
In other implementations, error reporter 004 records error identification and error information via packet status registers and error record status registers. In this way, when the MCU005 receives the error notification message sent by the field programmable gate array unit 007, it accesses the error packet status register in the error reporting unit 004, inquires the error flag stored in the error packet status register, and then accesses the error record status register corresponding to the error flag, and reads the error information corresponding to the received error reporting message therefrom.
By reading the error information, the MCU005 can confirm the functional module having the error, and then control the functional module to be reset according to the error information.
For example, a communication link may be directly established between the MCU005 and each functional module 002, so that the MCU005 may send a reset control signal to the functional module having an error through the communication link, and control the functional module having an error to be reset.
In other implementations, when the MCU005 controls the functional module to reset, indication information may be sent to the clock reset module 006, instructing the clock reset module 006 to reset the functional module. After receiving the indication information sent by the MCU005, the clock reset module 006 controls the reset of the functional module.
In other implementations, in order to ensure that the security of the on-chip environment of the microprocessor is not affected by the off-chip fpga 007, the authority of the fpga 007 may be limited, so that the fpga 007 cannot read the state and information in the microprocessor, and only an error notification message may be sent to the MCU in the microprocessor, and at the same time, only an error report message may be sent to the fpga 007 in the microprocessor, but not other messages.
In addition, for the MCU005 inside the microprocessor, its communication interface with the field programmable gate array unit 007 is governed by the microprocessor chip period, and the field programmable gate array unit 007 can communicate with the MCU005 only in certain allowed chip period states.
There are 5 chip cycle states, described as follows:
CM state (Chip Manufacturing, chip vendor all states): the initial state after chip production is CM state. In the CM state, the ownership of the chip is the chip manufacturer, all functions of the chip can be used, the chip manufacturer injects the key and key data of the chip manufacturer in the CM state, and all debugging and testing interface functions of the chip are opened.
DM state (Device Manufacturing, all states of the complete machine manufacturer): before the chip is delivered to the manufacturer of the whole machine by the chip manufacturer, the life cycle transition is completed under the control of the chip manufacturer, and the chip enters the stage of the manufacturer of the whole machine. In the DM state, the ownership of the chip is the manufacturer of the whole machine, the manufacturer of the whole machine injects the key or key data of the manufacturer of the whole machine in the DM state, and other hardware debugging and testing interface functions are closed except the necessary software debugging interface function. The key of the chip manufacturer injected in the CM state and the key data cannot be read and tampered in the DM state, but the key can be used for related operation.
UM state (User management, user all/safety state), namely complete life cycle transition is completed under the control of the complete machine manufacturer before the complete machine manufacturer delivers the complete machine to the final customer, and the User stage is entered. In UM state, the ownership of the chip is user, the user can store the user key and key data in the chip, and the debugging and testing interface functions of the chip are all closed. The key and key data of the chip manufacturer injected in the CM state and the key and key data of the whole machine manufacturer injected in the DM state cannot be read and written in the UM state, but the derivative key can be generated based on the key and key data.
DM RMA state (factory return to manufacturer state), in which the user-irrecoverable fault is occurred in the whole machine, after factory return to manufacturer, the life cycle transition is completed under the control of manufacturer of the whole machine, and dmdma state is entered. In the DM RMA state, the ownership of the chip is the manufacturer of the whole machine, the software debugging function is opened, the hardware debugging and testing functions are closed, and the key access authority in the DM state is the same as that in the DM state.
CM RMA state (factory return to chip manufacturer state) the chip has the failure that the whole machine manufacturer cannot repair, after factory return to chip manufacturer, the life cycle transition is completed under the control of chip manufacturer, and the chip enters CM RMA state. In the CM RMA state, the ownership of the chip is the chip manufacturer, all debugging and testing interfaces are opened, and the key access authority in the CM state is the same as that in the CM state.
Wherein CM, DM, CM RMA are not open to the user, one of the two states UM and DM RAM may be selected as a chip state allowing the field programmable gate array unit 007 to communicate with the MCU 005.
In addition, the MCU005 may be configured to reset the functional module in response to an error notification message sent from the field programmable gate array unit 007 only in some security-enabled situations. For example, the MCU005 analyzes the access address of the error notification message sent by the field programmable gate array unit 007, and if it is determined that the access address of the error notification message sent by the field programmable gate array unit 007 is a secure address, the MUC005 responds to the error notification message to reset the functional module, otherwise, the MCU005 does not respond.
Based on the setting of the safety mechanism, the reset of the functional module with errors inside the microprocessor can be realized by the field programmable gate array unit outside the microprocessor under the condition of ensuring the safety of the microprocessor.
As can be seen from the above description, in the error reporting method provided by the embodiment of the present application, an error reporting device is disposed inside a microprocessor, and a field programmable gate array unit connected to the error reporting device is disposed outside the microprocessor, where the error reporting device sends an error reporting message to the field programmable gate array unit after receiving an error signal sent by a functional module, and the field programmable gate array unit controls the functional module to reset when detecting that the error reporting message is not processed within a set period of time. According to the scheme, the field programmable gate array unit is arranged outside the microprocessor, so that the field programmable gate array unit outside the microprocessor resets the functional module under the condition that the error of the functional module in the microprocessor cannot be processed in time, and the error repairing efficiency of the functional module is improved. In addition, the field programmable gate array unit is not affected by the abnormality of the microprocessor, so that the scheme can ensure more robust repairing function of the internal functional module of the microprocessor.
Based on the error reporting method described in the foregoing embodiments, the computer system as shown in fig. 15, 16 and 17 according to the embodiments of the present application configures the error reporting unit 004, and sets the error reporting unit 004 in the microprocessor and the field programmable gate array unit 007 outside the microprocessor to execute the processing steps of the corresponding execution bodies in the error reporting method, so that the microprocessor has a higher error reporting efficiency and a higher error repairing efficiency.
Based on the computer system, any computer device containing the computer system can be constructed, such as a personal computer, an industrial computer, an intelligent terminal, a handheld terminal, an intelligent wearable device, a server and the like.
In the error reporting method described in any of the foregoing embodiments, after the error signal is received by the error reporter 004, the error signal may be classified and categorized, and the error generated by the functional module may be corrected and counted.
In some implementations, when the number of error signals received by the error reporter 004 and sent by the functional module 002 is multiple, before the error reporter 004 sends the error report message, for example, before the error reporter 004 sends the error report message to the processor core 001 or the MCU005 or the field programmable gate array unit 007, or before the error reporter 004 triggers the interrupt controller 003 to send the first interrupt signal to the processor core 001, the report priority corresponding to each error signal is determined in sequence, and then the error reporter 004 sends the error report message corresponding to each error signal to the processor core 001 or the MCU005 or the field programmable gate array unit 007 in sequence according to the order of the report priorities from high to low, or triggers the interrupt controller 003 to send the first interrupt signal corresponding to each error signal to the processor core 001 in sequence.
Specifically, in the embodiment of the application, processing priorities are preset for different functional module errors, and when a certain functional module error occurs, the higher the processing priority corresponding to the functional module error is, the earlier the functional module error is processed; accordingly, a lower priority function module error may be processed after waiting for a higher priority function module error to be processed.
For example, the processing priority corresponding to the different functional module errors may be determined according to the type of the error of the functional module error, the functional module in which the error occurs, the duration, the frequency, etc.
Based on the above-mentioned processing priority setting, when the error reporter 004 receives a plurality of error signals (including a plurality of error signals sent by one functional module or a plurality of error signals sent by a plurality of functional modules), the reporting priority corresponding to each received error signal is determined in sequence according to the preset processing priority corresponding to the error of the different functional module. For example, assuming that the preset processing priorities corresponding to the errors of the different functional modules include processing priorities corresponding to error types of the different functional modules, when the error reporter 004 receives a plurality of error signals, the error types corresponding to the error signals are analyzed and determined first, and then the processing priorities corresponding to the error signals are determined according to the error types corresponding to the error signals and the preset processing priorities corresponding to the error types of the different functional modules, and the processing priorities are used as the reporting priorities of the error signal reporting.
Then, the error reporting unit 004 reports each error signal in turn according to the reporting priority corresponding to each error signal. For example, according to the order of the reporting priority corresponding to each error signal from high to low, error reporting messages corresponding to each error signal are sequentially sent to the processor core 001 or the MCU005 or the field programmable gate array unit 007, so that orderly reporting of a plurality of errors is realized. Alternatively, the error reporter 004 sequentially triggers the interrupt controller 003 to send the first interrupt signal corresponding to each error signal to the processor core 001 according to the order of the reporting priority corresponding to each error signal from high to low.
Through the processing, the error reporting device 004 can report a plurality of errors sequentially in order and priority, so that on one hand, the error reporting order can be ensured, on the other hand, the utilization rate of operation resources can be improved, and the urgent functional module errors can be timely processed.
In other implementations, the error reporter 004 may also correct and count the error types before sending the error report message or triggering the interrupt controller 003 to send the first interrupt signal to the processor core 001, and selectively report errors according to the correction and count results.
Specifically, after receiving the error signal sent by the functional module 002, the error reporter 004 determines whether the error corresponding to the received error signal is an error that can be corrected before sending an error report message to the processor core 001, the MCU005, or the field programmable gate array unit 007, or before triggering the interrupt controller 003 to send the first interrupt signal to the processor core 001.
The error-correctable error refers to an error that an error signal corresponding to can be corrected in the operation process of the functional module, that is, an error that can be corrected by the functional module without resetting the functional module.
For example, in a static random access memory SRAM on a microprocessor, data is stored in binary form. When data is written into the SRAM, a check code is added to the data through a check algorithm to form check bits. The check bits are used for error checking and correcting of the data, and the number of data bits which can be corrected by different check codes is different.
If the check code added for the data can correct the error of the single data bit, when the check code detects that the data of the single data bit in the SRAM is in error, the error can be corrected by changing the binary value of the single data bit, so that the error of the single data bit is the error which can be corrected; if the data of a plurality of data bits in the SRAM are all in error, the check code does not support the data error correction of the plurality of data bits, so that the plurality of data bits are in error, namely the error cannot be corrected.
For error correction, the error correction can be locally carried out on the functional module without repairing the functional module through other components, so that reporting is not needed; for the uncorrectable errors, the uncorrectable errors cannot be repaired locally, so that the uncorrectable errors need to be uploaded to other components to repair the functional modules, for example, the functional modules are reset through a processor core, so that the functional module errors are repaired. In addition, when the same kind of error-correcting errors occur for a plurality of times, the error-correcting errors can be repeatedly generated by the functional module due to certain intractable abnormality, and in this case, the error can be reported to the processor core, and the processor can check the functional module to repair, so that the error-correcting errors of the functional module can be avoided.
Based on the above-mentioned idea, the embodiment of the present application sets that, after receiving the error signal sent by the functional module 002, the error reporter 004 judges whether the error corresponding to the received error signal is an error-correctable error or not, or whether the error is a specific error-correctable error, before sending an error report message to the processor core 001 or the MCU005 or the field programmable gate array unit 007, or before triggering the interrupt controller 003 to send the first interrupt signal to the processor core 001.
For example, whether various types of errors are correctable errors may be marked in advance, and based on this, the error reporter 004 may determine whether the error corresponding to the error signal is an correctable error or whether it is a specific error by identifying the type of error corresponding to the error signal.
If the error corresponding to the error signal from the functional module received by the error reporter 004 is an error-correctable error or a specific error, further judging whether the number of the error-correctable errors which have occurred or the specific error which have occurred is greater than a preset number, that is, judging whether the number of times that the error-correctable error occurs to the functional module exceeds a preset number of times.
In the embodiment of the present application, the error reporter 004 also has an error statistics function, that is, can count and count the number of correctable errors that have occurred in the functional module or a specific correctable error that has occurred. When the error reporter 004 determines that the error corresponding to the received error signal is an error-correctable error or a specific error-correctable error, the error reporter 004 may add 1 to the number of error-correctable errors that have occurred or the specific error-correctable error that has occurred, thereby achieving the purpose of updating the number of error-correctable errors that have occurred.
When the error reporter 004 determines that the number of the error which can be corrected or the number of certain specific error which can be corrected is larger than the preset number, namely that the functional module repeatedly generates error which can be corrected or certain specific error which can be corrected, the functional module can be determined to generate an abnormality and can not be repaired independently through the functional module, the error reporter 004 is required to report the error to other devices, and the other devices control the functional module to reset so as to achieve the purpose of repairing the error. Therefore, the error reporter 004 sends an error report message to the processor core 001 or the MCU005 or the field programmable gate array unit 007, or triggers the interrupt controller 003 to send a first interrupt signal to the processor core 001, so that the processor core 001 or the MCU005 or the field programmable gate array unit 007 can reset the functional module, thereby achieving the purpose of repairing the functional module error. On this basis, the error reporter 004 clears the recorded number of the error which can be corrected and starts the statistics of the number of the error which can be corrected and is generated in the next turn.
In the case that the error reporter 004 determines that the number of the generated correctable errors or the number of the generated specific correctable errors is not greater than the preset number, it is indicated that the functional module generates the correctable errors or the specific correctable errors only a small number of times, and the functional module may not be reset at this time, but the number of the generated correctable errors or the number of the generated specific correctable errors is counted and updated, for example, the number of the generated correctable errors or the number of the generated specific correctable errors is increased by 1.
In addition, if the error corresponding to the error signal from the functional module received by the error reporter 004 is an uncorrectable error, the error reporter 004 directly reports the error, such as sending an error report message to the processor core 001 or the MCU005 or the field programmable gate array unit 007, or triggering the interrupt controller 003 to send a first interrupt signal to the processor core 001, so that the processor core 001 or the MCU005 or the field programmable gate array unit 007 can perform error repair on the functional module.
In the above embodiment, the error reporting device 004 can identify and count the error that can be corrected, so as to avoid the resource waste of reporting errors caused by indiscriminate reporting of various received errors, and simultaneously, prevent the error that can be corrected from causing more serious errors because the error is not reported all the time.
The above embodiments are exemplary of the structure and function of the microprocessor and computer system according to the present application, and of error reporting methods based on these microprocessors and computer systems.
In the various embodiments described above, the elements of the microprocessor and/or computer system may be implemented in the form of hardware circuitry, and the functionality of some or all of the elements may be implemented by the design of hardware circuitry. For example, in one implementation, the hardware circuit is an ASIC, and the functions of some or all of the above units are implemented by designing the logic relationships of the elements in the circuit; for another example, in another implementation, the hardware circuit may be implemented by a PLD, for example, an FPGA may include a large number of logic gates, and the connection relationship between the logic gates is configured by a configuration file, so as to implement the functions of some or all of the above units.
Furthermore, the various units in the above microprocessor and/or computer system may be wholly or partially integrated together or may be implemented separately.
The above microprocessor and/or computer system may be applied to any computer device, so as to constitute a hardware device having the above microprocessor and/or computer system and capable of executing an error reporting method adapted to the structure and function of the microprocessor and/or computer system, and for these computer devices, embodiments of the present application will not be described in detail.
For the foregoing method embodiments, for simplicity of explanation, the methodologies are shown as a series of acts, but one of ordinary skill in the art will appreciate that the present application is not limited by the order of acts, as some steps may, in accordance with the present application, occur in other orders or concurrently. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.
The steps in the method of each embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs, and the technical features described in each embodiment can be replaced or combined.
The modules and the submodules in the device and the terminal of the embodiments of the application can be combined, divided and deleted according to actual needs.
In the embodiments provided in the present application, it should be understood that the disclosed terminal, apparatus and method may be implemented in other manners. For example, the above-described terminal embodiments are merely illustrative, and for example, the division of modules or sub-modules is merely a logical function division, and there may be other manners of division in actual implementation, for example, multiple sub-modules or modules may be combined or integrated into another module, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules or sub-modules illustrated as separate components may or may not be physically separate, and components that are modules or sub-modules may or may not be physical modules or sub-modules, i.e., may be located in one place, or may be distributed over multiple network modules or sub-modules. Some or all of the modules or sub-modules may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional module or sub-module in the embodiments of the present application may be integrated in one processing module, or each module or sub-module may exist alone physically, or two or more modules or sub-modules may be integrated in one module. The integrated modules or sub-modules may be implemented in hardware or in software functional modules or sub-modules.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software unit executed by a processor, or in a combination of the two. The software elements may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. The utility model provides a method of reporting errors, which is characterized in that the method is applied to a microprocessor, the microprocessor includes a functional module, an error reporter and an MCU, the functional module is connected with the error reporter, the error reporter is connected with the MCU, the functional module includes a hardware module for realizing specific processor functions, the method includes:
the MCU receives an error report message sent by the error report device, wherein the error report message is generated by the error report device under the condition of receiving an error signal sent by the functional module;
and the MCU controls the functional module to reset under the condition that the error report message is detected not to be processed within a set time length.
2. The method of claim 1, wherein the error reporter has a packet status register and an error record status register disposed therein; a first register bit in the packet status register is used to store an error identification corresponding to the error signal; the error record state register corresponds to the error identifier and is used for recording error information corresponding to the error signal; wherein the first register bit is a register bit corresponding to the error signal, the error information including an error type;
The controlling the reset of the functional module comprises the following steps:
accessing an error grouping state register of the error reporting device, obtaining an error identification corresponding to the error reporting message, accessing an error recording state register corresponding to the error identification, and determining error information corresponding to the error reporting message;
and controlling the functional module to reset based on the error information.
3. The method of claim 1 or 2, wherein the microprocessor further comprises a clock reset module, the clock reset module being connected to the MCU;
the MCU controls the functional module to reset, and the method comprises the following steps:
the MCU instructs the clock reset module to reset the functional module.
4. The method according to claim 1 or 2, wherein the microprocessor further comprises a processor core, the error reporter is further connected to the processor core, and the error reporter sends an error report message to the processor core when receiving an error signal sent by the functional module.
5. The method according to claim 1 or 2, wherein the microprocessor further comprises a processor core and an interrupt controller, the error reporter is connected to the interrupt controller, the interrupt controller is connected to the processor core, and the error reporter triggers the interrupt controller to send a first interrupt signal to the processor core when receiving an error signal sent by the functional module, where the first interrupt signal is used to indicate that an error occurs in the functional module.
6. The utility model provides a method of reporting errors, which is characterized in that the method is applied to a microprocessor, the microprocessor includes a functional module, an error reporter and an MCU, the functional module is connected with the error reporter, the error reporter is connected with the MCU, the functional module includes a hardware module for realizing specific processor functions, the method includes:
the error reporting device sends an error reporting message to the MCU under the condition that an error signal sent by the functional module is received, so that the MCU controls the functional module to reset under the condition that the error reporting message is detected not to be processed within a set duration;
the error report message is used for indicating that the functional module has an error.
7. The method of claim 6, wherein the error reporter sending an error report message to the MCU if the error signal is multiple, comprises:
the error reporting device sequentially determines the reporting priority of each error signal, and sequentially sends error reporting messages corresponding to the error signals to the MCU according to the sequence of the reporting priority from high to low.
8. The method of claim 6, wherein the error reporter sending an error report message to the MCU comprises:
judging whether the error corresponding to the error signal is an error which can be corrected; wherein, the error-correctable error is used for representing that the error corresponding to the error signal can be corrected in the running process of the functional module;
judging whether the number of the error-correctable errors is larger than a preset number or not under the condition that the error corresponding to the error signal is an error-correctable error;
and sending an error reporting message to the MCU under the condition that the number of the generated error is larger than the preset number, and resetting the number of the generated error.
9. The method as recited in claim 8, further comprising:
and counting the number of the generated error-correctable errors under the condition that the number of the generated error-correctable errors is not larger than the preset number.
10. A microprocessor comprising a functional module, an error reporter and an MCU, the functional module being connected to the error reporter, the error reporter being connected to the MCU, the functional module comprising a hardware module for implementing a specific processor function, the MCU being configured to perform the error reporting method of any of claims 1 to 5 and/or the error reporter being configured to perform the error reporting method of any of claims 6 to 9.
CN202311181976.9A 2023-09-14 2023-09-14 Error reporting method and microprocessor Active CN116932272B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311181976.9A CN116932272B (en) 2023-09-14 2023-09-14 Error reporting method and microprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311181976.9A CN116932272B (en) 2023-09-14 2023-09-14 Error reporting method and microprocessor

Publications (2)

Publication Number Publication Date
CN116932272A true CN116932272A (en) 2023-10-24
CN116932272B CN116932272B (en) 2023-11-21

Family

ID=88382914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311181976.9A Active CN116932272B (en) 2023-09-14 2023-09-14 Error reporting method and microprocessor

Country Status (1)

Country Link
CN (1) CN116932272B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040205420A1 (en) * 2002-06-26 2004-10-14 Charles Seeley Error reporting network in multiprocessor computer
US20110126082A1 (en) * 2008-07-16 2011-05-26 Freescale Semiconductor, Inc. Micro controller unit including an error indicator module
US20180205837A1 (en) * 2017-01-13 2018-07-19 Fuji Xerox Co., Ltd. Relay apparatus, error information management system, and non-transitory computer readable medium
CN110191027A (en) * 2019-06-19 2019-08-30 上海电气泰雷兹交通自动化系统有限公司 A kind of communication mistake diagnostic method between CCU and MCU
CN110187659A (en) * 2019-05-28 2019-08-30 成都星时代宇航科技有限公司 Method for monitoring state, system and cube star
CN113064745A (en) * 2021-02-20 2021-07-02 山东英信计算机技术有限公司 Method, device and medium for reporting error information
CN113806132A (en) * 2021-09-22 2021-12-17 京东方科技集团股份有限公司 Exception reset processing method and device
CN217588014U (en) * 2022-07-28 2022-10-14 北京万里红科技有限公司 Reset circuit
CN115237644A (en) * 2022-06-16 2022-10-25 广州汽车集团股份有限公司 System failure processing method, central processing unit and vehicle
CN115827332A (en) * 2022-12-29 2023-03-21 成都蜀郡微电子有限公司 MCU processor core circuit structure and MCU flight nearby recovery field method
CN115934389A (en) * 2021-08-04 2023-04-07 三星电子株式会社 System and method for error reporting and handling
CN116049249A (en) * 2021-12-31 2023-05-02 海光信息技术股份有限公司 Error information processing method, device, system, equipment and storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040205420A1 (en) * 2002-06-26 2004-10-14 Charles Seeley Error reporting network in multiprocessor computer
US20110126082A1 (en) * 2008-07-16 2011-05-26 Freescale Semiconductor, Inc. Micro controller unit including an error indicator module
US20180205837A1 (en) * 2017-01-13 2018-07-19 Fuji Xerox Co., Ltd. Relay apparatus, error information management system, and non-transitory computer readable medium
CN110187659A (en) * 2019-05-28 2019-08-30 成都星时代宇航科技有限公司 Method for monitoring state, system and cube star
CN110191027A (en) * 2019-06-19 2019-08-30 上海电气泰雷兹交通自动化系统有限公司 A kind of communication mistake diagnostic method between CCU and MCU
CN113064745A (en) * 2021-02-20 2021-07-02 山东英信计算机技术有限公司 Method, device and medium for reporting error information
CN115934389A (en) * 2021-08-04 2023-04-07 三星电子株式会社 System and method for error reporting and handling
CN113806132A (en) * 2021-09-22 2021-12-17 京东方科技集团股份有限公司 Exception reset processing method and device
CN116049249A (en) * 2021-12-31 2023-05-02 海光信息技术股份有限公司 Error information processing method, device, system, equipment and storage medium
CN115237644A (en) * 2022-06-16 2022-10-25 广州汽车集团股份有限公司 System failure processing method, central processing unit and vehicle
CN217588014U (en) * 2022-07-28 2022-10-14 北京万里红科技有限公司 Reset circuit
CN115827332A (en) * 2022-12-29 2023-03-21 成都蜀郡微电子有限公司 MCU processor core circuit structure and MCU flight nearby recovery field method

Also Published As

Publication number Publication date
CN116932272B (en) 2023-11-21

Similar Documents

Publication Publication Date Title
US9495233B2 (en) Error framework for a microprocesor and system
CN100440157C (en) Detecting correctable errors and logging information relating to their location in memory
TWI229796B (en) Method and system to implement a system event log for system manageability
CN101800675B (en) Failure monitoring method, monitoring equipment and communication system
CN102141947A (en) Method and system for processing abnormal task in computer application system adopting embedded operating system
CN113176963B (en) PCIe fault self-repairing method, device, equipment and readable storage medium
US20130246855A1 (en) Error Location Specification Method, Error Location Specification Apparatus and Computer-Readable Recording Medium in Which Error Location Specification Program is Recorded
JP5198154B2 (en) Fault monitoring system, device, monitoring apparatus, and fault monitoring method
CN101964724A (en) Energy conservation method of communication single plate and communication single plate
CN116932272B (en) Error reporting method and microprocessor
CN116909801B (en) Error reporting method, microprocessor and computer equipment
CN117009129B (en) Error reporting method, microprocessor and computer equipment
CN117009128B (en) Error reporting method and computer system
CN106155826A (en) For detecting and process the method and system of mistake in bus structures
CN113407391A (en) Fault processing method, computer system, substrate management controller and system
CN115599617B (en) Bus detection method and device, server and electronic equipment
JPWO2007097040A1 (en) Information processing apparatus control method, information processing apparatus
CN106506074B (en) A kind of method and apparatus detecting optical port state
US20080177901A1 (en) Communication error information output method, communication error information output device and recording medium therefor
CN115495301A (en) Fault processing method, device, equipment and system
CN110471814A (en) The control method of the error reporting function of server unit
CN114048156B (en) Multi-channel multi-mapping interrupt controller
US20230396634A1 (en) Universal intrusion detection and prevention for vehicle networks
CN115658373B (en) Server-based memory processing method and device, processor and electronic equipment
CN113032199B (en) Error injection method and device for expansion bus of high-speed serial computer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant