CN112463430A

CN112463430A - Crash information storage method and medium for multi-core system, and electronic device

Info

Publication number: CN112463430A
Application number: CN202011408857.9A
Authority: CN
Inventors: 师雯
Original assignee: Zeku Technology Beijing Corp Ltd
Current assignee: Zeku Technology Beijing Corp Ltd
Priority date: 2020-12-03
Filing date: 2020-12-03
Publication date: 2021-03-09
Anticipated expiration: 2040-12-03
Also published as: CN112463430B; WO2022116755A1

Abstract

The application discloses a crash information storage method, a crash information storage device, a crash information storage medium and electronic equipment of a multi-core system, wherein the crash information storage method of the multi-core system comprises the following steps: the method comprises the steps that a first processor determines whether a second processor is in an interrupt failure state which does not respond to interrupt through inter-core communication of the multi-core system; in the case where the second processor is in the interrupt disabled state, the first processor acts such that the crash information of the second processor is acquired into a storage device of the multicore system, wherein the first processor is in an interrupt valid state in response to the interrupt. The method for storing the crash information of the multi-core system does not need to require the processing core with the watchdog interrupt to respond to the interrupt, can also ensure the validity of the random access memory mirror image of the processing core, and further can provide more crash related information for subsequent debugging and analysis.

Description

Crash information storage method and medium for multi-core system, and electronic device

Technical Field

The present disclosure relates to the field of embedded technologies, and in particular, to a crash information storage method and medium for a multi-core system, and an electronic device.

Background

Watchdog (Watchdog) is a monitoring technology commonly used in embedded software, and includes both software and hardware components. The hardware part comprises a hardware timer, and if the timer is not reset within a few seconds, the hardware part informs a Power Management Integrated Circuit (PMIC) unit of the system to reset the system. The software portion may be implemented with a timer-scheduled process that periodically performs a reset of the hardware timer to prevent the PMIC unit from resetting the system. The watchdog can actively reset the system when the system is blocked and can not work normally, so that the system can work normally.

In the related art, when a watchdog interrupt occurs, the interrupt is triggered by timeout of the watchdog, the whole system is triggered to crash in the interrupt, during the crash, a TCM (Tightly Coupled Memory) and a Cache (buffer) are refreshed into a RAM (Random Access Memory), and then the content of the whole RAM is saved into a file system for subsequent debugging and analysis.

However, the system crash triggered by the timeout of the watchdog timer may store the effective memory image, but it is a precondition that when the watchdog timer times out, the processing core corresponding to the watchdog may respond to the interrupt. If the watchdog timer is overtime and the corresponding processing core closes the interruption, the crash flow cannot be triggered actively, so that the Cache is not refreshed in the obtained memory image file, and some data are invalid, thereby influencing the subsequent analysis process.

Disclosure of Invention

The present application is directed to solving, at least to some extent, one of the technical problems in the related art. To this end, an object of the present application is to provide a method for storing crash information of a multi-core system, so as to ensure the validity of a random access memory portion of a crash processing core.

A second object of the present application is to propose a computer storage medium.

A third object of the present application is to provide an electronic device.

In a first aspect, the present application provides a crash information storage method for a multi-core system, including the following steps: the first processor determines whether the second processor is in an interrupt failure state which does not respond to the interrupt through inter-core communication of the multi-core system; and when the second processor is in an interrupt failure state, the first processor acts so that the halt information of the second processor is acquired to a storage device of the multi-core system, wherein the first processor is in an interrupt effective state responding to an interrupt.

According to the method for storing the crash information of the multi-core system, when the processing core with the watchdog interrupt is in the interrupt failure state through inter-core communication monitoring of the multi-core system, the crash information of the processing core with the watchdog interrupt can be refreshed into the random access memory of the multi-core system through the action of at least one processing core in the interrupt effective state in the multi-core system. Therefore, the validity of the random access memory image of the processing core can be ensured without requiring the processing core with the watchdog interrupt to respond to the interrupt, and more crash related information can be provided for subsequent debugging and analysis.

According to an embodiment of the application, when the second processor is in an interrupt disabled state and the first processor is in an interrupt enabled state, the first processor acquires crash information of the second processor, and acquires the acquired crash information to a storage device of the multi-core system.

According to one embodiment of the application, the first processor accesses a memory space of the second processor to acquire the crash information of the second processor.

According to one embodiment of the application, the first processor accesses the TCM and/or the Cache of the second processor through an inter-core AXI interface so as to acquire the crash information in the TCM and/or the Cache into a storage device of the multi-core system.

According to an embodiment of the present application, the first processor obtains the crash information of the second processor according to a mapping relationship between the first processor and the second processor.

According to an embodiment of the application, the Cache of the multi-core system is set to be in a Fresh mode through the first processor, so that the second processor acquires the crash information to a storage device of the multi-core system after the multi-core system is restarted.

According to one embodiment of the application, the first processor and the second processor determine whether the first processor and/or the second processor are in an interrupt failure state by sending heartbeat information to each other every first preset time.

According to an embodiment of the application, the first processor and the second processor send inter-core interrupts to each other every the first preset time, and determine whether to respond to the inter-core interrupts; determining that the first processor or the second processor is in an interrupt failure state if the first processor or the second processor does not respond to the inter-core interrupt more than once.

According to one embodiment of the application, when at least one of the first processor and the second processor is interrupted by a watchdog and is in an interrupt valid state, state information of a multi-core system is acquired into a storage device of the multi-core system.

According to one embodiment of the application, when at least one of the first processor and the second processor is in a watchdog interrupt state and is in an interrupt valid state, at least one of the first processor and the second processor sends an inter-core interrupt to other processors, and the other processors are forced to acquire the TCM and/or the Cache into a storage device of the multi-core system.

According to one embodiment of the application, the storage device of the multi-core system is a RAM.

According to an embodiment of the application, the crash information includes state information of the multi-core system.

According to one embodiment of the present application, the inter-core interrupt is an IPI interrupt.

In a second aspect, the present application provides a computer-readable storage medium, on which a crash information storage program of a multi-core system is stored, and when being executed by a processor, the crash information storage program of the multi-core system implements the crash information storage method of the multi-core system according to the first aspect.

In a third aspect, the present application provides an electronic device, including a memory, a processor, and a crash information storage program that is stored in the memory and is capable of being executed by a multi-core system on the processor, where the processor executes the crash information storage program to implement the crash information storage method for the multi-core system according to the first aspect.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

FIG. 1 is a flowchart of a crash information storage method of a multi-core system according to an embodiment of the present application;

FIG. 2 is a flowchart of a crash information storage method of a multi-core system according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of communication among a plurality of processing cores in a multi-core system according to an embodiment of the present application;

fig. 4 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

The following describes a crash information storage method, a crash information storage medium, and an electronic device of a multi-core system according to an embodiment of the present application with reference to the drawings.

Fig. 1 is a flowchart of a crash information storage method of a multi-core system according to an embodiment of the present application.

As shown in fig. 1, the method for storing crash information of a multi-core system includes the following steps:

s101, the first processor determines whether the second processor is in an interrupt failure state which does not respond to the interrupt through inter-core communication of the multi-core system.

The second processor is a processing core for generating watchdog interrupt, and the first processor is in an interrupt effective state for responding to the terminal.

Specifically, the multi-core system includes a plurality of processing cores, and communication connections exist between the processing cores, wherein the multi-core system may be an integrated chip. Each processing core can be correspondingly provided with a watchdog module, each processing core can configure relevant registers inside the watchdog for the corresponding watchdog module, and the watchdog enabling is started through configuring the watchdog control register. Each processing core can periodically send a dog feeding signal to the corresponding watchdog module, and when the watchdog module receives the first dog feeding signal, the watchdog counter of the watchdog module starts to count.

When the count value of any watchdog calculator overflows for the first time, a watchdog interrupt is generated under the condition that the watchdog interrupt function is enabled, namely the watchdog interrupt occurs to any processing core. At the same time, the processing core is treated as the second processor and the other processing cores are treated as the first processor. Further, the first processor may monitor the state of the processing core in which the watchdog interrupt occurs through a communication connection with the second processor, for example, whether there is a heartbeat message interaction between the first processor and the second processor, so as to determine whether the processing core in which the watchdog interrupt occurs is in an interrupt failure state.

S102, when the second processor is in the interrupt disabled state, the first processor operates to acquire the crash information of the second processor into the storage device of the multicore system.

The storage device of the multi-core system is a RAM.

In this embodiment, when the second processor is in the interrupt disabled state, it cannot respond to the interrupt, and at this time, the crash information of the processing core in which the watchdog interrupt occurs may be flushed to the random access memory RAM of the multi-core system, such as the double rate synchronous dynamic random access memory DDR SDRAM, by the action of at least one processing core (e.g., the arbitrated processing core with the best processing capability) in the first processor in the interrupt enabled state in the multi-core system. Specifically, the halt information of the second processor can be refreshed into a random access memory of the multi-core system through inter-core communication; and the Cache of the multi-core system can also be set to be in a Fresh mode so as to refresh the crash information of the second processor into the random access memory of the multi-core system when the multi-core system is restarted.

Therefore, the method for storing the crash information of the multi-core system in the embodiment of the application does not need to require the processing core with the watchdog interrupt to respond to the interrupt, can also ensure the validity of the random access memory image of the processing core, and further can provide more crash related information for subsequent debugging and analysis.

In some embodiments, the first processor and the second processor determine whether the first processor and/or the second processor are in an interrupt disabled state by sending heartbeat information to each other every first predetermined time.

The first preset time can be calibrated according to needs, and can be a value in 0.5 s-3 s. The sending, by the first processor and the second processor, heartbeat information to determine whether the first processor and/or the second processor are in an interrupt failure state at intervals of a first preset time may specifically include: the first processor and the second processor mutually send inter-core interrupts every other first preset time and mutually determine whether to respond to the inter-core interrupts; if the first processor or the second processor does not respond to the inter-core interrupt more than once, it is determined that the first processor or the second processor is in an interrupt disabled state.

Specifically, the first processor sends an inter-core interrupt to the second processor every other first preset time, and detects whether the second processor responds to the inter-core interrupt; if the second processor does not respond to the inter-core interrupt for a plurality of consecutive times (e.g., one of 3-10 times), it is determined that the second processor is in an interrupt disabled state. Therefore, whether the second processor is in the interrupt failure state or not can be accurately monitored.

Further, the second processor may also send an inter-core interrupt to the first processor every first preset time, and if the first processor does not respond to the inter-core interrupt for a plurality of consecutive times (for example, one value of 3 to 10 times), it is determined that the first processor is in an interrupt failure state.

Optionally, the second processor may further determine, according to whether the inter-core interrupt information sent by the first processor is received, whether the first processor is in an interrupt failure state. Specifically, the second processor may know in advance a time when the first processor sends the inter-core interrupt to the second processor; for example, the second processor may record a time when the first processor first sends the inter-core interrupt to the second processor, and calculate, with the time as a reference, the time when the first processor sends the inter-core interrupt by using the previously known first preset time. Furthermore, if the second processor does not receive the inter-core interval sent by the first processor at the pre-calculated time when the first processor sends the inter-core interrupt for a plurality of consecutive times (for example, one value of 3 to 10 times), it is determined that the first processor is in the interrupt failure state.

For example, taking the example of a multi-Core system including two processing cores, the two processing cores may be referred to as a master processing Core1 and a slave processing Core0, respectively, see fig. 2. The two processing cores may perform not only the operation of resetting the watchdog register but also the operation and communicate with each other to determine the survival status of the other, for example, the Inter-core Interrupt may be an IPI (Inter Processor Interrupt) Interrupt, and the survival status of the other may be determined by the IPI Interrupt.

Referring to fig. 2, after the multi-Core system is started, the master processing Core1 may send an IPI interrupt to the slave processing Core0 every first preset time, e.g., 1s, to inform the slave processing Core0 that the master processing Core1 has not crashed; upon receiving the IPI interrupt sent by the master processing Core1, the slave processing Core0 may respond to the IPI interrupt and feed back corresponding information to the master processing Core1 to inform the master processing Core1 that the slave processing Core0 has not crashed. If a watchdog interrupt has occurred by the slave processing Core0 and the master processing Core1 detects that the slave processing Core0 does not respond to the inter-Core interrupt and continues for a number of times, e.g., 5 times, then it may be determined that the slave processing Core0 that occurred the watchdog interrupt is in an interrupt failed state; otherwise, the slave processing Core0 in which the watchdog interrupt occurred is determined to be in an interrupt valid state.

As an example, referring to fig. 2, when at least one of the first processor and the second processor is in an interrupt valid state due to a watchdog interrupt, state information of the multi-core system is acquired into a storage device of the multi-core system. Therefore, the validity of the dead halt information of the processing core with the watchdog interrupt can be ensured without the inter-core operation, and the subsequent debugging and analysis are facilitated.

Wherein the crash information includes state information of the multi-core system.

As one example, referring to fig. 2, when at least one of the first processor and the second processor is in a watchdog interrupt and is in an interrupt valid state, the at least one of the first processor and the second processor sends an inter-core interrupt to the other processor, and forces the other processor to fetch the TCM and/or the Cache into a storage device of the multi-core system. Therefore, more information of the multi-core system when the watchdog is interrupted can be obtained, the subsequent debugging analysis is facilitated, and the accuracy of debugging data is ensured.

Alternatively, referring to fig. 2, regardless of the watchdog interrupt, if a certain processing core can respond to the interrupt, it indicates that the processing core is operating normally, otherwise, if a certain processing core cannot respond to the interrupt continuously for multiple times, it indicates that the processing core is in a problem. At this time, the processing core which can normally work is required to actively trigger the system crash.

In some embodiments, in a case where the second processor is in an interrupt disabled state and the first processor is in an interrupt enabled state, the first processor acquires crash information of the second processor and acquires the acquired crash information into a storage device of the multicore system.

As a possible example, the first processor accesses the memory space of the second processor to obtain the crash information of the second processor.

The first processor accesses the TCM and/or the Cache of the second processor through an Advanced eXtensible Interface (AXI) Interface between the cores, so as to acquire the crash information in the TCM and/or the Cache into a storage device of the multi-core system.

Specifically, referring to fig. 3, TCM and Cache are mutually accessed between a plurality of processing cores Core 0-Core through an inter-Core interface AXI. When the second processor is detected to be in an interrupt failure state, the first processor can actively call an inter-core interface to help the second processor refresh the TCM and the L1 Cache into the L2 Cache, and then actively trigger a crash to refresh the L2 Cache into a Random Access Memory (RAM) and save a memory image. In fig. 3, DTCM refers to a data transfer bus, and ITCM refers to an instruction transfer bus.

As another possible example, the first processor obtains the crash information of the second processor according to the mapping relationship between the first processor and the second processor.

In this example, a mapping relationship between any two processing cores may be established in advance, and then the first processor obtains the mapping relationship between the first processor and the second processor. For example, for a multi-Core system comprising four processing cores, a mapping relation exists between the Core0 and the Core3, a mapping relation exists between the Core1 and the Core2, and when the Core0 generates a watchdog interrupt, the dead halt information of the processing Core generating the watchdog interrupt can be acquired through the Core 3.

In other embodiments, the Cache of the multi-core system is set to a Fresh mode by the first processor, so that the second processor acquires the dead halt information into a storage device of the multi-core system after the multi-core system is restarted.

Specifically, when the second processor is in an interrupt failure state, if the multi-core system does not support cross-core access between the Cache or the TCM, the Cache of the multi-core system can be set to be in a Fresh mode through the action of the first processor, and the contents in the Cache and/or the TCM can be guaranteed to be stored when the multi-core system is started in a hot mode, so that the Cache and/or the TCM can be refreshed to the RAM when the multi-core system is restarted, and then the contents are stored in the memory mirror image.

In summary, the method for storing the crash information of the multi-core system according to the present application can implement that when a certain processing core cannot respond to an interrupt and the multi-core system is about to crash, other processing cores perform refreshing, caching, variable saving, log printing, etc., so as to ensure the validity of the random access memory portion of the processing core that crashes in the random access memory image, provide more crash related information, and help to better solve the problem that the crash caused by the watchdog terminal is difficult to debug.

The present application also provides a computer-readable storage medium.

In this embodiment, a computer readable storage medium has stored thereon a crash information storage program of a multi-core system, which when executed by a processor implements the crash information storage method of the multi-core system of the above-described embodiment.

As shown in fig. 4, the electronic device 100 includes a memory 110, a processor 120, and a crash information storage program that is stored in the memory 110 and that is capable of being executed by the processor 120 in a multi-core system, and when the processor 120 executes the crash information storage program, the method for storing crash information in the multi-core system is implemented.

It should be noted that the logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

In the description of the present application, it is to be understood that the terms "central," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," "circumferential," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the present application and to simplify the description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and are therefore not to be considered limiting of the present application.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

In this application, unless expressly stated or limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can include, for example, fixed connections, removable connections, or integral parts; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.

In this application, unless expressly stated or limited otherwise, the first feature "on" or "under" the second feature may be directly contacting the first and second features or indirectly contacting the first and second features through intervening media. Also, a first feature "on," "over," and "above" a second feature may be directly or diagonally above the second feature, or may simply indicate that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature may be directly under or obliquely under the first feature, or may simply mean that the first feature is at a lesser elevation than the second feature.

Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A crash information storage method of a multi-core system, wherein the multi-core system comprises a first processor and a second processor, the method comprising the steps of:

the first processor determines whether the second processor is in an interrupt failure state which does not respond to the interrupt through inter-core communication of the multi-core system;

in the case where the second processor is in an interrupt failure state, the first processor acts such that crash information of the second processor is acquired into a storage device of the multi-core system,

wherein the first processor is in an interrupt active state responsive to an interrupt.

2. The method for storing crash information of a multi-core system according to claim 1, comprising the steps of:

and under the condition that the second processor is in an interrupt failure state and the first processor is in an interrupt effective state, the first processor acquires the crash information of the second processor and acquires the acquired crash information into a storage device of the multi-core system.

3. The method for storing crash information of a multi-core system according to claim 2, comprising the steps of:

and the first processor accesses the storage space of the second processor to acquire the halt information of the second processor.

4. The method for storing the crash information of the multicore system according to claim 3, wherein the first processor accesses the TCM and/or the Cache of the second processor through an inter-core AXI interface to obtain the crash information in the TCM and/or the Cache into a storage device of the multicore system.

5. The method for storing crash information of a multi-core system according to claim 2, comprising the steps of:

and the first processor acquires the halt information of the second processor according to the mapping relation between the first processor and the second processor.

6. The method for storing crash information of a multi-core system according to claim 1, comprising the steps of:

and setting the Cache of the multi-core system into a Fresh mode through the first processor, so that the second processor acquires the crash information into a storage device of the multi-core system after the multi-core system is restarted.

7. The crash information storage method of a multi-core system according to any one of claims 1 to 6, comprising the steps of:

the first processor and the second processor mutually send heartbeat information at intervals of a first preset time to determine whether the first processor and/or the second processor are in an interrupt failure state.

8. The method for storing crash information of a multi-core system according to claim 7, comprising the steps of:

the first processor and the second processor mutually send inter-core interrupts every other first preset time, and mutually determine whether to respond to the inter-core interrupts;

determining that the first processor or the second processor is in an interrupt failure state if the first processor or the second processor does not respond to the inter-core interrupt more than once.

9. The method according to claim 1, wherein when at least one of the first processor and the second processor is in an interrupt active state due to a watchdog interrupt, acquiring state information of a multi-core system into a storage device of the multi-core system.

10. The method according to claim 9, wherein when at least one of the first processor and the second processor is in an interrupt valid state due to a watchdog interrupt, at least one of the first processor and the second processor sends an inter-core interrupt to the other processor, and the other processor is forced to acquire the TCM and/or the Cache into a storage device of the multi-core system.

11. The crash information storage method of a multi-core system according to any one of claims 1 to 6,

the storage device of the multi-core system is a RAM.

12. The crash information storage method of a multi-core system according to any one of claims 1 to 6,

the crash information includes state information of the multi-core system.

13. The method of claim 8 or 10, wherein the crash information storage method of the multi-core system,

the inter-core interrupt is an IPI interrupt.

14. A computer-readable storage medium, on which a crash information storage program of a multi-core system is stored, the crash information storage program of the multi-core system implementing a crash information storage method of the multi-core system according to any one of claims 1 to 13 when executed by a processor.

15. An electronic device, comprising a memory, a processor, and a crash information storage program that is stored in the memory and that is capable of being executed by a multi-core system on the processor, wherein the processor implements the crash information storage method of the multi-core system according to any one of claims 1 to 13 when executing the crash information storage program.