CN109032822A - A kind of store method and device of computer crash information - Google Patents

A kind of store method and device of computer crash information Download PDF

Info

Publication number
CN109032822A
CN109032822A CN201710432510.XA CN201710432510A CN109032822A CN 109032822 A CN109032822 A CN 109032822A CN 201710432510 A CN201710432510 A CN 201710432510A CN 109032822 A CN109032822 A CN 109032822A
Authority
CN
China
Prior art keywords
dog
house dog
case
house
feeds
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710432510.XA
Other languages
Chinese (zh)
Other versions
CN109032822B (en
Inventor
刘佳妮
周武
王中辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201710432510.XA priority Critical patent/CN109032822B/en
Publication of CN109032822A publication Critical patent/CN109032822A/en
Application granted granted Critical
Publication of CN109032822B publication Critical patent/CN109032822B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0787Storage of error reports, e.g. persistent data storage, storage using memory protection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/24Resetting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/0757Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Retry When Errors Occur (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of store method of computer crash information and devices, wherein this method comprises: the first house dog of control resets CPU in the case where the system that determines is abnormal restarting;After cpu reset, the first house dog is set and the second house dog is that hardware feeds dog pattern, computer crash information is saved into flash memory;Software is configured by the second house dog and feeds dog, in the case where the second house dog feeds dog time-out, resets all devices of whole plate.Present invention efficiently solves in the prior art system exception crash restart in the case where, computer crash information can not effectively be saved, so that subsequent the technical issues of can not effectively being analyzed failure, reached in the case where system seriously crashes, can also effectively save the technical effect of computer crash information.

Description

A kind of store method and device of computer crash information
Technical field
The present invention relates to computer fields, more particularly to the store method and device of a kind of computer crash information.
Background technique
Computer crash information has important role for analysis failure Producing reason.Under normal circumstances, system can crash Computer crash information is saved in flash when generation, but sometimes some crash situations are than more serious, cause system can not It timely responds to, to cause that computer crash information can not be saved when crashing.These crash situations are referred to as sternly than more serious Weight crash phenomenon causes computer crash information to be lost when serious crash phenomenon occurs due to that can not save computer crash information, thus So that accident analysis can not be carried out, regression analysis can not be carried out to failure.
In view of the above-mentioned problems, currently no effective solution has been proposed.
Summary of the invention
The present invention provides the store method and device of a kind of computer crash information, dead in system exception in the prior art to solve In the case that machine is restarted, computer crash information can not be effectively saved, so that subsequent the technical issues of can not effectively being analyzed failure.
In order to solve the above technical problems, on the one hand, the present invention provides a kind of store method of computer crash information, comprising: true In the case where determining system for abnormal restarting, the first house dog of control resets CPU;After cpu reset, setting first is guarded the gate Dog and the second house dog are that hardware feeds dog pattern, and computer crash information is saved into flash memory;Software is configured by the second house dog to feed Dog resets all devices of whole plate in the case where the second house dog feeds dog time-out.
Optionally, in the case where the system that determines is abnormal restarting, the first house dog of control, which reset to CPU, includes: In the case where system in case of system halt is without response, software stops carrying out above-mentioned first house dog software hello dog;In the feelings for feeding dog time-out Under condition, above-mentioned first house dog resets above-mentioned CPU, and first identifier is arranged, wherein above-mentioned first identifier is for identifying System is abnormal restarting.
Optionally, after cpu reset, the first house dog is set and the second house dog is that hardware feeds dog pattern, is believed crashing Breath comprises determining whether that the first house dog is successfully arranged during saving into flash memory and the second house dog is that hardware feeds dog mould Formula saves computer crash information into flash memory;In the case where failed, reattempt to setting the first house dog and second and guard the gate Dog is that hardware hello dog pattern saves computer crash information into flash memory, and records the number retried;Exceed default threshold in number of retries In the case where value, abandons the first house dog of setting and the second house dog is that hardware feeds dog pattern.
Optionally, software is carried out by software and feeds dog, hardware is carried out by programmable logic device and feeds dog.
Optionally, above-mentioned computer crash information includes at least one of: one or more devices in memory mirror information, whole plate Register information.
Optionally, above-mentioned before the first house dog resets CPU in the case where the system that determines is abnormal restarting Method further include: in system operation, software continuous updates current stack pointer to preset memory address;It is whole resetting After all devices of plate, the above method further include: by checking stack pointer when crash, check data information.
On the other hand, the present invention also provides a kind of save sets of computer crash information, comprising: control module, for true In the case where determining system for abnormal restarting, the first house dog of control resets CPU;Preserving module, in cpu reset Afterwards, the first house dog is set and the second house dog is that hardware feeds dog pattern, computer crash information is saved into flash memory;Reseting module, Dog is fed for configuring software for the second house dog, in the case where the second house dog feeds dog time-out, resets all devices of whole plate Part.
Optionally, above-mentioned control module includes: pause unit, in the case where system in case of system halt is without response, software to stop Software only is carried out to above-mentioned first house dog and feeds dog;Control unit.For feed dog it is overtime in the case where, above-mentioned first house dog Above-mentioned CPU is resetted, and first identifier is set, wherein above-mentioned first identifier is abnormal restarting for identifying system.
Optionally, above-mentioned preserving module comprises determining that unit, is used to determine whether that the first house dog and second is successfully arranged House dog is that hardware hello dog pattern saves computer crash information into flash memory;Unit is retried, in the case where failed, weight The first house dog is arranged in new try and the second house dog is that hardware hello dog pattern saves computer crash information into flash memory, and records weight The number of examination;Unit is abandoned, for abandoning setting the first house dog and second in the case where number of retries exceeds preset threshold House dog is that hardware feeds dog pattern.
Optionally, the save set of above-mentioned computer crash information further include: update module, for being abnormal restarting in the system that determines In the case where, before the first house dog resets CPU, in system operation, software continuous updates current stack and refers to Needle is to preset memory address;Module is checked, for referring to after all devices for resetting whole plate by stack when checking crash Needle checks data information.
On the other hand, the present invention also provides a kind of computer readable storage medium, it is stored thereon with computer program, it should The step of above method is realized when program is executed by processor.
The present invention has the beneficial effect that: in the case where determining that system exception is restarted, by the way that two house dogs are arranged, passing through First house dog is arranged only to reset CPU, so that system can save computer crash information, then passes through again Second house dog realizes the reset to whole plate, to realize restarting for system, so that serious crash phenomenon occurs in system In the case where, computer crash information can also be effectively saved, is solved in the prior art in the case where system exception crash is restarted, nothing Method effectively saves computer crash information, so that subsequent the technical issues of can not effectively being analyzed failure, has reached serious in system In the case where crash, the technical effect of computer crash information can also be effectively saved.
Detailed description of the invention
Fig. 1 is the method flow diagram of the store method of computer crash information in the embodiment of the present invention;
Fig. 2 is the structural block diagram of the save set of computer crash information in the embodiment of the present invention;
Fig. 3 is system architecture schematic diagram in the embodiment of the present invention;
Fig. 4 is the method flow diagram of computer crash information store method in the embodiment of the present invention.
Specific embodiment
In order to solve in the prior art system exception crash restart in the case where, can not effectively save computer crash information, make Subsequent the technical issues of can not effectively being analyzed failure is obtained, the present invention provides a kind of store method of computer crash information and dresses It sets, below in conjunction with attached drawing and embodiment, the present invention will be described in further detail.It should be appreciated that tool described herein Body embodiment is only used to explain the present invention, does not limit the present invention.
In this example, it as shown in Figure 1, providing a kind of store method of computer crash information, may include steps of:
Step 101: in the case where the system that determines is abnormal restarting, the first house dog of control resets CPU;
Step 102: after cpu reset, the first house dog being set and the second house dog is that hardware feeds dog pattern, is believed crashing Breath is saved into flash memory;
Step 103: configuring software for the second house dog and feed dog, in the case where the second house dog feeds dog time-out, reset All devices of whole plate.
In upper example, in the case where determining that system exception is restarted, by the way that two house dogs are arranged, pass through setting first House dog only resets CPU, so that system can save computer crash information, is then guarded the gate again by second Dog realizes the reset to whole plate, to realize restarting for system, so that in the case where serious crash phenomenon occurs for system, Computer crash information can effectively be saved, solve in the prior art system exception crash restart in the case where, can not effectively save Computer crash information, so that subsequent the technical issues of can not effectively being analyzed failure, has reached the case where system seriously crashes Under, it can also effectively save the technical effect of computer crash information.
In view of to during restarting and have one, needing the cooperation of multiple devices in this process from system in case of system halt, In order to realize effective record to system information, to cooperate all parts effectively, setting mark can be passed through Mode reaches this purpose.In one embodiment, in the case where the system that determines is abnormal restarting, the first house dog is controlled , can be in the case where system in case of system halt to be without response when reset to CPU, software stops carrying out above-mentioned first house dog Software feeds dog;In the case where feeding dog time-out, above-mentioned first house dog resets above-mentioned CPU, and first identifier is arranged, In, above-mentioned first identifier is abnormal restarting for identifying system.
Specifically, programmable logic device can provide two registers: a crash_flag register, for recording Whether abnormal restarting phenomenon is had;One dump_retry register, for recording the number for attempting to enter dump mode.In determination When there is abnormal restarting situation out, a mark is just recorded in the crash_flag register, is abnormal for identifying system Restart.For example, it is 0 that crash_flag register initial value, which can be set,;Wherein, for crash_flag register, 0 is indicated System is normally restarted, and 1 expression system exception is restarted.
In order to realize effective preservation to computer crash information, the first watchdog chip of setting and the second house dog core can be passed through Piece connects the two new products by programmable logic device, and the first watchdog chip is connected to cpu reset signal, and second guards the gate Dog chip is connected to whole plate reset signal.If the first watchdog chip is restarted, CPU can be only restarted, other devices can be kept State before reset.
In view of occasionally there are when abnormal restarting, it is desirable to save the state or mode of computer crash information, still Impenetrable situation, if system always tries to enter the mode, identification will lead to system perturbations, thus in view of can be with One number of retries is set, if exceeding number of retries, can be abandoned, to guarantee that system can be executed orderly.In a reality It applies in mode, after cpu reset, the first house dog is set and the second house dog is that hardware feeds dog pattern, computer crash information is saved It may include: to determine whether that the first house dog and the second house dog, which is successfully arranged, feeds dog pattern for hardware during into flash memory Computer crash information is saved into flash memory;In the case where failed, the first house dog of setting and the second house dog are reattempted to Dog pattern is fed for hardware to save computer crash information into flash memory, and records the number retried;Exceed preset threshold in number of retries In the case where, it abandons the first house dog of setting and the second house dog is that hardware feeds dog pattern.That is, it is determined whether being successfully entered exception Tupe is just abandoned carrying out abnormality processing mode if repeatedly attempting not entering abnormality processing mode.
Carry out house dog feed dog when, can by software carry out software feed dog, by programmable logic device into Row hardware feeds dog, that is, can according to need different hello the dog mode of selection.
Above-mentioned computer crash information can include but is not limited at least one of: in memory mirror information, whole plate one or The register information of multiple devices.
It in order to realize the reading to data and check, can be carried out in such a way that stack pointer is set, specifically, can be In the case where determining that system is abnormal restarting, before the first house dog resets CPU, in system operation, software The current stack pointer of continuous updating is to preset memory address;It, can be by looking into this way after all devices for resetting whole plate It sees stack pointer when crash, checks data information.For example, can carry out stack backtracking by checking stack fingerprint when crash, check The information such as variable, code segment and data.By checking the register information of Primary Component, the shape for the module that goes wrong can analyze State.
The store method of computer crash information provided by upper example can be applied in built-in field, can by hardware design and Programmable logic device is write.
Based on the same inventive concept, a kind of save set of computer crash information is additionally provided in the embodiment of the present invention, it is such as following Embodiment it is above-mentioned.Since the principle that the save set of computer crash information solves the problems, such as is similar to the store method of computer crash information, because The implementation of the save set of this computer crash information may refer to the implementation of the store method of computer crash information, and overlaps will not be repeated. Used below, the combination of the software and/or hardware of predetermined function may be implemented in term " unit " or " module ".Although with Device described in lower embodiment is preferably realized with software, but the combined realization of hardware or software and hardware It may and be contemplated.Fig. 2 is a kind of structural block diagram of the save set of the computer crash information of the embodiment of the present invention, such as Fig. 2 institute Show, may include: control module 201, preserving module 202 and reseting module 203, the structure is illustrated below.
Control module 201, in the case where the system that determines is abnormal restarting, the first house dog of control to carry out CPU It resets;
Preserving module 202 feeds dog pattern for the first house dog and the second house dog after cpu reset, to be arranged for hardware, Computer crash information is saved into flash memory;
Reseting module 203, feeds dog for configuring software for the second house dog, feeds the situation of dog time-out in the second house dog Under, reset all devices of whole plate.
In one embodiment, control module 201 may include: pause unit, in system in case of system halt without response In the case of, software stops carrying out above-mentioned first house dog software hello dog;Control unit.For feed dog it is overtime in the case where, Above-mentioned first house dog resets above-mentioned CPU, and first identifier is arranged, wherein above-mentioned first identifier is for identifying system For abnormal restarting.
In one embodiment, preserving module 202 may include: determination unit, be used to determine whether successfully to be arranged One house dog and the second house dog are that hardware hello dog pattern saves computer crash information into flash memory;Retry unit, for not at In the case where function, reattempt to setting the first house dog and the second house dog be hardware feed dog pattern by computer crash information save to In flash memory, and record the number retried;Unit is abandoned, for abandoning being arranged in the case where number of retries exceeds preset threshold First house dog and the second house dog are that hardware feeds dog pattern.
In one embodiment, the save set of above-mentioned computer crash information can also include: update module, in determination In the case that system is abnormal restarting, before the first house dog resets CPU, in system operation, software continuous Current stack pointer is updated to preset memory address;Check module, for reset whole plate all devices after, by looking into It sees stack pointer when crash, checks data information.
In this example, a kind of computer readable storage medium is additionally provided, computer program is stored thereon with, the program quilt Processor realizes following steps when executing:
S1: in the case where the system that determines is abnormal restarting, the first house dog of control resets CPU;
S2: after cpu reset, the first house dog is set and the second house dog is that hardware feeds dog pattern, computer crash information is protected It deposits into flash memory;
S3: configuring software for the second house dog and feed dog, in the case where the second house dog feeds dog time-out, resets whole plate All devices.
That is, in the case where determining that system exception is restarted, by the way that two house dogs are arranged, by first house dog only CPU is resetted, so that system can save computer crash information, then passes through second house dog realization pair again The reset of whole plate, to realize restarting for system, so that in the case where serious crash phenomenon occurs for system, it can also be effective Save computer crash information, solve in the prior art system exception crash restart in the case where, can not effectively save computer crash information, It, can also be with so that subsequent the technical issues of can not effectively being analyzed failure, has reached in the case where system seriously crashes Effectively save the new oh technical effect that crashes.
In view of to during restarting and have one, needing the cooperation of multiple devices in this process from system in case of system halt, In order to realize effective record to system information, to cooperate all parts effectively, setting mark can be passed through Mode reaches this purpose.In one embodiment, in the case where the system that determines is abnormal restarting, the first house dog is controlled , can be in the case where system in case of system halt to be without response when reset to CPU, software stops carrying out above-mentioned first house dog Software feeds dog;In the case where feeding dog time-out, above-mentioned first house dog resets above-mentioned CPU, and first identifier is arranged, In, above-mentioned first identifier is abnormal restarting for identifying system.
Specifically, programmable logic device can provide two registers: a crash_flag register, for recording Whether abnormal restarting phenomenon is had;One dump_retry register, for recording the number for attempting to enter dump mode.In determination When there is abnormal restarting situation out, a mark is just recorded in the crash_flag register, is abnormal for identifying system Restart.For example, it is 0 that crash_flag register initial value, which can be set,;Wherein, for crash_flag register, 0 is indicated System is normally restarted, and 1 expression system exception is restarted.
In order to realize effective preservation to computer crash information, the first watchdog chip of setting and the second house dog core can be passed through Piece connects the two new products by programmable logic device, and the first watchdog chip is connected to cpu reset signal, and second guards the gate Dog chip is connected to whole plate reset signal.If the first watchdog chip is restarted, CPU can be only restarted, other devices can be kept State before reset.
In view of occasionally there are when abnormal restarting, it is desirable to save the state or mode of computer crash information, still Impenetrable situation, if system always tries to enter the mode, identification will lead to system perturbations, thus in view of can be with One number of retries is set, if exceeding number of retries, can be abandoned, to guarantee that system can be executed orderly.In a reality It applies in mode, after cpu reset, the first house dog is set and the second house dog is that hardware feeds dog pattern, computer crash information is saved It may include: to determine whether that the first house dog and the second house dog, which is successfully arranged, feeds dog pattern for hardware during into flash memory Computer crash information is saved into flash memory;In the case where failed, the first house dog of setting and the second house dog are reattempted to Dog pattern is fed for hardware to save computer crash information into flash memory, and records the number retried;Exceed preset threshold in number of retries In the case where, it abandons the first house dog of setting and the second house dog is that hardware feeds dog pattern.That is, it is determined whether being successfully entered exception Tupe is just abandoned carrying out abnormality processing mode if repeatedly attempting not entering abnormality processing mode.
Carry out house dog feed dog when, can by software carry out software feed dog, by programmable logic device into Row hardware feeds dog, that is, can according to need different hello the dog mode of selection.
Above-mentioned computer crash information can include but is not limited at least one of: in memory mirror information, whole plate one or The register information of multiple devices.
It in order to realize the reading to data and check, can be carried out in such a way that stack pointer is set, specifically, can be In the case where determining that system is abnormal restarting, before the first house dog resets CPU, in system operation, software The current stack pointer of continuous updating is to preset memory address;It, can be by looking into this way after all devices for resetting whole plate It sees stack pointer when crash, checks data information.For example, can carry out stack backtracking by checking stack fingerprint when crash, check The information such as variable, code segment and data.By checking the register information of Primary Component, the shape for the module that goes wrong can analyze State.
It is illustrated below with reference to store method and device of the specific embodiment to above-mentioned computer crash information, however, value It is noted that the specific embodiment merely to the application is better described, does not constitute an undue limitation on the present application.
Computer crash information can not be recorded in order to solve the problems, such as that CPU existing for existing crash treatment process thoroughly loses sound, In this example, by the recovery system when system loses response completely, and computer crash information is recorded, to reach serious crash situation hair The purpose of computer crash information is recorded when raw.
When system operates normally, the pointer of current stack is constantly updated into memory, needs abnormal restarting in system When, into abnormality processing mode (referred to as DUMP mode), computer crash information is saved in the abnormality processing mode, wherein Computer crash information may include: complete memory mirror and Primary Component register state information etc..
Based on above-mentioned design, a kind of method for saving computer crash information is provided, in this example not influence system just In the case where often restarting, it can save computer crash information in the case where serious crash phenomenon occurs and be used for accident analysis.It can wrap Include following steps:
S1: programmable logic device connects two panels watchdog chip, and first watchdog chip is connected to cpu reset signal, Second watchdog chip is connected to the reset signal (that is, reset signal comprising CPU and other all devices) of whole plate.If the A piece of watchdog chip is restarted, then CPU only can be restarted, before the state of other devices (such as: DDR, DSP etc.) can keep reset State.
S2: by programmable logic device, the configuration of watchdog chip is controlled.Programmable logic device provides two deposits Device: a crash_flag register, for recording whether have abnormal restarting phenomenon;One dump_retry register, is used for Record attempts the number for entering dump mode.
The initial value of programmable logic device is configured that first watchdog chip and second watchdog chip are all configured to Hardware feeds dog, is carried out feeding dog by programmable logic device;Crash_flag register initial value is 0;At the beginning of dump_retry register Value is 0.Wherein, for crash_flag register, 0 expression system is normally restarted, and 1 expression system exception is restarted.
S3: the system boot stage reads crash_flag register, judges whether there is before code migration to memory Abnormal restarting.Crash_flag=1 indicates that system last time restarts for abnormal restarting, then enter abnormality processing mode (that is, DUMP mode), enter step S4;Crash_flag=0, expression system last time restart normally to restart, continue to operate normally, into Enter step S5:.Judged before code migration to memory, can prevent boot from modifying memory content, that is, if last time weight Abnormal restarting causes when opening, identical when memory content is with abnormal occur at this time.
S4: it attempts to enter DUMP mode:
1) if first watchdog chip and second watchdog chip all configured into the success of DUMP mode Feed dog for hardware, dump_retry=0 be set, and save computer crash information (such as: complete memory mirror and Primary Component deposit Device state etc.) into flash.After saving operation and terminating, configuration of programmable logic devices register crash_flag=0, And configure software for first watchdog chip and feed dog, all devices of whole plate are resetted, step S3 is reentered.At this point, crashing It is identical when information (that is, complete memory mirror etc.) is with abnormal occur.
If 2) failed into DUMP mode, programmable logic device is not restarted at this time, first watchdog chip Dog is fed for software, second watchdog chip is that hardware feeds dog.Therefore, (with the difference for selecting device, weight after fixed reboot time Opening time difference) CPU restarts again, reenters step S3, at this time crash_flag=1, show to retry into DUMP mode. Dump_retry is recorded in number of retries.When number of retries be greater than 3 times, can be with configuration of programmable logic devices register Crash_flag=0, dump_retry=0, expression system are abandoned entering DUMP mode.
S5:boot continues to operate normally, and the boot stage configures crash_flag=1.It is fed until software can carry out software The dog stage (generally kernel stage) configures two panels watchdog chip, and first watchdog chip is configured to software and feeds dog, by soft Part carries out feeding dog, and second watchdog chip is configured to hardware and feeds dog, is carried out feeding dog by programmable logic device, and current When executing function stacking, constantly updates stack pointer and be stored in fixed memory address.
S6: it is likely to be encountered following three kinds in operational process and restarts situation:
1) it crashes and restarts: may be considered abnormal restarting, at this point, first watchdog chip does not have software to feed dog, it is fixed (with the difference for selecting device, reboot time is different) CPU is restarted after reboot time, other devices are not restarted, and removes CPU at this time, Its device will all keep the state before restarting, and mainly save the register information of memory and Primary Component in this example.It will Step S3 is returned to, due to crash_flag=1 at this time, will finally enter in DUMP mode and save computer crash information.
2) restart manually: may be considered and normally restart, at this time by software configuration crash_flag=0, whole plate is all Device resets, and will be returned to step S3.
3) power-off restarting: may be considered and normally restart, and all device power-off restartings of whole plate will be returned to step S3..
By the method for the preservation computer crash information of upper example, can effectively be saved in crash of the system without response Computer crash information.
It is described as follows below with a specific example:
In this example, as shown in Figure 3.Other than using system body, CPLD (mainly can be by programmable logic device Or the devices such as FPGA, hereinafter referred logical device) and two panels watchdog chip (hereinafter abbreviation house dog 1 and house dog 2) group At.
Logical device exports two-way house dog input signal (MDI_1, MDI_2), is separately input to two panels watchdog chip House dog input (MDI) pin, the output REST pin of house dog 1 is connected to the cpu reset signal (CPU of system RESET), the output reseting pin (RESET) of house dog 2 is connected to the whole plate reset signal (System RESET) of system.
In this example, with the exception that crashes, system without response in case where be illustrated, as shown in connection with fig. 4, protect Deposit specifically comprising the following steps: for computer crash information method
Step S1: in system operation, software constantly updates current stack pointer to fixed memory address, when system is dead When machine is without response, software stops feeding dog, and after feeding dog time-out, house dog 1 resets CPU.Crash_flag=1 at this time.
After step S2:CPU resets, into the boot stage, before code migration to memory, crash_flag is read, because For crash_flag=1, show that system is abnormal restarting, into abnormality processing mode (DUMP mode).
Step S3: entering DUMP mode, all configures hardware for house dog 1 and house dog 2 and feeds dog pattern, dump_retry =0, complete memory mirror information and Primary Component (DSP etc.) register information are saved into flash.After saving operation, Logical device register crash_flag=0 is write, house dog 2 is configured to software and feeds dog, and after feeding dog time-out, system will reset whole plate All devices.
Step S4: after whole plate resets, system reboot before code migration to memory, is read into the boot stage Crash_flag shows that system enters normal startup stage because of crash_flag=0.
Step S5: after system normally starts, the computer crash information of preservation can be checked by reading flash.It is crashed by checking When stack pointer, stack backtracking can be carried out, check the information such as variable, code segment and data;By the deposit for checking Primary Component Device information can analyze the state for the module that goes wrong.
In upper example, by hardware design and software programming, it is completely reactionless to solve existing system in case of system halt, has little time In the case where saving computer crash information, the problem of computer crash information is lost, and the normal operation for the system of will not influence is implemented, realized Get up relatively simple.
Obviously, those skilled in the art should be understood that each module of the above-mentioned embodiment of the present invention or each step can be with It is realized with general computing device, they can be concentrated on a single computing device, or be distributed in multiple computing devices On composed network, optionally, they can be realized with the program code that computing device can perform, it is thus possible to by it Store and be performed by computing device in the storage device, and in some cases, can be held with the sequence for being different from herein The shown or described step of row, perhaps they are fabricated to each integrated circuit modules or will be multiple in them Module or step are fabricated to single integrated circuit module to realize.In this way, the embodiment of the present invention be not limited to it is any specific hard Part and software combine.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the embodiment of the present invention can have various modifications and variations.All within the spirits and principles of the present invention, made Any modification, equivalent substitution, improvement and etc. should all be included in the protection scope of the present invention.

Claims (11)

1. a kind of store method of computer crash information characterized by comprising
In the case where the system that determines is abnormal restarting, the first house dog of control resets CPU;
After cpu reset, the first house dog is set and the second house dog is that hardware feeds dog pattern, computer crash information is saved to flash memory In;
Software is configured by the second house dog and feeds dog, in the case where the second house dog feeds dog time-out, resets all devices of whole plate Part.
2. the method according to claim 1, wherein in the case where the system that determines is abnormal restarting, control the One house dog reset to CPU
In the case where system in case of system halt is without response, software stops carrying out first house dog software hello dog;
In the case where feeding dog time-out, first house dog resets the CPU, and first identifier is arranged, wherein institute It is abnormal restarting that first identifier, which is stated, for identifying system.
3. the method according to claim 1, wherein setting the first house dog and second is guarded the gate after cpu reset Dog is that hardware feeds dog pattern, includes: during computer crash information is saved into flash memory
Determine whether that the first house dog is successfully arranged and the second house dog is that hardware hello dog pattern saves computer crash information to flash memory In;
In the case where failed, reattempt to the first house dog of setting and the second house dog is that hardware feeds dog pattern and will crash Information preservation records the number retried into flash memory;
In the case where number of retries exceeds preset threshold, abandons the first house dog of setting and the second house dog is that hardware feeds dog mould Formula.
4. passing through programmable logic the method according to claim 1, wherein carrying out software by software feeds dog Device carries out hardware and feeds dog.
5. the method according to claim 1, wherein the computer crash information includes at least one of: memory mirror As the register information of devices one or more in information, whole plate.
6. the method according to claim 1, wherein first sees in the case where the system that determines is abnormal restarting Before door dog resets CPU, the method also includes:
In system operation, software continuous updates current stack pointer to preset memory address;
After all devices for resetting whole plate, the method also includes:
By checking stack pointer when crash, data information is checked.
7. a kind of save set of computer crash information characterized by comprising
Control module, in the case where the system that determines is abnormal restarting, the first house dog of control to reset CPU;
Preserving module feeds dog pattern for the first house dog and the second house dog after cpu reset, to be arranged for hardware, will crash Information preservation is into flash memory;
Reseting module feeds dog for configuring software for the second house dog, in the case where the second house dog feeds dog time-out, resets All devices of whole plate.
8. device according to claim 7, which is characterized in that the control module includes:
Pause unit, in the case where system in case of system halt is without response, software to stop feeding first house dog progress software Dog;
Control unit, for feed dog it is overtime in the case where, first house dog resets the CPU, and is arranged the One mark, wherein the first identifier is abnormal restarting for identifying system.
9. device according to claim 7, which is characterized in that the preserving module includes:
Determination unit is used to determine whether that the first house dog is successfully arranged and the second house dog is that hardware feeds dog pattern for crash letter Breath is saved into flash memory;
Unit is retried, in the case where failed, reattempting to the first house dog of setting and the second house dog is hardware It feeds dog pattern to save computer crash information into flash memory, and records the number retried;
Unit is abandoned, is guarded the gate in the case where number of retries exceeds preset threshold, abandoning setting the first house dog and second Dog is that hardware feeds dog pattern.
10. device according to claim 7, which is characterized in that further include:
Update module, for the system that determines be abnormal restarting in the case where, before the first house dog resets CPU, In system operation, software continuous updates current stack pointer to preset memory address;
Module is checked, for by checking stack pointer when crash, checking that data are believed after all devices for resetting whole plate Breath.
11. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The step of any one of claims 1 to 6 the method is realized when execution.
CN201710432510.XA 2017-06-09 2017-06-09 Method and device for storing crash information Active CN109032822B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710432510.XA CN109032822B (en) 2017-06-09 2017-06-09 Method and device for storing crash information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710432510.XA CN109032822B (en) 2017-06-09 2017-06-09 Method and device for storing crash information

Publications (2)

Publication Number Publication Date
CN109032822A true CN109032822A (en) 2018-12-18
CN109032822B CN109032822B (en) 2024-01-09

Family

ID=64628786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710432510.XA Active CN109032822B (en) 2017-06-09 2017-06-09 Method and device for storing crash information

Country Status (1)

Country Link
CN (1) CN109032822B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739675A (en) * 2018-12-24 2019-05-10 深圳航天东方红海特卫星有限公司 A method of program exception is captured using hardware watchdog
CN109783267A (en) * 2019-01-17 2019-05-21 广东小天才科技有限公司 A kind of method and system solving downloading mode exception
CN109828858A (en) * 2019-01-17 2019-05-31 广东小天才科技有限公司 A kind of method and system for preventing system boot stuck
CN112068980A (en) * 2020-09-18 2020-12-11 展讯通信(上海)有限公司 Method and device for sampling information before CPU hang-up, equipment and storage medium
CN113010336A (en) * 2019-12-20 2021-06-22 珠海全志科技股份有限公司 Application processor crash field debugging method and application processor
CN113535448A (en) * 2021-06-30 2021-10-22 浙江中控技术股份有限公司 Multiple watchdog control method and control system thereof
CN113946148A (en) * 2021-09-29 2022-01-18 浙江零跑科技股份有限公司 MCU chip awakening system based on multi-ECU cooperative control
WO2022135429A1 (en) * 2020-12-23 2022-06-30 华为技术有限公司 Rapid start-up method
CN114911642A (en) * 2022-04-27 2022-08-16 北京计算机技术及应用研究所 Firmware restarting method based on UEFI event mechanism and watchdog
CN115061752A (en) * 2022-06-28 2022-09-16 展讯通信(上海)有限公司 Terminal equipment restarting method and device
CN115904793A (en) * 2023-03-02 2023-04-04 上海励驰半导体有限公司 Memory unloading method, system and chip based on multi-core heterogeneous system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1400529A (en) * 2001-07-30 2003-03-05 华为技术有限公司 Fault location method of real-time embedding system
CN101369237A (en) * 2007-08-14 2009-02-18 中兴通讯股份有限公司 Watchdog reset circuit and reset method
CN102521098A (en) * 2011-11-23 2012-06-27 中兴通讯股份有限公司 Processing method and processing device for monitoring dead halt of CPU (Central Processing Unit)
US9274894B1 (en) * 2013-12-09 2016-03-01 Twitter, Inc. System and method for providing a watchdog timer to enable collection of crash data
CN106326055A (en) * 2016-08-29 2017-01-11 四川九洲空管科技有限责任公司 Method for software and hardware crashing detection and resetting of airborne collision avoidance system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1400529A (en) * 2001-07-30 2003-03-05 华为技术有限公司 Fault location method of real-time embedding system
CN101369237A (en) * 2007-08-14 2009-02-18 中兴通讯股份有限公司 Watchdog reset circuit and reset method
CN102521098A (en) * 2011-11-23 2012-06-27 中兴通讯股份有限公司 Processing method and processing device for monitoring dead halt of CPU (Central Processing Unit)
US9274894B1 (en) * 2013-12-09 2016-03-01 Twitter, Inc. System and method for providing a watchdog timer to enable collection of crash data
CN106326055A (en) * 2016-08-29 2017-01-11 四川九洲空管科技有限责任公司 Method for software and hardware crashing detection and resetting of airborne collision avoidance system

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739675A (en) * 2018-12-24 2019-05-10 深圳航天东方红海特卫星有限公司 A method of program exception is captured using hardware watchdog
CN109783267A (en) * 2019-01-17 2019-05-21 广东小天才科技有限公司 A kind of method and system solving downloading mode exception
CN109828858A (en) * 2019-01-17 2019-05-31 广东小天才科技有限公司 A kind of method and system for preventing system boot stuck
CN113010336A (en) * 2019-12-20 2021-06-22 珠海全志科技股份有限公司 Application processor crash field debugging method and application processor
CN112068980B (en) * 2020-09-18 2023-06-23 展讯通信(上海)有限公司 Method and device for sampling information before CPU suspension, equipment and storage medium
CN112068980A (en) * 2020-09-18 2020-12-11 展讯通信(上海)有限公司 Method and device for sampling information before CPU hang-up, equipment and storage medium
WO2022135429A1 (en) * 2020-12-23 2022-06-30 华为技术有限公司 Rapid start-up method
CN113535448A (en) * 2021-06-30 2021-10-22 浙江中控技术股份有限公司 Multiple watchdog control method and control system thereof
CN113535448B (en) * 2021-06-30 2024-04-26 浙江中控技术股份有限公司 Multiple watchdog control method and control system thereof
CN113946148A (en) * 2021-09-29 2022-01-18 浙江零跑科技股份有限公司 MCU chip awakening system based on multi-ECU cooperative control
CN113946148B (en) * 2021-09-29 2023-11-10 浙江零跑科技股份有限公司 MCU chip awakening system based on multi-ECU cooperative control
CN114911642A (en) * 2022-04-27 2022-08-16 北京计算机技术及应用研究所 Firmware restarting method based on UEFI event mechanism and watchdog
CN114911642B (en) * 2022-04-27 2024-04-19 北京计算机技术及应用研究所 Firmware restarting method based on UEFI event mechanism and watchdog
CN115061752A (en) * 2022-06-28 2022-09-16 展讯通信(上海)有限公司 Terminal equipment restarting method and device
CN115904793A (en) * 2023-03-02 2023-04-04 上海励驰半导体有限公司 Memory unloading method, system and chip based on multi-core heterogeneous system
CN115904793B (en) * 2023-03-02 2023-05-23 上海励驰半导体有限公司 Memory transfer method, system and chip based on multi-core heterogeneous system

Also Published As

Publication number Publication date
CN109032822B (en) 2024-01-09

Similar Documents

Publication Publication Date Title
CN109032822A (en) A kind of store method and device of computer crash information
DE102012109614B4 (en) Procedure for recovering from stack overflow or stack underflow errors in a software application
CN104063477B (en) Embedded system starts abnormal processing method and processing device
US20120110378A1 (en) Firmware recovery system and method of baseboard management controller of computing device
JPH09258995A (en) Computer system
CN111913836B (en) Solid state disk low power consumption mode data recovery method and device, computer equipment and storage medium
WO2011027382A1 (en) Request processing system provided with multi-core processor
US20150154028A1 (en) Methods for accessing baseboard management controller
CN102656568A (en) Microcomputer and operation method thereof
CN111078515B (en) SSD layered log recording method, SSD layered log recording device, SSD layered log recording computer device and storage medium
CN107463459A (en) Store method, device, system and the terminal device of system exception internal storage data
CN114116280B (en) Interactive BMC self-recovery method, system, terminal and storage medium
CN106445720A (en) Memory error recovery method and device
JP3301992B2 (en) Computer system with power failure countermeasure and method of operation
CN111782446A (en) Method and device for testing normal power failure of SSD, computer equipment and storage medium
CN101697132A (en) Method, device and network equipment for quickly restarting operating system
US7917804B2 (en) Systems and methods for CPU repair
CN111124780B (en) UPI Link speed reduction test method, system, terminal and storage medium
JPH1091289A (en) Memory initialization device and method
WO2015188511A1 (en) Nand flash operation processing method and apparatus, and logic device
EP3629176B1 (en) Fault detection circuit with progress register and status register
CN117111990A (en) Program upgrading method, device, VCU, program upgrading system and medium
CN105068969B (en) Single particle effect guard system and method for digital signal processing platform framework
CN110532124A (en) Memory partition method and device
CN103810051A (en) Watchdog abnormity recovery device and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant