CN105279037A - Watchdog monitoring method and system - Google Patents
Watchdog monitoring method and system Download PDFInfo
- Publication number
- CN105279037A CN105279037A CN201410306845.3A CN201410306845A CN105279037A CN 105279037 A CN105279037 A CN 105279037A CN 201410306845 A CN201410306845 A CN 201410306845A CN 105279037 A CN105279037 A CN 105279037A
- Authority
- CN
- China
- Prior art keywords
- cpu
- fpga
- thread
- state value
- errorlevel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a watchdog monitoring method and system. The method comprises the following steps: electrifying an FPGA and a CPU to initialize; sending, by the FPGA, a dog feeding pulse signal to a watchdog chip, and run, by the CPU, a thread corresponding to a task; communicating, by the FPGA and the CPU, with each other so as to read the state information of the opposite side, judge whether the operation is abnormal or not, and control the opposite side to reset if the judging result is positive; and storing, by the FPGA, the state information of the CPU in real time. The invention correspondingly discloses a watchdog monitoring system. By adopting the technical scheme, the state information of the CPU can be obtained and stored through the FPGA in the monitoring process, and then the user can position the problem during the resetting of the CPU.
Description
Technical field
The present invention relates to electronic technology field, particularly relate to a kind of watch dog monitoring method and system.
Background technology
In computer systems, which; work due to CPU usually can be subject to the interference from external electromagnetic field; cause program fleet; and being absorbed in endless loop, the normal operation of program is interrupted, and causes system to be absorbed in dead state; there is unpredictable consequence; so for consideration CPU running status being carried out to monitoring in real time, create a kind of chip being specifically designed to monitoring running status, be commonly called as " house dog " (watchdog) chip.
Traditional watch dog monitoring mode, adopts the watchdog chip outside CPU control strip, by feeding dog in real time, realizes the monitoring of CPU running software situation.Because software control is easily subject to the impact of system environments, itself is also subject to the restriction of software itself, easily cause cpu system fault in some conditions, when cpu system fault, CPU does not record self information, thus cause user cannot know the reason of system exception, and after cpu reset is restarted, also cannot orientation problem.And traditional based on hardware mode monitoring watchdog, reset mode is fixed, dumb.
Summary of the invention
Based on this, be necessary to provide a kind of watch dog monitoring method and system, in monitor procedure, can be obtained by FPGA and preserve CPU status information, when cpu reset, for user's orientation problem.
A kind of watch dog monitoring method, comprising:
FPGA and CPU power-up initializing;
Described FPGA sends to watchdog chip and feeds dog pulse signal, the thread that described CPU operation task is corresponding;
Extremely whether described FPGA communicates with described CPU, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets;
Described FPGA preserves the status information of described CPU in real time.
In one embodiment, described FPGA communicates with described CPU, reads the status information of the other side, judges to comprise the step whether operation is abnormal:
After the thread that described CPU operation task is corresponding, start the monitoring to described thread, and the state value of the described thread in preset time period is sent to described FPGA, the errorlevel that described FPGA is corresponding according to described state value judges described CPU whether operation exception.
In one embodiment, described preset time period is 1ms to 32s.
In one embodiment, the errorlevel that described state value is corresponding comprises slow, the accidental crc error of response, continues crc error and communication disruption; When the errorlevel that described state value is corresponding is that when continuing crc error or communication disruption, described FPGA judges that described CPU is as operation exception state.
In one embodiment, described method also comprises:
Described CPU is after abnormality resets, from the status information of the CPU of described FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continues to run thread corresponding to other errorlevel state value, and generates CPU running log.
A kind of watch dog monitoring system, comprising: FPGA, CPU and watchdog chip;
Described FPGA, feeds dog pulse signal for sending to described watchdog chip after power-up initializing;
Described CPU, for after power-up initializing, the thread that operation task is corresponding;
Extremely whether described FPGA and described CPU, also for communicating, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets;
Described FPGA, also for preserving the status information of described CPU in real time.
In one embodiment, described CPU starts the monitoring to described thread, and the state value of the described thread in preset time period is sent to described FPGA after being used for thread corresponding to operation task;
Described FPGA is used for the errorlevel corresponding according to described state value and judges described CPU whether operation exception.
In one embodiment, described preset time period is 1ms to 32s.
In one embodiment, the errorlevel that described state value is corresponding comprises slow, the accidental crc error of response, continues crc error and communication disruption;
Described FPGA is, when continuing crc error or communication disruption, judge that described CPU is as operation exception state for the errorlevel corresponding when described state value.
In one embodiment, described CPU is used for after abnormality resets, from the status information of the CPU of described FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continue to run thread corresponding to other errorlevel state value, and generate CPU running log.
Above-mentioned watch dog monitoring method and system, extremely whether FPGA with CPU communicates, and reads the status information of the other side, to judge to run, if abnormal, then control the other side to reset, and FPGA can preserve the status information of CPU in real time, compared to traditional technology, achieving can in monitor procedure, obtained by FPGA and preserve CPU status information, when cpu reset, for user's orientation problem.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the watch dog monitoring method in an embodiment;
Fig. 2 is the structural representation of the watch dog monitoring system in an embodiment;
Fig. 3 is the syndeton schematic diagram of FPGA in an embodiment and watchdog chip;
Fig. 4 is the structural representation of the storage data of cpu monitor thread in an embodiment;
Fig. 5 is the storage organization schematic diagram of the status information of the CPU that in an embodiment, FPGA obtains.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
See Fig. 1, in one embodiment, a kind of watch dog monitoring method is provided.The method comprises:
Step 101, FPGA and CPU power-up initializing.
Step 102, FPGA sends to watchdog chip and feeds dog pulse signal, the thread that CPU operation task is corresponding.
Concrete, after FPGA power-up initializing, dog pulse signal is fed in the transmission that can continue to watchdog chip.CPU executes the task according to the instruction of power on executive routine or outside input, and such as read and write the display etc. of DDR, LCD, often perform a task, CPU can run corresponding thread.
Step 103, extremely whether FPGA with CPU communicates, and reads the status information of the other side, to judge to run, and if so, then controls the other side and resets.
Concrete, FPGA and CPU monitors the running status of the other side mutually.Wherein, FPGA, from carrying out logical process, can record logic state, and according to the logic state of record, CPU judges that whether FPGA work is abnormal, if abnormal, control FPGA resets.The principle that involved processing procedure and CPU control other peripheral hardware is similar, does not repeat them here.
In this step, also comprise FPGA and CPU is monitored.After the thread that CPU operation task is corresponding, start the monitoring to thread, and the state value of the thread in preset time period is sent to FPGA, judge CPU whether operation exception by the errorlevel that FPGA is corresponding according to this state value.Wherein, preset time period can be configured by CPU, can be different according to status information storage depth, and its scope is 1ms to 32s.The errorlevel of state value, carries out CRC check by the data communication that CPU is relevant to thread and obtains, can be divided into multiple grade in advance, as responded slow, accidental crc error, continuing crc error and communication disruption.When the errorlevel that state value is corresponding is that when continuing crc error or communication disruption, FPGA judges that CPU is as operation exception state.
Step 104, FPGA preserves the status information of CPU in real time.
Concrete, FPGA preserves the status information of CPU in real time, for the follow-up running status can analyzing CPU of user of service.
Further, in the present embodiment, when FPGA judges CPU operation exception, by send a reset level to CPU allow its reset after, the CPU status information that CPU preserves from FPGA retaking of a year or grade, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and the thread of state value to other errorlevel, read the state value before resetting to continue to run, improve reset speed.Further, CPU can generate CPU running log, analyzes for user of service.
See Fig. 2, in one embodiment, a kind of watchdog chip supervisory system is provided.This system comprises: FPGA, CPU and watchdog chip.
FPGA, feeds dog pulse signal for sending to watchdog chip after power-up initializing.
CPU, for after power-up initializing, the thread that operation task is corresponding.
Extremely whether FPGA and CPU, also for communicating, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets.
FPGA, also for preserving the status information of CPU in real time.
In one embodiment, having three altogether see Fig. 3, FPGA watchdog circuit and control pin, is DOG_FEED, RESET_CTR and SYSTEM_RESET respectively.DOG_FEED signal is that FPGA feeds dog pulse, and precision can reach ns level, and during normal hello dog, FPGA continues to send fixed pulse to refresh watchdog chip.RESET_CTR signal is watchdog reset useful signal, realizes reset switch by this signal of logic control.SYSTEM_RESET signal connects FPGA initiatively loading reset signal, and whether control FPGA resets.
In one embodiment, see Fig. 4, after the thread that CPU operation task is corresponding, start the monitoring (valid data bit in Fig. 4 is set to 1 by 0) to thread, and the state value of the thread in preset time period is sent to FPGA, this preset time period can configure scope from 1ms to 32s by CPU.In the diagram, namely the data such as Thread1, Thread2 are the numerical value of the monitoring preset time period of each thread of CPU configuration.
In one embodiment, FPGA with CPU communicates, and obtains the status information of CPU as shown in Figure 5.Data bit 15 is valid position, puts 1, and mark thread is just in the monitoring state.The peripheral hardware that data bit 14init data bit mark thread is corresponding completes initialization.Data bit 13 to data bit 0 misregistration grade, errorlevel comprises: response is (0x10), accidental crc error (0x20), lasting crc error (0x30) and communication disruption (0x40) slowly.FPGA is, when continuing crc error or communication disruption, judge that CPU is as operation exception state at the errorlevel that state value is corresponding.Now FPGA can send reset level to the reset pin of CPU, allows it reset.
In one embodiment, CPU is after abnormality resets, from the status information of the CPU of FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continue to run thread corresponding to other errorlevel state value, and generate CPU running log, analyze for user of service.
Above-mentioned watch dog monitoring method and system, extremely whether FPGA with CPU communicates, and reads the status information of the other side, to judge to run, if abnormal, then control the other side to reset, and FPGA can preserve the status information of CPU in real time, compared to traditional technology, achieving can in monitor procedure, obtained by FPGA and preserve CPU status information, when cpu reset, for user's orientation problem.
The above embodiment only have expressed several embodiment of the present invention, and it describes comparatively concrete and detailed, but therefore can not be interpreted as the restriction to the scope of the claims of the present invention.It should be pointed out that for the person of ordinary skill of the art, without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.
Claims (10)
1. a watch dog monitoring method, is characterized in that, described method comprises:
FPGA and CPU power-up initializing;
Described FPGA sends to watchdog chip and feeds dog pulse signal, the thread that described CPU operation task is corresponding;
Extremely whether described FPGA communicates with described CPU, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets;
Described FPGA preserves the status information of described CPU in real time.
2. method according to claim 1, is characterized in that, described FPGA communicates with described CPU, reads the status information of the other side, judges to comprise the step whether operation is abnormal:
After the thread that described CPU operation task is corresponding, start the monitoring to described thread, and the state value of the described thread in preset time period is sent to described FPGA, the errorlevel that described FPGA is corresponding according to described state value judges described CPU whether operation exception.
3. method according to claim 2, is characterized in that, described preset time period is 1ms to 32s.
4. method according to claim 2, is characterized in that, the errorlevel that described state value is corresponding comprises slow, the accidental crc error of response, continues crc error and communication disruption; When the errorlevel that described state value is corresponding is that when continuing crc error or communication disruption, described FPGA judges that described CPU is as operation exception state.
5. method according to claim 4, is characterized in that, described method also comprises:
Described CPU is after abnormality resets, from the status information of the CPU of described FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continues to run thread corresponding to other errorlevel state value, and generates CPU running log.
6. a watch dog monitoring system, is characterized in that, described system comprises: FPGA, CPU and watchdog chip;
Described FPGA, feeds dog pulse signal for sending to described watchdog chip after power-up initializing;
Described CPU, for after power-up initializing, the thread that operation task is corresponding;
Extremely whether described FPGA and described CPU, also for communicating, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets;
Described FPGA, also for preserving the status information of described CPU in real time.
7. system according to claim 6, is characterized in that, described CPU starts the monitoring to described thread, and the state value of the described thread in preset time period is sent to described FPGA after being used for thread corresponding to operation task;
Described FPGA is used for the errorlevel corresponding according to described state value and judges described CPU whether operation exception.
8. system according to claim 7, is characterized in that, described preset time period is 1ms to 32s.
9. system according to claim 7, is characterized in that, the errorlevel that described state value is corresponding comprises slow, the accidental crc error of response, continues crc error and communication disruption;
Described FPGA is, when continuing crc error or communication disruption, judge that described CPU is as operation exception state for the errorlevel corresponding when described state value.
10. system according to claim 9, it is characterized in that, described CPU is used for after abnormality resets, from the status information of the CPU of described FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continue to run thread corresponding to other errorlevel state value, and generate CPU running log.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410306845.3A CN105279037B (en) | 2014-06-30 | 2014-06-30 | Watch dog monitoring method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410306845.3A CN105279037B (en) | 2014-06-30 | 2014-06-30 | Watch dog monitoring method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105279037A true CN105279037A (en) | 2016-01-27 |
CN105279037B CN105279037B (en) | 2019-01-11 |
Family
ID=55148087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410306845.3A Active CN105279037B (en) | 2014-06-30 | 2014-06-30 | Watch dog monitoring method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105279037B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105929811A (en) * | 2016-04-06 | 2016-09-07 | 清华大学 | Protection circuit for program deadlock |
CN107025160A (en) * | 2017-04-14 | 2017-08-08 | 济南浪潮高新科技投资发展有限公司 | A kind of system of quick positioning question for Shen prestige processor platform |
CN109062718A (en) * | 2018-07-12 | 2018-12-21 | 联想(北京)有限公司 | A kind of server and data processing method |
CN109726080A (en) * | 2018-12-29 | 2019-05-07 | 百度在线网络技术(北京)有限公司 | Monitor the method and device of the working condition of heterogeneous computing system |
CN109815044A (en) * | 2019-03-29 | 2019-05-28 | 深圳市广联智通科技有限公司 | A kind of cascade watchdog circuit |
CN110287055A (en) * | 2019-06-28 | 2019-09-27 | 联想(北京)有限公司 | The data reconstruction method and electronic equipment of a kind of electronic equipment |
CN118377644A (en) * | 2024-06-21 | 2024-07-23 | 南京国电南自维美德自动化有限公司 | FPGA-based rapid CPU fault diagnosis lifting method and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060236150A1 (en) * | 2005-04-01 | 2006-10-19 | Dot Hill Systems Corporation | Timer-based apparatus and method for fault-tolerant booting of a storage controller |
CN1908856A (en) * | 2005-08-05 | 2007-02-07 | 中兴通讯股份有限公司 | Position restoration circuit device |
CN101271415A (en) * | 2008-05-07 | 2008-09-24 | 深圳国人通信有限公司 | Monitoring watchdog implementing method of built-in equipment |
CN101964731A (en) * | 2010-06-18 | 2011-02-02 | 中兴通讯股份有限公司 | Method and device for monitoring data link |
CN102081573A (en) * | 2010-02-01 | 2011-06-01 | 杭州华三通信技术有限公司 | Device and method for recording equipment restart reason |
CN102141939A (en) * | 2010-02-01 | 2011-08-03 | 杭州华三通信技术有限公司 | Device capable of recording restart reason of whole machine |
CN203386143U (en) * | 2013-07-24 | 2014-01-08 | 三维通信股份有限公司 | Remote machine reset device |
-
2014
- 2014-06-30 CN CN201410306845.3A patent/CN105279037B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060236150A1 (en) * | 2005-04-01 | 2006-10-19 | Dot Hill Systems Corporation | Timer-based apparatus and method for fault-tolerant booting of a storage controller |
CN1908856A (en) * | 2005-08-05 | 2007-02-07 | 中兴通讯股份有限公司 | Position restoration circuit device |
CN101271415A (en) * | 2008-05-07 | 2008-09-24 | 深圳国人通信有限公司 | Monitoring watchdog implementing method of built-in equipment |
CN102081573A (en) * | 2010-02-01 | 2011-06-01 | 杭州华三通信技术有限公司 | Device and method for recording equipment restart reason |
CN102141939A (en) * | 2010-02-01 | 2011-08-03 | 杭州华三通信技术有限公司 | Device capable of recording restart reason of whole machine |
CN101964731A (en) * | 2010-06-18 | 2011-02-02 | 中兴通讯股份有限公司 | Method and device for monitoring data link |
CN203386143U (en) * | 2013-07-24 | 2014-01-08 | 三维通信股份有限公司 | Remote machine reset device |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105929811A (en) * | 2016-04-06 | 2016-09-07 | 清华大学 | Protection circuit for program deadlock |
CN105929811B (en) * | 2016-04-06 | 2018-11-20 | 清华大学 | A kind of protection circuit for program deadlock |
CN107025160A (en) * | 2017-04-14 | 2017-08-08 | 济南浪潮高新科技投资发展有限公司 | A kind of system of quick positioning question for Shen prestige processor platform |
CN109062718A (en) * | 2018-07-12 | 2018-12-21 | 联想(北京)有限公司 | A kind of server and data processing method |
CN109726080A (en) * | 2018-12-29 | 2019-05-07 | 百度在线网络技术(北京)有限公司 | Monitor the method and device of the working condition of heterogeneous computing system |
CN109726080B (en) * | 2018-12-29 | 2023-07-14 | 百度在线网络技术(北京)有限公司 | Method and device for monitoring working state of heterogeneous computing system |
CN109815044A (en) * | 2019-03-29 | 2019-05-28 | 深圳市广联智通科技有限公司 | A kind of cascade watchdog circuit |
CN110287055A (en) * | 2019-06-28 | 2019-09-27 | 联想(北京)有限公司 | The data reconstruction method and electronic equipment of a kind of electronic equipment |
CN110287055B (en) * | 2019-06-28 | 2021-06-15 | 联想(北京)有限公司 | Data recovery method of electronic equipment and electronic equipment |
CN118377644A (en) * | 2024-06-21 | 2024-07-23 | 南京国电南自维美德自动化有限公司 | FPGA-based rapid CPU fault diagnosis lifting method and system |
Also Published As
Publication number | Publication date |
---|---|
CN105279037B (en) | 2019-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105279037A (en) | Watchdog monitoring method and system | |
CN102761439B (en) | Device and method for detecting and recording abnormity on basis of watchdog in PON (Passive Optical Network) access system | |
US10671416B2 (en) | Layered virtual machine integrity monitoring | |
EP3142011A1 (en) | Anomaly recovery method for virtual machine in distributed environment | |
US10216550B2 (en) | Technologies for fast boot with adaptive memory pre-training | |
US10956247B2 (en) | Collecting and transmitting diagnostics information from problematic devices | |
US20140068350A1 (en) | Self-checking system and method using same | |
CN108872762B (en) | Electronic equipment leakage detection method and device, electronic equipment and storage medium | |
CN106201755B (en) | The repositioning method and device of the network equipment | |
US9135193B2 (en) | Expander interrupt processing | |
US20220035438A1 (en) | Control method, apparatus, and electronic device | |
WO2014144043A4 (en) | Apparatus and method for generating descriptors to reaccess a non-volatile semiconductor memory of a storage drive due to an error | |
CN110704228A (en) | Solid state disk exception handling method and system | |
US9026720B2 (en) | Non-volatile memory monitoring | |
US20170371684A1 (en) | Pin control method and device | |
US9223740B2 (en) | Detection method and apparatus for hot-swapping of SD card | |
WO2023065601A1 (en) | Server component self-test anomaly recovery method and device, system, and medium | |
US20170052521A1 (en) | Programmable controller and arithmetic processing system | |
CN109933487B (en) | Intelligent robot monitoring method and device | |
CN112199642B (en) | Detection method for anti-debugging of android system, mobile terminal and storage medium | |
KR101139888B1 (en) | ???? middleware system for use in Container Security Device | |
US9401854B2 (en) | System and method for slow link flap detection | |
CN109102839B (en) | Bad block marking method, device, equipment and readable storage medium | |
CN104750551A (en) | A computer system and user-defined responding method thereof | |
CN115599617B (en) | Bus detection method and device, server and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: 516025 No. 1, Shunchang Road, Huinan Industrial Park, Zhongkai high tech Zone, Huizhou City, Guangdong Province Patentee after: WELLAV TECHNOLOGIES Ltd. Address before: 516006 Huitai Industrial Zone 63, Zhongkai High-tech Zone, Huizhou City, Guangdong Province Patentee before: HUIZHOU WELLAV TECHNOLOGIES Co.,Ltd. |