CN105279037A - Watchdog monitoring method and system - Google Patents

Watchdog monitoring method and system Download PDF

Info

Publication number
CN105279037A
CN105279037A CN201410306845.3A CN201410306845A CN105279037A CN 105279037 A CN105279037 A CN 105279037A CN 201410306845 A CN201410306845 A CN 201410306845A CN 105279037 A CN105279037 A CN 105279037A
Authority
CN
China
Prior art keywords
cpu
fpga
thread
state value
errorlevel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410306845.3A
Other languages
Chinese (zh)
Other versions
CN105279037B (en
Inventor
邹伟华
江锐
杨雪松
刘撑乾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WELLAV TECHNOLOGIES Ltd
Original Assignee
Huizhou Wellav Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huizhou Wellav Technologies Co ltd filed Critical Huizhou Wellav Technologies Co ltd
Priority to CN201410306845.3A priority Critical patent/CN105279037B/en
Publication of CN105279037A publication Critical patent/CN105279037A/en
Application granted granted Critical
Publication of CN105279037B publication Critical patent/CN105279037B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a watchdog monitoring method and system. The method comprises the following steps: electrifying an FPGA and a CPU to initialize; sending, by the FPGA, a dog feeding pulse signal to a watchdog chip, and run, by the CPU, a thread corresponding to a task; communicating, by the FPGA and the CPU, with each other so as to read the state information of the opposite side, judge whether the operation is abnormal or not, and control the opposite side to reset if the judging result is positive; and storing, by the FPGA, the state information of the CPU in real time. The invention correspondingly discloses a watchdog monitoring system. By adopting the technical scheme, the state information of the CPU can be obtained and stored through the FPGA in the monitoring process, and then the user can position the problem during the resetting of the CPU.

Description

Watch dog monitoring method and system
Technical field
The present invention relates to electronic technology field, particularly relate to a kind of watch dog monitoring method and system.
Background technology
In computer systems, which; work due to CPU usually can be subject to the interference from external electromagnetic field; cause program fleet; and being absorbed in endless loop, the normal operation of program is interrupted, and causes system to be absorbed in dead state; there is unpredictable consequence; so for consideration CPU running status being carried out to monitoring in real time, create a kind of chip being specifically designed to monitoring running status, be commonly called as " house dog " (watchdog) chip.
Traditional watch dog monitoring mode, adopts the watchdog chip outside CPU control strip, by feeding dog in real time, realizes the monitoring of CPU running software situation.Because software control is easily subject to the impact of system environments, itself is also subject to the restriction of software itself, easily cause cpu system fault in some conditions, when cpu system fault, CPU does not record self information, thus cause user cannot know the reason of system exception, and after cpu reset is restarted, also cannot orientation problem.And traditional based on hardware mode monitoring watchdog, reset mode is fixed, dumb.
Summary of the invention
Based on this, be necessary to provide a kind of watch dog monitoring method and system, in monitor procedure, can be obtained by FPGA and preserve CPU status information, when cpu reset, for user's orientation problem.
A kind of watch dog monitoring method, comprising:
FPGA and CPU power-up initializing;
Described FPGA sends to watchdog chip and feeds dog pulse signal, the thread that described CPU operation task is corresponding;
Extremely whether described FPGA communicates with described CPU, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets;
Described FPGA preserves the status information of described CPU in real time.
In one embodiment, described FPGA communicates with described CPU, reads the status information of the other side, judges to comprise the step whether operation is abnormal:
After the thread that described CPU operation task is corresponding, start the monitoring to described thread, and the state value of the described thread in preset time period is sent to described FPGA, the errorlevel that described FPGA is corresponding according to described state value judges described CPU whether operation exception.
In one embodiment, described preset time period is 1ms to 32s.
In one embodiment, the errorlevel that described state value is corresponding comprises slow, the accidental crc error of response, continues crc error and communication disruption; When the errorlevel that described state value is corresponding is that when continuing crc error or communication disruption, described FPGA judges that described CPU is as operation exception state.
In one embodiment, described method also comprises:
Described CPU is after abnormality resets, from the status information of the CPU of described FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continues to run thread corresponding to other errorlevel state value, and generates CPU running log.
A kind of watch dog monitoring system, comprising: FPGA, CPU and watchdog chip;
Described FPGA, feeds dog pulse signal for sending to described watchdog chip after power-up initializing;
Described CPU, for after power-up initializing, the thread that operation task is corresponding;
Extremely whether described FPGA and described CPU, also for communicating, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets;
Described FPGA, also for preserving the status information of described CPU in real time.
In one embodiment, described CPU starts the monitoring to described thread, and the state value of the described thread in preset time period is sent to described FPGA after being used for thread corresponding to operation task;
Described FPGA is used for the errorlevel corresponding according to described state value and judges described CPU whether operation exception.
In one embodiment, described preset time period is 1ms to 32s.
In one embodiment, the errorlevel that described state value is corresponding comprises slow, the accidental crc error of response, continues crc error and communication disruption;
Described FPGA is, when continuing crc error or communication disruption, judge that described CPU is as operation exception state for the errorlevel corresponding when described state value.
In one embodiment, described CPU is used for after abnormality resets, from the status information of the CPU of described FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continue to run thread corresponding to other errorlevel state value, and generate CPU running log.
Above-mentioned watch dog monitoring method and system, extremely whether FPGA with CPU communicates, and reads the status information of the other side, to judge to run, if abnormal, then control the other side to reset, and FPGA can preserve the status information of CPU in real time, compared to traditional technology, achieving can in monitor procedure, obtained by FPGA and preserve CPU status information, when cpu reset, for user's orientation problem.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the watch dog monitoring method in an embodiment;
Fig. 2 is the structural representation of the watch dog monitoring system in an embodiment;
Fig. 3 is the syndeton schematic diagram of FPGA in an embodiment and watchdog chip;
Fig. 4 is the structural representation of the storage data of cpu monitor thread in an embodiment;
Fig. 5 is the storage organization schematic diagram of the status information of the CPU that in an embodiment, FPGA obtains.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
See Fig. 1, in one embodiment, a kind of watch dog monitoring method is provided.The method comprises:
Step 101, FPGA and CPU power-up initializing.
Step 102, FPGA sends to watchdog chip and feeds dog pulse signal, the thread that CPU operation task is corresponding.
Concrete, after FPGA power-up initializing, dog pulse signal is fed in the transmission that can continue to watchdog chip.CPU executes the task according to the instruction of power on executive routine or outside input, and such as read and write the display etc. of DDR, LCD, often perform a task, CPU can run corresponding thread.
Step 103, extremely whether FPGA with CPU communicates, and reads the status information of the other side, to judge to run, and if so, then controls the other side and resets.
Concrete, FPGA and CPU monitors the running status of the other side mutually.Wherein, FPGA, from carrying out logical process, can record logic state, and according to the logic state of record, CPU judges that whether FPGA work is abnormal, if abnormal, control FPGA resets.The principle that involved processing procedure and CPU control other peripheral hardware is similar, does not repeat them here.
In this step, also comprise FPGA and CPU is monitored.After the thread that CPU operation task is corresponding, start the monitoring to thread, and the state value of the thread in preset time period is sent to FPGA, judge CPU whether operation exception by the errorlevel that FPGA is corresponding according to this state value.Wherein, preset time period can be configured by CPU, can be different according to status information storage depth, and its scope is 1ms to 32s.The errorlevel of state value, carries out CRC check by the data communication that CPU is relevant to thread and obtains, can be divided into multiple grade in advance, as responded slow, accidental crc error, continuing crc error and communication disruption.When the errorlevel that state value is corresponding is that when continuing crc error or communication disruption, FPGA judges that CPU is as operation exception state.
Step 104, FPGA preserves the status information of CPU in real time.
Concrete, FPGA preserves the status information of CPU in real time, for the follow-up running status can analyzing CPU of user of service.
Further, in the present embodiment, when FPGA judges CPU operation exception, by send a reset level to CPU allow its reset after, the CPU status information that CPU preserves from FPGA retaking of a year or grade, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and the thread of state value to other errorlevel, read the state value before resetting to continue to run, improve reset speed.Further, CPU can generate CPU running log, analyzes for user of service.
See Fig. 2, in one embodiment, a kind of watchdog chip supervisory system is provided.This system comprises: FPGA, CPU and watchdog chip.
FPGA, feeds dog pulse signal for sending to watchdog chip after power-up initializing.
CPU, for after power-up initializing, the thread that operation task is corresponding.
Extremely whether FPGA and CPU, also for communicating, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets.
FPGA, also for preserving the status information of CPU in real time.
In one embodiment, having three altogether see Fig. 3, FPGA watchdog circuit and control pin, is DOG_FEED, RESET_CTR and SYSTEM_RESET respectively.DOG_FEED signal is that FPGA feeds dog pulse, and precision can reach ns level, and during normal hello dog, FPGA continues to send fixed pulse to refresh watchdog chip.RESET_CTR signal is watchdog reset useful signal, realizes reset switch by this signal of logic control.SYSTEM_RESET signal connects FPGA initiatively loading reset signal, and whether control FPGA resets.
In one embodiment, see Fig. 4, after the thread that CPU operation task is corresponding, start the monitoring (valid data bit in Fig. 4 is set to 1 by 0) to thread, and the state value of the thread in preset time period is sent to FPGA, this preset time period can configure scope from 1ms to 32s by CPU.In the diagram, namely the data such as Thread1, Thread2 are the numerical value of the monitoring preset time period of each thread of CPU configuration.
In one embodiment, FPGA with CPU communicates, and obtains the status information of CPU as shown in Figure 5.Data bit 15 is valid position, puts 1, and mark thread is just in the monitoring state.The peripheral hardware that data bit 14init data bit mark thread is corresponding completes initialization.Data bit 13 to data bit 0 misregistration grade, errorlevel comprises: response is (0x10), accidental crc error (0x20), lasting crc error (0x30) and communication disruption (0x40) slowly.FPGA is, when continuing crc error or communication disruption, judge that CPU is as operation exception state at the errorlevel that state value is corresponding.Now FPGA can send reset level to the reset pin of CPU, allows it reset.
In one embodiment, CPU is after abnormality resets, from the status information of the CPU of FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continue to run thread corresponding to other errorlevel state value, and generate CPU running log, analyze for user of service.
Above-mentioned watch dog monitoring method and system, extremely whether FPGA with CPU communicates, and reads the status information of the other side, to judge to run, if abnormal, then control the other side to reset, and FPGA can preserve the status information of CPU in real time, compared to traditional technology, achieving can in monitor procedure, obtained by FPGA and preserve CPU status information, when cpu reset, for user's orientation problem.
The above embodiment only have expressed several embodiment of the present invention, and it describes comparatively concrete and detailed, but therefore can not be interpreted as the restriction to the scope of the claims of the present invention.It should be pointed out that for the person of ordinary skill of the art, without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims (10)

1. a watch dog monitoring method, is characterized in that, described method comprises:
FPGA and CPU power-up initializing;
Described FPGA sends to watchdog chip and feeds dog pulse signal, the thread that described CPU operation task is corresponding;
Extremely whether described FPGA communicates with described CPU, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets;
Described FPGA preserves the status information of described CPU in real time.
2. method according to claim 1, is characterized in that, described FPGA communicates with described CPU, reads the status information of the other side, judges to comprise the step whether operation is abnormal:
After the thread that described CPU operation task is corresponding, start the monitoring to described thread, and the state value of the described thread in preset time period is sent to described FPGA, the errorlevel that described FPGA is corresponding according to described state value judges described CPU whether operation exception.
3. method according to claim 2, is characterized in that, described preset time period is 1ms to 32s.
4. method according to claim 2, is characterized in that, the errorlevel that described state value is corresponding comprises slow, the accidental crc error of response, continues crc error and communication disruption; When the errorlevel that described state value is corresponding is that when continuing crc error or communication disruption, described FPGA judges that described CPU is as operation exception state.
5. method according to claim 4, is characterized in that, described method also comprises:
Described CPU is after abnormality resets, from the status information of the CPU of described FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continues to run thread corresponding to other errorlevel state value, and generates CPU running log.
6. a watch dog monitoring system, is characterized in that, described system comprises: FPGA, CPU and watchdog chip;
Described FPGA, feeds dog pulse signal for sending to described watchdog chip after power-up initializing;
Described CPU, for after power-up initializing, the thread that operation task is corresponding;
Extremely whether described FPGA and described CPU, also for communicating, reads the status information of the other side, to judge to run, and if so, then controls the other side and resets;
Described FPGA, also for preserving the status information of described CPU in real time.
7. system according to claim 6, is characterized in that, described CPU starts the monitoring to described thread, and the state value of the described thread in preset time period is sent to described FPGA after being used for thread corresponding to operation task;
Described FPGA is used for the errorlevel corresponding according to described state value and judges described CPU whether operation exception.
8. system according to claim 7, is characterized in that, described preset time period is 1ms to 32s.
9. system according to claim 7, is characterized in that, the errorlevel that described state value is corresponding comprises slow, the accidental crc error of response, continues crc error and communication disruption;
Described FPGA is, when continuing crc error or communication disruption, judge that described CPU is as operation exception state for the errorlevel corresponding when described state value.
10. system according to claim 9, it is characterized in that, described CPU is used for after abnormality resets, from the status information of the CPU of described FPGA reading and saving, initialization continues crc error or thread corresponding to communication disruption errorlevel state value, and continue to run thread corresponding to other errorlevel state value, and generate CPU running log.
CN201410306845.3A 2014-06-30 2014-06-30 Watch dog monitoring method and system Active CN105279037B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410306845.3A CN105279037B (en) 2014-06-30 2014-06-30 Watch dog monitoring method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410306845.3A CN105279037B (en) 2014-06-30 2014-06-30 Watch dog monitoring method and system

Publications (2)

Publication Number Publication Date
CN105279037A true CN105279037A (en) 2016-01-27
CN105279037B CN105279037B (en) 2019-01-11

Family

ID=55148087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410306845.3A Active CN105279037B (en) 2014-06-30 2014-06-30 Watch dog monitoring method and system

Country Status (1)

Country Link
CN (1) CN105279037B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105929811A (en) * 2016-04-06 2016-09-07 清华大学 Protection circuit for program deadlock
CN107025160A (en) * 2017-04-14 2017-08-08 济南浪潮高新科技投资发展有限公司 A kind of system of quick positioning question for Shen prestige processor platform
CN109062718A (en) * 2018-07-12 2018-12-21 联想(北京)有限公司 A kind of server and data processing method
CN109726080A (en) * 2018-12-29 2019-05-07 百度在线网络技术(北京)有限公司 Monitor the method and device of the working condition of heterogeneous computing system
CN109815044A (en) * 2019-03-29 2019-05-28 深圳市广联智通科技有限公司 A kind of cascade watchdog circuit
CN110287055A (en) * 2019-06-28 2019-09-27 联想(北京)有限公司 The data reconstruction method and electronic equipment of a kind of electronic equipment
CN118377644A (en) * 2024-06-21 2024-07-23 南京国电南自维美德自动化有限公司 FPGA-based rapid CPU fault diagnosis lifting method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060236150A1 (en) * 2005-04-01 2006-10-19 Dot Hill Systems Corporation Timer-based apparatus and method for fault-tolerant booting of a storage controller
CN1908856A (en) * 2005-08-05 2007-02-07 中兴通讯股份有限公司 Position restoration circuit device
CN101271415A (en) * 2008-05-07 2008-09-24 深圳国人通信有限公司 Monitoring watchdog implementing method of built-in equipment
CN101964731A (en) * 2010-06-18 2011-02-02 中兴通讯股份有限公司 Method and device for monitoring data link
CN102081573A (en) * 2010-02-01 2011-06-01 杭州华三通信技术有限公司 Device and method for recording equipment restart reason
CN102141939A (en) * 2010-02-01 2011-08-03 杭州华三通信技术有限公司 Device capable of recording restart reason of whole machine
CN203386143U (en) * 2013-07-24 2014-01-08 三维通信股份有限公司 Remote machine reset device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060236150A1 (en) * 2005-04-01 2006-10-19 Dot Hill Systems Corporation Timer-based apparatus and method for fault-tolerant booting of a storage controller
CN1908856A (en) * 2005-08-05 2007-02-07 中兴通讯股份有限公司 Position restoration circuit device
CN101271415A (en) * 2008-05-07 2008-09-24 深圳国人通信有限公司 Monitoring watchdog implementing method of built-in equipment
CN102081573A (en) * 2010-02-01 2011-06-01 杭州华三通信技术有限公司 Device and method for recording equipment restart reason
CN102141939A (en) * 2010-02-01 2011-08-03 杭州华三通信技术有限公司 Device capable of recording restart reason of whole machine
CN101964731A (en) * 2010-06-18 2011-02-02 中兴通讯股份有限公司 Method and device for monitoring data link
CN203386143U (en) * 2013-07-24 2014-01-08 三维通信股份有限公司 Remote machine reset device

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105929811A (en) * 2016-04-06 2016-09-07 清华大学 Protection circuit for program deadlock
CN105929811B (en) * 2016-04-06 2018-11-20 清华大学 A kind of protection circuit for program deadlock
CN107025160A (en) * 2017-04-14 2017-08-08 济南浪潮高新科技投资发展有限公司 A kind of system of quick positioning question for Shen prestige processor platform
CN109062718A (en) * 2018-07-12 2018-12-21 联想(北京)有限公司 A kind of server and data processing method
CN109726080A (en) * 2018-12-29 2019-05-07 百度在线网络技术(北京)有限公司 Monitor the method and device of the working condition of heterogeneous computing system
CN109726080B (en) * 2018-12-29 2023-07-14 百度在线网络技术(北京)有限公司 Method and device for monitoring working state of heterogeneous computing system
CN109815044A (en) * 2019-03-29 2019-05-28 深圳市广联智通科技有限公司 A kind of cascade watchdog circuit
CN110287055A (en) * 2019-06-28 2019-09-27 联想(北京)有限公司 The data reconstruction method and electronic equipment of a kind of electronic equipment
CN110287055B (en) * 2019-06-28 2021-06-15 联想(北京)有限公司 Data recovery method of electronic equipment and electronic equipment
CN118377644A (en) * 2024-06-21 2024-07-23 南京国电南自维美德自动化有限公司 FPGA-based rapid CPU fault diagnosis lifting method and system

Also Published As

Publication number Publication date
CN105279037B (en) 2019-01-11

Similar Documents

Publication Publication Date Title
CN105279037A (en) Watchdog monitoring method and system
CN102761439B (en) Device and method for detecting and recording abnormity on basis of watchdog in PON (Passive Optical Network) access system
US10671416B2 (en) Layered virtual machine integrity monitoring
EP3142011A1 (en) Anomaly recovery method for virtual machine in distributed environment
US10216550B2 (en) Technologies for fast boot with adaptive memory pre-training
US10956247B2 (en) Collecting and transmitting diagnostics information from problematic devices
US20140068350A1 (en) Self-checking system and method using same
CN108872762B (en) Electronic equipment leakage detection method and device, electronic equipment and storage medium
CN106201755B (en) The repositioning method and device of the network equipment
US9135193B2 (en) Expander interrupt processing
US20220035438A1 (en) Control method, apparatus, and electronic device
WO2014144043A4 (en) Apparatus and method for generating descriptors to reaccess a non-volatile semiconductor memory of a storage drive due to an error
CN110704228A (en) Solid state disk exception handling method and system
US9026720B2 (en) Non-volatile memory monitoring
US20170371684A1 (en) Pin control method and device
US9223740B2 (en) Detection method and apparatus for hot-swapping of SD card
WO2023065601A1 (en) Server component self-test anomaly recovery method and device, system, and medium
US20170052521A1 (en) Programmable controller and arithmetic processing system
CN109933487B (en) Intelligent robot monitoring method and device
CN112199642B (en) Detection method for anti-debugging of android system, mobile terminal and storage medium
KR101139888B1 (en) ???? middleware system for use in Container Security Device
US9401854B2 (en) System and method for slow link flap detection
CN109102839B (en) Bad block marking method, device, equipment and readable storage medium
CN104750551A (en) A computer system and user-defined responding method thereof
CN115599617B (en) Bus detection method and device, server and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 516025 No. 1, Shunchang Road, Huinan Industrial Park, Zhongkai high tech Zone, Huizhou City, Guangdong Province

Patentee after: WELLAV TECHNOLOGIES Ltd.

Address before: 516006 Huitai Industrial Zone 63, Zhongkai High-tech Zone, Huizhou City, Guangdong Province

Patentee before: HUIZHOU WELLAV TECHNOLOGIES Co.,Ltd.