CN1547125A - Watchdog implementing method based on sharing memory - Google Patents

Watchdog implementing method based on sharing memory Download PDF

Info

Publication number
CN1547125A
CN1547125A CNA200310118551XA CN200310118551A CN1547125A CN 1547125 A CN1547125 A CN 1547125A CN A200310118551X A CNA200310118551X A CN A200310118551XA CN 200310118551 A CN200310118551 A CN 200310118551A CN 1547125 A CN1547125 A CN 1547125A
Authority
CN
China
Prior art keywords
controlled
shared drive
house dog
physics process
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA200310118551XA
Other languages
Chinese (zh)
Other versions
CN1262926C (en
Inventor
兵 孙
孙兵
刘建华
田茂良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN 200310118551 priority Critical patent/CN1262926C/en
Publication of CN1547125A publication Critical patent/CN1547125A/en
Application granted granted Critical
Publication of CN1262926C publication Critical patent/CN1262926C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention refers to a realizing method for watchdog based on shared inner memory. The method includes: the controlled physical process writes the first byte label into the inner memory in the first time interval; the correspondent byte content in the shared inner memory of the controlled physical process with the second time interval by the waterdog procedure, if the content is consistent to the first byte label written by the controlled physical process, thus the control physical process is in running state, the watchdog procedure writes the second byte label into the byte; if the correspondent byte content in the controlled physical process is the second byte label, thus it represents that the controlled physical process is dead or exited, the watchdog will kill the dead process, at the same time, reset the control physical process. The invention solves the problem of code transplant over platform. It enlarges the application range of the software watchdog, and reduces the developing and maintenance cost.

Description

A kind of house dog implementation method based on shared drive
Technical field
The present invention relates to the method for software watchdog monitoring objective resource status, relate in particular to a kind of house dog implementation method based on shared drive.
Background technology
Traditional windows platform software watchdog generally all is to realize mutual between house dog and the controlled resource (controlled resource generally is that the mode with operating system physics process exists) by window message mechanism, specific practice is: house dog sends a monitoring request message to controlled resource, controlled resource is returned a monitoring response message after receiving this message immediately, if house dog is not received the monitoring response message that controlled resource is returned in the T time period (monitoring the overtime time limit), this moment, house dog thought that then controlled resource occurs unusually.
There is certain limitation in traditional software watchdog method for supervising: at first will possess bigger difficulty according to the house dog software that comes out based on Windows window message Mechanism Design in cross-platform transplanting, because platforms such as UNIX, LINUX are not supported the window message mechanism of Windows; Secondly the specific implementation of Windows window message mechanism provides a message queue of depositing pending message for each window by system, it is exactly to add in the main window message queue of controlled resource and copy this message content in fact that house dog sends monitoring request message to controlled resource, controlled resource main window takes out monitoring request message and handles from message queue, under the heavier situation of controlled resource load (portfolio as processing is very big), can pile up a lot of pending message in the main window message queue, the watch dog monitoring request message is mixed in and wherein may can not get for a long time handling, and causes the erroneous judgement disconnected (think controlled resource be in inactive state) of house dog to monitored resource.
Do not retrieve the prior art document that can address the aforementioned drawbacks as yet.
Summary of the invention
The purpose of this invention is to provide a kind of house dog implementation method based on shared drive, solve software watchdog monitoring problem, guarantee that promptly house dog can obtain out controlled state of resources exactly, and be not subjected to the load condition of controlled resource itself to bring the interference of erroneous judgement.
Share memory technology can be in the upward realization of operating system platform (comprising Windows, SunSolaris, IBM AIX, HP-UX, Linux) of main flow at present.
The object of the present invention is achieved like this:
The invention discloses a kind of house dog implementation method, comprise the steps: based on shared drive
(1) per very first time of controlled physics process writes first type flags at interval in shared drive;
(2) byte content of this controlled physics process of per second time interval run-down of house dog program correspondence in the shared drive piece is carried out corresponding operating according to byte content;
Wherein, second time interval value is greater than very first time spacing value.
Corresponding operating in the described step (2) comprises: if corresponding byte content is identical with first type flags that this controlled physics process writes, show that then this controlled physics process is in running status, the house dog program writes second type flags in this byte; If the corresponding byte content of this controlled physics process is second type flags, then this controlled physics process of explanation is ossified or withdraw from, and house dog can kill ossified process, restarts this controlled physics process simultaneously;
Wherein, described first type flags and the described second type flags content are inequality.
Described house dog implementation method based on shared drive, also comprise the house dog initialization step, obtain the current state of controlled physics process, if monitored process is in operation or ossified state, then allow the physics process withdraw from or it is killed, start controlled physics process then; If controlled physics process is in non-operating state, then directly start it.
Deposit a controll block structure in the described shared drive, comprise: whether whether controlled physics process title, physics process control block (PCB) usage flag, the uniquely tagged of controlled physics process, controlled resource start mark and heartbeat mark, during controlled physics process initial start, each mark is operated.
When described house dog implementation method based on shared drive, controlled physics process initial start, each sign operated comprise: whether usage flag puts 1 to the physics process control block (PCB), and whether controlled resource starts mark is put 1.
Described house dog implementation method based on shared drive, when controlled physics process was normally moved, the heartbeat mark put 1.
Described house dog implementation method based on shared drive, controlled physics process is carried out the establishment of shared drive, and controlled physics process and house dog software are all carried out corresponding read-write operation to shared drive.
Described house dog implementation method based on shared drive, the final deletion of this shared drive is recycled by operating system, be that monitored process and house dog are all abandoned the control to shared drive, operating system can discharge this shared drive resource and reclaim its control.
Beneficial effect of the present invention is: software watchdog adopts the shared drive mechanism of operating system maturation to realize controlled state of resources monitoring and control, can reach efficiently, effect accurately, this monitoring mechanism is not subjected to the effects of load of controlled resource own, house dog software and controlled resource all are reliable, efficient and safe to the accessing operation of shared drive, and can carry out cross-platform transplanting very smoothly according to house dog software and the controlled resource software that this monitoring mechanism is worked out out, because each mainstream operation system is all supported shared drive mechanism at present.
Description of drawings
Fig. 1 is a physics process control block (PCB) structure TPhyPCBStruc PPCB block diagram of the present invention;
Fig. 2 is house dog of the present invention and the controlled resource operation chart to shared drive;
Fig. 3 is controlled resource of the present invention operation shared drive process flow diagram when starting;
Operation shared drive process flow diagram when Fig. 4 is controlled resource operation of the present invention;
Fig. 5 obtains controlled state of resources process flow diagram for house dog of the present invention;
Fig. 6 is a regularly monitoring mechanism process flow diagram of house dog of the present invention.
Embodiment
The implementation method of software watchdog of the present invention is as follows:
One. software watchdog is to the method for supervising of controlled physics process (controlled resource)
1. per very first time of controlled physics process, interval t1 write the first type flags tag_running in shared drive;
2. the controlled physics process of per second time interval t2 run-down of house dog program corresponding byte content in the shared drive piece, if corresponding byte content is tag_running, show that then this physics process is in running status, the house dog program is write the second type flags tag_stopped in this byte; If the corresponding byte content of this physics process is tag_stopped, illustrate that then this physics process is ossified or withdraw from that house dog can kill ossified process, restarts controlled physics process simultaneously;
3. in order to ensure the reliability of testing mechanism, time t2 value requires greater than the t1 value, in addition shared drive insert mark tag_running and tag_stopped inequality, for example one is 1, another is 0.
Two. the house dog initialization
At first obtain the current state of controlled physics process,, then allow the physics process withdraw from or it is killed, start controlled physics process then if monitored process is in operation or ossified state; If controlled physics process is in non-operating state, then directly start it.This process realizes the initialization start-up course of house dog software to controlled resource.
Three. the shared drive management
The establishment of shared drive is to be undertaken by controlled physics process, monitored process and house dog software are all carried out corresponding read-write operation to shared drive, the final deletion of shared drive is recycled by operating system, be that monitored process and house dog are all abandoned the control to shared drive, operating system can discharge this shared drive resource and reclaim its control.
Below in conjunction with accompanying drawing, substantially be described in further detail according to the enforcement of the order of accompanying drawing to technical scheme:
Fig. 1 has introduced physics process control block (PCB) structure TPhyPCBStruc PPCB.This structure leaves in the shared drive, is operated simultaneously by house dog software and controlled resource, and structure updating is name[40 wherein] represent controlled resource (as the physics process) title; ByUse is a whether usage flag of physics process control block (PCB), and 0 expression is used, and 1 expression is used; Pid represents the unique identification (as the PID of physics process) of controlled resource; Running represents whether controlled resource starts, and " ACTIVE " expression starts, and other value representation does not start, and controlled resource is changed to " ACTIVE " value with this field when starting; BeatFlag is the heartbeat mark of watch dog monitoring, and controlled resource per 2 seconds (t1) is put 1 (tag_running) with this mark, and house dog then per 10 seconds (t2) is with clear 0 (tag_stopped) of this mark.
Fig. 2 has introduced house dog software and the controlled resource beatFlag field operations situation to shared drive PPCB.Controlled resource per 2 seconds (t1) is put 1 (tag_running) with the beatFlag field value of PPCB, and house dog software then per 10 seconds (t2) is with clear 0 (tag_stopped) of this field value.
When Fig. 3 has introduced controlled resource initial start to the operational circumstances of shared drive PPCB structure.At first the byUse value of PPCB is put 1, the Running value with PPCB is changed to " ACTIVE " then, then controlled resource name is inserted the name[40 of PPCB], at last the pid value of PPCB is changed to the unique identification (as the PID of physics process) of controlled resource.
When Fig. 4 has introduced controlled resource and has normally moved to the operational circumstances of shared drive PPCB structure.Controlled resource per 2 seconds (t1) can be put 1 (tag_running) with the heartbeat mark beatFlag of PPCB.
Fig. 5 has introduced the flow process that house dog obtains controlled resource status.The Running field value of at first judging PPCB is to be " ACTIVE ", if not then returning the STOPPED state, whether there is judgement if carry out the pid of PPCB again,, the pid of PPCB returns the STOPPED state if not existing, carry out next step PPCB.name[40 if the pid of PPCB exists] judge, if this title and controlled resource name are inconsistent, then return the STOPPED state, if title unanimity then the heartbeat mark beatFlag that carries out PPCB again judge, if the beatFlag value is 1 (tag_running), then return the RUNNING state, otherwise return the DEAD state.
Fig. 6 has introduced the regularly realization flow of monitoring mechanism of house dog.House dog per 10 seconds (t2) is cooked following policer operation: the flow process that calling graph 5 is described is obtained controlled state of resources, handle it respectively according to controlled state of resources (total three state) then, if state is STOPPED, house dog is just restarted controlled resource; If state is DEAD, kill controlled resource by force earlier, and the recording exceptional daily record, and then restart controlled resource; If state is RUNNING, house dog then can be with the heartbeat mark beatFlag of shared drive PPCB clear 0 (tag_stopped).

Claims (9)

1. the house dog implementation method based on shared drive comprises the steps:
(1) per very first time of controlled physics process writes first type flags at interval in shared drive;
(2) byte content of this controlled physics process of per second time interval run-down of house dog program correspondence in the shared drive piece is carried out corresponding operating according to byte content;
Wherein, described second interval greater than described very first time interval.
2. the house dog implementation method based on shared drive as claimed in claim 1, it is characterized in that, described corresponding operating comprises: if corresponding byte content is identical with first type flags that this controlled physics process writes, show that then this controlled physics process is in running status, the house dog program writes second type flags in this byte; If the corresponding byte content of this controlled physics process is second type flags, then this controlled physics process of explanation is ossified or withdraw from, and house dog can kill ossified process, restarts this controlled physics process simultaneously.
3. the house dog implementation method based on shared drive as claimed in claim 2 is characterized in that, described first type flags content and the described second type flags content are inequality.。
4. the house dog implementation method based on shared drive as claimed in claim 1, it is characterized in that, this method also comprises the house dog initialization step, obtain the current state of controlled physics process, if monitored process is in operation or ossified state, then allow the physics process withdraw from or it is killed, start controlled physics process then; If controlled physics process is in non-operating state, then directly start it.
5. the house dog implementation method based on shared drive as claimed in claim 1, it is characterized in that, deposit a controll block structure in the described shared drive, comprise: whether whether controlled physics process title, physics process control block (PCB) usage flag, the uniquely tagged of controlled physics process, controlled resource start mark and heartbeat mark, during controlled physics process initial start, each mark is operated.
6. the house dog implementation method based on shared drive as claimed in claim 5, it is characterized in that, during controlled physics process initial start, each sign operated comprise: whether usage flag puts 1 to the physics process control block (PCB), and whether controlled resource starts mark is put 1.
7. the house dog implementation method based on shared drive as claimed in claim 5 is characterized in that when controlled physics process was normally moved, the heartbeat mark put 1.
8. the house dog implementation method based on shared drive as claimed in claim 1 is characterized in that controlled physics process is carried out the establishment of shared drive, and controlled physics process and house dog software are all carried out corresponding read-write operation to shared drive.
9. as claim 1 or 8 described house dog implementation methods based on shared drive, it is characterized in that, the final deletion of this shared drive is recycled by operating system, be that monitored process and house dog are all abandoned the control to shared drive, operating system can discharge this shared drive resource and reclaim its control.
CN 200310118551 2003-12-12 2003-12-12 Watchdog implementing method based on sharing memory Expired - Fee Related CN1262926C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200310118551 CN1262926C (en) 2003-12-12 2003-12-12 Watchdog implementing method based on sharing memory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200310118551 CN1262926C (en) 2003-12-12 2003-12-12 Watchdog implementing method based on sharing memory

Publications (2)

Publication Number Publication Date
CN1547125A true CN1547125A (en) 2004-11-17
CN1262926C CN1262926C (en) 2006-07-05

Family

ID=34338049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200310118551 Expired - Fee Related CN1262926C (en) 2003-12-12 2003-12-12 Watchdog implementing method based on sharing memory

Country Status (1)

Country Link
CN (1) CN1262926C (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102387040A (en) * 2011-11-01 2012-03-21 深圳市航天泰瑞捷电子有限公司 Method and system for keeping high-speed stable running of front-end processor
CN102968352A (en) * 2012-12-14 2013-03-13 杨晓松 System and method for process monitoring and multi-stage recovery
CN103036778A (en) * 2012-12-18 2013-04-10 上海斐讯数据通信技术有限公司 Device and method preventing equipment from ossifying in residential internet gateway device
CN103425562A (en) * 2012-05-18 2013-12-04 北京北方微电子基地设备工艺研究中心有限责任公司 Software disconnection monitoring system and method
CN103744727A (en) * 2014-01-16 2014-04-23 Tcl集团股份有限公司 Service starting method, device and intelligent equipment
CN106326055A (en) * 2016-08-29 2017-01-11 四川九洲空管科技有限责任公司 Method for software and hardware crashing detection and resetting of airborne collision avoidance system
CN107783844A (en) * 2017-10-13 2018-03-09 锐捷网络股份有限公司 A kind of computer program operation exception detection method, device and medium
CN109189562A (en) * 2018-08-09 2019-01-11 麒麟合盛网络技术股份有限公司 The method and apparatus of control process operation

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102387040B (en) * 2011-11-01 2015-04-22 深圳市航天泰瑞捷电子有限公司 Method and system for keeping high-speed stable running of front-end processor
CN102387040A (en) * 2011-11-01 2012-03-21 深圳市航天泰瑞捷电子有限公司 Method and system for keeping high-speed stable running of front-end processor
CN103425562B (en) * 2012-05-18 2016-08-31 北京北方微电子基地设备工艺研究中心有限责任公司 Software disconnection monitoring system and method
CN103425562A (en) * 2012-05-18 2013-12-04 北京北方微电子基地设备工艺研究中心有限责任公司 Software disconnection monitoring system and method
CN102968352B (en) * 2012-12-14 2015-07-22 杨晓松 System and method for process monitoring and multi-stage recovery
CN102968352A (en) * 2012-12-14 2013-03-13 杨晓松 System and method for process monitoring and multi-stage recovery
CN103036778A (en) * 2012-12-18 2013-04-10 上海斐讯数据通信技术有限公司 Device and method preventing equipment from ossifying in residential internet gateway device
CN103036778B (en) * 2012-12-18 2018-05-01 上海斐讯数据通信技术有限公司 The ossified apparatus and method of equipment are prevented in a kind of family gateway equipment
CN103744727A (en) * 2014-01-16 2014-04-23 Tcl集团股份有限公司 Service starting method, device and intelligent equipment
CN106326055A (en) * 2016-08-29 2017-01-11 四川九洲空管科技有限责任公司 Method for software and hardware crashing detection and resetting of airborne collision avoidance system
CN107783844A (en) * 2017-10-13 2018-03-09 锐捷网络股份有限公司 A kind of computer program operation exception detection method, device and medium
CN109189562A (en) * 2018-08-09 2019-01-11 麒麟合盛网络技术股份有限公司 The method and apparatus of control process operation
CN109189562B (en) * 2018-08-09 2021-08-20 麒麟合盛网络技术股份有限公司 Method and device for controlling process operation

Also Published As

Publication number Publication date
CN1262926C (en) 2006-07-05

Similar Documents

Publication Publication Date Title
US8245239B2 (en) Deterministic runtime execution environment and method
US7962707B2 (en) Apparatus and method for deterministic garbage collection of a heap memory
US20090125572A1 (en) Method for managing retention of data on worm disk media based on event notification
DE69635409T2 (en) A COMPUTER SYSTEM WITH UNBEATED ON-REQUEST AVAILABILITY
CN1262926C (en) Watchdog implementing method based on sharing memory
US9690562B2 (en) Detecting computing processes requiring reinitialization after a software package update
CN1095128C (en) Regeneration agent for back-up software
CN1829964A (en) A method, apparatus and computer program for processing a queue of messages
EP1072973A3 (en) Remote loading execution method, remote loading execution system, data processing apparatus, managing apparatus and computer readable recording medium
CN101261593B (en) Method and system for enabling state save and debugging
CN1684040A (en) Information processor capable of using past processing space
US20140033187A1 (en) Dynamic firmware updating system for use in translated computing environments
CN1829977A (en) Method and apparatus for improving the performance of garbage collection using stack trace cache
US7080374B2 (en) System and method for using native code interpretation to move threads to a safe state in a run-time environment
US7117354B1 (en) Method and apparatus for allowing restarted programs to use old process identification
CN112817933A (en) Management method and device for elastic database connection pool
CN1581100A (en) Data aging method for network processor
US6418454B1 (en) Method and mechanism for duration-based management of temporary LOBs
CN111427588A (en) Suspending installation of firmware packages
CN114201458B (en) Information updating method, micro-service system and computer readable storage medium
US8813103B1 (en) Methods and systems for handling component-object-model communications
KR20220158547A (en) Apparatus and method for collecting gpu resources in container based cloud environment
CN1655138A (en) Computer restoration system and method with multiple restoring points
CN111930502A (en) Server management method, device, equipment and storage medium
CN1581109A (en) Method for updating non-volatile storage for embedded system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20060705

Termination date: 20131212