GB2323191A - Detecting and handling zombie processes - Google Patents

Detecting and handling zombie processes Download PDF

Info

Publication number
GB2323191A
GB2323191A GB9800209A GB9800209A GB2323191A GB 2323191 A GB2323191 A GB 2323191A GB 9800209 A GB9800209 A GB 9800209A GB 9800209 A GB9800209 A GB 9800209A GB 2323191 A GB2323191 A GB 2323191A
Authority
GB
United Kingdom
Prior art keywords
zombie
processes
information
detecting
moved
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB9800209A
Other versions
GB2323191B (en
GB9800209D0 (en
Inventor
Hiroshi Taguchi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Publication of GB9800209D0 publication Critical patent/GB9800209D0/en
Publication of GB2323191A publication Critical patent/GB2323191A/en
Application granted granted Critical
Publication of GB2323191B publication Critical patent/GB2323191B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0715Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a system implementing multitasking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • G06F11/1482Generic software techniques for error detection or fault masking by means of middleware or OS functionality

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

In a machine operating under UNIX, the occurrence of a zombie process can be detected and problems raised concomitantly with the occurrence of the zombie process can be dealt with. Process information pieces concerning all processes in operation are collected from a device file 4 storing performance information of a system, states of the individual processes are analyzed, and a zombie process is detected on the basis of a result of the analysis. The detection of the zombie process is indicated by transmitting an alarm and/or performing a write process to a log-file while adding information necessary for dealing with problems concomitant with the occurrence of the zombie process.

Description

2323191 SYSTEM AND METHOD OF DETECTING AND HANDLING ZOMBIE PROCESS is The
present invention relates to the detection and handling of a zombie process and more particularly, to a system and method of detecting and handling a zombie process in a machine operating under a UNIX operating system.
A conventional UNIX machine (a computer incorporating a version of UNIX as an operating system) does not have the function of automatically detecting a zombie process and effectively handling the zombie process. Therefore, in the event that a zombie process occurs, the system as a whole does not take any measures against the zombie process and when a problem of a level which interferes with the system working is raised, maintenance personnel are forced to analyze its causes and take necessary measures through manual operation.
Concomitantly with the occurrence of the zombie process, there arise (1) a problem that a new process cannot be generated and (2) a problem that assignment of the CPU to a different process is - 1 prevented, with the result that stable working of the system cannot be carried out.
JP-A-4-264655 proposes an information processing system adapted to perform a predetermined necessary process durirg makeup or log-in and having decision means for deciding the presence of a zombie process whose ending is not recognized by a master process and which is placed in extinguishment waiting condition while leaving behind only execution information, and informing means for informing a user of the presence of the zombie process when the presence is determined. This proposal, however, presupposes the existence of the master process and after the announcement, mere removal of the zombie process is is effected and the proposal fails to refer to any expediency for dealing with problems encountered in the -system concomitantly with the occurrence of the zombie process. Further, the conventional machine on the UNIX base does not have the function of detecting and 20 handling a zombie process and hence, it cannot automatically detect a zombie process and deal with problems concomitant with the occurrence of the zombie process.
- 2 - According to the present invention in a first aspect there is provided a zombie process detecting and handling method comprising the steps of periodically collecting process information pieces concerning all processes in operation from a performance information storing device file, analyzing the present states of the individual processes, and determining and detecting a process as a zombie process when said process has moved to a ready-to-end is state but cannot end normally while keeping the readyto-end state or when said process has moved to an execution mode but does not operate normally, and informing a notice to the.effect that the zombie process is detected while adding necessary information.
Further according to the present invention, a zombie process detecting and handling system comprises detection means which collects, in an information processing system, process information pieces concerning all processes in operation from a file storing performance information of the system, analyzes the present states of the individual processes and detects and determines a process as a zombie process when the process has moved to a ready-to-end state but cannot end normally while keeping the ready-end state or when the process has moved to an execution mode but does not operate normally, and means which receives a notice of zombie process detection from the detection means and provides an indication that the- zombie process is detected by transmitting an alarm and/or carrying out a write process to a log-file while adding predetermined information necessary to deal with problems concomitant with the occurrence of the zombie process.
According to the present invention, a recording medium is provided which records a program for executing, on a machine, a processing procedure in which information-pieces concerning all processes in operation are collected from a file storing performance information of a system, the present states of the individual processes are analyzed, a process is determined and detected as a zombie process when the process has moved to a ready-to end state but cannot end normally while keeping the ready-to-end state or when 4 the process has moved to an execution mode but does no operate normally, and a notice to the effect that the zombie process is detected is informed together with necessary information. Brief Description of the Drawinqs
Fig. 1 is a block diagram showing the construction of a system according to an embodiment of the present invention.
Fig. 2 is a flow chart for explaining the operation of the embodiment of the present invention. Description of the Preferred Embodiments
A preferred embodiment of the present invention will now be described with reference to the accompanying drawings.
A zombie process detecting and handling system according to an embodiment of the present invention is -constructed as schematically shown, in block form, in Fig. 1.
Referring to Fig. 1, the system of the present embodiment comprises a zombie process detecting/handling function section 1 including a detection unit 2 and a handling unit 3. The detection unit 2 collects process information pieces of individual processes in operation from a performance information storing device file 4, which is exemplified by a device special file of UNIX i the present embodiment, for storing information pieces concerning all the processes in operation, as indicated at (A) in Fig. 1.
Also, the detection unit 2 analyzes a state of a process on the basis of a collected process information piece and when the detection unit 2 detects and determines, as a result of the analysis, the process as a zombie process, it informs the handling unit 3 that the zombie process is detected, as indicated at (B) in Fig. 1.
Responsive to the notice from detection unit 2 which purports that the zombie process is detected, the handling unit 3 analyzes the zombie process and informs a maintenance man of the detection of the zombie process by transmitting an alarm and/or performing a write process to a log-file while adding information necessary for dealing with problems raised concomitantly with the generation of the zombie process, as indicated at (C) in Fig. 1.
The operation of the zombie process detecting and handling system embodying the present invention will be described with reference to a flow chart of Fig. 2.
Firstly, in connection with all processes in operation, process identifiers (hereinafter referred to as PID's) are acquired (step Sl).
One of the PID's acquired in the step Sl is 6 - taken out and process information concerning a corresponding process is collected from the performance information storing device file 4 by using the taken-out PID as a key (step S2).
The process information collected in the step S2 includes, for example, a name of the process, a master process identifier, a state of process, a CPU use time/rate of process.
On the basis of the collected process information, the state of the process of interest is analyzed (step 3) and the process is decided, from a result of the analysis, as to whether to be a zombie process (step S4). For example, when the process remains in the ready-to-end state for a time exceeding the predetermined threshold time but cannot end practically, when a CPU use rate of process exceed the -threshold value, the process is determined as being in zombie condition.
If the process of interest is determined as a zombie process in the step S4, the zombie process is analyzed (step S5) and the transmission of an alarm or the write process to the log-file is immediately carried out while affixing information necessary for dealing with problems raised in the system concomitantly with the generation of the zombie process, thus informing the maintenance man of the zombie process (step S6).
7 - After the state of the process of interest has been determined and the detection of the zombie process has been informed, the program returns to the step following the step Sl so that a different PID may be taken out and then steps succeeding the step S2 may be executed for a corresponding process by using the PID as a key.
At the time that the steps succeeding the step S2 are completed for all of the PID's, one cycle of the zombie process detecting/handling processing procedure ends. The one-cycle processing procedure can be carried out periodically.
The program for realizing the processing procedure consisting of the individual steps shown in Fig. 2 is suitably practiced on the machine of UNIX base by using a recording medium which records the program.
To sum up the foregoing, in the detection unit (2 in Fig. 1) of the zombie process detecting/handling function section (1 in Fig. 1), process information pieces concerning all of the processes in operation are collected from the performance information storing device file-(4 in Fig. 1) of the system, the analysis is carried out on the basis of the information pieces collected in respect of the individual processes and when a process exists which has moved to a ready-to end state but cannot end normally for a time exceeding the predetermined time while keeping the ready-to-end state or which exhibits, for a predetermined time, a CPU use rate in excess of a threshold value, this process is detected as a zombie process, and the handling unit (3 in Fig. 1) is informed of the detected zombie process.
Receiving the information purporting that the zombie process is detected, the handling unit performs the transmission of an alarm and/or the write process to the log-file while affixing information necessary for dealing with problems concomitant with the generation of the zombie process, thus informing the maintenance man of the detected zombie process.
Preferably, the states of processes in operation are monitored periodically to permit the collection of process information pieces and the analysis/determination of the process states and in the event that a zombie process occurs, automatic detection and handling of the zombie process is immediately carried out. This permits a trouble to be dealt with at the initial phase of its occurrence so as to ensure stable system working, thereby improving the reliability and productivity to advantage.
- 9

Claims (6)

  1. 2 3 4 5 6 1. A zombie process detecting and handling method comprising the steps of:
    periodically collecting process information pieces concerning all processes in operation from a performance information storing device file; ahalyzing the present states of the individual 7 processes; and 8 determining and detecting a process as a 9 zombie process when said process has moved to a ready to-end state but cannot end normally while keeping the 11 readyto-end state or when said process has moved to an 12 execution mode but does not operate normally; and 13 informing a notice to the effect that the 14 zombie process is detected while adding necessary 2 3 4 5 6 7 2. A zombie process detecting and handling iaethod according to claim 1 further comprising the steps of acquiring process identifiers of all the processes in operation, taking out one of the acquired process identifiers, and collecting a process information piece concerning a corresponding process by using the takenout process identifier as a key.
    3.
  2. 2 comprising:
  3. 3 detection means which collects, in an
  4. 4 information processing system, process information pieces concerning all processes in operation from a file 6 storing performance information of the system, analyzes 7 the present states of the individual processes and 8 detects and determines a process as a zombie process 9 when said process has moved to a ready-to-end state but cannot end normally while keeping the ready-to-end state 11 or when said process has moved to an execution mode but 12 does not operate normally; and 13 means which receives a notice of zombie 14 process detection f rom said detection means and provides an indication that the zombie process is detected, 16 by transmitting an alarm and/or performing a write 17 -process to a log-file while adding predetermined 18 information necessary for dealing with problems raised 19 concomitantly with the occurrence of said zombie process.
    A zombie process detecting and handling system 2 3 4 5 4. A recording medium which records a program for executing, on a machine, a processing procedure in which process information pieces concerning all processes in operation are collected from a file storing performance information of a system, the present states of the - 1 1 - 6 7 8 9 10 11 12 13 individual processes are analyzed, a process is determined and detected as a zombie process when said process has moved to a ready-to-end state but cannot end normally while keeping the ready-to-end state or when said process has moved to an execution mode but does not operate normally, and a notice to the effect that said zombie process is detected is informed while-adding predetermined necessary information.
  5. 5. A zombie process detecting and handling method, substantially as herein described with reference to the drawings.
  6. 6. A zombie process detecting and handling system, substantially as herein described with reference to the drawings.
    12 -
GB9800209A 1997-01-10 1998-01-06 System and method of detecting and handling zombie process Expired - Fee Related GB2323191B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP9014538A JPH10198583A (en) 1997-01-10 1997-01-10 Detection/processing system/method for idle running process

Publications (3)

Publication Number Publication Date
GB9800209D0 GB9800209D0 (en) 1998-03-04
GB2323191A true GB2323191A (en) 1998-09-16
GB2323191B GB2323191B (en) 2001-12-19

Family

ID=11863939

Family Applications (1)

Application Number Title Priority Date Filing Date
GB9800209A Expired - Fee Related GB2323191B (en) 1997-01-10 1998-01-06 System and method of detecting and handling zombie process

Country Status (3)

Country Link
JP (1) JPH10198583A (en)
GB (1) GB2323191B (en)
NZ (1) NZ329564A (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4562568B2 (en) * 2005-03-28 2010-10-13 富士通テン株式会社 Abnormality detection program and abnormality detection method
JP2011141786A (en) * 2010-01-08 2011-07-21 Oki Networks Co Ltd Cpu monitoring device and cpu monitoring program
CN116960947A (en) * 2023-07-03 2023-10-27 国网青海省电力公司海西供电公司 Abnormal process diagnosis and recovery method and module for power distribution automation master station system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1422603A (en) * 1972-05-23 1976-01-28 Ericsson Telefon Ab L M Arrangement for indicating abnormal programme execution in a computer
US3996567A (en) * 1972-05-23 1976-12-07 Telefonaktiebolaget L M Ericsson Apparatus for indicating abnormal program execution in a process controlling computer operating in real time on different priority levels
GB2047446A (en) * 1979-04-17 1980-11-26 Hitachi Ltd Multiprocessor information processing system having fault detection function
EP0190370A1 (en) * 1984-12-31 1986-08-13 International Business Machines Corporation Device for improving the detection of non-operational states in a non-attended interrupt-driven processor
EP0534884A1 (en) * 1991-09-26 1993-03-31 International Business Machines Corporation Task timeout prevention in a multi-task, real-time system
EP0636985A1 (en) * 1993-07-27 1995-02-01 International Business Machines Corporation Process monitoring in a multiprocessing server

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6382528A (en) * 1986-09-26 1988-04-13 Nec Corp Information output system at the time of trouble of task
JP2595733B2 (en) * 1989-11-30 1997-04-02 富士通株式会社 Task error detection method
JPH04280329A (en) * 1991-03-08 1992-10-06 Fujitsu Ltd Program abnormality detection system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1422603A (en) * 1972-05-23 1976-01-28 Ericsson Telefon Ab L M Arrangement for indicating abnormal programme execution in a computer
US3996567A (en) * 1972-05-23 1976-12-07 Telefonaktiebolaget L M Ericsson Apparatus for indicating abnormal program execution in a process controlling computer operating in real time on different priority levels
GB2047446A (en) * 1979-04-17 1980-11-26 Hitachi Ltd Multiprocessor information processing system having fault detection function
EP0190370A1 (en) * 1984-12-31 1986-08-13 International Business Machines Corporation Device for improving the detection of non-operational states in a non-attended interrupt-driven processor
EP0534884A1 (en) * 1991-09-26 1993-03-31 International Business Machines Corporation Task timeout prevention in a multi-task, real-time system
EP0636985A1 (en) * 1993-07-27 1995-02-01 International Business Machines Corporation Process monitoring in a multiprocessing server

Also Published As

Publication number Publication date
JPH10198583A (en) 1998-07-31
NZ329564A (en) 1998-06-26
GB2323191B (en) 2001-12-19
GB9800209D0 (en) 1998-03-04

Similar Documents

Publication Publication Date Title
US9003230B2 (en) Method and apparatus for cause analysis involving configuration changes
US7657627B2 (en) System and program product for throttling events in an information technology system
US8543988B2 (en) Trace processing program, method and system
US7444263B2 (en) Performance metric collection and automated analysis
CN107766208B (en) Method, system and device for monitoring business system
CN112416705A (en) Abnormal information processing method and device
GB2323191A (en) Detecting and handling zombie processes
CN112035322B (en) JVM monitoring method and device
JP5503177B2 (en) Fault information collection device
CN112596938A (en) Abnormity monitoring method and device
CN112242917A (en) Internet of vehicles service quality detection method and system
CN116028251A (en) Method, device and equipment for reporting error log and readable storage medium
CN106599055A (en) Method for detecting operation state of checkpoint in database
JP2001273172A (en) Computer operation data recording system and recording medium used for the system
JP2009134535A (en) Device for supporting software development, method of supporting software development, and program for supporting software development
CN117290151B (en) Method, device, equipment, system and medium for determining fault cause of power supply module
CN117390627B (en) Security attribute identification method and device for application program
CN116185770A (en) Data acquisition method and device, electronic equipment and storage medium
JP2009187189A (en) Stall detection device, stall detection method and stall detection program
JPH05216798A (en) Terminal fault cause message informing system for communication processor
JP6601232B2 (en) Analysis method, analysis device, and analysis program
JP2001134473A (en) Load monitor and linked processing automatic starting system
CN115952144A (en) Log management method, device, equipment and storage medium
CN113704088A (en) Process tracing method, process tracing system and related device
CN117909166A (en) Method, device, equipment and storage medium for monitoring abnormality of micro-service application

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20030106