CN110825593A - Method, device and equipment for detecting abnormal state of process and storage medium - Google Patents

Method, device and equipment for detecting abnormal state of process and storage medium Download PDF

Info

Publication number
CN110825593A
CN110825593A CN201911096474.XA CN201911096474A CN110825593A CN 110825593 A CN110825593 A CN 110825593A CN 201911096474 A CN201911096474 A CN 201911096474A CN 110825593 A CN110825593 A CN 110825593A
Authority
CN
China
Prior art keywords
state
field
field information
detected
switching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911096474.XA
Other languages
Chinese (zh)
Other versions
CN110825593B (en
Inventor
向付晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201911096474.XA priority Critical patent/CN110825593B/en
Publication of CN110825593A publication Critical patent/CN110825593A/en
Application granted granted Critical
Publication of CN110825593B publication Critical patent/CN110825593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/323Visualisation of programs or trace data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to a method, a device, equipment and a storage medium for detecting abnormal states of processes, wherein the method comprises the following steps: acquiring a process identifier of a process to be detected, and determining a process control block corresponding to the process identifier; in one cycle, the following steps are performed: judging whether the process to be detected is in a first preset state or not based on the field information of the process state at the current moment; when the process to be detected is in a first preset state, comparing the switching frequency field information of the current moment with the switching frequency field information of the previous moment, and when the comparison result is consistent, repeatedly executing the steps in the cycle until a first preset condition is met; and determining the process state of the process to be detected based on the first preset condition met at the end of the cycle. The method and the device can avoid misjudgment of the process state, and are realized without additionally adding codes, so that the accuracy and the flexibility of process state detection are improved.

Description

Method, device and equipment for detecting abnormal state of process and storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to a method, an apparatus, a device, and a storage medium for detecting an abnormal state of a process.
Background
The process is the existing form of the running program in the system, one process can be in a group of different states in the life time of the process, the system resource situation occupied by the process can be known by checking the state information of the process, and the running state of the system is analyzed and adjusted, so that the system can be kept running in a stable state.
For example, the process states in the Linux system are: executable state R (TASK _ RUNNING), INTERRUPTIBLE sleep state S (TASK _ INTERRUPTIBLE), INTERRUPTIBLE sleep state D (TASK _ UNINTERRUPTIBLE), process DEAD state Z (TASK _ DEAD-EXIT _ ZOMBIE); wherein the uninterruptible sleep state D is not an abnormal state, but when it is in the uninterruptible sleep state D for a long time and not returned, the process is in an abnormal state. The detection method for the abnormal state in the prior art mainly comprises two methods: one is to perform multiple detections through a ps command (Process Status) to determine whether a Process is in an uninterruptible sleep state D for a long time, but this method may cause erroneous determination, resulting in inaccurate detection results; the other is to add a heartbeat mechanism in the processing thread, but needs to add code to support the heartbeat function, so as to increase the workload of state detection.
Disclosure of Invention
The technical problem to be solved by the present application is to provide a method, an apparatus, a device and a storage medium for detecting an abnormal state of a process, which can avoid misjudgment of a process state, and are implemented without adding extra codes, so that accuracy and flexibility of process state detection are improved.
In order to solve the above technical problem, in one aspect, the present application provides a method for detecting an abnormal state of a process, where the method includes:
acquiring a process identifier of a process to be detected, and determining a process control block corresponding to the process identifier, wherein the process control block is used for recording a plurality of field information of the process, and the plurality of field information is updated along with the running of the process;
in one cycle, the following steps are performed:
acquiring process state field information of the current moment from the process control block, and judging whether the process to be detected is in a first preset state or not based on the process state field information of the current moment;
when the process to be detected is in a first preset state, acquiring switching frequency field information of the current moment from the process control block;
acquiring switching frequency field information of the previous moment of the current moment;
comparing the switching frequency field information of the current moment with the switching frequency field information of the previous moment, and when the comparison result is consistent, repeatedly executing the steps in the cycle until a first preset condition is met;
and determining the process state of the process to be detected based on the first preset condition met at the end of the cycle.
In another aspect, the present application provides an apparatus for detecting an abnormal state of a process, the apparatus including:
the process control block determining module is used for acquiring a process identifier of a process to be detected and determining a process control block corresponding to the process identifier, wherein the process control block is used for recording a plurality of pieces of field information of the process, and the plurality of pieces of field information are updated along with the running of the process;
the first preset state judgment module is used for executing the following steps in a cycle: acquiring process state field information of the current moment from the process control block, and judging whether the process to be detected is in a first preset state or not based on the process state field information of the current moment;
the first acquisition module is used for acquiring switching frequency field information of the current moment from the process control block when the process to be detected is in a first preset state;
the second acquisition module is used for acquiring switching time field information of the previous moment of the current moment;
the repeated execution module is used for comparing the switching frequency field information at the current moment with the switching frequency field information at the previous moment, and when the comparison result is consistent, the steps in the cycle are repeatedly executed until a first preset condition is met;
and the process state determining module is used for determining the process state of the process to be detected based on the first preset condition met when the circulation is finished.
In another aspect, the present application provides an apparatus comprising a processor and a memory, wherein the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the abnormal state detection method of the process as described above.
In another aspect, the present application provides a computer storage medium having at least one instruction, at least one program, a set of codes, or a set of instructions stored therein, the at least one instruction, at least one program, set of codes, or set of instructions being loaded by a processor and executing the method for detecting an abnormal state of a process as described above.
The embodiment of the application has the following beneficial effects:
determining a process control block corresponding to a process identifier through the process identifier of the process to be detected; acquiring process state field information of the current moment from the process control block, and judging whether the process to be detected is in a first preset state or not; when the process to be detected is in a first preset state, comparing the switching frequency field information of the current moment with the switching frequency field information of the previous moment, and when the comparison result is consistent, repeatedly executing the steps until a first preset condition is met; and determining the process state of the process to be detected based on the first preset condition met at the end of the cycle. The method and the device detect the state of the process by combining the field information of the process state and the field information of the switching times, avoid the misjudgment of the process state based on the judgment of the multi-dimensional information, and improve the accuracy of the judgment of the process state; according to the state detection method, source codes do not need to be modified, and the process state detection flexibility is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application;
fig. 2 is a flowchart of an abnormal state detection method for a process according to an embodiment of the present application;
fig. 3 is a flowchart of a first predetermined state determination method according to an embodiment of the present disclosure;
fig. 4 is a flowchart of a method for comparing field information of switching times according to an embodiment of the present disclosure;
fig. 5 is a flowchart of a process status determination method according to an embodiment of the present application;
fig. 6 is a flowchart of an abnormal state detection method for another process according to an embodiment of the present application;
fig. 7 is a flowchart of a process status detection method according to an embodiment of the present application;
FIG. 8 is a flow chart of a detection process provided by an embodiment of the present application;
fig. 9 is a schematic diagram of an abnormal state detection apparatus for a process according to an embodiment of the present application;
fig. 10 is a schematic diagram of a first predetermined state determination module according to an embodiment of the present application;
FIG. 11 is a block diagram of a repeat module provided in an embodiment of the present application;
FIG. 12 is a block diagram of a process state determination module provided by an embodiment of the present application;
FIG. 13 is a schematic diagram of an abnormal state detection apparatus for another process according to an embodiment of the present application;
FIG. 14 is a schematic diagram of a process monitoring module provided by an embodiment of the present application;
fig. 15 is a schematic structural diagram of an apparatus according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, the present application will be further described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, a schematic diagram of an implementation environment is shown, which may include: at least one user terminal 110 and a server 120, said user terminal 110 and said server 120 being in data communication via a network. Specifically, the user terminal 110 sends a process state detection request to the server 120; the server 120 receives the process state detection request and detects the state of the corresponding process to be detected.
The user terminal 110 may communicate with the Server 120 based on a Browser/Server mode (Browser/Server, B/S) or a Client/Server mode (Client/Server, C/S). The user terminal 110 may include: the physical devices may also include software running in the physical devices, such as application programs and the like. The operating system running on the user terminal 110 in the embodiment of the present invention may include, but is not limited to, an android system, an IOS system, linux, windows, and the like.
The server 120 and the user terminal 110 may establish a communication connection through a wired or wireless connection, and the server 120 may include a server operating independently, or a distributed server, or a server cluster composed of a plurality of servers, where the server may be a cloud server.
In order to solve the problem of detection state misjudgment existing in the prior art when a process is detected to be abnormal and the problem of workload increase caused by the need of adding a code supporting a heartbeat function to implement state detection, a method for detecting an abnormal state of a process is provided, an execution subject of which may be the server in fig. 1, specifically, please refer to fig. 2, and the method may include:
s210, acquiring a process identifier of a process to be detected, and determining a process control block corresponding to the process identifier, wherein the process control block is used for recording a plurality of pieces of field information of the process, and the plurality of pieces of field information are updated along with the running of the process.
The process identifier of the process to be detected may be determined based on a process state detection request of a user, or may be determined by a detection instruction actively initiated by the system.
The process identifier in the embodiment of the application is unique for each process, and the kernel identifies different processes through the process identifier. When a program is loaded into the memory, the program has a process, and a process control block is arranged for conveniently managing the process and is used for recording the characteristics of the process and some related information; the process control block is a data structure of an operating system core, is mainly used for representing the state of a process, and can enable a program to be a basic unit capable of independently running and execute the process concurrently.
Specifically, taking a Linux system as an example, a task _ struct structure is used for description here, each process corresponds to a task _ struct data structure, the task _ struct data structure is loaded into a memory and contains information of the process when the process runs, each process puts the information of the process into the data structure, and main field information in the task _ struct structure includes: process state field information, switching times field information, scheduling field information, identifier field information of the process, process communication field information, time data field information, and the like.
When the process is in a normal operation state, the relevant information in the task _ stuck structure corresponding to the process is also updated continuously.
In one cycle, the following steps are performed:
a loop may specifically refer to an execution cycle, that is, the following steps need to be executed in one execution cycle for each process; after executing a loop each time, it needs to judge whether the current relevant information meets a first preset condition, if not, all steps in an execution period are executed in a loop until the first preset condition is met.
S220, acquiring process state field information of the current time from the process control block, and judging whether the process to be detected is in a first preset state or not based on the process state field information of the current time.
The method mainly aims at judging whether the process is in the abnormal state or not on the basis that the process enters the first preset state, so that the judgment of whether the process to be detected is in the first preset state is a necessary condition for implementing the method for detecting the state of the application. Specifically, referring to fig. 3, it shows a first preset state judgment method, where the process state field information includes a process state field name and a process state field value, and accordingly, the method includes:
s310, acquiring the corresponding process state field value from the process control block based on the process state field name.
S320, when the process state field value is in an uninterruptible waiting state, judging that the process is in a first preset state.
After determining that the process to be detected is in the first preset state, further judgment needs to be performed based on the switching number field information in the process control block.
And S230, when the process to be detected is in a first preset state, acquiring switching frequency field information of the current moment from the process control block.
And S240, acquiring switching frequency field information of the previous moment of the current moment.
In the embodiment of the present application, a time interval between the previous time and the current time, which may also be referred to as a detection period, may be set, that is, each step in a cycle is executed every other detection period; since each field information in the process control block of the process is continuously updated, the switching number field information for each current time may be recorded so as to be compared with the switching number field information acquired in the next time.
S250, comparing the switching frequency field information of the current moment with the switching frequency field information of the previous moment, and when the comparison result is consistent, repeatedly executing the steps in the cycle until a first preset condition is met.
In the step S250, when the comparison result is consistent, the steps in the loop may be repeatedly executed after waiting for a preset time interval.
Please refer to fig. 4, which shows a method for comparing field information of switching times, where the field information of switching times includes: the field value of the active switching time field and the field value of the passive switching time field may include:
and S410, comparing the field value of the active switching time field at the current moment with the field value of the active switching time field at the previous moment.
And S420, comparing the field value of the passive switching time field at the current moment with the field value of the passive switching time field at the last moment.
S430, judging whether the field value of the active switching time field at the current moment is consistent with the field value of the active switching time field at the previous moment; if yes, go to step S440; if not, go to step S450.
S440, judging whether the field value of the passive switching time field at the current moment is consistent with the field value of the passive switching time field at the previous moment; if yes, go to step S460; if not, go to step S450.
S450, judging that the comparison results are inconsistent.
S460, judging that the comparison results are consistent.
Based on the method, if the field value of the active switching time field at the current moment is consistent with the field value of the active switching time field at the previous moment, and the field value of the passive switching time field at the current moment is consistent with the field value of the passive switching time field at the previous moment, the comparison result is judged to be consistent; if the field value of the field with the active switching times at the current moment is inconsistent with the field value of the field with the active switching times at the previous moment, or the field value of the field with the passive switching times at the current moment is inconsistent with the field value of the field with the passive switching times at the previous moment, judging that the comparison result is inconsistent; that is, if the comparison result between the switching number field information at the current time and the switching number field information at the previous time is inconsistent, it is determined that the process is in the normal state.
In the embodiment of the application, the switching frequency field information is used as a judgment basis because whether the process normally runs can be judged through the change of the switching frequency field information, and if the switching frequency field information of the process changes, the process is in a normal state; if the field information of the switching times of the process is not changed, the process is possibly in an abnormal state, and particularly whether the process is in the abnormal state or not is judged by combining the subsequent steps.
And S260, determining the process state of the process to be detected based on the first preset condition met at the end of circulation.
The end of the cycle here means: after the steps in the loop are executed once or repeatedly, when the relevant information at the current moment meets a first preset condition, the loop is judged to be ended.
The first preset condition in the embodiment of the present application may include: the comparison result between the switching frequency field information at the current moment and the switching frequency field information at the previous moment is inconsistent, and the repeated execution frequency is less than the preset frequency, or the repeated execution frequency reaches the preset frequency, namely when each step in the cycle is executed, the cycle can be ended only when one of the preset conditions is met. The number of times of repeated execution here refers to the number of times of execution of each step in one cycle described above, i.e., the number of times of execution from step S220 to step S250.
Specifically, referring to fig. 5, a method for determining a process state is shown, which determines a current state of a process for a condition satisfied at the end of a loop, and specifically may include:
and S510, when the first preset condition met at the end of circulation is that the comparison result of the switching frequency field information at the current moment is inconsistent with the comparison result of the switching frequency field information at the previous moment, and the repeated execution frequency is less than the preset frequency, determining that the process to be detected is in a normal state.
In this case, the loop is ended when the number of times of loop execution does not exceed the preset number of times, indicating that the field information of the number of times of switching in the process control block at the end of the loop is changed from the previous time, and indicating that the process is in a normal state as long as the field information of the number of times of switching is changed.
S520, when the first preset condition met at the end of circulation is that the number of times of repeated execution reaches a preset number, determining that the process to be detected is in an abnormal state.
In this case, the loop is ended when the number of times of executing the loop reaches the preset number, which indicates that the switching number field information in the process control block has not changed all the time in the process of executing the loop, and if the switching number field information has not changed and the number of times of executing the loop reaches the preset number, it may be determined that the process to be detected is in an abnormal state.
In another case, when the last loop is executed, the switching number field information in the process control block changes and the number of times of executing the loop reaches a preset number, and the process to be detected may be considered to be in a normal state.
Referring to fig. 6, another abnormal state detection method for a process is shown, which may include:
s610, creating a state detection result file in advance.
S620, when the process to be detected is determined to be in the abnormal state, determining the duration of the process to be detected in the abnormal state.
When the process to be detected is in the abnormal state, the time length of the process to be detected in the abnormal state can be the time length of a detection process, namely the sum of the time length of executing the preset number of cycles and the time length of the preset time interval between every two cycles. If the process to be detected is detected before the current detection process, the process to be detected is in an abnormal state before the current detection process but is not processed, and the switching number field information in the process control block is not changed compared with the switching number field information when the current detection process is finished when the previous detection process is finished, the time length of the process to be detected in the abnormal state is the time length from the time when the previous detection process is started to the current time. That is, the abnormality is not detected before, and the duration of the current detection process can be used as the duration of the current detection process in the abnormal state; and if the abnormal state is detected before, determining the time length in the abnormal state by combining the previous detection process and the current detection process.
And S630, outputting the process identifier of the process to be detected and the time length in the abnormal state to the state detection result file.
And outputting the process identifier of the process to be detected and the corresponding time length in the abnormal state to a state detection result file as a record.
Referring to fig. 7, a method for process state detection is shown, which may include:
and S710, responding to the process abnormity monitoring instruction, and acquiring the state detection result file.
And S720, returning the process identifiers of the processes in the state detection result file and the corresponding time length in the abnormal state.
For the process detected to be in the abnormal state, the detection result is output to a state detection result file; when the abnormal process needs to be monitored, the state information of the related abnormal process can be directly obtained through the state detection result file.
The specific implementation process of the present application is described below by taking a Linux system as an example, but the application scenario of the present application is not limited to the Linux system, and may be an android system, an IOS system, a windows system, and the like.
In Linux, the conditions for a process to enter an uninterruptible sleep state D are: the process initiates an IO request, which may be reading a logic location of a disk or reading a file on an NFS file system, and before IO is not returned, the process is in a D state, that is, an IO state is waited; if the IO has not been returned, the process will be in the D state all the time, and the process is in the abnormal state at this time. However, the process exception cannot be determined directly from the fact that the process is in the D state, because the D state is not an exception state, but the process is an exception state when the process is in the D state for a long time, and it is very likely that a problem occurs in a bottom disk or an exception occurs in a network.
The existing process anomaly detection method mainly comprises the following two methods:
1. see if the process state is in the D state through the ps command.
The ps command has a column of special display progress states, and whether the progress is in a D state for a long time is detected through multiple detections; however, the D state in a short time is not abnormal, and an IO-intensive application is in the D state most of the time, and is likely to be in the D state every time of detection, and actually, a process may frequently process IO requests, which may cause misjudgment.
2. And adding a heartbeat mechanism in the IO thread of the process, and monitoring whether the process is abnormal if the process does not receive the heartbeat report for a long time.
The code is required to be added to support the heartbeat function, certain workload exists, and some codes may not have source codes; and how to ensure that the monitoring process is always alive and effective is also a big problem.
In a specific implementation process, a kernel module is inserted to detect the process state, and meanwhile, a related interface is provided for a user so that the user can conveniently set and transmit related parameters; the execution operation of the kernel module is independent of other modules in the system and is decoupled from the operating system to the greatest extent.
Specifically, the embodiment of the present application provides three setting interfaces by adding one kernel module: process identifier pid, detection times and detection period of the process to be detected; (total detection time ═ number of detections x detection period).
Outputting the result to a result file: the output result comprises the process pid, and the process D can know whether the exception is triggered or not by monitoring the file by the user after the process D has been in the process for a long time.
The kernel module traverses all the processes, finds the TASK _ stuck of the process to be detected according to the set pid, firstly judges whether the process state is the TASK _ UNINTERRUPTIBLE state, namely the D state, records the times of the process for actively switching the CPU and passively switching the CPU if the process state is the D state, checks again after a detection period, adds one to the detection count if the process state is still the D state and the times of the process for actively switching the CPU and passively switching the CPU are not changed, repeats the above process, and outputs the process pid and the abnormal time to a result file if the detection count is more than or equal to the set detection times.
The reason why the active switching times and the passive switching times are adopted as the judgment basis for state detection is that: when the process enters the D state, namely the IO state is waited, the process is put into the queue, and the process can not actively switch the CPU at the moment; the queue cannot receive kernel signals, so that passive switching cannot occur, whether switching occurs in the process is checked by detecting the information of the two fields, and if switching does not occur for a long time, the process is always waiting for the same IO; when the process is in the abnormal D state, the stack information of the process can be kept unchanged all the time because the CPU switching does not occur.
The reliability of the detection result is ensured by detecting for many times, the pid of the process to be detected, the detection interval duration and the detection times are preset, each detection process can execute a cycle of the preset detection times, and the specific process can refer to fig. 8 and includes the following steps:
s810, judging whether the process to be detected is in a D state (judging whether the status in the TASK _ structure is TASK _ UNINTERRUPTIBLE); if yes, go to step S820, otherwise go to step S860.
S820, judging whether the process is passively switched or not and whether the times of the process is the same as the times of the last time or not (checking nvcsw fields in a task _ struct structure); if yes, go to step S830, otherwise go to step S860.
S830, judging whether the process is passively switched or not and whether the times of the process is the same as the times of the previous time or not (nivcsw field in a task _ struct structure); if yes, go to step S840, otherwise go to step S860.
S840, adding one to the detection count, and judging whether the detection count checks a preset detection frequency; if yes, go to step S870, otherwise go to step S850.
And S850, waiting for the detection interval duration.
And S860, returning to normal process.
S870, returning the process exception, and outputting the process exception information to a result file.
According to the above example, the application source code does not need to be modified, the detection times and the detection period can be adjusted, and the detection strategy is flexibly supported; through the multidimensional judgment of comparing the process state with the switching times, the misjudgment can not occur, and a plurality of processes can be monitored simultaneously.
Determining a process control block corresponding to a process identifier through the process identifier of the process to be detected; acquiring process state field information of the current moment from the process control block, and judging whether the process to be detected is in a first preset state or not; when the process to be detected is in a first preset state, comparing the switching frequency field information of the current moment with the switching frequency field information of the previous moment, and when the comparison result is consistent, repeatedly executing the steps until a first preset condition is met; and determining the process state of the process to be detected based on the first preset condition met at the end of the cycle. The method and the device detect the state of the process by combining the field information of the process state and the field information of the switching times, avoid the misjudgment of the process state based on the judgment of the multi-dimensional information, and improve the accuracy of the judgment of the process state; according to the state detection method, source codes do not need to be modified, and the process state detection flexibility is improved.
The present embodiment further provides an apparatus for detecting an abnormal state of a process, referring to fig. 9, the apparatus may include:
a process control block determining module 910, configured to obtain a process identifier of a process to be detected, and determine a process control block corresponding to the process identifier, where the process control block is configured to record multiple pieces of field information of the process, and the multiple pieces of field information are updated along with running of the process;
the first preset state determining module 920 is configured to, in a cycle, perform the following steps: acquiring process state field information of the current moment from the process control block, and judging whether the process to be detected is in a first preset state or not based on the process state field information of the current moment;
a first obtaining module 930, configured to obtain field information of the switching times at the current time from the process control block when the process to be detected is in a first preset state;
a second obtaining module 940, configured to obtain field information of the number of times of switching at the previous time of the current time;
a repeated execution module 950, configured to compare the switching number field information at the current time with the switching number field information at the previous time, and when the comparison result is consistent, repeatedly execute the above steps in the loop until a first preset condition is met;
a process state determining module 960, configured to determine the process state of the process to be detected based on the first preset condition that is met when the loop ends.
The process status field information includes a process status field name and a process status field value, and accordingly, referring to fig. 10, it is shown that the first predetermined status determining module 920 includes:
a process status field value obtaining module 1010, configured to obtain the corresponding process status field value from the process control block based on the process status field name;
a first determining module 1020, configured to determine that the process is in a first preset state when the process state field value is in an uninterruptible wait state.
The switching number field information includes: referring to fig. 11, the repeated execution module 950 further includes a comparison module 1100, and the comparison module 1100 includes:
a first comparing module 1110, configured to compare a field value of the active handover time field at the current time with a field value of the active handover time field at the previous time;
a second comparing module 1120, configured to compare a field value of the passive handover time field at the current time with a field value of the passive handover time field at the previous time;
a second determination module 1130, configured to determine that the comparison result is consistent if a field value of the active switching time field at the current time is consistent with a field value of the active switching time field at the previous time, and a field value of the passive switching time field at the current time is consistent with a field value of the passive switching time field at the previous time;
the third determining module 1140 is configured to determine that the comparison result is inconsistent if the field value of the active handover time field at the current time is inconsistent with the field value of the active handover time field at the previous time, or the field value of the passive handover time field at the current time is inconsistent with the field value of the passive handover time field at the previous time.
The first preset condition includes: accordingly, referring to fig. 12, the process status determining module 960 includes:
a first determining module 1210, configured to determine that the process to be detected is in a normal state when a first preset condition that is met at the end of the loop is that a comparison result between switching time field information at a current time and switching time field information at a previous time is inconsistent and a number of times of repeated execution is less than a preset number of times;
the second determining module 1220 is configured to determine that the process to be detected is in an abnormal state when the number of times of repeated execution reaches a preset number of times according to a first preset condition that is met at the end of the loop.
Referring to fig. 13, there is shown another abnormal state detection apparatus of a process, the apparatus comprising:
a file creating module 1310 for creating a state detection result file in advance;
an abnormal state duration determining module 1320, configured to determine, when it is determined that the process to be detected is in an abnormal state, a duration that the process to be detected is in the abnormal state;
the output module 1330 is configured to output the process identifier of the process to be detected and the time length in the abnormal state to the state detection result file.
Referring to FIG. 14, a process monitoring module 1400 is shown, the process monitoring module 1400 comprising:
an instruction response module 1410, configured to respond to the process exception monitoring instruction, and obtain the state detection result file;
and a result returning module 1420, configured to return the process identifiers of the processes in the state detection result file and the corresponding durations in the abnormal states.
The device provided in the above embodiments can execute the method provided in any embodiment of the present application, and has corresponding functional modules and beneficial effects for executing the method. Technical details not described in detail in the above embodiments may be referred to a method provided in any of the embodiments of the present application.
The present embodiments also provide a computer-readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions that is loaded by a processor and performs any of the methods described above in the present embodiments.
The present embodiment also provides a device, which is shown in fig. 15, and the device 1500 may have a large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1522 (e.g., one or more processors) and a memory 1532, and one or more storage media 1530 (e.g., one or more mass storage devices) for storing applications 1542 or data 1544. Memory 1532 and storage media 1530 may be, among other things, transient or persistent storage. The program stored on the storage medium 1530 may include one or more modules (not shown), each of which may include a series of instruction operations for the device. Still further, a central processor 1522 may be provided in communication with the storage medium 1530, executing a series of instruction operations in the storage medium 1530 on the device 1500. Apparatus 1500 may also include one or more power supplies 1526, one or more wired or wireless network interfaces 1550, one or more input-output interfaces 1558, and/or one or more operating systems 1541, such as a Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTMAnd so on. Any of the methods described above in this embodiment can be implemented based on the apparatus shown in fig. 15.
The present specification provides method steps as described in the examples or flowcharts, but may include more or fewer steps based on routine or non-inventive labor. The steps and sequences recited in the embodiments are but one manner of performing the steps in a multitude of sequences and do not represent a unique order of performance. In the actual system or interrupted product execution, it may be performed sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the embodiments or methods shown in the figures.
The configurations shown in the present embodiment are only partial configurations related to the present application, and do not constitute a limitation on the devices to which the present application is applied, and a specific device may include more or less components than those shown, or combine some components, or have an arrangement of different components. It should be understood that the methods, apparatuses, and the like disclosed in the embodiments may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a division of one logic function, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or unit modules.
Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A method for detecting abnormal state of a process is characterized by comprising the following steps:
acquiring a process identifier of a process to be detected, and determining a process control block corresponding to the process identifier, wherein the process control block is used for recording a plurality of field information of the process, and the plurality of field information is updated along with the running of the process;
in one cycle, the following steps are performed:
acquiring process state field information of the current moment from the process control block, and judging whether the process to be detected is in a first preset state or not based on the process state field information of the current moment;
when the process to be detected is in a first preset state, acquiring switching frequency field information of the current moment from the process control block;
acquiring switching frequency field information of the previous moment of the current moment;
comparing the switching frequency field information of the current moment with the switching frequency field information of the previous moment, and when the comparison result is consistent, repeatedly executing the steps in the cycle until a first preset condition is met;
and determining the process state of the process to be detected based on the first preset condition met at the end of the cycle.
2. The method of claim 1, wherein the process status field information comprises a process status field name and a process status field value;
correspondingly, the acquiring the process state field information of the current time from the process control block, and determining whether the process to be detected is in the first preset state based on the process state field information of the current time includes:
acquiring the corresponding process state field value from the process control block based on the process state field name;
and when the process state field value is in an uninterruptible waiting state, judging that the process is in a first preset state.
3. The method according to claim 1, wherein the field information of the number of times of switching includes: the field value of the active switching time field and the field value of the passive switching time field;
correspondingly, the comparing the field information of the switching times at the current moment with the field information of the switching times at the previous moment includes:
comparing the field value of the field with the active switching times at the previous moment;
comparing the field value of the field of the passive switching times at the current moment with the field value of the field of the passive switching times at the previous moment;
if the field value of the active switching time field at the current moment is consistent with the field value of the active switching time field at the previous moment, and the field value of the passive switching time field at the current moment is consistent with the field value of the passive switching time field at the previous moment, judging that the comparison result is consistent;
and if the field value of the field with the active switching times at the current moment is inconsistent with the field value of the field with the active switching times at the previous moment, or the field value of the field with the passive switching times at the current moment is inconsistent with the field value of the field with the passive switching times at the previous moment, judging that the comparison result is inconsistent.
4. The method of claim 1, further comprising:
and if the comparison result of the switching frequency field information at the current moment and the switching frequency field information at the previous moment is inconsistent, determining that the process is in a normal state.
5. The method according to claim 1, wherein the first preset condition comprises: the comparison result of the switching frequency field information at the current moment and the switching frequency field information at the previous moment is inconsistent, and the repeated execution frequency is less than the preset frequency, or the repeated execution frequency reaches the preset frequency;
correspondingly, the determining the process state of the process to be detected based on the first preset condition met at the end of the loop includes:
when a first preset condition met at the end of circulation is that the comparison result of the switching time field information at the current moment is inconsistent with the comparison result of the switching time field information at the previous moment and the repeated execution time is less than a preset time, determining that the process to be detected is in a normal state;
and when the first preset condition met by the circulation end is that the repeated execution times reach the preset times, determining that the process to be detected is in an abnormal state.
6. The method of claim 5, further comprising:
creating a state detection result file in advance;
when the process to be detected is determined to be in the abnormal state, determining the duration of the process to be detected in the abnormal state;
and outputting the process identifier of the process to be detected and the time length in the abnormal state to the state detection result file.
7. The method of claim 6, further comprising:
responding to a process abnormity monitoring instruction, and acquiring the state detection result file;
and returning the process identifiers of the processes in the state detection result file and the corresponding duration in the abnormal state.
8. An abnormal state detection apparatus for a process, comprising:
the process control block determining module is used for acquiring a process identifier of a process to be detected and determining a process control block corresponding to the process identifier, wherein the process control block is used for recording a plurality of pieces of field information of the process, and the plurality of pieces of field information are updated along with the running of the process;
the first preset state judgment module is used for executing the following steps in a cycle: acquiring process state field information of the current moment from the process control block, and judging whether the process to be detected is in a first preset state or not based on the process state field information of the current moment;
the first acquisition module is used for acquiring switching frequency field information of the current moment from the process control block when the process to be detected is in a first preset state;
the second acquisition module is used for acquiring switching time field information of the previous moment of the current moment;
the repeated execution module is used for comparing the switching frequency field information at the current moment with the switching frequency field information at the previous moment, and when the comparison result is consistent, the steps in the cycle are repeatedly executed until a first preset condition is met;
and the process state determining module is used for determining the process state of the process to be detected based on the first preset condition met when the circulation is finished.
9. An apparatus comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, the at least one instruction, the at least one program, set of codes, or set of instructions being loaded and executed by the processor to implement the abnormal state detection method of the process of any of claims 1 to 7.
10. A computer storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions that is loaded by a processor and that performs a method of abnormal state detection of a process as claimed in any one of claims 1 to 7.
CN201911096474.XA 2019-11-11 2019-11-11 Method, device and equipment for detecting abnormal state of process and storage medium Active CN110825593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911096474.XA CN110825593B (en) 2019-11-11 2019-11-11 Method, device and equipment for detecting abnormal state of process and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911096474.XA CN110825593B (en) 2019-11-11 2019-11-11 Method, device and equipment for detecting abnormal state of process and storage medium

Publications (2)

Publication Number Publication Date
CN110825593A true CN110825593A (en) 2020-02-21
CN110825593B CN110825593B (en) 2022-08-23

Family

ID=69554001

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911096474.XA Active CN110825593B (en) 2019-11-11 2019-11-11 Method, device and equipment for detecting abnormal state of process and storage medium

Country Status (1)

Country Link
CN (1) CN110825593B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989323A (en) * 2021-02-03 2021-06-18 成都欧珀通信科技有限公司 Process detection method, device, terminal and storage medium
CN113495832A (en) * 2020-04-05 2021-10-12 杭州迪普科技股份有限公司 Cache region leakage detection system and method thereof

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07239820A (en) * 1994-03-01 1995-09-12 Nippon Telegr & Teleph Corp <Ntt> Method for detecting abnormal operation of communication software and device therefor
JP2006277115A (en) * 2005-03-28 2006-10-12 Fujitsu Ten Ltd Abnormality detection program and abnormality detection method
CN1924810A (en) * 2005-09-02 2007-03-07 中兴通讯股份有限公司 Distributed control method in priority for operation process
CN1996257A (en) * 2006-12-26 2007-07-11 华为技术有限公司 Method and system for monitoring process
CN103986762A (en) * 2014-05-15 2014-08-13 京信通信系统(中国)有限公司 Process state detection method and device
CN104331357A (en) * 2014-10-10 2015-02-04 北京金山安全软件有限公司 Application program abnormity detection method and device and mobile terminal
CN105988905A (en) * 2015-02-12 2016-10-05 中兴通讯股份有限公司 Exception processing method and apparatus
CN106972951A (en) * 2017-02-27 2017-07-21 杭州天宽科技有限公司 A kind of automatic maintenance implementation method based on multiple related function module abnormality detections
CN107133167A (en) * 2017-04-24 2017-09-05 北京北信源软件股份有限公司 The abnormal method and device of real-time monitoring process under a kind of linux system
CN108388797A (en) * 2018-01-23 2018-08-10 北京奇艺世纪科技有限公司 A kind of intrusion detection method, device and electronic equipment
CN109753411A (en) * 2019-01-17 2019-05-14 Oppo广东移动通信有限公司 Abnormality eliminating method, device, mobile terminal and storage medium
CN109858244A (en) * 2019-01-16 2019-06-07 四川大学 Process exception behavioral value method and system in a kind of container
CN110362418A (en) * 2019-07-09 2019-10-22 腾讯科技(深圳)有限公司 A kind of abnormal data restoration methods, device, server and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07239820A (en) * 1994-03-01 1995-09-12 Nippon Telegr & Teleph Corp <Ntt> Method for detecting abnormal operation of communication software and device therefor
JP2006277115A (en) * 2005-03-28 2006-10-12 Fujitsu Ten Ltd Abnormality detection program and abnormality detection method
CN1924810A (en) * 2005-09-02 2007-03-07 中兴通讯股份有限公司 Distributed control method in priority for operation process
CN1996257A (en) * 2006-12-26 2007-07-11 华为技术有限公司 Method and system for monitoring process
CN103986762A (en) * 2014-05-15 2014-08-13 京信通信系统(中国)有限公司 Process state detection method and device
US20170212807A1 (en) * 2014-10-10 2017-07-27 Beijing Kingsoft Internet Security Software Co., Ltd. Method for detecting abnormal application and mobile terminal
CN104331357A (en) * 2014-10-10 2015-02-04 北京金山安全软件有限公司 Application program abnormity detection method and device and mobile terminal
CN105988905A (en) * 2015-02-12 2016-10-05 中兴通讯股份有限公司 Exception processing method and apparatus
CN106972951A (en) * 2017-02-27 2017-07-21 杭州天宽科技有限公司 A kind of automatic maintenance implementation method based on multiple related function module abnormality detections
CN107133167A (en) * 2017-04-24 2017-09-05 北京北信源软件股份有限公司 The abnormal method and device of real-time monitoring process under a kind of linux system
CN108388797A (en) * 2018-01-23 2018-08-10 北京奇艺世纪科技有限公司 A kind of intrusion detection method, device and electronic equipment
CN109858244A (en) * 2019-01-16 2019-06-07 四川大学 Process exception behavioral value method and system in a kind of container
CN109753411A (en) * 2019-01-17 2019-05-14 Oppo广东移动通信有限公司 Abnormality eliminating method, device, mobile terminal and storage medium
CN110362418A (en) * 2019-07-09 2019-10-22 腾讯科技(深圳)有限公司 A kind of abnormal data restoration methods, device, server and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIST_LINUX: "Linux内核调试技术——进程D状态死锁检测", 《HTTPS://MP.WEIXIN.QQ.COM/S/HKPF8ENEHYNBOWJ9JQWBHG》 *
黄崇滨: "基于进程热重启实现转发不间断的设计与实现", 《科技经济导刊》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113495832A (en) * 2020-04-05 2021-10-12 杭州迪普科技股份有限公司 Cache region leakage detection system and method thereof
CN112989323A (en) * 2021-02-03 2021-06-18 成都欧珀通信科技有限公司 Process detection method, device, terminal and storage medium
CN112989323B (en) * 2021-02-03 2024-02-13 成都欧珀通信科技有限公司 Process detection method, device, terminal and storage medium

Also Published As

Publication number Publication date
CN110825593B (en) 2022-08-23

Similar Documents

Publication Publication Date Title
US7401248B2 (en) Method for deciding server in occurrence of fault
CN108508874B (en) Method and device for monitoring equipment fault
US6078944A (en) Process management method and system
US20190138349A1 (en) Method and apparatus for migrating virtual machine
US20160140164A1 (en) Complex event processing apparatus and complex event processing method
US9864641B2 (en) Method for managing workloads in a multiprocessing computer system
US20080295095A1 (en) Method of monitoring performance of virtual computer and apparatus using the method
CN110825593B (en) Method, device and equipment for detecting abnormal state of process and storage medium
US20130080502A1 (en) User interface responsiveness monitor
JP5889332B2 (en) Activity recording system for concurrent software environments
Zhou et al. Bigroots: An effective approach for root-cause analysis of stragglers in big data system
JP5267681B2 (en) Performance data collection method, performance data collection device, and performance data management system
CN106528318B (en) Thread dead loop detection method and device
JPWO2005017736A1 (en) System and program for detecting bottleneck in disk array device
US8516482B2 (en) Virtual machine assigning method and storage medium thereof, information processing device having virtual machine environment
US11212174B2 (en) Network management device and network management method
US11706086B1 (en) Method and system for monitoring switch on basis of BMC, and device and medium
JP2009251871A (en) Contention analysis device, contention analysis method, and program
CN109446034B (en) Method and device for reporting crash event, computer equipment and storage medium
JP6064571B2 (en) Processing program, processing method, and processing apparatus
CN111506422B (en) Event analysis method and system
US9712380B2 (en) Analytical device control system
WO2013129061A1 (en) Control system for simultaneous number of connections, control server for simultaneous number of connections, control method for simultaneous number of connections and control program for simultaneous number of connections
KR102427477B1 (en) Apply multiple elements method for workload analysis in the micro data center
CN118034945A (en) Analysis method of kernel soft deadlock, processor and computing device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40021123

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant