CN110704281A

CN110704281A - Method for monitoring system operation

Info

Publication number: CN110704281A
Application number: CN201910971300.7A
Authority: CN
Inventors: 黄刚; 张小亮; 李若寒; 李岩
Original assignee: Shandong Chaoyue CNC Electronics Co Ltd
Current assignee: Shandong Chaoyue CNC Electronics Co Ltd
Priority date: 2019-10-14
Filing date: 2019-10-14
Publication date: 2020-01-17

Abstract

The invention discloses a method for monitoring system operation, and relates to the field of system safety; after the daemon process is down, the synchronous pid number in the monitoring configuration file is changed into 0, whether the daemon process exits accidentally is monitored, the daemon process exiting accidentally is restarted forcibly, the pid of each process in the configuration file is read, and the pid and the daemon process automatically record the pid as an asynchronous mechanism, so that the abnormity of each daemon process can be found in time without mutual interference, the normal operation of the system is ensured, the states of each module of the system are monitored, the system alarm and log storage can be started in one step, and a user can know the abnormal state of each module visually.

Description

Method for monitoring system operation

Technical Field

The invention discloses a method for monitoring system operation, and relates to the field of system safety.

Background

Software Systems (Software Systems) refer to computer Software Systems comprised of system Software, support Software, and application Software, which are parts of the computer Systems comprised of Software. With the development of informatization, the application of a software system in equipment is more and more popular, the interaction between the software system and a hardware system is more and more common, and the stability and the disaster tolerance capability of the equipment system in the operation process are more and more important. The invention discloses a method for monitoring system operation, after the daemon process of the system is started, the pid of the daemon process is synchronized into a monitoring configuration file, whether the pid of the daemon process is 0 is monitored by reading the monitoring configuration file, the daemon process with the pid of 0 is forcibly restarted, meanwhile, the state data of each module of the system is monitored and obtained, the state of each software daemon process and the operation state of each unit module can be monitored, data communication is carried out with each module at regular time, so as to obtain the operation state of each module, and when the operation state is abnormal, the functions of alarming and log storage are realized.

Disclosure of Invention

The invention provides a method for monitoring the operation of a system, which can monitor the state of each module of the system, carry out data communication with each module at regular time to acquire the operation state of each module, and realize the functions of alarming and log storage when the operation state is abnormal.

The specific scheme provided by the invention is as follows:

a method of monitoring operation of a system: after the daemon process of the system is started, synchronizing the pid of the daemon process into the monitoring configuration file, monitoring whether the pid of the daemon process is 0 or not by reading the monitoring configuration file, forcibly restarting the daemon process with the pid of 0,

and meanwhile, monitoring and acquiring the state data of each module of the system.

The method comprises the steps that a monitoring configuration file comprises all key functions of the daemon process to be monitored, the pid of the daemon process is 0, when the daemon process exits unexpectedly, a corresponding starting command is called to start the key functions, and the daemon process is restarted forcibly.

In the method, the pid of each daemon process in the system monitoring configuration file is read at regular intervals.

In the method, the state data of each module of the system is requested at regular time, and whether the state of each module is normal or not is monitored.

In the method, CRC (cyclic redundancy check) is carried out before the status data of each module of the system is requested at regular time, and the status data of each module of the system is obtained after the CRC passes.

A tool for monitoring system operation comprises a monitoring module,

wherein, after the daemon process of the system is started, the monitoring module synchronizes the pid of the daemon process to the monitoring configuration file, monitors whether the pid of the daemon process is 0 or not by reading the monitoring configuration file, forcibly restarts the daemon process with the pid of 0,

and meanwhile, the monitoring module monitors and acquires the state data of each module of the system.

The monitoring configuration file in the tool comprises all the key functions of the daemon process to be monitored, the pid of the daemon process is 0, and when the daemon process exits unexpectedly, the monitoring module calls a corresponding starting command to start the key functions and forcibly restarts the daemon process.

And the monitoring module in the tool reads the pid of each daemon process in the system monitoring configuration file at regular intervals.

The monitoring module in the tool requests the state data of each module of the system at regular time and monitors whether the state of each module is normal or not.

And the monitoring module in the tool performs CRC (cyclic redundancy check) on the system before requesting the state data of each module of the system at regular time, and acquires the state data of each module of the system after the CRC passes.

The invention has the advantages that:

the invention provides a method for monitoring system operation, which utilizes that after a daemon process is down, the synchronous pid number in a monitoring configuration file becomes 0, monitors whether the daemon process exits accidentally, and restarts the daemon process which exits accidentally, reads the pid of each process in the configuration file, and automatically records the pid as an asynchronous mechanism together with the daemon process, thereby being capable of finding the abnormity of each daemon process in time without mutual interference, ensuring the normal operation of the system, monitoring the state of each module of the system, starting system alarm and log storage in one step, and facilitating the user to know the abnormal state of each module intuitively.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention;

FIG. 2 is a schematic flow diagram of a monitoring daemon;

FIG. 3 is a flow diagram of modules in a monitoring system.

Detailed Description

The invention provides a method for monitoring system operation, which comprises the following steps: after the daemon process of the system is started, synchronizing the pid of the daemon process into the monitoring configuration file, monitoring whether the pid of the daemon process is 0 or not by reading the monitoring configuration file, forcibly restarting the daemon process with the pid of 0,

Meanwhile, a tool for monitoring the operation of the system corresponding to the method is provided, which comprises a monitoring module,

The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.

Taking a network cipher machine system as an example, the method of the invention is used for monitoring the system operation, and the specific process is as follows:

after the daemon process of the network cipher machine system is started, the daemon process comprises management processes of all modules of the network cipher machine, a web interface management process and the like, a daemon process number pid is written into a monitoring configuration file process, the monitoring configuration file is read every 10 seconds to monitor whether the pid of the daemon process is 0 or not, the daemon process with the pid of 0 is forcibly restarted,

and simultaneously, requesting state data of each module of the system at regular time, monitoring whether the state of each module is normal, wherein each module of the system can be a CPU, a communication module, a password processing module and the like, the obtained data can comprise the CPU utilization rate, the memory utilization rate, the hard disk utilization rate, the running time, the communication efficiency, the password access data and the like, and when the running state data is abnormal or exceeds a certain threshold value, an alarm thread is started for alarming and log storage.

In the process, the monitoring configuration file can comprise key functions of all daemon processes to be monitored, names, pids, starting commands and the like of all the daemon processes, when the pids of all the daemon processes are 0 or the processes do not respond, and when the daemon processes exit unexpectedly, the corresponding starting commands are called to start the key functions, so that the daemon processes are restarted forcibly;

meanwhile, CRC check is carried out before status data of each module of the system is requested at regular time, when the system status data is sent, a sending party calculates the CRC value of a data packet and adds the CRC value into the data packet, a receiving party, namely a monitored module, calculates the CRC value after receiving the data packet and compares the CRC value with the CRC value transmitted in the data packet, and the CRC check is passed, an operation status data packet is returned, wherein the operation status data packet can contain status data information of each module of the system, such as system operation time, CPU utilization rate, memory utilization rate, hard disk utilization rate and the like,

when the state data of a certain module is acquired to be abnormal, an alarm thread is started, a buzzer and an indicator lamp are executed for alarming, and meanwhile abnormal information is written into a log database.

Still take the network cipher machine system as an example, the tool of the invention is used for monitoring the system operation, and the concrete process is as follows:

after the daemon process of the network cipher machine system is started, the daemon process comprises management processes of all modules of the network cipher machine, a web interface management process and the like, a monitoring module of the tool can be used for writing a daemon process number pid into a monitoring configuration file process, reading the monitoring configuration file every 10 seconds to monitor whether the pid of the daemon process is 0 or not, forcibly restarting the daemon process with the pid of 0,

meanwhile, the monitoring module requests state data of each module of the system at regular time and monitors whether the state of each module is normal or not, each module of the system can be a CPU, a communication module, a password processing module and the like, data acquired by the monitoring module can comprise CPU utilization rate, memory utilization rate, hard disk utilization rate, running time, communication efficiency, password access data and the like, and when the running state data is abnormal or exceeds a certain threshold value, the monitoring module can alarm and perform log storage by using a mode of starting an alarm thread.

In the process, the monitoring configuration file can comprise key functions of all daemon processes to be monitored, names, pids, starting commands and the like of all the daemon processes, when the pids of all the daemon processes are 0 or the processes do not respond, and when the daemon processes exit unexpectedly, a monitoring module calls the corresponding starting commands to start the key functions and forcibly restart the daemon processes;

meanwhile, before the monitoring module requests the status data of each module of the system in a timing manner, CRC check is carried out on the status data and the system, when the system status data is sent, the monitoring module of a sending party calculates the CRC value of a data packet and adds the CRC value into the data packet, after a receiving party receives the data packet, the monitoring module calculates the CRC value and compares the CRC value with the CRC value transmitted in the data packet, after the CRC check is passed, an operation status data packet is returned to the monitoring module, and the operation status data packet can contain the status data information of each module of the system, such as system operation time, CPU utilization rate, memory utilization rate, hard disk utilization rate and the like,

when the monitoring module acquires that state data of a certain module is abnormal, an alarm thread is started, a buzzer and an indicator lamp are executed for alarming, and abnormal information is written into a log database.

The method or the tool can monitor each system software process, the pid of the process in the configuration file can be changed into 0 after the process is accidentally down, the configuration file is read at regular time, when the pid of a certain process is monitored to be 0, the daemon process is forcibly restarted to ensure the normal operation of the system, then, the state of each module of the system is monitored, data communication is carried out with each module at regular time to obtain the operation state of each module, and when the operation state is abnormal, alarming and log storage can be further realized.

The above-mentioned embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitution or change made by the technical personnel in the technical field on the basis of the invention is all within the protection scope of the invention. The protection scope of the invention is subject to the claims.

Claims

1. A method for monitoring system operation is characterized in that after a daemon process of the system is started, the pid of the daemon process is synchronized into a monitoring configuration file, whether the pid of the daemon process is 0 is monitored by reading the monitoring configuration file, the daemon process with the pid of 0 is forcibly restarted,

2. The method as claimed in claim 1, wherein the monitoring configuration file includes key functions of all daemon processes to be monitored, the pids of the daemon processes are 0, and when the daemon processes exit unexpectedly, the corresponding start command is called to start the key functions, so as to restart the daemon processes forcibly.

3. A method as claimed in claim 1 or 2, wherein the pid of each daemon in the system monitoring configuration file is read at regular intervals.

4. A method as claimed in claim 3, in which status data is periodically requested from the modules of the system to monitor the status of the modules.

5. The method as claimed in claim 4, wherein the status data of the modules of the system is periodically requested and a CRC check is performed, and the status data of the modules of the system is acquired after the CRC check is passed.

6. A tool for monitoring the operation of a system is characterized by comprising a monitoring module,

7. The tool of claim 6, wherein the monitoring configuration file includes key functions of all daemon processes to be monitored, the pid of the daemon process is 0, and when the daemon process exits unexpectedly, the monitoring module calls a corresponding start command to start the key functions to forcibly restart the daemon process.

8. A tool as claimed in claim 6 or claim 7 wherein the monitor module reads the pid of each daemon in the system monitor configuration file at regular intervals.

9. The tool of claim 8, wherein the monitoring module periodically requests status data of the modules of the system to monitor whether the status of the modules is normal.

10. The tool of claim 9, wherein the monitoring module periodically requests status data from the modules of the system before performing a CRC check with the system, and wherein the status data from the modules of the system is obtained after the CRC check passes.