CN112612635B - Multi-level protection method for application program - Google Patents

Multi-level protection method for application program Download PDF

Info

Publication number
CN112612635B
CN112612635B CN202011515855.XA CN202011515855A CN112612635B CN 112612635 B CN112612635 B CN 112612635B CN 202011515855 A CN202011515855 A CN 202011515855A CN 112612635 B CN112612635 B CN 112612635B
Authority
CN
China
Prior art keywords
daemon
daemon thread
thread
abnormal
level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011515855.XA
Other languages
Chinese (zh)
Other versions
CN112612635A (en
Inventor
邬惠峰
胡俊杰
侯丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Hangzhou Dianzi University Shangyu Science and Engineering Research Institute Co Ltd
Original Assignee
Hangzhou Dianzi University
Hangzhou Dianzi University Shangyu Science and Engineering Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University, Hangzhou Dianzi University Shangyu Science and Engineering Research Institute Co Ltd filed Critical Hangzhou Dianzi University
Priority to CN202011515855.XA priority Critical patent/CN112612635B/en
Publication of CN112612635A publication Critical patent/CN112612635A/en
Application granted granted Critical
Publication of CN112612635B publication Critical patent/CN112612635B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/0757Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/563Data redirection of data network streams

Abstract

The invention provides a multi-level protection method for an application program, which uses a hardware watchdog, a daemon process and a daemon thread to provide a reliability guarantee accurate to a code segment level, and provides information checking and control functions of all daemon threads of a terminal through the daemon process to the outside through a network. When the monitored unit is abnormal, abnormal information can be pushed to a specified target according to a configured working mode and recovery operation is automatically carried out, or only the information is pushed, and then an external command is waited to keep the site, so that problem troubleshooting is facilitated. When using the daemon thread, adding a daemon thread creating interface in a daemon thread pool at an inlet of the monitored unit code, activating an idle daemon thread, and adding a daemon thread releasing interface in the daemon thread pool at an outlet of the monitored unit code, so that the daemon thread restores to an idle state, and any code in the application is flexibly detected by using the daemon thread.

Description

Multi-level protection method for application program
Technical Field
The invention relates to the field of embedded development, in particular to a multi-level protection method for an application program.
Background
With the development of the internet, a large number of data rooms are needed to support internet services used by people in daily life, and the operation of the data rooms needs appropriate power environment conditions, so that the power environment of each data room needs to be monitored in real time by using a monitoring terminal. In order to provide high-quality monitoring service, the monitoring terminal itself needs to ensure operation stability, and is affected by various factors, and the monitoring terminal cannot ensure that there is no fault at all, so the monitoring terminal needs to have a certain fault self-recovery capability, so that when the application software running on the terminal has a fault, the application software can be automatically recovered to a normal state without human participation, and except for the fault that the application software itself can detect and process the recovery, the crash problem and the seizure problem of the application software generally need to be detected and processed by using a watchdog. The watchdog is generally divided into a hardware watchdog and a software watchdog, the hardware watchdog is generally used for monitoring at a terminal level, the software watchdog is generally used for detecting at a process level or a thread level, namely, the watchdog is reset in a main process of the process or the thread, when application software crashes or the main process of the process or the thread is stuck, the watchdog can be overtime, and then the watchdog can call a preset interface to restart the system or the application program to enable the application program to recover to a normal state. However, when the code in the process or the thread, which is not the main flow, is jammed and does not affect the code related to the watchdog reset, the watchdog can be normally reset and cannot detect the abnormality in time, so that the application software is always in an abnormal state. And current watchdog generally can directly carry out the processing of resetting when appearing unusually, can't keep the scene, is unfavorable for the investigation of problem.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a multi-level protection method for an application program, wherein a hardware watchdog, a daemon process and a daemon thread are used for providing reliability guarantee accurate to a code segment level, information checking and control functions of all daemon threads of a terminal are provided externally through the daemon process through a network, when network resources are limited, a plurality of terminals can be networked in a cascading mode, information is gathered to one terminal, and then the later provides an external interface in a unified mode. When the heartbeat of the daemon thread and the monitored unit is abnormal, abnormal information can be pushed to a specified target according to a configured working mode and recovery operation is automatically carried out, or only the information is pushed, and then an external command is waited to keep the site, so that problem troubleshooting is facilitated.
In order to overcome the technical defects in the prior art, the technical scheme of the invention is as follows:
a multi-level protection method for application program includes connecting multiple terminals in cascade mode, setting monitor module in each terminal, using said monitor module to monitor operation of each level code of application program and to carry out recovery operation on relevant level code when abnormal operation of application program occurs,
starting daemon processes after a terminal system is started, wherein each daemon process registers to an upper-level daemon process;
starting a plurality of daemon threads at corresponding positions of application software, and managing the daemon threads by daemon processes through a daemon thread pool;
and cascade communication is carried out between daemon processes of terminals at all levels, the daemon processes are used for resetting the hardware watchdog at regular time and keeping heartbeat with a daemon thread pool in the application software, so that the running state of the application software is monitored, and further processing is carried out according to the set working mode.
As a further improvement scheme, the information of the daemon process is uploaded and summarized step by step, the information is summarized to the main terminal, and then the main terminal provides an external interface in a unified mode.
As a further improvement, when the monitored unit is abnormal, abnormal information can be pushed to a specified target according to a configured working mode and the recovery operation is automatically carried out, or only information is pushed, and then an external command is waited to keep the site.
As a further improvement scheme, when the daemon thread is used, a daemon thread creating interface in a daemon thread pool is added at an inlet of a monitored unit code, an idle daemon thread is activated, a daemon thread releasing interface in the daemon thread pool is added at an outlet of the monitored unit code, so that the daemon thread restores an idle state, and the daemon thread is used for flexibly detecting any code in the application.
As a further improvement, the monitoring module monitors the application program by 5 levels, namely a system level, a process level, a thread level, a function level and a code segment level, wherein each level is divided into a plurality of monitored units according to monitoring requirements, and the hardware watchdog is used for monitoring the system level; the daemon process is an independent process and is used for monitoring the process level and guaranteeing the reliability of other processes except the daemon process; the daemon thread is positioned in the process and used for monitoring the thread level, the function level and the code segment level and ensuring the reliability of the thread, the function and the code segment in the process.
As a further improvement, the daemon thread comprises the following properties:
the thread tid is distributed by a system when the daemon thread is created, and is unique in the process;
overtime, with reference time T in daemonrefAnd the current time Tcurrent,TrefThe T is set when the daemon thread is initialized or set by the monitored unit, and the daemon thread can regularly acquire the T when runningcurrentAnd is combined with TrefComparison when T iscurrentAnd TrefWhen the difference between the two exceeds the overtime time, the daemon thread considers that the monitored unit is abnormal;
the running state, if the application software runs normally, the value of the attribute is normal, and when the monitored unit is abnormal, the value of the attribute is abnormal;
the automatic exception handling function pointer is set by the monitored unit, and when the monitored unit is abnormal, the daemon thread can call the function pointed by the function pointer to automatically carry out recovery operation;
the exception waiting processing function pointer is set by the monitored unit, when the monitored unit is abnormal, the daemon thread can call a function pointed by the function pointer, after the function is executed, exception information can be pushed outwards, and then the daemon thread enters a waiting state to reserve a fault site;
the system comprises a monitoring unit, a working mode and a daemon thread, wherein the daemon thread comprises an automatic working mode and a waiting working mode, and when the monitoring unit is abnormal, the daemon thread can call an abnormal automatic processing function, otherwise, the daemon thread can call an abnormal waiting processing function;
calling a source file where the code is located, namely the source file where the monitored unit is located, by the daemon thread, wherein the attribute is used for recording the code source file where the monitored unit is located after an abnormality occurs;
the daemon thread calls the line number where the code is located, namely the code starting line number corresponding to the monitored unit, and the attribute is used for recording the code starting line number corresponding to the monitored unit after abnormality occurs.
As a further improvement scheme, a daemon thread is dynamically initialized by a monitored unit, and the method specifically comprises the steps of calling a daemon thread creating interface in a daemon thread pool to initiate a creating request at an inlet of the monitored unit, wherein the creating request comprises timeout time, an abnormal automatic processing function, an abnormal waiting processing function, a source file where a daemon thread creating interface calls a code and a line number where the code is located, and the daemon thread takes effect immediately after the creating is successful; and calling a daemon thread release interface in the daemon thread pool to initiate a daemon thread release request at the tail of the monitored unit, wherein the release request comprises tid of the daemon thread.
As a further improvement, the monitored unit calls a daemon thread creation interface in a daemon thread pool to initiate a creation request, and when the interface is called, the daemon thread pool searches for an idle daemon thread after receiving the request, sets parameters contained in the request, and sets TrefAnd setting the current time and returning the tid of the daemon thread.
As a further improvement scheme, the daemon thread pool is an independent thread in the process, and after the process where the daemon thread pool is located is started, the daemon thread pool can register to the daemon process through a network interface; during registration, the daemon thread pool can actively send the name of the process, and after the registration is successful, when the state of the daemon thread is changed in the daemon thread pool, the daemon thread pool can actively push the change information to the daemon process; meanwhile, the daemon thread pool can send heartbeat to the daemon process at regular time; the daemon thread pool provides an abnormal information pushing interface for the daemon thread, and when the daemon thread is abnormal, abnormal information can be pushed to the daemon process in time.
As a further improvement, the daemon process executes the following operations:
1) managing each daemon thread pool in the terminal, and when registering to the daemon thread pool, using a mapping table to store information of the daemon thread pool, wherein the serial number of the terminal and the name of a process corresponding to the daemon thread pool are combined keywords, and whether a heartbeat, a last heartbeat time, a heartbeat timeout time of a daemon thread, an exception handling mode, an operating state and the daemon thread mapping table are required to be sent or not are values, wherein:
whether the attribute of the heartbeat needs to be sent is 1, the heartbeat needs to be sent, and when the attribute of the heartbeat needs to be sent is 0, the heartbeat does not need to be sent;
the last heartbeat time represents the time when the daemon process receives the heartbeat of the daemon thread pool last time;
when the difference value between the current time and the time of receiving the heartbeat of the daemon thread pool last time exceeds the value of the attribute, the daemon thread pool is abnormal;
the exception handling mode comprises an automatic mode and a waiting mode, when the exception handling mode is in the automatic mode, when the daemon thread pool is abnormal, the daemon process sends exception information to a specified target firstly, then automatically handles the exception condition, and when the daemon thread pool is in the waiting mode, after the exception information is sent, the daemon process enters a waiting state;
the running state represents whether the monitored daemon thread pool is in a normal state or an abnormal state at present;
the daemon thread mapping table is used for storing daemon thread information contained in the monitored daemon thread pool;
2) managing all daemon processes in other terminals, adding daemon thread pool information in the corresponding daemon process to a local mapping table when the daemon processes of other terminals register to the daemon processes, and setting a value of whether to send a heartbeat to be 0;
3) when the monitored daemon thread pool is abnormal, abnormal information is pushed to a specified target, and then abnormal recovery or a waiting state is automatically carried out according to the working mode;
4) if the hardware watchdog exists in the terminal, the hardware watchdog is reset at regular time;
5) registering to a specified target, and pushing information of all daemon thread pools in a mapping table to the opposite side; and in addition, the heartbeat with the specified target is kept, and when the heartbeat is monitored to be abnormal, the system of the terminal is automatically restarted.
Compared with the prior art, in the technical scheme of the invention, the daemon process provides the information checking and controlling functions of all daemon threads of the terminal through the network, when the network resource is limited, a plurality of terminals can be networked in a cascading mode, the information is gathered to one terminal, and then the latter provides an external interface in a unified way. When the monitored unit is abnormal, abnormal information can be pushed to a specified target according to a configured working mode and recovery operation is automatically carried out, or only the information is pushed, and then an external command is waited to keep the site, so that problem troubleshooting is facilitated. When using the daemon thread, adding a daemon thread creating interface in a daemon thread pool at an inlet of the monitored unit code, activating an idle daemon thread, and adding a daemon thread releasing interface in the daemon thread pool at an outlet of the monitored unit code, so that the daemon thread restores to an idle state, and any code in the application is flexibly detected by using the daemon thread. By adopting the technical scheme of the invention, the reliability guarantee can be provided for the application program from 5 levels, so that the exception handling capability of the application program is greatly improved, and the reliability is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a networking structure diagram of a multi-level internet of things terminal provided by an embodiment of the invention;
fig. 2 is a structural diagram of an internet of things terminal system provided in an embodiment of the present invention;
FIG. 3 is a diagram of an internal structure of application software according to an embodiment of the present invention;
fig. 4 is a schematic diagram illustrating a usage flow of a hardware watchdog according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a workflow of a daemon process according to an embodiment of the present invention;
fig. 6 is a schematic diagram illustrating a flow of using a daemon thread pool according to an embodiment of the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a multi-level internet of things terminal networking structure diagram, which is divided into three layers according to a network hierarchy, wherein a first level comprises a terminal 1, a second level comprises two terminals 2-1 and 2-1, and a third level comprises three terminals 3-1, 3-2 and 3-3. 3-1 and 3-2 register to the daemon process in 2-1 through the daemon process, and collect the information of the maintained daemon thread pool and daemon thread into 2-1, 2-1 will send configuration or control command to 3-1 and 3-2; 3-3 registers to the daemon process in 2-2 through the daemon process, and collects the information of the maintained daemon thread pool and daemon thread into 2-2, and 2-2 sends a control command to 3-3; 2-1 and 2-2 register with the daemon process in 1 through the daemon process therein, and collect the information of the maintained daemon thread pool and daemon thread into 1, wherein 1 contains the information of the daemon thread pool and daemon thread in all terminals in the second and third-level networks, 1 sends a control command to 2-1 and 2-2, if the command actually acts on the terminal in the third-level network, the terminal in the second-level network forwards the command to the corresponding terminal in the third-level network after receiving the command. Wherein the control command has the following functions: 1) dynamically switching the working mode of the daemon thread; 2) when the daemon thread is in the waiting mode, if the monitored unit is abnormal, the overtime processing function can be continuously executed through a command. The multi-level networking structure is used because data rooms are large in scale generally, data points needing to be monitored are more, and a single monitoring terminal cannot meet monitoring requirements, so that a plurality of monitoring terminals need to be used, network resources of a plurality of data rooms need to be distributed to a server, in order to reduce the occupation of the monitoring terminals on the network resources, a self-networking mode of the monitoring terminals is adopted, a terminal is used for providing a management interface of monitoring modules, a user can obtain information of all monitoring modules of all terminals in the network through the interface provided by the terminal, and the terminal sends control commands to the monitoring modules in all terminals in the network.
Fig. 2 is a structure diagram of a terminal system of the internet of things, which mainly includes a hardware watchdog, an operating system and application software. The hardware watchdog is used for monitoring whether the operating system normally operates, and when the hardware watchdog is overtime, the relay is controlled to enable the terminal to be powered on and powered off, so that the purpose of restarting the operating system is achieved. The operating system on the terminal adopts embedded Linux, the daemon process runs on the operating system, on one hand, the hardware watchdog is reset at regular time to ensure that the hardware watchdog can automatically recover from the abnormity, on the other hand, the hardware watchdog keeps heartbeat with a daemon thread pool in the application software, the operating state of the application software is monitored, and if the heartbeat is abnormal, abnormal information is pushed to a specified target and is further processed according to a working mode. When the daemon process is in an automatic mode, the application software with abnormal conditions can be automatically restarted, and when the daemon process is in a waiting mode, the daemon process is in a waiting state and does not execute any operation so as to reserve a fault site. The daemon process and the monitored software are communicated through a network, and one daemon process can monitor a plurality of software simultaneously.
Fig. 3 is an internal structure diagram of application software, which includes 1 daemon thread pool and 3 business threads. The daemon thread pool comprises 5 daemon threads, wherein 3 daemon threads are in a running state, 2 daemon threads are in idle filling, and each daemon thread has a unique tid. The daemon thread 1 is at a thread level and is responsible for monitoring whether the thread 1 normally runs or not; the daemon thread 2 is in function level and is responsible for monitoring whether the function 1 in the daemon thread 2 runs normally, and the daemon thread 3 is in code segment level and is responsible for monitoring whether the code 1 of the function 2 in the daemon thread 3 runs normally.
Fig. 4 is a hardware watchdog usage flow, which includes the following specific steps:
s401, after the Internet of things terminal is powered on, the watchdog starts to work.
S402, since the time from the power-on to the complete start of the system is longer than the timeout time of the watchdog, the Uboot needs to be responsible for resetting the watchdog first.
And S403, the kernel is responsible for resetting the watchdog after the kernel is started.
And S404, after the system is completely started, automatically running a daemon process, and finally, resetting the hardware watchdog by the daemon process.
Fig. 5 is a daemon process work flow, which includes the following specific steps:
s501, starting a daemon process.
S502, starting a management thread M of a terminal daemon thread pool1And the method is used for managing the daemon thread pool in each application software of the terminal. After receiving the registration request of the daemon thread pool, M1Nodes are created in the mapping table for each daemon thread pool to store information of the daemon thread pool, and a sub-mapping table is created for each daemon thread pool node to store information of the daemon thread. The serial number of the terminal and the name of the process corresponding to the daemon thread pool are combined keywords, whether the heartbeat needs to be sent, the heartbeat timeout time of the daemon thread, the last heartbeat time, the exception handling mode and the sub mapping table are used as values, and the value of the attribute of whether the heartbeat is sent is set to be 1. And then traversing the daemon thread pool mapping table to detect whether the heartbeat of the node with the heartbeat attribute value of 1 is sent or not is overtime.
S503, starting the management thread M of other terminal daemon process2And the method is used for managing the daemon process in other terminals. After receiving registration requests of daemon processes in other terminals, M2The node information in the daemon thread pool mapping table corresponding to the daemon process is added to the daemon thread pool mapping table of the daemon process, and the value of the attribute of whether to send the heartbeat is set to be 0.
And S504, starting an information pushing thread, and pushing the information in the daemon process to a specified target (the specified target can be a daemon process of other terminals, a user client, a cloud platform or the like). After the thread is started, the thread firstly registers to a specified target, and node information in a daemon thread pool mapping table of the daemon process is pushed to the specified target. When the information in the daemon process is changed, the thread pushes the change information to a specified target, and the change information is as follows:
1) when M is1When a daemon thread in a daemon thread pool managed in the system is abnormal, the corresponding daemon thread pool pushes abnormal information to M1Then M1Updating the abnormal information into a corresponding item of a mapping table in the daemon process, and then pushing the abnormal information to a specified target through an information pushing thread;
2) when M is1When a daemon thread pool in a middle management daemon thread pool is abnormal, M1Firstly, updating the abnormal information into a corresponding item of a mapping table in the daemon process, and then pushing the abnormal information to a specified target through an information pushing thread;
3) when M is2When a daemon thread pool in a middle-management daemon process or a daemon thread in a daemon thread pool is abnormal, M2Firstly, updating abnormal information into a corresponding item of a mapping table in the daemon process, and then pushing the abnormal information to a specified target through an information pushing thread;
and S505, starting a cycle, taking over the reset operation of the hardware watchdog in the cycle, and resetting the hardware watchdog at regular time.
FIG. 6 is a process flow for using a daemon thread pool. The daemon threads are uniformly managed by a daemon thread pool, when an application is just started, n idle daemon threads are created in the daemon thread pool, and then different monitored units in the application call interfaces of the daemon thread pool as required to dynamically activate the daemon threads. The specific operation steps are as follows:
s601, calling a daemon thread creating interface in a daemon thread pool at an inlet of a monitored unit code, wherein the input parameters are timeout time, an abnormal automatic processing function, an abnormal waiting processing function, a source file where the daemon thread creating interface calls the code and a line number where the code is located;
s602, the daemon thread pool searches for an idle daemon thread in the daemon thread pool;
s603 sets the parameters mentioned in step S601, and sets TrefSetting the current time, activating the daemon thread, and returning the tid of the daemon thread;
s604, after the monitored unit code is executed, calling a daemon thread release interface in a daemon thread pool at an outlet of the monitored unit code, and transmitting tid of a corresponding daemon thread to enable the daemon thread to return to an idle state.
When the monitored unit needs to use the daemon thread, a code for calling the daemon thread to create an interface and release the interface needs to be added in the monitored unit, wherein different codes are called according to a thread level, a function level and a code segment level. Calling a daemon thread creation interface at the beginning of a loop in the thread processing program, each function and the corresponding code segment, and calling a daemon thread release interface at the end of the loop. Therefore, the abnormal monitoring of each level code is conveniently realized, and the software reliability is improved.
The above description of the embodiments is only intended to facilitate the understanding of the method of the invention and its core idea. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (9)

1. A multi-level protection method for application program is characterized by that several terminals are connected in cascade mode, in every terminal a monitoring module is set, said monitoring module at least includes hardware watchdog, daemon process and daemon thread for monitoring operation of every level code of application program and making recovery operation of correspondent level code when the operation of application program is abnormal, in which,
starting the daemon processes after the terminal system is started, and registering each daemon process to an upper-level daemon process;
starting a plurality of daemon threads at corresponding positions of application software, and managing the daemon threads by daemon processes through a daemon thread pool;
cascade communication is carried out among daemon processes of terminals at all levels, the daemon processes are used for resetting a hardware watchdog at regular time and keeping heartbeat with a daemon thread pool in application software, so that the running state of the application software is monitored, and further processing is carried out according to a set working mode;
the daemon performs the following operations:
1) managing each daemon thread pool in the terminal, and when registering to the daemon thread pool, using a mapping table to store information of the daemon thread pool, wherein the serial number of the terminal and the name of a process corresponding to the daemon thread pool are combined keywords, and whether a heartbeat, a last heartbeat time, a heartbeat timeout time of a daemon thread, an exception handling mode, an operating state and the daemon thread mapping table are required to be sent or not are values, wherein:
whether the attribute of the heartbeat needs to be sent is 1, the heartbeat needs to be sent, and when the attribute of the heartbeat needs to be sent is 0, the heartbeat does not need to be sent;
the last heartbeat time represents the time when the daemon process receives the heartbeat of the daemon thread pool last time;
when the difference value between the current time and the time of receiving the heartbeat of the daemon thread pool last time exceeds the value of the attribute, the daemon thread pool is abnormal;
the exception handling mode comprises an automatic mode and a waiting mode, when the exception handling mode is in the automatic mode, when the daemon thread pool is abnormal, the daemon process sends exception information to a specified target firstly, then automatically handles the exception condition, and when the daemon thread pool is in the waiting mode, after the exception information is sent, the daemon process enters a waiting state;
the running state represents whether the monitored daemon thread pool is in a normal state or an abnormal state at present;
the daemon thread mapping table is used for storing daemon thread information contained in the monitored daemon thread pool;
2) managing all daemon processes in other terminals, adding daemon thread pool information in the corresponding daemon process to a local mapping table when the daemon processes of other terminals register to the daemon processes, and setting a value of whether to send a heartbeat to be 0;
3) when the monitored daemon thread pool is abnormal, abnormal information is pushed to a specified target, and then abnormal recovery or a waiting state is automatically carried out according to the working mode;
4) if the hardware watchdog exists in the terminal, the hardware watchdog is reset at regular time;
5) registering to a specified target, and pushing information of all daemon thread pools in a mapping table to the opposite side; and in addition, the heartbeat with the specified target is kept, and when the heartbeat is monitored to be abnormal, the system of the terminal is automatically restarted.
2. The method for multi-level protection of application programs according to claim 1, wherein the daemon process uploads and collects information level by level, collects the information to the main terminal, and then the main terminal provides external interfaces in a unified manner.
3. The method for multi-level protection of application programs according to claim 2, wherein when the monitored unit is abnormal, the abnormal information can be pushed to the designated target according to the configured working mode and the recovery operation can be automatically performed, or only the information is pushed and then the external command is waited to keep the site.
4. The method according to claim 3, wherein when using the daemon thread, a daemon thread creating interface in a daemon thread pool is added at an inlet of the monitored unit code, an idle daemon thread is activated, and a daemon thread releasing interface in the daemon thread pool is added at an outlet of the monitored unit code, so that the daemon thread restores an idle state, thereby flexibly using the daemon thread to detect any code in the application.
5. The method according to claim 4, wherein the monitoring module comprises 5 levels, namely a system level, a process level, a thread level, a function level and a code segment level, and each level is divided into a plurality of monitored units according to monitoring requirements, wherein the hardware watchdog is used for monitoring the system level; the daemon process is an independent process and is used for monitoring the process level and ensuring the reliability of other processes except the daemon process; the daemon thread is positioned in the process and used for monitoring the thread level, the function level and the code segment level and ensuring the reliability of the thread, the function and the code segment in the process.
6. The application multi-level protection method of claim 5, wherein the daemon thread comprises the following properties:
the thread tid is distributed by a system when the daemon thread is created, and is unique in the process;
overtime, with reference time T in daemonrefAnd the current time Tcurrent,TrefThe T is set when the daemon thread is initialized or set by the monitored unit, and the daemon thread can regularly acquire the T when runningcurrentAnd is combined with TrefComparison when T iscurrentAnd TrefWhen the difference between the two exceeds the overtime time, the daemon thread considers that the monitored unit is abnormal;
the running state, if the application software runs normally, the value of the attribute is normal, and when the monitored unit is abnormal, the value of the attribute is abnormal;
the automatic exception handling function pointer is set by the monitored unit, and when the monitored unit is abnormal, the daemon thread can call the function pointed by the function pointer to automatically carry out recovery operation;
the exception waiting processing function pointer is set by the monitored unit, when the monitored unit is abnormal, the daemon thread can call a function pointed by the function pointer, after the function is executed, exception information can be pushed outwards, and then the daemon thread enters a waiting state to reserve a fault site;
the system comprises a monitoring unit, a working mode and a daemon thread, wherein the daemon thread comprises an automatic working mode and a waiting working mode, and when the monitoring unit is abnormal, the daemon thread can call an abnormal automatic processing function, otherwise, the daemon thread can call an abnormal waiting processing function;
calling a source file where the code is located, namely the source file where the monitored unit is located, by the daemon thread, wherein the attribute is used for recording the code source file where the monitored unit is located after an abnormality occurs;
the daemon thread calls the line number where the code is located, namely the code starting line number corresponding to the monitored unit, and the attribute is used for recording the code starting line number corresponding to the monitored unit after abnormality occurs.
7. The method according to claim 6, wherein the daemon thread is dynamically initialized by the monitored unit, and the specific steps are that, at the entrance of the monitored unit, a daemon thread creation interface in a daemon thread pool is called to initiate a creation request, the creation request includes timeout time, an abnormal automatic processing function, an abnormal waiting processing function, a source file where the daemon thread creation interface calls a code and a line number where the code is located, and after the creation is successful, the daemon thread takes effect immediately; and calling a daemon thread release interface in the daemon thread pool to initiate a daemon thread release request at the tail of the monitored unit, wherein the release request comprises tid of the daemon thread.
8. According to claimThe method for protecting the application program in multiple levels according to claim 6, wherein the monitored unit calls a daemon thread creating interface in a daemon thread pool to initiate a creating request, when the interface is called, the daemon thread pool searches for an idle daemon thread after receiving the request, sets parameters contained in the request, and sends T to the clientrefAnd setting the current time and returning the tid of the daemon thread.
9. The method for multi-level protection of an application according to claim 6, wherein the daemon thread pool is an independent thread in the process, and after the process in which the daemon thread pool is located is started, the daemon thread pool registers with the daemon process through the network interface; during registration, the daemon thread pool can actively send the name of the process, and after the registration is successful, when the state of the daemon thread is changed in the daemon thread pool, the daemon thread pool can actively push the change information to the daemon process; meanwhile, the daemon thread pool can send heartbeat to the daemon process at regular time; the daemon thread pool provides an abnormal information pushing interface for the daemon thread, and when the daemon thread is abnormal, abnormal information can be pushed to the daemon process in time.
CN202011515855.XA 2020-12-21 2020-12-21 Multi-level protection method for application program Active CN112612635B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011515855.XA CN112612635B (en) 2020-12-21 2020-12-21 Multi-level protection method for application program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011515855.XA CN112612635B (en) 2020-12-21 2020-12-21 Multi-level protection method for application program

Publications (2)

Publication Number Publication Date
CN112612635A CN112612635A (en) 2021-04-06
CN112612635B true CN112612635B (en) 2022-06-10

Family

ID=75244116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011515855.XA Active CN112612635B (en) 2020-12-21 2020-12-21 Multi-level protection method for application program

Country Status (1)

Country Link
CN (1) CN112612635B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113778514A (en) * 2021-09-17 2021-12-10 平安科技(深圳)有限公司 Program maintenance method, device, equipment and storage medium
CN117112284B (en) * 2023-10-25 2024-02-02 西安热工研究院有限公司 DCS controller trusted state sensing method and related device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968352A (en) * 2012-12-14 2013-03-13 杨晓松 System and method for process monitoring and multi-stage recovery
CN104407958A (en) * 2014-10-30 2015-03-11 广州博控自动化技术有限公司 High-reliability system monitoring method and system
CN106445712A (en) * 2016-08-31 2017-02-22 上海澳润信息科技有限公司 Implementation method for software watchdog based on message monitoring
CN107451046A (en) * 2016-05-30 2017-12-08 腾讯科技(深圳)有限公司 A kind of method and terminal for detecting thread
CN110865900A (en) * 2020-01-19 2020-03-06 南京火零信息科技有限公司 Method for enhancing robustness of embedded system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968352A (en) * 2012-12-14 2013-03-13 杨晓松 System and method for process monitoring and multi-stage recovery
CN104407958A (en) * 2014-10-30 2015-03-11 广州博控自动化技术有限公司 High-reliability system monitoring method and system
CN107451046A (en) * 2016-05-30 2017-12-08 腾讯科技(深圳)有限公司 A kind of method and terminal for detecting thread
CN106445712A (en) * 2016-08-31 2017-02-22 上海澳润信息科技有限公司 Implementation method for software watchdog based on message monitoring
CN110865900A (en) * 2020-01-19 2020-03-06 南京火零信息科技有限公司 Method for enhancing robustness of embedded system

Also Published As

Publication number Publication date
CN112612635A (en) 2021-04-06

Similar Documents

Publication Publication Date Title
US20190324797A1 (en) Task processing method, apparatus, and system
US6952766B2 (en) Automated node restart in clustered computer system
US9325757B2 (en) Methods and systems for fault-tolerant distributed stream processing
CN112612635B (en) Multi-level protection method for application program
US8743680B2 (en) Hierarchical network failure handling in a clustered node environment
CN110830283B (en) Fault detection method, device, equipment and system
CN111209110B (en) Task scheduling management method, system and storage medium for realizing load balancing
CN110177020A (en) A kind of High-Performance Computing Cluster management method based on Slurm
US11706080B2 (en) Providing dynamic serviceability for software-defined data centers
CN107241242A (en) A kind of data processing method and device
CN113067850A (en) Cluster arrangement system under multi-cloud scene
TW200426571A (en) Policy-based response to system errors occurring during os runtime
CN113515316A (en) Novel edge cloud operating system
US20230409206A1 (en) Systems and methods for ephemeral storage snapshotting
CN107071189B (en) Connection method of communication equipment physical interface
CN111736809A (en) Distributed robot cluster network management framework and implementation method thereof
WO2020098266A1 (en) Abnormal response method and device
US20110238959A1 (en) Distributed controller, distributed processing system, and distributed processing method
CN113157426B (en) Task scheduling method, system, equipment and storage medium
US20050234919A1 (en) Cluster system and an error recovery method thereof
CN110798339A (en) Task disaster tolerance method based on distributed task scheduling framework
CN107426012B (en) Fault recovery method and device based on super-fusion architecture
CN113765690A (en) Cluster switching method, system, device, terminal, server and storage medium
CN112052095A (en) Distributed high-availability big data mining task scheduling system
CN115766715B (en) Super-fusion cluster monitoring method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant