CN106487599B - Method and system for distributed monitoring of running state of cloud access controller - Google Patents

Method and system for distributed monitoring of running state of cloud access controller Download PDF

Info

Publication number
CN106487599B
CN106487599B CN201611092334.1A CN201611092334A CN106487599B CN 106487599 B CN106487599 B CN 106487599B CN 201611092334 A CN201611092334 A CN 201611092334A CN 106487599 B CN106487599 B CN 106487599B
Authority
CN
China
Prior art keywords
node
monitoring process
monitoring
child
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611092334.1A
Other languages
Chinese (zh)
Other versions
CN106487599A (en
Inventor
陈昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shuwang Internet Technology Co.,Ltd.
Original Assignee
Shanghai Feixun Data Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Feixun Data Communication Technology Co Ltd filed Critical Shanghai Feixun Data Communication Technology Co Ltd
Priority to CN201611092334.1A priority Critical patent/CN106487599B/en
Publication of CN106487599A publication Critical patent/CN106487599A/en
Application granted granted Critical
Publication of CN106487599B publication Critical patent/CN106487599B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Abstract

The invention provides a method and a system for distributed monitoring of an operation state of a cloud access controller, wherein the method comprises the following steps: the method comprises the steps that a cloud access controller sets a plurality of monitoring processes, wherein one monitoring process is a current monitoring process, and the rest monitoring processes are backup monitoring processes, and the current monitoring process monitors server performance data of a plurality of servers and/or service processing information of a plurality of service modules in the cloud access controller; the distributed processing framework creates a permanent node and a temporary node under the permanent node, wherein the permanent node is a root node for monitoring the current monitoring process, and the temporary node is a child node of the root node; according to a preset rule, the distributed processing framework distributes monitoring authority to child nodes under the root node, and the child nodes with the monitoring authority monitor the current monitoring process. The invention can monitor the running state of the cloud access controller in real time through distributed monitoring.

Description

Method and system for distributed monitoring of running state of cloud access controller
Technical Field
The embodiment of the invention relates to the technical field of communication, in particular to a method and a system for distributed monitoring of an operation state of a cloud access controller.
Background
A cloud Access Controller (AC) system often presents a functional interface of the system through HyperText Markup Language (HTML), and a user may connect to the AC system through a browser to perform various operations.
Because the cloud access controller system needs to process a large amount of terminal device connection information at the same time, a single server cannot meet the performance requirement of service data processing, and in order to solve the processing scene of big data, service processing needs to be distributed into a plurality of servers. The distributed system of the cloud access controller is composed of a plurality of servers, each server runs a plurality of service modules in the system, and when the processing capacity reaches a threshold value, the cloud access controller system can solve the performance problem by dynamically adding servers or adding system service modules in the servers.
In a distributed environment, a cloud access controller is composed of a plurality of servers and a plurality of service modules. In the running process, various abnormal conditions may be encountered, such as a hardware failure of the server, an abnormal power supply, a CPU of the server, a memory usage exceeding a set threshold, a termination of execution of the processing process due to an abnormal service module in running, and the like. When these operation abnormalities occur, the abnormal conditions of the system need to be sent to operation and maintenance personnel in real time so as to be handled in time. Operation and maintenance personnel also need to know the operation state of each server in the cloud access controller system and the process state of the service module in real time.
In the process of implementing the invention, the inventor finds that the prior art has at least the following problems:
in the prior art, when a cloud access controller processes a large amount of service data by using a distributed method, a reliable mechanism is not provided for monitoring the operation state of each component in a distributed system in real time in consideration of the complexity of system operation and the possibility of occurrence of various faults, and the system does not provide a system operation monitoring scheme under the complex environment.
It should be noted that the above background description is only for the sake of clarity and complete description of the technical solutions of the present invention and for the understanding of those skilled in the art. Such solutions are not considered to be known to the person skilled in the art merely because they have been set forth in the background section of the invention.
Disclosure of Invention
In view of the foregoing problems, an object of embodiments of the present invention is to provide a method and a system for distributed monitoring of an operating state of a cloud access controller, which can place system monitoring in a distributed environment, ensure that a monitoring system finds an operating fault of the system in real time when the monitoring system fails, and ensure correct operation of a monitoring mechanism even if a single or multiple monitoring processes fail.
In order to achieve the above object, an embodiment of the present invention provides a method for distributed monitoring of an operating state of a cloud access controller, where the cloud access controller is a distributed system composed of a plurality of servers and a plurality of service modules, and the method includes: the cloud access controller sets a plurality of monitoring processes, wherein one monitoring process is a current monitoring process, and the rest monitoring processes are backup monitoring processes; the distributed processing framework establishes a root node for monitoring the current monitoring process, establishes child nodes under the root node, distributes monitoring authority to the child nodes under the root node according to a preset rule, and the child nodes with the monitoring authority monitor the current monitoring process; the current monitoring process monitors server performance data of a plurality of servers and/or service processing information of a plurality of service modules in the cloud access controller; if the current monitoring process is abnormal, the distributed processing framework deletes the child node monitoring the current monitoring process, determines a new current monitoring process from the backup monitoring process, and determines a new child node from all child nodes under the root node to monitor the new current monitoring process.
Further, the child nodes adopt child node names and sequence value identifications, wherein the sequence value of a child node in the root node is the sequence value of a previous child node plus 1; according to the preset rule, the distributed processing framework distributes monitoring authority for the child nodes under the root node, and the method comprises the following steps: acquiring the sequence values of all child nodes under the root node, and judging whether the sequence value of the current child node is minimum in the sequence values of all child nodes; if so, the distributed processing framework allocates monitoring authority to the current child node with the minimum sequence value; if not, setting a state monitoring callback in the previous child node by the current child node, and judging whether the sequence value of the previous child node is the minimum in the sequence values of all the child nodes; until determining a child node with the minimum sequence value under the root node; and the distributed processing framework allocates monitoring authority to the child node with the minimum sequence value.
Further, the monitoring of the server performance data of the plurality of servers and/or the service processing information of the plurality of service modules by the current monitoring process includes: creating a service node under the root node and creating sub-nodes under the service node, wherein the sub-nodes comprise a first sub-node and a second sub-node, the first sub-node is used for monitoring server performance data of a server, and the second sub-node is used for monitoring service processing information of a service module; after the current monitoring process is started, all sub-nodes under the service node are added into the cache and the states of all the sub-nodes are monitored, when the states of the sub-nodes are changed, all the sub-nodes under the service node are obtained, and all the obtained sub-nodes are compared with all the sub-nodes in the cache; if the sub-node in the cache does not exist in the acquired sub-node, the current monitoring process judges that the non-existing sub-node is abnormal and sends abnormal information to the cloud access controller; and if the subnode in the cache exists in the acquired subnode, the current monitoring process acquires the server performance data and/or the service processing information monitored by the existing subnode and sends the server performance data and/or the service processing information to the cloud access controller for processing.
Further, the current monitoring process is abnormal, including: the current monitoring process periodically sends server performance data and/or service processing information according to a preset time interval; and if the server performance data and/or the service processing information sent by the current monitoring process are not received in the time interval, the current monitoring process is abnormal.
Further, the step of deleting the child node monitoring the current monitoring process by the distributed processing framework, determining a new current monitoring process from the backup monitoring process, and determining a new child node from all child nodes under the root node to monitor the new current monitoring process includes: the distributed processing framework deletes the child node monitoring the current monitoring process and sends the node deletion information to the child node of the next sequence value; when the child node of the next sequence value receives the node deletion information, the abnormal information of the current monitoring process and the child node monitoring the current monitoring process is sent to the root node, and then the distributed processing framework determines a new current monitoring process from the backup monitoring process and determines a new child node from all child nodes under the root node to monitor the new current monitoring process according to the preset rule.
In order to achieve the above object, an embodiment of the present invention further provides a system for distributed monitoring of an operating state of a cloud access controller, including: the cloud access controller is a distributed system consisting of a plurality of servers and a plurality of service modules and is used for setting a plurality of monitoring processes, wherein one monitoring process is a current monitoring process, and the rest monitoring processes are backup monitoring processes; the distributed processing framework is used for creating a permanent node and creating a temporary node under the permanent node, wherein the permanent node is a root node for monitoring the current monitoring process, and the temporary node is a child node of the root node; distributing monitoring authority to child nodes under a root node according to a preset rule, and monitoring the current monitoring process by the child nodes with the monitoring authority; the current monitoring process monitors server performance data of a plurality of servers and/or service processing information of a plurality of service modules in the cloud access controller, if the current monitoring process is abnormal, child nodes monitoring the current monitoring process are deleted, a new current monitoring process is determined from the backup monitoring process, and new child nodes are determined from all child nodes under the root node to monitor the new current monitoring process.
Therefore, according to the method and the system for distributed monitoring of the operation state of the cloud access controller provided by the embodiment of the invention, the monitoring is placed in a distributed environment by using a distributed monitoring mode, so that the monitoring system can find the operation fault of the system in real time when the fault occurs, and the correct operation of a monitoring mechanism can be ensured even if a single or multiple monitoring processes have the fault.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for distributed monitoring of an operating state of a cloud access controller according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a system for distributed monitoring of an operation state of a cloud access controller according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The embodiment of the invention provides a distributed monitoring method for the running state of a cloud access controller. Referring to fig. 1, the method includes the following steps:
step S1: the cloud access controller is provided with a plurality of monitoring processes, wherein one monitoring process is a current monitoring process, and the rest monitoring processes are backup monitoring processes.
In the embodiment of the invention, the cloud access controller is a distributed system consisting of a plurality of servers and a plurality of service modules.
The cloud access controller is provided with a plurality of monitoring processes, preferably, the number of the monitoring processes is three, and the monitoring processes are respectively deployed in three different servers. At the same time, only one monitoring process is operated as the current monitoring process, and other monitoring processes are used as backups.
Step S2: and establishing a root node for monitoring the current monitoring process in the distributed processing framework, and establishing a child node under the root node.
In this embodiment, the distributed processing framework is based on ZooKeeper, which is a distributed application program coordination service of an open source code, and can provide a consistency service for distributed applications, and is mainly used to solve the problem of data management in distributed applications, for example, the provided functions include: unified naming services, state synchronization services, cluster management, management of distributed application configuration items, and the like. In addition, the ZooKeeper can also provide data storage based on a directory node tree mode similar to a file system, and is mainly used for maintaining and monitoring state change of stored data. By monitoring these data state changes, data-based cluster management can be achieved.
A permanent node monitor is created in the ZooKeeper, and the permanent node monitor is used as a root node for monitoring the current monitoring process.
After the monitoring process is started, a temporary node/monitors/monitor-process ID-is created under the permanent node monitors, namely, a child node of the root node, and meanwhile, the child node is set as a sequence value self-increment node, and the ZooKeeper automatically adds a sequence value to the last of the name of the child node, for example, the sequence value of the child node in the root node is the sequence value of the previous child node plus 1.
Step S3: according to a preset rule, the distributed processing framework distributes monitoring authority to child nodes under a root node, and the child nodes with the monitoring authority monitor the current monitoring process.
In the embodiment of the present invention, the preset rule for allocating the monitoring right is:
acquiring sequence values of all child nodes under a root node, and judging whether the sequence value of the current child node is minimum in the sequence values of all the child nodes;
if so, the distributed processing framework allocates a monitoring authority to the current child node with the minimum sequence value, and the current child node starts a monitoring current process after acquiring the monitoring authority;
if not, setting a state monitoring callback in the previous child node by the current child node, for example, if the sequence value of the current child node is 2, setting the state monitoring callback on the child node with the sequence value of 1; judging whether the sequence value of the previous child node is the minimum in the sequence values of all the child nodes; until determining the child node with the minimum sequence value under the root node as the current child node; and the distributed processing framework allocates monitoring authority to the smallest child node in the sequence value, and the child node starts a monitoring current monitoring process after acquiring the monitoring authority.
Step S4: the current monitoring process monitors server performance data of a plurality of servers and/or service processing information of a plurality of service modules in the cloud access controller.
In an embodiment of the present invention, a service (Server) node is created under the root node, and a sub-node is created under the service node, where the sub-node includes a first sub-node and a second sub-node.
Specifically, after the server performance acquisition program is started, a temporary node, that is, a first sub-node of the service node, is created under the service node, and the cloud access controller writes server information into the first sub-node, so that the first sub-node monitors server performance data of the server, where the server information may include, but is not limited to: host name, performance indexes such as CPU, memory, disk, etc. After the service module is started, a temporary node, that is, a second sub-node of the service node, is created under the service node, and the cloud access controller writes service module information into the second sub-node, so that the second sub-node monitors service processing information of the service module, where the service module information may include, but is not limited to: host name, module name, traffic handling in time period, etc.
And the current monitoring process acquires server information and service module information for monitoring.
Specifically, after the monitoring process is started, all the sub-nodes under the service node are added into the cache and the states of all the sub-nodes are monitored. And if the state of the sub-node is changed, acquiring all the sub-nodes under the service node. Comparing all sub-nodes under the acquired service node with all sub-nodes in the cache, if the sub-nodes in the cache do not exist in the acquired sub-nodes, the monitoring process considers that the non-existing sub-nodes are abnormal, and the monitoring process sends an abnormal message to the cloud access controller; and if the subnode in the cache exists in the acquired subnode, the monitoring process acquires the server performance data and/or the service processing information in the subnode and sends the server performance data and/or the service processing information to the cloud access controller for processing.
Step S5: if the current monitoring process is abnormal, the distributed processing framework deletes the child node monitoring the current monitoring process, determines a new current monitoring process from the backup monitoring process, and determines a new child node from all child nodes under the root node to monitor the new current monitoring process.
In the embodiment of the present invention, by monitoring the child nodes of the current monitoring process, the current monitoring process performs information interaction with the root node, such as server performance data and/or service processing information. In addition, the distributed processing framework presets a time interval, for example, 5 minutes, according to which the current monitoring process periodically sends server performance data and/or traffic processing information to the root node.
If the root node does not receive the information sent by the current monitoring process within the time interval, namely the monitoring process terminates the information interaction with the root node, the distributed processing framework deletes the child node monitoring the current monitoring process and sends the node deletion information to the child node of the next sequence value.
Because the child node sets a state monitoring callback in the previous child node, when the child node of the next sequence value receives the node deletion information of the previous child node, the current monitoring process is considered to be abnormal, and a new monitoring process is selected from other backup monitoring processes through an algorithm to take over the operation. The specific algorithm for selecting a new monitoring process from the backup monitoring processes to take over the operation is not limited in the embodiment of the present invention.
In addition, the distributed processing framework acquires the sequence values of all the child nodes under the root node again, and determines the child node with the minimum sequence value as a new child node to monitor a new current monitoring process according to the preset rule.
As shown in fig. 2, an embodiment of the present invention provides a system for distributed monitoring of an operation state of a cloud access controller, including:
the cloud access controller is a distributed system consisting of a plurality of servers and a plurality of service modules and is used for setting a plurality of monitoring processes, wherein one monitoring process is a current monitoring process, and the rest monitoring processes are backup monitoring processes;
the distributed processing framework is used for creating a permanent node and creating a temporary node under the permanent node, wherein the permanent node is a root node for monitoring the current monitoring process, and the temporary node is a child node of the root node; distributing monitoring authority to child nodes under a root node according to a preset rule, and monitoring the current monitoring process by the child nodes with the monitoring authority; the current monitoring process monitors server performance data of a plurality of servers and/or service processing information of a plurality of service modules in the cloud access controller, if the current monitoring process is abnormal, child nodes monitoring the current monitoring process are deleted, a new current monitoring process is determined from the backup monitoring process, and new child nodes are determined from all child nodes under the root node to monitor the new current monitoring process.
Wherein the content of the first and second substances,
the distributed processing framework allocates monitoring authority for child nodes under the root node, and specifically comprises the following steps:
acquiring the sequence values of all child nodes under the root node, and judging whether the sequence value of the current child node is minimum in the sequence values of all child nodes; if so, allocating monitoring authority to the current child node with the minimum sequence value; if not, setting a state monitoring callback in the previous child node by the current child node, and judging whether the sequence value of the previous child node is the minimum in the sequence values of all the child nodes; and until the child node with the minimum sequence value is determined to distribute the monitoring authority to the child node with the minimum sequence value under the root node.
The distributed processing framework is further specifically configured to:
creating a service node under the root node and creating sub-nodes under the service node, wherein the sub-nodes comprise a first sub-node and a second sub-node, the first sub-node is used for monitoring server performance data of a server, and the second sub-node is used for monitoring service processing information of a service module; after the current monitoring process is started, all sub-nodes under the service node are added into the cache and the states of all the sub-nodes are monitored, when the states of the sub-nodes are changed, all the sub-nodes under the service node are obtained, and all the obtained sub-nodes are compared with all the sub-nodes in the cache; if the sub-node in the cache does not exist in the acquired sub-node, the current monitoring process judges that the non-existing sub-node is abnormal and sends abnormal information to the cloud access controller; and if the subnode in the cache exists in the acquired subnode, the current monitoring process acquires the server performance data and/or the service processing information monitored by the existing subnode and sends the server performance data and/or the service processing information to the cloud access controller for processing.
The current monitoring process is abnormal, and specifically comprises the following steps:
the current monitoring process periodically sends server performance data and/or service processing information according to a preset time interval; and if the server performance data and/or the service processing information sent by the current monitoring process are not received in the time interval, the current monitoring process is abnormal.
The distributed processing framework is further specifically configured to:
if the current monitoring process is abnormal, the distributed processing framework deletes the child node monitoring the current monitoring process and sends node deletion information to the child node of the next sequence value; when the child node of the next sequence value receives the node deletion information, the abnormal information of the current monitoring process and the child node monitoring the current monitoring process is sent to the root node, and then the distributed processing framework determines a new current monitoring process from the backup monitoring process and determines a new child node from all child nodes under the root node to monitor the new current monitoring process according to the preset rule.
The specific technical details of the system for the distributed monitoring of the operating state of the cloud access controller are similar to those of the method for the distributed monitoring of the operating state of the cloud access controller, and therefore detailed description is omitted.
Therefore, according to the method and the system for distributed monitoring of the operation state of the cloud access controller provided by the embodiment of the invention, the monitoring is placed in a distributed environment by using a distributed monitoring mode, so that the monitoring system can find the operation fault of the system in real time when the fault occurs, and the correct operation of a monitoring mechanism can be ensured even if a single or multiple monitoring processes have the fault.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments.
Finally, it should be noted that: the foregoing description of various embodiments of the invention is provided to those skilled in the art for the purpose of illustration. It is not intended to be exhaustive or to limit the invention to a single disclosed embodiment. Various alternatives and modifications of the invention, as described above, will be apparent to those skilled in the art. Thus, while some alternative embodiments have been discussed in detail, other embodiments will be apparent or relatively easy to derive by those of ordinary skill in the art. The present invention is intended to embrace all such alternatives, modifications, and variances which have been discussed herein, and other embodiments which fall within the spirit and scope of the above application.

Claims (10)

1. A method for distributed monitoring of an operation state of a cloud access controller is characterized in that the cloud access controller is a distributed system composed of a plurality of servers and a plurality of service modules, and the method comprises the following steps:
the cloud access controller sets a plurality of monitoring processes, wherein one monitoring process is a current monitoring process, and the rest monitoring processes are backup monitoring processes;
the distributed processing framework establishes a root node for monitoring the current monitoring process, establishes child nodes under the root node, distributes monitoring authority to the child nodes under the root node according to a preset rule, and the child nodes with the monitoring authority monitor the current monitoring process;
the current monitoring process monitors server performance data of a plurality of servers and/or service processing information of a plurality of service modules in the cloud access controller;
if the current monitoring process is abnormal, the distributed processing framework deletes the child node monitoring the current monitoring process, determines a new current monitoring process from the backup monitoring process, and determines a new child node from all child nodes under the root node to monitor the new current monitoring process.
2. The method according to claim 1, wherein the child nodes are identified by child node names and sequence values, wherein the sequence value of a child node in the root node is the sequence value of its previous child node plus 1;
according to the preset rule, the distributed processing framework distributes monitoring authority for the child nodes under the root node, and the method comprises the following steps:
acquiring the sequence values of all child nodes under the root node, and judging whether the sequence value of the current child node is minimum in the sequence values of all child nodes;
if so, the distributed processing framework allocates monitoring authority to the current child node with the minimum sequence value;
if not, setting a state monitoring callback in the previous child node by the current child node, and judging whether the sequence value of the previous child node is the minimum in the sequence values of all the child nodes; until determining a child node with the minimum sequence value under the root node; and the distributed processing framework allocates monitoring authority to the child node with the minimum sequence value.
3. The method according to claim 1, wherein the monitoring a current monitoring process monitors server performance data of a plurality of servers and/or service processing information of a plurality of service modules in the cloud access controller, and comprises:
creating a service node under the root node and creating sub-nodes under the service node, wherein the sub-nodes comprise a first sub-node and a second sub-node, the first sub-node is used for monitoring server performance data of the plurality of servers, and the second sub-node is used for monitoring service processing information of the plurality of service modules;
after the current monitoring process is started, all sub-nodes under the service node are added into the cache and the states of all the sub-nodes are monitored, when the states of the sub-nodes are changed, all the sub-nodes under the service node are obtained, and all the obtained sub-nodes are compared with all the sub-nodes in the cache;
if the sub-node in the cache does not exist in the acquired sub-node, the current monitoring process judges that the non-existing sub-node is abnormal and sends abnormal information to the cloud access controller;
and if the subnode in the cache exists in the acquired subnode, the current monitoring process acquires the server performance data and/or the service processing information monitored by the existing subnode and sends the server performance data and/or the service processing information to the cloud access controller for processing.
4. The method for distributed monitoring of the operating state of the cloud access controller according to claim 3, wherein the occurrence of an exception in the current monitoring process includes:
the current monitoring process periodically sends server performance data and/or service processing information according to a preset time interval;
and if the server performance data and/or the service processing information sent by the current monitoring process are not received in the time interval, the current monitoring process is abnormal.
5. The method of distributed monitoring of operating states of a cloud access controller according to claim 4, wherein the distributed processing framework deletes the child node monitoring the current monitoring process, determines a new current monitoring process from the backup monitoring processes, and determines a new child node from all child nodes under the root node to monitor the new current monitoring process, comprising:
the distributed processing framework deletes the child node monitoring the current monitoring process and sends the node deletion information to the child node of the next sequence value;
when the child node of the next sequence value receives the node deletion information, the abnormal information of the current monitoring process and the child node monitoring the current monitoring process is sent to the root node, and then the distributed processing framework determines a new current monitoring process from the backup monitoring process and determines a new child node from all child nodes under the root node to monitor the new current monitoring process according to the preset rule.
6. A system for distributed monitoring of operation states of a cloud access controller is characterized by comprising:
the cloud access controller is a distributed system consisting of a plurality of servers and a plurality of service modules and is used for setting a plurality of monitoring processes, wherein one monitoring process is a current monitoring process, and the rest monitoring processes are backup monitoring processes;
the distributed processing framework is used for creating a permanent node and creating a temporary node under the permanent node, wherein the permanent node is a root node for monitoring the current monitoring process, and the temporary node is a child node of the root node; distributing monitoring authority to child nodes under a root node according to a preset rule, and monitoring the current monitoring process by the child nodes with the monitoring authority; the current monitoring process monitors server performance data of a plurality of servers and/or service processing information of a plurality of service modules in the cloud access controller, if the current monitoring process is abnormal, child nodes monitoring the current monitoring process are deleted, a new current monitoring process is determined from the backup monitoring process, and new child nodes are determined from all child nodes under the root node to monitor the new current monitoring process.
7. The system according to claim 6, wherein the child nodes are identified by child node names and sequence values, wherein the sequence value of a child node in the root node is the sequence value of its previous child node plus 1;
the distributed processing framework allocates monitoring authority for child nodes under the root node, and specifically comprises the following steps:
acquiring the sequence values of all child nodes under the root node, and judging whether the sequence value of the current child node is minimum in the sequence values of all child nodes;
if so, allocating monitoring authority to the current child node with the minimum sequence value;
if not, setting a state monitoring callback in the previous child node by the current child node, and judging whether the sequence value of the previous child node is the minimum in the sequence values of all the child nodes; and until the child node with the minimum sequence value is determined to distribute the monitoring authority to the child node with the minimum sequence value under the root node.
8. The system for distributed monitoring of the operating state of a cloud access controller according to claim 7, wherein the distributed processing framework is further specifically configured to:
creating a service node under the root node and creating sub-nodes under the service node, wherein the sub-nodes comprise a first sub-node and a second sub-node, the first sub-node is used for monitoring server performance data of a server, and the second sub-node is used for monitoring service processing information of a service module;
after the current monitoring process is started, all sub-nodes under the service node are added into the cache and the states of all the sub-nodes are monitored, when the states of the sub-nodes are changed, all the sub-nodes under the service node are obtained, and all the obtained sub-nodes are compared with all the sub-nodes in the cache;
if the sub-node in the cache does not exist in the acquired sub-node, the current monitoring process judges that the non-existing sub-node is abnormal and sends abnormal information to the cloud access controller;
and if the subnode in the cache exists in the acquired subnode, the current monitoring process acquires the server performance data and/or the service processing information monitored by the existing subnode and sends the server performance data and/or the service processing information to the cloud access controller for processing.
9. The system for distributed monitoring of the operating state of the cloud access controller according to claim 8, wherein the occurrence of an abnormality in the current monitoring process specifically includes:
the current monitoring process periodically sends server performance data and/or service processing information according to a preset time interval;
and if the server performance data and/or the service processing information sent by the current monitoring process are not received in the time interval, the current monitoring process is abnormal.
10. The system for distributed monitoring of the operating state of a cloud access controller according to claim 9, wherein the distributed processing framework is further specifically configured to:
if the current monitoring process is abnormal, the distributed processing framework deletes the child node monitoring the current monitoring process and sends node deletion information to the child node of the next sequence value; when the child node of the next sequence value receives the node deletion information, the abnormal information of the current monitoring process and the child node monitoring the current monitoring process is sent to the root node, and then the distributed processing framework determines a new current monitoring process from the backup monitoring process and determines a new child node from all child nodes under the root node to monitor the new current monitoring process according to the preset rule.
CN201611092334.1A 2016-11-30 2016-11-30 Method and system for distributed monitoring of running state of cloud access controller Active CN106487599B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611092334.1A CN106487599B (en) 2016-11-30 2016-11-30 Method and system for distributed monitoring of running state of cloud access controller

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611092334.1A CN106487599B (en) 2016-11-30 2016-11-30 Method and system for distributed monitoring of running state of cloud access controller

Publications (2)

Publication Number Publication Date
CN106487599A CN106487599A (en) 2017-03-08
CN106487599B true CN106487599B (en) 2020-02-04

Family

ID=58274801

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611092334.1A Active CN106487599B (en) 2016-11-30 2016-11-30 Method and system for distributed monitoring of running state of cloud access controller

Country Status (1)

Country Link
CN (1) CN106487599B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11449479B2 (en) * 2018-08-06 2022-09-20 Accelario Software Ltd. Data migration methods and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001001272A2 (en) * 1999-06-30 2001-01-04 Apptitude, Inc. Method and apparatus for monitoring traffic in a network
CN102546256A (en) * 2012-01-12 2012-07-04 易云捷讯科技(北京)有限公司 System and method used for monitoring cloud computation service
CN102739775A (en) * 2012-05-29 2012-10-17 宁波东冠科技有限公司 Method for monitoring and managing Internet of Things data acquisition server cluster
CN102868736A (en) * 2012-08-30 2013-01-09 浪潮(北京)电子信息产业有限公司 Design and implementation method of cloud computing monitoring framework, and cloud computing processing equipment
CN105703940A (en) * 2015-12-10 2016-06-22 中国电力科学研究院 Multistage dispatching distributed parallel computing-oriented monitoring system and monitoring method
CN106021070A (en) * 2016-04-29 2016-10-12 乐视控股(北京)有限公司 Method and device for server cluster monitoring

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001001272A2 (en) * 1999-06-30 2001-01-04 Apptitude, Inc. Method and apparatus for monitoring traffic in a network
CN102546256A (en) * 2012-01-12 2012-07-04 易云捷讯科技(北京)有限公司 System and method used for monitoring cloud computation service
CN102739775A (en) * 2012-05-29 2012-10-17 宁波东冠科技有限公司 Method for monitoring and managing Internet of Things data acquisition server cluster
CN102868736A (en) * 2012-08-30 2013-01-09 浪潮(北京)电子信息产业有限公司 Design and implementation method of cloud computing monitoring framework, and cloud computing processing equipment
CN105703940A (en) * 2015-12-10 2016-06-22 中国电力科学研究院 Multistage dispatching distributed parallel computing-oriented monitoring system and monitoring method
CN106021070A (en) * 2016-04-29 2016-10-12 乐视控股(北京)有限公司 Method and device for server cluster monitoring

Also Published As

Publication number Publication date
CN106487599A (en) 2017-03-08

Similar Documents

Publication Publication Date Title
CN108600029B (en) Configuration file updating method and device, terminal equipment and storage medium
CN110830283B (en) Fault detection method, device, equipment and system
CN112073265B (en) Internet of things monitoring method and system based on distributed edge computing
CN103581276A (en) Cluster management device and system, service client side and corresponding method
CN112463448B (en) Distributed cluster database synchronization method, device, equipment and storage medium
CN103036719A (en) Cross-regional service disaster method and device based on main cluster servers
CN106936620B (en) Alarm event processing method and processing device
CN109391691A (en) The restoration methods and relevant apparatus that NAS is serviced under a kind of single node failure
CN108304296A (en) A kind of server monitoring method, system, equipment and computer readable storage medium
CN107943665A (en) A kind of system host monitoring method and device
CN111400041A (en) Server configuration file management method and device and computer readable storage medium
CN104796283B (en) A kind of method of monitoring alarm
CN106487599B (en) Method and system for distributed monitoring of running state of cloud access controller
CN112947333B (en) Socket long connection-based balanced load fragmentation method
CN100359865C (en) Detecting method
CN113766013A (en) Session creation method, device, equipment and storage medium
CN113765690A (en) Cluster switching method, system, device, terminal, server and storage medium
US20150244780A1 (en) System, method and computing apparatus to manage process in cloud infrastructure
CN116010169A (en) Cloud platform RDS database migration disaster recovery method based on cloud protogenesis technology
CN112437146B (en) Equipment state synchronization method, device and system
CN105550094B (en) A kind of high-availability system state automatic monitoring method
CN111935296B (en) System for high-availability infinite MQTT message service capacity expansion
JP2015082131A (en) Monitoring system, monitoring method, monitoring program, and monitoring device
CN114036032A (en) Real-time program monitoring method and device
CN114168416A (en) Method for monitoring FastDFS storage system based on zabbix custom expansion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20201026

Address after: 318015 no.2-3167, zone a, Nonggang City, no.2388, Donghuan Avenue, Hongjia street, Jiaojiang District, Taizhou City, Zhejiang Province

Patentee after: Taizhou Jiji Intellectual Property Operation Co.,Ltd.

Address before: 201616 Shanghai city Songjiang District Sixian Road No. 3666

Patentee before: Phicomm (Shanghai) Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201217

Address after: 313300 Sunshine Industrial Park, Dipu Town, Anji County, Huzhou City, Zhejiang Province

Patentee after: Zhejiang Anji chair Technology Co.,Ltd.

Address before: 318015 no.2-3167, area a, nonggangcheng, 2388 Donghuan Avenue, Hongjia street, Jiaojiang District, Taizhou City, Zhejiang Province

Patentee before: Taizhou Jiji Intellectual Property Operation Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211209

Address after: 518000 1006, building 2, China Phoenix building, No. 2008, Shennan Avenue, Fuzhong community, Lianhua street, Futian District, Shenzhen, Guangdong Province

Patentee after: Shenzhen Shuwang Internet Technology Co.,Ltd.

Address before: 313300 Sunshine Industrial Park, Dipu Town, Anji County, Huzhou City, Zhejiang Province

Patentee before: Zhejiang Anji chair Technology Co.,Ltd.

TR01 Transfer of patent right