CN105553766A - Monitoring method of abnormal node dynamic tracking cluster node state - Google Patents
Monitoring method of abnormal node dynamic tracking cluster node state Download PDFInfo
- Publication number
- CN105553766A CN105553766A CN201510927404.XA CN201510927404A CN105553766A CN 105553766 A CN105553766 A CN 105553766A CN 201510927404 A CN201510927404 A CN 201510927404A CN 105553766 A CN105553766 A CN 105553766A
- Authority
- CN
- China
- Prior art keywords
- module
- abnormal
- cluster
- node
- clustered node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/10—Active monitoring, e.g. heartbeat, ping or trace-route
- H04L43/103—Active monitoring, e.g. heartbeat, ping or trace-route with adaptive polling, i.e. dynamically adapting the polling rate
Landscapes
- Health & Medical Sciences (AREA)
- Cardiology (AREA)
- General Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Debugging And Monitoring (AREA)
Abstract
The present invention discloses a monitoring method of an abnormal node dynamic tracking cluster node state. A cluster node state service module, a cluster node module, a cluster abnormal node detection module, and a cluster abnormal node queue module are comprised. The method concretely comprises the steps that: (A) the cluster node module sends a node report to the cluster node state service module and checks the existence of an offline abnormality, (B) the cluster abnormal node detection module carries out abnormal node detection on the cluster node module and transmits an abnormal node to the cluster abnormal node queue module, (C) the cluster abnormal node queue module sends the abnormal nodes in a queue to the cluster node state service module in a sequence and adjusts an abnormal judgement, (D) the cluster node state service module receives the abnormal judgement of the cluster abnormal node queue module and completes the monitoring of a cluster node state. According to the monitoring method, the cost of monitoring is saved, the sensitivity of the monitoring is raised, the stability of the monitoring service is ensured, the applicability is good, and the practicability is high.
Description
Technical field
The invention belongs to distributed data base technique field, be specifically related to a kind of monitoring method of abnormal nodes dynamic tracing clustered node state.
Background technology
Along with the increase of data volume, the number of nodes of large-scale cluster often will reach more than 100 nodes, the timing of rank traditional second poll or the point being monitored according to the order of sequence mode of reporting can not adapt to the monitoring of large-scale cluster due to the reason that too much can consume resources such as networks, so just need a kind of new monitoring method namely will ensure that certain monitoring sensitivity also will accomplish the less consumption of the resources such as Multi net voting, under this requires, active poll and abnormal nodes just can be adopted to follow the trail of combination, namely longer polling interval is adopted to ensure not consume cluster resource with taking place frequently, set up again and follow the trail of queue, to occurring that the node of off-line exception does emphasis and follows the trail of, this tracking accomplishes that second rank is to ensure sensitivity.
In a lot of schemes of current existence, the mode of initiatively poll is only adopted to monitor, but maximum shortcoming is exactly bad setting interval time, overhead is excessive, interval time is oversize, and system sensitivity is too low, if interval time is too short, needs repeatedly to create a large amount of socket, just think on the super large cluster of 300 computing nodes, constantly create and delete tcp and connect and can cause how large impact to whole system.Also only adopt the mode of abnormal nodes dynamic tracing in some scheme, this mode is only followed the trail of abnormal nodes, although overhead is smaller, system sensitivity is high, but node cannot be pinpointed the problems when not having human hair socket, sometimes also can cause the problems such as deadlock.
Summary of the invention
In order to solve the problems of the technologies described above, it is good and greatly put forward the monitoring method of highly sensitive abnormal nodes dynamic tracing clustered node state that the present invention is to provide a kind of stability in use.
The technical scheme realizing the object of the invention is: a kind of monitoring method of abnormal nodes dynamic tracing clustered node state, comprise clustered node status service module, clustered node module, cluster detection of anomaly node module, cluster abnormal nodes Queue module, concrete grammar step is:
A, clustered node module are reported to clustered node status service module sending node, and whether self-inspection has off-line abnormal;
B, cluster detection of anomaly node module carry out detection of anomaly node to clustered node module, and confirm to judge in conjunction with the abnormal exception of carrying out of self-inspection off-line of clustered node module, when confirming to there is abnormal nodes, by this abnormal nodes to the transmission of cluster abnormal nodes Queue module;
C, cluster abnormal nodes Queue module sequentially to abnormal nodes in clustered node status service module transmit queue, and regulate with long abnormal judgement;
The exception that D, clustered node status service module receive cluster abnormal nodes Queue module judges, and judges as required to start or stop the abnormal tracking of second rank, thus completes the monitoring to clustered node state.
In step D, when cluster abnormal nodes Queue module is detected as long abnormality, clustered node status service module starts or continues the abnormal tracking of rank second; When the detection of cluster abnormal nodes Queue module is not long abnormality, clustered node status service module is suspended abnormal tracking of rank second and is recovered abnormal nodes poll simultaneously.
The quantity at least two of described clustered node status service module and be provided with handover module between clustered node status service module.
The quantity at least two of described cluster detection of anomaly node module and cluster detection of anomaly node module are all connected with cluster abnormal nodes Queue module.
The present invention has positive effect: the present invention adopts driving wheel source to ask and the clustered node of abnormal nodes dynamic tracing is monitored, thus greatly save the cost of monitoring and substantially increase the sensitivity of monitoring, ensure that the stability of Monitoring Service, applicability is good, practical.
Accompanying drawing explanation
In order to make content of the present invention be more likely to be clearly understood, below according to specific embodiment also by reference to the accompanying drawings, the present invention is further detailed explanation, wherein:
Fig. 1 is structured flowchart of the present invention;
Fig. 2 is concrete steps block diagram of the present invention;
Fig. 3 is that length of the present invention judges FB(flow block) extremely.
Embodiment
(embodiment 1)
Fig. 1 to Fig. 3 shows a kind of embodiment of the present invention, and wherein Fig. 1 is structured flowchart of the present invention; Fig. 2 is concrete steps block diagram of the present invention; Fig. 3 is that length of the present invention judges FB(flow block) extremely.
See Fig. 1 to Fig. 3, a kind of monitoring method of abnormal nodes dynamic tracing clustered node state, comprise clustered node status service module 1, clustered node module 2, cluster detection of anomaly node module 3, cluster abnormal nodes Queue module 4, concrete grammar step is:
A, clustered node module are reported to clustered node status service module sending node, and whether self-inspection has off-line abnormal;
B, cluster detection of anomaly node module carry out detection of anomaly node to clustered node module, and confirm to judge in conjunction with the abnormal exception of carrying out of self-inspection off-line of clustered node module, when confirming to there is abnormal nodes, by this abnormal nodes to the transmission of cluster abnormal nodes Queue module;
C, cluster abnormal nodes Queue module sequentially to abnormal nodes in clustered node status service module transmit queue, and regulate with long abnormal judgement;
The exception that D, clustered node status service module receive cluster abnormal nodes Queue module judges, and judges as required to start or stop the abnormal tracking of second rank, thus completes the monitoring to clustered node state.
In step D, when cluster abnormal nodes Queue module is detected as long abnormality, clustered node status service module starts or continues the abnormal tracking of rank second; When the detection of cluster abnormal nodes Queue module is not long abnormality, clustered node status service module is suspended abnormal tracking of rank second and is recovered abnormal nodes poll simultaneously.
The quantity at least two of described clustered node status service module and be provided with handover module 5 between clustered node status service module.
The quantity at least two of described cluster detection of anomaly node module and cluster detection of anomaly node module are all connected with cluster abnormal nodes Queue module.
The present invention adopts driving wheel source to ask and the clustered node of abnormal nodes dynamic tracing is monitored, thus greatly saves the cost of monitoring and substantially increase the sensitivity of monitoring, and ensure that the stability of Monitoring Service, applicability is good, practical.
Obviously, the above embodiment of the present invention is only for example of the present invention is clearly described, and is not the restriction to embodiments of the present invention.For those of ordinary skill in the field, can also make other changes in different forms on the basis of the above description.Here exhaustive without the need to also giving all execution modes.And these belong to connotation of the present invention the apparent change of extending out or variation still belong to protection scope of the present invention.
Claims (4)
1. a monitoring method for abnormal nodes dynamic tracing clustered node state, comprises clustered node status service module, clustered node module, cluster detection of anomaly node module, cluster abnormal nodes Queue module, it is characterized in that: concrete grammar step is:
A, clustered node module are reported to clustered node status service module sending node, and whether self-inspection has off-line abnormal;
B, cluster detection of anomaly node module carry out detection of anomaly node to clustered node module, and confirm to judge in conjunction with the abnormal exception of carrying out of self-inspection off-line of clustered node module, when confirming to there is abnormal nodes, by this abnormal nodes to the transmission of cluster abnormal nodes Queue module;
C, cluster abnormal nodes Queue module sequentially to abnormal nodes in clustered node status service module transmit queue, and regulate with long abnormal judgement;
The exception that D, clustered node status service module receive cluster abnormal nodes Queue module judges, and judges as required to start or stop the abnormal tracking of second rank, thus completes the monitoring to clustered node state.
2. the monitoring method of abnormal nodes dynamic tracing clustered node state according to claim 1, it is characterized in that: in step D, when cluster abnormal nodes Queue module is detected as long abnormality, clustered node status service module starts or continues the abnormal tracking of rank second; When the detection of cluster abnormal nodes Queue module is not long abnormality, clustered node status service module is suspended abnormal tracking of rank second and is recovered abnormal nodes poll simultaneously.
3. the monitoring method of abnormal nodes dynamic tracing clustered node state according to claim 2, is characterized in that: the quantity at least two of described clustered node status service module and be provided with handover module between clustered node status service module.
4. the monitoring method of abnormal nodes dynamic tracing clustered node state according to claim 3, is characterized in that: the quantity at least two of described cluster detection of anomaly node module and cluster detection of anomaly node module are all connected with cluster abnormal nodes Queue module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510927404.XA CN105553766A (en) | 2015-12-12 | 2015-12-12 | Monitoring method of abnormal node dynamic tracking cluster node state |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510927404.XA CN105553766A (en) | 2015-12-12 | 2015-12-12 | Monitoring method of abnormal node dynamic tracking cluster node state |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105553766A true CN105553766A (en) | 2016-05-04 |
Family
ID=55832705
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510927404.XA Pending CN105553766A (en) | 2015-12-12 | 2015-12-12 | Monitoring method of abnormal node dynamic tracking cluster node state |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105553766A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106130761A (en) * | 2016-06-22 | 2016-11-16 | 北京百度网讯科技有限公司 | The recognition methods of the failed network device of data center and device |
CN106817700A (en) * | 2017-03-02 | 2017-06-09 | 中国人民解放军信息工程大学 | Detection of anomaly node method based on multiple integrality remote proving |
CN107231359A (en) * | 2017-06-08 | 2017-10-03 | 山东超越数控电子有限公司 | A kind of high-availability cluster node security method for monitoring state and system |
CN110716842A (en) * | 2019-10-09 | 2020-01-21 | 北京小米移动软件有限公司 | Cluster fault detection method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101640610A (en) * | 2009-09-02 | 2010-02-03 | 中兴通讯股份有限公司 | Method and system for automatic discovery of Ethernet link |
CN101938504A (en) * | 2009-06-30 | 2011-01-05 | 深圳市融创天下科技发展有限公司 | Cluster server intelligent dispatching method and system |
CN102404390A (en) * | 2011-11-07 | 2012-04-04 | 广东电网公司电力科学研究院 | Intelligent dynamic load balancing method for high-speed real-time database |
CN104460650A (en) * | 2014-10-24 | 2015-03-25 | 中国科学院遥感与数字地球研究所 | Fault diagnosis device and method for remote sensing satellite receiving system |
CN104639347A (en) * | 2013-11-07 | 2015-05-20 | 北大方正集团有限公司 | Multi-cluster monitoring method and device, and system |
KR101572672B1 (en) * | 2012-01-05 | 2015-12-04 | 한국전자통신연구원 | Method for monitoring node failure on communication network and system thereof |
-
2015
- 2015-12-12 CN CN201510927404.XA patent/CN105553766A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101938504A (en) * | 2009-06-30 | 2011-01-05 | 深圳市融创天下科技发展有限公司 | Cluster server intelligent dispatching method and system |
CN101640610A (en) * | 2009-09-02 | 2010-02-03 | 中兴通讯股份有限公司 | Method and system for automatic discovery of Ethernet link |
CN102404390A (en) * | 2011-11-07 | 2012-04-04 | 广东电网公司电力科学研究院 | Intelligent dynamic load balancing method for high-speed real-time database |
KR101572672B1 (en) * | 2012-01-05 | 2015-12-04 | 한국전자통신연구원 | Method for monitoring node failure on communication network and system thereof |
CN104639347A (en) * | 2013-11-07 | 2015-05-20 | 北大方正集团有限公司 | Multi-cluster monitoring method and device, and system |
CN104460650A (en) * | 2014-10-24 | 2015-03-25 | 中国科学院遥感与数字地球研究所 | Fault diagnosis device and method for remote sensing satellite receiving system |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106130761A (en) * | 2016-06-22 | 2016-11-16 | 北京百度网讯科技有限公司 | The recognition methods of the failed network device of data center and device |
CN106130761B (en) * | 2016-06-22 | 2019-06-18 | 北京百度网讯科技有限公司 | The recognition methods of the failed network device of data center and device |
CN106817700A (en) * | 2017-03-02 | 2017-06-09 | 中国人民解放军信息工程大学 | Detection of anomaly node method based on multiple integrality remote proving |
CN106817700B (en) * | 2017-03-02 | 2019-06-28 | 中国人民解放军信息工程大学 | Detection of anomaly node method based on multiple integrality remote proving |
CN107231359A (en) * | 2017-06-08 | 2017-10-03 | 山东超越数控电子有限公司 | A kind of high-availability cluster node security method for monitoring state and system |
CN110716842A (en) * | 2019-10-09 | 2020-01-21 | 北京小米移动软件有限公司 | Cluster fault detection method and device |
CN110716842B (en) * | 2019-10-09 | 2023-11-21 | 北京小米移动软件有限公司 | Cluster fault detection method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106453648B (en) | Equipment state determination method and device for intelligent household equipment | |
CN109274557B (en) | Intelligent CMDB management and cloud host monitoring method in cloud environment | |
US9141491B2 (en) | Highly available server system based on cloud computing | |
CN105553766A (en) | Monitoring method of abnormal node dynamic tracking cluster node state | |
CN106357469B (en) | A kind of dynamic adjusting method and device of monitoring resource mode | |
CN104320311A (en) | Heartbeat detection method of SCADA distribution type platform | |
CN104022904A (en) | Unified management platform for IT devices in distributed computer rooms | |
CN103986604A (en) | Method and device for locating network fault | |
CN107360239A (en) | A kind of client connection status detection method and system | |
WO2015188553A1 (en) | Link backup and power source backup method, device and system, and storage medium | |
WO2016065552A1 (en) | Heartbeat cycle setting method and terminal | |
US11539609B2 (en) | Method and apparatus for reporting power down events in a network node without a backup energy storage device | |
CN105739668A (en) | Power management method and power management system of notebook computers | |
CN104468201A (en) | Automatic deleting method and device for offline network equipment | |
CN104320285A (en) | Website running status monitoring method and device | |
CN103684897A (en) | Method, system and device for detecting network connectivity in client | |
CN105530145A (en) | Agentless equipment monitoring network based on ZABBIX framework, networking method and monitoring method | |
CN109639640B (en) | Message sending method and device | |
CN108933685A (en) | A kind of the monitoring maintaining method and system of Intelligent hardware product | |
CN103841047A (en) | Link aggregation method and device | |
CN103825765A (en) | Method and device for device state polling | |
CN103941843A (en) | Mode switching method and device | |
CN107682906B (en) | Routing inspection data communication method and system in machine room | |
CN203984097U (en) | Inclination of transmission line tower detection system based on stelliform connection topology configuration | |
EP4072106A1 (en) | Dynamic environment monitoring |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160504 |