CN105681103A - Loongson-chip-based cluster resource monitoring realization method - Google Patents
Loongson-chip-based cluster resource monitoring realization method Download PDFInfo
- Publication number
- CN105681103A CN105681103A CN201610117765.2A CN201610117765A CN105681103A CN 105681103 A CN105681103 A CN 105681103A CN 201610117765 A CN201610117765 A CN 201610117765A CN 105681103 A CN105681103 A CN 105681103A
- Authority
- CN
- China
- Prior art keywords
- node
- cluster
- monitoring
- information
- resource
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/04—Processing captured monitoring data, e.g. for logfile generation
- H04L43/045—Processing captured monitoring data, e.g. for logfile generation for graphical visualisation of monitoring data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/10—Active monitoring, e.g. heartbeat, ping or trace-route
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Cardiology (AREA)
- General Health & Medical Sciences (AREA)
- Environmental & Geological Engineering (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention relates to the technical field of the system resource monitoring method of the Loongson platform, especially to a Loongson-chip-based cluster resource monitoring realization method. According to the realization method, cluster system resource monitoring is realized on a Loongson platform. A survival situation of a node, resource usage situations of all nodes of the cluster, fan rotating speeds, processor temperatures, and main board temperatures of all nodes can be monitored; and thus a fault that may occur at a system can be predicted.
Description
Technical field
The present invention relates to the system resource monitoring method technical field of Loongson platform, in particular to a kind of cluster resource monitoring implementation method based on Loongson platform.
Background technology
Cluster is one group of computer, and they integrally externally provide network resource. In the view of user, cluster is a system, but not multiple computer system. Cluster has the advantages such as high scalability, high availability, high-performance. In the epoch of information fast development, the appearance of group system allows user that common hardware system is formed cluster, it is possible to increase new hardware according to actual needs at any time in the cluster, it is to increase the retractility of system and operability.
Cluster system resource monitoring is the core of cluster management, mainly the system resource of node is monitored. The data that group system obtains may be used for distribution and the utilization of cluster system resource, and user can also learn whether node breaks down or take measures on customs clearance in advance and take precautions against the generation of fault, the final reliability ensureing cluster.
In autonomous fields such as production domesticization computers, Loongson platform occupies critical role, therefore, it is achieved the cluster system resource monitoring in Loongson platform has significance.
Summary of the invention
In order to solve the problem of prior art, the present invention provides a kind of cluster resource monitoring implementation method based on Loongson platform, and it is in Loongson platform, it is achieved that cluster system resource is monitored.
The technical solution adopted in the present invention is as follows:
Based on a cluster resource monitoring implementation method for Loongson platform, comprise the following steps:
A, arranging monitoring agent, monitor node survival condition based on each node of cluster of Loongson platform, collecting node information, management node is responsible for collecting the information that each monitoring agent is collected;
B, in monitor node deploy database and mapping software;
C, the information collected by each node are stored in database;
D, the information collected is analyzed and show user.
Step D specifically comprises:
D1, analysis node survival data information, shuts down if node breaks down, then point out user to process malfunctioning node;
The temperature of processor of D2, analysis node, the temperature of mainboard, fan rotary speed parameter, according to analytical results, pre-examining system possibility produced problem, warning user takes the precautionary measures in time;
D3, resource information is carried out visualization processing, in the way of curve, disk or cylindricality figure, resource service condition is showed user intuitively.
Inventive design monitoring resource system takes centralized system structure, arranges a monitoring agent in the cluster on each node, and monitoring agent is responsible for obtaining the resource information of this node, and the monitoring order of response monitoring system.Management node (monitor node) is responsible for collecting the node resource information that each monitoring agent obtains, such as treater utilization ratio, internal memory behaviour in service. In addition, the relevant data of collection fan rotating speed, temperature of processor, mainboard temperature are for predicting that node may produced problem.
The useful effect that technical scheme provided by the invention is brought is:
A kind of cluster resource monitoring implementation method based on Loongson platform of the present invention, in Loongson platform, achieve cluster system resource monitoring, the mainly resource service condition of the survival condition of monitor node, the monitoring each node of cluster, the resource informations such as the treater utilization ratio of such as each node, network flow, disk utilization, monitor the fan rotating speed of each node, temperature of processor, mainboard temperature in addition, for the fault that pre-examining system may occur.
Accompanying drawing explanation
In order to the technical scheme being illustrated more clearly in the embodiment of the present invention, below the accompanying drawing used required in embodiment being described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the system network architecture figure of a kind of cluster resource monitoring implementation method based on Loongson platform of the present invention;
Fig. 2 is the method flowchart of a kind of cluster resource monitoring implementation method based on Loongson platform of the present invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, embodiment of the present invention is described further in detail.
Embodiment one
The present embodiment designs the system structure of a kind of monitoring resource assembly Chinese style, arranges a monitoring agent in the cluster on each node, and monitoring agent is responsible for obtaining the resource information of this node, and the monitoring order of response monitoring system. Management node (monitor node) is responsible for collecting the node resource information that each monitoring agent obtains, such as treater utilization ratio, internal memory behaviour in service. In addition, the relevant data of collection fan rotating speed, temperature of processor, mainboard temperature are for predicting that node may produced problem. The system network architecture of cluster is as shown in Figure 1.
As shown in Figure 2, the concrete implementation step of the system of the present embodiment is as follows:
(1) arranging monitoring agent, monitor node survival condition based on each node of cluster of Loongson platform, collecting node information, management node is responsible for collecting the information that each monitoring agent is collected;
(2) in monitor node deploy database and mapping software;
(3) data such as the resource information collected by each node are stored in database;
(4) data analysis:
1. analysis node survival data information, the machine if node has been delayed, then remind user to repair malfunctioning node;
2. the data of the temperature of processor of analysis node, fan rotating speed, mainboard temperature, whether prediction can there is fault;
3. adopt and graphically show each node resource (cpu utilization ratio, internal memory utilization ratio etc.) service condition intuitively, facilitate user carry out analyzing to cluster resource service condition and utilize.
The foregoing is only the better embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment of doing, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.
Claims (3)
1., based on a cluster resource monitoring implementation method for Loongson platform, comprise the following steps:
A, arranging monitoring agent, monitor node survival condition based on each node of cluster of Loongson platform, collecting node information, management node is responsible for collecting the information that each monitoring agent is collected;
B, in monitor node deploy database and mapping software;
C, the information collected by each node are stored in database;
D, the information collected is analyzed and show user.
2. a kind of cluster resource monitoring implementation method based on Loongson platform according to claim 1, it is characterised in that, described step D specifically comprises:
D1, analysis node survival data information, shuts down if node breaks down, then point out user to process malfunctioning node;
The temperature of processor of D2, analysis node, the temperature of mainboard, fan rotary speed parameter, according to analytical results, pre-examining system possibility produced problem, warning user takes the precautionary measures in time;
D3, resource information is carried out visualization processing, in the way of curve, disk or cylindricality figure, resource service condition is showed user intuitively.
3. a kind of cluster resource monitoring implementation method based on Loongson platform according to claim 1, it is characterised in that, described monitoring agent is responsible for obtaining the resource information of this node, and the monitoring order of response monitoring system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610117765.2A CN105681103A (en) | 2016-03-03 | 2016-03-03 | Loongson-chip-based cluster resource monitoring realization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610117765.2A CN105681103A (en) | 2016-03-03 | 2016-03-03 | Loongson-chip-based cluster resource monitoring realization method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105681103A true CN105681103A (en) | 2016-06-15 |
Family
ID=56306427
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610117765.2A Pending CN105681103A (en) | 2016-03-03 | 2016-03-03 | Loongson-chip-based cluster resource monitoring realization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105681103A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108989080A (en) * | 2018-05-29 | 2018-12-11 | 华为技术有限公司 | The method and apparatus of management node |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104468810A (en) * | 2014-12-18 | 2015-03-25 | 山东超越数控电子有限公司 | Method for monitoring high-performance computing resource based on loongson platform |
CN105024880A (en) * | 2015-07-17 | 2015-11-04 | 哈尔滨工程大学 | Elastic monitoring method for key task computer cluster |
-
2016
- 2016-03-03 CN CN201610117765.2A patent/CN105681103A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104468810A (en) * | 2014-12-18 | 2015-03-25 | 山东超越数控电子有限公司 | Method for monitoring high-performance computing resource based on loongson platform |
CN105024880A (en) * | 2015-07-17 | 2015-11-04 | 哈尔滨工程大学 | Elastic monitoring method for key task computer cluster |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108989080A (en) * | 2018-05-29 | 2018-12-11 | 华为技术有限公司 | The method and apparatus of management node |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11733829B2 (en) | Monitoring tree with performance states | |
US10243818B2 (en) | User interface that provides a proactive monitoring tree with state distribution ring | |
US10515469B2 (en) | Proactive monitoring tree providing pinned performance information associated with a selected node | |
US10523538B2 (en) | User interface that provides a proactive monitoring tree with severity state sorting | |
US10489234B2 (en) | Large log file diagnostics system | |
CN102694868B (en) | A kind of group system realizes and task dynamic allocation method | |
CN110245052B (en) | Method and device for determining hot spot component of data system, electronic equipment and storage medium | |
CN111418187A (en) | Scalable statistics and analysis mechanism in cloud networks | |
Balliu et al. | A big data analyzer for large trace logs | |
Wu et al. | An Auxiliary Decision‐Making System for Electric Power Intelligent Customer Service Based on Hadoop | |
Wang et al. | Research on key technology of edge-node resource scheduling based on linear programming | |
Wesolowski et al. | Datacenter-scale analysis and optimization of gpu machine learning workloads | |
CN105681103A (en) | Loongson-chip-based cluster resource monitoring realization method | |
Moguel et al. | Multilayer big data architecture for remote sensing in Eolic parks | |
US11222072B1 (en) | Graph database management system and method for a distributed computing environment | |
Metsch et al. | Apex lake: a framework for enabling smart orchestration | |
Chen et al. | Big data storage architecture design in cloud computing | |
Yongdnog et al. | A scalable and integrated cloud monitoring framework based on distributed storage | |
Terai et al. | An operational data collecting and monitoring platform for Fugaku: system overviews and case studies in the prelaunch service period | |
US11475017B2 (en) | Asynchronous data enrichment for an append-only data store | |
CN110262943B (en) | Abnormal component determining method and device of data system, electronic equipment and storage medium | |
Nguyen et al. | Hiperviz: Interactive visualization of CPU temperatures in high performance computing centers | |
US8838414B2 (en) | Determining when to create a prediction based on deltas of metric values | |
Fay et al. | Next generation monitoring: Tier 2 experience | |
Xu | Automatic selection and parameter configuration of big data software core components based on retention pattern |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160615 |