CN105681103A - Loongson-chip-based cluster resource monitoring realization method - Google Patents

Loongson-chip-based cluster resource monitoring realization method Download PDF

Info

Publication number
CN105681103A
CN105681103A CN201610117765.2A CN201610117765A CN105681103A CN 105681103 A CN105681103 A CN 105681103A CN 201610117765 A CN201610117765 A CN 201610117765A CN 105681103 A CN105681103 A CN 105681103A
Authority
CN
China
Prior art keywords
node
cluster
monitoring
information
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610117765.2A
Other languages
Chinese (zh)
Inventor
柳玉巧
陈乃阔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Chaoyue Numerical Control Electronics Co Ltd
Original Assignee
Shandong Chaoyue Numerical Control Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Chaoyue Numerical Control Electronics Co Ltd filed Critical Shandong Chaoyue Numerical Control Electronics Co Ltd
Priority to CN201610117765.2A priority Critical patent/CN105681103A/en
Publication of CN105681103A publication Critical patent/CN105681103A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • H04L43/045Processing captured monitoring data, e.g. for logfile generation for graphical visualisation of monitoring data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Environmental & Geological Engineering (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to the technical field of the system resource monitoring method of the Loongson platform, especially to a Loongson-chip-based cluster resource monitoring realization method. According to the realization method, cluster system resource monitoring is realized on a Loongson platform. A survival situation of a node, resource usage situations of all nodes of the cluster, fan rotating speeds, processor temperatures, and main board temperatures of all nodes can be monitored; and thus a fault that may occur at a system can be predicted.

Description

A kind of cluster resource monitoring implementation method based on Loongson platform
Technical field
The present invention relates to the system resource monitoring method technical field of Loongson platform, in particular to a kind of cluster resource monitoring implementation method based on Loongson platform.
Background technology
Cluster is one group of computer, and they integrally externally provide network resource. In the view of user, cluster is a system, but not multiple computer system. Cluster has the advantages such as high scalability, high availability, high-performance. In the epoch of information fast development, the appearance of group system allows user that common hardware system is formed cluster, it is possible to increase new hardware according to actual needs at any time in the cluster, it is to increase the retractility of system and operability.
Cluster system resource monitoring is the core of cluster management, mainly the system resource of node is monitored. The data that group system obtains may be used for distribution and the utilization of cluster system resource, and user can also learn whether node breaks down or take measures on customs clearance in advance and take precautions against the generation of fault, the final reliability ensureing cluster.
In autonomous fields such as production domesticization computers, Loongson platform occupies critical role, therefore, it is achieved the cluster system resource monitoring in Loongson platform has significance.
Summary of the invention
In order to solve the problem of prior art, the present invention provides a kind of cluster resource monitoring implementation method based on Loongson platform, and it is in Loongson platform, it is achieved that cluster system resource is monitored.
The technical solution adopted in the present invention is as follows:
Based on a cluster resource monitoring implementation method for Loongson platform, comprise the following steps:
A, arranging monitoring agent, monitor node survival condition based on each node of cluster of Loongson platform, collecting node information, management node is responsible for collecting the information that each monitoring agent is collected;
B, in monitor node deploy database and mapping software;
C, the information collected by each node are stored in database;
D, the information collected is analyzed and show user.
Step D specifically comprises:
D1, analysis node survival data information, shuts down if node breaks down, then point out user to process malfunctioning node;
The temperature of processor of D2, analysis node, the temperature of mainboard, fan rotary speed parameter, according to analytical results, pre-examining system possibility produced problem, warning user takes the precautionary measures in time;
D3, resource information is carried out visualization processing, in the way of curve, disk or cylindricality figure, resource service condition is showed user intuitively.
Inventive design monitoring resource system takes centralized system structure, arranges a monitoring agent in the cluster on each node, and monitoring agent is responsible for obtaining the resource information of this node, and the monitoring order of response monitoring system.Management node (monitor node) is responsible for collecting the node resource information that each monitoring agent obtains, such as treater utilization ratio, internal memory behaviour in service. In addition, the relevant data of collection fan rotating speed, temperature of processor, mainboard temperature are for predicting that node may produced problem.
The useful effect that technical scheme provided by the invention is brought is:
A kind of cluster resource monitoring implementation method based on Loongson platform of the present invention, in Loongson platform, achieve cluster system resource monitoring, the mainly resource service condition of the survival condition of monitor node, the monitoring each node of cluster, the resource informations such as the treater utilization ratio of such as each node, network flow, disk utilization, monitor the fan rotating speed of each node, temperature of processor, mainboard temperature in addition, for the fault that pre-examining system may occur.
Accompanying drawing explanation
In order to the technical scheme being illustrated more clearly in the embodiment of the present invention, below the accompanying drawing used required in embodiment being described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the system network architecture figure of a kind of cluster resource monitoring implementation method based on Loongson platform of the present invention;
Fig. 2 is the method flowchart of a kind of cluster resource monitoring implementation method based on Loongson platform of the present invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, embodiment of the present invention is described further in detail.
Embodiment one
The present embodiment designs the system structure of a kind of monitoring resource assembly Chinese style, arranges a monitoring agent in the cluster on each node, and monitoring agent is responsible for obtaining the resource information of this node, and the monitoring order of response monitoring system. Management node (monitor node) is responsible for collecting the node resource information that each monitoring agent obtains, such as treater utilization ratio, internal memory behaviour in service. In addition, the relevant data of collection fan rotating speed, temperature of processor, mainboard temperature are for predicting that node may produced problem. The system network architecture of cluster is as shown in Figure 1.
As shown in Figure 2, the concrete implementation step of the system of the present embodiment is as follows:
(1) arranging monitoring agent, monitor node survival condition based on each node of cluster of Loongson platform, collecting node information, management node is responsible for collecting the information that each monitoring agent is collected;
(2) in monitor node deploy database and mapping software;
(3) data such as the resource information collected by each node are stored in database;
(4) data analysis:
1. analysis node survival data information, the machine if node has been delayed, then remind user to repair malfunctioning node;
2. the data of the temperature of processor of analysis node, fan rotating speed, mainboard temperature, whether prediction can there is fault;
3. adopt and graphically show each node resource (cpu utilization ratio, internal memory utilization ratio etc.) service condition intuitively, facilitate user carry out analyzing to cluster resource service condition and utilize.
The foregoing is only the better embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment of doing, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (3)

1., based on a cluster resource monitoring implementation method for Loongson platform, comprise the following steps:
A, arranging monitoring agent, monitor node survival condition based on each node of cluster of Loongson platform, collecting node information, management node is responsible for collecting the information that each monitoring agent is collected;
B, in monitor node deploy database and mapping software;
C, the information collected by each node are stored in database;
D, the information collected is analyzed and show user.
2. a kind of cluster resource monitoring implementation method based on Loongson platform according to claim 1, it is characterised in that, described step D specifically comprises:
D1, analysis node survival data information, shuts down if node breaks down, then point out user to process malfunctioning node;
The temperature of processor of D2, analysis node, the temperature of mainboard, fan rotary speed parameter, according to analytical results, pre-examining system possibility produced problem, warning user takes the precautionary measures in time;
D3, resource information is carried out visualization processing, in the way of curve, disk or cylindricality figure, resource service condition is showed user intuitively.
3. a kind of cluster resource monitoring implementation method based on Loongson platform according to claim 1, it is characterised in that, described monitoring agent is responsible for obtaining the resource information of this node, and the monitoring order of response monitoring system.
CN201610117765.2A 2016-03-03 2016-03-03 Loongson-chip-based cluster resource monitoring realization method Pending CN105681103A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610117765.2A CN105681103A (en) 2016-03-03 2016-03-03 Loongson-chip-based cluster resource monitoring realization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610117765.2A CN105681103A (en) 2016-03-03 2016-03-03 Loongson-chip-based cluster resource monitoring realization method

Publications (1)

Publication Number Publication Date
CN105681103A true CN105681103A (en) 2016-06-15

Family

ID=56306427

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610117765.2A Pending CN105681103A (en) 2016-03-03 2016-03-03 Loongson-chip-based cluster resource monitoring realization method

Country Status (1)

Country Link
CN (1) CN105681103A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108989080A (en) * 2018-05-29 2018-12-11 华为技术有限公司 The method and apparatus of management node

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104468810A (en) * 2014-12-18 2015-03-25 山东超越数控电子有限公司 Method for monitoring high-performance computing resource based on loongson platform
CN105024880A (en) * 2015-07-17 2015-11-04 哈尔滨工程大学 Elastic monitoring method for key task computer cluster

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104468810A (en) * 2014-12-18 2015-03-25 山东超越数控电子有限公司 Method for monitoring high-performance computing resource based on loongson platform
CN105024880A (en) * 2015-07-17 2015-11-04 哈尔滨工程大学 Elastic monitoring method for key task computer cluster

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108989080A (en) * 2018-05-29 2018-12-11 华为技术有限公司 The method and apparatus of management node

Similar Documents

Publication Publication Date Title
US11733829B2 (en) Monitoring tree with performance states
US10243818B2 (en) User interface that provides a proactive monitoring tree with state distribution ring
US10515469B2 (en) Proactive monitoring tree providing pinned performance information associated with a selected node
US10523538B2 (en) User interface that provides a proactive monitoring tree with severity state sorting
US10489234B2 (en) Large log file diagnostics system
CN102694868B (en) A kind of group system realizes and task dynamic allocation method
CN110245052B (en) Method and device for determining hot spot component of data system, electronic equipment and storage medium
CN111418187A (en) Scalable statistics and analysis mechanism in cloud networks
Balliu et al. A big data analyzer for large trace logs
Wu et al. An Auxiliary Decision‐Making System for Electric Power Intelligent Customer Service Based on Hadoop
Wang et al. Research on key technology of edge-node resource scheduling based on linear programming
Wesolowski et al. Datacenter-scale analysis and optimization of gpu machine learning workloads
CN105681103A (en) Loongson-chip-based cluster resource monitoring realization method
Moguel et al. Multilayer big data architecture for remote sensing in Eolic parks
US11222072B1 (en) Graph database management system and method for a distributed computing environment
Metsch et al. Apex lake: a framework for enabling smart orchestration
Chen et al. Big data storage architecture design in cloud computing
Yongdnog et al. A scalable and integrated cloud monitoring framework based on distributed storage
Terai et al. An operational data collecting and monitoring platform for Fugaku: system overviews and case studies in the prelaunch service period
US11475017B2 (en) Asynchronous data enrichment for an append-only data store
CN110262943B (en) Abnormal component determining method and device of data system, electronic equipment and storage medium
Nguyen et al. Hiperviz: Interactive visualization of CPU temperatures in high performance computing centers
US8838414B2 (en) Determining when to create a prediction based on deltas of metric values
Fay et al. Next generation monitoring: Tier 2 experience
Xu Automatic selection and parameter configuration of big data software core components based on retention pattern

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160615