CN110659182A - High-performance computer monitoring method and system - Google Patents

High-performance computer monitoring method and system Download PDF

Info

Publication number
CN110659182A
CN110659182A CN201910862948.0A CN201910862948A CN110659182A CN 110659182 A CN110659182 A CN 110659182A CN 201910862948 A CN201910862948 A CN 201910862948A CN 110659182 A CN110659182 A CN 110659182A
Authority
CN
China
Prior art keywords
data
monitoring
time
real
performance computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910862948.0A
Other languages
Chinese (zh)
Inventor
张春林
黄益明
建澜涛
张祯
吴智
韩小虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN201910862948.0A priority Critical patent/CN110659182A/en
Publication of CN110659182A publication Critical patent/CN110659182A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data

Abstract

A high-performance computer monitoring method and a system belong to the technical field of high-performance computer system monitoring. The method of the invention comprises the following steps: polling to collect general monitoring data, and interrupting to collect key monitoring data; classifying and storing the collected general monitoring data and the collected key monitoring data according to real-time data and historical data; and caching the corresponding data according to the monitoring request, and pushing the data in real time. The system of the present invention comprises: the system comprises a polling data collector, an interrupt data collector, an agent module, a memory database, a time sequence database, a web back end, message middleware, a web front end and a server. The invention can effectively improve the data real-time performance and the query efficiency of the monitoring system.

Description

High-performance computer monitoring method and system
Technical Field
The invention relates to the technical field of high-performance computer system monitoring, in particular to a high-performance computer monitoring method and system.
Background
With the continuous improvement of the operational performance of a high-performance computer, the scale of node resources contained in a host system is continuously enlarged, the running states of a matched power supply system and a cooling system and the acquisition and monitoring of environmental information of a machine room are increasingly complicated, and a great challenge is provided on how to efficiently monitor the system.
Aiming at a system with the scale of millions of monitoring targets and millions of monitoring indexes, the traditional monitoring system design method is difficult to provide a high-efficiency low-delay solution.
Disclosure of Invention
The present invention is directed to solve the problems of the prior art, and provides a high performance computer monitoring method and system, which can effectively improve the real-time performance and query efficiency of data of a monitoring system.
The purpose of the invention is realized by the following technical scheme:
a high performance computer monitoring method, comprising:
polling to collect general monitoring data, and interrupting to collect key monitoring data;
classifying and storing the collected general monitoring data and the collected key monitoring data according to real-time data and historical data;
and caching the corresponding data according to the monitoring request, and pushing the data in real time.
The invention is different from the mode of polling and collecting all monitoring data in the prior art, and adopts a polling and collecting mode for general monitoring data and an interruption collecting mode for important monitoring data according to different weights, thereby reducing the pressure of polling and collecting and improving the monitoring efficiency of the general monitoring data to a certain extent on the one hand, and greatly reducing the monitoring delay for the important monitoring data and improving the timeliness and the efficiency for monitoring the important monitoring data on the other hand. In addition, the collected data are stored according to real-time data and historical data in a classified mode, and therefore efficiency of querying the historical data is effectively improved.
Preferably, the interrupt acquisition specifically refers to acquiring corresponding important monitoring data when an interrupt signal is received.
The invention also provides a high-performance computer monitoring system, comprising:
the polling data acquisition unit is used for acquiring general monitoring data;
the interrupt data acquisition unit is used for acquiring key monitoring data;
the agent module is used for receiving the data uploaded by the polling data acquisition unit and the interrupt data acquisition unit and storing the data according to real-time data and historical data in a classified manner;
the web back end is used for monitoring the request of the web front end, caching the corresponding data and pushing the data to the web front end in real time;
and the web front end is used for sending a request to the web back end and receiving and displaying the data.
Preferably, the present invention further comprises:
the memory database is used for storing the real-time data;
and the time sequence database is used for storing the historical data. Different databases are adopted to respectively store historical data and real-time data so as to effectively improve the efficiency of data query.
Preferably, the present invention further comprises:
and the message middleware is used for connecting the web front end and the web back end. The message middleware is particularly suitable for data communication in a distributed system, and the communication is efficient and reliable.
The invention has the advantages that: the monitoring time delay of the monitoring data is greatly reduced, particularly the monitoring time delay of the counterweight monitoring data, the monitoring time delay of the general monitoring data within 5s is realized, and the monitoring time delay of the counterweight monitoring data within 1s is realized, so that the timeliness and the high efficiency of the monitoring data are improved. Meanwhile, the query time for querying certain data in a certain period of time in the mass historical data reaches within 1s, and the efficiency of data query is effectively ensured.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a logic block diagram of the system of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
A high performance computer monitoring method, comprising:
polling to collect general monitoring data, and interrupting to collect key monitoring data;
classifying and storing the collected general monitoring data and the collected key monitoring data according to real-time data and historical data;
and caching the corresponding data according to the monitoring request, and pushing the data in real time.
The interruption acquisition specifically refers to acquiring corresponding key monitoring data when an interruption signal is received.
The method is different from the mode of polling and collecting all monitoring data in the prior art, and adopts a polling and collecting mode for general monitoring data and an interruption collecting mode for important monitoring data according to different weights, so that on one hand, the pressure of polling and collecting is reduced, the monitoring efficiency for general monitoring data is improved to a certain extent, on the other hand, the monitoring delay for the important monitoring data is greatly reduced by the interruption collecting mode, and the timeliness and the efficiency for monitoring the important monitoring data are improved. In addition, the collected data are stored according to real-time data and historical data in a classified mode, and therefore efficiency of querying the historical data is effectively improved.
In addition, the present invention also provides a high performance computer monitoring system, comprising:
the polling data acquisition unit is used for acquiring general monitoring data; the system is provided with a plurality of devices distributed in a distributed system;
the interrupt data acquisition unit is used for acquiring key monitoring data; the system is provided with a plurality of devices distributed in a distributed system;
the agent module is used for receiving the data uploaded by the polling data acquisition unit and the interrupt data acquisition unit and storing the data according to real-time data and historical data in a classified manner;
the web back end is used for monitoring the request of the web front end, caching the corresponding data and pushing the data to the web front end in real time;
and the web front end is used for sending a request to the web back end and receiving and displaying the data.
The memory database is used for storing the real-time data;
and the time sequence database is used for storing the historical data. Different databases are adopted to respectively store historical data and real-time data so as to effectively improve the efficiency of data query.
And the message middleware is used for connecting the web front end and the web back end. The message middleware is particularly suitable for data communication in a distributed system, and the communication is efficient and reliable.
The above description is only a preferred embodiment of the present invention, and the present invention is not limited to the above embodiment, and any changes or substitutions that can be easily made by those skilled in the art within the technical scope of the present invention should be covered by the protection scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A high performance computer monitoring method, comprising:
polling to collect general monitoring data, and interrupting to collect key monitoring data;
classifying and storing the collected general monitoring data and the collected key monitoring data according to real-time data and historical data;
and caching the corresponding data according to the monitoring request, and pushing the data in real time.
2. The method according to claim 1, wherein the interrupt acquisition means acquiring corresponding important monitoring data when receiving an interrupt signal.
3. A high performance computer monitoring system, comprising:
the polling data acquisition unit is used for acquiring general monitoring data;
the interrupt data acquisition unit is used for acquiring key monitoring data;
the agent module is used for receiving the data uploaded by the polling data acquisition unit and the interrupt data acquisition unit and storing the data according to real-time data and historical data in a classified manner;
the web back end is used for monitoring the request of the web front end, caching the corresponding data and pushing the data to the web front end in real time;
and the web front end is used for sending a request to the web back end and receiving and displaying the data.
4. The high performance computer monitoring system of claim 3, further comprising:
the memory database is used for storing the real-time data;
and the time sequence database is used for storing the historical data.
5. The high performance computer monitoring system of claim 3, further comprising:
and the message middleware is used for connecting the web front end and the web back end.
CN201910862948.0A 2019-09-12 2019-09-12 High-performance computer monitoring method and system Pending CN110659182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910862948.0A CN110659182A (en) 2019-09-12 2019-09-12 High-performance computer monitoring method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910862948.0A CN110659182A (en) 2019-09-12 2019-09-12 High-performance computer monitoring method and system

Publications (1)

Publication Number Publication Date
CN110659182A true CN110659182A (en) 2020-01-07

Family

ID=69037059

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910862948.0A Pending CN110659182A (en) 2019-09-12 2019-09-12 High-performance computer monitoring method and system

Country Status (1)

Country Link
CN (1) CN110659182A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112506735A (en) * 2020-11-26 2021-03-16 中移(杭州)信息技术有限公司 Service quality monitoring method, system, server and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104780220A (en) * 2015-04-28 2015-07-15 大连商品交易所 Intelligent monitoring system and method for large distributed system oriented to security futures industry
US20150358217A1 (en) * 2013-01-09 2015-12-10 Beijing Qihoo Technology Company Limited Web Polling Method, Device and System
CN107943668A (en) * 2017-12-15 2018-04-20 江苏神威云数据科技有限公司 Computer server cluster daily record monitoring method and monitor supervision platform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150358217A1 (en) * 2013-01-09 2015-12-10 Beijing Qihoo Technology Company Limited Web Polling Method, Device and System
CN104780220A (en) * 2015-04-28 2015-07-15 大连商品交易所 Intelligent monitoring system and method for large distributed system oriented to security futures industry
CN107943668A (en) * 2017-12-15 2018-04-20 江苏神威云数据科技有限公司 Computer server cluster daily record monitoring method and monitor supervision platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王红旗 等: "《高速网络数据采集方法研究》", 《西南民族大学学报 自然科学版》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112506735A (en) * 2020-11-26 2021-03-16 中移(杭州)信息技术有限公司 Service quality monitoring method, system, server and storage medium

Similar Documents

Publication Publication Date Title
CN103914485B (en) System and method for remotely collecting, retrieving and displaying application system logs
CN102479207B (en) Information search method, system and device
WO2021091489A1 (en) Method and apparatus for storing time series data, and server and storage medium thereof
CN103064731A (en) Device and method for improving message queue system performance
CN104506373A (en) Device and method for collecting and processing network information
CN101989283A (en) Monitoring method and device of performance of database
CN102254024A (en) Mass data processing system and method
CN108156225B (en) Micro-application monitoring system and method based on container cloud platform
CN103544261A (en) Method and device for managing global indexes of mass structured log data
CN102158550B (en) IEC61850-based power quality transient data transmission method
CN102722971A (en) Intelligent data acquisition device with solidified protocol
CN103117878A (en) Design method of Nagios-based distribution monitoring system
CN103678522A (en) Method for acquiring and converting data of metering system of intelligent transformer substation
CN110659182A (en) High-performance computer monitoring method and system
CN102122430B (en) Device and method for collecting agricultural product information
CN204790999U (en) Big data acquisition of industry and processing system
CN202431441U (en) State monitoring system of wind turbine
CN109526045B (en) Boiler information tracing method and device for intelligent workshop
CN103218396A (en) Dispatching and operating visualized analysis methodby generating static web pages according to visit frequency characteristic
CN204795120U (en) Split type extensible network message storage device
CN112241429A (en) Equipment thing allies oneself with system based on big data
CN207133857U (en) Intelligent inventory management system
CN202126601U (en) Remote data collection control device
CN203626978U (en) Diesel generator controller provided with intelligent system
CN114443410A (en) Service log processing method and system and Internet of things system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination