CN115934464A - Information platform monitoring and collecting system - Google Patents

Information platform monitoring and collecting system Download PDF

Info

Publication number
CN115934464A
CN115934464A CN202211592846.XA CN202211592846A CN115934464A CN 115934464 A CN115934464 A CN 115934464A CN 202211592846 A CN202211592846 A CN 202211592846A CN 115934464 A CN115934464 A CN 115934464A
Authority
CN
China
Prior art keywords
data
alarm
prometheus
monitoring
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211592846.XA
Other languages
Chinese (zh)
Inventor
于德江
左鹏
王禹博
徐士强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202211592846.XA priority Critical patent/CN115934464A/en
Publication of CN115934464A publication Critical patent/CN115934464A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an information platform monitoring and collecting system, which belongs to the technical field of container performance acquisition and monitoring, and aims to solve the technical problems of realizing the refined management of a K8S cluster container, conveniently checking the problem and timely processing the problem, wherein the technical scheme is as follows: the system comprises a data collection and extraction unit and a monitoring alarm unit, wherein the data collection and extraction unit comprises a data collection layer and a data extraction layer, and the monitoring alarm unit comprises a data display layer, an alarm rule configuration layer, an alarm generation layer and an alarm display layer; the data collection layer is used for collecting host data, system data and container data, carrying out standardized processing on the collected data and storing the data; and the data extraction layer is used for normalizing and filtering the data acquired by the data collection layer through an alarm rule language in a programmed yaml file during deployment, and extracting the required data to the monitoring alarm module.

Description

Information platform monitoring and collecting system
Technical Field
The invention relates to the technical field of container performance acquisition and monitoring, in particular to an information platform monitoring and acquisition system.
Background
Kubernets, K8S for short, can be used to manage containerized applications on multiple hosts in a cloud platform. The application deployment is realized by deploying the containers, the containers are isolated from each other, each container has a file system, processes among the containers cannot influence each other, and computing resources can be distinguished. Compared with a virtual machine, the container can be deployed rapidly, and the container can be migrated among different clouds and different versions of operating systems because the container is decoupled from underlying facilities and a machine file system.
Monitoring is very important work in k8s cluster operation and maintenance management, collects operation data in a cluster timely and comprehensively, and is a basis for observing cluster operation states, knowing cluster operation trends and carrying out alarm notification according to certain rules. However, for a cluster with a large number of containers, the existing monitoring method is easy to cause the problems of overlarge gateway pressure and monitoring data loss.
Therefore, how to realize the fine management of the K8S cluster container, conveniently find the problem and timely process the problem is a technical problem to be solved urgently at present.
Disclosure of Invention
The technical task of the invention is to provide an information platform monitoring and collecting system to solve the problems of how to realize the fine management of a K8S cluster container, conveniently find problems and timely process the problems.
The technical task of the invention is realized in the following way, the system for monitoring and acquiring the information platform comprises a data collecting and extracting unit and a monitoring and alarming unit, wherein the data collecting and extracting unit comprises a data collecting layer and a data extracting layer, and the monitoring and alarming unit comprises a data display layer, an alarming rule configuration layer, an alarming generation layer and an alarming display layer;
the data collection layer is used for collecting host data, system data and container data, carrying out standardized processing on the collected data and storing the data;
the data extraction layer is used for normalizing and filtering data acquired by the data collection layer through an alarm rule language in a compiled yaml file during deployment, extracting required data to the monitoring alarm module, and storing the collected data to a self-contained time sequence database of Prometheus through the exporter by the Prometheus for Grafana calling, wherein the data is in a uniform format;
the data display layer enables a web interface to be used for displaying the data acquired by the data collection layer in a unified mode, the display mode comprises a curve graph, a bar graph and a cake state, and the data are graphed, so that operation and maintenance personnel can be helped to know the operation state and the operation trend of a host or a network within a period of time and the operation state and the operation trend are used as the basis for the operation and maintenance personnel to troubleshoot problems or solve the problems;
the alarm rule configuration layer is used for configuring built-in alarm rules of all set resources in a yml configuration file prometheus.yml of Prometheus and pushing alarm information;
the alarm event generation layer is used for recording the alarm event in real time and notifying a user;
the user display layer is a web display interface and is used for uniformly displaying the monitoring statistical result and the alarm fault result.
Preferably, the data collection layer collects data in the following manner:
(1) building a Kubernetes cluster according to the requirements of actual service and resource conditions, and taking the cluster as a monitoring target;
(2) installing an acquisition component exporter, cadvisor or telegraf in the cluster to realize acquisition of cluster performance data, wherein the cluster performance data comprises cpu, memory, disk and network resource data information;
(3) monitoring indexes with different dimensions are collected through an exporter and are exposed through a data format supported by Prometous, and the Prometous periodically pulls data and displays the data by Grafana;
(4) collecting performance index data related to the container and Pod through cadvisor, and grabbing the performance index data by prometheus through an exposed metrics interface;
(5) and collecting the performance index data of the host through a prometheus-node-exporter, and capturing the performance index data by prometheus through an exposed metrics interface.
Preferably, the Prometheus building and installing process is as follows:
(1) Packaging the Prometheus mirror image and putting the Prometheus mirror image into a cluster mirror image warehouse for subsequent installation of Prometheus;
(2) Creating a namespace with the name of monitering in the constructed Kubernets cluster, and storing the containers operated by Prometheus;
(3) Distributing the reading authority of the cluster to the monitering, and obtaining resource related information of the cluster by Prometous through an API (application program interface) of Kubernetes;
(4) Creating a ConfigMap at monitering for storing the configuration of the Prometheus container and the configuration of dynamically discovering the pod and the running service in the kubernets cluster;
(5) Creating Prometous in a Delployment mode, and installing Prometous through a yaml file;
(6) And connecting Prometheus, mapping an internal port of Prometheus into an external port through a yaml file, and automatically connecting the Kubernetes cluster to Prometheus, namely Prometheus deployment succeeds.
More preferably, the working process of Prometheus is as follows:
(1) The Prometheus server periodically pulls metrics from configured exporters;
(2) The Prometheus server locally stores the collected metrics, runs the defined alert. Rules, records a new time sequence or pushes an alarm to Grafana;
(3) Processing the received alarm by Grafana according to the configuration file, and sending an alarm;
(4) And in the graphical interface, visually acquiring data.
Preferably, the data display layer adopts a Grafana tool, and the Grafana tool deployment process specifically comprises the following steps:
(1) Packaging Grafana mirror images and placing the Grafana mirror images into a cluster mirror image warehouse for subsequent installation of Grafana;
(2) Installing Grafana through a yaml file;
(3) Connecting Grafana, mapping Grafana internal ports into external ports through a yaml file, and automatically connecting the Kubernets to Grafana;
(4) Logging in Grafana by using an administrator account and configuring a data source of Prometous;
(5) Editing JSON files needing diagram types, importing the JSON files into Grafana, and calling the styles of all diagrams to display the diagrams of all data types;
(6) And connecting the Grafana to see the monitoring data of the relevant default mode, namely the Grafana is successfully deployed.
Preferably, the alarm rule configuration layer comprises an alarm rule configuration module, a receiving module, a sending module and a message notification module;
the alarm rule configuration module is used for configuring built-in alarm rules of all set resources in a yml configuration file prometheus.yml of Prometheus;
the receiving module is used for receiving the alarm information sent by the data collection and extraction unit and pushing the alarm information to the alarm management component alert manager when the instant index data of the container is captured on the tenant-side cluster of the data collection and extraction unit to trigger an alarm rule;
the sending module is used for sending the alarm information in the alarm management component Alertmanager to the message notification module;
the message notification module is used for sending the alarm information to the corresponding subscription terminal according to the preset account number and the preset theme of the message sending channel, and the theme and the subscription terminal of the theme.
Preferably, after the alarm rule configuration module loads and configures, the alarm rule configuration module accesses the address of the data collection and extraction unit and the index capture rule according to a K8S dynamic discovery mechanism, periodically captures the instantaneous index of each data collection and extraction unit, and the promemeus periodically calculates whether the alarm rule expression reaches the index threshold according to the alarm rule:
when the alarm rule expression meets the condition, prometheus pushes alarm information to AlertManager;
the alarm information comprises the UUID of the container, the name of the container, the node where the container is located, the threshold value of the set monitoring index and the current instantaneous value of the monitoring index.
Preferably, the message sending channel of the module through which the message passes comprises a mailbox, a short message, a nail and a WeChat.
The information platform monitoring and collecting system has the following advantages:
the invention monitors and alarms K8s cluster resource, can realize monitoring CPU/memory of cluster server container, and can monitor container group resource uninterruptedly after the container group is rescheduled, can monitor application service set under different copy conditions, and obtain original and aggregated monitoring data of a plurality of container groups, then send the monitored data to user in alarm mode in real time, and display the monitoring data in different modes; therefore, the K8S cluster container is managed finely, the problem is conveniently checked and is processed in time;
secondly, data acquisition is carried out on K8S container resources by using a monitoring acquisition component exporter, and the reading permission of a cluster is distributed, so that the resource related information of the cluster can be acquired through an API (application program interface) of Kubernetes;
the invention realizes the fine management of the K8S cluster container, is convenient to check the problem and process the problem in time, is beneficial to understanding the system behavior of the container and realizes the monitoring of the resource use condition.
Drawings
The invention is further described below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of an information platform monitoring and acquisition system.
Detailed Description
An information platform monitoring and collecting system according to the present invention is described in detail below with reference to the accompanying drawings and specific embodiments.
Example (b):
as shown in fig. 1, the embodiment provides an information platform monitoring and collecting system, which includes a data collecting and extracting unit and a monitoring and warning unit, where the data collecting and extracting unit includes a data collecting layer and a data extracting layer, and the monitoring and warning unit includes a data display layer, a warning rule configuration layer, a warning generation layer, and a warning display layer;
the data collection layer is used for collecting host data, system data and container data, carrying out standardized processing on the collected data and storing the data;
the data extraction layer is used for normalizing and filtering the data acquired by the data collection layer through an alarm rule language in a compiled yaml file during deployment, extracting required data to the monitoring alarm module, and storing the collected data to a self-contained time sequence database of Prometous through the exporter by the Prometous for grafana calling;
the data display layer enables a web interface to be used for displaying the data acquired by the data collection layer in a unified mode, the display mode comprises a curve graph, a bar graph and a cake state, and the data are graphed, so that operation and maintenance personnel can be helped to know the operation state and the operation trend of a host or a network within a period of time and the operation state and the operation trend are used as the basis for the operation and maintenance personnel to troubleshoot problems or solve the problems;
the alarm rule configuration layer is used for configuring built-in alarm rules of all set resources in a yml configuration file promemeus.yml of promemeus and pushing alarm information;
the alarm event generation layer is used for recording the alarm event in real time and notifying a user;
the user display layer is a web display interface and is used for uniformly displaying the monitoring statistical result and the alarm fault result.
The realization process of the monitoring is that hardware resources, software resources, system information and the like related in the platform and the service system are brought into a unified operation and maintenance monitoring platform, unified management, unified specification, unified processing and unified display are realized on various different data sources by eliminating the difference of management software and the difference of data acquisition means, and finally, operation and maintenance standardized, automatic and intelligent large operation and maintenance management is realized. Operation monitoring and fault warning are two main functional modules of a monitoring system.
The data collection layer in this embodiment collects data in the following manner:
(1) building a Kubernetes cluster according to actual business and resource condition requirements, and taking the cluster as a monitoring target;
(2) installing an acquisition component exporter, cadvisor or telegraf in the cluster to realize acquisition of cluster performance data, wherein the cluster performance data comprises cpu, memory, disk and network resource data information;
(3) monitoring indexes with different dimensions are collected through an exporter and are exposed through a data format supported by Prometous, and the Prometous periodically pulls data and displays the data by Grafana;
(4) collecting performance index data related to the container and the Pod through the cadvisor, and grabbing the performance index data through an exposed metrics interface by prometheus;
(5) and collecting the performance index data of the host through a prometheus-node-exporter, and capturing the performance index data by prometheus through an exposed metrics interface.
The process of Prometheus building and installation in the embodiment is specifically as follows:
(1) Packing the Prometheus mirror images and putting the Prometheus mirror images into a cluster mirror image warehouse for subsequent installation of Prometheus;
(2) Creating a namespace with the name of monitering in the constructed Kubernets cluster, and storing the containers operated by Prometheus;
(3) Distributing the reading authority of the cluster to the monitering, and obtaining resource related information of the cluster by Prometous through an API (application program interface) of Kubernetes;
(4) Creating a ConfigMap at monitering for storing the configuration of the Prometheus container and the configuration of dynamically discovering the pod and the running service in the kubernets cluster;
(5) Creating Prometheus in a Deployment mode, and installing Prometheus through a yaml file;
(6) And connecting Prometeus, mapping an internal port of Prometeus into an external port through a yaml file, and automatically connecting the Kubernets cluster to Prometeus, namely the Prometeus is successfully deployed.
The working process of Prometheus in this embodiment is specifically as follows:
(1) The Prometheus server periodically pulls metrics from configured exporters;
(2) The Prometheus server locally stores the collected metrics, runs the defined alert. Rules, records a new time sequence or pushes an alarm to Grafana;
(3) Processing the received alarm by Grafana according to the configuration file, and sending an alarm;
(4) And in the graphical interface, visually acquiring data.
The data display layer in this embodiment adopts a Grafana tool, and the deployment process of the Grafana tool is specifically as follows:
(1) Packaging the Grafana mirror image and putting the Grafana mirror image into a cluster mirror image warehouse for subsequent installation of Grafana;
(2) Installing Grafana through a yaml file;
(3) Connecting Grafana, mapping Grafana internal ports into external ports through a yaml file, and automatically connecting the Grafana to the Kubernets cluster;
(4) Logging in Grafana by using an administrator account and configuring a data source of Prometous;
(5) Editing JSON files needing diagram types, importing the JSON files into Grafana, and calling the styles of all diagrams to display the diagrams of all data types;
(6) And connecting the Grafana to see the monitoring data of the relevant default mode, namely the Grafana is successfully deployed.
The alarm rule configuration layer in the embodiment comprises an alarm rule configuration module, a receiving module, a sending module and a message notification module;
the alarm rule configuration module is used for configuring built-in alarm rules of all set resources in a yml configuration file prometheus.yml of Prometheus;
the receiving module is used for receiving the alarm information sent by the data collection and extraction unit and pushing the alarm information to the alarm management component alert manager when the instant index data of the container is captured on the tenant-side cluster of the data collection and extraction unit to trigger an alarm rule;
the sending module is used for sending the alarm information in the alarm management component Alertmanager to the message notification module;
the message notification module is used for sending the alarm information to the corresponding subscription terminal according to the preset account number and the preset theme of the message sending channel, and the theme and the subscription terminal of the theme.
After the alarm rule configuration module in this embodiment is configured and loaded, the address and the index capture rule of the data collection and extraction unit are accessed according to a K8S dynamic discovery mechanism, the instantaneous indexes of each data collection and extraction unit are periodically captured, and the prometheus periodically calculates whether the alarm rule expression reaches the index threshold according to the alarm rule:
when the alarm rule expression meets the condition, prometheus pushes alarm information to AlertManager;
the alarm information comprises UUID of the container, the name of the container, the node where the container is located, a threshold value for setting the monitoring index and the current instantaneous value of the monitoring index.
The message sending channels of the module through which the messages pass in the embodiment comprise mailboxes, short messages, nails and WeChat.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the spirit of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. An information platform monitoring and collecting system is characterized by comprising a data collecting and extracting unit and a monitoring and alarming unit, wherein the data collecting and extracting unit comprises a data collecting layer and a data extracting layer, and the monitoring and alarming unit comprises a data display layer, an alarming rule configuration layer, an alarming generation layer and an alarming display layer;
the data collection layer is used for collecting host data, system data and container data, carrying out standardized processing on the collected data and storing the data;
the data extraction layer is used for normalizing and filtering data acquired by the data collection layer through an alarm rule language in a compiled yaml file during deployment, extracting required data to the monitoring alarm module, and storing the collected data to a self-contained time sequence database of Prometheus through the exporter by the Prometheus for Grafana calling, wherein the data is in a uniform format;
the data display layer enables a web interface to be used for uniformly displaying the data acquired by the data collection layer, and the display modes comprise a curve graph, a bar chart and a cake state;
the alarm rule configuration layer is used for configuring built-in alarm rules of all set resources in a yml configuration file promemeus.yml of promemeus and pushing alarm information;
the alarm event generation layer is used for recording the alarm event in real time and notifying a user;
the user display layer is a web display interface and is used for uniformly displaying the monitoring statistical result and the alarm fault result.
2. The information platform monitoring and acquisition system according to claim 1, wherein the data collection layer collects data in the following manner:
(1) building a Kubernetes cluster according to actual business and resource condition requirements, and taking the cluster as a monitoring target;
(2) installing an acquisition component exporter, cadvisor or telegraf in the cluster to realize acquisition of cluster performance data, wherein the cluster performance data comprises cpu, memory, disk and network resource data information;
(3) monitoring indexes with different dimensions are collected through an exporter and are exposed through a data format supported by Prometous, and the Prometous periodically pulls data and displays the data by Grafana;
(4) collecting performance index data related to the container and Pod through cadvisor, and grabbing the performance index data by prometheus through an exposed metrics interface;
(5) and collecting the performance index data of the host through a prometheus-node-exporter, and capturing the performance index data by prometheus through an exposed metrics interface.
3. The information platform monitoring and collecting system according to claim 2, wherein the Prometheus building and installing process is specifically as follows:
(1) Packaging the Prometheus mirror image and putting the Prometheus mirror image into a cluster mirror image warehouse for subsequent installation of Prometheus;
(2) Creating a name space named as monitering in the constructed Kubernets cluster, wherein the name space is used for storing a container operated by Prometheus;
(3) Distributing the reading authority of the cluster to the monitering, and obtaining resource related information of the cluster by Prometous through an API (application program interface) of Kubernetes;
(4) Creating a ConfigMap at monitoring for storing the configuration of the Prometheus container and the configuration of the dynamically discovered pod and the running service in the Kubernetes cluster;
(5) Creating Prometous in a Delployment mode, and installing Prometous through a yaml file;
(6) And connecting Prometeus, mapping an internal port of Prometeus into an external port through a yaml file, and automatically connecting the Kubernets cluster to Prometeus, namely the Prometeus is successfully deployed.
4. The information platform monitoring and acquisition system according to claim 3, wherein the working process of Prometheus is as follows:
(1) The Prometheus server periodically pulls metrics from configured exporters;
(2) The Prometheus server locally stores the collected metrics, runs the defined alert. Rules, records a new time sequence or pushes an alarm to Grafana;
(3) Processing the received alarm by Grafana according to the configuration file, and sending an alarm;
(4) And in the graphical interface, visually acquiring data.
5. The information platform monitoring and collecting system according to claim 1, wherein the data presentation layer employs a Grafana tool, and a Grafana tool deployment process specifically includes:
(1) Packaging the Grafana mirror image and putting the Grafana mirror image into a cluster mirror image warehouse for subsequent installation of Grafana;
(2) Installing Grafana through a yaml file;
(3) Connecting Grafana, mapping Grafana internal ports into external ports through a yaml file, and automatically connecting the Kubernets to Grafana;
(4) Logging in Grafana by using an administrator account and configuring a data source of Prometous;
(5) Editing JSON files needing the chart types, importing the JSON files into Grafana, calling the styles of all the charts, and displaying the charts of all the data types;
(6) And connecting the Grafana to see the monitoring data of the relevant default mode, namely the Grafana is successfully deployed.
6. The information platform monitoring and acquisition system according to claim 1, wherein the alarm rule configuration layer comprises an alarm rule configuration module, a receiving module, an issuing module and a message notification module;
the alarm rule configuration module is used for configuring built-in alarm rules of all set resources in a yml configuration file prometheus.yml of Prometheus;
the receiving module is used for receiving the alarm information sent by the data collecting and extracting unit when capturing instant index data of a container on a tenant-side cluster of the data collecting and extracting unit to trigger an alarm rule, and pushing the alarm information to an alarm management component alert manager;
the sending module is used for sending the alarm information in the alarm management component Alertmanager to the message notification module;
the message notification module is used for sending the alarm information to the corresponding subscription terminal according to the preset account number and the preset theme of the message sending channel, and the theme and the subscription terminal of the theme.
7. The information platform monitoring and collecting system according to claim 6, wherein after the alarm rule configuration module is configured and loaded, the data collection and extraction unit address and the index capture rule are accessed according to a K8S dynamic discovery mechanism, the instantaneous indexes of the data collection and extraction units are captured periodically, and the prometheus periodically calculates whether the alarm rule expression reaches the index threshold according to the alarm rule:
when the alarm rule expression meets the condition, prometheus pushes alarm information to AlertManager;
the alarm information comprises the UUID of the container, the name of the container, the node where the container is located, the threshold value of the set monitoring index and the current instantaneous value of the monitoring index.
8. The information platform monitoring and acquisition system according to claim 6 or 7, wherein the message transmission channels of the module through which the message passes comprise a mailbox, a short message, a nail and a WeChat.
CN202211592846.XA 2022-12-13 2022-12-13 Information platform monitoring and collecting system Pending CN115934464A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211592846.XA CN115934464A (en) 2022-12-13 2022-12-13 Information platform monitoring and collecting system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211592846.XA CN115934464A (en) 2022-12-13 2022-12-13 Information platform monitoring and collecting system

Publications (1)

Publication Number Publication Date
CN115934464A true CN115934464A (en) 2023-04-07

Family

ID=86650490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211592846.XA Pending CN115934464A (en) 2022-12-13 2022-12-13 Information platform monitoring and collecting system

Country Status (1)

Country Link
CN (1) CN115934464A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117251353A (en) * 2023-11-20 2023-12-19 青岛民航凯亚系统集成有限公司 Monitoring method, system and platform for civil aviation weak current system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117251353A (en) * 2023-11-20 2023-12-19 青岛民航凯亚系统集成有限公司 Monitoring method, system and platform for civil aviation weak current system

Similar Documents

Publication Publication Date Title
CN109714192B (en) Monitoring method and system for monitoring cloud platform
CN112511339B (en) Container monitoring alarm method, system, equipment and storage medium based on multiple clusters
CN107689953B (en) Multi-tenant cloud computing-oriented container security monitoring method and system
CN106776212B (en) Supervision system and method for container cluster deployment of multi-process application
CN107508722B (en) Service monitoring method and device
CN106487574A (en) Automatic operating safeguards monitoring system
US20140337474A1 (en) System and method for monitoring and managing data center resources in real time incorporating manageability subsystem
CN104699759A (en) Method for maintaining automatic operation of database
CN105610648A (en) Operation and maintenance monitoring data collection method and server
CN108390907B (en) Management monitoring system and method based on Hadoop cluster
CN111488258A (en) System for analyzing and early warning software and hardware running state
CN114328124A (en) Method and device for business monitoring, storage medium and electronic device
CN109905262A (en) A kind of monitoring system and monitoring method of CDN device service
CN115934464A (en) Information platform monitoring and collecting system
CN114356499A (en) Kubernetes cluster alarm root cause analysis method and device
CN114048090A (en) K8S-based container cloud platform monitoring method and device and storage medium
EP1622310B1 (en) Administration method and system for network management systems
CN113570347A (en) RPA operation and maintenance method for micro-service architecture system
CN113037549A (en) Operation and maintenance environment warning method
CN108599978B (en) Cloud monitoring method and device
CN110557283A (en) power distribution communication network management and control method, server, system and readable storage medium
CN109951313A (en) A kind of monitoring device and method of Hadoop cloud platform
CN113765717A (en) Operation and maintenance management system based on secret-related special computing platform
CN115687036A (en) Log collection method and device and log system
CN114338350A (en) Alarm method, alarm device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination