CN110543410A

CN110543410A - Method for processing cluster index, method and device for inquiring cluster index

Info

Publication number: CN110543410A
Application number: CN201910841010.0A
Authority: CN
Inventors: 王晨露; 吕灼恒; 原帅; 王家尧
Original assignee: Dawning Information Industry Beijing Co Ltd
Current assignee: Dawning Information Industry Beijing Co Ltd
Priority date: 2019-09-05
Filing date: 2019-09-05
Publication date: 2019-12-06

Abstract

The embodiment of the application provides a method for processing cluster indexes, a method for inquiring the cluster indexes and a device thereof, wherein the method for processing the cluster indexes comprises the following steps: acquiring operation index data cached in advance by each node in a plurality of nodes; screening out operation index data triggering alarm from the operation index data of the nodes; and displaying the operation index data triggering the alarm. According to the method and the device, the operation index data are cached in advance through each node in the cluster, and then the operation index data cached in advance are sent to the central node in the cluster by each node, so that the central node can conveniently screen out the operation index data triggering the alarm from the operation index data of the nodes, and the operation index data triggering the alarm are displayed. Therefore, each node caches the operation index data in advance, so that each node can directly upload the cached operation index data to the central node, and the problem that the real-time monitoring result is slow to obtain is solved.

Description

Method for processing cluster index, method and device for inquiring cluster index

Technical Field

The application relates to the field of servers, in particular to a method for processing cluster indexes, a method for inquiring the cluster indexes and a device.

Background

In a large-scale server cluster system, operation and maintenance work is particularly important. For operation and maintenance, monitoring is crucial because it can ensure that problems are found and relevant personnel are notified to solve the problems in the first place.

The existing cluster service monitoring is realized by adopting a bottom-layer script to collect operation indexes and display the index data to related personnel. But as the cluster size continues to grow, calls between scripts need to be cross-node. Because the cross-node script calling mode needs to occupy a certain bandwidth, the process of acquiring the operation index data is relatively slow, and the problem that the monitoring result is slow to obtain may be caused, so that the user experience is influenced.

Disclosure of Invention

an object of the embodiments of the present application is to provide a method for processing a cluster index, a method for querying a cluster index, and an apparatus, so as to solve the problem in the prior art that a monitoring result is slow to obtain.

In a first aspect, an embodiment of the present application provides a method for processing a cluster indicator, where the method includes: acquiring operation index data cached in advance by each node in a plurality of nodes; screening out operation index data triggering alarm from the operation index data of the nodes; and displaying the operation index data triggering the alarm.

Therefore, in the embodiment of the application, the operation index data is cached in advance by each node in the cluster, and then the operation index data cached in advance is sent to the central node in the cluster by each node, so that the central node can screen out the operation index data triggering the alarm from the operation index data of the plurality of nodes and display the operation index data triggering the alarm. Therefore, each node caches the operation index data in advance, so that each node can directly upload the cached operation index data to the central node, and the problem that the real-time monitoring result is slow to obtain is solved.

In one possible embodiment, the method further comprises: storing the alarm triggering operation index data in a database, wherein all the alarm triggering operation index data in the database are stored according to different time periods.

Therefore, the operation index data are stored according to different time periods through the database, and when a user wants to check the data of a certain time period subsequently, historical operation index data can be quickly inquired in the storage mode, so that the inquiry efficiency is increased, the inquiry speed is increased, and the user experience degree can be further improved.

In one possible embodiment, the screening out the operation index data triggering the alarm from the operation index data of the plurality of nodes includes: at least one group of operation index data is screened out from the operation index data of the nodes, wherein each group of operation index data is data related to one alarm, and each group of operation index data comprises at least two operation index data.

therefore, according to the embodiment of the application, at least two pieces of operation index data are correlated, so that the operation index data related to the alarm data can be displayed quickly when a user wants to check a certain alarm data.

in one possible embodiment, the displaying of the operation index data triggering the alarm includes: displaying the operation index data triggering the alarm in a preset display mode, wherein the preset display mode comprises one of the following modes: graphics and colors.

Therefore, the operation index data triggering the alarm is displayed in an image or color display mode, and compared with the existing table display mode, the display mode enables a user to quickly and accurately know the operation condition of the node.

In a second aspect, an embodiment of the present application provides a method for processing a cluster indicator, where the method includes: caching the collected operation index data; and sending the operation index data cached in advance to the central node so that the central node can screen out the operation index data triggering the alarm from the operation index data and display the operation index data triggering the alarm.

In a third aspect, an embodiment of the present application provides a method for querying a cluster index, where the method includes: acquiring a query instruction, wherein the query instruction comprises a query time interval; and inquiring operation index data corresponding to the inquiry time period from the database according to the inquiry instruction, wherein the operation index data in the database are stored according to different time periods, and the operation index data are screened out from the operation index data cached in advance sent by each node and used for triggering the alarm.

In a fourth aspect, an embodiment of the present application provides an apparatus for processing a cluster indicator, where the apparatus includes: the first acquisition module is used for acquiring operation index data cached in advance by each node in the plurality of nodes; the screening module is used for screening out operation index data triggering alarms from the operation index data of the nodes; and the display module is used for displaying the operation index data of the triggered alarm.

In one possible embodiment, the apparatus comprises: the storage module is used for storing the operation index data for triggering the alarm in a database, wherein all the operation index data for triggering the alarm in the database are stored according to different time periods.

in a possible embodiment, the screening module is further configured to screen at least one set of operation index data from the operation index data of the plurality of nodes, where each set of operation index data is data related to an alarm, and each set of operation index data includes at least two operation index data.

In one possible embodiment, the display module is configured to display the operation index data triggering the alarm in a preset display manner, where the preset display manner includes one of the following manners: graphics and colors.

In a fifth aspect, an embodiment of the present application provides an apparatus for processing a cluster indicator, where the apparatus includes: the cache module is used for caching the collected operation index data; and the sending module is used for sending the operation index data cached in advance to the central node so that the central node can screen out the operation index data triggering the alarm from the operation index data and display the operation index data triggering the alarm.

in a sixth aspect, an embodiment of the present application provides an apparatus for querying a cluster index, where the apparatus includes: the second acquisition module is used for acquiring a query instruction, and the query instruction comprises a query time interval; and the query module is used for querying the operation index data corresponding to the query time period from the database according to the query instruction, wherein the operation index data in the database are stored according to different time periods, and the operation index data are screened out from the operation index data cached in advance sent by each node and used for triggering the alarm.

in a seventh aspect, this application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the method according to the first aspect or any optional implementation manner of the first aspect.

in an eighth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the method of the second aspect or any optional implementation manner of the second aspect.

In a ninth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to perform the method according to the third aspect or any optional implementation manner of the third aspect.

In a tenth aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the method of the first aspect or any of the alternative implementations of the first aspect.

In an eleventh aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the method of the second aspect or any of the alternative implementations of the second aspect.

In a twelfth aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the method of the third aspect or any of the alternative implementations of the third aspect.

In a thirteenth aspect, the present application provides a computer program product, which when run on a computer, causes the computer to execute the method of the first aspect or any possible implementation manner of the first aspect.

In a fourteenth aspect, the present application provides a computer program product, which when run on a computer, causes the computer to execute the method of the second aspect or any possible implementation manner of the second aspect.

In a fifteenth aspect, the present application provides a computer program product, which when run on a computer, causes the computer to execute the method of the third aspect or any possible implementation manner of the third aspect.

in order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

FIG. 1 is a schematic diagram illustrating one implementation scenario in which examples of the present application may be applied;

Fig. 2 is a flowchart of a method for processing cluster indexes according to an embodiment of the present disclosure;

fig. 3 is a specific flowchart of a method for processing cluster indexes according to an embodiment of the present disclosure;

Fig. 4 is a flowchart of a method for querying a cluster index according to an embodiment of the present application;

Fig. 5 is a specific flowchart of a method for querying a cluster index according to an embodiment of the present application;

Fig. 6 shows a block diagram of an apparatus for processing cluster metrics according to an embodiment of the present application;

fig. 7 is a block diagram illustrating a structure of an apparatus for processing cluster metrics according to an embodiment of the present application;

Fig. 8 is a block diagram illustrating a structure of an apparatus for querying a cluster index according to an embodiment of the present application;

Fig. 9 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

In a large-scale server cluster monitoring system, operation index data corresponding to each node in a cluster is usually displayed. For example, the operating status of a CPU (Central Processing Unit), the network card rate, the memory utilization rate, etc. and the system alarm information corresponding to each index.

The server cluster monitoring system can effectively monitor the operation index data of each node in a large-scale server cluster, quickly determine whether each node has acquired the index operation index data and determine whether the abnormal operation index data generates an alarm.

with the continuous increase of the scale of the server cluster, a large amount of operation index data is generated, so that the existing cluster adopts a cluster distributed deployment mode. The monitoring of the existing cluster distributed deployment mode is to monitor the operation index data of each node through a script ((for example, a Linux script).

However, since there are many nodes in the cluster, the call between scripts needs to be made across nodes. For example, a script on one node needs to collect operation index data of other nodes in addition to the operation index data of the node. In the process of cross-node calling of the script, network transmission needs to be consumed, and a certain bandwidth is occupied, so that the process of acquiring the operation index data is relatively slow, and further, the problem that the monitoring result is slow to acquire (or the monitored page is slow to load) may be caused, and the experience of a part of users is influenced.

and, the existing server cluster monitoring system uses a traditional relational database to store the operation index data in the form of rows and columns. The continuous increase of the cluster size causes the continuous increase of the amount of the operation index data, and the storage mode of the relational database causes the problem that the subsequent data query from the relational database is too slow. That is to say, the conventional server cluster monitoring system also has the problem that the acquisition of the historical operation index data for querying one node or a plurality of nodes is slow.

based on this, the embodiment of the application skillfully provides a method for processing cluster indexes and a method for inquiring the cluster indexes, wherein operation index data are cached in advance by each node in a cluster, and then the operation index data cached in advance are sent to a central node in the cluster by each node, so that the central node can screen out operation index data triggering alarm from the operation index data of a plurality of nodes and display the operation index data triggering alarm. Therefore, each node caches the operation index data in advance, so that each node can directly upload the cached operation index data to the central node, and the problem that the real-time monitoring result is slow to obtain is solved.

and the embodiment of the application also stores all the operation index data triggering the alarm in the database according to different time periods, so that the operation index data can be stored in a form of a list according to the time dimension, the operation index data are stored according to different time periods in advance, and when a user wants to check the data of a certain time period, the historical operation index data can be quickly inquired in the storage mode, so that the inquiry efficiency is increased, the inquiry speed is increased, and the user experience degree can be further improved.

Referring to fig. 1, fig. 1 is a schematic diagram illustrating an implementation scenario 100 to which an example of the present application may be applied. As shown in fig. 1, the implementation scenario 100 includes: node 11 to node 1n, node 22 to node 2n, hierarchical node 1, hierarchical node 2 and central node 3, where n is a positive integer.

In some embodiments, the nodes 11 to 1n, the nodes 22 to 2n, the hierarchical nodes 1, the hierarchical nodes 2, and the central node 3 may each be provided with a monitoring program for acquiring operation index data of the corresponding node. The monitoring program may be a script, a monitoring process, or the like. That is to say, the monitoring program may be set according to actual requirements, and the embodiment of the present application is not limited to this.

In some embodiments, the central node 3 may comprise a processor. The processor may process information and/or data related to the service request to perform one or more of the functions described herein. For example, the central node 3 screens out the operation index data that triggers an alarm from the plurality of operation index data uploaded from the nodes 11 to 1 n. In some embodiments, a processor may include one or more processing cores ((e.g., single-core processor ((S) or multi-core processor ((S)).

In the embodiment of the present application, each node from the node 11 to the node 1n obtains the operation index data of each node in advance through the monitoring program, and each node also caches the obtained operation index data in the memory of each node in advance. Correspondingly, each of the nodes 22 to 2n also obtains the operation index data of each node in advance through the monitoring program, and each node also caches the obtained operation index data in the memory of each node in advance.

Subsequently, after each of the nodes 11 to 1n acquires the instruction of the hierarchical node 1 or reaches the preset operation index data uploading time, the nodes 11 to 1n respectively upload the operation index data cached in the respective memories to the hierarchical node 1. Thus, the hierarchical node 1 can upload the operation index data of itself and the operation index data of the nodes 11 to 1n to the central node 3.

and after each of the nodes 21 to 2n acquires the instruction of the hierarchical node 2 or reaches the preset operation index data uploading time, the nodes 21 to 2n also upload the operation index data cached in the respective memories to the hierarchical node 2 respectively. Thus, the hierarchical node 2 can upload the operation index data of itself and the operation index data of the nodes 21 to 2n to the central node 3.

After obtaining the operation index data of all the nodes, the central node 3 may sequentially match the operation index data of each node with the alarm judgment rule, so as to screen out the operation index data triggering the alarm from the operation index data, and display the operation index data triggering the alarm in real time. And the central node can also store the operation index data into a storage table of the corresponding time period in the database.

It should be noted that, although fig. 1 illustrates a hierarchical node 1 and a hierarchical node 2, it should be understood by those skilled in the art that those skilled in the art may also set a hierarchical node according to actual needs, and the embodiments of the present application are not limited thereto. For example, in the case that the number of nodes in the server cluster is small, the hierarchical nodes may not be provided in the server cluster, that is, the central node directly communicates data with other nodes. For another example, when the number of nodes in the server cluster is large, a plurality of hierarchical nodes may be set in the server cluster, and each hierarchical node communicates with at least one node.

it should be further noted that the method for processing the cluster index and the method for querying the cluster index provided in the embodiment of the present invention may be further extended to other suitable implementation scenarios, and are not limited to the implementation scenario 100 shown in fig. 1. Moreover, it should be understood by those skilled in the art that in the process of actual application, the application scenario 100 may include more or fewer nodes or hierarchical nodes, that is, the number of nodes and/or the number of hierarchical nodes in the embodiment of the present application may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

Referring to fig. 2, fig. 2 is a flowchart of a method for processing a cluster indicator according to an embodiment of the present disclosure. As shown in fig. 2, the method includes:

And step S210, caching the collected operation index data by the node.

It should be understood that a node may also be referred to as a server, and may also be referred to as a server node. That is to say, the node in the embodiment of the present application may set a name according to an actual requirement, and the embodiment of the present application is not limited thereto.

It should also be understood that the operational metric data may also be referred to as metrics, may also be referred to as operational parameters, and may also be referred to as operational conditions. That is to say, the operation index data in the embodiment of the present application may also be named according to actual requirements, and the embodiment of the present application is not limited thereto.

It should also be understood that the node here may be any one of all nodes in the cluster, and may also be any one of a plurality of nodes (i.e., some of all nodes) specified in the cluster, and the embodiment of the present application is not limited thereto.

Specifically, a monitoring program capable of collecting operation index data of the node is installed on the node in the cluster, and in the process of starting a project program of the node, the monitoring program can be preferentially executed, and the operation index data acquired by the monitoring program is stored in a memory of the node, so that a cache mechanism of the operation index data is added to the node.

It should be noted that the node may cache part of the operation index data in all the operation index data that the current node needs to collect, and may also cache all the operation index data, which is not limited to this embodiment of the present application.

For example, when the node needs to upload two operation index data, i.e., the memory utilization rate and the network card rate, to the central node, the node may cache the network card rate in advance, and subsequently, when receiving an instruction from the central node, the node acquires the memory utilization rate in real time, and the subsequent node uploads the memory utilization rate and the network card rate to the central node.

for another example, when the node needs to upload two operation index data, that is, the memory utilization rate and the network card rate, to the central node, the node may cache both the memory utilization rate and the network card rate in advance, and subsequently, when an instruction of the central node is received, the node may directly upload the memory utilization rate and the network card rate to the central node.

It should also be noted that, in the case where the period of uploading data to the central node by the node is short ((e.g., 1 minute), the operation index data acquired by the node may be regarded as real-time operation index data because the period of uploading data is short), and in the case where the period of uploading data to the central node by the node is long ((e.g., 3 hours), the node may collect the operation index data within a preset time period before the uploading time is needed ((e.g., 2 minutes in advance), thereby ensuring that the operation index data collected here is real-time operation index data in this way.

Step S220, the node sends the operation index data cached in advance to the central node. Correspondingly, the central node acquires the operation index data cached in advance by each node in the plurality of nodes.

It should be understood that the condition for sending the pre-cached operation index data to the central node by the node may be set according to an actual requirement, and the embodiment of the present application is not limited thereto.

for example, when a node acquires an acquisition instruction of operation index data sent by a central node, the node may directly send the operation index data cached in the memory to the central node.

For another example, the node may upload the operation index data to the central node in time by a preset period, so that the node may directly send the operation index data cached in the memory to the central node when the preset time is reached.

It should also be understood that the sending manner in which the node sends the pre-cached operation index data to the central node may also be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the node may directly send the operation index data cached in the memory to the central node.

For another example, the node may send the operation index data cached in the memory to the corresponding hierarchical node (or management node), and then the hierarchical node forwards the operation index data to the central node.

Step S230, the central node screens out the operation index data triggering the alarm from the operation index data of the plurality of nodes.

it should be understood that the operation index data that triggers an alarm may also be referred to as abnormal operation index data, which may also be referred to as an abnormal index, which may also be referred to as an abnormal operation parameter, which may also be referred to as an abnormal operation condition. That is to say, the operation index data for triggering an alarm in the embodiment of the present application may also be named according to actual requirements, and the embodiment of the present application is not limited thereto.

It should also be understood that the determination rule for the central node to filter the operation index data triggering the alarm may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

In order to facilitate understanding of the embodiments of the present application, the following description will be given by way of specific examples.

Optionally, the central node may compare one operation index data with a corresponding alarm threshold, and determine the operation condition of the node or whether an alarm needs to be triggered according to the comparison result.

It should be understood that the alarm threshold may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, when the alarm threshold corresponding to the CPU utilization is 90% and the CPU utilization of the node acquired by the central node is 96%, the central node may determine that the CPU utilization of the node exceeds the alarm threshold through comparison, so that the central node may determine that the current CPU utilization is the CPU utilization capable of triggering an alarm.

Optionally, the central node may use at least two operational metric data to determine the operational condition of the node or whether an alarm needs to be triggered.

for example, when the central node compares the I/O ((Input/Output, Input/Output)) of the node with the alarm threshold, the central node determines that the I/O ((Input/Output, Input/Output)) of the node is higher according to the comparison result.

In addition, in order to avoid the situation that the operation index data collected by the node is temporary abnormal operation index data due to application program interruption, network fluctuation and the like, the central node may continuously monitor the operation index data of the abnormal node ((or the central node determines the duration of the abnormal operation index data), and trigger a real-time alarm when the duration of the abnormal operation index data exceeds a preset time.

It should be understood that the preset time may be set according to a time requirement, and the embodiment of the present application is not limited thereto.

In addition, in the process of screening the operation index data by the central node, the central node can also acquire alarm data related to the alarm. The data type included in the alarm data may be set according to actual requirements, and the embodiment of the present application is not limited to this.

For example, in the process that the central node may filter the operation index data according to the preset determination rule, the central node may further obtain alarm data such as an alarm level, an alarm time, and a node name where an alarm occurs.

Step S240, the central node displays the operation index data for triggering the alarm.

it should be understood that, in the process of displaying the operation index data triggering the alarm, the central node may also display the alarm data, that is, the central node may simultaneously display the operation index data and the alarm data.

it should also be understood that, in addition to displaying the operation index data of the triggered alarm, the central node may also push the operation index data and/or the alarm data of the triggered alarm to the mobile terminal of the user ((e.g., a mobile phone, a tablet computer, etc.), thereby ensuring that the user can quickly obtain the alarm information.

It should also be understood that the display mode of the central node displaying the operation index data triggering the alarm may also be set according to actual requirements, and the embodiment of the present application is not limited thereto.

The existing display mode of the operation index data is displayed in a table mode, but with the continuous increase of the cluster scale, the data to be displayed is increased, so that the existing data display mode cannot meet the requirement of a user on accurately and quickly knowing the operation condition of the node.

Therefore, the operation index data triggering the alarm can be displayed in a preset display mode, so that the display mode can accurately and quickly know the requirement of the operation condition of the node, and a user can be enabled to be vivid and full of sense color when looking over the data.

Optionally, the central node may graphically display the operation index data that triggers the alarm.

For example, the central node may demonstrate the last hour of data acquisition through fluctuations in the dynamics of the graph. As another example, the central node may also present the operation of the data TOPN (N data in the top, N being a positive integer) via a histogram.

Optionally, the central node may display the operation index data triggering the alarm in a display manner of different colors.

For example, the central node may also represent different alert levels by different colors.

It should be noted that, in the embodiment of the present application, the central node may directly display the operation index data for triggering the alarm, or may store the operation index data for triggering the alarm in the database, and then read the operation index data for triggering the alarm from the database for display, which is not limited to this embodiment of the present application.

It should be further noted that, for convenience of description, relevant portions of the data storage in the method for processing the cluster index in the embodiment of the present application are described in the method for querying the cluster index in fig. 4 below, and detailed descriptions thereof are omitted, and specific reference may be made to relevant portions in the method for querying the cluster index in fig. 4.

In order to facilitate the method for processing the cluster index in the embodiment of the present application, a description is given below with a specific embodiment.

Referring to fig. 3, fig. 3 is a specific flowchart of a method for processing a cluster indicator according to an embodiment of the present disclosure. As shown in fig. 3, the method includes:

A plurality of nodes collect operation index data and cache the operation index data; each node in the plurality of nodes sends the cached operation index data to the hierarchical node; the hierarchical nodes forward the operation index data of the nodes to the central node, the central node puts the operation index data into a queue, then the central node reads the operation index data from the queue, and an alarm threshold corresponding to the current operation index data is obtained in the process of reading the operation index data.

And the central node judges the alarm in the Redis memory, namely the central node matches the operation index data with the alarm threshold value to determine whether the alarm is given or not. If the central node determines that the operation index data is the operation index data triggering the alarm, the central node executes the next step, and if the central node determines that the operation index data is not the operation index data triggering the alarm, the central node deletes the operation index data.

Wherein, the alarm comprises a quasi alarm and a real-time alarm. The quasi-alarm is that the operation index data in the node meets the condition of triggering alarm, and the real-time alarm is that the duration of the abnormal operation index data in the node exceeds the preset time.

Therefore, the central node can respectively determine quasi-alarm, real-time alarm and delete quasi-alarm by matching the operation index data with the alarm threshold. The step of deleting the quasi-alarm means that the central node deletes corresponding alarm information in a corresponding quasi-alarm before when the duration of the abnormal operation index data is less than the preset time.

and finally, the central node synchronizes the data related to the quasi-alarm, the quasi-alarm deletion and the real-time alarm recovery to the database, and the central node can also display the operation index data triggering the alarm. The operation index data for triggering the alarm may be the operation index data for triggering the quasi-alarm, the operation index data for triggering the real-time alarm, or the operation index data for triggering the quasi-alarm and the real-time alarm.

Referring to fig. 4, fig. 4 is a flowchart of a method for querying a cluster index according to an embodiment of the present disclosure. As shown in fig. 4, the method includes:

Step S410, the central node obtains a query instruction, where the query instruction includes query time.

It should be understood that the query instruction may include a node name of the query, an alarm time, and the like, in addition to the query time. That is to say, the query instruction may be set according to actual requirements, and the embodiment of the present application is not limited to this.

It should also be understood that the query time may be a month, the last week, the last hour, etc. That is to say, the query time may also be set according to actual requirements, and the embodiment of the present application is not limited to this.

Step S420, the central node queries the operation index data corresponding to the query time period from the database according to the query instruction. The operation index data in the database is stored according to different time periods, and the operation index data is screened out from the operation index data cached in advance sent by each node and used for triggering the alarm.

It should be understood that the time period may be set according to actual requirements, and the embodiment of the present application is not limited thereto. Different data may be stored in different time periods (for example, the database may store data for 23/8/2017 by hours, but store data for 24/8/2017 by days), and the embodiment of the present application is not limited to this.

For example, the database can be counted and stored hourly, daily, and also can be counted and stored according to real-time data and historical data, and the like, so that index data can be queried and counted according to different time intervals as required, and further the running condition of the indexes in the cluster can be analyzed in real time.

Specifically, the central node may obtain a storage rule of the database ((e.g., statistics by day)), and the central node may perform archival storage on the data in the database according to the storage rule of the database, so that this storage manner may also accelerate the efficiency of data storage.

For example, in the case that the storage rule of the database is statistics by day, the database may be stored in a manner of establishing one table for each day, that is, one table corresponds to one day, and each table stores the operation index data for triggering the alarm in the corresponding day. Therefore, under the condition that the central node screens out the operation index data triggering the alarm, the central node can store the operation index data triggering the alarm into the database, namely the central node can store the operation index data triggering the alarm into a corresponding table in the database.

In addition, in order to establish the relevance between different operation index data, the central node can screen out at least one group of operation index data from the operation index data of the plurality of nodes and store the operation index data in the database. Each set of operation index data is data related to one alarm, and each set of operation index data comprises at least two operation index data.

For example, when the plurality of operation index data uploaded by the node includes the operating state and the memory utilization rate of the CPU, and when the central node determines a real-time alarm through the two operation index data, that is, the operating state and the memory utilization rate of the CPU, the central node may store the real-time alarm, the operating state and the memory utilization rate of the CPU into the database, and there is an association relationship between the real-time alarm, the operating state and the memory utilization rate of the CPU, that is, when the central node reads the relevant information of the real-time alarm, the central node may read not only the real-time alarm from the database, but also the information related to the real-time alarm.

It should be noted that, in the method for querying a cluster index in the embodiment of the present application, the read operation index data corresponding to the query time may also be displayed. The manner of this presentation can be seen from the related description of step S240 in fig. 2, and is not described in detail here.

Therefore, according to the embodiment of the application, all the operation index data triggering the alarm in the database are stored according to different time periods, so that the operation index data can be stored in a form of a sub-table according to the time dimension, the operation index data are stored according to different time periods in advance, and when a user wants to check the data of a certain time period, the historical operation index data can be rapidly inquired in a subsequent storage mode, so that the inquiry efficiency is increased, the inquiry speed is increased, and the user experience degree can be further improved.

In addition, according to the technical scheme, the running conditions of index acquisition, fault nodes and key services in the cluster can be loaded at one time. The key services include Web services, Nis (Network Information Service) services, Mysql (relational database management system) services, login services, scheduling services, and the like. And the embodiment of the application also has the effect of updating and acquiring dynamic data in real time, so that the data is more accurate and reliable.

In order to facilitate the flowchart of the method for querying the cluster index in the embodiment of the present application, the following description is made with a specific embodiment.

As shown in fig. 5, fig. 5 is a specific flowchart of a method for querying a cluster index according to an embodiment of the present application. As shown in fig. 5, the method includes:

Step S510, the central node obtains a query instruction input by the user. Wherein, the query instruction comprises a node name ((or cabinet identification) and a query time.

Specifically, the central node may query the chassis from the database according to the rack identifier transmitted from the foreground, and then query the node according to the chassis. And if the user carries out specific node query, the user can directly input the node name to carry out fuzzy query.

Wherein, in case that the user wants to query a plurality of nodes, the user can connect a plurality of node names through "or".

Step S520, the central node reads the operation index data and the alarm data related to the query instruction from the database.

Specifically, the central node reads out the alarm data and the abnormal index data related to the query instruction from the database according to the node name in the query instruction, and may also sort the alarm data and the abnormal index data according to the node name.

wherein, if there is a condition that the alarm time in the alarm data is not restored to a normal state before the start time in the query time, the node is also in the query range.

Step S530, the central node counts the operation index data and the alarm data related to the query instruction.

Specifically, the central node may count the number of total alarm categories in the query time period through the operation index data and the alarm data related to the query instruction. The central node can respectively match each alarm data, the operation index data and the alarm categories, so that the total number of the alarm categories is counted in the traversal mode.

In addition, before traversing, the central node may convert the alarm time in the alarm data into a minimum unit that can be displayed by the front-end interface, that is, perform format conversion on the alarm time in the alarm data. For example, the smallest unit that the front-end interface can exhibit is one hour, then the format is yyyy-MM-dd HH:00: 00.

In addition, when the queried node is a plurality of nodes, the central node may compare the current node name with the node name recorded in the previous record in the traversal process of the central node, and perform accumulation statistics if the current node name is the same node, or perform initialization statistics if the current node name is not the same node.

And step S540, the central node constructs a line graph through the statistical data.

Specifically, the central node determines that if the alarm time and the alarm recovery time are the same data after being formatted, the data are point data, and performs classification statistics according to the index name. And the central node determines that the alarm time and the recovery time are not equal after being formatted, the data are 'line' data, then accumulation operation is carried out, and judgment and accumulation operation are carried out by using the time after being formatted every time.

And step S550, the central node displays the alarm data and the abnormal operation index data through a line graph.

Specifically, the central node calculates the minimum units used by the front-end Echarts graph presentation and the formatted start time and end time according to the start time and end time of the query. In other words, the central node performs time conversion again.

for example, in the case that the time required for the front-end Echarts chart to show is 17 years 6 months 12, the alarm time in the background is 2017 years 6 months 12, and since 20 in 2017 cannot be displayed on the interface of the front end, time conversion is required here, so that the alarm time can be displayed.

It is to be understood that the above-described method is merely exemplary, and various modifications may be made by those skilled in the art based on the above-described method.

For example, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Rather, the steps depicted in the flowcharts may change the order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

Referring to fig. 6, fig. 6 shows a block diagram of a device 600 for processing cluster indexes provided in an embodiment of the present application, and it should be understood that the device 600 corresponds to the central node side in the method embodiments of fig. 2 to fig. 5, and is capable of performing steps of the central node side in the method embodiments, and specific functions of the device 600 may be referred to in the foregoing description, and detailed descriptions are appropriately omitted here to avoid repetition. The device 600 comprises at least one software function that can be stored in a memory in the form of software or firmware or that is fixed in an Operating System (OS) of the device 600, in particular, the device 600 comprises:

a first obtaining module 610, configured to obtain operation index data cached in advance by each node in the multiple nodes; the screening module 620 is configured to screen operation index data triggering an alarm from the operation index data of the plurality of nodes; and a display module 630, configured to display the operation index data for triggering the alarm.

In one possible embodiment, the apparatus comprises: and a storage module ((not shown)) for storing the alarm triggering operation index data in a database, wherein all the alarm triggering operation index data in the database are stored according to different time periods.

In a possible embodiment, the screening module 620 is further configured to screen at least one set of operation index data from the operation index data of the plurality of nodes, where each set of operation index data is data related to an alarm, and each set of operation index data includes at least two operation index data.

in a possible embodiment, the displaying module 630 is configured to display the operation index data triggering the alarm in a preset displaying manner, where the preset displaying manner includes one of the following manners: graphics and colors.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method, and will not be described in too much detail herein.

Referring to fig. 7, fig. 7 shows a block diagram of a device 700 for processing cluster indexes provided in an embodiment of the present application, and it should be understood that the device 700 corresponds to a node side in the method embodiments of fig. 2 to 5, and can perform steps at the node side in the method embodiments, and specific functions of the device 700 may be referred to the description above, and detailed descriptions are omitted here to avoid repetition. The device 700 comprises at least one software function that can be stored in a memory in the form of software or firmware or that is fixed in an Operating System (OS) of the device 700, in particular, the device 700 comprises:

The cache module 710 is configured to cache the collected operation index data; the sending module 720 is configured to send the operation index data cached in advance to the central node, so that the central node screens out the operation index data triggering the alarm from the operation index data, and displays the operation index data triggering the alarm.

Referring to fig. 8, fig. 8 shows a block diagram of a device 800 for querying a cluster index according to an embodiment of the present application, it should be understood that the device 800 corresponds to the central node side in the method embodiments of fig. 2 to fig. 5, and is capable of performing steps of the central node side in the method embodiments, specific functions of the device 800 may be referred to in the foregoing description, and detailed descriptions are appropriately omitted here to avoid repetition. The device 800 comprises at least one software function that can be stored in a memory in the form of software or firmware or that is fixed in an Operating System (OS) of the device 800, in particular the device 800 comprises:

A second obtaining module 810, configured to obtain a query instruction, where the query instruction includes a query time interval; the query module 820 is configured to query, according to the query instruction, operation index data corresponding to the query time period from the database, where the operation index data in the database is stored according to different time periods, and the operation index data is operation index data that triggers an alarm and is screened from pre-cached operation index data sent by each node.

The present application further provides an electronic device 900, where the electronic device 900 may be disposed in a node or a central node.

Fig. 9 is a block diagram of an electronic device 900 according to an embodiment of the present application, as shown in fig. 9. The electronic device 900 may include a processor 910, a communication interface 920, a memory 930, and at least one communication bus 940. Wherein the communication bus 940 is used for realizing direct connection communication of the components. In this embodiment, the communication interface 920 of the device in this application is used for performing signaling or data communication with other node devices. The processor 910 may be an integrated circuit chip having signal processing capabilities. The Processor 910 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor 910 may be any conventional processor or the like.

The Memory 930 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 930 stores computer readable instructions, which when executed by the processor 910, the electronic device 900 may perform the steps of the central node side in the method embodiments of fig. 2 to 5 described above.

The electronic device 900 may further include a memory controller, an input-output unit, an audio unit, an image capture unit.

the memory 930, the memory controller, the processor 910, the peripheral interface, the input/output unit, the audio unit, and the image capturing unit are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, these components may be electrically coupled to each other via one or more communication buses 940. The processor 910 is configured to execute executable modules stored in the memory 930, such as software functional modules or computer programs included in the electronic device 900.

The input and output unit is used for realizing interaction between the unmanned aerial vehicle and the unmanned aerial vehicle control device. The input/output unit may be, but is not limited to, a data input/output interface, etc.

The image acquisition unit may be configured to cause the drone to acquire a gesture image or a face image. The image may be, but is not limited to, a camera or the like.

The audio unit provides an audio interface to the user, which may include one or more microphones, one or more speakers, and audio circuitry.

It will be appreciated that the configuration shown in FIG. 9 is merely illustrative and that the electronic device 900 may include more or fewer components than shown in FIG. 9 or have a different configuration than shown in FIG. 9. The components shown in fig. 9 may be implemented in hardware, software, or a combination thereof.

The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method according to any of the alternative implementations of the central node side in fig. 2 to 5.

The present application also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, performs the method according to any one of the alternative implementations of the node side in fig. 2 to 5.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the system described above may refer to the corresponding process in the foregoing method, and will not be described in too much detail herein.

It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

the above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

the above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for processing cluster metrics, comprising:

acquiring operation index data cached in advance by each node in a plurality of nodes;

Screening out operation index data triggering an alarm from the operation index data of the plurality of nodes;

And displaying the operation index data of the triggering alarm.

2. The method of claim 1, further comprising:

And storing the operation index data of the triggered alarm in a database, wherein all the operation index data of the triggered alarm in the database are stored according to different time periods.

3. The method according to claim 1 or 2, wherein the screening out operation index data triggering an alarm from operation index data of the plurality of nodes comprises:

And screening at least one group of operation index data from the operation index data of the nodes, wherein each group of operation index data is data related to one alarm, and each group of operation index data comprises at least two operation index data.

4. the method of claim 1, wherein the presenting the operation index data that triggers the alarm comprises:

Displaying the operation index data of the triggered alarm in a preset display mode, wherein the preset display mode comprises one of the following modes: graphics and colors.

5. A method for processing cluster metrics, comprising:

Caching the collected operation index data;

And sending pre-cached operation index data to a central node, so that the central node can screen out the operation index data triggering the alarm from the operation index data and display the operation index data triggering the alarm.

6. A method for querying a cluster index, comprising:

Acquiring a query instruction, wherein the query instruction comprises a query time interval;

And querying operation index data corresponding to the query time period from a database according to the query instruction, wherein the operation index data in the database are stored according to different time periods, and the operation index data are screened out from the operation index data cached in advance sent by each node and used for triggering alarm.

7. An apparatus for processing cluster metrics, comprising:

The first acquisition module is used for acquiring operation index data cached in advance by each node in the plurality of nodes;

The screening module is used for screening out operation index data triggering alarms from the operation index data of the nodes;

And the display module is used for displaying the operation index data of the triggered alarm.

8. The apparatus of claim 7, wherein the apparatus comprises:

And the storage module is used for storing the operation index data for triggering the alarm in a database, wherein all the operation index data for triggering the alarm in the database are stored according to different time periods.

9. An apparatus for processing cluster metrics, comprising:

The cache module is used for caching the collected operation index data;

The sending module is used for sending the operation index data cached in advance to a central node, so that the central node can screen out the operation index data triggering the alarm from the operation index data and display the operation index data triggering the alarm.

10. An apparatus for querying cluster metrics, comprising:

The second acquisition module is used for acquiring a query instruction, and the query instruction comprises a query time interval;

And the query module is used for querying the operation index data corresponding to the query time period from a database according to a query instruction, wherein the operation index data in the database are stored according to different time periods, and the operation index data are screened out from the operation index data cached in advance sent by each node and used for triggering the alarm.