CN114238008A - Data acquisition method, device and system, electronic equipment and storage medium - Google Patents

Data acquisition method, device and system, electronic equipment and storage medium Download PDF

Info

Publication number
CN114238008A
CN114238008A CN202111320518.XA CN202111320518A CN114238008A CN 114238008 A CN114238008 A CN 114238008A CN 202111320518 A CN202111320518 A CN 202111320518A CN 114238008 A CN114238008 A CN 114238008A
Authority
CN
China
Prior art keywords
distributed cluster
data
cluster
host
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111320518.XA
Other languages
Chinese (zh)
Inventor
赵宇
马欢
侯雪峰
王东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202111320518.XA priority Critical patent/CN114238008A/en
Publication of CN114238008A publication Critical patent/CN114238008A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Hardware Design (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides a data acquisition method, a data acquisition device, a data acquisition system, electronic equipment and a storage medium, wherein a host in a distributed cluster acquires target data indicated by a specified monitoring index from running data of each component deployed in the host; and sending the acquired target data to a management server. Aiming at each distributed cluster, the management server receives target data sent by each host in the distributed cluster; determining to-be-processed data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster; and carrying out aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result. Based on the processing, the operation data of the components deployed in the hosts in each distributed cluster can be acquired, and the operation data of the components deployed in the hosts in the distributed clusters can be managed uniformly.

Description

Data acquisition method, device and system, electronic equipment and storage medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a data acquisition method, apparatus, system, electronic device, and storage medium.
Background
With the rapid development of big data technology, distributed clusters based on big data technology are widely applied in various fields. A distributed cluster includes multiple hosts (e.g., servers) in each of which multiple big data technology-based components can be deployed to implement corresponding functionality. For example, an Hbase (distributed database) component is deployed on each host in a distributed cluster to implement the functionality of distributed storage of data.
To optimize the performance of a distributed cluster, it is necessary to obtain operational data for the components deployed in each host in the distributed cluster. Subsequently, performance of the host (e.g., CPU usage of the host, memory usage of the host, etc.) may be analyzed based on the obtained operating data of each component.
Disclosure of Invention
An object of the embodiments of the present invention is to provide a data acquisition method, apparatus, system, electronic device, and storage medium, which can acquire operation data of components deployed in hosts in each distributed cluster, and can perform unified management on the operation data of the components deployed in the hosts in the distributed cluster. The specific technical scheme is as follows:
in a first aspect, to achieve the above object, an embodiment of the present invention provides a data acquisition system, where the system includes: a distributed cluster and management server, wherein:
the host in the distributed cluster is used for acquiring data indicated by a specified monitoring index from the running data of each component deployed in the host as target data; sending the obtained target data to the management server;
the management server is used for receiving target data sent by each host in each distributed cluster aiming at each distributed cluster; determining data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster; and aggregating the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result, wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
In a second aspect, to achieve the above object, an embodiment of the present invention provides a data acquisition method, where the method is applied to a management server, and the method includes:
for each distributed cluster, receiving target data sent by each host in the distributed cluster; wherein the target data is: data indicated by a specified monitoring index in the running data of each component deployed in the host;
determining data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster; wherein, the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster;
and carrying out aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result.
Optionally, before receiving, for each distributed cluster, target data sent by each host in the distributed cluster, the method further includes:
for each distributed cluster, determining a preset queue corresponding to the cluster priority of the distributed cluster as a target queue according to the corresponding relation between the cluster priority recorded in a preset database and the preset queue through a data transmission process;
the receiving, for each distributed cluster, target data sent by each host in the distributed cluster includes:
receiving target data sent by each host in the distributed cluster through the data transmission process;
after receiving, for each distributed cluster, the target data sent by each host in the distributed cluster, the method further includes:
adding the target data of the distributed cluster to the target queue through the data transmission process;
and acquiring the target data of the distributed cluster from the target queue through a data processing process.
Optionally, after adding the target data of the distributed cluster to the target queue, the method further includes:
if the target data of the distributed cluster is not successfully added to the target queue, adding the target data of the distributed cluster to a backup queue; wherein the backup queue is used to restore data that was not successfully added to the queue.
Optionally, before determining, in target data sent from each host in the distributed cluster, data indicated by a calculation index corresponding to the distributed cluster, as to-be-processed data corresponding to the distributed cluster, the method further includes:
acquiring a calculation index corresponding to the distributed cluster from the preset database through the data processing process;
determining data indicated by the calculation index corresponding to the distributed cluster from the target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster, wherein the data comprises:
and determining data indicated by the calculation indexes corresponding to the distributed cluster from the target data sent by each host in the distributed cluster through the data processing process, wherein the data are used as the data to be processed corresponding to the distributed cluster.
Optionally, the method further includes:
and when a monitoring strategy updating instruction is received, updating the corresponding relation between the cluster priority in the preset database and the preset queue and/or the calculation index of each distributed cluster.
Optionally, the method further includes:
for each distributed cluster, when a registration request sent by a host in the distributed cluster is received, acquiring the specified monitoring index from the preset database;
and sending the specified monitoring index to the host in the distributed cluster.
In a third aspect, to achieve the above object, an embodiment of the present invention provides a data obtaining method, where the method is applied to a host in a distributed cluster, and the method includes:
acquiring data indicated by a specified monitoring index from the running data of each component deployed in the host as target data;
sending the obtained target data to a management server, so that the management server determines data indicated by a calculation index corresponding to each distributed cluster from the target data sent by a host in the distributed cluster as to-be-processed data corresponding to the distributed cluster; and aggregating the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result, wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
Optionally, before obtaining, as target data, data indicated by a specified monitoring index from the operation data of each component deployed in the host, the method further includes:
sending a registration request of the distributed cluster to the management server, so that the management server obtains a specified monitoring index from a preset database when receiving the registration request, and sending the specified monitoring index to the host;
and receiving the specified monitoring index sent by the management server.
Optionally, before obtaining the data indicated by the specified monitoring index from the operation data of each component deployed in the host, as target data, the method further includes:
when receiving a component adding message, acquiring a monitoring interface address of an added component from the component adding message; wherein the monitoring interface address is used for acquiring the operation data of the added component.
In a fourth aspect, to achieve the above object, an embodiment of the present invention provides a data acquisition apparatus, where the apparatus is applied to a management server, and the apparatus includes:
the receiving module is used for receiving target data sent by each host in each distributed cluster; wherein the target data is: data indicated by a specified monitoring index in the running data of each component deployed in the host;
the first determining module is used for determining data indicated by the calculation indexes corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and the data are used as to-be-processed data corresponding to the distributed cluster; wherein, the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster;
and the aggregation module is used for performing aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result.
Optionally, the apparatus further comprises:
a second determining module, configured to, before the receiving module executes the target data sent by each host in each distributed cluster for each distributed cluster, execute a preset queue corresponding to the cluster priority of the distributed cluster as a target queue according to a corresponding relationship between the cluster priority recorded in a preset database and the preset queue through a data transmission process for each distributed cluster;
the receiving module is specifically configured to receive, through the data transmission process, target data sent by each host in the distributed cluster;
the device further comprises:
the first adding module is used for adding the target data of each distributed cluster to the target queue through the data transmission process after the receiving module receives the target data sent by each host in each distributed cluster;
and the first acquisition module is used for acquiring the target data of the distributed cluster from the target queue through a data processing process.
Optionally, the apparatus further comprises:
a second adding module, configured to, after the first adding module performs adding the target data of the distributed cluster to the target queue, perform adding the target data of the distributed cluster to a backup queue if the target data of the distributed cluster is not successfully added to the target queue; wherein the backup queue is used to restore data that was not successfully added to the queue.
Optionally, the apparatus further comprises:
a second obtaining module, configured to, before the first determining module executes target data sent from each host in the distributed cluster, determine data indicated by a calculation index corresponding to the distributed cluster, and use the data as to-be-processed data corresponding to the distributed cluster, execute the data processing process to obtain the calculation index corresponding to the distributed cluster from the preset database;
the first determining module is specifically configured to determine, through the data processing process, data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and use the data as to-be-processed data corresponding to the distributed cluster.
Optionally, the apparatus further comprises:
and the updating module is used for updating the corresponding relation between the cluster priority in the preset database and the preset queue and/or the calculation index of each distributed cluster when a monitoring strategy updating instruction is received.
Optionally, the apparatus further comprises:
the third acquisition module is used for acquiring the specified monitoring index from the preset database when a registration request sent by a host in each distributed cluster is received;
and the sending module is used for sending the specified monitoring index to the host in the distributed cluster.
In a fifth aspect, to achieve the above object, an embodiment of the present invention provides a data acquisition apparatus, where the apparatus is applied to a host in a distributed cluster, and the apparatus includes:
the first acquisition module is used for acquiring data indicated by a specified monitoring index from the running data of each component deployed in the host computer as target data;
the first sending module is used for sending the obtained target data to a management server so that the management server determines data indicated by a calculation index corresponding to each distributed cluster from the target data sent by a host in the distributed cluster as to-be-processed data corresponding to the distributed cluster; and aggregating the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result, wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
Optionally, the apparatus further comprises:
a second sending module, configured to, before the first obtaining module performs, on the operating data of each component deployed in the host, obtaining data indicated by a specified monitoring index, and sending, to the management server, a registration request of the distributed cluster, so that the management server obtains the specified monitoring index from a preset database when receiving the registration request, and sends the specified monitoring index to the host;
and the receiving module is used for receiving the specified monitoring index sent by the management server.
Optionally, the apparatus further comprises:
a second obtaining module, configured to, before the first obtaining module performs, from the operating data of each component deployed in the host, obtaining data indicated by the specified monitoring indicator, and performing, when a component addition message is received, obtaining a monitoring interface address of an added component from the component addition message, where the data is used as target data; wherein the monitoring interface address is used for acquiring the operation data of the added component.
The embodiment of the invention also provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
a memory for storing a computer program;
a processor, configured to implement the steps of the data acquisition method according to the second aspect or the steps of the data acquisition method according to the third aspect when executing the program stored in the memory.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the data obtaining method steps of the second aspect, or the data obtaining method steps of the third aspect.
Embodiments of the present invention further provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the data acquisition method according to the second aspect or the data acquisition method according to the third aspect.
According to the technical scheme provided by the embodiment of the invention, a host in a distributed cluster acquires data indicated by a specified monitoring index from running data of each component deployed in the host as target data; and sending the acquired target data to a management server. Aiming at each distributed cluster, the management server receives target data sent by each host in the distributed cluster; determining data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster; and carrying out aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result.
Based on the above processing, the management server and the hosts in the distributed clusters can be used to obtain the operation data of the components deployed in each host in each distributed cluster, and the management server can be used to perform uniform management on the operation data of the components deployed in each host in the distributed clusters. In addition, the target data of each distributed cluster can be processed according to the calculation index corresponding to the cluster priority of the distributed cluster, and the targeted processing of different distributed clusters can be realized.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by referring to these drawings.
Fig. 1 is a flowchart of a data acquisition method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another data acquisition method provided by the embodiments of the present invention;
fig. 3 is a schematic diagram of a default queue according to an embodiment of the present invention;
FIG. 4 is a flow chart of another data acquisition method provided by the embodiments of the present invention;
FIG. 5 is a flow chart of another data acquisition method provided by the embodiments of the present invention;
FIG. 6 is a flow chart of another data acquisition method provided by the embodiments of the present invention;
FIG. 7 is a flowchart of a method for component registration according to an embodiment of the present invention;
FIG. 8 is a flow chart of another data acquisition method provided by embodiments of the present invention;
FIG. 9 is a flow chart of another data acquisition method provided by embodiments of the present invention;
fig. 10 is a structural diagram of a data acquisition apparatus according to an embodiment of the present invention;
FIG. 11 is a block diagram of another data acquisition device provided in accordance with an embodiment of the present invention;
fig. 12 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived from the embodiments given herein by one of ordinary skill in the art, are within the scope of the invention.
To optimize the performance of a distributed cluster, it is necessary to obtain operational data for the components deployed in each host in the distributed cluster. Subsequently, performance of the host (e.g., CPU usage of the host, memory usage of the host, etc.) may be analyzed based on the obtained operating data of each component.
The embodiment of the invention provides a data acquisition system, which can acquire the operating data of each component deployed in each host in a distributed cluster, and comprises the distributed cluster and a management server, wherein:
the system comprises a host in the distributed cluster, a monitoring system and a monitoring system, wherein the host is used for acquiring data indicated by a specified monitoring index from running data of each component deployed in the host as target data; and sending the acquired target data to a management server.
The management server is used for receiving target data sent by each host in each distributed cluster aiming at each distributed cluster; determining data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster; and aggregating the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result, wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
According to the data acquisition system provided by the embodiment of the invention, the operation data of the components deployed in each host in each distributed cluster can be acquired through the management server and the hosts in the distributed clusters, and the operation data of the components deployed in each host in the distributed clusters can be uniformly managed through the management server. In addition, the target data of each distributed cluster can be processed according to the calculation index corresponding to the cluster priority of the distributed cluster, and the targeted processing of different distributed clusters can be realized.
With regard to other embodiments of the data acquisition system described above, reference may be made to the following description of related methods with regard to distributed clustering and management servers.
Referring to fig. 1, fig. 1 is a flowchart of a data acquisition method according to an embodiment of the present invention, where the method may be applied to a management server in the data acquisition system, and the method may include the following steps:
s101: and aiming at each distributed cluster, receiving target data sent by each host in the distributed cluster.
Wherein, the target data is: and the data indicated by the specified monitoring index in the running data of each component deployed in the host.
S102: and determining data indicated by the calculation indexes corresponding to the distributed cluster from the target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster.
And the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
S103: and carrying out aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result.
According to the data acquisition method provided by the embodiment of the invention, the operation data of the components deployed in the hosts in each distributed cluster can be acquired through the management server and the hosts in the distributed clusters, and the operation data of the components deployed in the hosts in the distributed clusters can be uniformly managed through the management server. In addition, the target data of each distributed cluster can be processed according to the calculation index corresponding to the cluster priority of the distributed cluster, and the targeted processing of different distributed clusters can be realized.
For step S101, for each distributed cluster, multiple components may be deployed in each host in the distributed cluster to implement corresponding functions. In order to analyze the performance of the host in the distributed cluster, each host in the distributed cluster may obtain data indicated by a specified monitoring index from the operating data of each component deployed in the host, obtain target data of the distributed cluster, and send the target data of the distributed cluster to the management server. The operational data for a distributed cluster may include an operational log for the distributed cluster.
The specified monitoring index may include monitoring indexes corresponding to different performances of the host, for example, a monitoring index for determining a memory state of the host, a monitoring index for determining a state of each thread in the host, and a monitoring index for determining various events in the host.
For example, the specified monitoring indicator corresponding to the memory state may include: memnon heappusedm (the size of the non-heap memory currently used by the JVM (Java Virtual Machine)), memnon heappmaxm (the size of the non-heap memory configured by the JVM), memheappusedm (the size of the heap memory currently used by the JVM), memheappmaxm (the size of the heap memory configured by the JVM), and MemMaxM (the maximum memory that can be used when the JVM is running), and the like.
The specified monitoring index corresponding to the state of each thread may include: threadnew (the number of threads currently in the New state), threadrunnable (the number of threads currently in the Runnable state), threadblocked (the number of threads currently in the Blocked state), threadwaiting (the number of threads currently in the Waiting state), and the like.
The designated monitoring indexes corresponding to various events may include: LogError (number of Error events in a fixed time interval), LogWarn (number of Warn events in a fixed time interval), and LogInfo (number of Info events in a fixed time interval), etc.
For a specific processing manner of the hosts in the distributed cluster, reference may be made to the related description of the subsequent embodiments.
Accordingly, for each distributed cluster, the management server may receive the target data sent by each host in the distributed cluster.
In an embodiment of the present invention, on the basis of fig. 1, referring to fig. 2, before step S101, the method may further include the steps of:
s104: and aiming at each distributed cluster, determining a preset queue corresponding to the cluster priority of the distributed cluster as a target queue according to the corresponding relation between the cluster priority recorded in a preset database and the preset queue through a data transmission process.
Accordingly, step S101 may include the steps of:
s1011: and receiving target data sent by each host in the distributed cluster through a data transmission process.
Accordingly, after step S101, the method may further include the steps of:
s105: and adding the target data of the distributed cluster to the target queue through a data transmission process.
S106: and acquiring the target data of the distributed cluster from the target queue through a data processing process.
The higher the cluster priority of a distributed cluster, the higher the importance of the distributed cluster. The number of the calculation indexes of the distributed cluster with the high cluster priority is larger than that of the distributed cluster with the low cluster priority, that is, the distributed cluster with the high cluster priority acquires more operation data.
The cluster priority of a distributed cluster may be determined according to the function of the distributed cluster, e.g., the cluster priority of a distributed cluster used to process traffic is higher than the cluster priority of a distributed cluster used for testing.
The management server may preset a plurality of queues, where different preset queues are used to store target data of distributed clusters with different cluster priorities.
The preset database may be Redis, and a corresponding relationship between the cluster priority and the preset queue is stored in the preset database.
The management server can acquire the corresponding relation between the cluster priority and the preset queue from the preset database through a data transmission process. Then, for each distributed cluster, the management server may determine, according to the corresponding relationship, a preset queue corresponding to the cluster priority of the distributed cluster through a data transmission process, and use the preset queue as a target queue.
Furthermore, when the target data sent by each host in the distributed cluster is received, the target data of the distributed cluster can be added to the target queue through a data transmission process. Subsequently, the management server may obtain the target data of the distributed cluster from the target queue through a data processing process.
In addition, for each distributed cluster, if the preset queue corresponding to the cluster priority of the distributed cluster does not exist in the corresponding relationship between the cluster priority and the preset queue, which indicates that the target data of the distributed cluster may be tampered by a malicious attacker, the data transmission process may discard the target data of the distributed cluster, and the security of the management server may be improved.
Based on the above processing, when the received target data of each distributed cluster is large, the management server may not be able to process all the target data at the same time, and after storing the target data of each distributed cluster to the preset queue, the management server may sequentially obtain the target data from the preset queue and sequentially process the obtained target data, so that the processing pressure of the management server may be reduced.
In addition, if more data are added to the preset queue corresponding to the cluster priority of one distributed cluster, and the target data of the distributed cluster may not be added to the preset queue corresponding to the cluster priority of the distributed cluster, a new queue may be added to the management server, and the target data of the distributed cluster is added to the new queue, so that data loss may be avoided to a certain extent.
In an embodiment of the present invention, after step S105, the method may further include the steps of: and if the target data of the distributed cluster is not successfully added to the target queue, adding the target data of the distributed cluster to the backup queue.
Wherein the backup queue is used to restore data that was not successfully added to the queue.
In one implementation, for each distributed cluster, when adding target data of the distributed cluster to a target queue corresponding to a cluster priority of the distributed cluster, the target queue may be abnormal (for example, the target queue stores more data), which may result in that the target data of the distributed cluster is not successfully added to the target queue.
If the target data of the distributed cluster is not successfully added to the target queue, the target data of the distributed cluster may be added to the backup queue in order to avoid data loss.
Subsequently, when the time length for adding the target data of the distributed cluster to the backup queue reaches the preset time length, the target queue may have recovered to be normal, and the management server may obtain the target data of the distributed cluster from the backup queue and add the target data of the distributed cluster to the target queue corresponding to the distributed cluster again.
Illustratively, referring to fig. 3, a distributed cluster includes: cluster a, cluster B, and cluster C. The preset queue may include: a high-priority queue, a normal queue, and a backup queue.
For each distributed cluster, after receiving the target data of the distributed cluster, the management server may determine a preset queue corresponding to the cluster priority of the distributed cluster from a plurality of preset queues through a data transmission process, and add the target data of the distributed cluster to the target queue corresponding to the cluster priority of the distributed cluster. For example, the cluster priority of the cluster a is higher, and the management server may add the target data of the cluster a to the high-priority queue through a data transmission process. The cluster priority of the cluster B is low, and the management server can add the target data of the cluster B to the general queue through a data transmission process.
When adding the target data of the cluster a to the high-priority queue, if the target data of the cluster a is not successfully added to the high-priority queue, in order to avoid data loss, the target data of the cluster a may be added to the backup queue. Subsequently, when the time length for adding the target data of the cluster a to the backup queue reaches the preset time length, the management server may obtain the target data of the cluster a from the backup queue, and add the target data of the cluster a to the high-priority queue again.
For step S102, for each distributed cluster, the target data of the distributed cluster is the data indicated by the specified monitoring index. In order to uniformly manage the plurality of distributed clusters, the designated monitoring indexes corresponding to the distributed clusters are the same, and the number of the designated monitoring indexes is larger than that of the calculation indexes corresponding to each distributed cluster. Thus, for each distributed cluster, there is unwanted data in the target data for that distributed cluster.
Accordingly, after receiving the target data of the distributed cluster, the management server may obtain the calculation index corresponding to the distributed cluster. Furthermore, the management server may determine, from the target data sent by each host in the distributed cluster, data indicated by the calculation index corresponding to the distributed cluster, and obtain to-be-processed data corresponding to the distributed cluster.
In an embodiment of the present invention, on the basis of fig. 2, referring to fig. 4, before step S102, the method may further include the steps of:
s107: and acquiring the calculation index corresponding to the distributed cluster from a preset database through a data processing process.
Accordingly, step S102 may include the steps of:
s1021: and determining data indicated by the calculation indexes corresponding to the distributed cluster from the target data sent by each host in the distributed cluster through a data processing process, wherein the data is used as the data to be processed corresponding to the distributed cluster.
The management server can acquire the calculation index of each distributed cluster from the preset database through the data processing process, and process the target data of the distributed clusters according to the calculation index of the distributed clusters, so that the different distributed clusters can be processed in a targeted manner.
For step S103, for each distributed cluster, after determining the to-be-processed data of the distributed cluster, the management server may perform aggregation processing on the to-be-processed data of the distributed cluster to obtain a corresponding aggregation result.
Illustratively, for each distributed cluster, the computed metrics in the distributed cluster include: LogError, the to-be-processed data of the distributed cluster includes: the number of Error events per component deployed in each host in the distributed cluster over a fixed time interval. Furthermore, the management server may perform aggregation processing on the to-be-processed data of the distributed cluster, and may obtain a total number of Error events of each component deployed in each host in the distributed cluster within a fixed time interval.
After obtaining the aggregation result of the distributed cluster, the management server may store the aggregation result of the distributed cluster to a local. Subsequently, when a query request for the distributed cluster sent by the client is received, the aggregated result of the distributed cluster may be sent to the client. Accordingly, the client may present the aggregated results of the distributed cluster to the user.
In addition, for each distributed cluster, the management server can also alarm according to the aggregation result of the distributed cluster. For example, for the above embodiment, when the total number of Error events in a fixed time interval of each component deployed in each host in the distributed cluster reaches a preset threshold, the management server may send alarm information to the client, where the alarm information may carry the identifier of the distributed cluster and the aggregation result of the distributed cluster.
In one embodiment of the invention, the method may further comprise the steps of: and when a monitoring strategy updating instruction is received, updating the corresponding relation between the cluster priority in the preset database and the preset queue and/or the calculation index of each distributed cluster.
In one implementation, the user may set a correspondence between the cluster priority and the preset queue and a calculation index of each distributed cluster. When the information needs to be modified, a technician can send a monitoring policy updating instruction to the management server through the client, and the monitoring policy updating instruction can carry the corresponding relation between the cluster priority and the preset queue and/or the calculation indexes of each distributed cluster. When receiving the monitoring policy update instruction, the management server may update the correspondence between the cluster priority in the preset data and the preset queue, and/or the calculation index of each distributed cluster.
Illustratively, if the computed metrics for a distributed cluster include: the index A, the index B and the index C, wherein the calculation indexes of the distributed cluster carried in the monitoring strategy updating instruction comprise: and D, when receiving the monitoring strategy updating instruction, the management server can update the calculation index of the distributed cluster into: index A, index B, index C and index D.
Based on the processing, the corresponding relation between the cluster priority and the preset queue and the calculation indexes of all the distributed clusters can be configured by the user, and the individual requirements of the user can be met.
In one embodiment of the invention, the method may further comprise the steps of:
step 1, aiming at each distributed cluster, when a registration request sent by a host in the distributed cluster is received, acquiring a specified monitoring index from a preset database.
And 2, sending the appointed monitoring index to the host in the distributed cluster.
In one implementation, for each distributed cluster, each host in the distributed cluster may send a registration request to a management server before obtaining target data for the distributed cluster. The registration request may carry the name of the distributed cluster, the name of the host in the distributed cluster, and the name of each component deployed in the host.
Correspondingly, when receiving the registration request, the management server may obtain the specified monitoring index from the preset database, and send the specified monitoring index to the host in the distributed cluster.
In addition, the specified monitoring index can be set by a technician, the monitoring strategy updating instruction can also carry the specified monitoring index, and further, when the monitoring strategy updating instruction is received, the management server can also update the specified monitoring index.
Referring to fig. 5, fig. 5 is a flowchart of a data acquisition method according to an embodiment of the present invention, where the method may be applied to a host in a distributed cluster in the data acquisition system, and the method may include the following steps:
s501: and acquiring data indicated by the specified monitoring index from the operation data of each component deployed in the host as target data.
S502: sending the obtained target data to a management server, so that the management server determines data indicated by a calculation index corresponding to each distributed cluster from the target data sent by a host in the distributed cluster as to-be-processed data corresponding to the distributed cluster; and carrying out aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result.
Wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster
According to the data acquisition method provided by the embodiment of the invention, the operation data of the components deployed in the hosts in each distributed cluster can be acquired through the management server and the hosts in the distributed clusters, and the operation data of the components deployed in the hosts in the distributed clusters can be uniformly managed through the management server. In addition, the target data of each distributed cluster can be processed according to the calculation index corresponding to the cluster priority of the distributed cluster, and the targeted processing of different distributed clusters can be realized.
For step S501, for each distributed cluster, multiple components may be deployed in each host in the distributed cluster to implement corresponding functions. Each component generates corresponding operation data during operation, and in order to analyze the performance of the host in the distributed cluster, each host in the distributed cluster can acquire data indicated by a specified monitoring index from the operation data of each component deployed in the host to obtain target data of the distributed cluster.
In one implementation of the present invention, the management server may further obtain a specified monitoring index from the management server before obtaining the target data. Accordingly, on the basis of fig. 5, referring to fig. 6, before step S501, the method may further include the steps of:
s503: and sending a registration request of the distributed cluster to the management server so that the management server acquires the specified monitoring index from the preset database when receiving the registration request and sends the specified monitoring index to the host.
S504: and receiving the specified monitoring index sent by the management server.
In one implementation, for each distributed cluster, each host in the distributed cluster may send a registration request to a management server before obtaining target data for the distributed cluster. Correspondingly, when receiving the registration request, the management server may obtain the specified monitoring index from the preset database, and send the specified monitoring index to the host in the distributed cluster.
For step S502, after the target data is acquired, each host in the distributed cluster may send the acquired target data to the management server.
Correspondingly, after receiving the target data of each distributed cluster, the management server may determine, from the target data sent by the host in the distributed cluster, data indicated by the calculation index corresponding to the distributed cluster, as to-be-processed data corresponding to the distributed cluster; and carrying out aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result.
For a specific processing manner of the management server, reference may be made to the related description of the foregoing embodiments.
In an embodiment of the present invention, before step S501, the method may further include the steps of: when the component adding message is received, the monitoring interface address of the added component is obtained from the component adding message.
Wherein the monitoring interface address is used for acquiring the operation data of the added component.
The monitor interface address of a component may be a JMX (a framework that embeds management functions for applications, devices, systems, etc.) interface address of the component, or a JVM interface address of the component.
For example, referring to fig. 7, the components deployed in the hosts in the distributed cluster include a component a and a component B, and if a new component (e.g., a component C) is deployed in a host, the host receives a component addition message through a data collection process, where the component addition message carries the JMX interface address of the component C. Subsequently, the data acquisition process can acquire the operating data of the component C according to the JMX interface address of the component C.
For example, when component C is Hadoop (an infrastructure of a distributed system), the JMX interface address of Hadoop is: http:// host: 60010/jmx.
Based on the above processing, when a new component is deployed in a host in a distributed cluster, the host can acquire the operating data of the new component through the new component monitoring interface address, and can realize dynamic expansion of the component.
Referring to fig. 8, fig. 8 is a flowchart of another data acquisition method provided in an embodiment of the present invention, where the data acquisition method is applied to a host and a management server in a distributed cluster. The distributed cluster includes: distributed cluster 1 and distributed cluster 2. The components deployed in the hosts in each distributed cluster include: hadoop, Hdfs (distributed file system), Hbase (an open source non-relational database), Hive (a data warehouse tool based on Hadoop), Presto (data query engine).
Each host in the distributed cluster sends a registration request to the management server through a data acquisition process. Upon receiving the registration request, the management server may send a specified monitoring index to the hosts in the distributed cluster through a data management process.
The host in the distributed cluster may acquire target Data indicated by a specified monitoring index from the operation Data of each component deployed in the host, and send the acquired target Data to a Data-transfer (Data transfer process) in the management server. The data transmission process is developed based on OpenResty + LUA (Web platform based on Lua language).
Correspondingly, after receiving the target data sent by each distributed cluster, the management server may obtain a corresponding relationship between the cluster priority and the preset queue from the Redis through a data transmission process, determine the target queue corresponding to the cluster priority of the distributed cluster according to the corresponding relationship, and add the target data of the distributed cluster to the target queue.
Then, the management server may obtain target data of one distributed cluster from the preset queue through a data processing process, and determine to-be-processed data indicated by the calculation index of the distributed cluster from the target data of the distributed cluster. Then, the management server may perform aggregation processing on the to-be-processed data of the distributed cluster to obtain an aggregation result of the distributed cluster, and store the aggregation result of the distributed cluster in an aggregation result storage database.
In addition, the management server can also receive a monitoring policy updating instruction sent by a user through a data management process. The management server may communicate with the client through a designated API (Application Programming Interface), the user may input a monitoring policy update instruction at the client, and the client sends the monitoring policy update instruction to the management server through the API. Or, the management server may configure a user access page, and the user may input a monitoring policy update instruction to the management server through the user access page.
The monitoring policy update instruction may carry a corresponding relationship between the cluster priority and the preset queue, and the management server may store the received corresponding relationship to the Redis through the data management process. The monitoring strategy updating instruction can also carry specified monitoring indexes and calculation indexes of all distributed clusters, and the management server can store the received specified monitoring indexes and the calculation indexes of all the distributed clusters to the local through a data management process.
According to the data acquisition method provided by the embodiment of the invention, the operation data of the components deployed in the hosts in each distributed cluster can be acquired through the management server and the hosts in the distributed clusters, and the operation data of the components deployed in the hosts in the distributed clusters can be uniformly managed through the management server. In addition, the target data of each distributed cluster can be processed according to the calculation index corresponding to the cluster priority of the distributed cluster, and the targeted processing of different distributed clusters can be realized.
Referring to fig. 9, fig. 9 is a flowchart of another data acquisition method provided in an embodiment of the present invention, where the method may include the following steps:
s901: and registering the monitoring component through the monitoring acquisition module.
The monitoring and collecting module is a data collecting process in the foregoing embodiment. When a host in the distributed cluster receives the component adding message, the host determines the monitoring interface address of the component carried in the component adding message. Furthermore, the host in the distributed cluster can obtain target data indicated by the specified monitoring index from the running data of each component through the monitoring interface address of each component.
S902: the monitoring data is reported to openreserve.
The monitoring data is the target data in the foregoing embodiment. When the target data is acquired, the host in the distributed cluster can upload the target data to a data transmission process in the management server, wherein the data transmission process is developed based on openreserve.
S903: and judging whether a Topic strategy exists, if so, executing step S905, and if not, executing step S904.
The Topic policy is a preset queue corresponding to the cluster priority of the distributed cluster in the foregoing embodiment. And aiming at each distributed cluster, the management server judges whether a target queue corresponding to the cluster priority of the distributed cluster exists or not through a data transmission process so as to perform corresponding processing according to a judgment result.
S904: the indicator is discarded.
And for each distributed cluster, if the management server judges that the target queue corresponding to the cluster priority of the distributed cluster does not exist through the data transmission process, discarding the target data of the distributed cluster.
S905: and uploading to a preset queue.
And for each distributed cluster, if the management server judges that the target queue corresponding to the cluster priority of the distributed cluster exists through a data transmission process, adding the target data of the distributed cluster to the target queue corresponding to the cluster priority of the distributed cluster.
S906: and obtaining the calculation index through a calculation module.
The calculation module is a data processing process in the foregoing embodiment. And aiming at each distributed cluster, the management server acquires the calculation indexes of the distributed cluster through a data processing process, and determines the data to be processed indicated by the calculation indexes of the distributed cluster from the target data of the distributed cluster. And then, the management server carries out aggregation processing on the data to be processed of the distributed cluster through a data processing process to obtain a corresponding aggregation result.
S907: and storing the calculation result.
The calculation result is the aggregation result in the foregoing embodiment, and for each distributed cluster, when the aggregation result of the distributed cluster is obtained, the management server may further store the aggregation result of the distributed cluster to a local location.
S908: and the monitoring management center configures a new component monitoring index.
The monitoring management center is a data management process in the foregoing embodiment, and the management server may receive a monitoring policy update instruction through the data management process, and update the corresponding relationship between the cluster priority and the preset queue and/or the calculation index of each distributed cluster according to the received monitoring policy update instruction.
According to the data acquisition method provided by the embodiment of the invention, the operation data of the components deployed in the hosts in each distributed cluster can be acquired through the management server and the hosts in the distributed clusters, and the operation data of the components deployed in the hosts in the distributed clusters can be uniformly managed through the management server. In addition, the target data of each distributed cluster can be processed according to the calculation index corresponding to the cluster priority of the distributed cluster, and the targeted processing of different distributed clusters can be realized.
Corresponding to the embodiment of the method in fig. 1, referring to fig. 10, fig. 10 is a structural diagram of a data acquisition apparatus provided in an embodiment of the present invention, where the apparatus is applied to a management server, and the apparatus includes:
a receiving module 1001, configured to receive, for each distributed cluster, target data sent by each host in the distributed cluster; wherein the target data is: data indicated by a specified monitoring index in the running data of each component deployed in the host;
a first determining module 1002, configured to determine, from target data sent by each host in the distributed cluster, data indicated by a calculation index corresponding to the distributed cluster, where the data is used as to-be-processed data corresponding to the distributed cluster; wherein, the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster;
the aggregation module 1003 is configured to perform aggregation processing on the to-be-processed data corresponding to the distributed cluster to obtain a corresponding aggregation result.
Optionally, the apparatus further comprises:
a second determining module, configured to determine, before the receiving module 1001 executes to each distributed cluster and receives target data sent by each host in the distributed cluster, a preset queue corresponding to a cluster priority of the distributed cluster as a target queue according to a corresponding relationship between the cluster priority and a preset queue recorded in a preset database through a data transmission process for each distributed cluster;
the receiving module 1001 is specifically configured to receive, through the data transmission process, target data sent by each host in the distributed cluster;
the device further comprises:
a first adding module, configured to, after the receiving module 1001 executes, for each distributed cluster, receiving target data sent by each host in the distributed cluster, execute to add the target data of the distributed cluster to the target queue through the data transmission process;
and the first acquisition module is used for acquiring the target data of the distributed cluster from the target queue through a data processing process.
Optionally, the apparatus further comprises:
a second adding module, configured to, after the first adding module performs adding the target data of the distributed cluster to the target queue, perform adding the target data of the distributed cluster to a backup queue if the target data of the distributed cluster is not successfully added to the target queue; wherein the backup queue is used to restore data that was not successfully added to the queue.
Optionally, the apparatus further comprises:
a second obtaining module, configured to, before the first determining module 1002 executes target data sent from each host in the distributed cluster, determine data indicated by a calculation index corresponding to the distributed cluster, and use the data as to-be-processed data corresponding to the distributed cluster, execute a data processing process to obtain the calculation index corresponding to the distributed cluster from the preset database;
the first determining module 1002 is specifically configured to determine, through the data processing process, data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and use the data as to-be-processed data corresponding to the distributed cluster.
Optionally, the apparatus further comprises:
and the updating module is used for updating the corresponding relation between the cluster priority in the preset database and the preset queue and/or the calculation index of each distributed cluster when a monitoring strategy updating instruction is received.
Optionally, the apparatus further comprises:
the third acquisition module is used for acquiring the specified monitoring index from the preset database when a registration request sent by a host in each distributed cluster is received;
and the sending module is used for sending the specified monitoring index to the host in the distributed cluster.
According to the data acquisition device provided by the embodiment of the invention, the operation data of the components deployed in each host in each distributed cluster can be acquired through the management server and the hosts in the distributed clusters, and the operation data of the components deployed in each host in the distributed clusters can be uniformly managed through the management server. In addition, the target data of each distributed cluster can be processed according to the calculation index corresponding to the cluster priority of the distributed cluster, and the targeted processing of different distributed clusters can be realized.
Corresponding to the embodiment of the method in fig. 5, referring to fig. 11, fig. 11 is a structural diagram of a data acquisition apparatus according to an embodiment of the present invention, where the apparatus is applied to a host in a distributed cluster, and the apparatus includes:
a first obtaining module 1101, configured to obtain, from the operation data of each component deployed in the host, data indicated by a specified monitoring index as target data;
a first sending module 1102, configured to send the obtained target data to a management server, so that the management server determines, for each distributed cluster, data indicated by a calculation index corresponding to the distributed cluster from the target data sent by a host in the distributed cluster, where the data is used as to-be-processed data corresponding to the distributed cluster; and aggregating the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result, wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
Optionally, the apparatus further comprises:
a second sending module, configured to, before the first obtaining module 1101 performs, in executing the operation data of each component deployed from the host, obtaining data indicated by a specified monitoring index, and taking the data as target data, execute sending, to the management server, a registration request of the distributed cluster, so that the management server obtains the specified monitoring index from a preset database when receiving the registration request, and sends the specified monitoring index to the host;
and the receiving module is used for receiving the specified monitoring index sent by the management server.
Optionally, the apparatus further comprises:
a second obtaining module, configured to, before the first obtaining module 1101 performs, in executing execution of running data of each component deployed from the host, obtaining data indicated by the specified monitoring index, and as target data, perform, when a component addition message is received, obtaining a monitoring interface address of the added component from the component addition message; wherein the monitoring interface address is used for acquiring the operation data of the added component.
According to the data acquisition device provided by the embodiment of the invention, the operation data of the components deployed in each host in each distributed cluster can be acquired through the management server and the hosts in the distributed clusters, and the operation data of the components deployed in each host in the distributed clusters can be uniformly managed through the management server. In addition, the target data of each distributed cluster can be processed according to the calculation index corresponding to the cluster priority of the distributed cluster, and the targeted processing of different distributed clusters can be realized.
An embodiment of the present invention further provides an electronic device, as shown in fig. 12, including a processor 1201, a communication interface 1202, a memory 1203, and a communication bus 1204, where the processor 1201, the communication interface 1202, and the memory 1203 complete mutual communication through the communication bus 1204,
a memory 1203 for storing a computer program;
the processor 1201 is configured to implement the steps of the data acquisition method applied to the management server in the above embodiment or the steps of the data acquisition method applied to the hosts in the distributed cluster in the above embodiment when executing the program stored in the memory 1203.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In still another embodiment provided by the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the data acquisition method applied to the management server in the above-described embodiment or the steps of the data acquisition method applied to the hosts in the distributed cluster in the above-described embodiment.
In yet another embodiment provided by the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the data acquisition method applied to the management server in the above embodiment, or the data acquisition method applied to the hosts in the distributed cluster in the above embodiment.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, system, electronic device, computer-readable storage medium, and computer program product embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference may be made to some descriptions of the method embodiments for related points.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (14)

1. A data acquisition system, characterized in that the system comprises: a distributed cluster and management server, wherein:
the host in the distributed cluster is used for acquiring data indicated by a specified monitoring index from the running data of each component deployed in the host as target data; sending the obtained target data to the management server;
the management server is used for receiving target data sent by each host in each distributed cluster aiming at each distributed cluster; determining data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster; and aggregating the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result, wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
2. A data acquisition method is applied to a management server, and the method comprises the following steps:
for each distributed cluster, receiving target data sent by each host in the distributed cluster; wherein the target data is: data indicated by a specified monitoring index in the running data of each component deployed in the host;
determining data indicated by a calculation index corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster; wherein, the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster;
and carrying out aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result.
3. The method of claim 2, wherein prior to receiving, for each distributed cluster, target data sent by each host in the distributed cluster, the method further comprises:
for each distributed cluster, determining a preset queue corresponding to the cluster priority of the distributed cluster as a target queue according to the corresponding relation between the cluster priority recorded in a preset database and the preset queue through a data transmission process;
the receiving, for each distributed cluster, target data sent by each host in the distributed cluster includes:
receiving target data sent by each host in the distributed cluster through the data transmission process;
after receiving, for each distributed cluster, the target data sent by each host in the distributed cluster, the method further includes:
adding the target data of the distributed cluster to the target queue through the data transmission process;
and acquiring the target data of the distributed cluster from the target queue through a data processing process.
4. The method of claim 3, wherein after adding the target data for the distributed cluster to the target queue, the method further comprises:
if the target data of the distributed cluster is not successfully added to the target queue, adding the target data of the distributed cluster to a backup queue; wherein the backup queue is used to restore data that was not successfully added to the queue.
5. The method according to claim 3, wherein before determining, as the to-be-processed data corresponding to the distributed cluster, data indicated by the calculation index corresponding to the distributed cluster, in the target data sent from each host in the distributed cluster, the method further includes:
acquiring a calculation index corresponding to the distributed cluster from the preset database through the data processing process;
determining data indicated by the calculation index corresponding to the distributed cluster from the target data sent by each host in the distributed cluster, and taking the data as to-be-processed data corresponding to the distributed cluster, wherein the data comprises:
and determining data indicated by the calculation indexes corresponding to the distributed cluster from the target data sent by each host in the distributed cluster through the data processing process, wherein the data are used as the data to be processed corresponding to the distributed cluster.
6. The method of claim 5, further comprising:
and when a monitoring strategy updating instruction is received, updating the corresponding relation between the cluster priority in the preset database and the preset queue and/or the calculation index of each distributed cluster.
7. The method of claim 6, further comprising:
for each distributed cluster, when a registration request sent by a host in the distributed cluster is received, acquiring the specified monitoring index from the preset database;
and sending the specified monitoring index to the host in the distributed cluster.
8. A method for data acquisition, the method being applied to a host in a distributed cluster, the method comprising:
acquiring data indicated by a specified monitoring index from the running data of each component deployed in the host as target data;
sending the obtained target data to a management server, so that the management server determines data indicated by a calculation index corresponding to each distributed cluster from the target data sent by a host in the distributed cluster as to-be-processed data corresponding to the distributed cluster; and aggregating the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result, wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
9. The method according to claim 8, wherein before obtaining, as target data, data indicated by a specified monitoring index from the operation data of each component deployed in the host, the method further comprises:
sending a registration request of the distributed cluster to the management server, so that the management server obtains a specified monitoring index from a preset database when receiving the registration request, and sending the specified monitoring index to the host;
and receiving the specified monitoring index sent by the management server.
10. The method according to claim 8, wherein before obtaining, as target data, data indicated by the specified monitoring index from the operation data of each component deployed in the host, the method further comprises:
when receiving a component adding message, acquiring a monitoring interface address of an added component from the component adding message; wherein the monitoring interface address is used for acquiring the operation data of the added component.
11. A data acquisition apparatus, wherein the apparatus is applied to a management server, the apparatus comprising:
the receiving module is used for receiving target data sent by each host in each distributed cluster; wherein the target data is: data indicated by a specified monitoring index in the running data of each component deployed in the host;
the first determining module is used for determining data indicated by the calculation indexes corresponding to the distributed cluster from target data sent by each host in the distributed cluster, and the data are used as to-be-processed data corresponding to the distributed cluster; wherein, the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster;
and the aggregation module is used for performing aggregation processing on the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result.
12. A data acquisition apparatus, the apparatus being applied to a host in a distributed cluster, the apparatus comprising:
the first acquisition module is used for acquiring data indicated by a specified monitoring index from the running data of each component deployed in the host computer as target data;
the first sending module is used for sending the obtained target data to a management server so that the management server determines data indicated by a calculation index corresponding to each distributed cluster from the target data sent by a host in the distributed cluster as to-be-processed data corresponding to the distributed cluster; and aggregating the data to be processed corresponding to the distributed cluster to obtain a corresponding aggregation result, wherein the calculation index corresponding to the distributed cluster is determined according to the cluster priority of the distributed cluster.
13. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 2 to 7, or claims 8 to 10, when executing a program stored in a memory.
14. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of claims 2-7 or any of claims 8-10.
CN202111320518.XA 2021-11-09 2021-11-09 Data acquisition method, device and system, electronic equipment and storage medium Pending CN114238008A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111320518.XA CN114238008A (en) 2021-11-09 2021-11-09 Data acquisition method, device and system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111320518.XA CN114238008A (en) 2021-11-09 2021-11-09 Data acquisition method, device and system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114238008A true CN114238008A (en) 2022-03-25

Family

ID=80748830

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111320518.XA Pending CN114238008A (en) 2021-11-09 2021-11-09 Data acquisition method, device and system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114238008A (en)

Similar Documents

Publication Publication Date Title
US11334543B1 (en) Scalable bucket merging for a data intake and query system
CN106878064B (en) Data monitoring method and device
CN111124819B (en) Method and device for full link monitoring
US10860406B2 (en) Information processing device and monitoring method
US20200301801A1 (en) Content-sensitive container scheduling on clusters
US20190286509A1 (en) Hierarchical fault determination in an application performance management system
US11687487B1 (en) Text files updates to an active processing pipeline
US10657099B1 (en) Systems and methods for transformation and analysis of logfile data
CN112306700A (en) Abnormal RPC request diagnosis method and device
US9058330B2 (en) Verification of complex multi-application and multi-node deployments
US10372572B1 (en) Prediction model testing framework
US11635994B2 (en) System and method for optimizing and load balancing of applications using distributed computer clusters
US11023284B2 (en) System and method for optimization and load balancing of computer clusters
CN112905323B (en) Data processing method, device, electronic equipment and storage medium
US10282245B1 (en) Root cause detection and monitoring for storage systems
US11934972B2 (en) Configuration assessment based on inventory
CN113656168A (en) Method, system, medium and equipment for automatic disaster recovery and scheduling of traffic
US10187264B1 (en) Gateway path variable detection for metric collection
CN111124609A (en) Data acquisition method and device, data acquisition equipment and storage medium
US11153183B2 (en) Compacted messaging for application performance management system
EP2634699B1 (en) Application monitoring
CN113760677A (en) Abnormal link analysis method, device, equipment and storage medium
CN110430070B (en) Service state analysis method, device, server, data analysis equipment and medium
US10223189B1 (en) Root cause detection and monitoring for storage systems
CN110309028A (en) Monitoring information acquisition methods, service monitoring method, apparatus and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination