WO2021093171A1 - Monitoring method, system and device, and storage medium - Google Patents

Monitoring method, system and device, and storage medium Download PDF

Info

Publication number
WO2021093171A1
WO2021093171A1 PCT/CN2020/073122 CN2020073122W WO2021093171A1 WO 2021093171 A1 WO2021093171 A1 WO 2021093171A1 CN 2020073122 W CN2020073122 W CN 2020073122W WO 2021093171 A1 WO2021093171 A1 WO 2021093171A1
Authority
WO
WIPO (PCT)
Prior art keywords
monitoring
task
analysis
data
arbitration
Prior art date
Application number
PCT/CN2020/073122
Other languages
French (fr)
Chinese (zh)
Inventor
田琳
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2021093171A1 publication Critical patent/WO2021093171A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • H04L67/025Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • H04L43/045Processing captured monitoring data, e.g. for logfile generation for graphical visualisation of monitoring data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Definitions

  • This application relates to the field of server operation and maintenance, and in particular to a monitoring method, system, equipment, and storage medium.
  • a data center is a network of specific equipment for global collaboration, composed of a large number of servers, used to transmit, accelerate, display, calculate, and store data information on the network infrastructure.
  • the data center is often used to store important data required for business operations. To ensure the reliable operation of the servers in the data center, technicians often need to monitor the status information of the servers in the data center.
  • the current monitoring solution for data center servers is usually to set up separate monitoring equipment in the data center, and the monitoring system is deployed in the monitoring equipment, and then the monitoring system performs complete monitoring operations on the servers in the data center.
  • the business line is long, which increases the operational risks of the current monitoring system and makes it difficult to ensure the overall reliability of the monitoring process.
  • the purpose of this application is to provide a monitoring method, system, equipment and storage medium to relatively ensure the overall reliability of the monitoring process.
  • this application provides a monitoring method applied to task agent equipment, including:
  • the method before acquiring the monitoring task passed by the arbitration device, the method further includes:
  • Obtain the incoming monitoring tasks from the arbitration device including:
  • the operating status information includes MAC address, IP address, resource utilization, and software version number.
  • the target device information is included in the monitoring task
  • Perform monitoring operations on the data center and obtain monitoring data based on monitoring tasks including:
  • this application also provides a monitoring method applied to arbitration equipment, including:
  • the monitoring task is delivered to the task agent device corresponding to the data center in the form of groups, and the delivery time interval of each group is preset for the duration;
  • the monitoring task will be sent to the task agent device at one time.
  • the monitoring task is sent to the task agent device corresponding to the data center in the form of grouping, including:
  • the monitoring tasks are distributed to each task agent device in a balanced manner in groups;
  • this application also provides a monitoring method applied to analysis equipment, including:
  • the analysis result is sent to the arbitration device for the arbitration device to display the analysis result.
  • this application also provides a monitoring system, including:
  • the task agent device is used to obtain the monitoring tasks passed in by the arbitration device; perform monitoring operations on the data center according to the monitoring tasks and obtain monitoring data; upload the monitoring data to the analysis device to perform analysis operations on the monitoring data and generate analysis results;
  • the arbitration device is used to generate monitoring tasks and obtain the total amount of monitoring tasks; determine whether the total amount of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1; if so, the monitoring tasks are delivered in groups To the task agent device corresponding to the data center, each group is issued at a preset time interval; otherwise, the monitoring task is issued to the task agent device at one time.
  • the analysis device is used to obtain the monitoring data transmitted by the task agent device, and perform analysis operations on the monitoring data according to preset rules to generate analysis results; and send the analysis results to the arbitration device for the arbitration device to display the analysis results.
  • this application also provides a monitoring device, including:
  • Memory used to store computer programs
  • the processor is used to implement the steps of the above-mentioned monitoring method when the computer program is executed.
  • the present application also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the monitoring method as described above are realized.
  • the task agent device obtains the monitoring task passed by the arbitration device, and performs monitoring operations on the monitored data center according to the monitoring task, thereby obtaining monitoring data, and then uploading the monitoring data to the analysis device for For analysis equipment to perform analysis operations on the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this method is based on the task agent equipment, arbitration equipment and analysis equipment to realize the monitoring of the data center, the complete business line of the monitoring business is distributed to multiple equipment for collaborative execution, thereby relatively reducing the overall operation risk of the current monitoring system. Ensure the overall reliability of the monitoring process. In addition, the present application also provides a monitoring system, equipment, and storage medium, and the beneficial effects are the same as those described above.
  • FIG. 1 is a flowchart of a monitoring method applied to task agent equipment disclosed in this application;
  • FIG. 3 is a flowchart of a specific monitoring method applied to task agent equipment disclosed in this application;
  • FIG. 5 is a flowchart of a specific monitoring method applied to arbitration equipment disclosed in this application.
  • FIG. 6 is a flowchart of a monitoring method applied to analysis equipment disclosed in this application.
  • Fig. 7 is a schematic structural diagram of a monitoring system disclosed in this application.
  • the current monitoring solution for data center servers is usually to set up separate monitoring equipment in the data center, and the monitoring system is deployed in the monitoring equipment, and then the monitoring system performs complete monitoring operations on the servers in the data center.
  • the business line is long, which increases the operational risks of the current monitoring system and makes it difficult to ensure the overall reliability of the monitoring process.
  • the core of this application is to provide a monitoring method to relatively ensure the overall reliability of the monitoring process.
  • an embodiment of the present application discloses a monitoring method, which is applied to a task agent device, and includes:
  • Step S10 Obtain the monitoring task transmitted by the arbitration device.
  • the execution subject of this embodiment is the task agent device, and the task agent device and the arbitration device are independent of each other and have a communication relationship.
  • the task agent device and the server device in the data center have a communication relationship to respond to the arbitration device.
  • the data center performs corresponding monitoring operations.
  • the arbitration device is also responsible for the management of task agents.
  • the monitoring task transmitted by the arbitration device may be initiated by the user in real time through the arbitration device, or it may be read item by item in a task list pre-written by the user.
  • the monitoring content corresponding to the monitoring task includes, but is not limited to, the occupancy of various computing resources in the data center.
  • Step S11 Perform a monitoring operation on the data center according to the monitoring task and obtain monitoring data.
  • the task agent device After the task agent device obtains the monitoring task passed in by the arbitration device, it further performs corresponding monitoring operations on the data center according to the monitoring task and obtains the corresponding monitoring data. Because the task agent device has direct communication with the server device in the data center Therefore, the task agent device can directly initiate corresponding monitoring instructions to the server device in the data center according to the monitoring requirements corresponding to the monitoring task, so as to obtain the monitoring data of the data center.
  • Step S12 Upload the monitoring data to the analysis device to perform an analysis operation on the monitoring data, and generate an analysis result.
  • the task agent device After the task agent device obtains the monitoring data of the data center, it further uploads the monitoring data to the analysis device, so as to perform corresponding analysis operations on the monitoring data through the analysis device, thereby generating the final analysis result.
  • analysis device and the task agent device are independent of each other, and there is a communication relationship, and the analysis device is responsible for collecting and analyzing the monitoring results collected by the task agent device.
  • the task agent device obtains the monitoring task passed by the arbitration device, and performs monitoring operations on the monitored data center according to the monitoring task, thereby obtaining monitoring data, and then uploading the monitoring data to the analysis device for For analysis equipment to perform analysis operations on the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this method is based on the task agent equipment, arbitration equipment, and analysis equipment to realize the monitoring of the data center, the complete business line of the monitoring business is distributed to multiple equipment for collaborative execution, thereby relatively reducing the overall operation risk of the current monitoring system. Ensure the overall reliability of the monitoring process.
  • an embodiment of the present application discloses a monitoring method, which is applied to a task agent device, and includes:
  • Step S20 Respond to the binding instruction transmitted by the arbitration device and establish a binding relationship with the arbitration device.
  • Step S21 Send the operating status information to the arbitration device having the binding relationship according to the preset frequency, so that the arbitration device can determine the availability of the task agent device.
  • Step S22 Obtain the monitoring task transmitted by the arbitration device with the binding relationship.
  • Step S23 Perform a monitoring operation on the data center according to the monitoring task and obtain monitoring data.
  • Step S24 Upload the monitoring data to the analysis device to perform an analysis operation on the monitoring data, and generate an analysis result.
  • the focus of this embodiment is to establish a binding relationship between the task agent device and the arbitration device in advance before acquiring the monitoring task passed in by the arbitration device.
  • the arbitration device first sends the information to the arbitration device.
  • the task agent device initiates a binding instruction, and after receiving the binding instruction of the arbitration device, the task agent device establishes a communication binding relationship with the arbitration device.
  • the binding relationship is the task response relationship of the task agent device to the arbitration device.
  • the task agent device responds to the monitoring task of the arbitration device that has a binding relationship with itself, and executes the corresponding monitoring service.
  • the task agent device After the task agent device establishes a binding relationship with the arbitration device, it only sends operating status information to the arbitration device that has a binding relationship according to the preset frequency. The purpose is to inform the arbitration device of its own working status so that the arbitration device can judge it. Whether the corresponding task agent device is available, and when the arbitration device determines that its corresponding task agent device is available, it further transmits the monitoring task to the task agent device, and then the task agent device performs corresponding monitoring operations according to the monitoring task.
  • the binding relationship between the arbitration device and the task agent device is established in advance, and the task agent device only serves the arbitration device with which it has a binding relationship, which can relatively ensure that the arbitration device assigns monitoring tasks to the task agent device. The orderliness and reliability ensure the overall reliability of the monitoring process.
  • the operating status information includes a MAC address, an IP address, a resource utilization rate, and a software version number.
  • the connectivity of the MAC address and IP address affects the monitoring task of the arbitration device to be sent to the task normally
  • resource utilization refers to tasks including but not limited to CPU utilization, memory utilization, and network utilization.
  • the agent equipment responds to monitoring tasks and needs to occupy computing resources and computing resources.
  • the quantity directly determines whether the monitoring task can be processed normally by the task agent device; in addition, the software version number refers to the version of the software program executed in the task agent device to realize the monitoring operation, because different version numbers affect the monitoring task
  • the data packet structure requirements may be different.
  • the version number of the software program that implements the monitoring operation in the task agent device may not match the version number of the software program of the arbitration device, causing the task agent device to fail to respond normally to the monitoring task transmitted by the arbitration device.
  • the operating status information sent by the task agent device to the arbitration device in this embodiment includes MAC address, IP address, resource utilization, and software version number.
  • the arbitration device can comprehensively determine whether the task agent device is available, and can further ensure the orderliness and reliability of the arbitration device assigning monitoring tasks to the task agent device, thereby ensuring the overall reliability of the monitoring process.
  • an embodiment of the present application discloses a monitoring method, which is applied to a task agent device, and includes:
  • Step S30 Obtain the monitoring task passed by the arbitration device, and the monitoring task includes the target device information.
  • Step S31 Perform a monitoring operation on the target device in the data center according to the target device information of the monitoring task and obtain the monitoring data.
  • Step S32 Upload the monitoring data to the analysis device corresponding to the target device in the analysis device cluster to perform an analysis operation on the monitoring data, and generate an analysis result.
  • the monitoring tasks of the arbitration device passed into the task agent device include target device information.
  • the target device information refers to the identity information of the specific server device that the user needs to monitor in the data center, and then the task The agent device performs monitoring operations on the target device in the data center according to the target device information of the monitoring task and obtains monitoring data. That is to say, the monitoring scope of the data center in this example is further accurate to a specific target device or some target devices in the data center.
  • this embodiment uploads the monitoring data to the analysis equipment corresponding to the target equipment in the analysis equipment cluster to perform the monitoring data Analysis operation, that is, in this embodiment, the number of analysis devices is greater than one and constitute an analysis device cluster, and different analysis devices are responsible for performing analysis operations on corresponding server devices in the data center, that is, target devices.
  • the monitoring data of different target devices in the data center can be uniquely located to the corresponding analysis device in the analysis device cluster based on the HASH value of the target device.
  • multiple analysis devices in the analysis device cluster jointly perform an analysis operation on the monitoring data, which further ensures the reliability of the analysis operation on the monitoring data, thereby ensuring the overall reliability of the monitoring process.
  • the analysis device After the analysis device obtains the analysis result, it can further synchronously record the analysis result into the storage system in a master/standby manner, so as to ensure the data security of the analysis result.
  • an embodiment of the present application discloses a monitoring method, which is applied to an arbitration device, and includes:
  • Step S40 Generate a monitoring task, and obtain the total number of tasks of the monitoring task.
  • Step S41 It is judged whether the total number of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1, if yes, step S42 is executed, otherwise, step S43 is executed.
  • Step S42 The monitoring task is delivered to the task agent device corresponding to the data center in the form of groups, and the delivery time interval of each group is preset.
  • Step S43 Send the monitoring task to the task agent device at one time.
  • the executive body of this embodiment is the arbitration device, and the arbitration device directly interacts with the user.
  • the user initiates the monitoring of the data center through the arbitration device, and the arbitration device responds to the user's monitoring needs of the data center and generates corresponding monitoring. task.
  • the arbitration device responds to the user's monitoring needs of the data center and generates corresponding monitoring. task.
  • the number of monitoring needs of users for the data center may be large
  • the arbitration device generates monitoring tasks
  • the total number of monitoring tasks initiated by the user is further obtained, and the monitoring tasks are issued according to the total number of tasks.
  • Deliver to the task agent device Deliver to the task agent device.
  • the focus of this embodiment is that when the total number of tasks is greater than the preset threshold, the monitoring tasks are delivered in groups to the task agent device corresponding to the data center, and the delivery time interval of each group is preset for a relatively low time.
  • the processing pressure of the task agent equipment on the monitoring task ensures the overall reliability of the monitoring process.
  • an embodiment of the present application discloses a monitoring method applied to arbitration devices, including:
  • Step S50 Generate a monitoring task, and obtain the total number of tasks of the monitoring task.
  • Step S51 Determine whether the total number of tasks is greater than a preset threshold, the preset threshold is an integer greater than 1, and if so, step S52 is executed, otherwise, step S53 is executed.
  • Step S52 The monitoring task is distributed to each task agent device in a balanced manner in the form of groups, and the distribution time interval of each group is preset.
  • Step S53 The monitoring tasks are distributed to each task agent device in a balanced manner at one time.
  • the arbitration device distributes the monitoring tasks to each task agent device in a balanced manner.
  • the workload of each task agent device ensures the overall stability and reliability of the task agent device in response to the monitoring task, and further ensures the overall reliability of the monitoring process.
  • an embodiment of the present application discloses a monitoring method, which is applied to an analysis device, and includes:
  • Step S60 Obtain the monitoring data transmitted by the task agent device, and perform an analysis operation on the monitoring data according to a preset rule to generate an analysis result.
  • Step S61 Send the analysis result to the arbitration device for the arbitration device to display the analysis result.
  • the analysis device after the analysis device obtains the monitoring data transmitted by the task device, it performs analysis operations on the monitoring data according to preset rules to generate analysis results.
  • the preset rules can be determined according to the actual monitoring and analysis needs of the user.
  • the analysis device After the analysis device generates the analysis result, it further sends the analysis result to the arbitration device, and the arbitration device displays the analysis result, so as to ensure that the user can intuitively understand the specific content of the analysis result.
  • the present application also provides a scenario embodiment in a specific application scenario for description.
  • a distributed multi-data center unified monitoring operation and maintenance architecture technical solution mainly includes arbitration equipment, analysis equipment and task agent equipment.
  • the task agent device adds a task agent device to the arbitration device, and then the arbitration device sends a binding instruction to the task agent device according to the password, and the task agent device responds to the binding after receiving the binding designation. After the two parties complete the binding, the task agent device reports the heartbeat to the arbitration device every 30 seconds.
  • the heartbeat content includes but is not limited to the following: physical address, IP, health information (CPU utilization, memory utilization, network utilization), version number.
  • the arbitration device detects the heartbeat information every 1 minute, and if a certain task agent device has no heartbeat, the task of the task agent device is migrated to other task agent devices.
  • Receive the execution status of the task agent device which is used to record the execution time, waiting time, execution duration, success, failure and other information of each task.
  • the task agent device sets the user name and password during installation, and is used to verify the password when arbitrating the device binding. At the same time, it has the ability to report heartbeat and execute tasks. Since the task agent equipment is distributed to each data center, the execution of the task agent equipment should be guaranteed to be stateless execution, that is to say, there is no correlation between tasks, and there is no correlation between the previous execution and the next execution. , There is no ability to store any state locally.
  • an embodiment of the present application discloses a monitoring system, including:
  • the task agent device 10 is used to obtain the monitoring tasks transmitted by the arbitration device 11; perform monitoring operations on the data center 13 according to the monitoring tasks and obtain monitoring data; upload the monitoring data to the analysis device 12 to perform analysis operations on the monitoring data, and Generate analysis results;
  • the arbitration device 11 is used to generate monitoring tasks and obtain the total number of tasks of the monitoring tasks; determine whether the total number of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1; if so, the monitoring tasks are downloaded in groups It is sent to the task agent device 10 corresponding to the data center 13, and the time interval for issuing each group is preset; otherwise, the monitoring task is issued to the task agent device 10 at one time.
  • the analysis device 12 is used to obtain the monitoring data passed in by the task agent device 10, and perform analysis operations on the monitoring data according to preset rules to generate analysis results; send the analysis results to the arbitration device 11 for the arbitration device 11 to analyze the results To show.
  • the task agent device obtains the monitoring task passed by the arbitration device, and performs monitoring operations on the monitored data center according to the monitoring task to obtain monitoring data, and then upload the monitoring data to the analysis device to For analysis equipment to perform analysis operations on the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this system is based on task agency equipment, arbitration equipment, and analysis equipment to realize the monitoring of the data center, the complete business line of the monitoring business is distributed to multiple equipment for collaborative execution, thereby relatively reducing the overall operational risk of the current monitoring system. Ensure the overall reliability of the monitoring process.
  • this application also discloses a monitoring device, including:
  • Memory used to store computer programs
  • the processor is used to implement the steps of the monitoring method applied to the task agent device and/or applied to the arbitration device and/or applied to the analysis device when executing the computer program.
  • the task agent equipment obtains the monitoring tasks passed by the arbitration equipment, and performs monitoring operations on the monitored data center according to the monitoring tasks to obtain monitoring data, and then upload the monitoring data to the analysis equipment to For analysis equipment to perform analysis operations on the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this device is based on task agent equipment, arbitration equipment and analysis equipment to realize the monitoring of the data center, the complete business line of the monitoring business is distributed to multiple equipment for collaborative execution, which relatively reduces the overall operational risk of the current monitoring system. Ensure the overall reliability of the monitoring process.
  • the present application also provides a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the computer program is applied to the task agent device and/or to the arbitration device and/or as described above. Or the steps of a monitoring method applied to analysis equipment.
  • the task agent device obtains the monitoring task passed by the arbitration device, and performs monitoring operations on the monitored data center according to the monitoring task, thereby obtaining monitoring data, and uploading the monitoring data to
  • the analysis equipment is used for the analysis equipment to analyze and operate the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this computer-readable storage medium realizes the monitoring of the data center based on the task agent equipment, arbitration equipment, and analysis equipment, the complete business line of the monitoring service is distributed to multiple equipment for collaborative execution, which relatively reduces the current monitoring system The overall operation risk ensures the overall reliability of the monitoring process.

Abstract

Disclosed are a monitoring method, system and device, and a storage medium. In the method, a data center is monitored on the basis of a task agent device, an arbitration device and an analysis device, and therefore, a complete service line of a monitoring service is distributed to multiple devices for cooperative execution, thus relatively reducing the overall operation risk of a current monitoring system, and ensuring the overall reliability of a monitoring process. In addition, further provided is a monitoring system and device, and a storage medium, and the beneficial effects thereof are the same as described above.

Description

一种监控方法、系统、设备及存储介质Monitoring method, system, equipment and storage medium
本申请要求于2019年11月15日提交中国专利局、申请号为201911122494.X、发明名称为“一种监控方法、系统、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 201911122494.X, and the invention title is "a monitoring method, system, equipment, and storage medium" on November 15, 2019. The entire content of the application is approved The reference is incorporated in this application.
技术领域Technical field
本申请涉及服务器运维领域,特别是涉及一种监控方法、系统、设备及存储介质。This application relates to the field of server operation and maintenance, and in particular to a monitoring method, system, equipment, and storage medium.
背景技术Background technique
数据中心是全球协作的特定设备网络,由大量的服务器构成,用来在网络基础设施上传递、加速、展示、计算、存储数据信息,数据中心往往用于存储企业运营所需的重要数据,需要确保数据中心中服务器的可靠运行,因此技术人员往往需要对数据中心中服务器的状态信息进行监控。A data center is a network of specific equipment for global collaboration, composed of a large number of servers, used to transmit, accelerate, display, calculate, and store data information on the network infrastructure. The data center is often used to store important data required for business operations. To ensure the reliable operation of the servers in the data center, technicians often need to monitor the status information of the servers in the data center.
当前对于数据中心服务器的监控方案,通常是在数据中心中设置单独的监控设备,监控设备中部署有监控系统,进而由监控系统对数据中心的服务器执行完整的监控操作,但是由于当前监控操作的业务线较长,因此增加了当前监控系统的运行风险,难以确保监控过程中的整体可靠性。The current monitoring solution for data center servers is usually to set up separate monitoring equipment in the data center, and the monitoring system is deployed in the monitoring equipment, and then the monitoring system performs complete monitoring operations on the servers in the data center. However, due to the current monitoring operation The business line is long, which increases the operational risks of the current monitoring system and makes it difficult to ensure the overall reliability of the monitoring process.
由此可见,提供一种监控方法,以相对确保监控过程中的整体可靠性,是本领域技术人员需要解决的问题。It can be seen that providing a monitoring method to relatively ensure the overall reliability of the monitoring process is a problem that needs to be solved by those skilled in the art.
发明内容Summary of the invention
本申请的目的是提供一种监控方法、系统、设备及存储介质,以相对确保监控过程中的整体可靠性。The purpose of this application is to provide a monitoring method, system, equipment and storage medium to relatively ensure the overall reliability of the monitoring process.
为解决上述技术问题,本申请提供一种监控方法,应用于任务代理设备,包括:In order to solve the above technical problems, this application provides a monitoring method applied to task agent equipment, including:
获取仲裁设备传入的监控任务;Obtain the monitoring task passed by the arbitration device;
根据监控任务对数据中心执行监控操作并获取监控数据;Perform monitoring operations on the data center according to monitoring tasks and obtain monitoring data;
将监控数据上传至分析设备以进行对监控数据的分析操作,并生成分 析结果。Upload the monitoring data to the analysis device to perform analysis operations on the monitoring data and generate analysis results.
优选的,在获取仲裁设备传入的监控任务之前,方法还包括:Preferably, before acquiring the monitoring task passed by the arbitration device, the method further includes:
响应仲裁设备传入的绑定指令并与仲裁设备建立绑定关系;Respond to the binding instruction from the arbitration device and establish a binding relationship with the arbitration device;
依照预设频率向具有绑定关系的仲裁设备发送运行状态信息,以供仲裁设备判断任务代理设备的可用性;Send operating status information to the arbitration device with a binding relationship according to the preset frequency, so that the arbitration device can determine the availability of the task agent device;
获取仲裁设备传入的监控任务,包括:Obtain the incoming monitoring tasks from the arbitration device, including:
获取具有绑定关系的仲裁设备传入的监控任务。Obtain the monitoring tasks from the arbitration device that has a binding relationship.
优选的,运行状态信息包括MAC地址、IP地址、资源利用率以及软件版本号。Preferably, the operating status information includes MAC address, IP address, resource utilization, and software version number.
优选的,监控任务中包括目标设备信息;Preferably, the target device information is included in the monitoring task;
根据监控任务对数据中心执行监控操作并获取监控数据,包括:Perform monitoring operations on the data center and obtain monitoring data based on monitoring tasks, including:
根据监控任务的目标设备信息对数据中心的目标设备执行监控操作并获取监控数据;Perform monitoring operations on the target equipment in the data center and obtain monitoring data according to the target equipment information of the monitoring task;
将监控数据上传至分析设备以进行对监控数据的分析操作,包括:Upload the monitoring data to the analysis device to perform analysis operations on the monitoring data, including:
将监控数据上传至分析设备集群中与目标设备对应的分析设备以进行对监控数据的分析操作。Upload the monitoring data to the analysis device corresponding to the target device in the analysis device cluster to perform an analysis operation on the monitoring data.
此外,本申请还提供一种监控方法,应用于仲裁设备,包括:In addition, this application also provides a monitoring method applied to arbitration equipment, including:
生成监控任务,并获取监控任务的任务总量;Generate monitoring tasks and obtain the total amount of monitoring tasks;
判断任务总量是否大于预设阈值,预设阈值为大于1的整数;Determine whether the total number of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1;
如果是,则将监控任务以分组的形式下发至与数据中心对应的任务代理设备,各分组的下发时刻间隔预设时长;If it is, the monitoring task is delivered to the task agent device corresponding to the data center in the form of groups, and the delivery time interval of each group is preset for the duration;
否则,将监控任务一次性下发至任务代理设备。Otherwise, the monitoring task will be sent to the task agent device at one time.
优选的,当与任务代理设备的数量大于1时,将监控任务以分组的形式下发至与数据中心对应的任务代理设备,包括:Preferably, when the number of task agent devices is greater than 1, the monitoring task is sent to the task agent device corresponding to the data center in the form of grouping, including:
将监控任务以分组的形式均衡下发至各任务代理设备;The monitoring tasks are distributed to each task agent device in a balanced manner in groups;
将监控任务一次性下发至任务代理设备,包括:Send the monitoring task to the task agent device at one time, including:
将监控任务一次性均衡下发至各任务代理设备。Distribute the monitoring tasks to each task agent device in a balanced manner at one time.
此外,本申请还提供一种监控方法,应用于分析设备,包括:In addition, this application also provides a monitoring method applied to analysis equipment, including:
获取任务代理设备传入的监控数据,并依照预设规则对监控数据进行 分析操作以生成分析结果;Obtain the monitoring data transmitted by the task agent device, and perform analysis operations on the monitoring data according to preset rules to generate analysis results;
将分析结果发送至仲裁设备,以供仲裁设备对分析结果进行展示。The analysis result is sent to the arbitration device for the arbitration device to display the analysis result.
此外,本申请还提供一种监控系统,包括:In addition, this application also provides a monitoring system, including:
任务代理设备,用于获取仲裁设备传入的监控任务;根据监控任务对数据中心执行监控操作并获取监控数据;将监控数据上传至分析设备以进行对监控数据的分析操作,并生成分析结果;The task agent device is used to obtain the monitoring tasks passed in by the arbitration device; perform monitoring operations on the data center according to the monitoring tasks and obtain monitoring data; upload the monitoring data to the analysis device to perform analysis operations on the monitoring data and generate analysis results;
仲裁设备,用于生成监控任务,并获取监控任务的任务总量;判断任务总量是否大于预设阈值,预设阈值为大于1的整数;如果是,则将监控任务以分组的形式下发至与数据中心对应的任务代理设备,各分组的下发时刻间隔预设时长;否则,将监控任务一次性下发至任务代理设备。The arbitration device is used to generate monitoring tasks and obtain the total amount of monitoring tasks; determine whether the total amount of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1; if so, the monitoring tasks are delivered in groups To the task agent device corresponding to the data center, each group is issued at a preset time interval; otherwise, the monitoring task is issued to the task agent device at one time.
分析设备,用于获取任务代理设备传入的监控数据,并依照预设规则对监控数据进行分析操作以生成分析结果;将分析结果发送至仲裁设备,以供仲裁设备对分析结果进行展示。The analysis device is used to obtain the monitoring data transmitted by the task agent device, and perform analysis operations on the monitoring data according to preset rules to generate analysis results; and send the analysis results to the arbitration device for the arbitration device to display the analysis results.
此外,本申请还提供一种监控设备,包括:In addition, this application also provides a monitoring device, including:
存储器,用于存储计算机程序;Memory, used to store computer programs;
处理器,用于执行计算机程序时实现如上述的监控方法的步骤。The processor is used to implement the steps of the above-mentioned monitoring method when the computer program is executed.
此外,本申请还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现如上述的监控方法的步骤。In addition, the present application also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the monitoring method as described above are realized.
本申请所提供的监控方法,由任务代理设备获取仲裁设备传入的监控任务,并根据监控任务对所监控的数据中心进行监控操作,以此获取监控数据,进而将监控数据上传至分析设备以供分析设备对监控数据进行分析操作,最终生成分析结果,达到对数据中心进行监控的目的。由于本方法基于任务代理设备、仲裁设备以及分析设备共同实现了对数据中心的监控,因此监控业务的完整业务线被分配于多个设备协同执行,进而相对降低了当前监控系统的整体运行风险,确保了监控过程的整体可靠性。此外,本申请还提供一种监控系统、设备及存储介质,有益效果同上所述。In the monitoring method provided by this application, the task agent device obtains the monitoring task passed by the arbitration device, and performs monitoring operations on the monitored data center according to the monitoring task, thereby obtaining monitoring data, and then uploading the monitoring data to the analysis device for For analysis equipment to perform analysis operations on the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this method is based on the task agent equipment, arbitration equipment and analysis equipment to realize the monitoring of the data center, the complete business line of the monitoring business is distributed to multiple equipment for collaborative execution, thereby relatively reducing the overall operation risk of the current monitoring system. Ensure the overall reliability of the monitoring process. In addition, the present application also provides a monitoring system, equipment, and storage medium, and the beneficial effects are the same as those described above.
附图说明Description of the drawings
图1为本申请公开的一种应用于任务代理设备的监控方法的流程图;FIG. 1 is a flowchart of a monitoring method applied to task agent equipment disclosed in this application;
图2为本申请公开的一种具体的应用于任务代理设备的监控方法的流程图;2 is a flowchart of a specific monitoring method applied to task agent equipment disclosed in this application;
图3为本申请公开的一种具体的应用于任务代理设备的监控方法的流程图;FIG. 3 is a flowchart of a specific monitoring method applied to task agent equipment disclosed in this application;
图4为本申请公开的一种应用于仲裁设备的监控方法的流程图;4 is a flowchart of a monitoring method applied to arbitration equipment disclosed in this application;
图5为本申请公开的一种具体的应用于仲裁设备的监控方法的流程图;FIG. 5 is a flowchart of a specific monitoring method applied to arbitration equipment disclosed in this application;
图6为本申请公开的一种应用于分析设备的监控方法的流程图;FIG. 6 is a flowchart of a monitoring method applied to analysis equipment disclosed in this application;
图7为本申请公开的一种监控系统的结构示意图。Fig. 7 is a schematic structural diagram of a monitoring system disclosed in this application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下,所获得的所有其他实施例,都属于本申请保护范围。The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.
当前对于数据中心服务器的监控方案,通常是在数据中心中设置单独的监控设备,监控设备中部署有监控系统,进而由监控系统对数据中心的服务器执行完整的监控操作,但是由于当前监控操作的业务线较长,因此增加了当前监控系统的运行风险,难以确保监控过程中的整体可靠性。The current monitoring solution for data center servers is usually to set up separate monitoring equipment in the data center, and the monitoring system is deployed in the monitoring equipment, and then the monitoring system performs complete monitoring operations on the servers in the data center. However, due to the current monitoring operation The business line is long, which increases the operational risks of the current monitoring system and makes it difficult to ensure the overall reliability of the monitoring process.
为此,本申请的核心是提供一种监控方法,以相对确保监控过程中的整体可靠性。To this end, the core of this application is to provide a monitoring method to relatively ensure the overall reliability of the monitoring process.
为了使本技术领域的人员更好地理解本申请方案,下面结合附图和具体实施方式对本申请作进一步的详细说明。In order to enable those skilled in the art to better understand the solution of the application, the application will be further described in detail below with reference to the accompanying drawings and specific implementations.
请参见图1所示,本申请实施例公开了一种监控方法,应用于任务代理设备,包括:As shown in FIG. 1, an embodiment of the present application discloses a monitoring method, which is applied to a task agent device, and includes:
步骤S10:获取仲裁设备传入的监控任务。Step S10: Obtain the monitoring task transmitted by the arbitration device.
需要说明的是,本实施例的执行主体为任务代理设备,任务代理设备与仲裁设备之间相互独立并具有通信关系,任务代理设备与数据中心中的服务器设备存在通信关系,用于响应仲裁设备传入的监控任务,并根据监控任务对数据中心执行监控操作,并获取到监控数据;而仲裁设备则负责对接用户发起的监控任务,并将监控任务发送至任务代理设备以通过任务代理设备对数据中心执行相应的监控操作,另外,仲裁设备还负责对任务代理的管理。It should be noted that the execution subject of this embodiment is the task agent device, and the task agent device and the arbitration device are independent of each other and have a communication relationship. The task agent device and the server device in the data center have a communication relationship to respond to the arbitration device. Incoming monitoring tasks, and perform monitoring operations on the data center according to the monitoring tasks, and obtain monitoring data; and the arbitration device is responsible for docking the monitoring tasks initiated by the user, and sends the monitoring tasks to the task agent device to pair with the task agent device. The data center performs corresponding monitoring operations. In addition, the arbitration device is also responsible for the management of task agents.
本步骤中,仲裁设备传入的监控任务可以是由用户通过仲裁设备实时发起的,也可以是在用户预先编写的任务列表中逐项读取得到的。另外,监控任务对应的监控内容包括但不限于数据中心各项运算资源的占用量。In this step, the monitoring task transmitted by the arbitration device may be initiated by the user in real time through the arbitration device, or it may be read item by item in a task list pre-written by the user. In addition, the monitoring content corresponding to the monitoring task includes, but is not limited to, the occupancy of various computing resources in the data center.
步骤S11:根据监控任务对数据中心执行监控操作并获取监控数据。Step S11: Perform a monitoring operation on the data center according to the monitoring task and obtain monitoring data.
在任务代理设备获取到仲裁设备传入的监控任务后,进一步根据监控任务对数据中心执行相应的监控操作,并获取对应的监控数据,由于任务代理设备与数据中心中的服务器设备存在直接的通信关系,因此任务代理设备能够直接根据监控任务对应的监控需求向数据中心的服务器设备发起相应的监控指令,以此获取数据中心的监控数据。After the task agent device obtains the monitoring task passed in by the arbitration device, it further performs corresponding monitoring operations on the data center according to the monitoring task and obtains the corresponding monitoring data. Because the task agent device has direct communication with the server device in the data center Therefore, the task agent device can directly initiate corresponding monitoring instructions to the server device in the data center according to the monitoring requirements corresponding to the monitoring task, so as to obtain the monitoring data of the data center.
步骤S12:将监控数据上传至分析设备以进行对监控数据的分析操作,并生成分析结果。Step S12: Upload the monitoring data to the analysis device to perform an analysis operation on the monitoring data, and generate an analysis result.
在任务代理设备获取到数据中心的监控数据后,进一步将监控数据上传至分析设备,以此通过分析设备对监控数据执行相应的分析操作,从而生成最终的分析结果。After the task agent device obtains the monitoring data of the data center, it further uploads the monitoring data to the analysis device, so as to perform corresponding analysis operations on the monitoring data through the analysis device, thereby generating the final analysis result.
需要说明的是,分析设备与任务代理设备之间相互独立,并且存在通信关系,分析设备负责对任务代理设备的采集的监控结果进行收集和分析。It should be noted that the analysis device and the task agent device are independent of each other, and there is a communication relationship, and the analysis device is responsible for collecting and analyzing the monitoring results collected by the task agent device.
本申请所提供的监控方法,由任务代理设备获取仲裁设备传入的监控任务,并根据监控任务对所监控的数据中心进行监控操作,以此获取监控数据,进而将监控数据上传至分析设备以供分析设备对监控数据进行分析操作,最终生成分析结果,达到对数据中心进行监控的目的。由于本方法基于任务代理设备、仲裁设备以及分析设备共同实现了对数据中心的监控, 因此监控业务的完整业务线被分配于多个设备协同执行,进而相对降低了当前监控系统的整体运行风险,确保了监控过程的整体可靠性。In the monitoring method provided by this application, the task agent device obtains the monitoring task passed by the arbitration device, and performs monitoring operations on the monitored data center according to the monitoring task, thereby obtaining monitoring data, and then uploading the monitoring data to the analysis device for For analysis equipment to perform analysis operations on the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this method is based on the task agent equipment, arbitration equipment, and analysis equipment to realize the monitoring of the data center, the complete business line of the monitoring business is distributed to multiple equipment for collaborative execution, thereby relatively reducing the overall operation risk of the current monitoring system. Ensure the overall reliability of the monitoring process.
参见图2所示,本申请实施例公开了一种监控方法,应用于任务代理设备,包括:As shown in Fig. 2, an embodiment of the present application discloses a monitoring method, which is applied to a task agent device, and includes:
步骤S20:响应仲裁设备传入的绑定指令并与仲裁设备建立绑定关系。Step S20: Respond to the binding instruction transmitted by the arbitration device and establish a binding relationship with the arbitration device.
步骤S21:依照预设频率向具有绑定关系的仲裁设备发送运行状态信息,以供仲裁设备判断任务代理设备的可用性。Step S21: Send the operating status information to the arbitration device having the binding relationship according to the preset frequency, so that the arbitration device can determine the availability of the task agent device.
步骤S22:获取具有绑定关系的仲裁设备传入的监控任务。Step S22: Obtain the monitoring task transmitted by the arbitration device with the binding relationship.
步骤S23:根据监控任务对数据中心执行监控操作并获取监控数据。Step S23: Perform a monitoring operation on the data center according to the monitoring task and obtain monitoring data.
步骤S24:将监控数据上传至分析设备以进行对监控数据的分析操作,并生成分析结果。Step S24: Upload the monitoring data to the analysis device to perform an analysis operation on the monitoring data, and generate an analysis result.
需要说明的是,本实施例的重点在于在获取仲裁设备传入的监控任务之前,任务代理设备预先与仲裁设备之间建立绑定关系,在建立绑定关系的过程中,首先由仲裁设备向任务代理设备发起绑定指令,进而任务代理设备在接收到仲裁设备的绑定指令后,与仲裁设备建立通信的绑定关系,该绑定关系即为任务代理设备对仲裁设备的任务响应关系,进而任务代理设备响应与自身存在绑定关系的仲裁设备的监控任务,并执行相应的监控服务。此外,在任务代理设备与仲裁设备建立绑定关系之后,仅依照预设频率向存在绑定关系的仲裁设备发送运行状态信息,目的是向仲裁设备告知自身的工作状态,以供仲裁设备判断其对应的任务代理设备是否可用,进而当仲裁设备判定其对应的任务代理设备可用时,则进一步向任务代理设备传入监控任务,进而通过任务代理设备根据监控任务执行相应的监控操作。本实施例通过预先建立仲裁设备与任务代理设备之间的绑定关系,并由任务代理设备仅服务于与其存在绑定关系的仲裁设备,能够相对确保仲裁设备向任务代理设备分配监控任务的有序性以及可靠性,进而确保了监控过程的整体可靠性。It should be noted that the focus of this embodiment is to establish a binding relationship between the task agent device and the arbitration device in advance before acquiring the monitoring task passed in by the arbitration device. In the process of establishing the binding relationship, the arbitration device first sends the information to the arbitration device. The task agent device initiates a binding instruction, and after receiving the binding instruction of the arbitration device, the task agent device establishes a communication binding relationship with the arbitration device. The binding relationship is the task response relationship of the task agent device to the arbitration device. Furthermore, the task agent device responds to the monitoring task of the arbitration device that has a binding relationship with itself, and executes the corresponding monitoring service. In addition, after the task agent device establishes a binding relationship with the arbitration device, it only sends operating status information to the arbitration device that has a binding relationship according to the preset frequency. The purpose is to inform the arbitration device of its own working status so that the arbitration device can judge it. Whether the corresponding task agent device is available, and when the arbitration device determines that its corresponding task agent device is available, it further transmits the monitoring task to the task agent device, and then the task agent device performs corresponding monitoring operations according to the monitoring task. In this embodiment, the binding relationship between the arbitration device and the task agent device is established in advance, and the task agent device only serves the arbitration device with which it has a binding relationship, which can relatively ensure that the arbitration device assigns monitoring tasks to the task agent device. The orderliness and reliability ensure the overall reliability of the monitoring process.
在上述实施例的基础上,作为一种优选的实施方式,运行状态信息包括MAC地址、IP地址、资源利用率以及软件版本号。On the basis of the foregoing embodiment, as a preferred implementation manner, the operating status information includes a MAC address, an IP address, a resource utilization rate, and a software version number.
需要说明的是,由于MAC地址、IP地址是仲裁设备与任务代理设备进行数据通信时所基于的必要条件,因此MAC地址以及IP地址的可连通性是影响仲裁设备的监控任务能够正常发送至任务代理设备的重点因素;另外,资源利用率指的是包括但不限于CPU利用率、内存利用率以及网络利用率等任务代理设备对监控任务进行响应以及处理时需要占用的运算资源,运算资源的数量直接决定着监控任务可否被任务代理设备正常处理的因素;另外,软件版本号指的是任务代理设备中执行的用于实现监控操作的软件程序的版本,由于不同的版本号对监控任务的数据包结构要求可能不同,因此可能存在任务代理设备中实现监控操作的软件程序的版本号与仲裁设备的软件程序的版本号不相符而导致任务代理设备无法正常响应仲裁设备传入的监控任务。考虑到综上一系列影响监控过程整体可靠性以及准确性的因素,本实施方式中任务代理设备向仲裁设备发送的运行状态信息包括MAC地址、IP地址、资源利用率以及软件版本号,以此仲裁设备能够综合判定该任务代理设备是否可用,能够进一步确保仲裁设备向任务代理设备分配监控任务的有序性以及可靠性,进而确保监控过程的整体可靠性。It should be noted that because the MAC address and IP address are necessary conditions for the arbitration device to communicate with the task agent device, the connectivity of the MAC address and IP address affects the monitoring task of the arbitration device to be sent to the task normally The key factors of agent equipment; in addition, resource utilization refers to tasks including but not limited to CPU utilization, memory utilization, and network utilization. The agent equipment responds to monitoring tasks and needs to occupy computing resources and computing resources. The quantity directly determines whether the monitoring task can be processed normally by the task agent device; in addition, the software version number refers to the version of the software program executed in the task agent device to realize the monitoring operation, because different version numbers affect the monitoring task The data packet structure requirements may be different. Therefore, the version number of the software program that implements the monitoring operation in the task agent device may not match the version number of the software program of the arbitration device, causing the task agent device to fail to respond normally to the monitoring task transmitted by the arbitration device. Taking into account a series of factors that affect the overall reliability and accuracy of the monitoring process, the operating status information sent by the task agent device to the arbitration device in this embodiment includes MAC address, IP address, resource utilization, and software version number. The arbitration device can comprehensively determine whether the task agent device is available, and can further ensure the orderliness and reliability of the arbitration device assigning monitoring tasks to the task agent device, thereby ensuring the overall reliability of the monitoring process.
参见图3所示,本申请实施例公开了一种监控方法,应用于任务代理设备,包括:Referring to FIG. 3, an embodiment of the present application discloses a monitoring method, which is applied to a task agent device, and includes:
步骤S30:获取仲裁设备传入的监控任务,监控任务中包括目标设备信息。Step S30: Obtain the monitoring task passed by the arbitration device, and the monitoring task includes the target device information.
步骤S31:根据监控任务的目标设备信息对数据中心的目标设备执行监控操作并获取监控数据。Step S31: Perform a monitoring operation on the target device in the data center according to the target device information of the monitoring task and obtain the monitoring data.
步骤S32:将监控数据上传至分析设备集群中与目标设备对应的分析设备以进行对监控数据的分析操作,并生成分析结果。Step S32: Upload the monitoring data to the analysis device corresponding to the target device in the analysis device cluster to perform an analysis operation on the monitoring data, and generate an analysis result.
需要说明的是,本实施例的重点在于仲裁设备传入任务代理设备的监控任务中包括目标设备信息,目标设备信息指的是数据中心中用户需要进行监控的具体服务器设备的身份信息,进而任务代理设备根据监控任务的目标设备信息对数据中心的目标设备执行监控操作并获取监控数据,也就 是说,本实例对数据中心的监控范围进一步精确至数据中心内部具体的某个或某些目标设备,即获取数据中心内部具体的某个或某些目标设备的监控数据,并且在此基础上,本实施例将监控数据上传至分析设备集群中与目标设备对应的分析设备以进行对监控数据的分析操作,也就是说,在本实施例中,分析设备的数量大于1并且构成分析设备集群,不同的分析设备负责对数据中心内相应的服务器设备,即目标设备进行分析操作。在具体实施场景下,数据中心中不同目标设备的监控数据,可以基于目标设备的HASH值唯一定位到分析设备集群中相应的分析设备。It should be noted that the focus of this embodiment is that the monitoring tasks of the arbitration device passed into the task agent device include target device information. The target device information refers to the identity information of the specific server device that the user needs to monitor in the data center, and then the task The agent device performs monitoring operations on the target device in the data center according to the target device information of the monitoring task and obtains monitoring data. That is to say, the monitoring scope of the data center in this example is further accurate to a specific target device or some target devices in the data center. , That is, to obtain the monitoring data of specific one or some target equipment in the data center, and on this basis, this embodiment uploads the monitoring data to the analysis equipment corresponding to the target equipment in the analysis equipment cluster to perform the monitoring data Analysis operation, that is, in this embodiment, the number of analysis devices is greater than one and constitute an analysis device cluster, and different analysis devices are responsible for performing analysis operations on corresponding server devices in the data center, that is, target devices. In a specific implementation scenario, the monitoring data of different target devices in the data center can be uniquely located to the corresponding analysis device in the analysis device cluster based on the HASH value of the target device.
本实施例通过分析设备集群中的多台分析设备共同对监控数据执行分析操作,进一步确保了对于监控数据进行分析操作的可靠性,进而确保监控过程的整体可靠性。In this embodiment, multiple analysis devices in the analysis device cluster jointly perform an analysis operation on the monitoring data, which further ensures the reliability of the analysis operation on the monitoring data, thereby ensuring the overall reliability of the monitoring process.
在上述实施例的基础上,分析设备在得到分析结果后,可以进一步将分析结果以主备的方式同步录入至存储系统中,以此确保分析结果的数据安全。On the basis of the above-mentioned embodiment, after the analysis device obtains the analysis result, it can further synchronously record the analysis result into the storage system in a master/standby manner, so as to ensure the data security of the analysis result.
参见图4所示,本申请实施例公开了一种监控方法,应用于仲裁设备,包括:As shown in FIG. 4, an embodiment of the present application discloses a monitoring method, which is applied to an arbitration device, and includes:
步骤S40:生成监控任务,并获取监控任务的任务总量。Step S40: Generate a monitoring task, and obtain the total number of tasks of the monitoring task.
步骤S41:判断任务总量是否大于预设阈值,预设阈值为大于1的整数,如果是,则执行步骤S42,否则,执行步骤S43。Step S41: It is judged whether the total number of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1, if yes, step S42 is executed, otherwise, step S43 is executed.
步骤S42:将监控任务以分组的形式下发至与数据中心对应的任务代理设备,各分组的下发时刻间隔预设时长。Step S42: The monitoring task is delivered to the task agent device corresponding to the data center in the form of groups, and the delivery time interval of each group is preset.
步骤S43:将监控任务一次性下发至任务代理设备。Step S43: Send the monitoring task to the task agent device at one time.
需要说明的是,本实施例的执行主体为仲裁设备,仲裁设备直接与用户进行交互,用户通过仲裁设备发起对数据中心的监控,进而仲裁设备响应用户对数据中心的监控需求,生成相应的监控任务。由于考虑到用户对于数据中心的监控需求数量可能较大,因此在仲裁设备生成监控任务后,进一步获取用户发起的监控任务的任务总量,并根据任务总量以相应的下发策略将监控任务下发至任务代理设备。本实施例的重点在于当任务总量 大于预设阈值时,将监控任务以分组的形式下发至与数据中心对应的任务代理设备,并且各分组的下发时刻间隔预设时长,相对降低了任务代理设备对监控任务的处理压力,确保了监控过程的整体可靠性。It should be noted that the executive body of this embodiment is the arbitration device, and the arbitration device directly interacts with the user. The user initiates the monitoring of the data center through the arbitration device, and the arbitration device responds to the user's monitoring needs of the data center and generates corresponding monitoring. task. Considering that the number of monitoring needs of users for the data center may be large, after the arbitration device generates monitoring tasks, the total number of monitoring tasks initiated by the user is further obtained, and the monitoring tasks are issued according to the total number of tasks. Deliver to the task agent device. The focus of this embodiment is that when the total number of tasks is greater than the preset threshold, the monitoring tasks are delivered in groups to the task agent device corresponding to the data center, and the delivery time interval of each group is preset for a relatively low time. The processing pressure of the task agent equipment on the monitoring task ensures the overall reliability of the monitoring process.
参见图5所示,当与任务代理设备的数量大于1时,本申请实施例公开了一种监控方法,应用于仲裁设备,包括:Referring to FIG. 5, when the number of task agent devices is greater than 1, an embodiment of the present application discloses a monitoring method applied to arbitration devices, including:
步骤S50:生成监控任务,并获取监控任务的任务总量。Step S50: Generate a monitoring task, and obtain the total number of tasks of the monitoring task.
步骤S51:判断任务总量是否大于预设阈值,预设阈值为大于1的整数,如果是,则执行步骤S52,否则,执行步骤S53。Step S51: Determine whether the total number of tasks is greater than a preset threshold, the preset threshold is an integer greater than 1, and if so, step S52 is executed, otherwise, step S53 is executed.
步骤S52:将监控任务以分组的形式均衡下发至各任务代理设备,各分组的下发时刻间隔预设时长。Step S52: The monitoring task is distributed to each task agent device in a balanced manner in the form of groups, and the distribution time interval of each group is preset.
步骤S53:将监控任务一次性均衡下发至各任务代理设备。Step S53: The monitoring tasks are distributed to each task agent device in a balanced manner at one time.
需要说明的是,本实施例是当任务代理设备的数量大于1时,即任务代理设备以集群的形式共同工作的情况下,仲裁设备将监控任务均衡分配至各个任务代理设备,以此能够均衡各个任务代理设备的工作负载,进而确保任务代理设备响应监控任务的整体稳定性以及可靠性,进一步确保了监控过程的整体可靠性。It should be noted that in this embodiment, when the number of task agent devices is greater than 1, that is, when the task agent devices work together in a cluster, the arbitration device distributes the monitoring tasks to each task agent device in a balanced manner. The workload of each task agent device ensures the overall stability and reliability of the task agent device in response to the monitoring task, and further ensures the overall reliability of the monitoring process.
参见图6所示,本申请实施例公开了一种监控方法,应用于分析设备,包括:As shown in FIG. 6, an embodiment of the present application discloses a monitoring method, which is applied to an analysis device, and includes:
步骤S60:获取任务代理设备传入的监控数据,并依照预设规则对监控数据进行分析操作以生成分析结果。Step S60: Obtain the monitoring data transmitted by the task agent device, and perform an analysis operation on the monitoring data according to a preset rule to generate an analysis result.
步骤S61:将分析结果发送至仲裁设备,以供仲裁设备对分析结果进行展示。Step S61: Send the analysis result to the arbitration device for the arbitration device to display the analysis result.
本实施例中,分析设备在获取到任务设备传入的监控数据后,依照预设规则对监控数据进行分析操作以生成分析结果,该预设规则可以根据用户的实际监控分析需求而定,在分析设备生成分析结果后,进一步将分析结果发送至仲裁设备,并由仲裁设备对分析结果进行展示,以此确保用户能够直观了解分析结果的具体内容。In this embodiment, after the analysis device obtains the monitoring data transmitted by the task device, it performs analysis operations on the monitoring data according to preset rules to generate analysis results. The preset rules can be determined according to the actual monitoring and analysis needs of the user. After the analysis device generates the analysis result, it further sends the analysis result to the arbitration device, and the arbitration device displays the analysis result, so as to ensure that the user can intuitively understand the specific content of the analysis result.
为了进一步加深对于本申请技术方案的理解,本申请还提供一种具体应用场景下的场景实施例进行说明。In order to further deepen the understanding of the technical solution of the present application, the present application also provides a scenario embodiment in a specific application scenario for description.
一种分布式多数据中心统一化监控运维架构技术方案主要包括仲裁设备、分析设备与任务代理设备。A distributed multi-data center unified monitoring operation and maintenance architecture technical solution mainly includes arbitration equipment, analysis equipment and task agent equipment.
在应用场景中的具体实施过程如下:The specific implementation process in the application scenario is as follows:
1)仲裁设备:1) Arbitration equipment:
实现方案:Implementation plan:
a.任务代理设备管理a. Task agent equipment management
在仲裁设备上添加任务代理设备,然后仲裁设备根据口令发送绑定指令到任务代理设备上,任务代理设备在收到绑定指定后,响应绑定。当双方完成绑定后,任务代理设备每隔30秒上报心跳给仲裁设备,心跳内容包含但不限于以下内容:物理地址、IP、健康信息(CPU利用率、内存利用率、网络利用率)、版本号。Add a task agent device to the arbitration device, and then the arbitration device sends a binding instruction to the task agent device according to the password, and the task agent device responds to the binding after receiving the binding designation. After the two parties complete the binding, the task agent device reports the heartbeat to the arbitration device every 30 seconds. The heartbeat content includes but is not limited to the following: physical address, IP, health information (CPU utilization, memory utilization, network utilization), version number.
仲裁设备每隔1分钟检测一次心跳信息,若某个任务代理设备无心跳,则把该任务代理设备的任务迁移到其他的任务代理设备。The arbitration device detects the heartbeat information every 1 minute, and if a certain task agent device has no heartbeat, the task of the task agent device is migrated to other task agent devices.
b.任务管理b. Task management
提供任务的维护接口,接收到任务后,按照绑定的数据中心网络监控策略分配任务给相应的任务代理设备执行。Provides a task maintenance interface. After receiving the task, it assigns the task to the corresponding task agent device for execution according to the bound data center network monitoring strategy.
2)分析设备:2) Analysis equipment:
实现方案:Implementation plan:
a.接收任务执行结果a. Receive task execution results
接收任务代理设备的执行结果,进行数据的清洗、分析、入库。Receive the execution results of the task agent equipment, and perform data cleaning, analysis, and storage.
b.接收任务执行状态b. Receive task execution status
接收任务代理设备的执行状态,用于记录每个任务的执行时间、等待时间、执行时长、成功、失败等信息。Receive the execution status of the task agent device, which is used to record the execution time, waiting time, execution duration, success, failure and other information of each task.
3)任务代理设备3) Task agent equipment
实现方案:Implementation plan:
任务代理设备在安装时设置用户名及密码,用于仲裁设备绑定时验证 口令。同时,具备心跳上报及任务执行的能力。由于任务代理设备是分布到各个数据中心的,所以任务代理设备的执行要保证是无状态的执行,也就是说任务之间没有相互关系,上一次的执行与下一次的执行之间没有相互关系,本地没有储存任何的状态的能力。The task agent device sets the user name and password during installation, and is used to verify the password when arbitrating the device binding. At the same time, it has the ability to report heartbeat and execute tasks. Since the task agent equipment is distributed to each data center, the execution of the task agent equipment should be guaranteed to be stateless execution, that is to say, there is no correlation between tasks, and there is no correlation between the previous execution and the next execution. , There is no ability to store any state locally.
请参见图7所示,本申请实施例公开了一种监控系统,包括:As shown in FIG. 7, an embodiment of the present application discloses a monitoring system, including:
任务代理设备10,用于获取仲裁设备11传入的监控任务;根据监控任务对数据中心13执行监控操作并获取监控数据;将监控数据上传至分析设备12以进行对监控数据的分析操作,并生成分析结果;The task agent device 10 is used to obtain the monitoring tasks transmitted by the arbitration device 11; perform monitoring operations on the data center 13 according to the monitoring tasks and obtain monitoring data; upload the monitoring data to the analysis device 12 to perform analysis operations on the monitoring data, and Generate analysis results;
仲裁设备11,用于生成监控任务,并获取监控任务的任务总量;判断任务总量是否大于预设阈值,预设阈值为大于1的整数;如果是,则将监控任务以分组的形式下发至与数据中心13对应的任务代理设备10,各分组的下发时刻间隔预设时长;否则,将监控任务一次性下发至任务代理设备10。The arbitration device 11 is used to generate monitoring tasks and obtain the total number of tasks of the monitoring tasks; determine whether the total number of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1; if so, the monitoring tasks are downloaded in groups It is sent to the task agent device 10 corresponding to the data center 13, and the time interval for issuing each group is preset; otherwise, the monitoring task is issued to the task agent device 10 at one time.
分析设备12,用于获取任务代理设备10传入的监控数据,并依照预设规则对监控数据进行分析操作以生成分析结果;将分析结果发送至仲裁设备11,以供仲裁设备11对分析结果进行展示。The analysis device 12 is used to obtain the monitoring data passed in by the task agent device 10, and perform analysis operations on the monitoring data according to preset rules to generate analysis results; send the analysis results to the arbitration device 11 for the arbitration device 11 to analyze the results To show.
本申请所提供的监控系统,由任务代理设备获取仲裁设备传入的监控任务,并根据监控任务对所监控的数据中心进行监控操作,以此获取监控数据,进而将监控数据上传至分析设备以供分析设备对监控数据进行分析操作,最终生成分析结果,达到对数据中心进行监控的目的。由于本系统基于任务代理设备、仲裁设备以及分析设备共同实现了对数据中心的监控,因此监控业务的完整业务线被分配于多个设备协同执行,进而相对降低了当前监控系统的整体运行风险,确保了监控过程的整体可靠性。In the monitoring system provided by this application, the task agent device obtains the monitoring task passed by the arbitration device, and performs monitoring operations on the monitored data center according to the monitoring task to obtain monitoring data, and then upload the monitoring data to the analysis device to For analysis equipment to perform analysis operations on the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this system is based on task agency equipment, arbitration equipment, and analysis equipment to realize the monitoring of the data center, the complete business line of the monitoring business is distributed to multiple equipment for collaborative execution, thereby relatively reducing the overall operational risk of the current monitoring system. Ensure the overall reliability of the monitoring process.
此外,本申请还公开了一种监控设备,包括:In addition, this application also discloses a monitoring device, including:
存储器,用于存储计算机程序;Memory, used to store computer programs;
处理器,用于执行计算机程序时实现如上述应用于任务代理设备和/或应用于仲裁设备和/或应用于分析设备的监控方法的步骤。The processor is used to implement the steps of the monitoring method applied to the task agent device and/or applied to the arbitration device and/or applied to the analysis device when executing the computer program.
本申请所提供的监控设备,由任务代理设备获取仲裁设备传入的监控任务,并根据监控任务对所监控的数据中心进行监控操作,以此获取监控数据,进而将监控数据上传至分析设备以供分析设备对监控数据进行分析操作,最终生成分析结果,达到对数据中心进行监控的目的。由于本设备基于任务代理设备、仲裁设备以及分析设备共同实现了对数据中心的监控,因此监控业务的完整业务线被分配于多个设备协同执行,进而相对降低了当前监控系统的整体运行风险,确保了监控过程的整体可靠性。For the monitoring equipment provided in this application, the task agent equipment obtains the monitoring tasks passed by the arbitration equipment, and performs monitoring operations on the monitored data center according to the monitoring tasks to obtain monitoring data, and then upload the monitoring data to the analysis equipment to For analysis equipment to perform analysis operations on the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this device is based on task agent equipment, arbitration equipment and analysis equipment to realize the monitoring of the data center, the complete business line of the monitoring business is distributed to multiple equipment for collaborative execution, which relatively reduces the overall operational risk of the current monitoring system. Ensure the overall reliability of the monitoring process.
进一步的,本申请还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现如上述应用于任务代理设备和/或应用于仲裁设备和/或应用于分析设备的监控方法的步骤。Further, the present application also provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the computer program is applied to the task agent device and/or to the arbitration device and/or as described above. Or the steps of a monitoring method applied to analysis equipment.
本申请所提供的计算机可读存储介质,由任务代理设备获取仲裁设备传入的监控任务,并根据监控任务对所监控的数据中心进行监控操作,以此获取监控数据,进而将监控数据上传至分析设备以供分析设备对监控数据进行分析操作,最终生成分析结果,达到对数据中心进行监控的目的。由于本计算机可读存储介质基于任务代理设备、仲裁设备以及分析设备共同实现了对数据中心的监控,因此监控业务的完整业务线被分配于多个设备协同执行,进而相对降低了当前监控系统的整体运行风险,确保了监控过程的整体可靠性。In the computer-readable storage medium provided by this application, the task agent device obtains the monitoring task passed by the arbitration device, and performs monitoring operations on the monitored data center according to the monitoring task, thereby obtaining monitoring data, and uploading the monitoring data to The analysis equipment is used for the analysis equipment to analyze and operate the monitoring data, and finally generate analysis results to achieve the purpose of monitoring the data center. Since this computer-readable storage medium realizes the monitoring of the data center based on the task agent equipment, arbitration equipment, and analysis equipment, the complete business line of the monitoring service is distributed to multiple equipment for collaborative execution, which relatively reduces the current monitoring system The overall operation risk ensures the overall reliability of the monitoring process.
以上对本申请所提供的一种监控方法、系统、设备及存储介质进行了详细介绍。说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。The above describes in detail a monitoring method, system, equipment, and storage medium provided by this application. The various embodiments in the specification are described in a progressive manner. Each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method part. It should be pointed out that for those of ordinary skill in the art, without departing from the principles of this application, several improvements and modifications can be made to this application, and these improvements and modifications also fall within the protection scope of the claims of this application.
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should also be noted that in this specification, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities or operations. There is any such actual relationship or sequence between operations. Moreover, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, but also includes those that are not explicitly listed Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or equipment that includes the element.

Claims (10)

  1. 一种监控方法,其特征在于,应用于任务代理设备,包括:A monitoring method, characterized in that it is applied to task agent equipment, and includes:
    获取仲裁设备传入的监控任务;Obtain the monitoring task passed by the arbitration device;
    根据所述监控任务对数据中心执行监控操作并获取监控数据;Perform monitoring operations on the data center according to the monitoring tasks and obtain monitoring data;
    将所述监控数据上传至分析设备以进行对所述监控数据的分析操作,并生成分析结果。Upload the monitoring data to an analysis device to perform an analysis operation on the monitoring data, and generate an analysis result.
  2. 根据权利要求1所述的监控方法,其特征在于,在所述获取仲裁设备传入的监控任务之前,所述方法还包括:The monitoring method according to claim 1, characterized in that, before the obtaining the monitoring task passed in by the arbitration device, the method further comprises:
    响应所述仲裁设备传入的绑定指令并与所述仲裁设备建立绑定关系;Responding to the binding instruction passed by the arbitration device and establishing a binding relationship with the arbitration device;
    依照预设频率向具有所述绑定关系的仲裁设备发送运行状态信息,以供所述仲裁设备判断所述任务代理设备的可用性;Sending operating status information to the arbitration device having the binding relationship according to a preset frequency, so that the arbitration device can determine the availability of the task agent device;
    所述获取仲裁设备传入的监控任务,包括:The obtaining of the monitoring task incoming from the arbitration device includes:
    获取具有所述绑定关系的所述仲裁设备传入的所述监控任务。Acquiring the monitoring task passed in by the arbitration device having the binding relationship.
  3. 根据权利要求2所述的监控方法,其特征在于,所述运行状态信息包括MAC地址、IP地址、资源利用率以及软件版本号。The monitoring method according to claim 2, wherein the operating status information includes MAC address, IP address, resource utilization, and software version number.
  4. 根据权利要求1至3任意一项所述的监控方法,其特征在于,所述监控任务中包括目标设备信息;The monitoring method according to any one of claims 1 to 3, wherein the monitoring task includes target device information;
    所述根据所述监控任务对数据中心执行监控操作并获取监控数据,包括:The performing a monitoring operation on the data center and obtaining monitoring data according to the monitoring task includes:
    根据所述监控任务的目标设备信息对所述数据中心的目标设备执行监控操作并获取监控数据;Perform a monitoring operation on the target device of the data center according to the target device information of the monitoring task and obtain monitoring data;
    所述将所述监控数据上传至分析设备以进行对所述监控数据的分析操作,包括:The uploading the monitoring data to an analysis device to perform an analysis operation on the monitoring data includes:
    将所述监控数据上传至分析设备集群中与所述目标设备对应的所述分析设备以进行对所述监控数据的所述分析操作。Upload the monitoring data to the analysis device corresponding to the target device in the analysis device cluster to perform the analysis operation on the monitoring data.
  5. 一种监控方法,其特征在于,应用于仲裁设备,包括:A monitoring method, characterized in that it is applied to an arbitration device, and includes:
    生成监控任务,并获取所述监控任务的任务总量;Generate monitoring tasks, and obtain the total number of tasks of the monitoring tasks;
    判断所述任务总量是否大于预设阈值,所述预设阈值为大于1的整数;Judging whether the total number of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1;
    如果是,则将所述监控任务以分组的形式下发至与数据中心对应的任 务代理设备,各所述分组的下发时刻间隔预设时长;If it is, the monitoring task is delivered to the task agent device corresponding to the data center in the form of a group, and the delivery time interval of each group is a preset duration;
    否则,将所述监控任务一次性下发至所述任务代理设备。Otherwise, the monitoring task is delivered to the task agent device at one time.
  6. 根据权利要求5所述的监控方法,其特征在于,当与所述任务代理设备的数量大于1时,所述将所述监控任务以分组的形式下发至与数据中心对应的任务代理设备,包括:The monitoring method according to claim 5, wherein when the number of the task agent devices is greater than 1, the monitoring task is delivered to the task agent device corresponding to the data center in the form of grouping, include:
    将所述监控任务以分组的形式均衡下发至各所述任务代理设备;Distribute the monitoring tasks to each of the task agent devices in a balanced manner in groups;
    所述将所述监控任务一次性下发至所述任务代理设备,包括:The sending the monitoring task to the task agent device at one time includes:
    将所述监控任务一次性均衡下发至各所述任务代理设备。The monitoring task is delivered to each of the task agent devices in a balanced manner at one time.
  7. 一种监控方法,其特征在于,应用于分析设备,包括:A monitoring method, characterized in that it is applied to analysis equipment, and includes:
    获取任务代理设备传入的监控数据,并依照预设规则对所述监控数据进行分析操作以生成分析结果;Obtain monitoring data transmitted by the task agent device, and perform analysis operations on the monitoring data according to preset rules to generate analysis results;
    将所述分析结果发送至仲裁设备,以供所述仲裁设备对所述分析结果进行展示。The analysis result is sent to the arbitration device, so that the arbitration device can display the analysis result.
  8. 一种监控系统,其特征在于,包括:A monitoring system is characterized in that it comprises:
    任务代理设备,用于获取仲裁设备传入的监控任务;根据所述监控任务对数据中心执行监控操作并获取监控数据;将所述监控数据上传至分析设备以进行对所述监控数据的分析操作,并生成分析结果;The task agent device is used to obtain the monitoring task passed in by the arbitration device; perform monitoring operations on the data center according to the monitoring task and obtain monitoring data; upload the monitoring data to the analysis device to perform the analysis operation on the monitoring data , And generate analysis results;
    所述仲裁设备,用于生成监控任务,并获取所述监控任务的任务总量;判断所述任务总量是否大于预设阈值,所述预设阈值为大于1的整数;如果是,则将所述监控任务以分组的形式下发至与数据中心对应的任务代理设备,各所述分组的下发时刻间隔预设时长;否则,将所述监控任务一次性下发至所述任务代理设备;The arbitration device is used to generate monitoring tasks and obtain the total number of tasks of the monitoring tasks; determine whether the total number of tasks is greater than a preset threshold, and the preset threshold is an integer greater than 1; if it is, then The monitoring task is delivered to the task agent device corresponding to the data center in the form of a group, and each group is issued at a preset time interval; otherwise, the monitoring task is delivered to the task agent device at one time ;
    所述分析设备,用于获取任务代理设备传入的监控数据,并依照预设规则对所述监控数据进行分析操作以生成所述分析结果;将所述分析结果发送至仲裁设备,以供所述仲裁设备对所述分析结果进行展示。The analysis device is used to obtain the monitoring data passed in by the task agent device, and perform an analysis operation on the monitoring data according to preset rules to generate the analysis result; and send the analysis result to the arbitration device for all The arbitration device displays the analysis result.
  9. 一种监控设备,其特征在于,包括:A monitoring device, characterized in that it comprises:
    存储器,用于存储计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述计算机程序时实现如权利要求1至4任一项所述的监控方法的步骤和/或执行如权利要求5至6任一项所述的监控方法的 步骤和/或执行如权利要求7所述的监控方法的步骤。The processor is configured to implement the steps of the monitoring method according to any one of claims 1 to 4 and/or the steps of the monitoring method according to any one of claims 5 to 6 when executing the computer program and/ Or execute the steps of the monitoring method according to claim 7.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至4任一项所述的监控方法的步骤和/或执行如权利要求5至6任一项所述的监控方法的步骤和/或执行如权利要求7所述的监控方法的步骤。A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the monitoring method according to any one of claims 1 to 4 is implemented. Steps and/or steps of the monitoring method according to any one of claims 5 to 6 and/or steps of the monitoring method according to claim 7 are performed.
PCT/CN2020/073122 2019-11-15 2020-01-20 Monitoring method, system and device, and storage medium WO2021093171A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911122494.XA CN110933148A (en) 2019-11-15 2019-11-15 Monitoring method, system, equipment and storage medium
CN201911122494.X 2019-11-15

Publications (1)

Publication Number Publication Date
WO2021093171A1 true WO2021093171A1 (en) 2021-05-20

Family

ID=69854170

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/073122 WO2021093171A1 (en) 2019-11-15 2020-01-20 Monitoring method, system and device, and storage medium

Country Status (2)

Country Link
CN (1) CN110933148A (en)
WO (1) WO2021093171A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106209482A (en) * 2016-09-13 2016-12-07 郑州云海信息技术有限公司 A kind of data center monitoring method and system
CN107070744A (en) * 2017-03-22 2017-08-18 上海合志信息技术有限公司 Server monitoring method
US9864417B2 (en) * 2013-03-08 2018-01-09 International Business Machines Corporation Server rack for improved data center management
CN109787850A (en) * 2017-11-10 2019-05-21 阿里巴巴集团控股有限公司 Monitoring system, monitoring method and calculate node
CN109905492A (en) * 2019-04-24 2019-06-18 苏州浪潮智能科技有限公司 Operation safety management system and method based on distributed modular data center

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677469B (en) * 2016-01-06 2019-12-27 北京京东世纪贸易有限公司 Timed task execution method and device
CN106878111A (en) * 2017-03-15 2017-06-20 郑州云海信息技术有限公司 The cloud monitoring system and monitoring method of a kind of High Availabitity
US10536505B2 (en) * 2017-04-30 2020-01-14 Cisco Technology, Inc. Intelligent data transmission by network device agent
CN109194546A (en) * 2018-09-14 2019-01-11 郑州云海信息技术有限公司 A kind of OpenStack mainframe cluster monitoring system and method based on Grafana

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9864417B2 (en) * 2013-03-08 2018-01-09 International Business Machines Corporation Server rack for improved data center management
CN106209482A (en) * 2016-09-13 2016-12-07 郑州云海信息技术有限公司 A kind of data center monitoring method and system
CN107070744A (en) * 2017-03-22 2017-08-18 上海合志信息技术有限公司 Server monitoring method
CN109787850A (en) * 2017-11-10 2019-05-21 阿里巴巴集团控股有限公司 Monitoring system, monitoring method and calculate node
CN109905492A (en) * 2019-04-24 2019-06-18 苏州浪潮智能科技有限公司 Operation safety management system and method based on distributed modular data center

Also Published As

Publication number Publication date
CN110933148A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
US7954011B2 (en) Enabling tracing operations in clusters of servers
CN109471710B (en) Task request processing method and device, processor, terminal and server
US10680896B2 (en) Virtualized network function monitoring
WO2020147419A1 (en) Monitoring method and apparatus, computer device and storage medium
CN101605108B (en) Method, system and apparatus for instant communication
WO2021203979A1 (en) Operation and maintenance processing method and apparatus, and computer device
TWI255109B (en) Autonomic server farm, method of server failure diagnosis, and self-healing in a server farm
CN106911648B (en) Environment isolation method and equipment
US11392873B2 (en) Systems and methods for simulating orders and workflows in an order entry and management system to test order scenarios
WO2017131774A1 (en) Log event summarization for distributed server system
US20070094532A1 (en) Kernel debugging in a cluster computing system
KR101506250B1 (en) Connection Dualization System For virtualization service
JP2018508072A (en) Method and apparatus for pushing messages
KR20150082932A (en) Apparatus and method for supporting configuration management of virtual machine, and apparatus and method for brokering cloud service using the apparatus
CN106060189B (en) A kind of distribution domain name registration system
CN114024972A (en) Long connection communication method, system, device, equipment and storage medium
CN108733545B (en) Pressure testing method and device
CN111339194A (en) Automatic scheduling method and device for middleware of database access layer
US10122602B1 (en) Distributed system infrastructure testing
CN105808441B (en) A kind of various dimensions performance diagnogtics analysis method
CN106559236B (en) Equipment resource management method and device of service board, main control board and frame type equipment
CN110636127A (en) Communication processing method and system between information data
WO2021093171A1 (en) Monitoring method, system and device, and storage medium
WO2016091141A1 (en) Method and apparatus for information collection
US10715608B2 (en) Automatic server cluster discovery

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20886688

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20886688

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20886688

Country of ref document: EP

Kind code of ref document: A1