WO2021072844A1 - Cloud platform host node monitoring method, apparatus, and computer device - Google Patents

Cloud platform host node monitoring method, apparatus, and computer device Download PDF

Info

Publication number
WO2021072844A1
WO2021072844A1 PCT/CN2019/116613 CN2019116613W WO2021072844A1 WO 2021072844 A1 WO2021072844 A1 WO 2021072844A1 CN 2019116613 W CN2019116613 W CN 2019116613W WO 2021072844 A1 WO2021072844 A1 WO 2021072844A1
Authority
WO
WIPO (PCT)
Prior art keywords
nodes
host node
node
host
importance
Prior art date
Application number
PCT/CN2019/116613
Other languages
French (fr)
Chinese (zh)
Inventor
刘洪晔
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201910977025.X external-priority
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021072844A1 publication Critical patent/WO2021072844A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Abstract

Provided is a cloud platform host node monitoring method based on cloud monitoring, comprising: obtaining the importance of each host node of a cloud platform, the importance being determined according to the adjacency similarity between each host node of the cloud platform and the neighboring nodes within two levels; the adjacency similarity is used for indicating the similarity of a whole consisting of any two host nodes and their surrounding adjacent nodes, adjacent nodes being neighboring nodes within two levels of the host node, and the adjacency similarity being inversely proportional to the importance. A critical host node of the cloud platform is determined according to the importance of the host nodes, and the alarm threshold of the critical host node is adjusted, the alarm threshold of non-critical host nodes being greater than the alarm threshold of the critical host node. Each host node of the cloud platform is monitored according to each alarm threshold, such that if the monitoring parameter of the host node is greater than the corresponding alarm threshold, an alarm is generated, the monitoring parameters comprising the device parameters and operating parameters of each host node.

Description

云平台的主机节点监控方法、装置和计算机设备Host node monitoring method, device and computer equipment of cloud platform
相关申请的交叉引用Cross-references to related applications
本申请要求于2019年10月15日提交中国专利局,申请号为201910977025X,申请名称为“云平台的主机节点监控方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 15, 2019, with the application number 201910977025X, and the application titled "Host Node Monitoring Method, Device and Computer Equipment of Cloud Platform", the entire content of which is by reference Incorporated in this application.
技术领域Technical field
本申请涉及一种云平台的主机节点监控方法、装置、计算机设备和存储介质。This application relates to a host node monitoring method, device, computer equipment and storage medium of a cloud platform.
背景技术Background technique
随着网络技术的发展和云平台的广泛应用,实现了云平台和不同服务器进行连接,可处理多种不同列表的业务。With the development of network technology and the widespread application of cloud platforms, the connection between the cloud platform and different servers has been realized, which can handle a variety of services with different lists.
然而,发明人意识到,目前在云平台运维过程中,对于不同的业务服务器主机节点都使用同样的监控方式,并没有考虑不同服务器主机节点的实际情况,在出现风险时会同时发出大量告警,误报警率较高,监控效果不理想。However, the inventor realized that in the current cloud platform operation and maintenance process, the same monitoring method is used for different business server host nodes, and the actual situation of different server host nodes is not considered, and a large number of alarms will be issued at the same time when there is a risk. , The false alarm rate is high, and the monitoring effect is not ideal.
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种云平台的主机节点监控方法、装置、计算机设备和存储介质。According to various embodiments disclosed in the present application, a method, device, computer equipment, and storage medium for monitoring a host node of a cloud platform are provided.
一种云平台的主机节点监控方法,包括:A method for monitoring host nodes of a cloud platform includes:
获取云平台各主机节点的重要度,所述重要度根据所述云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定;所述邻接相似度用于表示任意两个所述主机节点与其周围邻接节点组成的整体的相似程度,所述邻接节点为所述主机节点两阶内的邻居节点;所述邻接相似度与所述重要度成反比;Obtain the importance of each host node of the cloud platform, the importance is determined according to the adjacent similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacent similarity is used to indicate any two of the hosts The degree of similarity between a node and its surrounding neighboring nodes as a whole, where the neighboring nodes are neighbor nodes within two levels of the host node; the neighboring similarity is inversely proportional to the importance;
根据所述主机节点的重要度确定所述云平台的关键主机节点;Determining the key host node of the cloud platform according to the importance of the host node;
调整所述关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于所述关键主机节点的告警阈值;及Adjusting the alarm threshold of the key host node, wherein the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node; and
根据各所述告警阈值对所述云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;所述监控参数包括各所述主机节点的设备参数和运行参数。Monitor each host node of the cloud platform according to each alarm threshold to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold; the monitoring parameter includes the device parameters of each host node and Operating parameters.
一种云平台的主机节点监控装置,包括:A host node monitoring device of a cloud platform includes:
主机节点重要度获取模块,用于获取云平台各主机节点的重要度,所述重要度根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定;所述邻接相似度用于表示任 意两个所述主机节点与其周围邻接节点组成的整体的相似程度,所述邻接节点为所述主机节点两阶内的邻居节点;所述邻接相似度与所述重要度成反比;The host node importance degree obtaining module is used to obtain the importance degree of each host node of the cloud platform, and the importance degree is determined according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacency similarity is used for Represents the degree of similarity between any two host nodes and their surrounding neighboring nodes as a whole, where the neighboring nodes are neighbor nodes within two levels of the host node; the neighboring similarity is inversely proportional to the importance;
关键主机节点确定模块,用于根据所述主机节点的重要度确定所述云平台的关键主机节点;A key host node determining module, configured to determine the key host node of the cloud platform according to the importance of the host node;
告警阈值调整模块,用于调整所述关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于所述关键主机节点的告警阈值;及The alarm threshold adjustment module is used to adjust the alarm threshold of the key host node, wherein the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node; and
监控模块,用于根据各所述告警阈值对所述云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;所述监控参数包括各所述主机节点的设备参数和运行参数。一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:The monitoring module is configured to monitor each host node of the cloud platform according to each of the alarm thresholds, so as to generate an alarm when the monitoring parameter of the monitored host node is greater than the corresponding alarm threshold; the monitoring parameters include each of the hosts The device parameters and operating parameters of the node. A computer device, including a memory and one or more processors, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
获取根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度所确定的各主机节点的重要度;所述邻接相似度用于表示任意两个所述主机节点与其周围邻接节点组成的整体的相似程度,所述邻接节点为所述主机节点两阶内的邻居节点;所述邻接相似度与所述重要度成反比;Obtain the importance of each host node determined according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacency similarity is used to represent the composition of any two host nodes and its neighboring nodes Overall similarity, the adjacent nodes are neighbor nodes within two levels of the host node; the adjacent similarity is inversely proportional to the importance;
根据所述主机节点的重要度确定所述云平台的关键主机节点;Determining the key host node of the cloud platform according to the importance of the host node;
调整所述关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于所述关键主机节点的告警阈值;及Adjusting the alarm threshold of the key host node, wherein the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node; and
根据各所述告警阈值对所述云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;所述监控参数包括各所述主机节点的设备参数和运行参数。Monitor each host node of the cloud platform according to each alarm threshold to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold; the monitoring parameter includes the device parameters of each host node and Operating parameters.
一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
获取根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度所确定的各主机节点的重要度;所述邻接相似度用于表示任意两个所述主机节点与其周围邻接节点组成的整体的相似程度,所述邻接节点为所述主机节点两阶内的邻居节点;所述邻接相似度与所述重要度成反比;Obtain the importance of each host node determined according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacency similarity is used to represent the composition of any two host nodes and its neighboring nodes Overall similarity, the adjacent nodes are neighbor nodes within two levels of the host node; the adjacent similarity is inversely proportional to the importance;
根据所述主机节点的重要度确定所述云平台的关键主机节点;Determining the key host node of the cloud platform according to the importance of the host node;
调整所述关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于所述关键主机节点的告警阈值;及Adjusting the alarm threshold of the key host node, wherein the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node; and
根据各所述告警阈值对所述云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;所述监控参数包括各所述主机节点的设备参数和运行参数。Monitor each host node of the cloud platform according to each alarm threshold to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold; the monitoring parameter includes the device parameters of each host node and Operating parameters.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the present application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1为根据一个或多个实施例中云平台的主机节点监控方法的应用场景图。Fig. 1 is an application scenario diagram of a method for monitoring a host node of a cloud platform according to one or more embodiments.
图2为根据一个或多个实施例中云平台的主机节点监控方法的流程示意图。Fig. 2 is a schematic flowchart of a method for monitoring a host node of a cloud platform according to one or more embodiments.
图3为根据一个或多个实施例中确定云平台各主机节点与其两阶内的邻居节点之间邻接相似度的流程示意图。Fig. 3 is a schematic flowchart of determining the adjacency similarity between each host node of a cloud platform and its neighbor nodes in two stages according to one or more embodiments.
图4为一个实施例中云平台的主机节点监控方法的主机节点连接关系示意图。FIG. 4 is a schematic diagram of a connection relationship between host nodes of a method for monitoring a host node of a cloud platform in an embodiment.
图5为根据一个或多个实施例中云平台的主机节点监控装置的框图。Fig. 5 is a block diagram of a host node monitoring device of a cloud platform according to one or more embodiments.
图6为根据一个或多个实施例中计算机设备的框图。Figure 6 is a block diagram of a computer device according to one or more embodiments.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solutions and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.
本申请提供的云平台的主机节点监控方法,可以应用于如图1所示的应用环境中。云平台102与服务器104通过网络进行通信,云平台102的连接网络内设置多个主机节点106。服务器104通过获取根据云平台102内各主机节点106的重要度,其中重要度根据云平台102内各主机节点106与其两阶内的邻居节点之间邻接相似度确定,邻接相似度用于表示任意两个主机节点106与其周围邻接节点组成的整体的相似程度,邻接节点为主机节点106两阶内的邻居节点,邻接相似度与重要度成反比。根据主机节点106的重要度确定云平台102内的关键主机节点。服务器104通过调整关键主机节点的告警阈值,非关键主机节点的告警阈值大于关键主机节点的告警阈值,并根据各告警阈值对云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警,监控参数包括各主机节点的设备参数和运行参数。云平台102的连接网络内设置的各主机节点包括但不限于个人计算机以及笔记本电脑等,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The host node monitoring method of the cloud platform provided by this application can be applied to the application environment as shown in FIG. 1. The cloud platform 102 and the server 104 communicate through a network, and a plurality of host nodes 106 are set in the connection network of the cloud platform 102. The server 104 obtains the importance of each host node 106 in the cloud platform 102, where the importance is determined according to the adjacency similarity between each host node 106 in the cloud platform 102 and its neighbor nodes within two levels, and the adjacency similarity is used to represent arbitrary The degree of similarity between two host nodes 106 and its surrounding neighboring nodes as a whole. The neighboring nodes are neighbor nodes within two levels of the host node 106, and the neighboring similarity is inversely proportional to the importance. The key host nodes in the cloud platform 102 are determined according to the importance of the host node 106. The server 104 adjusts the alarm thresholds of key host nodes. The alarm thresholds of non-critical host nodes are greater than the alarm thresholds of key host nodes, and monitors each host node of the cloud platform according to the alarm thresholds to monitor the monitoring parameters of the host node. When it is greater than the corresponding alarm threshold, an alarm is generated, and the monitoring parameters include the device parameters and operating parameters of each host node. Each host node set in the connection network of the cloud platform 102 includes but is not limited to a personal computer and a notebook computer, etc. The server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
在其中一个实施例中,如图2所示,提供了一种云平台的主机节点监控方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:In one of the embodiments, as shown in FIG. 2, a method for monitoring host nodes of a cloud platform is provided. Taking the method applied to the server in FIG. 1 as an example, the method includes the following steps:
步骤S202,获取云平台各主机节点的重要度,重要度根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定。邻接相似度用于表示任意两个主机节点与其周围邻接节点组成的整体的相似程度,邻接节点为主机节点两阶内的邻居节点,邻接相似度与重要度成反比。In step S202, the importance of each host node of the cloud platform is obtained, and the importance is determined according to the similarity of the adjacency between each host node of the cloud platform and its neighbor nodes within two levels. The adjacency similarity is used to express the similarity between any two host nodes and its surrounding adjacent nodes as a whole. The adjacent nodes are the neighbor nodes within two levels of the host node, and the adjacency similarity is inversely proportional to the importance.
具体地,通过获取云平台的连接网络内各主机节点的节点参数,并根据节点参数分别 计算各主机节点和其两阶内的邻居节点之间的邻接相似度,进而根据邻接相似度,确定各主机节点的重要度。其中,邻接相似度与重要度成反比。Specifically, by acquiring the node parameters of each host node in the connection network of the cloud platform, and calculating the adjacency similarity between each host node and its neighbor nodes in two levels according to the node parameters, and then determining each host node according to the adjacency similarity The importance of the host node. Among them, the adjacency similarity is inversely proportional to the importance.
云平台的连接网络内设置有多个主机节点,云平台的连接网络中的相邻指的是任意两个主机节点的安全组策略允许两台主机相互访问。邻接相似度表示某两个节点以及周围邻接节点组成的整体的相似程度,在云平台内如果某一主机节点的邻居节点数目很多,且与其邻居节点的拓扑属性的重合度低时,即该主机节点与其两阶内的邻居节点之间的邻接相似度低时,此主机节点就越难被代替,其重要度越高,是在云平台监控过程中为需要重点监控的关键节点。因此,可通过计算各主机节点和其两阶内的邻居节点之间的邻接相似度,来确定各主机节点在云平台内的重要度。There are multiple host nodes in the connection network of the cloud platform, and the neighboring in the connection network of the cloud platform refers to the security group policy of any two host nodes allowing two hosts to access each other. The adjacency similarity indicates the degree of similarity of the whole composed of two nodes and the surrounding adjacent nodes. In the cloud platform, if the number of neighbor nodes of a host node is large, and the coincidence degree of the topological attributes of its neighbor nodes is low, the host node When the adjacency similarity between a node and its neighbor nodes in its two levels is low, the host node is more difficult to be replaced, and its importance is higher, and it is a key node that needs to be monitored in the cloud platform monitoring process. Therefore, the importance of each host node in the cloud platform can be determined by calculating the adjacency similarity between each host node and its neighbor nodes in two levels.
安全组表示一个逻辑上的分组,可以将同一地域内具有相同网络安全隔离需求的基础网络云服务器或弹性网卡实例加到同一个安全组内,通过安全组策略对实例的出入流量进行安全过滤,实例可以是基础网络云服务器或弹性网卡实例。安全组策略则表示根据安全组进行设置的,是指在某个安全区域内,用于所有与安全相关活动的一套规则。其中,一个安全区域指的是属于某个组织的一系列处理和通信资源。A security group represents a logical grouping. Basic network cloud servers or elastic network card instances in the same region with the same network security isolation requirements can be added to the same security group, and the inbound and outbound traffic of the instances can be securely filtered through security group policies. The instance can be a basic network cloud server or an elastic network card instance. The security group policy means that it is set according to the security group, which refers to a set of rules for all security-related activities in a certain security area. Among them, a security zone refers to a series of processing and communication resources belonging to an organization.
步骤S204,根据主机节点的重要度确定云平台的关键主机节点。Step S204: Determine the key host node of the cloud platform according to the importance of the host node.
具体地,根据各主机节点的重要度,对各主机节点进行排序,生成节点重要度列表,通过获取节点重要度列表中前N个主机节点,得到关键主机节点。其中,N根据云平台的连接网络中主机节点的个数确定。其中,邻接相似度越低的主机节点,其重要度越高,通过排序,可将主机节点进行正序排序,获得主机节点从重要度从高至低的排序列表。Specifically, according to the importance of each host node, each host node is sorted to generate a node importance list, and the key host node is obtained by obtaining the top N host nodes in the node importance list. Among them, N is determined according to the number of host nodes in the connection network of the cloud platform. Among them, a host node with a lower adjacency similarity has a higher importance. Through sorting, the host nodes can be sorted in a positive order, and a sorted list of host nodes from high to low is obtained.
N的设置,可根据主机节点的个数自动设置,也可根据用户需求进行自定义设置和修改。比如,当连接网络中,存在100个主机节点时,可将10个主机节点作为排序列表的前段主机节点,即关键节点,当主机节点数量增加时,根据设置的比例规则,可自行增加关键节点的数量。The setting of N can be automatically set according to the number of host nodes, and can also be customized and modified according to user needs. For example, when there are 100 host nodes in the connected network, 10 host nodes can be used as the first host nodes in the sorted list, that is, key nodes. When the number of host nodes increases, the key nodes can be added by themselves according to the set proportional rule. quantity.
在另一些实施例中,针对主机节点的排序,还可根据邻接相似度的大小,按照从小至大的排序方式对各主机节点进行反向排序,得到主机节点从重要度从低至高的排序列表,此时N的设置规则未变化,获取节点重要度列表的后N个主机节点,作为关键主机节点。In other embodiments, for the ordering of the host nodes, the host nodes can also be sorted in reverse order according to the size of the adjacency similarity, and the host nodes are sorted from low to high in importance. At this time, the setting rule of N has not changed, and the last N host nodes of the node importance list are obtained as key host nodes.
步骤S206,调整关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于关键主机节点的告警阈值。Step S206: Adjust the alarm threshold of the key host node, where the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node.
具体地,通过获取各关键主机节点的原始告警阈值,以及各关键主机节点的排序,进而根据关键主机节点的排序对关键主机节点的原始告警阈值进行调整,得到调整后的告警阈值。其中,关键主机节点的排序越在前,调整后的告警阈值越低,非关键主机节点的告警阈值大于关键主机节点的告警阈值,在进行告警阈值调整时,各非关键主机节点告警阈值维持不变。Specifically, by obtaining the original alarm threshold of each key host node and the order of each key host node, the original alarm threshold of the key host node is adjusted according to the order of the key host node to obtain the adjusted alarm threshold. Among them, the more critical host nodes are sorted, the lower the adjusted alarm threshold. The alarm thresholds of non-critical host nodes are greater than the alarm thresholds of key host nodes. When the alarm thresholds are adjusted, the alarm thresholds of non-critical host nodes remain unavailable. change.
未进行告警阈值调整之前,各主机节点的原始告警阈值大小相同,出现风险时,会同时出发针对多个主机节点的告警,通过获取各关键主机节点的原始告警阈值,并根据各关 键主机节点在节点重要度列表中的排序,对关键主机节点的原始告警阈值进行调整,包括将原始告警阈值调低或者调高的调整操作,当关键主机节点在节点重要度列表中的排序越靠前时,其告警阈值越小。Before adjusting the alarm threshold, the original alarm threshold of each host node is the same size. When a risk occurs, alarms for multiple host nodes will be sent at the same time. By obtaining the original alarm threshold of each key host node, and according to the status of each key host node Sorting in the node importance list adjusts the original alarm threshold of the key host node, including adjusting the original alarm threshold lower or higher. When the key host node is ranked higher in the node importance list, The lower the alarm threshold.
比如,针对节点重要度列表中的第一个关键主机节点,可将其告警阈值从统一阈值调整为最低阈值,举例来说,存在100个主机节点时,原来统一设置的原始告警阈值为100,可将第一关键节点的阈值设置为最低值1,根据排序将后续的关键节点的告警阈值依次增加。For example, for the first key host node in the node importance list, the alarm threshold can be adjusted from the unified threshold to the lowest threshold. For example, when there are 100 host nodes, the original alarm threshold that was originally set uniformly is 100. The threshold of the first key node can be set to the lowest value 1, and the alarm thresholds of subsequent key nodes are sequentially increased according to the ranking.
步骤S208,根据各告警阈值对云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警,监控参数包括各主机节点的设备参数和运行参数。Step S208: Monitor each host node of the cloud platform according to each alarm threshold, to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold. The monitoring parameters include device parameters and operating parameters of each host node.
具体地,在云平台运行过程中,实时监控各主机节点的监控参数,其中监控参数包括主机节点的设备参数以及运行参数等,设备参数包括设备各部件参数,比如内存参数、主板参数以及显卡参数等,运行参数比如运行时间以及运行状态等,通过监控主机节点的监控参数,当出现监控参数超出所设置的告警阈值时,针对该主机节点产生告警。Specifically, during the operation of the cloud platform, the monitoring parameters of each host node are monitored in real time, where the monitoring parameters include the device parameters and operating parameters of the host node, and the device parameters include the parameters of various components of the device, such as memory parameters, motherboard parameters, and graphics card parameters. Etc., operating parameters such as operating time and operating status, etc., by monitoring the monitoring parameters of the host node, when the monitoring parameters exceed the set alarm threshold, an alarm is generated for the host node.
上述云平台的主机节点监控方法中,通过获取云平台各主机节点的重要度,重要度根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定,考虑了云平台连接网络中各主机节点的不同重要程度,并进行区分,调整根据重要度所确定的关键主机节点的告警阈值,进而根据各告警阈值对云平台的各主机节点进行监控,可加强对重要度较高的关键节点的监控,在监控到主机节点的监控参数大于对应告警阈值时,及时产生告警,避免出现同时产生较多告警,无法及时针对重要的关键节点进行处理,降低了误报警率。In the host node monitoring method of the cloud platform, the importance of each host node of the cloud platform is obtained, and the importance is determined according to the similarity between each host node of the cloud platform and its neighbor nodes within two levels, taking into account the cloud platform connection network The importance of each host node is different and differentiated, adjust the alarm threshold of the key host node determined according to the importance, and then monitor each host node of the cloud platform according to the alarm threshold, which can strengthen the key of higher importance For node monitoring, when the monitoring parameters of the host node are greater than the corresponding alarm threshold, an alarm is generated in time to avoid generating multiple alarms at the same time, failing to process important key nodes in time, and reducing the false alarm rate.
在其中一个实施例中,如图3所示,确定云平台各主机节点与其两阶内的邻居节点之间邻接相似度的步骤,具体包括以下S302至S310包括的步骤:In one of the embodiments, as shown in FIG. 3, the step of determining the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels specifically includes the following steps S302 to S310:
步骤S302,获取主机节点。Step S302: Obtain a host node.
步骤S304,将该主机节点的其中两个相邻节点作为主机节点的一阶相邻节点。Step S304: Use two neighboring nodes of the host node as first-order neighboring nodes of the host node.
步骤S306,将一阶相邻节点的所有相邻节点作为主机节点的二阶相邻节点,并得到二阶相邻节点的数量。Step S306, taking all neighboring nodes of the first-order neighboring nodes as the second-order neighboring nodes of the host node, and obtaining the number of second-order neighboring nodes.
具体地,云平台的连接网络内各主机节点的节点连接关系可参照图4,如图4所示,假设任取主机节点a、b、c,则b、c为a的一阶相邻节点,节点1、2、6、7为节点b的一阶相邻节点,且为节点a的二阶相邻节点,节点3、4、5、6、7为节点c的一阶相邻节点,且为节点a的二阶相邻节点。其中,主机节点a的一阶相邻节点的数量为2,主机节点a的二阶相邻节点的数量为7。Specifically, the node connection relationship of each host node in the connection network of the cloud platform can be referred to Figure 4. As shown in Figure 4, assuming that the host nodes a, b, and c are arbitrary, then b and c are the first-order neighboring nodes of a , Nodes 1, 2, 6, 7 are the first-order adjacent nodes of node b, and are the second-order adjacent nodes of node a, and nodes 3, 4, 5, 6, and 7 are the first-order adjacent nodes of node c, And it is the second-order adjacent node of node a. Among them, the number of first-order neighboring nodes of host node a is 2, and the number of second-order neighboring nodes of host node a is 7.
步骤S308,从二阶相邻节点中,确定一阶相邻节点的公共相邻节点,并得到公共相邻节点的数量。Step S308, from the second-order neighboring nodes, determine the common neighboring nodes of the first-order neighboring nodes, and obtain the number of common neighboring nodes.
具体地,参照图4,由于节点1、2、6、7为节点b的一阶相邻节点,且为节点a的二阶相邻节点,节点3、4、5、6、7为节点c的一阶相邻节点,且为节点a的二阶相邻节 点,则节点6、7为节点b和节点c和节点a的公共相邻节点,公共相邻节点的数量为2。Specifically, referring to Fig. 4, since nodes 1, 2, 6, and 7 are first-order adjacent nodes of node b, and are second-order adjacent nodes of node a, nodes 3, 4, 5, 6, and 7 are node c Nodes 6 and 7 are the common neighboring nodes of node b and node c and node a, and the number of common neighboring nodes is 2.
步骤S310,根据公共相邻节点的数量和二阶相邻节点的数量,确定对选择的两个相邻节点对应的邻接相似度,其中,邻接相似度为公共相邻节点的数量与二阶相邻节点的数量的比值。Step S310: Determine the adjacent similarity corresponding to the two selected adjacent nodes according to the number of common adjacent nodes and the number of second-order adjacent nodes, where the adjacent similarity is the number of common adjacent nodes and the second-order similarity. The ratio of the number of neighboring nodes.
具体地,参照图4,主机节点a的二阶相邻节点包括节点1、2、3、4、5、6、7,一共7个二阶相邻节点,节点b和节点c的公共相邻节点为节点6、7,共两个公共相邻节点,则有节点b、c的邻接相似度为公共相邻节点的数量与二阶相邻节点的数量的比值,即2/7。Specifically, referring to Figure 4, the second-order adjacent nodes of the host node a include nodes 1, 2, 3, 4, 5, 6, and 7, a total of 7 second-order adjacent nodes, and the common adjacent nodes of node b and node c The nodes are nodes 6 and 7, and there are two common neighboring nodes in total. The adjacency similarity of nodes b and c is the ratio of the number of common neighboring nodes to the number of second-order neighboring nodes, that is, 2/7.
当一个主机节点的邻接节点数目众多,并且邻居的拓扑属性的重合度低,那么此主机节点就越难被代替,其重要度越高,是在云平台监控过程中为需要重点监控的关键主机节点。其中,云平台由于业务相关,与各服务器主机节点之间的紧密度较高,通过计算公共邻接节点的比例,并将该比例作为邻接节点相似度的度量,进而更拟合云平台和各主机节点形成的网络结构的领域节点的邻接相似度,达到更加拟合云平台业务场景的功能。When a host node has a large number of adjacent nodes, and the overlap of neighbors' topological attributes is low, then the host node is more difficult to be replaced, and its importance is higher. It is a key host that needs to be monitored during the cloud platform monitoring process. node. Among them, the cloud platform has a high degree of closeness with the host nodes of each server due to its business relationship. By calculating the ratio of public adjacent nodes, and using this ratio as a measure of the similarity of adjacent nodes, it is more suitable for the cloud platform and each host. The adjacency similarity of the domain nodes of the network structure formed by the nodes achieves the function of more fitting the business scenarios of the cloud platform.
上述步骤中,通过根据公共相邻节点的数量和二阶相邻节点的数量的比值,确定对选择的两个相邻节点对应的邻接相似度,可根据邻接相似度确定重要度高的关键节点,进行重点监控,进一步提高云平台中针对各主机节点的监控力度。In the above steps, by determining the adjacent similarity corresponding to the two selected adjacent nodes according to the ratio of the number of common adjacent nodes to the number of second-order adjacent nodes, the key nodes with high importance can be determined according to the adjacent similarity , Carry out key monitoring, and further improve the monitoring of each host node in the cloud platform.
在其中一个实施例中,根据云平台各主机节点与其两阶内的邻居节点之间的邻接相似度确定各主机节点的重要度的步骤,具体包括:In one of the embodiments, the step of determining the importance of each host node according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels specifically includes:
遍历主机节点的相邻节点,将主机节点的任意两个相邻节点组合,计算得到主机节点与各相邻节点对应的邻接相似度;利用预设定值分别减去各邻接相似度,得到与各邻接相似度对应的差值,并对所有差值进行求和,得到主机节点的重要度。Traverse the adjacent nodes of the host node, combine any two adjacent nodes of the host node, and calculate the adjacent similarity between the host node and each adjacent node; use the preset value to subtract the adjacent similarity to obtain the The difference value corresponding to each adjacency similarity, and all the difference values are summed to obtain the importance of the host node.
具体地,云平台的连接网络内各主机节点的节点连接关系如图4所示,参照图4,主机节点a的相邻节点包括节点b、c、d,节点1、2、6、7为节点b的一阶相邻节点,且为节点a的二阶相邻节点,节点3、4、5、6、7为节点c的一阶相邻节点,且为节点a的二阶相邻节点,节点6、8为节点d的一阶相邻节点,且为节点a的二阶相邻节点。通过将节点a的一阶相邻节点b、c、d进行任意两者组合,可以得到的组合包括:(b、c)、(b、d)以及(c、d)共三个组合。Specifically, the node connection relationship of each host node in the connection network of the cloud platform is shown in Figure 4. With reference to Figure 4, the neighboring nodes of host node a include nodes b, c, and d, and nodes 1, 2, 6, 7 are The first-order adjacent node of node b, and the second-order adjacent node of node a, nodes 3, 4, 5, 6, and 7 are the first-order adjacent nodes of node c, and the second-order adjacent nodes of node a , Nodes 6 and 8 are the first-order adjacent nodes of node d, and are the second-order adjacent nodes of node a. By combining any two of the first-order adjacent nodes b, c, and d of node a, the combinations that can be obtained include: (b, c), (b, d), and (c, d) three combinations.
针对节点b、c而言,主机节点a的二阶相邻节点包括节点1、2、3、4、5、6、7,一共7个二阶相邻节点,节点b和节点c的公共相邻节点为节点6、7,共两个公共相邻节点,则有节点b、c的邻接相似度为公共相邻节点的数量与二阶相邻节点的数量的比值,即2/7。For nodes b and c, the second-order adjacent nodes of host node a include nodes 1, 2, 3, 4, 5, 6, and 7, a total of 7 second-order adjacent nodes. The common phase of node b and node c The neighboring nodes are nodes 6 and 7, and there are two common neighboring nodes. Then the adjacency similarity of nodes b and c is the ratio of the number of common neighboring nodes to the number of second-order neighboring nodes, that is, 2/7.
针对节点b、d而言,主机节点a的二阶相邻节点包括节点1、2、6、7、8,共5个主机节点,节点b和节点d的公共相邻节点为节点6,则有节点b、d的邻接相似度为公共相邻节点的数量与二阶相邻节点的数量的比值,即1/5。For nodes b and d, the second-order adjacent nodes of host node a include nodes 1, 2, 6, 7, and 8, a total of 5 host nodes, and the common adjacent node of node b and node d is node 6, then The adjacent similarity of nodes b and d is the ratio of the number of common adjacent nodes to the number of second-order adjacent nodes, that is, 1/5.
针对节点c、d而言,主机节点a的二阶相邻节点包括节点3、4、5、6、7、8,共6个二阶相邻节点,节点c和节点d的公共相邻节点为节点6,则有节点c、d的邻接相似度 为公共相邻节点的数量与二阶相邻节点的数量的比值,即1/6。For nodes c and d, the second-order neighboring nodes of host node a include nodes 3, 4, 5, 6, 7, and 8, a total of 6 second-order neighboring nodes, and the common neighboring nodes of node c and node d For node 6, the adjacency similarity of nodes c and d is the ratio of the number of common adjacent nodes to the number of second-order adjacent nodes, that is, 1/6.
进一步地,本实施例中预设定值为1,通过利用预设定值1分别减去各邻接相似度,得到与各邻接相似度对应的差值,并对所有差值进行求和,得到主机节点的重要度,即主机节点a的重要度为(1-2/7)+(1-1/5)+(1-1/6)≈2.35Further, in this embodiment, the preset value is 1. By using the preset value 1 to subtract each adjacent similarity, the difference corresponding to each adjacent similarity is obtained, and all the differences are summed to obtain The importance of the host node, that is, the importance of the host node a is (1-2/7)+(1-1/5)+(1-1/6)≈2.35
上述步骤中,通过遍历主机节点的相邻节点,将主机节点的任意两个相邻节点进行组合,计算得到主机节点与各相邻节点组合对应的邻接相似度,并利用预设定值分别减去各组合的邻接相似度,得到与各邻接相似度对应的差值,并对所有差值进行求和,得到主机节点的重要度,有利于后续根据各主机节点的重要度进行排序,获取关键主机节点,提高工作效率。In the above steps, by traversing the neighboring nodes of the host node, combining any two neighboring nodes of the host node, calculating the neighboring similarity corresponding to the host node and each neighboring node combination, and using the preset value to reduce Remove the adjacency similarity of each combination to obtain the difference corresponding to each adjacency similarity, and sum all the differences to obtain the importance of the host node, which is helpful for subsequent sorting according to the importance of each host node to obtain the key Host node to improve work efficiency.
应该理解的是,虽然图2-3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowchart of FIGS. 2-3 are displayed in sequence as indicated by the arrows, these steps are not necessarily performed in sequence in the order indicated by the arrows. Unless there is a clear description in this article, there is no strict order for the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in Figure 2-3 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
在其中一个实施例中,如图5所示,提供了一种云平台的主机节点监控装置,包括:主机节点重要度获取模块502、关键主机节点确定模块504、告警阈值调整模块506以及监控模块508,其中:In one of the embodiments, as shown in FIG. 5, a host node monitoring device of a cloud platform is provided, including: a host node importance acquisition module 502, a key host node determination module 504, an alarm threshold adjustment module 506, and a monitoring module 508, of which:
主机节点重要度获取模块502,用于获取云平台各主机节点的重要度,重要度根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定。邻接相似度用于表示任意两个主机节点与其周围邻接节点组成的整体的相似程度,邻接节点为主机节点两阶内的邻居节点,邻接相似度与重要度成反比;。The host node importance degree obtaining module 502 is used to obtain the importance degree of each host node of the cloud platform, and the importance degree is determined according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels. The adjacency similarity is used to express the similarity between any two host nodes and its surrounding adjacent nodes as a whole. The adjacent nodes are the neighbor nodes within two levels of the host node, and the adjacent similarity is inversely proportional to the importance.
关键主机节点确定模块504,用于根据主机节点的重要度确定云平台的关键主机节点。The key host node determining module 504 is configured to determine the key host node of the cloud platform according to the importance of the host node.
告警阈值调整模块506,用于调整关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于关键主机节点的告警阈值。The alarm threshold adjustment module 506 is configured to adjust the alarm threshold of the key host node, where the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node.
监控模块508,用于根据各告警阈值对云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警,监控参数包括各主机节点的设备参数和运行参数。The monitoring module 508 is used to monitor each host node of the cloud platform according to various alarm thresholds to generate an alarm when the monitoring parameter of the monitored host node is greater than the corresponding alarm threshold. The monitoring parameters include the device parameters and operating parameters of each host node .
上述云平台的主机节点监控装置,通过获取云平台各主机节点的重要度,重要度根据云平台各主机节点与其两阶内的邻居节点之间的邻接相似度所确定的各主机节点的重要度,考虑到云平台连接网络中各主机节点的不同重要程度,并进行区分,调整根据重要度所确定的关键主机节点的告警阈值,进而根据各告警阈值对云平台的各主机节点进行监控,可加强对重要度较高的关键节点的监控,在监控到主机节点的监控参数大于对应告警 阈值时,及时产生告警,避免出现同时产生较多告警,无法及时针对重要的关键节点进行处理,降低了误报警率。The above-mentioned host node monitoring device of the cloud platform obtains the importance of each host node of the cloud platform, and the importance of each host node is determined according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels. , Considering the different importance of each host node in the cloud platform connection network, and distinguishing, adjust the alarm threshold of the key host node determined according to the importance, and then monitor each host node of the cloud platform according to each alarm threshold. Strengthen the monitoring of key nodes with higher importance. When monitoring parameters of the host node are greater than the corresponding alarm threshold, an alarm will be generated in time to avoid the occurrence of multiple alarms at the same time, and the important key nodes cannot be processed in time, which reduces False alarm rate.
在其中一个实施例中,主机节点重要度获取模块还用于:In one of the embodiments, the host node importance acquisition module is also used for:
获取云平台的连接网络内各主机节点的节点参数;根据节点参数分别计算各主机节点和其两阶内的邻居节点之间的邻接相似度;根据邻接相似度,确定各主机节点的重要度。Obtain the node parameters of each host node in the connection network of the cloud platform; calculate the adjacency similarity between each host node and its two-level neighbor nodes according to the node parameters; determine the importance of each host node according to the adjacency similarity.
上述主机节点重要度获取模块,通过获取云平台的连接网络内各主机节点的节点参数,并根据节点参数分别计算各主机节点和其两阶内的邻居节点之间的邻接相似度,进一步根据邻接相似度,确定各主机节点的重要度,有利于后续根据各主机节点的重要度进行排序,获取关键主机节点,提高工作效率。The above-mentioned host node importance acquisition module obtains the node parameters of each host node in the connection network of the cloud platform, and calculates the adjacency similarity between each host node and its neighbor nodes within two levels according to the node parameters, and further according to the adjacency The degree of similarity determines the importance of each host node, which is conducive to subsequent sorting according to the importance of each host node, obtaining key host nodes, and improving work efficiency.
在其中一个实施例中,关键主机节点确定模块还用于:In one of the embodiments, the key host node determination module is also used to:
根据各主机节点的重要度,对各主机节点进行排序,生成节点重要度列表;获取节点重要度列表中前N个主机节点,得到关键主机节点;N根据云平台的连接网络中主机节点的个数确定。According to the importance of each host node, each host node is sorted to generate a node importance list; the first N host nodes in the node importance list are obtained to obtain the key host node; N is based on the number of host nodes in the connection network of the cloud platform The number is ok.
上述关键主机节点确定模块,根据各主机节点的重要度,对各主机节点进行排序,生成节点重要度列表,并将节点重要度列表中前N个主机节点作为关键主机节点,可针对关键主机节点进行进一步的严格监控,及时告警。The above-mentioned key host node determination module sorts the host nodes according to the importance of each host node, generates a node importance list, and uses the first N host nodes in the node importance list as key host nodes, which can be targeted at key host nodes Carry out further strict monitoring and timely alarm.
在其中一个实施例中,提供了一种云平台的主机节点监控装置,还包括邻接相似度计算模块,用于:In one of the embodiments, a host node monitoring device of a cloud platform is provided, which further includes an adjacency similarity calculation module for:
获取主机节点;将该主机节点的其中两个相邻节点作为主机节点的一阶相邻节点;将一阶相邻节点的所有相邻节点作为主机节点的二阶相邻节点,并得到二阶相邻节点的数量;从二阶相邻节点中,确定一阶相邻节点的公共相邻节点,并得到公共相邻节点的数量;根据公共相邻节点的数量和二阶相邻节点的数量,确定对选择的两个相邻节点对应的邻接相似度,其中,邻接相似度为公共相邻节点的数量与二阶相邻节点的数量的比值。Obtain the host node; use two of the adjacent nodes of the host node as the first-order adjacent nodes of the host node; use all the adjacent nodes of the first-order adjacent nodes as the second-order adjacent nodes of the host node, and obtain the second-order The number of adjacent nodes; from the second-order adjacent nodes, determine the common adjacent nodes of the first-order adjacent nodes, and obtain the number of common adjacent nodes; according to the number of common adjacent nodes and the number of second-order adjacent nodes , Determine the adjacency similarity corresponding to the two selected adjacent nodes, where the adjacency similarity is the ratio of the number of common adjacent nodes to the number of second-order adjacent nodes.
上述云平台的主机节点监控装置,通过根据公共相邻节点的数量和二阶相邻节点的数量的比值,确定对选择的两个相邻节点对应的邻接相似度,可根据邻接相似度确定重要度高的关键节点,进行重点监控,进一步提高云平台中针对各主机节点的监控力度。The host node monitoring device of the above cloud platform determines the adjacency similarity corresponding to the two selected adjacent nodes according to the ratio of the number of public adjacent nodes and the number of second-order adjacent nodes, and can determine the importance according to the adjacent similarity. The key nodes with high degree of high-degree, focus on monitoring, and further improve the monitoring of each host node in the cloud platform.
在其中一个实施例中,主机节点重要度获取模块还用于:In one of the embodiments, the host node importance acquisition module is also used for:
遍历主机节点的相邻节点,将主机节点的任意两个相邻节点组合,计算得到主机节点与各相邻节点对应的邻接相似度;利用预设定值分别减去各邻接相似度,得到与各邻接相似度对应的差值,并对所有差值进行求和,得到主机节点的重要度。Traverse the adjacent nodes of the host node, combine any two adjacent nodes of the host node, and calculate the adjacent similarity between the host node and each adjacent node; use the preset value to subtract the adjacent similarity to obtain the The difference value corresponding to each adjacency similarity, and all the difference values are summed to obtain the importance of the host node.
上述主机节点重要度获取模块,通过遍历主机节点的相邻节点,将主机节点的任意两个相邻节点进行组合,计算得到主机节点与各相邻节点组合对应的邻接相似度,并利用预设定值分别减去各组合的邻接相似度,得到与各邻接相似度对应的差值,并对所有差值进行求和,得到主机节点的重要度,有利于后续根据各主机节点的重要度进行排序,获取关键主机节点,提高工作效率。The above-mentioned host node importance acquisition module combines any two adjacent nodes of the host node by traversing the adjacent nodes of the host node, calculates the adjacent similarity corresponding to the host node and each adjacent node combination, and uses the preset The fixed value is subtracted from the adjacent similarity of each combination to obtain the difference corresponding to each adjacent similarity, and all the differences are summed to obtain the importance of the host node, which is conducive to follow-up according to the importance of each host node Sort, obtain key host nodes, and improve work efficiency.
在其中一个实施例中,告警阈值调整模块还用于:In one of the embodiments, the alarm threshold adjustment module is also used to:
获取各关键主机节点的原始告警阈值;获取各关键主机节点的排序;Obtain the original alarm threshold of each key host node; obtain the ranking of each key host node;
根据关键主机节点的排序对关键主机节点的原始告警阈值进行调整,得到调整后的告警阈值;其中,关键主机节点的排序越在前,调整后的告警阈值越低。The original alarm thresholds of the key host nodes are adjusted according to the ordering of the key host nodes to obtain the adjusted alarm thresholds; among them, the higher the ordering of the key host nodes, the lower the adjusted alarm threshold.
上述告警阈值调整模块,通过获取各关键主机节点的原始告警阈值,以及各关键主机节点的排序,并根据关键主机节点的排序对关键主机节点的原始告警阈值进行调整,得到调整后的告警阈值,有利于根据调整后的告警阈值,对各关键主机节点实时监控,并及时告警,降低云平台各主机节点出现故障的风险。The above alarm threshold adjustment module obtains the original alarm threshold of each key host node and the order of each key host node, and adjusts the original alarm threshold of the key host node according to the order of the key host node to obtain the adjusted alarm threshold. It is conducive to real-time monitoring of each key host node according to the adjusted alarm threshold, and timely warning, reducing the risk of failure of each host node of the cloud platform.
关于云平台的主机节点监控装置的具体限定可以参见上文中对于云平台的主机节点监控方法的限定,在此不再赘述。上述云平台的主机节点监控装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the host node monitoring device of the cloud platform, please refer to the above limitation of the host node monitoring method of the cloud platform, which will not be repeated here. The various modules in the host node monitoring device of the above cloud platform can be implemented in whole or in part by software, hardware and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储主机节点监控数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种云平台的主机节点监控方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 6. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used to store the monitoring data of the host node. The network interface of the computer device is used to communicate with an external terminal through a network connection. When the computer readable instruction is executed by the processor, a method for monitoring a host node of a cloud platform is realized.
本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the one or more processors perform the following steps:
获取云平台各主机节点的重要度,重要度根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定;邻接相似度用于表示任意两个主机节点与其周围邻接节点组成的整体的相似程度,邻接节点为主机节点两阶内的邻居节点;邻接相似度与重要度成反比;Obtain the importance of each host node of the cloud platform. The importance is determined according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacency similarity is used to represent the whole composed of any two host nodes and its neighboring nodes The adjacent node is the adjacent node within two levels of the host node; the adjacent similarity is inversely proportional to the importance;
根据主机节点的重要度确定云平台的关键主机节点;Determine the key host node of the cloud platform according to the importance of the host node;
调整关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于关键主机节点的告警阈值;及Adjust the alarm threshold of key host nodes, where the alarm threshold of non-critical host nodes is greater than the alarm threshold of key host nodes; and
根据各告警阈值对云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;监控参数包括各主机节点的设备参数和运行参数。Monitor each host node of the cloud platform according to each alarm threshold to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold; the monitoring parameters include the device parameters and operating parameters of each host node.
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:In an embodiment, the processor further implements the following steps when executing the computer-readable instructions:
获取云平台的连接网络内各主机节点的节点参数;Obtain the node parameters of each host node in the connection network of the cloud platform;
根据节点参数分别计算各主机节点和其两阶内的邻居节点之间的邻接相似度;及Calculate the adjacency similarity between each host node and its neighbor nodes within two levels according to the node parameters; and
根据邻接相似度,确定各主机节点的重要度。According to the adjacency similarity, the importance of each host node is determined.
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:In an embodiment, the processor further implements the following steps when executing the computer-readable instructions:
根据各主机节点的重要度,对各主机节点进行排序,生成节点重要度列表;及According to the importance of each host node, sort each host node to generate a node importance list; and
获取节点重要度列表中前N个主机节点,得到关键主机节点;N根据云平台的连接网络中主机节点的个数确定。Obtain the first N host nodes in the node importance list to obtain key host nodes; N is determined according to the number of host nodes in the connection network of the cloud platform.
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:In an embodiment, the processor further implements the following steps when executing the computer-readable instructions:
获取主机节点;Get the host node;
将该主机节点的其中两个相邻节点作为主机节点的一阶相邻节点;Two of the adjacent nodes of the host node are regarded as the first-order adjacent nodes of the host node;
将一阶相邻节点的所有相邻节点作为主机节点的二阶相邻节点,并得到二阶相邻节点的数量;Take all adjacent nodes of the first-order adjacent nodes as the second-order adjacent nodes of the host node, and obtain the number of second-order adjacent nodes;
从二阶相邻节点中,确定一阶相邻节点的公共相邻节点,并得到公共相邻节点的数量;及From the second-order neighboring nodes, determine the common neighboring nodes of the first-order neighboring nodes, and get the number of common neighboring nodes; and
根据公共相邻节点的数量和二阶相邻节点的数量,确定对选择的两个相邻节点对应的邻接相似度,其中,邻接相似度为公共相邻节点的数量与二阶相邻节点的数量的比值。According to the number of common adjacent nodes and the number of second-order adjacent nodes, determine the adjacency similarity corresponding to the two selected adjacent nodes, where the adjacency similarity is the number of common adjacent nodes and the number of second-order adjacent nodes The ratio of the quantity.
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:In an embodiment, the processor further implements the following steps when executing the computer-readable instructions:
遍历主机节点的相邻节点,将主机节点的任意两个相邻节点组合,计算得到主机节点与各相邻节点对应的邻接相似度;及Traverse the adjacent nodes of the host node, combine any two adjacent nodes of the host node, and calculate the adjacent similarity between the host node and each adjacent node; and
利用预设定值分别减去各邻接相似度,得到与各邻接相似度对应的差值,并对所有差值进行求和,得到主机节点的重要度。The pre-set value is used to subtract each adjacent similarity to obtain the difference corresponding to each adjacent similarity, and all the differences are summed to obtain the importance of the host node.
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:In an embodiment, the processor further implements the following steps when executing the computer-readable instructions:
获取各关键主机节点的原始告警阈值;Obtain the original alarm threshold of each key host node;
获取各关键主机节点的排序;及Obtain the ranking of each key host node; and
根据关键主机节点的排序对关键主机节点的原始告警阈值进行调整,得到调整后的告警阈值;其中,关键主机节点的排序越在前,调整后的告警阈值越低。The original alarm thresholds of the key host nodes are adjusted according to the ordering of the key host nodes to obtain the adjusted alarm thresholds; among them, the higher the ordering of the key host nodes, the lower the adjusted alarm threshold.
一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
获取云平台各主机节点的重要度,重要度根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定;邻接相似度用于表示任意两个主机节点与其周围邻接节点组成的整体的相似程度,邻接节点为主机节点两阶内的邻居节点;邻接相似度与重要度成反比;Obtain the importance of each host node of the cloud platform. The importance is determined according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacency similarity is used to represent the whole composed of any two host nodes and its neighboring nodes The adjacent node is the adjacent node within two levels of the host node; the adjacent similarity is inversely proportional to the importance;
根据主机节点的重要度确定云平台的关键主机节点;Determine the key host node of the cloud platform according to the importance of the host node;
调整关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于关键主机节点的告警阈值;及Adjust the alarm threshold of key host nodes, where the alarm threshold of non-critical host nodes is greater than the alarm threshold of key host nodes; and
根据各告警阈值对云平台的各主机节点进行监控,以在监控到主机节点的监控参数大 于对应告警阈值时,产生告警;监控参数包括各主机节点的设备参数和运行参数。Monitor each host node of the cloud platform according to each alarm threshold to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold; the monitoring parameters include the device parameters and operating parameters of each host node.
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:In an embodiment, when the computer-readable instructions are executed by the processor, the following steps are further implemented:
获取云平台的连接网络内各主机节点的节点参数;Obtain the node parameters of each host node in the connection network of the cloud platform;
根据节点参数分别计算各主机节点和其两阶内的邻居节点之间的邻接相似度;及Calculate the adjacency similarity between each host node and its neighbor nodes within two levels according to the node parameters; and
根据邻接相似度,确定各主机节点的重要度。According to the adjacency similarity, the importance of each host node is determined.
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:In an embodiment, when the computer-readable instructions are executed by the processor, the following steps are further implemented:
根据各主机节点的重要度,对各主机节点进行排序,生成节点重要度列表;及According to the importance of each host node, sort each host node to generate a node importance list; and
获取节点重要度列表中前N个主机节点,得到关键主机节点;N根据云平台的连接网络中主机节点的个数确定。Obtain the first N host nodes in the node importance list to obtain key host nodes; N is determined according to the number of host nodes in the connection network of the cloud platform.
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:In an embodiment, when the computer-readable instructions are executed by the processor, the following steps are further implemented:
获取主机节点;Get the host node;
将该主机节点的其中两个相邻节点作为主机节点的一阶相邻节点;Two of the adjacent nodes of the host node are regarded as the first-order adjacent nodes of the host node;
将一阶相邻节点的所有相邻节点作为主机节点的二阶相邻节点,并得到二阶相邻节点的数量;Take all adjacent nodes of the first-order adjacent nodes as the second-order adjacent nodes of the host node, and obtain the number of second-order adjacent nodes;
从二阶相邻节点中,确定一阶相邻节点的公共相邻节点,并得到公共相邻节点的数量;及From the second-order neighboring nodes, determine the common neighboring nodes of the first-order neighboring nodes, and get the number of common neighboring nodes; and
根据公共相邻节点的数量和二阶相邻节点的数量,确定对选择的两个相邻节点对应的邻接相似度,其中,邻接相似度为公共相邻节点的数量与二阶相邻节点的数量的比值。According to the number of common adjacent nodes and the number of second-order adjacent nodes, determine the adjacency similarity corresponding to the two selected adjacent nodes, where the adjacency similarity is the number of common adjacent nodes and the number of second-order adjacent nodes The ratio of the quantity.
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:In an embodiment, when the computer-readable instructions are executed by the processor, the following steps are further implemented:
遍历主机节点的相邻节点,将主机节点的任意两个相邻节点组合,计算得到主机节点与各相邻节点对应的邻接相似度;及Traverse the adjacent nodes of the host node, combine any two adjacent nodes of the host node, and calculate the adjacent similarity between the host node and each adjacent node; and
利用预设定值分别减去各邻接相似度,得到与各邻接相似度对应的差值,并对所有差值进行求和,得到主机节点的重要度。The pre-set value is used to subtract each adjacent similarity to obtain the difference corresponding to each adjacent similarity, and all the differences are summed to obtain the importance of the host node.
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:In an embodiment, when the computer-readable instructions are executed by the processor, the following steps are further implemented:
获取各关键主机节点的原始告警阈值;Obtain the original alarm threshold of each key host node;
获取各关键主机节点的排序;及Obtain the ranking of each key host node; and
根据关键主机节点的排序对关键主机节点的原始告警阈值进行调整,得到调整后的告警阈值;其中,关键主机节点的排序越在前,调整后的告警阈值越低。The original alarm thresholds of the key host nodes are adjusted according to the ordering of the key host nodes to obtain the adjusted alarm thresholds; among them, the higher the ordering of the key host nodes, the lower the adjusted alarm threshold.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲 存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction between the combinations of these technical features, they should It is considered as the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and their description is relatively specific and detailed, but they should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims (20)

  1. 一种云平台的主机节点监控方法,包括:A method for monitoring host nodes of a cloud platform includes:
    获取云平台各主机节点的重要度,所述重要度根据所述云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定;所述邻接相似度用于表示任意两个所述主机节点与其周围邻接节点组成的整体的相似程度,所述邻接节点为所述主机节点两阶内的邻居节点;所述邻接相似度与所述重要度成反比;Obtain the importance of each host node of the cloud platform, the importance is determined according to the adjacent similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacent similarity is used to indicate any two of the hosts The degree of similarity between a node and its surrounding neighboring nodes as a whole, where the neighboring nodes are neighbor nodes within two levels of the host node; the neighboring similarity is inversely proportional to the importance;
    根据所述主机节点的重要度确定所述云平台的关键主机节点;Determining the key host node of the cloud platform according to the importance of the host node;
    调整所述关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于所述关键主机节点的告警阈值;及Adjusting the alarm threshold of the key host node, wherein the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node; and
    根据各所述告警阈值对所述云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;所述监控参数包括各所述主机节点的设备参数和运行参数。Monitor each host node of the cloud platform according to each alarm threshold to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold; the monitoring parameter includes the device parameters of each host node and Operating parameters.
  2. 根据权利要求1所述的方法,其特征在于,所述获取云平台各主机节点的重要度,所述重要度根据所述云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定,包括:The method according to claim 1, characterized in that said acquiring the importance of each host node of the cloud platform, said importance being determined according to the similarity between each host node of the cloud platform and its neighbor nodes within two levels ,include:
    获取所述云平台的连接网络内各主机节点的节点参数;Acquiring the node parameters of each host node in the connection network of the cloud platform;
    根据所述节点参数分别计算各所述主机节点和其两阶内的邻居节点之间的邻接相似度;及Respectively calculating the adjacency similarity between each host node and its neighbor nodes within two levels according to the node parameters; and
    根据所述邻接相似度,确定各所述主机节点的重要度。Determine the importance of each host node according to the adjacency similarity.
  3. 根据权利要求1所述的方法,其特征在于,根据所述主机节点的重要度确定所述云平台的关键主机节点,包括:The method according to claim 1, wherein determining the key host node of the cloud platform according to the importance of the host node comprises:
    根据各所述主机节点的重要度,对各所述主机节点进行排序,生成节点重要度列表;及According to the importance of each of the host nodes, sort each of the host nodes to generate a node importance list; and
    获取所述节点重要度列表中前N个主机节点,得到关键主机节点;所述N根据所述云平台的连接网络中所述主机节点的个数确定。The first N host nodes in the node importance list are obtained to obtain key host nodes; the N is determined according to the number of the host nodes in the connection network of the cloud platform.
  4. 根据权利要求1所述的方法,其特征在于,确定云平台各主机节点与其两阶内的邻居节点之间邻接相似度,包括:The method according to claim 1, wherein determining the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels comprises:
    获取主机节点;Get the host node;
    将该主机节点的其中两个相邻节点作为所述主机节点的一阶相邻节点;Taking two of the neighboring nodes of the host node as the first-order neighboring nodes of the host node;
    将所述一阶相邻节点的所有相邻节点作为所述主机节点的二阶相邻节点,并得到二阶相邻节点的数量;Taking all neighboring nodes of the first-order neighboring nodes as second-order neighboring nodes of the host node, and obtaining the number of second-order neighboring nodes;
    从所述二阶相邻节点中,确定所述一阶相邻节点的公共相邻节点,并得到所述公共相邻节点的数量;及From the second-order neighboring nodes, determining the common neighboring nodes of the first-order neighboring nodes, and obtaining the number of the common neighboring nodes; and
    根据所述公共相邻节点的数量和所述二阶相邻节点的数量,确定对选择的两个相邻节点对应的邻接相似度,其中,所述邻接相似度为所述公共相邻节点的数量与所述二阶相邻 节点的数量的比值。According to the number of the common neighboring nodes and the number of the second-order neighboring nodes, determine the neighboring similarity corresponding to the two selected neighboring nodes, wherein the neighboring similarity is the value of the common neighboring node The ratio of the number to the number of second-order adjacent nodes.
  5. 根据权利要求4所述的方法,其特征在于,所述根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度,确定各主机节点的重要度,包括:The method according to claim 4, wherein the determining the importance of each host node according to the adjacent similarity between each host node of the cloud platform and its neighbor nodes within two levels comprises:
    遍历所述主机节点的相邻节点,将所述主机节点的任意两个相邻节点组合,计算得到所述主机节点与各相邻节点对应的邻接相似度;及Traversing the neighboring nodes of the host node, combining any two neighboring nodes of the host node, and calculating the neighboring similarity corresponding to the host node and each neighboring node; and
    利用预设定值分别减去各所述邻接相似度,得到与各所述邻接相似度对应的差值,并对所有差值进行求和,得到所述主机节点的重要度。Each of the adjacent similarities is subtracted from a preset value to obtain a difference corresponding to each of the adjacent similarities, and all the differences are summed to obtain the importance of the host node.
  6. 根据权利要求1项所述的方法,其特征在于,调整所述关键主机节点的告警阈值,包括:The method according to claim 1, wherein adjusting the alarm threshold of the key host node comprises:
    获取各所述关键主机节点的原始告警阈值;Acquiring the original alarm threshold of each key host node;
    获取各所述关键主机节点的排序;及Acquiring the ranking of each of the key host nodes; and
    根据所述关键主机节点的排序对所述关键主机节点的原始告警阈值进行调整,得到调整后的告警阈值;其中,关键主机节点的排序越在前,调整后的告警阈值越低。The original alarm thresholds of the key host nodes are adjusted according to the order of the key host nodes to obtain the adjusted alarm thresholds; wherein, the higher the order of the key host nodes, the lower the adjusted alarm threshold.
  7. 一种云平台的主机节点监控装置,包括:A host node monitoring device of a cloud platform includes:
    主机节点重要度获取模块,用于获取云平台各主机节点的重要度,所述重要度根据云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定;所述邻接相似度用于表示任意两个所述主机节点与其周围邻接节点组成的整体的相似程度,所述邻接节点为所述主机节点两阶内的邻居节点;所述邻接相似度与所述重要度成反比;The host node importance degree obtaining module is used to obtain the importance degree of each host node of the cloud platform, and the importance degree is determined according to the adjacency similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacency similarity is Represents the degree of similarity between any two host nodes and their surrounding neighboring nodes as a whole, where the neighboring nodes are neighbor nodes within two levels of the host node; the neighboring similarity is inversely proportional to the importance;
    关键主机节点确定模块,用于根据所述主机节点的重要度确定所述云平台的关键主机节点;A key host node determining module, configured to determine the key host node of the cloud platform according to the importance of the host node;
    告警阈值调整模块,用于调整所述关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于所述关键主机节点的告警阈值;及The alarm threshold adjustment module is used to adjust the alarm threshold of the key host node, wherein the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node; and
    监控模块,用于根据各所述告警阈值对所述云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;所述监控参数包括各所述主机节点的设备参数和运行参数。The monitoring module is configured to monitor each host node of the cloud platform according to each of the alarm thresholds, so as to generate an alarm when the monitoring parameter of the monitored host node is greater than the corresponding alarm threshold; the monitoring parameters include each of the hosts The device parameters and operating parameters of the node.
  8. 根据权利要求7所述的装置,其特征在于,所述主机节点重要度获取模块,还用于:8. The device according to claim 7, wherein the host node importance acquisition module is further configured to:
    获取所述云平台的连接网络内各主机节点的节点参数;根据所述节点参数分别计算各所述主机节点和其两阶内的邻居节点之间的邻接相似度;根据所述邻接相似度,确定各所述主机节点的重要度。Obtain the node parameters of each host node in the connection network of the cloud platform; respectively calculate the adjacency similarity between each host node and its neighbor nodes within two steps according to the node parameter; according to the adjacency similarity, Determine the importance of each host node.
  9. 根据权利要求7所述的装置,其特征在于,所述关键主机节点确定模块,还用于:The device according to claim 7, wherein the key host node determining module is further configured to:
    根据各所述主机节点的重要度,对各所述主机节点进行排序,生成节点重要度列表;获取所述节点重要度列表中前N个主机节点,得到关键主机节点;所述N根据所述云平台的连接网络中所述主机节点的个数确定。According to the importance of each of the host nodes, the host nodes are sorted to generate a node importance list; the first N host nodes in the node importance list are obtained to obtain key host nodes; the N is based on the The number of the host nodes in the connection network of the cloud platform is determined.
  10. 根据权利要求7所述的装置,其特征在于,所述装置还包括邻接相似度计算模块, 用于:The device according to claim 7, wherein the device further comprises an adjacency similarity calculation module, configured to:
    获取主机节点;将该主机节点的其中两个相邻节点作为所述主机节点的一阶相邻节点;将所述一阶相邻节点的所有相邻节点作为所述主机节点的二阶相邻节点,并得到二阶相邻节点的数量;从所述二阶相邻节点中,确定所述一阶相邻节点的公共相邻节点,并得到所述公共相邻节点的数量;根据所述公共相邻节点的数量和所述二阶相邻节点的数量,确定对选择的两个相邻节点对应的邻接相似度,其中,所述邻接相似度为所述公共相邻节点的数量与所述二阶相邻节点的数量的比值。Obtain a host node; use two neighboring nodes of the host node as the first-order neighbor nodes of the host node; use all neighboring nodes of the first-order neighbor node as the second-order neighbors of the host node Node, and obtain the number of second-order neighboring nodes; from the second-order neighboring nodes, determine the common neighboring nodes of the first-order neighboring nodes, and obtain the number of the common neighboring nodes; according to the The number of common adjacent nodes and the number of second-order adjacent nodes are used to determine the adjacent similarity corresponding to the two selected adjacent nodes, where the adjacent similarity is the number of the common adjacent nodes and the total number of adjacent nodes. The ratio of the number of second-order adjacent nodes.
  11. 根据权利要求10所述的装置,其特征在于,所述主机节点重要度获取模块,还用于:The apparatus according to claim 10, wherein the host node importance acquisition module is further configured to:
    遍历所述主机节点的相邻节点,将所述主机节点的任意两个相邻节点组合,计算得到所述主机节点与各相邻节点对应的邻接相似度;利用预设定值分别减去各所述邻接相似度,得到与各所述邻接相似度对应的差值,并对所有差值进行求和,得到所述主机节点的重要度。Traverse the neighboring nodes of the host node, combine any two neighboring nodes of the host node, and calculate the neighboring similarity corresponding to the host node and each neighboring node; use a preset value to subtract each The adjacent similarity obtains the difference corresponding to each adjacent similarity, and sums all the differences to obtain the importance of the host node.
  12. 根据权利要求7所述的装置,其特征在于,所述告警阈值调整模块,还用于:The device according to claim 7, wherein the alarm threshold adjustment module is further configured to:
    获取各所述关键主机节点的原始告警阈值;获取各所述关键主机节点的排序;根据所述关键主机节点的排序对所述关键主机节点的原始告警阈值进行调整,得到调整后的告警阈值;其中,关键主机节点的排序越在前,调整后的告警阈值越低。Obtaining the original alarm threshold of each key host node; obtaining the ranking of each key host node; adjusting the original alarm threshold of the key host node according to the ranking of the key host node to obtain the adjusted alarm threshold; Among them, the higher the ranking of key host nodes, the lower the adjusted alarm threshold.
  13. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:
    获取云平台各主机节点的重要度,所述重要度根据所述云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定;所述邻接相似度用于表示任意两个所述主机节点与其周围邻接节点组成的整体的相似程度,所述邻接节点为所述主机节点两阶内的邻居节点;所述邻接相似度与所述重要度成反比;Obtain the importance of each host node of the cloud platform, the importance is determined according to the adjacent similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacent similarity is used to indicate any two of the hosts The degree of similarity between a node and its surrounding neighboring nodes as a whole, where the neighboring nodes are neighbor nodes within two levels of the host node; the neighboring similarity is inversely proportional to the importance;
    根据所述主机节点的重要度确定所述云平台的关键主机节点;Determining the key host node of the cloud platform according to the importance of the host node;
    调整所述关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于所述关键主机节点的告警阈值;及Adjusting the alarm threshold of the key host node, wherein the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node; and
    根据各所述告警阈值对所述云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;所述监控参数包括各所述主机节点的设备参数和运行参数。Monitor each host node of the cloud platform according to each alarm threshold to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold; the monitoring parameter includes the device parameters of each host node and Operating parameters.
  14. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 13, wherein the processor further executes the following steps when executing the computer-readable instruction:
    获取所述云平台的连接网络内各主机节点的节点参数;Acquiring the node parameters of each host node in the connection network of the cloud platform;
    根据所述节点参数分别计算各所述主机节点和其两阶内的邻居节点之间的邻接相似度;及Respectively calculating the adjacency similarity between each host node and its neighbor nodes within two levels according to the node parameters; and
    根据所述邻接相似度,确定各所述主机节点的重要度。Determine the importance of each host node according to the adjacency similarity.
  15. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 13, wherein the processor further executes the following steps when executing the computer-readable instruction:
    根据各所述主机节点的重要度,对各所述主机节点进行排序,生成节点重要度列表;及According to the importance of each of the host nodes, sort each of the host nodes to generate a node importance list; and
    获取所述节点重要度列表中前N个主机节点,得到关键主机节点;所述N根据所述云平台的连接网络中所述主机节点的个数确定。The first N host nodes in the node importance list are obtained to obtain key host nodes; the N is determined according to the number of the host nodes in the connection network of the cloud platform.
  16. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 13, wherein the processor further executes the following steps when executing the computer-readable instruction:
    获取主机节点;Get the host node;
    将该主机节点的其中两个相邻节点作为所述主机节点的一阶相邻节点;Taking two of the neighboring nodes of the host node as the first-order neighboring nodes of the host node;
    将所述一阶相邻节点的所有相邻节点作为所述主机节点的二阶相邻节点,并得到二阶相邻节点的数量;Taking all neighboring nodes of the first-order neighboring nodes as second-order neighboring nodes of the host node, and obtaining the number of second-order neighboring nodes;
    从所述二阶相邻节点中,确定所述一阶相邻节点的公共相邻节点,并得到所述公共相邻节点的数量;及From the second-order neighboring nodes, determining the common neighboring nodes of the first-order neighboring nodes, and obtaining the number of the common neighboring nodes; and
    根据所述公共相邻节点的数量和所述二阶相邻节点的数量,确定对选择的两个相邻节点对应的邻接相似度,其中,所述邻接相似度为所述公共相邻节点的数量与所述二阶相邻节点的数量的比值。According to the number of the common neighboring nodes and the number of the second-order neighboring nodes, determine the neighboring similarity corresponding to the two selected neighboring nodes, wherein the neighboring similarity is the value of the common neighboring node The ratio of the number to the number of second-order adjacent nodes.
  17. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
    获取云平台各主机节点的重要度,所述重要度根据所述云平台各主机节点与其两阶内的邻居节点之间邻接相似度确定;所述邻接相似度用于表示任意两个所述主机节点与其周围邻接节点组成的整体的相似程度,所述邻接节点为所述主机节点两阶内的邻居节点;所述邻接相似度与所述重要度成反比;Obtain the importance of each host node of the cloud platform, the importance is determined according to the adjacent similarity between each host node of the cloud platform and its neighbor nodes within two levels; the adjacent similarity is used to indicate any two of the hosts The degree of similarity between a node and its surrounding neighboring nodes as a whole, where the neighboring nodes are neighbor nodes within two levels of the host node; the neighboring similarity is inversely proportional to the importance;
    根据所述主机节点的重要度确定所述云平台的关键主机节点;Determining the key host node of the cloud platform according to the importance of the host node;
    调整所述关键主机节点的告警阈值,其中,非关键主机节点的告警阈值大于所述关键主机节点的告警阈值;及Adjusting the alarm threshold of the key host node, wherein the alarm threshold of the non-key host node is greater than the alarm threshold of the key host node; and
    根据各所述告警阈值对所述云平台的各主机节点进行监控,以在监控到主机节点的监控参数大于对应告警阈值时,产生告警;所述监控参数包括各所述主机节点的设备参数和运行参数。Monitor each host node of the cloud platform according to each alarm threshold to generate an alarm when the monitored parameter of the host node is greater than the corresponding alarm threshold; the monitoring parameter includes the device parameters of each host node and Operating parameters.
  18. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:18. The storage medium of claim 17, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:
    获取所述云平台的连接网络内各主机节点的节点参数;Acquiring the node parameters of each host node in the connection network of the cloud platform;
    根据所述节点参数分别计算各所述主机节点和其两阶内的邻居节点之间的邻接相似度;及Respectively calculating the adjacency similarity between each host node and its neighbor nodes within two levels according to the node parameters; and
    根据所述邻接相似度,确定各所述主机节点的重要度。According to the adjacent similarity, the importance of each host node is determined.
  19. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:18. The storage medium of claim 17, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:
    根据各所述主机节点的重要度,对各所述主机节点进行排序,生成节点重要度列表;及According to the importance of each of the host nodes, sort each of the host nodes to generate a node importance list; and
    获取所述节点重要度列表中前N个主机节点,得到关键主机节点;所述N根据所述云平台的连接网络中所述主机节点的个数确定。The first N host nodes in the node importance list are obtained to obtain key host nodes; the N is determined according to the number of the host nodes in the connection network of the cloud platform.
  20. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:18. The storage medium of claim 17, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:
    获取主机节点;Get the host node;
    将该主机节点的其中两个相邻节点作为所述主机节点的一阶相邻节点;Taking two of the neighboring nodes of the host node as the first-order neighboring nodes of the host node;
    将所述一阶相邻节点的所有相邻节点作为所述主机节点的二阶相邻节点,并得到二阶相邻节点的数量;Taking all neighboring nodes of the first-order neighboring nodes as the second-order neighboring nodes of the host node, and obtaining the number of second-order neighboring nodes;
    从所述二阶相邻节点中,确定所述一阶相邻节点的公共相邻节点,并得到所述公共相邻节点的数量;及From the second-order neighboring nodes, determining the common neighboring nodes of the first-order neighboring nodes, and obtaining the number of the common neighboring nodes; and
    根据所述公共相邻节点的数量和所述二阶相邻节点的数量,确定对选择的两个相邻节点对应的邻接相似度,其中,所述邻接相似度为所述公共相邻节点的数量与所述二阶相邻节点的数量的比值。According to the number of the common neighboring nodes and the number of the second-order neighboring nodes, determine the neighboring similarity corresponding to the two selected neighboring nodes, wherein the neighboring similarity is the value of the common neighboring node The ratio of the number to the number of second-order adjacent nodes.
PCT/CN2019/116613 2019-10-15 2019-11-08 Cloud platform host node monitoring method, apparatus, and computer device WO2021072844A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910977025.XA CN110890977B (en) 2019-10-15 2019-10-15 Host node monitoring method and device of cloud platform and computer equipment
CN201910977025.X 2019-10-15

Publications (1)

Publication Number Publication Date
WO2021072844A1 true WO2021072844A1 (en) 2021-04-22

Family

ID=69746200

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116613 WO2021072844A1 (en) 2019-10-15 2019-11-08 Cloud platform host node monitoring method, apparatus, and computer device

Country Status (2)

Country Link
CN (1) CN110890977B (en)
WO (1) WO2021072844A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114338372A (en) * 2020-09-25 2022-04-12 中国移动通信集团山东有限公司 Network information security monitoring method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102236693A (en) * 2010-04-28 2011-11-09 国际商业机器公司 Method and device for determining similarity between documents
US20170294112A1 (en) * 2016-04-06 2017-10-12 Alcatel-Lucent Usa, Inc. Alarm causality templates for network function virtualization
CN108009710A (en) * 2017-11-19 2018-05-08 国家计算机网络与信息安全管理中心 Node test importance appraisal procedure based on similarity and TrustRank algorithms
CN109194661A (en) * 2018-09-13 2019-01-11 网易(杭州)网络有限公司 Network attack alarm threshold configuration method, medium, device and calculating equipment

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060146875A1 (en) * 2005-01-04 2006-07-06 Yang Luiyang L Media access controller and methods for distributed hop-by-hop flow control in wireless mesh networks
CN102650964B (en) * 2011-02-28 2016-03-09 国际商业机器公司 For monitoring the method for OO application, system and self-monitoring system
CN102711107B (en) * 2012-05-17 2015-09-02 北京工业大学 Based on the wireless sensor network intrusion detection method of key node
CN104602288B (en) * 2015-02-13 2018-05-01 北京北交信控科技有限公司 A kind of railway GPRS network key network element equipment actively monitoring system and method
CN107360091A (en) * 2016-05-09 2017-11-17 中兴通讯股份有限公司 A kind of method and device for realizing QoS management
CN106385339B (en) * 2016-11-01 2020-02-07 上海携程商务有限公司 Monitoring method and monitoring system for access performance of enterprise network
CN106789322B (en) * 2017-01-05 2019-08-27 清华大学 The determination method and apparatus of key node in Information Network
EP3352111B1 (en) * 2017-01-24 2021-08-11 AIT Austrian Institute of Technology GmbH Method for identifying critical events
CN207095615U (en) * 2017-08-25 2018-03-13 河南瑞欧光电科技有限公司 Tunnel monitoring system based on fiber grating
CN109714180B (en) * 2017-10-26 2022-03-04 中兴通讯股份有限公司 Method for reducing redundant alarm, corresponding equipment and storage medium
CN109194703B (en) * 2018-06-29 2021-06-18 平安科技(深圳)有限公司 Processing method of communication load between cloud platform hosts, electronic device and medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102236693A (en) * 2010-04-28 2011-11-09 国际商业机器公司 Method and device for determining similarity between documents
US20170294112A1 (en) * 2016-04-06 2017-10-12 Alcatel-Lucent Usa, Inc. Alarm causality templates for network function virtualization
CN108009710A (en) * 2017-11-19 2018-05-08 国家计算机网络与信息安全管理中心 Node test importance appraisal procedure based on similarity and TrustRank algorithms
CN109194661A (en) * 2018-09-13 2019-01-11 网易(杭州)网络有限公司 Network attack alarm threshold configuration method, medium, device and calculating equipment

Also Published As

Publication number Publication date
CN110890977A (en) 2020-03-17
CN110890977B (en) 2022-06-21

Similar Documents

Publication Publication Date Title
US10609045B2 (en) Autonomic incident triage prioritization by performance modifier and temporal decay parameters
US11049039B2 (en) Static and dynamic device profile reputation using cloud-based machine learning
US20220263716A1 (en) Automated closed-loop actions in a network using a distributed ledger
US20170288952A1 (en) Network policy conflict detection and resolution
US10339309B1 (en) System for identifying anomalies in an information system
CN111049695A (en) Cloud gateway configuration method and system
US10686807B2 (en) Intrusion detection system
EP3541015B1 (en) Method and device for analyzing service survivability
US20170187741A1 (en) Systems and methods for prioritizing indicators of compromise
US20210374675A1 (en) Connecting contact center resources using dlt for iot solutions
US9800596B1 (en) Automated detection of time-based access anomalies in a computer network through processing of login data
US20220094690A1 (en) Trusted and connected multi-domain node clusters
WO2021072844A1 (en) Cloud platform host node monitoring method, apparatus, and computer device
WO2021164174A1 (en) Cache server deployment method and apparatus for cloud platform, and computer device
US11647035B2 (en) Fidelity of anomaly alerts using control plane and data plane information
El Hadj et al. Validation and correction of large security policies: a clustering and access log based approach
US20200349527A1 (en) Machine learning risk assessment utilizing calendar data
Chopra et al. Cloud computing potability with risk assessment
US11122145B2 (en) Time series data analysis
Abdolrashidi et al. Incremental partitioning of large time-evolving graphs
CN109670950B (en) Transaction monitoring method, device, equipment and storage medium based on blockchain
Malallah et al. Performance Analysis of Enterprise Cloud Computing: A Review
Ruiz et al. Lyapunov-based anomaly detection in highly-clustered networks
Rapaport et al. Spillover Today? Predicting Traffic Overflows on Private Peering of Major Content Providers
CN114553726B (en) Network security operation and maintenance method and system based on functions and resource levels

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19949033

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19949033

Country of ref document: EP

Kind code of ref document: A1