CN108512673B - Cloud service quality monitoring method and device and server - Google Patents

Cloud service quality monitoring method and device and server Download PDF

Info

Publication number
CN108512673B
CN108512673B CN201710103863.5A CN201710103863A CN108512673B CN 108512673 B CN108512673 B CN 108512673B CN 201710103863 A CN201710103863 A CN 201710103863A CN 108512673 B CN108512673 B CN 108512673B
Authority
CN
China
Prior art keywords
cloud
server
cloud service
defect
reason
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710103863.5A
Other languages
Chinese (zh)
Other versions
CN108512673A (en
Inventor
马文霜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710103863.5A priority Critical patent/CN108512673B/en
Publication of CN108512673A publication Critical patent/CN108512673A/en
Application granted granted Critical
Publication of CN108512673B publication Critical patent/CN108512673B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5009Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • H04L43/087Jitter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Abstract

The embodiment of the invention provides a cloud service quality monitoring method, a cloud service quality monitoring device and a server, wherein the method comprises the following steps: receiving network test detection packets respectively aiming at a server and a cloud host in the server; determining a cloud service quality detection result of the server according to the network test detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the network test detection packet for the cloud host; if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud service quality is defective currently, analyzing corresponding operation data of the cloud service according to preset reasons causing the cloud service quality defect, and determining a target reason matched from the operation data; and acquiring a preset solution strategy corresponding to the target reason, and executing the solution strategy. The embodiment of the invention can improve the monitoring effect of the cloud service quality and ensure the cloud service quality.

Description

Cloud service quality monitoring method and device and server
Technical Field
The invention relates to the technical field of data processing, in particular to a cloud service quality monitoring method, a cloud service quality monitoring device and a server.
Background
Cloud services are an added service on the basis of the internet, and generally involve providing a dynamic and easily extensible service through the internet; typical application scenarios of cloud services are cloud internet of things, cloud security, cloud storage and the like. Currently, cloud services are generally provided by a cloud server, a plurality of virtual computers (called cloud hosts, where the server may be considered as a carrier of the cloud host) may be arranged on the server, the server is virtualized into a plurality of cloud hosts, and a cloud computing mode-based on-demand use and on-demand payment rental service may be provided through the cloud hosts, so that dynamic and easily-extensible cloud services are realized, for example, different cloud hosts may be allocated to different users, and on-demand configuration of cloud service resources is realized.
The cloud service quality refers to the service quality of the cloud service, and good cloud service quality has an important meaning for improving the experience of a user in using the cloud service, so that the service quality of the cloud service is monitored, the cloud service quality defect is solved when the cloud service quality defect exists, and the good cloud service quality is guaranteed. However, the problem of cloud service quality monitoring at present is that, as cloud services relate to servers and cloud hosts in the servers, cloud service quality defects may appear on the servers and also on the cloud hosts, so that it is difficult to accurately locate the reasons causing the cloud service quality defects, and a solution strategy cannot be accurately matched, so that the monitoring effect of the cloud service quality is poor;
therefore, how to accurately locate the reason causing the cloud service quality defect and provide a solution strategy matched with the located reason so as to improve the monitoring effect of the cloud service quality becomes a problem to be considered by a person skilled in the art.
Disclosure of Invention
In view of this, embodiments of the present invention provide a cloud service quality monitoring method, apparatus, and server, so as to accurately locate a cause causing a cloud service quality defect, and provide a solution policy matched with the located cause, so as to improve a monitoring effect of cloud service quality.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
a cloud service quality monitoring method comprises the following steps:
receiving network test detection packets respectively aiming at a server and a cloud host in the server;
determining a cloud service quality detection result of the server according to the network test detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the network test detection packet for the cloud host;
if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud service quality is defective currently, analyzing corresponding operation data of the cloud service according to preset reasons causing the cloud service quality defect, and determining a target reason matched from the operation data;
and acquiring a preset solution strategy corresponding to the target reason, and executing the solution strategy.
An embodiment of the present invention further provides a cloud service quality monitoring apparatus, including:
the network test detection packet receiving module is used for receiving network test detection packets respectively aiming at the server and the cloud host in the server;
the detection result determining module is used for determining a cloud service quality detection result of the server according to the network test detection packet for the server and determining a cloud service quality detection result of the cloud host according to the network test detection packet for the cloud host;
the target reason determining module is used for analyzing corresponding operation data of the cloud service according to preset reasons causing the cloud service quality defect if the cloud service quality detection results of the determined server and the cloud host indicate that the cloud service quality defect exists at present, and determining a target reason matched from the operation data;
and the solution strategy execution module is used for acquiring a preset solution strategy corresponding to the target reason and executing the solution strategy.
The embodiment of the invention also provides a server which comprises the cloud service quality monitoring device.
Based on the technical scheme, in the cloud service quality monitoring method provided by the embodiment of the invention, the server can receive the network test detection packets respectively aiming at the server and the cloud host in the server, the cloud service quality detection result of the server is determined according to the network test detection packet aiming at the server, and the cloud service quality detection result of the cloud host is determined according to the network test detection packet aiming at the cloud host, so that after the cloud service quality defect is determined to exist currently according to the cloud service quality detection results of the server and the cloud host, the corresponding operation data of the cloud service can be analyzed according to preset reasons causing the cloud service quality defect, the target reason causing the current cloud service quality defect is matched, and the accurate positioning of the reasons causing the cloud service quality defect is realized; and then, a preset solution strategy corresponding to the target reason is obtained and executed, a solution strategy matched with the positioned reason is provided, the cloud service quality defect is solved, the monitoring effect of the cloud service quality is improved, and the cloud service quality is guaranteed.
According to the embodiment of the invention, the cloud service quality of the server and the cloud host can be detected through the network test detection packet such as the ping detection packet and the like, so that the cloud service quality detection result of the server and the cloud host is obtained, and therefore, after the cloud service quality defect is determined to exist at present, the operation data of the cloud service is analyzed according to various reasons causing the cloud service quality defect, the target reason causing the cloud service quality defect is located, the accurate location of the reason causing the cloud service quality defect is realized, and through executing the preset solution strategy corresponding to the located reason, the solution for solving the cloud service quality defect can be provided, the cloud service quality is improved, the monitoring effect of the cloud service quality is improved, and the purpose of guaranteeing the cloud service quality is achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a cloud service quality monitoring method according to an embodiment of the present invention;
fig. 2 is another flowchart of a cloud service quality monitoring method according to an embodiment of the present invention;
fig. 3 is a further flowchart of a cloud service quality monitoring method according to an embodiment of the present invention;
fig. 4 is another flowchart of a cloud service quality monitoring method according to an embodiment of the present invention;
fig. 5 is yet another flowchart of a cloud service quality monitoring method according to an embodiment of the present invention;
fig. 6 is a further flowchart of a cloud service quality monitoring method according to an embodiment of the present invention;
fig. 7 is another flowchart of a cloud service quality monitoring method according to an embodiment of the present invention;
fig. 8 is a block diagram of a cloud service quality monitoring apparatus according to an embodiment of the present invention;
fig. 9 is another block diagram of a cloud service quality monitoring apparatus according to an embodiment of the present invention;
fig. 10 is a block diagram of a hardware configuration of the server.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a cloud service quality monitoring method according to an embodiment of the present invention, where the method is applicable to a server, and the server may implement the cloud service quality monitoring method shown in fig. 1 by setting corresponding program functions, where the server may be regarded as a carrier of a cloud host, and the cloud host is implemented by setting a plurality of virtual computers in the server; referring to fig. 1, the cloud service quality monitoring method may include:
step S100, receiving network test detection packets respectively aiming at the server and the cloud host in the server.
The network test detection Packet may be used to test network information such as network connection amount, for example, the network test detection Packet may be a ping (Packet Internet Groper, Internet Packet explorer) detection Packet;
optionally, in the embodiment of the present invention, the ping tool may be used to send ping detection packets to the server in the cloud and each cloud host in the server, and the ping tool may be disposed on the user device at the user end, so that the server may receive the ping detection packet for the server and the ping detection packet for each cloud host in the server.
Step S110, determining a cloud service quality detection result of the server according to the network test detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the network test detection packet for the cloud host.
Optionally, the cloud service detection result may be determined by a corresponding delay and jitter of a network test detection packet (e.g., ping detection packet); the corresponding time delay of the network test detection packet means the time consumed by the network test detection packet from sending to receiving, and the unit is generally millisecond, the time delay reflects the speed of the network, and the smaller the time delay is, the better the time delay is; the jitter corresponding to the network test detection packet refers to the variation of the time delay, for example, the time delay difference of two network test detection packets, and the jitter reflects the stability of the network, and the smaller the jitter is, the better the jitter is.
Optionally, the network test detection packet for the server may be processed by the server, and by determining the corresponding delay and jitter of the network test detection packet for the server, it may be determined whether the server has a delay defect and a jitter defect, so as to obtain a cloud service quality detection result of the server;
optionally, if the determined time delay corresponding to the network test detection packet for the server is greater than the predetermined first time delay, it may be determined that the server has a time delay defect, otherwise (that is, the time delay corresponding to the network test detection packet for the server is not greater than the predetermined first time delay), it is determined that the server does not have the time delay defect; if the determined jitter corresponding to the network test detection packet for the server is greater than the predetermined first jitter, the server may be considered to have jitter defect, otherwise (i.e. the jitter corresponding to the network test detection packet for the server is not greater than the predetermined first jitter), the server may be considered to have no jitter defect.
Optionally, the network test detection packet for each cloud host in the server may be processed by each cloud host, and by determining the corresponding delay and jitter of the network test detection packet for the cloud host, it may be determined whether the cloud host has a delay defect and a jitter defect, so as to obtain a cloud server quality detection result of the cloud host;
optionally, if the determined time delay corresponding to the network test detection packet for a certain cloud host is greater than a predetermined second time delay (the first time delay and the second time delay may be the same or different, and may be set according to actual conditions), it is determined that the cloud host has a time delay defect, otherwise (that is, the time delay corresponding to the network test detection packet for the cloud host is not greater than the predetermined second time delay), it is determined that the cloud host does not have the time delay defect; if the determined jitter corresponding to the network test detection packet of a certain cloud host is greater than a predetermined second jitter (the first jitter and the second jitter may be the same or different, and may be set according to actual conditions), determining that the cloud host has a jitter defect, otherwise (i.e., the jitter corresponding to the network test detection packet of the cloud host is not greater than the predetermined second jitter), determining that the cloud host does not have the jitter defect.
Step S120, if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud service quality is defective currently, analyzing corresponding operation data of the cloud service according to preset reasons causing the cloud service quality defect, and determining a target reason matched from the operation data.
Optionally, after determining a cloud service quality detection result of the server and a cloud service quality detection result of the cloud host, the embodiment of the invention may determine whether the server has a delay defect and a jitter defect, and whether the cloud host has a delay defect and a jitter defect, so as to determine whether the cloud service quality defect exists currently;
optionally, the current cloud service quality defect indicated by the cloud service quality detection result of the server and the cloud host may be as follows:
the server has time delay defect and/or jitter defect;
or the cloud host in the server has a time delay defect and/or a jitter defect;
or the server and the cloud host have time delay defects and/or jitter defects.
Optionally, the running data corresponding to the cloud service relates to running data such as a network, a Central Processing Unit (CPU), a memory, and a disk that provide the cloud service, and an optional form of the running data may be a running log.
After determining that the cloud service quality defect exists at present according to the cloud service quality detection result of the server and the cloud service quality detection result of the cloud host, the embodiment of the invention needs to locate the reason causing the cloud service quality defect, provide a solution strategy for the reason based on the location, and realize the purpose of monitoring the quality of the cloud server;
the embodiment of the invention can be used for sorting and analyzing various reasons causing the cloud service quality defect in advance, and realizing the presetting of the various reasons causing the cloud service quality defect; after the cloud service quality defect is determined to exist currently, the embodiment of the invention can call the corresponding operation data of the cloud service and various preset reasons causing the cloud service quality defect, analyze the operation data according to the various preset reasons causing the cloud service quality defect, match the target reason from the operation data, and realize the positioning of the specific reasons causing the cloud service quality defect currently; optionally, the determined target reason belongs to preset reasons causing cloud service quality defects.
Step S130, obtaining a preset solution strategy corresponding to the target reason, and executing the solution strategy.
Optionally, for each reason causing the cloud service quality defect, the embodiment of the present invention may set a corresponding solution policy respectively, to obtain a solution policy corresponding to each reason causing the cloud service quality defect; therefore, after the target reason is determined, the embodiment of the invention can acquire the solution strategy corresponding to the target reason from the preset solution strategies corresponding to all the reasons causing the cloud service quality defect, further execute the solution strategy, solve the target reason causing the cloud service quality defect, improve the monitoring effect of the cloud service quality and ensure the cloud service quality.
In the cloud service quality monitoring method provided by the embodiment of the invention, the server can receive the network test detection packets respectively aiming at the server and the cloud host in the server, the cloud service quality detection result of the server is determined according to the network test detection packet aiming at the server, and the cloud service quality detection result of the cloud host is determined according to the network test detection packet aiming at the cloud host, so that after the cloud service quality defect is determined to exist currently according to the cloud service quality detection results of the server and the cloud host, the corresponding operation data of the cloud service can be analyzed according to the preset causes of the cloud service quality defect, the target cause of the current cloud service quality defect is matched, and the precise positioning of the cause of the cloud service quality defect is realized; and then, a preset solution strategy corresponding to the target reason is obtained and executed, a solution strategy matched with the positioned reason is provided, the cloud service quality defect is solved, the monitoring effect of the cloud service quality is improved, and the cloud service quality is guaranteed.
According to the embodiment of the invention, the cloud service quality of the server and the cloud host can be detected through the network test detection packet such as the ping detection packet and the like, so that the cloud service quality detection result of the server and the cloud host is obtained, and therefore, after the cloud service quality defect is determined to exist at present, the operation data of the cloud service is analyzed according to various reasons causing the cloud service quality defect, the target reason causing the cloud service quality defect is located, the accurate location of the reason causing the cloud service quality defect is realized, and through executing the preset solution strategy corresponding to the located reason, the solution for solving the cloud service quality defect can be provided, the cloud service quality is improved, the monitoring effect of the cloud service quality is improved, and the purpose of guaranteeing the cloud service quality is achieved.
Optionally, further, in the embodiment of the present invention, when it is determined that a cloud service quality defect exists at present according to the cloud service quality detection results of the determined server and the cloud host, a defect form of the cloud server quality defect existing at present may be further determined;
the defect form refers to what a specific device (such as a server and/or a cloud host) with a delay defect and/or a jitter defect occurs, what the association between devices is, and the like, and is a specific description of the current existing cloud service quality defect; namely, the embodiment of the invention can further determine the equipment with the cloud service quality defect, the association between the equipment and the like when the cloud service quality detection results of the server and the cloud host indicate that the cloud service quality defect exists at present, obtain the defect description of the cloud server quality defect existing at present, and determine the defect form of the cloud server quality defect existing at present;
correspondingly, the embodiment of the invention can preset various reasons causing the cloud service quality defects in various defect forms, so that after the defect form of the cloud server quality defect existing at present is determined, the corresponding operation data of the cloud service is analyzed according to the preset various reasons causing the cloud service quality defect in the defect form, and the target reason matched from the operation data is determined;
and analyzing the operating data according to preset reasons causing the cloud service quality defects in the defect form to determine target reasons, so that the data processing amount related to the determination of the target reasons can be reduced.
Optionally, when analyzing the operation data according to preset reasons, the embodiment of the present invention may preset reason descriptions (which may be text descriptions of reasons or matching condition descriptions, etc.) corresponding to the reasons causing the cloud service quality defect, and after determining that the cloud service quality defect currently exists, may analyze the reason corresponding to the reason description matched with the data content from the operation data according to the preset reason descriptions corresponding to the reasons causing the cloud service quality defect, so as to obtain the target reason;
optionally, the causes of the cloud service quality defect may be multiple, for example, when the cloud host is concurrent in the network, the instantaneous bandwidth reaches the upper limit of the network bandwidth and occupies all the bandwidth of the server, and for example, when the vhost (virtual host) of the server competes with the vcpu (virtual processor) to cause that the network packet cannot be processed, and for example, when the load of the CPU (central processing unit) corresponding to the cloud host is too high, the network packet cannot be processed in time, and for example, when the cloud host is interrupted and distributed on each processing core of the server, the embodiment of the present invention may completely sort and analyze various causes of the cloud service quality defect as much as possible, and define descriptions of the causes of the cloud service quality defect.
Optionally, it is obvious that, in the embodiment of the present invention, a cause description corresponding to each cause causing a cloud service quality defect in each defect form may also be preset, and when it is determined that a cloud service quality defect currently exists, a defect form of the cloud server quality defect currently existing may be further determined, so that according to the cause description corresponding to each cause causing a cloud service quality defect in the defect form, operation data corresponding to a cloud service may be analyzed, and a target cause matched from the operation data may be determined.
The following introduces the flow of the cloud service quality monitoring method provided by the embodiment of the present invention, respectively for six types of defect forms of the cloud server quality defect and the reasons for causing the cloud server quality defect respectively.
Firstly, a server has a jitter defect, and all cloud hosts on the server have the jitter defect; the reason for causing the cloud service quality defect may be that the instantaneous bandwidth of a single cloud host in the server reaches the upper limit of the network bandwidth when the network is concurrent, and the occupied server bandwidth proportion reaches the upper limit of the bandwidth proportion (the cloud host can be considered to occupy almost all the bandwidth of the server); if the bandwidth on the server is occupied by one cloud host, all the cloud hosts on the server will generate jitter at the moment;
in this regard, embodiments of the present invention may provide a solution policy: and limiting the bandwidth of the cloud host within a set bandwidth range.
Optionally, taking a network test detection packet as a ping detection packet as an example, fig. 2 shows another flowchart of the cloud service quality monitoring method provided by the embodiment of the present invention, where the method is applicable to a server, and referring to fig. 2, the cloud service quality monitoring method may include:
step S200, receiving ping detection packets respectively aiming at the server and the cloud host in the server.
Step S210, determining a cloud service quality detection result of the server according to the ping detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the ping detection packet for the cloud host.
Optionally, in the embodiment of the present invention, whether the server has a delay defect and/or a jitter defect may be determined according to the ping detection packet for the server, and whether the cloud host has a delay defect and/or a jitter defect may be determined according to the ping detection packet for the cloud host.
Step S220, if the determined cloud service quality detection results of the server and the cloud hosts indicate that the server has a jitter defect and all the cloud hosts on the server have a jitter defect, analyzing corresponding operation data of the cloud service according to the reason description of the preset first reason, and judging whether the instantaneous bandwidth of a single cloud host in the server during network concurrency reaches the upper limit of the network bandwidth and whether the occupied bandwidth proportion of the server reaches the upper limit of the bandwidth proportion.
Optionally, in the embodiment of the present invention, it may be preset that the server has a jitter defect, and all cloud hosts on the server have a first reason corresponding to the jitter defect, and a reason description of the first reason is defined; the reason description of the first reason includes: the instantaneous bandwidth of a single cloud host in the server during network concurrency reaches the upper limit of the network bandwidth, and the server bandwidth proportion occupied by the single cloud host reaches the upper limit of the bandwidth proportion;
analyzing the operation data according to the preset reason description of the first reason, judging whether the data content of the operation data is matched with the reason description of the first reason or not, and determining that the target reason is the first reason when the data content of the operation data is matched with the reason description of the first reason; the target reason is that a single cloud host in the server occupies almost all the bandwidth of the server, which is specifically expressed as the instant bandwidth of the single cloud host in the server when the network is concurrent, and reaches the upper limit of the network bandwidth, and the bandwidth proportion of the server occupied by the single cloud host reaches the upper limit of the bandwidth proportion.
Optionally, in the embodiment of the present invention, the network card device of the server may obtain the operation data corresponding to the cloud service.
Step S230, if the instantaneous bandwidth of a single cloud host in the server during network concurrency reaches the upper limit of the network bandwidth, and the occupied server bandwidth proportion reaches the upper limit of the bandwidth proportion, determining that the target factor is the first factor.
Step S240, obtaining a first solution policy corresponding to the preset first reason, and executing the first solution policy; the first resolution strategy comprises: and limiting the bandwidth of the cloud host within a set bandwidth range.
Optionally, in the embodiment of the present invention, the bandwidth of the cloud host may be limited by using a network tool on the server operating system, and the set bandwidth range of the cloud host may be determined by referring to the network configuration when the user purchases the cloud host, which may be specifically set according to an actual situation.
Secondly, the server has a delay defect and a jitter defect, the CPU usage proportion occupied by the load of a numa (Non-uniform memory access) node of the server and the network packet quantity of the cloud host (the network packet quantity received and sent by the cloud host) can be analyzed, if the CPU usage proportion occupied by the load of the numa node of the server is larger than the set CPU usage proportion and the network packet quantity of the cloud host is larger than the set network packet quantity, the cloud service quality defect may be caused by that the vhost and the vcpu of the server compete for resources, and the network packet cannot be processed in time;
correspondingly, when the server has a delay defect and a jitter defect, the embodiment of the invention can analyze the CPU usage proportion occupied by the load of the numa node of the server and the network packet quantity of the cloud host by operating the data, wherein the CPU usage proportion occupied by the load of the numa node of the server is larger than the set CPU usage proportion, and the network packet quantity of the cloud host is larger than the set network packet quantity, and analyze the resource usage change conditions of the vhost and the vcpu by operating the data, and when the resource usage change conditions reflect that the vhost and the vcpu compete for the server resources, determine that the cause of the cloud service quality defect is caused by the fact that the vhost and the vcpu compete for the server resources;
in this regard, embodiments of the invention may provide a resolution policy: and migrating the cloud host, or raising the processing priority of the vhost to the real-time priority.
Optionally, fig. 3 shows another flowchart of a cloud service quality monitoring method provided in an embodiment of the present invention, where the method is applicable to a server, and referring to fig. 3, the cloud service quality monitoring method may include:
and step S300, receiving ping detection packets aiming at the server and the cloud host in the server respectively.
Step S310, determining a cloud service quality detection result of the server according to the ping detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the ping detection packet for the cloud host.
Step S320, if the determined cloud service quality detection results of the server and the cloud host indicate that the server has a delay defect and a jitter defect, analyzing the corresponding operation data of the cloud service, and determining whether a CPU usage ratio occupied by a numa node of the server is greater than a set CPU usage ratio, and whether a network packet amount of the cloud host is greater than a set network packet amount.
A numa node is a memory designed for a multiprocessor computer, with memory access times dependent on the location of the memory relative to the processor.
Optionally, the set network packet amount may be set in combination with the configuration of the cloud host itself.
Step S330, if the CPU usage proportion occupied by the numa node load of the server is larger than the set CPU usage proportion and the network packet quantity of the cloud host is larger than the set network packet quantity, analyzing corresponding operation data of the cloud service according to the reason description of a preset second reason, and determining the resource usage change conditions of the vhost and the vcpu of the server; the reason description of the second reason includes: the server's vhost competes with vcpu for server resources.
If the vhost and the vcpu compete for server resources, a network packet (including a ping detection packet) cannot be processed immediately, so that the time delay of the ping detection packet is very large, and a server and a cloud host both have a time delay defect.
Step S340, if the resource usage change condition of the vhost and the vcpu reflects, the vhost and the vcpu compete for the server resource, and the target reason is determined to be the second reason.
Step S350, obtaining a second solution strategy corresponding to the preset second reason, and executing the second solution strategy; the second resolution policy includes: and migrating the cloud host, or raising the processing priority of the vhost to the real-time priority.
Optionally, migrating the cloud host refers to migrating the cloud host from the server.
Optionally, the process of Linux is divided into a common process and a real-time process, the common process is a non-real-time process SCHED _ OTHER or SCHED _ NORMAL, the real-time process is divided into SCHED _ FIFO and SCHED _ RR, the priority (0-99) of the real-time process is higher than the priority (100-139) of the common process, and the real-time process is always an active process until death; when real-time processes are running in the system, ordinary processes can hardly be time sliced (only 5% of the CPU time).
After the processing priority of the vhost is improved to the real-time priority, both the server time delay and the jitter are reduced, for example, the server time delay is 1.18 milliseconds before the processing priority of the vhost is improved to the real-time priority, and the server time delay can be reduced to 0.261 milliseconds after the processing priority of the vhost is improved to the real-time priority.
If the utilization rate of the CPU corresponding to the cloud host is greater than the set utilization rate and the network packet quantity of the cloud host is greater than the set network packet quantity, the reason causing the cloud service quality defect may be that the utilization rate of the CPU corresponding to the cloud host is too high, so that the network packet cannot be processed in time, and the cloud host has the delay defect and the jitter defect;
in this regard, embodiments of the present invention may provide a solution policy: binding the CPU corresponding to the cloud host according to the logic core, and relieving the condition of overhigh utilization rate of the CPU corresponding to the cloud host;
optionally, fig. 4 shows another flowchart of a cloud service quality monitoring method provided in an embodiment of the present invention, where the method is applicable to a server, and referring to fig. 4, the cloud service quality monitoring method may include:
step S400, receiving ping detection packets aiming at the server and the cloud host in the server respectively.
Step S410, determining a cloud service quality detection result of the server according to the ping detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the ping detection packet for the cloud host.
Step S420, if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud host has a delay defect and a jitter defect, analyzing the corresponding operation data of the cloud service according to the reason description of the preset third reason, and determining whether the usage rate of the CPU corresponding to the cloud host is greater than the set usage rate and the network packet amount of the cloud host is greater than the set network packet amount.
Optionally, the reason description of the third reason includes: the utilization rate of the CPU corresponding to the cloud host is greater than the set utilization rate, and the network packet quantity of the cloud host is greater than the set network packet quantity.
Optionally, the utilization rate of the CPU corresponding to the cloud host is greater than the set utilization rate, which indicates that the utilization rate of the CPU corresponding to the cloud host is too high, and may be caused by too many processes, a large amount of calculations performed by the cloud host, and the like; and the CPU of the cloud host is fully occupied, so that the network packet cannot be processed in time, and the time delay defect and the jitter defect of the cloud host are generated.
Step S430, if the utilization rate of the CPU corresponding to the cloud host is greater than the set utilization rate and the network packet amount of the cloud host is greater than the set network packet amount, determining that the target reason is the third reason.
Step S440, obtaining a preset third solution strategy corresponding to the third cause, and executing the third solution strategy; the third resolution policy includes: and binding the CPU corresponding to the cloud host according to the logic core.
Optionally, in the embodiment of the present invention, the thread of the cloud host may be bound to the specified CPU on the server operating system, so that the CPU corresponding to the cloud host is bound according to the logical core; optionally, the taskset and vcpupin commands may be used to bind the CPU corresponding to the cloud host according to the logical core.
After the CPU corresponding to the cloud host is bound according to the logic core, the jitter and time delay conditions of the cloud host can be reduced; for example, the maximum time delay reaches 119 milliseconds before the CPU corresponding to the cloud host is bound according to the logical core, and the maximum time delay is only 19.5 milliseconds after the CPU corresponding to the cloud host is bound according to the logical core, which greatly reduces the jitter and time delay of the cloud host.
If the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate and the network packet quantity of the cloud host is not greater than the set network packet quantity, the cloud service quality defect may be caused by that the cloud host is interrupted and distributed on each processing core of the server, so that the server has the jitter defect;
in this regard, embodiments of the invention may provide a resolution policy: and binding the cloud host interrupt on a zeroth CPU of the server.
Optionally, fig. 5 shows yet another flowchart of a cloud service quality monitoring method provided in an embodiment of the present invention, where the method is applicable to a server, and referring to fig. 5, the cloud service quality monitoring method may include:
step S500, receiving ping detection packets respectively aiming at the server and the cloud host in the server.
Step S510, determining a cloud service quality detection result of the server according to the ping detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the ping detection packet for the cloud host.
Step S520, if the determined cloud service quality detection results of the server and the cloud host indicate that the server has a jitter defect and the operating systems of the servers having the jitter defect are the same, analyzing the operating data, and determining whether the usage rate of the CPU corresponding to the cloud host is greater than a set usage rate and whether the network packet size of the cloud host is greater than the set network packet size.
Step S530, if the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate and the network packet volume of the cloud host is not greater than the set network packet volume, analyzing the corresponding operation data of the cloud service according to the reason description of the preset fourth reason, and judging whether the cloud host is interrupted and distributed on each processing core of the server.
The reason description of the fourth reason includes: cloud host interrupts are distributed across the various processing cores of the server.
Optionally, the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate, and the network packet amount of the cloud host is not greater than the set network packet amount, which indicates that the CPU utilization rate corresponding to the cloud host and the network packet amount of the cloud host are not high, and it may be that the operating systems of the servers having the jitter defect are the same because the cloud host is interrupted and distributed on each processing core of the server.
Step S540, if the cloud host interrupts are distributed on each processing core of the server, determining that the target reason is the fourth reason.
Step S550, obtaining a preset fourth solution strategy corresponding to the fourth reason, and executing the fourth solution strategy; the fourth resolution strategy includes: and binding the cloud host interrupt on a zeroth CPU of the server.
Optionally, in the embodiment of the present invention, the cloud host interrupt may be bound to the zeroth CPU of the server by executing the command echo 1>/proc/irq/47/smp _ affinity on the server operating system, where 47 is a virtual 0-input interrupt number, which indicates that the interrupt is bound to the CPU0 (zeroth CPU of the server);
it should be noted that an Interrupt Request (IRQ) is a request for service, and is issued at the hardware layer, and may be issued using a dedicated hardware line or an information packet (message signaled interrupt, MSI) across a hardware bus. The IRQ has an associated "like" attribute, smp _ affinity, which may define the CPU core that is allowed to execute ISR for the IRQ; this attribute may also be used to improve program performance by assigning interrupt and program thread similarities to one or more specific CPU cores, which may allow caches to be shared between specified interrupts and program threads. The interrupt proximity value for a particular IRQ may be saved in a related/proc/IRQ _ NUMBER/smp _ affinity file, which the root user may view and modify. The value stored in this file is a hexadecimal byte mask that represents all the CPU cores in the operating system.
According to the embodiment of the invention, after the cloud host is bound to the zeroth CPU of the server in an interrupted manner, the jitter defect of the server can be reduced; for example, if the cloud host interrupt is bound to the front of the zeroth CPU of the server, the jitter of the server is 36.4 milliseconds, and if the cloud host interrupt is bound to the rear of the zeroth CPU of the server, the jitter of the server is 2.9 milliseconds, the jitter defect of the server is reduced.
Fifthly, a single cloud host in the server has a time delay defect, and the time delay corresponding to the single cloud host is greater than a set time delay upper limit, so that the cloud service quality defect may be caused because a hard disk IO (input/output) of the single cloud host is in a native mode, and therefore when the hard disk utilization rate is too high, the operation of the single cloud host is blocked on the hard disk operation, so that the single cloud host has a very large time delay;
in this regard, embodiments of the invention may provide a resolution policy: and canceling a native mode of a hard disk IO of the single cloud host, and restarting the single cloud host.
Optionally, fig. 6 shows another flowchart of a cloud service quality monitoring method provided in an embodiment of the present invention, where the method is applicable to a server, and referring to fig. 6, the cloud service quality monitoring method may include:
step S600, receiving ping detection packets respectively aiming at the server and the cloud host in the server.
Step S610, determining a cloud service quality detection result of the server according to the ping detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the ping detection packet for the cloud host.
Step S620, if the determined cloud service quality detection results of the server and the cloud hosts indicate that a single cloud host in the server has a delay defect, and the delay corresponding to the single cloud host is greater than a set delay upper limit, analyzing the operation data corresponding to the cloud service according to the preset reason description of the fifth cause, and determining whether a hard disk IO of the single cloud host is in a native mode.
The description of the reason for the fifth reason includes: the hard disk IO of a single cloud host is in a native mode.
Step S630, if the hard disk IO of the single cloud host is in a native manner, determining that the target cause is the fifth cause.
Step S640, obtaining a fifth solution policy corresponding to the preset fifth cause, and executing the fifth solution policy; the fifth resolution policy includes: and canceling a native mode of a hard disk IO of the single cloud host, and restarting the single cloud host.
After the disk IO mode of a single cloud host is changed to be not a native mode, the operation of the cloud host is not blocked on the operation of a hard disk, and the time delay defect is reduced; for example, after the disk IO mode of a single cloud host is changed to be not a native mode, the time delay can be reduced from ten thousand milliseconds to single-digit milliseconds.
If the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate and the network packet quantity of the cloud host is not greater than the set network packet quantity, the cloud service quality defect may be caused by that a large amount of memory on the server is consumed, so that the cloud host is exchanged out;
in this regard, embodiments of the invention may provide a resolution policy: the switch partition on the server is cancelled.
Optionally, fig. 7 shows another flowchart of a cloud service quality monitoring method provided in an embodiment of the present invention, where the method is applicable to a server, and referring to fig. 7, the cloud service quality monitoring method may include:
step S700, receiving ping detection packets respectively aiming at the server and the cloud host in the server.
Step S710, determining a cloud service quality detection result of the server according to the ping detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the ping detection packet for the cloud host.
Step S720, if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud host has a jitter defect and the cloud host having the jitter defect has no commonality, analyzing the operation data, and judging whether the utilization rate of the CPU corresponding to the cloud host is greater than a set utilization rate and the network packet volume of the cloud host is greater than the set network packet volume.
Step 730, if the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate and the network packet volume of the cloud host is not greater than the set network packet volume, analyzing the corresponding operation data of the cloud service according to the reason description of the preset sixth reason, and determining whether the memory consumption proportion of the server is greater than the set consumption proportion and the cloud host is exchanged.
The reason description of the sixth reason includes: the memory consumption proportion of the server is larger than the set consumption proportion, and the cloud host is exchanged.
The memory consumption ratio of the server is larger than the set consumption ratio, so that a large amount of memory on the server is consumed, possibly leading to the exchange of the cloud hosts, and causing the jitter defect of the individual cloud hosts in the server.
Step S740, if the memory consumption ratio of the server is greater than the set consumption ratio and the cloud hosts are exchanged, determining that the target reason is the sixth reason.
Step S750, obtaining a preset sixth solution policy corresponding to the sixth reason, and executing the sixth solution policy; the sixth resolution policy includes: the switch partition on the server is cancelled.
The switching partition on the server is cancelled, so that the phenomenon that a jitter defect exists in a certain cloud host in the server can be reduced even if the disk IO delay of the server is high due to other reasons under the condition that the switching partition is not available.
The cloud service quality monitoring method provided by the embodiment of the invention is introduced for six defect forms of the cloud server quality defect and the reasons causing the cloud server quality defect respectively and correspondingly. Therefore, after the server analyzes the current cloud service quality defect and determines the defect form of the current cloud server quality defect, the operation data can be analyzed according to the preset reason description of each reason corresponding to the cloud server quality defect causing the defect form, the target reason of the cloud server quality defect causing the defect form is determined, the corresponding solving strategy is executed, and the monitoring of the cloud service quality is realized;
optionally, the defect form of the quality defect of the cloud server is as indicated in the above six cases:
the server has a jitter defect, and all cloud hosts on the server have the jitter defect;
or, the server has time delay defect and jitter defect;
or, the cloud host has a delay defect and a jitter defect;
or, the servers have jitter defects, and the operating systems of the servers with jitter defects are the same;
or, a single cloud host has a delay defect, and the corresponding delay of the single cloud host is greater than a set delay upper limit;
or the cloud host has jitter defect, and there is no commonality between the cloud hosts having jitter defect.
The cloud service quality monitoring method provided by the embodiment of the invention can realize accurate positioning of the reason causing the cloud service quality defect, and provides a solution for the cloud service quality defect by executing a preset solution strategy corresponding to the positioned reason, so that the cloud service quality is improved, the monitoring effect of improving the cloud service quality is achieved, and the purpose of ensuring the cloud service quality is achieved.
In the following, the cloud service quality monitoring apparatus provided by the embodiment of the present invention is introduced, and the cloud service quality monitoring apparatus described below may be regarded as a functional module architecture that a server needs to set in order to implement the cloud service quality monitoring method provided by the embodiment of the present invention; the following description may be referred to in correspondence with the cloud service quality monitoring method described above.
Fig. 8 is a block diagram of a cloud service quality monitoring apparatus according to an embodiment of the present invention, where the cloud service quality monitoring apparatus may be applied to a server, and referring to fig. 8, the cloud service quality monitoring apparatus may include:
a network test detection packet receiving module 100, configured to receive network test detection packets for a server and a cloud host in the server respectively;
a detection result determining module 110, configured to determine a cloud service quality detection result of the server according to the network test detection packet for the server, and determine a cloud service quality detection result of the cloud host according to the network test detection packet for the cloud host;
a target cause determining module 120, configured to, if the determined cloud service quality detection results of the server and the cloud host indicate that a cloud service quality defect currently exists, analyze, according to preset causes causing the cloud service quality defect, operation data corresponding to the cloud service, and determine a target cause matched from the operation data;
a solution policy executing module 130, configured to obtain a preset solution policy corresponding to the target reason, and execute the solution policy.
Optionally, fig. 9 shows another structural block diagram of the cloud service quality monitoring apparatus provided in the embodiment of the present invention, and in combination with fig. 8 and fig. 9, the cloud service quality monitoring apparatus may further include:
and a defect form determining module 140, configured to determine a defect form of the cloud server quality defect currently existing.
Correspondingly, the target cause determining module 120 is configured to analyze the operation data corresponding to the cloud service according to preset causes causing the cloud service quality defect, and determine a target cause matched from the operation data, specifically including:
and analyzing the corresponding operation data of the cloud service according to the preset reason description corresponding to each reason causing the cloud service quality defect in the defect form, and determining the target reason matched from the operation data.
Optionally, the defect type determined by the defect type determining module 140 includes:
the server has a jitter defect, and all cloud hosts on the server have the jitter defect;
or, the server has time delay defect and jitter defect;
or, the cloud host has a delay defect and a jitter defect;
or, the servers have jitter defects, and the operating systems of the servers with jitter defects are the same;
or, a single cloud host has a delay defect, and the corresponding delay of the single cloud host is greater than a set delay upper limit;
or the cloud hosts have jitter defects, and the cloud hosts with jitter defects do not have commonality.
Optionally, the target cause determining module 120 is configured to analyze, according to a preset cause description corresponding to each cause causing the defect of cloud service quality in the defect form, operation data corresponding to the cloud service, and determine a target cause matched from the operation data, and specifically includes:
if the determined cloud service quality detection results of the server and the cloud hosts indicate that the server has a jitter defect and all the cloud hosts on the server have a jitter defect, analyzing corresponding operation data of the cloud service according to the reason description of a preset first reason, and judging whether the instantaneous bandwidth of a single cloud host in the server during network concurrency reaches the upper limit of the network bandwidth and whether the occupied server bandwidth proportion reaches the upper limit of the bandwidth proportion;
if the instantaneous bandwidth of a single cloud host in the server during network concurrency reaches the upper limit of the network bandwidth, and the occupied server bandwidth proportion reaches the upper limit of the bandwidth proportion, determining that the target reason is the first reason;
correspondingly, the solution policy executing module 130 is configured to obtain a preset solution policy corresponding to the target reason, and execute the solution policy, including:
acquiring a first solution strategy corresponding to the preset first reason, and executing the first solution strategy; the first resolution strategy comprises: and limiting the bandwidth of the cloud host within a set bandwidth range.
Optionally, the target cause determining module 120 is configured to analyze, according to a preset cause description corresponding to each cause causing the defect of cloud service quality in the defect form, operation data corresponding to the cloud service, and determine a target cause matched from the operation data, and specifically includes:
if the determined cloud service quality detection results of the server and the cloud host indicate that the server has a time delay defect and a jitter defect, analyzing corresponding operation data of the cloud service, and judging whether the CPU usage proportion occupied by the numa node load of the server is greater than a set CPU usage proportion and whether the network packet amount of the cloud host is greater than a set network packet amount;
if the CPU usage proportion occupied by the load of the numa node of the server is larger than the set CPU usage proportion, and the network packet quantity of the cloud host is larger than the set network packet quantity, analyzing corresponding operation data of the cloud service according to the reason description of a preset second reason, and determining the resource usage change condition of the vhost and the vcpu of the server; the reason description of the second reason includes: the vhost of the server competes for server resources with the vcpu;
if the resource use change condition of the vhost and the vcpu reflects, the vhost and the vcpu compete for server resources, and the target reason is determined to be the second reason;
correspondingly, the solution policy executing module 130 is configured to obtain a preset solution policy corresponding to the target reason, and execute the solution policy, including:
acquiring a second solution strategy corresponding to the preset second reason, and executing the second solution strategy; the second resolution policy includes: and migrating the cloud host, or raising the processing priority of the vhost to the real-time priority.
Optionally, the target cause determining module 120 is configured to analyze, according to a preset cause description corresponding to each cause causing the defect of cloud service quality in the defect form, operation data corresponding to the cloud service, and determine a target cause matched from the operation data, and specifically includes:
if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud host has a time delay defect and a jitter defect, analyzing corresponding operation data of the cloud service according to the reason description of a preset third reason, and judging whether the utilization rate of a CPU corresponding to the cloud host is greater than a set utilization rate and whether the network packet quantity of the cloud host is greater than the set network packet quantity;
if the utilization rate of the CPU corresponding to the cloud host is greater than the set utilization rate and the network packet quantity of the cloud host is greater than the set network packet quantity, determining that the target reason is the third reason;
correspondingly, the solution policy executing module 130 is configured to obtain a preset solution policy corresponding to the target reason, and execute the solution policy, including:
acquiring a third solution strategy corresponding to the preset third cause, and executing the third solution strategy; the third resolution policy includes: and binding the CPU corresponding to the cloud host according to the logic core.
Optionally, the target cause determining module 120 is configured to analyze, according to a preset cause description corresponding to each cause causing the defect of cloud service quality in the defect form, operation data corresponding to the cloud service, and determine a target cause matched from the operation data, and specifically includes:
if the determined cloud service quality detection results of the server and the cloud host indicate that the server has a jitter defect and the operating systems of the servers with the jitter defect are the same, analyzing the operating data, and judging whether the utilization rate of the CPU corresponding to the cloud host is greater than a set utilization rate and the network packet amount of the cloud host is greater than the set network packet amount;
if the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate and the network packet quantity of the cloud host is not greater than the set network packet quantity, analyzing corresponding operation data of the cloud service according to the reason description of the preset fourth reason, and judging whether the cloud host is interrupted and distributed on each processing core of the server or not;
if the cloud host interrupts are distributed on each processing core of the server, determining that the target reason is the fourth reason;
correspondingly, the solution policy executing module 130 is configured to obtain a preset solution policy corresponding to the target reason, and execute the solution policy, including:
acquiring a preset fourth solution strategy corresponding to the fourth reason, and executing the fourth solution strategy; the fourth resolution strategy includes: and binding the cloud host interrupt on a zeroth CPU of the server.
Optionally, the target cause determining module 120 is configured to analyze, according to a preset cause description corresponding to each cause causing the defect of cloud service quality in the defect form, operation data corresponding to the cloud service, and determine a target cause matched from the operation data, and specifically includes:
if the determined cloud service quality detection results of the server and the cloud host indicate that a single cloud host in the server has a time delay defect, and the time delay corresponding to the single cloud host is greater than a set time delay upper limit, analyzing corresponding operation data of the cloud service according to the reason description of a preset fifth reason, and judging whether the hard disk IO of the single cloud host is in a native mode or not;
if the hard disk IO of the single cloud host is in a native mode, determining that the target reason is the fifth reason;
correspondingly, the solution policy executing module 130 is configured to obtain a preset solution policy corresponding to the target reason, and execute the solution policy, including:
acquiring a fifth solution strategy corresponding to the preset fifth cause, and executing the fifth solution strategy; the fifth resolution policy includes: and canceling a native mode of a hard disk IO of the single cloud host, and restarting the single cloud host.
Optionally, the target cause determining module 120 is configured to analyze, according to a preset cause description corresponding to each cause causing the defect of cloud service quality in the defect form, operation data corresponding to the cloud service, and determine a target cause matched from the operation data, and specifically includes:
if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud host has a jitter defect and the cloud host with the jitter defect has no commonality, analyzing the operation data, and judging whether the utilization rate of the CPU corresponding to the cloud host is greater than a set utilization rate and the network packet amount of the cloud host is greater than the set network packet amount;
if the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate and the network packet quantity of the cloud host is not greater than the set network packet quantity, analyzing corresponding operation data of the cloud service according to the reason description of the preset sixth reason, and judging whether the memory consumption proportion of the server is greater than the set consumption proportion and whether the cloud host is exchanged;
if the memory consumption proportion of the server is larger than the set consumption proportion and the cloud host is exchanged, determining that the target reason is the sixth reason;
correspondingly, the solution policy executing module 130 is configured to obtain a preset solution policy corresponding to the target reason, and execute the solution policy, including:
acquiring a preset sixth solution strategy corresponding to the sixth reason, and executing the sixth solution strategy; the sixth resolution policy includes: the switch partition on the server is cancelled.
The cloud service quality monitoring device provided by the embodiment of the invention can realize accurate positioning of the reason causing the cloud service quality defect, and provides a solution for the cloud service quality defect by executing a preset solution strategy corresponding to the positioned reason, so that the cloud service quality is improved, the monitoring effect of improving the cloud service quality is achieved, and the purpose of ensuring the cloud service quality is achieved.
The embodiment of the invention also provides a server, which can comprise the cloud service quality monitoring device.
Alternatively, fig. 10 shows an alternative hardware structure of a server, and referring to fig. 10, the server may include: a processor 1, a communication interface 2, a memory 3 and a communication bus 4;
wherein, the processor 1, the communication interface 2 and the memory 3 complete the communication with each other through the communication bus 4;
optionally, the communication interface 2 may be an interface of a communication module, such as an interface of a GSM module;
the processor 1 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention.
The memory 3 may comprise a high-speed RAM memory and may also comprise a non-volatile memory, such as at least one disk memory.
Wherein, the processor 1 is specifically configured to:
receiving network test detection packets respectively aiming at a server and a cloud host in the server;
determining a cloud service quality detection result of the server according to the network test detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the network test detection packet for the cloud host;
if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud service quality is defective currently, analyzing corresponding operation data of the cloud service according to preset reasons causing the cloud service quality defect, and determining a target reason matched from the operation data;
and acquiring a preset solution strategy corresponding to the target reason, and executing the solution strategy.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (13)

1. A cloud service quality monitoring method is applied to a server, and comprises the following steps:
receiving network test detection packets respectively aiming at a server and a cloud host in the server, wherein the network test detection packets are used for testing network connection quantity;
determining a cloud service quality detection result of the server according to the network test detection packet for the server, and determining a cloud service quality detection result of the cloud host according to the network test detection packet for the cloud host;
if the determined cloud service quality detection results of the server and the cloud host indicate that a cloud service quality defect exists currently, calling operation data corresponding to the cloud service and a preset reason causing the cloud service quality defect, analyzing the operation data corresponding to the cloud service according to the preset reason causing the cloud service quality defect, and determining a target reason matched from the operation data, wherein the target reason belongs to the preset reason causing the cloud service quality defect, and the preset reason causing the cloud service quality defect is obtained through pre-sorting and analyzing;
determining the current equipment with the cloud service quality defect, determining the association between the equipment, obtaining the defect description of the current cloud service quality defect, and determining the defect form of the current cloud server quality defect;
acquiring a preset solution strategy corresponding to the target reason, and executing the solution strategy, wherein the method specifically comprises the following steps: acquiring a fifth solution strategy corresponding to a preset fifth factor, and executing the fifth solution strategy; the fifth resolution policy includes: canceling a native mode of hard disk input and output of a single cloud host, and restarting the single cloud host;
analyzing the corresponding operation data of the cloud service according to the preset reason causing the cloud service quality defect, and determining the target reason matched from the operation data, wherein the target reason comprises the following steps:
analyzing the operation data corresponding to the cloud service according to the preset reason description corresponding to each reason causing the cloud service quality defect in the defect form, and determining the target reason matched from the operation data, wherein the method specifically comprises the following steps: if the determined cloud service quality detection results of the server and the cloud host indicate that a single cloud host in the server has a time delay defect, and the time delay corresponding to the single cloud host is greater than a set time delay upper limit, analyzing corresponding operation data of the cloud service according to a preset reason description of a fifth reason, and judging whether the hard disk input and output of the single cloud host are in a native mode, wherein the reason description of the fifth reason is that the hard disk input and output of the single cloud host are in the native mode; and if the hard disk input and output of the single cloud host are in a native mode, determining that the target reason is the fifth reason.
2. The cloud quality of service monitoring method of claim 1, wherein the defect form comprises:
the server has a jitter defect, and all cloud hosts on the server have the jitter defect;
or, the server has time delay defect and jitter defect;
or, the cloud host has a delay defect and a jitter defect;
or, the servers have jitter defects, and the operating systems of the servers with jitter defects are the same;
or, a single cloud host has a delay defect, and the corresponding delay of the single cloud host is greater than a set delay upper limit;
or the cloud hosts have jitter defects, and the cloud hosts with jitter defects do not have commonality.
3. The cloud service quality monitoring method according to claim 2, wherein analyzing the operation data corresponding to the cloud service according to the preset cause description corresponding to each cause causing the cloud service quality defect in the defect form, and determining the target cause matched from the operation data comprises:
if the determined cloud service quality detection results of the server and the cloud hosts indicate that the server has a jitter defect and all the cloud hosts on the server have a jitter defect, analyzing corresponding operation data of the cloud service according to the reason description of a preset first reason, and judging whether the instantaneous bandwidth of a single cloud host in the server during network concurrency reaches the upper limit of the network bandwidth and whether the occupied server bandwidth proportion reaches the upper limit of the bandwidth proportion;
if the instantaneous bandwidth of a single cloud host in the server during network concurrency reaches the upper limit of the network bandwidth, and the occupied server bandwidth proportion reaches the upper limit of the bandwidth proportion, determining that the target reason is the first reason;
the obtaining of a preset solution strategy corresponding to the target reason, wherein the executing of the solution strategy comprises:
acquiring a first solution strategy corresponding to the preset first reason, and executing the first solution strategy; the first resolution strategy comprises: and limiting the bandwidth of the cloud host within a set bandwidth range.
4. The cloud service quality monitoring method according to claim 2, wherein analyzing the operation data corresponding to the cloud service according to the preset cause description corresponding to each cause causing the cloud service quality defect in the defect form, and determining the target cause matched from the operation data comprises:
if the determined cloud service quality detection results of the server and the cloud host indicate that the server has a time delay defect and a jitter defect, analyzing corresponding operation data of the cloud service, and judging whether the CPU usage proportion occupied by the load of the non-uniform memory access architecture node of the server is greater than the set CPU usage proportion and the network packet quantity of the cloud host is greater than the set network packet quantity;
if the CPU usage proportion occupied by the load of the non-uniform memory access architecture node of the server is larger than the set CPU usage proportion, and the network packet quantity of the cloud host is larger than the set network packet quantity, analyzing corresponding operation data of the cloud service according to the reason description of a preset second reason, and determining the resource usage change condition of the virtual host and the virtual processor of the server; the reason description of the second reason includes: a virtual host of the server competes with a virtual processor for server resources;
and if the resource use change conditions of the virtual host and the virtual processor reflect, the virtual host and the virtual processor compete for server resources, and the target reason is determined to be the second reason.
5. The cloud service quality monitoring method according to claim 4, wherein the obtaining of the preset solution policy corresponding to the target reason, and the executing of the solution policy includes:
acquiring a second solution strategy corresponding to the preset second reason, and executing the second solution strategy; the second resolution policy includes: and migrating the cloud host, or improving the processing priority of the virtual host to the real-time priority.
6. The cloud service quality monitoring method according to claim 2, wherein analyzing the operation data corresponding to the cloud service according to the preset cause description corresponding to each cause causing the cloud service quality defect in the defect form, and determining the target cause matched from the operation data comprises:
if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud host has a time delay defect and a jitter defect, analyzing corresponding operation data of the cloud service according to the reason description of a preset third reason, and judging whether the utilization rate of a CPU corresponding to the cloud host is greater than a set utilization rate and whether the network packet quantity of the cloud host is greater than the set network packet quantity;
if the utilization rate of the CPU corresponding to the cloud host is greater than the set utilization rate and the network packet quantity of the cloud host is greater than the set network packet quantity, determining that the target reason is the third reason;
the obtaining of a preset solution strategy corresponding to the target reason, wherein the executing of the solution strategy comprises:
acquiring a third solution strategy corresponding to the preset third cause, and executing the third solution strategy; the third resolution policy includes: and binding the CPU corresponding to the cloud host according to the logic core.
7. The cloud service quality monitoring method according to claim 2, wherein analyzing the operation data corresponding to the cloud service according to the preset cause description corresponding to each cause causing the cloud service quality defect in the defect form, and determining the target cause matched from the operation data comprises:
if the determined cloud service quality detection results of the server and the cloud host indicate that the server has a jitter defect and the operating systems of the servers with the jitter defect are the same, analyzing the operating data, and judging whether the utilization rate of the CPU corresponding to the cloud host is greater than a set utilization rate and the network packet amount of the cloud host is greater than the set network packet amount;
if the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate and the network packet quantity of the cloud host is not greater than the set network packet quantity, analyzing corresponding operation data of the cloud service according to the reason description of the preset fourth reason, and judging whether the cloud host is interrupted and distributed on each processing core of the server or not;
if the cloud host interrupts are distributed on each processing core of the server, determining that the target reason is the fourth reason;
the obtaining of a preset solution strategy corresponding to the target reason, wherein the executing of the solution strategy comprises:
acquiring a preset fourth solution strategy corresponding to the fourth reason, and executing the fourth solution strategy; the fourth resolution strategy includes: and binding the cloud host interrupt on a zeroth CPU of the server.
8. The cloud service quality monitoring method according to claim 2, wherein analyzing the operation data corresponding to the cloud service according to the preset cause description corresponding to each cause causing the cloud service quality defect in the defect form, and determining the target cause matched from the operation data comprises:
if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud host has a jitter defect and the cloud host with the jitter defect has no commonality, analyzing the operation data, and judging whether the utilization rate of the CPU corresponding to the cloud host is greater than a set utilization rate and the network packet amount of the cloud host is greater than the set network packet amount;
if the utilization rate of the CPU corresponding to the cloud host is not greater than the set utilization rate and the network packet quantity of the cloud host is not greater than the set network packet quantity, analyzing corresponding operation data of the cloud service according to the reason description of the preset sixth reason, and judging whether the memory consumption proportion of the server is greater than the set consumption proportion and whether the cloud host is exchanged;
and if the memory consumption proportion of the server is larger than the set consumption proportion and the cloud host is exchanged, determining that the target reason is the sixth reason.
9. The cloud qos monitoring method according to claim 8, wherein the obtaining of the preset solution policy corresponding to the target cause, and the executing of the solution policy includes:
acquiring a preset sixth solution strategy corresponding to the sixth reason, and executing the sixth solution strategy; the sixth resolution policy includes: the switch partition on the server is cancelled.
10. A cloud service quality monitoring device is applied to a server, and the device comprises:
the network test detection packet receiving module is used for receiving network test detection packets respectively aiming at the server and the cloud host in the server, and the network test detection packets are used for testing the network connection quantity;
the detection result determining module is used for determining a cloud service quality detection result of the server according to the network test detection packet for the server and determining a cloud service quality detection result of the cloud host according to the network test detection packet for the cloud host;
the target reason determining module is used for calling the corresponding operation data of the cloud service and the preset reason causing the cloud service quality defect if the determined cloud service quality detection results of the server and the cloud host indicate that the cloud service quality defect exists at present, analyzing the corresponding operation data of the cloud service according to the preset reason causing the cloud service quality defect, and determining a target reason matched from the operation data, wherein the target reason belongs to the preset reason causing the cloud service quality defect, and the preset reason causing the cloud service quality defect is obtained by pre-sorting and analyzing the reason causing the cloud service quality defect;
the solution strategy execution module is used for acquiring a preset solution strategy corresponding to the target reason and executing the solution strategy;
the solution policy executing module is specifically configured to acquire a fifth solution policy corresponding to a preset fifth factor, and execute the fifth solution policy; the fifth resolution policy includes: canceling a native mode of hard disk input and output of a single cloud host, and restarting the single cloud host;
the defect form determining module is used for determining the defect form of the quality defect of the cloud server;
the target cause determining module is configured to analyze operation data corresponding to the cloud service according to preset causes causing cloud service quality defects, and determine a target cause matched from the operation data, and specifically includes:
analyzing the operation data corresponding to the cloud service according to the preset reason description corresponding to each reason causing the cloud service quality defect in the defect form, and determining the target reason matched from the operation data, wherein the method specifically comprises the following steps: if the determined cloud service quality detection results of the server and the cloud host indicate that a single cloud host in the server has a time delay defect, and the time delay corresponding to the single cloud host is greater than a set time delay upper limit, analyzing corresponding operation data of the cloud service according to a preset reason description of a fifth reason, and judging whether the hard disk input and output of the single cloud host are in a native mode, wherein the reason description of the fifth reason is that the hard disk input and output of the single cloud host are in the native mode; if the hard disk input and output of the single cloud host are in a native mode, determining that the target reason is the fifth reason;
the device is further used for determining the current equipment with the cloud service quality defect, determining the association between the equipment, obtaining the defect description of the current cloud service quality defect, and determining the defect form of the current cloud server quality defect.
11. The cloud quality of service monitoring apparatus of claim 10, wherein the defect form comprises:
the server has a jitter defect, and all cloud hosts on the server have the jitter defect;
or, the server has time delay defect and jitter defect;
or, the cloud host has a delay defect and a jitter defect;
or, the servers have jitter defects, and the operating systems of the servers with jitter defects are the same;
or, a single cloud host has a delay defect, and the corresponding delay of the single cloud host is greater than a set delay upper limit;
or the cloud hosts have jitter defects, and the cloud hosts with jitter defects do not have commonality.
12. A server, characterized by comprising the cloud service quality monitoring apparatus of any one of claims 10 to 11.
13. A computer-readable storage medium comprising instructions for performing the method of any of claims 1-9.
CN201710103863.5A 2017-02-24 2017-02-24 Cloud service quality monitoring method and device and server Active CN108512673B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710103863.5A CN108512673B (en) 2017-02-24 2017-02-24 Cloud service quality monitoring method and device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710103863.5A CN108512673B (en) 2017-02-24 2017-02-24 Cloud service quality monitoring method and device and server

Publications (2)

Publication Number Publication Date
CN108512673A CN108512673A (en) 2018-09-07
CN108512673B true CN108512673B (en) 2021-08-03

Family

ID=63373810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710103863.5A Active CN108512673B (en) 2017-02-24 2017-02-24 Cloud service quality monitoring method and device and server

Country Status (1)

Country Link
CN (1) CN108512673B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110740078B (en) * 2019-09-26 2023-08-22 平安科技(深圳)有限公司 Proxy monitoring method of server and related products
CN110784337B (en) * 2019-09-26 2023-08-22 平安科技(深圳)有限公司 Cloud service quality monitoring method and related products
CN111913660B (en) * 2020-07-15 2022-11-18 郑州阿帕斯数云信息科技有限公司 Dotting data processing method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1777126A (en) * 2005-12-12 2006-05-24 史文勇 System and method for conducting comprehensive measurement and association analysis to time delay and drop
CN1897547A (en) * 2005-07-14 2007-01-17 华为技术有限公司 Method for inspecting Qos in telecommunication network
CN104038392A (en) * 2014-07-04 2014-09-10 云南电网公司 Method for evaluating service quality of cloud computing resources
CN105760230A (en) * 2016-02-18 2016-07-13 广东睿江云计算股份有限公司 Method and device for automatically adjusting operation of cloud host
CN106411647A (en) * 2016-10-13 2017-02-15 腾讯科技(深圳)有限公司 Communication quality detection method and detection server

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101409651B (en) * 2008-11-26 2012-11-07 中国电信股份有限公司 Method, system and equipment for monitoring soft exchange load-bearing network quality
CN102035691A (en) * 2009-09-28 2011-04-27 中国移动通信集团公司 Method and device for detecting quality of network link
CN102692896B (en) * 2011-11-17 2013-12-11 上海理工大学 System for remotely maintaining printer in real time based on virtual reality technology
CN106130809B (en) * 2016-09-07 2019-06-25 东南大学 A kind of IaaS cloud platform network failure locating method and system based on log analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1897547A (en) * 2005-07-14 2007-01-17 华为技术有限公司 Method for inspecting Qos in telecommunication network
CN1777126A (en) * 2005-12-12 2006-05-24 史文勇 System and method for conducting comprehensive measurement and association analysis to time delay and drop
CN104038392A (en) * 2014-07-04 2014-09-10 云南电网公司 Method for evaluating service quality of cloud computing resources
CN105760230A (en) * 2016-02-18 2016-07-13 广东睿江云计算股份有限公司 Method and device for automatically adjusting operation of cloud host
CN106411647A (en) * 2016-10-13 2017-02-15 腾讯科技(深圳)有限公司 Communication quality detection method and detection server

Also Published As

Publication number Publication date
CN108512673A (en) 2018-09-07

Similar Documents

Publication Publication Date Title
CN106302434B (en) Server adaptation method, device and system
US20180375726A1 (en) Resource Configuration Method, Virtualized Network Function Manager, and Element Management System
US11876731B2 (en) System and methods for sharing memory subsystem resources among datacenter applications
CN108512673B (en) Cloud service quality monitoring method and device and server
CN112003797B (en) Method, system, terminal and storage medium for improving performance of virtualized DPDK network
CN110224943B (en) Flow service current limiting method based on URL, electronic equipment and computer storage medium
CN113067875B (en) Access method, device and equipment based on dynamic flow control of micro-service gateway
US11048632B2 (en) Data storage system with performance-based distribution of I/O requests to processing cores
CN110557432B (en) Cache pool balance optimization method, system, terminal and storage medium
CN114138481A (en) Data processing method, device and medium
CN111597041B (en) Calling method and device of distributed system, terminal equipment and server
CN112596985A (en) IT asset detection method, device, equipment and medium
CN108804152B (en) Method and device for adjusting configuration parameters
CN110309036B (en) CPU occupancy rate detection method and detection equipment
CN115576698A (en) Network card interrupt aggregation method, device, equipment and medium
CN111427673B (en) Load balancing method, device and equipment
CN111352710A (en) Process management method and device, computing equipment and storage medium
CN109491948B (en) Data processing method and device for double ports of solid state disk
CN116431327B (en) Task current limiting processing method and fort machine
CN108289084B (en) Access traffic blocking method and apparatus, and non-transitory computer-readable storage medium
CN115033390B (en) Load balancing method and device
CN113992589B (en) Message distribution method and device and electronic equipment
CN114338169B (en) Request processing method, device, server and computer readable storage medium
CN116501450B (en) Translation control method, binary translation method, instruction execution method and device
US20240134731A1 (en) Intelligent exposure of hardware latency statistics within an electronic device or system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant