CN115865624A

CN115865624A - Root cause positioning method of performance bottleneck in host, electronic equipment and storage medium

Info

Publication number: CN115865624A
Application number: CN202211500602.4A
Authority: CN
Inventors: 江卓; 刘克非; 魏浩然; 钟小龙; 王剑
Original assignee: Beijing Youzhuju Network Technology Co Ltd
Current assignee: Beijing Youzhuju Network Technology Co Ltd
Priority date: 2022-11-28
Filing date: 2022-11-28
Publication date: 2023-03-28

Abstract

The disclosure relates to the technical field of computer networks, in particular to a root cause positioning method of a bottleneck in a host, electronic equipment and a storage medium, wherein the method comprises the steps of acquiring a path topological structure in the host to be tested; performing loopback test on a target path in the host to be tested based on the path topological structure, and determining the path performance of the target path, wherein the target path is a path between a network card in the host to be tested and an endpoint in the host to be tested; and analyzing the path state of the target path based on the path performance, and determining the root cause of the performance bottleneck in the host to be tested. The root cause location is carried out based on the loopback test, the reliability of the performance of the obtained path can be ensured by utilizing the loopback test, and meanwhile, the performance bottleneck in the host is characterized by the path state, so that the root cause of the performance bottleneck can be accurately located on the basis.

Description

Root cause positioning method of performance bottleneck in host, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer network technologies, and in particular, to a root cause location method of a performance bottleneck in a host, an electronic device, and a storage medium.

Background

Remote Direct Memory Access (RDMA) has been applied to numerous applications in data centers, such as distributed machine learning, distributed storage, etc., to achieve high throughput and low latency. As the last hop for network communication, the network within the host significantly affects the performance of the network application. However, the network in the host also presents a bottleneck, the bandwidth in the host may be reduced due to sudden link failure or occupation by other traffic, and when the bandwidth in the host is reduced, the traffic on the RDMA network card is more likely to be blocked. Further, in a multi-machine system such as a distributed training scenario, a performance bottleneck on one host may cause the throughput of the entire system to be severely reduced, and in severe cases, may even cause the training task to be stalled. Therefore, it is desirable to accurately locate the root cause of the performance bottleneck in the host to avoid impacting traffic performance.

Disclosure of Invention

In view of this, embodiments of the present disclosure provide a root cause location method of a performance bottleneck in a host, an electronic device, and a storage medium, so as to solve the problem of root cause location of the performance bottleneck in the host.

According to a first aspect, an embodiment of the present disclosure provides a root cause location method for a performance bottleneck in a host, including:

acquiring a path topological structure in a host to be tested;

performing loopback test on a target path in the host to be tested based on the path topological structure, and determining the path performance of the target path, wherein the target path is a path between a network card in the host to be tested and an endpoint in the host to be tested;

and analyzing the path state of the target path based on the path performance, and determining the root of the performance bottleneck in the host to be tested.

According to a second aspect, an embodiment of the present disclosure further provides a root cause locating device of a performance bottleneck in a host, including:

the acquisition module is used for acquiring a path topological structure in the host to be tested;

the test module is used for carrying out loopback test on a target path in the host to be tested based on the path topological structure and determining the path performance of the target path, wherein the target path is a path between a network card in the host to be tested and an end point in the host to be tested;

and the analysis module is used for carrying out path state analysis on the target path based on the path performance and determining the root of the performance bottleneck in the host to be tested.

According to a third aspect, an embodiment of the present disclosure provides an electronic device, including: a memory and a processor, the memory and the processor being communicatively connected to each other, the memory storing therein computer instructions, and the processor executing the computer instructions to perform the method for root cause localization of a performance bottleneck in a host according to the first aspect or any one of the embodiments of the first aspect.

According to a fourth aspect, an embodiment of the present disclosure provides a computer-readable storage medium storing computer instructions for causing a computer to execute the method for root cause localization of a performance bottleneck in a host according to the first aspect or any one of the implementation manners of the first aspect.

According to a fifth aspect, embodiments of the present disclosure provide a root cause localization system of a performance bottleneck in a data center, including:

a host;

a root cause locating device, configured to determine a root cause of a performance bottleneck of each host by executing the root cause locating method for an internal bottleneck of a host in the first aspect or any one implementation manner of the first aspect, and when the host does not perform the performance bottleneck, determine that the performance bottleneck of the data center is a network bottleneck.

According to the root cause positioning method of the performance bottleneck in the host, the target path in the host to be tested is subjected to loopback test to determine the path performance of the target path, and the path state of the target path is analyzed based on the path performance to position the root cause of the performance bottleneck in the host to be tested. The method is based on the loopback test to carry out root cause positioning, because the reliability of the performance of the obtained path can be ensured by utilizing the loopback test, and meanwhile, the performance bottleneck in the host is represented by the path state, on the basis, the root cause of the performance bottleneck can be accurately positioned.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a flow chart of a method for root cause location of a performance bottleneck in a host according to an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method of root cause location of a performance bottleneck within a host according to an embodiment of the present disclosure;

FIG. 3 is a flow chart of a method of root cause location of a performance bottleneck within a host according to an embodiment of the present disclosure;

FIG. 4 is a block diagram of a root cause locator device of a performance bottleneck in a host, according to an embodiment of the present disclosure;

fig. 5 is a schematic hardware structure diagram of an electronic device provided by an embodiment of the present disclosure;

fig. 6 is a block diagram of a root cause location system of a performance bottleneck in a data center according to an embodiment of the present disclosure.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

As described above, RDMA network cards are the last hop for network communications, and network performance in the host significantly affects the performance of network applications. However, the network within the host can also become a bottleneck, and bandwidth within the host can be reduced due to sudden link failures or occupation by other traffic. In the past, the bandwidth in the host is far greater than the line rate of the RDMA network card, for example, the bandwidth of the 25Gb/s RDMA network card is compared with the bandwidth of 62.96Gb/s PCIe Gen3 x8, which provides enough bandwidth redundancy for the flow on the RDMA network card. Therefore, the network within the host rarely becomes a bottleneck for network communication.

However, as applications continue to strive for high throughput and low latency, the line rate of RDMA network cards is rapidly increasing, e.g., from 25Gb/s to 200Gb/s. In contrast, the bandwidth boost in the host does not match, e.g., the PCIe bandwidth increases from 62.96Gb/s to 252.06Gb/s. Thus, traffic on the RDMA network card is more likely to be blocked when bandwidth drops in the host. Furthermore, both topology and traffic patterns within the host become more complex, which makes bandwidth drops within the host caused by link failures or traffic contention more frequent. Further, as traffic within the host becomes more complex, configuration items in the host also increase substantially, which results in an increased probability of misconfiguration. Some of these configurations, such as turning on access control services, redirect traffic between a GPU (Graphics Processing Unit, abbreviated as GPU) and RDMA to the CPU root complex, resulting in a drastic increase in latency and a severe reduction in bandwidth in the host.

Performance bottlenecks in the host can severely impact service performance, and if the bandwidth in the host is lower than the receive rate of the RDMA network card, the receive buffer of the RDMA network card may accumulate or even become full. If this happens in a lossy environment, the RDMA network card may lose packets. Since RDMA is very sensitive to packet loss, even a very low packet loss rate can result in a dramatic drop in throughput. If this happens in a lossless environment, the RDMA network card will send a Priority-based Flow Control (PFC) pause frame to the egress port of the upstream switch to stop its traffic. If the RDMA network card continues to send PFC pause frames, it will likely trigger a PFC storm and cause the entire network to crash. Furthermore, in a multi-machine system such as a distributed training scenario, a performance bottleneck on one host may cause the throughput of the entire system to be severely reduced, and in severe cases, even cause the training task to be stalled. Therefore, when a network bottleneck occurs in the host, it needs to be discovered and accurately positioned as soon as possible.

Existing bottleneck positioning mechanisms are lack of a monitoring system for network bottlenecks in a host, and performance bottlenecks in the host cannot be discovered in time when the performance bottlenecks occur. However, when the traffic direction feeds back the traffic performance degradation to the network group, the traffic performance has typically been severely affected. Since the performance degradation caused by the bottleneck in the host and the bottleneck in the network may be similar, when the performance of the network is degraded, it is first necessary to determine whether the problem occurs in the network or in the host. Even if the bottleneck is confirmed to occur in the host, the host needs to be logged in, a series of test cases are executed, and some analysis tools are used for deducing the bottleneck, so that the whole process is time-consuming.

Also, existing analytical tools can only monitor a particular device, but because there may be different combinations of devices in each host, different combinations of tools may be required for each diagnosis, which can incur additional learning and execution overhead.

Based on this, the embodiment of the present disclosure provides a root cause positioning method for a performance bottleneck in a host, which can quickly find the performance bottleneck in the host and diagnose the root cause of the bottleneck. The scheme can be deployed on all RDMA servers with low cost, and can identify the suppliers of the equipment in the host computer during operation, so that the API provided by different equipment suppliers can be automatically called. For example, when the scheme identifies that the kind of the CPU is Intel, the interface provided by the Intel is called; when the scheme identifies that the CPU type is AMD, the interface provided by the AMD is called, so that the scheme can adapt to the equipment of different suppliers. When bottlenecks appear in the host, the protocol can quickly discover them and automatically diagnose their root cause. Therefore, when the network performance is degraded, whether the host or the network is attributed to can be quickly judged based on the method.

The core idea of the scheme is to perform loopback test on the path in the host to determine the path state of the path in the host, and infer the performance bottleneck by using the path state in the host. The scheme can be realized based on a commercial RDMA network card, and can be deployed in all RDMA servers. Of course, the solution may also be deployed in other types of servers, and a specific application scenario thereof is not limited in any way herein.

In accordance with an embodiment of the present disclosure, there is provided a method embodiment for root cause location of performance bottlenecks in hosts, it is noted that the steps illustrated in the flow charts of the accompanying figures may be performed in a computer system such as a set of computer-executable instructions and that while logical sequences are illustrated in the flow charts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

In this embodiment, a root cause positioning method of a performance bottleneck in a host is provided, which can be used for electronic devices, such as computers, test devices, and the like, fig. 1 is a flowchart of the root cause positioning method of the performance bottleneck in the host according to the embodiment of the present disclosure, and as shown in fig. 1, the flowchart includes the following steps:

s11, obtaining a path topological structure in the host to be tested.

The path topological structure is known in advance, the path topological structure in the host to be tested is the same for the host of the same type, and the path topological structure in the host to be tested is obtained according to the model information when the host to be tested is in network connection.

S12, performing loopback test on the target path in the host to be tested based on the path topological structure, and determining the path performance of the target path.

The target path is a path between a network card in the host to be tested and an endpoint in the host to be tested.

The target path in the host to be tested may be all paths in the host to be tested, that is, paths between all network cards and all end points in the host to be tested, or may be a designated path in the host to be tested, for example, a path connected to an abnormal network card, or the like. The selection of the target path may be set according to the interaction of the user, or may be set according to the state of the host to be tested. The more the number of the target paths participating in the loopback test is, the more the calculation resources are occupied, so that the influence of the loopback test on the service flow of the host to be tested can be greatly reduced by setting the target paths according to the state of the host to be tested.

The loopback test refers to a data transmission process from an endpoint in the host to be tested to the network card connected with the endpoint and then back to the endpoint, and the path performance of the target path can be determined by recording the time spent in the process. Path performance includes, but is not limited to, latency or bandwidth. And performing loopback test on each target path so as to determine the path performance of each target path.

And S13, analyzing the path state of the target path based on the path performance, and determining the root cause of the performance bottleneck in the host to be tested.

The target path may include a plurality of links, specifically, the target path refers to a path between the network card and the endpoint, and nodes may be located between the network card and the endpoint, which may divide the path into a plurality of links, and refer to connections between adjacent nodes as links. After the processing of S12, the path performance of the target path is obtained, and the path performance may be compared with a performance threshold, so as to determine whether the path state of the target path is a normal path or an abnormal path. And further, analyzing the link in the abnormal path to obtain an abnormal link, and further determining the root of the performance bottleneck in the process to be tested. Wherein the root cause includes but is not limited to link failure, abnormal configuration or bus overload, etc.

In the root cause positioning method of the performance bottleneck in the host provided by this embodiment, the loop test is performed on the target path in the host to be tested to determine the path performance of the target path, and then the path state of the target path is analyzed based on the path performance to position the root cause of the performance bottleneck in the host to be tested. The method is based on the loopback test to carry out root cause positioning, because the reliability of the performance of the obtained path can be ensured by utilizing the loopback test, and meanwhile, the performance bottleneck in the host to be tested is represented by the path state, on the basis, the root cause of the performance bottleneck can be accurately positioned.

In this embodiment, a root cause positioning method of a performance bottleneck in a host is provided, which can be used for electronic devices, such as computers, test devices, and the like, fig. 2 is a flowchart of the root cause positioning method of the performance bottleneck in the host according to the embodiment of the present disclosure, and as shown in fig. 2, the flowchart includes the following steps:

and S21, acquiring a path topological structure in the host to be tested.

Please refer to S11 in fig. 1 for details, which are not described herein again.

S22, performing loopback test on the target path in the host to be tested based on the path topological structure, and determining the path performance of the target path.

And the target path is a path between the network card in the host to be tested and the endpoint in the host to be tested.

Specifically, the above S22 includes:

s221, a first memory space and a second memory space are registered in a target endpoint of a host under test.

Wherein the target endpoint is an endpoint of the target path.

For each target path, two memory spaces, namely a first memory space for transmitting data and a second memory space for receiving data, are registered in the end points of the target path. The two sections of memory spaces can be registered when loopback testing is required, and the two sections of memory spaces are cancelled when the loopback testing is finished so as to release the memory resources of the end points.

S222, issuing a first data reading instruction to the network card, so as to read the first data with the first length from the first memory space to the cache space of the network card and write the read first data into the second memory space.

The path between the network card and the endpoint is a target path, that is, for each target path, the network card corresponding to each target path is determined, a first data reading instruction is sent to the network card, the network card is informed to read first data with a first length from the first memory space to the cache space of the first data, and then the first data is directly written into the second memory space. The first length is set according to actual requirements, and is not limited in any way.

S223, recording a first time for reading the first data from the first memory space, and a second time for the network card to store the first data in the second memory space.

And S224, determining the delay of the target path based on the first time difference between the second time and the first time so as to determine the path performance of the target path.

When the network card corresponding to the target path reads first data from the first memory space, recording the reading time as first time; correspondingly, when the network card stores the first data into the second memory space, the storage time is recorded as second time. And measuring the time of the loopback test by using the time difference between the second time and the first time. The whole process completely occurs in the host to be tested, and the participation of the external network of the host to be tested is not needed, so the loopback test result can feed back the path performance of the target path in the host to be tested. (1)

The time Lat of the loopback test can be expressed by the following formula:

in the formula, T _proc The processing delay of the network card comprises two parts: the time of issuing a first data reading instruction reaches the interval time of sending a first reading request by the network card and the time of receiving all the write data packets sent by the network card and finishing the storage of the first data; lat _host The propagation delay of the data packet in the host to be tested, namely the round-trip delay of the data packet from the network card to the endpoint and back to the network card;

transmission delay for sending messages in the host to be tested, i.e. message Size/bus bandwidth BW in the host to be tested _host Wherein the bus bandwidth is determined by the minimum of the bus read bandwidth and the bus write bandwidthAnd (4) determining.

In some embodiments, the S224 includes:

(1) And issuing a second data reading instruction to the network card so as to read second data with a second length from the first memory space to the cache space of the network card and write the read second data into the second memory space, wherein the second length is greater than the first length.

(2) And recording a third time for the network card to read the second data from the first memory space and a fourth time for the network card to store the second data into the second memory space.

(3) And calculating a second time difference between the fourth time and the third time, and determining the difference value between the second time difference and the first time difference as the bandwidth of the target path.

The reading and storing manner of the second data with the second length is similar to that of the first data with the first length, and is not repeated herein. And obtaining a second time difference through the reading and storing processing of the second data with the second length. And the difference value between the second time difference and the first time difference is the bandwidth of the target path.

Specifically, in conjunction with the above equation (1), when first data of a first length is employed, the above equation (1) may be approximated as:

when the network card utilization rate is low and the bottleneck in the host is not met, the processing delay T in the above formula _proc Typically remain unchanged. Therefore, the measured loopback delay can feed back the delay of the target path in the host to be measured. When the second data with the second length is selected, the above equation (1) can be approximated as:

the measured loopback delay can feed back the bandwidth of the target path in the host to be measured. Based on this, the bandwidth of the target path in the host to be tested can be reflected by taking the difference value between the second time difference and the first time difference, so that the extremely large data size is not needed to be used.

Because the path delay is composed of three parts, namely the processing delay of the network card, the propagation delay of a data packet in the host and the transmission delay of a message sent in the host, no matter whether the first data with the first length or the second data with the second length are used, the first two parts in the measured path delay can be approximately considered to be consistent, so that the third part of the path delay can be obtained by subtracting the delay of the first data with the first length from the delay of the second data with the second length, and the bandwidth of the target path is calculated. Thus, this approach does not have to use very large data for bandwidth testing.

In some embodiments, the S22 further includes:

(1) And acquiring the state of the host to be tested, wherein the state comprises idle or busy.

(2) And determining a target path based on the state of the host to be tested.

(3) And performing loopback test on the target path to determine the path performance of the target path.

And in the running process of the host to be tested, monitoring the state of the host to be tested in real time. If the state of the host to be tested is idle, performing loopback test on all paths in the host to be tested in order to further ensure the accuracy of root cause positioning; if the state of the host to be tested is busy, the loopback test can be performed only on the designated path in the host to be tested. For details of the loopback test, please refer to the above description, which is not described herein again.

Specifically, when the state of the host to be tested is idle, the target path is determined to be a path from all network cards to all endpoints in the host to be tested. And when the state of the host to be tested is busy, determining that the target path is a path from the abnormal network card to a first target endpoint in the host to be tested, wherein the first target endpoint comprises all memory nodes of the host to be tested and a graphic processor which is under the same central processing unit root node as the abnormal network card.

The target paths in each state are determined according to different states of the host to be tested, and loopback tests are respectively carried out on the corresponding target paths, so that the influence of the loopback tests on the service flow can be reduced. For the links which are under the same root node with the abnormal network card, the scheme only uses the abnormal network card to judge the states of the links so as to reduce the influence on the service flow and avoid the error inference on the bottleneck in the host to be tested.

And S23, analyzing the path state of the target path based on the path performance, and determining the root cause of the performance bottleneck in the host to be tested.

Please refer to S13 in fig. 1 for details, which are not described herein again.

According to the root cause positioning method of the performance bottleneck in the host, the loopback test of the target path is realized by reading the first data read from the first memory space and storing the first data into the second memory space, and due to the convenience of data reading and writing, the path performance of the target path can be determined quickly.

In this embodiment, a root cause positioning method of a performance bottleneck in a host is provided, which can be used for electronic devices, such as computers, test devices, and the like, fig. 3 is a flowchart of the root cause positioning method of the performance bottleneck in the host according to the embodiment of the present disclosure, and as shown in fig. 3, the flowchart includes the following steps:

and S31, acquiring a path topological structure in the host to be tested.

S32, performing loopback test on the target path in the host to be tested based on the path topological structure, and determining the path performance of the target path.

Please refer to S12 in fig. 1 for details, which are not described herein again.

And S33, analyzing the path state of the target path based on the path performance, and determining the root cause of the performance bottleneck in the host to be tested.

Specifically, the above S33 includes:

and S331, determining an abnormal link in the target path based on the size of the path performance.

As described above, the path performance includes, but is not limited to, delay or bandwidth, based on which, the delay or bandwidth obtained by the loopback test is compared with the corresponding delay threshold or bandwidth threshold, respectively, to determine an abnormal path in the target path. And further analyzing the abnormal path so as to determine the abnormal link.

Or, determining the average value of the path performance by comparing the magnitude of the path performance; and comparing the path performance of each target path with the average value, and determining an abnormal path so as to determine an abnormal link.

In some embodiments, S331 includes:

(1) And determining the state information of the target path based on the size of the path performance, wherein the state information comprises a normal path or an abnormal path.

(2) And inquiring the state of each link in the abnormal path, and taking the link with the uncertain link state as the abnormal link.

(3) And when the abnormal path of which the states of all the links are normal exists, identifying the state of the abnormal path of which the states of all the links are normal as a jitter state.

The determination process of the state information of the target path, as described above, is not described herein again. Of course, if it is determined that the state information of all the target paths is the normal path, it can be determined that there is no performance bottleneck in the host to be tested; if the abnormal path exists, determining that a performance bottleneck exists in the host to be tested, and positioning a bottleneck root cause.

Through the processing of the steps, the normal path and the abnormal path can be determined. All target paths are marked as uncertain states firstly, all normal paths are traversed, and links on all normal paths are marked as normal states. And traversing all the abnormal paths, and if uncertain links exist on the abnormal paths, marking the links as abnormal links.

Further, if the states of all links on an abnormal path are normal, and there may be a jittered link on the path, the states of all links on the path are identified as jittered states, for example, a gray state is used to represent jittered states. If a link is judged to be in a gray state for N consecutive measurement cycles, the link is diagnosed as a jittered link. Where N may be set according to actual requirements, for example, N =2 or 3, and so on.

In some embodiments, the path performance includes bandwidth when the state of the host under test is busy. Based on this, S331 includes:

(1) The bandwidths of the respective target paths are compared.

(2) Determining an abnormal path in the target path based on the comparison result of the bandwidths.

(3) And querying the state of each link in the abnormal path, and taking the link with the uncertain link state as an optional abnormal link.

(4) And counting the number of the network cards which have connection relation with the optional abnormal links, and determining the optional abnormal link with the maximum number as the abnormal link.

When the state of the host to be tested is busy, service flow may exist on some network cards, the loopback bandwidth measured by the network cards is obviously reduced due to the influence of the service flow, and on the basis, the path state is judged by utilizing comparison, so that the accuracy of the determined abnormal link is ensured. Specifically, the bandwidth of each target path is compared to determine an abnormal path. For example, the abnormal path is inferred by using the relative value, and when the bandwidth of a certain entry marked path measured by the network card is obviously lower than the bandwidths of other target paths, the entry marked path is judged to be the abnormal path.

Meanwhile, since the network card with the service traffic cannot judge whether a path is normal, some links may be wrongly marked as abnormal, which results in the accuracy of the inference of the abnormal links being reduced. In this case, the state of each link in the abnormal path is queried, the link with the uncertain link state is taken as an optional abnormal link, the number of network cards having a connection relationship with the optional abnormal link is counted, and the optional abnormal link with the largest number is determined as the abnormal link. I.e. the link with the highest number is most likely the bottleneck link, should receive more attention. For example, in the process of traversing an abnormal path, an abnormal path 1 (from the network card 1 to an end point) is traversed first, and this path passes through the link 1 (which is in an uncertain state), then the link 1 is marked as an abnormal state, and the count value abnormal _ cnt of the link 1 is changed from 0 to 1; then if the abnormal path 2 (from the network card 1 to another end point) also passes through the link 1, since the abnormal path is marked by the network card 1, the count value abnormal _ cnt of the link 1 will remain unchanged; if the abnormal path 3 (the network card 2 to some end point) also passes through the link 1, the count value abnormal _ cnt of the link 1 is increased by 1 because the abnormal path is marked by the new network card. That is, the count value abnormal _ cnt of the abnormal link is obtained by counting the number of different network cards connected to the abnormal link. The function of the abnormal _ cnt is to infer an abnormal link when triggered by an abnormal index on the network card, because the amount of information obtained by the loopback test at this time is reduced, more paths may be marked as abnormal, and a link with a higher abnormal _ cnt is more likely to be an abnormal link at this time.

S332, determining the root cause of the performance bottleneck in the host to be tested based on the information of the abnormal link.

Wherein the information of the abnormal link comprises the position or the utilization rate of the abnormal link.

And after the abnormal link is obtained, analyzing the root cause of the performance bottleneck in the host to be tested based on the information of the abnormal link. When the host to be tested is in an idle state, the root cause of the bottleneck is usually link failure or abnormal configuration; when the host under test is in a busy state, the root cause of the bottleneck is usually overloaded. When the host to be tested is in an idle state, monitoring the bottleneck in the host to be tested regularly; when the host to be tested is in a busy state, the operation is carried out to carry out root cause positioning on the bottleneck in the host to be tested only when abnormal indexes appear on the network card.

When the state of the host under test is idle, the step S332 includes:

(1) And determining the position of the abnormal link by using the path topology structure.

(2) And when abnormal links exist on paths from the network card to a second target endpoint, determining that the root cause of the performance bottleneck in the host to be tested is a link fault, wherein the second target endpoint is all memory nodes of the host to be tested and a graphic processor which is under the same central processing unit root node with the network card.

The path topology structure is used for representing the connection of the path in the host to be tested, and the abnormal link is determined after the processing, so that the position of the abnormal link is determined. I.e. the node to which the abnormal link is connected. And if the paths from the network card to all the second target endpoints are in an abnormal state, determining that the root cause of the bottleneck is the bus link fault of the network card.

When the host to be tested is idle, the root cause judgment is carried out by utilizing the position of the abnormal path, so that the accuracy of the determined root cause is ensured.

In some embodiments, step (2) of S332 above includes:

2.1 When abnormal links exist on the paths from the network card to the target end point, the delay of the path to be measured from the network card to the graphics processor under the same root node with the graphics processor is obtained.

2.2 When the delay is abnormal, determining that the root cause of the performance bottleneck caused by the path to be tested is abnormal configuration, and determining that the root cause of the performance bottleneck caused by other paths is link failure, wherein the other paths are paths between the network card and the target endpoint except the path to be tested.

When abnormal links exist on paths from the network card to a target endpoint, if the state of a bus link connected with the GPU is abnormal, the delay from the network card to the GPU under the same root node with the GPU is further checked, and if the delay is the same, the root cause of the bottleneck is determined to be abnormal configuration; for other abnormal links, the root cause is determined to be a link failure.

Under the condition that an abnormal link exists, the judgment needs to be carried out again by combining time delay, whether the root cause is in abnormal configuration or link failure is further distinguished, and the reliability of the determined root cause is improved.

When the state of the host to be tested is busy, root cause positioning is carried out when abnormal indexes appear on the network card, and the network card which is about to have the abnormal indexes is defined as the abnormal network card. Based on this, S332 includes:

(1) And acquiring the utilization rate of the abnormal link.

(2) And when the utilization rate is higher than a first utilization rate threshold value, determining that the root cause of the performance bottleneck in the host to be tested is bus overload.

(3) And when the utilization rate is lower than a second utilization rate threshold value, determining that the root cause of the performance bottleneck in the host to be tested is the link fault, wherein the second utilization rate threshold value is smaller than the first utilization rate threshold value.

The utilization rate of the link is obtained by monitoring through the bus monitoring module, namely the bus monitoring module is responsible for monitoring the utilization rate of the bus link in the host to be tested, and the utilization rate of the bus link comprises a PCIe link connected with the RDMA network card, a PCIe link connected with the GPU, a CPU root port, a cross-CPU Socket bus, a memory channel and the like. When one link is determined to be abnormal, whether the root cause of the problem is link failure or flow competition can be determined by monitoring the utilization rate of the link.

When the abnormal index on the abnormal network card is triggered, the abnormal link is usually in an overload state. With this, the root cause of the problem is located with the bus guardian module. When the utilization of the abnormal link is high, the root cause of the bottleneck is determined to be bus overload. And when the utilization rate of the abnormal link is low, determining that the root cause of the bottleneck is the link failure. The higher is represented by being higher than the first utilization threshold, the lower is represented by being lower than the second utilization threshold, and the specific values of the first utilization threshold and the second utilization threshold are set according to actual needs, which is not limited herein.

When an abnormal network card exists, the abnormal link is usually in an overload state. By utilizing the method, the root cause of the problem is positioned by utilizing the utilization rate of the abnormal link, and the accuracy of root cause positioning is ensured.

According to the root cause positioning method of the performance bottleneck in the host, the abnormal link in the target path is screened out according to the path performance, the root cause positioning is carried out by utilizing the information of the abnormal link, and the accuracy of the determined root cause can be ensured.

In this embodiment, a root cause positioning device of a performance bottleneck in a host is also provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and the description of which has been already made is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

The present embodiment provides a root cause locating device of a performance bottleneck in a host, as shown in fig. 4, including:

an obtaining module 41, configured to obtain a path topology structure in a host to be tested;

a test module 42, configured to perform a loopback test on a target path in the host to be tested based on the path topology structure, and determine a path performance of the target path, where the target path is a path between a network card in the host to be tested and an endpoint in the host to be tested;

an analysis module 43, configured to perform path state analysis on the target path based on the path performance, and determine a root cause of a performance bottleneck in the host to be tested.

In some embodiments, test module 42 includes:

a registration unit, configured to register a first memory space and a second memory space in a target endpoint of the host to be tested, where the target endpoint is an endpoint of the target path;

the instruction issuing unit is used for issuing a first data reading instruction to the network card so as to read first data with a first length from the first memory space to the cache space of the network card and write the read first data into the second memory space;

the recording unit is used for recording first time for reading the first data from the first memory space and second time for storing the first data into the second memory space by the network card;

a second determining unit, configured to determine a delay of the target path based on a first time difference between the second time and the first time to determine a path performance of the target path.

In some embodiments, the determining unit comprises:

the instruction issuing subunit is configured to issue a second data read instruction to the network card, so as to read second data of a second length from the first memory space to a cache space of the network card and write the read second data into the second memory space, where the second length is greater than the first length;

the recording subunit is configured to record a third time when the network card reads the second data from the first memory space, and a fourth time when the network card stores the second data into the second memory space;

and the first determining subunit is configured to calculate a second time difference between the fourth time and the third time, and determine a difference between the second time difference and the first time difference as the bandwidth of the target path.

In some embodiments, test module 42 further includes:

the acquisition unit is used for acquiring the state of the host to be tested, wherein the state comprises idle or busy;

a second determining unit, configured to determine the target path based on a state of the host to be tested;

and the test unit is used for performing loopback test on the target path and determining the path performance of the target path.

In some embodiments, the second determination unit comprises:

the first determining subunit is configured to determine, when the state of the host to be tested is idle, that the target path is a path from all network cards to all endpoints in the host to be tested;

and the second determining subunit is used for determining that the target path is a path from the abnormal network card to a first target endpoint in the host to be tested when the state of the host to be tested is busy, wherein the first target endpoint comprises all memory nodes of the host to be tested and a graphic processor which is under the same central processing unit root node as the abnormal network card.

In some embodiments, the analysis module 43 includes:

a third determining unit, configured to determine, based on the size of the path performance, an abnormal link in the target path;

a fourth determining unit, configured to determine a root cause of a performance bottleneck in the host to be tested based on the information of the abnormal link, where the information of the abnormal link includes a position or a utilization rate of the abnormal link.

In some embodiments, the third determination unit comprises:

a third determining subunit, configured to determine, based on the size of the path performance, state information of the target path, where the state information includes a normal path or an abnormal path;

the first inquiry subunit is configured to inquire the state of each link in the abnormal path, and use a link whose link state is uncertain as the abnormal link;

and the identification subunit is used for identifying the state of the abnormal path of which the states of all the links are normal as a jitter state when the abnormal path of which the states of all the links are normal exists.

In some embodiments, when the state of the host under test is idle, the fourth determining unit includes:

a fourth determining subunit, configured to determine a location of the abnormal link by using the path topology;

a fifth determining subunit, configured to determine that a root cause of a performance bottleneck in the host to be tested is a link fault when the abnormal link exists on a path between the network card and a second target endpoint, where the second target endpoint is all memory nodes of the host to be tested and a graphics processor that is located under the same root node of the central processing unit as the network card.

In some embodiments, the fifth determining subunit comprises:

the first obtaining subunit is configured to obtain, when the abnormal links exist on the paths from the network card to the target endpoint, a delay of a path to be measured from the network card to the graphics processor, where the path is located at the same node as the graphics processor;

a sixth determining subunit, configured to determine, when the delay is abnormal, that a root cause of the performance bottleneck caused by the path to be tested is an abnormal configuration, and determine that root causes of other paths causing the performance bottleneck are a link failure, where the other paths are paths other than the path to be tested, in a path from the network card to a target endpoint.

In some embodiments, when the status of the host under test is busy, the path performance includes a bandwidth, and the third determining unit includes:

the comparison subunit is used for comparing the bandwidth of each target path;

a seventh determining subunit, configured to determine, based on a comparison result of the bandwidths, an abnormal path in the target paths;

the second inquiry subunit is used for inquiring the state of each link in the abnormal path and taking the link with the uncertain link state as an optional abnormal link;

and the counting subunit is used for counting the number of the network cards which have connection relations with the optional abnormal links, and determining the optional abnormal link with the largest number as the abnormal link.

In some embodiments, the fourth determination unit comprises:

the second obtaining subunit is configured to obtain a utilization rate of the abnormal link;

an eighth determining subunit, configured to determine that a root cause of a performance bottleneck in the host to be tested is bus overload when the utilization is higher than the first utilization threshold;

a ninth determining subunit, configured to determine that a root cause of a performance bottleneck in the host to be tested is a link failure when the utilization is lower than a second utilization threshold, where the second utilization threshold is smaller than the first utilization threshold.

The root cause of the performance bottleneck in the host in this embodiment is represented by the location means in the form of a functional unit, where the unit is an ASIC circuit, a processor and memory executing one or more software or fixed programs, and/or other devices that can provide the above-described functionality.

Further functional descriptions of the modules are the same as those of the corresponding embodiments, and are not repeated herein.

The embodiment of the present disclosure further provides an electronic device, which has the root cause positioning device of the performance bottleneck in the host shown in fig. 4.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an alternative embodiment of the present disclosure, and as shown in fig. 5, the electronic device may include: at least one processor 51, such as a CPU (Central Processing Unit), at least one communication interface 53, memory 54, at least one communication bus 52. Wherein a communication bus 52 is used to enable the connection communication between these components. The communication interface 53 may include a Display (Display) and a Keyboard (Keyboard), and the optional communication interface 53 may also include a standard wired interface and a standard wireless interface. The Memory 54 may be a high-speed RAM Memory (volatile Random Access Memory) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The memory 54 may alternatively be at least one memory device located remotely from the processor 51. Wherein the processor 51 may be in connection with the apparatus described in fig. 4, the memory 54 stores an application program, and the processor 51 calls the program code stored in the memory 54 for performing any of the above-mentioned method steps.

The communication bus 52 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The communication bus 52 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 5, but this is not intended to represent only one bus or type of bus.

The memory 54 may include a volatile memory (RAM), such as a random-access memory (RAM); the memory may also include a non-volatile memory (english: flash memory), such as a Hard Disk Drive (HDD) or a solid-state drive (SSD); the memory 54 may also comprise a combination of the above types of memories.

The processor 51 may be a Central Processing Unit (CPU), a Network Processor (NP), or a combination of a CPU and an NP.

The processor 51 may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof.

Optionally, the memory 54 is also used to store program instructions. The processor 51 may invoke program instructions to implement a root cause location method of a performance bottleneck in a host as shown in any of the embodiments of the present application.

The embodiments of the present disclosure also provide a non-transitory computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions can execute the root cause positioning method of the performance bottleneck in the host in any of the above method embodiments. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.

The embodiment of the present disclosure further provides a cause location system of a performance bottleneck in a data center, as shown in fig. 6, including: a host 10 and a cause location device 20, wherein the cause location device 20 is configured to locate a cause of a performance bottleneck in the data center. It should be noted that the root cause positioning device 20 may be the host 10 of the data center, or may be an additionally provided electronic device, and the root cause positioning device 20 is not limited in this respect. A plurality of hosts 10 are provided in a data center, and each host 10 serves as a processing node in the data center.

For root cause positioning of performance bottleneck in the data center, the root cause positioning may be triggered at regular time or triggered according to requirements, and the triggering time is not limited at all. For example, when a host in a data center is idle, it may be a timing trigger; if the host in the data center is busy, the method can be triggered only when the network card in the host is abnormal.

The root cause location device 20 first determines whether a performance bottleneck in the data center is an occurring in-host or network bottleneck using the above-described root cause location method for the in-host performance bottleneck. If the host is determined to have no performance bottleneck through the processing of the method, the performance bottleneck of the data center is determined to be the network bottleneck.

The root cause positioning system for the performance bottleneck in the data center provided by the embodiment deploys the root cause positioning method in the data center, can quickly position whether the performance bottleneck of the data center occurs in a computing node or a network, can realize the root cause of the minute-level positioning bottleneck for the known performance bottleneck in the host, and can assist operation and maintenance personnel to find various performance bottlenecks in the host which are not found before the data center is deployed.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the method embodiments, since they are substantially similar to the apparatus and system embodiments, the description is simple, and reference may be made to the partial description of the apparatus and system embodiments for relevant points.

It is to be understood that in the embodiments of the present disclosure, where the collection of data within a host is involved, when the above embodiments of the present disclosure are applied to specific products or technologies, user permission or consent needs to be obtained, and the collection, use and handling of relevant data needs to comply with relevant laws and regulations and standards in relevant countries and regions.

Although the embodiments of the present disclosure have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the present disclosure, and such modifications and variations fall within the scope defined by the appended claims.

Claims

1. A method for root cause location of a performance bottleneck in a host, comprising:

acquiring a path topological structure in a host to be tested;

2. The method according to claim 1, wherein the performing a loopback test on a target path in the host under test based on the path topology to determine the path performance of the target path comprises:

registering a first memory space and a second memory space in a target endpoint of the host to be tested, wherein the target endpoint is an endpoint of the target path;

issuing a first data reading instruction to the network card so as to read first data with a first length from the first memory space to a cache space of the network card and write the read first data into the second memory space;

recording a first time for reading the first data from the first memory space and a second time for storing the first data into the second memory space by the network card;

determining a delay of the target path based on a first time difference between the second time and the first time to determine a path performance of the target path.

3. The method of claim 2, wherein determining the delay of the target path based on a first time difference between the second time and the first time to determine the path performance of the target path comprises:

issuing a second data reading instruction to the network card so as to read second data with a second length from the first memory space to a cache space of the network card and write the read second data into the second memory space, wherein the second length is greater than the first length;

recording a third time when the network card reads the second data from the first memory space and a fourth time when the network card stores the second data into the second memory space;

and calculating a second time difference between the fourth time and the third time, and determining a difference value between the second time difference and the first time difference as the bandwidth of the target path.

4. The method according to claim 1, wherein the performing a loopback test on a target path in the host under test based on the path topology structure to determine a path performance of the target path further comprises:

acquiring the state of the host to be tested, wherein the state comprises idle or busy;

determining the target path based on the state of the host to be tested;

and performing loopback test on the target path to determine the path performance of the target path.

5. The method of claim 4, wherein determining the target path based on the state of the host under test comprises:

when the state of the host to be tested is idle, determining that the target path is a path from all network cards to all end points in the host to be tested;

and when the state of the host to be tested is busy, determining that the target path is a path from an abnormal network card to a first target endpoint in the host to be tested, wherein the first target endpoint comprises all memory nodes of the host to be tested and a graphic processor which is under the same central processing unit root node as the abnormal network card.

6. The method of claim 1, wherein the performing a path state analysis on the target path based on the path performance to determine a root cause of a performance bottleneck in the host under test comprises:

determining an abnormal link in the target path based on the size of the path performance;

and determining the root cause of the performance bottleneck in the host to be tested based on the information of the abnormal link, wherein the information of the abnormal link comprises the position or the utilization rate of the abnormal link.

7. The method of claim 6, wherein the determining the abnormal link in the target path based on the magnitude of the path performance comprises:

determining state information of the target path based on the size of the path performance, wherein the state information comprises a normal path or an abnormal path;

inquiring the state of each link in the abnormal path, and taking the link with the uncertain link state as the abnormal link;

and when an abnormal path with all the link states being normal exists, marking the abnormal path with all the link states being normal as a jitter state.

8. The method according to claim 6, wherein when the status of the host under test is idle, the determining a root cause of a performance bottleneck in the host under test based on the information of the abnormal link comprises:

determining the position of the abnormal link by using the path topological structure;

and when the abnormal links exist on the paths from the network card to a second target endpoint, determining that the root cause of the performance bottleneck in the host to be tested is a link fault, wherein the second target endpoint is all memory nodes of the host to be tested and a graphic processor under the same central processing unit root node as the network card.

9. The method according to claim 8, wherein when the abnormal link exists on the path from the network card to the second target endpoint, determining that the root cause of the performance bottleneck in the host to be tested is a link failure comprises:

when the abnormal links exist on the paths from the network card to the target end point, acquiring the delay of the path to be measured from the network card to the graphic processor under the same node with the graphic processor;

when the delay is abnormal, determining that the root cause of the performance bottleneck caused by the path to be tested is abnormal configuration, and determining that the root cause of the performance bottleneck caused by other paths is link failure, wherein the other paths are paths between the network card and a target endpoint except the path to be tested.

10. The method as claimed in claim 6, wherein when the status of the host to be tested is busy, the path performance includes a bandwidth, and the determining the abnormal link in the target path based on the size of the path performance includes:

comparing the bandwidth of each target path;

determining an abnormal path among the target paths based on the comparison result of the bandwidths;

inquiring the state of each link in the abnormal path, and taking the link with the uncertain link state as an optional abnormal link;

counting the number of the network cards which have connection relation with the optional abnormal links, and determining the optional abnormal link with the maximum number as the abnormal link.

11. The method of claim 10, wherein the determining a root cause of a performance bottleneck in the host under test based on the information of the abnormal link comprises:

acquiring the utilization rate of the abnormal link;

when the utilization rate is higher than a first utilization rate threshold value, determining that the root cause of the performance bottleneck in the host to be tested is bus overload;

and when the utilization rate is lower than a second utilization rate threshold value, determining that the root cause of the performance bottleneck in the host to be tested is a link fault, wherein the second utilization rate threshold value is smaller than the first utilization rate threshold value.

12. A root cause locating device for a performance bottleneck in a host, comprising:

13. An electronic device, comprising:

a memory and a processor, the memory and the processor being communicatively coupled to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method for root cause localization of performance bottlenecks in hosts according to any one of claims 1-11.

14. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the method for root cause localization of performance bottlenecks in hosts of any of claims 1-11.

15. A cause localization system for a performance bottleneck in a data center, comprising:

a host;

a root cause location device that determines a root cause of each of the host performance bottlenecks by performing the root cause location method of an in-host bottleneck of any one of claims 1-11, the data center performance bottleneck being determined to be a network bottleneck when the host is a clone bottleneck.