CN114024858A

CN114024858A - Task execution method, device, equipment and storage medium

Info

Publication number: CN114024858A
Application number: CN202111293062.2A
Authority: CN
Inventors: 常韬; 孙鹏; 黎世勇
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-11-03
Filing date: 2021-11-03
Publication date: 2022-02-08
Anticipated expiration: 2041-11-03

Abstract

The disclosure provides a task execution method, a task execution device, a task execution equipment and a storage medium, and relates to the technical field of computers, in particular to the technical field of cloud computing. The specific implementation scheme is as follows: obtaining a first physical topology of each device in the distributed computing system, wherein the first physical topology is as follows: computing a topology between units in the device; obtaining a second physical topology among devices in the distributed computing system; generating a total physical topology among the computing units in the distributed computing system according to the first physical topology and the second physical topology; acquiring communication topology between task execution units corresponding to the application program; and performing topology mapping on the total physical topology and the communication topology, and distributing computing resources for the task execution units corresponding to the application programs based on the mapping result so that each task execution unit executes the tasks based on the distributed resources. By applying the scheme provided by the embodiment of the disclosure, the computing resources in the distributed computing system can be provided for the application program.

Description

Task execution method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technology, and more particularly, to the field of cloud computing technology.

Background

Because distributed computing systems can provide rich computing resources, more and more large-scale applications are deployed in distributed computing systems, performing various tasks of the applications based on the devices in the distributed computing system.

In order to ensure that each task of the application program is executed smoothly, computing resources need to be allocated for each task.

Disclosure of Invention

The disclosure provides a task execution method, a task execution device, a task execution equipment and a storage medium.

According to an aspect of the present disclosure, there is provided a task execution method including:

obtaining a first physical topology of each device in a distributed computing system, wherein the first physical topology is as follows: computing a topology between units in the device;

obtaining a second physical topology among devices in the distributed computing system;

generating a total physical topology among the computing units in the distributed computing system according to the first physical topology and the second physical topology;

acquiring communication topology between task execution units corresponding to the application program;

and performing topology mapping on the total physical topology and the communication topology, and distributing computing resources to the task execution units corresponding to the application programs based on mapping results, so that each task execution unit executes the tasks based on the distributed resources. According to another aspect of the present disclosure, there is provided a task performing apparatus including:

a first physical topology obtaining module, configured to obtain a first physical topology of each device in a distributed computing system, where the first physical topology is: computing a topology between units in the device;

the second physical topology obtaining module is used for obtaining a second physical topology among the devices in the distributed computing system;

a total physical topology obtaining module, configured to generate a total physical topology among computing units in the distributed computing system according to the first physical topology and the second physical topology;

the communication topology obtaining module is used for obtaining the communication topology among the task execution units corresponding to the application program;

and the topology mapping module is used for carrying out topology mapping on the total physical topology and the communication topology and distributing computing resources to the task execution units corresponding to the application program based on the mapping result so that each task execution unit executes the task based on the distributed resources.

According to another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above task execution method.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the above task execution method.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the above task execution method.

As can be seen from the above, in the solution provided in the embodiment of the present disclosure, the first physical topology is a topology among computing units in each device in the distributed computing system, and the second physical topology is a topology among devices in the distributed computing system, so that the total physical topology generated based on the first physical topology and the second physical topology can reflect not only the topology among computing units in each device, but also the topology among computing units among devices. The communication topology reflects the topology between the task execution units corresponding to the application program. On the basis, when the total physical topology and the communication topology are mapped, the mapping can be performed in the computing units in each device in the whole distributed computing system, so that the computing units in each device in the whole distributed computing system are considered when computing resources are allocated to each task execution unit, and further, the scheme provided by the embodiment of the disclosure not only can allocate the computing resources to each task execution unit, but also can optimize the allocated computing resources, and effectively ensure that each task execution unit can smoothly execute the task.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1a is a schematic flowchart of a first task execution method according to an embodiment of the present disclosure;

FIG. 1b is a schematic diagram of a first physical topology provided by an embodiment of the present disclosure;

FIG. 1c is a schematic diagram of the overall physical topology provided by an embodiment of the present disclosure;

FIG. 1d is a schematic diagram of a communication topology provided by an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of a second task execution method according to an embodiment of the present disclosure;

fig. 3 is a flowchart illustrating a third task execution method according to an embodiment of the disclosure;

fig. 4 is a schematic flowchart of a fourth task execution method according to an embodiment of the present disclosure;

FIG. 5 is a flowchart illustrating a task execution method according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of a first task execution device according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a second task execution device according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of a third task execution device according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of a fourth task execution device according to an embodiment of the present disclosure;

FIG. 10 is a block diagram of an electronic device used to implement a task execution method of an embodiment of the present disclosure;

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

An application scenario of the scheme provided in the embodiment of the present disclosure is described below.

Nowadays, various applications are richer and richer, and the precision of the provided functions is higher and higher, and meanwhile, the computing resources required by the applications in the running process are more and more. Because one device can provide limited computing resources, in order to ensure the normal and efficient operation of the application program, more and more distributed computing systems are introduced in practical application, so that multiple devices in the system can jointly provide computing resources for the application program.

For example, an application that implements deep learning model training. When the application program is operated to train the deep learning model, a large amount of sample data is needed to train the deep learning model, and in addition, the deep learning model has more and more levels, so that the situation is integrated, the operation of the application program needs more computing resources, and therefore each device in the distributed computing system can provide computing resources for the application program, and the smooth execution of the application program is ensured. For example, each device in the distributed computing system provides computing resources for tasks corresponding to different hierarchies in the deep learning model, so that not only can the tasks corresponding to the different hierarchies be smoothly executed, but also the tasks can be executed in parallel based on the allocated resources when the tasks corresponding to the different hierarchies meet parallel execution conditions.

The above scenario for deep learning model training is only one application scenario of the scheme provided by the embodiment of the present disclosure, and the scheme provided by the embodiment of the present disclosure may also be applied to other scenarios requiring a large amount of computing resources.

The following specifically describes a task execution method provided by the embodiments of the present disclosure.

Referring to fig. 1a, fig. 1a is a schematic flowchart of a first task execution method provided by an embodiment of the present disclosure, where the method includes the following steps S101 to S105.

Step S101: a first physical topology of devices in a distributed computing system is obtained.

The distributed computing system may be a system based on various known distributed frameworks, wherein the framework adopted by the distributed system may be determined according to application scenarios, design requirements and other factors. The distributed system comprises a plurality of devices, each device can contain at least one type of computing unit, and one device can comprise a plurality of computing units of the same type, and the computing units can execute data processing indicated by tasks.

For example, the computing Unit may be a core in a device such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a TPU (temporal Processing Unit), and an NPU (Neural-network Processing Unit).

In the case that a plurality of computing units exist in one device, communication can be carried out among the computing units in the device, so that the communication condition among the computing units can be represented based on the topology among the computing units in the device. For convenience of description, the topology between the computing units in the device is referred to as a first physical topology in the embodiments of the present disclosure.

Referring to fig. 1b, a first physical topology is shown. The square in the graph represents nodes, each node represents a computing unit in the device, connecting lines in the graph represent edges between the nodes, and if edges exist between the nodes, the connecting lines represent that communication links exist between the computing units corresponding to the two nodes, and data communication can be performed.

Specifically, the first physical topology of the devices in the distributed computing system may be obtained in the following manner.

In one embodiment, a preset physical topology extraction tool is used for extracting a first physical topology of each device in the distributed computing system.

For example, when the layout architecture of the computing unit in the device is NVIDIA (imperial access) architecture, the first physical topology of the computing unit in the device may be extracted using NVML (imperial access management library) and hwloc (hardware location library) tools provided by NVIDIA.

In another embodiment, for each device in the distributed computing system, the computing units and the communication links between the computing units existing in the device may be detected first, and then the first physical topology may be constructed by using the computing units as nodes and the communication links between the computing units as edges. The specific implementation process can be detailed in steps S201 to S204 in the embodiment shown in fig. 2, which will not be detailed here.

The communication link is a link for data interaction between the computing units.

Step S102: a second physical topology among the devices in the distributed computing system is obtained.

A distributed computing system includes multiple devices, and each two devices may or may not be capable of communicating directly with each other. The second physical topology is a physical topology between the devices in the distributed computing system, and therefore, the physical topology can reflect whether direct communication can be performed between the devices.

In one embodiment, devices and communication links between the devices present in the distributed computing system are detected, and a second physical topology is constructed based on the detected communication links using the devices as nodes.

The communication link is a link for data interaction between devices.

The specific implementation process can be detailed in steps S302-S306 in the following embodiment shown in fig. 3, and will not be detailed here.

Step S103: and generating the total physical topology among the computing units in the distributed computing system according to the first physical topology and the second physical topology.

The total physical topology is the physical topology among all the computing units of each device in the distributed computing system, so that the physical topology can reflect the communication situation among all the computing units in the distributed computing system.

Referring to fig. 1c, a schematic diagram of an overall physical topology is shown. The boxes in the figure represent nodes, each of which represents a computing unit in a device, wherein the four nodes on the left side are computing units in one device, and the four nodes on the right side are computing units in another device. The connecting lines in the graph indicate edges between the nodes, and if there are edges between the nodes, it indicates that there are communication links between the computing units corresponding to the two nodes, and data communication is possible. In addition, it can be seen from fig. 1c that there are communication links between some of the computing units in the two devices.

In one embodiment, whether a communication link exists between each device and other devices in the distributed computing system may be determined according to the second physical topology, and it may be considered that a communication link also exists between computing units of respective devices in which the communication link exists; based on the first physical topology, a new edge can be established between nodes corresponding to the computing units in the devices with communication links, and the total physical topology is obtained.

Of course, only a portion of the computing units may have a communication link between two devices that have a communication link. The partial calculation unit may be determined according to information such as the type of data processing that can be performed, the amount of calculation resources that can be provided, and the like.

For example, assume that device X includes computing units X1, X2, and X3, device Y includes computing units Y1, Y2, and Y3, and that there is a communication link between device X and device Y, which may be a communication link between X1, X2, and X3 and Y1, Y2, and Y3, respectively, or a communication link between a portion of X1, X2, and X3 and a portion of Y1, Y2, and Y3, e.g., a communication link between X1 and Y1, and a communication link between X3 and Y3.

Step S104: and acquiring the communication topology among the task execution units corresponding to the application program.

In the running process of the application program, a plurality of tasks generally need to be executed, and each task is completed by one task execution unit. Specifically, the task execution unit may include a process created by the application and/or a thread created in the created process.

Specifically, at least 2 threads may be created in each created process.

Referring to fig. 1d, a schematic diagram of a communication topology is shown.

The circles in the figure indicate nodes, each node indicates a task execution unit corresponding to an application, and the nodes within a rectangular frame correspond to a process. When two circles are included in a rectangular box, the nodes corresponding to the two circles represent two threads created in one process. When a circle is included in a rectangular frame, the node corresponding to the circle represents a process. The connecting lines in the graph indicate edges between the nodes, and if an edge exists between the nodes, it indicates that a communication link exists between the processes or threads corresponding to the two nodes, so that data communication can be performed.

As can be seen from the above description, the task execution unit may include not only a process, but also a thread created in the process, so that when executing a task in an application program, the task execution unit may not only be executed by the process, but also be executed by the thread created in the process, thereby widening the range of the task execution unit, and enabling more task execution units to participate in the running process of the application program. In addition, when the computing unit is subsequently allocated to each task execution unit, not only the computing resources can be allocated to the process, but also the computing resources can be allocated to the threads.

In an embodiment of the present disclosure, communication data between task execution units corresponding to an application program may be acquired, and a communication topology may be constructed according to the communication data.

The communication data may be obtained in different manners, and details of step S404 in the embodiment shown in fig. 4 will not be detailed here.

Specifically, the communication topology between the task execution units may be acquired in the following manner.

In one embodiment, since the application program may correspond to a plurality of different task execution units, each task execution unit executes a different task, a data interaction requirement may exist between the task execution units during the task processing, and if a data interaction requirement exists between two task execution units, it may be considered that a communication link exists between the two task execution units. In view of the above, when constructing the communication topology, the communication topology may be constructed with each task execution unit as a node and a communication link between the task execution units as an edge. See steps S404-S407 in the embodiment shown in fig. 4, which will not be described in detail here.

In another embodiment, a communication topology between the task execution units may be generated in advance according to pre-stored communication data between the task execution units corresponding to the application, and the generated communication topology is stored, so that before the solution provided by the embodiment of the present disclosure is executed, the communication topology between the task execution units corresponding to the application is already generated, and thus when the step S104 is executed, the communication topology between the task execution units corresponding to the application may be obtained from the pre-stored communication topology.

The pre-stored communication data may be provided by a user, or may be obtained by running an application program in advance.

Since the communication topology is generated in advance, the communication topology can be obtained quickly in the embodiment, so that the efficiency of allocating computing resources to the task execution unit is improved, and the efficiency of executing the task by the task execution unit is improved.

Step S105: and performing topology mapping on the total physical topology and the communication topology, and distributing computing resources for the task execution units corresponding to the application programs based on the mapping result so that each task execution unit executes the tasks based on the distributed resources.

Because the total physical topology reflects the communication condition among the computing units in the distributed computing system, and the communication topology reflects the communication condition among the task execution units, after the total physical topology and the communication topology are mapped, the mapping result can reflect the mapping relation between the task execution units and the computing units.

In this step, a mapping relation between the total physical topology and the communication topology may be established by using a graph mapping algorithm with respect to the total physical topology and the communication topology, and the computing resources provided by the computing units corresponding to the task execution units may be allocated to the task execution units according to the mapping relation, so that the task execution units may execute the tasks based on the allocated resources, thereby completing the execution process of the application program.

In one embodiment, the graph mapping algorithm may be a multivariate extremum-based graph cut algorithm.

In obtaining the first physical topology, it may be implemented by steps S201-S204 mentioned in the embodiment shown in fig. 2 below, in addition to the implementation mentioned at step S101 in the embodiment shown in fig. 1 above.

Referring to fig. 2, a flowchart of a second task execution method is provided, and the task execution method in this embodiment includes the following steps S201 to S208.

Step S201: detecting computing units present in the device and communication links between the computing units.

In one embodiment, a preset physical topology detection tool may be used to detect computing units present in the device and communication links between the computing units.

For example, devices in a distributed computing system may access a pre-defined communications library that stores various tools such as physical topology detection tools and components that these tools need to use. In this case, the devices in the distributed computing system are respectively connected to the communication library, and a physical topology detection tool for the computing unit in the device in the communication library is called, at this time, the physical topology detection tool uses the components stored in the communication library to detect the computing unit in the device and the communication link between the computing units, and then the physical topology detection tool feeds back the detection result to the device.

The communication library may include detection tools for various computing unit architectures, so that physical topologies of computing units in various architectures can be detected. In addition, the communication library may include a detection tool for detecting computing units in different communication modes, for example, a P2P (Point-to-Point) communication mode, a mode for performing communication based on collective semantics, and the like, so that the extensibility and portability of the scheme provided by the embodiment of the present disclosure may be enhanced.

Step S202: and constructing a first topology by taking each computing unit as a node and taking communication links among the computing units as edges.

In the first topology constructed in this step, each node corresponds to one computing unit. After the first topology is obtained, the nodes having edges can be known by observing the first topology, and then the computing units corresponding to the nodes having communication links can be known.

Step S203: and obtaining the computing capacity characteristic value of each computing unit and obtaining the communication capacity characteristic value of each communication link.

The computing capacity of each computing unit may be different under the influence of the hardware attribute of each computing unit, and the computing capacity of the computing unit can be represented by using the computing capacity representation value in the application, so that the computing capacity strength of the computing unit can be known by looking up the computing capacity representation value. In one embodiment, the computing power characterization value may include computing power information of the computing unit. For example, the computing power information may be information on the amount of available computing resources of the computing unit, the hardware model, and the like.

Under the influence of factors such as hardware attributes and communication requirements of each computing unit, the communication capacity of the communication link between every two computing units is different, and the communication capacity of the communication link between the computing units can be represented by adopting the communication capacity representation value in application, so that the communication capacity of the communication link between the two computing units can be obtained by looking up the communication capacity representation value. The communication capability may be communication quality, communication reliability, and the like of the communication link.

In one embodiment, the communication capability characterizing value may include at least one of the following information: physical distance between computing units connected by the communication link, link bandwidth of the communication link, communication delay of the communication link, and the like.

The computing capacity characteristic value and the communication capacity characteristic value can be obtained by integrating various information, so that the computing capacity of each computing unit and the communication capacity of each communication link can be comprehensively represented.

The manner of obtaining the computing power characterizing value of the computing unit and the communication power characterizing value of the communication link is explained below.

In one embodiment, the computing power information of each computing unit may be pre-stored in the device, in which case, the computing power characterization value may be obtained by directly reading the computing power information of the computing unit from the device.

In another embodiment, when the physical topology detection tool is used to detect the computing units existing in the device and the communication links among the computing units, the physical distances among the computing units connected with the detected communication links, the link bandwidths of the communication links, and the communication delay information of the communication links can be counted, so as to obtain the communication capability characterization value.

Step S204: and setting the attribute of the node in the first topology based on the obtained computing capability characterization value, and setting the attribute of the edge in the first topology based on the obtained communication capability characterization value to obtain a first physical topology.

After the first topology has been generated in step S202, although the communication links between the computing units corresponding to the nodes can be known through the first topology, in order to more clearly and intuitively know the computing capability of the computing unit corresponding to each node and the communication capability of each communication link, in the scheme provided in this embodiment, an attribute related to the computing capability is further set for each node, and an attribute related to the communication capability is set for each edge.

In this step, when setting the attribute of the node in the first topology, the calculation force information may be directly set as the attribute value of the node in the first topology. Thus, the larger the above attribute value of a node in the first topology is, the higher the computing power of the computing unit corresponding to the node is.

In setting the attributes of the edges in the first topology, in one embodiment, in the case that the communication capability characterizing value includes a physical distance between the computing units connected by the communication link, a link bandwidth of the communication link, and a communication delay of the communication link, the physical distance between the computing units connected by the communication link, the link bandwidth of the communication link, and the communication delay of the communication link may be respectively set as sub-attribute values of the attributes of the edges in the first topology, so that a plurality of attribute values may exist in the attributes of the edges. In this way, when the communication capability corresponding to the communication link corresponding to the edge is determined, the determination can be made from different aspects corresponding to the attribute values.

In another embodiment, in a case that the communication capability characterizing value includes a physical distance between computing units connected to the communication link, a link bandwidth of the communication link, and a communication delay of the communication link, the three pieces of information may be weighted by a preset weight, and an obtained computation result is set as an attribute value of the edge in the first topology, where the larger an absolute value of the preset weight of the information is, the more important the attribute represented by the information is for evaluating the communication capability. In addition, the weight value can be set to be positive or negative, so that the communication capacity of the communication link can be represented by the size of the attribute value after weighted calculation.

For example, a weight of a physical distance between computing units connected to a communication link may be set to 0.2, a weight of a communication delay of the communication link may be set to 0.4, and a weight of a link bandwidth of the communication link may be set to-0.5, and if the physical distance between computing units connected to one communication link, the communication delay of the communication link, and the link bandwidth of the communication link are v1, v2, and v3, respectively, attribute values of edges corresponding to the communication link may be obtained through weighted summation: 0.2 × v1+0.4 × v 2-0.5 × v3, in this case, the smaller the attribute value of an edge in the first topology, the stronger the communication capability of the communication link corresponding to the edge.

Step S205: a second physical topology among the devices in the distributed computing system is obtained.

Step S206: and generating the total physical topology among the computing units in the distributed computing system according to the first physical topology and the second physical topology.

Step S207: and acquiring the communication topology among the task execution units corresponding to the application program.

Step S208: and performing topology mapping on the total physical topology and the communication topology, and distributing computing resources for the task execution units corresponding to the application programs based on the mapping result so that each task execution unit executes the tasks based on the distributed resources.

The steps S205 to S208 are the same as the steps S102 to S105 shown in fig. 1, and will not be described in detail.

As can be seen from the above, in the scheme provided in this embodiment, the attribute of the node of the first topology is set based on the computation capability characterization value of the computing unit, and the attribute of the edge of the first topology is set based on the communication capability characterization value of the communication link between the computing units, so as to obtain the first physical topology, which not only can characterize which computing units have communication links therebetween, but also can characterize the computing capability of the computing units and the communication capability of the communication link between the computing units, so that the information that can be characterized by the total physical topology graph obtained according to the first physical topology is richer, and thus the mapping relationship between the subsequently generated total physical topology and the communication topology is more accurate, the computing resources allocated to each task execution unit are more accurate, and the probability that each task execution unit executes smoothly is improved.

In obtaining the second physical topology, in addition to the implementation mentioned at step S102 in the implementation shown in fig. 1, the implementation can be achieved by steps S302-S306 mentioned in the embodiment shown in fig. 3 below.

Referring to fig. 3, fig. 3 is a schematic flowchart of a third task execution method provided in the embodiment of the present disclosure, where the task execution method in the embodiment includes the following steps S301 to S309.

Step S301: a first physical topology of devices in a distributed computing system is obtained.

The above step S301 is the same as the step S101 shown in the foregoing embodiment of fig. 1, and is not described in detail here.

Step S302: detecting devices present in a distributed computing system and communication links between devices.

In one embodiment, a pre-defined physical topology detection tool may be used to detect devices present in a distributed computing system and communication links between devices.

For example, devices in a distributed computing system may access a pre-defined communications library that stores various tools such as physical topology detection tools and components that these tools need to use. In this case, the devices in the distributed computing system are connected to the communication library respectively, and a physical topology detection tool for the devices in the communication library is called, and at this time, the physical topology detection tool detects the devices existing in the distributed computing system and the communication links between the devices by using the components stored in the communication library, and feeds back the detection result to the devices.

Step S303: and constructing a second topology by taking the devices as nodes and taking communication links among the devices as edges.

In the second topology constructed in this step, each node corresponds to one device. After the second topology is obtained, the nodes having edges can be known by observing the second topology, and then the devices corresponding to the nodes having communication links can be known.

Step S304: an IP address of a device in a distributed computing system is obtained.

In one embodiment, each device in the distributed computing system may obtain an IP address of its own network card, and then send the obtained IP address to the electronic device serving as the execution subject, so that the electronic device obtains the IP address of each device in the distributed computing system.

Step S305: the hop count between devices in the distributed computing system, determined by means of route tracing, based on the obtained IP address is obtained.

In one embodiment, after obtaining the IP address of each device, the electronic device as the execution subject may send the obtained IP address to each device, and each device may perform route tracing on the IP addresses of other devices, determine the number of hops with the other devices according to the result of the route tracing, and send the determined number of hops to the electronic device.

For example, each device may trace the IP address of the other device through a traceroute (traceroute) command, obtain a trace result, and then determine the IP hop count with the other device according to the trace result.

Step S306: and setting the attribute of the edge in the second topology based on the obtained hop count to obtain a second physical topology among the devices in the distributed computing system.

In the above step S303, the second topology has been generated, and although the communication links between the devices corresponding to the nodes can be known through the second topology, in order to more clearly and intuitively know the connection condition between the devices, in the scheme provided in this embodiment, the attribute of the edge in the second topology may also be set by using the hop count between the devices on the basis of the second topology, so as to obtain the second physical topology. Thus, according to the attribute of the edge in the second physical topology, it can be known whether two devices with communication links are directly connected or indirectly connected through other devices.

Step S307: and generating the total physical topology among the computing units in the distributed computing system according to the first physical topology and the second physical topology.

Because the edge of the second physical topology is provided with the attribute, when the total physical topology is generated, the hop count between the devices corresponding to the node can be obtained according to the attribute set by the edge of the second physical topology, and then the connection condition between the two devices can be judged. For example, if the attribute value of an edge between nodes corresponding to two devices is 1, the two devices may be considered to be directly connected; if the attribute value of an edge between nodes corresponding to two devices is greater than 1, the two devices may be considered to be connected via at least one other device.

In view of the foregoing, in an embodiment, when generating the total physical topology, the directly connected devices in the distributed computing system may be determined according to the attributes set on each edge in the second physical topology, and then a new edge is established between nodes corresponding to the computing units in the directly connected devices on the basis of the first physical topology, so as to obtain the total physical topology.

When a new edge is established for a node corresponding to a computing unit in two directly connected devices, a new edge may be established for only a node corresponding to a part of the computing units in the two devices. The partial calculation unit may be determined according to information such as the type of data processing that can be performed, the amount of calculation resources that can be provided, and the like.

Step S308: and acquiring the communication topology among the task execution units corresponding to the application program.

Step S309: and performing topology mapping on the total physical topology and the communication topology, and distributing computing resources for the task execution units corresponding to the application programs based on the mapping result so that each task execution unit executes the tasks based on the distributed resources.

The steps S308-S309 are the same as the steps S104-S105 shown in FIG. 1, and will not be described in detail.

As can be seen from the above, in the scheme provided in this embodiment, not only the second topology is constructed according to the devices and the communication links between the devices in the distributed computing system, but also the IP hop count between each device is obtained, so that after the second physical topology is obtained by setting the attribute value for the edge in the second topology based on the IP hop count between each device, the second physical topology can not only represent which devices have communication links therebetween, but also represent the IP hop count between the devices having links therebetween, and thus can accurately obtain the total physical topology, so that the obtained total physical topology is more accurate.

In obtaining the communication topology, in addition to the implementation mentioned at step S104 in the foregoing implementation shown in fig. 1, the implementation can also be achieved by steps S404-S408 mentioned in the embodiment shown in fig. 4 described below.

Referring to fig. 4, fig. 4 is a schematic flowchart of a fourth task execution method provided in the embodiment of the present disclosure, where the task execution method in the embodiment includes the following steps S401 to S408.

Step S401: a first physical topology of devices in a distributed computing system is obtained.

Step S402: a second physical topology among the devices in the distributed computing system is obtained.

Step S403: and generating the total physical topology among the computing units in the distributed computing system according to the first physical topology and the second physical topology.

The steps S401 to S403 are respectively the same as the steps 101-103 shown in the embodiment of fig. 1, and will not be described in detail here.

Step S404: and communication data between the task execution units corresponding to the application program are obtained.

The communication data may include information such as communication records among the task execution units, computation resource demand of the task execution units, and communication data amount among the task execution units, where the communication records may record information such as start and end time of communication, communication duration, identifiers of tasks executed by the task execution units, and communication link identifiers.

Specifically, the communication data between the task execution units corresponding to the application programs may be obtained in different manners.

The first mode is as follows: and communication data between task execution units generated in the running process of the application program collected by a preset data collection tool is obtained.

For example, the communication data between task execution units generated during the running process of the application program is collected through an MPI profiling (Multi Point Interface profiling) tool developed by NVIDIA. Of course, other data collection tools may be used to collect the communication data, and the embodiments of the present disclosure are not limited to the specific tools used, as long as the tools capable of collecting the communication data can be used.

The preset data collection tool is used for collecting the communication data, so that different data collection tools can be adopted according to different application scenes, and the use flexibility of the scheme provided by the embodiment of the disclosure is improved.

The second mode is as follows: and receiving target communication data sent by the communication library through a preset data sending interface in a target time period, and determining the received target communication data as communication data between task execution units corresponding to the application program.

Wherein the target time period is: in the time period from the start to the end of the application program, the target communication data is as follows: and the communication library captures data generated by communication between the task execution units in real time through a preset data capture interface.

After the data sending and receiving interface and the data capturing interface are set for the communication library, the interfaces are accessed in various different application scenes, and communication data acquisition is realized through the unified interface, so that the compatibility of the scheme provided by the embodiment of the disclosure is improved.

And in the third mode, communication data between task execution units corresponding to the application programs provided by the user is obtained.

In some cases, a user is familiar with the framework of the application program, and can know the condition of data communication between the task execution units without running the application program, and in this case, the communication data can be provided by the user in advance, so that the process of running the application program to obtain the communication data is avoided, the efficiency of allocating computing resources to the task execution units is improved, and the efficiency of executing tasks by the task execution units is improved.

Step S405: and determining a communication link among the task execution units, the computing resource demand of the task execution units and the communication data amount among the task execution units according to the obtained communication data.

If a communication record exists between one task execution unit and another task execution unit in the communication data, the two task execution units may be considered to have a communication link therebetween, so that the communication record may be analyzed to obtain the communication link between the two task execution units, and further, the corresponding calculation resource demand and the communication data amount between the task execution units may be obtained from the communication data according to the communication record.

Step S406: and constructing a third topology by taking each task execution unit as a node and taking a communication link between each task execution unit as an edge.

In the third topology constructed in this step, each node corresponds to one task execution unit. After the third topology is obtained, it can be known that edges exist between nodes by observing the third topology, and then it can be known that communication links exist between task execution units corresponding to the nodes.

Step S407: and setting the attribute of the node in the third topology based on the acquired computing resource demand, and setting the attribute of the edge in the third topology based on the acquired communication data amount to obtain the communication topology between the task execution units corresponding to the application program.

In the above step S406, the third topology has been generated, and although the communication links between the task execution units corresponding to the nodes can be known through the third topology, in order to more clearly and intuitively know the computation resource demand of the task execution unit corresponding to each node and the communication data volume of each communication link, in the scheme provided in this embodiment, the computation resource demand is also used to set the attributes for the nodes of the third topology, and the attributes are set by using the edges of the data communication third topology.

Step S408: and performing topology mapping on the total physical topology and the communication topology, and distributing computing resources for the task execution units corresponding to the application programs based on the mapping result so that each task execution unit executes the tasks based on the distributed resources.

As can be seen from step S408, in the solution provided in this embodiment, the communication topology includes not only the nodes corresponding to the task execution unit and the edges between the nodes, and the nodes and the edges also set the amount of computing resource required and the amount of data traffic as attributes, respectively, so that when performing topology mapping on the total physical topology and the communication topology, the computing resource requirements of the task execution units corresponding to the nodes, and the data traffic between the task execution units may be further taken into account, the computing resource quantity which can be provided by the computing unit corresponding to the node in the mapped total physical topology is more matched with the computing resource demand quantity of the task execution unit corresponding to the node in the communication topology, and the data communication quantity of the communication link corresponding to the edge in the mapped total physical topology is more matched with the data communication quantity of the communication link corresponding to the edge in the communication topology.

As can be seen from the above, in the scheme provided in this embodiment, the attribute of the node in the third topology is set based on the obtained amount of demand for computing resources, and the attribute of the edge in the third topology is set based on the obtained amount of communication data, so as to obtain the communication topology between the task execution units corresponding to the application program, so that the communication topology can not only represent which task execution units have communication links therebetween, but also represent the computing resources required by each task execution unit and the amount of communication data between the task execution units, so that the information that the communication topology can provide is richer and more diverse, and more reference information is provided for mapping the total physical topology and the communication topology, so that the mapping result is more accurate, the accuracy of allocating computing resources to the task execution units can be improved, and the efficiency of the task execution units for executing tasks can be improved.

It should be noted that, the process of obtaining the total physical topology in the foregoing embodiments may be executed in parallel or in series with the process of obtaining the communication topology, and the embodiments of the present disclosure do not limit this.

The following describes a task execution method provided by the embodiment of the present disclosure in detail with reference to fig. 5. Fig. 5 is a workflow diagram of a task execution method according to an embodiment of the present disclosure.

Firstly, when the method starts to be executed, judging whether the electronic equipment locally stores the total physical topology, if the electronic equipment does not locally store the total physical topology, detecting the physical topology, and thus obtaining the total physical topology; and if the electronic equipment stores the total physical topology locally, acquiring the total physical topology locally from the electronic equipment.

After the total physical topology is obtained, judging whether a user can provide a communication topology, if the user cannot provide the communication topology, operating an application program, detecting the communication topology, obtaining the communication topology, mapping the total physical topology and the communication topology, and generating a mapping file; and if the user provides the communication topology, mapping the total physical topology and the communication topology by using the communication topology provided by the user to generate a mapping file.

And then distributing computing resources for each task execution unit of the application program according to the content recorded in the mapping file, wherein the task execution unit executes the tasks by using the distributed computing resources.

Corresponding to the image searching method, the disclosure also provides a task execution device.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a first task execution device according to an embodiment of the present disclosure, where the device includes the following

modules

601 and 605.

A first physical topology obtaining module 601, configured to obtain a first physical topology of each device in the distributed computing system, where the first physical topology is: computing a topology between units in the device;

a second physical topology obtaining module 602, configured to obtain a second physical topology among the devices in the distributed computing system;

a total physical topology obtaining module 603, configured to generate a total physical topology among the computing units in the distributed computing system according to the first physical topology and the second physical topology;

a communication topology obtaining module 604, configured to obtain a communication topology between task execution units corresponding to the application;

and a topology mapping module 605, configured to perform topology mapping on the total physical topology and the communication topology, and allocate computing resources to the task execution unit corresponding to the application program based on the mapping result, so that each task execution unit executes the task based on the allocated resources.

In an embodiment of the present disclosure, a task execution unit corresponding to the application includes: a process and/or a thread created in a process.

In an embodiment of the present disclosure, the communication topology obtaining module is specifically configured to obtain, from a pre-stored communication topology, a communication topology between task execution units corresponding to an application program, where the pre-stored communication topology includes: and generating a communication topology between the task execution units in advance according to the prestored communication data between the task execution units corresponding to the application program.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a second task execution device according to an embodiment of the disclosure, where the device includes the following

modules

701 and 709.

A first link detection submodule 701, configured to detect a computing unit existing in the device and a communication link between the computing units;

a first topology constructing sub-module 702, configured to construct a first topology by using each computing unit as a node and using a communication link between the computing units as an edge;

a calculation capability characterization value obtaining submodule 703 for obtaining a calculation capability characterization value of each calculation unit;

a communication capability characterizing value obtaining submodule 704, configured to obtain a communication capability characterizing value of each communication link;

a first attribute value setting submodule 705, configured to set an attribute of a node in the first topology based on the obtained computing capability characterizing value, and set an attribute of an edge in the first topology based on the obtained communication capability characterizing value, so as to obtain a first physical topology.

A second physical topology obtaining module 706, configured to obtain a second physical topology among the devices in the distributed computing system;

a total physical topology obtaining module 707, configured to generate a total physical topology among computing units in the distributed computing system according to the first physical topology and the second physical topology;

a communication topology obtaining module 708, configured to obtain a communication topology between task execution units corresponding to the application;

a topology mapping module 709, configured to perform topology mapping on the total physical topology and the communication topology, and allocate computing resources to the task execution unit corresponding to the application program based on the mapping result, so that each task execution unit executes the task based on the allocated resources.

In an embodiment of the present disclosure, the computing power characterization value obtaining submodule is specifically configured to obtain computing power information of each computing unit, and use the computing power information of each computing unit as a computing power characterization value of each computing unit; and/or the communication capability characterization value obtaining submodule is specifically configured to obtain, for each communication link, at least one of the following information, and determine the obtained information as the communication capability characterization value of each communication link: a link bandwidth of the communication link; a communication delay of the communication link; the physical distance between the computing units connected by the communication link.

Referring to fig. 8, fig. 8 is a schematic structural diagram of a third task execution device according to an embodiment of the present disclosure, where the device includes the following

modules

801 and 809.

A first physical topology obtaining module 801, configured to obtain a first physical topology of each device in a distributed computing system, where the first physical topology is: computing a topology between units in the device;

a second link detection sub-module 802, configured to detect devices and communication links between the devices in the distributed computing system;

a second topology constructing sub-module 803, configured to construct a second topology by using each device as a node and using a communication link between each device as an edge;

an IP address obtaining sub-module 804, configured to obtain an IP address of a device in the distributed computing system;

a hop count obtaining sub-module 805 configured to obtain a hop count between devices in the distributed computing system, which is determined by a route tracing manner based on the obtained IP address;

and a second attribute value setting submodule 806, configured to set an attribute of an edge in the second topology based on the obtained hop count, so as to obtain a second physical topology among devices in the distributed computing system.

A total physical topology obtaining module 807, configured to generate a total physical topology among computing units in the distributed computing system according to the first physical topology and the second physical topology;

a communication topology obtaining module 808, configured to obtain a communication topology between task execution units corresponding to the application;

and a topology mapping module 809, configured to perform topology mapping on the total physical topology and the communication topology, and allocate computing resources to the task execution unit corresponding to the application program based on the mapping result, so that each task execution unit executes the task based on the allocated resources.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a fourth task execution device according to an embodiment of the disclosure, where the device includes the following

modules

901 and 908.

A first physical topology obtaining module 901, configured to obtain a first physical topology of each device in the distributed computing system, where the first physical topology is: computing a topology between units in the device;

a second physical topology obtaining module 902, configured to obtain a second physical topology among the devices in the distributed computing system;

a total physical topology obtaining module 903, configured to generate a total physical topology among the computing units in the distributed computing system according to the first physical topology and the second physical topology;

a communication data obtaining sub-module 904, configured to obtain communication data between task execution units corresponding to the application;

an attribute value determination submodule 905 configured to determine, according to the obtained communication data, a communication link between the task execution units, a calculation resource demand of the task execution units, and a communication data amount between the task execution units;

third topology building submodule 906: constructing a third topology by taking each task execution unit as a node and taking a communication link between each task execution unit as an edge;

a third attribute value setting submodule 907, configured to set an attribute of a node in the third topology based on the obtained computation resource demand, and set an attribute of an edge in the third topology based on the obtained communication data amount, so as to obtain a communication topology between task execution units corresponding to the application program.

And a topology mapping module 908, configured to perform topology mapping on the total physical topology and the communication topology, and allocate computing resources to task execution units corresponding to the application program based on a mapping result, so that each task execution unit executes a task based on the allocated resources.

In an embodiment of the present disclosure, the communication data obtaining sub-module is specifically configured to obtain communication data between task execution units corresponding to an application program according to at least one of the following manners:

acquiring communication data between task execution units generated in the running process of the application program and collected by a preset data collection tool;

Receiving target communication data sent by a communication library through a preset data sending interface in a target time period, and determining the received target communication data as communication data between task execution units corresponding to the application program, wherein the target time period is as follows: in a time period from the start to the end of the application program, the target communication data is: the communication library captures data generated by communication between task execution units in real time through a preset data capture interface;

And acquiring communication data between the task execution units corresponding to the application program provided by the user.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

In one embodiment of the present disclosure, there is provided an electronic device including:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of task execution described in the method embodiments above.

In one embodiment of the present disclosure, a non-transitory computer-readable storage medium is provided, having stored thereon computer instructions for causing a computer to perform the task execution method of the aforementioned method embodiment.

In one embodiment of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the task execution method described in the aforementioned method embodiment.

FIG. 10 illustrates a schematic block diagram of an example electronic device 1000 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 10, the apparatus 1000 includes a computing unit 1001 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)1002 or a computer program loaded from a storage unit 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data necessary for the operation of the device 1000 can also be stored. The calculation unit 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

A number of components in device 1000 are connected to I/O interface 1005, including: an input unit 1006 such as a keyboard, a mouse, and the like; an output unit 1007 such as various types of displays, speakers, and the like; a storage unit 1008 such as a magnetic disk, an optical disk, or the like; and a communication unit 1009 such as a network card, a modem, a wireless communication transceiver, or the like. The communication unit 1009 allows the device 1000 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

Computing unit 1001 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 1001 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 1001 executes the respective methods and processes described above, such as the task execution method. For example, in some embodiments, the task execution methods may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 1008. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1000 via ROM 1002 and/or communications unit 1009. When the computer program is loaded into the RAM 1003 and executed by the computing unit 1001, one or more steps of the task execution method described above may be performed. Alternatively, in other embodiments, the computing unit 1001 may be configured by any other suitable means (e.g., by means of firmware) to perform the task execution method.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A task execution method, comprising:

and performing topology mapping on the total physical topology and the communication topology, and distributing computing resources to the task execution units corresponding to the application programs based on mapping results, so that each task execution unit executes the tasks based on the distributed resources.

2. The method of claim 1, wherein the obtaining a first physical topology of devices in a distributed computing system comprises:

a first physical topology of each device in the distributed computing system is obtained as follows:

detecting computing units existing in the equipment and communication links among the computing units;

constructing a first topology by taking each computing unit as a node and taking a communication link between the computing units as an edge;

obtaining the computing capacity characteristic value of each computing unit and obtaining the communication capacity characteristic value of each communication link;

and setting the attribute of the node in the first topology based on the obtained computing capability characterization value, and setting the attribute of the edge in the first topology based on the obtained communication capability characterization value to obtain a first physical topology.

3. The method of claim 2, wherein,

the obtaining of the computing power characterization value of each computing unit includes:

acquiring computing power information of each computing unit, and taking the computing power information of each computing unit as a computing power representation value of each computing unit;

and/or

The obtaining of the communication capability characterization value of each communication link includes:

at least one of the following information is obtained for each communication link, and the obtained information is determined as a communication capability characterizing value of each communication link:

a link bandwidth of the communication link;

a communication delay of the communication link;

the physical distance between the computing units connected by the communication link.

4. The method of claim 1, wherein the obtaining a second physical topology among devices in the distributed computing system comprises:

detecting devices existing in the distributed computing system and communication links among the devices;

constructing a second topology by taking each device as a node and taking a communication link between each device as an edge;

obtaining an IP address of a device in the distributed computing system;

obtaining hop counts among devices in the distributed computing system determined by a route tracing mode based on the obtained IP addresses;

and setting the attribute of the edge in the second topology based on the obtained hop count to obtain a second physical topology among the devices in the distributed computing system.

5. The method of claim 1, wherein the obtaining of the communication topology between task execution units corresponding to the application program comprises:

acquiring communication data between task execution units corresponding to the application program;

according to the obtained communication data, determining a communication link among the task execution units, the computing resource demand of the task execution units and the communication data quantity among the task execution units;

constructing a third topology by taking each task execution unit as a node and taking a communication link between each task execution unit as an edge;

and setting the attribute of the node in the third topology based on the acquired computing resource demand, and setting the attribute of the edge in the third topology based on the acquired communication data amount to obtain the communication topology among the task execution units corresponding to the application program.

6. The method of claim 5, wherein the obtaining communication data between task execution units corresponding to the application program comprises:

obtaining communication data between task execution units corresponding to the application program according to at least one of the following modes:

7. The method of claim 1, wherein the obtaining of the communication topology between task execution units corresponding to the application program comprises:

obtaining a communication topology between task execution units corresponding to an application program from a pre-stored communication topology, wherein the pre-stored communication topology comprises: and generating a communication topology between the task execution units in advance according to the prestored communication data between the task execution units corresponding to the application program.

8. The method according to any one of claims 1-7, wherein the task execution unit corresponding to the application program comprises: a process and/or a thread created in a process.

9. A task execution device comprising:

10. The apparatus according to claim 9, wherein the first physical topology obtaining module is specifically configured to obtain a first physical topology of each device in the distributed computing system;

the first physical topology obtaining module includes:

the first link detection submodule is used for detecting the computing units existing in the equipment and communication links among the computing units;

the first topology construction sub-module is used for constructing a first topology by taking each computing unit as a node and taking a communication link between the computing units as an edge;

the calculation capacity characteristic value obtaining submodule is used for obtaining the calculation capacity characteristic value of each calculation unit;

the communication capacity characteristic value obtaining submodule is used for obtaining the communication capacity characteristic value of each communication link;

and the first attribute value setting submodule is used for setting the attribute of the node in the first topology based on the obtained computing capability representation value and setting the attribute of the edge in the first topology based on the obtained communication capability representation value to obtain a first physical topology.

11. The apparatus of claim 10, wherein,

the computing power characteristic value obtaining submodule is specifically used for obtaining computing power information of each computing unit, and taking the computing power information of each computing unit as a computing power characteristic value of each computing unit;

and/or

The communication capability characterizing value obtaining sub-module is specifically configured to obtain at least one of the following information for each communication link, and determine the obtained information as the communication capability characterizing value of each communication link:

a link bandwidth of the communication link;

a communication delay of the communication link;

12. The apparatus of claim 9, wherein the second physical topology obtaining module comprises:

the second link detection submodule is used for detecting the equipment existing in the distributed computing system and the communication link among the equipment;

the second topology construction submodule is used for constructing a second topology by taking each device as a node and taking a communication link between each device as an edge;

an IP address obtaining submodule, configured to obtain an IP address of a device in the distributed computing system;

a hop count obtaining submodule for obtaining a hop count between devices in the distributed computing system determined by a route tracing manner based on the obtained IP address;

and the second attribute value setting submodule is used for setting the attribute of the edge in the second topology based on the obtained hop count to obtain a second physical topology among the devices in the distributed computing system.

13. The apparatus of claim 9, wherein the communication topology obtaining module comprises:

the communication data acquisition submodule is used for acquiring communication data between the task execution units corresponding to the application program;

the attribute value determining submodule is used for determining a communication link among the task execution units, the computing resource demand of the task execution units and the communication data quantity among the task execution units according to the obtained communication data;

the third topology construction submodule is used for constructing a third topology by taking each task execution unit as a node and taking a communication link between each task execution unit as an edge;

and the third attribute value setting submodule is used for setting the attribute of the node in the third topology based on the acquired computing resource demand, and setting the attribute of the edge in the third topology based on the acquired communication data volume to obtain the communication topology between the task execution units corresponding to the application program.

14. The apparatus according to claim 13, wherein the communication data obtaining sub-module is specifically configured to obtain communication data between task execution units corresponding to the application program according to at least one of the following manners:

15. The apparatus of claim 9, wherein,

the communication topology obtaining module is specifically configured to obtain a communication topology between task execution units corresponding to an application program from a pre-stored communication topology, where the pre-stored communication topology includes: and generating a communication topology between the task execution units in advance according to the prestored communication data between the task execution units corresponding to the application program.

16. The apparatus according to any one of claims 9-15, wherein the task execution unit corresponding to the application program comprises: a process and/or a thread created in a process.

17. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

18. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-8.

19. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-8.