WO2020238719A1 - 通信链路的建立方法及装置,节点标识确定方法及装置 - Google Patents

通信链路的建立方法及装置,节点标识确定方法及装置 Download PDF

Info

Publication number
WO2020238719A1
WO2020238719A1 PCT/CN2020/091221 CN2020091221W WO2020238719A1 WO 2020238719 A1 WO2020238719 A1 WO 2020238719A1 CN 2020091221 W CN2020091221 W CN 2020091221W WO 2020238719 A1 WO2020238719 A1 WO 2020238719A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
nodes
group
logical group
logical
Prior art date
Application number
PCT/CN2020/091221
Other languages
English (en)
French (fr)
Inventor
董建波
曹政
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020238719A1 publication Critical patent/WO2020238719A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • H04L67/1046Joining mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/146Markers for unambiguous identification of a particular session, e.g. session cookie or URL-encoding

Definitions

  • This application relates to the field of computer technology, in particular to a method and device for establishing a data communication link, a method and device for determining node identification in a cluster architecture, and a method and device for establishing a data communication link in an artificial intelligence cluster architecture, and Computer storage devices and electronic devices.
  • the existing cluster architecture adds a large number of computing devices to improve the data processing capability.
  • the network interconnection structure between switching devices it is necessary to consider the network interconnection structure between switching devices to improve the data communication capability between nodes.
  • Fat-tree is one of the most commonly used topologies for high-performance computer clusters.
  • the Fat-tree topology is divided into three levels: from top to bottom, there are edge, aggregate, and core. It can realize the relationship between source node and target node through a similar reduce algorithm. data communication. Since there are multiple parallel paths between the source node and the target node, the network versatility and scalability are good. However, under the demand of large-scale data processing, it will bring a greater performance loss.
  • Figure 1 is a network topology diagram of the cluster architecture, in which a large number of computing devices communicate with the opposite A large number of computing devices on the switching device perform data communication.
  • any switching device on the lower side of the bipartite graph can reach the upper side and The switching device connected to the computing device realizes data communication between the upper computing devices.
  • any switching device on the upper side of the bipartite graph can arrive The switching device on the lower side realizes data communication between the lower side computing devices. It can be seen that there are a large number of reachable paths regardless of the data communication between the upper computing nodes or the data communication between the lower computing nodes.
  • the present application provides a method for establishing a data communication link to solve the problem of data congestion caused by non-unique communication paths between node devices in the prior art.
  • This application provides a method for establishing a data communication link, including:
  • a unique data communication link between the nodes in the logical group is established.
  • the grouping of nodes used for data communication in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes includes:
  • the nodes distributed on both sides of the bipartite graph in the cluster architecture are grouped to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • grouping the nodes distributed on both sides of the bipartite graph in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes in the cluster mechanism include:
  • the set of nodes connected to the same switching device in the two-part graph is determined as the physical group, wherein the physical group includes nodes from different logical groups.
  • it further includes:
  • the node identifiers of the two nodes are determined as the set node identifiers of the designated node.
  • the determining the node identifier of the node in the logical group according to the identifier of the specified node set in the logical group includes:
  • the node identifiers of the nodes in the first node group in the second logical group are generated according to the node identifiers of the nodes in the first node group in the first logical group;
  • the node identifiers of the nodes in the second node group in the first logical group are generated, and so on, until the node identifiers of all the nodes in the logical group are generated.
  • the determining the node identifier of the node in the logical group according to the identifier of the specified node set in the logical group includes:
  • the establishing a unique data communication link between nodes in the logical group according to the node identifier includes:
  • the node identifier perform a protocol operation on the nodes in the logical group to obtain protocol information of the node
  • a unique data communication link between the nodes in the logical group is established.
  • the performing a protocol operation on the nodes in the logical group according to the node identifier to obtain protocol information of the node includes:
  • the protocol information of the node paired with the first pairing distance is obtained.
  • it further includes:
  • the protocol information of the node paired at the second pairing distance is obtained.
  • it further includes:
  • the protocol information of the node paired with the second pairing distance obtained and the protocol information of the node paired with the third pairing distance received are sent to the node paired with the first pairing distance.
  • it further includes:
  • the node identifier of the node in the determined logical group is shifted to obtain the node identifier of the node in the shifted logical group.
  • the shifting the node identifiers of the nodes in the determined logical group to obtain the node identifiers of the nodes in the shifted logical group includes:
  • the node identifiers of the nodes located on the upper side or the lower side of the bipartite graph of the cluster architecture in the logical group are shifted to obtain the node identifiers of the nodes in the shifted logical group.
  • it further includes:
  • the node identification information of the nodes in the logical group is recorded on the nodes in the physical group connected to the switching device.
  • it further includes:
  • the information of the node identification is recorded in the node query table.
  • the establishing a node query table according to the node identifiers of the nodes in the logical group includes:
  • the node identification information with the same index value is recorded in the same node query list.
  • This application also provides a device for establishing a data communication link, including:
  • the grouping unit is used to group the nodes in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes;
  • the determining unit is configured to determine the node identifier of the node in the logical group according to the identifier of the designated node set in the logical group;
  • the establishment unit is configured to establish a unique data communication link between nodes in the logical group according to the node identifier.
  • This application also provides a method for determining node identification in a cluster architecture, including:
  • the grouping the nodes in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes includes:
  • the nodes distributed on both sides of the bipartite graph in the cluster architecture are grouped to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • the grouping of the nodes distributed on both sides of the bipartite graph in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes includes:
  • the set of nodes connected to the same switching device in the two-part graph is determined as the physical group, wherein the physical group includes nodes from different logical groups.
  • it further includes:
  • the identifiers of the two nodes are determined as the set designated node identifiers.
  • the determining the node identifier of the node in the logical group according to the identifier of the specified node set in the logical group includes:
  • the node identifiers of the nodes in the first node group in the second logical group are generated according to the node identifiers of the nodes in the first node group in the first logical group;
  • the node identifiers of the nodes in the second node group in the first logical group are generated, and so on, until the node identifiers of all the nodes in the logical group are generated.
  • the determining the node identifier of the node in the logical group according to the identifier of the specified node set in the logical group includes:
  • This application also provides a device for determining node identifiers in a cluster architecture, including:
  • the grouping unit is used to group the nodes in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes;
  • the determining unit is configured to determine the node identifier of the node in the logical group according to the identifier of the designated node set in the logical group.
  • This application also provides a method for establishing a data communication link in an artificial intelligence cluster architecture, including:
  • a unique data communication link between the nodes in the logical group is established.
  • This application also provides a device for establishing a data communication link in an artificial intelligence cluster architecture, including:
  • the grouping unit is used to group the nodes used for data communication in the artificial intelligence cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes in the artificial intelligence cluster architecture;
  • the determining unit is configured to determine the node identifiers of other nodes in the logical group according to the node identifiers of the designated nodes set in the logical group;
  • the establishment unit is configured to establish a unique data communication link between nodes in the logical group according to the node identifier.
  • This application also provides a computer storage medium for storing data generated by a network platform, and a program corresponding to the data generated by the network platform for processing;
  • the program When the program is read and executed by the processor, it executes the steps of the method for establishing a data communication link as described above, or executes the steps of the method for determining node identification in the cluster architecture as described above, or executes the steps as described above.
  • the steps of the method for establishing a data communication link in the artificial intelligence cluster architecture are described.
  • This application also provides an electronic device, including:
  • the memory is used to store a program for processing data generated by the network platform.
  • the program When the program is read and executed by the processor, it executes the steps of the method for establishing a data communication link as described above, or executes the steps as described above The steps of the method for determining node identification in the cluster architecture, or the steps of the method for establishing a data communication link in the artificial intelligence cluster architecture as described above.
  • the present application provides a method for establishing a data communication link, including: grouping nodes in a cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes; Specify the node identifier to determine the node identifier of the node in the logical group; according to the node identifier, establish a unique data communication link between the nodes in the logical group; thus, there is no duplication between nodes in the cluster architecture Data communication path, and avoid performance degradation and network congestion in the state of large-scale data processing.
  • the node identification can be obtained by recording the node identification information in the physical group in the cluster architecture to learn the connection relationship between the nodes; it can also be inquired by establishing the node Table, the node identification information is recorded in the node query table, so that the connection relationship between the nodes can be learned according to the node query table.
  • This application also provides a method for determining node identifiers in a cluster architecture, including: grouping nodes in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes; according to the setting in the logical group Specify the identification of the designated node to determine the node identification of the node in the logical group; so that each node in the cluster architecture has its own identification information to determine the data communication link between the nodes in the cluster architecture to avoid duplication The emergence of data communication links.
  • This application provides a method for establishing a data communication link based on an artificial intelligence cluster architecture, including: grouping nodes used for data communication in the artificial intelligence cluster architecture to obtain a description of the node connection relationship in the artificial intelligence cluster architecture The logical group and the physical group describing the physical location of the node; determine the node IDs of other nodes in the logical group according to the node ID of the specified node set in the logical group; establish the logical group according to the node ID The only data communication link between the middle nodes; so as to ensure that in the artificial intelligence cluster architecture, the nodes can communicate through the only data communication link, and avoid network congestion caused by repeated data communication links between nodes, The problem that causes the performance of artificial intelligence cluster architecture to decline.
  • Figure 1 is a network topology diagram of the cluster architecture
  • FIG. 2 is a flowchart of an embodiment of a method for establishing a data communication link provided by the present application
  • FIG. 3 is a schematic diagram of the structure of node grouping in an embodiment of a method for establishing a data communication link provided by the present application;
  • FIG. 4 is a schematic diagram of method 1 for determining node identification in an embodiment of a method for establishing a data communication link provided by the present application;
  • FIG. 5 is a schematic diagram of the second method of determining node identification in an embodiment of a method for establishing a data communication link provided by the present application
  • FIG. 6 is a schematic diagram of protocol operations in an embodiment of a method for establishing a data communication link provided by the present application
  • FIG. 7 is a schematic diagram of a data communication link between paired nodes corresponding to a first matching distance in an embodiment of a method for establishing a data communication link provided by the present application;
  • FIG. 8 is a schematic diagram of a data communication link between paired nodes corresponding to a second matching distance in an embodiment of a method for establishing a data communication link provided by the present application;
  • FIG. 9 is a schematic diagram of a data communication link between paired nodes corresponding to a third matching distance in an embodiment of a method for establishing a data communication link provided by the present application;
  • FIG. 10 is a schematic diagram of a first connection relationship between nodes in different logical groups or the same logical group by color in an embodiment of a method for establishing a data communication link provided by the present application;
  • FIG. 11 is a schematic diagram of a second connection relationship between nodes in different logical groups or the same logical group by color in an embodiment of a method for establishing a data communication link provided by the present application;
  • FIG. 12 is a schematic structural diagram of an embodiment of an apparatus for establishing a data communication link provided by the present application.
  • FIG. 13 is a flowchart of an embodiment of a method for determining node identifiers in a cluster architecture provided by this application;
  • FIG. 14 is a schematic structural diagram of an embodiment of an apparatus for determining node identifiers in a cluster architecture provided by the present application.
  • 15 is a flowchart of an embodiment of a method for establishing a data communication link based on an artificial intelligence cluster architecture provided by the present application;
  • FIG. 16 is a flowchart of an embodiment of an apparatus for establishing a data communication link based on an artificial intelligence cluster architecture provided by the present application.
  • the method for establishing a data communication link provided by the present application may be a data communication link established based on data communication between different node devices in a cluster architecture, so as to avoid the existence of multiple reachable paths between node devices.
  • the problem of network congestion may be a data communication link established based on data communication between different node devices in a cluster architecture, so as to avoid the existence of multiple reachable paths between node devices. The problem of network congestion.
  • FIG. 2 is a flowchart of an embodiment of a method for establishing a data communication link provided by the present application.
  • the establishment method includes:
  • Step S201 Group the nodes in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • step S201 Before describing the specific implementation process of step S201, some technical terms involved in this step will be explained first, as follows:
  • the cluster architecture structure in this embodiment adopts the topological structure of a bipartite graph.
  • the switching devices in the cluster architecture are distributed on both sides of the bipartite graph, including: nodes connected to the switching device on the upper side and The node connected to the opposite lower side with the switching device, between the upper node and the lower node, between the upper node, and between the lower node, is a network cluster architecture that can communicate data through the switching device.
  • the nodes respectively connected to the switching devices on the upper and lower sides of the bipartite graph may include: computing node devices, storage node devices, network node devices (network cards), co-processing node devices, and system node devices.
  • the co-processing node device may include: GPU, ASIC, FPGA, DSP, etc.
  • step S201 is:
  • Step S201-1 Group the nodes distributed on both sides of the bipartite graph in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • the logical group may represent a logical connection relationship between nodes
  • the physical group may represent a physical location relationship of nodes.
  • step S201-1 may include:
  • Step S201-1a Determine the set of nodes respectively connected to different switching devices in the two parts as a logical group
  • Step S201-1b Determine the set of nodes connected to the same switching device in the two-part graph as the physical group, wherein the physical group includes nodes from different logical groups.
  • FIG. 3 is a schematic diagram of the structure of node grouping in an embodiment of a method for establishing a data communication link provided by the present application.
  • the so-called 4 ⁇ 8 may refer to including 8 physical groups and 4 logical groups, with a total of 32 nodes.
  • Each switching device is connected to different logical group members to form a physical group, that is, each switch is connected to 4 nodes.
  • the 4 nodes are a physical group, and there are 8 physical groups in total; 4 nodes are from different logical groups.
  • Select nodes on different switching devices to form a logical group, a logical group has 8 nodes, and a total of 4 logical groups.
  • the logical group can be divided according to the order in which nodes are connected to the physical group. For example, as shown in FIG. 3, the first node in each physical group on the upper side and each physical group on the lower side The first node in is divided into the first logical group; the second node in each physical group in the upper side and the second node in each physical group in the lower side are divided in the second logical group; The third node in each physical group in the side and the third node in each physical group in the lower side are divided into the third logical group; the fourth node in each physical group in the upper side and the lower side The fourth node in each physical group in is divided into a fourth logical group.
  • the composition of the physical group can be divided arbitrarily.
  • the logical group can also be divided according to actual needs, ensuring that each physical group has 4 nodes and each logical group has 8 nodes. Normally, the nodes are not repeated.
  • the above is only a description of the logical group and the physical group in the form of a 4 ⁇ 8 network cluster architecture, and is not used for limitation.
  • each node in the logical group needs to be identified, so as to facilitate the determination of the communication connection relationship between the nodes between different physical groups.
  • the determination of the node identifier of this application will be described below.
  • Step S202 Determine the node identifiers of other nodes in the logical group according to the node identifiers of the designated nodes set in the logical group.
  • the other nodes may be nodes other than designated nodes.
  • the purpose of the step S202 is to determine an identification reference in advance, so as to facilitate obtaining the identification of other nodes according to the identification reference.
  • the identifier of the designated node in the logical group By setting the identifier of the designated node in the logical group, the node identifiers of nodes other than the designated node in the logical group are obtained. Therefore, a node identification reference needs to be set, which can also include:
  • Step S20a Select node information in the first logical group.
  • the first logical group can be any logical group in the above 4 ⁇ 8 cluster architecture. Following the above example, please refer to Fig. 3.
  • the nodes are arranged from left to right.
  • a logical group may include: the first node in each physical group on the upper side and the first node in each physical group on the lower side.
  • Step S20b Divide the nodes in the first logical group into a node group including at least two nodes according to the bipartite graph.
  • a logical group may include 4 A node group, that is, divided into two vectors according to the bipartite graph, namely: x, y, x represent the nodes on the upper side of the bipartite graph, and y represent the nodes on the lower side of the bipartite graph.
  • the nodes in the upper physical group and the nodes in the lower physical group are taken out as a node group.
  • Step S20c Assign identifiers to two nodes of the first node group in the first logical group.
  • Step S20d Determine the node identifiers of the two nodes as the set node identifiers of the designated node.
  • step S20d determines the node identifiers of the two nodes assigned in the step S20c as the node identifiers of the designated node.
  • the node identification can be in a coding manner.
  • other identification methods that can distinguish nodes can also be used, such as shape identification, color identification, etc.
  • the above is an explanation of how to set the node ID of the specified node. After obtaining the node ID of the specified node, you need to determine the node ID of the node members in other logical groups according to the node ID of the specified node, so that you can also know the node ID of each physical group The node ID of the node member.
  • step S202 there are two ways to determine the node identification of other nodes according to the node identification of the designated node, but it is not limited to these two ways. Firstly, a general description of the two methods will be given later. Elaborate.
  • Step S202-1a According to the node identifiers of the nodes in the first node group in the first logical group, generate the node identifiers of the nodes in the second node group adjacent to the first node group in the first logical group ;
  • Step S202-1b According to the node identifiers of the nodes in the first node group in the first logical group, generate the node identifiers of the nodes in the next node group adjacent to the second node group;
  • Step S202-1c When the node identification in the first logical group is generated, the node identification of the node in the first node group in the second logical group is generated according to the node identification of the node in the first node group in the first logical group ;
  • Step S202-1d According to the node identifiers of the nodes in the second node group in the first logical group, generate the node identifiers of the nodes in the second node group in the second logical group, and so on until all nodes in the logical group are generated The node ID.
  • Method two includes:
  • Step S202-2a Sort the node groups in the logical group in order to obtain a sorted set of node groups
  • Step S202-2b Determine the node identifier of the node in the first node group according to the node identifier of the node in the first node group preset in the node group set;
  • Step S202-2c Generate the node identifiers of the nodes in the second node group in the node group set according to the node identifiers of the nodes in the first node group, and obtain the node identifiers of the nodes in the second node group;
  • Step S202-2d Generate the node identifiers of the nodes in the third node group in the node group set according to the node identifiers of the nodes in the second node group, and obtain the node identifiers of the nodes in the third node group; and so on until Generate the node IDs of all nodes in the node group set.
  • the first method and the second method are the same in that they both need to determine the node IDs of all node members in the first logical group according to the node ID of the designated node.
  • the designated node may be the first logical group. Node member of the first node group in.
  • method one After determining the node identifier of the node in the first node group in the first logical group, in method one, the second logical group can be determined according to the node identifier of the node in the node group in the first logical group.
  • the node identifier of the second node group in the first logical group is determined according to the designated node (the first node group), the node identifier of the third node group is determined according to the node identifier of the second node group, and the nodes of the fourth node group
  • the identifier is determined according to the node identifier of the third node group; the node identifier of the first node group in the second logical group is determined according to the node identifier of the fourth node group in the first logical group, and the second node in the second logical group
  • the node ID of the group is determined according to the node ID of the first node group in the second logical group, and so on. How to determine the details will be explained in detail below.
  • FIG. 4 is a schematic diagram of Method 1 for determining a node identifier in an embodiment of a method for establishing a data communication link provided by the present application.
  • the logical groups are numbered with 0-N.
  • the numbers of the logical groups may include: the first logical group, that is, 0, expressed in binary as 0000; the second logical group, that is 1, and expressed in binary as 0001; The third logical group, namely 2, is represented as 0010 in binary; the fourth logical group, namely 3, is represented as 0011 in binary.
  • the first logical group includes 8 nodes, and the two opposite nodes are divided into a node group.
  • the first logical group includes from left to right: the first node group 0, which is expressed as 0000 in binary;
  • the combination in the first logical group is also applicable to the structure of other logical groups.
  • the nodes of the first node group in the first logical group are designated nodes, and the second node group generates nodes according to the first node group. Number, the third node group generates the node number according to the first node group, and the fourth node group forms the node number according to the first node, as follows:
  • calculate the node numbers of other node groups in the first logical group by the formula [x+i ⁇ B, y+i ⁇ B], where i is the current node group The number value or the number value of the current logical group.
  • i is the number value of the current node group.
  • the calculation formula for the node number is: [y+i ⁇ B, x+i ⁇ B], when the number 1 in the binary representation is an even number, the node The number calculation formula is: [x+i ⁇ B, y+i ⁇ B]. Among them, i is the number of the current node group, and B is the number of designated nodes 2.
  • the node number of the node in the second node group (binary representation is 0001, that is, the node group number is 1) is based on the node number [1, 2] of the node in the first node group, using [y+i ⁇ B, x+i ⁇ B]
  • the formula is generated, namely [2+1 ⁇ 2, 1+1 ⁇ 2], and the node number of the node in the second node group generated is [4, 3];
  • the node number of the node in the third node group (binary representation is 0010, that is, the node group number is 2) is based on the node number [4, 3] of the node in the second node group, using [y+i ⁇ B, x+ i ⁇ B] formula is generated, namely [2+2 ⁇ 2, 1+2 ⁇ 2], and the node number of the node in the third node group generated is [6, 5];
  • the node number of the node in the fourth node group (binary representation is 0011, that is, the node group number is 3) is based on the node number [6, 5] of the node in the third node group, using [x+i ⁇ B, y+ The i ⁇ B] formula is generated, namely [1+3 ⁇ 2, 2+3 ⁇ 2], and the node number of the node in the second node group generated is [7, 8].
  • B in the above formula represents the calculation reference value, which is set according to different logical groups.
  • the subsequent node numbers are generated based on two specified nodes, because the calculation basis is 2, and the first After the node number in the logical group is determined, the calculation basis is set to 8.
  • the node numbers in the second logical group can be determined, and the specific process can be:
  • the second logical group (binary representation is 0001), therefore, the generation of the node number of the node in the second logical group adopts [y+i ⁇ B, x+i ⁇ B], where i represents the number value of the current logical group , Which is 1; B is 8. Since the node numbers in the first logical group are already known, there is no need to perform binary judgments on the node groups in the second logical group when determining the node numbers in the second logical group, only the corresponding node in the first logical group Just generate the number.
  • i the number value of the current logical group , Which is 1
  • B is 8. Since the node numbers in the first logical group are already known, there is no need to perform binary judgments on the node groups in the second logical group when determining the node numbers in the second logical group, only the corresponding node in the first logical group Just generate the number.
  • the node number of the node in the first node group is generated according to the first node group node number [1, 2] of the first logical group, which is [2+1 ⁇ 8, 1+1 ⁇ 8], the first node generated
  • the node number of the node in the group is [10, 9];
  • the node number of the node in the second node group is generated according to the node number [4, 3] of the second node group in the first logical group, which is [3+1 ⁇ 8, 4+1 ⁇ 8], the second node group generated
  • the node number of the middle node is [11, 12];
  • the node number of the node in the third node group is generated according to the node number [6, 5] of the third node group in the first logical group, which is [5+1 ⁇ 8, 6+1 ⁇ 8], the second node group generated
  • the node number of the middle node is [13, 14];
  • the node number of the node in the fourth node group is generated according to the node number [7, 8] of the third node group in the first logical group, which is [8+1 ⁇ 8, 7+1 ⁇ 8], the second node group generated
  • the node number of the middle node is [16, 15].
  • the node numbers in the third logical group can be determined, and the specific process may be:
  • the third logical group (binary representation is 0010), therefore, the generation of the node number of the node in the third logical group adopts [y+i ⁇ B, x+i ⁇ B], where i represents the current logical group number value, That is 2; B is 8. Since the node number in the first logical group is already known, there is no need to perform a binary judgment on the node group in the third logical group when determining the node number in the third logical group. It only needs to be based on the corresponding node in the first logical group. Just generate the number. E.g:
  • the node number of the node in the first node group is generated according to the first node group node number [1, 2] of the first logical group, which is [2+2 ⁇ 8, 1+2 ⁇ 8], the first node generated
  • the node number of the node in the group is [18, 17];
  • the node number of the node in the second node group is generated according to the node number [4,3] of the second node group in the first logical group, which is [3+2 ⁇ 8, 4+2 ⁇ 8], the second node group generated
  • the node number of the middle node is [19, 20];
  • the node number of the node in the third node group is generated according to the node number [6, 5] of the third node group in the first logical group, which is [5+2 ⁇ 8, 6+2 ⁇ 8], and the second node group is generated
  • the node number of the middle node is [21, 22];
  • the node number of the node in the fourth node group is generated according to the node number [7, 8] of the third node group in the first logical group, which is [8+2 ⁇ 8, 7+2 ⁇ 8], and the second node group is generated
  • the node number of the middle node is [24, 23].
  • the node numbers in the fourth logical group can be determined, and the specific process may be:
  • the fourth logical group (binary representation is 0011), therefore, the generation of the node number of the node in the fourth logical group adopts [x+i ⁇ B, y+i ⁇ B], where i represents the current logical group number value, That is 3; B is 8. Since the node number in the first logical group is already known, there is no need to perform a binary judgment on the node group in the fourth logical group when determining the node number in the fourth logical group. It only needs to be based on the corresponding node in the first logical group. Just generate the number. E.g:
  • the node number of the node in the first node group is generated according to the first node group node number [1, 2] of the first logical group, which is [1+3 ⁇ 8, 2+3 ⁇ 8], the first node generated
  • the node number of the node in the group is [25, 26];
  • the node number of the node in the second node group is generated according to the node number [4,3] of the second node group in the first logical group, which is [4+3 ⁇ 8, 3+3 ⁇ 8], the second node group generated
  • the node number of the middle node is [28, 27];
  • the node number of the node in the third node group is generated according to the node number [6, 5] of the third node group in the first logical group, which is [6+3 ⁇ 8, 5+3 ⁇ 8], and the second node group is generated
  • the node number of the middle node is [30, 29];
  • the node number of the node in the fourth node group is generated according to the node number [7, 8] of the third node group in the first logical group, which is [7+3 ⁇ 8, 8+3 ⁇ 8], and the second node group is generated
  • the node number of the middle node is [31, 32].
  • FIG. 5 is a schematic diagram of a second method of determining a node identifier in an embodiment of a method for establishing a data communication link provided by the present application.
  • each node group is numbered with 0-N.
  • the logical group includes 4 groups, each group has 4 node groups, a total of 16 node groups, then according to the logical group
  • the arrangement sequence of, the binary representation of each node group can be obtained, the first node group 0000; the second node group 0001; the third node group 0010; the fourth node group 0011; the fifth node group 0100; the sixth node group 0101; The seventh node group 0110; the eighth node group 0111; the ninth node group 1000; the tenth node group 1001; the eleventh node group 1010; the twelfth node group 1011; the thirteenth node group 1100; the fourteenth node group 1101; Fifteenth node group 1110; Sixteenth node group 1111.
  • the node in the first node group is the designated node, and the node number is [1, 2].
  • the node number of the specified node is the designated node, and the node number is [1, 2].
  • Use the node number of the specified node as a reference to generate the node numbers of other nodes generate the node numbers of the second node group according to the node numbers of the first node group; generate the node numbers of the third node group according to the node numbers of the first node group , Generate the node number of the fourth node group according to the node number of the first node group, and so on.
  • the calculation formula of the node number is: [y+B, x+B]
  • the calculation formula of the node number is: [x+ B, y+B].
  • B is the largest node number among completed node numbers.
  • the first node group 0000, the node number is preset to [1, 2];
  • the second node group 0001, the node number is generated according to [y+B, x+B], which is [4, 3];
  • the third node group 0010 the node number is generated according to [y+B, x+B], which is [6, 5];
  • the fourth node group 0011, the node number is generated according to [x+B, y+B], which is [7, 8];
  • the fifth node group 0100 the node number is generated according to [y+B, x+B], which is [10, 9];
  • the sixth node group 0101 the node number is generated according to [x+B, y+B], which is [11, 12];
  • the seventh node group 0110 the node number is generated according to [x+B, y+B], which is [13, 14];
  • the eighth node group 0111 the node number is generated according to [y+B, x+B], which is [16, 15];
  • the ninth node group 1000 the node number is generated according to [y+B, x+B], which is [18, 17];
  • the node number is generated according to [x+B, y+B], which is [19, 20];
  • the eleventh node group 1010 the node number is generated according to [x+B, y+B], which is [21, 22];
  • the twelfth node group 1011 the node number is generated according to [y+B, x+B], which is [24, 23];
  • the thirteenth node group 1100 the node number is generated according to [x+B, y+B], which is [25, 26];
  • the fourteenth node group 1101, the node number is generated according to [y+B, x+B], which is [28, 27];
  • the fifteenth node group 1110 the node number is generated according to [y+B, x+B], which is [30, 29];
  • the sixteenth node group 1111 the node number is generated according to [x+B, y+B], which is [31, 32].
  • the node number of the corresponding node group is recorded in the corresponding logical group according to the composition structure of the logical group, and the node number of the node in each logical group can be obtained.
  • the node identification can be recorded in different presentation modes, for example: recording the node identification information of the node in the logical group into the physical group connected to the switching device On the node. As shown in Figure 3, each physical group can present the node numbers of nodes in different logical groups.
  • the node identification information is recorded in the form of a lookup table. Specifically, a node lookup table is established according to the node identification; the node identification information is recorded in the node lookup table, and the node identification is determined through the lookup table.
  • the mapping relationship between the searched node ID and the node (for example: the mapping relationship between the node ID and the node IP), so as to know the node corresponding to the searched node ID, that is, the information of the node ID and the information of the node are recorded in the query table , And other node identification information with connection relationship.
  • the establishing a node query table according to the node identifiers of the nodes in the logical group includes:
  • the node identification information with the same index value is recorded in the same node query list.
  • Step S203 Establish a unique data communication link between nodes in the logical group according to the node identifier.
  • FIG. 5 is a schematic diagram of protocol operations in an embodiment of a method for establishing a data communication link provided by the present application.
  • the specific implementation process may include:
  • Step S203-1 According to the node identifier, perform a protocol operation on the nodes in the logical group to obtain protocol information of the node.
  • the protocol operation adopted in the step S203-1 may be a multi-step protocol method, that is, havling-doubling allreduce.
  • the Reduce protocol is a classic concept from functional programming.
  • the data specification involves dividing a batch of data into smaller batches through a function. For example, the elements of an array are reduced to a number by an addition function.
  • Allreduce is a kind of collective communication mode, which can be widely understood as sending the data held by the current node to all other participating nodes, and performing corresponding protocol operations, so that all nodes can achieve the results after the protocol.
  • the protocol operation may include: averaging, addition, taking the maximum value, etc. In this embodiment, the rule addition is mainly used.
  • step S203-1 may include:
  • Step S203-1-11 Pair the nodes in the logical group in pairs according to the set first pairing distance.
  • Step S203-1-12 According to the data transmitted between the paired nodes, obtain protocol information of the node paired with the first pairing distance. As shown in step1 in Figure 6.
  • step S203-1-12 The purpose of the step S203-1-12 is that the paired nodes obtain part of the data of each other, so as to obtain protocol information.
  • Step S203-1-21 According to the set second pairing distance, the nodes in the logical group are paired again; step 2 in FIG. 6.
  • Step S203-1-22 According to the data transmitted between the two nodes that are paired, and based on the protocol information of the node paired with the first pairing distance that has been obtained, obtain the data paired with the second pairing distance. Protocol information of the node.
  • Step S203-1-31 According to the set third pairing distance, pair the nodes in the logical group again; step 3 in FIG. 6.
  • Step S203-1-32 According to the data transmitted between the two nodes that are paired, and based on the protocol information of the node paired with the second pairing distance that has been obtained, obtain the pairing with the third pairing distance. Protocol information of the node;
  • Step S203-1-33 Send the protocol information of the node paired at the third pairing distance to the node paired at the second pairing distance; that is, start reverse transmission from step3.
  • Step S203-1-33 The protocol information of the node paired with the second pairing distance obtained and the protocol information of the node paired with the third pairing distance received are sent to the first pairing distance The paired node.
  • Figure 6 only uses 8 nodes as an example.
  • each node obtains one-third of the protocol result; afterwards, the reverse protocol is performed, and the amount of data transmitted in the protocol result can be the same every time.
  • Step S203-2 Establish a unique data communication link between the nodes in the logical group according to the protocol information of the nodes.
  • step S203-2 The purpose of the step S203-2 is to establish a unique data communication link between nodes according to the protocol information of the nodes.
  • the specific implementation process of step S203-2 may be that the protocol information of the node determined according to the first pairing distance in step S203-1-11 and step S203-1-12 realizes unique data between nodes
  • the communication link is established, as shown in Figure 7; it can also be the protocol information of the node determined according to the second pairing distance in the steps S203-1-21 and S203-1-22 to realize the unique data communication link between the nodes Establishment, as shown in Figure 8; it can also be the establishment of a unique data communication link between nodes according to the protocol information of the nodes determined in steps S203-1-31 to step S203-33 according to the third pairing distance, as shown in Figure 9 Shown; or, the protocol information of the node paired with the third pairing distance is sent to the node paired with the second pairing distance; and the protocol of the node paired with the second pairing distance will be obtained The information and the received protocol information of
  • FIG. 10 is a schematic diagram of the first connection relationship between nodes in a logical group in an embodiment of a method for establishing a data communication link provided by this application;
  • FIG. 11 is an implementation of a method for establishing a data communication link provided by this application
  • a schematic diagram of the second connection relationship between nodes in a logical group Both figures describe the connection relationship between the nodes in the logical group. There is a connection relationship between nodes of the same color. Because the connection relationship spans two logical groups, it is displayed in a dashed box, so 8 in a dashed box The root line represents 8 connections and 16 nodes are connected.
  • node 1 and node 9 have a connection relationship.
  • FIG. 10 and FIG. 11 only take the node groups in some logical groups as examples for drawing.
  • the node identifier of the node in the determined logical group is shifted to obtain the node identifier of the node in the shifted logical group. Specifically, the node identifier of the node located on the upper side or the lower side of the bipartite graph of the cluster architecture in the logical group is shifted to obtain the node identifier of the node in the shifted logical group.
  • the node number of the upper (x) node in the second logical group before the shift is [10, 11, 13, 16], and the upper node in the second logical group after the shift
  • the node number is [16, 10, 11, 13]
  • the node number of the upper (x) node of the third logic group before the shift is [18, 19, 21, 24], the upper side of the third logic after the shift
  • the node number of the node is [21, 24, 18, 19]
  • the node number of the upper (x) node of the fourth logical group before the shift is [25, 28, 30, 31], the fourth logical group after the shift
  • the node number of the middle and upper node is [31, 25, 28, 30]
  • the shift can be implemented in the process of determining the node identifier, for example: the formula of the node identifier of the first logical group node is not shifted [y+B ⁇ ⁇ 0, x+B]; the formula of the node identification of the second logical group node is added to the
  • the shift can be shifted for the upper side, or shifted for the lower side (y).
  • the formula for the node identifier of the first logical group node is not shifted [y+B, x+ B ⁇ 0];
  • the formula of the node ID of the second logical group node is added to the shift [y+i ⁇ B, x+i ⁇ B ⁇ 1];
  • the formula of the node ID of the third logical group node is added to the shift [ y+i ⁇ B, x+i ⁇ B ⁇ 2];
  • the formula of the node identification of the fourth logical group node is added to the shift [x+i ⁇ B, y+i ⁇ B ⁇ 3].
  • FIG. 12 is a schematic structural diagram of an embodiment of an apparatus for establishing a data communication link provided by the present application, and the apparatus for establishing includes:
  • the grouping unit 1201 is used to group the nodes in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • the grouping unit 1201 is specifically configured to group the nodes distributed on both sides of the bipartite graph in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • the grouping unit 1201 includes:
  • a logical group determining subunit configured to determine a set of nodes connected to different switching devices in the two-part graph as the logical group
  • the physical group determining subunit is configured to determine a set of nodes connected to the same switching device in the two-part graph as the physical group, wherein the physical group includes nodes from different logical groups.
  • the selection unit is used to select node information in the first logical group
  • a node group dividing unit configured to divide the nodes in the first logical group into a node group including at least two nodes according to the bipartite graph;
  • the designated node identifier determining unit is configured to determine the node identifiers of the two nodes as the set node identifiers of the designated node.
  • the determining unit 1202 is configured to determine the node identifiers of other nodes in the logical group according to the node identifiers of the designated nodes set in the logical group.
  • the determining unit 1202 includes two implementation manners.
  • the first manner includes:
  • the first generating subunit is configured to generate nodes in the second node group adjacent to the first node group in the first logical group according to the node identifiers of the nodes in the first node group in the first logical group Node ID;
  • a second generating subunit configured to generate, according to the node identifiers of the nodes in the second node group, the node identifiers of the nodes in the next node group adjacent to the second node group in the first logical group;
  • the third generation subunit is used to generate the nodes in the first node group in the second logical group according to the node identifiers of the nodes in the first node group in the first logical group when the node identifiers in the first logical group are generated.
  • the fourth generating subunit is used to generate the node identifiers of the nodes in the second node group in the second logical group according to the node identifiers of the nodes in the second node group in the first logical group, and so on until all the logic is generated The node ID of the node in the group.
  • the second manner of the determining unit 1202 includes:
  • the first determining subunit is configured to determine the first node identification reference according to the node identification of the node in the first node group in the node group set;
  • the first generating subunit is configured to generate node identifiers of nodes in the second node group in the node group set according to the first node identifier reference;
  • a second determining subunit configured to determine a second node identification reference according to the node identification of the node in the second node group
  • the third generating subunit is configured to generate the node identifiers of the nodes in the third node group in the node group set according to the second node identifier reference; and so on until the node identifiers of the nodes in all the node groups in the node group set are generated.
  • the establishing unit 1203 is configured to establish a unique data communication link between nodes in the logical group according to the node identifier.
  • the establishment unit 1203 includes:
  • a protocol information obtaining subunit which is used to perform protocol operations on nodes in the logical group according to the node identifiers to obtain node protocol information
  • the establishment of a subunit is used to establish a unique data communication link between nodes belonging to the same logical group in different physical groups according to the protocol information of the nodes.
  • the protocol information obtaining subunit includes:
  • the first pairing subunit is configured to pair the nodes in the logical group in pairs according to the set first pairing distance
  • the first protocol obtaining subunit is configured to obtain protocol information of the node paired with the first pairing distance according to the data transmitted between the paired nodes.
  • the second pairing subunit is configured to re-pair the nodes in the logical group in pairs according to the set second pairing distance
  • the second protocol obtaining subunit is used to obtain the second pairing distance based on the data transmitted between the paired two nodes and based on the obtained protocol information of the nodes paired with the first pairing distance Protocol information of the paired node.
  • the third pairing subunit is used to re-pair the nodes in the logical group in pairs according to the set third pairing distance;
  • the third protocol obtaining subunit is used to obtain the third pairing distance based on the data transmitted between the paired two nodes and based on the protocol information of the node paired with the second pairing distance. Protocol information of the paired node;
  • the first sending subunit is configured to send protocol information of the node paired at the third pairing distance to the node paired at the second pairing distance;
  • the second sending subunit is used to send the protocol information of the node paired with the second pairing distance and the protocol information of the node paired with the third pairing distance received to the first pairing distance Distance to the node to be paired.
  • the device for establishing a data communication link further includes:
  • the shift unit is configured to shift the determined node identifier of the node in the logical group to obtain the node identifier of the node in the shifted logical group.
  • the shift unit is specifically configured to shift the node identifiers of the nodes located on the upper side or the lower side of the bipartite graph of the cluster architecture in the logical group to obtain the nodes of the nodes in the shifted logical group Logo.
  • the node identification in the device for establishing a data communication link provided by this application can be presented in different ways, and therefore, it also includes:
  • the recording unit is configured to record the node identification information of the nodes in the logical group to the nodes in the physical group connected to the switching device.
  • a query table establishing unit configured to establish a node query table according to the node identifier
  • the recording unit is configured to record the node identification information in the node query table.
  • the query table establishment unit includes:
  • An index subunit used to establish an index according to the distance between nodes in the logical group
  • the recording subunit is used to record node identification information with the same index value in the same node query list.
  • this application also provides a node in a cluster architecture
  • the identification method is the same as some of the content in the embodiment of the method for establishing a data communication link.
  • step S201 in the embodiment of the method for establishing a data communication link.
  • step S202 the description of S202.
  • FIG. 13 is a flowchart of an embodiment of a method for determining node identifiers in a cluster architecture provided by the present application, and the determining method includes:
  • Step S1301 Group the nodes in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • the cluster architecture structure in this embodiment adopts the topological structure of a bipartite graph.
  • the switching equipment in the cluster architecture is distributed on both sides of the bipartite graph, including: the upper side is connected to the switching equipment.
  • Nodes and nodes connected to the switching device on the opposite lower side, between the upper node and the lower node, between the upper node, and between the lower node, are network cluster architectures that can communicate data through the switching device.
  • the nodes respectively connected to the switching devices on the upper and lower sides of the bipartite graph may include: computing node devices, storage node devices, network node devices (network cards), co-processing node devices, and system node devices.
  • the co-processing node device may include: GPU, ASIC, FPGA, DSP, etc.
  • step S1301 The specific implementation process of step S1301 is:
  • the nodes distributed on both sides of the bipartite graph in the cluster architecture are grouped to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • the logical group may represent a logical connection relationship between nodes
  • the physical group may represent a physical location relationship of nodes.
  • the set of nodes connected to different switching devices in the two-part graph may be determined as a logical group; the set of nodes connected to the same switching device in the two-part graph may be determined as the physical group Group, wherein the physical group includes nodes from different logical groups.
  • a 4 ⁇ 8 network cluster architecture is taken as an example for description, please refer to FIG. 3, and for specific content, please refer to the description of step S201 in the data communication link method, which will not be repeated here.
  • Step S1302 Determine the node identifiers of other nodes in the logical group according to the node identifiers of the designated nodes set in the logical group.
  • the purpose of the step S1302 is to determine the identification reference of a node in advance, so as to facilitate obtaining node identifications of other nodes according to the identification reference.
  • the identifier of the designated node in the logical group By setting the identifier of the designated node in the logical group, the node identifiers of nodes other than the designated node in the logical group are obtained. Therefore, it is necessary to set a node identification benchmark, which can also include:
  • the identifiers of the two nodes are determined as the set designated node identifiers.
  • step S1302 For the process of determining the identification of the designated node in the step S1302, reference may be made to the description of the step S20a to the step S20d in the embodiment of the method for establishing a data communication link, which will not be repeated here.
  • the node identification can be in a coding manner.
  • other identification methods that can distinguish nodes can also be used, such as shape identification, color identification, etc.
  • the above is an explanation of how to set the node ID of the specified node. After obtaining the node ID of the specified node, you need to determine the node ID of the node members in other logical groups according to the node ID of the specified node, so that you can also know the node ID of each physical group The node ID of the node member.
  • step S1302 there are two ways to determine the node identification of other nodes according to the node identification of the specified node, but it is not limited to these two ways.
  • Way one includes:
  • Step S1302-1a According to the node identifiers of the nodes in the first node group in the first logical group, generate the node identifiers of the nodes in the second node group adjacent to the first node group in the first logical group ;
  • Step S1302-1b According to the node identifiers of the nodes in the second node group, generate the node identifiers of the nodes in the next node group adjacent to the second node group in the first logical group;
  • Step S1302-1c When the node identifier in the first logical group is generated, the node identifier of the node in the first node group in the second logical group is generated according to the node identifier of the node in the first node group in the first logical group ;
  • Step S1302-1d Generate the node identifiers of the nodes in the second node group in the second logical group according to the node identifiers of the nodes in the second node group in the first logical group, and so on until all nodes in the logical group are generated The node ID.
  • Method two includes:
  • Step S1302-2a Sort the node groups in the logical group in order to obtain a sorted set of node groups
  • Step S1302-2b Determine a first node identification reference according to the node identification of the node in the first node group in the node group set;
  • Step S1302-2c Generate node identifiers of nodes in the second node group in the node group set according to the first node identifier reference;
  • Step S1302-2d Determine a second node identification reference according to the node identification of the node in the second node group
  • Step S1302-2e Generate node identities of nodes in the third node group in the node group set according to the second node identity reference; and so on until the node identities of all nodes in the node group set are generated.
  • step S202 is only a general description of how to determine the node identities of other nodes in the logical group according to the node identities of the designated nodes in step S1302.
  • step S202 is only a general description of how to determine the node identities of other nodes in the logical group according to the node identities of the designated nodes in step S1302.
  • step S202 is only a general description of how to determine the node identities of other nodes in the logical group according to the node identities of the designated nodes in step S1302.
  • FIG. 14 is a schematic structural diagram of an embodiment of an apparatus for determining node identifiers in a cluster architecture provided by the present application; the determining apparatus includes:
  • the grouping unit 1401 is used to group the nodes in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • the grouping unit 1401 is specifically configured to group the nodes distributed on both sides of the bipartite graph in the cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes.
  • the grouping unit 1401 may include:
  • a logical group determining subunit configured to determine a set of nodes connected to different switching devices in the two-part graph as the logical group
  • the physical group determining subunit is configured to determine a set of nodes connected to the same switching device in the two-part graph as the physical group, wherein the physical group includes nodes from different logical groups.
  • the selection unit is used to select node information in the first logical group
  • a node group dividing unit configured to divide the nodes in the first logical group into a node group including at least two nodes according to the bipartite graph;
  • the designated node identifier determining unit is configured to determine the node identifiers of the two nodes as the set node identifiers of the designated node.
  • the determining unit 1402 is configured to determine the node identifiers of other nodes in the logical group according to the node identifiers of the designated nodes set in the logical group.
  • the determining unit 1402 includes two implementation manners.
  • the first manner includes:
  • the first generating subunit is configured to generate nodes in the second node group adjacent to the first node group in the first logical group according to the node identifiers of the nodes in the first node group in the first logical group Node ID;
  • a second generating subunit configured to generate, according to the node identifiers of the nodes in the second node group, the node identifiers of the nodes in the next node group adjacent to the second node group in the first logical group;
  • the third generation subunit is used to generate the nodes in the first node group in the second logical group according to the node identifiers of the nodes in the first node group in the first logical group when the node identifiers in the first logical group are generated.
  • the fourth generating subunit is used to generate the node identifiers of the nodes in the second node group in the second logical group according to the node identifiers of the nodes in the second node group in the first logical group, and so on until all the logic is generated The node ID of the node in the group.
  • the second manner of the determining unit 1402 includes:
  • the first determining subunit is configured to determine the first node identification reference according to the node identification of the node in the first node group in the node group set;
  • the first generating subunit is configured to generate node identifiers of nodes in the second node group in the node group set according to the first node identifier reference;
  • a second determining subunit configured to determine a second node identification reference according to the node identification of the node in the second node group
  • the third generating subunit is configured to generate the node identifiers of the nodes in the third node group in the node group set according to the second node identifier reference; and so on until the node identifiers of the nodes in all the node groups in the node group set are generated.
  • FIG. 15 is an implementation of the method for establishing a data communication link in an artificial intelligence cluster architecture provided by this application.
  • Example flow chart, the method includes:
  • Step S1501 Group the nodes used for data communication in the artificial intelligence cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes in the artificial intelligence cluster architecture;
  • Step S1502 Determine the node identifiers of other nodes in the logical group according to the node identifiers of the designated nodes set in the logical group;
  • Step S1503 Establish a unique data communication link between nodes in the logical group according to the node identifier.
  • the method for establishing a data communication link based on the artificial intelligence cluster architecture aims to apply the uniqueness of the data communication link to the artificial intelligence cluster architecture, thereby ensuring the stability of the performance of the artificial intelligence cluster architecture.
  • this application also provides an embodiment of an apparatus for establishing a data communication link in an artificial intelligence cluster architecture. Please refer to FIG. 16.
  • the device includes:
  • the grouping unit 1601 is configured to group the nodes for user data communication in the artificial intelligence cluster architecture to obtain a logical group describing the connection relationship of the nodes and a physical group describing the physical location of the nodes in the artificial intelligence cluster architecture;
  • the determining unit 1602 is configured to determine the node identifiers of other nodes in the logical group according to the node identifiers of the designated nodes set in the logical group;
  • the establishing unit 1603 is configured to establish a unique data communication link between nodes in the logical group according to the node identifier.
  • this application also provides a computer storage device for storing data generated by a network platform, and a program for processing the data generated by the network platform;
  • the program When the program is read and executed by the processor, it executes the steps of the method for establishing a data communication link as described above, or executes the steps of the method for determining node identification in the cluster architecture as described above, or executes the steps as described above.
  • this application also provides an electronic device, including:
  • the memory is used to store a program for processing data generated by the network platform.
  • the program When the program is read and executed by the processor, it executes the steps of the method for establishing a data communication link as described above, or executes the steps as described above.
  • the steps of the method for determining node identification in the cluster architecture, or the steps of the method for establishing a data communication link in the artificial intelligence cluster architecture as described above are executed.
  • the computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.
  • processors CPU
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-permanent memory in computer readable media, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
  • computer-readable media does not include non-transitory computer-readable media (transitory media), such as modulated data signals and carrier waves.
  • this application can be provided as methods, systems or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.

Abstract

本申请公开一种数据通信链路的建立方法及装置,集群机构中节点标识的确定方法及装置,以及基于人工智能集群架构中数据通信链路的建立方法及装置,计算机存储介质和电子设备,其中数据通信链路的建立方法包括:将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路;从而使得集群架构中的节点之间没有重复的数据通信路径,避免在大规模数据处理状态下出现的性能下降和网络拥塞等情况。

Description

通信链路的建立方法及装置,节点标识确定方法及装置
本申请要求2019年05月29日递交的申请号为201910457828.2、发明名称为“通信链路的建立方法及装置,节点标识确定方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,具体涉及一种数据通信链路的建立方法和装置,以及集群架构中节点标识的确定方法和装置,以及人工智能集群架构中数据通信链路的建立方法和装置,以及计算机存储设备和电子设备。
背景技术
随着互联技术的广泛应用,对大数据的处理以及机器学习等方面的应用需求越来越多。为提高数据处理的速度,现有集群架构中通过加入大量的计算设备,提升数据处理能力,而对集群架构需要考虑交换设备之间的网络互连结构以提高节点之间的数据通信能力。
目前集群架构常用的是Fat-tree拓扑结构,Fat-tree是一种应用于高性能计算机集群的最常用的拓扑之一。Fat-tree拓扑结构分为三个层次:自上而下分别为边缘层(edge)、汇聚层(aggregate)和核心层(core),其可以通过类似reduce算法实现源节点和目标节点之间的数据通信。由于源节点和目标节点之间具有多条并行路径,因此网络通用性和扩展性良好。但是,在大规模数据处理的需求下,会带来较大的性能损失。
为此,现有技术提供一种基于二部图的集群架构网络拓扑图,如图1所示,图1是集群架构的网络拓扑图,其中,大量的计算设备通过与其连接交换设备,与对面交换设备上的大量计算设备进行数据通信,如图1中所示,拓扑结构上侧的每个计算设备之间进行通信时,可以通过二部图下侧的任何一个交换设备抵达上侧的与计算设备连接的交换设备,实现上侧计算设备之间的数据通信,同样的,二部图下侧的每个计算设备之间进行通信时,可以使得二部图上侧的任何一个交换设备抵达下侧的交换设备,实现下侧计算设备之间的数据通信。可见,不论上侧计算节点之间的数据通信还是下侧计算节点之间的数据通信,均存在大量的可达路径。
如果计算设备之间在进行数据通信时存在大量的可达路径,则会造成网络通信数据 拥塞,导致数据通信延迟。
发明内容
本申请提供一种数据通信链路的建立方法,以解决现有技术中节点设备之间通信路径不唯一导致的数据拥塞的问题。
本申请提供一种数据通信链路的建立方法,包括:
将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;
根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识;
根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
在一些实施例中,所述将集群架构中用于进行数据通信的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组,包括:
将所述集群架构中分布在二部图两侧的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
在一些实施例中,所述将所述集群架构中分布在二部图两边的所述节点进行分组,获得在所述集群机构中描述节点连接关系的逻辑组和描述节点物理位置的物理组,包括:
将所述二部图中分别连接不同交换设备的节点的集合,确定为所述逻辑组;
将所述二部图中连接同一交换设备的节点的集合,确定为所述物理组,其中,所述物理组中包括来自不同逻辑组的节点。
在一些实施例中,还包括:
选取第一逻辑组中的节点信息;
将所述第一逻辑组中的节点按照所述二部图划分为至少包括两个节点的节点组;
赋予所述第一逻辑组中第一节点组的两个节点的节点标识;
将所述两个节点的节点标识确定为所述设定的指定节点的节点标识。
在一些实施例中,所述根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识,包括:
根据所述第一逻辑组中第一节点组中节点的节点标识,生成所述第一逻辑组中与所述第一节点组相邻的第二节点组中节点的节点标识;
根据所述第二节点组中节点的节点标识生成所述第一逻辑组中与所述第二节点组相邻的下一节点组中节点的节点标识;
当第一逻辑组中节点标识生成完毕,则根据所述第一逻辑组中第一节点组中节点的节点标识,生成第二逻辑组中第一节点组中节点的节点标识;
根据所述第一逻辑组中第二节点组中节点的节点标识,生成所述第二逻辑组中第二节点组中节点的节点标识,依次类推直到生成所有逻辑组中节点的节点标识。
在一些实施例中,所述根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识,包括:
将逻辑组中的节点组按照顺序进行排序,获得排序后的节点组集合;
根据所述节点组集合中第一节点组中节点的节点标识,确定第一节点标识基准;
根据第一节点标识基准生成所述节点组集合中第二节点组中节点的节点标识;
根据所述第二节点组中节点的节点标识,确定第二节点标识基准;
根据所述第二节点标识基准生成所述节点组集合中第三节点组中节点的节点标识;依次类推直到生成节点组集合中所有节点组中节点的节点标识。
在一些实施例中,所述根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路,包括:
根据所述节点标识,对所述逻辑组中的节点进行规约操作,获得节点的规约信息;
根据节点的规约信息,建立所述逻辑组中节点之间的唯一数据通信链路。
在一些实施例中,所述根据所述节点标识,对所述逻辑组中的节点进行规约操作,获得节点的规约信息,包括:
根据设定的第一配对距离,对所述逻辑组中的节点进行两两配对;
根据配对的节点之间传输的数据,获得以所述第一配对距离进行配对的节点的规约信息。
在一些实施例中,还包括:
根据设定的第二配对距离,对所述逻辑组中的节点重新两两配对;
根据配对的两个节点之间传输的数据,并基于已经获得的以所述第一配对距离进行配对的节点的规约信息,获得以所述第二配对距离进行配对的节点的规约信息。
在一些实施例中,还包括:
根据设定的第三配对距离,对所述逻辑组中的节点重新两两配对;
根据配对的两个节点之间传输的数据,并基于已经获得的以所述第二配对距离进行配对的节点的规约信息,获得以所述第三配对距离进行配对的节点的规约信息;
将以所述第三配对距离进行配对的节点的规约信息发送至以所述第二配对距离进行 配对的节点;
将获得以所述第二配对距离进行配对的节点的规约信息和接收的以所述第三配对距离进行配对的节点的规约信息,发送至以所述第一配对距离进行配对的节点。
在一些实施例中,还包括:
对所述确定的所述逻辑组中节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识。
在一些实施例中,所述对所述确定的所述逻辑组中节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识,包括:
对所述逻辑组中位于所述集群架构的二部图上侧的节点或下侧的节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识。
在一些实施例中,还包括:
将所述逻辑组中节点的节点标识信息记录至与所述交换设备连接的物理组内的节点上。
在一些实施例中,还包括:
根据所述节点标识,建立节点查询表;
将所述节点标识的信息记录在所述节点查询表中。
在一些实施例中,所述根据所述逻辑组中节点的节点标识,建立节点查询表,包括:
按照所述逻辑组中节点之间的距离建立索引;
将索引值相同的节点标识信息记录到同一节点查询列表中。
本申请还提供一种数据通信链路的建立装置,包括:
分组单元,用于将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;
确定单元,用于根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识;
建立单元,用于根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
本申请还提供一种集群架构中节点标识的确定方法,包括:
将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;
根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识。
在一些实施例中,所述将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组,包括:
将所述集群架构中分布在二部图两侧的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
在一些实施例中,所述将所述集群架构中分布在二部图两边的所述节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组,包括:
将所述二部图中分别连接不同交换设备的节点的集合,确定为所述逻辑组;
将所述二部图中连接同一交换设备的节点的集合,确定为所述物理组,其中,所述物理组中包括来自不同逻辑组的节点。
在一些实施例中,还包括:
选取第一逻辑组中的节点信息;
将所述第一逻辑组中节点按照所述二部图划分为至少包括两个节点的节点组;
赋予所述第一逻辑组中第一节点组的两个节点的标识;
将所述两个节点的标识确定为所述设定的指定节点标识。
在一些实施例中,所述根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识,包括:
根据所述第一逻辑组中第一节点组中节点的节点标识,生成所述第一逻辑组中与所述第一节点组相邻的第二节点组中节点的节点标识;
根据所述第二节点组中节点的节点标识生成所述第一逻辑组中与所述第二节点组相邻的下一节点组中节点的节点标识;
当第一逻辑组中节点标识生成完毕,则根据所述第一逻辑组中第一节点组中节点的节点标识,生成第二逻辑组中第一节点组中节点的节点标识;
根据所述第一逻辑组中第二节点组中节点的节点标识,生成所述第二逻辑组中第二节点组中节点的节点标识,依次类推直到生成所有逻辑组中节点的节点标识。
在一些实施例中,所述根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识,包括:
将逻辑组中的节点组按照顺序进行排序,获得排序后的节点组集合;
根据所述节点组集合中第一节点组中节点的节点标识,确定第一节点标识基准;
根据第一节点标识基准生成所述节点组集合中第二节点组中节点的节点标识;
根据所述第二节点组中节点的节点标识,确定第二节点标识基准;
根据所述第二节点标识基准生成所述节点组集合中第三节点组中节点的节点标识;依次类推直到生成节点组集合中所有节点组中节点的节点标识。
本申请还提供一种集群架构中节点标识的确定装置,包括:
分组单元,用于将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;
确定单元,用于根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识。
本申请还提供一种基于人工智能集群架构中数据通信链路的建立方法,包括:
将人工智能集群架构中用于数据通信的节点进行分组,获得在所述人工智能集群架构中描述节点连接关系的逻辑组和描述节点物理位置的物理组;
根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;
根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
本申请还提供一种基于人工智能集群架构中数据通信链路的建立装置,包括:
分组单元,用于将人工智能集群架构中用于数据通信的节点分组,获得在所述人工智能集群架构中描述节点连接关系的逻辑组和描述节点物理位置的物理组;
确定单元,用于根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;
建立单元,用于根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
本申请还提供一种计算机存储介质,用于存储网络平台产生数据,以及对应所述网络平台产生数据进行处理的程序;
所述程序在被所述处理器读取执行时,执行如上所述的数据通信链路的建立方法的步骤,或者执行如上所述的集群架构中节点标识的确定方法的步骤,或者执行如上所述的人工智能集群架构中数据通信链路的建立方法的步骤。
本申请还提供一种电子设备,包括:
处理器;
存储器,用于存储对网络平台产生数据进行处理的程序,所述程序在被所述处理器读取执行时,执行如上所述的数据通信链路的建立方法的步骤,或者执行如上所述的集群架构中节点标识的确定方法的步骤,或者执行如上所述的人工智能集群架构中数据通 信链路的建立方法的步骤。
与现有技术相比,本申请具有以下优点:
本申请提供一种数据通信链路的建立方法,包括:将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识;根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路;从而能够使得集群架构中的节点之间没有重复的数据通信路径,并且避免在大规模数据处理状态下出现的性能下降和网络拥塞等情况。
另外,本申请提供的数据通信链路的建立方法中,所述节点标识可以通过将节点标识信息记录到集群架构中的物理组中,以获知节点之间的连接关系;也可以通过建立节点查询表,将节点标识信息记录在节点查询表中,从而能够根据节点查询表获知节点之间的连接关系。
本申请还提供一种集群架构中节点标识的确定方法,包括:将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;根据所述逻辑组中设定的指定节点的标识,确定所述逻辑组中节点的节点标识;从而使得在集群架构中每个节点均拥有自己的识别信息,以便确定集群架构中节点之间的数据通信链路,避免重复数据通信链路的出现。
本申请该提供一种基于人工智能集群架构中数据通信链路的建立方法,包括:将人工智能集群架构中用于数据通信的节点进行分组,获得在所述人工智能集群架构中描述节点连接关系的逻辑组和描述节点物理位置的物理组;根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路;从而保证人工智能集群架构中,节点之间能够通过唯一数据通信链路进行通信,避免由于节点之间存在重复的数据通信链路而导致的网络拥塞,使得人工智能集群架构性能下降的问题。
附图说明
图1是集群架构的网络拓扑图;
图2是本申请提供的一种数据通信链路的建立方法实施例的流程图;
图3是本申请提供的一种数据通信链路的建立方法实施例中的节点分组的结构示意图;
图4是本申请提供的一种数据通信链路的建立方法实施例中确定节点标识方式一的 示意图;
图5是本申请提供的一种数据通信链路的建立方法实施例中确定节点标识方式二的示意图;
图6是本申请提供的一种数据通信链路的建立方法实施例中规约操作的示意图;
图7是本申请提供的一种数据通信链路的建立方法实施例中对应第一匹配距离的配对节点之间数据通信链路的示意图;
图8是本申请提供的一种数据通信链路的建立方法实施例中对应第二匹配距离的配对节点之间数据通信链路的示意图;
图9是本申请提供的一种数据通信链路的建立方法实施例中对应第三匹配距离的配对节点之间数据通信链路的示意图;
图10是本申请提供的一种数据通信链路的建立方法实施例中以颜色区别不同逻辑组或相同逻辑组中节点之间的第一种连接关系示意图;
图11是本申请提供的一种数据通信链路的建立方法实施例中以颜色区别不同逻辑组或相同逻辑组节点之间的第二种连接关系示意图;
图12是本申请提供的一种数据通信链路的建立装置实施例的结构示意图;
图13是本申请提供的一种集群架构中节点标识的确定方法实施例的流程图;
图14是本申请提供的一种集群架构中节点标识的确定装置实施例的结构示意图;
图15是本申请提供的一种基于人工智能集群架构中数据通信链路的建立方法实施例的流程图;
图16是本申请提供的一种基于人工智能集群架构中数据通信链路的建立装置实施例的流程图。
具体实施方式
在下面的描述中阐述了很多具体细节以便于充分理解本申请。但是本申请能够以很多不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本申请内涵的情况下做类似推广,因此本申请不受下面公开的具体实施的限制。
本申请中使用的术语是仅仅出于对特定实施例描述的目的,而非旨在限制本申请。在本申请中和所附权利要求书中所使用的描述方式例如:“一种”、“第一”、和“第二”等,并非对数量上的限定或先后顺序上的限定,而是用来将同一类型的信息彼此区 分。
本申请提供的一种数据通信链路的建立方法可以是基于集群架构中的不同节点设备之间进行数据通信而建立的数据通信链路,避免节点设备之间由于存在多条可达路径而导致网络拥塞的问题。
请参考图2所示,图2是本申请提供的一种数据通信链路的建立方法实施例的流程图。该建立方法包括:
步骤S201:将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
在描述所述步骤S201的具体实现过程前,先将该步骤中涉及的一些技术名词进行解释,具体如下:
本实施例中的集群架构结构是采用二部图的拓扑结构形式,如图1所示,集群架构中的交换设备分布于二部图的两侧,包括:上侧与交换设备连接的节点以及相对的下侧与交换设备连接的节点,上侧节点和下侧节点之间,上侧节点之间以及下侧节点之间能够通过交换设备进行数据通信的网络集群架构。
二部图上侧和下侧的交换设备上分别连接的节点可以包括:计算节点设备、存储节点设备、网络节点设备(网卡)、协处理节点设备以及系统节点设备等。所述协处理节点设备可以包括:如GPU、ASIC、FPGA、DSP等。
所述步骤S201的具体实现过程是:
步骤S201-1:将所述集群架构中分布在二部图两侧的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。其中,所述逻辑组可以表示节点之间的逻辑连接关系,所述物理组可以表示节点物理位置关系。
在本实施例中,所述步骤S201-1的具体实现过程可以包括:
步骤S201-1a:将所述二部图中分别连接不同交换设备的节点的集合确定为逻辑组;
步骤S201-1b:将所述二部图中连接同一交换设备的节点的集合,确定为所述物理组,其中,所述物理组中包括来自不同逻辑组的节点。
为便于理解,下面将以4×8网络集群架构为例进行说明,请参考图3所示。
图3是本申请提供的一种数据通信链路的建立方法实施例中的节点分组的结构示意图。在本实施例中,所谓4×8可以是指包括8个物理组和4个逻辑组,节点共有32个。 每个交换设备上分别连接不同逻辑组成员构成物理组,即每个交换机上分别连接4个节点,该4个节点为一个物理组,共有8个物理组;4个节点分别来自不同的逻辑组,选取不同交换设备上的节点,组成一个逻辑组,一个逻辑组共8个节点,共有4个逻辑组。
在本实施例中,可以按照物理组上连接节点的顺序进行逻辑组的划分,例如:如图3所示,上侧中每个物理组中的第一个节点和下侧中每个物理组中的第一节点被划分在第一逻辑组中;上侧中每个物理组中的第二个节点和下侧中每个物理组中的第二节点被划分在第二逻辑组中;上侧中每个物理组中的第三个节点和下侧中每个物理组中的第三节点被划分在第三逻辑组中;上侧中每个物理组中的第四个节点和下侧中每个物理组中的第四节点被划分在第四逻辑组中。物理组的组成可以任意划分,当然,逻辑组的划分方式也可以根据实际需要进行,保证每个物理组有4个节点,每个逻辑组有8个节点即可。通常情况下,节点不重复。上述仅是以4×8的网络集群架构形式对逻辑组和物理组进行说明,并非用于限制。
在基于集群架构完成物理组和逻辑组的分组后,需要对逻辑组中的每个节点进行标识,从而便于确定不同物理组之间各个节点之间通信连接关系。下面将对本申请节点标识的确定进行说明。
步骤S202:根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识。
所述其他节点可以是刨除指定节点之外的节点。
所述步骤S202的目的在于需要预先确定一个标识基准,进而便于根据该标识基准获得其他节点的标识。通过对逻辑组中设置指定节点的标识,从而获得逻辑组中指定节点以外的其他节点的节点标识。因此,需要设置一个节点标识基准,即还可以包括:
步骤S20a:选取第一逻辑组中的节点信息。
所述第一逻辑组可以是上述4×8集群架构中的任何一个逻辑组,沿用上例,请参考图3所示,每个物理组中按照节点从左到右的排列方式,所述第一逻辑组可以包括:上侧每个物理组中第一节点和下侧每个物理组中的第一节点。
步骤S20b:将所述第一逻辑组中的节点按照所述二部图划分为至少包括两个节点的节点组。
沿用上例,将所述第一逻辑组中位于所述上侧物理组中的节点和下侧物理组中的节 点为一个节点组,对于4×8的集群架构,一个逻辑组中可以包括4个节点组,也就是说,按照二部图划分为两个向量,即:x,y,x表示二部图上侧的节点,y表示二部图下侧的节点。将位于上侧物理组中的节点与位于下侧物理组中的节点取出作为一个节点组。
步骤S20c:赋予所述第一逻辑组中第一节点组的两个节点的标识。
所述步骤S20c可以将所述第一节点组中的两个节点的向量值赋值为[1,2],即x=1,y=2;1和2为第一节点组中节点的标识,1为二部图上侧物理组中的节点标识,2为二部图下侧物理组中的节点标识。
步骤S20d:将所述两个节点的节点标识确定为所述设定的指定节点的节点标识。
所述步骤S20d的目的在于,将所述步骤S20c中赋值的两个节点的节点标识确定为指定节点的节点标识。
在确定指定节点的节点标识之后,需要根据指定节点的节点标识获得其他节点的节点标识,从而为建立节点之间的连接关系做基础。
在本实施例中,所述节点标识可以采用编码的方式,当然,也可以采用其他能够将节点进行区别的标识方式,例如:形状标识、颜色标识等。
以上为对如何设置指定节点的节点标识进行说明,在获得指定节点的节点标识后,需要根据指定节点的节点标识,确定其他逻辑组中节点成员的节点标识,从而也能够获知每个物理组中节点成员的节点标识。
所述步骤S202如何根据指定节点的节点标识,确定其他节点的节点标识可以包括两种方式,但并与限于这两种方式,首先,将两种方式做一个概括性的说明,后续将会详细阐述。
方式一包括:
步骤S202-1a:根据所述第一逻辑组中第一节点组中节点的节点标识,生成所述第一逻辑组中与所述第一节点组相邻的第二节点组中节点的节点标识;
步骤S202-1b:根据所述第一逻辑组中第一节点组中节点的节点标识生成与所述第二节点组相邻的下一节点组中节点的节点标识;
步骤S202-1c:当第一逻辑组中节点标识生成完毕,则根据所述第一逻辑组中第一节点组中节点的节点标识,生成第二逻辑组中第一节点组中节点的节点标识;
步骤S202-1d:根据所述第一逻辑组中第二节点组中节点的节点标识,生成所述第 二逻辑组中第二节点组中节点的节点标识,依次类推直到生成所有逻辑组中节点的节点标识。
方式二包括:
步骤S202-2a:将逻辑组中的节点组按照顺序进行排序,获得排序后的节点组集合;
步骤S202-2b:根据所述节点组集合中预设的第一节点组中节点的节点标识,确定所述第一节点组中节点的节点标识;
步骤S202-2c:根据所述第一节点组中节点的节点标识生成所述节点组集合中第二节点组中节点的节点标识,获得所述第二节点组中节点的节点标识;
步骤S202-2d:根据所述第二节点组中节点的节点标识生成所述节点组集合中第三节点组中节点的节点标识,获得所述第三节点组中节点的节点标识;依次类推直到生成节点组集合中所有节点组中节点的节点标识。
方式一和方式二相同之处在于,二者均需要根据指定节点的节点标识,分别确定第一逻辑组中所有节点成员的节点标识,在本实施例中所述指定节点可以为第一逻辑组中的第一节点组的节点成员。
方式一和方式二不同之处在于,当确定第一逻辑组中第一节点组中节点的节点标识后,方式一,可以根据第一逻辑组中节点组的节点的节点标识,确定第二逻辑组与第一逻辑组相对节点组的节点的节点标识;根据第一逻辑组中节点组的节点的节点标识,确定第三逻辑组与第一逻辑组相对节点组的节点的节点标识;根据根据第一逻辑组中节点组的节点的节点标识,确定第四逻辑组与第一逻辑组相对节点组的节点的节点标识,从而获得所有逻辑组中所有节点的节点标识,需要说明的是,上述是以4×8的集群架构为基础确定逻辑组中节点的节点标识,如果是4×16,则到第四逻辑组后继续进一步确定下一个逻辑组中节点的节点标识,依次类推,直至所有逻辑组中执行完毕。在本实施例中是以4×8的集群架构为基础进行的说明,所以后续均以该集群架构进行说明。方式二,第一逻辑组中的第二节点组的节点标识根据指定节点(第一节点组)确定,第三节点组的节点标识根据第二节点组的节点标识确定,第四节点组的节点标识根据第三节点组的节点标识确定;第二逻辑组中的第一节点组的节点标识根据所述第一逻辑组中第四节点组的节点标识确定,第二逻辑组中的第二节点组的节点标识根据第二逻辑组中的第一节点组的节点标识确定,依次类推。具体如何确定,下面详细说明。
方式一:
请参考图4所示,图4是本申请提供的一种数据通信链路的建立方法实施例中确定节点标识的方式一的示意图。
首先,将逻辑组用0-N进行编号,在本实施例中逻辑组的编号可以包括:第一逻辑组,即0,二进制表示为0000;第二逻辑组,即1,二进制表示为0001;第三逻辑组,即2,二进制表示为0010;第四逻辑组,即3,二进制表示为0011。以第一逻辑组为举例,对逻辑组的结构进行说明。第一逻辑组包括8个节点,上下相对的两个节点被划分为一个节点组,根据图4所示,第一逻辑组中从左至右包括:第一节点组0,二进制表示为0000;第二节点组1,二进制表示为0001;第三节点组2,二进制表示为0010;第四节点组3,二进制表示为0011。第一逻辑组中的结合同样适用于其他逻辑组的结构。
结合上述第一逻辑组的结构,描述第一逻辑组中节点的节点编号的确定,所述第一逻辑组中第一节点组的节点为指定节点,第二节点组根据第一节点组生成节点编号,第三节点组根据第一节点组生成节点编号,第四节点组根据第一节点组成节点编号,具体如下:
基于上述所述步骤S20a至所述步骤S20d可以获得第一逻辑组中第一节点组中的两个节点为指定节点(x,y),指定节点的节点编号分别是x=1,y=2。根据第一逻辑组中的第一节点组的节点编号,通过公式[x+i×B,y+i×B]计算第一逻辑组中其他节点组的节点编号,其中i是当前节点组的编号值或者是当前逻辑组的编号值,计算第一逻辑组中节点编号值时,i为当前节点组的编号值。
关于第一逻辑组:
当第一逻辑组中节点组的二进制表示中数字1为奇数时,节点编号的计算公式为:[y+i×B,x+i×B],当二进制表示中数字1为偶数时,节点编号的计算公式为:[x+i×B,y+i×B]。其中,i是当前节点组的编号,B是指定节点的个数2。
第一节点组(二进制表示为0000,即节点组编号为0)中节点的节点编号指定为x=1,y=2,因此采用[1+0×2,2+0×2]公式也可以获得,该第一节点组的节点编号是x=1,y=2。
第二节点组(二进制表示为0001,即节点组编号为1)中节点的节点编号根据第一节点组中节点的节点编号[1,2],采用[y+i×B,x+i×B]公式生成,即[2+1×2,1+1×2],生成的第二节点组中节点的节点编号为[4,3];
第三节点组(二进制表示为0010,即节点组编号为2)中节点的节点编号根据所述第二节点组中节点的节点编号[4,3],采用[y+i×B,x+i×B]公式生成,即[2+2×2,1+2×2],生成的第三节点组中节点的节点编号为[6,5];
第四节点组(二进制表示为0011,即节点组编号为3)中节点的节点编号根据所述第三节点组中节点的节点编号[6,5],采用[x+i×B,y+i×B]公式生成,即[1+3×2,2+3×2],生成的第二节点组中节点的节点编号为[7,8]。
至此,获得第一逻辑组中所有节点组中的节点编号,即:
Figure PCTCN2020091221-appb-000001
需要说明的是,上述公式中B表示计算基准值,该值根据不同逻辑组进行设定,在第一逻辑组中是根据两个指定节点生成后续节点编号,因为计算基准为2,在第一逻辑组中节点编号确定之后,计算基准被设定为8。
关于第二逻辑组:
依据所述第一逻辑组中节点的节点编号,可以确定第二逻辑组中的节点编号,具体过程可以是:
第二逻辑组(二进制表示为0001),因此,对于第二逻辑组中节点的节点编号的生成采用[y+i×B,x+i×B],其中,i表示当前逻辑组的编号值,即为1;B为8。由于已经获知第一逻辑组中的节点编号,因此,在确定第二逻辑组中节点编号时无需在对第二逻辑组中的节点组进行二进制判断,仅需根据第一逻辑组中对应的节点编号生成即可。例如:
第一节点组中节点的节点编号,根据第一逻辑组的第一节点组节点编号[1,2]生成,即为[2+1×8,1+1×8],生成的第一节点组中节点的节点编号为[10,9];
第二节点组中节点的节点编号根据第一逻辑组中第二节点组节点编号[4,3]生成,即为[3+1×8,4+1×8],生成的第二节点组中节点的节点编号为[11,12];
第三节点组中节点的节点编号根据第一逻辑组中第三节点组节点编号[6,5]生成,即为[5+1×8,6+1×8],生成的第二节点组中节点的节点编号为[13,14];
第四节点组中节点的节点编号根据第一逻辑组中第三节点组节点编号[7,8]生成,即为[8+1×8,7+1×8],生成的第二节点组中节点的节点编号为[16,15]。
至此,获得第二逻辑组中所有节点组中的节点编号,即:
Figure PCTCN2020091221-appb-000002
关于第三逻辑组:
依据所述第一逻辑组中节点的节点编号,可以确定第三逻辑组中的节点编号,具体过程可以是:
第三逻辑组(二进制表示为0010),因此,对于第三逻辑组中节点的节点编号的生成采用[y+i×B,x+i×B],其中,i表示当前逻辑组编号值,即为2;B为8。由于已经获知第一逻辑组中的节点编号,因此,在确定第三逻辑组中节点编号时无需在对第三逻辑组中的节点组进行二进制判断,仅需根据第一逻辑组中对应的节点编号生成即可。例如:
第一节点组中节点的节点编号,根据第一逻辑组的第一节点组节点编号[1,2]生成,即为[2+2×8,1+2×8],生成的第一节点组中节点的节点编号为[18,17];
第二节点组中节点的节点编号根据第一逻辑组中第二节点组节点编号[4,3]生成,即为[3+2×8,4+2×8],生成的第二节点组中节点的节点编号为[19,20];
第三节点组中节点的节点编号根据第一逻辑组中第三节点组节点编号[6,5]生成,即为[5+2×8,6+2×8],生成的第二节点组中节点的节点编号为[21,22];
第四节点组中节点的节点编号根据第一逻辑组中第三节点组节点编号[7,8]生成,即为[8+2×8,7+2×8],生成的第二节点组中节点的节点编号为[24,23]。
至此,获得第三逻辑组中所有节点组中的节点编号,即:
Figure PCTCN2020091221-appb-000003
关于第四逻辑组:
依据所述第一逻辑组中节点的节点编号,可以确定第四逻辑组中的节点编号,具体过程可以是:
第四逻辑组(二进制表示为0011),因此,对于第四逻辑组中节点的节点编号的生成采用[x+i×B,y+i×B],其中,i表示当前逻辑组编号值,即为3;B为8。由于已经获知第一逻辑组中的节点编号,因此,在确定第四逻辑组中节点编号时无需在对第四逻辑组中的节点组进行二进制判断,仅需根据第一逻辑组中对应的节点编号生成即可。例如:
第一节点组中节点的节点编号,根据第一逻辑组的第一节点组节点编号[1,2]生成,即为[1+3×8,2+3×8],生成的第一节点组中节点的节点编号为[25,26];
第二节点组中节点的节点编号根据第一逻辑组中第二节点组节点编号[4,3]生成,即为[4+3×8,3+3×8],生成的第二节点组中节点的节点编号为[28,27];
第三节点组中节点的节点编号根据第一逻辑组中第三节点组节点编号[6,5]生成,即为[6+3×8,5+3×8],生成的第二节点组中节点的节点编号为[30,29];
第四节点组中节点的节点编号根据第一逻辑组中第三节点组节点编号[7,8]生成,即为[7+3×8,8+3×8],生成的第二节点组中节点的节点编号为[31,32]。
至此,获得第三逻辑组中所有节点组中的节点编号,即:
Figure PCTCN2020091221-appb-000004
上述为节点编号确定方式的第一种介绍,下面对第二种方式进行说明。
方式二:
请参考图5所示,图5是本申请提供的一种数据通信链路的建立方法实施例中确定节点标识的方式二的示意图。
首先,根据逻辑组的排列顺序对每个节点组采用0-N进行编号,在本实施例中,逻辑组包括4组,每组有4个节点组,一共16个节点组,那么根据逻辑组的排列顺序则可以获得每个节点组的二进制表示,第一节点组0000;第二节点组0001;第三节点组0010;第四节点组0011;第五节点组0100;第六节点组0101;第七节点组0110;第八节点组0111;第九节点组1000;第十节点组1001;第十一节点组1010;第十二节点组1011;第十三节点组1100;第十四节点组1101;第十五节点组1110;第十六节点组1111。
其中,第一节点组中的节点为指定节点,节点编号为[1,2]。将所述指定节点的节点编号作为基准生成其他节点的节点编号,根据第一节点组的节点编号生成第二节点组的节点编号;根据第一节点组的节点编号生成第三节点组的节点编号,根据第一节点组的节点编号生成第四节点组的节点编号,以此类推。当节点组中二进制表示的数字1为奇数时,节点编号的计算公式为:[y+B,x+B],当二进制表示中数字1为偶数时,节点编号的计算公式为:[x+B,y+B]。B为以完成节点编号中最大的节点编号。
第一节点组0000,节点编号为预先设定为[1,2];
第二节点组0001,节点编号根据[y+B,x+B]生成,即为[4,3];
第三节点组0010,节点编号根据[y+B,x+B]生成,即为[6,5];
第四节点组0011,节点编号根据[x+B,y+B]生成,即为[7,8];
第五节点组0100,节点编号根据[y+B,x+B]生成,即为[10,9];
第六节点组0101,节点编号根据[x+B,y+B]生成,即为[11,12];
第七节点组0110,节点编号根据[x+B,y+B]生成,即为[13,14];
第八节点组0111,节点编号根据[y+B,x+B]生成,即为[16,15];
第九节点组1000,节点编号根据[y+B,x+B]生成,即为[18,17];
第十节点组1001,节点编号根据[x+B,y+B]生成,即为[19,20];
第十一节点组1010,节点编号根据[x+B,y+B]生成,即为[21,22];
第十二节点组1011,节点编号根据[y+B,x+B]生成,即为[24,23];
第十三节点组1100,节点编号根据[x+B,y+B]生成,即为[25,26];
第十四节点组1101,节点编号根据[y+B,x+B]生成,即为[28,27];
第十五节点组1110,节点编号根据[y+B,x+B]生成,即为[30,29];
第十六节点组1111,节点编号根据[x+B,y+B]生成,即为[31,32]。
上述生成每个节点组中节点的节点编号后,按照逻辑组的组成结构将对应节点组的节点编号记录到相对应的逻辑组中,即可获得每个逻辑组中节点的节点编号。
在获得逻辑组中节点的节点标识后,可以将所述节点标识采用不同的呈现方式进行记录,例如:将所述逻辑组中节点的节点标识信息记录至与所述交换设备连接的物理组内的节点上。如图3所示,每个物理组中均能够呈现不同逻辑组中节点的节点编号。或者,将节点标识信息以查询表的形式进行记录,具体可以是,根据所述节点标识,建立节点查询表;将节点标识信息记录在所述节点查询表中,通过查询表查找节点标识,确定查找的节点标识与节点的映射关系(例如:节点标识与节点IP的映射关系),从而获知查找的节点标识所对应的节点,也就是说,查询表中记录有节点标识的信息,节点的信息,以及具有连接关系的其他节点标识的信息。其中,所述根据所述逻辑组中节点的节点标识,建立节点查询表,包括:
按照所述逻辑组中节点之间的距离建立索引;
将索引值相同的节点标识信息记录到同一节点查询列表中。
基于上述在确定节点标识后,需要根据节点标识建立二部图中上侧节点与下侧节点 之间的数据通信链路,请参考下述内容。
步骤S203:根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
根据现有技术可以获知,以二部图为网络集群架构的节点之间存在多个可达路劲,会造成网络拥塞。当确定节点标识后,可以根据节点标识建立节点之间唯一数据通信链路,即在不同逻辑组中建立节点之间的唯一数据通信链路,或者在相同逻辑组中建立节点之间的唯一数据通信链路。请参考图5所示,图5是本申请提供的一种数据通信链路的建立方法实施例中规约操作的示意图,具体实现过程可以包括:
步骤S203-1:根据所述节点标识,对所述逻辑组中的节点进行规约操作,获得节点的规约信息。
所述步骤S203-1中采用的规约操作可以是分多步规约方式,即:havling-doubling allreduce。Reduce规约是来自函数式编程的一个经典概念。数据规约包含通过一个函数将一批数据分成较小的一批数据。比如将一个数组的元素通过加法函数规约为一个数字。allreduce是集合通信模式的一种,可以通俗的理解为,将当前节点所持有的数据发送给其他所有参与节点,进行相应的规约操作,使得所有节点能够规约后的结果。规约操作可以包括:如平均、加法、取最大值等,在本实施例中主要采用的是加法规约。
如图6所示,所述步骤S203-1的实现过程可以包括:
步骤S203-1-11:根据设定的第一配对距离,对所述逻辑组中的节点进行两两配对。
如图6所示的Step1,以配对距离的步长为1,对所述第一逻辑组中的节点进行两两配对,即节点1和节点2配对,节点3和节点4配对,节点5和节点6配对,节点7和节点8配对;所述第二逻辑组中的节点进行两两配对,即节点9和节点10配对,节点11和节点12配对,节点13和节点14配对,节点15和节点16配对;所述第三逻辑组中的节点进行两两配对,即节点17和节点18配对,节点18和节点20配对,节点21和节点22配对,节点23和节点24配对;所述第四逻辑组中的节点进行两两配对,即节点25和节点26配对,节点27和节点28配对,节点29和节点30配对,节点31和节点32配对。
步骤S203-1-12:根据配对的节点之间传输的数据,获得以所述第一配对距离进行配对的节点的规约信息。如图6中的step1。
所述步骤S203-1-12的目的在于,两两配对的节点彼此获得对方的部分数据,从而得到规约信息。
基于上述内容还可以包括:
步骤S203-1-21:根据设定的第二配对距离,对所述逻辑组中的节点重新两两配对;如图6中的step2。
步骤S203-1-22:根据配对的两个节点之间传输的数据,并基于已经获得的以所述第一配对距离进行配对的节点的规约信息,获得以所述第二配对距离进行配对的节点的规约信息。
基于上述内容还可以包括:
步骤S203-1-31:根据设定的第三配对距离,对所述逻辑组中的节点重新两两配对;如图6中的step3。
步骤S203-1-32:根据配对的两个节点之间传输的数据,并基于已经获得的以所述第二配对距离进行配对的节点的规约信息,获得以所述第三配对距离进行配对的节点的规约信息;
步骤S203-1-33:将以所述第三配对距离进行配对的节点的规约信息发送至以所述第二配对距离进行配对的节点;即从step3开始反向传输。
步骤S203-1-33:将获得以所述第二配对距离进行配对的节点的规约信息和接收的以所述第三配对距离进行配对的节点的规约信息,发送至以所述第一配对距离进行配对的节点。
需要说明的是,图6仅以8个节点作为举例说明,对于32个节点的规约过程则需要五个步骤,即:step1至step5。在正向规约后,每个节点获得1/32之一的规约结果;之后再进行反向规约,规约结果中传输的数据量可以是每次相同的。
步骤S203-2:根据节点的规约信息,建立所述逻辑组中节点之间的唯一数据通信链路。
所述步骤S203-2的目的是:根据节点的规约信息进而在节点之间建立唯一数据通信链路。需要说明的是,所述步骤S203-2的具体实现过程可以是所述步骤S203-1-11和步骤S203-1-12中根据第一配对距离确定的节点的规约信息实现节点之间唯一数据通信链路建立,如图7所示;也可以是所述步骤S203-1-21和步骤S203-1-22中根据第二配对距离确定的节点的规约信息实现节点之间唯一数据通信链路建立,如图8所示;也可 以是所述步骤S203-1-31至步骤S203-33中根据第三配对距离确定的节点的规约信息实现节点之间唯一数据通信链路建立,如图9所示;或者,将以所述第三配对距离进行配对的节点的规约信息发送至以所述第二配对距离进行配对的节点;以及将获得以所述第二配对距离进行配对的节点的规约信息和接收的以所述第三配对距离进行配对的节点的规约信息,发送至以所述第一配对距离进行配对的节点,使得每个节点获得规约结果,从而完成节点之间唯一数据通信链路建立,如图10和图11所示。
图10是本申请提供的一种数据通信链路的建立方法实施例在逻辑组中节点之间的第一种连接关系示意图;图11是本申请提供的一种数据通信链路的建立方法实施例在逻辑组中节点之间的第二种连接关系示意图。这两幅图中均描述逻辑组中节点之间的连接关系,相同颜色的节点之间存在连接关系,因为连接关系跨越两个逻辑组,为在一个虚线框中展示,所以一个虚线框内8根线代表8个连接,连接了16个节点,例如图10上部连接关系表示中,节点1与节点9具有连接关系,在图10下部连线表示中,由于节点9的位置与节点2相同,因此以节点2代替节点9连接表示,当并不是节点1与节点2连接,节点1与节点2之间的连线代表节点1与节点9连接,同理节点4与节点3连线代表节点4与节点12连接。图10和图11仅以部分逻辑组中的节点组作为举例进行绘示。
需要说明的是,为避免逻辑组中节点之间的链路相同,本实施例中,还可以包括:
对所述确定的所述逻辑组中节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识。具体可以是,对所述逻辑组中位于所述集群架构的二部图上侧的节点或下侧的节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识。
如图7中的第二逻辑组,移位前第二逻辑组中上侧(x)节点的节点编号为[10,11,13,16],移位后第二逻辑组中上侧节点的节点编号为[16,10,11,13];第三逻辑组在移位前上侧(x)节点的节点编号为[18,19,21,24],移位后第三逻辑中上侧节点的节点编号为[21,24,18,19];第四逻辑组在移位前上侧(x)节点的节点编号为[25,28,30,31],移位后第四逻辑组中上侧节点的节点编号为[31,25,28,30];移位可以在节点标识确定过程中实现,例如:第一逻辑组节点的节点标识的公式为不移位[y+B<<0,x+B];第二逻辑组节点的节点标识的公式加入移位[y+i×B<<1,x+i×B];第三逻辑组节点的节点标识的公式加入移位[y+i×B<<2,x+i×B];第四逻辑组节点的节点标识的公式加入移位[x+i×B<<3,y+i×B]。其中,<<0表示移位0位即不移位,<<1 表示移位1位,<<2表示移位2位,<<3表示移位3位。
需要说明的是,移位可以针对上侧进行移位,也可以针对下侧(y)进行移位,例如:第一逻辑组节点的节点标识的公式为不移位[y+B,x+B<<0];第二逻辑组节点的节点标识的公式加入移位[y+i×B,x+i×B<<1];第三逻辑组节点的节点标识的公式加入移位[y+i×B,x+i×B<<2];第四逻辑组节点的节点标识的公式加入移位[x+i×B,y+i×B<<3]。
以上是对本申请提供的一种数据通信链路的建立方法实施例的具体描述,与前述提供的一种数据通信链路的建立方法实施例相对应,本申请还公开一种数据通信链路的建立装置实施例,请参看图12,由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。下述描述的装置实施例仅仅是示意性的。
如图12所示,图12是本申请提供的一种数据通信链路的建立装置实施例的结构示意图,该建立装置包括:
分组单元1201,用于将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
所述分组单元1201具体用于将所述集群架构中分布在二部图两侧的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
所述分组单元1201包括:
逻辑组确定子单元,用于将所述二部图中分别连接不同交换设备的节点的集合,确定为所述逻辑组;
物理组确定子单元,用于将所述二部图中连接同一交换设备的节点的集合,确定为所述物理组,其中,所述物理组中包括来自不同逻辑组的节点。
还包括:
选取单元,用于选取第一逻辑组中的节点信息;
节点组划分单元,用于将所述第一逻辑组中的节点按照所述二部图划分为至少包括两个节点的节点组;
赋值单元,用于赋予所述第一逻辑组中第一节点组的两个节点的节点标识;
指定节点标识确定单元,用于将所述两个节点的节点标识确定为所述设定的指定节点的节点标识。
确定单元1202,用于根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识。
所述确定单元1202包括两种实现方式,方式一包括:
第一生成子单元,用于根据所述第一逻辑组中第一节点组中节点的节点标识,生成所述第一逻辑组中与所述第一节点组相邻的第二节点组中节点的节点标识;
第二生成子单元,用于根据所述第二节点组中节点的节点标识生成所述第一逻辑组中与所述第二节点组相邻的下一节点组中节点的节点标识;
第三生成子单元,用于当第一逻辑组中节点标识生成完毕,则根据所述第一逻辑组中第一节点组中节点的节点标识,生成第二逻辑组中第一节点组中节点的节点标识;
第四生成子单元,用于根据所述第一逻辑组中第二节点组中节点的节点标识,生成所述第二逻辑组中第二节点组中节点的节点标识,依次类推直到生成所有逻辑组中节点的节点标识。
所述确定单元1202方式二包括:
获得子单元,用于将逻辑组中的节点组按照顺序进行排序,获得排序后的节点组集合;
第一确定子单元,用于根据所述节点组集合中第一节点组中节点的节点标识,确定第一节点标识基准;
第一生成子单元,用于根据第一节点标识基准生成所述节点组集合中第二节点组中节点的节点标识;
第二确定子单元,用于根据所述第二节点组中节点的节点标识,确定第二节点标识基准;
第三生成子单元,用于根据所述第二节点标识基准生成所述节点组集合中第三节点组中节点的节点标识;依次类推直到生成节点组集合中所有节点组中节点的节点标识。
建立单元1203,用于根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
所述建立单元1203包括:
规约信息获得子单元,用于根据所述节点标识,对所述逻辑组中的节点进行规约操作,获得节点的规约信息;
建立子单元,用于根据节点的规约信息,建立不同物理组中属于同一逻辑组内的节 点之间的唯一数据通信链路。
所述规约信息获得子单元包括:
第一配对子单元,用于根据设定的第一配对距离,对所述逻辑组中的节点进行两两配对;
第一规约获得子单元,用于根据配对的节点之间传输的数据,获得以所述第一配对距离进行配对的节点的规约信息。
第二配对子单元,用于根据设定的第二配对距离,对所述逻辑组中的节点重新两两配对;
第二规约获得子单元,用于根据配对的两个节点之间传输的数据,并基于已经获得的以所述第一配对距离进行配对的节点的规约信息,获得以所述第二配对距离进行配对的节点的规约信息。
第三配对子单元,用于根据设定的第三配对距离,对所述逻辑组中的节点重新两两配对;
第三规约获得子单元,用于根据配对的两个节点之间传输的数据,并基于已经获得的以所述第二配对距离进行配对的节点的规约信息,获得以所述第三配对距离进行配对的节点的规约信息;
第一发送子单元,用于将以所述第三配对距离进行配对的节点的规约信息发送至以所述第二配对距离进行配对的节点;
第二发送子单元,用于将获得以所述第二配对距离进行配对的节点的规约信息和接收的以所述第三配对距离进行配对的节点的规约信息,发送至以所述第一配对距离进行配对的节点。
基于上述内容,为避免逻辑组中节点之间的链路相同,本申请提供的数据通信链路的建立装置还包括:
移位单元,用于对所述确定的所述逻辑组中节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识。
所述移位单元具体用于对所述逻辑组中位于所述集群架构的二部图上侧的节点或下侧的节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识。
基于上述内容,本申请提供的数据通信链路的建立装置中节点标识可以采用不同的方式呈现,因此,还包括:
记录单元,用于将所述逻辑组中节点的节点标识信息记录至与所述交换设备连接的物理组内的节点上。
或者,还包括:
查询表建立单元,用于根据所述节点标识,建立节点查询表;
记录单元,用于将所述节点标识信息记录在所述节点查询表中。
所述查询表建立单元包括:
索引子单元,用于按照所述逻辑组中节点之间的距离建立索引;
记录子单元,用于将索引值相同的节点标识信息记录到同一节点查询列表中。
以上为本申请提供的一种数据通信链路的建立装置实施例的说明,基于上述提供的数据通信链路的建立方法和数据通信链路的建立装置,本申请还提供一种集群架构中节点标识的确定方法,由于该确定方法与数据通信链路的建立方法实施例中的部分内容相同,此处描述的较为概要,具体细节内容,参考数据通信链路的建立方法实施例中的步骤S201和S202的描述。
请参考图13,图13是本申请提供的一种集群架构中节点标识的确定方法实施例的流程图,该确定方法包括:
步骤S1301:将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
首先,本实施例中的集群架构结构是采用二部图的拓扑结构形式,如图1所示,集群架构中的交换设备分布于二部图的两侧,包括:上侧与交换设备连接的节点以及相对的下侧与交换设备连接的节点,上侧节点和下侧节点之间,上侧节点之间以及下侧节点之间能够通过交换设备进行数据通信的网络集群架构。
二部图上侧和下侧的交换设备上分别连接的节点可以包括:计算节点设备、存储节点设备、网络节点设备(网卡)、协处理节点设备以及系统节点设备等。所述协处理节点设备可以包括:如GPU、ASIC、FPGA、DSP等。
所述步骤S1301具体实现过程是:
将所述集群架构中分布在二部图两侧的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。其中,所述逻辑组可以表示节点之间的逻辑连接关系,所述物理组可以表示节点物理位置关系。
在本实施例中,可以将所述二部图中分别连接不同交换设备的节点的集合确定为逻 辑组;可以将所述二部图中连接同一交换设备的节点的集合,确定为所述物理组,其中,所述物理组中包括来自不同逻辑组的节点。为便于理解,以4×8网络集群架构为例进行说明,请参考图3所示,具体内容,请参考数据通信链路方法中的步骤S201的描述,此处不再赘述。
步骤S1302:根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识。
所述步骤S1302的目的在于需要预先确定一个节点的标识基准,进而便于根据该标识基准获得其他节点的节点标识。通过对逻辑组中设置指定节点的标识,从而获得逻辑组中指定节点以外的其他节点的节点标识。因此,需要设置一个节点的标识基准,即还可以包括:
选取第一逻辑组中的节点信息;
将所述第一逻辑组中节点按照所述二部图划分为至少包括两个节点的节点组;
赋予所述第一逻辑组中第一节点组的两个节点的标识;
将所述两个节点的标识确定为所述设定的指定节点标识。
所述步骤S1302中关于如何确定指定节点标识的过程可以参考上述数据通信链路的建立方法实施例中的所述步骤S20a至所述步骤S20d的描述,此处不再赘述。
在确定指定节点的节点标识之后,需要根据指定节点的节点标识获得其他节点的节点标识,从而为建立节点之间的通信链路关系做基础。
在本实施例中,所述节点标识可以采用编码的方式,当然,也可以采用其他能够将节点进行区别的标识方式,例如:形状标识、颜色标识等。
以上为对如何设置指定节点的节点标识进行说明,在获得指定节点的节点标识后,需要根据指定节点的节点标识,确定其他逻辑组中节点成员的节点标识,从而也能够获知每个物理组中节点成员的节点标识。
所述步骤S1302如何根据指定节点的节点标识,确定其他节点的节点标识可以包括两种方式,但并与限于这两种方式,首先,将两种方式做一个概括性的说明,后续将会详细阐述。方式一包括:
步骤S1302-1a:根据所述第一逻辑组中第一节点组中节点的节点标识,生成所述第一逻辑组中与所述第一节点组相邻的第二节点组中节点的节点标识;
步骤S1302-1b:根据所述第二节点组中节点的节点标识生成所述第一逻辑组中与所 述第二节点组相邻的下一节点组中节点的节点标识;
步骤S1302-1c:当第一逻辑组中节点标识生成完毕,则根据所述第一逻辑组中第一节点组中节点的节点标识,生成第二逻辑组中第一节点组中节点的节点标识;
步骤S1302-1d:根据所述第一逻辑组中第二节点组中节点的节点标识,生成所述第二逻辑组中第二节点组中节点的节点标识,依次类推直到生成所有逻辑组中节点的节点标识。
方式二包括:
步骤S1302-2a:将逻辑组中的节点组按照顺序进行排序,获得排序后的节点组集合;
步骤S1302-2b:根据所述节点组集合中第一节点组中节点的节点标识,确定第一节点标识基准;
步骤S1302-2c:根据第一节点标识基准生成所述节点组集合中第二节点组中节点的节点标识;
步骤S1302-2d:根据所述第二节点组中节点的节点标识,确定第二节点标识基准;
步骤S1302-2e:根据所述第二节点标识基准生成所述节点组集合中第三节点组中节点的节点标识;依次类推直到生成节点组集合中所有节点组中节点的节点标识。
以上仅为步骤S1302中如何根据指定节点的节点标识,确定逻辑组中其他节点的节点标识的概括说明,具体内容请参考上述数据通信链路的建立方法实施例中的步骤S202的详细说明。
以上是对本申请提供的一种集群架构中节点标识的确定方法实施例的具体描述,与前述提供的一种集群架构中节点标识的确定方法实施例相对应,本申请还公开一种集群架构中节点标识的确定装置实施例,请参看图14,由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。下述描述的装置实施例仅仅是示意性的。
请参考图14,图14是本申请提供的一种集群架构中节点标识的确定装置实施例的结构示意图;所述确定装置包括:
分组单元1401,用于将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
所述分组单元1401具体用于将所述集群架构中分布在二部图两侧的节点进行分 组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
所述分组单元1401可以包括:
逻辑组确定子单元,用于将所述二部图中分别连接不同交换设备的节点的集合,确定为所述逻辑组;
物理组确定子单元,用于将所述二部图中连接同一交换设备的节点的集合,确定为所述物理组,其中,所述物理组中包括来自不同逻辑组的节点。
还包括:
选取单元,用于选取第一逻辑组中的节点信息;
节点组划分单元,用于将所述第一逻辑组中的节点按照所述二部图划分为至少包括两个节点的节点组;
赋值单元,用于赋予所述第一逻辑组中第一节点组的两个节点的节点标识;
指定节点标识确定单元,用于将所述两个节点的节点标识确定为所述设定的指定节点的节点标识。
确定单元1402,用于根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识。
所述确定单元1402包括两种实现方式,方式一包括:
第一生成子单元,用于根据所述第一逻辑组中第一节点组中节点的节点标识,生成所述第一逻辑组中与所述第一节点组相邻的第二节点组中节点的节点标识;
第二生成子单元,用于根据所述第二节点组中节点的节点标识生成所述第一逻辑组中与所述第二节点组相邻的下一节点组中节点的节点标识;
第三生成子单元,用于当第一逻辑组中节点标识生成完毕,则根据所述第一逻辑组中第一节点组中节点的节点标识,生成第二逻辑组中第一节点组中节点的节点标识;
第四生成子单元,用于根据所述第一逻辑组中第二节点组中节点的节点标识,生成所述第二逻辑组中第二节点组中节点的节点标识,依次类推直到生成所有逻辑组中节点的节点标识。
所述确定单元1402方式二包括:
获得子单元,用于将逻辑组中的节点组按照顺序进行排序,获得排序后的节点组集合;
第一确定子单元,用于根据所述节点组集合中第一节点组中节点的节点标识,确定 第一节点标识基准;
第一生成子单元,用于根据第一节点标识基准生成所述节点组集合中第二节点组中节点的节点标识;
第二确定子单元,用于根据所述第二节点组中节点的节点标识,确定第二节点标识基准;
第三生成子单元,用于根据所述第二节点标识基准生成所述节点组集合中第三节点组中节点的节点标识;依次类推直到生成节点组集合中所有节点组中节点的节点标识。
以上是对本申请提供的一种集群架构中节点标识的确定装置实施例的描述,描述较为概要具体内容可以参考上述数据通信链路的建立方法实施例和建立装置实施例的描述。
由于人工智能深度学习技术不断的发展,因此对集群架构的性能要求更高,为避免数据在通信过程由于产生拥塞等问题而导致数据处理速度下降。本申请还提供一种基于人工智能集群架构中数据通信链路的建立方法,请参考图15所示,图15是本申请提供的一种基于人工智能集群架构中数据通信链路的建立方法实施例的流程图,该方法包括:
步骤S1501:将人工智能集群架构中用于数据通信的节点进行分组,获得在所述人工智能集群架构中描述节点连接关系的逻辑组和描述节点物理位置的物理组;
步骤S1502:根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;
步骤S1503:根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
上述提供的基于人工智能集群架构中数据通信链路的建立方法,目的在于,将数据通信链路的唯一性应用到人工智能的集群架构中,从而保证人工智能集群架构性能的稳定性。具体实现过程可以参考上述数据通信链路的建立方法实施例的说明。
基于本申请提供的基于人工智能集群架构中数据通信链路的建立方法实施例,本申请还提供一种基于人工智能集群架构中数据通信链路的建立装置实施例,请参考图16所示,该装置包括:
分组单元1601,用于将人工智能集群架构中用户数据通信的节点进行分组,获得在所述人工智能集群架构中描述节点连接关系的逻辑组和描述节点物理位置的物理组;
确定单元1602,用于根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;
建立单元1603,用于根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
同样地,该装置的具体实现过程可以参考上述数据通信链路的建立方法实施例和数据通信链路的建立装置实施例的说明,此处不再赘述。
基于上述内容,本申请还提供一种计算机存储设备,用于存储网络平台产生数据,以及对应所述网络平台产生数据进行处理的程序;
所述程序在被所述处理器读取执行时,执行如上所述的数据通信链路的建立方法的步骤,或者执行如上所述的集群架构中节点标识的确定方法的步骤,或者执行如上所述的基于人工智能集群架构中数据通信链路的建立方法的步骤。
基于上述内容,本申请还提供一种电子设备,包括:
处理器;
存储器,用于存储对网络平台产生数据进行处理的程序,所述程序在被所述处理器读取执行时,执行如上所述的数据通信链路的建立方法的步骤,或者执行如上所述的集群架构中节点标识的确定方法的步骤,或者执行如上所述的基于人工智能集群架构中数据通信链路的建立方法的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
1、计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式 磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
2、本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请虽然以较佳实施例公开如上,但其并不是用来限定本申请,任何本领域技术人员在不脱离本申请的精神和范围内,都可以做出可能的变动和修改,因此本申请的保护范围应当以本申请权利要求所界定的范围为准。

Claims (27)

  1. 一种数据通信链路的建立方法,其特征在于,包括:
    将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;
    根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;
    根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
  2. 根据权利要求1所述的数据通信链路的建立方法,其特征在于,所述将集群架构中用于进行数据通信的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组,包括:
    将所述集群架构中分布在二部图两侧的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
  3. 根据权利要求2所述的数据通信链路的建立方法,其特征在于,所述将所述集群架构中分布在二部图两边的所述节点进行分组,获得在所述集群机构中描述节点连接关系的逻辑组和描述节点物理位置的物理组,包括:
    将所述二部图中分别连接不同交换设备的节点的集合,确定为所述逻辑组;
    将所述二部图中连接同一交换设备的节点的集合,确定为所述物理组,其中,所述物理组中包括来自不同逻辑组的节点。
  4. 根据权利要求2所述的数据通信链路的建立方法,其特征在于,还包括:
    选取第一逻辑组中的节点信息;
    将所述第一逻辑组中的节点按照所述二部图划分为至少包括两个节点的节点组;
    赋予所述第一逻辑组中第一节点组的两个节点的节点标识;
    将所述两个节点的节点标识确定为所述设定的指定节点的节点标识。
  5. 根据权利要求4所述的数据通信链路的建立方法,其特征在于,所述根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识,包括:
    根据所述第一逻辑组中第一节点组中节点的节点标识,生成所述第一逻辑组中与所述第一节点组相邻的第二节点组中节点的节点标识;
    根据所述第二节点组中节点的节点标识生成所述第一逻辑组中与所述第二节点组相邻的下一节点组中节点的节点标识;
    当第一逻辑组中节点标识生成完毕,则根据所述第一逻辑组中第一节点组中节点的 节点标识,生成第二逻辑组中第一节点组中节点的节点标识;
    根据所述第一逻辑组中第二节点组中节点的节点标识,生成所述第二逻辑组中第二节点组中节点的节点标识,依次类推直到生成所有逻辑组中节点的节点标识。
  6. 根据权利要求4所述的数据通信链路的建立方法,其特征在于,所述根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识,包括:
    将逻辑组中的节点组按照顺序进行排序,获得排序后的节点组集合;
    根据所述节点组集合中第一节点组中节点的节点标识,确定第一节点标识基准;
    根据第一节点标识基准生成所述节点组集合中第二节点组中节点的节点标识;
    根据所述第二节点组中节点的节点标识,确定第二节点标识基准;
    根据所述第二节点标识基准生成所述节点组集合中第三节点组中节点的节点标识;依次类推直到生成节点组集合中所有节点组中节点的节点标识。
  7. 根据权利要求1所述的数据通信链路的建立方法,其特征在于,所述根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路,包括:
    根据所述节点标识,对所述逻辑组中的节点进行规约操作,获得节点的规约信息;
    根据节点的规约信息,建立所述逻辑组中节点之间的唯一数据通信链路。
  8. 根据权利要求7所述的数据通信链路的建立方法,其特征在于,所述根据所述节点标识,对所述逻辑组中的节点进行规约操作,获得节点的规约信息,包括:
    根据设定的第一配对距离,对所述逻辑组中的节点进行两两配对;
    根据配对的节点之间传输的数据,获得以所述第一配对距离进行配对的节点的规约信息。
  9. 根据权利要求8所述的数据通信链路的建立方法,其特征在于,还包括:
    根据设定的第二配对距离,对所述逻辑组中的节点重新两两配对;
    根据配对的两个节点之间传输的数据,并基于已经获得的以所述第一配对距离进行配对的节点的规约信息,获得以所述第二配对距离进行配对的节点的规约信息。
  10. 根据权利要求9所述的数据通信链路的建立方法,其特征在于,还包括:
    根据设定的第三配对距离,对所述逻辑组中的节点重新两两配对;
    根据配对的两个节点之间传输的数据,并基于已经获得的以所述第二配对距离进行配对的节点的规约信息,获得以所述第三配对距离进行配对的节点的规约信息;
    将以所述第三配对距离进行配对的节点的规约信息发送至以所述第二配对距离进行配对的节点;
    将获得以所述第二配对距离进行配对的节点的规约信息和接收的以所述第三配对距离进行配对的节点的规约信息,发送至以所述第一配对距离进行配对的节点。
  11. 根据权利要求1所述的数据通信链路的建立方法,其特征在于,还包括:
    对所述确定的所述逻辑组中节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识。
  12. 根据权利要求11所述的数据通信链路的建立方法,其特征在于,所述对所述确定的所述逻辑组中节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识,包括:
    对所述逻辑组中位于所述集群架构的二部图上侧的节点或下侧的节点的节点标识进行移位,获得移位后的逻辑组中节点的节点标识。
  13. 根据权利要求1所述的数据通信链路的建立方法,其特征在于,还包括:
    将所述逻辑组中节点的节点标识信息记录至与交换设备连接的物理组内的节点上。
  14. 根据权利要求1所述的数据通信链路的建立方法,其特征在于,还包括:
    根据所述节点标识,建立节点查询表;
    将所述节点标识的信息记录在所述节点查询表中。
  15. 根据权利要求14所述的数据通信链路的建立方法,其特征在于,所述根据所述逻辑组中节点的节点标识,建立节点查询表,包括:
    按照所述逻辑组中节点之间的距离建立索引;
    将索引值相同的节点标识信息记录到同一节点查询列表中。
  16. 一种数据通信链路的建立装置,其特征在于,包括:
    分组单元,用于将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;
    确定单元,用于根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;
    建立单元,用于根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
  17. 一种集群架构中节点标识的确定方法,其特征在于,包括:
    将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;
    根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识。
  18. 根据权利要求17所述的集群架构中节点标识的确定方法,其特征在于,所述将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组,包括:
    将所述集群架构中分布在二部图两侧的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组。
  19. 根据权利要求18所述的集群架构中节点标识的确定方法,其特征在于,所述将所述集群架构中分布在二部图两边的所述节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组,包括:
    将所述二部图中分别连接不同交换设备的节点的集合,确定为所述逻辑组;
    将所述二部图中连接同一交换设备的节点的集合,确定为所述物理组,其中,所述物理组中包括来自不同逻辑组的节点。
  20. 根据权利要求18所述的方法,其特征在于,还包括:
    选取第一逻辑组中的节点信息;
    将所述第一逻辑组中节点按照所述二部图划分为至少包括两个节点的节点组;
    赋予所述第一逻辑组中第一节点组的两个节点的标识;
    将所述两个节点的标识确定为所述设定的指定节点标识。
  21. 根据权利要求20所述的集群架构中节点标识的确定方法,其特征在于,所述根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识,包括:
    根据所述第一逻辑组中第一节点组中节点的节点标识,生成所述第一逻辑组中与所述第一节点组相邻的第二节点组中节点的节点标识;
    根据所述第二节点组中节点的节点标识生成所述第一逻辑组中与所述第二节点组相邻的下一节点组中节点的节点标识;
    当第一逻辑组中节点标识生成完毕,则根据所述第一逻辑组中第一节点组中节点的节点标识,生成第二逻辑组中第一节点组中节点的节点标识;
    根据所述第一逻辑组中第二节点组中节点的节点标识,生成所述第二逻辑组中第二节点组中节点的节点标识,依次类推直到生成所有逻辑组中节点的节点标识。
  22. 根据权利要求20所述的集群架构中节点标识的确定方法,其特征在于,所述 根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识,包括:
    将逻辑组中的节点组按照顺序进行排序,获得排序后的节点组集合;
    根据所述节点组集合中第一节点组中节点的节点标识,确定第一节点标识基准;
    根据第一节点标识基准生成所述节点组集合中第二节点组中节点的节点标识;
    根据所述第二节点组中节点的节点标识,确定第二节点标识基准;
    根据所述第二节点标识基准生成所述节点组集合中第三节点组中节点的节点标识;依次类推直到生成节点组集合中所有节点组中节点的节点标识。
  23. 一种集群架构中节点标识的确定装置,其特征在于,包括:
    分组单元,用于将集群架构中的节点进行分组,获得描述节点连接关系的逻辑组和描述节点物理位置的物理组;
    确定单元,用于根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识。
  24. 一种基于人工智能集群架构中数据通信链路的建立方法,其特征在于,包括:
    将人工智能集群架构中用于数据通信的节点进行分组,获得在所述人工智能集群架构中描述节点连接关系的逻辑组和描述节点物理位置的物理组;
    根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;
    根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
  25. 一种基于人工智能集群架构中数据通信链路的建立装置,其特征在于,包括:
    分组单元,用于将人工智能集群架构中用于数据通信的节点进行分组,获得在所述人工智能集群架构中描述节点连接关系的逻辑组和描述节点物理位置的物理组;
    确定单元,用于根据所述逻辑组中设定的指定节点的节点标识,确定所述逻辑组中其他节点的节点标识;
    建立单元,用于根据所述节点标识,建立所述逻辑组中节点之间的唯一数据通信链路。
  26. 一种计算机存储介质,用于存储网络平台产生数据,以及对应所述网络平台产生数据进行处理的程序;
    所述程序在被所述处理器读取执行时,执行如权利要求1至15任意一项所述的数据通信链路的建立方法的步骤,或者执行如权利要求17至22任意一项所述的集群架构中节点标识的确定方法的步骤,或者执行建立如权利要求25所述的人工智能集群架构中数据通信链路的建立装置的步骤。
  27. 一种电子设备,包括:
    处理器;
    存储器,用于存储对网络平台产生数据进行处理的程序,所述程序在被所述处理器读取执行时,执行如权利要求1至15任意一项所述的数据通信链路的建立方法的步骤,或者执行如权利要求17至22任意一项所述的集群架构中节点标识的确定方法的步骤,或者执行建立如权利要求25所述的建立装置的步骤。
PCT/CN2020/091221 2019-05-29 2020-05-20 通信链路的建立方法及装置,节点标识确定方法及装置 WO2020238719A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910457828.2A CN110581880B (zh) 2019-05-29 2019-05-29 通信链路的建立方法及装置,节点标识确定方法及装置
CN201910457828.2 2019-05-29

Publications (1)

Publication Number Publication Date
WO2020238719A1 true WO2020238719A1 (zh) 2020-12-03

Family

ID=68811033

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/091221 WO2020238719A1 (zh) 2019-05-29 2020-05-20 通信链路的建立方法及装置,节点标识确定方法及装置

Country Status (2)

Country Link
CN (1) CN110581880B (zh)
WO (1) WO2020238719A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110581880B (zh) * 2019-05-29 2021-09-07 阿里巴巴集团控股有限公司 通信链路的建立方法及装置,节点标识确定方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102761471A (zh) * 2011-04-29 2012-10-31 无锡江南计算技术研究所 无线计算互连网络及座标空间变换方法
CN107079045A (zh) * 2014-10-14 2017-08-18 微软技术许可有限责任公司 使用集群的节点识别
US20180041555A1 (en) * 2016-08-03 2018-02-08 Big Switch Networks, Inc. Systems and methods to manage multicast traffic
EP3346768A1 (en) * 2017-01-10 2018-07-11 Quantek, Inc. Method of performing timing arrangement in a mesh network
CN109714183A (zh) * 2017-10-26 2019-05-03 阿里巴巴集团控股有限公司 一种集群中的数据处理方法及装置
CN110581880A (zh) * 2019-05-29 2019-12-17 阿里巴巴集团控股有限公司 通信链路的建立方法及装置,节点标识确定方法及装置

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69840844D1 (de) * 1998-08-10 2009-07-02 Ibm Abstraktion einer "PNNI" Topologie
US7042846B2 (en) * 2000-01-27 2006-05-09 International Business Machines Corporation Restrictive costs in network systems
BRPI0520582A2 (pt) * 2005-10-11 2009-06-13 Ericsson Telefon Ab L M método para gerar árvores de distribuição em uma rede
US7865551B2 (en) * 2006-05-05 2011-01-04 Sony Online Entertainment Llc Determining influential/popular participants in a communication network
US20080256056A1 (en) * 2007-04-10 2008-10-16 Yahoo! Inc. System for building a data structure representing a network of users and advertisers
CN101311917B (zh) * 2007-05-24 2011-04-06 中国科学院过程工程研究所 一种面向粒子模型的多层直连集群并行计算系统
US20100205057A1 (en) * 2009-02-06 2010-08-12 Rodney Hook Privacy-sensitive methods, systems, and media for targeting online advertisements using brand affinity modeling
CN102571591B (zh) * 2012-01-18 2014-09-17 中国人民解放军国防科学技术大学 实现标识网络通信的方法、边缘路由器及系统
CN103428045A (zh) * 2012-05-25 2013-12-04 华为技术有限公司 连通性检测方法、装置和系统
US9432301B2 (en) * 2013-04-29 2016-08-30 Telefonaktiebolaget L M Ericsson (Publ) Defining disjoint node groups for virtual machines with pre-existing placement policies
US10341221B2 (en) * 2015-02-26 2019-07-02 Cisco Technology, Inc. Traffic engineering for bit indexed explicit replication
CN104618980B (zh) * 2015-03-05 2018-09-28 江苏中科羿链通信技术有限公司 无线多跳链状网的路由实现方法
CN106788664A (zh) * 2015-11-23 2017-05-31 上海交通大学 星座通信网中基于完美匹配模型的链路分配方法
WO2017186720A1 (en) * 2016-04-28 2017-11-02 Fairflow Technologies Holding B.V. Distributing and aggregating resource data in a network
CN108769842A (zh) * 2018-05-17 2018-11-06 北京邮电大学 多播业务保护构建方法及装置
CN109792406B (zh) * 2018-07-27 2021-06-18 袁振南 服务器集群中的消息传递方法、装置及存储介质
US11245617B1 (en) * 2018-12-28 2022-02-08 Juniper Networks, Inc. Compressed routing header

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102761471A (zh) * 2011-04-29 2012-10-31 无锡江南计算技术研究所 无线计算互连网络及座标空间变换方法
CN107079045A (zh) * 2014-10-14 2017-08-18 微软技术许可有限责任公司 使用集群的节点识别
US20180041555A1 (en) * 2016-08-03 2018-02-08 Big Switch Networks, Inc. Systems and methods to manage multicast traffic
EP3346768A1 (en) * 2017-01-10 2018-07-11 Quantek, Inc. Method of performing timing arrangement in a mesh network
CN109714183A (zh) * 2017-10-26 2019-05-03 阿里巴巴集团控股有限公司 一种集群中的数据处理方法及装置
CN110581880A (zh) * 2019-05-29 2019-12-17 阿里巴巴集团控股有限公司 通信链路的建立方法及装置,节点标识确定方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HOU, JIE ET AL.: "A Framework for Identifier-based Routing for Future Internet", 2009 EIGHTH IEEE INTERNATIONAL CONFERENCE ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 31 December 2009 (2009-12-31), XP031610226 *

Also Published As

Publication number Publication date
CN110581880A (zh) 2019-12-17
CN110581880B (zh) 2021-09-07

Similar Documents

Publication Publication Date Title
Jimenez et al. On the controller placement for designing a distributed SDN control layer
US20100027442A1 (en) Constructing scalable overlays for pub-sub with many topics: the greedy join-leave algorithm
US10404576B2 (en) Constrained shortest path determination in a network
US10038623B2 (en) Reducing flooding of link state changes in networks
EP3224995A1 (en) Modeling a border gateway protocol network
WO2022012576A1 (zh) 路径规划方法、装置、路径规划设备及存储介质
US11082358B2 (en) Network path measurement method, apparatus, and system
WO2020238719A1 (zh) 通信链路的建立方法及装置,节点标识确定方法及装置
CN105474168A (zh) 网络装置执行的数据处理方法和相关设备
Guo et al. DCube: A family of network structures for containerized data centers using dual-port servers
US10637739B2 (en) Network topology system and building method for topologies and routing tables thereof
CN106209559A (zh) 一种建立组播隧道的方法和装置
CN104219163B (zh) 一种基于动态副本法和虚拟节点法的节点动态前移的负载均衡方法
Tato et al. Designing overlay networks for decentralized clouds
US20180219746A1 (en) Cost Management Against Requirements for the Generation of a NoC
CN114237985B (zh) 修复纠删码存储系统中失效存储块的方法及相关设备
CN116915708A (zh) 路由数据包的方法、处理器及可读存储介质
Toda et al. Autonomous and distributed construction of locality aware skip graph
CN105072047B (zh) 一种报文传输及处理方法
KR102050828B1 (ko) 병렬 연산을 이용한 오픈 가상 스위치의 가속화 방법 및 이를 이용한 오픈 가상 스위치
US10084725B2 (en) Extracting features from a NoC for machine learning construction
CN115567541B (zh) 区块链网络、节点集合的维护方法及装置
WO2021111491A1 (ja) 分散深層学習システムおよび分散深層学習方法
WO2021095196A1 (ja) 分散深層学習システムおよび分散深層学習方法
Cao et al. Research on Improvement of Routing Algorithm Kademlia in IPFS

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20813357

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20813357

Country of ref document: EP

Kind code of ref document: A1