WO2023130656A1

WO2023130656A1 - Method for generating heterougeneous multi-node interconnection topology, and storage medium

Info

Publication number: WO2023130656A1
Application number: PCT/CN2022/096236
Authority: WO
Inventors: 杨宏斌; 金良; 胡克坤; 赵雅倩; 董刚; 刘海威; 蒋东东; 晁银银
Original assignee: 苏州浪潮智能科技有限公司
Priority date: 2022-01-10
Filing date: 2022-05-31
Publication date: 2023-07-13
Also published as: CN114050975A; CN114050975B

Abstract

The present application relates to a method for generating a heterogeneous multi-node interconnection topology, and a storage medium. The method comprises the following steps: on the basis of a graph convolutional network model, feature extraction node information, and a topological structure, acquiring node information, and a low-dimensional vector representation of the topological structure; inputting a performance parameter and the low-dimensional vector representation to a fully connected layer, so as to generate feature integration information; inputting the feature integration information into a preset generation network, so as to generate a heterogeneous multi-node interconnection topological structure; and acquiring a feature value of the heterogeneous multi-node interconnection topological structure, so as to ensure that the heterogeneous multi-node interconnection topological structure meets a preset accuracy requirement.

Description

A heterogeneous multi-node interconnection topology generation method and storage medium

Cross References to Related Applications

This application claims the priority of the Chinese patent application submitted to the China Patent Office on January 10, 2022, with the application number 202210024578.5, and the application name is "a heterogeneous multi-node interconnection topology generation method and storage medium", the entire content of which is passed References are incorporated in this application.

technical field

The present application relates to the technical field of computer networks, in particular to a heterogeneous multi-node interconnection topology generation method and storage medium.

Background technique

In the current heterogeneous computing field, there are various types of computing devices, such as CPUs, GPUs, FPGAs, and dedicated ICs. The computing power of a single computing node is different, as large as a server, including multiple CPUs and multiple GPU computing cards, as small as a single dedicated computing chip, including hundreds or thousands of PEs; the interface is different, such as the QPI interface of Intel CPU, Nvidia GPU Nvlink of FPGA, SRIO of FPGA; multi-node interconnection application scenarios are different, such as supercomputer, distributed computing, ultra-heterogeneous platform, network-on-chip, many-core CPU multi-core architecture and multi-PE interconnection of heterogeneous acceleration chip, among which supercomputer, Distributed computing and ultra-heterogeneous platforms use device nodes, such as CPUs, GPUs, FPGAs, and dedicated ICs; network-on-chips, many-core CPU multi-core architectures, and heterogeneous acceleration chips use core nodes, such as CPU cores, CUDA cores, and PE arrays wait.

The interconnection methods of each computing node are also various, such as: shared bus, Crossbar switching matrix (as shown in Figure 1 and Figure 2), Ring (as shown in Figure 3 is the topology diagram of the bus and Ring), Star connection, Mesh and Torus (as shown in Figure 4 is a schematic diagram of a 2D Mesh and 2D Torus distributed switching matrix, and Figure 5 is a schematic diagram of a 2D Torus structure), etc. Usually, the network connecting multiple independent computers is called Network, the interconnection network inside a chip or between multiple chips is called Fabric, and the large-scale interconnection network embedded in a single chip to connect multiple different modules in a chip Called NoC. Different from computer networks, more characteristic technologies can be implemented in NoC, such as QoS (Quality of Service), which can decide which request to forward first and which one to forward later. The receiving and sending ports of each node Crossbar can increase the queue buffer to realize QoS priority control; realize more advanced flow control strategy, make full use of queues; use more advanced routing algorithm and congestion judgment algorithm to calculate which way to go to the target node more smoothly. Commonly used Fabric topologies such as HyperCube are also used by Intel QPI, Fat Tree topology is also used by Tianhe II supercomputer to connect a large number of computer nodes, Pyramid topology, Butterfly topology, Intel in Its 12-core Ivy Bridge CPU micro-architecture is connected in series with stringed rings and triple rings (Triple Ring) used by its 12 cores. Ringed Cube and ClosNetwork topology etc. Figure 7 shows a schematic diagram of the internal architecture of an image processing chip. Local Router in the figure refers to a local router, Global Router refers to a global router, Hierarchical Star refers to a hierarchical star, and RISC refers to Reduced Instruction-Set Computing (reduced instruction Set computer), IIE refers to Integrated Information Environment (integrated information environment), ME refers to Motion Estimation (motion estimation), ST refers to store (storage), SM refers to Shared Memory (shared memory), PMC refers to Power Management Controller (power management controller), VAE refers to Variational autoEncoder (variational automatic encoder), GTMU refers to GSM Transmission Timing Management Unit for BBU (GSM main control transmission unit), OGW refers to Originating Gateway device (originating gateway device ), FMP refers to Functional Multiprocessor Architecture (functional multiprocessor structure), VPE refers to Vector Processing Element (vector processing unit), LTMU refers to Local Task Management Units (local task management unit), SPE refers to Scalar Processing Element ( Scalar processing unit), FEC refers to Forwarding Equivalence Class (forwarding equivalence class). The chip internally uses 6 Crossbar Switches to form a star network. At the same time, 4 13x13 Crossbars are connected in series to form a Ring. The overall hybrid topology is adopted. Among them, the inventor realized that due to the different computing power provided by different computing devices or computing cores, the interface types, numbers and bandwidths are different, and the interconnection topologies between computing nodes are various. High computing power, multiple interfaces, and multiple interconnect lines correspond to high power consumption and cost. If the computing nodes and the interconnection do not match, the computing nodes or interconnect lines will be idle. Therefore, when deploying a computing task or designing a heterogeneous multi-node hardware circuit , how to choose the type and number of nodes, which topology to use, and how to interconnect each node to make it as low as possible in the case of satisfying the computing performance is an optimization problem.

That is to say, different from the units and connection relationships determined in the netlist of chip placement and routing tasks, the heterogeneous multi-node topology selects computing units and connection relationships according to computing tasks. Since the performance of each computing node is different, the cost is also different. If there are many interconnecting lines between two nodes, the communication bandwidth will be high, but the cost will also be high. Larger, it will also increase the probability of channel congestion.

Contents of the invention

The present application provides a method for generating a heterogeneous multi-node interconnection topology, including the following steps:

Based on the graph convolutional network model, feature extraction node information and topology to obtain low-dimensional vector representation of node information and topology;

Input performance parameters and low-dimensional vector representations to fully connected layers to generate feature integration information;

Input feature integration information into a preset generation network to generate a heterogeneous multi-node interconnection topology; and

The eigenvalues of the heterogeneous multi-node interconnection topology are obtained to ensure that the heterogeneous multi-node interconnection topology meets the preset accuracy requirements.

In one embodiment of the present application, the generating network specifically includes: an upsampling layer, a convolutional layer, a fully connected layer, a batch normalization layer, a modified linear unit, and a Sigmoid function.

In one embodiment of the present application, the low-dimensional vector representation includes a node embedding vector and a connection embedding vector; based on the following formula, the node embedding vector of the graph convolutional network model is obtained:

Among them, v _i represents the node

Represents the neighbor nodes of node v _i , e _ij represents the connection between nodes v _i and v _j , mean represents the average function; based on the following formula, the connection embedding vector of the graph convolutional network model is obtained:

Among them, f _c0 and f _c1 represent two feed-forward networks of different sizes, w _ij ^e represents the learnable 1x1 weight of the corresponding adjacent edge, concat represents the concatenation function, and creates a node vector based on node features, and v _i and v _j both represent node.

In one embodiment of the present application, the upsampling process of the upsampling layer specifically includes: assuming that the feature integration information is a graph S(V, E) containing V vertices and E adjacent edges;

Based on the graph S(V,E), the following operations are performed sequentially: map the graph S(V,E) to a graph S'(V',E') containing N*n vertices and E*m adjacent edges;

Based on the graph S'(V', E'), generate a first adjacency matrix, and obtain an initial value of the first adjacency matrix; and

Based on the initial value of the first adjacency matrix, the first adjacency matrix is trained to obtain the optimal value A ^ω of the first adjacency matrix.

In one embodiment of the present application, the vertex features of the graph S'(V', E') are obtained based on the following formula:

Among them, f _in is the vertex feature of graph S'(V',E'), k _ij represents the geodesic distance between vertex j and vertex i of graph S'(V',E'), for calculating N*n The optimal value f _j of the graph S'(V',E') obtained after the weight of any vertex in the vertices represents the vertex characteristics of the graph S(V,E).

In one embodiment of the present application, the convolutional layer generates a global graph and an independent graph based on the upsampling result of the upsampling layer, and performs a convolution operation based on the global graph and the independent graph; the convolution operation specifically includes the following steps: initializing the independent graph For graph S(V,E), an independent graph is generated based on:

Among them, C _k represents the independent graph, _fin represents the vertex feature W _θk of the graph S'(V',E') and

are embedding functions θ and

Parameters, SoftMax is the normalization function; where, the normalization function is:

Among them, N represents the number of vertices in the graph S'(V',E'), θ(v _i ),

Respectively represent two 1×1 convolutional layers with different initial values.

In one embodiment of the present application, the eigenvalues of the heterogeneous multi-node interconnection topology are obtained based on the following formula:

Among them, B _k represents the global graph, C _k represents the independent graph α represents the parameter to adjust the weight of the independent graph, _fin represents the vertex features of the graph S'(V', E'), K _v represents the kernel size of the spatial dimension, W _k A vector of weights representing the 1x1 convolution operation.

In an embodiment of the present application, ensuring that the heterogeneous multi-node interconnection topology meets the preset accuracy requirements specifically includes: combining the heterogeneous multi-node interconnection topology with the preset real heterogeneous multi-node interconnection topology based on the following formula Perform the cross entropy loss operation:

Among them, E represents the expected value of the distribution function, P _data represents the distribution of actual topological samples, x is the real sample in P _data , P _z represents the distribution of input noise, D(x) represents the probability of judging the sample as correct, G(z ) represents the heterogeneous multi-node interconnection topology; z represents the input noise; based on the following formula, the topology reconstruction loss result is obtained:

Among them, P _t represents the heterogeneous multi-node interconnection topology P _t 'represents the real heterogeneous multi-node interconnection topology, L _topo represents the topological distance between the heterogeneous multi-node interconnection topology and the corresponding node of the real heterogeneous multi-node interconnection topology; According to the calculation results of topology reconstruction loss and cross-entropy loss, the final loss of the heterogeneous multi-node interconnection topology is obtained based on the following formula:

L=L _cGAN +λL _rec

Among them, λ is the weight of the reconstruction item L _rec is the topology reconstruction loss L _cGAN is the cross-entropy loss; compare the final loss of the heterogeneous multi-node interconnection topology with the preset loss of the heterogeneous multi-node interconnection topology, if the heterogeneous The final loss of the multi-node interconnection topology is greater than the preset loss of the heterogeneous multi-node interconnection topology, then repeat the execution to generate the heterogeneous multi-node interconnection topology until the final loss of the heterogeneous multi-node interconnection topology is not greater than the preset heterogeneous The loss of the heterogeneous multi-node interconnection topology ensures that the heterogeneous multi-node interconnection topology meets the preset accuracy requirements.

In one embodiment of the present application, the performance parameters include noise signal, performance requirements, power consumption requirements and cost requirements.

In an embodiment of the present application, ensuring that the heterogeneous multi-node interconnection topology meets preset accuracy requirements includes:

Based on the eigenvalues of the heterogeneous multi-node interconnection topology, the cross-entropy loss results and topology reconstruction loss results of the heterogeneous multi-node interconnection topology and the preset real heterogeneous multi-node interconnection topology are obtained;

Based on the cross-entropy loss result and the topology reconstruction loss result, the final loss of the heterogeneous multi-node interconnection topology is obtained; and

Comparing the final loss of the heterogeneous multi-node interconnection topology with the preset loss of the heterogeneous multi-node interconnection topology, if the final loss of the heterogeneous multi-node interconnection topology is greater than the preset heterogeneous multi-node interconnection topology loss, the upsampling process of the upsampling layer and the convolution operation of the convolutional layer are repeated until the final loss of the heterogeneous multi-node interconnection topology is not greater than the preset loss of the heterogeneous multi-node interconnection topology.

In order to achieve the above object, the application also proposes a second technical solution:

A non-volatile computer-readable storage medium storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, one or more processors are made to execute the method provided in any one of the above-mentioned embodiments. The steps of the heterogeneous multi-node interconnection topology generation method.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below. Other features and advantages of the application will be apparent from the description, drawings, and claims.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those skilled in the art, other drawings can also be obtained based on these drawings without creative effort.

Fig. 1 is a schematic diagram of Crossbar in the prior art;

Fig. 2 is a schematic diagram of Crossbar cascading in the prior art;

Fig. 3 is a topological structure diagram of a bus and a Ring in the prior art;

Fig. 4 is the schematic diagram of 2D Mesh and 2D Torus distributed switch matrix in the prior art; Fig. 5 is the structural representation of 2D Torus in the prior art;

Fig. 6 is a schematic diagram of the topological structure of banded rings and triple rings in the prior art;

Fig. 7 is a schematic diagram of the internal architecture of an image processing dedicated chip in the prior art;

FIG. 8 is a flowchart of a method for generating a heterogeneous multi-node interconnection topology provided in one or more embodiments of the present application;

Fig. 9 is a schematic diagram of the application environment of the method provided in one or more embodiments of the present application;

FIG. 10 is a schematic diagram of the internal operation structure of the discriminant network provided in one or more embodiments of the present application;

FIG. 11 is a schematic diagram of the internal operation structure of the convolutional layer provided in one or more embodiments of the present application;

Fig. 12 is a schematic diagram of the internal operation structure of the generating network provided in one or more embodiments of the present application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the application clearer, the technical solutions in the embodiments of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the application. Obviously, the described embodiments are only It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

Embodiment one:

Referring to FIG. 8 , FIG. 8 is a flowchart of a method for generating a heterogeneous multi-node interconnection topology provided in Embodiment 1. Referring to FIG.

The method provided in this embodiment is applied in the application environment shown in FIG. 9 . The method of this embodiment includes the following steps:

Step S1. Based on the graph convolutional network model, feature extraction of node information and topological structure to obtain a low-dimensional vector representation of node information and topological structure.

In one of the implementations, the graph convolutional network model is based on node information library and topology library, feature node information and topology. Among them, the node information library includes node models established based on characteristics such as node computing power, node core number, node interface number, and node interface broadband; The topology model established by characteristics such as density and connection line length; performance parameters include performance, power consumption, cost, etc.

In one of the implementations, based on the low-dimensional vector representation of performance parameters, node information and topology, the node embedding vector of the graph convolutional network model is obtained based on the following formula:

Among them, v _i represents the node

in,

and

Denotes two feed-forward networks of different sizes,

is the learnable 1x1 weight corresponding to the adjacent edge, concat represents the concatenation function, and creates a node vector based on node features, and v _i and v _j both represent nodes.

Step S2, based on the node embedding vector, connection embedding vector and performance parameters input to the fully connected layer, the node embedding vector, connection embedding vector and performance parameters are fused in the fully connected layer to form feature integration information, and the feature integration information is input into the preset Generate networks to generate heterogeneous multi-node interconnection topologies.

In one embodiment, the generation network includes, but is not limited to, an upsampling layer and a convolutional layer, as shown in FIG. 10 , which is a schematic diagram of the generation network of this application. where the spatial upsampling layer operates using an aggregation function defined by a graph A ^ω that maps a graph S(V,E) with V vertices and E edges to a larger graph S'(V' , E'), by assigning different importances to new sets of vertices, the network can learn the optimal value of A ^ω for good upsampling of the graph. The upsampling process of the upsampling layer specifically includes: mapping the graph S(V, E) to a graph S'(V', E') containing N*n vertices and E*m adjacent edges; based on the graph S'(V',E') Generate the first adjacency matrix and obtain the initial value of the first adjacency matrix; based on the initial value of the first adjacency matrix, train the first adjacency matrix to obtain the optimal value of the first adjacency matrix A ^ω . The graph S'(V', E') vertex features are obtained based on the following formula:

Among them, f _in is the vertex feature of the graph S'(V', E'), _kij represents the geodesic distance A between the vertex j and the vertex i of the graph S'(V', E'), ^{and ω} is the calculation N* Any of the n vertices; the optimal value of the graph S'(V', E') obtained after the weight of the vertex; f _j represents the vertex feature of the graph S(V, E).

In one embodiment, after the upsampling layer performs upsampling processing, the convolution layer performs convolution operations based on the global graph and the independent graph. The operation process of the convolutional layer is shown in Figure 11. Specifically, the independent graph is initialized as graph S(V,E); the independent graph is generated based on the following formula:

Among them, C _k represents the independent graph, fin _represents the vertex features W _θk of the graph S'(V', E') and

are embedding functions θ and

Parameters, SoftMax is a normalization function; where, the normalization function is; where, the normalization function is:

are embedding functions θ and

Step S3. Based on the pre-built discriminant network, ensure that the heterogeneous multi-node interconnection topology meets the preset accuracy requirements. Among them, there is a similarity between the operation structure of the discrimination network and the generation network, as shown in FIG. 12 , which is the internal structure of the discrimination network of the present application. Specifically, the discriminative network uses an aggregation matrix B ^φ with trainable weights φ that is different from the weights learned by the generator network, since the aggregation is mapped from a larger graph S'(V', E') to A smaller graph S ₁ (V ₁ , E ₁ ), based on the following formula, obtains the vertex features of graph S ₁ (V ₁ , E ₁ ):

Among them, f _i is the vertex feature of graph S ₁ (V ₁ , E ₁ ); f _j ' is the vertex feature of graph S'(V', E'); k is the graph S'(V', E') The geodesic distance B ^φ between vertex j and vertex i is an aggregation matrix B ^φ with trainable weights φ.

In one embodiment, ensuring that the heterogeneous multi-node interconnection topology meets the preset accuracy requirements specifically includes: intersecting the heterogeneous multi-node interconnection topology with the preset real heterogeneous multi-node interconnection topology based on the following formula entropy loss operation;

L=L _cGAN +λL _rec

Among them, λ is the weight of the reconstruction item L _rec is the topology reconstruction loss L _cGAN is the cross-entropy loss; compare the final loss of the heterogeneous multi-node interconnection topology with the preset loss of the heterogeneous multi-node interconnection topology, if the heterogeneous The final loss of the multi-node interconnection topology is greater than the preset loss of the heterogeneous multi-node interconnection topology, then repeat step S2 until the final loss of the heterogeneous multi-node interconnection topology is not greater than the preset heterogeneous multi-node interconnection topology loss, to ensure that the heterogeneous multi-node interconnection topology meets the preset accuracy requirements.

Embodiment two:

The method of this embodiment includes the following steps: based on the graph convolutional network model, feature extraction of node information and topology to obtain low-dimensional vector representations of node information and topology; input performance parameters and low-dimensional vector representations to the fully connected layer , to generate feature integration information; input feature integration information into the preset generation network to generate heterogeneous multi-node interconnection topology; based on the pre-built discriminant network, ensure that the heterogeneous multi-node interconnection topology meets the preset accuracy requirements .

In one embodiment, the generating network specifically includes: an upsampling layer and a convolutional layer; the convolutional layer generates a heterogeneous multi-node interconnection topology based on the upsampling processing result of the upsampling layer.

In one of the implementations, the low-dimensional vector representation includes a node embedding vector and a connection embedding vector; based on the following formula, the node embedding vector of the graph convolutional network model is obtained:

Among them, v _i represents the node

in,

and

Denotes two feed-forward networks of different sizes,

In one of the implementations, the upsampling process of the upsampling layer specifically includes: assuming that the feature integration information is a graph S(V, E) containing V vertices and E adjacent edges; based on the graph S(V, E), sequentially Perform the following operations: map the graph S(V, E) to a graph S'(V', E') containing N*n vertices and E*m adjacent edges; based on the graph S'(V', E' ), generate the first adjacency matrix, and obtain the initial value of the first adjacency matrix; based on the initial value of the first adjacency matrix, train the first adjacency matrix to obtain the optimal value A ^ω of the first adjacency matrix.

In one of the implementations, the graph S'(V', E') vertex features are obtained based on the following formula:

In one of the implementations, the convolutional layer generates a global graph and an independent graph based on the upsampling result of the upsampling layer, and performs a convolution operation based on the global graph and the independent graph; the convolution operation specifically includes the following steps: initializing the independent graph as a graph S(V,E), which generates an independent graph based on:

are embedding functions θ and

Among them, N represents the number of vertices θ(v _i ) of the graph S'(V', E')

Respectively represent two 1x1 convolutional layers with different initial values.

In one of the implementation manners, the eigenvalues of the heterogeneous multi-node interconnection topology are obtained based on the following formula:

Among them, B _k represents the global graph, C _k represents the independent graph α represents the parameter to adjust the weight of the independent graph, _fin represents the vertex features of the graph S'(V', E'), K _v represents the kernel size of the spatial dimension, W _k A vector of weights representing the 1x1 convolution operation. The specific operation process inside the convolutional layer is shown in Figure 11. In the figure, B _k is a global map, which is unique to each layer. _Bk is an independent graph for learning a per-sample specific topology.

θ and

are two embedding functions, here a 1x1 convolutional layer. _Kv denotes the number of subgraphs,

represents the residual operation

Represents the matrix multiplication operation

is the gate that controls the importance weights of the two graphs. The importance of independent graphs in different layers is adjusted through a gating mechanism, using a different α value for each layer that is learned and updated through training.

In one of the implementation manners, ensuring that the heterogeneous multi-node interconnection topology meets preset accuracy requirements specifically includes:

The heterogeneous multi-node interconnection topology and the preset real heterogeneous multi-node interconnection topology are calculated based on the following formula for cross-entropy loss;

L=L _cGAN +λL _rec

Among them, λ is the weight of the reconstruction item L _rec is the topology reconstruction loss L _cGAN is the cross-entropy loss; compare the final loss of the heterogeneous multi-node interconnection topology with the preset loss of the heterogeneous multi-node interconnection topology, if the heterogeneous If the final loss of the multi-node interconnection topology is greater than the preset loss of the heterogeneous multi-node interconnection topology, the upsampling operation and graph convolution operation are repeated until the final loss of the heterogeneous multi-node interconnection topology is not greater than the preset heterogeneous multi-node interconnection topology. The loss of the heterogeneous multi-node interconnection topology ensures that the heterogeneous multi-node interconnection topology meets the preset accuracy requirements. Among them, the upsampling operation includes: the upsampling process of the upsampling layer specifically includes: assuming that the feature integration information is a graph S(V, E) containing V vertices and E adjacent edges; Perform the following operations: map the graph S(V, E) to a graph S'(V', E') containing N*n vertices and E*m adjacent edges; based on the graph S'(V', E' ), generate the first adjacency matrix, and obtain the initial value of the first adjacency matrix; based on the initial value of the first adjacency matrix, train the first adjacency matrix to obtain the best value A ^ω of the first adjacency matrix; Get the best value A ^ω , get the vertex features of the graph S'(V', E'). The graph S'(V', E') vertex features are obtained based on the following formula:

Among them, f _in is the vertex feature of the graph S'(V', E'), _kij represents the geodesic distance A between the vertex j and the vertex i of the graph S'(V', E'), ^{and ω} is the calculation N* The optimal value of the graph S'(V', E') obtained after the weight of any vertex in the n vertices; f _j represents the vertex feature of the graph S(V, E).

The graph convolution operation includes the convolution layer generating a global graph and an independent graph based on the upsampling results of the upsampling layer, and performing convolution operations based on the global graph and the independent graph; the convolution operation specifically includes the following steps: initializing the independent graph as a graph S( V,E), generating an independent graph based on:

are embedding functions θ and

N represents the number of vertices θ(v _i ) of the graph S'(V', E'),

Obtain the eigenvalues of the heterogeneous multi-node interconnection topology based on the following formula: Obtain the eigenvalues of the heterogeneous multi-node interconnection topology based on the following formula:

Among them, B _k represents the global graph, C _k represents the independent graph α represents the parameter to adjust the weight of the independent graph, _fin represents the vertex features of the graph S'(V', E'), K _v represents the kernel size of the spatial dimension W _k represents 1x1 A vector of weights for the convolution operation.

In one embodiment, the performance parameters include noise signal, performance requirements, power consumption requirements and cost requirements.

Embodiment three:

This embodiment provides a non-volatile computer-readable storage medium storing computer-readable instructions. When the program is executed by one or more processors, one or more processors execute the program provided in any of the above-mentioned embodiments. The steps of the heterogeneous multi-node interconnect topology generation method.

Those skilled in the art should understand that the embodiments in the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the embodiment of the present application may be in the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, the embodiments of the present application may take the form of computer program products implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes therein. .

Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

Note that the above are only preferred embodiments and technical principles used in this application. Those skilled in the art will understand that the present application is not limited to the specific embodiments here, and various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present application. Therefore, although the present application has been described in detail through the above embodiments, the present application is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present application. The scope is determined by the scope of the appended claims.

Claims

A heterogeneous multi-node interconnect topology generation method, characterized in that: the method includes the following steps:

Based on the graph convolutional network model, feature extraction node information and topology to obtain low-dimensional vector representation of node information and topology;

Input performance parameters and the low-dimensional vector representation to the fully connected layer to generate feature integration information;

Inputting the feature integration information into a preset generation network to generate a heterogeneous multi-node interconnection topology; and

The characteristic value of the heterogeneous multi-node interconnection topology is acquired to ensure that the heterogeneous multi-node interconnection topology meets a preset accuracy requirement.
The heterogeneous multi-node interconnection topology generation method according to claim 1, wherein: the generation network specifically includes: an upsampling layer, a convolutional layer, a fully connected layer, a batch normalization layer, a modified linear unit, and a S type function.
The heterogeneous multi-node interconnection topology generation method according to claim 1, wherein the low-dimensional vector representation includes node embedding vectors and connection embedding vectors;

Based on the following formula, the node embedding vector of the graph convolutional network model is obtained:

where ν i represents a node,
Represents the neighbor nodes of node ν i , e ij represents the connection between nodes ν i and ν j , mean represents the average function; and

Based on the following formula, the connection embedding vector of the graph convolutional network model is obtained:

Among them, f c0 and f c1 represent two feed-forward networks of different sizes, w ij e represents the learnable 1x1 weight of the corresponding adjacent edge, concat represents the concatenation function, and creates a node vector based on node features, and v i and v j both represent node.
The heterogeneous multi-node interconnection topology generation method according to claim 2, characterized in that: the upsampling process of the upsampling layer specifically includes: assuming that the feature integration information is a graph S containing V vertices and E adjacent edges (V, E); Carry out the following operations sequentially based on the graph S (V, E):

Mapping said graph S(V,E) to a graph S'(V',E') containing N*n vertices and E*m adjacent edges;

Based on the graph S'(V', E'), generate a first adjacency matrix, and obtain an initial value of the first adjacency matrix; and

Based on the initial value of the first adjacency matrix, the first adjacency matrix is trained to obtain an optimal value A ω of the first adjacency matrix.
The heterogeneous multi-node interconnection topology generation method according to claim 4, characterized in that: the following formula obtains the vertex characteristics of the graph S' (V', E'):

Among them, f in is the vertex feature k ij of the graph S'(V', E'), which means the geodesic distance between the vertex j and the vertex i of the graph S'(V', E'), A ω is the calculation described The optimal value of the graph S'(V', E') obtained after the weight of any vertex in the N*n vertices, and f j represents the vertex feature of the graph S(V, E).
The heterogeneous multi-node interconnection topology generation method according to claim 2, characterized in that: the convolution layer generates a global graph and an independent graph based on the upsampling result of the upsampling layer, and based on the global graph and the The independent graph is carried out convolution operation; The convolution operation specifically includes the following steps:

The independent graph is initialized as a graph S(V, E), and the independent graph is generated based on the following formula::

where C k represents the independent graph, fin represents the vertex features W θk of the graph S'(V', E') and
are embedding functions θ and
The parameters of , SoftMax is a normalization function; and

Wherein, the normalization function is:

Among them, N represents the number of vertices in the graph S'(V', E'), θ(v i ),
Respectively represent two 1x1 convolutional layers with different initial values.
The heterogeneous multi-node interconnection topology generation method according to claim 6, characterized in that: the characteristic value of the heterogeneous multi-node interconnection topology is obtained based on the following formula:

where B k represents the global graph, and C k represents the independent graph α represents the parameter to adjust the weight of the independent graph. f in represents the vertex feature K of the graph S'(V', E') , v represents the kernel size W of the spatial dimension, and K represents the weight vector of the 1x1 convolution operation.
The method for generating a heterogeneous multi-node interconnection topology according to claim 7, wherein ensuring that the heterogeneous multi-node interconnection topology meets preset accuracy requirements specifically includes:

The heterogeneous multi-node interconnection topology and the preset real heterogeneous multi-node interconnection topology are used to perform cross-entropy loss calculation based on the following formula:

Among them, E represents the expected value of the distribution function, P data represents the distribution of actual topological samples, x is the real sample in P data , P z represents the distribution of input noise, D(x) represents the probability of judging the sample as correct, G(z ) represents the heterogeneous multi-node interconnection topology; z represents the input noise;

Based on the following formula, the topology reconstruction loss result is obtained:

Among them, P t represents the heterogeneous multi-node interconnection topology P t 'represents the real heterogeneous multi-node interconnection topology, L topo represents the topological distance between the heterogeneous multi-node interconnection topology and the corresponding node of the real heterogeneous multi-node interconnection topology;

According to the topology reconstruction loss result and the cross-entropy loss calculation result, the final loss of the heterogeneous multi-node interconnection topology is obtained based on the following formula:

L=L cGAN +λL rec

Among them, λ is the weight of the reconstruction item, L rec is the topology reconstruction loss, and L cGAN is the cross-entropy loss; and

Comparing the final loss of the heterogeneous multi-node interconnection topology with the preset loss of the heterogeneous multi-node interconnection topology, if the final loss of the heterogeneous multi-node interconnection topology is greater than the preset heterogeneous multi-node interconnection topology the loss of the multi-node interconnection topology, repeat claims 4 to 7 until the final loss of the heterogeneous multi-node interconnection topology is not greater than the preset loss of the heterogeneous multi-node interconnection topology, ensuring that the heterogeneous multi-node interconnection topology The multi-node interconnection topology meets the preset accuracy requirements.
The method for generating a heterogeneous multi-node interconnection topology according to claim 1, wherein the performance parameters include noise signals, performance requirements, power consumption requirements and cost requirements.
The heterogeneous multi-node interconnection topology generation method according to claim 1, wherein the ensuring that the heterogeneous multi-node interconnection topology meets preset accuracy requirements includes:

Obtaining a cross-entropy loss result and a topology reconstruction loss result between the heterogeneous multi-node interconnection topology and a preset real heterogeneous multi-node interconnection topology based on the eigenvalues of the heterogeneous multi-node interconnection topology;

Obtaining a final loss of the heterogeneous multi-node interconnection topology based on the cross-entropy loss result and the topology reconstruction loss result; and

Comparing the final loss of the heterogeneous multi-node interconnection topology with the preset loss of the heterogeneous multi-node interconnection topology, if the final loss of the heterogeneous multi-node interconnection topology is greater than the preset heterogeneous multi-node interconnection topology the loss of the multi-node interconnection topology, repeat the upsampling process of the upsampling layer and the convolution operation of the convolution layer until the final loss of the heterogeneous multi-node interconnection topology is not greater than the preset Loss of heterogeneous multi-node interconnect topology.
One or more non-transitory computer-readable storage media storing computer-readable instructions, wherein when the computer-readable instructions are executed by one or more processors, the one or more processors Carrying out the steps of the method as claimed in any one of claims 1 to 10.