CN115865607A - Distributed training computing node management method and related device - Google Patents

Distributed training computing node management method and related device Download PDF

Info

Publication number
CN115865607A
CN115865607A CN202310180801.XA CN202310180801A CN115865607A CN 115865607 A CN115865607 A CN 115865607A CN 202310180801 A CN202310180801 A CN 202310180801A CN 115865607 A CN115865607 A CN 115865607A
Authority
CN
China
Prior art keywords
computing node
computing
node
similarity
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310180801.XA
Other languages
Chinese (zh)
Inventor
李仁刚
闫瑞栋
郭振华
赵雅倩
刘璐
金良
徐聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Mass Institute Of Information Technology
Original Assignee
Shandong Mass Institute Of Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Mass Institute Of Information Technology filed Critical Shandong Mass Institute Of Information Technology
Priority to CN202310180801.XA priority Critical patent/CN115865607A/en
Publication of CN115865607A publication Critical patent/CN115865607A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application discloses a distributed training computing node management method and a related device, and relates to the technical field of computers, wherein the computing node management method comprises the following steps: acquiring node information of each computing node; grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types; setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group; and performing distributed model training in the plurality of computing node groups based on the input model and data to obtain a training result so as to improve the efficiency of the distributed model training.

Description

Distributed training computing node management method and related device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a server, and a computer-readable storage medium for managing compute nodes in distributed training.
Background
With the rapid development of big data, artificial intelligence, high-performance computation and internet technology, massive data and large-scale models generated in various fields are often modeled and solved through a neural network. The storage, calculation and solving processes of the neural network all depend on a distributed training system. A so-called distributed training system is a network formed by a plurality of computing nodes together, and each computing node may be formed by one host or a plurality of hosts.
In the related technology, a deep neural network model or a large data set to be trained is split in a model parallel, data parallel or mixed parallel mode and is distributed to corresponding computing nodes; then, each computing node separately trains the split small-scale data or the sub-model and generates a local or intermediate training result; and finally, the distributed training system aggregates all local training results in a certain mode to obtain a global result and outputs the global training result. However, in practical applications, there are differences between different computing nodes, which results in a decrease in efficiency of the training process of the distributed model, and thus resources of the computing nodes cannot be effectively utilized.
Therefore, how to improve the efficiency of the distributed model training is a key issue of attention for those skilled in the art.
Disclosure of Invention
The application aims to provide a computing node management method, a computing node management device, a server and a computer readable storage medium for distributed training, so as to improve the efficiency of distributed model training.
In order to solve the above technical problem, the present application provides a distributed training method for managing compute nodes, including:
acquiring node information of each computing node;
grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types;
setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group;
and carrying out distributed model training in the plurality of computing node groups based on the input model and data to obtain a training result.
Optionally, a synchronous update policy is adopted among the compute nodes in each compute node group, and an asynchronous update policy is adopted among the compute node groups.
Optionally, the obtaining node information of each computing node includes:
when a newly accessed computing node exists, acquiring node information of the newly accessed computing node; wherein the node information includes: hardware information, current load running state information, network connection and bandwidth conditions among the computing nodes;
and recording the node information in a database.
Optionally, grouping all the computing nodes based on the node information of each computing node to obtain multiple computing node groups of different types, where the grouping includes:
similarity calculation is carried out on each computing node based on the node information of each computing node, and the similarity between each computing node is obtained;
and clustering all the computing nodes based on the similarity between each computing node to obtain a plurality of computing node groups.
Optionally, performing similarity calculation on each computing node based on the node information of each computing node to obtain a similarity between each computing node, including:
calculating the firmware similarity between each computing node based on the firmware information of each computing node;
calculating the network structure similarity between each computing node based on the network information of each computing node;
calculating load similarity of each calculation based on the load information of each calculation node;
and determining the similarity between each computing node based on the firmware similarity, the network structure similarity and the load similarity between each computing node.
Optionally, calculating the firmware similarity between each computing node based on the firmware information of each computing node includes:
calculating a hardware index for each of the compute nodes based on the firmware information for each of the compute nodes;
and calculating Euclidean distance between hardware indexes among each computing node, and taking the Euclidean distance as the similarity of the firmware among each computing node.
Optionally, calculating the network structure similarity between each computing node based on the network information of each computing node includes:
calculating a network address distance and a network neighbor index between each computing node based on the network information of each computing node;
and taking the network address distance and the network neighbor index between each computing node as the network structure similarity between each computing node.
Optionally, calculating the load similarity of each calculation based on the load information of each calculation node includes:
calculating an equipment load condition index and a network bandwidth condition index of each computing node based on the load information of each computing node;
and taking the equipment load condition index and the network bandwidth condition index as the load similarity of the computing node.
Optionally, determining the similarity between each computing node based on the firmware similarity, the network structure similarity, and the load similarity between each computing node includes:
and performing weighted calculation on the firmware similarity, the network structure similarity and the load similarity among the calculation nodes to obtain the similarity among the calculation nodes.
Optionally, performing distributed model training in the plurality of computing node groups based on the input model and data to obtain a training result, including:
processing the input model and data based on a distributed training format to obtain distributed training data and a distributed training model;
and performing distributed model training based on a synchronous updating strategy among each computing node, an asynchronous updating strategy among each computing node group, distributed training data and a model to obtain the training result.
Optionally, processing the input model and data based on the format of distributed training to obtain the data and model of distributed training, including:
and denoising and standardizing the input model and data based on a distributed training format to obtain the distributed training data and model.
Optionally, the process of asynchronous update policy between each computing node group includes:
and executing an asynchronous updating strategy among each computing node group based on a preset buffer zone.
The application also provides a distributed training computing node management method, which comprises the following steps:
the method comprises the steps that a client sends a model to be trained and data to a server, so that the server can obtain node information of each computing node; grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types; setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group; performing distributed model training in the plurality of computing node groups based on the input model and data to obtain and return a training result;
and the client acquires the training result and displays the training result.
The application also provides a distributed training computing node management method, which comprises the following steps:
the server acquires node information of each computing node; grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types; setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group; performing distributed model training in the plurality of computing node groups based on the model and data input by the client to obtain a training result;
and the client displays the training result.
The present application further provides a distributed training computing node management apparatus, including:
the node information acquisition module is used for acquiring the node information of each computing node;
the node grouping module is used for grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types;
the communication architecture setting module is used for setting a local decentralized communication architecture for the computing nodes in each computing node group and setting a global centralized communication architecture between each computing node group;
and the model training module is used for carrying out distributed model training in the plurality of computing node groups based on the input model and data to obtain a training result.
The present application further provides a server, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the computing node management method as described above when executing the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of computing node management as described above.
The application provides a distributed training computing node management method, which comprises the following steps: acquiring node information of each computing node; grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types; setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group; and carrying out distributed model training in the plurality of computing node groups based on the input model and data to obtain a training result.
The method comprises the steps of grouping all computing nodes to obtain a plurality of computing node groups, and then executing differentiated communication architectures and data updating strategies between the computing nodes in each computing node group and each computing node group, so that the reduction of training efficiency caused by the difference between different computing nodes is avoided, the efficiency of the training process of the distributed model is improved, and the resources of the computing nodes are effectively utilized.
The present application further provides a distributed training computing node management apparatus, a server, and a computer-readable storage medium, which have the above beneficial effects and are not described herein again.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only the embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a distributed training method for managing computing nodes according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of another distributed trained computing node management method according to an embodiment of the present application;
fig. 3 is a schematic diagram of a distributed communication architecture according to an embodiment of the present application;
FIG. 4 is a diagram illustrating a parallel training architecture according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a distributed training computing node management apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
The core of the application is to provide a distributed training computing node management method, a computing node management device, a server and a computer readable storage medium, so as to improve the efficiency of distributed model training.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the related technology, a deep neural network model or a large data set to be trained is split in a model parallel, data parallel or mixed parallel mode and is distributed to corresponding computing nodes; then, each computing node separately trains the split small-scale data or the sub-model and generates a local or intermediate training result; and finally, the distributed training system aggregates all local training results in a certain mode to obtain a global result and outputs the global training result. However, in practical applications, there are differences between different computing nodes, which results in a decrease in efficiency of the training process of the distributed model, and thus resources of the computing nodes cannot be effectively utilized.
Therefore, the method for managing the computing nodes in the distributed training is provided, a plurality of computing node groups are obtained by grouping all the computing nodes, and then a differentiated communication architecture and a data updating strategy are executed between the computing nodes in each computing node group and each computing node group, so that the reduction of the training efficiency caused by the difference between different computing nodes is avoided, the efficiency of the training process of the distributed model is improved, and the resources of the computing nodes are effectively utilized.
The following describes a distributed training method for managing compute nodes according to an embodiment.
Referring to fig. 1, fig. 1 is a flowchart of a distributed training computing node management method according to an embodiment of the present disclosure.
In this embodiment, the method may include:
s101, acquiring node information of each computing node;
this step is intended to acquire node information of each computing node.
The computing nodes are nodes for distributed training, and include but are not limited to various heterogeneous computing nodes such as CPUs, GPUs, FPGAs, mobile computing devices and the like. The heterogeneous computing nodes are different hardware with different structures among the computing nodes.
Wherein the node information includes: hardware information, current load running state information, network connection between the computing nodes and bandwidth conditions.
Further, the step may include:
and when the newly accessed computing node exists, acquiring the node information of the newly accessed computing node.
It can be seen that the present alternative scheme mainly explains how to acquire new node information. In this alternative, when there is a newly accessed computing node, node information of the newly accessed computing node is acquired.
Further, the step may include:
and acquiring the node information of each computing node, and recording the node information in a database.
It can be seen that the present alternative scheme mainly illustrates how node information is saved. In this alternative, the node information of each computing node is acquired, and the node information is recorded in the database.
S102, grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types;
on the basis of S101, this step is intended to group all the computing nodes based on the node information of each computing node, resulting in a plurality of computing node groups of different types.
That is, different computing nodes are grouped, and similar computing nodes are divided into the same computing node group, so that performance loss caused by differences among different computing nodes is eliminated.
Further, the step may include:
step1, similarity calculation is carried out on each calculation node based on node information of each calculation node to obtain similarity between each calculation node;
and 2, clustering all the computing nodes based on the similarity between each computing node to obtain a plurality of computing node groups.
It can be seen that the present alternative is primarily illustrative of how the grouping may be performed. In the alternative scheme, similarity calculation is carried out on each computing node based on the node information of each computing node, and the similarity between each computing node is obtained; and clustering all the computing nodes based on the similarity between each computing node to obtain a plurality of computing node groups. And clustering can be performed based on the obtained similarity.
Further, the process of calculating the similarity in the last alternative may include:
step1, calculating the firmware similarity between each computing node based on the firmware information of each computing node;
step2, calculating the network structure similarity between each computing node based on the network information of each computing node;
step 3, calculating the load similarity of each calculation based on the load information of each calculation node;
and 4, determining the similarity between each computing node based on the firmware similarity, the network structure similarity and the load similarity between each computing node.
It can be seen that the present alternative scheme mainly explains how to calculate the similarity. In this alternative, the firmware similarity between each compute node is computed based on the firmware information of each compute node; calculating the network structure similarity between each computing node based on the network information of each computing node; calculating load similarity of each calculation based on the load information of each calculation node; and determining the similarity between each computing node based on the firmware similarity, the network structure similarity and the load similarity between each computing node. Namely, the similarity of the computing nodes is evaluated in three directions of the firmware similarity, the network structure similarity and the load similarity, so that the accuracy of judging the similarity is improved.
Further, the process of calculating the firmware similarity in the last alternative may include:
step1, calculating a hardware index of each computing node based on firmware information of each computing node;
and 2, calculating the Euclidean distance between hardware indexes of each computing node, and taking the Euclidean distance as the firmware similarity between each computing node.
It can be seen that the present alternative scheme mainly illustrates how the firmware similarity is calculated. In this alternative, the hardware index of each compute node is computed based on the firmware information of each compute node; and calculating Euclidean distance between hardware indexes between each computing node, and taking the Euclidean distance as the firmware similarity between each computing node.
Further, the process of calculating the network structure similarity in the above alternative may include:
step1, calculating a network address distance and a network neighbor index between each computing node based on network information of each computing node;
and 2, taking the network address distance and the network neighbor index between each computing node as the network structure similarity between each computing node.
It can be seen that the alternative scheme mainly explains how to calculate the network structure similarity. In the alternative, the network address distance and the network neighbor index between each computing node are calculated based on the network information of each computing node; and taking the network address distance and the network neighbor index between each computing node as the network structure similarity between each computing node.
Further, the process of calculating the load similarity in the last alternative may include:
step1, calculating equipment load condition indexes and network bandwidth condition indexes of each computing node based on load information of each computing node;
and 2, taking the equipment load condition index and the network bandwidth condition index as the load similarity of the computing node.
It can be seen that the present alternative scheme mainly explains how to calculate the load similarity. In the alternative, the equipment load condition index and the network bandwidth condition index of each computing node are calculated based on the load information of each computing node; and taking the equipment load condition index and the network bandwidth condition index as the load similarity of the computing node.
Further, the process of calculating the total similarity in the last alternative may include:
and performing weighted calculation on the firmware similarity, the network structure similarity and the load similarity among the calculation nodes to obtain the similarity among the calculation nodes.
Further, the process of clustering in the last alternative may include:
and clustering all the computing nodes based on the average value of the similarity between each computing node to obtain a plurality of computing node groups.
S103, setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group;
on the basis of S102, the step aims to set a local decentralized communication framework for the computing nodes in each computing node group and set a global centralized communication framework between each computing node group. That is, similar nodes exist among nodes in each computing node group, and a local decentralized communication architecture can be adopted to improve the communication efficiency. And because different computing node groups have differences among the computing node groups, a global centralized communication architecture is adopted so as to coordinate the communication of the different computing node groups.
S104, performing distributed model training in a plurality of computing node groups based on the input model and data to obtain a training result; and the computing nodes in each computing node group adopt a synchronous updating strategy, and the computing nodes in each computing node group adopt an asynchronous updating strategy.
On the basis of S103, the step aims at carrying out distributed model training in a plurality of computing node groups based on the input model and data to obtain a training result; and the computing nodes in each computing node group adopt a synchronous updating strategy, and the computing nodes in each computing node group adopt an asynchronous updating strategy.
Further, the step may include:
step1, processing input models and data based on a distributed training format to obtain distributed training data and models;
and 2, performing distributed model training based on a synchronous updating strategy among each computing node, an asynchronous updating strategy among each computing node group, distributed training data and a model to obtain a training result.
It can be seen that this alternative is primarily illustrative of how training can be performed. In the alternative, the input model and data are processed based on the format of distributed training to obtain the data and model of distributed training; and performing distributed model training based on a synchronous updating strategy among each computing node, an asynchronous updating strategy among each computing node group, distributed training data and a model to obtain a training result.
Further, the process of performing data processing in the last alternative may include:
and carrying out denoising processing and standardization processing on the input model and the input data based on the format of the distributed training to obtain the data and the model of the distributed training.
It can be seen that the present alternative is primarily illustrative of how data and models may be processed. In the alternative, the input model and data are denoised and standardized based on the format of distributed training to obtain the data and model of distributed training.
Further, the process of data synchronization between the computing node groups in the last alternative may include:
and executing an asynchronous updating strategy among each computing node group based on a preset buffer zone.
It can be seen that the present alternative scheme is primarily illustrative of how an asynchronous update policy may be implemented. In this alternative, an asynchronous update policy is implemented between each compute node group based on a pre-set buffer.
Further, this embodiment may further include:
step1, performing visualization processing on a training result to obtain a visualization result;
and 2, displaying the visualization result.
It can be seen that the present alternative mainly illustrates that the training result can also be displayed visually. In the alternative scheme, the training result is visualized to obtain a visualized result; and displaying the visualization result.
In summary, in the embodiment, all the computing nodes are grouped to obtain a plurality of computing node groups, and then a differentiated communication architecture and a data updating strategy are executed between the computing nodes in each computing node group and each computing node group, so that reduction of training efficiency caused by differences between different computing nodes is avoided, efficiency of a training process of a distributed model is improved, and resources of the computing nodes are effectively utilized.
The following further describes a distributed training method for managing compute nodes according to another specific embodiment.
Referring to fig. 2, fig. 2 is a flowchart of another distributed trained computing node management method according to an embodiment of the present disclosure.
In this embodiment, the following modules are used to implement the distributed training method for managing computing nodes.
1) And the heterogeneous computing node cluster information input module. The module records the hardware information, the current load running state information, the network connection and bandwidth condition between the computing nodes and other information of the computing nodes of the heterogeneous server participating in distributed training.
2) And the grouping module is based on heterogeneous computing node clustering. The module quantifies the similarity between heterogeneous computing nodes according to the hardware information of the heterogeneous computing nodes, the current complex running state, the importance and the urgency of computing tasks and other characteristics. Secondly, the similarity is used as a key basis of clustering, and clustering operation is carried out on the cluster formed by the whole computing nodes. Finally, several "homogenous" packets are generated by the clustering operation. Providing a basis for a subsequent distributed communication architecture.
3) And designing a module based on the distributed communication architecture of the grouping information. The module designs a communication architecture combining local decentralized and global centralized architectures. The same group formed by the homogeneous computing nodes adopts a local decentralized framework; a global centralized architecture is adopted among different groups.
4) A data/model input module. The data/model input module mainly completes the input task of the data set and the model to be processed, processes the input data/model into the format required by the distributed training system and the like, and provides the data/model for the direct reading and calling of the module to be trained later.
5) And a heterogeneous hybrid parallel distributed training scheme module. A first-order optimization algorithm is adopted as a bottom-layer optimizer, a synchronous updating strategy is executed for homogeneous computing nodes in the same group, an asynchronous updating strategy is executed for heterogeneous computing nodes among different groups, a heterogeneous hybrid parallel distributed training scheme combining the synchronous updating strategy and the asynchronous updating strategy is designed, and the accuracy and the effectiveness of the execution of training tasks are guaranteed.
6) And a training result output module. This module is responsible for outputting a global solution to the training task.
In summary, all modules cooperatively complete various complex training tasks in the deep learning field.
Heterogeneous computing node cluster information input module
Computing node hardware devices are a necessary prerequisite for distributed training. With the popularization of technologies such as internet of things, edge computing and cloud computing, various heterogeneous computing nodes such as a CPU (central processing unit), a GPU (graphic processing unit), an FPGA (field programmable gate array) and mobile computing equipment are simultaneously accessed to a network to form a complete distributed training system. The heterogeneous computing node cluster information input module records hardware information of computing nodes of the heterogeneous server participating in distributed training, current load operation state information of each computing node, network connection and bandwidth conditions among the computing nodes and other information (as shown in table 1), and provides important basis for subsequent grouping modules.
Table 1 schematic table of node information
Figure SMS_1
And the grouping module is based on heterogeneous computing node clustering.
Because various heterogeneous devices exist in the distributed system, the performance difference of the heterogeneous devices is large when various tasks are calculated or processed, so that the efficient cooperation among heterogeneous computing nodes in the distributed system is difficult, and the execution efficiency of the system is reduced. To this end, the present embodiment attempts to propose a grouping idea of heterogeneous computing nodes, i.e., grouping "similar" computing devices into the same group to achieve efficient coordination with low overhead. The module quantifies the similarity between heterogeneous computing nodes according to the hardware information of the heterogeneous computing nodes, the current complex running state, the importance and the urgency of computing tasks and other characteristics. Secondly, the similarity is used as a key basis of clustering, and clustering operation is carried out on the cluster formed by the whole computing nodes. Finally, several "homogenous" packets are generated by the clustering operation. Providing a basis for a subsequent distributed communication architecture.
Among them, two key problems are: how to measure the similarity between heterogeneous computing nodes, and what clustering algorithm to use.
Measuring the similarity between heterogeneous computing nodes: the similarity (Sim) of the heterogeneous computing nodes in this embodiment is defined as: the combination of three similarities of firmware similarity (Dev), network structure similarity (Net) and Load similarity (Load) is as follows:
Sim(X,Y)= a*Dev(X,Y)+b*Net(X,Y)+c*Load(X,Y)。
wherein a, b, c respectively represent coefficients greater than 0, and a + b + c =1. Where Sim (X, Y) represents the similarity between compute node X and compute node Y.
Where Dev (X, Y) represents the firmware similarity between the computing node X and the computing node Y, and defines the euclidean distance between the core computing hardware index a and the storage capacity index B (note that all indexes need to be normalized), and its mathematical formula is defined as follows:
Dev(X,Y)=(A^2+B^2)^(1/2)。
the network structure similarity between the computing node X and the computing node Y is expressed by Net (X, Y), and is defined as the combination of a network 1P index C and a network neighbor index D, and the mathematical formula is defined as follows:
Net(X,Y)=(|ComNe1ghbor(X,Y)|/(|Ne1ghbor(X)|+|Ne1ghbor(Y)|))*d(X,Y)。
the network index C is denoted by d (X, Y) and represents the cos distance of the network 1P between the computation nodes, i.e., d (X, Y) = X × Y/(| X | + | Y |).
Wherein, the network neighbor index D is recorded as: (| ComNe1ghbor (X, Y) |/(| Ne1ghbor (X) | + | Ne1ghbor (Y) |)), wherein | ComNe1ghbor (X, Y) | represents the number of common neighbor nodes between the computing node X and the computing node Y, | Ne1ghbor (X) | + | Ne1ghbor (Y) | represents the total number of all neighbor nodes (including itself) of the computing node X and the computing node Y.
Load (X, Y) represents the Load similarity between the computing node X and the computing node Y, and is defined as the combination of the index of the device Load condition E and the index of the network bandwidth occupation condition F, and the specific mathematical formula is defined as follows:
Load(X,Y)=(1/2)*E+(1/2)*F(4)。
similarity-based grouping (clustering) algorithm: based on the above method for calculating the similarity between any two computing nodes (assuming that there are n computing nodes in the distributed training system), the present embodiment proposes the following grouping (clustering) strategy for heterogeneous computing nodes:
the method comprises the following steps: starting from a computing device with the reference number 1, the similarity between the node and all other nodes is calculated, namely Sim (1, 1), sim (1, 2), sim (1, 3), \ 8230;, sim (1, n) is obtained and calculated
Figure SMS_2
(ii) a From the reference numerals The computing device of 2 starts by computing its similarity to all the other nodes, i.e. obtainingSim (2, 2), sim (2, 3), sim (2, 4), \ 8230, sim (2, n), and calculate ^ or ^ based on the measured values>
Figure SMS_3
(ii) a And by analogy, calculating the similarity upper matrix of all the nodes and the average value of the similarity of each row.
TABLE 2 INDICATIONS OF THE MEASUREMENT OF SIMILAR RATIO
Figure SMS_4
Step two: starting from the computing node numbered 1, the computing node numbered 1 to the computing node numbered n are examined. If the similarity Sim (1, 1) between the calculation node denoted by 1 and the calculation node denoted by 1 is equal to or greater than S1, the calculation node denoted by 1 and the calculation node denoted by 1 are divided into the same group (cluster). Through the above operations, the first Group1= { node 1, node 2, node k. Note that those compute nodes with a similarity lower than the average similarity S1 will be used in the grouping process for the subsequently numbered nodes.
Step three: starting with the computing node numbered 2, the computing node numbered 2 through the computing node numbered n are examined, and only the nodes that were culled in step two (i.e., nodes that are not in Group 1) are considered. If the similarity Sim (2, l) between the calculation node with the reference number 2 and the calculation node with the reference number l is greater than or equal to S2, they are classified into the same group. Through the above operation, a second Group2= { node 2, node l.
Step four: and repeating the process until the n computing nodes are processed. Finally, m disjoint grouping sets Group1, group2, \ 8230and Group pm without coincident computing nodes can be obtained.
And designing a module based on the distributed communication architecture of the grouping information.
Referring to fig. 3, fig. 3 is a schematic diagram of a distributed communication architecture according to an embodiment of the present disclosure.
Considering the influence of factors such as hardware equipment, network bandwidth and transmission rate, communication among computing nodes of the distributed training system often becomes a bottleneck, and training performance is severely restricted. In this case, the module focuses on designing a communication architecture that combines local decentralized with a global centralized architecture. As shown in fig. 3.
Fig. 3 shows a schematic diagram of a distributed system communication architecture designed by the present embodiment. Specifically, the present embodiment provides a local decentralized architecture and a global centralized architecture. On one hand, a local decentralized framework is adopted in the same group formed by the homogeneous nodes; on the other hand, a global centralized architecture is adopted among different groups.
The main advantages of this architecture are: firstly, the homogeneous nodes have small differences in processing capacity, computing performance and the like, so that the method is suitable for executing the synchronous updating strategy, and the synchronous updating strategy has high execution efficiency; and secondly, the difference between different groups is large, so that a global centralized framework capable of executing an asynchronous updating strategy is designed, and the synchronization and the average of global model parameters of each local group training result are realized.
A data/model input module.
The data/model input module is mainly used for completing the input task of a data set and a model to be processed, processing the input data/model into a format required by a distributed training system, and performing operations such as noise removal, standardization and the like for direct reading and calling of a subsequent training module.
Referring to fig. 4, fig. 4 is a schematic diagram of a parallel training architecture according to an embodiment of the present disclosure.
The heterogeneous hybrid parallel distributed training scheme module is shown in fig. 4. The embodiment designs a distributed training method with global centralization and local decentralization. Specifically, after the computing nodes in the cluster are processed by the clustering grouping module, different computing nodes are divided into different groups. These packets are connected to a particular server node and there is no direct connection between different packets. The server node is directly connected with the server node. The computing nodes in the same group adopt a synchronous updating strategy, and different groups adopt an asynchronous updating strategy.
In order to improve the efficiency of the server nodes to perform asynchronous global operations, the present embodiment sets a Buffer for each server node, and the size of the Buffer is dynamically adjusted according to the number of corresponding packets. For example, in fig. 4, group1 (packet 1) is connected to Server node Server1 (Server 1), and the size of Buffer1 at Server1 is 1.Group2 and Group3 are connected to the Server node Server2, and the size of the Buffer2 of the Server2 is 2.
In summary, the heterogeneous hybrid parallel distributed training scheme includes: the server node workflow and the computing node workflow are two parts.
The server node workflow may include:
inputting: the total iteration number T, the threshold value Q and the learning rate eta.
And (3) outputting: global parameter W (t + 1).
Step1, when the iteration time t =0, setting a Buffer1 at each Server node Server1 as an empty set, and making a local summary parameter Server1 (W) =0 at the Server 1.
Step2, when the iteration time t = m, the Server node Server1 obtains the average gradients Gj and Gk generated by the group from all the groups (e.g., group j, group pk) connected thereto, and calculates the "global gradient" G1 of the group, i.e., G1= (1/2) (Gj + Gk). The global parameters are updated based on the gradient G1, as follows:
W(t+1)=W(t)-η*G1。
and step 3, sending the global parameter W (t + 1) generated at the moment t = m to the corresponding groups Gj and Gk.
Step 4, if the iteration interval Gap = | t between any two Server nodes Server1 and Server2 1 -t 2 |>If Q is not greater than the threshold, global synchronization between the Server1 and the Server2 is performed to obtain a global gradient G = (1/2) × (G1 + G2), and the global parameter is updated accordingly, as shown in the following formula:
W(t+1)=W(t)-η* G。
where G1 denotes the gradient cached in the Buffer of the Server1 (Server 1), and G2 denotes the gradient cached in the Buffer of the Server2 (Server 2).
Once T = T, the server node workflow ends, step 5.
The computing node workflow may include:
step1, for each Group1, it is assumed that Group1 includes a computing node w1, a computing node w2, and a computing node w3. When t = m, group1 acquires a global parameter W (t) sent by a server node corresponding to the global parameter W (t), and each computing node iteratively computes a new gradient based on W (t) and a local data sample:
node w1 calculates the gradient g1: g1= \ 8711f (W (t);. Xi), where. Xi represents the sample data for node W1;
node w2 calculates the gradient g2: g2= 8711f (W (t);
Figure SMS_5
) Wherein->
Figure SMS_6
Sample data representing node w 2;
node w3 calculates the gradient g3: g3= \ 8711f (W (t); /), where ψ represents sample data for node W3.
A gradient G1= (1/3) = (G1 + G2+ G3) of the above-described 3 nodes;
and 2, sending the new gradient G1 to the corresponding server node.
And Step 3, repeating the Step1 and the Step2 until T = T.
And finally, a result output module.
And the result output module is responsible for outputting the final result of the distributed training system and presenting the final result to the user in a visual mode. The user can further modify or adjust the training scheme according to the information, so that the system improvement is facilitated.
The embodiment provides a mixed distributed training method for fusing a centralization framework and a decentralization framework and adopting a synchronous updating strategy and an asynchronous updating strategy based on a grouping (clustering) idea. For this reason, 6 main functional modules are designed: the system comprises a heterogeneous computing node cluster information input module, a grouping module based on heterogeneous computing node clustering, a distributed communication architecture design module based on grouping information, a data/model input module, a heterogeneous hybrid parallel distributed training scheme module and a training result output module. Each module takes its own role and completes the training task cooperatively.
According to the similarity index definition provided by the embodiment, grouping or clustering of heterogeneous computing nodes is realized, so that similar equipment is divided into the same group, and various expenses caused by difference among the equipment are reduced;
meanwhile, a packet-based distributed communication architecture is designed and a flexible synchronous updating strategy is adopted. Aiming at the computing nodes in the same group, a decentralized framework is adopted and a synchronous updating strategy is executed, so that the computing precision and efficiency of the homogeneous nodes are ensured; aiming at the computing nodes of different groups, a centralized framework is adopted and an asynchronous updating strategy is executed, so that the resource utilization rate of the heterogeneous nodes is ensured, and resource idling and waste are avoided.
Therefore, in the embodiment, all the computing nodes are grouped to obtain a plurality of computing node groups, and then a differentiated communication architecture and a data updating strategy are executed between the computing nodes in each computing node group and each computing node group, so that the reduction of training efficiency caused by the difference between different computing nodes is avoided, the efficiency of the training process of the distributed model is improved, and the resources of the computing nodes are effectively utilized.
The following further describes a distributed training method for managing compute nodes according to another embodiment.
In this embodiment, the method may include:
the client sends the model and the data to be trained to the server so that the server can obtain the node information of each computing node; grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types; setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group; performing distributed model training in a plurality of computing node groups based on the input model and data to obtain and return a training result; the method comprises the following steps that a synchronous updating strategy is adopted among computing nodes in each computing node group, and an asynchronous updating strategy is adopted among the computing node groups;
and the client acquires the training result and displays the training result.
Therefore, in the embodiment, all the computing nodes are grouped to obtain a plurality of computing node groups, and then a differentiated communication architecture and a data updating strategy are executed between the computing nodes in each computing node group and each computing node group, so that the reduction of training efficiency caused by the difference between different computing nodes is avoided, the efficiency of the training process of the distributed model is improved, and the resources of the computing nodes are effectively utilized.
The following further describes a distributed training method for managing compute nodes according to another embodiment.
In this embodiment, the method may include:
the server acquires node information of each computing node; grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types; setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group; performing distributed model training in a plurality of computing node groups based on a model and data input by a client to obtain a training result; the method comprises the following steps that a synchronous updating strategy is adopted among computing nodes in each computing node group, and an asynchronous updating strategy is adopted among the computing node groups;
and the client displays the training result.
Therefore, in the embodiment, all the computing nodes are grouped to obtain a plurality of computing node groups, and then a differentiated communication architecture and a data updating strategy are executed between the computing nodes in each computing node group and each computing node group, so that the reduction of training efficiency caused by the difference between different computing nodes is avoided, the efficiency of the training process of the distributed model is improved, and the resources of the computing nodes are effectively utilized.
In the following, the distributed trained computing node management apparatus provided in the embodiment of the present application is introduced, and the below-described distributed trained computing node management apparatus and the above-described distributed trained computing node management method may be referred to in correspondence with each other.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a distributed training computing node management apparatus according to an embodiment of the present disclosure.
In this embodiment, the apparatus may include:
a node information obtaining module 100, configured to obtain node information of each computing node;
a node grouping module 200, configured to group all computing nodes based on node information of each computing node to obtain multiple computing node groups of different types;
a communication architecture setting module 300, configured to set a local decentralized communication architecture for the compute nodes in each compute node group, and set a global centralized communication architecture between each compute node group;
a model training module 400, configured to perform distributed model training in multiple computing node groups based on an input model and data to obtain a training result; and the computing nodes in each computing node group adopt a synchronous updating strategy, and the computing nodes in each computing node group adopt an asynchronous updating strategy.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a server provided in the embodiment of the present application, where the server may include:
a memory for storing a computer program;
and the processor is used for realizing the steps of the distributed training computing node management method when executing the computer program.
As shown in fig. 6, which is a schematic diagram of a composition structure of a server, the server may include: a processor 10, a memory 2, a communication interface 12 and a communication bus 13. The processor 10, the memory 2 and the communication interface 12 all communicate with each other via a communication bus 13.
In the embodiment of the present application, the processor 10 may be a Central Processing Unit (CPU), an application specific integrated circuit (asic), a digital signal processor, a field programmable gate array (fpga) or other programmable logic device.
The processor 10 may call a program stored in the memory 2, and in particular, the processor 10 may perform operations in an embodiment of the exception 1P recognition method.
The memory 2 is used for storing one or more programs, the program may include program codes, the program codes include computer operation instructions, in this embodiment, the memory 2 stores at least the program for implementing the following functions:
acquiring node information of each computing node;
grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types;
setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group;
performing distributed model training in a plurality of computing node groups based on the input model and data to obtain a training result; and the computing nodes in each computing node group adopt a synchronous updating strategy, and the computing nodes in each computing node group adopt an asynchronous updating strategy.
In one possible implementation, the memory 2 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created during use.
Further, the memory 2 may comprise high speed random access memory, and may also comprise non-volatile memory, such as at least one disk storage device or other volatile solid state storage device.
The communication interface 12 may be an interface of a communication module for connecting with other devices or systems.
Of course, it should be noted that the structure shown in fig. 6 does not constitute a limitation on the server in the embodiment of the present application, and in practical applications, the server may include more or less components than those shown in fig. 6, or some components may be combined.
The present application further provides a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program can implement the steps of any one of the above-mentioned distributed training computing node management methods.
The computer-readable storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
For the introduction of the computer-readable storage medium provided in the present application, please refer to the above method embodiments, which are not described herein again.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed in the embodiment corresponds to the method disclosed in the embodiment, so that the description is simple, and the relevant points can be referred to the description of the method part.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The present application provides a method, an apparatus, a server, and a computer-readable storage medium for managing compute nodes for distributed training. The principles and embodiments of the present application are described herein using specific examples, which are only used to help understand the method and its core idea of the present application. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.

Claims (17)

1. A distributed training method for managing computing nodes is characterized by comprising the following steps:
acquiring node information of each computing node;
grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types;
setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group;
and carrying out distributed model training in the plurality of computing node groups based on the input model and data to obtain a training result.
2. The method according to claim 1, wherein a synchronous update policy is applied between the compute nodes in each of the compute node groups, and an asynchronous update policy is applied between each of the compute node groups.
3. The method for managing computing nodes according to claim 1, wherein obtaining node information of each computing node comprises:
when a newly accessed computing node exists, acquiring node information of the newly accessed computing node; wherein the node information includes: hardware information, current load running state information, network connection and bandwidth conditions among the computing nodes;
and recording the node information in a database.
4. The method according to claim 1, wherein grouping all the computing nodes into groups based on the node information of each computing node to obtain a plurality of computing node groups of different types comprises:
similarity calculation is carried out on each computing node based on the node information of each computing node, and the similarity between each computing node is obtained;
and clustering all the computing nodes based on the similarity between each computing node to obtain a plurality of computing node groups.
5. The method for managing computing nodes according to claim 4, wherein performing similarity calculation on each computing node based on the node information of each computing node to obtain a similarity between each computing node includes:
calculating the firmware similarity between each computing node based on the firmware information of each computing node;
calculating the network structure similarity between each computing node based on the network information of each computing node;
calculating load similarity of each calculation based on the load information of each calculation node;
and determining the similarity between each computing node based on the firmware similarity, the network structure similarity and the load similarity between each computing node.
6. The computing node management method of claim 5, wherein computing the firmware similarity between each computing node based on the firmware information of each computing node comprises:
calculating a hardware index for each of the compute nodes based on the firmware information for each of the compute nodes;
and calculating Euclidean distance between hardware indexes among each computing node, and taking the Euclidean distance as the similarity of the firmware among each computing node.
7. The method for managing computing nodes according to claim 5, wherein calculating the network structure similarity between each computing node based on the network information of each computing node comprises:
calculating a network address distance and a network neighbor index between each computing node based on the network information of each computing node;
and taking the network address distance and the network neighbor index between each computing node as the network structure similarity between each computing node.
8. The method of claim 5, wherein computing the load similarity for each computation based on the load information for each compute node comprises:
calculating an equipment load condition index and a network bandwidth condition index of each computing node based on the load information of each computing node;
and taking the equipment load condition index and the network bandwidth condition index as the load similarity of the computing node.
9. The method for managing the computing nodes according to claim 6, wherein determining the similarity between each of the computing nodes based on the similarity of the firmware between each of the computing nodes, the similarity of the network structure, and the similarity of the load comprises:
and performing weighted calculation on the firmware similarity, the network structure similarity and the load similarity among the calculation nodes to obtain the similarity among the calculation nodes.
10. The method of claim 1, wherein performing distributed model training in the plurality of compute node groups based on the input model and data to obtain training results comprises:
processing the input model and data based on a distributed training format to obtain distributed training data and a distributed training model;
and performing distributed model training based on a synchronous updating strategy among each computing node, an asynchronous updating strategy among each computing node group, distributed training data and a model to obtain the training result.
11. The method of claim 10, wherein processing the input model and data based on the format of distributed training to obtain the data and model of distributed training comprises:
and carrying out denoising processing and standardization processing on the input model and the input data based on the format of the distributed training to obtain the data and the model of the distributed training.
12. The method according to claim 10, wherein the asynchronous updating policy between each of the computing node groups comprises:
and executing an asynchronous updating strategy among each computing node group based on a preset buffer zone.
13. A distributed training method for managing computing nodes is characterized by comprising the following steps:
the method comprises the steps that a client sends a model to be trained and data to a server, so that the server can obtain node information of each computing node; grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types; setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group; performing distributed model training in the plurality of computing node groups based on the input model and data to obtain and return a training result;
and the client acquires the training result and displays the training result.
14. A distributed training method for managing computing nodes is characterized by comprising the following steps:
the server acquires node information of each computing node; grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types; setting a local decentralized communication architecture for the computing nodes in each computing node group, and setting a global centralized communication architecture between each computing node group; performing distributed model training in the plurality of computing node groups based on the model and data input by the client to obtain a training result;
and the client displays the training result.
15. A distributed trained compute node management apparatus, comprising:
the node information acquisition module is used for acquiring the node information of each computing node;
the node grouping module is used for grouping all the computing nodes based on the node information of each computing node to obtain a plurality of computing node groups of different types;
the communication architecture setting module is used for setting a local decentralized communication architecture for the computing nodes in each computing node group and setting a global centralized communication architecture between each computing node group;
and the model training module is used for carrying out distributed model training in the plurality of computing node groups based on the input model and data to obtain a training result.
16. A server, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the compute node management method of any one of claims 1 to 12 when executing said computer program.
17. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the compute node management method according to one of the claims 1 to 12.
CN202310180801.XA 2023-03-01 2023-03-01 Distributed training computing node management method and related device Pending CN115865607A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310180801.XA CN115865607A (en) 2023-03-01 2023-03-01 Distributed training computing node management method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310180801.XA CN115865607A (en) 2023-03-01 2023-03-01 Distributed training computing node management method and related device

Publications (1)

Publication Number Publication Date
CN115865607A true CN115865607A (en) 2023-03-28

Family

ID=85659392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310180801.XA Pending CN115865607A (en) 2023-03-01 2023-03-01 Distributed training computing node management method and related device

Country Status (1)

Country Link
CN (1) CN115865607A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542324A (en) * 2023-07-06 2023-08-04 之江实验室 Distributed asynchronous protocol method and device for intelligent computing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110969198A (en) * 2019-11-24 2020-04-07 广东浪潮大数据研究有限公司 Distributed training method, device, equipment and storage medium for deep learning model
WO2020107351A1 (en) * 2018-11-29 2020-06-04 袁振南 Model training method and nodes thereof, network and storage device
CN111813858A (en) * 2020-07-10 2020-10-23 电子科技大学 Distributed neural network hybrid synchronous training method based on self-organizing grouping of computing nodes
CN115081620A (en) * 2022-06-20 2022-09-20 上海电力大学 Acceleration distributed training method based on packet asynchronous parallel strategy

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020107351A1 (en) * 2018-11-29 2020-06-04 袁振南 Model training method and nodes thereof, network and storage device
CN110969198A (en) * 2019-11-24 2020-04-07 广东浪潮大数据研究有限公司 Distributed training method, device, equipment and storage medium for deep learning model
CN111813858A (en) * 2020-07-10 2020-10-23 电子科技大学 Distributed neural network hybrid synchronous training method based on self-organizing grouping of computing nodes
CN115081620A (en) * 2022-06-20 2022-09-20 上海电力大学 Acceleration distributed training method based on packet asynchronous parallel strategy

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542324A (en) * 2023-07-06 2023-08-04 之江实验室 Distributed asynchronous protocol method and device for intelligent computing
CN116542324B (en) * 2023-07-06 2023-10-10 之江实验室 Distributed asynchronous protocol method and device for intelligent computing

Similar Documents

Publication Publication Date Title
CN111242282B (en) Deep learning model training acceleration method based on end edge cloud cooperation
WO2018099084A1 (en) Method, device, chip and system for training neural network model
US20200065710A1 (en) Normalizing text attributes for machine learning models
CN109508326B (en) Method, device and system for processing data
CN112100450A (en) Graph calculation data segmentation method, terminal device and storage medium
CN115865607A (en) Distributed training computing node management method and related device
WO2018059302A1 (en) Text recognition method and device, and storage medium
CN109377552B (en) Image occlusion calculating method, device, calculating equipment and storage medium
CN113761390A (en) Method and system for analyzing attribute intimacy
CN110876072A (en) Batch registered user identification method, storage medium, electronic device and system
CN115879543B (en) Model training method, device, equipment, medium and system
CN115392491A (en) Model training method and device based on knowledge distillation and federal learning
CN112256811B (en) Map information representation method and device based on map structure
CN109981361B (en) Method and device for determining infection source in propagation network
CN114021031A (en) Financial product information pushing method and device
CN108804640B (en) Data grouping method, device, storage medium and equipment based on maximized IV
CN111310896A (en) Method and apparatus for training neural networks
WO2019136178A1 (en) Collaborative algorithm development, deployment, and tuning platform
CN113111254B (en) Training method, fitting method and device of recommendation model and electronic equipment
US11651293B2 (en) Hierarchical decentralized distributed deep learning training
CN111324737B (en) Bag-of-words model-based distributed text clustering method, storage medium and computing device
CN114745310B (en) Flow threshold determining method and device based on genetic algorithm
TWI764456B (en) Method and device for block operation, computer device and storage medium
CN111950225B (en) Chip layout method and device, storage medium and electronic equipment
CN117556305A (en) Network node identification method, system and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230328