CN112541584A

CN112541584A - Deep neural network model parallel mode selection method

Info

Publication number: CN112541584A
Application number: CN201910897718.8A
Authority: CN
Inventors: 刘鑫; 刘沙; 彭超; 朱传家; 陈德训; 黄则强; 陆旭峰; 裴阳
Original assignee: Wuxi Jiangnan Computing Technology Institute
Current assignee: Wuxi Jiangnan Computing Technology Institute
Priority date: 2019-09-23
Filing date: 2019-09-23
Publication date: 2021-03-23
Anticipated expiration: 2039-09-23
Also published as: CN112541584B

Abstract

The invention discloses a deep neural network model parallel mode selection method, which comprises the following steps: s1, calculating the total data volume of the whole neural network model; s2, judging whether the total data volume of the neural network model obtained in the S1 exceeds the total available memory volume of a single calculation node for training, if not, executing S3, and if so, executing S4; s3, selecting a data parallel mode; s4, segmenting the network layer of the neural network model, obtaining the number of calculation nodes required to be distributed by the neural network model according to the segmentation result, executing S5 if the number of the calculation nodes in the input parameters is more than twice of the number of the nodes required by the model segmentation, otherwise executing S6; s5, selecting a model parallel mode; s6, selecting a mixed parallel mode comprising data parallel and model parallel. According to the invention, the automatic selection of the distributed extended parallel mode is realized by acquiring and analyzing the information of the model parameters, the hyper-parameters and the data volume, and higher parallel performance is ensured.

Description

Deep neural network model parallel mode selection method

Technical Field

The invention relates to a deep neural network model parallel mode selection method, and belongs to the technical field of deep learning.

Background

Distributed training of data parallel mode stores a backup of a model on each compute node, processes different parts of a data set on each compute node, and the data parallel mode training method requires combining the results of each work node and synchronizing model parameters between the nodes. The distributed training of the model parallel mode distributes different network layers of the neural network model to different computing nodes, or distributes different parameters in the same layer to different computing nodes, and the different computing nodes are responsible for training different parts of the network model. The hybrid parallel mode is that there is both model parallel and data parallel in a batch of computing nodes performing distributed training, for example, the model parallel mode may be used on a group of nodes, and data parallel may be used among groups of nodes.

In recent years, with the development of deep learning technology, various deep neural network models are emerging. With the increasing variety of network models, the network depth also expands from several layers to hundreds of layers. Although the accuracy rate of the deep-level network is greatly improved, the network model parameters are more and more, the training time is longer and longer, and the method becomes a great obstacle to the rapid development and the wide application of the deep learning technology. In order to complete the training of the super-large scale neural network and the super-large scale data, a single computing node is not feasible, and the support of distributed parallel expansion is required. At present, deep learning distributed parallel extension modes mainly include data parallel, model parallel and hybrid parallel, however, how to select a suitable parallel mode becomes the direction of effort of those skilled in the art.

Disclosure of Invention

The invention aims to provide a deep neural network model parallel mode selection method, which realizes automatic selection of a distributed extended parallel mode through information acquisition and analysis of model parameters, hyper-parameters and data quantity and ensures higher parallel performance.

In order to achieve the purpose, the invention adopts the technical scheme that: a deep neural network model parallel mode selection method is characterized in that input parameters of an artificial intelligence training task comprise a neural network model file, the number of computing nodes and the size of single training sample data, wherein the neural network model file comprises batch _ size, the number of model parameters and the data type;

the parallel mode selection method comprises the following steps:

s1, calculating parameter data volume of the whole neural network model by a distributed expansion component in the artificial intelligence framework according to the parameter number and the data type of the neural network model, and calculating the data volume of input data according to the size of single training sample data in input parameters and the size of batch _ size in a neural network model file, wherein the sum of the parameter data volume and the data volume of the input data is the total data volume of the neural network model;

s2, the distributed extension module judges whether the total data volume of the neural network model obtained in the S1 exceeds the total available memory of a single calculation node for training, if not, S3 is executed, and if so, S4 is executed;

s3, selecting a data parallel mode, dividing training samples into a plurality of parts with the same number as the number of the computing nodes by the distributed extension component according to the number of the computing nodes in the input parameters, training each computing node by using respective sample data, transmitting gradient data among the computing nodes, and completing training together;

s4, segmenting the network layer of the neural network model, dividing the network layer into a plurality of parts, distributing the model parameters of each part on a computing node, obtaining the number of the computing nodes required to be distributed by the neural network model according to the segmentation result, executing S5 if the number of the computing nodes in the input parameters is less than two times of the number of the nodes required by the model segmentation, otherwise executing S6, wherein the concrete method of segmentation is as follows: selecting a plurality of continuous layers with the maximum number from a starting layer of a network layer as one part, so that the sum of the data quantity of the plurality of layers does not exceed the total available memory of the computing node, and if the data quantity of a certain single network layer exceeds the total available memory of the computing node, dividing the network layer into a plurality of parts according to the total available memory of the computing node;

s5, selecting a model parallel mode, segmenting the neural network model, and distributing the segmented neural network model parameters of each partial network layer to different computing nodes by the distributed expansion component;

s6, selecting a mixed parallel mode comprising data parallel and model parallel, grouping all computing nodes according to the number of the computing nodes required to be distributed by the neural network model, wherein the number of the computing nodes contained in each group is the same as the number of the computing nodes divided by the neural network model, model parallel is adopted in each group of computing nodes, intermediate data is transmitted among the computing nodes in the group, data parallel is adopted among the node groups formed by each group of computing nodes, and gradient data is transmitted by the computing nodes among the groups.

Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:

the method for selecting the parallel mode of the deep neural network model can solve the problem of automatically selecting the parallel mode when different types of neural network models are subjected to distributed expansion, can also solve the problem of network layer segmentation when the models are parallel, does not need manual intervention of a user, realizes automatic selection of the distributed expansion parallel mode by acquiring and analyzing information of model parameters, hyper-parameters and data quantity, and ensures higher parallel performance.

Drawings

FIG. 1 is a schematic diagram of a data parallel mode;

FIG. 2 is a schematic diagram of a model parallel model;

FIG. 3 is a schematic diagram of a hybrid parallel mode;

FIG. 4 is a flow chart of the deep neural network model parallel mode selection method of the present invention.

Detailed Description

Example (b): a deep neural network model parallel mode selection method is characterized in that input parameters of an artificial intelligence training task comprise a neural network model file, the number of computing nodes and the size of single training sample data, wherein the neural network model file comprises batch _ size, the number of model parameters and the data type;

the parallel mode selection method comprises the following steps:

s2, the distributed extension module judges whether the total data volume of the neural network model obtained in the S1 exceeds the total available memory volume of a single calculation node for training, the total available memory volume of the calculation node can be obtained through a system interface, if not, S3 is executed, and if so, S4 is executed;

The examples are further explained below:

the adaptive deep neural network parallel mode selection method provided by the invention solves the problem of parallel mode selection when the deep neural network model is subjected to distributed parallel expansion, can adaptively select a proper parallel mode from data parallel, model parallel and mixed parallel modes according to the type of the network model, the size of parameters, the size of training data volume and the size of batch _ size to obtain better acceleration performance, and also provides a model network layer segmentation method aiming at the model parallel and mixed parallel to distribute model parameters to different nodes.

Firstly, measuring and calculating the parameter quantity of the whole neural network model, selecting a model parallel mode aiming at the condition that the parameter data quantity of the network model exceeds the total quantity of available memory of a single calculation node, and segmenting according to a network layer, wherein the segmentation of the network layer is divided into a plurality of parts with approximate execution time according to the execution time of each layer;

secondly, aiming at the condition that the quantity of the model parameters does not exceed the total quantity of the available memory of a single computing node, firstly selecting a data parallel mode, further computing the size of the data space needing to be distributed according to the size of batch _ size, if the sum of the quantity of the data and the model parameters exceeds the available memory of the single computing node, dividing the parameters of the network layer with the largest quantity in the model into two parts, distributing the two parts to two computing nodes, mixing and paralleling the two computing nodes by adopting data paralleling and model paralleling, adopting data paralleling between node groups consisting of every two nodes, and if the quantity of the parameters and the quantity of the data still exceed the memory of the nodes after the layer with the largest quantity of the parameters is segmented, continuously segmenting the layer with the larger quantity of the parameters.

By adopting the method for selecting the parallel mode of the deep neural network model, the problem of automatic selection of the parallel mode when different types of neural network models are subjected to distributed expansion can be solved, the problem of network layer segmentation when the models are parallel can also be solved, manual intervention of a user is not needed, automatic selection of the distributed expansion parallel mode is realized by acquiring and analyzing information of model parameters, hyper-parameters and data quantity, and higher parallel performance is ensured.

To facilitate a better understanding of the invention, the terms used herein will be briefly explained as follows:

data parallel: different computing nodes have multiple copies of the same model, each computing node is assigned to different data, and then the computing results of all computing nodes are combined in a certain manner.

Parallel models: different computing nodes are responsible for different parts of the network model and train the same batch of data together, and intermediate data in the computing process needs to be transmitted among different computing nodes.

batch _ size: and training the number of the selected samples at one time when the deep learning model is trained.

The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims

1. A parallel mode selection method of a deep neural network model is characterized by comprising the following steps: the input parameters of the artificial intelligence training task comprise a neural network model file, the number of computing nodes and the size of a single training sample data, wherein the neural network model file comprises batch _ size, the number of model parameters and the data type;

the parallel mode selection method comprises the following steps: