CN112912837A

CN112912837A - Neural network compiling method, device, equipment, storage medium and program product

Info

Publication number: CN112912837A
Application number: CN201880098337.7A
Authority: CN
Inventors: 蒋国跃
Original assignee: Bitmain Technologies Inc
Current assignee: Bitmain Technologies Inc
Priority date: 2018-11-08
Filing date: 2018-11-08
Publication date: 2021-06-04
Anticipated expiration: 2038-11-08
Also published as: CN112912837B; WO2020093304A1

Abstract

A neural network compiling method, device, device, storage medium and program product, the method comprising: acquiring intermediate representation information corresponding to a neural network, and determining grouping information according to the intermediate representation information (101); The packets are compiled online (102). The neural network compiling method, device, device, storage medium and program product can determine the intermediate representation information in advance according to the layer grouping result of the neural network, so that the intermediate representation information includes the layer grouping information of the neural network, and then according to the obtained intermediate representation information. The representation information determines the grouping information of the neural network, and compiles the neural network based on the grouping information. Even if the tensor dimensions between layers are different, the neural network can be compiled according to the grouping information.

Description

PCT国内申请，说明书已公开。PCT domestic application, the description has been published.

Claims

An online neural network compiling method, comprising:

acquiring intermediate representation information corresponding to a neural network, wherein the intermediate representation information is determined by performing layer grouping on the neural network in advance and according to a grouping result;

determining grouping information of the neural network according to the intermediate representation information;

and compiling the grouping of the neural network on line according to the grouping information.
The method according to claim 1, wherein the grouping information includes input data tensor information, output data tensor information, preset tensor information of the grouping processing data;

the compiling the grouping of the neural network on line according to the grouping information comprises the following steps:

determining input data tensor information and preset tensor information of the grouping according to the grouping information;

segmenting the grouped input data according to the preset tensor information and the input data tensor information to obtain segmented data;

and compiling the grouping on line according to the segmentation data.
The method of claim 2, wherein the input data tensor information comprises an input data identification; the preset tensor information comprises preset tensor dimensionality;

the segmenting the grouped input data according to the preset tensor information and the input data tensor information to obtain segmented data comprises the following steps:

determining input data of the group according to the input data identification;

and segmenting the input data according to the preset tensor dimension to obtain segmented data, wherein the dimension of the segmented data is smaller than or equal to the preset tensor dimension.
The method of claim 2, wherein the grouping information further comprises: global tensor information, time step information;

the compiling the packet on-line according to the segmentation data comprises:

reading one of the segmentation data according to a preset reading rule;

determining a step length and a processing intercept point when the segmentation data is processed according to the grouped global tensor information;

determining partition subdata according to the step length and the processing intercept point, and generating an instruction according to the time step information and the partition subdata in the grouping information;

and judging whether the processing of the segmentation data is finished according to the processing intercept, if not, continuing to execute the step of determining the step length and the processing intercept when the segmentation data is processed according to the grouped global tensor information.
The method according to claim 4, wherein if the segmented data is determined to be processed, determining whether all the segmented data is processed, and if not, continuing to perform the step of reading one of the segmented data according to a preset reading rule.
The method of claim 1, wherein the intermediate representation information includes a number of packets of the neural network;

the method further comprises the following steps:

and determining the compiling quantity of the compiled packets, determining whether all the compiled packets are compiled according to the compiling quantity and the packet quantity, and if not, continuing to execute the step of compiling the packets of the neural network on line according to the packet information.
The method of claim 6,

the determining the compiling number of the compiling completion comprises:

initializing a compiling identifier to be 0;

after the grouping of the neural network is compiled online according to the grouping information, a new compiling identifier is obtained by superposing 1 on the compiling identifier;

determining whether all the groups are compiled according to the compiling quantity and the grouping quantity, wherein the determining comprises the following steps:

and comparing whether the compiling identification is consistent with the grouping quantity, and if so, determining that the grouping is finished after compiling.
An online neural network compiling device, comprising:

the acquisition module is used for acquiring intermediate representation information corresponding to the neural network, wherein the intermediate representation information is determined by performing layer grouping on the neural network in advance and according to a grouping result;

a first determining module, configured to determine grouping information of the neural network according to the intermediate representation information;

and the compiling module is used for compiling the grouping of the neural network on line according to the grouping information.
The apparatus according to claim 8, wherein the grouping information includes input data tensor information, output data tensor information, preset tensor information of the grouping processing data;

the compiling module comprises:

a first determining unit configured to determine, according to the grouping information, input data tensor information and preset tensor information of the group;

the dividing unit is used for dividing the grouped input data according to the preset tensor information and the input data tensor information to obtain divided data;

and the compiling unit is used for compiling the grouping on line according to the segmentation data.
The apparatus of claim 9, wherein the input data tensor information comprises an input data identification; the preset tensor information comprises preset tensor dimensionality;

the segmentation unit is specifically configured to:

determining input data of the group according to the input data identification;

and segmenting the input data according to the preset tensor dimension to obtain segmented data, wherein the dimension of the segmented data is smaller than or equal to the preset tensor dimension.
The apparatus of claim 9, wherein the grouping information further comprises: global tensor information, time step information;

the compiling unit is specifically configured to:

reading one of the segmentation data according to a preset reading rule;

determining a step length and a processing intercept point when the segmentation data is processed according to the grouped global tensor information;

determining partition subdata according to the step length and the processing intercept point, and generating an instruction according to the time step information and the partition subdata in the grouping information;

and judging whether the processing of the segmentation data is finished according to the processing intercept, if not, continuing to execute the step of determining the step length and the processing intercept when the segmentation data is processed according to the grouped global tensor information.
The apparatus of claim 11, wherein if the compiling unit determines that the split data is processed, the compiling module further includes a second determining unit configured to determine whether all the split data is processed, and if not, the compiling unit continues to perform the step of reading one of the split data according to a preset reading rule.
The apparatus of claim 8, wherein the intermediate representation information comprises a number of packets of the neural network;

the device further comprises:

and the second determining module is used for determining the compiling quantity after compiling is finished, determining whether all the groups are compiled according to the compiling quantity and the grouping quantity, and if not, continuing to execute the step of compiling the groups of the neural network on line according to the grouping information by the compiling module.
The apparatus of claim 13,

the second determining module is specifically configured to:

initializing a compiling identifier to be 0;

after the compiling module carries out online compiling on the grouping of the neural network according to the grouping information, the second determining module obtains a new compiling identifier by superposing 1 on the compiling identifier;

the second determining module determines whether all the groups are compiled according to the compiling quantity and the grouping quantity, and comprises the following steps:

and comparing whether the compiling identification is consistent with the grouping quantity, and if so, determining that the grouping is finished after compiling.
A computer comprising the apparatus of any one of claims 8-14.
An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor, the instructions, when executed by the at least one processor, causing the at least one processor to perform the method of any one of claims 1-7.
A computer-readable storage medium having stored thereon computer-executable instructions configured to perform the method of any one of claims 1-7.
A computer program product, characterized in that the computer program product comprises a computer program stored on a computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to carry out the method of any one of claims 1-7.