CN111656370A

CN111656370A - Detection method and verification platform of accelerator

Info

Publication number: CN111656370A
Application number: CN201980009150.XA
Authority: CN
Inventors: 王耀杰; 林蔓虹; 陈琳
Original assignee: SZ DJI Technology Co Ltd
Current assignee: SZ DJI Technology Co Ltd
Priority date: 2019-04-18
Filing date: 2019-04-18
Publication date: 2020-09-11
Also published as: WO2020211037A1

Abstract

A detection method and a verification platform of an accelerator are provided, wherein the detection method comprises the following steps: generating at least one target neural network (110); translating the at least one target neural network into neural network instructions (120); inputting the neural network instruction into an accelerator and a software model matched with the accelerator respectively for execution, and determining the difference of the output result of the neural network instruction (130); and determining an instruction (140) with an exception in the operation process of the accelerator according to the difference of the output results of the neural network instructions. By the generated at least one target neural network, the performance of the accelerator can be effectively detected.

Description

Detection method and verification platform of accelerator

Copyright declaration

The disclosure of this patent document contains material which is subject to copyright protection. The copyright is owned by the copyright owner. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the patent and trademark office official records and records.

Technical Field

The present application relates to the field of neural network technology, and more particularly, to a detection method and a verification platform for an accelerator.

Background

After the neural network is generated, the neural network generally needs to be loaded on an accelerator to be operated in order to perform data processing by using the neural network. The performance of the accelerator may directly affect the subsequent data processing effect by using the neural network, and therefore, how to better detect the performance of the accelerator is a problem to be solved.

Disclosure of Invention

The application provides a detection method of an accelerator, a generation method of a neural network, a data processing method and a related device, so as to better detect the accelerator.

In a first aspect, a method for detecting an accelerator is provided, the method including: generating at least one target neural network; translating the at least one target neural network into neural network instructions; respectively inputting the neural network instruction into an accelerator and a software model matched with the accelerator for execution, and determining the difference of output results of the neural network instruction; and determining an abnormal instruction in the running process of the accelerator according to the difference of the output results of the neural network instructions.

According to the method and the device, the performance of the accelerator can be effectively detected through the generated at least one target neural network.

Optionally, generating at least one target neural network comprises: a plurality of target neural networks are generated.

When a plurality of target neural networks are generated, different neural networks can be adopted to detect the performance of the accelerator, and further the performance detection of the accelerator can be better realized.

Optionally, the target neural network is a convolutional neural network.

It should be understood that the target neural network generated in the present application may be other types of neural networks besides the convolutional neural network, for example, a feedforward neural network, a recurrent neural network, and the like.

In certain implementations of the first aspect, generating at least one target neural network includes: determining algebra of a target neural network, and node types and node numbers of nodes of all generations of the target neural network, wherein the target neural network is any one of the at least one target neural network; determining a target connection mode for connecting all nodes in the target neural network according to a preset node connection requirement; and generating a target neural network according to the target connection mode.

In the method and the device, the generation number, the node number and the node type of the target neural network to be generated are determined, and then the target connection mode is generated by combining the preset node connection requirements, so that the target neural network can be generated finally, and various types of neural networks can be generated more flexibly and conveniently. Further, when various types of neural networks are generated, the performance of the accelerator can be better detected.

In some implementations of the first aspect, determining a target connection mode for connecting all nodes in the target neural network according to a preset node connection requirement includes: determining a candidate father node of a current node according to the node connection requirement, wherein the current node and the candidate father node meet the node connection requirement; selecting an actual father node of the current node from the candidate father nodes; and determining the connection relation between the current node and the actual father node of the current node so as to finally generate a target connection mode.

The candidate parent node may also be referred to as a candidate node of the parent node.

In certain implementations of the first aspect, determining a candidate parent node of the current node based on the node connectivity requirement includes: determining a candidate father node of the current node according to at least one of the following connection relations; when the node type of the current node is Concat or Eltwise, the number of father nodes of the current node is multiple, and the number of the father nodes of the current node is less than or equal to the number of candidate father nodes of the current node; when the node type of the father node of the current node is Active, the node type of the current node is a type other than Active; when the node type of the father node of the current node is Global Powing, the node type of the current node is Global Powing; when the node type of the father node of the current node is FC, the node type of the current node is FC or Concat; when the node type of the parent node of the current node is Conv, Eltwise, Pooling, and Concat, the node type of the current node may be any one of Conv, Eltwise, Pooling, Active, Global Pooling, Concat, and FC.

In some implementations of the first aspect, selecting the actual parent node of the current node from the candidate parent nodes includes: determining the probability of each node in the candidate father nodes as the actual father node of the current node according to the probability density function; and determining the actual father node of the current node from the candidate father nodes according to the probability that each node in the candidate father nodes is taken as the actual father node of the current node.

In some implementations of the first aspect, determining the actual parent node of the current node from the candidate parent nodes according to a probability that each of the candidate parent nodes is the actual parent node of the current node includes: and determining the node, which is used as the actual father node of the current node and has the probability higher than the preset probability value, in the candidate father nodes as the actual father node of the current node.

In certain implementations of the first aspect, the method further comprises: and adjusting the probability of each node in the candidate parent nodes as the actual parent node of the current node according to the expectation and the variance of the probability density function.

By adjusting the expectation and the variance of the probability density function, the width and the depth of the target neural network can be adjusted, so that the target neural network with the depth and the width meeting the requirements can be generated.

In particular, the expectation and variance of the probability density function may be adjusted according to the requirements of the depth and width of the target neural network to be generated.

In general, the larger the variance of the probability density function, the greater the probability that a node in a neighboring generation is selected, the narrower the width of the network becomes and the deeper the depth becomes.

In certain implementations of the first aspect, the probability density function is a gaussian function.

In certain implementations of the first aspect, the generating the target neural network according to the target connection mode includes: determining an effective target connection relation from the target connection relation according to a preset node effective connection relation; and generating a target neural network according to the effective target connection relation.

In certain implementations of the first aspect, the node-efficient connection relationship comprises at least one of: when the node type of the current node is Eltwise, the number of a plurality of input channels of the current node is kept consistent; when the node type of the current node is FC or Global Pooling, only nodes except the FC, Global Pooling and act types can be connected behind the current node.

In some implementations of the first aspect, determining the generation number of the target neural network to be generated, and the node types and the node numbers of the nodes of all the generation of the target neural network includes: and determining the algebra of the target neural network, and the node types and the node numbers of the nodes of all the generations of the target neural network according to the operation requirement on the target neural network.

The operation requirement on the target neural network can be the requirement of the operation amount (size), when the operation amount requirement is smaller, fewer generations can be set for the target neural network, and each generation can also be provided with fewer node numbers; when the calculation amount requirement is larger, more generations can be set for the target neural network, and each generation can also be provided with more nodes.

The operation requirement on the target neural network can be the complexity of operation, when the complexity of operation is low, fewer generations can be set for the target neural network, and each generation can also be provided with fewer nodes; when the operation complexity is higher, more algebras can be set for the target neural network, and each generation can also be provided with more node numbers.

In a second aspect, a method for generating a neural network is provided, the method including: determining algebra of a target neural network to be generated, and node types and node numbers of nodes of all generations of the target neural network; determining a target connection mode for connecting all nodes in a target neural network according to a preset node connection requirement; and generating a target neural network according to the target connection mode.

In the method and the device, the generation number, the node number and the node type of the target neural network to be generated are determined, and then the target connection mode is generated by combining the preset node connection requirements, so that the target neural network can be generated finally, and various types of neural networks can be generated more flexibly and conveniently.

In a third aspect, a data processing method is provided, which includes: determining algebra of a target neural network to be generated, and node types and node numbers of nodes of all generations of the target neural network; determining a target connection mode for connecting all nodes in a target neural network according to a preset node connection requirement; generating a target neural network according to the target connection mode; and processing data by adopting a target neural network.

According to the method and the device, the algebra, the number and the type of the nodes of the target neural network to be generated are determined, and then the target connection mode is generated by combining the preset node connection requirements, so that the target neural network can be generated finally, various types of neural networks can be generated more flexibly and conveniently, and further, the specific neural network can be adopted to perform data processing on corresponding data more pertinently.

It is to be understood that the specific manner of generating the target neural network and the definition and interpretation of the relevant information in the second and third aspects of the present application can be found in the relevant context of the first aspect described above.

In a fourth aspect, a validation platform for an accelerator is provided, the validation platform comprising: a memory for storing code; at least one processor configured to execute code stored in memory to perform the following operations: generating at least one target neural network; translating the at least one target neural network into neural network instructions; respectively inputting the neural network instruction into an accelerator and a software model matched with the accelerator for execution, and determining the difference of output results of the neural network instruction; and determining an abnormal instruction in the running process of the accelerator according to the difference of the output results of the neural network instructions.

In a fifth aspect, an apparatus for generating a neural network is provided, including: a memory for storing code; at least one processor configured to execute code stored in the memory to perform the following: determining algebra of a target neural network to be generated, and node types and node numbers of nodes of each generation; determining a target connection mode for connecting each node in the target neural network according to a preset node connection requirement; and generating the target neural network according to the target connection mode.

In a sixth aspect, there is provided a data processing apparatus comprising: a memory for storing code; at least one processor configured to execute code stored in the memory to perform the following: determining algebra of a target neural network to be generated, and node types and node numbers of nodes of all generations of the target neural network; determining a target connection mode for connecting all nodes in the target neural network according to a preset node connection requirement; generating the target neural network according to the target connection mode; and processing data by adopting the target neural network.

In a seventh aspect, a computer-readable storage medium is provided, on which instructions for performing any one of the methods of the first, second and third aspects are stored.

In an eighth aspect, there is provided a computer program product comprising instructions for performing the method of any one of the first, second and third aspects.

Drawings

FIG. 1 is a schematic diagram of a neural network architecture;

FIG. 2 is a schematic flow chart diagram of a method of detecting an accelerator according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a neural network generation process according to an embodiment of the present application;

FIG. 4 is a schematic diagram of the algebra, the number of nodes in each generation and the node types of the determined target neural network;

FIG. 5 is a schematic diagram of one possible node connection relationship of a neural network;

FIG. 6 is a schematic diagram of one possible node connection relationship for a neural network;

FIG. 7 is a schematic diagram of one possible node connection relationship for a neural network;

FIG. 8 is a schematic diagram of one possible node connection relationship for a neural network;

FIG. 9 is a schematic diagram of a neural network generation process according to an embodiment of the present application;

FIG. 10 is a schematic block diagram of a validation platform of an accelerator of an embodiment of the present application;

FIG. 11 is a schematic block diagram of an apparatus for generating a neural network of an embodiment of the present application;

fig. 12 is a schematic block diagram of a data processing apparatus according to an embodiment of the present application.

Detailed Description

The embodiments of the present application will be described in detail below with reference to the accompanying drawings.

For better understanding of the embodiments of the present application, the structure of the neural network and related information of the neural network in the embodiments of the present application are described below with reference to fig. 1.

FIG. 1 is a schematic diagram of a neural network architecture.

It should be understood that the neural network in fig. 1 may be a convolutional neural network, or may be another type of neural network, and the present application is not limited thereto.

In fig. 1, the structure of the neural network mainly includes three parts: node, generation, and tree.

In fig. 1, the neural network includes nodes 1 to 9, which together form nodes of 0 th generation to 4 th generation, and each generation includes the following nodes:

generation 0: a node 1;

generation 1: node 2, node 3, node 4;

generation 2: node 5, node 6;

generation 3: node 7, node 8;

generation 4: node 1.

As shown in fig. 1, a node of a previous generation may serve as a parent node of a next generation, and a node of the next generation may serve as a child node of the previous generation. For example, generation 1 to generation 4 nodes may be child nodes of generation 0 nodes, and generation 1 nodes may be parent nodes of generation 2 to generation 4 nodes.

As shown in fig. 1, the nodes in the 0 th generation to the 4 th generation collectively constitute a tree of the neural network.

The information about the nodes, generations and trees is described in detail below.

Each node is used to describe a computation layer (e.g., a convolutional layer), and the information contained in each node and the meaning of the corresponding information are as follows:

node _ header, header information of the node;

the header information of the node comprises sequence, gen _ id and node _ id, wherein the sequence is a total serial number of the node, the gen _ id represents a generation index number (the index number of the generation where the node is located), and the node _ id represents a node index number in the generation;

parent _ num is equal to 2 for nodes of the type Concat/Eltwise, and is equal to 1 for nodes of other types;

parent (of the node), the number of parent (of the node) is equal to parent _ num;

node type, for example, the node type here can include Input/Eltwise/Concat/Conv/Pool/Relu/Prelu/Innerrproduct/Global Pooling, etc.;

node _ name: a node name (of the node);

the node name of a top node (of the node), wherein the top node is a child node of the node;

the node name of the bottom node (of the node), wherein the bottom node is a father node of the node, and the number of the bottom nodes is parent _ num;

if _ n/c/h/w [ ]: the number of batch, number of channels, width and height of each input node (of the node), wherein the number of input nodes of the node is equal to parent _ num;

of _ n/c/h/w: the output node of this node is the number of banks, number of channels, width and height.

The generation (generation) is used for organizing at least one node, if a generation comprises a plurality of nodes, the nodes in the same generation cannot be connected with each other, and the node in the current generation can only be connected with the node in the generation with gen _ id smaller than that of the current generation (namely, cross-generation connection is supported). The information contained in the generation and the corresponding information have the following meanings:

gen _ id: a generation index number;

node _ num is the number of nodes contained in the generation, and the node _ num is smaller than or equal to the maximum width of the neural network;

nodes: instances of nodes contained in the generation;

node _ tq [ ], the type of each node included in the generation.

A tree (tree) is used to organize generations and describe the connection relationships of all nodes in a network. The information contained in the tree and the corresponding information have the following meanings:

gen _ num is the algebra contained in the tree, and gen _ num is less than or equal to the maximum depth of the network;

genes [ ]: an example of the generations contained in the tree, the number of genes [ ] is equal to gen _ num.

It should be understood that the neural network structure described above with reference to fig. 1 is only one possible structure of the neural network in the embodiment of the present application, and the neural network in the embodiment of the present application may also be other structures.

One possible structure of the neural network in the embodiment of the present application is briefly described above with reference to fig. 1, and the detection method of the accelerator in the embodiment of the present application is described in detail below with reference to fig. 2.

Fig. 2 is a schematic flow chart of a detection method of an accelerator according to an embodiment of the present application. The method shown in fig. 2 may be performed by an electronic device or a server, where the electronic device may be a mobile terminal (e.g., a smart phone), a computer, a personal digital assistant, a wearable device, an in-vehicle device, an internet of things device, or other processor-containing device. The method shown in fig. 2 includes steps 110 to 140, which are described in detail below.

110. At least one target neural network is generated.

Optionally, the at least one target neural network is a plurality of target neural networks.

The target neural network may be a convolutional neural network, may be other than a convolutional neural network, or may be another type of neural network other than a convolutional neural network, for example, a feed-forward neural network, a recurrent neural network, or the like.

120. Translating the at least one target neural network into neural network instructions.

It should be understood that step 120 is performed in order to load the at least one target neural network into the accelerator or the software model, and before loading the at least one target neural network into the accelerator or the software model, the at least one target neural network generally needs to be translated into an instruction which can be executed by the accelerator or the software model.

130. And respectively inputting the neural network instruction into an accelerator and a software model matched with the accelerator for execution, and determining the difference of the output result of the neural network instruction.

It should be appreciated that the software model described above that is matched to the accelerator may be a software model for contrasting the performance of the accelerator that may simulate the operational behavior of the accelerator.

Assuming that the input of the neural network command to the accelerator obtains a first output result and the input of the neural network command to the software model obtains a second output result, the difference of the output results of the neural network command can be obtained by comparing the first output result and the second output result.

140. And determining an abnormal instruction in the running process of the accelerator according to the difference of the output results of the neural network instructions.

In step 140, when the output result is different, the instruction of the accelerator corresponding to the output result may be regarded as an instruction with an abnormality in the operation process of the accelerator, and the instruction with an abnormality in the operation process of the accelerator may be determined to be used for locating the problem of the accelerator, so as to further improve or modify the design of the accelerator, thereby improving the performance of the accelerator.

Furthermore, when a plurality of target neural networks are generated, different neural networks can be adopted to detect the performance of the accelerator, and further the performance detection of the accelerator can be better realized.

There are various implementations of generating the at least one target neural network in step 110, and the process of generating the at least one target neural network in step 110 is described in detail below with reference to fig. 3.

Fig. 3 is a schematic diagram of a generation process of a neural network according to an embodiment of the present application.

The process shown in fig. 3 includes steps 210 through 230, which are described in detail below.

210. And determining the algebra of the target neural network, and the node types and the node numbers of the nodes of all the generations of the target neural network.

The target neural network determined in step 210 may be any one of the at least one target neural network in step 110.

Specifically, in step 210, the algebra of the target neural network may be randomly determined, and then the node type and the node number of the node of each generation may be randomly determined.

For example, as shown in fig. 1, the algebra of the target neural network may be randomly determined to be 5 (the algebra of the neural network in fig. 1 is 5).

Additionally, in step 210, the algebra of the target neural network can be determined within a range of values (e.g., a depth range of the neural network). For example, the algebra of the target neural network can be randomly determined to be 12 within the range of the values [10,20 ].

After the generation number of the target neural network is determined, the node type of each generation of nodes can be determined from all available node types, and when the node number of each generation is determined, the node number of each generation can be determined within a certain numerical range (for example, the width range of the neural network).

It should be understood that, in step 210, the algebra of the target neural network, and the node types and node numbers of each generation can also be set according to specific (calculation) requirements.

For example, if the neural network is used to perform some simple operations, a smaller number of generations and a smaller number of nodes may be set for the target neural network, whereas if the neural network is used to perform some very complex operations, a larger number of generations and a larger number of nodes may be set for the target neural network.

Alternatively, the node types of each generation can be determined according to the available node types of Input/Eltwise/Concat/Conv/Pool/Relu/Prelu/Innerrproduct/Global Pooling.

For example, taking the neural network shown in fig. 1 as an example, after the generation number of the target neural network is randomly determined to be 5, the numbers of nodes of 0 th generation to 4 th generation may be randomly determined to be 1, 3, 2, and 1, respectively.

When determining the node type and the node number of each generation, the node type of each generation may be determined first, the node number of each generation may be determined first, and the node type and the node number of each generation may be determined simultaneously (the present application does not limit the order of determining the node type of each generation of nodes and the node number of each generation of nodes).

It should be understood that in determining the node types and the number of nodes for each generation, the number of nodes for each generation may be greater than or equal to the number of node types for that generation (the number of node types for each generation is less than the number of nodes for that generation).

The algebra of the target neural network determined in step 210, the node types and the node numbers of each generation will be described with reference to the drawings.

For example, as shown in fig. 4, the algebra of the target neural network determined in step 210 is 4 (including 0 th generation to 4 th generation), and the number of nodes included in the 0 th generation to 3 rd generation is specifically as follows:

the number of the nodes of the 0 th generation of nodes is 1;

the number of the nodes of the 1 st generation of nodes is 3;

the number of the nodes of the 2 nd generation of nodes is 2;

the number of nodes of the 3 rd generation node is 1.

The node types included in the 0 th generation to the 3 rd generation are specifically as follows:

the node type of the 0 th generation node is Input;

the node types of the 1 st generation node comprise FC, Eltwise and Global Poolling;

the node types of the 2 nd generation nodes are Concat and FC;

the node type of the 3 rd generation node is Eltwise.

220. And determining the connection relation of each node in the target neural network according to the preset node connection requirement.

The node connection requirement may be a rule that can satisfy a normal use requirement of the neural network, and the node connection requirement may be set in advance, and specifically, the node connection requirement may be set by experience and a requirement of the neural network to be generated.

It should be understood that the connection relationships between the nodes in the target neural network determined in step 220 according to the node connection requirements may be various, and after the connection relationships are obtained, one connection relationship may be (arbitrarily) selected from the various connection relationships as the final connection relationship.

Optionally, the node connection requirement may include at least one of the following conditions:

(1) the node type of the first generation node is an Input (Input) type;

(2) when the node type of the current node is Concat or Eltwise, the number of father nodes of the current node is less than or equal to the number of candidate nodes of the father nodes;

(3) the connection between the node type of the current node and the parent node of the current node satisfies the relationship shown in table 1.

TABLE 1

Table 1 shows node types of parent nodes that can be connected when the current node is a different node type, where Y indicates connectable and N indicates non-connectable.

It should be understood that, in the above step 220, various node connection relationships may be obtained, and before the step 230 is executed, the validity of the various node connection relationships may be determined, and the step 230 may be executed after a valid node connection relationship is selected.

Specifically, when the validity of various node connection relationships is checked, it may be determined whether the node connection relationships satisfy the following condition (4) and condition (5), and the node connection relationships satisfying the condition (4) and the condition (5) among the node connection relationships are determined as valid node connection relationships, and step 230 is executed according to the valid node connection relationships.

(4) The channel numbers of a plurality of inputs of the Eltwise type node are kept consistent;

(5) nodes of the FC type and the Global Pooling type (including nodes immediately after the current node and nodes following the current node in the next generation) cannot be connected to nodes of other types than the FC, Global Pooling and act types.

Specifically, the node type of the node immediately following the node of the FC type and the Global pobing type, and the node type of the node following the node of the FC type and the Global pobing type in the next generation can only be of the FC, Global pobing, or act type.

For example, in the neural network structure shown in fig. 4, the node type of the node 6 is Eltwise, the number of input channels of both inputs of the node 6 is 1, and the number of input channels at both ends of the node 6 satisfies the above-mentioned condition (4), but for the node 11 of the same Eltwise type, the number of input channels on the left side of the node 11 is 2, the number of input channels on the right side is 1, and the number of input channels on the left side of the node 11 and the number of input channels on the right side do not coincide with each other, and the above-mentioned condition (4) is not satisfied.

Therefore, the connection relationship shown in fig. 5 does not satisfy the above condition (4), and when the plurality of node connection relationships determined in step 220 include the invalid connection relationship shown in fig. 5, the connection relationship needs to be excluded.

For another example, in the neural network shown in fig. 6, the node type of the node 1 is FC, the node type of the node 2 is Relu, and since the node type of the node 1 is FC, only the nodes with the node types of FC, Global firing and act can be connected behind the node 1, and the connection relationship between the node 1 and the node 2 does not satisfy the condition (5); in addition, the node type of the node 3 is Global policy, the node type of the node 4 is Prelu, the node 3 can only connect nodes with the node types of FC, Global policy and act, and the connection relationship between the node 3 and the node 4 does not satisfy the above condition (5).

Therefore, the connection relationship shown in fig. 6 does not satisfy the above condition (5), and when the plurality of node connection relationships determined in step 220 include the invalid connection relationship shown in fig. 6, the connection relationship is excluded.

In addition, in step 220, when determining the parent node of one node, there may be a plurality of candidate nodes, and in this case, as long as the above conditions (1) to (5) are satisfied, all the candidate nodes may be the candidate parent node of the current node (which may also be referred to as the candidate node of the parent node), but specifically, which node is selected from the candidate parent nodes as the actual parent node of the current node may be determined according to the probability density function.

Alternatively, the probability density function may be a gaussian function (gaussian function), and since the gaussian function as a whole conforms to the basic requirement that the closer generation is selected, the higher the probability is, specifically, the expected value of the gaussian function may be consistent with the generation index value-1, and the expected value of the gaussian function does not affect the control of the network morphology. By adjusting the variance in the gaussian function, the form of the gaussian function can be controlled, thereby controlling the probability of the nodes in each generation being selected. In general, the larger the variance of the gaussian function, the greater the probability that a node in a neighboring generation is selected, the deeper the depth will become, and the narrower the width of the network will become.

230. And generating a target neural network according to the target connection relation.

In step 230, after the connection relationship of each node is determined, a target neural network may be constructed according to the connection relationship of each node, or a prototxt file (which includes the connection relationship of each node in the target neural network) may be output, so that the prototxt file may be subsequently input to a configuration tool and translated into a neural network instruction for execution by an accelerator.

It should be understood that when the target neural network is generated according to the connection relationship, the node internal parameters of each node also need to be determined, wherein the node internal parameter type, the number of the node internal parameters, and the node internal parameters of each node are related to the node type.

For example, of _ h needs to satisfy equation (1) for a node of the Conv type.

of_h＝(if_h[0]+2×pad_h–(dilation_h×(kernel_h-1)+1))/stride_h+1 (1)

For a node of Pool type, of _ h should satisfy equation (2).

of_h＝(if_h[0]+2×pad_h–Pool_size)/stride_h+1 (2)

In the above equations (1) and (2), of _ h represents the height of the node output feature map, if _ h represents the height of the node input feature map, pad _ h is the number of rows of elements filled in the input feature map of the node for calculation, and is usually 0, dimension _ h represents the number of elements interpolated in the middle of the input feature map of the node (dimension _ h is greater than 0), and is usually 0, kernel _ h represents the size of the convolution kernel when performing the convolution operation, stride _ h represents the step size of the convolution kernel or the pooling window sliding in the height direction, and Pool _ size represents the size of the window when performing the pooling processing.

For a Concat type node, of _ c is equal to the sum of the individual if _ c, and for an Eltwise type node, of _ c should be consistent with the size of each if _ c.

In addition, when determining the intra-node parameters of each node, the following condition a needs to be satisfied.

Condition a: the feature map output by the parent node is equal in size to the feature map input by the child node.

Since the feature graph of the output of the parent node is the feature graph of the input of the child node, the size of the feature graph output in the parent node is consistent with the size of the feature graph input by the child node.

The generation of the target neural network according to the determined connection relationship in step 230 is described below with reference to fig. 7 and 8.

For example, when the algebra of the target neural network determined in step 210, and the node types and the node numbers of each generation are as shown in fig. 4, the node connection relationships obtained in step 220 are continued as shown in fig. 7 and 8.

Next, the node connection relationships shown in fig. 7 and 8 are analyzed according to the above conditions (4) and (5). As a result of analysis, the node connection relationships shown in fig. 7 and 8 both satisfy the condition (4), but in fig. 7, the connection of the node 3 and the node 6 does not satisfy the condition (5). Since fig. 8 satisfies the condition (5) in addition to the condition (4), it can be determined that the node connection relationship shown in fig. 8 is a valid node connection relationship, and then, in step 230, the neural network can be constructed according to the node connection relationship shown in fig. 8.

In order to better understand the flow of the neural network generation method according to the embodiment of the present application, a detailed description is given below with reference to fig. 9 of a specific implementation flow of the neural network generation process according to the embodiment of the present application.

Fig. 9 is a schematic diagram of a generation process of a neural network according to an embodiment of the present application. The process shown in fig. 9 may be performed by an electronic device (the definition and explanation of the electronic device may refer to relevant contents in the method shown in fig. 2), and the process shown in fig. 9 includes steps 1001 to 1011, which are described in detail below.

1001. And starting.

Step 1001 represents the start of the generation of the neural network.

1002. Algebra of the neural network is randomly generated.

It should be appreciated that in step 1002, a value may be randomly selected within a certain range of values as an algebra of the neural network.

1003. And randomly generating the number of the generation nodes and the node types of the generation nodes.

In step 1003, the number of nodes of each generation may be randomly generated within a range of a certain network width. For example, the width of the neural network cannot exceed 10, and then, a value can be arbitrarily selected between 1 and 10 as the number of nodes of each generation.

And when the node type of each node is randomly generated, the node type of each node can be randomly generated from all available node types.

Step 1002 and step 1003 here correspond to step 210 above, and the relevant definitions and explanations for step 210 above apply equally to step 1002 and step 1003, and step 1002 and step 1003 will not be described in detail here in order to avoid repetition.

1004. And instantiating each node according to the node type.

Specifically, in step 1004, each node in each generation may be instantiated according to the node type of each generation and the node number of each generation, that is, a node instance in each generation is determined according to the node type of each generation and the node number of each generation, where one node may correspond to one instance or multiple instances.

It should be understood that the nodes herein are more biased toward a logical concept, whereas a node instance is an entity upon which the node actually depends, on which various data processing tasks of the node can be performed.

1005. Configuring the head information and the father node number of each node.

Configuring the header information (node _ header) of each node, namely generating the node total sequence number (sequence), the generation index number (gen _ id) and the node index number (node _ id) in the generation of each node.

For example, the generation index numbers (gen _ id) of the respective generations may be generated in the order from top to bottom, the total sequence numbers (sequence) of the respective nodes may be generated in the order from 0 th generation to N th generation (N is the number of the last generation of the neural network), and the node index numbers (node _ id) of the respective nodes in the generation may be generated in each generation in a certain order.

Wherein sequence represents the serial number of the node in the whole neural network.

1006. Candidate nodes of parent nodes of the nodes are calculated.

Specifically, in step 1006, candidate nodes for the parent node of the current node are computed to facilitate subsequent selection of the parent node from the candidate nodes.

In step 1006, candidate parent nodes may be selected from previous generations, starting with the bottom-most layer, for each node in each layer, layer-by-layer.

It should be appreciated that in selecting a candidate parent node for a current node, the candidate parent node for the current node may not only be from a previous generation of the current node, but may also be from all generations prior to the current node.

When determining a candidate parent node for each node, the candidate parent node may be selected according to a certain node connection requirement (the node connection requirement may be one or more of the above conditions (1) to (3)), and a node satisfying the node connection requirement in the previous generation is taken as a candidate parent node of the current node.

For example, as shown in FIG. 4, node 5 and node 6 in generation 2 may be selected as candidate parent nodes for node 7 in generation 3.

In addition, in step 1006, after determining the candidate parent nodes of the nodes, a probability density function may be used to calculate the probability that each of the candidate parent nodes is the parent node of the current node, and the nodes with the probability greater than a certain value may be used as the candidate parent nodes of the current node.

It should be understood that the number of the candidate parent nodes may be plural, and the number of the parent nodes selected from the candidate parent nodes may be one or plural. In addition, the parent node selected from the candidate parent nodes is the actual parent node of the current node.

For example, a node has 6 candidate parents, and the probabilities of the 6 candidate parents as the candidate parents of the current node are 70%, 60%, 65%, 45%, 40% and 30% respectively, as calculated by the probability density function. Then, candidate parent nodes corresponding to 70%, 60%, 65% probabilities, respectively, may be determined as the actual parent nodes of the current node (one or more candidate parent nodes may be selected as the actual parent nodes of the current node).

In the above example, only the candidate parent node with the highest correspondence probability may be used as the actual parent node of the current node (i.e., the candidate parent node with the 70% correspondence probability may be used as the actual parent node of the current node).

The probability density function may be a gaussian function.

1007. And randomly selecting an actual father node of the current node for connection.

In the above step 1006, after selecting the actual parent node of the current node from the candidate parent nodes, if the number of the actual parent nodes is multiple, the parent node may be arbitrarily or randomly selected from the actual parent nodes for connection.

1008. It is determined whether the current connection is valid.

In step 1008, to determine whether the currently existing connection is valid, in the specific execution, each connection relationship may be determined according to the above condition (4) and condition (5), where the connection relationship satisfying the conditions (4) and (5) is a valid connection relationship, and the connection relationship not satisfying any one of the conditions (4) and (5) is an invalid connection relationship.

When the connection is determined to be valid, step 1009 is performed, and when the connection is determined to be invalid, step 1006 is continuously performed.

1009. The nodes are connected.

In step 1009, the nodes may be connected according to the effective connection relationship determined in step 1008.

It is understood that after step 1009, step 1009a may also be performed.

1009a, determining the node internal parameters of each node.

When determining the node internal parameters of each node, the node internal parameters of each node may be determined according to the above formula (1), formula (2) and the constraint of the condition a.

1010. A prototxt file is printed.

The prototxt file contains the connection relation of each node in the neural network to be generated, and after the prototxt file is generated, the neural network can be conveniently constructed or generated according to the prototxt file.

1011. And (6) ending.

Step 1011 represents the end of the neural network generation process.

The detection method of the accelerator according to the embodiment of the present application is described in detail above with reference to fig. 1 to 9.

In fact, the present application may also protect a method for generating a neural network, where the method for generating a neural network specifically includes: determining algebra of a target neural network to be generated, and node types and node numbers of nodes of all generations of the target neural network; determining a target connection mode for connecting all nodes in a target neural network according to a preset node connection requirement; and generating a target neural network according to the target connection mode.

The generated target neural network can be used for processing data, and therefore, the application can also protect a data processing method, which includes: determining algebra of a target neural network to be generated, and node types and node numbers of nodes of all generations of the target neural network; determining a target connection mode for connecting all nodes in a target neural network according to a preset node connection requirement; generating a target neural network according to the target connection mode; and processing data by adopting a target neural network.

Optionally, the processing data by using the target neural network includes: acquiring input data; and carrying out data processing on the input data by adopting a target neural network to obtain output data.

The input data may be data that needs to be processed by a neural network, and further, the input data may be data that needs to be processed by the neural network in the field of artificial intelligence.

For example, the input data may be image data to be processed, and the output data may be a classification result or a recognition result of an image. For another example, the input data may be voice data to be recognized, and the output result may be a voice recognition result.

It should be understood that, the specific manner of generating the neural network and the definition and explanation of the related information in the above neural network generating method and data processing method can be referred to the relevant contents of the neural network generating process (for example, the relevant contents shown in fig. 2) above.

The verification platform of the accelerator according to the embodiment of the present application is described below with reference to fig. 10, it should be understood that the verification platform of the accelerator shown in fig. 10 is capable of performing the steps of the detection method of the accelerator according to the embodiment of the present application, and the repeated description is appropriately omitted when fig. 10 is introduced below.

FIG. 10 is a schematic block diagram of a validation platform of an accelerator of an embodiment of the application. The validation platform 2000 of the accelerator shown in fig. 10 includes:

a memory 2001 for storing codes;

at least one processor 2002 that executes code stored in the memory to:

generating at least one target neural network;

translating the at least one target neural network into neural network instructions;

inputting the neural network instruction into an accelerator and a software model matched with the accelerator respectively for execution, and determining the difference of the output results of the neural network instruction;

and determining an abnormal instruction in the running process of the accelerator according to the difference of the output results of the neural network instructions.

It is to be appreciated that only one processor 2002 is shown in FIG. 10 for ease of illustration, and in fact, the validation platform 2000 shown in FIG. 10 can contain one or more processors 2002.

Fig. 11 is a schematic block diagram of an apparatus for generating a neural network according to an embodiment of the present application. It should be understood that the apparatus 3000 shown in fig. 11 is capable of performing the steps of the method for generating a neural network according to the embodiment of the present application, and the apparatus 3000 shown in fig. 11 includes:

a memory 3001 for storing codes;

at least one processor 3002 configured to execute code stored in said memory to perform the following:

generating at least one target neural network;

It is to be appreciated that only one processor 3002 is shown in fig. 11 for ease of illustration, and in fact, the apparatus 3000 shown in fig. 11 may contain one or more processors 3002.

Fig. 12 is a schematic block diagram of a data processing apparatus according to an embodiment of the present application. It should be understood that the apparatus 4000 shown in fig. 12 is capable of performing the steps of the data processing method according to the embodiment of the present application, and the apparatus 4000 shown in fig. 12 includes:

a memory 4001 for storing codes;

at least one processor 4002 configured to execute the code stored in the memory to perform the following:

determining algebra of a target neural network to be generated, and node types and node numbers of nodes of all generations of the target neural network;

determining a target connection mode for connecting all nodes in the target neural network according to a preset node connection requirement;

generating the target neural network according to the target connection mode;

and processing data by adopting the target neural network.

It is to be appreciated that only one processor 4002 is shown in fig. 12 for ease of illustration, and in fact, the apparatus 4000 shown in fig. 12 may contain one or more processors 4002.

The verification platform 2000, the apparatus 3000, and the apparatus 4000 of the accelerator may be specifically an electronic device or a server, where the electronic device may be a mobile terminal (e.g., a smart phone), a computer, a personal digital assistant, a wearable device, an in-vehicle device, an internet of things device, and other devices including a processor.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any other combination. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a Digital Video Disk (DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of detecting an accelerator, comprising:

generating at least one target neural network;

2. The method of claim 1, wherein the generating at least one target neural network comprises:

determining algebra of the target neural network, and node types and node numbers of nodes of all generations of the target neural network, wherein the target neural network is any one of the at least one target neural network;

and generating the target neural network according to the target connection mode.

3. The method of claim 2, wherein determining a target connection mode for connecting all nodes in the target neural network according to a preset node connection requirement comprises:

determining a candidate parent node of the current node according to the node connection requirement, wherein the current node and the candidate parent node meet the node connection requirement;

selecting an actual parent node of the current node from the candidate parent nodes;

and determining the connection relation between the current node and the actual father node of the current node so as to finally generate the target connection mode.

4. The method of claim 3, wherein said determining a candidate parent node for the current node based on the node connectivity requirements comprises:

determining a candidate parent node of the current node according to at least one of the following connection relations;

when the node type of the current node is Concat or Eltwise, the number of father nodes of the current node is multiple, and the number of the father nodes of the current node is less than or equal to the number of candidate father nodes of the current node;

when the node type of the father node of the current node is Active, the node type of the current node is a type other than Active;

when the node type of the father node of the current node is GlobalPoling, the node type of the current node is GlobalPoling;

when the node type of the father node of the current node is FC, the node type of the current node is FC or Concat;

when the node type of the parent node of the current node is Conv, Eltwise, Pooling, and Concat, the node type of the current node may be any one of Conv, Eltwise, Pooling, Active, GlobalPooling, Concat, and FC.

5. The method of claim 3 or 4, wherein said selecting the actual parent node of the current node from the candidate parent nodes comprises:

determining the probability of each node in the candidate father nodes as the actual father node of the current node according to a probability density function;

and determining the actual father node of the current node from the candidate father nodes according to the probability that each node in the candidate father nodes is taken as the actual father node of the current node.

6. The method of claim 5, wherein determining the actual parent node of the current node from the candidate parent nodes based on the probability that each of the candidate parent nodes is the actual parent node of the current node comprises:

and determining a node, which is used as the actual father node of the current node and has a probability higher than a preset probability value, in the candidate father nodes as the actual father node of the current node.

7. The method of claim 5 or 6, further comprising:

and adjusting the probability of each node in the candidate parent nodes as the actual parent node of the current node according to the expectation and the variance of the probability density function.

8. The method of any one of claims 5-7, wherein the probability density function is a Gaussian function.

9. The method of any one of claims 1-8, wherein generating the target neural network from the target connectivity means comprises:

determining an effective target connection relation from the target connection relation according to a preset node effective connection relation;

and generating the target neural network according to the effective target connection relation.

10. The method of claim 9, wherein the node-valid connection relationship comprises at least one of:

when the node type of the current node is Eltwise, the number of a plurality of input channels of the current node is kept consistent;

and when the node type of the current node is FC or GlobalPooling, the current node cannot be connected with other types of nodes except the FC, GlobalPooling and act types.

11. The method of any one of claims 1-10, wherein the determining the generation number of the target neural network to be generated, and the node types and node numbers of the nodes of all the generation of the target neural network, comprises:

and determining the algebra of the target neural network, and the node types and the node numbers of the nodes of all the generations of the target neural network according to the operation requirement on the target neural network.

12. A method for generating a neural network, comprising:

13. The method of claim 12, wherein determining a target connection mode for connecting all nodes in the target neural network according to a preset node connection requirement comprises:

14. The method of claim 13, wherein said determining a candidate parent node for the current node based on the node connectivity requirements comprises:

15. The method of claim 13 or 14, wherein said selecting the actual parent node of the current node from the candidate parent nodes comprises:

16. The method of claim 15, wherein determining the actual parent node of the current node from the candidate parent nodes based on the probability that each of the candidate parent nodes is the actual parent node of the current node comprises:

17. The method of claim 15 or 16, wherein the method further comprises:

18. The method of any one of claims 15-17, wherein the probability density function is a gaussian function.

19. The method of any one of claims 12-18, wherein generating the target neural network from the target connectivity means comprises:

20. The method of claim 19, wherein the node-valid connection relationship comprises at least one of:

21. The method of any one of claims 12-20, wherein the determining the generation number of the target neural network to be generated, and the node types and node numbers of the nodes of all the generation of the target neural network comprises:

22. A data processing method, comprising:

generating the target neural network according to the target connection mode;

and processing data by adopting the target neural network.

23. A validation platform for an accelerator, comprising:

a memory for storing code;

at least one processor configured to execute code stored in the memory to perform the following:

generating at least one target neural network;

24. The validation platform of claim 23, wherein the generating at least one target neural network comprises:

25. The verification platform of claim 24, wherein determining a target connection mode for connecting all nodes in the target neural network according to preset node connection requirements comprises:

26. The verification platform of claim 25, wherein said determining a candidate parent node for the current node based on the node connectivity requirements comprises:

27. The verification platform of claim 25 or 26, wherein said selecting the actual parent node of the current node from the candidate parent nodes comprises:

28. The verification platform of claim 27, wherein determining the actual parent node of the current node from the candidate parent nodes based on the probability that each of the candidate parent nodes is the actual parent node of the current node comprises:

29. The verification platform of claim 27 or 28, wherein the verification platform further comprises:

30. The authentication platform of any one of claims 27 to 29, wherein the probability density function is a gaussian function.

31. The validation platform of any of claims 23-30, wherein generating the target neural network from the target connection style comprises:

32. The verification platform of claim 31, wherein the node-valid connection relationship comprises at least one of:

33. The validation platform of any of claims 23-32, wherein the determining an algebra of the target neural network to be generated, and node types and node numbers of nodes of all generations of the target neural network comprises:

34. An apparatus for generating a neural network, comprising:

a memory for storing code;

35. The apparatus of claim 34, wherein the determining a target connection mode for connecting all nodes in the target neural network according to a preset node connection requirement comprises:

36. The apparatus as recited in claim 35, wherein said determining a candidate parent node for said current node based on said node connectivity requirements comprises:

37. The apparatus of claim 35 or 36, wherein said selecting the actual parent node of the current node from the candidate parent nodes comprises:

38. The apparatus of claim 37, wherein determining the actual parent node of the current node from the candidate parent nodes based on the probability that each of the candidate parent nodes is the actual parent node of the current node comprises:

39. The apparatus of claim 37 or 38, wherein the apparatus further comprises:

40. The apparatus of any one of claims 37-39, wherein the probability density function is a Gaussian function.

41. The apparatus of any one of claims 34-40, wherein generating the target neural network from the target connectivity means comprises:

42. The apparatus of claim 41, wherein the node-valid connection relationship comprises at least one of:

and when the node type of the current node is FC or GlobalPooling, the current node can only be connected with nodes except the FC, GlobalPooling and act types.

43. The apparatus of any one of claims 34-42, wherein the determining the generation number of the target neural network to be generated, and the node types and node numbers of the nodes of all the generation of the target neural network comprises:

44. A data processing apparatus, comprising:

a memory for storing code;

generating the target neural network according to the target connection mode;

and processing data by adopting the target neural network.