WO2021189209A1 - 加速器的检测方法和验证平台 - Google Patents

加速器的检测方法和验证平台 Download PDF

Info

Publication number
WO2021189209A1
WO2021189209A1 PCT/CN2020/080742 CN2020080742W WO2021189209A1 WO 2021189209 A1 WO2021189209 A1 WO 2021189209A1 CN 2020080742 W CN2020080742 W CN 2020080742W WO 2021189209 A1 WO2021189209 A1 WO 2021189209A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
neural network
nodes
target neural
network
Prior art date
Application number
PCT/CN2020/080742
Other languages
English (en)
French (fr)
Inventor
王耀杰
林蔓虹
陈琳
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2020/080742 priority Critical patent/WO2021189209A1/zh
Publication of WO2021189209A1 publication Critical patent/WO2021189209A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Definitions

  • This application relates to the technical field of neural networks, and more specifically, to an accelerator detection method and verification platform.
  • the neural network After the neural network is generated, in order to use the neural network for data processing, it is generally necessary to load the neural network on the accelerator to run.
  • the performance of the accelerator may directly affect the subsequent use of neural networks for data processing. Therefore, how to better detect the performance of the accelerator is a problem that needs to be solved.
  • This application provides an accelerator detection method, a neural network generation method, a data processing method, a network description file generation method, and related devices to better perform accelerator detection.
  • an accelerator detection method includes: generating at least one target neural network according to a network description file, wherein the network description file records the network structure of the target neural network;
  • the target neural network is translated into neural network instructions; the neural network instructions are respectively input to the accelerator and the software model matching the accelerator for execution, and the difference in the output results of the neural network instructions is determined; according to the neural network instructions The difference in the output result determines the abnormal instruction during the operation of the accelerator.
  • generating at least one target neural network includes: generating multiple target neural networks.
  • different neural networks can be used to detect the performance of the accelerator, and the performance of the accelerator can be better detected.
  • the above-mentioned target neural network is a convolutional neural network.
  • the target neural network generated in this application can also be other types of neural networks other than the convolutional neural network, for example, a feedforward neural network, a recurrent neural network, and so on.
  • generating at least one target neural network includes: obtaining and parsing at least one of the network description files, each of the network description files including nodes of all generations of the target neural network The node type and the node connection relationship of the node; the structure of the target neural network is generated according to the node type and the node connection relationship; the node parameters of the target neural network are generated according to preset parameter constraints between nodes; according to the target The structure of the neural network and the node parameters generate the target neural network.
  • the corresponding target neural network is generated by parsing the network description file, and the structure of the target neural network can be customized, modified and reused through the network description file to generate the target neural network with a clear structure multiple times
  • the network description file is a text file, and the first line of the network description file includes a comment for describing optional node types; the nth line of the network description file ( n is an integer greater than 1) describes node information of all nodes of the n-1th generation in the target neural network, and the node information includes node types and node connection relationships; where n is an integer.
  • generating the structure of the target neural network according to the node type and the node connection relationship includes: instantiating each node according to the node type; and determining the target neural network according to the node connection relationship.
  • the parent node of each node connect each node and the parent node corresponding to each node to generate the structure of the target neural network.
  • the process of generating the network description file includes: determining the algebra of the target neural network, and the node type and number of nodes of all generations of the target neural network; Node connection requires determining a target connection mode connecting all nodes in the target neural network; generating the network description file according to the generation number of the target neural network, the node type of each generation of nodes, the number of nodes, and the target connection mode.
  • the network description file as the basis for neural network generation can be finally generated. More flexible and convenient to generate various types of neural networks. Further, when multiple types of neural networks are generated, the performance of the accelerator can be better tested.
  • determining the target connection method of connecting all nodes in the target neural network according to the preset node connection requirements includes: determining the candidate parent node of the current node according to the node connection requirements, where the current node and The candidate parent node meets the node connection requirements; the actual parent node of the current node is selected from the candidate parent nodes; the connection relationship between the current node and the actual parent node of the current node is determined to finally generate the target connection mode.
  • the above-mentioned candidate parent node may also be referred to as a candidate node of the parent node.
  • determining the candidate parent node of the current node according to the node connection requirements includes: determining the candidate parent node of the current node according to at least one of the following connection relationships; the node type of the current node is In Concat or Eltwise, the number of parent nodes of the current node is multiple, and the number of parent nodes of the current node is less than or equal to the number of candidate parent nodes of the current node; the node type of the parent node of the current node is Active (active layer ), the node type of the current node is a type other than Active; when the node type of the parent node of the current node is Global Pooling (global pooling layer), the node type of the current node is Global Pooling; in the parent node of the current node When the node type of is FC (fully connected layers), the node type of the current node is FC or Concat (concatenation layer); the node type
  • selecting the actual parent node of the current node from the candidate parent nodes includes: determining the probability that each of the candidate parent nodes is the actual parent node of the current node according to a probability density function; The actual parent node of the current node is determined from the candidate parent nodes according to the probability that each node in the candidate parent nodes is the actual parent node of the current node.
  • determining the actual parent node of the current node from the candidate parent nodes according to the probability that each of the candidate parent nodes is the actual parent node of the current node includes: A node whose probability of being the actual parent node of the current node is greater than the preset probability value is determined as the actual parent node of the current node.
  • the above method further includes: adjusting the probability of each of the candidate parent nodes as the actual parent node of the current node according to the expectation and variance of the probability density function.
  • the width and depth of the target neural network can be adjusted, so that the target neural network whose depth and width meet the requirements can be generated.
  • the expectation and variance of the probability density function can be adjusted according to the requirements of the depth and width of the target neural network to be generated.
  • the greater the variance of the probability density function the greater the probability that the nodes in the neighboring generation will be selected, the narrower the width of the network and the deeper the depth.
  • the aforementioned probability density function is a Gaussian function.
  • the above-mentioned generating the target neural network according to the target connection mode includes: determining the effective target connection relationship from the target connection relationship according to the preset effective connection relationship of the nodes; generating according to the effective target connection relationship Target neural network.
  • the above-mentioned effective connection relationship of nodes includes at least one of the following relationships: when the node type of the current node is Eltwise, the number of channels of multiple inputs of the current node remains the same; When the node type is FC or Global Pooling, the current node can only connect to nodes other than FC, Global Pooling, and act types.
  • determining the algebra of the target neural network to be generated, as well as the node type and number of nodes of all generations of the target neural network includes: determining the target neural network according to the operational requirements of the target neural network The algebra of the network, and the node type and number of nodes of all generations of the target neural network.
  • the above calculation requirements for the target neural network can be the calculation amount (size) demand, when the calculation amount is small, you can set fewer algebras for the target neural network, and you can set a smaller number of nodes per generation; and when When the computational demand is large, more algebras can be set for the target neural network, and more nodes can be set per generation.
  • the above calculation requirements for the target neural network can be the complexity of the calculation. When the calculation complexity is low, fewer algebras can be set for the target neural network, and a smaller number of nodes can be set for each generation; when the calculation complexity is low When it is higher, more algebras can be set for the target neural network, and more nodes can be set for each generation.
  • the process of generating the network description file includes: determining the node type and node connection relationship of the nodes of all generations of the target neural network according to the structure type of the accelerator; The node connection relationship generates the network description file.
  • determining the node types and node connection relationships of the nodes of all generations of the target neural network according to the structure type of the accelerator includes: determining the number of sub-networks in the target neural network; The type of arithmetic unit determines the node type of each node in the sub-network; determines the node connection relationship within each of the sub-networks, and determines the sub-network according to the node connection relationship within each of the sub-networks Node connection relationship between networks; determining the target neural network according to the node type of the node within each sub-network, the node connection relationship within each sub-network, and the node connection relationship between the sub-networks The node type and node connection relationship of the nodes of all generations.
  • determining the node type of each node in the sub-network according to the type of the arithmetic unit of the accelerator includes: selecting at least one of the types of the arithmetic unit of the accelerator as one of the sub-networks The node type of each node in one of the sub-networks is different; repeat the previous step until the node type of each node in the sub-network is determined.
  • the node connection relationship within each of the sub-networks is determined, and the node connection relationship between the sub-networks is determined according to the node connection relationship within each of the sub-networks It includes: selecting a node as an input node of the sub-network; determining the parent node of each node in the sub-network except the input node by random or traversal; determining the output of each sub-network Node; according to the input node and the output node of the sub-network, the node connection relationship between the sub-networks is determined by random or traversal.
  • a method for generating a neural network includes: obtaining and parsing at least one of the network description files, each of the network description files including a node type of nodes of all generations of the target neural network And the node connection relationship; generate the structure of the target neural network according to the node type and the node connection relationship; generate the node parameters of the target neural network according to preset parameter constraints between nodes; according to the target neural network The structure and the node parameters generate the target neural network.
  • the corresponding target neural network is generated by parsing the network description file, and the structure of the target neural network can be customized, modified, and reused through the network description file to generate a target neural network with a clear structure multiple times.
  • a data processing method comprising: obtaining and parsing at least one of the network description files, each of the network description files including the node types and nodes of all generations of nodes of the target neural network Connection relationship; generating the structure of the target neural network according to the node type and the connection relationship of the nodes; generating the node parameters of the target neural network according to preset parameter constraints between nodes; according to the structure of the target neural network and The node parameters generate the target neural network; the target neural network is used for data processing.
  • the corresponding target neural network is generated by parsing the network description file, and the structure of the target neural network can be customized, modified and reused through the network description file, so as to generate the target neural network with a clear structure multiple times. Use a specific neural network to process the corresponding data.
  • a method for generating a network description file includes: determining the algebra of the target neural network, and the node types and the number of nodes of all generations of the target neural network; The connection requirement determines the target connection mode for connecting all nodes in the target neural network; the network description file is generated according to the algebra of the target neural network, the node type of each generation of nodes, the number of nodes, and the target connection mode.
  • the network description file as the basis for neural network generation can be finally generated. More flexible and convenient to generate various types of neural networks. Further, when multiple types of neural networks are generated, the performance of the accelerator can be better tested.
  • a method for generating a network description file includes: determining the node types and node connection relationships of nodes of all generations of the target neural network according to the structure type of the accelerator; The connection relationship generates the network description file.
  • an accelerator verification platform includes: a memory for storing code; at least one processor for executing the code stored in the memory to perform the following operations: generating at least one according to a network description file The target neural network, wherein the network description file records the network structure of the target neural network; translates the at least one target neural network into a neural network instruction; and inputs the neural network instruction to and from the accelerator respectively It is executed in a matched software model, and the difference in the output result of the neural network instruction is determined; and the abnormal instruction during the operation of the accelerator is determined according to the difference in the output result of the neural network instruction.
  • a device for generating a neural network including: a memory for storing code; at least one processor for executing the code stored in the memory to perform the following operations: acquiring and analyzing at least one of the A network description file, each of the network description files including a node type and node connection relationship of nodes of all generations of the target neural network; the structure of the target neural network is generated according to the node type and the node connection relationship Generate the node parameters of the target neural network according to preset parameter constraints between nodes; generate the target neural network according to the structure of the target neural network and the node parameters.
  • a data processing device including: a memory, configured to store code; at least one processor, configured to execute the code stored in the memory to perform the following operations: obtain and parse at least one of the network descriptions File, each of the network description files includes a node type and node connection relationship of nodes of all generations of the target neural network; generating the structure of the target neural network according to the node type and the node connection relationship; The node parameters of the target neural network are generated by preset parameter constraints between nodes; the target neural network is generated according to the structure of the target neural network and the node parameters; the target neural network is used for data processing.
  • a device for generating a network description file including: a memory for storing code; at least one processor for executing the code stored in the memory to perform the following operations: determining the target neural network The algebra of the target neural network, and the node type and number of nodes of all generations of the target neural network; determine the target connection mode to connect all nodes in the target neural network according to the preset node connection requirements; according to the target neural network The number of generations, the node type of each generation of nodes, the number of nodes, and the target connection mode to generate the network description file
  • a device for generating a network description file including: a memory for storing code; at least one processor for executing the code stored in the memory to perform the following operations: determine according to the structure type of the accelerator Node types and node connection relationships of nodes of all generations of the target neural network; generating the network description file according to the node types and the node connection relationships.
  • Figure 1 is a schematic diagram of the neural network structure
  • FIG. 2 is a schematic flowchart of an accelerator detection method according to an embodiment of the present application
  • Fig. 3 is a flowchart of a neural network generation process according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a network structure described by a network description file in an embodiment of the present application.
  • Fig. 5 is a flowchart of a process of generating a target neural network in an embodiment of the present disclosure.
  • Fig. 6 is a flowchart of a method for generating a network description file in an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of the algebra of the determined target neural network, the number of nodes in each generation, and the type of nodes;
  • Fig. 8 is a schematic diagram of a possible node connection relationship of a neural network
  • Fig. 9 is a schematic diagram of a possible node connection relationship of a neural network
  • FIG. 10 is a schematic diagram of a possible node connection relationship of a neural network
  • Figure 11 is a schematic diagram of a possible node connection relationship of a neural network
  • FIG. 12 is a schematic diagram of the process of generating a network description file in the embodiment shown in FIG. 6.
  • FIG. 13 is a flowchart of another process of generating a network description file according to an embodiment of the present application.
  • FIG. 14 is a flowchart of determining the structure of the target neural network according to the structure type of the accelerator in an embodiment of the present application;
  • Figure 15 is a schematic diagram of node type selection within a sub-network
  • Figure 16 is a flow chart for determining the connection relationship between nodes
  • FIG. 17 is a schematic diagram of the process of determining the connection relationship of the sub-network shown in FIG. 15;
  • FIG. 18 is a schematic block diagram of a verification platform of an accelerator according to an embodiment of the present application.
  • FIG. 19 is a schematic block diagram of an apparatus for generating a neural network according to an embodiment of the present application.
  • FIG. 20 is a schematic block diagram of a data processing device according to an embodiment of the present application.
  • FIG. 21 is a schematic block diagram of an apparatus for generating a network description file according to an embodiment of the present application.
  • FIG. 22 is a schematic block diagram of an apparatus for generating a network description file according to an embodiment of the present application.
  • Figure 1 is a schematic diagram of the neural network structure.
  • the neural network in FIG. 1 can be either a convolutional neural network or other types of neural networks, which is not limited in this application.
  • the structure of the neural network mainly includes three parts: node (node), generation (generation) and tree (tree).
  • the neural network includes nodes 1 to 9, which together form the nodes from the 0th to the 4th generation.
  • the nodes included in each generation are as follows:
  • the first generation node 2, node 3, node 4;
  • the second generation node 5, node 6;
  • the third generation node 7, node 8;
  • the node of the previous generation can be used as the parent node of the node of the subsequent generation, and the node of the subsequent generation can be used as the child node of the node of the previous generation.
  • nodes from the 1st to the 4th generation can be used as child nodes of the 0th generation node, and the 1st generation node can be used as the parent node of the 2nd to 4th generation nodes.
  • the above-mentioned nodes in the 0th to 4th generations together constitute the tree of the neural network.
  • Each node is used to describe a computing layer (for example, a convolutional layer).
  • the information contained in each node and the meaning of the corresponding information are as follows:
  • node_header the header information of the node
  • the header information of the above node includes sequence, gen_id, and node_id, where sequence is the total sequence number of the node, gen_id represents the generation index number (the index number of the generation where the node is located), and node_id represents the node index number in the generation;
  • node_t node type, for example, the node type here can include Input (input layer)/Eltwise (operating layer by element)/Concat (splicing layer)/
  • node_name the name of the node (of the node);
  • top (of the node) the node name of the top node, where the top node is a child node of the node;
  • bottom[] The node name of the bottom node (of the node), where the bottom node is the parent node of the node, and the number of bottom nodes is parent_num;
  • if_n/c/h/w[] The batch number, channel number, width and height of each input node (of the node), where the number of input nodes of the node is equal to parent_num;
  • Generation is used to organize at least one node. If a generation contains multiple nodes, the nodes in the same generation cannot be connected to each other. The nodes in the current generation can only connect to the nodes in the generation whose gen_id is less than the gen_id of the current generation ( That is to support cross-generation connection).
  • the information contained in the generation and the meaning of the corresponding information are as follows:
  • gen_id code index number
  • node_num the number of nodes contained in the generation, node_num is less than or equal to the maximum width of the neural network
  • nodes instances of nodes included in the generation
  • node_tq[] The type of each node included in the generation.
  • Trees are used to organize multiple generations and describe the connection relationships of all nodes in the network.
  • the information contained in the tree and the meaning of the corresponding information are as follows:
  • gen_num the algebra contained in the tree, gen_num is less than or equal to the maximum depth of the network
  • genes[] Examples of generations contained in the tree, the number of gens[] is equal to gen_num.
  • neural network structure introduced above in conjunction with FIG. 1 is only one possible structure of the neural network in the embodiment of the present application, and the neural network in the embodiment of the present application may also have other structures.
  • the specific structure and form of the network are not limited.
  • Fig. 2 is a schematic flowchart of an accelerator detection method according to an embodiment of the present application.
  • the method shown in Figure 2 can be executed by an electronic device or a server, where the electronic device can be a mobile terminal (for example, a smart phone), a computer, a personal digital assistant, a wearable device, a vehicle-mounted device, an Internet of Things device, etc., containing a processor equipment.
  • a mobile terminal for example, a smart phone
  • a computer for example, a computer, a personal digital assistant, a wearable device, a vehicle-mounted device, an Internet of Things device, etc., containing a processor equipment.
  • the accelerator detection method 200 may include:
  • Step S1 generating at least one target neural network according to the network description file, wherein the network description file records the network structure of the target neural network.
  • Step S2 Translating at least one target neural network into a neural network instruction.
  • step S3 the neural network instructions are respectively input to the accelerator and the software model matched with the accelerator for execution, and the difference in the output results of the neural network instructions is determined.
  • Step S4 Determine an abnormal instruction during the operation of the accelerator according to the difference in the output result of the neural network instruction.
  • step S1 at least one target neural network is generated according to the network description file, wherein the network description file records the network structure of the target neural network.
  • the aforementioned at least one target neural network is a plurality of target neural networks.
  • the above-mentioned target neural network may be a convolutional neural network, or may be other than a convolutional neural network, or may be other types of neural networks other than a convolutional neural network, for example, a feedforward neural network, a recurrent neural network, and so on.
  • the foregoing network description file may be, for example, a text file, and the file type is, for example, *.dscp.
  • the file type of the network description file can also be other types that can be edited, and the present disclosure does not impose special restrictions on this.
  • step S2 at least one target neural network is translated into neural network instructions.
  • step S1 is to load the aforementioned at least one target neural network into an accelerator or software model for execution. Before loading into the accelerator or software model, it is generally necessary to translate the aforementioned at least one target neural network into an accelerator or software model that can be executed. instruction.
  • step S3 the neural network instructions are respectively input to the accelerator and the software model matched with the accelerator for execution, and the difference in the output results of the neural network instructions is determined.
  • the above-mentioned software model matched with the accelerator may be a software model for comparing the performance of the accelerator, and the software model may simulate the operation behavior of the accelerator.
  • the above neural network command is input to the accelerator to obtain the first output result
  • the above neural network command is input to the software model to obtain the second output result
  • the above neural network can be obtained by comparing the first output result and the second output result. The difference in the output result of the instruction.
  • step S4 according to the difference in the output result of the neural network instruction, it is determined that an abnormal instruction occurs during the operation of the accelerator.
  • step S4 when the output result is different, the instruction of the accelerator corresponding to the output result can be regarded as an instruction that is abnormal during the operation of the accelerator.
  • the instruction that is abnormal during the operation of the accelerator it can be used for positioning Accelerator problems, further improvements or amendments to the design of the accelerator, so as to improve the performance of the accelerator.
  • multiple target neural networks with a clear structure can be generated, and targeted performance detection of the accelerator can be effectively performed. Further, when multiple target neural networks are generated, different neural networks can be used to detect the performance of the accelerator, and the performance detection of the accelerator can be better realized.
  • step S1 There are many implementation methods for generating at least one target neural network according to the network description file in step S1.
  • the method of generating at least one target neural network according to the network description file in step S1 will be described in detail below with reference to FIG. 3.
  • Fig. 3 is a flowchart of a neural network generation method according to an embodiment of the present application.
  • the method 300 for generating a neural network may include:
  • Step S11 Obtain and parse at least one network description file.
  • Each network description file includes node types and node connection relationships of nodes of all generations of a target neural network.
  • step S12 the structure of the target neural network is generated according to the node type and the connection relationship of the nodes.
  • Step S13 generating node parameters of the target neural network according to preset parameter constraint conditions between nodes.
  • Step S14 Generate a target neural network according to the structure and node parameters of the target neural network.
  • step S11 at least one network description file is acquired and parsed.
  • Each network description file includes node types and node connection relationships of nodes of all generations of a target neural network.
  • the above-mentioned target neural network can be any kind of neural network.
  • Each description file can describe a target neural network.
  • multiple network description files can be obtained and parsed to generate multiple target neural networks with clear structures and random parameters.
  • the format of the network description file may be, for example: the first line includes comments, which are used to describe optional node types; the nth line (n is an integer greater than 1) describes the nth line in the target neural network.
  • the node information of all nodes of the -1 generation, the node information includes the node type and the node connection relationship; where n is an integer.
  • the optional node types include but are not limited to the following types:
  • the information of a single node may be described in the following format, for example:
  • NodeType is the node type of the current node, which is selected from the effective node types in the first row.
  • the node type is generally Input type (that is, the first generation node of the target neural network is set as the Input node);
  • PGnIDX_PNnIDX is the parent node information of the current node, that is, the node connection relationship of the current node.
  • a node can have multiple parent nodes. For a current node, multiple parent node information can be described (the Input type node has no parent node);
  • PGnIDX and PGnIDX respectively represent the generation index of the n-th parent node and the m-th parent node of the current node. At this time, the generation index of the parent node is smaller than the generation index of the current node;
  • PNnIDX and PNmIDX respectively represent the node index of the n-th parent node and the m-th parent node in the generation.
  • Each line in the network description file (that is, each generation in the target neural network) can describe multiple nodes, and the node description information of each node is separated by a single space character.
  • the 0th generation 0 of the target neural network includes 1 node, namely node 1, and the node type of node 1 is Input; the first generation includes three nodes: node 2, node 3, and node 4.
  • the node types are ReLU, Conv, and PReLU respectively.
  • the parent node of node 2 is the 0th node of generation 0, namely node 1.
  • the parent nodes of node 3 and node 4 are the same as node 2, and are Input nodes; the second generation includes 1 node, namely node 5, node type is Pooling, parent node is the 0th node of generation 1, namely node 2; the third generation includes 1 node, namely node 6, node type is Eltwise, this node connects two The parent nodes are the 0th node of the second generation, namely node 5, and the first node of the first generation, namely node 3.
  • the network description file can be edited and set by developers according to their needs, so that various neural networks with determined structures and random parameters can be customized based on their needs.
  • step S12 the structure of the target neural network is generated according to the node type and the connection relationship of the nodes.
  • step S12 may mainly include: instantiating each node according to the node type; determining the parent node of each node according to the node connection relationship; connecting each node and the parent node corresponding to each node to generate the structure of the target neural network.
  • the structure for generating the target neural network can be used to verify the rationality of the connection relationship of the target neural network, or to output a target neural network that can be applied to other purposes.
  • the above-mentioned prototxt file may also be output directly according to the node type and the node connection relationship, without outputting the target neural network.
  • step S13 node parameters of the target neural network are generated according to preset parameter constraint conditions between nodes.
  • the intra-node parameter type of each node the number of intra-node parameters, and the intra-node parameter are related to the node type.
  • of_h represents the height of the node output feature map
  • if_h represents the height of the node input feature map
  • pad_h is the element that is filled on the input feature map of the node for the convenience of calculation
  • the number of rows is usually filled with 0
  • dilation_h represents the number of elements interpolated in the input feature map of the node (dilation_h is greater than 0)
  • the interpolation value is 0
  • kernel_h represents the size of the convolution kernel during convolution operation
  • stride_h represents The step size of the convolution kernel or the pooling window sliding in the height direction
  • Pool_size represents the size of the window during pooling processing.
  • Condition A The size of the feature map output by the parent node is equal to the size of the feature map input by the child node.
  • the size of the output feature map in the parent node must be the same as the size of the feature map input by the child node.
  • Fig. 5 is a flowchart of a process of generating a target neural network in an embodiment of the present disclosure.
  • the process shown in FIG. 5 can be executed by an electronic device (for the definition and explanation of the electronic device, please refer to the related content in the method shown in FIG. 2).
  • the process 500 shown in FIG. 5 includes steps S501 to S510. The steps are described in detail.
  • Step S501 start.
  • Step S501 represents starting to generate a neural network.
  • Step S502 Obtain and parse the description file in the dscp format.
  • step S503 If the parsing is successful, go to step S503; if the parsing fails, return the parsing failure message and go to step S510 to end the neural network generation process.
  • Step S503 Obtain the number of nodes of each generation recorded in the dscp file, the node type of each node in each generation, and the node connection relationship.
  • step S503 the node type and node connection relationship of each node, that is, the description information of the parent node of each node, can be obtained according to the order of description in the dscp file.
  • Step S504 instantiate each node according to the node type.
  • each node in each generation can be instantiated according to the node type of each generation and the number of nodes in each generation, that is, according to the node type of each generation node and the number of nodes in each generation The number determines the node instances in each generation, where one node can correspond to one instance or multiple instances.
  • the node here is more inclined to a logical concept, and the node instance is an entity that the node actually depends on, and various data processing tasks of the node can be executed on the entity.
  • Step S505 Configure the header information of each node and the number of parent nodes.
  • Configuring the header information of each node is to generate the total sequence number of each node (sequence), the generation index number (gen_id) and the node index number (node_id) in the generation.
  • the generation index number (gen_id) of each generation can be generated in the order from top to bottom, and the total sequence number of each node ( sequence), in each generation, the node index number (node_id) of each node in the generation is generated in a certain order.
  • sequence represents the sequence number of the node in the entire neural network.
  • Step S506 it is determined whether the current connection is valid.
  • step S506 it is necessary to determine whether the currently existing connection is valid.
  • each connection relationship can be judged according to the preceding conditions (4) and (5), and the conditions (4) and (5) are satisfied.
  • the connection relationship is a valid connection relationship, and the connection relationship that does not satisfy any one of the condition (4) and the condition (5) is an invalid connection relationship.
  • step S507 is executed.
  • step S507 is executed.
  • step S510 is entered to end the network generation process.
  • Step S507 Connect each node.
  • each node and its parent node may be connected according to the parent node description information of each node recorded in the dscp file.
  • step S508 the intra-node parameters of each node are randomly generated.
  • the intra-node parameters of each node can be determined according to the above formula (1), formula (2) and the constraints of condition A.
  • step S509 the prototype file in the prototxt format is printed.
  • the prototxt file contains the connection relationship of each node in the neural network to be generated. After the prototxt file is generated, it is convenient to construct or generate the neural network according to the prototxt file.
  • Step S510 end.
  • Step S510 represents the end of the neural network generation process.
  • Fig. 6 is a flowchart of a method for generating a network description file in an embodiment of the present disclosure.
  • a method 600 for generating a network description file indicated in FIGS. 2 to 5 may include:
  • Step S61 Determine the algebra of the target neural network, and the node types and the number of nodes of all generations of the target neural network.
  • Step S62 Determine a target connection mode for connecting all nodes in the target neural network according to a preset node connection requirement.
  • Step S63 Generate the network description file according to the generation number of the target neural network, the node type of each generation of nodes, the number of nodes, and the target connection mode.
  • step S61 determine the algebra of the target neural network, and the node types and the number of nodes of all generations of the target neural network.
  • the target neural network determined in the above step S61 may be any one of the at least one target neural network in the above step S1.
  • step S61 the generation number of the target neural network can be randomly determined first, and then the node type and the number of nodes of each generation of nodes can be randomly determined.
  • the algebra of the target neural network can be randomly determined to be 5 (the algebra of the neural network in Figure 1 is 5).
  • the algebra of the target neural network can be determined within a certain numerical range (for example, the depth range of the neural network).
  • the algebra of the target neural network can be randomly determined to be 12 within the range of values [10,20].
  • the node type of each generation node can be determined from all available node types. When determining the number of nodes in each generation, it can be within a certain range of values (for example, the width of the neural network). Range) to determine the number of nodes in each generation.
  • step S61 the algebra of the target neural network, as well as the node type and number of nodes of each generation can also be set according to specific (operation) requirements.
  • the node type of each generation can be determined according to the available node types of Input/Eltwise/Concat/Conv/Pool/Relu/Prelu/Innerproduct/Global Pooling.
  • the number of nodes from the 0th to the 4th generation can be randomly determined to be 1, 3, 2, and 2 respectively. and 1.
  • the number of nodes in each generation can be greater than or equal to the number of node types in the generation (the number of node types in each generation is less than the number of nodes in the generation ).
  • the algebra of the target neural network determined in step S61 is 4 (including the 0th to 4th generations), and the number of nodes included in the 0th to 3rd generations is specifically as follows:
  • the number of nodes of the 0th generation node is 1;
  • the number of nodes of the first generation node is 3;
  • the number of nodes of the second generation node is 2;
  • the number of nodes of the 3rd generation node is 1.
  • the node types included in the 0th to 3rd generations are as follows:
  • the node type of the 0th generation node is Input
  • the node types of the first generation nodes include FC, Eltwise and Global Poolling;
  • the node types of the second generation nodes are Concat and FC;
  • the node type of the 3rd generation node is Eltwise.
  • step S62 a target connection mode for connecting all nodes in the target neural network is determined according to preset node connection requirements.
  • the above-mentioned node connection requirements may be rules that can meet the normal use requirements of the neural network.
  • the node connection requirements may be preset. Specifically, the node connection requirements may be set through experience and the requirements of the neural network to be generated.
  • connection relationship between each node in the target neural network determined according to the node connection requirements can be multiple, and after multiple connection relationships are obtained, one (arbitrarily) can be selected from the multiple connection relationships. This kind of connection is regarded as the final connection.
  • the aforementioned node connection requirements may include at least one of the following conditions:
  • the node type of the first generation node is Input type
  • Table 1 shows the node types of the parent nodes that can be connected when the current node is of different node types, where Y indicates that it can be connected, and N indicates that it cannot be connected.
  • step S63 the validity of the multiple node connection relationships can be judged, and a valid node connection relationship can be selected from it before step S63 is performed.
  • step S63 is executed according to these valid node connection relationships.
  • FC type and Global Pooling type nodes (including nodes immediately following the current node and nodes located after the current node in subsequent generations) cannot be connected to other types of nodes other than FC, Global Pooling and act types.
  • the node type of the node immediately following the FC type and Global Pooling type node, and the node after the FC type and Global Pooling type node in the subsequent generation can only be FC, Global Pooling or act type.
  • the node type of node 6 is Eltwise, the number of input channels of node 6 is both 1, and the number of input channels at both ends of node 6 meets the above condition (4), but for For node 11 of the same Eltwise type, the number of input channels on the left of node 11 is 2, the number of input channels on the right is 1, and the number of input channels on the left of node 11 is inconsistent with the number of input channels on the right, which does not meet the above requirements.
  • Condition (4) the number of input channels on the left of node 6 is Eltwise, the number of input channels of node 6 is both 1, and the number of input channels at both ends of node 6 meets the above condition (4), but for node 11 of the same Eltwise type, the number of input channels on the left of node 11 is 2, the number of input channels on the right is 1, and the number of input channels on the left of node 11 is inconsistent with the number of input channels on the right, which does not meet the above requirements.
  • Condition (4) the number of input channels on the left of
  • connection relationship shown in FIG. 8 does not meet the aforementioned condition (4).
  • the connection relationship determined in step 220 include the invalid connection relationship shown in FIG. 8, the connection relationship needs to be excluded.
  • the node type of node 1 is FC
  • the node type of node 2 is Relu. Since the node type of node 1 is FC, the only connection behind node 1 is FC and Global.
  • the node type of node 3 is Global Pooling
  • the node type of node 4 is Prelu
  • node 3 can only be connected to the node type of FC
  • Global Pooling and act nodes the connection relationship between node 3 and node 4 does not satisfy the above condition (5).
  • connection relationship shown in FIG. 9 does not meet the aforementioned condition (5).
  • the connection relationship determined in step S62 include the invalid connection relationship shown in FIG. 9, the connection relationship should be excluded.
  • step S62 when determining the parent node of a node, there may be multiple candidate nodes. At this time, as long as the above conditions (1) to (5) are met, they can be used as the candidate parent nodes of the current node (also It can be called the candidate node of the parent node), but which nodes are selected from the candidate parent nodes as the actual parent nodes of the current node can be determined according to the probability density function.
  • the above-mentioned probability density function may be a Gaussian function. Since the Gaussian function as a whole meets the basic requirement that the closer the generation is selected, the higher the probability, specifically, the expected value of the Gaussian function may be the same as the generation index value -1 Keep consistent, the expected value of the Gaussian function does not affect the control of the network form.
  • the variance in the Gaussian function By adjusting the variance in the Gaussian function, the shape of the Gaussian function can be controlled, thereby controlling the probability that the nodes in each generation are selected.
  • the greater the variance of the Gaussian function the greater the probability that the nodes in the neighboring generation will be selected, the deeper the depth, and the narrower the width of the network.
  • Fig. 10 and Fig. 11 are schematic diagrams of possible node connection relationships of the neural network, respectively.
  • step S61 When the algebra of the target neural network determined in step S61, as well as the node types and number of nodes in each generation are shown in Figure 7, on this basis, continue to perform step S62 to obtain the node connection relationship shown in Figure 10 and Shown in Figure 11.
  • the node connection relationship shown in Fig. 10 and Fig. 11 is analyzed. Through analysis, it is found that the connection relationship of the nodes shown in Fig. 10 and Fig. 11 both satisfy the condition (4), but in Fig. 10, the connection of the node 3 and the node 6 does not meet the above condition (5). In addition to satisfying the above condition (4), Fig. 11 also satisfies the condition (5). Therefore, it can be determined that the node connection relationship shown in Fig. 11 is an effective node connection relationship, that is, the target connection mode of the target neural network is determined .
  • step S63 a network description file is generated according to the generation number of the target neural network, the node type of each generation of nodes, the number of nodes, and the target connection mode.
  • a network description that can be loaded and analyzed can be generated according to the determined algebra of the target neural network, the node type of each generation of nodes, the number of nodes, and the target connection mode.
  • File, the network description file may be, for example, a text file, and the file type is, for example, a dscp file.
  • FIG. 12 is a schematic diagram of the process of generating a network description file in the embodiment shown in FIG. 6.
  • the process shown in FIG. 12 can be executed by an electronic device (for the definition and explanation of the electronic device, please refer to the relevant content in the method shown in FIG. 2).
  • the process shown in FIG. 12 includes steps 1201 to 1211. Give a detailed description.
  • Step S1201 start.
  • Step S1201 represents starting to generate the network description file.
  • step S1202 the algebra of the neural network is randomly generated.
  • a value can be randomly selected as the algebra of the neural network within a certain value range.
  • step S1203 the number of nodes of each generation and the node type of nodes of each generation are randomly generated.
  • the number of nodes of each generation can be randomly generated within a certain network width range. For example, the width of the neural network cannot exceed 10, so you can choose a value between 1 and 10 as the number of nodes in each generation.
  • the node type of each node can be randomly generated from all available node types.
  • Step S1202 and step S1203 here are equivalent to step S61 above.
  • the relevant definitions and explanations of step S61 above are also applicable to step S1202 and step S1203.
  • step S1202 and step S1203 are not described in detail here.
  • Step S1204 Determine the header information of each node and the number of parent nodes.
  • Determining the header information (node_header) of each node is to generate the total sequence number (sequence) of each node, the generation index number (gen_id) and the node index number (node_id) in the generation.
  • the generation index number (gen_id) of each generation can be generated in the order from top to bottom, and the total sequence number of each node ( sequence), in each generation, the node index number (node_id) of each node in the generation is generated in a certain order.
  • sequence represents the sequence number of the node in the entire neural network.
  • Step S1205 Calculate candidate nodes of the parent node of each node.
  • step S1205 the candidate node of the parent node of the current node is to be calculated, so that the parent node can be subsequently selected from the candidate node.
  • the candidate parent node When determining the candidate parent node for each node, the candidate parent node can be selected according to certain node connection requirements (the node connection requirements can be one or more of the above conditions (1) to (3)) ,
  • the node that meets the node connection requirements in the previous generation is regarded as the candidate parent node of the current node. For example, as shown in FIG. 7, node 5 and node 6 in the second generation can be selected as candidate parent nodes of node 7 in the third generation.
  • step S1205 when the candidate parent node of the node is determined, the probability density function can be used to calculate the probability that each node in the candidate parent node is the parent node of the current node, and the node with the probability greater than a certain value is taken as Candidate parent node of the current node.
  • the number of the above-mentioned candidate parent nodes may be multiple, and the number of parent nodes selected from the candidate parent nodes may be one or multiple.
  • the parent node selected from the candidate parent nodes is the actual parent node of the current node.
  • a node has 6 candidate parent nodes, calculated by the probability density function, the probability of these 6 candidate parent nodes being the candidate parent nodes of the current node is 70%, 60%, 65%, 45%, 40% and 30%. Then, the candidate parent nodes corresponding to the probabilities of 70%, 60%, and 65% can be determined as the actual parent nodes of the current node (one or more candidate parent nodes can be selected as the actual parent nodes of the current node).
  • only the candidate parent node with the highest corresponding probability may be used as the actual parent node of the current node (that is, the candidate parent node corresponding to a probability of 70% is used as the actual parent node of the current node).
  • the aforementioned probability density function may specifically be a Gaussian function.
  • Step S1206 Randomly select the actual parent node of the current node to record the connection relationship.
  • step S1205 after the actual parent node of the current node is selected from the candidate parent nodes, if the number of actual parent nodes is multiple, then the parent node can be selected arbitrarily or randomly from the actual parent nodes for recording.
  • step S1207 it is determined whether the current connection is valid.
  • each connection relationship can be judged according to the above condition (4) and condition (5), the connection relationship that meets the conditions (4) and (5) is a valid connection relationship, and the condition (4) and the condition are not satisfied
  • the connection relationship of any one of the conditions in (5) is an invalid connection relationship.
  • step S1208 When it is determined that the connection is valid, step S1208 is executed, and when it is determined that the connection is invalid, step S1205 is continued to be executed.
  • Step S1208 Generate a network description file according to the generation number of the target neural network, the node type of each generation of nodes, the number of nodes, and the target connection mode.
  • the network description file may be, for example, a text file, and the file type is, for example, a dscp file.
  • Step S1209 represents the end of the process of generating the network description file.
  • Fig. 13 is a flowchart of another method for generating a network description file in an embodiment of the present disclosure.
  • a method 1300 for generating a network description file indicated in FIG. 2 to FIG. 5 may include:
  • Step S131 Determine the node types and node connection relationships of the nodes of all generations of the target neural network according to the structure type of the accelerator.
  • Step S132 Generate a network description file according to the node type and node connection relationship.
  • step S131 unlike the network description file in the embodiment shown in Figures 6-12, which records the randomly generated network, this step can customize the structure of the target neural network according to the structure of the accelerator under test, so as to more targeted The accelerator is tested.
  • FIG. 14 is a flowchart of determining the structure of the target neural network according to the structure type of the accelerator in an embodiment of the present disclosure.
  • step S131 may include:
  • Step S1311 Determine the number of sub-networks in the target neural network.
  • Step S1312 Determine the node type of the node in each sub-network according to the type of the arithmetic unit of the accelerator.
  • Step S1313 Determine the node connection relationship within each sub-network, and determine the node connection relationship between the sub-networks according to the node connection relationship within each sub-network.
  • Step S1314 Determine the node type and node connection relationship of the nodes of all generations of the target neural network according to the node type of the node within each sub-network, the node connection relationship within each sub-network, and the node connection relationship between the sub-networks.
  • step S131 each sub-step of step S131 will be described in detail.
  • step S1311 the number of sub-networks in the target neural network is determined.
  • the number of sub-networks in the target neural network can be randomly determined.
  • the number of sub-networks of the target neural network can also be determined within a certain numerical range (for example, the depth range of the neural network).
  • the number of sub-networks can be randomly determined to be 12 within the range of [10, 20].
  • the number of sub-networks of the target neural network can also be set according to specific (operational) requirements.
  • step S1312 the node type of each node in the sub-network is determined according to the type of the arithmetic unit of the accelerator.
  • Convolutional neural network accelerators generally include arithmetic units such as CONV/POOLING/ELTWIS/ACTIVE, which can perform calculations in parallel.
  • arithmetic units such as CONV/POOLING/ELTWIS/ACTIVE
  • step S1312 try to divide different types of nodes into one sub-network. That is, try to set up different nodes in a sub-network, so that multiple arithmetic units can operate in parallel within the sub-network, and improve the detection efficiency of the accelerator.
  • At least one of the computing unit types of the accelerator can be selected as the node type of a node of the sub-network, and the node type of each node in a sub-network is different. Repeat this process to determine the node type of the nodes inside each sub-network.
  • Figure 15 is a schematic diagram of node type selection within the sub-network.
  • the sub-network 151 includes randomly selected POOLING, ELTWIS, and CONV three types of nodes.
  • the sub-network 152 includes randomly selected CONV and POOLING two types.
  • Types of nodes; the sub-network 153 includes randomly selected CONV and ACTIVE types of nodes.
  • a node should be set in a sub-network as the input node of the target neural network, and the node type should be set to INPUT.
  • the INPUT type node can belong to the same sub-network as other types of nodes. You can also set up a sub-network separately.
  • the node type of the node in each sub-network can also be determined by sequential permutation and combination, etc., which is not particularly limited in the present disclosure.
  • step S1313 the node connection relationship within each of the sub-networks is determined, and the node connection relationship between the sub-networks is determined according to the node connection relationship within each of the sub-networks.
  • Fig. 16 is a flowchart for determining the connection relationship between nodes.
  • the process of determining node connection relationships within the sub-networks and node connection relationships between sub-networks in step S1313 may include:
  • step S13131 a node is selected as an input node of a sub-network.
  • Step S13132 Determine the parent node of each node in the sub-network except the input node by random or traversal.
  • Step S13133 Determine the output node of each sub-network.
  • Step S13134 According to the input nodes and output nodes of the sub-networks, the node connection relationship between the sub-networks is determined in a random or traversal manner.
  • a node in each sub-network can be randomly selected as the input node of the sub-network.
  • the node type different from the input node of the target neural network is INPUT
  • the input node type of each sub-network can be multiple, such as the aforementioned CONV, POOLING, ELTWIS, ACTIVE and other nodes.
  • a target neural network can only be configured with one INPUT type node, and the INPUT type node and other nodes belong to a certain sub-network at the same time, then the input node of the sub-network defaults to the INPUT type node; If the INPUT type node belongs to a sub-network alone, the input node and output node of the sub-network are both the INPUT type node.
  • step S13132 the parent node of each node in the sub-network except the input node is determined by random or traversal.
  • the parent node of the node can be determined one by one. Except for the input node, the parent node of each node belongs to the same sub-network.
  • FIG. 17 is a schematic diagram of the process of determining the connection relationship of the sub-network shown in FIG. 15.
  • the node 1 in the sub-network 151 may be determined as the input node first, and then the parent node of the node 2 may be determined.
  • the parent node of node 2 can be determined randomly or traversally. For example, the parent node of node 2 can be set to node 1, or it can be set to node 3. If the parent node of node 2 is set to node 1, the parent node of node 3 can be set to node 2 or node 1 randomly; if the parent node of node 2 is set to node 3, the parent node of node 3 must be set It is node 1 to connect the input node of the sub-network.
  • the input node of the sub-network 152 is node 4, and the parent node of node 5 is naturally set as node 4. It is determined that the input node of the sub-network 153 is node 6, and the parent node of node 7 is naturally set as node 6.
  • step S13133 the output node of each of the sub-networks is determined.
  • the non-input node is automatically set as the output node; if a sub-network has multiple nodes, the output node of the sub-network can be determined according to the node connection relationship within the sub-network. For example, in the sub-network 151 shown in FIG. 17, node 3 can be set as an output node. When node 2 and node 3 are both connected to input node 1, both node 2 and node 3 can be set as the output node of the sub-network, or one of them can be selected as the output node of the sub-network.
  • Step S13134 According to the input node and the output node of the sub-network, determine the node connection relationship between the sub-networks in a random or traversal manner.
  • the input nodes and output nodes of each sub-network can be connected in a random or traversal manner. Taking Figure 17 as an example, you can first determine that the input node of the sub-network 151 is connected to the input node 0 of the neural network (that is, the INPUT type node); then determine that the input node of the sub-network 152 is connected to any output node of the sub-network 151; The input nodes of the sub-network 153 are connected to the output nodes of the sub-network 152 to form a node connection relationship between the sub-networks.
  • a sub-network may have two or more output nodes, and each output node may also be connected to the input nodes of two or more sub-networks.
  • step S1314 the nodes of all generations of the target neural network are determined according to the node types of the nodes in each of the sub-networks, the node connection relationships in each of the sub-networks, and the node connection relationships between the sub-networks The type of node and the connection relationship between the nodes.
  • step S1312 and step S13131 to step S13134 the node type and node connection mode of each node in the entire target neural network can be determined.
  • the preset specification can be set by the customizing personnel of the target neural network.
  • the node types and node connection relationships of all generations of the target neural network can be recorded for subsequent generation of the network description file.
  • the parent node can be set to transmit the result of each step to the child node in real time, so that the child node can be in the parent node. Calculate at the same time when performing the next operation, without passing the operation result to the child node after the operation of the parent node is completed, so that each operation unit can run in parallel when the accelerator performs operation on the internal node of a sub-network, and the operation efficiency of the accelerator is improved.
  • this application can also protect a neural network generation method.
  • the method for generating a neural network specifically includes: obtaining and parsing at least one of the network description files, each of the network description files including the node types and node connection relationships of nodes of all generations of the target neural network; The type and the connection relationship of the nodes generate the structure of the target neural network; generate the node parameters of the target neural network according to preset parameter constraints between nodes; generate the target neural network according to the structure of the target neural network and the node parameters Target neural network.
  • the corresponding target neural network is generated by parsing the network description file, and the structure of the target neural network can be customized, modified, and reused through the network description file to generate a target neural network with a clear structure multiple times.
  • the target neural network generated above can be used to process data. Therefore, this application can also protect a data processing method, including: obtaining and parsing at least one network description file, each of which includes one network description file.
  • the node types and node connection relations of the nodes of all generations of the target neural network generate the structure of the target neural network according to the node types and the node connection relations; generate the target neural network according to preset parameter constraints between nodes
  • the node parameters of the network the target neural network is generated according to the structure of the target neural network and the node parameters; the target neural network is used for data processing.
  • the corresponding target neural network is generated by parsing the network description file, and the structure of the target neural network can be customized, modified and reused through the network description file, so as to generate the target neural network with a clear structure multiple times. Use a specific neural network to process the corresponding data.
  • using the target neural network to perform data processing includes: obtaining input data; using the target neural network to perform data processing on the input data to obtain output data.
  • the aforementioned input data may be data that needs to be processed by a neural network, and further, the input data may be data that needs to be processed by a neural network in the field of artificial intelligence.
  • the above-mentioned input data may be image data to be processed, and the above-mentioned output data may be a classification result or a recognition result of the image.
  • the input data may also be voice data to be recognized, and the output result may be a voice recognition result.
  • FIG. 18 is a schematic block diagram of a verification platform of an accelerator according to an embodiment of the present application.
  • the verification platform 1800 of the accelerator shown in FIG. 18 includes:
  • the memory 1801 is used to store codes
  • At least one processor 1802 is configured to execute codes stored in the memory to perform the following operations:
  • Translating the at least one target neural network into a neural network instruction inputting the neural network instruction into an accelerator and a software model matching the accelerator for execution, and determining the difference in output results of the neural network instruction;
  • the instruction that is abnormal during the operation of the accelerator is determined.
  • FIG. 18 shows only one processor 1802.
  • the verification platform 1800 shown in FIG. 18 may include one or more processors 1802.
  • FIG. 19 is a schematic block diagram of a device for generating a neural network according to an embodiment of the present application. It should be understood that the device 1900 shown in FIG. 19 can execute each step of the method for generating a neural network in the embodiment shown in FIG. 2 of the present application, and the device 1900 shown in FIG. 19 includes:
  • the memory 1901 is used to store codes
  • At least one processor 1902 is configured to execute codes stored in the memory to perform the following operations:
  • each of the network description files including a node type and node connection relationship of nodes of all generations of the target neural network; generated according to the node type and the node connection relationship
  • the structure of the target neural network; the node parameters of the target neural network are generated according to preset parameter constraints between nodes; the target neural network is generated according to the structure of the target neural network and the node parameters.
  • FIG. 19 shows only one processor 1902.
  • the apparatus 1900 shown in FIG. 19 may include one or more processors 1902.
  • FIG. 20 is a schematic block diagram of a data processing device according to an embodiment of the present application. It should be understood that the device 2000 shown in FIG. 20 can execute each step of the data processing method of the embodiment shown in FIG. 2 of the present application, and the device 2000 shown in FIG. 20 includes:
  • the memory 2001 is used to store codes
  • At least one processor 2002 is configured to execute codes stored in the memory to perform the following operations:
  • each of the network description files including node types and node connection relationships of nodes of all generations of the target neural network;
  • the target neural network is used for data processing.
  • FIG. 20 it should be understood that, for convenience of presentation, only one processor 2002 is shown in FIG. 20. In fact, the apparatus 2000 shown in FIG. 20 may include one or more processors 2002.
  • FIG. 21 is a schematic block diagram of an apparatus for generating a network description file according to an embodiment of the present application. It should be understood that the device 2100 shown in FIG. 21 can execute each step of the data processing method of the embodiment shown in FIG. 6 of this application, and the device 2100 shown in FIG. 21 includes:
  • the memory 2101 is used to store codes
  • At least one processor 2102 is configured to execute codes stored in the memory to perform the following operations:
  • the network description file is generated according to the generation number of the target neural network, the node type of each generation of nodes, the number of nodes, and the target connection mode.
  • FIG. 21 it should be understood that, for convenience of presentation, only one processor 2102 is shown in FIG. 21. In fact, the apparatus 2100 shown in FIG. 21 may include one or more processors 2102.
  • FIG. 22 is a schematic block diagram of an apparatus for generating a network description file according to an embodiment of the present application. It should be understood that the device 2200 shown in FIG. 22 can execute each step of the data processing method of the embodiment shown in FIG. 13 of this application, and the device 2200 shown in FIG. 22 includes:
  • the memory 2201 is used to store codes
  • At least one processor 2202 is configured to execute codes stored in the memory to perform the following operations:
  • the network description file is generated according to the node type and the connection relationship of the node.
  • processors 22202 may include one or more processors 2202.
  • the verification platform 1800 and the device 1900-2200 of the accelerator mentioned above may specifically be electronic equipment or servers, where the electronic equipment may be a mobile terminal (for example, a smart phone), a computer, a personal digital assistant, a wearable device, a vehicle-mounted device, and the Internet of Things.
  • the electronic equipment may be a mobile terminal (for example, a smart phone), a computer, a personal digital assistant, a wearable device, a vehicle-mounted device, and the Internet of Things.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, a solid state disk (SSD)), etc. .
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the accelerator detection method, neural network generation method, data processing method, network description file generation method, and related devices provided by the embodiments of the present disclosure can customize the structure of the neural network used to detect the accelerator and improve the detection efficiency of the accelerator.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种加速器的检测方法和验证平台。该检测方法包括:根据网络描述文件生成至少一个目标神经网络(S1);将至少一个目标神经网络翻译成神经网络指令(S2);将神经网络指令分别输入到加速器以及与加速器匹配的软件模型中执行,并确定所述神经网络指令的输出结果的差异(S3);根据神经网络指令的输出结果的差异确定加速器运行过程中出现异常的指令(S4)。该方法通过根据网络描述文件生成的至少一个目标神经网络,能够生成结构明确的多个目标神经网络,有效地对加速器进行针对性的性能检测。

Description

加速器的检测方法和验证平台
版权申明
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。
技术领域
本申请涉及神经网络技术领域,并且更为具体地,涉及一种加速器的检测方法和验证平台。
背景技术
神经网络在生成之后,为了利用该神经网络进行数据处理,一般需要将神经网络加载到加速器上运行。而加速器性能的好坏可能会直接影响到后续利用神经网络进行数据处理的效果,因此,如何更好地对加速器进行性能检测是一个需要解决的问题。
发明内容
本申请提供一种加速器的检测方法、神经网络的生成方法、数据处理方法、网络描述文件的生成方法以及相关装置,以更好地进行加速器的检测。
第一方面,提供了一种加速器的检测方法,该方法包括:根据网络描述文件生成至少一个目标神经网络,其中,所述网络描述文件记录所述目标神经网络的网络结构;将所述至少一个目标神经网络翻译成神经网络指令;将所述神经网络指令分别输入到加速器以及与所述加速器匹配的软件模型中执行,并确定所述神经网络指令的输出结果的差异;根据所述神经网络指令的输出结果的差异确定加速器运行过程中出现异常的指令。
本申请中,通过根据网络描述文件生成的至少一个目标神经网络,能够生成结构明确的多个目标神经网络,有效地对加速器进行针对性的性能检测。
可选地,生成至少一个目标神经网络,包括:生成多个目标神经网络。
当生成了多个目标神经网络时,能够采用不同的神经网络对加速器的性能进行检测,进而能够更好地实现对加速器的性能检测。
可选地,上述目标神经网络为卷积神经网络。
应理解,本申请中生成的目标神经网络除了可以是卷积神经网络之外,也可以是卷积神经网络之外的其他类型的神经网络,例如,前馈神经网络,递归神经网络等等。
在第一方面的某些实现方式中,生成至少一个目标神经网络,包括:获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;根据预设节点间参数约束条件生成所述目标神经网络的节点参数;根据所述 目标神经网络的结构和所述节点参数生成所述目标神经网络。
本申请中,通过解析网络描述文件生成对应的目标神经网络,可以通过网络描述文件对目标神经网络的结构进行定制、修改和复用,以多次生成结构明确的目标神经网络
在第一方面的某些实现方式中,所述网络描述文件为文本文件,所述网络描述文件的第一行包括注释,用于描述可选节点类型;所述网络描述文件的第n行(n为大于1的整数)描述所述目标神经网络中第n-1代的所有节点的节点信息,所述节点信息包括节点类型和节点连接关系;其中,n为整数。
在第一方面的某些实现方式中,根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构包括:根据所述节点类型例化各节点;根据所述节点连接关系确定所述各节点的父节点;连接各节点及所述各节点对应的父节点以生成所述目标神经网络的结构。
在第一方面的某些实现方式中,网络描述文件的生成过程包括:确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数;根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式;根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件。
本申请中,通过先确定待生成的目标神经网络的代数,节点个数和节点类型,再结合预设的节点连接要求生成目标连接方式,能够最终生成作为神经网络生成依据的网络描述文件,可以更加灵活方便地生成多种类型的神经网络。进一步的,当生成了多种类型的神经网络之后,能够更好地对加速器进行性能检测。
在第一方面的某些实现方式中,根据预设的节点连接要求确定连接目标神经网络中所有节点的目标连接方式,包括:根据节点连接要求确定当前节点的候选父节点,其中,当前节点和候选父节点满足节点连接要求;从候选父节点中选择出当前节点的实际父节点;确定当前节点与当前节点的实际父节点之间的连接关系,以最终生成目标连接方式。
上述候选父节点也可以称为父节点的候选节点。
[根据细则91更正 26.05.2020] 
在第一方面的某些实现方式中,根据节点连接要求确定当前节点的候选父节点,包括:根据以下连接关系中的至少一种,确定当前节点的候选父节点;在当前节点的节点类型为Concat或Eltwise时,当前节点的父节点个数为多个,且当前节点的父节点个数小于或者等于当前节点的候选父节点个数;在当前节点的父节点的节点类型为Active(激活层)时,当前节点的节点类型为Active之外的类型;在当前节点的父节点的节点类型为Global Pooling(全局池化层)时,当前节点的节点类型为Global Pooling;在当前节点的父节点的节点类型为FC(fully connected layers, 全连接层)时,当前节点的节点类型为FC或者Concat(拼接层);在当前节点的父节点的节点类型为Conv(卷积层)、Eltwise(按元素操作层)、Pooling(池化层)以及Concat时,当前节点的节点类型可以为Conv、Eltwise、Pooling、Active、Global Pooling、Concat以及FC中的任意一种。
在第一方面的某些实现方式中,从候选父节点中选择出当前节点的实际父节点,包括:根据概率密度函数确定候选父节点中的每个节点作为当前节点的实际父节点的概率;根据候选父节点中的每个节点作为当前节点的实际父节点的概率从候选父节点中确定出当前 节点的实际父节点。
在第一方面的某些实现方式中,根据候选父节点中的每个节点作为当前节点的实际父节点的概率从候选父节点中确定出当前节点的实际父节点,包括:将候选父节点中作为当前节点的实际父节点的概率大于预设概率值的节点确定为当前节点的实际父节点。
在第一方面的某些实现方式中,上述方法还包括:根据概率密度函数的期望和方差,调整候选父节点中的每个节点作为当前节点的实际父节点的概率。
通过调整概率密度函数的期望和方差,能够调整目标神经网络的宽度和深度,从而能够生成深度和宽度满足要求的目标神经网络。
具体地,可以根据待生成的目标神经网络的深度和宽度的要求来调整概率密度函数的期望和方差。
一般来说,概率密度函数的方差越大,邻近代中的节点被选中的概率越大,网络的宽度会变得越窄,深度会变得越深。
在第一方面的某些实现方式中,上述概率密度函数为高斯函数。
在第一方面的某些实现方式中,上述根据目标连接方式生成目标神经网络,包括:根据预设的节点有效连接关系,从目标连接关系中确定出有效目标连接关系;根据有效目标连接关系生成目标神经网络。
在第一方面的某些实现方式中,上述节点有效连接关系包括下列关系中的至少一种:在当前节点的节点类型为Eltwise时,当前节点的多个输入的通道数保持一致;当前节点的节点类型为FC或者Global Pooling时,当前节点的之后只能连接FC、Global Pooling和act类型之外的节点。
在第一方面的某些实现方式中,确定待生成的目标神经网络的代数,以及目标神经网络所有代的节点的节点类型和节点个数,包括:根据对目标神经网络的运算要求确定目标神经网络的代数,以及目标神经网络所有代的节点的节点类型和节点个数。
上述对目标神经网络的运算需求可以为运算量(大小)的需求,当运算量需求较小时,可以为目标神经网络设置较少的代数、每代也可以设置较少的节点个数;而当运算量需求较大时可以为目标神经网络设置较多的代数,每代也可以设置较多的节点个数。
上述对目标神经网络的运算需求可以为运算的复杂度,当运算复杂度较低时,可以为目标神经网络设置较少的代数、每代也可以设置较少的节点个数;当运算复杂度较高时,可以为目标神经网络设置较多的代数,每代也可以设置较多的节点个数。
在第一方面的某些实现方式中,网络描述文件的生成过程包括:根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系;根据所述节点类型和所述节点连接关系生成所述网络描述文件。
本申请中,通过根据加速器的结构类型确定目标神经网络的结构,可以更好地针对被测加速器的现状进行针对性检测,提高加速器检测效率和效果。
在第一方面的某些实现方式中,根据加速器的结构类型确定所述目标神经网络所有代 的节点的节点类型以及节点连接关系包括:确定所述目标神经网络中的子网络的数量;依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型;确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系;根据每个所述子网络内部的节点的节点类型、每个所述子网络内部的节点连接关系以及所述子网络之间的节点连接关系确定所述目标神经网络所有代的节点的节点类型以及节点连接关系。
在第一方面的某些实现方式中,依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型包括:选取所述加速器的运算单元种类中的至少一个作为一个所述子网络的节点的节点类型,其中一个所述子网络中各节点的节点类型不同;重复上一步骤直至确定每个所述子网络内部的节点的节点类型。
在第一方面的某些实现方式中,确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系包括:选取一个节点作为一个所述子网络的输入节点;通过随机或遍历方式确定所述子网络内部除所述输入节点之外的每个节点的父节点;确定每个所述子网络的输出节点;根据所述子网络的所述输入节点和所述输出节点,通过随机或遍历方式确定所述子网络之间的节点连接关系。
第二方面,提供一种神经网络的生成方法,该方法包括:获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;根据预设节点间参数约束条件生成所述目标神经网络的节点参数;根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络。
本申请中,通过解析网络描述文件生成对应的目标神经网络,可以通过网络描述文件对目标神经网络的结构进行定制、修改和复用,以多次生成结构明确的目标神经网络。
第三方面,提供一种数据处理方法,该方法包括:获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;根据预设节点间参数约束条件生成所述目标神经网络的节点参数;根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络;采用目标神经网络进行数据处理。
本申请中,通过解析网络描述文件生成对应的目标神经网络,可以通过网络描述文件对目标神经网络的结构进行定制、修改和复用,以多次生成结构明确的目标神经网络,进而能够更有针对性的采用特定的神经网络对相应的数据进行数据处理。
第四方面,提供一种网络描述文件的生成方法,该方法包括:确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数;根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式;根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件。
本申请中,通过先确定待生成的目标神经网络的代数,节点个数和节点类型,再结合预设的节点连接要求生成目标连接方式,能够最终生成作为神经网络生成依据的网络描述文件,可以更加灵活方便地生成多种类型的神经网络。进一步的,当生成了多种类型的神经网络之后,能够更好地对加速器进行性能检测。
第五方面,提供一种网络描述文件的生成方法,该方法包括:根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系;根据所述节点类型和所述节点连接关系生成所述网络描述文件。
本申请中,通过根据加速器的结构类型确定目标神经网络的结构,可以更好地针对被测加速器的现状进行针对性检测,提高加速器检测效率和效果。
应理解,本申请第二方面至第五方面中生成的目标神经网络的具体方式以及对相关信息的限定和解释可以参见上述第一方面中的相关内容。
第六方面,提供一种加速器的验证平台,该验证平台包括:存储器,用于存储代码;至少一个处理器,用于执行存储器中存储的代码,以执行如下操作:根据网络描述文件生成至少一个目标神经网络,其中,所述网络描述文件记录所述目标神经网络的网络结构;将所述至少一个目标神经网络翻译成神经网络指令;将所述神经网络指令分别输入到加速器以及与所述加速器匹配的软件模型中执行,并确定所述神经网络指令的输出结果的差异;根据所述神经网络指令的输出结果的差异确定加速器运行过程中出现异常的指令。
第七方面,提供一种神经网络的生成装置,包括:存储器,用于存储代码;至少一个处理器,用于执行所述存储器中存储的代码,以执行如下操作:获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;根据预设节点间参数约束条件生成所述目标神经网络的节点参数;根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络。
第八方面,提供一种数据处理装置,包括:存储器,用于存储代码;至少一个处理器,用于执行所述存储器中存储的代码,以执行如下操作:获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;根据预设节点间参数约束条件生成所述目标神经网络的节点参数;根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络;采用所述目标神经网络进行数据处理。
第九方面,提供一种网络描述文件的生成装置,包括:存储器,用于存储代码;至少一个处理器,用于执行所述存储器中存储的代码,以执行如下操作:确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数;根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式;根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件
第十方面,提供一种网络描述文件的生成装置,包括:存储器,用于存储代码;至少 一个处理器,用于执行所述存储器中存储的代码,以执行如下操作:根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系;根据所述节点类型和所述节点连接关系生成所述网络描述文件。
附图说明
图1是神经网络结构的示意图;
图2是本申请实施例的加速器的检测方法的示意性流程图;
图3是本申请实施例的神经网络的生成过程的流程图;
图4是本申请实施例的网络描述文件描述的一种网络结构的示意图;
图5是本公开实施例中生成目标神经网络的过程的流程图。
图6是本公开实施例中的一个网络描述文件的生成方法的流程图。
图7是确定的目标神经网络的代数、各代的节点个数和节点类型的示意图;
图8是神经网络的一种可能的节点连接关系的示意图;
图9是神经网络的一种可能的节点连接关系的示意图;
图10是神经网络的一种可能的节点连接关系的示意图;
图11是神经网络的一种可能的节点连接关系的示意图;
图12是图6所示实施例的网络描述文件的生成过程的示意图。
图13是本申请实施例的另一种网络描述文件的生成过程的流程图;
图14是本申请实施例中根据加速器的结构类型确定目标神经网络结构的流程图;
图15是子网络内部节点类型选取的示意图;
图16是确定节点连接关系的流程图;
图17是图15所示子网络的连接关系确定过程的示意图;
图18是本申请实施例的加速器的验证平台的示意性框图;
图19是本申请实施例的生成神经网络的装置的示意性框图;
图20是本申请实施例的数据处理装置的示意性框图;
图21是本申请实施例的网络描述文件的生成装置的示意性框图;
图22是本申请实施例的网络描述文件的生成装置的示意性框图。
具体实施方式
下面结合附图对本申请实施例进行详细的描述。
为了更好地理解本申请实施例,下面先结合图1对本申请实施例中的神经网络的结构以及神经网络的相关信息进行描述。
图1是神经网络结构的示意图。
应理解,图1中的神经网络既可以是卷积神经网络,也可以是其它类型的神经网络,本申请对此不做限制。
在图1中,神经网络的结构主要包括三部分:节点(node)、代(generation)和树(tree)。
在图1中,神经网络包括节点1至节点9,这些节点共同组成了第0代至第4代的节点,每代包含的节点如下:
第0代:节点1;
第1代:节点2、节点3、节点4;
第2代:节点5、节点6;
第3代:节点7、节点8;
第4代:节点1。
如图1所示,前面代的节点可以作为后面代的节点的父节点,后面代的节点可以作为前面代的节点的子节点。例如,第1代至第4代的节点可以作为第0代节点的子节点,第1代节点可以作为第2代至第4代节点的父节点。
如图1所示,上述第0代至第4代中的节点共同构成了神经网络的树。
下面对节点、代和树的相关信息进行详细介绍。
每个节点用于描述一个计算层(例如,卷积层),每个节点包含的信息以及相应信息的含义具体如下:
node_header:节点的头信息;
上述节点的头信息包括sequence、gen_id和node_id,其中,sequence为节点总序列号,gen_id表示代索引号(该节点所处的代的索引号),node_id表示代中的节点索引号;
parent_num:(该节点的)父节点个数,对于Concat/Eltwise类型的节点来说,parent_num≥2,对于其它类型的节点来说,parent_num=1;
parents[]:(该节点的)父节点,(该节点的)父节点个数等于parent_num;
[根据细则91更正 26.05.2020] 
node_t:节点类型,例如,这里的节点类型可以包括Input(输入层)/Eltwise(按元素操作层)/Concat(拼接层)/
[根据细则91更正 26.05.2020] 
Conv(卷积层)/Pool(池化层)/Relu(激活函数层)/Prelu(激活函数层)/innerproduct(全连接层)/Global Pooling(全局池化层)等;
node_name:(该节点的)节点名称;
top:(该节点的)top节点的节点名称,其中,top节点为该节点的子节点;
bottom[]:(该节点的)bottom节点的节点名称,其中,bottom节点为该节点的父节点,bottom节点的个数为parent_num;
if_n/c/h/w[]:(该节点的)各输入节点的batch数、通道数、宽和高,其中,该节点的输入节点个数等于parent_num;
of_n/c/h/w:该节点的输出节点batch数、通道数、宽和高。
代(generation)用于组织至少一个节点,如果一个代中包含多个节点,同代中的各个节点不能相互连接,当前代中的节点只能连接gen_id小于当前代的gen_id的代中的节点(即支持跨代连接)。代中包含的信息以及相应信息的含义如下:
gen_id:代索引号;
node_num:代中包含的节点个数,node_num小于或者等于神经网络的最大宽度;
nodes:代中包含的节点的实例;
node_tq[]:代中包含的各节点的类型。
树(tree)用于组织多个代,并描述网络中所有节点的连接关系。树中包含的信息以及相应信息的含义如下:
gen_num:树中包含的代数,gen_num小于或者等于网络的最大深度;
gens[]:树中包含的代的实例,gens[]的个数等于gen_num。
应理解,上文结合图1介绍的神经网络结构只是本申请实施例中的神经网络的一种可能的结构,本申请实施例的神经网络还可以是其它结构,本申请对本申请涉及到的神经网络的具体结构和形式不做限定。
上文结合图1对本申请实施例中的神经网络的一种可能的结构进行了简单的介绍,下面结合图2对本申请实施例的加速器的检测方法进行详细介绍。
图2是本申请实施例的加速器的检测方法的示意性流程图。图2所示的方法可以由电子设备或者服务器执行,这里的电子设备可以是移动终端(例如,智能手机),电脑,个人数字助理,可穿戴设备,车载设备,物联网设备等包含处理器的设备。
参考图2,加速器的检测方法200可以包括:
步骤S1,根据网络描述文件生成至少一个目标神经网络,其中,所述网络描述文件记录所述目标神经网络的网络结构。
步骤S2,将至少一个目标神经网络翻译成神经网络指令。
步骤S3,将神经网络指令分别输入到加速器以及与加速器匹配的软件模型中执行,并确定神经网络指令的输出结果的差异。
步骤S4,根据神经网络指令的输出结果的差异确定加速器运行过程中出现异常的指令。
下面分别对这些步骤进行详细的描述。
在步骤S1,根据网络描述文件生成至少一个目标神经网络,其中,所述网络描述文件记录所述目标神经网络的网络结构。
可选地,上述至少一个目标神经网络为多个目标神经网络。
上述目标神经网络可以是卷积神经网络,也可以是卷积神经网络之外,也可以是卷积神经网络之外的其他类型的神经网络,例如,前馈神经网络,递归神经网络等等。
上述网络描述文件例如可以为文本文件,文件类型例如为*.dscp。此外,网络描述文件的文件类型也可以为其他能够被编辑的类型,本公开对此不作特殊限制。
在步骤S2,将至少一个目标神经网络翻译成神经网络指令。
应理解,步骤S1是为了将上述至少一个目标神经网络加载到加速器或者软件模型中执行,在加载到加速器或者软件模型之前,一般需要将上述至少一个目标神经网络翻译成加速器或者软件模型能够执行的指令。
在步骤S3,将神经网络指令分别输入到加速器以及与加速器匹配的软件模型中执行, 并确定神经网络指令的输出结果的差异。
应理解,上述与加速器匹配的软件模型可以是用于对比加速器性能的软件模型,该软件模型可以模拟加速器的运算行为。
假设上述神经网络指令输入到加速器得到的是第一输出结果,上述神经网络指令输入到软件模型得到的是第二输出结果,通过比较第一输出结果和第二输出结果就能获取到上述神经网络指令的输出结果的差异。
在步骤S4,根据神经网络指令的输出结果的差异确定加速器运行过程中出现异常的指令。
在步骤S4中,当输出结果出现差异时,与该输出结果相对应的加速器的指令就可以认定为加速器运行过程中出现异常的指令,通过确定加速器运行过程中出现异常的指令,能够用于定位加速器的问题,进一步的改进或者修正加速器的设计,从而提高加速器的性能。
本申请中,通过根据网络描述文件生成的至少一个目标神经网络,能够生成结构明确的多个目标神经网络,有效地对加速器进行针对性的性能检测。进一步的,当生成了多个目标神经网络时,能够采用不同的神经网络对加速器的性能进行检测,进而能够更好地实现对加速器的性能检测。
上述步骤S1中根据网络描述文件生成至少一个目标神经网络的实现方式有多种,下面结合图3对步骤S1中根据网络描述文件生成至少一个目标神经网络的方法进行详细的介绍。
图3是本申请实施例的神经网络的生成方法的流程图。
参考图3,神经网络的生成方法300可以包括:
步骤S11,获取并解析至少一个网络描述文件,每个网络描述文件包括一个目标神经网络的所有代的节点的节点类型和节点连接关系。
步骤S12,根据节点类型和节点连接关系生成目标神经网络的结构。
步骤S13,根据预设节点间参数约束条件生成目标神经网络的节点参数。
步骤S14,根据目标神经网络的结构和节点参数生成目标神经网络。
下面分别对这些步骤进行详细的描述。
在步骤S11,获取并解析至少一个网络描述文件,每个网络描述文件包括一个目标神经网络的所有代的节点的节点类型和节点连接关系。
上述目标神经网络可以是任意一种神经网络。每个描述文件均可以描述一个目标神经网络,对应地,可以获取并解析多个网络描述文件,以生成多个结构明确、参数随机的目标神经网络。
在本公开的一个实施例中,网络描述文件的格式例如可以为:第一行包括注释,用于描述可选节点类型;第n行(n为大于1的整数)描述目标神经网络中第n-1代的所有节点的节点信息,节点信息包括节点类型和节点连接关系;其中,n为整数。
其中,可选节点类型包括但不限于以下类型:
ReLu/PReLU/Leaky/Input/onvolution/InnerProduct/Deconvolution/Pooling/GlobalPooling/Concat/Eltwise/Scale。
在一个实施例中,单个节点的信息例如可以采用如下格式进行描述:
NodeType:PGnIDX_PNnIDX,PGmIDX_PNmIDX…
其中,NodeType为当前节点的节点类型,从第一行中的有效节点类型中选取,在n=2时,节点类型一般为Input类型(即设置目标神经网络的第一代节点为Input节点);
PGnIDX_PNnIDX为当前节点的父节点信息,即当前节点的节点连接关系,一个节点可以有多个父节点,对于一个当前节点,可以描述多个父节点信息(Input类型节点没有父节点);
PGnIDX和PGnIDX分别表示当前节点的第n个父节点和第m个父节点的代索引,此时父节点代索引小于当前节点的代索引;
PNnIDX和PNmIDX分别表示第n个父节点和第m个父节点在所在代中的节点索引。
网络描述文件中的每一行(即目标神经网络中的每一代)均可描述多个节点,每个节点的节点描述信息之间使用单个空格字符进行分隔。
示例性而言,如下dscp文件可以描述如图4所示的网络结构:
//valid node type:
ReLu,PReLU,Leaky,Input,Convolution,InnerProduct,Deconvolution,Pooling,GlobalPooling,Co ncat,Eltwise,Scale
Input
ReLU:0_0Convolution:0_0PReLU:0_0
Pooling:1_0
Eltwise:2_0,1_1
在图4所示结构中,目标神经网络的第0代0包括1个节点,即节点1,节点1的节点类型为Input;第1代包括三个节点:节点2、节点3和节点4,节点类型分别为ReLU、Conv、PReLU,节点2的父节点为0代的第0个节点,即节点1,节点3和节点4的父节点均与节点2相同,为Input节点;第2代包括1个节点,即节点5,节点类型为Pooling,父节点为1代的第0个节点,即节点2;第3代包括1个节点,即节点6,节点类型为Eltwise,该节点连接两个父节点,分别是第2代的第0个节点即节点5,以及第1代的第1个节点,即节点3。
网络描述文件可以为开发人员根据需求自行编辑和设置,从而可以基于需求定制各种结构确定、参数随机的神经网络。
在步骤S12,根据节点类型和节点连接关系生成目标神经网络的结构。
解析网络描述文件后,就可以根据网络结构描述文件中记载的各个节点的节点类型和节点连接关系来构造目标神经网络的结构,或者输出prototxt文件(该文件中包含目标神 经网络中各个节点的连接关系),以便后续根据该prototxt文件输入给配置工具翻译成神经网络指令供加速器执行。
在一个实施例中,步骤S12主要可以包括:根据节点类型例化各节点;根据节点连接关系确定各节点的父节点;连接各节点及各节点对应的父节点以生成目标神经网络的结构。
在本公开实施例中,生成目标神经网络的的结构可以用于验证目标神经网络连接关系的合理性,或者用于输出可以应用于其他目的的目标神经网络。在其他一些实施例中,也可以直接根据节点类型和节点连接关系输出上述prototxt文件,无需输出目标神经网络。
在步骤S13,根据预设节点间参数约束条件生成目标神经网络的节点参数。
在确定目标神经网络的结构后,还需要确定各个节点的节点内参数,其中,各个节点的节点内参数类型、节点内参数的个数以及节点内参数与节点类型相关。
例如,对于Conv类型的节点来说,of_h需满足公式(1)。
of_h=(if_h[0]+2×pad_h–(dilation_h×(kernel_h-1)+1))/stride_h+1    (1)
而对于Pool类型的节点来说,of_h需满足公式(2)。
of_h=(if_h[0]+2×pad_h–Pool_size)/stride_h+1    (2)
其中,在上述公式(1)和公式(2)中,of_h表示节点输出特征图的高,if_h表示节点输入特征图的高,pad_h是为了便于计算而在节点的输入特征图上填充的元素的行数,通常都是填充0,dilation_h表示在节点的输入特征图中间插值的元素的个数(dilation_h大于0),通常插值为0,kernel_h表示进行卷积操作时卷积核的大小,stride_h表示卷积核或池化窗口在高度方向滑动的步长,Pool_size表示进行池化处理时的窗口的大小。
对于Concat类型的节点来说,of_c等于各个if_c的总和,对于Eltwise类型的节点来说,of_c应与每个if_c的大小保持一致。
另外,在确定各个节点的节点内参数时,还需要满足下面的条件A。
条件A:父节点输出的特征图的大小与子节点的输入的特征图的大小相等。
由于父节点的输出的特征图的就是子节点的输入的特征图,因此,父节点内输出的特征图的大小要与子节点输入的特征图的大小一致。
S14、根据目标神经网络的结构和节点参数生成目标神经网络。
图5是本公开实施例中生成目标神经网络的过程的流程图。
图5所示的过程可以由电子设备(该电子设备的限定和解释可参见图2所示的方法中的相关内容)执行,图5所示的过程500包括步骤S501至S510,下面分别对这些步骤进行详细的描述。
步骤S501,开始。
步骤S501表示开始生成神经网络。
[根据细则91更正 26.05.2020] 
步骤S502,获取并解析dscp格式的描述文件。
如果解析成功,进入步骤S503;如果解析失败,返回解析失败信息并进入步骤S510 结束神经网络生成过程。
步骤S503,获取dscp文件中记载的各代的节点数、每代中各节点的节点类型和节点连接关系。
在步骤S503中,可以按dscp文件的记载顺序获取各节点的节点类型和节点连接关系,即各节点的父节点描述信息。
步骤S504,根据节点类型例化各节点。
具体地,在步骤S504中,可以根据各代的节点类型和各代的节点个数,例化各代中的各个节点,也就是说,要根据各代节点的节点类型和各代的节点个数确定各代中的节点实例,其中,一个节点可以对应一个实例,也可以对多个实例。
应理解,这里的节点更偏向于逻辑上的一个概念,而节点实例则是节点实际依托的一个实体,在该实体上能够执行该节点的各种数据处理任务。
步骤S505,配置各个节点的头信息和父节点个数。
配置各个节点的头信息(header)也就是要生成各个节点的节点总序列号(sequence),代索引号(gen_id)和代中的节点索引号(node_id)。
例如,可以按照从上到下的顺序生成各代的代索引号(gen_id),按照第0代到第N(N为神经网络的最后一代的编号)代的顺序生成各个节点的总序列号(sequence),在每代中再按照一定的顺序生成各个节点在代中的节点索引号(node_id)。
其中,sequence表示整个神经网络中的节点的序列号。
步骤S506,确定当前连接是否有效。
在步骤S506中,要确定当前已经存在的连接是否有效,在具体执行时,可以根据前文条件(4)和条件(5)对每一个连接关系进行判断,满足条件(4)和(5)的连接关系为有效连接关系,不满足条件(4)和条件(5)中的任意一个条件的连接关系为无效连接关系。
当确定连接有效时,执行步骤S507,当确定连接无效时,返回连接无效信息并并进入步骤S510结束网络生成过程。
步骤S507,连接各个节点。
在步骤S507中,可以根据dscp文件记录的各节点的父节点描述信息对各个节点及其父节点进行连接。
步骤S508,随机生成各个节点的节点内参数。
确定各个节点的节点内参数时,可以根据上述公式(1),公式(2)以及条件A的约束来确定各个节点的节点内参数。
[根据细则91更正 26.05.2020] 
步骤S509,打印prototxt格式的原型文件。
prototxt文件中包含要生成的神经网络中各个节点的连接关系,生成该prototxt文件之后,便于后续根据该prototxt文件构建或者生成神经网络。
步骤S510,结束。
步骤S510表示神经网络的生成过程结束。
图6是本公开实施例中的一个网络描述文件的生成方法的流程图。
参考图6,在本公开实施例中,图2~图5中指出的网络描述文件的一种生成方法600可以包括:
步骤S61,确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数。
步骤S62,根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式。
步骤S63,根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件。
下面,对方法600的各步骤进行详细解释。
在步骤S61,确定目标神经网络的代数,以及目标神经网络所有代的节点的节点类型和节点个数。
其中,上述步骤S61中确定的目标神经网络可以是上述步骤S1中的至少一个目标神经网络中的任意一个目标神经网络。
具体地,在步骤S61中,可以先随机确定目标神经网络的代数,然后再随机确定每一代的节点的节点类型和节点个数。
例如,如图1所示,可以随机确定目标神经网络的代数为5(图1中的神经网络的代数为5)。
另外,在步骤S61中,可以在一定的数值范围(例如,神经网络的深度范围)内来确定目标神经网络的代数。例如,可以在数值[10,20]范围内随机确定目标神经网络的代数为12。
在确定了目标神经网络的代数之后,可以从所有可用的节点类型中确定出各代节点的节点类型,在确定各代的节点个数时,可以在一定的数值范围(例如,神经网络的宽度范围)内来确定各代的节点个数。
应理解,在步骤S61中,也可以根据具体的(运算)需求来设置目标神经网络的代数、以及各代的节点类型和节点个数。
例如,如果采用神经网络做一些简单的运算,那么,可以为目标神经网络设置较少的代数、每代也可以设置较少的节点个数,而如果要采用神经网络做一些非常复杂的运算,那么,可以为目标神经网络设置较多的代数,每代也可以设置较多的节点个数。
可选地,可根据Input/Eltwise/Concat/Conv/Pool/Relu/Prelu/Innerproduct/Global Pooling这些可用的节点类型来确定各代的节点类型。
例如,以图1中所示的神经网络为例,在随机确定了目标神经网络的代数为5之后,可以随机确定第0代至第4代的节点个数分别为1、3、2、2和1。
在确定各代的节点类型和节点个数的时候,既可以先确定各代的节点类型,也可以先 确定各代的节点个数,还可以同时确定各代的节点类型和节点个数(本申请不限定确定各代节点的节点类型和各代节点的节点个数的先后顺序)。
应理解,在确定各代的节点类型和节点个数时,每一代的节点个数可以大于或者等于该代的节点类型的个数(每一代的节点类型的个数小于该代的节点个数)。
下面结合附图对步骤S61说明确定的目标神经网络的代数,以及各代的节点类型和节点个数进行说明。
例如,如图7所示,步骤S61确定的目标神经网络的代数为4(包括第0代至第4代),第0代至第3代包含的节点个数具体如下:
第0代节点的节点个数为1;
第1代节点的节点个数为3;
第2代节点的节点个数为2;
第3代节点的节点个数为1。
第0代至第3代包含的节点类型具体如下:
第0代节点的节点类型为Input;
第1代节点的节点类型包括FC、Eltwise和Global Poolling;
第2代节点的节点类型为Concat和FC;
第3代节点的节点类型为Eltwise。
在步骤S62,根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式。
上述节点连接要求可以是能够满足神经网络正常使用要求的规则,该节点连接要求可以是预先设置好的,具体地,可以通过经验和要生成的神经网络的需求来设定节点连接要求。
应理解,在步骤S62中根据节点连接要求确定的目标神经网络中各个节点之间的连接关系可以有多种,在获取了多种连接关系之后可以从该多种连接关系中(任意)选择一种连接关系作为最终的连接关系。
可选地,上述节点连接要求可以包括下列条件中的至少一种:
(1)第一代节点的节点类型为输入(Input)类型;
(2)当前节点的节点类型为Concat或者Eltwise时,当前节点的父节点个数小于或者等于该父节点的候选节点个数;
(3)当前节点的节点类型与当前节点的父节点之间的连接满足表1所示的关系。
表1
Figure PCTCN2020080742-appb-000001
Figure PCTCN2020080742-appb-000002
表1示出了当前节点为不同的节点类型时能够连接的父节点的节点类型,其中,Y表示可以连接,N表示不能连接。
应理解,在上述步骤S62中可以得到多种节点连接关系,在执行步骤S63之前,可以对该多种节点连接关系的有效性进行判断,从中选择出有效的节点连接关系之后再执行步骤S63。
具体地,在检查多种节点连接关系的有效性时,可以判断这些节点连接关系是否满足下面的条件(4)和条件(5),并将这些节点连接关系中满足条件(4)和条件(5)的节点连接关系确定为有效的节点连接关系,并根据这些有效的节点连接关系执行步骤S63。
(4)Eltwise类型的节点多个输入的通道数要保持一致;
(5)FC类型和Global Pooling类型的节点之后(包括紧跟着当前节点之后的节点,以及后面代中位于当前节点之后的节点)不能连接FC、Global Pooling和act类型之外的其它类型节点。
具体地,FC类型和Global Pooling类型的节点之后紧跟着的后面的节点,以及后面代中位于FC类型和Global Pooling类型的节点之后的节点的节点类型只能是FC、Global Pooling或act类型。
例如,在图7所示的神经网络结构中,节点6的节点类型为Eltwise,节点6两个输入的通道数均为1,节点6两端的输入通道数满足上述条件(4),但是,对于同为Eltwise类型的节点11来说,节点11左侧的输入通道数为2,右侧的输入通道数为1,节点11左侧的输入通道数和右侧的输入通道数不一致,不满足上述条件(4)。
因此,图8所示的连接关系不符合上述条件(4),当步骤220中确定出的多种节点连接关系包含图8所示的无效连接关系时,需要将该连接关系排除掉。
再如,在图9所示的神经网络中,节点1的节点类型为FC,节点2的节点类型为Relu,由于节点1的节点类型为FC,节点1后面只能连接节点类型为FC、Global Pooling和act的节点,节点1与节点2的连接关系不满足上述条件(5);另外,节点3的节点类型为Global Pooling,节点4的节点类型为Prelu,节点3只能连接节点类型为FC、Global Pooling和act的节点,节点3与节点4的连接关系不满足上述条件(5)。
因此,图9所示的连接关系不符合上述条件(5),当步骤S62中确定出的多种节点连接关系包含图9所示的无效连接关系时,要将该连接关系排除掉。
另外,在步骤S62中,在确定一个节点的父节点时,可能会存在多个候选节点,这个时候,只要满足上述条件(1)至条件(5)均可以作为当前节点的候选父节点(也可以称为父节点的候选节点),但是具体从候选父节点中选择哪些节点作为当前节点的实际父 节点可以根据概率密度函数来确定。
可选地,上述概率密度函数可以是高斯函数(gaussian function),由于高斯函数整体符合越接近的代被选中的概率越高的基本要求,具体地,高斯函数的期望值可以与代索引值-1保持一致,高斯函数的期望值不影响网络形态的控制。通过调整高斯函数中的方差,可以实现对高斯函数形态的控制,从而控制各代中的节点被选中的概率。一般来说,高斯函数的方差越大,邻近代中的节点被选中的概率越大,深度会变得越深,网络的宽度会变得越窄。
图10和图11分别是神经网络的可能的节点连接关系的示意图。
当步骤S61中确定出来的目标神经网络的代数,以及各代的节点类型和节点个数如图7所示的情况时,在此基础上,继续执行步骤S62得到的节点连接关系如图10和图11所示。
接下来,根据上述条件(4)和(5)对图10和图11所示的节点连接关系进行分析。通过分析得知,图10和图11所示节点连接关系均满足条件(4),但是,在图10中,节点3与节点6连接不符合上述条件(5)。而图11除了满足上述条件(4)之外,还满足条件(5),因此,可以确定图11所示的节点连接关系是有效的节点连接关系,即确定了该目标神经网络的目标连接方式。
在步骤S63,根据目标神经网络的代数、每代节点的节点类型、节点个数以及目标连接方式生成网络描述文件。
在步骤S63中,在确定了各个节点的连接关系之后,就可以根据上述确定的目标神经网络的代数、每代节点的节点类型、节点个数以及目标连接方式生成可供加载并解析的网络描述文件,该网络描述文件例如可以为文本文件,文件类型例如为dscp文件。
为了更好地理解本申请实施例的网络描述文件的生成方法的流程,下面结合图12对申请实施例的网络描述文件的生成过程的具体执行流程1200进行详细的介绍。
图12是图6所示实施例的网络描述文件的生成过程的示意图。图12所示的过程可以由电子设备(该电子设备的限定和解释可参见图2所示的方法中的相关内容)执行,图12所示的过程包括步骤1201至1211,下面分别对这些步骤进行详细的描述。
步骤S1201,开始。
步骤S1201表示开始生成网络描述文件。
步骤S1202,随机生成神经网络的代数。
应理解,在步骤S1202,可以在一定的数值范围内随机选择一个数值作为神经网络的代数。
步骤S1203,随机生成各代节点的个数和各代节点的节点类型。
在步骤S1203中,可以在一定的网络宽度的范围内随机生成各代节点的个数。例如,神经网络的宽度不能超过10,那么,可以分别在1到10之间任意选择一个数值作为各个代的节点的个数。
而在随机生成各个节点的节点类型时,可以从所有可用的节点类型中随机生成各个节点的节点类型。
这里的步骤S1202和步骤S1203相当于上文中的步骤S61,上文中对步骤S61的相关限定和解释同样适用于步骤S1202和步骤S1203,为了避免重复,这里不再详细描述步骤S1202和步骤S1203。
步骤S1204,确定各个节点的头信息和父节点个数。
确定各个节点的头信息(node_header)也就是要生成各个节点的节点总序列号(sequence),代索引号(gen_id)和代中的节点索引号(node_id)。
例如,可以按照从上到下的顺序生成各代的代索引号(gen_id),按照第0代到第N(N为神经网络的最后一代的编号)代的顺序生成各个节点的总序列号(sequence),在每代中再按照一定的顺序生成各个节点在代中的节点索引号(node_id)。
其中,sequence表示整个神经网络中的节点的序列号。
步骤S1205,计算各个节点的父节点的候选节点。
具体地,在步骤S1205中,要计算当前节点的父亲节点的候选节点,以便于后续从该候选节点中选择出父节点。可以最底层开始,逐层为每一层中的每个节点从前面的代中选择出候选父节点。应理解,在为当前节点选择候选父节点时,当前节点的候选父节点不仅可以来自于当前节点的上一代,也可以来源于当前节点之前的所有代。
在为每一个节点确定候选父节点时,可以按照一定的节点连接要求(该节点连接要求可以是上文中的条件(1)至条件(3)中的一种或者多种)来选择候选父节点,将上一代中满足节点连接要求的节点作为当前节点的候选父节点。例如,如图7所示,可以选择第2代中的节点5和节点6作为第3代中的节点7的候选父节点。
另外,在步骤S1205中,当确定了节点的候选父节点之后,可以采用概率密度函数来计算候选父节点中的每个节点作为当前节的父节点的概率,并将概率大于一定数值的节点作为当前节点的候选父节点。
应理解,上述候选父节点的个数可以是多个,从候选父节点中选出的父节点的个数可以是一个也可以是多个。另外,从候选父节点中选择出来的父节点是当前节点的实际父节点。
例如,某个节点有6个候选父节点,通过概率密度函数计算,这6个候选父节点作为当前节点的候选父节点的概率分别为70%、60%、65%、45%、40%和30%。那么,可以将概率分别为70%、60%、65%对应的候选父节点确定为当前节点的实际父节点(可以选择一个或者多个候选父节点作为当前节点的实际父节点)。
在上述例子中,也可以只将对应概率最大的候选父节点作为当前节点的实际父节点(也就是将概率为70%对应的候选父节点作为当前节点的实际父节点)。
上述概率密度函数具体可以是高斯函数。
步骤S1206,随机挑选当前节点的实际父节点记录连接关系。
在上述步骤S1205中,当从候选父节点中选择出当前节点的实际父节点之后,如果实际父节点的数量为多个,那么,就可以从实际父节点中任意或者随机选择父节点进行记录。
步骤S1207,确定当前连接是否有效。
在步骤S1207中,可以根据上述条件(4)和条件(5)对每一个连接关系进行判断,满足条件(4)和(5)的连接关系为有效连接关系,不满足条件(4)和条件(5)中的任意一个条件的连接关系为无效连接关系。
当确定连接有效时,执行步骤S1208,当确定连接无效时,继续执行步骤S1205。
步骤S1208,根据目标神经网络的代数、每代节点的节点类型、节点个数以及目标连接方式生成网络描述文件。
网络描述文件例如可以为文本文件,文件类型例如为dscp文件。
步骤S1209,结束。
步骤S1209表示网络描述文件的生成过程结束。
在图6~图12所示实施例中,通过生成结构随机、参数随机的目标神经网络结构,并记录在网络描述文件中,可以通过该网络描述文件的加载生成多种神经网络结构,能够极大地扩大神经网络的数量和种类。在用于加速器检测场景时,可以通过大量多种不同结构的神经网络对加速器进行更好地验证。此外,由于上述目标神经网络的结构记录在网络描述文件中,可以通过对网络描述文件的复制并修改生成多种结构相似的神经网络,并实现这些神经网络的复用。
图13是本公开实施例中的另一个网络描述文件的生成方法的流程图。
参考图13,在本公开实施例中,图2~图5中指出的网络描述文件的一种生成方法1300可以包括:
步骤S131,根据加速器的结构类型确定目标神经网络所有代的节点的节点类型以及节点连接关系。
步骤S132,根据节点类型和节点连接关系生成网络描述文件。
下面对图13所示实施例的步骤进行详细解释。
在步骤S131,不同于图6~图12所示实施例中网络描述文件记录的是随机生成的网络,本步骤可以根据被测加速器的结构定制目标神经网络的结构,以更有针对性地对加速器进行测试。
图14是本公开一个实施例中根据加速器的结构类型确定目标神经网络结构的流程图。
参考图14,在一个实施例中,步骤S131可以包括:
步骤S1311,确定目标神经网络中的子网络的数量。
步骤S1312,依据加速器的运算单元种类确定每个子网络内部的节点的节点类型。
步骤S1313,确定每个子网络内部的节点连接关系,并根据每个子网络内部的节点连 接关系确定子网络之间的节点连接关系。
步骤S1314,根据每个子网络内部的节点的节点类型、每个子网络内部的节点连接关系以及子网络之间的节点连接关系确定目标神经网络所有代的节点的节点类型以及节点连接关系。
下面,对步骤S131的各子步骤进行详细说明。
在步骤S1311,确定目标神经网络中的子网络的数量。
可以随机确定目标神经网络中的子网络的数量。此外,也可以在一定的数值范围(例如,神经网络的深度范围)内来确定目标神经网络的子网络的数量。例如,可以在数值[10,20]范围内随机确定子网络的数量为12。此外,也可以根据具体的(运算)需求来设置目标神经网络的子网络的数量。
在步骤S1312,依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型。
卷积神经网络加速器一般包括CONV/POOLING/ELTWIS/ACTIVE等运算单元,这些运算单元可以并行进行计算。为了提高加速器的运算效率,可以让尽量多的运算单元同时参与运算。为此,在步骤S1312中,尽量将不同类型的节点划分到一个子网络中。即,尽量在一个子网络中设置不同节点,以利于在该子网络范围内多个运算单元可以并行运算,提高加速器的检测效率。
例如,在确定一个子网络内的节点的节点类型时,可以选取加速器的运算单元种类中的至少一个作为一个该子网络的节点的节点类型,并使一个子网络中各节点的节点类型不同。重复执行该过程,可以确定每个子网络内部的节点的节点类型。
图15是子网络内部节点类型选取的示意图。在图15所示实施例中,加速器的运算单元有CONV、POOLING、ELTWIS、ACTIVE四种,各子网络内部的节点数量小于等于四。子网络151中包括随机选取的POOLING、ELTWIS、CONV三种类型的节点,每种节点只有一个,以便于该子网络范围内运算单元并行运行;子网络152中包括随机选取的CONV、POOLING两种类型的节点;子网络153中包括随机选取的CONV、ACTIVE两种类型的节点。
在此步骤需要注意的是,应在一个子网络中设置一个节点作为目标神经网络的输入节点,并将节点类型设置为INPUT,该INPUT类型节点既可以与其他类型的节点同属于一个子网络,也可以单独成立一个子网络。
上述过程仅为示例,还可以通过顺次排列组合等发生方式确定各子网络中的节点的节点类型,本公开对此不作特殊限制。
在步骤S1313,确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系。
图16是确定节点连接关系的流程图。
参考图16,步骤S1313中确定子网络内部的节点连接关系以及子网络之间的节点连 接关系的过程可以包括:
步骤S13131,选取一个节点作为一个子网络的输入节点。
步骤S13132,通过随机或遍历方式确定子网络内部除输入节点之外的每个节点的父节点。
步骤S13133,确定每个子网络的输出节点。
步骤S13134,根据子网络的输入节点和输出节点,通过随机或遍历方式确定子网络之间的节点连接关系。
下面,对图16所示各步骤进行详细说明。
在步骤S13131,可以在每个子网络中随机选取一个节点作为该子网络的输入节点。
可以理解的是,不同于目标神经网络的输入节点的节点类型为INPUT,各子网络的输入节点的类型可以为多种,例如上文提到的CONV、POOLING、ELTWIS、ACTIVE等各种节点。但是,需要注意的是,一个目标神经网络只能配置一个INPUT类型的节点,该INPUT类型的节点与其他节点同时属于某一个子网络,则该子网络的输入节点默认为该INPUT类型的节点;如果该INPUT类型的节点单独属于一个子网络,则该子网络的输入节点和输出节点均为该INPUT类型节点。
在步骤S13132,通过随机或遍历方式确定所述子网络内部除输入节点之外的每个节点的父节点。
可以在每个子网络内部,逐个节点确定该节点的父节点。除输入节点外,各节点的父节点均属于同一个子网络。
图17是图15所示子网络的连接关系确定过程的示意图。
在图17所示实施例中,可以首先确定子网络151中的节点1为输入节点,然后确定节点2的父节点。可以通过随机或遍历方式确定节点2的父节点。例如,既可以将节点2的父节点设置为节点1,也可以设置为节点3。如果将节点2的父节点设置为节点1,则可以将节点3的父节点随机设置为节点2或节点1;如果将节点2的父节点设置为节点3,则必须将节点3的父节点设置为节点1,以连接该子网络的输入节点。
接下来,可以确定子网络152的输入节点为节点4,则自然设置节点5的父节点为节点4。确定子网络153的输入节点为节点6,则自然设置节点7的父节点为节点6。
在步骤S13133,确定每个所述子网络的输出节点。
如果一个子网络仅有两个节点,则自动将非输入节点设置为输出节点;如果一个子网络有多个节点,则可以根据子网络内部的节点连接关系确定子网络的输出节点。例如在图17所示的子网络151中,可以设置节点3为输出节点。当节点2和节点3均仅连接输入节点1时,可以将节点2和节点3均设置为该子网络的输出节点,或者选取其中一个作为该子网络的输出节点。
步骤S13134,根据所述子网络的所述输入节点和所述输出节点,通过随机或遍历方式确定所述子网络之间的节点连接关系。
可以通过随机或遍历的方式连接各子网络的输入节点和输出节点。以图17为例,可以首先确定子网络151的输入节点连接神经网络的输入节点0(即INPUT类型节点);然后确定子网络152的输入节点连接子网络151的任意一个输出节点;接下来确定子网络153的输入节点连接子网络152的输出节点,以形成子网络之间的节点连接关系。
可以理解的是,一个子网络可以具有两个或两个以上的输出节点,每个输出节点也可以连接两个或两个以上的子网络的输入节点。
在步骤S1314,根据每个所述子网络内部的节点的节点类型、每个所述子网络内部的节点连接关系以及所述子网络之间的节点连接关系确定所述目标神经网络所有代的节点的节点类型以及节点连接关系。
经过步骤S1312和步骤S13131~步骤S13134,既可以确定整个目标神经网络中各节点的节点类型和节点连接方式。
此时,如果自动验证该节点连接方式符合预设规范,该预设规范可以由目标神经网络的定制人员自行设置。在该节点连接方式符合预设规范时,可以记录该目标神经网络所有代的节点类型以及节点连接关系,供后续生成网络描述文件。
需要解释的是,虽然在一个子网络内部节点具有依赖关系,但是可以在目标神经网络翻译成神经网络指令时,设置父节点将每一步运算结果实时传递给子节点,以供子节点在父节点进行下一步运算时同时计算,而无需在父节点的运算完成后才将运算结果传递给子节点,实现加速器在对一个子网络内部节点进行运算时各运算单元能够并行运行,提高加速器的运行效率。
上文结合图1至图17对本申请实施例的加速器的检测方法进行了详细的描述。
事实上,本申请还可以保护一种神经网络的生成方法。
神经网络的生成方法具体包括:获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;根据预设节点间参数约束条件生成所述目标神经网络的节点参数;根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络。
本申请中,通过解析网络描述文件生成对应的目标神经网络,可以通过网络描述文件对目标神经网络的结构进行定制、修改和复用,以多次生成结构明确的目标神经网络。
上述生成的目标神经网络可以用于对数据进行处理,因此,本申请还可以保护一种数据处理方法,包括:获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;根据预设节点间参数约束条件生成所述目标神经网络的节点参数;根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络;采用目标神经网络进行数据处理。
本申请中,通过解析网络描述文件生成对应的目标神经网络,可以通过网络描述文件对目标神经网络的结构进行定制、修改和复用,以多次生成结构明确的目标神经网络,进而能够更有针对性的采用特定的神经网络对相应的数据进行数据处理。
可选地,上述采用目标神经网络进行数据处理,包括:获取输入数据;采用目标神经网络对输入数据进行数据处理,得到输出数据。上述输入数据可以是需要采用神经网络进行处理的数据,进一步的,该输入数据可以是人工智能领域内需要采用神经网络进行处理的数据。例如,上述输入数据可以是待处理的图像数据,上述输出数据可以是图像的分类结果或者识别结果。再如,上述输入数据也可以是待识别的语音数据,上述输出结果可以是语音识别结果。
应理解,上述神经网络的生成方法和数据处理方法中的神经网络的生成的具体方式以及对相关信息的限定和解释可以参见上文中神经网络的生成过程的相关内容(例如,图2所示的相关内容)。
下面结合图18对本申请实施例的加速器的验证平台进行描述,应理解,图18所示的加速器的验证平台能够执行本申请图2~图17实施例的加速器的检测方法的各个步骤,下面在介绍图18时适当省略重复的描述。
图18是本申请实施例加速器的验证平台的示意性框图。图18所示的加速器的验证平台1800包括:
存储器1801,用于存储代码;
至少一个处理器1802,用于执行所述存储器中存储的代码,以执行如下操作:
根据网络描述文件生成至少一个目标神经网络,其中,所述网络描述文件记录所述目标神经网络的网络结构;
将所述至少一个目标神经网络翻译成神经网络指令;将所述神经网络指令分别输入到加速器以及与所述加速器匹配的软件模型中执行,并确定所述神经网络指令的输出结果的差异;
根据所述神经网络指令的输出结果的差异确定加速器运行过程中出现异常的指令。
应理解,图18中为了方便表示,仅示出了一个处理器1802,事实上,图18所示的验证平台1800可以包含一个或者多个处理器1802。
图19是本申请实施例的一种生成神经网络的装置的示意性框图。应理解,图19所示的装置1900能够执行本申请图2所示实施例的生成神经网络的方法的各个步骤,图19所示的装置1900包括:
存储器1901,用于存储代码;
至少一个处理器1902,用于执行所述存储器中存储的代码,以执行如下操作:
获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;根据所述节点类型和所述节点连接关 系生成所述目标神经网络的结构;根据预设节点间参数约束条件生成所述目标神经网络的节点参数;根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络。
应理解,图19中为了方便表示,仅示出了一个处理器1902,事实上,图19所示的装置1900可以包含一个或者多个处理器1902。
图20是本申请实施例的数据处理装置的示意性框图。应理解,图20所示的装置2000能够执行本申请图2所示实施例的数据处理方法的各个步骤,图20所示的装置2000包括:
存储器2001,用于存储代码;
至少一个处理器2002,用于执行所述存储器中存储的代码,以执行如下操作:
获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;
根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;
根据预设节点间参数约束条件生成所述目标神经网络的节点参数;
根据所述目标神经网络的结构和所述节点参数生成目标神经网络;
采用所述目标神经网络进行数据处理。
应理解,图20中为了方便表示,仅示出了一个处理器2002,事实上,图20所示的装置2000可以包含一个或者多个处理器2002。
图21是本申请实施例的网络描述文件的生成装置的示意性框图。应理解,图21所示的装置2100能够执行本申请图6所示实施例的数据处理方法的各个步骤,图21所示的装置2100包括:
存储器2101,用于存储代码;
至少一个处理器2102,用于执行所述存储器中存储的代码,以执行如下操作:
确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数;
根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式;
根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件。
应理解,图21中为了方便表示,仅示出了一个处理器2102,事实上,图21所示的装置2100可以包含一个或者多个处理器2102。
图22是本申请实施例的网络描述文件的生成装置的示意性框图。应理解,图22所示的装置2200能够执行本申请图13所示实施例的数据处理方法的各个步骤,图22所示的装置2200包括:
存储器2201,用于存储代码;
至少一个处理器2202,用于执行所述存储器中存储的代码,以执行如下操作:
根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系;
根据所述节点类型和所述节点连接关系生成所述网络描述文件。
应理解,图22中为了方便表示,仅示出了一个处理器2202,事实上,图22所示的装置2200可以包含一个或者多个处理器2202。
上述加速器的验证平台1800、装置1900~装置2200具体可以是电子设备或者服务器,这里的电子设备可以是移动终端(例如,智能手机),电脑,个人数字助理,可穿戴设备,车载设备,物联网设备等包含处理器的设备。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各 个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
工业实用性
本公开实施例提供的加速器的检测方法、神经网络的生成方法、数据处理方法、网络描述文件的生成方法以及相关装置,可以定制用于检测加速器的神经网络的结构,提高加速器的检测效率。

Claims (43)

  1. 一种加速器的检测方法,其特征在于,包括:
    根据网络描述文件生成至少一个目标神经网络,其中,所述网络描述文件记录所述目标神经网络的网络结构;
    将所述至少一个目标神经网络翻译成神经网络指令;
    将所述神经网络指令分别输入到加速器以及与所述加速器匹配的软件模型中执行,并确定所述神经网络指令的输出结果的差异;
    根据所述神经网络指令的输出结果的差异确定加速器运行过程中出现异常的指令。
  2. 如权利要求1所述的方法,其特征在于,所述根据网络描述文件生成至少一个目标神经网络,包括:
    获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;
    根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;
    根据预设节点间参数约束条件生成所述目标神经网络的节点参数;
    根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络。
  3. 如权利要求1或2所述的方法,其特征在于,所述网络描述文件为文本文件,所述网络描述文件的第一行包括注释,用于描述可选节点类型;所述网络描述文件的第n行(n为大于1的整数)描述所述目标神经网络中第n-1代的所有节点的节点信息,所述节点信息包括节点类型和节点连接关系;其中,n为整数。
  4. 如权利要求1或2所述的方法,其特征在于,所述根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构包括:
    根据所述节点类型例化各节点;
    根据所述节点连接关系确定所述各节点的父节点;
    连接各节点及所述各节点对应的父节点以生成所述目标神经网络的结构。
  5. 如权利要求1或2所述的方法,其特征在于,所述网络描述文件的生成过程包括:
    确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数;
    根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式;
    根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件。
  6. 如权利要求5所述的方法,其特征在于,所述根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式,包括:
    根据所述节点连接要求确定当前节点的候选父节点,其中,所述当前节点和所述候选父节点满足所述节点连接要求;
    从所述候选父节点中选择出所述当前节点的实际父节点;
    确定所述当前节点与所述当前节点的实际父节点之间的连接关系,以最终生成所述目标连接方式。
  7. 如权利要求6所述的方法,其特征在于,所述根据所述节点连接要求确定所述当前节点的候选父节点,包括:
    根据以下连接关系中的至少一种,确定所述当前节点的候选父节点;
    在当前节点的节点类型为Concat或Eltwise时,所述当前节点的父节点个数为多个,且所述当前节点的父节点个数小于或者等于所述当前节点的候选父节点个数;
    在所述当前节点的父节点的节点类型为Active时,所述当前节点的节点类型为Active之外的类型;
    在所述当前节点的父节点的节点类型为Global Pooling时,所述当前节点的节点类型为Global Pooling;
    在所述当前节点的父节点的节点类型为FC时,所述当前节点的节点类型为FC或者Concat;
    在所述当前节点的父节点的节点类型为Conv、Eltwise、Pooling以及Concat时,所述当前节点的节点类型可以为Conv、Eltwise、Pooling、Active、Global Pooling、Concat以及FC中的任意一种。
  8. 如权利要求5-7中任一项所述的方法,其特征在于,根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件,包括:
    根据预设的节点有效连接关系,从所述目标连接关系中确定出有效目标连接关系;
    根据所述有效目标连接关系生成所述目标神经网络。
  9. 如权利要求8所述的方法,其特征在于,所述节点有效连接关系包括下列关系中的至少一种:
    在所述当前节点的节点类型为Eltwise时,所述当前节点的多个输入的通道数保持一致;
    所述当前节点的节点类型为FC或者Global Pooling时,所述当前节点的之后不能连接FC、Global Pooling和act类型之外的其它类型节点。
  10. 如权利要求5-9中任一项所述的方法,其特征在于,所述确定待生成的目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数,包括:
    根据对所述目标神经网络的运算要求确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数。
  11. 如权利要求1或2所述的方法,其特征在于,所述网络描述文件的生成过程包括:
    根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系;
    根据所述节点类型和所述节点连接关系生成所述网络描述文件。
  12. 如权利要求11所述的方法,其特征在于,所述根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系包括:
    确定所述目标神经网络中的子网络的数量;
    依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型;
    确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系;
    根据每个所述子网络内部的节点的节点类型、每个所述子网络内部的节点连接关系以及所述子网络之间的节点连接关系确定所述目标神经网络所有代的节点的节点类型以及节点连接关系。
  13. 如权利要求12所述的方法,其特征在于,所述依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型包括:
    选取所述加速器的运算单元种类中的至少一个作为一个所述子网络的节点的节点类型,其中一个所述子网络中各节点的节点类型不同;
    重复上一步骤直至确定每个所述子网络内部的节点的节点类型。
  14. 如权利要求12或13所述的方法,其特征在于,所述确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系包括:
    选取一个节点作为一个所述子网络的输入节点;
    通过随机或遍历方式确定所述子网络内部除所述输入节点之外的每个节点的父节点;
    确定每个所述子网络的输出节点;
    根据所述子网络的所述输入节点和所述输出节点,通过随机或遍历方式确定所述子网络之间的节点连接关系。
  15. 一种神经网络的生成方法,其特征在于,包括:
    获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;
    根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;
    根据预设节点间参数约束条件生成所述目标神经网络的节点参数;
    根据所述目标神经网络的结构和所述节点参数生成所述目标神经网络。
  16. 如权利要求15所述的方法,其特征在于,所述网络描述文件为文本文件,所述网络描述文件的第一行包括注释,用于描述可选节点类型;所述网络描述文件的第n行(n为大于1的整数)描述所述目标神经网络中第n-1代的所有节点的节点信息,所述节点信息包括节点类型和节点连接关系;其中,n为整数。
  17. 如权利要求15所述的方法,其特征在于,所述根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构包括:
    根据所述节点类型例化各节点;
    根据所述节点连接关系确定所述各节点的父节点;
    连接各节点及所述各节点对应的父节点以生成所述目标神经网络的结构。
  18. 如权利要求15所述的方法,其特征在于,所述网络描述文件的生成过程包括:
    确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数;
    根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式;
    根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件。
  19. 如权利要求18所述的方法,其特征在于,所述根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式,包括:
    根据所述节点连接要求确定当前节点的候选父节点,其中,所述当前节点和所述候选父节点满足所述节点连接要求;
    从所述候选父节点中选择出所述当前节点的实际父节点;
    确定所述当前节点与所述当前节点的实际父节点之间的连接关系,以最终生成所述目标连接方式。
  20. 如权利要求19所述的方法,其特征在于,所述根据所述节点连接要求确定所述当前节点的候选父节点,包括:
    根据以下连接关系中的至少一种,确定所述当前节点的候选父节点;
    在当前节点的节点类型为Concat或Eltwise时,所述当前节点的父节点个数为多个,且所述当前节点的父节点个数小于或者等于所述当前节点的候选父节点个数;
    在所述当前节点的父节点的节点类型为Active时,所述当前节点的节点类型为Active之外的类型;
    在所述当前节点的父节点的节点类型为Global Pooling时,所述当前节点的节点类型为Global Pooling;
    在所述当前节点的父节点的节点类型为FC时,所述当前节点的节点类型为FC或者Concat;
    在所述当前节点的父节点的节点类型为Conv、Eltwise、Pooling以及Concat时,所述当前节点的节点类型可以为Conv、Eltwise、Pooling、Active、Global Pooling、Concat以及FC中的任意一种。
  21. 如权利要求18-20中任一项所述的方法,其特征在于,根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件,包括:
    根据预设的节点有效连接关系,从所述目标连接关系中确定出有效目标连接关系;
    根据所述有效目标连接关系生成所述目标神经网络。
  22. 如权利要求21所述的方法,其特征在于,所述节点有效连接关系包括下列关系中的至少一种:
    在所述当前节点的节点类型为Eltwise时,所述当前节点的多个输入的通道数保持一致;
    所述当前节点的节点类型为FC或者Global Pooling时,所述当前节点的之后不能连接FC、Global Pooling和act类型之外的其它类型节点。
  23. 如权利要求18-22中任一项所述的方法,其特征在于,所述确定待生成的目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数,包括:
    根据对所述目标神经网络的运算要求确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数。
  24. 如权利要求15所述的方法,其特征在于,所述网络描述文件的生成过程包括:
    根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系;
    根据所述节点类型和所述节点连接关系生成所述网络描述文件。
  25. 如权利要求24所述的方法,其特征在于,所述根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系包括:
    确定所述目标神经网络中的子网络的数量;
    依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型;
    确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系;
    根据每个所述子网络内部的节点的节点类型、每个所述子网络内部的节点连接关系以及所述子网络之间的节点连接关系确定所述目标神经网络所有代的节点的节点类型以及节点连接关系。
  26. 如权利要求25所述的方法,其特征在于,所述依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型包括:
    选取所述加速器的运算单元种类中的至少一个作为一个所述子网络的节点的节点类型,其中一个所述子网络中各节点的节点类型不同;
    重复上一步骤直至确定每个所述子网络内部的节点的节点类型。
  27. 如权利要25或26所述的方法,其特征在于,所述确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系包括:
    选取一个节点作为一个所述子网络的输入节点;
    通过随机或遍历方式确定所述子网络内部除所述输入节点之外的每个节点的父节点;
    确定每个所述子网络的输出节点;
    根据所述子网络的所述输入节点和所述输出节点,通过随机或遍历方式确定所述子网 络之间的节点连接关系。
  28. 一种数据处理方法,其特征在于,包括:
    获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;
    根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;
    根据预设节点间参数约束条件生成所述目标神经网络的节点参数;
    根据所述目标神经网络的结构和所述节点参数生成目标神经网络;
    采用所述目标神经网络进行数据处理。
  29. 一种网络描述文件的生成方法,其特征在于,包括:
    确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数;
    根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式;
    根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件。
  30. 如权利要求29所述的方法,其特征在于,所述根据预设的节点连接要求确定连接所述目标神经网络中所有节点的目标连接方式,包括:
    根据所述节点连接要求确定当前节点的候选父节点,其中,所述当前节点和所述候选父节点满足所述节点连接要求;
    从所述候选父节点中选择出所述当前节点的实际父节点;
    确定所述当前节点与所述当前节点的实际父节点之间的连接关系,以最终生成所述目标连接方式。
  31. 如权利要求30所述的方法,其特征在于,所述根据所述节点连接要求确定所述当前节点的候选父节点,包括:
    根据以下连接关系中的至少一种,确定所述当前节点的候选父节点;
    在当前节点的节点类型为Concat或Eltwise时,所述当前节点的父节点个数为多个,且所述当前节点的父节点个数小于或者等于所述当前节点的候选父节点个数;
    在所述当前节点的父节点的节点类型为Active时,所述当前节点的节点类型为Active之外的类型;
    在所述当前节点的父节点的节点类型为Global Pooling时,所述当前节点的节点类型为Global Pooling;
    在所述当前节点的父节点的节点类型为FC时,所述当前节点的节点类型为FC或者Concat;
    在所述当前节点的父节点的节点类型为Conv、Eltwise、Pooling以及Concat时,所 述当前节点的节点类型可以为Conv、Eltwise、Pooling、Active、Global Pooling、Concat以及FC中的任意一种。
  32. 如权利要求29-31中任一项所述的方法,其特征在于,根据所述目标神经网络的代数、每代节点的节点类型、节点个数以及所述目标连接方式生成所述网络描述文件,包括:
    根据预设的节点有效连接关系,从所述目标连接关系中确定出有效目标连接关系;
    根据所述有效目标连接关系生成所述目标神经网络。
  33. 如权利要求32所述的方法,其特征在于,所述节点有效连接关系包括下列关系中的至少一种:
    在所述当前节点的节点类型为Eltwise时,所述当前节点的多个输入的通道数保持一致;
    所述当前节点的节点类型为FC或者Global Pooling时,所述当前节点的之后不能连接FC、Global Pooling和act类型之外的其它类型节点。
  34. 如权利要求29-33中任一项所述的方法,其特征在于,所述确定待生成的目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数,包括:
    根据对所述目标神经网络的运算要求确定所述目标神经网络的代数,以及所述目标神经网络所有代的节点的节点类型和节点个数。
  35. 一种网络描述文件的生成方法,其特征在于,包括:
    根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系;
    根据所述节点类型和所述节点连接关系生成所述网络描述文件。
  36. 如权利要求35所述的方法,其特征在于,所述根据加速器的结构类型确定所述目标神经网络所有代的节点的节点类型以及节点连接关系包括:
    确定所述目标神经网络中的子网络的数量;
    依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型;
    确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系;
    根据每个所述子网络内部的节点的节点类型、每个所述子网络内部的节点连接关系以及所述子网络之间的节点连接关系确定所述目标神经网络所有代的节点的节点类型以及节点连接关系。
  37. 如权利要求36所述的方法,其特征在于,所述依据加速器的运算单元种类确定每个所述子网络内部的节点的节点类型包括:
    选取所述加速器的运算单元种类中的至少一个作为一个所述子网络的节点的节点类型,其中一个所述子网络中各节点的节点类型不同;
    重复上一步骤直至确定每个所述子网络内部的节点的节点类型。
  38. 如权利要36或37所述的方法,其特征在于,所述确定每个所述子网络内部的节点连接关系,并根据所述每个所述子网络内部的节点连接关系确定所述子网络之间的节点连接关系包括:
    选取一个节点作为一个所述子网络的输入节点;
    通过随机或遍历方式确定所述子网络内部除所述输入节点之外的每个节点的父节点;
    确定每个所述子网络的输出节点;
    根据所述子网络的所述输入节点和所述输出节点,通过随机或遍历方式确定所述子网络之间的节点连接关系。
  39. 一种加速器的验证平台,其特征在于,包括:
    存储器,用于存储代码;
    至少一个处理器,用于执行所述存储器中存储的代码,以执行如权利要求1-14任一项所述的方法。
  40. 一种神经网络的生成装置,其特征在于,包括:
    存储器,用于存储代码;
    至少一个处理器,用于执行所述存储器中存储的代码,以执行如权利要求15-27任一项所述的方法。
  41. 一种数据处理装置,其特征在于,包括:
    存储器,用于存储代码;
    至少一个处理器,用于执行所述存储器中存储的代码,以执行如下操作:
    获取并解析至少一个所述网络描述文件,每个所述网络描述文件包括一个所述目标神经网络的所有代的节点的节点类型和节点连接关系;
    根据所述节点类型和所述节点连接关系生成所述目标神经网络的结构;
    根据预设节点间参数约束条件生成所述目标神经网络的节点参数;
    根据所述目标神经网络的结构和所述节点参数生成目标神经网络;
    采用所述目标神经网络进行数据处理。
  42. 一种网络描述文件的生成装置,其特征在于,包括:
    存储器,用于存储代码;
    至少一个处理器,用于执行所述存储器中存储的代码,以执行如权利要求29-34任一项所述的方法。
  43. 一种网络描述文件的生成装置,其特征在于,包括:
    存储器,用于存储代码;
    至少一个处理器,用于执行所述存储器中存储的代码,以执行如权利要求35-38任一项所述的方法。
PCT/CN2020/080742 2020-03-23 2020-03-23 加速器的检测方法和验证平台 WO2021189209A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/080742 WO2021189209A1 (zh) 2020-03-23 2020-03-23 加速器的检测方法和验证平台

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/080742 WO2021189209A1 (zh) 2020-03-23 2020-03-23 加速器的检测方法和验证平台

Publications (1)

Publication Number Publication Date
WO2021189209A1 true WO2021189209A1 (zh) 2021-09-30

Family

ID=77890828

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/080742 WO2021189209A1 (zh) 2020-03-23 2020-03-23 加速器的检测方法和验证平台

Country Status (1)

Country Link
WO (1) WO2021189209A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951961A (zh) * 2017-02-24 2017-07-14 清华大学 一种粗粒度可重构的卷积神经网络加速器及系统
US20180114117A1 (en) * 2016-10-21 2018-04-26 International Business Machines Corporation Accelerate deep neural network in an fpga
CN108280514A (zh) * 2018-01-05 2018-07-13 中国科学技术大学 基于fpga的稀疏神经网络加速系统和设计方法
CN108389183A (zh) * 2018-01-24 2018-08-10 上海交通大学 肺部结节检测神经网络加速器及其控制方法
CN108537328A (zh) * 2018-04-13 2018-09-14 众安信息技术服务有限公司 用于可视化构建神经网络的方法
CN108734270A (zh) * 2018-03-23 2018-11-02 中国科学院计算技术研究所 一种兼容型神经网络加速器及数据处理方法
CN109635949A (zh) * 2018-12-31 2019-04-16 浙江新铭智能科技有限公司 一种神经网络生成方法与装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180114117A1 (en) * 2016-10-21 2018-04-26 International Business Machines Corporation Accelerate deep neural network in an fpga
CN106951961A (zh) * 2017-02-24 2017-07-14 清华大学 一种粗粒度可重构的卷积神经网络加速器及系统
CN108280514A (zh) * 2018-01-05 2018-07-13 中国科学技术大学 基于fpga的稀疏神经网络加速系统和设计方法
CN108389183A (zh) * 2018-01-24 2018-08-10 上海交通大学 肺部结节检测神经网络加速器及其控制方法
CN108734270A (zh) * 2018-03-23 2018-11-02 中国科学院计算技术研究所 一种兼容型神经网络加速器及数据处理方法
CN108537328A (zh) * 2018-04-13 2018-09-14 众安信息技术服务有限公司 用于可视化构建神经网络的方法
CN109635949A (zh) * 2018-12-31 2019-04-16 浙江新铭智能科技有限公司 一种神经网络生成方法与装置

Similar Documents

Publication Publication Date Title
CN111260545B (zh) 生成图像的方法和装置
CN110807515A (zh) 模型生成方法和装置
JP2019535091A (ja) 畳み込みニューラルネットワークを使用したシーケンスの処理
CN111275784B (zh) 生成图像的方法和装置
WO2018068421A1 (zh) 一种神经网络的优化方法及装置
US20210089909A1 (en) High fidelity speech synthesis with adversarial networks
US10706104B1 (en) System and method for generating a graphical model
CN109829164B (zh) 用于生成文本的方法和装置
CN111046027A (zh) 时间序列数据的缺失值填充方法和装置
JP7354463B2 (ja) データ保護方法、装置、サーバ及び媒体
CN112199473A (zh) 一种知识问答系统中的多轮对话方法与装置
US20170286627A1 (en) Analysis and verification of models derived from clinical trials data extracted from a database
US20230394369A1 (en) Tracking provenance in data science scripts
CN110309269A (zh) 应答处理方法及其系统、计算机系统及计算机可读介质
CN112633260B (zh) 视频动作分类方法、装置、可读存储介质及设备
EP4213097A1 (en) Image generation method and apparatus
WO2021189209A1 (zh) 加速器的检测方法和验证平台
JP6723488B1 (ja) 学習装置及び推論装置
CN113490955A (zh) 用于产生金字塔层的架构的系统和方法
CN109598344A (zh) 模型生成方法和装置
US20210279575A1 (en) Information processing apparatus, information processing method, and storage medium
WO2020211037A1 (zh) 加速器的检测方法和验证平台
CN116090538A (zh) 一种模型权重获取方法以及相关系统
CN112419216A (zh) 图像去干扰方法、装置、电子设备及计算机可读存储介质
CN112184592A (zh) 一种图像修复方法、装置、设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20927786

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20927786

Country of ref document: EP

Kind code of ref document: A1