CN111582474B - Neural network structure detection method, training method and training device of structure detection model - Google Patents

Neural network structure detection method, training method and training device of structure detection model Download PDF

Info

Publication number
CN111582474B
CN111582474B CN202010331224.6A CN202010331224A CN111582474B CN 111582474 B CN111582474 B CN 111582474B CN 202010331224 A CN202010331224 A CN 202010331224A CN 111582474 B CN111582474 B CN 111582474B
Authority
CN
China
Prior art keywords
network
layer
training
sample
completion time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010331224.6A
Other languages
Chinese (zh)
Other versions
CN111582474A (en
Inventor
赵恒�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Wuqi Nanjing Technology Co ltd
Original Assignee
Zhongke Wuqi Nanjing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Wuqi Nanjing Technology Co ltd filed Critical Zhongke Wuqi Nanjing Technology Co ltd
Priority to CN202010331224.6A priority Critical patent/CN111582474B/en
Publication of CN111582474A publication Critical patent/CN111582474A/en
Application granted granted Critical
Publication of CN111582474B publication Critical patent/CN111582474B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention relates to a neural network structure detection method, a training method and a training device of a structure detection model, wherein the neural network structure detection method comprises the following steps: obtaining the calculation completion time of each layer when the reference network runs on the multi-core platform; inputting the calculation completion time of each layer into a trained structure detection model to obtain the structure information of a target network running on the same multi-core platform with the reference network; and reconstructing a topological structure according to the structural information of the target network to obtain the structure of the target network. Therefore, the structure of the target network can be detected through lower cost, and the detection result has higher robustness and accuracy.

Description

Neural network structure detection method, training method and training device of structure detection model
Technical Field
The embodiment of the invention relates to the technical field of neural networks, in particular to a neural network structure detection method, a training method of a structure detection model and a training device of the structure detection model.
Background
In recent years, neural networks have been widely used in various fields such as image recognition, language recognition, malware detection, and the like. However, while neural networks are widely used, some illegal actions also follow, for example, hackers implement the attack by implanting the neural network on the platform. Accordingly, how to combat such aggressive behavior also becomes particularly important.
In order to combat such attacks, the structure of the hacker-implanted neural network may be obtained, so as to extrapolate the input data and training data of the neural network, and generate an effective combat sample, thereby degrading the performance of the neural network and finally obtaining an erroneous prediction result. What is most important in this approach is how to accurately acquire the structure of the neural network.
Disclosure of Invention
In view of the above, in order to solve the above technical problems or some of the technical problems, embodiments of the present invention provide a neural network structure detection method, a training method of a structure detection model, and a device thereof.
In a first aspect, an embodiment of the present invention provides a method for constructing a structure detection model, where the method includes:
generating a sample network set;
obtaining a training sample set according to the sample network set, wherein each training sample takes the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network as an input value and the structural information of the sample network as a label value;
and training the training sample set by using a set algorithm to obtain a structure detection model, wherein the structure detection model takes the calculation completion time of each layer of the reference network as an input value and takes the structure information of a target network running on the same multi-core platform with the reference network as an output value.
In one possible embodiment, the sample networks in the sample network set are constructed by:
combining the structural parameters in each set of candidate parameters according to the set combination rule to obtain a plurality of groups of layer structural parameters;
when a sample network is constructed, selecting a group of layer structure parameters from the plurality of groups of layer structure parameters for each layer of the sample network, and constructing the layer according to the selected layer structure parameters to obtain the sample network.
In one possible embodiment, the obtaining a training sample set from a sample network set includes:
for each sample network in the sample network set, obtaining the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network, and obtaining the layer number of the sample network, the layer structure of each layer and the layer position; and taking the number of layers, the layer structure of each layer and the layer positions as the structure information.
In one possible embodiment, the structure detection model includes:
and the depth prediction network takes the calculation completion time of each layer of the reference network as an input value and the layer number of the target network running on the same multi-core platform with the reference network as an output value.
The layer structure estimation network comprises a layer position detection network and a layer structure detection network, wherein the layer position detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer position of each layer of a target network running on the same multi-core platform with the reference network as an output value; the layer structure detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer structure of each layer of the target network running on the same multi-core platform with the reference network as an output value.
In one possible implementation, training the training sample set by using a set algorithm to obtain a depth prediction network in a structure detection model includes:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the depth prediction network obtained by training according to a first loss function until a set first training stop condition is met; the first loss function represents the number of layers of the sample network and the mean square error of the depth prediction network output value.
In one possible embodiment, the obtaining the position of each layer of the sample network includes:
For each layer of the sample network, determining a layer position of a layer currently calculated by the reference network when the layer calculation is detected to be completed; determining a layer position of a layer currently calculated by the reference network as a layer position of the layer in the sample network;
training the training sample set by using a set algorithm to obtain a layer position detection network in a structure detection model, wherein the method comprises the following steps:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer position detection network obtained by training according to a second loss function until a set second training stop condition is met; the second loss function represents cross entropy of layer positions in the sample network and layer position detection network output values.
In one possible implementation manner, training the training sample set by using a set algorithm to obtain a structure detection network in a structure detection model includes:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer structure detection network obtained by training according to a third loss function until a set third training stop condition is met; the third loss function represents cross entropy of layer structure and layer structure detection network output values in the sample network.
In a second aspect, an embodiment of the present invention provides a neural network structure detection method, where the method includes:
obtaining the calculation completion time of each layer of the reference network when the target network and the reference network run on the multi-core platform;
inputting the calculation completion time of each layer into a trained structure detection model to obtain the structure information of a target network running on the same multi-core platform with the reference network;
and reconstructing a topological structure according to the structural information of the target network to obtain the structure of the target network.
In one possible implementation manner, the reconstructing the topology according to the structure information of the target network to obtain the structure of the target network includes:
determining the layer structure of each layer of the target network in the structure information according to the layer position of the layer in the structure information;
and carrying out topology structure reconstruction according to the layer position and the layer structure of each layer of the target network to obtain the structure of the target network.
In a third aspect, an embodiment of the present invention provides a device for constructing a structure detection model, where the device includes:
a first generation module for generating a sample network set;
The second generation module is used for obtaining a training sample set according to the sample network set, wherein each training sample takes the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network as an input value and takes the structural information of the sample network as a label value;
the training module is used for training the training sample set by using a set algorithm to obtain a structure detection model, wherein the structure detection model takes the calculation completion time of each layer of the reference network as an input value and takes the structure information of a target network running on the same multi-core platform as the reference network as an output value.
In one possible implementation, the first generating module generates the sample network includes:
combining the structural parameters in each set of candidate parameters according to the set combination rule to obtain a plurality of groups of layer structural parameters;
when a sample network is constructed, selecting a group of layer structure parameters from the plurality of groups of layer structure parameters for each layer of the sample network, and constructing the layer according to the selected layer structure parameters to obtain the sample network.
In a possible implementation manner, the second generating module obtains a training sample set according to a sample network set, including:
For each sample network in the sample network set, obtaining the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network, and obtaining the layer number of the sample network, the layer structure of each layer and the layer position; and taking the number of layers, the layer structure of each layer and the layer positions as the structure information.
In a possible embodiment, the structure detection model includes:
and the depth prediction network takes the calculation completion time of each layer of the reference network as an input value and the layer number of the target network running on the same multi-core platform with the reference network as an output value.
The layer structure estimation network comprises a layer position detection network and a layer structure detection network, wherein the layer position detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer position of each layer of a target network running on the same multi-core platform with the reference network as an output value; the layer structure detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer structure of each layer of the target network running on the same multi-core platform with the reference network as an output value.
In one possible implementation manner, the training module trains the training sample set by using a set algorithm to obtain a depth prediction network in a structure detection model, including:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the depth prediction network obtained by training according to a first loss function until a set first training stop condition is met; the first loss function represents the number of layers of the sample network and the mean square error of the depth prediction network output value.
In a possible implementation manner, the second generating module obtains a layer position of each layer of the sample network, including:
for each layer of the sample network, determining a layer position of a layer currently calculated by the reference network when the layer calculation is detected to be completed; determining a layer position of a layer currently calculated by the reference network as a layer position of the layer in the sample network;
the training module trains the training sample set by using a set algorithm to obtain a layer position detection network in the structure detection model, and the training module comprises the following steps:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer position detection network obtained by training according to a second loss function until a set second training stop condition is met; the second loss function represents cross entropy of layer positions in the sample network and layer position detection network output values.
In one possible implementation manner, the training module trains the training sample set by using a set algorithm to obtain a structure detection network in a structure detection model, including:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer structure detection network obtained by training according to a third loss function until a set third training stop condition is met; the third loss function represents cross entropy of layer structure and layer structure detection network output values in the sample network.
In a fourth aspect, an embodiment of the present invention provides a neural network structure detection apparatus, including:
the time obtaining module is used for obtaining the calculation completion time of each layer of the reference network when the target network and the reference network run on the multi-core platform;
the model input module is used for inputting the calculation completion time of each layer into a trained structure detection model to obtain the structure information of the target network;
and the structure reconstruction module is used for reconstructing a topological structure according to the structure information of the target network to obtain the structure of the target network.
In a possible implementation manner, the structure reconstruction module performs topology structure reconstruction according to the structure information of the target network obtained by the model input module, to obtain the structure of the target network, and includes:
Determining the layer structure of each layer of the target network in the structure information according to the layer position of the layer in the structure information;
and carrying out topology structure reconstruction according to the layer position and the layer structure of each layer of the target network to obtain the structure of the target network.
In a fifth aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus;
the processor, the communication interface and the memory communicate with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to execute a computer program stored in the memory, where the step of the neural network structure detection method or the training method of the structure detection model provided by the embodiment of the invention is implemented when the processor executes the computer program.
In a sixth aspect, an embodiment of the present invention provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements the steps of the neural network structure detection method or the training method of the structure detection model provided by the embodiment of the present invention.
According to the method provided by the embodiment of the invention, a training sample set is obtained according to the sample network set by generating the sample network set, wherein each training sample takes the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network as an input value and the structural information of the sample network as a label value; and training the training sample set by using a setting algorithm to obtain a structure detection model capable of predicting the structure information of the target network.
The calculation completion time of each layer is input to a trained structure detection model by obtaining the calculation completion time of each layer when the reference network runs on the multi-core platform, so that the structure information of the target network running on the same multi-core platform with the reference network is obtained, the topology structure reconstruction is carried out according to the structure information of the target network, the structure of the target network is obtained, the structure of the target network can be detected through lower cost, and the detection result has higher robustness and accuracy.
Drawings
FIG. 1 is a flowchart of an embodiment of a method for constructing a structure detection model according to an embodiment of the present invention;
fig. 2 is a flowchart of an embodiment of a neural network structure detection method according to an embodiment of the present invention;
FIG. 3 is a block diagram of an embodiment of a device for constructing a structural probe model according to an embodiment of the present invention;
fig. 4 is a block diagram of an embodiment of a neural network structure detecting device according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
For the purpose of facilitating an understanding of the embodiments of the present invention, reference will now be made to the following description of specific embodiments, taken in conjunction with the accompanying drawings, which are not intended to limit the embodiments of the invention.
Currently, with the continuous development of Cloud platforms, the deployment of a neural network to a remote Cloud platform for running becomes a mainstream trend, and most of the Cloud platforms are multi-core architectures, such as Google applied to Cloud TPU on the Cloud. In the architecture, only one neural network model is allowed to run on one processor, different neural network models are deployed on different processors, and meanwhile, the information of a memory time side channel is leaked due to the shared memory, so when a plurality of processors access the memory, the delay of the memory is mutually influenced due to the competition relationship, and the privacy information on other processor cores can be reversely deduced according to the received delay change. The present invention has been made based on this.
For easy understanding, the method for constructing the structure detection model according to the present invention will be described in the following by using specific embodiments:
referring to fig. 1, a flowchart of an embodiment of a method for constructing a structure detection model according to an embodiment of the present invention includes the following steps:
step 101: a sample network set is generated.
It should be appreciated that the sample network set includes a plurality of sample networks. As an embodiment, the user may preset an optional range of each type of neural network structural parameter (hereinafter referred to as structural parameter), and the candidate parameter set may be obtained according to the optional range set by the user. For example, the optional range of the preset structural parameter of the network node is 1-5, and the candidate parameter set corresponding to the structural parameter of the network node is {1,2,3,4,5}.
In this embodiment, the structural parameters in each candidate parameter set may be combined according to a set combination rule, for example, a randomly selected combination rule or a fully arranged combination rule, to obtain multiple groups of layer structure parameters. Then, when the sample network is constructed, for each layer of the sample network to be constructed, a group of layer structure parameters can be randomly selected from a plurality of groups of additional structure parameters, and the layer is constructed according to the selected layer structure parameters, so that the sample network is obtained.
Step 102: a training sample set is obtained from the sample network set.
In the embodiment of the invention, each training sample takes the calculation completion time of each layer of a reference network running on the same multi-core platform as a sample network as an input value and takes the structural information of the sample network as a label value. Generally, the structural information referred to herein may include the number of layers, the layer position, and the layer structure.
Based on the above, in the embodiment of the invention, the reference network and the sample network can be deployed on the same multi-core platform, and the reference network and the sample network are controlled to run on the same multi-core platform at the same time. After the reference network finishes calculation, the calculation completion time of each layer of the reference network can be obtained and is marked as T, T= { T 1 ,t 2 ,…,t n And t is }, where 1 To refer to the calculation completion time of the first layer of the network, t 2 Calculation completion time for reference network second layer, and so on, t n The calculation completion time for the n-th layer of the reference network.
It should be noted that, in the embodiment of the present invention, the reference network is designed as a multi-layer neural network, so as to ensure that the calculation completion time of the reference network is longer than the calculation completion time of the target network. The target network refers to a neural network to be subjected to structure detection, such as a neural network implanted on a platform by a hacker for developing attack. Meanwhile, each layer of the reference network satisfies the following characteristics: in the whole calculation process of the layer, the time for memory access occupies a large proportion of the whole calculation time, namely the time for memory access is longer than the time for single calculation, and the setting can realize capturing the memory access influence brought by the target network as much as possible.
Step 103: and training the training sample set by using a set algorithm to obtain a structure detection model.
Firstly, in the embodiment of the present invention, the structure detection model uses the calculation completion time of each layer of the reference network as an input value, and uses the structure information of the target network running on the same multi-core platform as the reference network as an output value, that is, the structure of the neural network can be detected by applying the structure detection model in the embodiment of the present invention.
As an embodiment, the structure detection model may include two parts, namely a depth prediction network and a layer structure estimation network, where the depth prediction network uses a calculation completion time of each layer of the reference network as an input value, and uses a layer number of a target network running on the same multi-core platform as the reference network as an output value, that is, the layer number of the neural network may be detected by applying the depth prediction network. The layer structure estimation network takes the calculation completion time of each layer of the reference network as an input value and takes the layer position and the layer structure of each layer of the target network running on the same multi-core platform with the reference network as an output value, that is, the application layer structure estimation network can detect the layer position and the layer structure of each layer of the neural network. The training process of the two parts is described below:
Depth prediction network:
as an embodiment, the depth prediction network may be implemented based on a multi-layer perceptron (Multilayer Perceptron, abbreviated as MLP). In practice, the training sample set may be trained using a supervised learning algorithm, and the depth prediction network obtained by training is counter-propagated according to the first loss function until a set first training stop condition is met. Wherein the first loss function represents the mean square error of the number of layers of the sample network and the output value of the depth prediction network, in other words, the first loss function represents the mean square error of the real number of layers of the sample network and the predicted number of layers of the depth prediction network. Correspondingly, the first training stop condition may refer to: the mean square error is less than a set mean square error threshold.
Layer structure estimation network:
as one embodiment, the layer structure estimation network includes a layer position detection network and a layer structure detection network. The layer position detection network takes the calculation completion time of each layer of the reference network as an input value, and takes the layer position of each layer of the target network running on the same multi-core platform with the reference network as an output value, that is, the application layer position detection network can detect the layer position of each layer of the neural network. The structure detection network takes the calculation completion time of each layer of the reference network as an input value, and takes the layer structure of each layer of the target network running on the same multi-core platform with the reference network as an output value, that is, the application layer structure detection network can detect the layer structure of each layer of the neural network.
As an embodiment, the location detection network is implemented based on a long and short time memory network (Long Short Term Memory Network, LSTM for short). In practice, the training sample set may be trained using a supervised learning algorithm, and the trained layer location detection network counter-propagated according to the second loss function until a set second training stop condition is met. Wherein the second loss function represents the cross entropy of the layer positions in the sample network and the layer position detection network output values, in other words the second loss function represents the cross entropy of the actual layer positions in the sample network and the layer positions predicted by the layer position detection network.
It should be added that the true layer position of the layer in the sample network can be obtained by: for each layer of the sample network, when the layer calculation is detected to be completed, determining a layer position of the layer currently calculated by the reference network, and determining the layer position of the layer currently calculated by the reference network as the layer position of the layer in the sample network.
In addition, the layer position predicted by the layer position detection network refers to: and sequencing all the layer positions predicted by the layer position detection network according to the sequence from large to small, and taking the first M layer positions as layer positions predicted by the layer position detection network. Here, M is a set natural number.
As one embodiment, the layer structure detection network is implemented based on MLP. In practice, the training sample set may be trained using a supervised learning algorithm, and the layer structure detection network obtained by training may be counter-propagated according to the third loss function until a set third training stop condition is met. Wherein the third loss function represents the cross entropy of the layer structure in the sample network and the layer structure detection network output value, in other words, the third loss function represents the cross entropy of the layer structure predicted by the layer structure detection network and the true structure in the sample network. It should be noted that, in this embodiment, for each layer of the sample network, according to the actual layer position of the layer, the corresponding value is taken out of all layer structures predicted by the layer structure detection network, that is, the predicted layer structure corresponding to the layer is taken out.
As can be seen from the above embodiments, by generating a sample network set, a training sample set is obtained according to the sample network set, where each training sample uses a calculation completion time of each layer of a reference network running on the same multi-core platform as the sample network as an input value, and uses structural information of the sample network as a tag value; and training the training sample set by using a setting algorithm to obtain a structure detection model capable of predicting the structure information of the target network.
The following describes a neural network structure detection method according to the present invention in specific embodiments:
referring to fig. 2, a flowchart of an embodiment of a neural network structure detection method according to an embodiment of the present invention includes the following steps:
step 201: the calculation completion time of each layer of the reference network when running on the multi-core platform is obtained.
Step 202: and inputting the calculation completion time of each layer into the trained structure detection model to obtain the structure information of the target network running on the same multi-core platform with the reference network.
Step 203: and carrying out topology structure reconstruction according to the structure information of the target network to obtain the structure of the target network.
Steps 201 to 203 are collectively described below:
in the embodiment of the invention, the reference network and the target network are deployed on the same multi-core platform, and the reference network and the target network are controlled to run on the same multi-core platform at the same time. After the reference network finishes calculation, the calculation completion time T of each layer of the reference network can be obtained.
As can be seen from the embodiment of fig. 1, inputting T into the trained structure detection model can obtain the structure information of the target network. And finally, reconstructing a topological structure on the detected layer structure according to the characteristics of an Add layer in the target network, namely the structure of the two input layers is required to be identical, namely the structure information of the target network output by the structure detection model, and finally obtaining the structure of the target network model.
Similar to the description in step 103, when the topology structure reconstruction is performed on the target network, each layer position output by the layer position detection network, that is, the layer position in the structure information, needs to be used as an index, and in the layer structure output by the layer structure detection network, the layer structure corresponding to each layer position is determined, that is, the layer position is matched with the layer structure; and finally, reconstructing the topological structure of the target network according to the matched layer position and layer structure to obtain the structure of the target network.
According to the embodiment, the calculation completion time of each layer is input into the trained structure detection model by obtaining the calculation completion time of each layer when the reference network runs on the multi-core platform, the structure information of the target network running on the same multi-core platform with the reference network is obtained, the topology structure is reconstructed according to the structure information of the target network, the structure of the target network is obtained, the structure of the target network can be detected through lower cost, and the detection result has higher robustness and accuracy.
Referring to fig. 3, a block diagram of an embodiment of a device for constructing a structure detection model according to an embodiment of the present invention includes: a first generation module 31, a second generation module 32, and a training module 33.
Wherein, the first generating module 31 is configured to generate a sample network set;
a second generating module 32, configured to obtain a training sample set according to the sample network set, where each training sample uses a calculation completion time of each layer of a reference network running on the same multi-core platform as the sample network as an input value, and uses structural information of the sample network as a tag value;
the training module 33 is configured to train the training sample set by using a set algorithm to obtain a structure detection model, where the structure detection model uses a calculation completion time of each layer of the reference network as an input value and uses structure information of a target network running on the same multi-core platform as the reference network as an output value.
In a possible implementation manner, the first generating module 31 generates a sample network including:
combining the structural parameters in each set of candidate parameters according to the set combination rule to obtain a plurality of groups of layer structural parameters;
when a sample network is constructed, selecting a group of layer structure parameters from the plurality of groups of layer structure parameters for each layer of the sample network, and constructing the layer according to the selected layer structure parameters to obtain the sample network.
In a possible implementation manner, the second generating module 31 obtains a training sample set according to a sample network set, including:
for each sample network in the sample network set, obtaining the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network, and obtaining the layer number of the sample network, the layer structure of each layer and the layer position; and taking the number of layers, the layer structure of each layer and the layer positions as the structure information.
In a possible embodiment, the structure detection model includes:
and the depth prediction network takes the calculation completion time of each layer of the reference network as an input value and the layer number of the target network running on the same multi-core platform with the reference network as an output value.
The layer structure estimation network comprises a layer position detection network and a layer structure detection network, wherein the layer position detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer position of each layer of a target network running on the same multi-core platform with the reference network as an output value; the layer structure detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer structure of each layer of the target network running on the same multi-core platform with the reference network as an output value.
In a possible implementation manner, the training module 33 trains the training sample set by using a set algorithm to obtain a depth prediction network in the structure detection model, including:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the depth prediction network obtained by training according to a first loss function until a set first training stop condition is met; the first loss function represents the number of layers of the sample network and the mean square error of the depth prediction network output value.
In a possible implementation manner, the second generating module 32 obtains the layer position of each layer of the sample network, including:
for each layer of the sample network, determining a layer position of a layer currently calculated by the reference network when the layer calculation is detected to be completed; determining a layer position of a layer currently calculated by the reference network as a layer position of the layer in the sample network;
the training module 33 trains the training sample set by using a set algorithm to obtain a layer position detection network in the structure detection model, including:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer position detection network obtained by training according to a second loss function until a set second training stop condition is met; the second loss function represents cross entropy of layer positions in the sample network and layer position detection network output values.
In a possible implementation manner, the training module 33 trains the training sample set by using a set algorithm to obtain a structure detection network in a structure detection model, including:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer structure detection network obtained by training according to a third loss function until a set third training stop condition is met; the third loss function represents cross entropy of layer structure and layer structure detection network output values in the sample network.
Referring to fig. 4, a block diagram of an embodiment of a neural network structure detecting device according to an embodiment of the present invention includes: a time acquisition module 41, a model input module 42, and a training module 43.
The time obtaining module 41 is configured to obtain a calculation completion time of each layer of the reference network when the target network and the reference network run on the multi-core platform;
a model input module 42, configured to input the calculation completion time of each layer to a trained structure detection model, so as to obtain structure information of the target network;
and the structure reconstruction module 43 is configured to perform topology structure reconstruction according to the structure information of the target network obtained by the model input module, so as to obtain the structure of the target network.
In a possible implementation manner, the structure reconstructing module 43 performs topology structure reconstruction according to the structure information of the target network obtained by the model input module 42, to obtain the structure of the target network, including:
determining the layer structure of each layer of the target network in the structure information according to the layer position of the layer in the structure information;
and carrying out topology structure reconstruction according to the layer position and the layer structure of each layer of the target network to obtain the structure of the target network.
With continued reference to fig. 5, the present application also provides an electronic device including a processor 501, a communication interface 502, a memory 503, and a communication bus 504.
Wherein the processor 501, the communication interface 502, and the memory 503 communicate with each other through the communication bus 504;
a memory 503 for storing a computer program;
the processor 501 is configured to execute a computer program stored in the memory 503, and when the processor 401 executes the computer program, the steps of the neural network structure detection method or the training method of the structure detection model provided by the embodiment of the present application are implemented.
The present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the neural network structure detection method or the training method of the structure detection model provided by the embodiment of the present application.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of function in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (12)

1. A method of constructing a structural probe model, the method comprising:
generating a sample network set;
obtaining a training sample set according to the sample network set, wherein each training sample takes the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network as an input value and the structural information of the sample network as a label value;
training the training sample set by using a set algorithm to obtain a structure detection model, wherein the structure detection model takes the calculation completion time of each layer of the reference network as an input value and takes the structure information of a target network running on the same multi-core platform with the reference network as an output value;
the obtaining a training sample set according to the sample network set comprises:
for each sample network in the sample network set, obtaining the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network, and obtaining the layer number of the sample network, the layer structure of each layer and the layer position; taking the number of layers, the layer structure of each layer and the layer position as the structure information;
the structure detection model comprises:
the depth prediction network takes the calculation completion time of each layer of the reference network as an input value and the layer number of a target network running on the same multi-core platform with the reference network as an output value;
The layer structure estimation network comprises a layer position detection network and a layer structure detection network, wherein the layer position detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer position of each layer of a target network running on the same multi-core platform with the reference network as an output value; the layer structure detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer structure of each layer of the target network running on the same multi-core platform with the reference network as an output value;
training the training sample set by using a set algorithm to obtain a depth prediction network in a structure detection model, wherein the training sample set comprises the following steps:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the depth prediction network obtained by training according to a first loss function until a set first training stop condition is met; the first loss function represents the number of layers of the sample network and the mean square error of the depth prediction network output value;
the obtaining the layer position of each layer of the sample network comprises the following steps:
for each layer of the sample network, determining a layer position of a layer currently calculated by the reference network when the layer calculation is detected to be completed; determining a layer position of a layer currently calculated by the reference network as a layer position of the layer in the sample network;
Training the training sample set by using a set algorithm to obtain a layer position detection network in a structure detection model, wherein the method comprises the following steps:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer position detection network obtained by training according to a second loss function until a set second training stop condition is met; the second loss function represents cross entropy of layer positions in the sample network and layer position detection network output values.
2. The method of claim 1, wherein the sample networks in the set of sample networks are constructed by:
combining the structural parameters in each set of candidate parameters according to the set combination rule to obtain a plurality of groups of layer structural parameters;
when a sample network is constructed, selecting a group of layer structure parameters from the plurality of groups of layer structure parameters for each layer of the sample network, and constructing the layer according to the selected layer structure parameters to obtain the sample network.
3. The method of claim 1, wherein training the training sample set with a set algorithm results in a structure detection network in a structure detection model, comprising:
Training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer structure detection network obtained by training according to a third loss function until a set third training stop condition is met; the third loss function represents cross entropy of layer structure and layer structure detection network output values in the sample network.
4. A method of detecting a neural network structure based on the structure detection model constructed by the structure detection model construction method according to any one of claims 1 to 3, characterized by comprising:
obtaining the calculation completion time of each layer of the reference network when the target network and the reference network run on the multi-core platform;
inputting the calculation completion time of each layer into a trained structure detection model to obtain the structure information of the target network;
and reconstructing a topological structure according to the structural information of the target network to obtain the structure of the target network.
5. The method of claim 4, wherein the performing topology reconstruction according to the structure information of the target network to obtain the structure of the target network comprises:
determining the layer structure of each layer of the target network in the structure information according to the layer position of the layer in the structure information;
And carrying out topology structure reconstruction according to the layer position and the layer structure of each layer of the target network to obtain the structure of the target network.
6. A device for constructing a structural inspection model, the device comprising:
a first generation module for generating a sample network set;
the second generation module is used for obtaining a training sample set according to the sample network set, wherein each training sample takes the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network as an input value and takes the structural information of the sample network as a label value;
the training module is used for training the training sample set by using a set algorithm to obtain a structure detection model, wherein the structure detection model takes the calculation completion time of each layer of the reference network as an input value and takes the structure information of a target network running on the same multi-core platform with the reference network as an output value;
the second generating module obtains a training sample set according to the sample network set, including:
for each sample network in the sample network set, obtaining the calculation completion time of each layer of a reference network running on the same multi-core platform with the sample network, and obtaining the layer number of the sample network, the layer structure of each layer and the layer position; taking the number of layers, the layer structure of each layer and the layer position as the structure information;
The structure detection model comprises:
the depth prediction network takes the calculation completion time of each layer of the reference network as an input value and the layer number of a target network running on the same multi-core platform with the reference network as an output value;
the layer structure estimation network comprises a layer position detection network and a layer structure detection network, wherein the layer position detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer position of each layer of a target network running on the same multi-core platform with the reference network as an output value; the layer structure detection network takes the calculation completion time of each layer of the reference network as an input value and takes the layer structure of each layer of the target network running on the same multi-core platform with the reference network as an output value;
the training module trains the training sample set by using a set algorithm to obtain a depth prediction network in a structure detection model, and the training module comprises the following steps:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the depth prediction network obtained by training according to a first loss function until a set first training stop condition is met; the first loss function represents the number of layers of the sample network and the mean square error of the depth prediction network output value;
The second generation module obtains a layer position of each layer of the sample network, including:
for each layer of the sample network, determining a layer position of a layer currently calculated by the reference network when the layer calculation is detected to be completed; determining a layer position of a layer currently calculated by the reference network as a layer position of the layer in the sample network;
the training module trains the training sample set by using a set algorithm to obtain a layer position detection network in the structure detection model, and the training module comprises the following steps:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer position detection network obtained by training according to a second loss function until a set second training stop condition is met; the second loss function represents cross entropy of layer positions in the sample network and layer position detection network output values.
7. The apparatus of claim 6, wherein the first generation module generates a sample network comprising:
combining the structural parameters in each set of candidate parameters according to the set combination rule to obtain a plurality of groups of layer structural parameters;
when a sample network is constructed, selecting a group of layer structure parameters from the plurality of groups of layer structure parameters for each layer of the sample network, and constructing the layer according to the selected layer structure parameters to obtain the sample network.
8. The apparatus of claim 7, wherein the training module trains the training sample set using a set algorithm to obtain a structure detection network in a structure detection model, comprising:
training the training sample set by using a supervised learning algorithm, and carrying out back propagation on the layer structure detection network obtained by training according to a third loss function until a set third training stop condition is met; the third loss function represents cross entropy of layer structure and layer structure detection network output values in the sample network.
9. A neural network structure detection device, the device comprising:
the time obtaining module is used for obtaining the calculation completion time of each layer of the reference network when the target network and the reference network run on the multi-core platform;
the model input module is used for inputting the calculation completion time of each layer into a trained structure detection model to obtain the structure information of the target network;
wherein the trained structural detection model is constructed according to the structural detection model construction device of any one of claims 6-8;
and the structure reconstruction module is used for reconstructing a topological structure according to the structure information of the target network obtained by the model input module to obtain the structure of the target network.
10. The apparatus of claim 9, wherein the structure reconstruction module performs topology reconstruction according to the structure information of the target network obtained by the model input module to obtain the structure of the target network, and includes:
determining the layer structure of each layer of the target network in the structure information according to the layer position of the layer in the structure information;
and carrying out topology structure reconstruction according to the layer position and the layer structure of each layer of the target network to obtain the structure of the target network.
11. An electronic device comprising a processor, a communication interface, a memory, and a communication bus;
the processor, the communication interface and the memory communicate with each other through the communication bus;
the memory is used for storing a computer program;
the processor being adapted to execute a computer program stored on the memory, the processor implementing the steps of the method according to any one of claims 1-3 or 4-5 when the computer program is executed.
12. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the steps of the method of any of claims 1-3 or 4-5.
CN202010331224.6A 2020-04-24 2020-04-24 Neural network structure detection method, training method and training device of structure detection model Active CN111582474B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010331224.6A CN111582474B (en) 2020-04-24 2020-04-24 Neural network structure detection method, training method and training device of structure detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010331224.6A CN111582474B (en) 2020-04-24 2020-04-24 Neural network structure detection method, training method and training device of structure detection model

Publications (2)

Publication Number Publication Date
CN111582474A CN111582474A (en) 2020-08-25
CN111582474B true CN111582474B (en) 2023-08-25

Family

ID=72111740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010331224.6A Active CN111582474B (en) 2020-04-24 2020-04-24 Neural network structure detection method, training method and training device of structure detection model

Country Status (1)

Country Link
CN (1) CN111582474B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658418A (en) * 2018-10-31 2019-04-19 百度在线网络技术(北京)有限公司 Learning method, device and the electronic equipment of scene structure
CN110097103A (en) * 2019-04-22 2019-08-06 西安电子科技大学 Based on the semi-supervision image classification method for generating confrontation network
CN110245491A (en) * 2019-06-11 2019-09-17 合肥宜拾惠网络科技有限公司 The determination method, apparatus and memory and processor of network attack type
CN110266673A (en) * 2019-06-11 2019-09-20 合肥宜拾惠网络科技有限公司 Security strategy optimized treatment method and device based on big data
CN110909877A (en) * 2019-11-29 2020-03-24 百度在线网络技术(北京)有限公司 Neural network model structure searching method and device, electronic equipment and storage medium
CN111027060A (en) * 2019-12-17 2020-04-17 电子科技大学 Knowledge distillation-based neural network black box attack type defense method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229647A (en) * 2017-08-18 2018-06-29 北京市商汤科技开发有限公司 The generation method and device of neural network structure, electronic equipment, storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658418A (en) * 2018-10-31 2019-04-19 百度在线网络技术(北京)有限公司 Learning method, device and the electronic equipment of scene structure
CN110097103A (en) * 2019-04-22 2019-08-06 西安电子科技大学 Based on the semi-supervision image classification method for generating confrontation network
CN110245491A (en) * 2019-06-11 2019-09-17 合肥宜拾惠网络科技有限公司 The determination method, apparatus and memory and processor of network attack type
CN110266673A (en) * 2019-06-11 2019-09-20 合肥宜拾惠网络科技有限公司 Security strategy optimized treatment method and device based on big data
CN110909877A (en) * 2019-11-29 2020-03-24 百度在线网络技术(北京)有限公司 Neural network model structure searching method and device, electronic equipment and storage medium
CN111027060A (en) * 2019-12-17 2020-04-17 电子科技大学 Knowledge distillation-based neural network black box attack type defense method

Also Published As

Publication number Publication date
CN111582474A (en) 2020-08-25

Similar Documents

Publication Publication Date Title
CN113408743B (en) Method and device for generating federal model, electronic equipment and storage medium
US20190354868A1 (en) Multi-task neural networks with task-specific paths
JP7233807B2 (en) Computer-implemented method, computer system, and computer program for simulating uncertainty in artificial neural networks
KR102172277B1 (en) Dual deep neural network
CN109922032B (en) Method, device, equipment and storage medium for determining risk of logging in account
CN109120462B (en) Method and device for predicting opportunistic network link and readable storage medium
CN112116090B (en) Neural network structure searching method and device, computer equipment and storage medium
EP3635637A1 (en) Pre-training system for self-learning agent in virtualized environment
CN110046706B (en) Model generation method and device and server
KR20190031318A (en) Domain Separation Neural Networks
JP2022095659A (en) Neural architecture search
CN108154197A (en) Realize the method and device that image labeling is verified in virtual scene
WO2019006541A1 (en) System and method for automatic building of learning machines using learning machines
CN115545334B (en) Land utilization type prediction method and device, electronic equipment and storage medium
Zhang et al. Identification of concrete surface damage based on probabilistic deep learning of images
Qu et al. Improving the reliability for confidence estimation
CN111582474B (en) Neural network structure detection method, training method and training device of structure detection model
CN113487019A (en) Circuit fault diagnosis method and device, computer equipment and storage medium
CN117235742A (en) Intelligent penetration test method and system based on deep reinforcement learning
KR102255470B1 (en) Method and apparatus for artificial neural network
Lu et al. Ranking attack graphs with graph neural networks
CN109756494B (en) Negative sample transformation method and device
CN114510592A (en) Image classification method and device, electronic equipment and storage medium
RU2718409C1 (en) System for recovery of rock sample three-dimensional structure
CN117891566B (en) Reliability evaluation method, device, equipment, medium and product of intelligent software

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Building 613A, Building 5, Qilin Artificial Intelligence Industrial Park, No. 266 Chuangyan Road, Qilin Technology Innovation Park, Nanjing City, Jiangsu Province, 211135

Applicant after: Zhongke Wuqi (Nanjing) Technology Co.,Ltd.

Address before: Room 1248, 12 / F, research complex building, Institute of computing technology, Chinese Academy of Sciences, No. 6, South Road, Haidian District, Beijing 100086

Applicant before: JEEJIO (BEIJING) TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant