CN114065308A - Gate-level hardware Trojan horse positioning method and system based on deep learning - Google Patents
Gate-level hardware Trojan horse positioning method and system based on deep learning Download PDFInfo
- Publication number
- CN114065308A CN114065308A CN202111412498.9A CN202111412498A CN114065308A CN 114065308 A CN114065308 A CN 114065308A CN 202111412498 A CN202111412498 A CN 202111412498A CN 114065308 A CN114065308 A CN 114065308A
- Authority
- CN
- China
- Prior art keywords
- path
- submodule
- positioning
- paths
- gate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- ZXQYGBMAQZUVMI-GCMPRSNUSA-N gamma-cyhalothrin Chemical compound CC1(C)[C@@H](\C=C(/Cl)C(F)(F)F)[C@H]1C(=O)O[C@H](C#N)C1=CC=CC(OC=2C=CC=CC=2)=C1 ZXQYGBMAQZUVMI-GCMPRSNUSA-N 0.000 title claims abstract description 55
- 238000000034 method Methods 0.000 title claims abstract description 22
- 238000013135 deep learning Methods 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 64
- 238000001514 detection method Methods 0.000 claims abstract description 55
- 238000012360 testing method Methods 0.000 claims abstract description 42
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000010845 search algorithm Methods 0.000 claims abstract description 7
- 238000010276 construction Methods 0.000 claims description 10
- 238000010586 diagram Methods 0.000 claims description 9
- 238000002372 labelling Methods 0.000 claims description 6
- 238000012163 sequencing technique Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000004806 packaging method and process Methods 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 2
- 101100149325 Escherichia coli (strain K12) setC gene Proteins 0.000 claims 1
- 238000005192 partition Methods 0.000 claims 1
- 238000011156 evaluation Methods 0.000 abstract 1
- 230000008685 targeting Effects 0.000 abstract 1
- 229910052710 silicon Inorganic materials 0.000 description 6
- 239000010703 silicon Substances 0.000 description 6
- 238000013461 design Methods 0.000 description 5
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/70—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
- G06F21/71—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
- G06F30/32—Circuit design at the digital level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Geometry (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention relates to a gate-level hardware Trojan horse positioning method and system based on deep learning, wherein the method comprises the steps of firstly, obtaining seven open gate-level netlist files to obtain a training set and a test set; then preprocessing is carried out, the netlist file is converted into a path statement by using a depth-first search algorithm, and path generation is completed; then constructing and training a TextCNN model for detection and positioning; inputting the path set of the test set into the model to obtain a pre-detection result; dividing paths of the pre-detection result and constructing virtual positioning coordinates to obtain a short path set for positioningSL(ii) a Finally will beSLInputting the TextCNN model to obtain a positioning resultP. The invention realizes quick and effective evaluation of the safety of the integrated circuitFull performance and even discovery and targeting of threats.
Description
Technical Field
The invention relates to the field of computer hardware protection and system-on-chip security, in particular to a gate-level hardware Trojan horse positioning method and system based on deep learning.
Background
Integrated Circuits (ICs) are the core components that make up the computer hardware, and are complex in design and manufacture. To reduce costs, many manufacturers choose to outsource a portion of the IC manufacturing process, so-called third party vendors, which undoubtedly introduces a significant security threat to hardware security. A Hardware Trojan (HT) is a small piece of circuitry that an attacker inserts into the original IC layout for some malicious purpose. HT may be inserted at any stage of IC manufacture with security threats including changing circuit functionality, causing information leakage, denial of service, etc. Currently, studies on HT detection can be roughly divided into pre-silicon detection and post-silicon detection, wherein the pre-silicon detection is to perform security detection before the IC chip is finished, and similarly, the post-silicon detection is to perform security detection after the IC chip is finished. Obviously, pre-silicon testing can reduce cost more, thereby achieving a win-win relationship between safety and profit. The pre-silicon detection is mainly performed in the design stage of the IC, and the gate level is the last link of the design stage, and it is very effective to detect HT at the gate level.
In the IC design, the IC is divided according to the level of abstraction, which is, in order from high to low: system level, algorithm level, register transfer level, gate level, transistor level. Gate-level detection is a common static detection method, and a new Trojan horse detection method is explored by analyzing the logic structure of a circuit through a gate-level netlist. The key to detecting HT at the gate level is to obtain a netlist file describing the level, i.e., a gate level netlist. The gate-level netlist is used to describe the interconnection relationships between circuit elements that contain logic gates or other elements at the same level as the logic gates. To date, there have been many efforts to propose methods for prevention and detection of HT at the gate level. The most common method is to utilize a gate-level netlist to mine the features of HT, and then input the features into a deep learning model to perform feature learning, so as to effectively detect HT. Numerous detection studies have achieved considerable results, but staying in the detection phase alone is not really resistant to HT, finding specific locations of HT is a prerequisite to more accurately combat them, however, studies to locate HT associations are still scarce.
Disclosure of Invention
In view of the above, the present invention provides a gate-level hardware trojan positioning method and system based on deep learning, which can position a hardware trojan at the gate level.
In order to achieve the purpose, the invention adopts the following technical scheme:
a gate-level hardware Trojan horse positioning method based on deep learning comprises the following steps:
step A: acquiring seven public gate-level netlist files, and dividing a data set by using a leave-one method to obtain a training set Tr and a test set Ts;
and B: b, preprocessing the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a training set Tr path setAnd path set of test set Ts
And C: constructing and initializing a TextCNN model for detecting and positioning HT, and based on the path set of the training set Tr obtained in the step BTraining;
step D: collecting the paths of the test set Ts obtained in the step BInputting the textCNN model trained in the step C to obtain the pre-detectionThe result is;
step E: d, dividing paths of the pre-detection result obtained in the step D and constructing virtual positioning coordinates to obtain a short path set SL for positioning;
step F: and E, inputting the short path set SL obtained in the step E into the textCNN model trained in the step D to obtain a positioning result P.
Further, the step B is specifically as follows:
step B1: traversing the netlist file by using a depth-first search algorithm, and taking a wire network as an intermediary to obtain a tree graph G representing the interconnection relation of different logic gates;
step B2: based on the tree diagram G obtained in the step B1, the situation of the real circuit can be restored and a plurality of label-free paths can be obtained, and then the label-free paths are combined into a label-free path set of the netlist;
step B3: and B1 and B2 are carried out on the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and finally the unlabeled path set of the training set Tr and the test set Ts is obtainedAnd
step B4: labeling the unlabeled path obtained in the step B3 based on the information of the gate-level netlist of the training set Tr and the test set Ts obtained in the step A to obtain a labeled path set of the training set Tr and the test set TsAnd
further, the step C specifically includes:
step C1: path set of training set Tr obtained in step BGenerating a vocabulary table so that the TextCNN model can extract features;
step C2: constructing and initializing a TextCNN model;
step C3: based on the path set of the training set Tr obtained in the step BThe TextCNN model can learn the characteristics of the Trojan path and the Trojan-free path respectively to complete the training of the model.
Further, the step C1 is specifically:
step C11: firstly, the path set of the training set Tr obtained in the step B is collectedConverting into text content;
step C12: reading the words one by one and calculating the frequency of each word based on the text content obtained in the step C11;
step C13: according to the frequency of the words, marking a sequence number for each word from high to low to finish the vectorization representation of the words;
step C14: and packaging the words and the corresponding serial numbers into a dictionary type, writing the dictionary type into a vocabulary file, and finishing the generation of the vocabulary.
Further, the step D specifically includes:
step D1: based on the textCNN model trained in the step C, adding storage operation for the last full connection layer of the model, so as to record a pre-detection result conveniently;
step D2: gathering paths of test setInputting the textCNN model trained in the step C to obtain a primary test result set { PTP,PFP,PTN,PFNIn which P isTPIs a set of correctly identified trojan paths, PFPIs a set of Trojan-free paths, P, identified as Trojan-containingTNIs a correctly identified set of Trojan-free paths, PFNIs a set of trojan paths identified as Trojan-free;
step D3: initial measurement result set { P) obtained based on step D2TP,PFP,PTN,PFNSelecting a set P of Trojan horse paths in which to be correctly identifiedTPAs a result of the preliminary detection.
Further, the step E specifically includes:
step E1: d, numbering the paths in the pre-detection result obtained in the step D to obtain an original long path set LL for positioning, wherein the original long path set LL is { LL } LLiTP, which is the set P of correctly identified trojan paths from step D2TPThe number of paths contained in;
step E2: setting the length of the division cutlen, and enabling the long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
step E3: and E2 is executed on each original long path set LL obtained in the step E1 to obtain a short path set SL and a virtual positioning coordinate set, and path division and virtual positioning coordinate construction are completed.
Further, the step E2 specifically includes:
step E21: setting the length of the division cutlen;
step E22: for long path LLiCalculating the number of short paths num that can be generated after it is dividediThe formula is as follows:
wherein, lengthiIndicating a long path LLiLength of (d);
step E23: will long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short pathsWherein j is the short pathIndex indicating that the jth short path is from the long path LLLIs divided into (1);
step E24: according to the results of step E22 and step E23, it is a short pathSetting virtual positioning coordinatesTo record possible trojan locations, wherein Andthe calculation formula of (a) is as follows:
wherein t isiIs the original long path LLiThe t-th division of (1);
step E25: step E24 is repeated to complete numiAnd setting virtual positioning coordinates of the short path.
Further, the step F specifically includes:
step F1: one path in the short path set SLInputting the textCNN model trained in the step D, and predicting the classification result;
step F2: if the prediction result output by the TextCNN model is that the Trojan path exists, the corresponding virtual positioning coordinate is usedRecording the positioning result P;
step F3: and F1 and step F2 are repeated until all the short paths execute the operation, and the final positioning result P is output to finish positioning.
A deep learning based gate level hardware trojan horse positioning system comprising:
the path generation module is used for generating path statements representing circuit routing and comprises a search submodule, a temporary path submodule and a label submodule; firstly, preprocessing an input gate-level netlist file of a training set Tr and a test set Ts, performing depth-first search on the gate-level netlist file through a search submodule to obtain a tree diagram G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submoduleAndfinally, labeling the non-labeled paths through a label submodule to generate a labeled path set of a training set Tr and a test set TsAnd
the model generation module is used for constructing and training a TextCNN model and comprises a vectorization submodule, a model construction submodule and a model training submodule; first, for the path set of the training set Tr generated by the label moduleGenerating a vocabulary file through a vectorization sub-module, constructing and initializing a TextCNN model through a model constructing sub-module, and finally gathering paths through a model training sub-moduleInputting and completing modelsTraining;
the pre-detection module is used for obtaining a pre-detection result of the test set Ts and comprises an increase and storage sub-module, a pre-detection sub-module and an output sub-module; firstly, adding storage operation to the last full connection layer of the TextCNN model constructed by the model construction submodule through the adding and storing submodule so as to record a pre-detection result, and then, collecting a path set through the pre-detection submoduleThe path in (1) is pre-detected to obtain an initial detection result set { P }TP,PFP,PTN,PFNH, the set P of Trojan paths to be correctly identifiedTPAs the pre-detection result, the output sub-module outputs the pre-detection result;
the path dividing module is used for dividing a result path output by the output module into short paths and reducing the positioning range and comprises a sequencing submodule, a dividing submodule and a quasi-coordinate submodule; for the pre-detection result P output by the output moduleTPThe paths are numbered through a sequencing submodule, then are divided into a plurality of short paths through a dividing submodule, and finally a virtual positioning coordinate is set for each short path through a quasi-coordinate submodule;
the positioning module is used for positioning the Trojan horse and comprises a loading submodule and an output submodule; firstly, a short path is loaded into a TextCNN model trained by a model generation module through a loading submodule, a predicted result is output through an output submodule, then a path predicted to be a Trojan is selected, a corresponding virtual positioning coordinate is output, and positioning is finished.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention realizes the detection of the hardware Trojan horse by using the application of the convolutional neural network in text classification;
2. the invention converts the detection problem of the hardware Trojan horse into a two-classification problem, and enables the convolutional neural network to learn the context characteristics of circuit path statements and autonomously explore the characteristics of a Trojan horse path and a Trojan-free path so as to classify. Secondly, on the basis of detection, the positioning of the hardware trojan is explored, the path segmentation technology is considered to be applied to the positioning problem, and the long path in the circuit is divided into a plurality of short paths, so that the positioning range of the hardware trojan is narrowed;
3. the invention can realize further positioning work on the basis of detection, breaks through the situation that the hardware trojan is roughly manufactured and positioned only from the image of the integrated circuit in the past, can realize positioning on the gate level, and more effectively resists the hardware trojan from the design stage of the integrated circuit;
4. the invention can be used in integrated circuit security detection systems to evaluate the security performance of an integrated circuit and even discover and target threats for designers to take steps against such threats, and the like.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
fig. 2 is a schematic diagram of a system structure according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides a gate-level hardware Trojan horse positioning method based on deep learning, which includes the following steps:
step A: firstly, acquiring seven public gate-level netlist files, and dividing a data set by using a leave-one-out method to obtain a training set Tr and a test set Ts;
and B: b, preprocessing the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a path set of the training set Tr and the test set TsAndcompleting the generation of the path;
step B1: traversing the netlist file by using a depth-first search algorithm, and taking a wire mesh as a 'medium' to obtain a tree diagram G representing the interconnection relation of different logic gates;
step B2: based on the tree diagram G obtained in the step B1, the situation of the real circuit can be restored and a plurality of label-free paths can be obtained, and then the label-free paths are combined into a label-free path set of the netlist;
step B3: and B1 and B2 are carried out on the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and finally the unlabeled path set of the training set Tr and the test set Ts is obtainedAnd
step B4: labeling the unlabeled path obtained in the step B3 based on the information of the gate-level netlist of the training set Tr and the test set Ts obtained in the step A to obtain a labeled path set of the training set Tr and the test set TsAnd
and C: constructing and initializing a TextCNN model for detecting and positioning HT, and inputting the path set of the training set Tr obtained in the step BCompleting the construction and training of the model;
step C1: path set of training set Tr obtained in step BGenerating a vocabulary table so that the TextCNN model can extract features;
step C11: firstly, the path set of the training set Tr obtained in the step B is collectedConverting into text content;
step C12: reading the words one by one and calculating the frequency of each word based on the text content obtained in the step C1;
step C13: according to the frequency of the words, marking a sequence number for each word from high to low to finish the vectorization representation of the words;
step C14: and packaging the words and the corresponding serial numbers into a dictionary type, writing the dictionary type into a vocabulary file, and finishing the generation of the vocabulary.
Step C2: constructing and initializing a TextCNN model;
step C3: based on the path set of the training set Tr obtained in the step BThe TextCNN model can learn the characteristics of the Trojan path and the Trojan-free path respectively to complete the training of the model.
Step D: collecting the paths of the test set Ts obtained in the step BInputting the textCNN model trained in the step C to obtain a pre-detection result;
step D1: based on the textCNN model trained in the step C, adding storage operation for the last full connection layer of the model, so as to record a pre-detection result conveniently;
step D2: gathering paths of test setInputting the textCNN model trained in the step C to obtain a primary test result set { P }TP,PFP,PTN,PFNIn which P isTPIs a set of correctly identified trojan paths, PFPIs a set of Trojan-free paths, P, identified as Trojan-containingTNIs a correctly identified set of Trojan-free paths, PFNIs a set of trojan paths identified as Trojan-free;
step D3: initial measurement result set { P) obtained based on step D2TP,PFP,PTN,PFNSelecting only the set P of correctly identified Trojan paths thereinTPAs a result of the pre-detection for subsequent positioning.
Step E: d, dividing paths of the pre-detection result obtained in the step D and constructing virtual positioning coordinates to obtain a short path set SL for positioning;
step E1: d, numbering the paths in the pre-detection result obtained in the step D to obtain an original long path set LL for positioning, wherein the original long path set LL is { LL } LLiTP, which is the set P of correctly identified trojan paths from step D2TPThe number of paths contained in;
step E2: setting the length of the division cutlen, and enabling the long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
step E21: setting the length of the division cutlen;
step E22: for long path LLiCalculating the number of short paths num that can be generated after it is dividediThe formula is as follows:
wherein, lengthiIndicating a long path LLiLength of (d);
step E23: will long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short pathsWhere j is the index of the short path, indicating that the j-th short path is from the long path LLiIs divided into (1);
step E24: according to the results of step E22 and step E23, it is a short pathSetting virtual positioning coordinatesTo record possible trojan locations, wherein Andthe calculation formula of (a) is as follows:
wherein t isiIs the original long path LLiThe t-th division of (1);
step E25: step E24 is repeated to complete numiAnd setting virtual positioning coordinates of the short path.
Step E3: and E2 is executed on each original long path set LL obtained in the step E1 to obtain a short path set SL and a virtual positioning coordinate set, and path division and virtual positioning coordinate construction are completed.
Step F: and E, inputting the short path set SL obtained in the step E into the textCNN model trained in the step D to obtain a positioning result P.
Step F1: one path in the short path set SLInputting the textCNN model trained in the step D, and predicting the classification result;
step F2: if the prediction result output by the TextCNN model is that the Trojan path exists, the corresponding virtual positioning coordinate is usedRecording the positioning result P;
step F3: and F1 and step F2 are repeated until all the short paths execute the operation, and the final positioning result P is output to finish positioning.
The invention also provides a gate-level hardware Trojan horse positioning system based on deep learning, which comprises:
the path generation module is used for generating path statements representing circuit routing and comprises a search submodule, a temporary path submodule and a label submodule; firstly, preprocessing an input gate-level netlist file of a training set Tr and a test set Ts, performing depth-first search on the gate-level netlist file through a search submodule to obtain a tree diagram G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submoduleAndfinally, labeling the non-labeled paths through a label submodule to generate a labeled path set of a training set Tr and a test set TsAnd
the model generation module is used for constructing and training a TextCNN model and comprises a vectorization submodule, a model construction submodule and a model training submodule; first, for the path set of the training set Tr generated by the label moduleGenerating a vocabulary file through a vectorization sub-module, constructing and initializing a TextCNN model through a model constructing sub-module, and finally gathering paths through a model training sub-moduleInputting a model and finishing the training of the model;
preliminary detectionThe module is used for obtaining a pre-detection result of the test set Ts and comprises an increase and storage submodule, a pre-detection submodule and an output submodule; firstly, adding storage operation to the last full connection layer of the TextCNN model constructed by the model construction submodule through the adding and storing submodule so as to record a pre-detection result, and then, collecting a path set through the pre-detection submoduleThe path in (1) is pre-detected to obtain an initial detection result set { P }TP,PFP,PTN,PFNH, the set P of Trojan paths to be correctly identifiedTPAs the pre-detection result, the output sub-module outputs the pre-detection result;
the path dividing module is used for dividing a result path output by the output module into short paths and reducing the positioning range and comprises a sequencing submodule, a dividing submodule and a quasi-coordinate submodule; for the pre-detection result P output by the output moduleTPThe paths are numbered through a sequencing submodule, then are divided into a plurality of short paths through a dividing submodule, and finally a virtual positioning coordinate is set for each short path through a quasi-coordinate submodule;
the positioning module is used for positioning the Trojan horse and comprises a loading submodule and an output submodule; firstly, a short path is loaded into a TextCNN model trained by a model generation module through a loading submodule, a predicted result is output through an output submodule, then a path predicted to be a Trojan is selected, a corresponding virtual positioning coordinate is output, and positioning is finished.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.
Claims (9)
1. A gate-level hardware Trojan horse positioning method based on deep learning is characterized by comprising the following steps:
step A: acquiring seven public gate-level netlist files, and dividing a data set by using a leave-one method to obtain a training set Tr and a test set Ts;
and B: b, preprocessing the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a training set Tr path setAnd path set of test set Ts
And C: constructing and initializing a TextCNN model for detecting and positioning HT, and based on the path set of the training set Tr obtained in the step BTraining;
step D: collecting the paths of the test set Ts obtained in the step BInputting the textCNN model trained in the step C;
step E: d, dividing paths of the pre-detection result obtained in the step D and constructing virtual positioning coordinates to obtain a short path set SL for positioning;
step F: and E, inputting the short path set SL obtained in the step E into the textCNN model trained in the step D to obtain a positioning result P.
2. The deep learning-based gate-level hardware Trojan horse positioning method according to claim 1, wherein the step B is specifically as follows:
step B1: traversing the netlist file by using a depth-first search algorithm, and taking a wire network as an intermediary to obtain a tree graph G representing the interconnection relation of different logic gates;
step B2: based on the tree diagram G obtained in the step B1, the situation of the real circuit can be restored and a plurality of label-free paths can be obtained, and then the label-free paths are combined into a label-free path set of the netlist;
step B3: obtained by step AThe gate-level netlist files of the training set Tr and the test set Ts are subjected to the operations of the step B1 and the step B2, and finally the unlabeled path set of the training set Tr and the test set Ts is obtainedAnd
3. the deep learning based gate-level hardware Trojan horse positioning method according to claim 1, wherein the step C is specifically as follows:
step C1: path set of training set Tr obtained in step BGenerating a vocabulary table so that the TextCNN model can extract features;
step C2: constructing and initializing a TextCNN model;
4. The deep learning based gate-level hardware Trojan horse positioning method of claim 3, wherein: the step C1 specifically includes:
step C11: firstly, the path set of the training set Tr obtained in the step B is collectedConverting into text content;
step C12: reading the words one by one and calculating the frequency of each word based on the text content obtained in the step C11;
step C13: according to the frequency of the words, marking a sequence number for each word from high to low to finish the vectorization representation of the words;
step C14: and packaging the words and the corresponding serial numbers into a dictionary type, writing the dictionary type into a vocabulary file, and finishing the generation of the vocabulary.
5. The deep learning based gate-level hardware Trojan horse positioning method of claim 1, wherein: the step D is specifically as follows:
step D1: based on the textCNN model trained in the step C, adding storage operation for the last full connection layer of the model, so as to record a pre-detection result conveniently;
step D2: gathering paths of test setC, inputting the textCNN model trained in the step C in a set mode to obtain a primary test result set { PTP,PFP,PTN,PFNIn which P isTPIs a set of correctly identified trojan paths, PFPIs a set of Trojan-free paths, P, identified as Trojan-containingTNIs a correctly identified set of Trojan-free paths, PFNIs a set of trojan paths identified as Trojan-free;
step D3: initial measurement result set { P) obtained based on step D2TP,PFP,PTN,PFNSelecting a set P of Trojan horse paths in which to be correctly identifiedTPAs a result of the preliminary detection.
6. The deep learning based gate-level hardware Trojan horse positioning method of claim 1, wherein: the step E specifically comprises the following steps:
step E1: d, numbering the paths in the pre-detection result obtained in the step D to obtain an original long path set LL for positioning, wherein the original long path set LL is { LL } LLiTP, which is the set P of correctly identified trojan paths from step D2TPThe number of paths contained in;
step E2: setting the length of the division cutlen, and enabling the long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
step E3: and E2 is executed on each original long path set LL obtained in the step E1 to obtain a short path set SL and a virtual positioning coordinate set, and path division and virtual positioning coordinate construction are completed.
7. The deep learning based gate-level hardware Trojan horse positioning method of claim 6, wherein: the step E2 specifically includes:
step E21: setting the length of the division cutlen;
step E22: for long path LLiCalculating the number of short paths num that can be generated after it is dividediThe formula is as follows:
wherein, lengthiIndicating a long path LLiLength of (d);
step E23: will long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short pathsWhere j is the index of the short path, indicating that the j-th short path is from the long path LLiMiddle divisionComing out;
step E24: according to the results of step E22 and step E23, it is a short pathSetting virtual positioning coordinatesTo record possible trojan locations, wherein Andthe calculation formula of (a) is as follows:
wherein t isiIs the t-th partition of the original long path LLi;
step E25: step E24 is repeated to complete numiAnd setting virtual positioning coordinates of the short path.
8. The deep learning based gate-level hardware Trojan horse positioning method of claim 1, wherein: the step F specifically comprises the following steps:
step F1: one path in the short path set SLInputting the textCNN model trained in the step D, and predicting the classification result;
step F2: if the prediction result output by the TextCNN model is that the Trojan path exists, the corresponding virtual positioning coordinate is usedRecording the positioning result P;
step F3: and F1 and step F2 are repeated until all the short paths execute the operation, and the final positioning result P is output to finish positioning.
9. A gate-level hardware trojan positioning system based on deep learning, comprising:
a path generation module: the path statement is used for generating a path statement for representing circuit routing and comprises a searching submodule, a temporary path submodule and a label submodule; firstly, preprocessing an input gate-level netlist file of a training set Tr and a test set Ts, performing depth-first search on the gate-level netlist file through a search submodule to obtain a tree diagram G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submoduleAndfinally, labeling the label-free paths through a label submodule to generate a labeled path set of a training set Tr and a test set TsAnd
a model generation module: the system is used for constructing and training a TextCNN model and comprises a vectorization submodule, a model construction submodule and a model training submodule; first, for the path set of the training set Tr generated by the label moduleGenerating a vocabulary file through a vectorization sub-module, constructing and initializing a TextCNN model through a model constructing sub-module, and finally gathering paths through a model training sub-moduleInputting a model and finishing the training of the model;
a pre-detection module: the system is used for obtaining a pre-detection result of the test set Ts and comprises an increase and storage submodule, a pre-detection submodule and an output submodule; firstly, adding storage operation to the last full connection layer of the TextCNN model constructed by the model construction submodule through the adding and storing submodule so as to record a pre-detection result, and then, collecting a path set through the pre-detection submoduleThe path in (1) is pre-detected to obtain an initial detection result set { P }TP,PFP,PTN,PFNFinally, taking the set PTP of the correctly identified Trojan horse paths as a pre-detection result, and outputting the result by an output sub-module;
a path division module: the device is used for dividing a result path output by the output module into short paths and reducing a positioning range and comprises a sequencing submodule, a dividing submodule and a quasi-coordinate submodule; for the pre-detection result P output by the output moduleTPThe paths are numbered through a sequencing submodule, then are divided into a plurality of short paths through a dividing submodule, and finally a virtual positioning coordinate is set for each short path through a quasi-coordinate submodule;
a positioning module: completing the positioning of the Trojan horse, wherein the positioning comprises a loading submodule and an output submodule; firstly, a short path is loaded into a TextCNN model trained by a model generation module through a loading submodule, a predicted result is output through an output submodule, then a path predicted to be a Trojan is selected, a corresponding virtual positioning coordinate is output, and positioning is finished.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111412498.9A CN114065308A (en) | 2021-11-25 | 2021-11-25 | Gate-level hardware Trojan horse positioning method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111412498.9A CN114065308A (en) | 2021-11-25 | 2021-11-25 | Gate-level hardware Trojan horse positioning method and system based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114065308A true CN114065308A (en) | 2022-02-18 |
Family
ID=80276358
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111412498.9A Pending CN114065308A (en) | 2021-11-25 | 2021-11-25 | Gate-level hardware Trojan horse positioning method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114065308A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109684834A (en) * | 2018-12-21 | 2019-04-26 | 福州大学 | A kind of gate leve hardware Trojan horse recognition method based on XGBoost |
US20190272375A1 (en) * | 2019-03-28 | 2019-09-05 | Intel Corporation | Trust model for malware classification |
CN113486347A (en) * | 2021-06-30 | 2021-10-08 | 福州大学 | Deep learning hardware Trojan horse detection method based on semantic understanding |
CN113591084A (en) * | 2021-07-26 | 2021-11-02 | 福州大学 | Method and system for identifying transform malicious chip based on circuit path statement |
-
2021
- 2021-11-25 CN CN202111412498.9A patent/CN114065308A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109684834A (en) * | 2018-12-21 | 2019-04-26 | 福州大学 | A kind of gate leve hardware Trojan horse recognition method based on XGBoost |
US20190272375A1 (en) * | 2019-03-28 | 2019-09-05 | Intel Corporation | Trust model for malware classification |
CN113486347A (en) * | 2021-06-30 | 2021-10-08 | 福州大学 | Deep learning hardware Trojan horse detection method based on semantic understanding |
CN113591084A (en) * | 2021-07-26 | 2021-11-02 | 福州大学 | Method and system for identifying transform malicious chip based on circuit path statement |
Non-Patent Citations (1)
Title |
---|
刘志强;张铭津;池源;李云松;: "一种深度学习的硬件木马检测算法", 西安电子科技大学学报, no. 06, pages 37 - 45 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chefer et al. | Transformer interpretability beyond attention visualization | |
Baly et al. | We can detect your bias: Predicting the political ideology of news articles | |
Pang et al. | Predicting vulnerable software components through deep neural network | |
Yasaei et al. | Gnn4tj: Graph neural networks for hardware trojan detection at register transfer level | |
CN112232058B (en) | False news identification method and system based on deep learning three-layer semantic extraction framework | |
CN112215004A (en) | Application method in extraction of text entities of military equipment based on transfer learning | |
CN107168992A (en) | Article sorting technique and device, equipment and computer-readable recording medium based on artificial intelligence | |
CN113742733B (en) | Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type | |
CN109960727A (en) | For the individual privacy information automatic testing method and system of non-structured text | |
CN109144879B (en) | Test analysis method and device | |
US9703658B2 (en) | Identifying failure mechanisms based on a population of scan diagnostic reports | |
Azriel et al. | SoK: An overview of algorithmic methods in IC reverse engineering | |
Cruz et al. | On document representations for detection of biased news articles | |
Gong et al. | Zero-shot relation classification from side information | |
CN116150757A (en) | Intelligent contract unknown vulnerability detection method based on CNN-LSTM multi-classification model | |
CN114239083A (en) | Efficient state register identification method based on graph neural network | |
Zhu et al. | Tag: Learning circuit spatial embedding from layouts | |
US6405351B1 (en) | System for verifying leaf-cell circuit properties | |
CN112417147A (en) | Method and device for selecting training samples | |
CN116522334A (en) | RTL-level hardware Trojan detection method based on graph neural network and storage medium | |
CN111738290A (en) | Image detection method, model construction and training method, device, equipment and medium | |
CN114065308A (en) | Gate-level hardware Trojan horse positioning method and system based on deep learning | |
Rematska et al. | A survey on reverse engineering of technical diagrams | |
CN115982388A (en) | Case quality control map establishing method, case document quality testing method, case quality control map establishing equipment and storage medium | |
CN113836297B (en) | Training method and device for text emotion analysis model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |