CN114065308A - Gate-level hardware Trojan horse positioning method and system based on deep learning - Google Patents

Gate-level hardware Trojan horse positioning method and system based on deep learning Download PDF

Info

Publication number
CN114065308A
CN114065308A CN202111412498.9A CN202111412498A CN114065308A CN 114065308 A CN114065308 A CN 114065308A CN 202111412498 A CN202111412498 A CN 202111412498A CN 114065308 A CN114065308 A CN 114065308A
Authority
CN
China
Prior art keywords
path
submodule
positioning
paths
gate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111412498.9A
Other languages
Chinese (zh)
Inventor
董晨
张媛媛
许熠
黄槟鸿
黄小刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202111412498.9A priority Critical patent/CN114065308A/en
Publication of CN114065308A publication Critical patent/CN114065308A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Geometry (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a gate-level hardware Trojan horse positioning method and system based on deep learning, wherein the method comprises the steps of firstly, obtaining seven open gate-level netlist files to obtain a training set and a test set; then preprocessing is carried out, the netlist file is converted into a path statement by using a depth-first search algorithm, and path generation is completed; then constructing and training a TextCNN model for detection and positioning; inputting the path set of the test set into the model to obtain a pre-detection result; dividing paths of the pre-detection result and constructing virtual positioning coordinates to obtain a short path set for positioningSL(ii) a Finally will beSLInputting the TextCNN model to obtain a positioning resultP. The invention realizes quick and effective evaluation of the safety of the integrated circuitFull performance and even discovery and targeting of threats.

Description

Gate-level hardware Trojan horse positioning method and system based on deep learning
Technical Field
The invention relates to the field of computer hardware protection and system-on-chip security, in particular to a gate-level hardware Trojan horse positioning method and system based on deep learning.
Background
Integrated Circuits (ICs) are the core components that make up the computer hardware, and are complex in design and manufacture. To reduce costs, many manufacturers choose to outsource a portion of the IC manufacturing process, so-called third party vendors, which undoubtedly introduces a significant security threat to hardware security. A Hardware Trojan (HT) is a small piece of circuitry that an attacker inserts into the original IC layout for some malicious purpose. HT may be inserted at any stage of IC manufacture with security threats including changing circuit functionality, causing information leakage, denial of service, etc. Currently, studies on HT detection can be roughly divided into pre-silicon detection and post-silicon detection, wherein the pre-silicon detection is to perform security detection before the IC chip is finished, and similarly, the post-silicon detection is to perform security detection after the IC chip is finished. Obviously, pre-silicon testing can reduce cost more, thereby achieving a win-win relationship between safety and profit. The pre-silicon detection is mainly performed in the design stage of the IC, and the gate level is the last link of the design stage, and it is very effective to detect HT at the gate level.
In the IC design, the IC is divided according to the level of abstraction, which is, in order from high to low: system level, algorithm level, register transfer level, gate level, transistor level. Gate-level detection is a common static detection method, and a new Trojan horse detection method is explored by analyzing the logic structure of a circuit through a gate-level netlist. The key to detecting HT at the gate level is to obtain a netlist file describing the level, i.e., a gate level netlist. The gate-level netlist is used to describe the interconnection relationships between circuit elements that contain logic gates or other elements at the same level as the logic gates. To date, there have been many efforts to propose methods for prevention and detection of HT at the gate level. The most common method is to utilize a gate-level netlist to mine the features of HT, and then input the features into a deep learning model to perform feature learning, so as to effectively detect HT. Numerous detection studies have achieved considerable results, but staying in the detection phase alone is not really resistant to HT, finding specific locations of HT is a prerequisite to more accurately combat them, however, studies to locate HT associations are still scarce.
Disclosure of Invention
In view of the above, the present invention provides a gate-level hardware trojan positioning method and system based on deep learning, which can position a hardware trojan at the gate level.
In order to achieve the purpose, the invention adopts the following technical scheme:
a gate-level hardware Trojan horse positioning method based on deep learning comprises the following steps:
step A: acquiring seven public gate-level netlist files, and dividing a data set by using a leave-one method to obtain a training set Tr and a test set Ts;
and B: b, preprocessing the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a training set Tr path set
Figure BDA0003374690440000021
And path set of test set Ts
Figure BDA0003374690440000022
And C: constructing and initializing a TextCNN model for detecting and positioning HT, and based on the path set of the training set Tr obtained in the step B
Figure BDA0003374690440000023
Training;
step D: collecting the paths of the test set Ts obtained in the step B
Figure BDA0003374690440000024
Inputting the textCNN model trained in the step C to obtain the pre-detectionThe result is;
step E: d, dividing paths of the pre-detection result obtained in the step D and constructing virtual positioning coordinates to obtain a short path set SL for positioning;
step F: and E, inputting the short path set SL obtained in the step E into the textCNN model trained in the step D to obtain a positioning result P.
Further, the step B is specifically as follows:
step B1: traversing the netlist file by using a depth-first search algorithm, and taking a wire network as an intermediary to obtain a tree graph G representing the interconnection relation of different logic gates;
step B2: based on the tree diagram G obtained in the step B1, the situation of the real circuit can be restored and a plurality of label-free paths can be obtained, and then the label-free paths are combined into a label-free path set of the netlist;
step B3: and B1 and B2 are carried out on the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and finally the unlabeled path set of the training set Tr and the test set Ts is obtained
Figure BDA0003374690440000031
And
Figure BDA0003374690440000032
step B4: labeling the unlabeled path obtained in the step B3 based on the information of the gate-level netlist of the training set Tr and the test set Ts obtained in the step A to obtain a labeled path set of the training set Tr and the test set Ts
Figure BDA0003374690440000033
And
Figure BDA0003374690440000034
further, the step C specifically includes:
step C1: path set of training set Tr obtained in step B
Figure BDA0003374690440000035
Generating a vocabulary table so that the TextCNN model can extract features;
step C2: constructing and initializing a TextCNN model;
step C3: based on the path set of the training set Tr obtained in the step B
Figure BDA0003374690440000041
The TextCNN model can learn the characteristics of the Trojan path and the Trojan-free path respectively to complete the training of the model.
Further, the step C1 is specifically:
step C11: firstly, the path set of the training set Tr obtained in the step B is collected
Figure BDA0003374690440000042
Converting into text content;
step C12: reading the words one by one and calculating the frequency of each word based on the text content obtained in the step C11;
step C13: according to the frequency of the words, marking a sequence number for each word from high to low to finish the vectorization representation of the words;
step C14: and packaging the words and the corresponding serial numbers into a dictionary type, writing the dictionary type into a vocabulary file, and finishing the generation of the vocabulary.
Further, the step D specifically includes:
step D1: based on the textCNN model trained in the step C, adding storage operation for the last full connection layer of the model, so as to record a pre-detection result conveniently;
step D2: gathering paths of test set
Figure BDA0003374690440000043
Inputting the textCNN model trained in the step C to obtain a primary test result set { PTP,PFP,PTN,PFNIn which P isTPIs a set of correctly identified trojan paths, PFPIs a set of Trojan-free paths, P, identified as Trojan-containingTNIs a correctly identified set of Trojan-free paths, PFNIs a set of trojan paths identified as Trojan-free;
step D3: initial measurement result set { P) obtained based on step D2TP,PFP,PTN,PFNSelecting a set P of Trojan horse paths in which to be correctly identifiedTPAs a result of the preliminary detection.
Further, the step E specifically includes:
step E1: d, numbering the paths in the pre-detection result obtained in the step D to obtain an original long path set LL for positioning, wherein the original long path set LL is { LL } LLiTP, which is the set P of correctly identified trojan paths from step D2TPThe number of paths contained in;
step E2: setting the length of the division cutlen, and enabling the long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
step E3: and E2 is executed on each original long path set LL obtained in the step E1 to obtain a short path set SL and a virtual positioning coordinate set, and path division and virtual positioning coordinate construction are completed.
Further, the step E2 specifically includes:
step E21: setting the length of the division cutlen;
step E22: for long path LLiCalculating the number of short paths num that can be generated after it is dividediThe formula is as follows:
Figure BDA0003374690440000051
wherein, lengthiIndicating a long path LLiLength of (d);
step E23: will long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths
Figure BDA0003374690440000052
Wherein j is the short pathIndex indicating that the jth short path is from the long path LLLIs divided into (1);
step E24: according to the results of step E22 and step E23, it is a short path
Figure BDA0003374690440000061
Setting virtual positioning coordinates
Figure BDA0003374690440000062
To record possible trojan locations, wherein
Figure BDA0003374690440000063
Figure BDA0003374690440000064
And
Figure BDA0003374690440000065
the calculation formula of (a) is as follows:
Figure BDA0003374690440000066
wherein t isiIs the original long path LLiThe t-th division of (1);
step E25: step E24 is repeated to complete numiAnd setting virtual positioning coordinates of the short path.
Further, the step F specifically includes:
step F1: one path in the short path set SL
Figure BDA0003374690440000067
Inputting the textCNN model trained in the step D, and predicting the classification result;
step F2: if the prediction result output by the TextCNN model is that the Trojan path exists, the corresponding virtual positioning coordinate is used
Figure BDA0003374690440000068
Recording the positioning result P;
step F3: and F1 and step F2 are repeated until all the short paths execute the operation, and the final positioning result P is output to finish positioning.
A deep learning based gate level hardware trojan horse positioning system comprising:
the path generation module is used for generating path statements representing circuit routing and comprises a search submodule, a temporary path submodule and a label submodule; firstly, preprocessing an input gate-level netlist file of a training set Tr and a test set Ts, performing depth-first search on the gate-level netlist file through a search submodule to obtain a tree diagram G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submodule
Figure BDA0003374690440000071
And
Figure BDA0003374690440000072
finally, labeling the non-labeled paths through a label submodule to generate a labeled path set of a training set Tr and a test set Ts
Figure BDA0003374690440000073
And
Figure BDA0003374690440000074
the model generation module is used for constructing and training a TextCNN model and comprises a vectorization submodule, a model construction submodule and a model training submodule; first, for the path set of the training set Tr generated by the label module
Figure BDA0003374690440000075
Generating a vocabulary file through a vectorization sub-module, constructing and initializing a TextCNN model through a model constructing sub-module, and finally gathering paths through a model training sub-module
Figure BDA0003374690440000076
Inputting and completing modelsTraining;
the pre-detection module is used for obtaining a pre-detection result of the test set Ts and comprises an increase and storage sub-module, a pre-detection sub-module and an output sub-module; firstly, adding storage operation to the last full connection layer of the TextCNN model constructed by the model construction submodule through the adding and storing submodule so as to record a pre-detection result, and then, collecting a path set through the pre-detection submodule
Figure BDA0003374690440000077
The path in (1) is pre-detected to obtain an initial detection result set { P }TP,PFP,PTN,PFNH, the set P of Trojan paths to be correctly identifiedTPAs the pre-detection result, the output sub-module outputs the pre-detection result;
the path dividing module is used for dividing a result path output by the output module into short paths and reducing the positioning range and comprises a sequencing submodule, a dividing submodule and a quasi-coordinate submodule; for the pre-detection result P output by the output moduleTPThe paths are numbered through a sequencing submodule, then are divided into a plurality of short paths through a dividing submodule, and finally a virtual positioning coordinate is set for each short path through a quasi-coordinate submodule;
the positioning module is used for positioning the Trojan horse and comprises a loading submodule and an output submodule; firstly, a short path is loaded into a TextCNN model trained by a model generation module through a loading submodule, a predicted result is output through an output submodule, then a path predicted to be a Trojan is selected, a corresponding virtual positioning coordinate is output, and positioning is finished.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention realizes the detection of the hardware Trojan horse by using the application of the convolutional neural network in text classification;
2. the invention converts the detection problem of the hardware Trojan horse into a two-classification problem, and enables the convolutional neural network to learn the context characteristics of circuit path statements and autonomously explore the characteristics of a Trojan horse path and a Trojan-free path so as to classify. Secondly, on the basis of detection, the positioning of the hardware trojan is explored, the path segmentation technology is considered to be applied to the positioning problem, and the long path in the circuit is divided into a plurality of short paths, so that the positioning range of the hardware trojan is narrowed;
3. the invention can realize further positioning work on the basis of detection, breaks through the situation that the hardware trojan is roughly manufactured and positioned only from the image of the integrated circuit in the past, can realize positioning on the gate level, and more effectively resists the hardware trojan from the design stage of the integrated circuit;
4. the invention can be used in integrated circuit security detection systems to evaluate the security performance of an integrated circuit and even discover and target threats for designers to take steps against such threats, and the like.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
fig. 2 is a schematic diagram of a system structure according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides a gate-level hardware Trojan horse positioning method based on deep learning, which includes the following steps:
step A: firstly, acquiring seven public gate-level netlist files, and dividing a data set by using a leave-one-out method to obtain a training set Tr and a test set Ts;
and B: b, preprocessing the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a path set of the training set Tr and the test set Ts
Figure BDA0003374690440000091
And
Figure BDA0003374690440000092
completing the generation of the path;
step B1: traversing the netlist file by using a depth-first search algorithm, and taking a wire mesh as a 'medium' to obtain a tree diagram G representing the interconnection relation of different logic gates;
step B2: based on the tree diagram G obtained in the step B1, the situation of the real circuit can be restored and a plurality of label-free paths can be obtained, and then the label-free paths are combined into a label-free path set of the netlist;
step B3: and B1 and B2 are carried out on the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and finally the unlabeled path set of the training set Tr and the test set Ts is obtained
Figure BDA0003374690440000093
And
Figure BDA0003374690440000094
step B4: labeling the unlabeled path obtained in the step B3 based on the information of the gate-level netlist of the training set Tr and the test set Ts obtained in the step A to obtain a labeled path set of the training set Tr and the test set Ts
Figure BDA0003374690440000095
And
Figure BDA0003374690440000096
and C: constructing and initializing a TextCNN model for detecting and positioning HT, and inputting the path set of the training set Tr obtained in the step B
Figure BDA0003374690440000097
Completing the construction and training of the model;
step C1: path set of training set Tr obtained in step B
Figure BDA0003374690440000098
Generating a vocabulary table so that the TextCNN model can extract features;
step C11: firstly, the path set of the training set Tr obtained in the step B is collected
Figure BDA0003374690440000099
Converting into text content;
step C12: reading the words one by one and calculating the frequency of each word based on the text content obtained in the step C1;
step C13: according to the frequency of the words, marking a sequence number for each word from high to low to finish the vectorization representation of the words;
step C14: and packaging the words and the corresponding serial numbers into a dictionary type, writing the dictionary type into a vocabulary file, and finishing the generation of the vocabulary.
Step C2: constructing and initializing a TextCNN model;
step C3: based on the path set of the training set Tr obtained in the step B
Figure BDA0003374690440000101
The TextCNN model can learn the characteristics of the Trojan path and the Trojan-free path respectively to complete the training of the model.
Step D: collecting the paths of the test set Ts obtained in the step B
Figure BDA0003374690440000102
Inputting the textCNN model trained in the step C to obtain a pre-detection result;
step D1: based on the textCNN model trained in the step C, adding storage operation for the last full connection layer of the model, so as to record a pre-detection result conveniently;
step D2: gathering paths of test set
Figure BDA0003374690440000103
Inputting the textCNN model trained in the step C to obtain a primary test result set { P }TP,PFP,PTN,PFNIn which P isTPIs a set of correctly identified trojan paths, PFPIs a set of Trojan-free paths, P, identified as Trojan-containingTNIs a correctly identified set of Trojan-free paths, PFNIs a set of trojan paths identified as Trojan-free;
step D3: initial measurement result set { P) obtained based on step D2TP,PFP,PTN,PFNSelecting only the set P of correctly identified Trojan paths thereinTPAs a result of the pre-detection for subsequent positioning.
Step E: d, dividing paths of the pre-detection result obtained in the step D and constructing virtual positioning coordinates to obtain a short path set SL for positioning;
step E1: d, numbering the paths in the pre-detection result obtained in the step D to obtain an original long path set LL for positioning, wherein the original long path set LL is { LL } LLiTP, which is the set P of correctly identified trojan paths from step D2TPThe number of paths contained in;
step E2: setting the length of the division cutlen, and enabling the long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
step E21: setting the length of the division cutlen;
step E22: for long path LLiCalculating the number of short paths num that can be generated after it is dividediThe formula is as follows:
Figure BDA0003374690440000111
wherein, lengthiIndicating a long path LLiLength of (d);
step E23: will long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths
Figure BDA0003374690440000112
Where j is the index of the short path, indicating that the j-th short path is from the long path LLiIs divided into (1);
step E24: according to the results of step E22 and step E23, it is a short path
Figure BDA0003374690440000113
Setting virtual positioning coordinates
Figure BDA0003374690440000114
To record possible trojan locations, wherein
Figure BDA0003374690440000115
Figure BDA0003374690440000116
And
Figure BDA0003374690440000117
the calculation formula of (a) is as follows:
Figure BDA0003374690440000121
wherein t isiIs the original long path LLiThe t-th division of (1);
step E25: step E24 is repeated to complete numiAnd setting virtual positioning coordinates of the short path.
Step E3: and E2 is executed on each original long path set LL obtained in the step E1 to obtain a short path set SL and a virtual positioning coordinate set, and path division and virtual positioning coordinate construction are completed.
Step F: and E, inputting the short path set SL obtained in the step E into the textCNN model trained in the step D to obtain a positioning result P.
Step F1: one path in the short path set SL
Figure BDA0003374690440000122
Inputting the textCNN model trained in the step D, and predicting the classification result;
step F2: if the prediction result output by the TextCNN model is that the Trojan path exists, the corresponding virtual positioning coordinate is used
Figure BDA0003374690440000123
Recording the positioning result P;
step F3: and F1 and step F2 are repeated until all the short paths execute the operation, and the final positioning result P is output to finish positioning.
The invention also provides a gate-level hardware Trojan horse positioning system based on deep learning, which comprises:
the path generation module is used for generating path statements representing circuit routing and comprises a search submodule, a temporary path submodule and a label submodule; firstly, preprocessing an input gate-level netlist file of a training set Tr and a test set Ts, performing depth-first search on the gate-level netlist file through a search submodule to obtain a tree diagram G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submodule
Figure BDA0003374690440000131
And
Figure BDA0003374690440000132
finally, labeling the non-labeled paths through a label submodule to generate a labeled path set of a training set Tr and a test set Ts
Figure BDA0003374690440000133
And
Figure BDA0003374690440000134
the model generation module is used for constructing and training a TextCNN model and comprises a vectorization submodule, a model construction submodule and a model training submodule; first, for the path set of the training set Tr generated by the label module
Figure BDA0003374690440000135
Generating a vocabulary file through a vectorization sub-module, constructing and initializing a TextCNN model through a model constructing sub-module, and finally gathering paths through a model training sub-module
Figure BDA0003374690440000136
Inputting a model and finishing the training of the model;
preliminary detectionThe module is used for obtaining a pre-detection result of the test set Ts and comprises an increase and storage submodule, a pre-detection submodule and an output submodule; firstly, adding storage operation to the last full connection layer of the TextCNN model constructed by the model construction submodule through the adding and storing submodule so as to record a pre-detection result, and then, collecting a path set through the pre-detection submodule
Figure BDA0003374690440000137
The path in (1) is pre-detected to obtain an initial detection result set { P }TP,PFP,PTN,PFNH, the set P of Trojan paths to be correctly identifiedTPAs the pre-detection result, the output sub-module outputs the pre-detection result;
the path dividing module is used for dividing a result path output by the output module into short paths and reducing the positioning range and comprises a sequencing submodule, a dividing submodule and a quasi-coordinate submodule; for the pre-detection result P output by the output moduleTPThe paths are numbered through a sequencing submodule, then are divided into a plurality of short paths through a dividing submodule, and finally a virtual positioning coordinate is set for each short path through a quasi-coordinate submodule;
the positioning module is used for positioning the Trojan horse and comprises a loading submodule and an output submodule; firstly, a short path is loaded into a TextCNN model trained by a model generation module through a loading submodule, a predicted result is output through an output submodule, then a path predicted to be a Trojan is selected, a corresponding virtual positioning coordinate is output, and positioning is finished.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (9)

1. A gate-level hardware Trojan horse positioning method based on deep learning is characterized by comprising the following steps:
step A: acquiring seven public gate-level netlist files, and dividing a data set by using a leave-one method to obtain a training set Tr and a test set Ts;
and B: b, preprocessing the gate-level netlist files of the training set Tr and the test set Ts obtained in the step A, and combining a depth-first search algorithm to obtain a training set Tr path set
Figure FDA0003374690430000011
And path set of test set Ts
Figure FDA0003374690430000012
And C: constructing and initializing a TextCNN model for detecting and positioning HT, and based on the path set of the training set Tr obtained in the step B
Figure FDA0003374690430000013
Training;
step D: collecting the paths of the test set Ts obtained in the step B
Figure FDA0003374690430000014
Inputting the textCNN model trained in the step C;
step E: d, dividing paths of the pre-detection result obtained in the step D and constructing virtual positioning coordinates to obtain a short path set SL for positioning;
step F: and E, inputting the short path set SL obtained in the step E into the textCNN model trained in the step D to obtain a positioning result P.
2. The deep learning-based gate-level hardware Trojan horse positioning method according to claim 1, wherein the step B is specifically as follows:
step B1: traversing the netlist file by using a depth-first search algorithm, and taking a wire network as an intermediary to obtain a tree graph G representing the interconnection relation of different logic gates;
step B2: based on the tree diagram G obtained in the step B1, the situation of the real circuit can be restored and a plurality of label-free paths can be obtained, and then the label-free paths are combined into a label-free path set of the netlist;
step B3: obtained by step AThe gate-level netlist files of the training set Tr and the test set Ts are subjected to the operations of the step B1 and the step B2, and finally the unlabeled path set of the training set Tr and the test set Ts is obtained
Figure FDA0003374690430000021
And
Figure FDA0003374690430000022
step B4: labeling the unlabeled path obtained in the step B3 based on the information of the gate-level netlist of the training set Tr and the test set Ts obtained in the step A to obtain a labeled path set of the training set Tr and the test set Ts
Figure FDA0003374690430000023
And
Figure FDA0003374690430000024
3. the deep learning based gate-level hardware Trojan horse positioning method according to claim 1, wherein the step C is specifically as follows:
step C1: path set of training set Tr obtained in step B
Figure FDA0003374690430000025
Generating a vocabulary table so that the TextCNN model can extract features;
step C2: constructing and initializing a TextCNN model;
step C3: based on the path set of the training set Tr obtained in the step B
Figure FDA0003374690430000026
The TextCNN model can learn the characteristics of the Trojan path and the Trojan-free path respectively to complete the training of the model.
4. The deep learning based gate-level hardware Trojan horse positioning method of claim 3, wherein: the step C1 specifically includes:
step C11: firstly, the path set of the training set Tr obtained in the step B is collected
Figure FDA0003374690430000027
Converting into text content;
step C12: reading the words one by one and calculating the frequency of each word based on the text content obtained in the step C11;
step C13: according to the frequency of the words, marking a sequence number for each word from high to low to finish the vectorization representation of the words;
step C14: and packaging the words and the corresponding serial numbers into a dictionary type, writing the dictionary type into a vocabulary file, and finishing the generation of the vocabulary.
5. The deep learning based gate-level hardware Trojan horse positioning method of claim 1, wherein: the step D is specifically as follows:
step D1: based on the textCNN model trained in the step C, adding storage operation for the last full connection layer of the model, so as to record a pre-detection result conveniently;
step D2: gathering paths of test set
Figure FDA0003374690430000031
C, inputting the textCNN model trained in the step C in a set mode to obtain a primary test result set { PTP,PFP,PTN,PFNIn which P isTPIs a set of correctly identified trojan paths, PFPIs a set of Trojan-free paths, P, identified as Trojan-containingTNIs a correctly identified set of Trojan-free paths, PFNIs a set of trojan paths identified as Trojan-free;
step D3: initial measurement result set { P) obtained based on step D2TP,PFP,PTN,PFNSelecting a set P of Trojan horse paths in which to be correctly identifiedTPAs a result of the preliminary detection.
6. The deep learning based gate-level hardware Trojan horse positioning method of claim 1, wherein: the step E specifically comprises the following steps:
step E1: d, numbering the paths in the pre-detection result obtained in the step D to obtain an original long path set LL for positioning, wherein the original long path set LL is { LL } LLiTP, which is the set P of correctly identified trojan paths from step D2TPThe number of paths contained in;
step E2: setting the length of the division cutlen, and enabling the long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths and setting virtual positioning coordinates for the short paths;
step E3: and E2 is executed on each original long path set LL obtained in the step E1 to obtain a short path set SL and a virtual positioning coordinate set, and path division and virtual positioning coordinate construction are completed.
7. The deep learning based gate-level hardware Trojan horse positioning method of claim 6, wherein: the step E2 specifically includes:
step E21: setting the length of the division cutlen;
step E22: for long path LLiCalculating the number of short paths num that can be generated after it is dividediThe formula is as follows:
Figure FDA0003374690430000041
wherein, lengthiIndicating a long path LLiLength of (d);
step E23: will long path LLiSequentially dividing a group of cutlen logic gates to obtain a plurality of short paths
Figure FDA0003374690430000042
Where j is the index of the short path, indicating that the j-th short path is from the long path LLiMiddle divisionComing out;
step E24: according to the results of step E22 and step E23, it is a short path
Figure FDA0003374690430000043
Setting virtual positioning coordinates
Figure FDA0003374690430000044
To record possible trojan locations, wherein
Figure FDA0003374690430000045
Figure FDA0003374690430000046
Figure FDA0003374690430000047
And
Figure FDA0003374690430000048
the calculation formula of (a) is as follows:
Figure FDA0003374690430000051
wherein t isiIs the t-th partition of the original long path LLi;
step E25: step E24 is repeated to complete numiAnd setting virtual positioning coordinates of the short path.
8. The deep learning based gate-level hardware Trojan horse positioning method of claim 1, wherein: the step F specifically comprises the following steps:
step F1: one path in the short path set SL
Figure FDA0003374690430000052
Inputting the textCNN model trained in the step D, and predicting the classification result;
step F2: if the prediction result output by the TextCNN model is that the Trojan path exists, the corresponding virtual positioning coordinate is used
Figure FDA0003374690430000053
Recording the positioning result P;
step F3: and F1 and step F2 are repeated until all the short paths execute the operation, and the final positioning result P is output to finish positioning.
9. A gate-level hardware trojan positioning system based on deep learning, comprising:
a path generation module: the path statement is used for generating a path statement for representing circuit routing and comprises a searching submodule, a temporary path submodule and a label submodule; firstly, preprocessing an input gate-level netlist file of a training set Tr and a test set Ts, performing depth-first search on the gate-level netlist file through a search submodule to obtain a tree diagram G representing interconnection relations of different logic gates, and then generating a label-free path set of the training set Tr and the test set Ts through a temporary path submodule
Figure FDA0003374690430000054
And
Figure FDA0003374690430000055
finally, labeling the label-free paths through a label submodule to generate a labeled path set of a training set Tr and a test set Ts
Figure FDA0003374690430000056
And
Figure FDA0003374690430000057
a model generation module: the system is used for constructing and training a TextCNN model and comprises a vectorization submodule, a model construction submodule and a model training submodule; first, for the path set of the training set Tr generated by the label module
Figure FDA0003374690430000061
Generating a vocabulary file through a vectorization sub-module, constructing and initializing a TextCNN model through a model constructing sub-module, and finally gathering paths through a model training sub-module
Figure FDA0003374690430000062
Inputting a model and finishing the training of the model;
a pre-detection module: the system is used for obtaining a pre-detection result of the test set Ts and comprises an increase and storage submodule, a pre-detection submodule and an output submodule; firstly, adding storage operation to the last full connection layer of the TextCNN model constructed by the model construction submodule through the adding and storing submodule so as to record a pre-detection result, and then, collecting a path set through the pre-detection submodule
Figure FDA0003374690430000063
The path in (1) is pre-detected to obtain an initial detection result set { P }TP,PFP,PTN,PFNFinally, taking the set PTP of the correctly identified Trojan horse paths as a pre-detection result, and outputting the result by an output sub-module;
a path division module: the device is used for dividing a result path output by the output module into short paths and reducing a positioning range and comprises a sequencing submodule, a dividing submodule and a quasi-coordinate submodule; for the pre-detection result P output by the output moduleTPThe paths are numbered through a sequencing submodule, then are divided into a plurality of short paths through a dividing submodule, and finally a virtual positioning coordinate is set for each short path through a quasi-coordinate submodule;
a positioning module: completing the positioning of the Trojan horse, wherein the positioning comprises a loading submodule and an output submodule; firstly, a short path is loaded into a TextCNN model trained by a model generation module through a loading submodule, a predicted result is output through an output submodule, then a path predicted to be a Trojan is selected, a corresponding virtual positioning coordinate is output, and positioning is finished.
CN202111412498.9A 2021-11-25 2021-11-25 Gate-level hardware Trojan horse positioning method and system based on deep learning Pending CN114065308A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111412498.9A CN114065308A (en) 2021-11-25 2021-11-25 Gate-level hardware Trojan horse positioning method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111412498.9A CN114065308A (en) 2021-11-25 2021-11-25 Gate-level hardware Trojan horse positioning method and system based on deep learning

Publications (1)

Publication Number Publication Date
CN114065308A true CN114065308A (en) 2022-02-18

Family

ID=80276358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111412498.9A Pending CN114065308A (en) 2021-11-25 2021-11-25 Gate-level hardware Trojan horse positioning method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN114065308A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684834A (en) * 2018-12-21 2019-04-26 福州大学 A kind of gate leve hardware Trojan horse recognition method based on XGBoost
US20190272375A1 (en) * 2019-03-28 2019-09-05 Intel Corporation Trust model for malware classification
CN113486347A (en) * 2021-06-30 2021-10-08 福州大学 Deep learning hardware Trojan horse detection method based on semantic understanding
CN113591084A (en) * 2021-07-26 2021-11-02 福州大学 Method and system for identifying transform malicious chip based on circuit path statement

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684834A (en) * 2018-12-21 2019-04-26 福州大学 A kind of gate leve hardware Trojan horse recognition method based on XGBoost
US20190272375A1 (en) * 2019-03-28 2019-09-05 Intel Corporation Trust model for malware classification
CN113486347A (en) * 2021-06-30 2021-10-08 福州大学 Deep learning hardware Trojan horse detection method based on semantic understanding
CN113591084A (en) * 2021-07-26 2021-11-02 福州大学 Method and system for identifying transform malicious chip based on circuit path statement

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘志强;张铭津;池源;李云松;: "一种深度学习的硬件木马检测算法", 西安电子科技大学学报, no. 06, pages 37 - 45 *

Similar Documents

Publication Publication Date Title
Chefer et al. Transformer interpretability beyond attention visualization
Baly et al. We can detect your bias: Predicting the political ideology of news articles
Pang et al. Predicting vulnerable software components through deep neural network
Yasaei et al. Gnn4tj: Graph neural networks for hardware trojan detection at register transfer level
CN112232058B (en) False news identification method and system based on deep learning three-layer semantic extraction framework
CN112215004A (en) Application method in extraction of text entities of military equipment based on transfer learning
CN107168992A (en) Article sorting technique and device, equipment and computer-readable recording medium based on artificial intelligence
CN113742733B (en) Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type
CN109960727A (en) For the individual privacy information automatic testing method and system of non-structured text
CN109144879B (en) Test analysis method and device
US9703658B2 (en) Identifying failure mechanisms based on a population of scan diagnostic reports
Azriel et al. SoK: An overview of algorithmic methods in IC reverse engineering
Cruz et al. On document representations for detection of biased news articles
Gong et al. Zero-shot relation classification from side information
CN116150757A (en) Intelligent contract unknown vulnerability detection method based on CNN-LSTM multi-classification model
CN114239083A (en) Efficient state register identification method based on graph neural network
Zhu et al. Tag: Learning circuit spatial embedding from layouts
US6405351B1 (en) System for verifying leaf-cell circuit properties
CN112417147A (en) Method and device for selecting training samples
CN116522334A (en) RTL-level hardware Trojan detection method based on graph neural network and storage medium
CN111738290A (en) Image detection method, model construction and training method, device, equipment and medium
CN114065308A (en) Gate-level hardware Trojan horse positioning method and system based on deep learning
Rematska et al. A survey on reverse engineering of technical diagrams
CN115982388A (en) Case quality control map establishing method, case document quality testing method, case quality control map establishing equipment and storage medium
CN113836297B (en) Training method and device for text emotion analysis model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination