CN113449303A - Intelligent contract vulnerability detection method and system based on teacher-student network model - Google Patents

Intelligent contract vulnerability detection method and system based on teacher-student network model Download PDF

Info

Publication number
CN113449303A
CN113449303A CN202110719356.0A CN202110719356A CN113449303A CN 113449303 A CN113449303 A CN 113449303A CN 202110719356 A CN202110719356 A CN 202110719356A CN 113449303 A CN113449303 A CN 113449303A
Authority
CN
China
Prior art keywords
intelligent contract
vulnerability detection
teacher
data set
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110719356.0A
Other languages
Chinese (zh)
Other versions
CN113449303B (en
Inventor
黄步添
焦颖颖
罗春凤
刘振广
何钦铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yunxiang Network Technology Co Ltd
Original Assignee
Hangzhou Yunxiang Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Yunxiang Network Technology Co Ltd filed Critical Hangzhou Yunxiang Network Technology Co Ltd
Priority to CN202110719356.0A priority Critical patent/CN113449303B/en
Publication of CN113449303A publication Critical patent/CN113449303A/en
Application granted granted Critical
Publication of CN113449303B publication Critical patent/CN113449303B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Finance (AREA)
  • Computing Systems (AREA)
  • Accounting & Taxation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Virology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an intelligent contract vulnerability detection method and system based on a graph neural network. The method comprises the steps of establishing a teacher network as a software target supervision student network to approach the performance of detecting intelligent contract vulnerabilities by the teacher network, training the student network to form a lightweight intelligent contract vulnerability detection model, extracting feature vectors of a semantic graph and a binary code control flow graph simultaneously, using the feature vectors as input of a vulnerability judgment graph neural network, and further judging whether the intelligent contracts to be verified have vulnerabilities. Furthermore, the conversion of the feature vector obtained by the binary code control flow diagram to the feature vector corresponding to the semantic diagram is realized.

Description

Intelligent contract vulnerability detection method and system based on teacher-student network model
Technical Field
The invention belongs to the technical field of block chain intelligent contract security vulnerability detection, and particularly relates to an intelligent contract vulnerability detection method and system based on a teacher-student network model.
Background
An intelligent contract is a program running on a blockchain that defines a set of automatically executable contract rules in the form of code. As the block chain technology matures, the intelligent contracts, one of the core technologies, are also being widely used.
In a blockchain system, intelligent contracts play an important role in value transfer, and each security hole in the contract may cause a huge loss of value. For example, a 2016 "The DAO" contract security breach, resulting in The theft of 360 million ethernet coins; the money multi-signature wallet security hole in 2017 caused a loss of $ 1.52 million; the us BEC token contract security hole in 2018 caused its 9 billion dollar market value to be cleared instantaneously. Therefore, attention needs to be paid to the security vulnerability detection problem of the intelligent contract.
The existing intelligent contract vulnerability detection method mainly comprises two types, namely strict rules manually defined by experts and intelligent detection of a neural network. Strict rules manually defined by experts are labor intensive and inextensible, lack flexibility and are easily broken; the intelligent detection of the existing neural network generally only selects one of a semantic graph and a byte code control flow graph as the input of the neural network, and the problem of missing report and false report often occurs.
Disclosure of Invention
In view of the above, the invention provides an intelligent contract vulnerability detection method based on a teacher-student network model, wherein the teacher-student network model is obtained through graph neural network training, a complex heavyweight teacher network is set up to serve as a software target supervision student network to approach the performance of intelligent contract vulnerability detection of the teacher network, the student network training forms a lightweight intelligent contract vulnerability detection model, feature vectors of a semantic graph and a binary code control flow graph are extracted at the same time, namely the semantic vector and the control flow graph vector serve as feature vectors for vulnerability judgment, and conversion of the semantic vector to the control flow graph vector is realized.
The method comprises the following steps of establishing a teacher network as a software target supervision student network to approach the performance of intelligent contract vulnerability detection of the teacher network, and training the student network to form a lightweight intelligent contract vulnerability detection model, and specifically comprises the following steps:
respectively acquiring an intelligent contract source code data set and an intelligent contract binary code data set;
preprocessing an intelligent contract source code data set to obtain a semantic vector data set; preprocessing an intelligent contract binary code data set to obtain a control flow graph vector data set;
establishing an intelligent contract vulnerability detection teacher network and an intelligent contract vulnerability detection student network based on a graph neural network, training the intelligent contract vulnerability detection teacher network and the intelligent contract vulnerability detection student network based on a semantic vector and a control flow graph vector to obtain an intelligent contract vulnerability detection teacher network model, and guiding the teacher network model to obtain the intelligent contract vulnerability detection student network model to correspondingly form a teacher-student network lightweight model;
the teacher-student network lightweight model detects intelligent contract vulnerability detection based on the student network model. .
The intelligent contract binary code is a code form of an intelligent contract actually deployed on a blockchain and is obtained by compiling the intelligent contract source code.
Further, the preprocessing is performed on the intelligent contract source code data set to obtain a semantic vector data set, and the specific steps are as follows:
converting the intelligent contract source code data set into a semantic graph data set through a source code semantic extraction module;
converting the semantic graph data set into a semantic vector data set through a semantic conversion graph neural network module, which specifically comprises the following steps: the semantic feature extraction unit extracts n feature values from the semantic graph; the n characteristic values output by the semantic characteristic extraction unit are converted into semantic vectors through a semantic conversion map neural network unit.
Further, the semantic graph acquisition steps are as follows:
generating an intelligent contract function analysis result from the data set by using a function analysis tool CodeSensor;
constructing an abstract syntax tree based on the generated function analysis result, and taking the type symbol, the function interface and the syntax as abstract syntax tree nodes;
and deeply traversing the abstract syntax tree, and taking the function interface nodes and the syntax nodes in the abstract syntax tree as the nodes of the semantic graph, wherein the abstract syntax tree extracts the characteristic calls of control flow edges, data flow edges and return value edges to connect the nodes to be converted into the semantic graph.
Further, preprocessing the intelligent contract binary code data set to obtain a control flow graph vector data set, and the specific steps are as follows:
converting the intelligent contract binary code data set into a control flow graph data set through a binary code control flow graph conversion module;
the control flow graph transformation neural network module is used for transforming the control flow graph data set into a control flow graph vector data set, and the method specifically comprises the following steps: the control flow graph feature extraction unit extracts m feature values from the control flow graph; and converting the m characteristic values output by the control flow graph characteristic extraction unit into control flow graph vectors through a control flow graph conversion graph neural network unit. Further, training an intelligent contract vulnerability detection teacher network and an intelligent contract vulnerability detection student network based on the semantic vector and the control flow graph vector, specifically comprising the following steps:
respectively extracting semantic vectors and control flow diagram vectors which correspond to each other one by one from the semantic vector data set and the control flow diagram vector data set as input data of an intelligent contract vulnerability detection teacher network, and training the intelligent contract vulnerability detection teacher network to obtain an intelligent contract vulnerability detection teacher network model; wherein the content of the first and second substances,
the input data semantic vector of the intelligent contract vulnerability detection teacher network trains and outputs a characteristic vector A through the intelligent contract vulnerability detection teacher network;
the vector of the input data control flow graph of the intelligent contract vulnerability detection teacher network is trained by the intelligent contract vulnerability detection teacher network to output a characteristic vector B; training the correlation between the characteristic vector A and the characteristic vector B through a fully-connected neural network to obtain a mapping relation model of the correlation between the characteristic vector A and the characteristic vector B;
the vector data set of the control flow graph is input data of the intelligent contract vulnerability detection student network, and the intelligent contract vulnerability detection student network is trained to obtain an intelligent contract vulnerability detection student network model;
and based on a knowledge distillation algorithm, the intelligent contract vulnerability detection teacher network model is used as a software target to guide the training of the intelligent contract vulnerability detection student network, so that the intelligent contract vulnerability detection teacher network model participates in the parameter adjusting process of the intelligent contract vulnerability detection student network.
Further, the obtaining of the mapping relationship model associated between the feature vector a and the feature vector B specifically includes:
the feature vector A can be mapped into a vector B ' through the mapping relation model, the vector B ' has the same dimension as the feature vector B, and the vector B ' comprises training features contained in the feature vector B;
the feature vector B can be mapped into a vector A ' through the mapping relation model, the vector A ' has the same dimension as the feature vector A, and the vector A ' comprises training features contained in the feature vector A; .
In the method for obtaining the lightweight intelligent contract vulnerability detection student model, a teacher network is used as a software target to build a student network, and a Knowledge Distillation algorithm is used to enable the teacher network to participate in a parameter adjusting process of the student network, and the specific implementation steps comprise:
calling the trained teacher model and outputting the weighted sum P of the output layerst
Training student models with the following loss function LKD(Ws) Wherein L isKDRepresents the loss function, Ws represents the loss value, H represents the cross entropy, Yture represents the difference from the correct result,ps represents the output of the student network, β represents an adjustable parameter, and Pt represents the output of the teacher network;
Figure BDA0003136373700000051
the intelligent contract vulnerability detection teacher network model is used as a software target to guide the training of the intelligent contract vulnerability detection student network on the basis of a knowledge distillation algorithm, so that the intelligent contract vulnerability detection teacher network model participates in the parameter adjusting process of the intelligent contract vulnerability detection student network. An intelligent contract vulnerability detection system based on a teacher-student network model specifically comprises: the intelligent contract vulnerability detection system comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor executes the computer program to run in a teacher network and a student network of the intelligent contract vulnerability detection system, and comprises a data acquisition module, a data processing module, a model construction and training module and a model establishment module;
the data acquisition module is used for respectively acquiring an intelligent contract source code data set and an intelligent contract binary code data set;
the data processing module is used for processing the intelligent contract source code data set to obtain a semantic vector data set; processing the intelligent contract binary code data set to obtain a control flow graph vector data set;
the model building and training module is used for building an intelligent contract vulnerability detection teacher network and an intelligent contract vulnerability detection student network, respectively training the intelligent contract vulnerability detection teacher network and the intelligent contract vulnerability detection student network based on semantic vectors and control flow diagram vectors, and guiding to obtain the intelligent contract vulnerability detection teacher network model and then guiding to obtain the intelligent contract vulnerability detection student network model;
the model establishing module is used for training to obtain a teacher-student network lightweight model and detecting the intelligent contract vulnerability based on the student model. The invention provides an intelligent contract vulnerability detection method based on a graph neural network, which combines semantic analysis and control flow graph analysis for the first time, takes a semantic vector and a control flow graph vector as input of a vulnerability judgment graph neural network at the same time, can obtain the semantic vector by mapping the control flow graph vector through a full-connection network, solves the problem that the semantic vector cannot be obtained under the condition of missing intelligent contract source codes, greatly improves the report missing and false report phenomena in the existing intelligent contract vulnerability detection scheme, and improves the vulnerability detection accuracy; and the teacher-student network is adopted, so that the lightweight student network approaches the performance of a complex heavyweight teacher network, and the teacher-student network has good universality and practical value.
Drawings
FIG. 1 is a flow chart of an intelligent contract vulnerability detection method based on a graph neural network;
FIG. 2 is a flow chart of teacher network detecting intelligent contract vulnerabilities in an intelligent contract vulnerability detection method based on a graph neural network;
FIG. 3 is a flow chart of student network intelligent contract vulnerability detection in an intelligent contract vulnerability detection method based on a graph neural network;
fig. 4 is a system diagram of an intelligent contract vulnerability detection method based on a graph neural network.
Detailed Description
In order to clearly illustrate the present invention and make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, so that those skilled in the art can implement the technical solutions in reference to the description text. The technology of the present invention will be described in detail below with reference to the accompanying drawings in conjunction with specific embodiments.
A complex heavyweight teacher network is built to serve as a software target supervision student network to approach the performance of intelligent contract vulnerability detection of the teacher network, a lightweight intelligent contract vulnerability detection model is formed through student network training, and feature vectors of a semantic graph and a binary code control flow graph are extracted simultaneously, namely the semantic vector and the control flow graph vector serve as feature vectors for vulnerability judgment, and conversion of the semantic vector to the control flow graph vector is achieved.
Fig. 1 is a flowchart of an intelligent contract vulnerability detection method based on a graph neural network. The following elaborations are made by combining fig. 1 to set up a complex heavyweight teacher network as a software target supervision student network to approach the performance of detecting the intelligent contract vulnerabilities of the teacher network, and the student network training forms a lightweight intelligent contract vulnerability detection model, which specifically includes:
(a) through a network, collecting an Etheng intelligent contract source code and an intelligent contract binary code by using a crawler to construct a data set;
(b) preprocessing the data set and converting the graph structure according to the input requirement of a graph neural network to form a semantic graph and a control flow graph, and converting the semantic graph and the control flow graph into an inputtable vector;
(c) constructing an intelligent contract vulnerability detection teacher-student network, and transmitting the inputtable vectors after data set processing as input data into the teacher network to obtain a heavyweight complex intelligent contract vulnerability detection teacher model;
(d) taking the complex intelligent contract vulnerability detection teacher model as a software target through transfer learning, and performing supervised training learning on a student network;
(e) training to obtain a teacher-student network lightweight intelligent contract vulnerability detection student model, and detecting the intelligent contract vulnerability detection model.
Further, the intelligent contract binary code is a code form of an intelligent contract actually deployed on a blockchain, and is a machine code after the intelligent contract source code is compiled.
Further, the method for forming the semantic graph specifically includes:
generating an intelligent contract function analysis result by using a function analysis tool CodeSensor;
constructing an abstract syntax tree based on the generated function analysis result, and taking the type symbol, the function interface and the syntax as nodes of the tree;
and deeply traversing the abstract syntax tree, taking interface function nodes and syntax nodes in the tree as nodes of a semantic graph, adding a control flow edge, a data flow edge and a Fallback edge, and converting the abstract syntax tree into the semantic graph.
Further, the method for forming a control flow graph specifically includes:
the disassembling unit is used for converting the intelligent contract binary codes into easily understood assembly languages by using an Etheng decompiler portal;
dividing the instruction of the intelligent contract into basic blocks based on the assembly language, thereby obtaining the nodes of the control flow graph;
at any two basic blocks A with connectioniAnd AjAdding a directed edge (A) betweeni,Aj) Thus obtaining the complete binary code control flow graph.
Fig. 2 is a flowchart illustrating a teacher network detecting an intelligent contract vulnerability in an intelligent contract vulnerability detection method based on a graph neural network. The following describes in detail a specific implementation method of the intelligent contract vulnerability detection method of the teacher network with reference to fig. 2, including the following steps:
step 1, collecting an intelligent contract source code data set and a corresponding binary code data set by using a crawler through a network, preprocessing the data sets, and removing invalid data;
step 2, converting the intelligent contract source code data set into a semantic graph data set through a source code semantic extraction module;
step 3, converting the semantic graph data set into a semantic vector data set through a semantic conversion graph neural network module;
step 4, converting the intelligent contract binary code data set into a control flow graph data set through a binary code control flow graph conversion module;
step 5, converting the control flow graph data set into a control flow graph vector data set through a control flow graph conversion graph neural network module;
step 6, extracting one-to-one corresponding semantic vector and control flow graph vector from the semantic vector data set and the control flow graph vector data set as the input of a teacher network vulnerability judgment module, and training a teacher network; the trained teacher network can detect the intelligent contract vulnerability.
Further, the method for converting a control flow graph into a control flow graph vector specifically includes:
a control flow graph feature extraction unit extracts m feature values from the control flow graph;
and the control flow graph conversion graph neural network unit converts the m characteristic values output by the control flow graph characteristic extraction unit into control flow graph vectors.
Further, the method for converting the semantic graph into the semantic vector specifically includes:
a semantic feature extraction unit extracts n feature values from the semantic graph;
and the semantic conversion map neural network unit converts the n characteristic values output by the semantic characteristic extraction unit into semantic vectors.
Fig. 3 is a flowchart illustrating a student network detecting an intelligent contract vulnerability in an intelligent contract vulnerability detection method based on a graph neural network. The following describes in detail a specific implementation method of the intelligent contract vulnerability detection method of the student network with reference to fig. 3, including the following steps:
step a, collecting an intelligent contract binary code data set by using a crawler through a network, preprocessing the data set, and removing invalid data;
b, converting the intelligent contract binary code data set into a control flow graph data set through a binary code control flow graph conversion module;
c, converting the control flow graph data set into a control flow graph vector data set through a control flow graph conversion graph neural network module;
d, converting the vector data set of the control flow graph into a semantic vector data set through a full-connection network module;
step e, a teacher network is used as a software target to build a student network, and the teacher network is made to participate in the parameter adjusting process of the student network by using a Knowledge Distillation (Knowledge Distillation) algorithm;
f, extracting semantic vectors and control flow graph vectors which correspond to each other one by one from the semantic vector data set and the control flow graph vector data set as input of a student network vulnerability judgment module, and training a student network; the trained student network can detect the intelligent contract vulnerability, and the performance of the intelligent contract vulnerability approaches that of the teacher network.
Further, the method for obtaining the fully connected network in step d specifically includes:
(1) collecting an intelligent contract source code data set and a corresponding binary code data set by using a crawler through a network, preprocessing the data sets, and removing invalid data;
(2) converting the intelligent contract source code data set into a semantic graph data set through a source code semantic extraction module;
(3) converting the semantic graph data set into a semantic vector data set through a semantic conversion graph neural network module;
(4) converting the intelligent contract binary code data set into a control flow graph data set through a binary code control flow graph conversion module;
(5) converting the control flow graph data set into a control flow graph vector data set through a control flow graph conversion neural network module;
(6) extracting semantic vectors and control flow diagram vectors which correspond to each other one by one from the semantic vector data set and the control flow diagram vector data set, and putting the semantic vectors and the control flow diagram vectors into a fully-connected network for training;
(7) after training of a large number of data sets, a full-connection network for establishing a mapping relation between the semantic vector and the control flow graph vector is obtained;
(8) and inputting the control flow graph vector into the trained fully-connected network to obtain a corresponding semantic vector.
Further, in the step e, a teacher network is used as a software target to build a student network, and a knowledgeable discovery algorithm is used to enable the teacher network to participate in a parameter adjusting process of the student network, which specifically includes:
calling the trained teacher model and outputting the weighted sum P of the output layerst
Training a student model by using a following loss function L _ KD (W _ s), wherein L _ KD represents a loss function, Ws represents a loss value, H represents cross entropy, Yture represents the difference with a correct result, Ps represents the output of a student network, beta represents an adjustable parameter, and Pt represents the output of a teacher network;
Figure BDA0003136373700000111
fig. 3 is a system diagram of an intelligent contract vulnerability detection method based on a graph neural network. The system components of an intelligent contract vulnerability detection method based on a graph neural network are specifically described below with reference to fig. 3.
As an embodiment of the present invention, on the other hand, the present invention further provides an intelligent contract vulnerability detection system based on a graph neural network, comprising: the intelligent contract vulnerability detection system comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor executes the computer program to run in a teacher network and a student network of the intelligent contract vulnerability detection system, and comprises a data acquisition module, a data processing module, a model construction and training module and a model establishment module;
the data acquisition module is used for respectively acquiring an intelligent contract source code data set and an intelligent contract binary code data set;
the data processing module is used for processing the intelligent contract source code data set to obtain a semantic vector data set; processing the intelligent contract binary code data set to obtain a control flow graph vector data set;
the model building and training module is used for building an intelligent contract vulnerability detection teacher network and an intelligent contract vulnerability detection student network, respectively training the intelligent contract vulnerability detection teacher network and the intelligent contract vulnerability detection student network based on semantic vectors and control flow diagram vectors, and guiding to obtain the intelligent contract vulnerability detection teacher network model and then guiding to obtain the intelligent contract vulnerability detection student network model;
the model establishing module is used for training to obtain a teacher-student network lightweight model and detecting the intelligent contract vulnerability based on the student model. The embodiments described above are presented to enable a person having ordinary skill in the art to make and use the invention. It will be readily apparent to those skilled in the art that various modifications to the above-described embodiments may be made, and the generic principles defined herein may be applied to other embodiments without the use of inventive faculty. Therefore, the present invention is not limited to the above embodiments, and those skilled in the art should make improvements and modifications to the present invention based on the disclosure of the present invention within the protection scope of the present invention.

Claims (10)

1. An intelligent contract vulnerability detection method based on a teacher-student network model is characterized by comprising the following steps:
respectively acquiring an intelligent contract source code data set and an intelligent contract binary code data set;
preprocessing an intelligent contract source code data set to obtain a semantic vector data set; preprocessing an intelligent contract binary code data set to obtain a control flow graph vector data set;
establishing an intelligent contract vulnerability detection teacher network and an intelligent contract vulnerability detection student network based on a graph neural network, training the intelligent contract vulnerability detection teacher network and the intelligent contract vulnerability detection student network based on a semantic vector and a control flow graph vector to obtain an intelligent contract vulnerability detection teacher network model, and guiding the teacher network model to obtain the intelligent contract vulnerability detection student network model to correspondingly form a teacher-student network lightweight model;
the teacher-student network lightweight model detects intelligent contract vulnerability detection based on the student network model.
2. The teacher-student network model-based intelligent contract vulnerability detection method according to claim 1, wherein the preprocessing is performed on an intelligent contract source code data set to obtain a semantic vector data set, and the specific steps are as follows:
converting the intelligent contract source code data set into a semantic graph data set through a source code semantic extraction module;
converting the semantic graph data set into a semantic vector data set through a semantic conversion graph neural network module, which specifically comprises the following steps: the semantic feature extraction unit extracts n feature values from the semantic graph; the n characteristic values output by the semantic characteristic extraction unit are converted into semantic vectors through a semantic conversion map neural network unit.
3. The teacher-student network model-based intelligent contract vulnerability detection method according to claim 2, wherein the semantic graph acquisition steps are as follows:
generating an intelligent contract function analysis result from the data set by using a function analysis tool CodeSensor;
constructing an abstract syntax tree based on the generated function analysis result, and taking the type symbol, the function interface and the syntax as abstract syntax tree nodes;
and deeply traversing the abstract syntax tree, and taking the function interface nodes and the syntax nodes in the abstract syntax tree as the nodes of the semantic graph, wherein the abstract syntax tree extracts the characteristic calls of control flow edges, data flow edges and return value edges to connect the nodes to be converted into the semantic graph.
4. The teacher-student network model-based intelligent contract vulnerability detection method according to claim 1, wherein the intelligent contract binary code data set is preprocessed to obtain a control flow graph vector data set, and the specific steps are as follows:
converting the intelligent contract binary code data set into a control flow graph data set through a binary code control flow graph conversion module;
the control flow graph transformation neural network module is used for transforming the control flow graph data set into a control flow graph vector data set, and the method specifically comprises the following steps: the control flow graph feature extraction unit extracts m feature values from the control flow graph; and converting the m characteristic values output by the control flow graph characteristic extraction unit into control flow graph vectors through a control flow graph conversion graph neural network unit.
5. The teacher-student network model-based intelligent contract vulnerability detection method of claim 4, wherein the step of forming the control flow graph is as follows:
the disassembling unit is used for converting the intelligent contract binary codes into easily understood assembly languages by using an Etheng decompiler portal;
dividing the instruction of the intelligent contract into basic blocks based on the assembly language, thereby obtaining the nodes of the control flow graph;
at any two basic blocks A with connectioniAnd AjAdding a directed edge (A) betweeni,Aj) And obtaining a binary code control flow graph.
6. The teacher-student network model-based intelligent contract vulnerability detection method according to claim 1, wherein the intelligent contract vulnerability detection teacher network and the intelligent contract vulnerability detection student network are trained based on semantic vectors and control flow graph vectors, and the specific steps are as follows:
respectively extracting semantic vectors and control flow diagram vectors which correspond to each other one by one from the semantic vector data set and the control flow diagram vector data set as input data of an intelligent contract vulnerability detection teacher network, and training the intelligent contract vulnerability detection teacher network to obtain an intelligent contract vulnerability detection teacher network model; the input data semantic vector of the intelligent contract vulnerability detection teacher network trains and outputs a characteristic vector A through the intelligent contract vulnerability detection teacher network; the vector of the input data control flow graph of the intelligent contract vulnerability detection teacher network is trained by the intelligent contract vulnerability detection teacher network to output a characteristic vector B; training the correlation between the characteristic vector A and the characteristic vector B through a fully-connected neural network to obtain a mapping relation model of the correlation between the characteristic vector A and the characteristic vector B;
and the vector data set of the control flow graph is input data of the intelligent contract vulnerability detection student network, and the intelligent contract vulnerability detection student network is trained to obtain an intelligent contract vulnerability detection student network model.
7. The teacher-student network model-based intelligent contract vulnerability detection method according to claim 6, wherein the obtaining of the mapping relationship model of the association between the feature vector A and the feature vector B specifically comprises:
the feature vector A is mapped out a vector B ' through the mapping relation model, the vector B ' has the same dimension as the feature vector B, and the vector B ' comprises training features contained in the feature vector B;
the feature vector B can be mapped to a vector A ' through the mapping relation model, the vector A ' has the same dimension as the feature vector A, and the vector A ' comprises training features contained in the feature vector A.
8. The teacher-student network model-based intelligent contract vulnerability detection method according to claim 6, wherein a knowledge distillation algorithm is used to enable a teacher network to participate in a parameter adjustment process of a student network, and the steps are as follows:
calling the trained teacher model and outputting the weighted sum P of the output layerst
Training student models with the following loss function LKD(Ws) Wherein L isKDRepresenting a loss function, Ws representing a loss value, H representing a cross entropy, Yture representing a difference from a correct result, Ps representing an output of the student network, β representing an adjustable parameter, and Pt representing an output of the teacher network;
Figure FDA0003136373690000041
9. the teacher-student network model-based intelligent contract vulnerability detection method according to claim 6, wherein the intelligent contract vulnerability detection teacher network model is used as a software target to guide training of the intelligent contract vulnerability detection student network based on a knowledge distillation algorithm, so that the intelligent contract vulnerability detection teacher network model participates in a parameter adjustment process of the intelligent contract vulnerability detection student network.
10. An intelligent contract vulnerability detection system based on a teacher-student network model is characterized by comprising a data acquisition module, a data processing module, a model construction and training module and a model establishment module;
the data acquisition module is used for respectively acquiring an intelligent contract source code data set and an intelligent contract binary code data set;
the data processing module is used for processing the intelligent contract source code data set to obtain a semantic vector data set; processing the intelligent contract binary code data set to obtain a control flow graph vector data set;
the model building and training module is used for building an intelligent contract vulnerability detection teacher network and an intelligent contract vulnerability detection student network, respectively training the intelligent contract vulnerability detection teacher network and the intelligent contract vulnerability detection student network based on semantic vectors and control flow diagram vectors, and guiding to obtain the intelligent contract vulnerability detection teacher network model and then guiding to obtain the intelligent contract vulnerability detection student network model;
the model establishing module is used for training to obtain a teacher-student network lightweight model and detecting the intelligent contract vulnerability based on the student model.
CN202110719356.0A 2021-06-28 2021-06-28 Intelligent contract vulnerability detection method and system based on teacher-student network model Active CN113449303B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110719356.0A CN113449303B (en) 2021-06-28 2021-06-28 Intelligent contract vulnerability detection method and system based on teacher-student network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110719356.0A CN113449303B (en) 2021-06-28 2021-06-28 Intelligent contract vulnerability detection method and system based on teacher-student network model

Publications (2)

Publication Number Publication Date
CN113449303A true CN113449303A (en) 2021-09-28
CN113449303B CN113449303B (en) 2022-11-11

Family

ID=77813250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110719356.0A Active CN113449303B (en) 2021-06-28 2021-06-28 Intelligent contract vulnerability detection method and system based on teacher-student network model

Country Status (1)

Country Link
CN (1) CN113449303B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113904844A (en) * 2021-10-08 2022-01-07 浙江工商大学 Intelligent contract vulnerability detection method based on cross-modal teacher-student network
CN115033896A (en) * 2022-08-15 2022-09-09 鹏城实验室 Method, device, system and medium for detecting Ethernet intelligent contract vulnerability
CN116578988A (en) * 2023-05-23 2023-08-11 海南大学 Vulnerability detection method and device of intelligent contract and storage medium

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130091539A1 (en) * 2011-10-11 2013-04-11 Honeywell International Inc. System and method for insider threat detection
CN108108622A (en) * 2017-12-13 2018-06-01 上海交通大学 Leakage location based on depth convolutional network and controlling stream graph
US20190042745A1 (en) * 2017-12-28 2019-02-07 Intel Corporation Deep learning on execution trace data for exploit detection
CN109933991A (en) * 2019-03-20 2019-06-25 杭州拜思科技有限公司 A kind of method, apparatus of intelligence contract Hole Detection
CN109948345A (en) * 2019-03-20 2019-06-28 杭州拜思科技有限公司 A kind of method, the system of intelligence contract Hole Detection
CN109977682A (en) * 2019-04-01 2019-07-05 中山大学 A kind of block chain intelligence contract leak detection method and device based on deep learning
CN110175454A (en) * 2019-04-19 2019-08-27 肖银皓 A kind of intelligent contract safety loophole mining method and system based on artificial intelligence
CN110543419A (en) * 2019-08-28 2019-12-06 杭州趣链科技有限公司 intelligent contract code vulnerability detection method based on deep learning technology
CN111159012A (en) * 2019-12-10 2020-05-15 中国科学院深圳先进技术研究院 Intelligent contract vulnerability detection method based on deep learning
CN111259394A (en) * 2020-01-15 2020-06-09 中山大学 Fine-grained source code vulnerability detection method based on graph neural network
CN111488582A (en) * 2020-04-01 2020-08-04 杭州云象网络技术有限公司 Intelligent contract reentry vulnerability detection method based on graph neural network
US20200272813A1 (en) * 2019-02-21 2020-08-27 Tata Consultancy Services Limited Hand detection in first person view
CN111639344A (en) * 2020-07-31 2020-09-08 中国人民解放军国防科技大学 Vulnerability detection method and device based on neural network
CN112035842A (en) * 2020-08-17 2020-12-04 杭州云象网络技术有限公司 Intelligent contract vulnerability detection interpretability method based on codec
US20210110047A1 (en) * 2019-10-15 2021-04-15 Anchain.ai Inc. Continuous vulnerability management system for blockchain smart contract based digital asset using sandbox and artificial intelligence

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130091539A1 (en) * 2011-10-11 2013-04-11 Honeywell International Inc. System and method for insider threat detection
CN108108622A (en) * 2017-12-13 2018-06-01 上海交通大学 Leakage location based on depth convolutional network and controlling stream graph
US20190042745A1 (en) * 2017-12-28 2019-02-07 Intel Corporation Deep learning on execution trace data for exploit detection
US20200272813A1 (en) * 2019-02-21 2020-08-27 Tata Consultancy Services Limited Hand detection in first person view
CN109933991A (en) * 2019-03-20 2019-06-25 杭州拜思科技有限公司 A kind of method, apparatus of intelligence contract Hole Detection
CN109948345A (en) * 2019-03-20 2019-06-28 杭州拜思科技有限公司 A kind of method, the system of intelligence contract Hole Detection
CN109977682A (en) * 2019-04-01 2019-07-05 中山大学 A kind of block chain intelligence contract leak detection method and device based on deep learning
CN110175454A (en) * 2019-04-19 2019-08-27 肖银皓 A kind of intelligent contract safety loophole mining method and system based on artificial intelligence
CN110543419A (en) * 2019-08-28 2019-12-06 杭州趣链科技有限公司 intelligent contract code vulnerability detection method based on deep learning technology
US20210110047A1 (en) * 2019-10-15 2021-04-15 Anchain.ai Inc. Continuous vulnerability management system for blockchain smart contract based digital asset using sandbox and artificial intelligence
CN111159012A (en) * 2019-12-10 2020-05-15 中国科学院深圳先进技术研究院 Intelligent contract vulnerability detection method based on deep learning
CN111259394A (en) * 2020-01-15 2020-06-09 中山大学 Fine-grained source code vulnerability detection method based on graph neural network
CN111488582A (en) * 2020-04-01 2020-08-04 杭州云象网络技术有限公司 Intelligent contract reentry vulnerability detection method based on graph neural network
CN111639344A (en) * 2020-07-31 2020-09-08 中国人民解放军国防科技大学 Vulnerability detection method and device based on neural network
CN112035842A (en) * 2020-08-17 2020-12-04 杭州云象网络技术有限公司 Intelligent contract vulnerability detection interpretability method based on codec

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANH VIET PHAN; MINH LE NGUYEN; LAM THU BUI: ""Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction"", 《2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI)》 *
倪远东等: "智能合约安全漏洞研究综述", 《信息安全学报》 *
陈肇炫等: "基于抽象语法树的智能化漏洞检测系统", 《信息安全学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113904844A (en) * 2021-10-08 2022-01-07 浙江工商大学 Intelligent contract vulnerability detection method based on cross-modal teacher-student network
CN113904844B (en) * 2021-10-08 2023-09-12 浙江工商大学 Intelligent contract vulnerability detection method based on cross-mode teacher-student network
CN115033896A (en) * 2022-08-15 2022-09-09 鹏城实验室 Method, device, system and medium for detecting Ethernet intelligent contract vulnerability
CN115033896B (en) * 2022-08-15 2022-11-08 鹏城实验室 Method, device, system and medium for detecting Ethernet intelligent contract vulnerability
CN116578988A (en) * 2023-05-23 2023-08-11 海南大学 Vulnerability detection method and device of intelligent contract and storage medium
CN116578988B (en) * 2023-05-23 2024-01-23 海南大学 Vulnerability detection method and device of intelligent contract and storage medium

Also Published As

Publication number Publication date
CN113449303B (en) 2022-11-11

Similar Documents

Publication Publication Date Title
CN113449303B (en) Intelligent contract vulnerability detection method and system based on teacher-student network model
CN111428044B (en) Method, device, equipment and storage medium for acquiring supervision and identification results in multiple modes
CN110765966B (en) One-stage automatic recognition and translation method for handwritten characters
CN112035842B (en) Intelligent contract vulnerability detection interpretability method based on encoder-decoder
CN111985245B (en) Relationship extraction method and system based on attention cycle gating graph convolution network
CN112613303B (en) Knowledge distillation-based cross-modal image aesthetic quality evaluation method
CN113596007B (en) Vulnerability attack detection method and device based on deep learning
CN110414219A (en) Detection method for injection attack based on gating cycle unit Yu attention mechanism
WO2020122456A1 (en) System and method for matching similarities between images and texts
CN110175248B (en) Face image retrieval method and device based on deep learning and Hash coding
CN113010209A (en) Binary code similarity comparison technology for resisting compiling difference
CN110674503B (en) Intelligent contract endless loop detection method based on graph convolution neural network
CN112035841A (en) Intelligent contract vulnerability detection method based on expert rules and serialized modeling
CN112286575A (en) Intelligent contract similarity detection method and system based on graph matching model
CN111831783B (en) Method for extracting chapter-level relation
CN114239574A (en) Miner violation knowledge extraction method based on entity and relationship joint learning
CN111597816A (en) Self-attention named entity recognition method, device, equipment and storage medium
CN113177107B (en) Intelligent contract similarity detection method based on syntax tree matching
CN113904844A (en) Intelligent contract vulnerability detection method based on cross-modal teacher-student network
CN115731453B (en) Chinese character click type identifying code identifying method and system
CN110619877A (en) Voice recognition man-machine interaction method, device and system applied to laser pen and storage medium
Skobov et al. Video-to-hamnosys automated annotation system
CN114357166B (en) Text classification method based on deep learning
CN110826325A (en) Language model pre-training method and system based on confrontation training and electronic equipment
CN110197521A (en) The visual text embedding grammar indicated based on semantic structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant