CN113486357A - Intelligent contract security detection method based on static analysis and deep learning - Google Patents

Intelligent contract security detection method based on static analysis and deep learning Download PDF

Info

Publication number
CN113486357A
CN113486357A CN202110766768.XA CN202110766768A CN113486357A CN 113486357 A CN113486357 A CN 113486357A CN 202110766768 A CN202110766768 A CN 202110766768A CN 113486357 A CN113486357 A CN 113486357A
Authority
CN
China
Prior art keywords
matrix
abstract
solidity
source program
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110766768.XA
Other languages
Chinese (zh)
Other versions
CN113486357B (en
Inventor
周福才
罗熙霖
焦梓
孙劲桐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN202110766768.XA priority Critical patent/CN113486357B/en
Publication of CN113486357A publication Critical patent/CN113486357A/en
Application granted granted Critical
Publication of CN113486357B publication Critical patent/CN113486357B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/425Lexical analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Stored Programmes (AREA)

Abstract

本发明公开了一种基于静态分析和深度学习的智能合约安全检测方法,涉及区块链智能合约安全技术领域。包括对智能合约solidity源程序进行静态分析得到智能合约solidity源程序的图结构;从图结构中提取出抽象事实;根据solidity源程序的抽象事实,搭建用于对solidity源程序进行漏洞分类的深度学习模型,包括:输入模块,注意力模块,残差连接模块和输出模块;构建训练数据集;利用所述训练数据集对所述深度学习模型进行训练;利用训练好的深度学习模型对输入的智能合约进行漏洞检测,输出智能合约solidity源程序的安全检测结果。该方法可较为全面地分析智能合约solidity源程序的行为,提高了智能合约solidity源程序安全检测的准确率。

Figure 202110766768

The invention discloses a smart contract security detection method based on static analysis and deep learning, and relates to the technical field of blockchain smart contract security. Including the static analysis of the solidity source program of the smart contract to obtain the graph structure of the solidity source program of the smart contract; extracting the abstract facts from the graph structure; according to the abstract facts of the solidity source program, build a deep learning for classifying the vulnerabilities of the solidity source program The model includes: an input module, an attention module, a residual connection module and an output module; constructing a training data set; using the training data set to train the deep learning model; using the trained deep learning model to input intelligence The contract performs vulnerability detection and outputs the security detection results of the solidity source program of the smart contract. This method can comprehensively analyze the behavior of the solidity source program of the smart contract, and improve the accuracy of the security detection of the solidity source program of the smart contract.

Figure 202110766768

Description

Intelligent contract security detection method based on static analysis and deep learning
Technical Field
The invention relates to the technical field of block chain intelligent contract security, in particular to an intelligent contract security detection method based on static analysis and deep learning.
Background
A Smart Contract (Smart Contract) is a special protocol deployed in a blockchain. Buterin determines the applicability of decentralized computing outside of transactions and designs an Etherhouse blockchain that supports the execution of intelligent contracts. The smart contract contains code functions that include trading, decision making, and sending ethernet currency. Smart contracts have proven useful in many areas, including securities, communications, banking, medical, and the like. But the intelligent contract has the characteristic of transparency, namely, all participants can view the source code of the intelligent contract. And the intelligent contract has the characteristic that the intelligent contract can not be changed once deployed, so that the intelligent contract can not update software in time after finding a bug, and the loss can be reduced only by means of transaction suspension or bifurcation and the like. If the security detection is not carried out on the intelligent contract, the intelligent contract cannot be repaired in time, so that the normal use of the function of the intelligent contract is influenced, and even the benefit of the intelligent contract user can be damaged to cause serious consequences. Such as DAO attack events: the anonymous hacker uses the reentrant vulnerability of the intelligent contract to cheat 360 ten thousand Ethernet coins; parity cracking events: the deliberate breaker finds the time stamp loophole in the intelligent contract code library, and destroys the code library by utilizing the problem of inconsistent time stamps, thereby causing the loss of 1.5 hundred million dollars; malicious contract events: the five hackers maliciously release 34000 problematic intelligent contracts, which causes the ether house to be complicated, and generates abnormal chain reaction, thereby causing the ether currency with the value of 440 ten thousand dollars to be stolen. Under such severe security threat situation, currently, there is no good general means to detect the intelligent contract vulnerability, and the intelligent contract security assurance still mainly depends on the security technology level of the contract developer and the code audit based on expert experience. Therefore, an effective scheme for automatically detecting the security of the intelligent contract needs to be proposed urgently. The existing automatic safety detection has the following problems: 1. the intelligent contract code can not be analyzed in a full coverage mode, 2, the false alarm rate of security detection is high, and 3, only specific attacks are concerned, and other attacks are not easy to be detected.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an intelligent contract security detection method based on static analysis and deep learning, aiming at solving the problem of intelligent contract security detection.
The technical scheme of the invention is as follows:
1. an intelligent contract security detection method based on static analysis and deep learning is characterized by comprising the following steps:
step 1: carrying out static analysis on the intelligent contract security source program to obtain a graph structure of the intelligent contract security source program; the static analysis comprises lexical analysis and syntactic analysis; the graph structure comprises an abstract syntax tree AST and a control flow graph CFG;
step 2: extracting abstract facts from the graph structure of the solid source program obtained in the step 1;
and step 3: according to the abstract fact of the relevance source program obtained in the step 2, a deep learning model for vulnerability classification of the relevance source program is built, and the deep learning model comprises the following steps: the device comprises an input module, an attention module, a residual error connecting module and an output module;
and 4, step 4: constructing a training data set of the deep learning model;
and 5: training the deep learning model by using the training data set;
step 6: and carrying out vulnerability detection on the input intelligent contract by using the trained deep learning model, and outputting a security detection result of the intelligent contract security source program.
Further, according to the intelligent contract security detection method based on static analysis and deep learning, the step 1 specifically includes the following steps:
step 1.1: preprocessing an intelligent contract security source program, and deleting all contents irrelevant to security detection of the security source program;
step 1.2: importing a source code file corresponding to the import statement into the preprocessed intelligent contract source program to obtain a complete source program of the intelligence;
step 1.3: for a complete solidity source program, converting the solidity source program into an abstract syntax tree by using an ANTLR analyzer;
step 1.4: and constructing a control flow graph CFG of the solid source program according to the abstract syntax tree.
Further, according to the intelligent contract security detection method based on static analysis and deep learning, the step 1.3 specifically includes the following steps:
step 1.3.1: performing lexical analysis on the complete solid source program by using an ANTLR analyzer, and labeling the attributes of words in the solid source program according to predefined word attribute categories to obtain a word sequence with word attribute labels corresponding to each program sentence;
step 1.3.2: performing syntactic analysis by using an ANTLR analyzer aiming at a word sequence corresponding to each program sentence generated by the lexical analysis, and determining a syntactic structure of each program sentence according to a predefined syntactic rule; the grammar structure comprises a contract structure, a function structure, a variable structure, an expression structure and a statement control flow structure;
step 1.3.3: the solidity source program is converted into an abstract syntax tree using an ANTLR parser according to the syntax structure of each program statement.
Further, according to the intelligent contract security detection method based on static analysis and deep learning, the word attribute categories include keyword < keyword >, visibility definer < qualifier >, variable data type < changeable type >, identifier < identifier >, operator > and constant < constant >.
Further, according to the intelligent contract security detection method based on static analysis and deep learning, the predefined grammar rules are as follows:
a)Contract::=”contract”<identifier>”{”[contractBlock]”}”;
b)ContractBlock::=[Function]|[Variable];
c)Function::=”function”<identifier>”(”[Variable]”)”<qualifier>[keyword][Return][”;”|Block];
d)Variable::=<variabletype><qualifier><identifier>[”=”Expression]”;”;
e)Expression::=Functioncall|<identifier>|Expression<operator>|Expression<operator>Expres sion|<identifier><operator><constant>;
f)Functioncall::=Expression”(”Variable”)”;
g)Block::=”{”Statement”}”;
h)Statement::=IfStatement|WhileStatment|ForStatement|Variable|Expression|Block|”break”|”continue”|Return;
i)IfStatement::=”if””(”Expression”)”Block[”else”Block];
j)WhileStatment::=”while””(”Expression”)”Block;
k)ForStatement::=”for””(”[Variable]”;”[Expression]”;”[Expression]”)”Block;
l)Return::=”return”[Expression]。
further, according to the intelligent contract security detection method based on static analysis and deep learning, the step 1.4 specifically includes the following steps:
step 1.4.1: constructing different basic blocks according to Block nodes in an abstract syntax tree AST by using program statements belonging to a statement control flow structure, recording statement numbers StmtId of each statement in each basic Block, and recording an incoming edge and an outgoing edge of each basic Block;
step 1.4.2: connecting different basic blocks, connecting two basic blocks when the outgoing edge of one basic block is equal to the incoming edge of another basic block, and recording the jumping condition of the basic block when the outgoing edge number of one basic block is more than 1;
step 1.4.3: and recording the number VarId of the variable and assignment operation Assign in the program statement by using a static single assignment form, namely, only performing one assignment operation on one variable, and modifying the name of the variable subjected to secondary assignment.
Further, according to the intelligent contract security detection method based on static analysis and deep learning, the abstract facts containing all control flow information, data information and function information of the intelligent contract are written in datalog language, and the abstract facts are structured as follows:
Figure BDA0003151086430000041
the predicate is defined according to a solid source program structure, and comprises a data type, a function type, an expression structure and a control flow structure; arg1, argn being other parameters related to the content of a specific solidity program statement.
Further, according to the intelligent contract security detection method based on static analysis and deep learning, the method for extracting the abstract fact from the graph structure of the relevance source program obtained in the step 1 comprises the following steps: and traversing the graph structure of the similarity source program, and extracting the abstract fact of the similarity source program according to the keyword matching.
Further, according to the intelligent contract security detection method based on static analysis and deep learning, the step 3 specifically includes the following steps:
step 3.1: building an input module: using a 0-1 coding matrix X to represent the abstract facts obtained in the step 2, respectively performing word embedding processing and position embedding processing on the abstract facts represented by the 0-1 coding matrix X, and splicing a matrix obtained after the word embedding processing and a matrix obtained after the position embedding processing to obtain an E matrix as the input of an attention module;
step 3.2: constructing an attention module, which specifically comprises the following steps:
step 3.2.1: obtaining a Q matrix, a K matrix and a V matrix of the abstract fact through three linear changes of the E matrix respectively, and obtaining an attention coefficient matrix A of the abstract fact according to a formula (4);
A=QKT (4)
the Q matrix is a Query matrix of the abstract facts and consists of Query vectors corresponding to each word of each abstract fact; the K matrix is a Key matrix of the abstract facts and consists of Key vectors corresponding to each word of each abstract fact; the V matrix is a Value matrix of the abstract facts and consists of Value vectors corresponding to each word of each abstract fact;
step 3.2.2: updating element values in the V matrix according to a formula (5) according to an attention coefficient matrix A of the abstract fact to obtain an updated V matrix V';
Figure BDA0003151086430000042
wherein dk represents the arithmetic sum of squares of the K matrix; the softmax function is an activation function;
step 3.2.3: adding a layer normalization mechanism into a matrix V 'of the attention module to enable elements in the matrix V' to be more standard so as to accelerate convergence and ensure the stability of feature distribution;
step 3.3: building a residual connecting module, wherein a matrix calculation formula of the residual connecting module is as follows:
Z=H(E)=E+F(E)=E+V″ (9)
wherein, the matrix E is the input of the attention module; v' is the output of the attention module; z is the output of the residual connecting module; f is a residual function, in the attention module, a mapping h (e) → Z is obtained through back propagation, and if there is no residual connection module, F (e) → 0;
step 3.4: the method comprises the following steps of building an output module to output vulnerability probability possibly existing in abstract facts, wherein the concrete steps of building the output module are as follows:
step 3.4.1: defining a vulnerability category output formula shown in a formula (10) for outputting abstract fact vulnerability category results of the intelligent contracts;
Pk=softmax(Linear(Z)) (10)
wherein, Linear represents a Linear function, and Linear transformation is carried out on the matrix Z for one time; pkProbability values for different vulnerability types;
step 3.4.2: and constructing a loss function of the deep learning model to enable the model to have vulnerability classification capability.
Further, according to the intelligent contract security detection method based on static analysis and deep learning, the loss function is a multi-class cross entropy loss function shown in formula (11):
Loss1=-∑k yklog(Pk) (11)
wherein, ykAnd k represents a tag of one-hot coding corresponding to the abstract fact, and represents a vulnerability category corresponding to the abstract fact.
Compared with the prior art, the invention has the following beneficial effects:
1. the behavior of the intelligent contract security source program can be comprehensively analyzed. The security detection of the intelligent contract firstly needs to comprehensively analyze the code behavior. In the method, the abstract syntax tree and the control flow graph of the intelligent contract solid source program are analyzed, and then the graph structure is abstracted into fact representation, so that the abstract fact can cover the code behavior more comprehensively, the semantic features in the program are effectively represented, and the support is provided for a later deep learning model machine.
2. The expandability of the security detection of the intelligent contract security source program is enhanced. The traditional security detection method is mainly based on predefined rules and only focuses on known security vulnerabilities. The deep learning model used by the method is not limited to specific security holes, and the model can be trained by supplementing the training set so as to detect various security holes and easily expand the security holes. In addition, on the aspect of security detection of unknown vulnerabilities, the method can have the detection capability of the vulnerabilities only by training the model again, and compared with the traditional security detection method, the method has good expandability on the detection of the security vulnerabilities.
3. The accuracy of security detection of the intelligent contract security source program is improved. In the method, the static analysis method and the deep learning method are combined to carry out security detection on the intelligent contract, the existing deep learning model is improved, the attention module is added to learn the key information in the abstract fact, the accuracy of security detection classification is effectively improved on the basis of improving vectorization representation of the abstract fact, and the missing report rate of security holes is also effectively reduced.
Drawings
FIG. 1 is a schematic flow chart of an intelligent contract security detection method based on static analysis and deep learning according to the present invention;
FIG. 2 is a diagram of an abstract syntax tree of example code in an embodiment of the present invention;
FIG. 3 is a diagram of a deep learning model architecture in an embodiment of the present invention;
FIG. 4 is a schematic diagram of an attention module according to an embodiment of the invention.
Detailed Description
The following detailed description of embodiments of the invention will be described in conjunction with the accompanying drawings. The following examples are intended to illustrate the invention only, but to limit the scope of the invention.
Fig. 1 is a schematic flow chart of an intelligent contract security detection method based on static analysis and deep learning according to the present invention, and the intelligent contract security detection method based on static analysis and deep learning includes the following steps:
step 1: carrying out static analysis on the intelligent contract security source program to obtain a graph structure of the intelligent contract security source program; the static analysis comprises lexical analysis and syntactic analysis; the Graph structure includes an Abstract Syntax Tree (AST) and a Control Flow Graph (CFG).
Step 1.1: preprocessing an intelligent contract security source program, and deleting all contents irrelevant to security detection of the security source program;
in a preferred embodiment, the preprocessing of the smart contract relevance source program includes deleting a single line of comments "//", multiple lines of comments "/" … "/", spaces "", a carriage return "\\ n", and all content not relevant to the security detection of the relevance source program.
Step 1.2: and importing a source code file corresponding to the import statement into the preprocessed intelligent contract source program to obtain the complete source program of the intelligence.
Step 1.3: for a complete solidity source, the solidity source is converted into an abstract syntax tree using an ANTLR parser.
Step 1.3.1: performing lexical analysis on the complete solid source program by using an ANTLR analyzer, and labeling the attributes of words in the solid source program according to predefined word attribute categories to obtain a word sequence with word attribute labels corresponding to each program sentence;
the word attribute categories include keyword < keyword >, visibility delimiter < qualifier >, variable data type < changeable type >, identifier < identifier >, operator > and constant.
Step 1.3.2: performing syntactic analysis by using an ANTLR analyzer aiming at a word sequence corresponding to each program sentence generated by the lexical analysis, and determining a syntactic structure of each program sentence according to a predefined syntactic rule; the grammar structure comprises a contract structure, a function structure, a variable structure, an expression structure and a statement control flow structure;
in a preferred embodiment, the grammar rules predefined using BNF (Backus-Naur Form, Backus-Van) are as follows, according to the linguistic properties:
m)Contract::=”contract”<identifier>”{”[contractBlock]”}”;
n)ContractBlock::=[Function]|[Variable];
o)Function::=”function”<identifier>”(”[Variable]”)”<qualifier>[keyword][Return][”;”|Block];
p)Variable::=<variabletype><qualifier><identifier>[”=”Expression]”;”;
q)Expression::=Functioncall|<identifier>|Expression<operator>|Expression<operator>Expression|<identifier><operator><constant>;
r)Functioncall::=Expression”(”Variable”)”;
s)Block::=”{”Statement”}”;
t)Statement::=IfStatement|WhileStatment|ForStatement|Variable|Expression|Block|”break”|”continue”|Return;
u)IfStatement::=”if””(”Expression”)”Block[”else”Block];
v)WhileStatment::=”while””(”Expression”)”Block;
w)ForStatement::=”for””(”[Variable]”;”[Expression]”;”[Expression]”)”Block;
x)Return::=”return”[Expression]。
step 1.3.3: converting the solid source program into an abstract syntax tree by using an ANTLR analyzer according to the syntax structure of each program statement;
for example, for the code shown below, it is converted into an abstract syntax tree as shown in fig. 2 using an ANTLR parser.
Figure BDA0003151086430000071
Step 1.4: constructing a control flow graph CFG of the solid source program according to the abstract syntax tree, which comprises the following concrete steps:
step 1.4.1: constructing different basic blocks according to Block nodes in an abstract syntax tree AST by using program statements belonging to a statement control flow structure, recording statement numbers StmtId of each statement in each basic Block, and recording an incoming edge and an outgoing edge of each basic Block;
step 1.4.2: connecting different basic blocks, connecting two basic blocks when the outgoing edge of one basic block is equal to the incoming edge of another basic block, and recording the jumping condition of the basic block when the outgoing edge number of one basic block is more than 1;
step 1.4.3: the number VarId of the variable in the program statement and the assignment operation Assign are recorded by using a static single assignment form (SSA form), that is, one variable only performs one assignment operation, and the variable name of the variable subjected to the secondary assignment is modified.
For example, for an assignment operation "x ═ 1; y is x + 1; x is y; "its static single assignment form is" x1 ═ 1; y ═ x1+ 1; x2 ═ y; "recording assignment operations for variables using a static single assignment form facilitates analysis of subsequent abstract facts.
Step 2: and (3) extracting an abstract fact from the graph structure of the solid source program obtained in the step (1), specifically, traversing the graph structure of the solid source program, and extracting the abstract fact of the solid source program according to keyword matching.
The abstract facts are written by using a datalog language and comprise all control flow information, data information and function information of the intelligent contracts, and the information is key characteristics related to security vulnerabilities;
in a preferred embodiment, the abstract facts are structured as follows:
Figure BDA0003151086430000081
where predicate is the name of the corresponding predicate defined according to the solidity structure, arg 1.
In the preferred embodiment, there are four predicate names, data type, function type, expression structure, and control flow structure. The specific predicate name definition and the parameter definition are as follows:
traversing all nodes of AST of the solid source program, defining a predicate name of the nodes as VarDecl for an operation node Variable of a data type, defining a predicate name of the nodes as FunDecl for an operation node Function of a Function type, and defining a predicate name of the nodes as FunCall for a Function call node Funcionall in an expression structure, wherein for the call of a special Function, the predicate names of the nodes include address-dependent functions call, delegatecall, send, transfer and error handling functions reverse, assert and requirer, and defining the predicate name as an original name; the parameters are the related statement number, the variable number and the parameters of all leaf nodes corresponding to the nodes.
Traversing a control flow graph of a solid source program, defining the predicate name of an assignment operation Assign between variables as VarAss, defining parameters as a corresponding statement number stmtId and a related variable number varId, defining the predicate name of a statement in the same basic Block as Block, defining parameters as a basic Block number Block Id and a statement number stmtId, and defining the predicate name of the statement as Block when a path exists between basic blocks and defining the predicate name as Block Path and the parameter as a corresponding basic Block number Block Id.
For example, the abstract facts extracted by traversing the graph structure generated by the example code in step 1.3.2 are as follows:
VarDecl(StmtId='S00',VarId='V00',variabletype='uint',identifier='storedData')
Block(BlockId='B00',StmtId='S00')
FunDecl(identifier='set',VarId='V01',qualifier='public')
VarDecl(VarId=′V01′,variabletype=′uint′,identifier=′x′)
Block(BlockId=′B01′,StmtId=′S01′)
VarAss(StmtId=′S01′,VarId=′V00′,VarId=′V01′)
and step 3: according to the abstract fact of the solubility source program obtained in the step 2, building a deep learning model for carrying out vulnerability classification on the solubility source program;
in a preferred embodiment, the structure design deep learning model based on the Transformer model, as shown in fig. 3, includes four modules: the device comprises an input module, an attention module, a residual error connecting module and an output module. The construction process of the deep learning model comprises the following steps:
step 3.1: building an input module: performing vectorization preprocessing on the abstract facts obtained in the step 2, representing the input abstract facts by using a 0-1 coding matrix X, and performing dimension reduction processing because the 0-1 coding matrix X is too sparse, namely performing word embedding processing and position embedding processing on the abstract facts represented by the 0-1 coding matrix X, wherein the matrix obtained after the dimension reduction processing is the input required by the attention module, and the specific steps are as follows:
step 3.1.1: performing word embedding processing on the abstract facts represented by the 0-1 coding matrix X according to formula (1) to obtain a word matrix X':
Xl*d′=tanh(Xl*v W1) (1)
wherein, W1Is a parameter matrix to be trained in the input module; l is the row number of the longest abstract fact in the abstract facts corresponding to different solid source programs; v is the vocabulary size of the abstract fact; d is the dimension of the term after dimension reduction.
Step 3.1.2: performing position embedding processing on abstract facts represented by a 0-1 coding matrix X;
in order to ensure that the deep learning model can better acquire the position information of the abstract facts, the input module introduces a position coding mechanism of the abstract facts, namely position embedding.
In the preferred embodiment, the position information of each statement in the abstract fact is represented by a matrix P, and the matrix P is subjected to an activation function according to formula (2) to obtain a position coding matrix P':
Pl*d′=tanh(Pl*d) (2)
the matrix P is initialized randomly before training, and a position coding matrix P' formed by position vectors corresponding to each position is obtained after training.
Step 3.1.3: for the abstract fact of an intelligent contract, the position coding matrix P 'and the word matrix X' are spliced according to the formula (3) to obtain an E matrix which is used as the input of the attention module.
Figure BDA0003151086430000091
Step 3.2: constructing an attention module, wherein a schematic diagram of the attention module is shown in FIG. 4;
the attention module is the core of the deep learning model. Through the attention mechanism of the module, attention coefficients among abstract fact words can be calculated, and the vector corresponding to each word of each abstract fact contains information of vectors corresponding to other words, so that key information in the abstract facts can be better acquired. The principle of the attention mechanism is that the attention coefficient between each word and other words in the abstract fact is obtained by matrix multiplication.
In a preferred embodiment, the specific steps of building the attention module are as follows:
step 3.2.1: calculating attention coefficients among the abstract fact words to obtain an attention coefficient matrix of the abstract facts;
the calculation of the attention coefficient in the preferred embodiment is similar to BERT, involving three matrices: q matrix, K matrix, and V matrix. The Q matrix is a Query matrix of the abstract facts and consists of Query vectors corresponding to each word of each abstract fact; the K matrix is a Key matrix of the abstract facts and is composed of Key vectors corresponding to each word of each abstract fact, and the V matrix is a Value matrix of the abstract facts and is composed of Value vectors corresponding to each word of each abstract fact. The three matrixes are randomly given values in an initial state, are respectively obtained by three linear changes of the matrix E, and have characterization significance after being trained.
An attention coefficient matrix of the abstract fact is obtained according to formula (4):
A=QKT (4)
step 3.2.2: updating element values in the V matrix according to the attention coefficient matrix A of the abstract fact to obtain an updated V matrix y';
in a preferred embodiment, after obtaining the attention coefficient matrix a, the element values in the V matrix are updated according to equation (5), and an updated V matrix V' can be obtained.
Figure BDA0003151086430000101
Where dk represents the arithmetic square sum of the K matrix, and the dimension enlarged by the square multiple after multiplication is reduced to the original size in formula (5), and a certain gradient update value jitter is reduced in the process of back propagation. Softmax is an activation function, and the significance of the activation function is that the characterization capability of a V' matrix is enhanced by adding nonlinear change.
Step 3.2.3: adding a layer normalization mechanism into a matrix V 'of the attention module to enable elements in the matrix V' to be more standard so as to accelerate convergence and ensure the stability of feature distribution;
the layer normalization mechanism takes the inputs of all dimensions of the matrix V' into account, calculates the average input value and input variance, and then transforms the inputs of each dimension using the same normalization operation. The formula of the mean of all elements of the V' matrix is as follows:
Figure BDA0003151086430000102
the variance formula for all elements of the V' matrix is as follows:
Figure BDA0003151086430000111
wherein n is(v)Is the number of elements in V,. mu.(v)Is taken as the mean value of the average value,
Figure BDA0003151086430000113
is the variance, σ(v)Is the standard deviation. Each element V in the matrix ViNormalization is performed according to equation (8):
Figure BDA0003151086430000112
in the above formula, vi' for each element V in the matrix ViNormalized values.
Step 3.3: building a residual error connection module;
in the preferred embodiment, the vocabulary of the source input (abstract facts) of the deep learning model is too small, the attention module may capture the connection relationship between words excessively, and the residual connection module is added to overcome the problem to some extent.
In a preferred embodiment, the matrix calculation formula of the residual concatenation module is as follows:
Z=H(E)=E+F(E)=E+V″ (9)
wherein, the matrix E is the input of the attention module; v "is the output of the attention module and the addition of these two matrices results in the output Z of the residual concatenation module. F is the residual function, and in the attention module, a mapping h (e) → Z is obtained through back propagation, and if there is no residual connecting module, F (e) → 0.
Step 3.4: building an output module;
the output module is used for outputting the possible vulnerability probability of the abstract fact and maximizing the security vulnerability detection capability of the deep learning model according to the loss function.
In a preferred embodiment, the specific steps of constructing the output module are as follows:
step 3.4.1: and defining a vulnerability category output formula shown in the formula (10) for outputting an abstract fact vulnerability category result of the intelligent contract.
Pk=softmax(Linear(Z)) (10)
Wherein, Linear represents a Linear function, namely, a Linear transformation is carried out on the matrix Z, the softmax function is an activation function, PkProbability values for different vulnerability types.
Step 3.4.2: and (3) constructing a loss function of the deep learning model, wherein the model has vulnerability classification capability through the loss function, and the loss function is a multi-class cross entropy loss function shown in formula (11).
Loss1=-∑k yklog(Pk) (11)
Wherein, ykAnd k represents a tag of one-hot coding corresponding to the abstract fact, and represents a vulnerability category corresponding to the abstract fact.
And 4, step 4: constructing a training data set of a deep learning model;
vulnerability detection problems can be considered as multi-classification problems in machine learning. Because the classification problem belongs to supervised learning, data (relevance program) and tags for data (vulnerability type) are required. Therefore, the construction of the training data set of the deep learning model comprises the steps of acquiring data and labeling the data with label types.
In the preferred embodiment, a total of 1500 program files of the real-life smart contracts for the etherhouse are first collected. And then, according to the definition of the SWC Registry on the vulnerability of the intelligent contract, carrying out manual marking on the 1500 program files, and constructing a training data set of the deep learning model. The SWC Registry is an intelligent contract vulnerability annotation standard library which is mainstream at present. It is built by Etherhouse Security and developers in the Smart Contract Security organization. The vulnerability library provides Ethengfang intelligent contract security vulnerability classification, partial test cases and consequences caused by vulnerabilities. The number of holes in each category in the training dataset and the occupation ratio are shown in table 2.
TABLE 2 vulnerability Numbers and ratios
Vulnerability category Number of Ratio of occupation of
Reentrant vulnerabilities 1014 67.6%
Timestamp dependency vulnerabilities 715 46.7%
Endless loop leak 326 21.7%
Without leak 293 19.5%
And 5: and training a deep learning model by utilizing the training data set.
In a preferred embodiment, the training of the deep learning model is divided into two steps, the first step being pre-training (pre-train) with the aim of rapidly dropping the value of the loss function of the deep learning model. The second step is fine-tuning training (Finetune Train) aiming at further improving the security detection capability of the deep learning model. The combined training mode of pre-training and fine-tuning training enables the deep learning model to have better robustness and expandability.
In a preferred embodiment, the Jupyter notewood platform with GPU resources is used for pre-training and fine-tuning training of the deep learning model: during pre-training, setting the Batch-size to be 16, setting the Epoch to be 80, selecting the optimizer to be Adam, and stopping the pre-training to start fine tuning training when the loss value is stably changed to be 1; during the fine tuning training, the Batch-size is set to 4, the Epoch is set to 20, the optimizer selects SGD, and the fine tuning training is stopped when the loss value changes steadily to 0.1. The deep learning model after pre-training and fine-tuning training has vulnerability classification capability for the intelligent contract.
Step 6: carrying out vulnerability detection on the input intelligent contract by using the trained deep learning model, and outputting a security detection result of the intelligent contract security source program;
and detecting the vulnerability of the intelligent contract by using the trained deep learning model, wherein the output result is the probability value of each vulnerability type, if the output probability value is more than or equal to 0.5, the vulnerability of the intelligent contract is considered to exist, and if the output probability value is less than 0.5, the vulnerability does not exist. The method can effectively and automatically detect the security of the intelligent contract.
It is to be understood that the above-described embodiments are only a few embodiments of the present invention, and not all embodiments. The above examples are only for explaining the present invention and do not constitute a limitation to the scope of protection of the present invention. All other embodiments, which can be derived by those skilled in the art from the above-described embodiments without any creative effort, namely all modifications, equivalents, improvements and the like made within the spirit and principle of the present application, fall within the protection scope of the present invention claimed.

Claims (10)

1.一种基于静态分析和深度学习的智能合约安全检测方法,其特征在于,包括下述步骤:1. a smart contract security detection method based on static analysis and deep learning, is characterized in that, comprises the following steps: 步骤1:对智能合约solidity源程序进行静态分析得到智能合约solidity源程序的图结构;所述静态分析包括词法分析和语法分析;所述图结构包括抽象语法树AST和控制流图CFG;Step 1: statically analyze the solidity source program of the smart contract to obtain a graph structure of the solidity source program of the smart contract; the static analysis includes lexical analysis and syntax analysis; the graph structure includes an abstract syntax tree AST and a control flow graph CFG; 步骤2:从步骤1得到的solidity源程序的图结构中提取出抽象事实;Step 2: Extract abstract facts from the graph structure of the solidity source program obtained in step 1; 步骤3:根据步骤2得到的solidity源程序的抽象事实,搭建用于对solidity源程序进行漏洞分类的深度学习模型,该深度学习模型包括:输入模块,注意力模块,残差连接模块和输出模块;Step 3: According to the abstract facts of the solidity source program obtained in step 2, build a deep learning model for classifying vulnerabilities in the solidity source program. The deep learning model includes: an input module, an attention module, a residual connection module and an output module ; 步骤4:构建所述深度学习模型的训练数据集;Step 4: construct the training data set of the deep learning model; 步骤5:利用所述训练数据集对所述深度学习模型进行训练;Step 5: using the training data set to train the deep learning model; 步骤6:利用训练好的深度学习模型对输入的智能合约进行漏洞检测,输出智能合约solidity源程序的安全检测结果。Step 6: Use the trained deep learning model to perform vulnerability detection on the input smart contract, and output the security detection result of the solidity source program of the smart contract. 2.根据权利要求1所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述步骤1具体包括如下步骤:2. the smart contract security detection method based on static analysis and deep learning according to claim 1, is characterized in that, described step 1 specifically comprises the steps: 步骤1.1:对智能合约solidity源程序进行预处理,删除与solidity源程序安全检测无关的所有内容;Step 1.1: Preprocess the solidity source program of the smart contract, and delete all content unrelated to the security detection of the solidity source program; 步骤1.2:对预处理后的智能合约solidity源程序导入import语句对应的源代码文件,得到完整的solidity源程序;Step 1.2: Import the source code file corresponding to the import statement to the preprocessed smart contract solidity source program to obtain the complete solidity source program; 步骤1.3:针对完整的solidity源程序,使用ANTLR分析器将solidity源程序转换为抽象语法树;Step 1.3: For the complete solidity source program, use the ANTLR analyzer to convert the solidity source program into an abstract syntax tree; 步骤1.4:根据抽象语法树构造solidity源程序的控制流图CFG。Step 1.4: Construct the control flow graph CFG of the solidity source program according to the abstract syntax tree. 3.根据权利要求2所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述步骤1.3具体包括如下步骤:3. the smart contract security detection method based on static analysis and deep learning according to claim 2, is characterized in that, described step 1.3 specifically comprises the steps: 步骤1.3.1:使用ANTLR分析器对完整的solidity源程序进行词法分析,按照预定义的单词属性类别,对solidity源程序中单词的属性进行标注,得到与每个程序语句对应的具有单词属性标注的单词序列;Step 1.3.1: Use the ANTLR analyzer to perform lexical analysis on the complete solidity source program, mark the attributes of the words in the solidity source program according to the predefined word attribute categories, and obtain the word attribute annotation corresponding to each program statement. word sequence; 步骤1.3.2:针对词法分析生成的每个程序语句对应的单词序列,使用ANTLR分析器进行语法分析,按照预定义的语法规则确定每个程序语句的语法结构;所述语法结构包括合约结构、函数结构、变量结构、表达式结构与语句控制流结构;Step 1.3.2: For the word sequence corresponding to each program statement generated by the lexical analysis, use the ANTLR analyzer to perform grammatical analysis, and determine the grammatical structure of each program statement according to predefined grammatical rules; the grammatical structure includes contract structure, Function structure, variable structure, expression structure and statement control flow structure; 步骤1.3.3:根据每个程序语句的语法结构,使用ANTLR分析器将solidity源程序转换为抽象语法树。Step 1.3.3: According to the grammatical structure of each program statement, use the ANTLR parser to convert the solidity source program into an abstract syntax tree. 4.根据权利要求3所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述单词属性类别,包括关键字<keyword>、可见性定义符<qualifier>、变量数据类型<variabletype>、标识符<identifier>、运算符<operator>以及常量<constant>。4. The smart contract security detection method based on static analysis and deep learning according to claim 3, wherein the word attribute category includes keyword <keyword>, visibility definer <qualifier>, variable data type <variabletype>, identifier <identifier>, operator <operator>, and constant <constant>. 5.根据权利要求3所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述预定义的语法规则如下:5. the smart contract security detection method based on static analysis and deep learning according to claim 3, is characterized in that, described predefined grammar rule is as follows: a)Contract::=”contract”<identifier>”{”[contractBlock]”}”;a)Contract::=”contract”<identifier>”{”[contractBlock]”}”; b)ContractBlock::=[Function]|[Variable];b)ContractBlock::=[Function]|[Variable]; c)Function::=”function”<identifier>”(”[Variable]”)”<qualifier>[keyword][Return][”;”|Block];c)Function::=”function”<identifier>”(”[Variable]”)”<qualifier>[keyword][Return][”;”|Block]; d)Variable::=<variabletype><qualifier><identifier>[”=”Expression]”;”;d)Variable::=<variabletype><qualifier><identifier>[”=”Expression]”;”; e)Expression::=Functioncall|<identifier>|Expression<operator>|Expression<operator>Expression|<identifier><operator><constant>;e)Expression::=Functioncall|<identifier>|Expression<operator>|Expression<operator>Expression|<identifier><operator><constant>; f)Functioncall::=Expression”(”Variable”)”;f)Functioncall::=Expression"("Variable")"; g)Block::=”{”Statement”}”;g) Block::="{"Statement"}"; h)Statement::=IfStatement|WhileStatment|ForStatement|Variable|Expression|Block|”break”|”continue”|Return;h)Statement::=IfStatement|WhileStatment|ForStatement|Variable|Expression|Block|"break"|"continue"|Return; i)IfStatement::=”if””(”Expression”)”Block[”else”Block];i)IfStatement::="if""("Expression")"Block["else"Block]; j)WhileStatment::=”while””(”Expression”)”Block;j)WhileStatment::="while""("Expression")"Block; k)ForStatement::=”for””(”[Variable]”;”[Expression]”;”[Expression]”)”Block;k)ForStatement::="for""("[Variable]";"[Expression]";"[Expression]")"Block; l)Return::=”return”[Expression]。l)Return::="return"[Expression]. 6.根据权利要求2所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述步骤1.4具体包括如下步骤:6. The smart contract security detection method based on static analysis and deep learning according to claim 2, is characterized in that, described step 1.4 specifically comprises the following steps: 步骤1.4.1:利用属于语句控制流结构的程序语句按照抽象语法树AST中的Block节点构造不同的基本块Block,记录每个基本块中每条语句的语句编号StmtId,以及记录每一个基本块的入边和出边;Step 1.4.1: Use the program statements belonging to the statement control flow structure to construct different basic blocks Block according to the Block nodes in the abstract syntax tree AST, record the statement number StmtId of each statement in each basic block, and record each basic block the incoming and outgoing edges; 步骤1.4.2:连接不同的基本块,当一个基本块的出边与另一个基本块的入边相等时,便可将两个基本块进行连接,当一个基本块的出边数量大于1时,记录此基本块的跳转条件;Step 1.4.2: Connect different basic blocks. When the outgoing edge of one basic block is equal to the incoming edge of another basic block, the two basic blocks can be connected. When the number of outgoing edges of one basic block is greater than 1 , record the jump condition of this basic block; 步骤1.4.3:使用静态单赋值形式记录程序语句中的变量的编号VarId以及赋值操作Assign,即一个变量只执行一次赋值操作,对于进行二次赋值的变量修改其变量名称。Step 1.4.3: Use the static single assignment form to record the variable number VarId in the program statement and the assignment operation Assign, that is, a variable only performs an assignment operation once, and the variable name is modified for the variable that is assigned twice. 7.根据权利要求1所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述抽象事实包含着智能合约的所有控制流信息、数据信息和函数信息,使用datalog语言编写,所述抽象事实的结构形式如下:7. The smart contract security detection method based on static analysis and deep learning according to claim 1, wherein the abstract fact contains all control flow information, data information and function information of the smart contract, and is written in datalog language , the structure of the abstract fact is as follows:
Figure FDA0003151086420000031
Figure FDA0003151086420000031
其中,predicate是根据solidity源程序结构定义的对应谓词名称,包括数据类型、函数类型、表达式结构以及控制流结构;arg1,...,argn是与具体solidity程序语句内容相关的其他参数。Among them, predicate is the corresponding predicate name defined according to the solidity source program structure, including data type, function type, expression structure and control flow structure; arg1,...,argn are other parameters related to the content of the specific solidity program statement.
8.根据权利要求1或7所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述从步骤1得到的solidity源程序的图结构中提取出抽象事实的方法为:遍历solidity源程序的图结构,根据关键词匹配提取solidity源程序的抽象事实。8. The smart contract security detection method based on static analysis and deep learning according to claim 1 or 7, wherein the method for extracting abstract facts from the graph structure of the solidity source program obtained in step 1 is: Traverse the graph structure of the solidity source program, and extract the abstract facts of the solidity source program according to keyword matching. 9.根据权利要求1所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述步骤3具体包括如下步骤:9. The smart contract security detection method based on static analysis and deep learning according to claim 1, is characterized in that, described step 3 specifically comprises the steps: 步骤3.1:搭建输入模块:将步骤2得到的抽象事实使用0-1编码矩阵X表示,对由0-1编码矩阵X表示的抽象事实分别进行词嵌入处理和位置嵌入处理,并将词嵌入处理后得到的矩阵与位置嵌入处理后得到的矩阵进行拼接得到的E矩阵作为注意力模块的输入;Step 3.1: Build the input module: use the 0-1 encoding matrix X to represent the abstract facts obtained in step 2, and perform word embedding processing and position embedding processing on the abstract facts represented by the 0-1 encoding matrix X, respectively, and the word embedding processing The E matrix obtained by splicing the obtained matrix and the matrix obtained after the position embedding process is used as the input of the attention module; 步骤3.2:搭建注意力模块,具体包括如下步骤:Step 3.2: Build the attention module, which includes the following steps: 步骤3.2.1:由E矩阵分别经过三个线性变化得到抽象事实的Q矩阵、K矩阵和V矩阵,并根据公式(4)得到抽象事实的注意力系数矩阵A;Step 3.2.1: Obtain the Q matrix, K matrix and V matrix of the abstract fact through three linear changes from the E matrix, and obtain the attention coefficient matrix A of the abstract fact according to formula (4); A=QKT (4)A = QK T (4) 其中,Q矩阵为抽象事实的Query矩阵,由每条抽象事实的每个单词所对应的query向量组成;K矩阵为抽象事实的Key矩阵,由每条抽象事实的每个单词所对应的key向量组成;V矩阵为抽象事实的Value矩阵,由每个抽象事实的每个单词所对应的value向量组成;Among them, the Q matrix is the Query matrix of the abstract fact, which consists of the query vector corresponding to each word of each abstract fact; the K matrix is the Key matrix of the abstract fact, which consists of the key vector corresponding to each word of each abstract fact Composition; V matrix is the Value matrix of the abstract fact, which is composed of the value vector corresponding to each word of each abstract fact; 步骤3.2.2:根据抽象事实的注意力系数矩阵A,按照公式(5)更新V矩阵中的元素值,得到更新后的V矩阵V';Step 3.2.2: According to the attention coefficient matrix A of the abstract fact, update the element values in the V matrix according to formula (5) to obtain the updated V matrix V';
Figure FDA0003151086420000032
Figure FDA0003151086420000032
其中,dk表示K矩阵的算数平方和;softmax函数为激活函数;Among them, d k represents the arithmetic sum of squares of the K matrix; the softmax function is the activation function; 步骤3.2.3:在注意力模块的矩阵V′中加入层归一化机制,使矩阵V′中的元素更加规范,以加速收敛并保证特征分布的稳定性;Step 3.2.3: Add a layer normalization mechanism to the matrix V' of the attention module to make the elements in the matrix V' more standardized to speed up the convergence and ensure the stability of the feature distribution; 步骤3.3:搭建残差连接模块,残差连接模块的矩阵计算公式如下:Step 3.3: Build the residual connection module. The matrix calculation formula of the residual connection module is as follows: Z=H(E)=E+F(E)=E+V″ (9)Z=H(E)=E+F(E)=E+V″ (9) 其中,矩阵E为注意力模块的输入;V″为注意力模块的输出;Z为残差连接模块的输出;F是残差函数,在注意力模块中,经过反向传播会得到一个映射H(E)→Z,若没有残差连接模块,则F(E)→0;Among them, the matrix E is the input of the attention module; V″ is the output of the attention module; Z is the output of the residual connection module; F is the residual function, in the attention module, after back propagation, a mapping H will be obtained (E)→Z, if there is no residual connection module, then F(E)→0; 步骤3.4:搭建输出模块,以输出抽象事实可能存在的漏洞概率,搭建输出模块的具体步骤如下:Step 3.4: Build an output module to output the possible vulnerability probability of abstract facts. The specific steps for building an output module are as follows: 步骤3.4.1:定义公式(10)所示的漏洞类别输出公式,用于输出智能合约的抽象事实漏洞类别结果;Step 3.4.1: Define the vulnerability category output formula shown in formula (10), which is used to output the abstract fact vulnerability category result of the smart contract; Pk=softmax(Linear(Z)) (10)P k =softmax(Linear(Z)) (10) 其中,Linear代表线性函数,对矩阵Z进行一次线性变换;Pk为不同漏洞类型的概率值;Among them, Linear represents a linear function, which performs a linear transformation on the matrix Z; P k is the probability value of different vulnerability types; 步骤3.4.2:构建深度学习模型的损失函数,使模型具有漏洞分类能力。Step 3.4.2: Build the loss function of the deep learning model, so that the model has the ability to classify vulnerabilities.
10.根据权利要求9所述的基于静态分析和深度学习的智能合约安全检测方法,其特征在于,所述损失函数为式(11)所示的多类别交叉熵损失函数:10. The smart contract security detection method based on static analysis and deep learning according to claim 9, wherein the loss function is a multi-category cross-entropy loss function shown in formula (11): Loss1=-∑kyklog(Pk) (11)Loss 1 = -∑ k y k log(P k ) (11) 其中,yk代表抽象事实对应的one-hot编码的标签,k表示抽象事实对应的漏洞类别。Among them, y k represents the one-hot encoded label corresponding to the abstract fact, and k represents the vulnerability category corresponding to the abstract fact.
CN202110766768.XA 2021-07-07 2021-07-07 Intelligent contract security detection method based on static analysis and deep learning Expired - Fee Related CN113486357B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110766768.XA CN113486357B (en) 2021-07-07 2021-07-07 Intelligent contract security detection method based on static analysis and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110766768.XA CN113486357B (en) 2021-07-07 2021-07-07 Intelligent contract security detection method based on static analysis and deep learning

Publications (2)

Publication Number Publication Date
CN113486357A true CN113486357A (en) 2021-10-08
CN113486357B CN113486357B (en) 2024-02-13

Family

ID=77941656

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110766768.XA Expired - Fee Related CN113486357B (en) 2021-07-07 2021-07-07 Intelligent contract security detection method based on static analysis and deep learning

Country Status (1)

Country Link
CN (1) CN113486357B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114048464A (en) * 2022-01-12 2022-02-15 北京大学 Method and system for security vulnerability detection of Ethereum smart contract based on deep learning
CN115017507A (en) * 2022-07-14 2022-09-06 北京华云安信息技术有限公司 Method, device, equipment and storage medium for detecting source code tampering
CN115033896A (en) * 2022-08-15 2022-09-09 鹏城实验室 Method, device, system and medium for detecting Ethernet intelligent contract vulnerability
CN115146267A (en) * 2022-06-22 2022-10-04 北京天融信网络安全技术有限公司 Method and device for detecting macro viruses in Office document, electronic equipment and storage medium
CN115146282A (en) * 2022-08-31 2022-10-04 中国科学院大学 AST-based source code exception detection method and device
CN115879868A (en) * 2022-09-09 2023-03-31 南京审计大学 A Smart Contract Security Audit Method Combining Expert System and Deep Learning
CN116318861A (en) * 2023-02-13 2023-06-23 西安电子科技大学 Untested verification method based on the return value of Ethereum smart contract based on dynamic transaction information
CN117033164A (en) * 2023-05-17 2023-11-10 烟台大学 Intelligent contract security vulnerability detection method and system
CN118171290A (en) * 2024-05-14 2024-06-11 烟台大学 Smart contract vulnerability detection method and system based on source code and bytecode

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109933991A (en) * 2019-03-20 2019-06-25 杭州拜思科技有限公司 A kind of method, apparatus of intelligence contract Hole Detection
CN110096439A (en) * 2019-04-26 2019-08-06 河海大学 A kind of method for generating test case towards solidity language
CN110111218A (en) * 2019-03-18 2019-08-09 东北大学 A kind of software copyright managing and control system and method based on block chain
KR20190105774A (en) * 2018-03-06 2019-09-18 충남대학교산학협력단 Method for improving safty of calling function in smart contracts
US20190370799A1 (en) * 2018-05-30 2019-12-05 Investa Tech Consulting, Inc. Application for creating real time smart contracts
CN110659494A (en) * 2019-09-27 2020-01-07 重庆邮电大学 Extensible intelligent contract vulnerability detection method
CN111753306A (en) * 2020-05-29 2020-10-09 西安深信科创信息技术有限公司 Intelligent contract vulnerability detection method and device, electronic equipment and storage medium
CN111861465A (en) * 2020-07-21 2020-10-30 国家计算机网络与信息安全管理中心 Detection method and device based on intelligent contract, storage medium and electronic device
US11036614B1 (en) * 2020-08-12 2021-06-15 Peking University Data control-oriented smart contract static analysis method and system
CN113360915A (en) * 2021-06-09 2021-09-07 扬州大学 Intelligent contract multi-vulnerability detection method and system based on source code graph representation learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190105774A (en) * 2018-03-06 2019-09-18 충남대학교산학협력단 Method for improving safty of calling function in smart contracts
US20190370799A1 (en) * 2018-05-30 2019-12-05 Investa Tech Consulting, Inc. Application for creating real time smart contracts
CN110111218A (en) * 2019-03-18 2019-08-09 东北大学 A kind of software copyright managing and control system and method based on block chain
CN109933991A (en) * 2019-03-20 2019-06-25 杭州拜思科技有限公司 A kind of method, apparatus of intelligence contract Hole Detection
CN110096439A (en) * 2019-04-26 2019-08-06 河海大学 A kind of method for generating test case towards solidity language
CN110659494A (en) * 2019-09-27 2020-01-07 重庆邮电大学 Extensible intelligent contract vulnerability detection method
CN111753306A (en) * 2020-05-29 2020-10-09 西安深信科创信息技术有限公司 Intelligent contract vulnerability detection method and device, electronic equipment and storage medium
CN111861465A (en) * 2020-07-21 2020-10-30 国家计算机网络与信息安全管理中心 Detection method and device based on intelligent contract, storage medium and electronic device
US11036614B1 (en) * 2020-08-12 2021-06-15 Peking University Data control-oriented smart contract static analysis method and system
CN113360915A (en) * 2021-06-09 2021-09-07 扬州大学 Intelligent contract multi-vulnerability detection method and system based on source code graph representation learning

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
MINGHANG ZHAO 等: "Deep Residual Shrinkage Networks for Fault Diagnosis", IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, vol. 16, no. 7, pages 4681 - 4690, XP011780648, DOI: 10.1109/TII.2019.2943898 *
POUYAN MOMENI 等: "Machine Learning Model for Smart Contracts Security Analysis", 2019 17TH INTERNATIONAL CONFERENCE ON PRICAVY, SECURITY AND TRUST(PST) *
REDNAXELAFX: "如何使用AST生成程序的控制流图(CFG)?)", Retrieved from the Internet <URL:网页公开:https://www.zhihu.com/question/27730062> *
ZI JIAO 等: "RPVC: A Revocable Publicly Verifiable Computation Solution for Edge Computing", RPVC: A REVOCABLE PUBLICLY VERIFIABLE COMPUTATION SOLUTION FOR EDGE COMPUTING. SENSORS 2022, pages 1 - 20 *
倪远东;张超;殷婷婷;: "智能合约安全漏洞研究综述", 信息安全学报, no. 03, pages 83 - 104 *
赵淦森;谢智健;王欣明;何嘉浩;张成志;林成创;ZIHENG ZHOU;陈冰川;CHUNMING RONG;: "ContractGuard:面向以太坊区块链智能合约的入侵检测系统", 网络与信息安全学报, no. 02, pages 39 - 59 *
赵芳煜: "基于以太坊智能合约的漏洞扫描器的设计与整合", 中国优秀硕士学位论文全文数据库 信息科技辑, pages 3 *
陆申明;左志强;王林章;: "静态程序分析并行化研究进展", 软件学报, no. 05, pages 7 - 18 *
韩松明;梁彬;黄建军;石文昌;: "DC-Hunter:一种基于字节码匹配的危险智能合约检测方案", 信息安全学报, no. 03, pages 105 - 117 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114048464B (en) * 2022-01-12 2022-03-15 北京大学 Method and system for security vulnerability detection of Ethereum smart contract based on deep learning
CN114048464A (en) * 2022-01-12 2022-02-15 北京大学 Method and system for security vulnerability detection of Ethereum smart contract based on deep learning
CN115146267A (en) * 2022-06-22 2022-10-04 北京天融信网络安全技术有限公司 Method and device for detecting macro viruses in Office document, electronic equipment and storage medium
CN115017507A (en) * 2022-07-14 2022-09-06 北京华云安信息技术有限公司 Method, device, equipment and storage medium for detecting source code tampering
CN115033896B (en) * 2022-08-15 2022-11-08 鹏城实验室 Method, device, system and medium for detecting Ethernet intelligent contract vulnerability
CN115033896A (en) * 2022-08-15 2022-09-09 鹏城实验室 Method, device, system and medium for detecting Ethernet intelligent contract vulnerability
CN115146282A (en) * 2022-08-31 2022-10-04 中国科学院大学 AST-based source code exception detection method and device
CN115879868A (en) * 2022-09-09 2023-03-31 南京审计大学 A Smart Contract Security Audit Method Combining Expert System and Deep Learning
CN115879868B (en) * 2022-09-09 2023-07-21 南京审计大学 A Smart Contract Security Audit Method Combining Expert System and Deep Learning
CN116318861A (en) * 2023-02-13 2023-06-23 西安电子科技大学 Untested verification method based on the return value of Ethereum smart contract based on dynamic transaction information
CN117033164A (en) * 2023-05-17 2023-11-10 烟台大学 Intelligent contract security vulnerability detection method and system
CN117033164B (en) * 2023-05-17 2024-03-29 烟台大学 A smart contract security vulnerability detection method and system
CN118171290A (en) * 2024-05-14 2024-06-11 烟台大学 Smart contract vulnerability detection method and system based on source code and bytecode

Also Published As

Publication number Publication date
CN113486357B (en) 2024-02-13

Similar Documents

Publication Publication Date Title
CN113486357B (en) Intelligent contract security detection method based on static analysis and deep learning
CN113360915B (en) Smart contract multi-vulnerability detection method and system based on source code graph representation learning
CN111428044B (en) Method, device, equipment and storage medium for acquiring supervision and identification results in multiple modes
CN111191002B (en) Neural code searching method and device based on hierarchical embedding
CN112035841B (en) Intelligent contract vulnerability detection method based on expert rules and serialization modeling
CN104809069A (en) Source node loophole detection method based on integrated neural network
CN117725592A (en) A smart contract vulnerability detection method based on directed graph attention network
CN117252261A (en) Knowledge graph construction method, electronic equipment and storage medium
CN112699375A (en) Block chain intelligent contract security vulnerability detection method based on network embedded similarity
Zhang et al. SVScanner: Detecting smart contract vulnerabilities via deep semantic extraction
CN118427842B (en) LLM-based SAST vulnerability rapid analysis method, device and equipment
CN113626826B (en) Smart contract security detection methods, systems, equipment, terminals and applications
CN111045670A (en) Method and device for identifying multiplexing relationship between binary code and source code
CN112784279A (en) Software product safety risk assessment method based on dependency library version information
CN111881446A (en) Method and device for identifying malicious code in industrial Internet
CN117688560A (en) Semantic analysis-oriented intelligent detection method for malicious software
CN116595537A (en) A Vulnerability Detection Method for Generative Smart Contracts Based on Multimodal Features
CN119293192A (en) Intelligent dialogue method and related device based on user authority separation
CN118709191A (en) A source code vulnerability detection and positioning method, device, equipment and storage medium
CN116628695A (en) Vulnerability mining method and device based on multi-task learning
CN113971283B (en) A method and device for detecting malicious applications based on features
US20250045393A1 (en) Binary file malware detection with structure aware machine learning
CN118395450A (en) Vulnerability detection model training method, detection method, device, equipment and medium
CN118013440A (en) An abnormal detection method for personal sensitive information desensitization operation based on event graph
CN116415251B (en) A vulnerability impact range reasoning method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20240213