CN113486357B

CN113486357B - Intelligent contract security detection method based on static analysis and deep learning

Info

Publication number: CN113486357B
Application number: CN202110766768.XA
Authority: CN
Inventors: 周福才; 罗熙霖; 焦梓; 孙劲桐
Original assignee: 东北大学
Priority date: 2021-07-07
Filing date: 2021-07-07
Publication date: 2024-02-13
Anticipated expiration: 2041-07-07
Also published as: CN113486357A

Abstract

The invention discloses an intelligent contract security detection method based on static analysis and deep learning, and relates to the technical field of block chain intelligent contract security. Performing static analysis on the intelligent contract solubility source program to obtain a graph structure of the intelligent contract solubility source program; extracting abstract facts from the graph structure; according to abstract facts of the solubility source program, a deep learning model for performing vulnerability classification on the solubility source program is built, and the method comprises the following steps: the device comprises an input module, an attention module, a residual error connection module and an output module; constructing a training data set; training the deep learning model by utilizing the training data set; and performing vulnerability detection on the input intelligent contracts by using the trained deep learning model, and outputting a security detection result of the intelligent contract solubility source program. The method can comprehensively analyze the behaviors of the intelligent contract solubility source program and improve the accuracy of the security detection of the intelligent contract solubility source program.

Description

Intelligent contract security detection method based on static analysis and deep learning

Technical Field

The invention relates to the technical field of intelligent contract security of blockchain, in particular to an intelligent contract security detection method based on static analysis and deep learning.

Background

Smart Contract (Smart Contract) is a special protocol deployed in the blockchain. Buterin determines the applicability of decentralized computing outside of the transaction and designs an ethernet blockchain that supports execution of smart contracts. The smart contract contains code functions whose functions include trading, decision making, sending ethernet, etc. Smart contracts have proven suitable for many applications, including securities, communications, banking, medical, and other fields. However, smart contracts are characterized by transparency in that participants can view the source code of the smart contract. Moreover, the intelligent contract has the characteristic that once deployed, the intelligent contract cannot be changed, so that software update cannot be performed in time after the intelligent contract discovers the loopholes, and loss can be reduced only by means of suspending transactions or forking and the like. If the security detection is not performed on the intelligent contract, the intelligent contract cannot be repaired in time, so that normal use of the intelligent contract function is affected, and even the interests of the intelligent contract user can be damaged to cause serious consequences. Such as DAO attack event: anonymous hackers use reentrant vulnerabilities of intelligent contracts to fool 360 ten thousand ethernet coins; parity crack event: the deliberate destructor finds out the timestamp loophole in the intelligent contract code base, utilizes the problem of inconsistent timestamp, destroys the code base, and causes the loss of 1.5 hundred million dollars; malicious contract event: five hackers maliciously issued 34000 problematic intelligent contracts, resulting in complex ethernet, and an abnormal chain reaction was generated, resulting in theft of ethernet dollars worth 440 ten thousand dollars. In the situation of such severe security threats, there is currently no better universal means to detect smart contract vulnerabilities, and smart contract security is still largely dependent on the security technology level of contract developers and code auditing based on expert experience. Therefore, a scheme for effectively and automatically detecting the security of intelligent contracts is needed to be proposed. The existing automated security inspection has the following problems: 1. the intelligent contract code cannot be subjected to full coverage analysis, 2. The false alarm rate of security detection is high, 3. Only specific attacks are concerned, and the method is not easy to expand to detection of other attacks.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides an intelligent contract safety detection method based on static analysis and deep learning, which aims to solve the problem of intelligent contract safety detection.

The technical scheme of the invention is as follows:

1. an intelligent contract safety detection method based on static analysis and deep learning is characterized by comprising the following steps:

step 1: performing static analysis on the intelligent contract solubility source program to obtain a graph structure of the intelligent contract solubility source program; the static analysis comprises lexical analysis and grammar analysis; the graph structure comprises an abstract syntax tree AST and a control flow graph CFG;

step 2: extracting abstract facts from the map structure of the solubility source program obtained in the step 1;

step 3: according to the abstract facts of the solubility source program obtained in the step 2, a deep learning model for performing vulnerability classification on the solubility source program is built, and the deep learning model comprises: the device comprises an input module, an attention module, a residual error connection module and an output module;

step 4: constructing a training data set of the deep learning model;

step 5: training the deep learning model by utilizing the training data set;

step 6: and performing vulnerability detection on the input intelligent contracts by using the trained deep learning model, and outputting a security detection result of the intelligent contract solubility source program.

Further, according to the intelligent contract security detection method based on static analysis and deep learning, the step 1 specifically includes the following steps:

step 1.1: preprocessing the intelligent contract solubility source program, and deleting all contents irrelevant to the safety detection of the solubility source program;

step 1.2: importing a source code file corresponding to an import statement into the preprocessed intelligent contract visibility source program to obtain a complete visibility source program;

step 1.3: for a complete solubility source program, converting the solubility source program into an abstract syntax tree by using an ANTLR analyzer;

step 1.4: constructing a control flow graph CFG of the solubility source program according to the abstract syntax tree.

Further, according to the intelligent contract security detection method based on static analysis and deep learning, the step 1.3 specifically includes the following steps:

step 1.3.1: performing lexical analysis on the complete solubility source program by using an ANTLR analyzer, and marking the attribute of the words in the solubility source program according to the predefined word attribute category to obtain a word sequence with word attribute marks corresponding to each program statement;

step 1.3.2: for word sequences corresponding to each program statement generated by lexical analysis, using an ANTLR analyzer to carry out grammar analysis, and determining the grammar structure of each program statement according to a predefined grammar rule; the grammar structure comprises a contract structure, a function structure, a variable structure, an expression structure and a statement control flow structure;

step 1.3.3: according to the grammar structure of each program statement, the soldity source program is converted into an abstract grammar tree by using an ANTLR analyzer.

Further, according to the intelligent contract security detection method based on static analysis and deep learning, the word attribute categories include a keyword < keyword >, a visibility definer < qualitier >, a variable data type < variabletype >, an identifier < identifier >, an operator, and a constant.

Further, according to the intelligent contract security detection method based on static analysis and deep learning, the predefined grammar rules are as follows:

a)Contract::＝”contract”<identifier>”{”[contractBlock]”}”；

b)ContractBlock::＝[Function]|[Variable]；

c)Function::＝”function”<identifier>”(”[Variable]”)”<qualifier>[keyword][Return][”；”|Block]；

d)Variable::＝<variabletype><qualifier><identifier>[”＝”Expression]”；”；

e)Expression::＝Functioncall|<identifier>|Expression<operator>|Expression<operator>Expres sion|<identifier><operator><constant>；

f)Functioncall::＝Expression”(”Variable”)”；

g)Block::＝”{”Statement”}”；

i)IfStatement::＝”if””(”Expression”)”Block[”else”Block]；

j)WhileStatment::＝”while””(”Expression”)”Block；

k)ForStatement::＝”for””(”[Variable]”；”[Expression]”；”[Expression]”)”Block；

l)Return::＝”return”[Expression]。

further, according to the intelligent contract security detection method based on static analysis and deep learning, the step 1.4 specifically includes the following steps:

step 1.4.1: constructing different basic blocks Block according to Block nodes in an abstract syntax tree AST by utilizing program sentences belonging to a sentence control flow structure, recording sentence numbers StmtId of each sentence in each basic Block, and recording the in-edge and out-edge of each basic Block;

step 1.4.2: connecting different basic blocks, when the outgoing edge of one basic block is equal to the incoming edge of the other basic block, connecting the two basic blocks, and when the number of the outgoing edges of one basic block is greater than 1, recording the jump condition of the basic block;

step 1.4.3: the number VarId of the variable in the program statement and the assignment operation assignment are recorded by using a static single assignment form, namely, one variable only carries out one assignment operation, and the variable name of the variable carrying out secondary assignment is modified.

Further, according to the intelligent contract security detection method based on static analysis and deep learning, the abstract facts comprise all control flow information, data information and function information of the intelligent contract, and are written by using a datalog language, and the abstract facts have the following structural form:

the predicate is a corresponding predicate name defined according to a solubility source program structure, and comprises a data type, a function type, an expression structure and a control flow structure; arg 1..argn is another parameter related to the content of a concrete solubility program sentence.

Further, according to the intelligent contract security detection method based on static analysis and deep learning, the method for extracting abstract facts from the graph structure of the solubility source program obtained in the step 1 is as follows: traversing the graph structure of the solubility source program, and extracting abstract facts of the solubility source program according to keyword matching.

Further, according to the intelligent contract security detection method based on static analysis and deep learning, the step 3 specifically includes the following steps:

step 3.1: building an input module: the abstract facts obtained in the step 2 are represented by a 0-1 coding matrix X, word embedding processing and position embedding processing are respectively carried out on the abstract facts represented by the 0-1 coding matrix X, and an E matrix obtained by splicing a matrix obtained after the word embedding processing and a matrix obtained after the position embedding processing is used as input of an attention module;

step 3.2: the method for building the attention module specifically comprises the following steps of:

step 3.2.1: respectively obtaining a Q matrix, a K matrix and a V matrix of the abstract facts through three linear changes of the E matrix, and obtaining an attention coefficient matrix A of the abstract facts according to a formula (4);

A＝QK ^T (4)

the Q matrix is a Query matrix of abstract facts and consists of Query vectors corresponding to each word of each abstract fact; the K matrix is a Key matrix of abstract facts and consists of Key vectors corresponding to each word of each abstract fact; the V matrix is a Value matrix of the abstract facts and consists of Value vectors corresponding to each word of each abstract fact;

step 3.2.2: updating element values in the V matrix according to the attention coefficient matrix A of the abstract facts and the formula (5) to obtain an updated V matrix V';

where dk represents the arithmetic sum of squares of the K matrix; the softmax function is the activation function;

step 3.2.3: a layer normalization mechanism is added into a matrix V 'of the attention module, so that elements in the matrix V' are more standard, convergence is accelerated, and stability of feature distribution is ensured;

step 3.3: building a residual connection module, wherein a matrix calculation formula of the residual connection module is as follows:

Z＝H(E)＝E+F(E)＝E+V″ (9)

wherein, the matrix E is the input of the attention module; v' is the output of the attention module; z is the output of the residual error connection module; f is a residual function, in the attention module, a mapping H (E) -Z is obtained through back propagation, and if no residual connection module exists, F (E) -0;

step 3.4: building an output module to output the possible vulnerability probability of abstract facts, wherein the concrete steps of building the output module are as follows:

step 3.4.1: defining a vulnerability class output formula shown in formula (10) for outputting abstract fact vulnerability class results of intelligent contracts;

P _k ＝softmax(Linear(Z)) (10)

wherein Linear represents a Linear function, and performs Linear transformation on the matrix Z once; p (P) _k Probability values for different vulnerability types;

step 3.4.2: and constructing a loss function of the deep learning model, so that the model has vulnerability classification capability.

Further, according to the intelligent contract security detection method based on static analysis and deep learning, the loss function is a multi-category cross entropy loss function shown in formula (11):

Loss ₁ ＝-∑ _k y _k log(P _k ) (11)

wherein y is _k And representing a one-hot coded label corresponding to the abstract fact, wherein k represents a vulnerability class corresponding to the abstract fact.

Compared with the prior art, the invention has the following beneficial effects:

1. the behavior of the intelligent contract solubility source program can be comprehensively analyzed. Security detection of intelligent contracts first requires a comprehensive analysis of their code behavior. In the method, firstly, an abstract syntax tree and a control flow diagram of an intelligent contract similarity source program are analyzed, then, the diagram structure is abstracted into fact representation, the abstract fact can cover code behaviors more comprehensively, semantic features in the program are effectively represented, and support is provided for a deep learning model machine.

2. The expandability of the intelligent contract security source program security detection is enhanced. Traditional security detection methods are based mainly on predefined rules, focusing only on known security vulnerabilities. The deep learning model used by the method is not limited to specific security holes, and the model can be trained by supplementing the training set so as to achieve the detection of various security holes and be easily expanded. In addition, on the aspect of security detection of unknown vulnerabilities, the method can have the capability of detecting the vulnerabilities only by training the model again, and has good expandability for detecting the security vulnerabilities compared with the traditional security detection method.

3. The accuracy of the intelligent contract security source program safety detection is improved. In the method, two methods of static analysis and deep learning are combined to carry out security detection on intelligent contracts, an existing deep learning model is improved, an attention module is added to learn key information in abstract facts, the accuracy of security detection classification is effectively improved on the basis of improving vectorization characterization of the abstract facts, and the false alarm rate of security holes is effectively reduced.

Drawings

FIG. 1 is a flow chart of the intelligent contract security detection method based on static analysis and deep learning of the present invention;

FIG. 2 is an abstract syntax tree diagram of example code in an embodiment of the invention;

FIG. 3 is a schematic diagram of a deep learning model structure in an embodiment of the invention;

fig. 4 is a schematic diagram of an attention module according to an embodiment of the invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and detailed description. The following examples are only illustrative of the present invention, but limit the scope of the present invention.

FIG. 1 is a flow chart of the intelligent contract security inspection method based on static analysis and deep learning of the present invention, which comprises the following steps:

step 1: performing static analysis on the intelligent contract solubility source program to obtain a graph structure of the intelligent contract solubility source program; the static analysis comprises lexical analysis and grammar analysis; the graph structure includes an abstract syntax tree (Abstract Syntax Tree, AST) and a control flow graph (Control Flow Graph, CFG).

in a preferred embodiment, the preprocessing performed on the intelligent contract solubility source program includes deleting a single line of notes "//", a plurality of lines of notes "//// …/", a space "", a carriage return "\n", and all content unrelated to the security detection of the solubility source program.

Step 1.2: and importing a source code file corresponding to the import statement into the preprocessed intelligent contract solubility source program to obtain a complete solubility source program.

Step 1.3: for a complete solubility source program, an ANTLR analyzer is used to convert the solubility source program into an abstract syntax tree.

the word attribute categories include a keyword < keyword >, a visibility definer < qualiteier >, a variable data type < variabletype >, an identifier < identifier >, an operator > and a constant.

in a preferred embodiment, the syntax rules predefined using BNF (Backus-Naur Form, backus-Van) are as follows, according to the solubility language characteristics:

m)Contract::＝”contract”<identifier>”{”[contractBlock]”}”；

n)ContractBlock::＝[Function]|[Variable]；

o)Function::＝”function”<identifier>”(”[Variable]”)”<qualifier>[keyword][Return][”；”|Block]；

p)Variable::＝<variabletype><qualifier><identifier>[”＝”Expression]”；”；

q)Expression::＝Functioncall|<identifier>|Expression<operator>|Expression<operator>Expression|<identifier><operator><constant>；

r)Functioncall::＝Expression”(”Variable”)”；

s)Block::＝”{”Statement”}”；

u)IfStatement::＝”if””(”Expression”)”Block[”else”Block]；

v)WhileStatment::＝”while””(”Expression”)”Block；

w)ForStatement::＝”for””(”[Variable]”；”[Expression]”；”[Expression]”)”Block；

x)Return::＝”return”[Expression]。

step 1.3.3: according to the grammar structure of each program statement, converting the solubility source program into an abstract grammar tree by using an ANTLR analyzer;

for example, for the code shown below, an ANTLR analyzer is used to convert it into an abstract syntax tree as shown in fig. 2.

Step 1.4: constructing a control flow graph CFG of the solubility source program according to the abstract syntax tree, wherein the concrete steps are as follows:

step 1.4.3: the number VarId of the variable in the program statement and the assignment operation assignment are recorded using a static single assignment form (SSA form), i.e. one variable performs only one assignment operation, and its variable name is modified for the variable performing the secondary assignment.

For example, for the assignment operation "x=1; y=x+1; x=y; "its static single assignment form is" x1=1; y=x1+1; x2=y; the assignment operation of recording variables using a static single assignment form facilitates subsequent analysis of abstract facts.

Step 2: extracting abstract facts from the map structure of the solubility source program obtained in the step 1, specifically traversing the map structure of the solubility source program, and extracting the abstract facts of the solubility source program according to keyword matching.

The abstract facts are written in a datalog language and contain all control flow information, data information and function information of the intelligent contract, wherein the information is a key feature related to security vulnerabilities;

in a preferred embodiment, the abstract facts are structured as follows:

here, the predicate is a corresponding predicate name defined according to the solubility structure, arg1, & gt, argn is another parameter related to the statement content of the specific solubility program.

In the preferred embodiment, there are four predicate names, data type, function type, expression structure, and control flow structure, respectively. The specific predicate name definition and parameter definition are as follows:

traversing all nodes of an AST of a solubility source program, defining a predicate name VarDecl of the operation node Variable of a data type, defining a predicate name FunDecl of the operation node Function of a Function type, defining a predicate name FunCall of the Function call node Function in an expression structure, wherein the call of a special Function comprises an address correlation Function call, callcode, delegatecall, send, transfer and an error processing Function revert, assert, require, and defining the predicate name of the special Function as the original name of the special Function; the parameters are the sentence numbers, the variable numbers and the parameters of all leaf nodes corresponding to the nodes.

Traversing a control flow graph of a solubility source program, defining predicate names VarASS of assignment operation Assign among variables, defining the predicate names of sentences in the same basic Block, namely corresponding sentence numbers StmtId and related variable numbers VarId, defining the predicate names of the sentences in the same basic Block, namely a Block, the parameters of the sentences in the basic Block numbers BlockId and the sentence numbers StmtId, and defining the predicate names of the sentences as BlockPath and the parameters of the corresponding basic Block numbers BlockId when paths exist among the basic blocks.

For example, the abstract facts extracted by traversing the graph structure generated by the example code in step 1.3.2 are shown below:

VarDecl(StmtId＝'S00',VarId＝'V00',variabletype＝'uint',identifier＝'storedData')

Block(BlockId＝'B00',StmtId＝'S00')

FunDecl(identifier＝'set',VarId＝'V01',qualifier＝'public')

VarDecl(VarId＝′V01′，variabletype＝′uint′，identifier＝′x′)

Block(BlockId＝′B01′，StmtId＝′S01′)

VarAss(StmtId＝′S01′，VarId＝′V00′，VarId＝′V01′)

step 3: according to the abstract facts of the solubility source program obtained in the step 2, a deep learning model for performing vulnerability classification on the solubility source program is built;

in a preferred embodiment, the architecture based on the transducer model designs a deep learning model, as shown in fig. 3, comprising four modules: the device comprises an input module, an attention module, a residual error connection module and an output module. The construction process of the deep learning model comprises the following steps:

step 3.1: building an input module: carrying out vectorization preprocessing on the abstract facts obtained in the step 2, representing the input abstract facts by using a 0-1 coding matrix X, and carrying out dimension reduction processing on the abstract facts represented by the 0-1 coding matrix X because the 0-1 coding matrix X is too sparse, namely carrying out word embedding processing and position embedding processing on the abstract facts represented by the 0-1 coding matrix X, wherein the matrix obtained after the dimension reduction processing is the input required by an attention module, and specifically comprises the following steps:

step 3.1.1: performing word embedding processing on abstract facts represented by a 0-1 coding matrix X according to a formula (1) to obtain a word matrix X':

X _l*d ′＝tanh(X _l*v W ₁ ) (1)

wherein W is ₁ Is a parameter matrix to be trained in the input module; l is the number of lines of the longest abstract facts in the abstract facts corresponding to different solubility source programs; v is the vocabulary size of the abstract fact; d is the term dimension after dimension reduction.

Step 3.1.2: performing position embedding processing on abstract facts represented by a 0-1 coding matrix X;

in order to ensure that the deep learning model can better acquire the position information of the abstract facts, the input module introduces a position coding mechanism of the abstract facts, namely position embedding.

In a preferred embodiment, the position information of each statement in the abstract fact is represented by a matrix P, and the matrix P is subjected to an activation function according to a formula (2) to obtain a position coding matrix P':

P _l*d ′＝tanh(P _l*d ) (2)

the matrix P is randomly initialized before training, and a position coding matrix P' formed by position vectors corresponding to each position is obtained after training.

Step 3.1.3: for the abstract facts of an intelligent contract, the position coding matrix P 'is spliced with the word matrix X' according to the formula (3) to obtain an E matrix as the input of the attention module.

Step 3.2: building an attention module, wherein a schematic diagram of the attention module is shown in fig. 4;

the attention module is the core of the deep learning model. Through the attention mechanism of the module, the attention coefficient between abstract fact words can be calculated, and the vector corresponding to each word of each abstract fact contains the information of the vectors corresponding to other words, so that the key information in the abstract fact can be better obtained. The principle of the attention mechanism is that the attention coefficients between each word and other words in the abstract fact are obtained by matrix multiplication.

In a preferred embodiment, the specific steps for building an attention module are as follows:

step 3.2.1: calculating attention coefficients among the abstract fact words to obtain an attention coefficient matrix of the abstract fact;

the method of calculation of the attention coefficients in the preferred embodiment is similar to BERT, involving three matrices: q matrix, K matrix and V matrix. The Q matrix is a Query matrix of abstract facts and consists of Query vectors corresponding to each word of each abstract fact; the K matrix is a Key matrix of the abstract facts, and consists of Key vectors corresponding to each word of each abstract fact, the V matrix is a Value matrix of the abstract facts, and consists of Value vectors corresponding to each word of each abstract fact. The three matrixes are randomly given values in the initial state, are respectively obtained by the E matrix through three linear changes, and the values of Q, K and V have characterization significance after training.

An attention coefficient matrix of the abstract fact is derived according to equation (4):

A＝QK ^T (4)

step 3.2.2: updating element values in the V matrix according to the attention coefficient matrix A of the abstract facts to obtain an updated V matrix y';

in a preferred embodiment, after obtaining the attention coefficient matrix a, the element values in the V matrix are updated according to the formula (5), so as to obtain an updated V matrix V'.

Where dk represents the arithmetic sum of squares of the K matrix, and the dimension that is multiplied by the square in equation (5) is reduced to the original size, and in the process of back propagation, a certain gradient update value jitter is reduced. Softmax is an activation function that is applied in the sense that adding a nonlinear variation enhances the characterizability of the V' matrix.

the layer normalization mechanism considers the inputs of all dimensions of the matrix V', calculates the average input value and the input variance, and then converts the inputs of each dimension with the same normalization operation. The mean formula for all elements of the V' matrix is as follows:

the variance formula for all elements of the V' matrix is as follows:

wherein n is ^(v) Is the number of elements in V', mu ^(v) Is the mean value of the two values,is variance, sigma ^(v) Is the standard deviation. Each element V in the matrix V _i Normalization processing is performed according to the formula (8):

in the above, v _i ' is each element V in the matrix V _i Normalized values.

Step 3.3: constructing a residual error connection module;

the vocabulary of the source inputs (abstract facts) of the deep learning model in the preferred embodiment is too small, the attention module may excessively capture the connection relations between words, and the addition of the residual connection module may overcome the problem to some extent.

In a preferred embodiment, the matrix calculation formula of the residual connection module is as follows:

Z＝H(E)＝E+F(E)＝E+V″ (9)

wherein, the matrix E is the input of the attention module; v "is the output of the attention module and the addition of these two matrices yields the output Z of the residual connection module. F is a residual function, and in the attention module, a mapping H (E) →z is obtained by back propagation, and if there is no residual connection module, F (E) →0.

Step 3.4: building an output module;

the output module is used for outputting the possible vulnerability probability of the abstract facts and maximizing the security vulnerability detection capability of the deep learning model according to the loss function.

In a preferred embodiment, the specific steps for building the output module are as follows:

step 3.4.1: and defining a vulnerability class output formula shown in formula (10) for outputting abstract fact vulnerability class results of the intelligent contract.

P _k ＝softmax(Linear(Z)) (10)

Wherein Linear represents a Linear function, i.e. a Linear transformation of matrix Z is performed once, the softmax function is an activation function, P _k Probability values for different vulnerability types.

Step 3.4.2: and constructing a loss function of the deep learning model, wherein the model has vulnerability classification capability through the loss function, and the loss function is a multi-category cross entropy loss function shown in a formula (11).

Loss ₁ ＝-∑ _k y _k log(P _k ) (11)

Step 4: constructing a training data set of a deep learning model;

the vulnerability detection problem may be regarded as a multi-classification problem in machine learning. Since the classification problem belongs to supervised learning, data (solubility program) and a tag of the data (vulnerability type) are required. The construction of the training data set for the deep learning model thus includes acquiring data and labeling the data with a tag type.

In the preferred embodiment, a total of 1500 program files for the real existing smart contracts for ethernet are first collected. And then manually marking the 1500 program files according to the definition of SWC Registry on the loopholes of the intelligent contract, and constructing a training data set of the deep learning model. SWC Registry is the currently mainstream library of intelligent contract vulnerability annotation standards. It is built by ethernet security personnel and developers in the Smart Contract Security organization. The loophole library provides security loopholes classification, partial test cases of the intelligent contracts of the Ethernet and results brought by the loopholes. The number of loopholes for each class in the training dataset and their occupancy are shown in table 2.

TABLE 2 vulnerability count and duty cycle

Vulnerability class	Quantity of	Duty ratio of
			Reentrant vulnerability	1014	67.6％
Timestamp dependency loopholes	715	46.7％
			Endless loop vulnerability	326	21.7％
Leak-free	293	19.5％

Step 5: and training the deep learning model by using the training data set.

In a preferred embodiment, the training of the deep learning model is split into two steps, the first step being pre-training, with the aim of causing the value of the loss function of the deep learning model to drop rapidly. The second step is fine training (Finetune Train) to further improve the security detection capabilities of the deep learning model. The combined training mode of the pre-training and the fine-tuning training enables the deep learning model to have better robustness and expandability.

In a preferred embodiment, the pretraining and fine-tuning training of the deep learning model is performed using a juyter Notebook platform with GPU resources: during pre-training, setting the Batch-size to be 16, setting the epoch to be 80, selecting the optimizer to be Adam, stopping the pre-training when the loss value change is stable to be 1, and starting fine-tuning training; during fine tuning training, the Batch-size is set to 4, the epoch is set to 20, the optimizer is selected to be SGD, and the fine tuning training is stopped when the loss value change is stabilized to 0.1. The deep learning model after pre-training and fine-tuning training has the ability of vulnerability classification for intelligent contracts.

Step 6: performing vulnerability detection on the input intelligent contracts by using the trained deep learning model, and outputting a security detection result of the intelligent contract solubility source program;

and performing vulnerability detection on the intelligent contracts by using the trained deep learning model, outputting a probability value of each vulnerability type as an output result, if the output probability value is more than or equal to 0.5, considering that the intelligent contracts have the vulnerability, and if the output probability value is less than 0.5, not having the vulnerability. The method can effectively and automatically detect the security of the intelligent contract.

It should be apparent that the above-described embodiments are merely some, but not all, embodiments of the present invention. The above examples are only for explaining the present invention and do not limit the scope of the present invention. Based on the above embodiments, all other embodiments, i.e. all modifications, equivalents and improvements made within the spirit and principles of the present application, which are obtained by persons skilled in the art without making creative efforts are within the scope of the present invention claimed.

Claims

step 4: constructing a training data set of the deep learning model;

step 5: training the deep learning model by utilizing the training data set;

the step 3 specifically comprises the following steps:

A＝QK ^T (4)

wherein d _k Representing the arithmetic sum of squares of the K matrix; the softmax function is the activation function;

Z＝H(E)＝E+F(E)＝E+V" (9)

P _k ＝softmax(Linear(Z)) (10)

step 3.4.2: constructing a loss function of the deep learning model, so that the model has vulnerability classification capability;

the loss function is a multi-class cross entropy loss function shown in formula (11):

Loss ₁ ＝-Σ _k y _k log(P _k ) (11)

2. The intelligent contract security test method based on static analysis and deep learning as claimed in claim 1, wherein the step 1 specifically includes the following steps:

3. The intelligent contract security test method based on static analysis and deep learning as claimed in claim 2, wherein the step 1.3 specifically includes the steps of:

4. The intelligent contract security test method based on static analysis and deep learning of claim 3, wherein the word attribute categories include keywords < keyword >, visibility definer < qualitier >, variable data type < variable >, identifier < operator > and constant.

5. The intelligent contract security detection method based on static analysis and deep learning of claim 3, wherein the predefined grammar rules are as follows:

a)Contract::＝”contract”<identifier>”{”[contractBlock]”}”；

b)ContractBlock::＝[Function]|[Variable]；

f)Functioncall::＝Expression”(”Variable”)”；

g)Block::＝”{”Statement”}”；

i)IfStatement::＝”if””(”Expression”)”Block[”else”Block]；

j)WhileStatment::＝”while””(”Expression”)”Block；

l)Return::＝”return”[Expression]。

6. the intelligent contract security test method based on static analysis and deep learning as claimed in claim 2, wherein the step 1.4 specifically includes the steps of:

7. The intelligent contract security test method based on static analysis and deep learning as claimed in claim 1, wherein the abstract facts include all control flow information, data information and function information of the intelligent contract, and are written in a datalog language, and the abstract facts have the following structural form:

8. The intelligent contract security detection method based on static analysis and deep learning according to claim 1 or 7, wherein the method for extracting abstract facts from the graph structure of the solubility source program obtained in step 1 is as follows: traversing the graph structure of the solubility source program, and extracting abstract facts of the solubility source program according to keyword matching.