CN115659176A - Training method of intelligent contract vulnerability detection model and related equipment - Google Patents

Training method of intelligent contract vulnerability detection model and related equipment Download PDF

Info

Publication number
CN115659176A
CN115659176A CN202211260813.5A CN202211260813A CN115659176A CN 115659176 A CN115659176 A CN 115659176A CN 202211260813 A CN202211260813 A CN 202211260813A CN 115659176 A CN115659176 A CN 115659176A
Authority
CN
China
Prior art keywords
vulnerability detection
intelligent contract
graph structure
code
detection model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211260813.5A
Other languages
Chinese (zh)
Inventor
胡军
沙金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202211260813.5A priority Critical patent/CN115659176A/en
Publication of CN115659176A publication Critical patent/CN115659176A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a training method of an intelligent contract vulnerability detection model and related equipment, and the accuracy of model detection is improved. The method comprises the following steps: determining a vulnerability detection result corresponding to the intelligent contract source code data set; constructing a code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set; performing data enhancement on the code attribute graph structure; inputting the code attribute graph structure subjected to data enhancement and the vulnerability detection result into an initial vulnerability detection model to obtain an output result; adjusting a loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result; and determining the initial vulnerability detection model reaching the preset iteration condition as the intelligent contract vulnerability detection model.

Description

Training method of intelligent contract vulnerability detection model and related equipment
[ technical field ] A method for producing a semiconductor device
The invention belongs to the field of block chains, and particularly relates to a training method of a basic intelligent contract vulnerability detection model and related equipment.
[ background of the invention ]
The block chain is a distributed data management technology for realizing decentralization based on data encryption, timestamps and a distributed consensus mechanism, and has the characteristics of traceability, no tampering and high availability. An intelligent contract is a piece of code that runs on a blockchain, the logic of the code defining the contents of the contract. Compared with the traditional contract, the intelligent contract has multiple advantages by depending on a block chain platform: first, the execution of smart contracts is not dependent on third parties, but is automated and decentralized; secondly, the intelligent contract itself cannot be tampered; and thirdly, the intelligent contracts are stored on the block chain platform, each block chain node stores a contract backup and is visible to all people, and the transparency of contract execution is guaranteed. With the development of blockchain technology, more and more developers pay attention to the advantages of intelligent contracts, and the intelligent contract technology is applied to a plurality of fields including finance, artwork trading, risk investment and the like.
Smart contracts are becoming more prevalent as more and more problems use blockchain technology to provide decentralized solutions, currently exchanging billions of dollars per day through this technology. However, due to the irreversible and tamperproof characteristics, once the contract itself has a vulnerability and is attacked after deployment, huge economic losses are caused.
The current intelligent contract vulnerability detection method mainly surrounds formal verification, symbolic execution and static analysis. Such as Oyente, manticiore, mythirl are all intelligent contract vulnerability detection tools developed based on symbolic execution ideas. The three methods rely on expert-defined rules or patterns to detect the vulnerability, the artificially defined rules are easy to make mistakes, the responsible vulnerability or pattern is easy to be reported under and false under only the expert rules, and an attacker can easily bypass the rules to realize the attack. Secondly, as the number of intelligent contracts increases, the types of the vulnerabilities gradually increase, and few experts cannot screen all vulnerabilities, so that correct rules are designed.
Recently, methods for intelligent contract vulnerability detection by deep learning have been adopted. Such as modeling the source code as a control flow graph, the source code is processed sequentially using LSTM. However, these methods treat the source code or operation code as a text sequence, rather than a semantic block, or fail to highlight key variables in the code, resulting in insufficient syntactic and semantic information. That is, the existing method acquires the shallow text structure, so the invention considers the application graph structure to better express the syntax semantic information.
[ summary of the invention ]
The invention provides a training method of an intelligent contract vulnerability detection model and related equipment, which can improve the accuracy of model detection.
The invention provides a training method of an intelligent contract vulnerability detection model, which comprises the following steps:
determining a vulnerability detection result corresponding to the intelligent contract source code data set;
constructing a code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set;
performing data enhancement on the code attribute graph structure;
inputting the code attribute graph structure subjected to data enhancement and the vulnerability detection result into an initial vulnerability detection model to obtain an output result;
adjusting a loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result;
and determining the initial vulnerability detection model reaching the preset iteration condition as the intelligent contract vulnerability detection model.
The second aspect of the present invention provides a training apparatus for an intelligent contract vulnerability detection model, including:
the determining unit is used for determining a vulnerability detection result corresponding to the intelligent contract source code data set;
the construction unit is used for constructing a code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set;
the data enhancement unit is used for performing data enhancement on the code attribute graph structure;
the model training unit is used for inputting the code attribute graph structure subjected to data enhancement and the vulnerability detection result into an initial vulnerability detection model to obtain an output result;
the adjusting unit is used for adjusting a loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result;
and the model determining unit is used for determining the initial vulnerability detection model reaching the preset iteration condition as the intelligent contract vulnerability detection model.
A third aspect of the present invention provides a computer device comprising at least one connected processor, a memory and a transceiver, wherein the memory is used for storing program codes, and the processor is used for calling the program codes in the memory to execute the steps of the training method of the intelligent contract vulnerability detection model according to the first aspect.
A fourth aspect of the present invention provides a computer storage medium comprising instructions which, when executed on a computer, cause the computer to perform the steps of the method for training an intelligent contract vulnerability detection model according to any of the above aspects.
Compared with the related art, the embodiment provided by the invention can better extract the grammar and semantic information of the intelligent contract source code by introducing the graph structure. By introducing the idea of contrast learning, the accuracy of model detection is improved by adopting a series of methods such as data enhancement, positive sample distance pulling and the like, and the automation degree of intelligent contract detection is improved by adopting a deep learning method.
[ description of the drawings ]
Fig. 1 is a schematic flow chart of a training method of an intelligent contract vulnerability detection model according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a structure of a built code property graph provided by the present invention according to an embodiment of the present invention;
fig. 3 is a schematic view of a virtual structure of a training apparatus of an intelligent contract vulnerability detection model according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a hardware structure of a server according to an embodiment of the present invention.
[ detailed description ] embodiments
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
The following describes a training method of the intelligent contract vulnerability detection model from the perspective of a training device of the intelligent contract vulnerability detection model, where the training device of the intelligent contract vulnerability detection model may be a server or a service unit in the server, and is not particularly limited.
Referring to fig. 1 in combination, fig. 1 is a schematic flowchart of a training method of an intelligent contract vulnerability detection model according to an embodiment of the present invention, including:
101. and determining a vulnerability detection result corresponding to the intelligent contract source code data set.
In this embodiment, the training device of the intelligent contract vulnerability detection model may determine a vulnerability detection result corresponding to an intelligent contract source code data set, specifically, the training device of the intelligent contract vulnerability detection model acquires an intelligent contract source code from an ethernet and a VNT chain platform to form the data set, then inputs the acquired intelligent contract into Mythril and Oyente intelligent contract vulnerability detection tools for respective detection, marks the intelligent contract with a detected vulnerability as "1", marks the intelligent contract without a vulnerability as "0", and performs manual audit on the contract with ambiguity in the two tools to ensure the accuracy of contract marking vulnerability.
102. And constructing a code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set.
In this embodiment, the training device of the intelligent contract vulnerability detection model may construct a code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set. The code attribute graph structure comprises nodes and edges, wherein a training device of the intelligent contract vulnerability detection model can construct nodes of a target code attribute graph structure corresponding to a target intelligent contract source code according to the importance degrees of elements with different degrees in the target intelligent contract source code, the target intelligent contract source code is any one intelligent contract source code in an intelligent contract source code data set, and the nodes of the target code attribute graph structure comprise main nodes, common nodes and fallback nodes; and constructing edges of the object code attribute graph structure according to the relationship among the nodes, wherein the edges of the object code attribute graph structure comprise a control flow edge, a data flow edge, a forward edge and a rollback edge. The following describes the structure of the constructed code attribute graph specifically:
firstly, constructing nodes of a code attribute graph structure: according to the difference of importance of different program elements in the intelligent contract source code, dividing degree elements in the intelligent contract source code into three types of nodes, namely a main node, a common node and a fallback node:
the primary nodes represent critical calls and important variables, and the primary functions are built-in functions. The method mainly aims at reentry vulnerabilities and endless loop vulnerabilities in intelligent contract source codes, and a withdraw function in the reentry vulnerabilities (the withdraw function is a function responsible for transferring accounts and is a key for detecting the reentry vulnerabilities, and because the function can be illegally and frequently called when the reentry vulnerabilities occur, variables corresponding to user balances and variables capable of directly influencing the user balances can be used as main nodes; the main nodes in a dead loop vulnerability are mainly all loop statements (e.g., for and while), loop condition variables, and self-calls.
The common nodes are calls and variables which play an auxiliary role in detecting the vulnerability, in short, whether the calls and variables of the main nodes are constructed as the common nodes, for example, the variables indirectly related to the dead loop are the common nodes.
The fallback node is a fallback function of the virtual attack contract, and the fallback node can interact with the tested function. The backspacing function is a special design in an intelligent contract and is a cause of many security vulnerabilities.
And then, constructing edges of the code attribute graph structure, and simulating the relationship between the nodes by constructing the edges, wherein the edges of the code attribute graph structure comprise a control flow edge, a data flow edge, a forward edge and a backward edge. The feature of the edge is extracted as a primitive ancestor (Vs, ve, O, T), vs representing the start node of the edge, ve representing the end node of the edge, O representing the execution order of the edge, and T representing the type of the edge. Wherein, the data flow edge is used for tracking the use condition of the variable, and relates to the access and modification of the variable; the forward edge represents an edge which is in a natural execution order from node to node; the rollback edge is used for modeling a rollback mechanism in an intelligent contract and comprises two edges, wherein the first rollback edge is used for calling a rollback node, and the second rollback edge is used for pointing to a tested function from the rollback node; controlling the flow edge: the control semantics used for capturing the code mainly comprise conditional statements and security handle statements, such as if, for, assert and require statements.
Referring to fig. 2, fig. 2 is a schematic diagram of a structure of a constructed code attribute graph provided in an embodiment of the present invention, and fig. 2 is a form of a section of intelligent contract code extracted into a graph structure, where the section of code has a reentry vulnerability, and is divided into three main nodes C1, C2, and C3, a common node N1, and a fallback node F according to a calling relationship between each element in the section of code and a call. In the figure, three sides are shown, thick lines are control flow sides e1, e2, e4, e6, e7, e8, e9, thin lines are data flow sides e3, e4, e5, e8, and dotted lines are fallback sides e10, e11.
The function type is used as a main node and is a C1 node, the amount is used as a function input parameter and is used as a common node and is defined as an N1 node, and the parameter is controlled in the function body through an if statement to form a control flow relation.
Credit [ msg.send ] - = amount because the amount node has data access to the Credit node and data modification operation forms a data flow relation
Sender, call, value (amount) may trigger a fallback function, and a value function node and an amount node form a fallback, and simultaneously, because the code is called in a withdraw function, the fallback function and the withdraw node also form a fallback relation.
103. And carrying out data enhancement on the code attribute graph structure.
In this embodiment, after obtaining the code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set, the training device of the intelligent contract vulnerability detection model may perform data enhancement on the code attribute graph structure, that is, slightly transform the graph structure to obtain different expressions. The invention uses two data enhancement methods, namely, the method of masking the node characteristics and the method of masking the adjacent matrix, the method of masking the node characteristics removes certain characteristics of the nodes through certain probability, and the method of masking the adjacent matrix removes certain edges between the nodes through certain probability. The following is a detailed description:
discarding the feature corresponding to the node of the code attribute graph structure according to a first preset probability to obtain a first enhanced sample, wherein the mask of the node feature means that the feature of the node is temporarily discarded according to a certain probability in the training process of the deep learning network, that is, the node feature is sampled to obtain the first enhanced sample.
And discarding the adjacency matrix of the code attribute graph structure according to a second preset probability to obtain a second enhancement sample, wherein the mask of the adjacency matrix means that the edges between the nodes are temporarily discarded according to a certain probability in the training process of the deep learning network to obtain the second enhancement sample. Setting a matrix A2 with equal rows and columns and a value of 0-1, setting a probability default P =0.15, setting a value less than 0.15 in A2 to 0, and remaining more than 0.15, and then multiplying the updated A2 with the adjacency matrix a, can set 1 of 15 percent in a to 0.
It should be noted that, the vulnerability detection result corresponding to the intelligent contract source code data set may be determined through step 101, and data enhancement may be performed on the code attribute graph structure through steps 102 to 103, however, there is no restriction on the execution order between step 101 and steps 102 to 103, and step 101 may be executed first, or steps 102 to 103 may be executed first, or executed at the same time, which is not specifically limited.
104. And inputting the code attribute graph structure subjected to data enhancement and the vulnerability detection result into the initial vulnerability detection model to obtain an output result.
In this embodiment, the training apparatus of the intelligent contract vulnerability detection model inputs the code attribute graph structure after data enhancement and the vulnerability detection result into the initial vulnerability detection model to obtain an output result, that is, the training apparatus of the intelligent contract vulnerability detection model inputs the sample after data enhancement (the sample refers to the code attribute graph structure after data enhancement, that is, the first enhancement sample or the second enhancement sample) and the tag (the tag refers to the vulnerability detection result corresponding to each enhancement sample) into the initial vulnerability detection model to obtain two output results, i.e., output1 and output2 are different outputs of the same sample, for example, if output1 is [0.91,0.09], output2 is [0.89,0.11], and at the same time, it indicates that reentry is performed, and output1 means that the probability of code reentry is 91% and the probability of reentry is not 9%. output2 means that the probability of the code being a reentrant vulnerability is 89%, and the probability of not being a reentrant vulnerability is 11%.
105. And adjusting the loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result until a preset iteration termination condition is reached.
In this embodiment, after the training device of the intelligent contract vulnerability detection model obtains the first output result and the second output result, similarity evaluation is performed through the loss function, the comparison loss is added, and the distance between the first output result and the second output result is reduced. Two methods are provided herein, the first is to reduce the loss by using a weighted average of the euler distance + manhattan distance; the second method is to pull in the similarity of similar samples in the dimensional space, i.e. let the diagonal of the cross-correlation matrix of the first output result and the second output result be 1 as much as possible and the off-diagonal elements be 0 as much as possible, i.e. let the two outputs be equal as much as possible in the same dimension. The following is a detailed description:
1. the distance is reduced by reducing the loss function by a weighted average of the euler distance and the manhattan distance.
The training device of the intelligent contract vulnerability detection model adjusts the loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result, and comprises the following steps:
calculating the Euler distance and the Manhattan distance between the first output result and the second output result;
and adjusting the loss function according to the Euler distance, the Manhattan distance and the vulnerability detection result.
The training device of the intelligent contract vulnerability detection model can calculate the Euler distance and the Manhattan distance of the first output result and the second output result, namely obtain the Euler loss and the Manhattan loss, and then weight and scale the two losses to obtain the final loss result. That is, the adjustment of the loss function according to the euler distance, the manhattan distance, and the hole detection result is to approximate the distance between the first output result and the second output result.
2. And (5) the similarity of similar samples in a dimension space is drawn.
The training device of the intelligent contract vulnerability detection model adjusts the loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result, and comprises the following steps:
determining a cross-correlation matrix of the first output result and the second output result;
and adjusting the loss function based on the cross-correlation matrix and the vulnerability detection result.
That is, two different output results are input into the same graph convolution neural network, and the cross correlation matrix is calculated, so that the redundancy is reduced by a loss function: the cross-correlation matrix is made to be as close to the identity matrix as possible, that is, the diagonal element is made to be 1 as much as possible, and the off-diagonal element is made to be 0 as much as possible, so that the same dimensional components of two eigenvectors extracted after the same sample is input into the network after being amplified are very similar, and the redundancy of different dimensional components is minimized. Here a loss function class is defined: barlowTwinLoss, for a batch:
Figure BDA0003890899500000081
the first term in the above formula is an invariant term, the second term is a redundancy reducing term, and λ is a constant with a positive value, which is used to measure the importance of the first term and the second term in the loss function. C is cross correlationAnd a matrix (namely a dimension similarity matrix), wherein the cross-correlation matrix can be regarded as an intersection of the first output result and the second output result, and the diagonal element of C is equal to 1 through an invariant item, so that the embedding versions of different augmentation versions of the same sample are invariant. The redundancy reduction term reduces the redundancy by making the non-diagonal elements equal to 0. i and j are the characteristic dimensions of the network output,
Figure BDA0003890899500000082
is the sum of products corresponding to the ith dimension of the feature vector of the first output result and the jth dimension of the feature vector of the second output result, i.e. c ij Representing the similarity of dimension i and dimension j.
It should be noted that, the loss is reduced by using the euler distance + manhattan distance or by using the similarity of the pulled-in similar samples in the dimensional space, but in the actual use process, the loss may be reduced by using both the euler distance + manhattan distance and the similarity of the pulled-in similar samples in the dimensional space.
In addition, the distance between the first output result (output 1) and the second output result (output 2) is pulled as much as possible by the loss function. Since output1 and output2 are positive samples, taking a reentrant hole as an example, since a reentrant hole can be detected (the probability of detecting the reentrant hole is not high at this time), that is, the two output results have an intersecting part, adding contrast loss means that the intersecting part of the two output results is larger and larger, and further increasing the probability of detecting the hole.
In one embodiment, the training device of the intelligent contract vulnerability detection model further performs the following steps:
constructing a to-be-detected code attribute graph structure corresponding to the to-be-detected intelligent contract source code;
carrying out data enhancement on the attribute graph structure of the code to be detected;
and inputting the code attribute graph structure to be detected after data enhancement into an intelligent contract vulnerability detection model so as to detect whether a vulnerability exists in the code attribute graph structure to be detected.
In this embodiment, after the training of the intelligent contract vulnerability detection model is completed, the training device of the intelligent contract vulnerability detection model inputs the graph structure data of the intelligent contract test set into the trained model of the graph neural network which introduces the contrast learning, and the model is used for performing automatic judgment, so as to determine whether the reentry vulnerability and the closed loop vulnerability exist in the input intelligent contract. Two groups of experiments are performed, the first group of experiments use a node feature mask in a data enhancement stage, and reduce loss by using an Euler distance plus a Manhattan distance in a stage of adding a contrast loss pull-in distance, and simultaneously reduce loss by using the similarity of pull-in similar samples in a dimensional space. In the second set of experiments, the adjacency matrix mask is used in the data enhancement stage, and in the stage of adding contrast loss pull-in distance, the euler distance + manhattan distance is used to reduce loss, and meanwhile, the similarity of pull-in similar samples in the dimensional space is also used to reduce loss. The two groups of experiments are respectively called NC-GCN and AM-GCN, the following is comparison of the invention with other ten effects, the two groups are divided into three groups, the first group is an automatic detection tool, the second group is a model obtained by using the training method of the intelligent contract vulnerability detection model provided by the invention for detection, and the DR-GCN in the third group is a network model published on the IJCAI in 2020, so that the experiments of the invention are superior to the other ten effects at present. The specific detection effects are shown in table 1:
TABLE 1
Figure BDA0003890899500000091
In summary, it can be seen that, in the embodiment provided by the present invention, the syntax and semantic information of the intelligent contract source code are extracted better by introducing the graph structure. By introducing the idea of contrast learning, a series of methods such as data enhancement, positive sample distance pulling and the like are adopted to improve the accuracy of model detection. And the degree of automation of intelligent contract detection is improved by adopting a deep learning method.
The invention is explained from the above by a training method of an intelligent contract vulnerability detection model, and the invention is explained from the perspective of a training device of the intelligent contract vulnerability detection model.
Referring to fig. 3, fig. 3 is a schematic view of a virtual structure of a training apparatus of an intelligent contract vulnerability detection model according to an embodiment of the present invention, where the training apparatus 300 of the intelligent contract vulnerability detection model includes:
a determining unit 301, configured to determine a vulnerability detection result corresponding to an intelligent contract source code data set;
a constructing unit 302, configured to construct a code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set;
a data enhancement unit 303, configured to perform data enhancement on the code attribute map structure;
the model training unit 304 is configured to input the code attribute graph structure subjected to data enhancement and the vulnerability detection result into an initial vulnerability detection model to obtain an output result;
an adjusting unit 305, configured to adjust a loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result;
a model determining unit 306, configured to determine the initial vulnerability detection model reaching the preset iteration condition as the intelligent contract vulnerability detection model.
In one possible design, the data enhancement unit 303 is specifically configured to:
discarding the characteristics corresponding to the nodes of the code attribute graph structure according to a first preset probability to obtain a first enhancement sample;
or the like, or, alternatively,
and discarding the adjacency matrix of the code attribute graph structure according to a second preset probability to obtain a second enhanced sample.
In one possible design, the output result includes a first output result and the second output result, and the model training unit 304 is specifically configured to:
calculating the Euler distance and the Manhattan distance between the first output result and the second output result;
and adjusting the loss function according to the Euler distance, the Manhattan distance and the vulnerability detection result.
In one possible design, the model training unit 304 is further specifically configured to:
determining a cross-correlation matrix of the first output result and the second output result;
and adjusting the loss function based on the cross-correlation matrix and the vulnerability detection result.
In one possible design, the building unit 302 is specifically configured to:
constructing nodes of a target code attribute graph structure corresponding to a target intelligent contract source code according to the importance degrees of elements with different degrees in the target intelligent contract source code, wherein the target intelligent contract source code is any one intelligent contract source code in the intelligent contract source code data set, and the nodes of the target code attribute graph structure comprise at least one of a main node, a common node and a fallback node;
and constructing edges of the object code attribute graph structure according to the relationship among the nodes, wherein the edges of the object code attribute graph structure comprise at least one of control flow edges, data flow edges, forward edges or rollback edges.
In one possible design, the determining unit 301 is further configured to:
constructing a to-be-detected code attribute graph structure corresponding to the to-be-detected intelligent contract source code;
performing data enhancement on the attribute graph structure of the code to be detected;
and inputting the code attribute graph structure to be detected after data enhancement into the intelligent contract vulnerability detection model so as to detect whether a vulnerability exists in the code attribute graph structure to be detected.
Fig. 4 is a schematic structural diagram of the server of the present invention, and as shown in fig. 4, the server 400 of this embodiment includes at least one processor 401, at least one network interface 404 or other user interface 403, a memory 405, and at least one communication bus 402. The server 400 optionally contains a user interface 403 including a display, keyboard or pointing device. Memory 405 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 405 stores execution instructions, when the server 400 runs, the processor 401 communicates with the memory 405, and the processor 401 calls the instructions stored in the memory 405 to execute the training method of the intelligent contract vulnerability detection model. The operating system 406, which contains various programs for implementing various basic services and for handling hardware-dependent tasks.
The server provided by the embodiment of the invention can execute the technical scheme of the embodiment of the training method of the intelligent contract vulnerability detection model, the realization principle and the technical effect are similar, and the details are not repeated here.
The embodiment of the invention also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a computer, the computer program implements the method flow related to the training device of the intelligent contract vulnerability detection model in any of the method embodiments. Correspondingly, the computer can be a training device of the intelligent contract vulnerability detection model.
An embodiment of the present invention further provides a computer program or a computer program product including a computer program, where when the computer program is executed on a certain computer, the computer will implement the method flows related to the training apparatus of the intelligent contract vulnerability detection model in any of the above method embodiments. Correspondingly, the computer can be the training device of the intelligent contract vulnerability detection model.
In the above-described embodiment corresponding to fig. 1, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The above-mentioned embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A training method of an intelligent contract vulnerability detection model is characterized by comprising the following steps:
determining a vulnerability detection result corresponding to an intelligent contract source code data set;
constructing a code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set;
performing data enhancement on the code attribute graph structure;
inputting the code attribute graph structure subjected to data enhancement and the vulnerability detection result into an initial vulnerability detection model to obtain an output result;
adjusting a loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result;
and determining the initial vulnerability detection model reaching the preset iteration condition as the intelligent contract vulnerability detection model.
2. The method of claim 1, wherein the data enhancing the code property graph structure comprises:
discarding the characteristics corresponding to the nodes of the code attribute graph structure according to a first preset probability to obtain a first enhancement sample;
or the like, or, alternatively,
and discarding the adjacency matrix of the code attribute graph structure according to a second preset probability to obtain a second enhanced sample.
3. The method according to claim 1 or 2, wherein the output result comprises a first output result and the second output result, and the adjusting the loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result comprises:
calculating the Euler distance and the Manhattan distance between the first output result and the second output result;
and adjusting the loss function according to the Euler distance, the Manhattan distance and the vulnerability detection result.
4. The method according to claim 1 or 2, wherein the adjusting the loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result comprises:
determining a cross-correlation matrix of the first output result and the second output result;
and adjusting the loss function based on the cross-correlation matrix and the vulnerability detection result.
5. The method according to claim or 2, wherein the constructing a code attribute graph structure corresponding to each group of intelligent contract source code in the intelligent contract source code data set based on the vulnerability detection result comprises:
constructing nodes of a target code attribute graph structure corresponding to a target intelligent contract source code according to the importance degrees of elements with different degrees in the target intelligent contract source code, wherein the target intelligent contract source code is any one intelligent contract source code in the intelligent contract source code data set, and the nodes of the target code attribute graph structure comprise main nodes, common nodes and fallback nodes;
and constructing the edges of the object code attribute graph structure according to the relationship among the nodes, wherein the edges of the object code attribute graph structure comprise a control flow edge, a data flow edge, a forward edge and a backspace edge.
6. The method according to claim 1 or 2, characterized in that the method further comprises:
constructing a to-be-detected code attribute graph structure corresponding to the to-be-detected intelligent contract source code;
carrying out data enhancement on the attribute graph structure of the code to be detected;
and inputting the code attribute graph structure to be detected after data enhancement into the intelligent contract vulnerability detection model so as to detect whether a vulnerability exists in the code attribute graph structure to be detected.
7. The utility model provides a trainer of intelligence contract leak testing model which characterized in that includes:
the determining unit is used for determining a vulnerability detection result corresponding to the intelligent contract source code data set;
the construction unit is used for constructing a code attribute graph structure corresponding to each group of intelligent contract source codes in the intelligent contract source code data set;
the data enhancement unit is used for performing data enhancement on the code attribute graph structure;
the model training unit is used for inputting the code attribute graph structure subjected to data enhancement and the vulnerability detection result into an initial vulnerability detection model to obtain an output result;
the adjusting unit is used for adjusting a loss function of the initial vulnerability detection model according to the output result and the vulnerability detection result;
and the model determining unit is used for determining the initial vulnerability detection model reaching the preset iteration condition as the intelligent contract vulnerability detection model.
8. The apparatus according to claim 7, wherein the data enhancement unit is specifically configured to:
discarding the characteristics corresponding to the nodes of the code attribute graph structure according to a first preset probability to obtain a first enhancement sample;
or the like, or, alternatively,
and discarding the adjacency matrix of the code attribute graph structure according to a second preset probability to obtain a second enhanced sample.
9. A computer device, comprising:
at least one processor, a memory, and a transceiver connected, wherein the memory is configured to store program code, and the processor is configured to invoke the program code in the memory to perform the steps of the training method of the intelligent contract vulnerability detection model according to any of claims 1 to 6.
10. A computer storage medium, comprising:
instructions which, when run on a computer, cause the computer to perform the steps of the method of training an intelligent contract vulnerability detection model according to any of claims 1 to 6.
CN202211260813.5A 2022-10-14 2022-10-14 Training method of intelligent contract vulnerability detection model and related equipment Pending CN115659176A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211260813.5A CN115659176A (en) 2022-10-14 2022-10-14 Training method of intelligent contract vulnerability detection model and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211260813.5A CN115659176A (en) 2022-10-14 2022-10-14 Training method of intelligent contract vulnerability detection model and related equipment

Publications (1)

Publication Number Publication Date
CN115659176A true CN115659176A (en) 2023-01-31

Family

ID=84987065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211260813.5A Pending CN115659176A (en) 2022-10-14 2022-10-14 Training method of intelligent contract vulnerability detection model and related equipment

Country Status (1)

Country Link
CN (1) CN115659176A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657615A (en) * 2018-12-19 2019-04-19 腾讯科技(深圳)有限公司 A kind of training method of target detection, device and terminal device
CN110674869A (en) * 2019-09-23 2020-01-10 腾讯科技(深圳)有限公司 Classification processing and graph convolution neural network model training method and device
CN111488582A (en) * 2020-04-01 2020-08-04 杭州云象网络技术有限公司 Intelligent contract reentry vulnerability detection method based on graph neural network
CN112925977A (en) * 2021-02-26 2021-06-08 中国科学技术大学 Recommendation method based on self-supervision graph representation learning
CN113360915A (en) * 2021-06-09 2021-09-07 扬州大学 Intelligent contract multi-vulnerability detection method and system based on source code graph representation learning
CN113919418A (en) * 2021-09-17 2022-01-11 中国电子科技集团公司第三十六研究所 Classification model training method and device based on small samples and electronic equipment
CN114328898A (en) * 2021-12-28 2022-04-12 广州华多网络科技有限公司 Text abstract generating method and device, equipment, medium and product thereof
CN114462045A (en) * 2021-12-31 2022-05-10 国网浙江省电力有限公司物资分公司 Intelligent contract vulnerability detection method
CN114612316A (en) * 2022-01-28 2022-06-10 之江实验室 Method and device for removing rain from nuclear prediction network image

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657615A (en) * 2018-12-19 2019-04-19 腾讯科技(深圳)有限公司 A kind of training method of target detection, device and terminal device
CN110674869A (en) * 2019-09-23 2020-01-10 腾讯科技(深圳)有限公司 Classification processing and graph convolution neural network model training method and device
CN111488582A (en) * 2020-04-01 2020-08-04 杭州云象网络技术有限公司 Intelligent contract reentry vulnerability detection method based on graph neural network
CN112925977A (en) * 2021-02-26 2021-06-08 中国科学技术大学 Recommendation method based on self-supervision graph representation learning
CN113360915A (en) * 2021-06-09 2021-09-07 扬州大学 Intelligent contract multi-vulnerability detection method and system based on source code graph representation learning
CN113919418A (en) * 2021-09-17 2022-01-11 中国电子科技集团公司第三十六研究所 Classification model training method and device based on small samples and electronic equipment
CN114328898A (en) * 2021-12-28 2022-04-12 广州华多网络科技有限公司 Text abstract generating method and device, equipment, medium and product thereof
CN114462045A (en) * 2021-12-31 2022-05-10 国网浙江省电力有限公司物资分公司 Intelligent contract vulnerability detection method
CN114612316A (en) * 2022-01-28 2022-06-10 之江实验室 Method and device for removing rain from nuclear prediction network image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JURE ZBONTAR 等: ""Barlow Twins: Self-Supervised Learning via Redundancy Reduction"", 《HTTPS://ARXIV.ORG/PDF/2103.03230V2.PDF》, 3 May 2021 (2021-05-03) *
段亚男: ""基于代码属性图和图卷积神经网络的软件漏洞检测方法研究"", 《万方》 *

Similar Documents

Publication Publication Date Title
CN113360915B (en) Intelligent contract multi-vulnerability detection method and system based on source code diagram representation learning
CN114547611A (en) Intelligent contract Pompe fraudster detection method and system based on multi-modal characteristics
CN113434699B (en) Pre-training method, computer device and storage medium for BERT model for text matching
CN113486357A (en) Intelligent contract security detection method based on static analysis and deep learning
CN113127933B (en) Intelligent contract Pompe fraudster detection method and system based on graph matching network
Xu et al. On a utilitarian approach to privacy preserving text generation
CN112085091A (en) Artificial intelligence-based short text matching method, device, equipment and storage medium
CN113298152A (en) Model training method and device, terminal equipment and computer readable storage medium
CN115017511A (en) Source code vulnerability detection method and device and storage medium
CN115455382A (en) Semantic comparison method and device for binary function codes
CN117972732B (en) Intelligent contract vulnerability detection method and system based on multi-feature fusion
CN116340952A (en) Intelligent contract vulnerability detection method based on operation code program dependency graph
CN113312058B (en) Similarity analysis method for intelligent contract binary function
Song et al. BinMLM: Binary authorship verification with flow-aware mixture-of-shared language model
CN112132269B (en) Model processing method, device, equipment and storage medium
CN117725592A (en) Intelligent contract vulnerability detection method based on directed graph annotation network
CN113158630A (en) Text editing image method, storage medium, electronic device and system
CN115659176A (en) Training method of intelligent contract vulnerability detection model and related equipment
Huang et al. Deep Smart Contract Intent Detection
Zhen et al. DA-GNN: A smart contract vulnerability detection method based on Dual Attention Graph Neural Network
CN113468884A (en) Chinese event trigger word extraction method and device
CN117556425B (en) Intelligent contract vulnerability detection method, system and equipment based on graph neural network
CN118333132B (en) Emotion recognition model training method, emotion recognition method and related equipment
KR102557800B1 (en) Device and method for constructing differentially private decision trees
Blanco-Castañeda et al. Applied Stochastic Modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination