CN113177107B - Intelligent contract similarity detection method based on syntax tree matching - Google Patents

Intelligent contract similarity detection method based on syntax tree matching Download PDF

Info

Publication number
CN113177107B
CN113177107B CN202110569353.3A CN202110569353A CN113177107B CN 113177107 B CN113177107 B CN 113177107B CN 202110569353 A CN202110569353 A CN 202110569353A CN 113177107 B CN113177107 B CN 113177107B
Authority
CN
China
Prior art keywords
syntax tree
similarity
intelligent contract
vector
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110569353.3A
Other languages
Chinese (zh)
Other versions
CN113177107A (en
Inventor
刘振广
徐小俊
钱鹏
刘灵凤
武思凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Rendui Network Co ltd
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN202110569353.3A priority Critical patent/CN113177107B/en
Publication of CN113177107A publication Critical patent/CN113177107A/en
Application granted granted Critical
Publication of CN113177107B publication Critical patent/CN113177107B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an intelligent contract similarity detection method based on syntax tree matching, which captures intelligent contract syntax information by means of an abstract syntax tree extraction tool, obtains semantic information of each node in a syntax tree by utilizing an attention mechanism of an encoder, finally extracts high semantic feature vectors of the intelligent contract syntax tree, obtains similarity values between different syntax trees by taking the feature vectors as input of a similarity calculation function, and obtains a similarity detection result of two sections of intelligent contract source codes by an averaging method. Compared with the traditional code clone detection method, the method realizes more accurate detection effect, has the similarity detection explanation accurate to the code line, and has good universality and practical value.

Description

Intelligent contract similarity detection method based on syntax tree matching
Technical Field
The invention belongs to the technical field of program similarity detection, and particularly relates to an intelligent contract similarity detection method based on syntax tree matching.
Background
In recent years, in order to improve software development efficiency, more and more developers have come to use code multiplexing techniques, such as multiplexing existing program codes, multiplexing general software frameworks, multiplexing common design patterns, and the like. However, blindly reusing existing program code may cause many problems, such as increased extra cost of the project, vulnerability of the software to vulnerability risks, and easy infringement of the software copyright.
Code similarity detection is one of the effective techniques for checking code reuse, also called code clone detection, and can determine whether the same or similar code fragments exist in two programs. According to different similarity degrees of codes, the code similarity detection is generally divided into four levels: (1) identical program code; (2) code other than space, comments, variable or function renaming, etc. is fully multiplexed; (3) code slightly modified based on type (2); (4) code that is implemented differently but semantically or functionally the same. The traditional detection method usually only considers code similarity detection at a grammar level, so that the detection level of the type (1) and the type (2) can be only achieved, and the existing similarity detection method combines multi-dimensional analysis methods such as vocabulary, grammar, semantics and the like to realize the code similarity detection at the type (3) and the type (4) levels.
The intelligent contract similarity detection is code clone detection aiming at a block chain intelligent contract, and the intelligent contract is program code written by a Turing complete language and has the advantages of non-reversibility and non-variability, namely, the contract cannot be modified and updated after being deployed. If a certain intelligent contract has a vulnerability, the cloned derivative contract may also have a corresponding vulnerability, so that it is necessary to research a similarity detection method for the intelligent contract code, which can effectively avoid the propagation of the contract vulnerability, thereby further improving the reliability and security of the intelligent contract.
The method based on syntax tree matching can effectively solve the problem of intelligent contract similarity detection, and comprises the steps of converting an intelligent contract code into Abstract Syntax Trees (AST), dividing each AST into a plurality of syntax trees to obtain corresponding syntax tree sequences, and calculating contract similarity matrixes of two different syntax trees by using a similarity detection algorithm; the method can realize high-efficiency and accurate intelligent contract similarity detection, can give similarity explanation accurate to a code line, and has good foresight and reference.
Disclosure of Invention
In view of the above, the invention provides an intelligent contract similarity detection method based on syntax tree matching, which can realize intelligent contract source code similarity detection at the semantic level.
An intelligent contract similarity detection method based on syntax tree matching comprises the following steps:
(1) constructing an abstract syntax tree: aiming at the Ether intelligent contract research object, extracting an abstract syntax tree from intelligent contract source codes by using a syntax tree extraction tool;
(2) constructing a syntax tree sequence: with intelligent contract code segment Z1And Z2For the contract clone pairs to be tested, Z is obtained using a syntax tree extraction tool1And Z2Corresponding abstract syntax tree F1And F2Will F1And F2Splitting according to corresponding sentences and traversing in a precedence way to obtain a syntax tree sequence S1And S2
(3) Syntax tree feature extraction: constructing a syntax tree coder based on an Attention mechanism, and extracting a syntax tree sequence S1And S2The feature vector corresponding to each syntax tree in the syntax tree sequence S is further obtained1Feature vector set of
Figure BDA0003082053450000021
And a sequence of syntax trees S2Feature vector set of
Figure BDA0003082053450000022
Figure BDA0003082053450000023
Where n denotes the dimension of the vector, m and k denote S, respectively1And S2The number of syntax trees in (1);
(4) similarity calculation: computing using Pearson's similarity algorithm
Figure BDA0003082053450000024
Each vector of
Figure BDA0003082053450000025
Similarity between medium vectors to obtain a contract similarity matrix Tm×kWherein T ism×kThe value of the ith row and the jth column element in the middle represents S1The ith syntax tree and S2Of the jth syntax treeSimilarity;
(5) contract similarity detection: setting a threshold a1And a2Will matrix Tm×kHigher than a1Is kept constant below a1Computing the average value M of all non-zero elements in the matrix, which is the intelligent contract code segment Z1And Z2Further comparing M with a2Size of contract code segment Z1And Z2Whether they are similar;
(6) interpretability analysis: if matrix Tm×kThe element value of the ith row and the jth column in the matrix is the maximum element value in the matrix, and then represents S1The ith syntax tree and S2The jth syntax tree has the highest similarity, so that the contract code segment Z can be positioned1And Z2There are specific lines of code that are similar.
Further, the specific implementation manner of the step (2) is as follows: first, an intelligent contract code segment Z is extracted by using a syntax tree extraction tool1And Z2Extracted as an abstract syntax tree F1And F2(ii) a Then, according to statement hierarchy, pair F1And F2Splitting is carried out, and a syntax tree sequence S is obtained through preorder traversal1={fi∈F1|f1,...,fmAnd S2={fj∈F2|f1,...,fkWherein each syntax tree corresponds to a statement in the intelligent contract, i.e. Z1And Z2M sentences and k sentences are contained respectively, i and j are natural numbers, i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to k;
in particular, for a nested statement, a series of independent nodes Ns ═ block, body } needs to be defined, where block is used to split the header and body of the nested statement, and body is used for method declaration; the syntax tree rooted at node s consists of s and all its descendant nodes D(s), if there is a path between nodes s and d through n, meaning that node d is some statement contained in the body of s, where d ∈ D(s) and n ∈ Ns.
Further, the specific implementation manner of the step (3) is as follows:
first of all, the first step is to,converting all nodes (if the syntax tree represents a statement for defining variables, the nodes may be type definitions of corresponding variables) in a syntax tree needing encoding into corresponding vector representations by using a word2vec tool, and obtaining a vector sequence X ═ { X ═ X1,...,xmTaking X as the input of a syntax tree encoder, and m is the number of nodes in the syntax tree;
then, constructing a syntax tree encoder based on the Attention, learning the semantic relation between each vector in the sequence X, and obtaining a semantic vector sequence Y ═ Y corresponding to the input vector sequence X through multi-layer iterative learning1,...,ym};
And finally, inputting all vectors in the sequence Y into a convolution pooling layer to generate a feature vector corresponding to the syntax tree.
Further, the specific implementation manner of the step (4) is as follows: will be provided with
Figure BDA0003082053450000031
Vector p in (1)iAnd
Figure BDA0003082053450000032
vector p in (1)jSubstituting into the following similarity calculation function to obtain piAnd pjIs a similarity value t ofijI.e. sim (p)i,pj);
Figure BDA0003082053450000033
Wherein: p is a radical ofitRepresenting a vector piThe value of the t-th element in (b),
Figure BDA0003082053450000034
representing a vector piAverage value of elements in (1), pjtRepresenting a vector pjThe value of the t-th element in (b),
Figure BDA0003082053450000035
representing a vector pjAverage value of elements in (1), tijIs a matrix Tm×kI.e. representing the code segment Z1Corresponding syntax tree sequence S1The ith syntax tree and code segment Z in2Corresponding syntax tree sequence S2The similarity of the jth syntax tree in (1).
Further, the specific implementation manner of the step (5) is as follows:
first, a threshold value a is set1And a2Wherein a is1For filtering elements of the contract similarity matrix with lower similarity values, a2For determining whether the two code segments are similar;
then, the matrix T is dividedm×kHigher than a1Is kept constant below a1Computing the average value M of all non-zero elements in the matrix, which is the intelligent contract code segment Z1And Z2The similarity of (2);
finally, M is compared with a2Size of (D), judgment of Z1And Z2Similarity of (c): if M ≧ a2Then represents Z1And Z2With similarity, otherwise Z1And Z2There is no similarity.
Further, the specific implementation manner of the step (6) is as follows: first, by comparing the matrix Tm×kThe value of the element in (1) can lock some elements with higher values, and obtain the position of the elements in the matrix; in particular, the matrix Tm×kThe value in the ith row and the jth column in the specification represents S1The ith syntax tree of (1) and (S)2If the similarity value of the jth syntax tree is larger than a set threshold value, the method represents an intelligent contract code segment Z1The ith statement in (1) and Z2The jth statement in (a) is highly similar, so that locking to a specific code line in the intelligent contract can be realized, and the interpretability of intelligent contract similarity detection is given.
The intelligent contract similarity detection method based on syntax tree matching effectively solves the problem of intelligent contract similarity detection of Etheng; compared with the traditional code clone detection method, the method disclosed by the invention realizes a more accurate detection effect, has a similarity detection explanation accurate to a code line, has good universality and practical value, and has the following main beneficial technical effects and innovativeness in the following four aspects:
1. the intelligent contract syntax tree construction method disclosed by the invention captures intelligent contract syntax information through an abstract syntax tree extraction tool, subdivides the information on a statement level, and can more accurately compare the similarity of two code segments.
2. The syntax tree coder based on the Attention mechanism can extract high-semantic vector representation in the contract syntax tree, and improves the accuracy and efficiency of similarity detection.
3. The invention digitalizes the similarity of different contract code lines, can correspondingly provide the intelligent contract similarity detection interpretability and has reliable reference significance.
4. The intelligent contract similarity detection method based on syntax tree matching has good expansibility and reference significance.
Drawings
Fig. 1 is a schematic flow chart of an intelligent contract similarity detection method based on syntax tree matching according to the present invention.
FIG. 2 is a flow diagram illustrating splitting an abstract syntax tree into syntax trees according to the present invention.
FIG. 3 is a schematic diagram of the coding of an Attention-based encoder according to the present invention.
FIG. 4 is a diagram of a simulation for detecting intelligent contract similarity according to an embodiment of the present invention.
Detailed Description
In order to more specifically describe the present invention, the following detailed description is provided for the technical solution of the present invention with reference to the accompanying drawings and the specific embodiments.
The method comprises the steps of capturing intelligent contract grammar information by means of an abstract grammar tree extraction tool, acquiring semantic information of each node in a syntax tree by utilizing an attention mechanism of an encoder, finally extracting high semantic feature vectors of the intelligent contract syntax tree, obtaining similarity values between different syntax trees by taking the feature vectors as input of a similarity calculation function, obtaining a similarity detection result of two sections of intelligent contract source codes by an averaging method, and obtaining the flow of the similarity detection result as shown in figure 1.
As shown in fig. 2, the method of the present invention for splitting an intelligent contract syntax tree (AST) into syntax trees can be summarized as follows: and splitting the AST according to different sentences, and traversing in a preorder manner to obtain a syntax tree sequence S, namely each syntax tree corresponds to one sentence in the source code. For nested statements, a series of independent nodes Ns ═ block, body }, where block is used to split the headers and bodies of the nested statements (e.g., the nested relationship between syntax tree f4 and syntax tree f 5), and body is used for method declaration; the syntax tree is defined as: the syntax tree rooted at S consists of S and all its descendant nodes D (S) (where S ∈ S); the definition of the descendant node is: if there is a path between s and d through n, then we mean that node d is contained in some statement in the body of s (where d ∈ D(s), n ∈ Ns).
Taking the FunStatement syntax tree in fig. 2 as an example, the syntax tree encoder based on the Attention mechanism of the present invention has the following processes: (1) converting all nodes of the syntax tree into an initial vector representation by using a word2vec tool, and obtaining a sequence X ═ { X ═ X1,...,xmWhere xiVector representation for each node); (2) inputting an initial vector X of a syntax tree into an Attention network, and learning semantic relations among the vectors by using an Attention mechanism so as to extract the semantic relations among nodes in the syntax tree; (3) performing multiple iterative learning through the step (2), and performing normalization processing on the learning result through a Softmax layer to obtain a semantic vector representation Y ═ Y corresponding to the initial vector1,...,ymIn which yiAnd xiCorresponding in turn), which contains not only syntactic information, but also semantic information (the semantic vector referred to in this invention is: considering that the code corresponding to the node of the syntax tree may have different meanings in different code lines, converting the initial vector of the node into a semantization vector of the combined use environment, for example, there is public in both lines 3 and 4 of the code segment in fig. 2, where the public in line 3 represents a variable type and the public in line 4 represents a function type); (4) converting Y into final syntax tree direction using convolution pooling layerQuantity pi
The following embodiment takes the intelligent contract similarity detection shown in fig. 4 as an example, and the specific detection flow is as follows:
(1) first, intelligent contracts A and B are respectively converted into abstract syntax trees F by using syntax tree extraction tools1And F2
(2) As shown in FIG. 2, F1And F2Splitting according to different sentences and obtaining a syntax tree sequence S through preorder traversal1={fi∈F1|f1,...,f8},S2={fi∈F2|f1,...,f10}. In this example, contracts A and B contain 8 sentences and 10 sentences, respectively, so sequence S1Containing 8 syntactic trees, S2Contains 10 syntax trees.
(3) As shown in fig. 3, a syntax tree sequence S is generated by a syntax encoder based on the Attention mechanism1And S2In each clause method tree fiCorresponding feature vector to obtain syntax tree sequence S1Feature vector set of
Figure BDA0003082053450000061
Syntax tree sequence S2Feature vector set of
Figure BDA0003082053450000062
Figure BDA0003082053450000063
(the dimension of all feature vectors in this example is 64).
(4) Will be provided with
Figure BDA0003082053450000064
Middle vector PiAnd
Figure BDA0003082053450000065
middle vector PjSequentially substituting the similarity calculation function to obtain PiAnd PjIs a similarity value t ofijTo obtain a contract similarity matrix T8×10(ii) a Setting a threshold a1Filtering T by element value comparison8×10Middle element value lower than a1The specific implementation process of the element (2) is as follows:
4.1 calculation Using Pearson's similarity calculation function
Figure BDA0003082053450000066
And
Figure BDA0003082053450000067
similarity between medium syntax tree vectors:
Figure BDA0003082053450000068
finally obtaining a contract similarity matrix T8×10The matrix is the ith row and jth column element value, i.e. the sequence S1The ith syntax tree and S2The similarity value of the jth syntax tree in (1).
4.2 setting threshold a1When the ratio is 0.75, mixing T8×10Middle element value higher than a1Is lower than a1Set to zero.
(5) Setting a threshold a2To find a contract similarity matrix T8×10Average value of medium and non-zero elements to obtain similarity value M of intelligent contracts A and B, and comparing M with a2So as to judge whether the intelligent contracts A and B are similar, and the specific implementation process is as follows:
5.1 computing the matrix T8×10And the average value M of the non-zero elements is the similarity of the intelligent contracts A and B.
5.2 setting a threshold a20.8, if M is more than or equal to a2If the similarity value of the contracts A and B is higher than 0.8, the semantic similarity between the contracts A and B is represented; otherwise, the contracts A and B have no semantic similarity.
5.3 the similarity value M of this case is greater than 0.8, which indicates that there is semantic similarity between contracts A and B.
(6) For T8×10Further analysis was carried out on medium elements, if T8×10The similarity value of the ith row and the jth column element is higherThe method represents that the ith statement in the contract A and the jth statement in the contract B have high similarity, and further embodies the interpretability analysis of the similarity detection method, and the specific implementation process is as follows:
6.1 contract similarity matrix T8×10The element values in the matrix are compared, and the element with higher similarity value of the matrix and the position of the element in the matrix can be obtained.
6.2 it is clear that in this example, the first and second statements in the contract A, B (similarity value 1.00) are the most similar, and thus can be locked to the two lines 1 and 2.
The embodiments described above are presented to enable a person having ordinary skill in the art to make and use the invention. It will be readily apparent to those skilled in the art that various modifications to the above-described embodiments may be made, and the generic principles defined herein may be applied to other embodiments without the use of inventive faculty. Therefore, the present invention is not limited to the above embodiments, and those skilled in the art should make improvements and modifications to the present invention based on the disclosure of the present invention within the protection scope of the present invention.

Claims (1)

1. An intelligent contract similarity detection method based on syntax tree matching comprises the following steps:
(1) constructing an abstract syntax tree: aiming at the Ether intelligent contract research object, extracting an abstract syntax tree from intelligent contract source codes by using a syntax tree extraction tool;
(2) constructing a syntax tree sequence: with intelligent contract code segment Z1And Z2For the contract clone pairs to be tested, Z is obtained using a syntax tree extraction tool1And Z2Corresponding abstract syntax tree F1And F2A 1 to F1And F2Splitting according to corresponding sentences and traversing in a precedence way to obtain a syntax tree sequence S1And S2The specific implementation mode is as follows: first, an intelligent contract code segment Z is extracted by using a syntax tree extraction tool1And Z2Extracted as an abstract syntax tree F1And F2(ii) a Then, according to statement hierarchy, pair F1And F2Splitting is carried out, and a syntax tree sequence S is obtained through preorder traversal1={fi∈F1|f1,...,fmAnd S2={fj∈F2|f1,...,fkWherein each syntax tree corresponds to a statement in the intelligent contract, i.e. Z1And Z2M sentences and k sentences are contained respectively, i and j are natural numbers, i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to k;
in particular, for a nested statement, a series of independent nodes Ns ═ block, body } needs to be defined, where block is used to split the header and body of the nested statement, and body is used for method declaration; the syntax tree rooted at node s consists of s and all its descendant nodes D(s), if there is a path between s and d through n, which means that d is a certain statement contained in the body of s, where d belongs to D(s) and n belongs to Ns;
(3) and (3) syntactic tree feature extraction: constructing a syntax tree coder based on an Attention mechanism, and extracting a syntax tree sequence S1And S2The feature vector corresponding to each syntax tree in the syntax tree sequence S is further obtained1Feature vector set of
Figure FDA0003579346450000011
And a sequence of syntax trees S2Feature vector set of
Figure FDA0003579346450000012
Figure FDA0003579346450000013
Where n denotes the dimension of the vector, m and k denote S, respectively1And S2The number of syntax trees in (1) is specifically implemented as follows:
firstly, converting all nodes in a syntax tree needing encoding into corresponding vector representations by using a word2vec tool to obtain a vector sequence X ═ { X ═ X1,...,xmTaking X as an input of a syntax tree encoder, and m is the number of nodes in the syntax tree;
then, an Attention-based syntax tree coding is constructedThe semantic relation between each vector in the sequence X is learned, and the semantization vector sequence Y corresponding to the input vector sequence X is obtained by multi-layer iterative learning1,...,ym};
Finally, inputting all vectors in the sequence Y into a convolution pooling layer to generate a feature vector corresponding to the syntax tree;
(4) similarity calculation: computing using Pearson's similarity algorithm
Figure FDA0003579346450000021
Each vector of
Figure FDA0003579346450000022
The similarity between the medium vectors obtains a contract similarity matrix Tm×kWherein T ism×kThe value of the ith row and the jth column element in the middle represents S1The ith syntax tree and S2The concrete implementation manner of the similarity of the jth syntax tree in the syntax tree is as follows: will be provided with
Figure FDA0003579346450000023
Vector p in (1)iAnd
Figure FDA0003579346450000024
vector p in (1)jSubstituting into the following similarity calculation function to obtain piAnd pjIs a similarity value t ofijI.e. sim (p)i,pj);
Figure FDA0003579346450000025
Wherein: p is a radical ofitRepresenting a vector piThe value of the t-th element in (b),
Figure FDA0003579346450000026
representing a vector piAverage value of elements in (1), pjtRepresenting a vector pjThe value of the t-th element in (b),
Figure FDA0003579346450000027
representing a vector pjAverage value of elements in (1), tijIs a matrix Tm×kI.e. representing the code segment Z1Corresponding syntax tree sequence S1The ith syntax tree and code segment Z in2Corresponding syntax tree sequence S2The similarity of the jth syntax tree in (1);
(5) contract similarity detection: setting a threshold a1And a2Will matrix Tm×kHigher than a1Is kept constant below a1Computing the average value M of all non-zero elements in the matrix, which is the intelligent contract code segment Z1And Z2Further comparing M with a2Size of contract code segment Z is judged1And Z2Whether similar, the concrete implementation is as follows:
first, a threshold value a is set1And a2Wherein a is1For filtering elements of the contract similarity matrix with lower similarity values, a2For determining whether the two code segments are similar;
then, the matrix T is dividedm×kHigher than a1Is kept constant below a1Computing the average value M of all non-zero elements in the matrix, which is the intelligent contract code segment Z1And Z2Similarity of (2);
finally, M is compared with a2Size of (D), judgment of Z1And Z2Similarity of (c): if M ≧ a2Then represents Z1And Z2With similarity, otherwise Z1And Z2Do not have similarity;
(6) interpretability analysis: if matrix Tm×kThe element value of the ith row and the jth column in the matrix is the maximum element value in the matrix, and then represents S1The ith syntax tree and S2The jth syntax tree has the highest similarity, so that the contract code segment Z can be positioned1And Z2The specific code lines with similarity exist, and the specific implementation mode is as follows: first, byComparison matrix Tm×kThe value of the element in (1) can lock some elements with higher values, and obtain the position of the elements in the matrix; in particular, the matrix Tm×kThe value in the ith row and the jth column in the specification represents S1The ith syntax tree of (1) and (S)2If the similarity value of the jth syntax tree is larger than a set threshold value, the method represents an intelligent contract code segment Z1The ith statement in (1) and Z2The jth statement in (a) is highly similar, so that locking to a specific code line in the intelligent contract can be realized, and the interpretability of intelligent contract similarity detection is given.
CN202110569353.3A 2021-05-25 2021-05-25 Intelligent contract similarity detection method based on syntax tree matching Active CN113177107B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110569353.3A CN113177107B (en) 2021-05-25 2021-05-25 Intelligent contract similarity detection method based on syntax tree matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110569353.3A CN113177107B (en) 2021-05-25 2021-05-25 Intelligent contract similarity detection method based on syntax tree matching

Publications (2)

Publication Number Publication Date
CN113177107A CN113177107A (en) 2021-07-27
CN113177107B true CN113177107B (en) 2022-05-27

Family

ID=76929930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110569353.3A Active CN113177107B (en) 2021-05-25 2021-05-25 Intelligent contract similarity detection method based on syntax tree matching

Country Status (1)

Country Link
CN (1) CN113177107B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312268A (en) * 2021-07-29 2021-08-27 北京航空航天大学 Intelligent contract code similarity detection method
CN114201406B (en) * 2021-12-16 2024-02-02 中国电信股份有限公司 Code detection method, system, equipment and storage medium based on open source component

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162750A (en) * 2019-01-24 2019-08-23 腾讯科技(深圳)有限公司 Text similarity detection method, electronic equipment and computer readable storage medium
CN111898360A (en) * 2019-07-26 2020-11-06 创新先进技术有限公司 Text similarity detection method and device based on block chain and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11190520B2 (en) * 2018-11-20 2021-11-30 Microsoft Technology Licensing, Llc Blockchain smart contracts for digital asset access

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162750A (en) * 2019-01-24 2019-08-23 腾讯科技(深圳)有限公司 Text similarity detection method, electronic equipment and computer readable storage medium
CN111898360A (en) * 2019-07-26 2020-11-06 创新先进技术有限公司 Text similarity detection method and device based on block chain and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection;Xiaojun Xu等;《Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security》;20171231;全文 *
基于SGX的区块链交易隐私安全保护方法;范俊松等;《应用科学学报》;20210131;全文 *

Also Published As

Publication number Publication date
CN113177107A (en) 2021-07-27

Similar Documents

Publication Publication Date Title
CN107516041B (en) WebShell detection method and system based on deep neural network
CN108446540B (en) Program code plagiarism type detection method and system based on source code multi-label graph neural network
CN110941716B (en) Automatic construction method of information security knowledge graph based on deep learning
WO2019233112A1 (en) Vectorized representation method for software source codes
CN113177107B (en) Intelligent contract similarity detection method based on syntax tree matching
CN113761221B (en) Knowledge graph entity alignment method based on graph neural network
CN113010209A (en) Binary code similarity comparison technology for resisting compiling difference
CN114201406B (en) Code detection method, system, equipment and storage medium based on open source component
CN113297580B (en) Code semantic analysis-based electric power information system safety protection method and device
CN113901474B (en) Vulnerability detection method based on function-level code similarity
CN114547619B (en) Vulnerability restoration system and restoration method based on tree
CN114064117A (en) Code clone detection method and system based on byte code and neural network
CN115617395A (en) Intelligent contract similarity detection method fusing global and local features
CN114611115A (en) Software source code vulnerability detection method based on mixed graph neural network
CN113449303A (en) Intelligent contract vulnerability detection method and system based on teacher-student network model
CN113591465A (en) Method and device for identifying multidimensional IoC entity based on correlation enhancement network threat intelligence
CN115658846A (en) Intelligent search method and device suitable for open-source software supply chain
CN116302089A (en) Picture similarity-based code clone detection method, system and storage medium
CN114780103B (en) Semantic code clone detection method based on graph matching network
CN115878177A (en) Code clone detection method and system
CN116738963A (en) Deep learning code plagiarism detection method based on multi-head attention mechanism
CN115422541A (en) Intelligent contract code clone detection method based on AST multi-dimensional feature fusion
CN115859307A (en) Similar vulnerability detection method based on tree attention and weighted graph matching
CN115185728A (en) Software system architecture recovery method based on graph node embedding
CN116628695A (en) Vulnerability discovery method and device based on multitask learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231213

Address after: No. 811 Xingbo Third Road, Chengdong Street, Boxing County, Binzhou City, Shandong Province, 256500

Patentee after: Shandong Rendui Network Co.,Ltd.

Address before: 310018, No. 18 Jiao Tong Street, Xiasha Higher Education Park, Hangzhou, Zhejiang

Patentee before: ZHEJIANG GONGSHANG University

TR01 Transfer of patent right