CN113268732A - Solidity intelligent contract similarity detection method and system - Google Patents

Solidity intelligent contract similarity detection method and system Download PDF

Info

Publication number
CN113268732A
CN113268732A CN202110420735.XA CN202110420735A CN113268732A CN 113268732 A CN113268732 A CN 113268732A CN 202110420735 A CN202110420735 A CN 202110420735A CN 113268732 A CN113268732 A CN 113268732A
Authority
CN
China
Prior art keywords
similarity
basic block
intelligent contract
negative sample
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110420735.XA
Other languages
Chinese (zh)
Other versions
CN113268732B (en
Inventor
祝迪
庞建民
周鑫
岳峰
王军
李明亮
王其涵
韩文杰
刘光明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202110420735.XA priority Critical patent/CN113268732B/en
Publication of CN113268732A publication Critical patent/CN113268732A/en
Application granted granted Critical
Publication of CN113268732B publication Critical patent/CN113268732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/565Static detection by checking file integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method and a system for detecting similarity of a solid intelligent contract, which are characterized in that intelligent contract source codes of different versions of an Ethernet are collected, two types of intermediate representations are obtained through compiling and compiling optimization, and a similar basic block pair consisting of a basic block and a similar basic block is obtained by marking the intermediate representation with the basic block to form an intelligent contract similarity data set; generating a basic block sequence vector through vector space embedding, and acquiring a differential negative sample and a hard negative sample of a basic block through a natural language processing model; and training and optimizing the triple network model by taking the basic block in the similarity data set as an anchor and the similar basic block as a positive sample and taking the differential negative sample and/or the hard negative sample as a negative sample, and performing similarity detection on the target intelligent contract based on the triple network model after training and optimizing. The method improves the similarity detection accuracy, and is suitable for intelligent contract vulnerability mining, malicious contract detection, contract upgrading security detection and the like in a large-scale background.

Description

Solidity intelligent contract similarity detection method and system
Technical Field
The invention belongs to the technical field of intelligent contract auditing and analyzing, and particularly relates to a method and a system for detecting similarity of a solid intelligent contract.
Background
With the rapid development of medical health, internet of things, financial information and other aspects, the application of the block chain technology in the block chain technology is gradually approved in the society. Currently, the ether house is the largest block chain, and not only has the largest number of active people, but also has the largest number of intelligent complexes running therein. Smart contracts are decentralized applications in blockchains, and can be used to accomplish transparent, non-tamperable, traceable messaging and monetary transactions without third party certification. However, the current intelligent contracts of the etherhouses are used as one kind of programs, and face the problems of malicious contract deployment, code vulnerabilities, contract upgrading security threats and the like.
Currently, there are three main ways for security audit of intelligent contracts: static analysis, dynamic analysis, and learning by machine. The dynamic analysis records the running state of the software during running through means such as simulation, fuzzy test and the like, and establishes a similarity matching model for the running state of the software, although the false alarm rate is not high, the used resources are large and time-consuming, and the increasing number of intelligent contracts is not suitable for being matched. Static analysis generally refers to that a similarity matching or vulnerability detection model is established by analyzing the syntax and semantics of a target program and combining data flow and control flow analysis on the premise of not running the target program, so that the execution rate is high, but context information of contract running cannot be obtained, and the false alarm rate is high. In recent years, along with the rapid development of artificial intelligence, learning models such as machine learning have been applied to the field of contract security audit. The machine learning model is trained by using a large-scale data set through machine learning, so that the machine learning model suitable for intelligent contract security audit is obtained, the learning model is established through pre-training of large-scale software data, security audit can be performed on unknown intelligent contracts through the model, and security audit can be performed with less time and resources on the basis of ensuring accuracy. However, most of the existing methods for performing security audit on intelligent contracts by using machine learning are performed from a source code level and a method granularity, so that a machine learning model cannot effectively extract semantics of the intelligent contracts, and the accuracy is not high. As the use of blockchain technology has rapidly evolved, the number of intelligent contracts has grown explosively. The reuse of contract codes is more and more frequent, the similarity detection of intelligent contracts is more and more important, and the intelligent contract vulnerability search, the malicious contract detection, the contract upgrading threat detection and the like can be carried out on the basis of the intelligent contract similarity detection. At present, data set support is lacked for similarity detection of intelligent contracts, and in addition, the development language Solidity version and the compiler version of the intelligent contracts are more, so that difficulty is brought to security audit and similarity analysis of the intelligent contracts.
Disclosure of Invention
Therefore, the invention provides a method and a system for detecting similarity of a solid intelligent contract, which realize similarity measurement on a cross-version intelligent contract on a basic block granularity and an intermediate representation level, improve the accuracy of similarity detection and are suitable for intelligent contract vulnerability mining, malicious contract detection, contract upgrading security detection and the like under a large-scale background.
According to the design scheme provided by the invention, a solid intelligent contract similarity detection method is provided, which comprises the following contents:
collecting intelligent contract source codes of different versions of the ethereals, respectively obtaining two types of intermediate representations through compiling and compiling optimization, and obtaining similar basic block pairs consisting of basic blocks and similar basic blocks through marking the two types of intermediate representations to form an intelligent contract similarity data set;
generating a basic block sequence vector by embedding a vector space aiming at the similarity data set, and acquiring a differential negative sample and a hard negative sample of a basic block through a natural language processing model;
constructing a triple network model for carrying out similarity measurement on the intelligent contract, training and optimizing the triple network model by taking a basic block in a similarity data set as an anchor and a similar basic block as a positive sample and taking a differential negative sample and/or a hard negative sample as a negative sample, and carrying out similarity detection on the target intelligent contract based on the training and optimized triple network model.
As the identity intelligent contract similarity detection method, further, in the basic block marking, firstly, the instruction in the intermediate representation is normalized, and a basic block sequence is generated according to the basic block definition; and then matching and summarizing the basic blocks and the similar basic blocks in the intermediate representation according to the basic block marks to obtain a similarity data set.
As the method for detecting the similarity of the intelligent contract based on the identity, disclosed by the invention, further, in the vector space embedding, firstly, basic blocks in a data set are converted into tokens, and then, word2vec is used for carrying out vector space embedding to generate basic block sequence vectors.
As the identity intelligent contract similarity detection method, a Transformer model is adopted by a natural language processing model, a vector space distance used for similarity measurement between basic blocks is obtained by utilizing the Transformer model, and similarity is sorted from high to low to obtain a difference negative sample and a hard negative sample.
As the identity intelligent contract similarity detection, further, a basic block in a data set generates a difference negative sample and a hard negative sample corresponding to the basic block according to a preset proportion through a natural language processing model.
As the identity intelligent contract similarity detection, a three-tuple network model is further formed by a transform model, and error calculation of model training is performed by utilizing a triple loss function.
As the identity intelligent contract similarity detection, the adaptive moment estimation adam optimizer is used for carrying out adaptive learning on model parameters in the triple network model training optimization.
Further, the invention also provides a solid intelligent contract similarity detection system, which comprises: a data collection module, a sample construction module and a training detection module, wherein,
the data collection module is used for collecting intelligent contract source codes of different versions of the Etherns, respectively obtaining two types of intermediate representations through compiling and compiling optimization, and obtaining similar basic block pairs consisting of basic blocks and similar basic blocks through marking the two types of intermediate representations to form an intelligent contract similarity data set;
the sample construction module is used for generating a basic block sequence vector through vector space embedding aiming at the similarity data set, and acquiring a differential negative sample and a hard negative sample of a basic block through a natural language processing model;
the training detection module is used for constructing a triple network model for carrying out similarity measurement on the intelligent contract, carrying out training optimization on the triple network model by taking a basic block in a similarity data set as an anchor and a similar basic block as a positive sample and taking a differentiated negative sample and/or a hard negative sample as a negative sample, and carrying out similarity detection on the target intelligent contract based on the training optimized triple network model.
The invention has the beneficial effects that:
the invention carries out data training on the intermediate representation level and the basic block granularity of the intelligent contract, can carry out similarity measurement of the Solidiy language across compiler versions, and can obtain higher accuracy; the optimized similar basic blocks are obtained by adopting a basic block marking mode, so that the blank of the similarity research data set of the intelligent synthetic intermediate representation layer is made up; the method is characterized in that a pre-training mode is used for embedding the space vector of the intelligent contract basic block, and then sampling is carried out based on the improved negative sample, so that a triple network suitable for intelligent contract similarity measurement can be put in, a good measurement result can be generated, the method can be applied to intelligent contract vulnerability mining, malicious contract detection, contract upgrading safety detection and the like under a large-scale background, and has a good application prospect.
Description of the drawings:
FIG. 1 is a schematic flow chart of a method for detecting similarity of intelligent contracts for identity in the embodiment;
FIG. 2 is a flow diagram of a similarity detection algorithm in an embodiment;
FIG. 3 is an exemplary illustration of a compiled basic block marking generated intelligent contract intermediate representation similar basic block pairs in an embodiment;
FIG. 4 is a schematic representation of the normalized representation in an embodiment;
FIG. 5 is a diagram illustrating the structure of the Transformer and the structure of the machine learning model in the embodiment.
The specific implementation mode is as follows:
in order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described in detail below with reference to the accompanying drawings and technical solutions.
The embodiment of the invention provides a method for detecting similarity of a identity intelligent contract, which is shown in a figure 1 and comprises the following contents:
s101, collecting intelligent contract source codes of different versions of the Etherns, respectively obtaining two types of intermediate representations through compiling and compiling optimization, and obtaining similar basic block pairs consisting of basic blocks and similar basic blocks through marking the two types of intermediate representations to form an intelligent contract similarity data set;
s102, generating a basic block sequence vector by embedding a vector space aiming at the similarity data set, and acquiring a differential negative sample and a hard negative sample of a basic block through a natural language processing model;
s103, constructing a triple network model for carrying out similarity measurement on the intelligent contract, training and optimizing the triple network model by taking a basic block in the similarity data set as an anchor and a similar basic block as a positive sample and taking a differential negative sample and/or a hard negative sample as a negative sample, and carrying out similarity detection on the target intelligent contract based on the training and optimized triple network model. .
Data training is carried out on the intermediate representation level and the basic block granularity of the intelligent contract, so that the similarity measurement of the Solidiy language across compiler versions can be carried out, and higher accuracy can be obtained; the optimized similar basic blocks are obtained by adopting a basic block marking mode, so that the blank of the similarity research data set of the intelligent synthetic intermediate representation layer is made up; the method comprises the steps of embedding a space vector into an intelligent contract basic block in a pre-training mode, and then sampling based on the improved negative sample, so that a triple network suitable for intelligent contract similarity measurement can be put into the intelligent contract basic block, a good measurement result can be generated, and the method can be applied to intelligent contract vulnerability mining, malicious contract detection, contract upgrading safety detection and the like under a large-scale background.
Referring to the algorithm shown in fig. 2, the first purpose of the algorithm is to provide a data set for detecting the similarity of intelligent contracts at an intermediate representation level, wherein the data set comprises intelligent contract basic blocks and corresponding similar basic blocks across different versions, fills the blank of the data set for the etherhouse intelligent contract similarity research, and provides training data for similarity detection for a machine learning model. The method specifically comprises the following steps: collecting a certain amount of intelligent contracts of different versions of the Ethernet workshop; compiling the collected intelligent contracts with optimization options and compiling the intelligent contracts without the optimization options respectively to generate optimized intermediate representation files and unoptimized intermediate representation files; referring to fig. 3 and 4, the instructions in the intermediate representation file are normalized, and a basic block sequence is generated according to the basic block definition; and matching the basic blocks in the file and the corresponding similar basic blocks (positive samples) according to the basic block marks, and summarizing to obtain a data set. The second purpose is to provide a method for embedding the intelligent contract basic blocks into the space vector through a pre-training model, and improving the negative sample sampling mode on the basis of the method, so that the machine learning model has better effect. The method specifically comprises the following steps: converting a basic block sequence in the data set generated in the first target into token, and then embedding vector space by using word2 vec; referring to fig. 5, the embedded basic block sequence vector is put into a natural language processing model Transformer for pre-training, so as to obtain a vector space distance between unoptimized basic blocks, i.e. a similarity measure of the unoptimized basic blocks; determining a basic block B1, randomly sampling 100 intelligent contract basic blocks, performing similarity measurement on the 100 sampled basic blocks and a basic block B1 by using a pre-training model, sequencing the 100 basic blocks from high to low according to the similarity, taking the basic block at the tail of the sequence as a differentiation negative sample, and taking the basic block at the head of the sequence as a hard negative sample, so that the hard negative sample can be obtained while the differentiation of the negative sample is ensured, and the subsequent model is helped to obtain a better training effect; and (3) basic blocks in the data set generated by the first purpose are processed according to a preset ratio of, for example, 3: a ratio of 1 generates corresponding differentiated negative examples and hard negative examples. The third purpose is to construct a neural network model suitable for intelligent contract similarity detection, which specifically comprises the following steps: and putting the generated positive samples (intelligent contracts in the first purpose are similar to basic blocks), negative samples (differential negative samples or hard negative samples generated in the third purpose) and the basic blocks into a triple network consisting of a Transformer, and finally obtaining a similarity measurement result of the intelligent contracts through reasoning by the trained model. In the algorithm implementation, the trained model is dynamically quantized to reduce the memory occupation and energy consumption of the model and reduce the model reasoning time. An adam optimizer can be used during training, and the learning rate is as follows: 3e-5, the batch size is: 32, epoch is: 10, dropout ratio: 0.1. the hardware environment for the above hyper-parameter training is as follows: intel Xeon, display card: nvidia RTX2080Ti × 4, operating memory: 192GB, hard disk: 2TB SSD. It is more suitable if the hardware environment is better.
The unknown intelligent contract can obtain the similarity with the known intelligent contract through the reasoning of the model of the invention after the intermediate representation is obtained through compiling and the basic block sequence of the intermediate representation is embedded into the vector space.
Vulnerability detection may be an application of similarity detection. And (3) carrying out similarity comparison on the randomly selected unknown 100 intelligent contracts with the known intelligent contracts with the vulnerabilities, wherein the result shows that the unknown intelligent contracts with the known intelligent contracts with the vulnerabilities and high in similarity have the same vulnerability problem, and the accuracy is up to 97%.
Further, based on the foregoing method, an embodiment of the present invention further provides a identity intelligent contract similarity detection system, including: a data collection module, a sample construction module and a training detection module, wherein,
the data collection module is used for collecting intelligent contract source codes of different versions of the Etherns, respectively obtaining two types of intermediate representations through compiling and compiling optimization, and obtaining similar basic block pairs consisting of basic blocks and similar basic blocks through marking the two types of intermediate representations to form an intelligent contract similarity data set;
the sample construction module is used for generating a basic block sequence vector through vector space embedding aiming at the similarity data set, and acquiring a differential negative sample and a hard negative sample of a basic block through a natural language processing model;
the training detection module is used for constructing a triple network model for carrying out similarity measurement on the intelligent contract, carrying out training optimization on the triple network model by taking a basic block in a similarity data set as an anchor and a similar basic block as a positive sample and taking a differentiated negative sample and/or a hard negative sample as a negative sample, and carrying out similarity detection on the target intelligent contract based on the training optimized triple network model.
Unless specifically stated otherwise, the relative steps, numerical expressions, and values of the components and steps set forth in these embodiments do not limit the scope of the present invention.
Based on the foregoing system, an embodiment of the present invention further provides a server, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method described above.
Based on the system, the embodiment of the invention further provides a computer readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the system embodiment, and for the sake of brief description, reference may be made to the corresponding content in the system embodiment for the part where the device embodiment is not mentioned.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing system embodiments, and are not described herein again.
In all examples shown and described herein, any particular value should be construed as merely exemplary, and not as a limitation, and thus other examples of example embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and system may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the system according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for detecting similarity of intelligent contracts based on identity is characterized by comprising the following steps:
collecting intelligent contract source codes of different versions of the ethereals, respectively obtaining two types of intermediate representations through compiling and compiling optimization, and obtaining similar basic block pairs consisting of basic blocks and similar basic blocks through marking the two types of intermediate representations to form an intelligent contract similarity data set;
generating a basic block sequence vector by embedding a vector space aiming at the similarity data set, and acquiring a differential negative sample and a hard negative sample of a basic block through a natural language processing model;
constructing a triple network model for carrying out similarity measurement on the intelligent contract, training and optimizing the triple network model by taking a basic block in a similarity data set as an anchor and a similar basic block as a positive sample and taking a differential negative sample and/or a hard negative sample as a negative sample, and carrying out similarity detection on the target intelligent contract based on the training and optimized triple network model.
2. The method for detecting similarity of intelligent contracts according to identity of claim 1, wherein in the basic block marking, firstly normalizing the instructions in the intermediate representation, and generating a basic block sequence according to the basic block definition; and then matching and summarizing the basic blocks and the similar basic blocks in the intermediate representation according to the basic block marks to obtain a similarity data set.
3. The method for detecting similarity of the Solidity intelligent contracts of claim 1 or 2, wherein in the vector space embedding, basic blocks in a data set are firstly converted into tokens, and then word2vec is used for vector space embedding to generate basic block sequence vectors.
4. The method for detecting similarity of intelligent contracts according to claim 1, wherein a transform model is adopted by the natural language processing model, the transform model is utilized to obtain a vector space distance for similarity measurement between basic blocks, and similarity is sorted from high to low to obtain a difference negative sample and a hard negative sample.
5. The method for detecting similarity of a identity smart contract of claim 1 or 4, wherein the basic blocks in the data set generate difference negative samples and hard negative samples corresponding to the basic blocks according to a preset proportion through a natural language processing model.
6. The method for detecting similarity of intelligent contracts of identity according to claim 1, characterized in that a triplet network model is formed by a transform model, and error calculation of model training is performed by using a triplet loss function.
7. The method for detecting similarity of intelligent contracts of identity according to claim 1, characterized in that in the triple network model training optimization, an adam optimizer is used to optimize model parameters.
8. A identity intelligent contract similarity detection system, comprising: a data collection module, a sample construction module and a training detection module, wherein,
the data collection module is used for collecting intelligent contract source codes of different versions of the Etherns, respectively obtaining two types of intermediate representations through compiling and compiling optimization, and obtaining similar basic block pairs consisting of basic blocks and similar basic blocks through marking the two types of intermediate representations to form an intelligent contract similarity data set;
the sample construction module is used for generating a basic block sequence vector through vector space embedding aiming at the similarity data set, and acquiring a differential negative sample and a hard negative sample of a basic block through a natural language processing model;
the training detection module is used for constructing a triple network model for carrying out similarity measurement on the intelligent contract, carrying out training optimization on the triple network model by taking a basic block in a similarity data set as an anchor and a similar basic block as a positive sample and taking a differentiated negative sample and/or a hard negative sample as a negative sample, and carrying out similarity detection on the target intelligent contract based on the training optimized triple network model.
9. A server, comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to perform the method of any one of claims 1 to 7.
10. A computer-readable medium, on which a computer program for execution by a processor is stored, the computer program being adapted to perform the method of any of claims 1 to 7.
CN202110420735.XA 2021-04-19 2021-04-19 Method and system for detecting similarity of intelligent contracts of identity Active CN113268732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110420735.XA CN113268732B (en) 2021-04-19 2021-04-19 Method and system for detecting similarity of intelligent contracts of identity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110420735.XA CN113268732B (en) 2021-04-19 2021-04-19 Method and system for detecting similarity of intelligent contracts of identity

Publications (2)

Publication Number Publication Date
CN113268732A true CN113268732A (en) 2021-08-17
CN113268732B CN113268732B (en) 2022-12-20

Family

ID=77228987

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110420735.XA Active CN113268732B (en) 2021-04-19 2021-04-19 Method and system for detecting similarity of intelligent contracts of identity

Country Status (1)

Country Link
CN (1) CN113268732B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689188A (en) * 2021-08-23 2021-11-23 交通银行股份有限公司 Decentralization information management system and method based on Ether house intelligent contracts
CN116028596A (en) * 2023-03-27 2023-04-28 云筑信息科技(成都)有限公司 Method for realizing entity matching blocking

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9659248B1 (en) * 2016-01-19 2017-05-23 International Business Machines Corporation Machine learning and training a computer-implemented neural network to retrieve semantically equivalent questions using hybrid in-memory representations
CN110737899A (en) * 2019-09-24 2020-01-31 暨南大学 machine learning-based intelligent contract security vulnerability detection method
CN110782346A (en) * 2019-10-09 2020-02-11 山东科技大学 Intelligent contract classification method based on keyword feature extraction and attention
CN110993044A (en) * 2019-11-28 2020-04-10 周口师范学院 Lightweight dynamic autonomous cross-link interaction method for medical alliance link
CN111125716A (en) * 2019-12-19 2020-05-08 中国人民大学 Method and device for detecting Ethernet intelligent contract vulnerability
US20200193064A1 (en) * 2019-05-15 2020-06-18 Alibaba Group Holding Limited Blockchain-based copyright distribution
CN111309305A (en) * 2020-02-12 2020-06-19 扬州大学 Intelligent contract-oriented code automatic recommendation method, system, computer equipment and storage medium
CN112115472A (en) * 2020-08-12 2020-12-22 北京智融云河科技有限公司 Data management and control oriented intelligent contract code checking method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9659248B1 (en) * 2016-01-19 2017-05-23 International Business Machines Corporation Machine learning and training a computer-implemented neural network to retrieve semantically equivalent questions using hybrid in-memory representations
US20200193064A1 (en) * 2019-05-15 2020-06-18 Alibaba Group Holding Limited Blockchain-based copyright distribution
CN110737899A (en) * 2019-09-24 2020-01-31 暨南大学 machine learning-based intelligent contract security vulnerability detection method
CN110782346A (en) * 2019-10-09 2020-02-11 山东科技大学 Intelligent contract classification method based on keyword feature extraction and attention
CN110993044A (en) * 2019-11-28 2020-04-10 周口师范学院 Lightweight dynamic autonomous cross-link interaction method for medical alliance link
CN111125716A (en) * 2019-12-19 2020-05-08 中国人民大学 Method and device for detecting Ethernet intelligent contract vulnerability
CN111309305A (en) * 2020-02-12 2020-06-19 扬州大学 Intelligent contract-oriented code automatic recommendation method, system, computer equipment and storage medium
CN112115472A (en) * 2020-08-12 2020-12-22 北京智融云河科技有限公司 Data management and control oriented intelligent contract code checking method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHIPENG GAO等: "SmartEmbed: A Tool for Clone and Bug Detection in Smart Contracts through Structural Code Embedding", 《2019 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME)》 *
倪远东等: "智能合约安全漏洞研究综述", 《信息安全学报》 *
刘云霞等: "面向智能合约链上升级的松耦合模型研究", 《计算机应用研究》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689188A (en) * 2021-08-23 2021-11-23 交通银行股份有限公司 Decentralization information management system and method based on Ether house intelligent contracts
CN116028596A (en) * 2023-03-27 2023-04-28 云筑信息科技(成都)有限公司 Method for realizing entity matching blocking
CN116028596B (en) * 2023-03-27 2023-08-18 云筑信息科技(成都)有限公司 Method for realizing entity matching blocking

Also Published As

Publication number Publication date
CN113268732B (en) 2022-12-20

Similar Documents

Publication Publication Date Title
Wang et al. LightLog: A lightweight temporal convolutional network for log anomaly detection on the edge
CN109391706A (en) Domain name detection method, device, equipment and storage medium based on deep learning
CN113268732B (en) Method and system for detecting similarity of intelligent contracts of identity
CN106557695A (en) A kind of malicious application detection method and system
CN110704103A (en) Binary file semantic similarity comparison method and device based on software genes
Dey et al. Representation of developer expertise in open source software
CN113254934B (en) Binary code similarity detection method and system based on graph matching network
CN110705255A (en) Method and device for detecting association relation between sentences
Ullah et al. Programmers' de-anonymization using a hybrid approach of abstract syntax tree and deep learning
CN117195220A (en) Intelligent contract vulnerability detection method and system based on Tree-LSTM and BiLSTM
CN115859302A (en) Source code vulnerability detection method, device, equipment and storage medium
Jiang et al. Applying blockchain-based method to smart contract classification for CPS applications
CN112783508B (en) File compiling method, device, equipment and storage medium
Dong Application of Big Data Mining Technology in Blockchain Computing
CN112966728A (en) Transaction monitoring method and device
CN114285587A (en) Domain name identification method and device and domain name classification model acquisition method and device
Mo et al. Exponential stability of the Euler-Maruyama method for neutral stochastic functional differential equations with jumps
CN115829712A (en) Data information security classification method and device
Barr et al. On the vulnerability of large corpora source code
CN114218580A (en) Intelligent contract vulnerability detection method based on multi-task learning
He et al. Parallel decision tree with application to water quality data analysis
Ivanov et al. Predicting type annotations for python using embeddings from graph neural networks
CN116578989B (en) Intelligent contract vulnerability detection system and method based on deep pre-training neural network
Adekanmbi et al. USING MACHINE LEARNING TO DETECT CARD FRAUD FROM TRANSACTION DATA
CN115762683B (en) Method and device for processing fuel cell design data and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant