CN112579463B - Solidity intelligent contract-oriented defect prediction method - Google Patents

Solidity intelligent contract-oriented defect prediction method Download PDF

Info

Publication number
CN112579463B
CN112579463B CN202011562073.1A CN202011562073A CN112579463B CN 112579463 B CN112579463 B CN 112579463B CN 202011562073 A CN202011562073 A CN 202011562073A CN 112579463 B CN112579463 B CN 112579463B
Authority
CN
China
Prior art keywords
defect
solidity
prediction
information
regression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011562073.1A
Other languages
Chinese (zh)
Other versions
CN112579463A (en
Inventor
杨慧文
崔展齐
贾明华
刘秀磊
刘建宾
郑丽伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dabu Technology Beijing Co ltd
Original Assignee
Dabu Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dabu Technology Beijing Co ltd filed Critical Dabu Technology Beijing Co ltd
Priority to CN202011562073.1A priority Critical patent/CN112579463B/en
Publication of CN112579463A publication Critical patent/CN112579463A/en
Application granted granted Critical
Publication of CN112579463B publication Critical patent/CN112579463B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3608Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Factory Administration (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a Solidity intelligent contract-oriented defect prediction method, which is applied to the technical field of software defect prediction, and comprises the steps of firstly extracting metric elements of code modules from Solidity source codes, marking the number of defects for each code module, and thus constructing a defect prediction dataset; then, aiming at the problem of class unbalance in Solidity defect prediction data sets, carrying out data preprocessing by adopting an oversampling method; and finally, respectively constructing a defect quantity prediction model and a defect tendency prediction model, and evaluating the performance of the models. According to the invention, the metric metaset is combined with Solidity intelligent contract defect detection results to construct a Solidity intelligent contract defect prediction dataset, so that the characteristics of Solidity intelligent contracts can be better described, and the performance differences of different models in defect quantity prediction and defect tendency prediction problems are respectively verified based on the dataset.

Description

Solidity intelligent contract-oriented defect prediction method
Technical Field
The invention relates to the technical field of software defect prediction, in particular to a defect prediction method for Solidity intelligent contracts.
Background
Blockchains are the core support technology for digital cryptocurrency systems, represented by bitcoin. The block chain technology has the core advantages of decentralization, and provides a solution for solving the problems of high cost, low efficiency, unsafe data storage and the like existing in a decentralization mechanism. Research and application of blockchain technology presents explosive growth, government departments, financial institutions, technological enterprises, and capital markets alike are exploring ways to solve practical problems using blockchain technology.
The intelligent contract is a core component of the blockchain, and is a digital protocol which uses algorithms and programs to compile contract terms, runs on the blockchain and can be automatically executed according to rules. Intelligent contracts were first proposed in 1994, and there is a great deal of attention with the advent of blockchain technology. More complex applications can be realized by writing intelligent contracts, thereby expanding the functions of the blockchain. At present, intelligence is about to play a role in the aspects of traditional financial assets, asset management in social systems, contract management and the like, such as stock right crowd funding, or voting agreement establishment based on intelligent contracts and the like.
The intelligent contract brings potential safety risks while expanding the function of the blockchain, and the defects of the intelligent contract can cause huge loss to property, such as: the 11 th 2017 party wallet was attacked, resulting in 2.85 billion dollars of ethernet currency being frozen; 300 ten thousand Ethernet coins of the maximum crowd funding project TheDAO in 2016 are illegally transferred and the like, and unlike traditional software, patch repair of an intelligent contract after deployment is very difficult, so that the quality assurance technology of the intelligent contract is widely focused in industry and academia.
The software defect prediction is an effective supplement to a defect detection technology, and the software defect prediction technology predicts defect tendency or defect quantity of a software module by analyzing software codes or development processes, designing measurement elements related to defects and adopting methods such as machine learning, and the like, optimizes distribution of defect detection resources according to prediction results, or judges the test sufficiency of a system, and is used as a basis for judging whether software can be delivered or not so as to promote improvement of software quality.
To our knowledge, however, there has been no study on defect prediction in the field of smart contracts. Applying software defect prediction techniques to the intelligent contract field faces the following challenges:
there is no intelligent contract defect prediction dataset.
Existing metric meta-sets focus on code complexity and object-oriented program characteristics, while intelligent contracts are used as a novel program related to monetary variation, and no specific metric meta-sets currently describe related characteristics of intelligent contracts.
Therefore, how to provide a defect prediction method for Solidity intelligent contracts is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the present invention provides a defect prediction method for Solidity intelligent contracts, which combines a metric element set and Solidity defect detection results to construct a Solidity intelligent contract defect prediction dataset, so that the characteristics of Solidity intelligent contracts can be better described, and based on the dataset, the performance differences of different models in defect number prediction and defect tendency prediction problems are respectively verified. For the defect predisposition prediction problem, it is further analyzed whether processing of class imbalance data sets using an oversampling technique would improve the prediction performance.
In order to achieve the above object, the present invention provides the following technical solutions:
a defect prediction method for Solidity intelligent contracts comprises the following specific steps:
Extracting metric meta information from a source code, performing defect detection on the source code to obtain Solidity defect detection information, and correspondingly combining the two information according to the contact/library to form a defect data set;
and predicting the predicted Solidity intelligent contracts by using the regression model and the classification model.
Preferably, in the defect prediction method for Solidity intelligent contracts, the metric meta information includes: solidity smart contract functions, methods, variable types, attributes, and Solidity language constraints.
Preferably, in the defect prediction method for Solidity intelligent contracts, the specific step of extracting the metric meta information includes: the CKBD metrics meta-information for object-oriented features and code complexity is combined with the SC-Sol metrics meta-information for the constraints of the Solidity smart contract for functions, methods, variable types, attributes and Solidity language to obtain CKBD-SC-Sol metrics metasets.
Preferably, in the defect prediction method for Solidity intelligent contracts, the specific step of obtaining defect information includes: according to Solidity intelligent contract defect detection information, sorting the intelligent contract defect detection information into the defect number of different types of defects contained in each contact/library; for the defect number data set, the defect number of each contact/library is the sum of the defect numbers of the various types; for defect predisposition data sets, the number of defects of each type is binarized, i.e. the label of the defect/library with a number of defects greater than 1 is marked 1, otherwise 0.
Preferably, in the defect prediction method for Solidity intelligent contracts, the defect number of the Solidity intelligent contracts is predicted by using a regression model; the regression model is one of linear regression, bayesian ridge, decision tree regression, random forest regression, K-neighbor regression, gradient acceleration regression, support vector machine regression and the like.
Preferably, in the defect prediction method for Solidity intelligent contracts, the defect tendency of Solidity intelligent contracts is predicted by using a classification model; the classification model is one of a Bernoulli Bayesian classifier, a Gao Sibei leaf classifier, a K-neighbor classifier, a decision tree classifier, a random forest classifier, a support vector machine classifier and the like.
Compared with the prior art, the defect prediction method for Solidity intelligent contracts is provided by the invention, and firstly, the metric elements of the code modules are extracted from Solidity source codes, and the defect number is marked for each code module, so that a defect prediction data set is constructed; then, aiming at the problem of class unbalance in Solidity defect prediction data sets, carrying out data preprocessing by adopting an oversampling method; and finally, respectively constructing a defect quantity prediction model and a defect tendency prediction model, and evaluating the performance of the models. The invention combines Solidity defect detection information to construct Solidity intelligent contract defect prediction dataset, can better describe the characteristics of Solidity intelligent contracts, and respectively verifies the performance difference of different models in defect quantity prediction and defect tendency prediction problems based on the dataset.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
A defect prediction method for Solidity intelligent contracts is shown in FIG. 1, and comprises the following specific steps:
extracting metric meta information from a source code, performing defect detection on the source code, acquiring defect detection information, and correspondingly combining the two information according to the contact/library to form a defect prediction data set;
and predicting the predicted Solidity intelligent contracts by using the regression model and the classification model.
Specifically, first, metric meta-information, CKBD-SC-Sol, is extracted from the source code, and combined with Solidity defect detection results, a smart contract defect dataset, called Solidity, is composed.
Secondly, 7 regression models, namely linear regression, bayesian ridge, decision tree regression, random forest regression, K-nearest neighbor regression, gradient acceleration regression and support vector machine regression are applied to the defect number prediction problem of Solidity intelligent contracts.
Third, for the defect tendency prediction problem of Solidity intelligent contracts, 6 classification models, namely a Bernoulli Bayesian classifier, a Gao Sibei leaf-Sizer, a K-nearest neighbor classifier, a decision tree classifier, a random forest classifier and a support vector machine classifier are applied.
Further, the metric meta information includes: object-oriented features, code complexity, solidity smart contract functions, methods, variable types, attributes, and Solidity language constraints.
In order to further optimize the technical scheme, the specific steps of extracting the measurement meta information are as follows: the CKBD metrics meta-information for object-oriented features and code complexity is combined with the SC-Sol metrics meta-information for the constraints of the Solidity smart contract for functions, methods, variable types, attributes and Solidity language to obtain CKBD-SC-Sol metrics metasets.
Further, since there is no defect prediction dataset of Solidity smart contracts yet, in order to construct a defect prediction model, first, the source codes of Solidity smart contracts are obtained from Xblock and ETHERSCAN, CKBD-SC-Sol metric metasets in Solidity smart contracts are extracted using AST analysis tool solidity-parser-antlr, and the extracted CKBD-SC-Sol metric metaset information is combined with corresponding defect detection information to construct Solidity defect prediction dataset.
In order to further optimize the technical scheme, the specific steps of obtaining defect information are as follows: the intelligent block chain contract detection platform outputs defect types and corresponding code line numbers after analyzing and detecting the source codes or Ethernet contract addresses through the input Solidity, and the defect reports output by the intelligent block chain contract detection platform are arranged into the defect number of different types of defects contained in each contact/library according to the defect types; for the defect number data set, the defect number of each contact/library is the sum of the defect numbers of the various types; for defect predisposition data sets, the number of defects of each type is binarized, i.e. the label of the defect/library with a number of defects greater than 1 is marked 1, otherwise 0.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (4)

1. A Solidity intelligent contract-oriented defect prediction method is characterized by comprising the following specific steps:
Extracting metric meta information from the source code, obtaining defect information, and combining to form a defect data set; the metric meta information includes: object-oriented features, code complexity, solidity functions of intelligent contracts, methods, variable types, attributes, and Solidity language constraints; the specific steps of obtaining defect information are as follows: according to Solidity defect detection information, sorting the defect number of different types of defects contained in each contact/library according to the defect type; for the defect number data set, the defect number of each contact/library is the sum of the defect numbers of the various types; for defect predisposition data sets, binarizing the defect quantity of each type, namely marking a label of a contact/library with the defect quantity larger than 1 as 1, otherwise marking as 0; obtaining source codes of Solidity intelligent contracts from Xblock and ETHERSCAN, extracting CKBD-SC-Sol metric metasets in Solidity intelligent contracts by using an AST analysis tool solidity-parser-antlr, and combining the extracted CKBD-SC-Sol metric metaset information with corresponding defect detection information to construct a Solidity defect prediction dataset;
and predicting the predicted Solidity intelligent contracts by using the regression model and the classification model.
2. The method for predicting defects in Solidity smart contracts according to claim 1, wherein the specific steps of extracting metric meta information are as follows: the CKBD metrics meta-information for object-oriented features and code complexity is combined with the SC-Sol metrics meta-information for the constraints of the Solidity smart contract for functions, methods, variable types, attributes and Solidity language to obtain CKBD-SC-Sol metrics metasets.
3. The method for predicting defects of Solidity-oriented intelligent contracts according to claim 1, wherein the number of defects of Solidity intelligent contracts is predicted by using a regression model; wherein the method comprises the steps of
The regression model is one of linear regression, bayesian ridge, decision tree regression, random forest regression and K adjacent regression.
4. The method for predicting defects in Solidity smart contracts according to claim 1, wherein the defect tendencies of Solidity smart contracts are predicted using a classification model; the classification model is one of a Bernoulli Bayesian classifier, a Gao Sibei leaf classifier, a K-neighbor classifier, a decision tree classifier, a random forest classifier and a support vector machine classifier.
CN202011562073.1A 2020-12-25 2020-12-25 Solidity intelligent contract-oriented defect prediction method Active CN112579463B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011562073.1A CN112579463B (en) 2020-12-25 2020-12-25 Solidity intelligent contract-oriented defect prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011562073.1A CN112579463B (en) 2020-12-25 2020-12-25 Solidity intelligent contract-oriented defect prediction method

Publications (2)

Publication Number Publication Date
CN112579463A CN112579463A (en) 2021-03-30
CN112579463B true CN112579463B (en) 2024-05-24

Family

ID=75139676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011562073.1A Active CN112579463B (en) 2020-12-25 2020-12-25 Solidity intelligent contract-oriented defect prediction method

Country Status (1)

Country Link
CN (1) CN112579463B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114331396A (en) * 2021-12-28 2022-04-12 中国科学技术大学 Automatic protocol security attribute extraction method and system for Ether house intelligent contract
CN114510431B (en) * 2022-04-20 2022-07-05 武汉理工大学 Workload-aware intelligent contract defect prediction method, system and equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106201871A (en) * 2016-06-30 2016-12-07 重庆大学 Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised
CN108664402A (en) * 2018-05-14 2018-10-16 北京航空航天大学 A kind of failure prediction method based on software network feature learning
CN109977682A (en) * 2019-04-01 2019-07-05 中山大学 A kind of block chain intelligence contract leak detection method and device based on deep learning
CN110543419A (en) * 2019-08-28 2019-12-06 杭州趣链科技有限公司 intelligent contract code vulnerability detection method based on deep learning technology
CN111240993A (en) * 2020-01-20 2020-06-05 北京航空航天大学 Software defect prediction method based on module dependency graph
CN111339535A (en) * 2020-02-17 2020-06-26 扬州大学 Vulnerability prediction method and system for intelligent contract codes, computer equipment and storage medium
CN111506504A (en) * 2020-04-13 2020-08-07 扬州大学 Software development process measurement-based software security defect prediction method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020018921A1 (en) * 2018-07-20 2020-01-23 Coral Protocol Blockchain transaction safety using smart contracts

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106201871A (en) * 2016-06-30 2016-12-07 重庆大学 Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised
CN108664402A (en) * 2018-05-14 2018-10-16 北京航空航天大学 A kind of failure prediction method based on software network feature learning
CN109977682A (en) * 2019-04-01 2019-07-05 中山大学 A kind of block chain intelligence contract leak detection method and device based on deep learning
CN110543419A (en) * 2019-08-28 2019-12-06 杭州趣链科技有限公司 intelligent contract code vulnerability detection method based on deep learning technology
CN111240993A (en) * 2020-01-20 2020-06-05 北京航空航天大学 Software defect prediction method based on module dependency graph
CN111339535A (en) * 2020-02-17 2020-06-26 扬州大学 Vulnerability prediction method and system for intelligent contract codes, computer equipment and storage medium
CN111506504A (en) * 2020-04-13 2020-08-07 扬州大学 Software development process measurement-based software security defect prediction method and device

Also Published As

Publication number Publication date
CN112579463A (en) 2021-03-30

Similar Documents

Publication Publication Date Title
US20190164015A1 (en) Machine learning techniques for evaluating entities
CN111461216B (en) Case risk identification method based on machine learning
CN112579463B (en) Solidity intelligent contract-oriented defect prediction method
CN112070138A (en) Multi-label mixed classification model construction method, news classification method and system
US20210201270A1 (en) Machine learning-based change control systems
Narayana et al. Automation and smart materials in detecting smart contracts vulnerabilities in Blockchain using deep learning
Choe et al. The Real‐Time Mobile Application for Classifying of Endangered Parrot Species Using the CNN Models Based on Transfer Learning
CN113705909A (en) Risk level prediction method and device based on prediction model and storage medium
Wu et al. Code vulnerability detection based on deep sequence and graph models: A survey
Tian et al. Enhancing vulnerability detection via AST decomposition and neural sub-tree encoding
Gopali et al. Vulnerability detection in smart contracts using deep learning
Dong Application of Big Data Mining Technology in Blockchain Computing
CN116739605A (en) Transaction data detection method, device, equipment and storage medium
CN115268847A (en) Block chain intelligent contract generation method and device and electronic equipment
Peng et al. Unbalanced Data Processing and Machine Learning in Credit Card Fraud Detection
CN112561538B (en) Risk model creation method, apparatus, computer device and readable storage medium
CN112115212B (en) Parameter identification method and device and electronic equipment
Gopala Krishnan et al. Predictive algorithm and criteria to perform big data analytics
Jain et al. An integrated deep learning model for Ethereum smart contract vulnerability detection
Gangopadhyay et al. LAD in finance: accounting analytics and fraud detection
Nha et al. Methodology Interaction by Machine Learning Model to Detect Vulnerability in Smart Contract of Blockchain
Liermann et al. Use Case: Optimization of Regression Tests—Reduction of the Test Portfolio Through Representative Identification
Li et al. A review of data representation methods for vulnerability mining using deep learning
Tuhin et al. Smart cybercrime classification for digital forensics with small datasets
Fugini et al. A text analytics architecture for smart companies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220920

Address after: Room 3006, Building 2, Tianchang Garden, No. 34 Beiyuan Road, Chaoyang District, Beijing 100000

Applicant after: Dabu Technology (Beijing) Co.,Ltd.

Address before: 100192 Beijing city Haidian District Qinghe small Camp Road No. 12

Applicant before: BEIJING INFORMATION SCIENCE AND TECHNOLOGY University

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant