CN117725594A - Multiple composite detection method, device, equipment and storage medium of intelligent contract - Google Patents

Multiple composite detection method, device, equipment and storage medium of intelligent contract Download PDF

Info

Publication number
CN117725594A
CN117725594A CN202311765424.2A CN202311765424A CN117725594A CN 117725594 A CN117725594 A CN 117725594A CN 202311765424 A CN202311765424 A CN 202311765424A CN 117725594 A CN117725594 A CN 117725594A
Authority
CN
China
Prior art keywords
vulnerability
detection
initial
list
detector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311765424.2A
Other languages
Chinese (zh)
Inventor
蔡承均
康嘉文
黄育城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
City University Of Hong Kong Dongguan Preparatory
Original Assignee
City University Of Hong Kong Dongguan Preparatory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by City University Of Hong Kong Dongguan Preparatory filed Critical City University Of Hong Kong Dongguan Preparatory
Priority to CN202311765424.2A priority Critical patent/CN117725594A/en
Publication of CN117725594A publication Critical patent/CN117725594A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure relates to a multiple composite detection method, apparatus, device and storage medium for intelligent contracts, the method comprising: acquiring an intelligent contract to be detected; detecting intelligent contracts to be detected by adopting a plurality of detection tools respectively to obtain detection result files respectively generated by each detection tool; carrying out data integration processing on a plurality of detection result files to obtain an initial vulnerability list; and verifying the initial vulnerability list by adopting a large language model to obtain a target vulnerability list. The multiple composite detection method of the intelligent contracts provided by the disclosure detects the intelligent contracts to be detected through a plurality of detection tools, and solves the technical problems that a single detection tool has larger bias to different vulnerability types and the vulnerability detection coverage is not high. The large language model is used for verifying the detection result output by the detection tool, and the characteristic of wide coverage is utilized for detection by the large language model, so that the false alarm rate and the false alarm rate of the loopholes can be further reduced, and the safety of the intelligent contract is improved.

Description

Multiple composite detection method, device, equipment and storage medium of intelligent contract
Technical Field
The disclosure relates to the field of computer technology, and in particular, to a multiple composite detection method, device, equipment and storage medium for intelligent contracts.
Background
With the rapid development of medical health, internet of things, financial information and the like, the application of blockchain technology in the same is gradually accepted. The intelligent contract is used as a decentralization application in the blockchain, and can finish transparent, non-tamperable and traceable information transfer and money transaction without third party certification. However, the intelligent contracts at present also face the problems of malicious attacks, code loopholes and the like.
In the related art, in order to improve security of smart contracts and discover potential code vulnerabilities, smart contract detection tools are generally used to perform code detection on smart contracts. The detection tool comprises a plurality of detectors, each detector has a bug type focused by the detector, but the difference between different detection tools is larger, the detection tools have larger bias towards different bug types, and the detection tools rely on rules defined by experts to detect code bugs. With the increase of the number and the complexity of the intelligent contracts, the problems of high false alarm rate and false alarm rate of the loopholes and the like exist, so that a developer cannot discover potential loopholes in the intelligent contracts in time, and the safety risk of the intelligent contracts is further increased.
Disclosure of Invention
According to a first aspect of the present disclosure, there is provided a multiple composite detection method of an intelligent contract, including:
acquiring an intelligent contract to be detected;
detecting the intelligent contract to be detected by adopting a plurality of detection tools respectively to obtain detection result files respectively generated by each detection tool;
carrying out data integration processing on a plurality of detection result files to obtain an initial vulnerability list;
and verifying the initial vulnerability list by adopting a large language model to obtain a target vulnerability list.
According to a second aspect of the present disclosure, there is provided a multiple composite detection apparatus of an intelligent contract, comprising:
the data acquisition module is used for acquiring the intelligent contract to be detected;
the data processing module is used for respectively detecting the intelligent contracts to be detected by adopting a plurality of detection tools to obtain detection result files respectively generated by each detection tool;
the data processing module is further used for carrying out data integration processing on the plurality of detection result files to obtain an initial vulnerability list;
the data processing module is further used for verifying the initial vulnerability list by adopting a large language model to obtain a target vulnerability list.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
A processor; the method comprises the steps of,
a memory storing a program;
wherein the program comprises instructions which, when executed by the processor, cause the processor to perform the method according to an exemplary embodiment of the present disclosure.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform a method according to an exemplary embodiment of the present disclosure.
According to the one or more technical schemes provided by the embodiment of the disclosure, the intelligent contracts to be detected are obtained, a plurality of detection tools are adopted to detect the intelligent contracts to be detected respectively, and detection result files generated by each detection tool are obtained. And carrying out data integration processing on the plurality of detection result files to obtain an initial vulnerability list. And verifying the initial vulnerability list by adopting a large language model to obtain a target vulnerability list. The multiple composite detection method of the intelligent contracts provided by the disclosure detects the intelligent contracts to be detected through a plurality of detection tools, and solves the technical problems that a single detection tool has larger bias to different vulnerability types and the vulnerability detection coverage of the detection tools is not high. The large language model is used for verifying the detection result output by the detection tool, so that the false alarm rate and the false alarm rate of the loopholes can be further reduced, and the safety of the intelligent contract is improved.
Drawings
Further details, features and advantages of the present disclosure are disclosed in the following description of exemplary embodiments, with reference to the following drawings, wherein:
FIG. 1 is an overall block diagram of a multiple composite detection method for smart contracts according to an exemplary embodiment of the present disclosure;
FIG. 2 is a flow chart of a multiple composite detection method for an intelligent contract according to an exemplary embodiment of the present disclosure;
FIG. 3 is a flow chart of a detection module provided by an example of the present disclosure;
FIG. 4 is a flow chart of an exemplary provided verification module of the present disclosure;
FIG. 5 is a flow diagram of a method of multiple composite detection of an intelligent contract according to an exemplary embodiment of the present disclosure;
FIG. 6 is a schematic block diagram of functional blocks of a multiple composite detection device of an intelligent contract according to an exemplary embodiment of the present disclosure;
FIG. 7 is a schematic block diagram of an exemplary provided chip of the present disclosure;
fig. 8 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "comprising" and variations thereof as used herein is meant to be open ended, i.e., the term "comprising, but not limited to," is based, at least in part, on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below. It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to the relevant legal regulations.
For example, in response to receiving an active request from a user, a prompt is sent to the user to explicitly prompt the user that the operation it is requesting to perform will require personal information to be obtained and used with the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as an electronic device, an application program, a server or a storage medium for executing the operation of the technical scheme of the present disclosure according to the prompt information.
As an alternative but non-limiting implementation, in response to receiving an active request from a user, the manner in which the prompt information is sent to the user may be, for example, a popup, in which the prompt information may be presented in a text manner. In addition, a selection control for the user to select to provide personal information to the electronic device in a 'consent' or 'disagreement' manner can be carried in the popup window. It will be appreciated that the above-described notification and user authorization process is merely illustrative and not limiting of the implementations of the present disclosure, and that other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.
Before describing embodiments of the present disclosure, the following definitions are first provided for the relative terms involved in the embodiments of the present disclosure:
intelligent contract: an intelligent contract is an automated contract or program that executes on a blockchain that contains predefined rules and logic for automatically performing operations defined in the contract when certain conditions are met. Intelligent contracts are a key feature of blockchain technology that enables programmable, non-tamperable contract execution in a decentralized manner.
Search enhancement generation (RetrievalAugmented Generation, RAG) method: RAG is a natural language processing technique that combines information retrieval and generation models. The method aims at improving the performance of the generative model on specific tasks by retrieving relevant information from a large knowledge base and integrating the information into the generative model.
With the rapid development of medical health, internet of things, financial information and the like, the application of blockchain technology in the same is gradually accepted. The intelligent contract is used as a decentralization application in the blockchain, and can finish transparent, non-tamperable and traceable information transfer and money transaction without third party certification. However, the intelligent contracts at present also face the problems of malicious attacks, code loopholes and the like.
In the related art, in order to improve security of smart contracts and discover potential code vulnerabilities, smart contract detection tools are generally used to perform code detection on smart contracts. However, the detection tool relies on rules defined by experts to detect code vulnerabilities, and with the increase of the number of intelligent contracts and the improvement of complexity, the problems of high false alarm rate and false alarm rate of vulnerabilities exist. In addition, the detection report formats output by different detection tools are not uniform, and the readability is poor, so that a developer cannot timely discover potential vulnerabilities in the intelligent contract, and the safety risk of the intelligent contract is further increased.
Accordingly, in order to solve the above-mentioned problems, the embodiments of the present disclosure first provide a multiple composite detection method of an intelligent contract, and exemplary, fig. 1 is an overall frame diagram of a multiple composite detection method of an intelligent contract according to an exemplary embodiment of the present disclosure. As shown in fig. 1, the multiple composite detection method 100 of an intelligent contract may include a detection module 110, a verification module 120, and a report generation module 130. First, a plurality of detection tools are integrated into the detection module 110, and in the detection module 110, the intelligent contracts to be detected are sequentially detected for multiple rounds by using the plurality of detection tools, so as to obtain a plurality of detection result files. And carrying out data analysis processing on the plurality of detection result files to obtain an initial vulnerability list. Subsequently, in the verification module 120, the initial vulnerability list is verified using the large language model, and the verified vulnerability list and the corresponding modification suggestion are obtained. Finally, in the report generating module 130, the large language model is used to perform natural language colloquial processing on the vulnerability list and the modification suggestion, so as to improve the readability of the vulnerability list and the modification suggestion.
Illustratively, FIG. 2 is a flow chart of a multiple composite detection method of an exemplary smart contract provided by the present disclosure. As shown in fig. 2, the multiple composite detection method of the smart contract may include the steps of:
step S210: the detectors of the plurality of detection tools in the detection module are classified.
Illustratively, although each detection tool comprises a plurality of detectors, the difference between different detection tools is large, and the detection tools have large bias towards different vulnerability types, so that the problem of large difference between different detection tools can be solved by arranging the detection tools, such as Slither, mythril, security and the like, in the detection module in advance, and the detection process can cover wider vulnerability types, so that comprehensive detection of potential risks of intelligent contracts is improved.
The detection module also illustratively provides a detection tool insertion function for expanding the detection tools within the detection module. The user is allowed to add a new detection tool into the detection module, so that the user can select and integrate the customized detection module suitable for the intelligent contract to be detected according to the specific requirement of the user, thereby better adapting to the specific development environment and safety requirement and coping with the continuous emergence of new loopholes and threats in the intelligent contract field.
Because the emphasis points of the plurality of detectors in the detection tools are different, specific vulnerabilities aimed at are different, and in order to facilitate subsequent analysis and comparison of detection result files generated by different detection tools, classification processing can be performed on all detectors in all detection tools in the detection module.
For example, the detectors may be classified according to the type of vulnerability the detectors are interested in. The vulnerability type of interest to the detector is matched to the vulnerability type that has been determined to be categorized. If the detector is focused on a particular type of vulnerability, it is classified into a category corresponding to that vulnerability type.
In this embodiment, the vulnerability types can be classified into the following 14 categories: reentry attacks, shaping overflows, access control, exception operations, denial of service, type mismatch, ethernet freeze, short address attacks, ethernet loss, call stack overflows, tx.origin use loopholes, timestamp dependencies, block parameter dependencies, and transaction order dependencies. And classifying and integrating the 14 types of vulnerabilities and the detectors corresponding to the 14 types of vulnerabilities to obtain a classification set a. The classification set a may include detector names and corresponding vulnerability type information, for example, detectors corresponding to reentrant attack vulnerabilities are A1, A2, and A3, and the reentrant attack vulnerabilities are numbered 1; the detectors corresponding to the shaping overflow loopholes are B1 and B2, the number of the shaping overflow loopholes is 2 … …, the transaction sequence depends on the corresponding detectors to be N1, N2, N3 and N4, and the number of the access control loopholes is 14; the categorization set a can be expressed as: {1, A2, A3}, {2, B1, B2} … … {14, N1, N2, N3, N4}.
In practical application, the types of the 14 types of loopholes can be increased or decreased according to the intelligent contracts to be detected, so that a user can customize a detection module applicable to the intelligent contracts to be detected according to requirements.
Detectors that do not care for the class 14 vulnerabilities described above may themselves be categorized individually.
The detection tools are built into the detection module, and a new detection tool expansion inlet is provided, so that the complexity of subsequent expansion of the new detection tools is reduced, the maintenance cost of the detection module is reduced, the characteristics of each detection tool are reserved to the greatest extent, and the coverage range of vulnerability detection is enlarged.
Based on the method, the plurality of detectors in the detection module are classified, so that the type of the vulnerability focused by each detector can be known more clearly, and more accurate and detailed information is provided for users. The performance and coverage of the different detectors can also be intuitively compared. By comparing the vulnerability results of the same category, the advantages and disadvantages of each detector and the applicable scene can be better evaluated. In addition, the new detection tool expansion portal can improve the expansibility of the system, so that a user can select and integrate a customized detection module suitable for the intelligent contract to be detected according to the specific requirements of the user.
Step S220: and carrying out unified data processing on the plurality of detectors.
Firstly, detecting intelligent contract source codes to be detected by sequentially using a plurality of detection tools in a detection module, and obtaining a detection result file corresponding to each detection tool. The detection result file may include the name of the triggered detector in the detection tool, the vulnerability description, and the location where the vulnerability occurred. In order to facilitate unified data processing on the detection result file, the format of the detection result file can be unified into a JSON format.
And then, according to the detection result file, acquiring a detector for detecting the loopholes, and performing unified data processing on the detector for detecting the loopholes. For detectors that categorize themselves individually, the detector name, the type of vulnerability targeted, the description of the vulnerability, and the modification suggestion may be saved to the initial vulnerability list. For the detector in the classification set a, according to the 14 types of loopholes, the detector name, the type of the loopholes, the description of the loopholes and the modification suggestion are saved in a detection result file corresponding to the type of the loopholes.
Different detectors have different advantages in detecting vulnerabilities because they employ different static analysis techniques and rule sets. Therefore, a plurality of detection tools are adopted to detect the intelligent contracts to be detected, the detection accuracy can be improved through cross verification and comprehensive results, and false alarm and missing report of loopholes are reduced.
For example, fig. 3 is a flowchart of a detection module provided by an example of the present disclosure, where, as shown in fig. 3, the detection module includes a plurality of detection tools, such as a detection tool a, a detection tool B, and a detection tool C, and performs unified data processing and preferential processing sequentially according to a detection result file output by each detection tool, so as to obtain an initial vulnerability list.
Step S230: and preferentially processing a plurality of detectors in the same vulnerability type to obtain an initial vulnerability list.
First, for the detectors in the classification set a, the confidence of the detectors corresponding to the 14 vulnerability types may be assigned.
Specifically, the confidence assignment may be performed by:
(1) obtaining the accuracy of the detector by a tool developer: the accuracy of the detector may be obtained directly from the tool developer and used as a confidence level for the corresponding detector. This approach relies on the developer's objective assessment of its tool performance, providing a confidence value for each detector that is based on expertise.
(2) Obtaining the accuracy of the corresponding detector through the data set self-test: preparing an intelligent contract data set with known loopholes, detecting and testing intelligent contracts in the data set by using a detector, and assigning the accuracy obtained by the test as the confidence of the corresponding detector.
(3) Dynamically adjusting the confidence corresponding to the detector through machine learning: an objective function may be set and the confidence of each detector adjusted based on the output of the objective function. Wherein, for the detector detecting the loophole, the opposite reliability is heightened; and for the detector which does not detect the loopholes, the confidence coefficient is reduced to minimize the output value of the objective function, so that the accuracy of the detection module is improved.
Specifically, the objective function can be expressed by the following formula (1):
Z=1-[x 1 ,x 2 ,…x n ]·[y 1 ,y 2 ,…,y n ] T (1)
wherein Z is the output value of the objective function;x n For the triggering state of each detector, the triggering state of the detector detecting the loophole is 1, and the triggering state of the detector not detecting the loophole is-1; y is n Confidence for each detector; n is the number of detectors contained in each vulnerability type.
In the confidence assignment process, in order to ensure that the matrix dimension calculation is correct, if a certain vulnerability type does not have a corresponding detector, the trigger state and the confidence are both defaulted to 0.
After the triggering state and the confidence coefficient corresponding to each detector are obtained, the detector in the same vulnerability type starts to be preferentially processed.
Illustratively, the preferential treatment may be represented by the following formula (2):
Q=[x 1 ,x 2 ,…x n ]·[y 1 ,y 2 ,…,y n ] T (2)
Wherein Q is the output value of preferential treatment; x is x n For the triggering state of each detector, the triggering state of the detector detecting the loophole is 1, and the triggering state of the detector not detecting the loophole is-1; y is n Confidence for each detector; n is the number of detectors contained in each vulnerability type.
Judging whether the loopholes exist or not through the positive and negative of the output value of the formula (2), wherein the output value is that the regular loopholes exist, and storing the detector names in the loopholes types, the aimed loopholes types, descriptions of the loopholes and modification suggestions into an initial loopholes list. If the output value is negative, then the vulnerability does not exist, and the detector name in the vulnerability type, the vulnerability type targeted, the description of the vulnerability, and the modification suggestions are not saved to the initial vulnerability list.
And performing preferential treatment on each vulnerability type in sequence to obtain an initial vulnerability list.
Taking the reentry attack vulnerability type as an example, assuming that the reentry attack vulnerability type comprises three detectors, namely A1, A2 and A3, the triggering states of which are respectively-1, 1 and 1, and the confidence degrees of which are respectively 0.3,0.3 and 0.4, the preferred processing output value is [ -1, 1] [0.3,0.3,0.4] = -0.3+0.3+0.4=0.4, the output value is positive, and the reentry attack vulnerability exists in the intelligent contract to be detected.
Based on the method, by adopting different confidence assignment modes, the professional experience of a tool developer, the accuracy of actual test data and the dynamic adjustment of machine learning are comprehensively utilized, so that the confidence of each detector is more comprehensively and multi-angularly estimated. And by adopting a preferential treatment strategy, the performances and the confidence degrees of a plurality of detectors under the same vulnerability type are comprehensively considered to output the result with the highest confidence degree, so that the accuracy of judging the vulnerability existence is improved.
Step S240: and verifying the initial vulnerability list at the verification module to obtain a verified vulnerability list.
Exemplary, fig. 4 is a flowchart of an exemplary verification module provided in the present disclosure, where, as shown in fig. 4, an initial vulnerability list and a to-be-detected intelligent contract source code are input as input data into a large language model. The large language model searches related information from a preset knowledge base a and a preset knowledge base b in a RAG mode, integrates and learns the searched information to generate attack contracts, and carries out simulated attack on intelligent contract source codes to be detected to obtain a verified vulnerability list and a modification suggestion.
In the verification module, verifying the initial vulnerability list through a large language model, specifically comprising the steps of generating an attack contract by means of an RAG auxiliary large language model, then performing simulated attack on the intelligent contract to be verified through the large language model, obtaining a simulated attack result according to a preset attack establishment basis, verifying each vulnerability in the initial vulnerability list one by one, and obtaining a verified vulnerability list. In this embodiment, the large language model may use gpt-3.5-turbo-0613.
To ensure the quality of the simulated attack, knowledge base a and knowledge base b may be maintained in advance.
The knowledge base a may include intelligent contract source codes with the above 14 vulnerability types and corresponding attack contract source codes, where each vulnerability type of intelligent contract source code and corresponding attack contract source code is more than 10. The knowledge base a is used for providing corresponding information for the large language model by means of the RAG method in the process of generating the attack, and reducing the error of the large language model in generating the attack contract.
The knowledge base b may include attack establishment basis corresponding to the 14 vulnerability types. The knowledge base b is used for assisting the large language model in providing a judging basis for attack establishment when the source code contract and the attack contract are executed virtually.
By maintaining these two knowledge bases, the necessary background knowledge and validation criteria may be provided for the simulated attack, helping to improve the quality of attack contract generation and reduce potentially misleading information.
For example, to facilitate subsequent generation of attack contracts, the content in knowledge base a may be partitioned and each data block converted into a corresponding vector embedded block.
In an alternative manner, the partitioning process may be performed according to the vulnerability type, such that each data chunk is used to express a specific vulnerability type, facilitating retrieval of large language models.
In an alternative manner, the partitioning process may be performed according to intelligent contracts, where partitioning of each intelligent contract is considered, and each data block represents an independent intelligent contract, which helps the large language model to better understand the structure and vulnerability of each intelligent contract.
In an alternative manner, the partitioning may be performed according to attack scenarios, and since the knowledge base a includes attack contracts, the partitioning may be considered according to each attack scenario in the attack contracts. After partitioning, each data block represents a specific attack scenario, which helps the large language model understand the association between attacks and vulnerabilities.
The partitioning rules can be combined or adjusted according to specific situations, and the proper partitioning strategy is selected so that the vector embedding block can better capture information in the knowledge base a, and more targeted input is provided for a large language model.
Illustratively, after knowledge base a and knowledge base b are maintained, generation of attack instructions may begin.
The attack instruction module may be: "' you are an intelligent contract vulnerability verification assistant that performs a question-and-answer task. The Question shown in Question is answered by using the next content i give your Context as a reference. If you do not know how to answer, we say that "I do not know".
The Question: { question template }
Context: { content in knowledge base with high similarity to problem context } "'
Question template: "for this smart contract { vulnerabilities }, generate the corresponding vulnerability attack contract, smart contract source code: { Intelligent contract Source } "
Wherein { } represents that the content therein needs to be replaced according to different scenes, { vulnerabilities } enumerates vulnerabilities existing in the initial vulnerability list in sequence in different verification rounds; the content of the content which is highly similar to the problem context in the { knowledge base } can be used for carrying out language matching on the round of problems in the problem template and the knowledge base a through a retriever component, and the embedded vector blocks respectively corresponding to the types of the loopholes, the intelligent contract source codes corresponding to the loopholes and the attack contracts mentioned in the problem module are obtained in the knowledge base a.
Further, the generated attack instruction is input into the large language model, and an attack contract is obtained.
Then, the intelligent contract to be verified is subjected to simulation attack through the large language model.
Specifically, two accounts existing on the blockchain are simulated through a large language model, intelligent contract source codes to be detected are deployed in an account a, generated attack contracts are deployed in an account b, and the basis for establishment of attack is set. And finally, according to the establishment basis of the attack, acquiring an output result of the large language model, and judging whether the attack is successful or not.
If the attack is successful, indicating that the intelligent contract to be detected has corresponding loopholes, and reserving corresponding information in an initial loophole list; if the attack is invalid, the condition that the corresponding loopholes do not exist in the intelligent contract to be detected is indicated, and the corresponding information in the loophole list needs to be removed.
In the step, traversing each vulnerability in the vulnerability list, sequentially generating corresponding attack contracts, and obtaining corresponding virtual attack results. After traversing each vulnerability in the vulnerability list, obtaining a verified vulnerability list and a modification suggestion.
For the detection tool newly added with the detection module, the following information needs to be supplemented to ensure that the large language model can verify the detection result file output by the large language model. The supplementary information may include:
(1) if the vulnerability type of interest of the detector in the newly added detection tool already exists in the classification set a, the name and confidence of the detector are supplemented into the corresponding vulnerability type.
(2) If the leak type focused by the detector in the newly added detection tool does not exist in the classification set a, supplementing the leak type into the classification set a, and supplementing the name and the confidence of the detector into the corresponding leak type.
(3) If the leak type focused by the detector in the newly added detection tool does not exist in the classification set a, the knowledge base a needs to be supplemented with 10 intelligent contract source codes with the leak type and corresponding attack contracts. Supplementing attack basis of the vulnerability type in the knowledge base b.
Step S250: and carrying out language optimization processing on the vulnerability list and the modification suggestion in the reporting module to obtain a target detection report.
And in the report generation module, performing natural language colloquial processing on the verified vulnerability list and the modification suggestion by using a large language model to obtain a target detection report. The natural language colloquial processing can improve the readability of the vulnerability list and the modification suggestion, and is helpful for non-professional readers to understand the vulnerability information more easily, so that the efficiency and accuracy of the vulnerability repair process are improved.
Based on the above embodiment, in order to facilitate the use of the user, the man-machine interaction interface provided by the present disclosure may further include: the system comprises a detection tool management module, a file uploading module and a report browsing module.
And the detection tool management module is used for managing and updating a plurality of detection tools built in the detection module. The detection tool insertion function can also be provided for expanding new detection tools to meet the evolving safety requirements.
And the file uploading module is used for uploading intelligent contract source codes of different vulnerability types and corresponding attack contract source codes in the knowledge base a. And the method is also used for uploading attack establishment basis corresponding to different vulnerability types in the knowledge base b. And the method is also used for uploading intelligent contract source codes to be detected.
And the report browsing module is used for displaying the target detection report, and presenting a vulnerability list and a modification suggestion.
According to one or more technical schemes provided by the exemplary embodiments of the present disclosure, by integrating a plurality of detection tools, multiple rounds of comprehensive detection on intelligent contracts are realized, so that the comprehensiveness and accuracy of vulnerability detection are improved. And the initial vulnerability list is obtained by carrying out data analysis processing on the plurality of detection result files, so that the comprehensive consideration of the output of different detectors is facilitated, and the robustness of vulnerability identification is improved. In the verification module, a large language model is adopted for vulnerability verification, so that vulnerabilities can be more comprehensively analyzed and understood, and a more accurate verification result is provided. Finally, colloquially processing the vulnerability list and the modification suggestion through the large language model, and improving the readability of the vulnerability list and the modification suggestion.
Therefore, the multiple composite detection method of the intelligent contract provided by the exemplary embodiment of the disclosure can solve the technical problems of low leak coverage rate of a single detection tool and low leak detection accuracy of a large language model through the combination of a plurality of leak detection tools and the large language model.
Based on the foregoing embodiments, fig. 5 is a flow schematic diagram of a multiple composite detection method of an intelligent contract according to an exemplary embodiment of the present disclosure, and as shown in fig. 5, the method may include the following steps:
step S510: and acquiring the intelligent contract to be detected.
In an embodiment, the source code of the smart contract to be detected needs to be acquired for subsequent multiple composite detection.
Step S520: and detecting the intelligent contracts to be detected by adopting a plurality of detection tools respectively to obtain detection result files respectively generated by each detection tool.
In the embodiment, although each detection tool includes a plurality of detectors, the difference between different detection tools is large, and the deviation of different vulnerability types is large, so that a plurality of detection tools, such as Slither, mythril and Securify, can be built in advance. Thus covering wider vulnerability types and improving comprehensive detection of potential risks of intelligent contracts.
In this embodiment, a detection tool insertion function is also provided for adding a new detection tool. The user can select and integrate a plurality of detection tools suitable for intelligent contracts to be detected according to the specific requirements of the user, so that the user can better adapt to the specific development environment and safety requirements, and new loopholes and threats are continuously emerging in the field of intelligent contracts.
And detecting the intelligent contract to be detected by using a plurality of detection tools in sequence to obtain a detection result file corresponding to each detection tool. The detection result file may include the name of the triggered detector in the detection tool, the vulnerability description, the vulnerability type, the location where the vulnerability occurred, and the modification suggestion.
Step S530: and carrying out data integration processing on the plurality of detection result files to obtain an initial vulnerability list.
Because the formats of the detection result files generated by the detection tools are not uniform, in order to facilitate the verification of the subsequent large language model, the data integration processing needs to be carried out on the detection result files. The data integration process may include unified data processing and preferential processing. The unified data processing is used for classifying the detection result files according to the vulnerability types, and the preferential processing is used for judging the detection result according to the accuracy of the detector in each vulnerability type.
Step S540: and verifying the initial vulnerability list by adopting a large language model to obtain a target vulnerability list.
In an embodiment, the initial vulnerability list and the intelligent contracts to be detected are used as input data and input into a large language model. The large language model searches related information from a preset database in a RAG mode, integrates and learns the searched information to generate attack contracts, and carries out simulation attack on the intelligent contracts to be detected to obtain simulation attack. Judging whether the loopholes exist according to the simulation attack result, thereby verifying the accuracy of the initial loophole list. For existing vulnerabilities, relevant information thereof in the initial vulnerability list is retained. And for the non-existing loopholes, removing the relevant information of the non-existing loopholes in the initial loopholes list. And sequentially verifying each vulnerability in the initial vulnerability list, and finally obtaining a target vulnerability list.
Based on this, in the exemplary embodiment of the present disclosure, by acquiring an intelligent contract to be detected, detecting the intelligent contract to be detected by using a plurality of detection tools, respectively, and obtaining a detection result file generated by each detection tool, respectively. And carrying out data integration processing on the plurality of detection result files to obtain an initial vulnerability list. And verifying the initial vulnerability list by adopting a large language model to obtain a target vulnerability list. The multiple composite detection method of the intelligent contracts provided by the disclosure detects the intelligent contracts to be detected through a plurality of detection tools, and solves the technical problems that a single detection tool has larger bias to different vulnerability types and the vulnerability detection coverage of the detection tools is not high. The large language model is used for verifying the detection result output by the detection tool, so that the false alarm rate and the false alarm rate of the loopholes can be further reduced, and the safety of the intelligent contract is improved.
Based on the above embodiment, in still another embodiment provided in the present disclosure, the detecting tool includes a plurality of detectors, and the step S520 may specifically include:
acquiring vulnerability types focused by a plurality of detectors respectively;
classifying the plurality of detectors according to the vulnerability type;
Detecting intelligent contracts to be detected by adopting a plurality of detection tools respectively to obtain detection result files respectively generated by each detection tool;
and respectively storing the plurality of detection result files into the vulnerability types corresponding to the detectors.
In the embodiment, because the emphasis points of the plurality of detectors in the detection tools are different, specific vulnerabilities aimed at are different, and in order to facilitate subsequent analysis and comparison of detection result files generated by different detection tools, classification processing can be performed on all detectors in all detection tools in the detection module.
In particular, all of the detectors in the plurality of detection tools built-in advance and newly developed, and the type of vulnerability that each detector is interested in, may be acquired. And classifying the detectors according to the vulnerability types, and classifying the detectors focusing on the same vulnerability type into the same category.
And detecting the intelligent contracts to be detected by adopting a plurality of detection tools respectively to obtain detection result files respectively generated by each detection tool. The detection result file may include the name of the triggered detector in the detection tool, the vulnerability description, and the location where the vulnerability occurred. In order to facilitate unified data processing on the detection result file, the format of the detection result file can be unified into a JSON format. And classifying the detection result file into the vulnerability type corresponding to the detector.
Based on this, different detectors have different advantages in detecting vulnerabilities because they employ different static analysis techniques and rule sets. Therefore, a plurality of detection tools are adopted to detect the intelligent contracts to be detected, the detection accuracy can be improved through cross verification and comprehensive results, and false alarms and missing reports are reduced.
Based on the foregoing embodiments, in still another embodiment provided by the present disclosure, classifying the plurality of detectors by the vulnerability type may specifically include:
presetting a plurality of first vulnerability types;
classifying the corresponding detector into the corresponding first vulnerability type under the condition that the vulnerability type focused by the detector is matched with the first vulnerability type;
in the event that the vulnerability type of interest of the detector does not match the first vulnerability type, the corresponding detector is classified into a second vulnerability type.
In an embodiment, a plurality of first vulnerability types may be preset according to common vulnerability types. In this embodiment, the first vulnerability type may include: reentry attacks, shaping overflows, access control, exception operations, denial of service, type mismatch, ethernet freeze, short address attacks, ethernet loss, call stack overflows, tx.origin use loopholes, timestamp dependencies, block parameter dependencies, and transaction order dependencies.
In practical application, the first vulnerability type can be increased or decreased according to the intelligent contracts to be detected, so that the user can customize the first vulnerability type applicable to the intelligent contracts to be detected according to the requirements.
The vulnerability type of interest to the detector is matched to the first vulnerability type. If the vulnerability type of interest to the detector is in the first vulnerability type, the detector is classified into the corresponding vulnerability type. If the vulnerability type of interest to the detector is not in the first vulnerability type, the detector is classified separately into a second vulnerability type. Each detector is traversed and all detectors are classified.
Based on the method, the plurality of detectors are classified, so that the type of the vulnerability focused by each detector can be known more clearly, and more accurate and detailed information is provided for users. The performance and coverage of the different detectors can also be intuitively compared. By comparing the vulnerability results of the same category, the advantages and disadvantages of each detector and the applicable scene can be better evaluated.
Based on the above embodiment, in yet another embodiment provided in the present disclosure, the step S530 may specifically include:
acquiring trigger state and confidence information of each detector in the first vulnerability type; wherein, under the condition that the detector detects the loophole, the trigger state is assigned positive, and under the condition that the detector does not detect the loophole, the trigger state is assigned negative;
Calculating a preferential treatment output value corresponding to each first vulnerability type based on the triggering state and the confidence information;
under the condition that the preferred processing output value is positive, storing the detection result file in the corresponding first vulnerability type into an initial vulnerability list;
and obtaining a detection result file in the second vulnerability type, and storing the detection result file in the second vulnerability type to an initial vulnerability list.
In an embodiment, the data integration process may include unified data processing and preferential processing.
Specifically, unified data processing may include:
and acquiring the detector for detecting the loopholes according to the detection result file, and storing the detection result file into an initial loophole list for the detector in the second loophole type. For the detector in the first vulnerability type, preferential treatment is needed to sort the detection result file into the initial vulnerability list.
Specifically, the preferential treatment may include:
and acquiring the trigger state and the confidence information of each detector in the first vulnerability type. The trigger state is used for indicating whether the detector detects the loophole in the detection process. The detection state may be assigned to 1 if the detector detects a vulnerability and to-1 if the detector does not detect a vulnerability. The confidence information is used to indicate the accuracy of the detector for vulnerability detection.
After the triggering state and the confidence information corresponding to each detector are obtained, the detector corresponding to each vulnerability type in the first vulnerability type is subjected to preferential treatment, and a preferential treatment output value is obtained.
Illustratively, the preferential treatment may be represented by the above formula (2). Judging whether the loopholes exist or not by preferentially processing the positive and negative of the output value, wherein the output value is that the regular loopholes exist, and storing the detection result file corresponding to the loophole type into an initial loophole list. If the output value is negative, the loophole does not exist, and the detection result file corresponding to the loophole type is not stored in the initial loophole list.
And performing preferential treatment on each vulnerability type in sequence to obtain an initial vulnerability list.
Based on the method, unified data processing and preferential processing strategies are adopted, the performance and the confidence of a plurality of detectors under the same vulnerability type are comprehensively considered, and the result with the highest confidence is output, so that the accuracy of judging the existence of the vulnerability is improved.
Based on the foregoing embodiments, in still another embodiment provided by the present disclosure, obtaining confidence information of each detector in the first vulnerability type may specifically include:
acquiring the detection accuracy of official release and/or taking the detection accuracy obtained through the data set test as initial confidence information;
Adjusting the initial confidence coefficient information by adopting an objective function to obtain adjusted confidence coefficient information;
wherein adjusting the initial confidence information using the objective function includes:
for the detector with positive triggering state, the corresponding initial confidence information is improved, and for the detector with negative triggering state, the corresponding initial confidence information is reduced;
the objective function includes:
Z=1-[x 1 ,x 2 ,…x n ]·[y’ 1 ,y’ 2 ,…,y’ n ] T
wherein Z is an output value of the objective function; x is x n For the triggering state of each detector, the triggering state of the detector detecting the loophole is 1, and the triggering state of the detector not detecting the loophole is-1; y' n Initial confidence information for each detector; n is the number of detectors respectively corresponding to each first vulnerability type.
In an embodiment, obtaining the detection accuracy of the official release may include: and directly acquiring the accuracy of the detector from a tool developer, and taking the accuracy as initial confidence information of the corresponding detector. This approach relies on the developer's objective assessment of its tool performance, providing a confidence level for each detector based on expertise.
In an embodiment, the detection accuracy obtained by the data set test may include: preparing an intelligent contract data set with known vulnerabilities, detecting and testing intelligent contracts in the data set by using a detector, and assigning accuracy obtained by the testing as initial confidence information of the corresponding detector.
In the assignment process of the initial confidence information, in order to ensure that the calculation of the dimension of the subsequent matrix is correct, if a certain vulnerability type does not have a corresponding detector, the trigger state and the initial confidence information are both defaulted to 0.
In an embodiment, after the initial confidence information is obtained, the initial confidence information corresponding to the detector may be dynamically adjusted through machine learning, so as to obtain the adjusted confidence information.
Specifically, an objective function may be set, and initial confidence information for each detector is adjusted according to the output of the objective function. For the detector detecting the loopholes, the initial confidence information is heightened; and for the detector which does not detect the loopholes, reducing the initial confidence information to minimize the output value of the objective function, thereby improving the accuracy of the adjusted confidence information.
Illustratively, the objective function may be represented by the following equation (3):
Z=1-[x 1 ,x 2 ,…x n ]·[y’ 1 ,y’ 2 ,…,y’ n ] T (3)
wherein Z is an output value of the objective function; x is x n For the triggering state of each detector, the triggering state of the detector detecting the loophole is 1, and the triggering state of the detector not detecting the loophole is-1; y' n Initial confidence information for each detector; n is the number of detectors respectively corresponding to each first vulnerability type.
Based on the method, by adopting different confidence information assignment modes, the professional experience of a tool developer, the accuracy of actual test data and the dynamic adjustment of machine learning are comprehensively utilized, so that the confidence information of each detector is more comprehensively and multi-angularly estimated.
Based on the above embodiment, in yet another embodiment provided in the present disclosure, the step S540 may specifically include:
the large language model generates attack contracts corresponding to each vulnerability in the initial vulnerability list through a retrieval enhancement generation RAG mode;
in a large language model, an attack contract is used for launching simulation attack on an intelligent contract to be detected, and a simulation attack result is obtained;
under the condition that the simulation attack is established, a detection result file of the corresponding vulnerability in the initial vulnerability list is reserved;
under the condition that the simulation attack is not established, removing a detection result file of the corresponding vulnerability in the initial vulnerability list;
and traversing and verifying each vulnerability in the initial vulnerability list to obtain a target vulnerability list.
In an embodiment, a knowledge base a may be preset, and intelligent contract source codes and corresponding attack contract source codes corresponding to each vulnerability type are maintained in the knowledge base a, where the number of the intelligent contract source codes and the corresponding attack contract source codes of each vulnerability type is more than 10 respectively. The knowledge base a is used for providing corresponding information for the large language model by means of the RAG method in the process of generating the attack, and reducing the error of the large language model in generating the attack contract.
To facilitate subsequent generation of attack contracts, the content in knowledge base a may be partitioned and each data block converted to a corresponding vector embedded block.
In an alternative manner, the partitioning process may be performed according to the vulnerability type, such that each data chunk is used to express a specific vulnerability type, facilitating retrieval of large language models.
In an alternative manner, the partitioning process may be performed according to intelligent contracts, where partitioning of each intelligent contract is considered, and each data block represents an independent intelligent contract, which helps the large language model to better understand the structure and vulnerability of each intelligent contract.
In an alternative manner, the partitioning may be performed according to attack scenarios, and since the knowledge base a includes attack contracts, the partitioning may be considered according to each attack scenario in the attack contracts. After partitioning, each data block represents a specific attack scenario, which helps the large language model understand the association between attacks and vulnerabilities.
The partitioning rules can be combined or adjusted according to specific situations, and the proper partitioning strategy is selected so that the vector embedding block can better capture information in the knowledge base a, and more targeted input is provided for a large language model.
Illustratively, after maintaining the knowledge base a, generation of attack instructions may begin.
The attack instruction module may be: "' you are an intelligent contract vulnerability verification assistant that performs a question-and-answer task. The Question shown in Question is answered by using the next content i give your Context as a reference. If you do not know how to answer, we say that "I do not know".
The Question: { question template }
Context: { content in knowledge base with high similarity to problem context } "'
Question template: "for this smart contract { vulnerabilities }, generate the corresponding vulnerability attack contract, smart contract source code: { Intelligent contract Source } "
Wherein { } represents that the content therein needs to be replaced according to different scenes, { vulnerabilities } enumerates vulnerabilities existing in the initial vulnerability list in sequence in different verification rounds; the content of the content which is highly similar to the problem context in the { knowledge base } can be used for carrying out language matching on the round of problems in the problem template and the knowledge base a through a retriever component, and the embedded vector blocks respectively corresponding to the types of the loopholes, the intelligent contract source codes corresponding to the loopholes and the attack contracts mentioned in the problem module are obtained in the knowledge base a.
And inputting the generated attack instruction into a large language model, searching related information from a preset knowledge base a by the large language model through a RAG mode, integrating and learning the searched information to generate an attack contract, and performing simulated attack on intelligent contract source codes to be detected to obtain a simulated attack result. In this embodiment, the large language model may use gpt-3.5-turbo-0613.
If the attack is established, the existence of the corresponding loopholes in the intelligent contract to be detected is indicated, and the detection result file of the corresponding loopholes in the initial loopholes list is reserved; if the attack is not established, the fact that the corresponding loopholes do not exist in the intelligent contract to be detected is indicated, and the detection result files of the corresponding loopholes in the initial loopholes list need to be removed.
Traversing each vulnerability in the verifying vulnerability list by using a large language model, sequentially generating corresponding attack contracts, performing simulation attack on the intelligent contracts to be detected, verifying one by one, and obtaining the verified vulnerability list.
Based on this, by maintaining knowledge base a, the necessary background knowledge and validation criteria can be provided for the simulated attack, helping to improve the quality of attack contract generation and reduce potentially misleading information. The partitioning rules can be combined or adjusted according to specific situations, and the proper partitioning strategy is selected so that the vector embedding block can better capture information in the knowledge base a, and more targeted input is provided for a large language model.
Based on the above embodiment, in still another embodiment provided by the present disclosure, in a large language model, an attack contract is used to launch a simulation attack on an intelligent contract to be detected, so as to obtain a simulation attack result, which may specifically include:
acquiring attack establishment basis corresponding to each vulnerability in the initial vulnerability list;
creating a first account and a second account in the large language model;
deploying intelligent contracts to be detected in a first account, and sequentially deploying attack contracts corresponding to each vulnerability in an initial vulnerability list in a second account;
and launching the simulation attack to the first account through the second account, and obtaining a simulation attack result based on the establishment of the attack.
In an embodiment, a knowledge base b may be preset, and attack establishment basis corresponding to each vulnerability type is maintained in the knowledge base b. The knowledge base b is used for assisting the large language model in providing a judging basis for attack establishment when the source code contract and the attack contract are executed virtually. The attack establishment basis corresponding to each vulnerability can be obtained from the knowledge base b in a RAG mode.
In an embodiment, a first account and a second account existing on a blockchain are created through a large language model, source codes of intelligent contracts to be detected are deployed in the first account, generated attack contracts are deployed in the second account, and the basis for attack establishment is set.
And launching the simulation attack to the first account through the second account, acquiring a simulation attack result of the large language model according to the attack establishment basis, and judging whether the attack is established or not.
Based on the above, by maintaining the knowledge base b, the method can be used for assisting the large language model in providing a judging basis for the establishment of the attack when the source code contract and the attack contract are executed virtually. The intelligent contracts to be detected are simulated and attacked in the large language model, so that the characteristic of wide coverage of the large language model can be utilized to cover more source code contents in the intelligent contracts to be detected, the loopholes in the intelligent contracts to be detected are found, and the missing report rate of the loopholes is reduced.
Based on the foregoing embodiment, in still another embodiment provided by the present disclosure, traversing each vulnerability in the initial vulnerability list to obtain the target vulnerability list may specifically include:
traversing each vulnerability in the initial vulnerability list to obtain a verified vulnerability list;
and performing natural language processing on the verified vulnerability list by adopting a large language model to obtain a target vulnerability list.
In the embodiment, a large language model is used for sequentially performing simulation attack on each vulnerability in the initial vulnerability list, verifying the vulnerabilities one by one, and obtaining a verified vulnerability list. The large language model may also generate modification suggestions for each vulnerability in the verified vulnerability list via a RAG approach. Furthermore, the verified vulnerability list and the modification suggestion can be subjected to natural language colloquial processing by using a large language model, and a target vulnerability list is obtained.
In the step, the natural language colloquial processing can improve the readability of the vulnerability list and the modification suggestion, and is helpful for non-professional readers to understand the vulnerability information more easily, so that the efficiency and accuracy of the vulnerability repairing process are improved.
According to one or more technical schemes provided by the exemplary embodiments of the present disclosure, by integrating a plurality of detection tools, multiple rounds of comprehensive detection on intelligent contracts are realized, so that the comprehensiveness and accuracy of vulnerability detection are improved. And the initial vulnerability list is obtained by carrying out data analysis processing on the plurality of detection result files, so that the comprehensive consideration of the output of different detectors is facilitated, and the robustness of vulnerability identification is improved. In the verification module, a large language model is adopted for vulnerability verification, so that vulnerabilities can be more comprehensively analyzed and understood, and a more accurate verification result is provided. Finally, colloquially processing the vulnerability list and the modification suggestion through the large language model, and improving the readability of the vulnerability list and the modification suggestion.
Therefore, the multiple composite detection method of the intelligent contract provided by the exemplary embodiment of the disclosure can solve the technical problems of low leak coverage rate of a single detection tool and low leak detection accuracy of a large language model through the combination of a plurality of leak detection tools and the large language model.
The foregoing description of the embodiments of the present disclosure has been presented primarily in terms of methods. It will be appreciated that, in order to implement the above-mentioned functions, the apparatus corresponding to the method of the exemplary embodiment of the present disclosure includes corresponding hardware structures and/or software modules that perform the respective functions. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is implemented as hardware or computer software driven hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The embodiments of the present disclosure may divide functional units of a server according to the above method examples, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated modules may be implemented in hardware or in software functional modules. It should be noted that, in the embodiment of the present disclosure, the division of the modules is merely a logic function division, and other division manners may be implemented in actual practice.
In the case of dividing each functional module with corresponding each function, exemplary embodiments of the present disclosure provide a multiple composite detection apparatus of an intelligent contract, which may be a server or a chip applied to the server. Fig. 6 is a schematic block diagram of functional modules of a multiple composite detection device of an intelligent contract according to an exemplary embodiment of the present disclosure. As shown in fig. 6, the multiple composite detecting apparatus 600 of the smart contract includes:
a data acquisition module 610, configured to acquire an intelligent contract to be detected;
the data processing module 620 is configured to detect the intelligent contract to be detected by using a plurality of detection tools, so as to obtain a detection result file generated by each detection tool;
the data processing module 620 is further configured to perform data integration processing on the plurality of detection result files to obtain an initial vulnerability list;
the data processing module 620 is further configured to verify the initial vulnerability list by using a large language model, and obtain a target vulnerability list.
In yet another embodiment provided by the present disclosure, the detection tool comprises a plurality of detectors; the data processing module 620 is further configured to obtain a plurality of detectors and vulnerability types focused by the plurality of detectors respectively; classifying the plurality of detectors according to the vulnerability type; detecting the intelligent contracts to be detected by adopting the detection tools respectively to obtain detection result files generated by each detection tool respectively; and respectively storing the detection result files into the vulnerability types corresponding to the detectors.
In yet another embodiment provided in the present disclosure, the data processing module 620 is further configured to preset a plurality of first vulnerability types; classifying the corresponding detector into the corresponding first vulnerability type under the condition that the vulnerability type focused by the detector is matched with the first vulnerability type; in the event that the vulnerability type of interest of the detector does not match the first vulnerability type, classifying the corresponding detector into a second vulnerability type.
In yet another embodiment provided in the present disclosure, the data processing module 620 is further configured to obtain trigger status and confidence information of each detector in the first vulnerability type; wherein, when the detector detects the vulnerability, the trigger state is assigned positive, and when the detector does not detect the vulnerability, the trigger state is assigned negative; calculating a preferred processing output value corresponding to each first vulnerability type based on the triggering state and the confidence information; under the condition that the preferred processing output value is positive, storing the detection result file in the corresponding first vulnerability type into an initial vulnerability list; and obtaining a detection result file in the second vulnerability type, and storing the detection result file in the second vulnerability type to an initial vulnerability list.
In yet another embodiment provided in the present disclosure, the data processing module 620 is further configured to obtain an official published detection accuracy, and/or, the detection accuracy obtained through the data set test is used as the initial confidence information; adjusting the initial confidence information by adopting an objective function to obtain the adjusted confidence information; wherein said adjusting said initial confidence information using an objective function comprises: for the detector with positive triggering state, the corresponding initial confidence information is improved, and for the detector with negative triggering state, the corresponding initial confidence information is reduced;
the objective function includes:
Z=1-[x 1 ,x 2 ,…x n ]·[y’ 1 ,y’ 2 ,…,y’ n ] T
wherein Z is an output value of the objective function; x is x n For the triggering state of each detector, the triggering state of the detector detecting the loophole is 1, and the triggering state of the detector not detecting the loophole is-1; y' n Initial confidence information for each detector; n is the number of detectors respectively corresponding to each first vulnerability type.
In yet another embodiment provided in the present disclosure, the data processing module 620 is further configured to generate attack contracts corresponding to each vulnerability in the initial vulnerability list by using the large language model through a search enhancement generation RAG mode; in the large language model, using the attack contract to launch simulation attack on the intelligent contract to be detected, and obtaining a simulation attack result; under the condition that the simulation attack is established, a detection result file of the corresponding loopholes in the initial loophole list is reserved; under the condition that the simulation attack is not established, removing the detection result file of the corresponding vulnerability in the initial vulnerability list; and traversing and verifying each vulnerability in the initial vulnerability list to obtain the target vulnerability list.
In another embodiment provided in the present disclosure, the data processing module 620 is further configured to obtain an attack establishment basis corresponding to each vulnerability in the initial vulnerability list; creating a first account and a second account in the large language model; deploying the intelligent contracts to be detected in the first account, and sequentially deploying attack contracts corresponding to each vulnerability in the initial vulnerability list in the second account; and launching a simulation attack to the first account through the second account, and obtaining a simulation attack result based on establishment of the attack.
In yet another embodiment provided in the present disclosure, the data processing module 620 is further configured to traverse and verify each vulnerability in the initial vulnerability list to obtain a verified vulnerability list; and carrying out natural language processing on the verified vulnerability list by adopting a large language model to obtain the target vulnerability list.
Fig. 7 is a schematic block diagram of an exemplary provided chip of the present disclosure. As shown in fig. 7, the chip 700 includes one or more (including two) processors 701 and a communication interface 702. The communication interface 702 may support a server to perform the data transceiving steps of the method described above, and the processor 701 may support a server to perform the data processing steps of the method described above.
Optionally, as shown in fig. 7, the chip 700 further includes a memory 703, where the memory 703 may include a read only memory and a random access memory, and provides operating instructions and data to the processor. A portion of the memory may also include non-volatile random access memory (non-volatile random access memory, NVRAM).
In some embodiments, as shown in FIG. 7, the processor 701 performs the corresponding operation by invoking a memory-stored operating instruction (which may be stored in an operating system). The processor 701 controls the processing operations of any one of the terminal devices, and may also be referred to as a central processing unit (centralprocessing unit, CPU). Memory 703 may include read only memory and random access memory and provides instructions and data to the processor. A portion of the memory 703 may also include NVRAM. Such as a memory, a communication interface, and a memory coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. But for clarity of illustration, the various buses are labeled as bus system 704 in fig. 7.
The method disclosed by the embodiment of the disclosure can be applied to a processor or implemented by the processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general purpose processor, a digital signal processor (digital signal processing, DSP), an ASIC, an off-the-shelf programmable gate array (field-programmable gate array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps and logic blocks of the disclosure in the embodiments of the disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present disclosure may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
The exemplary embodiments of the present disclosure also provide an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor. The memory stores a computer program executable by the at least one processor for causing the electronic device to perform a method according to embodiments of the present disclosure when executed by the at least one processor.
The present disclosure also provides a non-transitory computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor of a computer, is for causing the computer to perform a method according to an embodiment of the present disclosure.
The present disclosure also provides a computer program product comprising a computer program, wherein the computer program, when executed by a processor of a computer, is for causing the computer to perform a method according to embodiments of the disclosure.
Fig. 8 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure. Referring to fig. 8, a block diagram of an electronic device 800 that may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the electronic device 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the electronic device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
Various components in electronic device 800 are connected to I/O interface 805, including: an input unit 806, an output unit 807, a storage unit 808, and a communication unit 809. The input unit 806 may be any type of device capable of inputting information to the electronic device 800, and the input unit 806 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device. The output unit 807 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. The storage unit 808 may include, but is not limited to, magnetic disks, optical disks. The communication unit 809 allows the electronic device 800 to exchange information/data with other devices over computer networks, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth (TM) devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.
The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a graphics Processing Unit (PU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The computing unit 801 performs the various methods and processes described above. Each of the methods described above may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 800 via the ROM 802 and/or the communication unit 809.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
As used in this disclosure, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described by the embodiments of the present disclosure are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a terminal, a user equipment, or other programmable apparatus. The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired or wireless means. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium, e.g., floppy disk, hard disk, tape; optical media, such as digital video discs (Digital video disc, DVD); but also semiconductor media such as solid state disks (solid state drive, SSD).
Although the present disclosure has been described in connection with specific features and embodiments thereof, it will be apparent that various modifications and combinations thereof can be made without departing from the spirit and scope of the disclosure. Accordingly, the specification and drawings are merely exemplary illustrations of the present disclosure as defined in the appended claims and are considered to cover any and all modifications, variations, combinations, or equivalents within the scope of the disclosure. It will be apparent to those skilled in the art that various modifications and variations can be made to the present disclosure without departing from the spirit or scope of the disclosure. Thus, the present disclosure is intended to include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (11)

1. A multiple composite detection method of an intelligent contract, the method comprising:
acquiring an intelligent contract to be detected;
detecting the intelligent contract to be detected by adopting a plurality of detection tools respectively to obtain detection result files respectively generated by each detection tool;
carrying out data integration processing on a plurality of detection result files to obtain an initial vulnerability list;
and verifying the initial vulnerability list by adopting a large language model to obtain a target vulnerability list.
2. The method of claim 1, wherein the detection tool comprises a plurality of detectors; the adoption of a plurality of detection tools to detect the intelligent contract to be detected respectively, obtaining a detection result file generated by each detection tool respectively, comprises the following steps:
acquiring leak types focused by a plurality of detectors respectively;
classifying the plurality of detectors according to the vulnerability type;
detecting the intelligent contracts to be detected by adopting the detection tools respectively to obtain detection result files generated by each detection tool respectively;
and respectively storing the detection result files into the vulnerability types corresponding to the detectors.
3. The method of claim 2, wherein the classifying the plurality of detectors by vulnerability type comprises:
presetting a plurality of first vulnerability types;
classifying the corresponding detector into the corresponding first vulnerability type under the condition that the vulnerability type focused by the detector is matched with the first vulnerability type;
in the event that the vulnerability type of interest of the detector does not match the first vulnerability type, classifying the corresponding detector into a second vulnerability type.
4. The method of claim 3, wherein the performing data integration processing on the plurality of detection result files to obtain an initial vulnerability list includes:
acquiring trigger state and confidence information of each detector in the first vulnerability type; wherein, when the detector detects the vulnerability, the trigger state is assigned positive, and when the detector does not detect the vulnerability, the trigger state is assigned negative;
calculating a preferred processing output value corresponding to each first vulnerability type based on the triggering state and the confidence information;
under the condition that the preferred processing output value is positive, storing the detection result file in the corresponding first vulnerability type into an initial vulnerability list;
and obtaining a detection result file in the second vulnerability type, and storing the detection result file in the second vulnerability type to an initial vulnerability list.
5. The method of claim 4, wherein the obtaining confidence information for each detector in the first vulnerability type comprises:
acquiring the detection accuracy of official release and/or taking the detection accuracy obtained through the data set test as initial confidence information;
Adjusting the initial confidence information by adopting an objective function to obtain the adjusted confidence information;
wherein said adjusting said initial confidence information using an objective function comprises:
for the detector with positive triggering state, the corresponding initial confidence information is improved, and for the detector with negative triggering state, the corresponding initial confidence information is reduced;
the objective function includes:
Z=1-[x 1 ,x 2 ,…x n ]·[y’ 1 ,y’ 2 ,…,y’ n ] T
wherein Z is an output value of the objective function; x is x n For the trigger state of each detector, the trigger state of the detector detecting the vulnerability is 1, and the trigger of the detector detecting no vulnerability is detectedThe hair state is-1; y' n Initial confidence information for each detector; n is the number of detectors respectively corresponding to each first vulnerability type.
6. The method of claim 1, wherein validating the initial vulnerability list using a large language model to obtain a target vulnerability list comprises:
the large language model generates attack contracts corresponding to all vulnerabilities in the initial vulnerability list through a retrieval enhancement generation RAG mode;
in the large language model, using the attack contract to launch simulation attack on the intelligent contract to be detected, and obtaining a simulation attack result;
Under the condition that the simulation attack is established, a detection result file of the corresponding loopholes in the initial loophole list is reserved;
under the condition that the simulation attack is not established, removing the detection result file of the corresponding vulnerability in the initial vulnerability list;
and traversing and verifying each vulnerability in the initial vulnerability list to obtain the target vulnerability list.
7. The method of claim 6, wherein the launching, in the large language model, a simulated attack on the smart contract to be detected using the attack contract, obtaining a simulated attack result, comprises:
acquiring attack establishment basis corresponding to each vulnerability in the initial vulnerability list;
creating a first account and a second account in the large language model;
deploying the intelligent contracts to be detected in the first account, and sequentially deploying attack contracts corresponding to each vulnerability in the initial vulnerability list in the second account;
and launching a simulation attack to the first account through the second account, and obtaining a simulation attack result based on establishment of the attack.
8. The method of claim 6, wherein the traversing verifies each vulnerability in the initial vulnerability list to obtain the target vulnerability list comprises:
Traversing and verifying each vulnerability in the initial vulnerability list to obtain a verified vulnerability list;
and carrying out natural language processing on the verified vulnerability list by adopting a large language model to obtain the target vulnerability list.
9. A multiple composite detection device of an intelligent contract, comprising:
the data acquisition module is used for acquiring the intelligent contract to be detected;
the data processing module is used for respectively detecting the intelligent contracts to be detected by adopting a plurality of detection tools to obtain detection result files respectively generated by each detection tool;
the data processing module is further used for carrying out data integration processing on the plurality of detection result files to obtain an initial vulnerability list;
the data processing module is further used for verifying the initial vulnerability list by adopting a large language model to obtain a target vulnerability list.
10. An electronic device, comprising:
a processor; the method comprises the steps of,
a memory storing a program;
wherein the program comprises instructions which, when executed by the processor, cause the processor to perform the method according to any of claims 1-8.
11. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1-8.
CN202311765424.2A 2023-12-21 2023-12-21 Multiple composite detection method, device, equipment and storage medium of intelligent contract Pending CN117725594A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311765424.2A CN117725594A (en) 2023-12-21 2023-12-21 Multiple composite detection method, device, equipment and storage medium of intelligent contract

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311765424.2A CN117725594A (en) 2023-12-21 2023-12-21 Multiple composite detection method, device, equipment and storage medium of intelligent contract

Publications (1)

Publication Number Publication Date
CN117725594A true CN117725594A (en) 2024-03-19

Family

ID=90199567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311765424.2A Pending CN117725594A (en) 2023-12-21 2023-12-21 Multiple composite detection method, device, equipment and storage medium of intelligent contract

Country Status (1)

Country Link
CN (1) CN117725594A (en)

Similar Documents

Publication Publication Date Title
US11868242B1 (en) Method, apparatus, and computer program product for predictive API test suite selection
US11128668B2 (en) Hybrid network infrastructure management
EP4028903A1 (en) Chatbot for defining a machine learning (ml) solution
US20200202007A1 (en) Open source vulnerability remediation tool
US11176257B2 (en) Reducing risk of smart contracts in a blockchain
US20150220332A1 (en) Resolving merge conflicts that prevent blocks of program code from properly being merged
WO2019144549A1 (en) Vulnerability testing method and device, computer equipment, and storage medium
EP3117361A1 (en) Behavioral analysis for securing peripheral devices
WO2021051031A1 (en) Techniques for adaptive and context-aware automated service composition for machine learning (ml)
US11720825B2 (en) Framework for multi-tenant data science experiments at-scale
CN111160749A (en) Method and device for evaluating information quality and fusing information
US11916964B2 (en) Dynamic, runtime application programming interface parameter labeling, flow parameter tracking and security policy enforcement using API call graph
US10291483B2 (en) Entity embedding-based anomaly detection for heterogeneous categorical events
US20210286706A1 (en) Graph-based method for inductive bug localization
KR102269174B1 (en) Appratus and method for verification of smart contracts
US12003525B2 (en) Development security operations on the edge of the network
US11941115B2 (en) Automatic vulnerability detection based on clustering of applications with similar structures and data flows
JP2018522334A (en) Inter-module behavior verification
CN113297583B (en) Vulnerability risk analysis method, device, equipment and storage medium
CN117725594A (en) Multiple composite detection method, device, equipment and storage medium of intelligent contract
US9998495B2 (en) Apparatus and method for verifying detection rule
US11704222B2 (en) Event log processing
JP7259932B2 (en) Hypothesis Verification Device, Hypothesis Verification Method, and Program
CN112784990A (en) Training method of member inference model
CN112597209A (en) Data verification method, device and system and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination