CN115658549A - Formal verification method for source code - Google Patents

Formal verification method for source code Download PDF

Info

Publication number
CN115658549A
CN115658549A CN202211570303.8A CN202211570303A CN115658549A CN 115658549 A CN115658549 A CN 115658549A CN 202211570303 A CN202211570303 A CN 202211570303A CN 115658549 A CN115658549 A CN 115658549A
Authority
CN
China
Prior art keywords
protocol
function
source code
code
generator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211570303.8A
Other languages
Chinese (zh)
Other versions
CN115658549B (en
Inventor
赵永望
章喆
姚历智
赵健宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Wangan Technology Co ltd
Original Assignee
Zhejiang Wangan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Wangan Technology Co ltd filed Critical Zhejiang Wangan Technology Co ltd
Priority to CN202211570303.8A priority Critical patent/CN115658549B/en
Publication of CN115658549A publication Critical patent/CN115658549A/en
Application granted granted Critical
Publication of CN115658549B publication Critical patent/CN115658549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a formal verification method for a source code, which comprises an input generator, a symbol executor, a protocol generator, a protocol matcher, a formal verification protocol library, a protocol learner, a protocol filter, a natural language converter, a code verifier and a verification report generator, wherein the input generator takes the source code as input to generate a plurality of code files for randomizing variable and function parameter values, the symbol executor is used for reading the code files and analyzing a file set, and each file comprises an input state, an execution function and an output state of a single function. The original protocol is generated after the source code is automatically processed through the protocol generator, the protocol is corrected and supplemented through the protocol learner by means of model learning, the flow is highly automatic, the threshold of manually writing the formal protocol by development and verification personnel is effectively reduced, the cost of formal language learning and implementation is reduced, and the degree of automation of verification is improved.

Description

Formal verification method for source code
Technical Field
The invention relates to the technical field of formal verification, in particular to a formal verification method for a source code.
Background
The software correctness guaranteeing method comprises three technologies, namely: software testing, code analysis, formal verification, the most common of which is formal verification at present.
However, formal verification has the defect of high learning cost, and multiple functional specification languages exist for each programming language, so that the reuse rate of the functional specification is low, that is, when a new software has similar functions, the previously written functional specification cannot be directly used, and the difference of different programming languages for the same implementation cannot be quickly analyzed.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a method for verifying the source code formality, which solves the problems in the background technology.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme: a formal verification method for source codes comprises an input generator, a symbol executor, a protocol generator, a protocol matcher, a formal verification protocol library, a protocol learner, a protocol screener, a natural language converter, a code verifier and a verification report generator, wherein the input generator takes source codes as input to generate a plurality of code files with randomized variable and function parameter values, the symbol executor is used for reading the code files and analyzing a file set, each file contains an input state, an execution function and an output state of a single function, the protocol generator is used for reading the file set and generating an original formal protocol corresponding to each function, the protocol matcher receives the original protocol set and calls the formal verification protocol meeting the original protocol generated above from the formal verification protocol library to generate a protocol set with names and properties through the protocol matcher so as to expand the original protocol set, the protocol learner inputs the expanded protocol set, adjusts and screens the protocol set by using a model learning algorithm and outputs the adjusted protocol set, the protocol screener screens out documents after requirements and the documents are converted into a specification converter, and generates a final protocol report through verifying the source codes and verifies whether the protocol generator by using the protocol generator.
The input generator is used for receiving and verifying the source code, decomposing the source code and generating a plurality of code function files, wherein each code file is a set of readable computer language instructions.
Preferably, the symbol executor is configured to receive the generated code file, parse the code file to obtain a set of code functions, where a single set includes an input state, an execution function, and an output state of a certain function, and package the set in a file manner and transmit the package to the simulation learner.
Preferably, the specification generator is configured to preprocess the functions, generate a list of functions to be verified, and initialize a function specification for each function in the list to generate an original specification.
Preferably, the formal verification conventions library is based on a set of detected formal conventions, including formal descriptions of different language grammar rules.
Preferably, the protocol matcher is configured to match the original protocol generated in the previous step with protocols in a formalized protocol library, and generate different types of compliant protocol sets.
Preferably, the specification learner is configured to accept the specification set, accurately and reliably learn the source code through the specification set, and automatically generate an original formalized specification corresponding to each function in the source code.
Preferably, the natural language converter is configured to receive a requirement document, an interface document, and the like, convert the requirement document and the interface document into a requirement specification document and an interface specification document described by using a formal language, and transmit the specification document to the specification filter.
Preferably, the specification filter is configured to filter and judge a specification corresponding to each function, and transmit the set to the code verifier.
Preferably, the code verifier is used for verifying whether all functions and corresponding specifications are in compliance.
(III) advantageous effects
The invention provides a formal verification method for source codes. The method has the following beneficial effects:
1. according to the scheme, the original protocol is generated after the source code is automatically processed through the protocol generator, the protocol is corrected and supplemented through the protocol learner through a model learning means, the flow is highly automatic, the threshold of manually writing the formal protocol by development and verification personnel is effectively reduced, the costs of formal language learning and implementation are reduced, and the verification automation degree is improved.
2. The scheme is suitable for various programming languages, realizes the conversion from the programming languages to formal conventions, and has high universality.
Drawings
FIG. 1 is a flow chart of a method of formally verifying source code in accordance with the present invention;
fig. 2 is a block diagram of a learning process of the specification learner.
Detailed Description
The embodiment of the invention provides a formal verification method for source codes, which comprises an input generator, a symbol executor, a protocol generator, a protocol matcher, a formal verification protocol library, a protocol learner, a protocol filter, a natural language converter, a code verifier and a verification report generator, wherein the input generator takes source codes as input to generate a plurality of code files with variable and function parameter values randomized, the symbol executor is used for reading the code files and analyzing a file set, each file comprises an input state, an execution function and an output state of a single function, the protocol generator is used for reading the file set and generating an original formal protocol corresponding to each function, the protocol matcher receives the original protocol set, calls the formal verification protocol conforming to the original protocol generated above from the formal verification protocol library, generates a protocol set with similar name and property through the protocol, and is used for expanding the original protocol set, the expanded protocol set is input by the protocol learner, the protocol set is adjusted by using a model learning algorithm, and outputs an adjusted protocol set, the requirement document filter, an interface filter is converted into a specification by the natural language converter, the file converter, the protocol generator is used for generating a final verification report for verifying the formal verification of the source codes and verifying the source codes.
The input generator is used for receiving and verifying a source code, decomposing the source code, generating a plurality of code function files, converting each function into a plurality of code files which only execute one program entry function of the function, adjusting global variables and input parameter types, and transmitting the generated code files to the symbol executor, wherein each code file is a set of readable computer language instructions.
The symbol executor is used for receiving the generated code file, analyzing the code file to obtain a set of code functions, packaging the set in a file mode and transmitting the packaged set to the simulation learner, wherein a single set comprises an input state, an execution function and an output state of a certain function.
The protocol generator is used for preprocessing functions, generating a function list to be verified, and initializing function protocols of each function in the list to generate an original protocol.
The formal verification conventions library is based on a set of detected formal conventions, including formal descriptions of different language grammar rules.
And the protocol matcher is used for matching the original protocol generated in the last step with protocols in the formalized protocol library and generating different types of protocol sets.
The specification learner is used for receiving the specification set, accurately and reliably learning the source code through the specification set, and automatically generating an original formalized specification corresponding to each function in the source code.
The natural language converter is used for receiving the requirement document, the interface document and the like, converting the requirement document and the interface document into a requirement specification document and an interface specification document which are described by using a formal language, and transmitting the specification document to the specification filter.
The protocol filter is used for filtering and judging the protocol corresponding to each function, and transmitting the set to the code verifier.
The code verifier is used for verifying whether all functions and corresponding specifications are consistent.
The method comprises the steps of automatically initializing and matching a source code protocol by establishing a formal rule model library, calibrating the formal protocol by combining natural/semi-formal requirements and design documents, and then automatically verifying and reporting the protocol by a formal verifier. The invention is realized by the following technical scheme:
firstly, reading a source code file to be verified, and putting the source code file into an input generator. Each function in the source code file can be generated into a plurality of code files only executing the function, and the generated files can adjust the original global variables in the code files to be random values and the random values of the input parameter types of the function.
Specifically, the generator receives source codes, adopts a BNF paradigm and an EBNF extension mode thereof to describe the syntax of the programming language, and converts the syntax into a syntax tree. All function definition statements are found from the syntax tree, and the name, return value type, parameter type of the function, and other information (e.g., class name, package name, etc.) required to call the function are determined. And deleting the code block called by the program entry in the source code file, and replacing the code block called by the function.
Wherein the code block should contain the following information: the construction of the return variables and the parameters, the return variables which only need to be initialized and the parameters which are randomly assigned. Wherein function call refers to the return value of the function call being given to the previously constructed return variable.
And secondly, putting the code file output from the above into a symbol executor, and analyzing a source code function file set (including an input state, an execution function and an output state of the function). Symbolic execution, i.e., virtual execution, is performed in an abstract program state by defining source code semantics in advance. Execution of a program is the process of state transition of the program. Wherein, the content of the single collection packet is: an execution file, an identification of the execution function (typically the name of the function + the number of lines of the code in the execution file), a state of the program before the execution of the function, and a state of the program after the execution of the function (or a derived loop invariant if a loop exists).
And thirdly, putting the function file set into a protocol generator to generate an original formalized protocol corresponding to each function in the source code, wherein the original formalized protocol comprises a precondition and a postcondition. The constraint conditions including the argument and the argument should be satisfied when the structures of the pre-and post-conditions are similar. The constraint for an argument should be an expression that returns a boolean type of value.
The protocol generator carries out program preprocessing on a source program (function set), obtains the source program which does not contain preprocessing instruction content, generates a list of functions to be verified, and carries out the following same operations on each function in the list:
1. and initializing function reduction.
1. Acquiring all global variables of the function call:
a. acquiring all global variables in a source program;
b. extracting operation variables in all expressions of a target function in a source program;
c. and deleting the global variables which are not used as the operation variables, and storing the global variables as the global variable list of the function call.
2. Acquiring arguments of the preconditions: the "target function parameter" and the "global variable of the function call" are set as arguments.
3. Setting initial precondition:
a. each argument of the source program should exist in one type; if the type is uncertain, determining the type and the argument type of the called position of the function by calling other expressions in the expression operation of the argument;
b. each argument should be constrained to an "undetermined type value" and an "allowed type value".
4. Acquiring the argument of the post condition: the "target function return argument" and the "global variable of the function call" are set as arguments.
5. Setting initial post conditions:
a. each argument of the source program should exist in one type; if the type is uncertain, determining the type by calling other expressions in the expression operation of the argument and the argument type of the called position of the function;
b. each argument should be constrained to an undetermined and allowed type value, and the result of the previous symbol execution is set to "input state" implying "output state".
And fourthly, reading an original protocol set of the function set, calling a formal verification protocol which conforms to the original protocol generated in the previous step from a formal verification protocol library, and expanding and generating a protocol set which has high text matching degree and a front-back conditional intersection through the formal verification protocol library on the basis of the original protocol through a protocol matcher.
The protocol matcher firstly matches the similar protocol names in a protocol library according to the function names; then through the set operation, the similar name conventions (a plurality), the conventions with the intersection of the initial conventions and the post-conditions are found. And finally, combining the protocol set with the highest comprehensive matching degree with the original protocol, and finally outputting.
And fifthly, inputting the expanded protocol set and the original protocol into a protocol learner to learn the function protocol and output the protocol of the function.
Function specification learning only learns functions that do not call non-functional specification functions. If the function called in one function has no function specification, the function specification of the calling function is learned.
The goal of learning is to obtain a possible function specification for the function.
In one embodiment, the questioning module and the answering module in the learner are referred to as students and the answering module is referred to as teacher, and the learning process is a process of gradually obtaining more accurate preconditions and postconditions through continuous inquiry of the students and each answer of the teacher.
As shown in fig. 2, the specific steps are as follows:
1. the student always holds the pre-condition and the post-condition combined into a function specification to ask the teacher whether the teacher is the function specification of the function.
2. The teacher verifies the protocol through a program verifier, which is a device capable of returning a verification result by inputting the verification protocol and the source code, and returns true if the verification protocol is consistent with the source code, and returns false if the verification protocol language is not consistent with the source code:
a. if not, the counterexample is returned. The student deletes the situation in the preposition and postition conditions according to the counterexample, if the student encounters errors for a plurality of times, the student constructs the set containing the minimum of the counterexamples for deletion, and if the counterexamples do not coincide for a plurality of times, the student returns to the nearest coincident preposition conditions and postition conditions.
b. If so, an example is returned based on the operation in the source code that uses the function. The student increments this in the context according to this example, and if multiple encounters are correct, constructs the smallest aggregate entry increment that contains these several examples.
And repeatedly learning in the learning mode until each function in the source code generates a corresponding original formalized reduction set.
And sixthly, after the requirement document and the interface document are converted into the standard document through the natural language converter, the standard document and the generated protocol set are transmitted to the protocol filter together, and the protocol corresponding to each function is screened out.
As another pre-step in the specification screening stage, the natural language converter extracts the requirement name from the requirement document and finds the requirement name with similar meaning to the original specification name. And carrying out natural language analysis on the requirement description corresponding to the requirement name, converting the noun into a free argument, and converting the verb into a function. Corresponding ivl (such as Boogie) is generated, and then the intersection convention is used as a pre-post condition for verification. If passing, the flag is a possible demand, and if not, the flag is a small probability demand. The same operation is done for the interface document, but the interface name is extracted. The requirement name and the interface name are added to the specification field.
And seventhly, selecting all source codes and formal conventions thereof, putting the source codes and the formal conventions into a code verifier for formal verification, and verifying whether all functions and the corresponding conventions are consistent or not.
And eighthly, generating a verification report by the verification report generator according to the verification result entry report, and if the verification result entry report is consistent with the verification result entry report, generating the verification report by the verification reporter.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A method of formally verifying source code, comprising: the system comprises an input generator, a symbol executor, a protocol generator, a protocol matcher, a formal verification protocol library, a protocol learner, a protocol filter, a natural language converter, a code verifier and a verification report generator, wherein the input generator takes a source code as input and generates a plurality of code files for randomizing variable and function parameter values;
the symbol executor is used for reading the code files and analyzing a file set, wherein each file comprises an input state, an execution function and an output state of a single function;
the protocol generator is used for reading the file set and generating an original formalized protocol corresponding to each function;
after the original protocol set is received by the protocol matcher, calling a formal verification protocol which conforms to the original protocol generated above from the formal verification protocol library, and generating a protocol set with similar name and property through the protocol matcher so as to expand the original protocol set;
the protocol learner inputs the expanded protocol set, adjusts and screens the protocol set by using a model learning algorithm, and outputs an adjusted protocol;
the protocol filter converts the requirement document and the interface document into a standard document through a natural language converter, and then transmits the standard document and the generated protocol set to the protocol filter together to screen out the final corresponding protocol of each function;
the code verifier is used for performing formal verification on all source codes and formal conventions thereof, verifying whether the source codes and the formal conventions conform to each other or not, and generating a verification report through the verification report generator.
2. A method of formally verifying source code according to claim 1, wherein: the input generator is used for receiving and verifying the source code, decomposing the source code and generating a plurality of code function files, wherein each code file is a set of readable computer language instructions.
3. A method of formally validating source code according to claim 2, wherein: the symbol executor is used for receiving the generated code files, analyzing the code files to obtain a set of code functions, packing the set in a file mode and transmitting the packed set to the simulation learner, wherein a single set comprises an input state, an execution function and an output state of a certain function.
4. A method of formally verifying source code according to claim 1, wherein: the protocol generator is used for preprocessing functions, generating a function list to be verified, initializing function protocols of each function in the list and generating an original protocol.
5. A method of formally validating source code according to claim 1, wherein: the formalized verification conventions library is based on a set of detected formalized conventions, including formalized descriptions of different language grammar rules.
6. A method of formally validating source code according to claim 1, wherein: the protocol matcher is used for matching the original protocol generated in the last step with protocols in a formalized protocol library and generating different types of protocol sets.
7. A method of formally verifying source code according to claim 1, wherein: the protocol learner is used for receiving the protocol set, accurately and reliably learning the source code through the protocol set and automatically generating the original formalized protocol corresponding to each function in the source code.
8. A method of formally validating source code according to claim 1, wherein: the natural language converter is used for receiving a requirement document, an interface document and the like, converting the requirement document and the interface document into a requirement specification document and an interface specification document which are described by using a formal language, and transmitting the specification document to the specification filter.
9. A method of formally validating source code according to claim 1, wherein: the protocol filter is used for filtering and judging the protocol corresponding to each function and transmitting the set to the code verifier.
10. A method of formally validating source code according to claim 9, wherein: the code verifier is used for verifying whether all functions and corresponding specifications are consistent.
CN202211570303.8A 2022-12-08 2022-12-08 Formal verification method for source code Active CN115658549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211570303.8A CN115658549B (en) 2022-12-08 2022-12-08 Formal verification method for source code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211570303.8A CN115658549B (en) 2022-12-08 2022-12-08 Formal verification method for source code

Publications (2)

Publication Number Publication Date
CN115658549A true CN115658549A (en) 2023-01-31
CN115658549B CN115658549B (en) 2023-03-07

Family

ID=85019679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211570303.8A Active CN115658549B (en) 2022-12-08 2022-12-08 Formal verification method for source code

Country Status (1)

Country Link
CN (1) CN115658549B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756000A (en) * 2023-05-24 2023-09-15 浙江望安科技有限公司 Method for continuously integrating combined form verification

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6591403B1 (en) * 2000-10-02 2003-07-08 Hewlett-Packard Development Company, L.P. System and method for specifying hardware description language assertions targeting a diverse set of verification tools
CN108985073A (en) * 2018-07-18 2018-12-11 成都链安科技有限公司 A kind of supermatic intelligent forms of contract chemical examination card system and method
WO2020046981A1 (en) * 2018-08-28 2020-03-05 Amazon Technologies, Inc. Automated code verification service and infrastructure therefor
CN110989997A (en) * 2019-12-04 2020-04-10 电子科技大学 Formal verification method based on theorem verification
CN112685315A (en) * 2021-01-05 2021-04-20 电子科技大学 C-source code-oriented automatic formal verification tool and method
CN115310095A (en) * 2022-08-08 2022-11-08 成都链安科技有限公司 Block chain intelligent contract mixed formal verification method and system
CN115357492A (en) * 2022-08-19 2022-11-18 浙江大学 Formal verification method and device for Java software

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6591403B1 (en) * 2000-10-02 2003-07-08 Hewlett-Packard Development Company, L.P. System and method for specifying hardware description language assertions targeting a diverse set of verification tools
CN108985073A (en) * 2018-07-18 2018-12-11 成都链安科技有限公司 A kind of supermatic intelligent forms of contract chemical examination card system and method
WO2020046981A1 (en) * 2018-08-28 2020-03-05 Amazon Technologies, Inc. Automated code verification service and infrastructure therefor
CN110989997A (en) * 2019-12-04 2020-04-10 电子科技大学 Formal verification method based on theorem verification
CN112685315A (en) * 2021-01-05 2021-04-20 电子科技大学 C-source code-oriented automatic formal verification tool and method
CN115310095A (en) * 2022-08-08 2022-11-08 成都链安科技有限公司 Block chain intelligent contract mixed formal verification method and system
CN115357492A (en) * 2022-08-19 2022-11-18 浙江大学 Formal verification method and device for Java software

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡凯;白晓敏;高灵超;董爱强;: "智能合约的形式化验证方法" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756000A (en) * 2023-05-24 2023-09-15 浙江望安科技有限公司 Method for continuously integrating combined form verification
CN116756000B (en) * 2023-05-24 2024-02-06 浙江望安科技有限公司 Method for continuously integrating combined form verification

Also Published As

Publication number Publication date
CN115658549B (en) 2023-03-07

Similar Documents

Publication Publication Date Title
US7685082B1 (en) System and method for identifying, prioritizing and encapsulating errors in accounting data
CN107885999B (en) Vulnerability detection method and system based on deep learning
CN111611586B (en) Software vulnerability detection method and device based on graph convolution network
US11775414B2 (en) Automated bug fixing using deep learning
US20110258601A1 (en) Method and apparatus for the performing unit testing of software modules in software systems
US10459829B2 (en) Overall test tool migration pipeline
CN114297654A (en) Intelligent contract vulnerability detection method and system for source code hierarchy
CN115658549B (en) Formal verification method for source code
CN109857641A (en) The method and device of defects detection is carried out to program source file
CN113157385A (en) Intelligent contract vulnerability automatic detection method based on graph neural network
Mai et al. A natural language programming approach for requirements-based security testing
CN106708525A (en) Coq-based MSVL program verification method
KR102546424B1 (en) Machine learning data generating apparatus, apparatus and method for analyzing errors in source code
CN112685315A (en) C-source code-oriented automatic formal verification tool and method
KR20220080311A (en) Method for automatically fixing program errors and system supporting the same
CN112380112A (en) Java automatic formalization modeling detection verification method and system
US7543274B2 (en) System and method for deriving a process-based specification
CN117113080A (en) Data processing and code processing method, device, all-in-one machine and storage medium
WO2022122174A1 (en) Methods and apparatuses for troubleshooting a computer system
Al Salem et al. A review on grammar-based fuzzing techniques
CN116126731A (en) Code standardization method based on generation type pre-training
CN116595537A (en) Vulnerability detection method of generated intelligent contract based on multi-mode features
CN115357492A (en) Formal verification method and device for Java software
CN112433947A (en) Chaos engineering method and system based on network data
Jnanamurthy et al. Formal specification at model-level of model-driven engineering using modelling techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant