CN110989997A

CN110989997A - Formal verification method based on theorem verification

Info

Publication number: CN110989997A
Application number: CN201911225125.3A
Authority: CN
Inventors: 杨霞; 郭文生; 瞿元; 李南铮; 黄一; 钱智成; 潘文睿; 高扬; 张冯博; 卢秀台; 熊宇
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2019-12-04
Filing date: 2019-12-04
Publication date: 2020-04-10

Abstract

The invention discloses a formal verification method based on theorem proving, which is applied to the field of safe operating systems and aims at solving the safety problem of the existing operating systems, and comprises the following steps: reconstructing a source code, performing formal modeling on a function, performing theorem description on the function, and finally performing formal certification; the invention adopts semi-automatic proof of human-computer interaction, and utilizes isomorphic relation between type system and logic to convert the process of constructing proof into the process of writing program, and the correctness check of proof also becomes the problem of type check.

Description

Formal verification method based on theorem verification

Technical Field

The invention belongs to the field of computer operating systems, and particularly relates to a formal verification technology of a safe operating system.

Background

In 1946, the first electronic digital computer was born in the world, the appearance of the computers, namely ENIAC and ENIAC, laid the development foundation of electronic computers, and meanwhile, the first electronic digital computer was the birth sign of the first electronic tube computer, and then, with the development of transistor technology, large and small-scale integrated circuits and super-large-scale integrated circuits, the computers underwent several updating and upgrading processes, and developed from the second transistor computer to the present fourth-generation large-scale integrated circuit computer. Until now, computers have been continuously developed, and from the viewpoint of practical application, the development is continuously progressing in various directions such as miniaturization and intellectualization. With the development of the internet, computers are becoming ubiquitous, small enough to be used in life of people, such as mobile phones, computers, household appliances, and military systems, such as aerospace systems, nuclear power energy systems, missile launching, and the like, and the functions of the computers are becoming increasingly powerful, and the influence thereof is extremely great.

While accepting the benefits of the rapid growth of computers and the internet, we have to bear the hazards associated with them, and because of the ubiquity of computer systems, once problems arise with these systems, we will experience them with the immediate detriment of their own right in the first instance. For example, the bitcoin Lessovirus "WannaCry" abused in over 150 countries and regions in 2017, which utilizes the Windows security hole "EternalBlue", is spread by mail, program Trojan and other ways, causes serious harm to many industries, and finally causes the loss of $ 80 hundred million. The whole-body overflow attack vulnerability of BEC contract platform in the block chain industry of the next year is because the source code is not restricted by a Safemath library, so that the overflow problem bypasses the code check, tens of thousands of tokens are lost, and billions of dollars are lost finally. The consequences of each occurrence of the security attack described above are very serious. Certainly, for these threats, related organizations and people also provide many solutions, such as releasing bug patches, updating virus libraries, and various antivirus software in time, but such methods cannot fundamentally solve the problems, and usually, a solution to the dangerous events is provided after the dangerous events have occurred, so in order to fundamentally solve the security problems, the security of the operating system itself must be improved from the perspective of the operating system itself to deal with a plurality of unknown security threats, and how to develop a safe and reliable software and hardware system is the fundamental problem.

According to the technical development report of CCF Chinese computer science, the formalization method is more and more emphasized by the international and domestic computer science and technology field. Formalization has been listed as a necessary technical means for security-related systems by many international standardization organizations. Among the SIL1-4 software safety ratings, the SIL3 and SIL4, which have the highest safety rating, require that a formalization method must be used. In the IT industry, Microsoft developed a series of formal verification tools to analyze the correctness of a plurality of software; facebook widely uses the authentication tool Infer given to split logic in the development process of its Android applications; the Huaqi corporation in China establishes a formal verification team, and verifies the correctness and the safety of an operating system kernel by using a theorem verification auxiliary tool.

In recent years, formalization of code level for real systems has been largely classified into operating system kernel validation and compiler validation. The verification work of the kernel of the operating system is represented by seL4 formal proof of microkernel, which is completed by a foreign NICTA studio, and the unique corresponding state conversion can be generated by the executable specification through establishing a machine model to prove the state conversion of each step of the abstract specification. The seL4 kernel which is formally proved can effectively defend against the problems of buffer overflow, null pointer reference, pointer type error, memory leakage and the like, and is mainly verified by a Chinese academy major team on the mu C/OS-II in China. On the compiler side, the most famous is the certification and subsequent work of CompCert by INRIA. The compiler provides mathematical, machine-checked proof that the behavior of the generated executable code conforms exactly to the semantic specifications of the source program. C code compiled by the compiler may exclude the possibility of compiler-introduced errors. That is, the compiler-generated executable code is proven to be completely consistent with the behavior specified by the semantics of the source code C program.

Disclosure of Invention

In order to solve the technical problems, the invention adopts a formal verification method based on theorem verification, thereby fundamentally solving the problem of dangerous events existing in the safety system.

The technical scheme adopted by the invention is as follows: a formal verification method based on theorem proving comprises the following steps: reconstructing a source code, performing formal modeling on a function, performing theorem description on the function, and finally performing formal certification;

the source code reconstruction specifically includes:

a1, sorting the calling relation of the functions, and importing all functions and data types called in the proving function into a folder;

a2, reconstructing data types in the function which can not be identified or the conversion is not perfect by the CompCert C compiler;

a3, separating complex structures in the code;

a4, deleting functions which do not influence the function logic and the variable state in the code;

the formalized modeling of the function specifically comprises the following steps: formalized modeling of functions automatically by a CompCert C compiler and VST tool;

performing theorem description on the function, specifically: the theorem is expressed by using a Hoare logical structure, and the obtained formalized description theorem can reflect the state change of each part before and after the function is executed;

the theorem proves that the method specifically comprises the following steps: derived proof is performed on the theorem of the formal description using a proof strategy.

Further, the theorem of the formal description satisfies the specification of the VST.

Further, the process of theorem describing the function is as follows:

a1, binding the theorem description with the original function formalized model through DECLARE, and then connecting with the function formalized model name;

a2, enumerating variables which can appear in the whole function execution process by adopting a WITH statement, and specifying the types of the variables;

a3, precondition, the internal structure is "PROP () LOCAL () SEP ()", PROP () represents the constraint condition to be satisfied before the function is executed; LOCAL () represents a LOCAL variable or a global variable that must be present before a function is executed; SEP () is a space predicate, and represents the actual initial value condition on the address of a certain operation variable before the function is executed;

a4, postcondition, internal structure is "PROP^‘()LOCAL^‘()SEP^’()”，PROP^‘() Representing a constraint condition to be satisfied after the function is executed; LOCAL^‘() Representing a return value representing a function; SEP^’() The predicate is a space predicate and represents the specific value condition on the address of an operation variable after the function is executed.

Further, the attestation policy includes: the method comprises the following steps that a rewrite rewriting strategy, an application strategy, a forward strategy and an enterler strategy are adopted, wherein the rewrite rewriting strategy replaces target variables in a current target with requirement variables in a premise; replacing a target variable in a current target by a required variable in the premise by the application strategy, and simplifying the proved target; the forward policy is used to facilitate forward execution of the Hoare logic attestation; the entailer policy is used to perform the derivation work for implication statements.

The invention has the beneficial effects that: the formal verification method based on theorem proving adopts semi-automatic proving of human-computer interaction, converts the process of constructing the proving into the process of writing a program by utilizing the isomorphic relation between a type system and logic, and the correctness check of the proving also becomes a type check problem.

Drawings

FIG. 1 is a block diagram of a formalized certification process provided by an embodiment of the present invention;

fig. 2 is a schematic diagram of a theorem-describing process structure according to an embodiment of the present invention.

Detailed Description

To facilitate understanding of the technical content of the present invention by those skilled in the art, the present invention will be further explained with reference to fig. 1-2.

Formal verification is common in two forms: one is to infer whether the system model specification satisfies the property specification, at this time, the model specification is biased to the operation type, and the property is usually an explanatory type; the other is to infer whether one model specification of the system has a refinement or equivalence relation with another model specification. The reasoning process provides a set of static methods for predicting the system behavior, a user can describe the expected property of the system behavior or the guess of the relation between different abstractions in the development process, and the formal verification proves or proves the property or the guess in a mechanical mode, so that the credibility of the user on the protocol and the system is improved.

1 model detection

Model detection is a formal method used to accurately prove that a system can work correctly according to a preset target. The model detection technology is an automatic inspection technology of a finite state reaction type system. The main idea of model detection is to describe the given system and system attributes by finite state models, and then use an efficient search program (model detector) to determine whether the system model meets the original design requirements (system attributes). To ensure the termination of the search, the state space of the model is typically limited to a finite number. In the model detection, various formal specification languages may be used as a modeling language, and a sequential logic is generally used as a property description language. Model detection is an automated verification method that can be used when the model does not satisfy properties. A counter example is given to facilitate positioning and modifying the model. However, the model detection has a state space explosion problem, so that the model detection may have undetected system defects and cannot prove that the system is completely correct.

2 theorem proving

The theorem proving method can solve the problem of unlimited state space, thereby overcoming the problem of explosion of the state space. Formal verification based on theorem proving takes the conclusion that a system meets the stipulations as a logic proposition, and carries out proving on the proposition in a deductive reasoning mode through a group of reasoning rules.

According to the difference of the certification mode and the automation degree, the verification system based on theorem certification can be divided into automatic verification of an automatic theorem prover and semi-automatic verification of human-computer interaction.

2.1 automated verification based on automated theorem provers

The current common program provers for proof based on automatic theorem proving include why3, smallfoot, etc., most of which are based on a specific program logic, and given a program and its specifications, the prover can automatically decide what axiom or rule in the program logic is used for each statement of the program and generate a corresponding verifying condition as a proving obligation.

However, since many problems in automated theorem proving are indeterminate problems and each theorem prover has its own capability limitations, the nature of expressiveness and proof is limited. In order to realize automatic certification, the property to be certified and the code to be verified need to be rewritten in many times, and even the property to be verified and the function of the code are sacrificed for the automation of verification, so that the method is not suitable for the verification work.

2.2 semi-automated proof of human-machine interaction

Many secondary theorem proving tools, such as Coq used in this proving work, often utilize the isomorphic relationships between type systems and logic to convert the process of constructing the proof into the process of writing the program, and the proof correctness check becomes the type checking problem.

Although the semi-automatic verification work needs a lot of manual labor to construct the verification, the method does not need to sacrifice the expression capability of the specifications and codes, particularly, the program can be represented by logic with strong expression capability, the verification itself has display representation in a machine, the correctness of the verification can be automatically checked, and therefore, the correctness of the algorithm is proved without relying on an automatic theorem, and the verification conclusion is more reliable.

The certification method adopted by the invention is a semi-automatic theorem certification method, which mainly uses a CompCert C compiler, a VST tool and an Coq auxiliary theorem certification tool:

the CompCert C compiler is a formalized certified compiler done by the research institute INRIA, France. The compiler provides mathematical, machine-checked proof that the behavior of the generated executable code conforms exactly to the semantic specifications of the source program. C code compiled by the compiler may exclude the possibility of compiler-introduced errors. That is, the compiler-generated executable code is proven to be completely consistent with the behavior specified by the semantics of the source code C program.

The CompCert C compiler architecture includes a total of 3 parts:

the first part may parse, type check, and pre-process the code, i.e., convert the C code into an abstract syntax tree of the comp certc language. In this section, some structures that are not supported by the CompCert C compiler are extended, e.g., local variables of the block scope are renamed and lifted to the local scope of the function, etc., but there are also some structures that are not supported, e.g., variable function declarations are rejected. This part is not formally proven, but because Complert C is equivalent to a subset of C, the compiler can generate Complert C code from the C language specific syntax output, so the part of the conversion result can be checked manually. In addition, most tools for C static analysis and program verification run in a simplified C language similar to CompCert C, and errors that may be introduced by this part of the architecture can be detected by performing analysis or program verification directly on the CompCert C form.

The second part is to convert the CompCert C syntax tree into a compilation syntax tree, which is proven to be correct in the Coq tool. It consists of 16 channels and uses 10 intermediate languages.

All intermediate languages of the second part are assigned formal semantics and each conversion process is proven to preserve semantics.

And the third part is assembly and linkage, namely the assembly language abstract syntax tree generated by the second part is printed in a concrete assembly syntax, and then a system assembly tool and a linker are called to respectively generate a target file and an executable file.

The VST is essentially a tool chain including a static analysis tool for checking program assertions, an optimized compiler for compiling a program into machine language, and a semantic system and library that provide an environment for program verification. This proven toolchain can ensure that assertions declared at the top of the toolchain do hold in machine language programs when running in an operating system context. Using this toolchain to verify C programs is also known as Verifiable C, a language and program logic for reasoning about the functional correctness of C programs, a subclass of Comcert C light. The method is an abstract description of the C language, and separates other meanings in the C language and load meanings from expressions. The program logic is a high-level separation logic, namely, a Hoare logic, and can better support the reasoning of a pointer data structure, a function pointer and data abstraction.

Coq is a formal proof management system designed to develop mathematical proofs, and in particular written formal conventions, programs, and de-proof programs that are correct in these conventions. It provides a specification language called "Gallina" which can be used to represent program structure, as well as program properties, and also the process of attestation of the program. All logic judgments in such a language library are summarized as type judgments, and the correctness of the proof is derived by a type checking algorithm, which is also the core of the whole system. In addition, Coq provides an interactive theorem proving assistant tool, Coqide, for facilitating human-computer interaction, so-called interactive theorem proving that formalized proving is accomplished by mutual assistance of a user and a computer. Attestation of different programs may use the tool to build an attestation process.

As shown in fig. 1, the method of the present invention comprises the following four steps:

1. code restructuring operations

Because our formalization tools are not supported or are not complete for some syntaxes in the C language and simplifying the function logic can simplify the attestation process, we need to perform reconstruction operations on partial functions and data types.

The main contents of the code reconstruction part are as follows:

and importing work, sorting the calling relation of the functions, and importing all functions and data types called in the proving function into one folder.

Because we only prove for a single function, we need to separately sort out all functions and data types of each function call, so as to ensure that the sorted-out code can be successfully compiled.

In this part of the work, there are many macro selection definitions: # if … … # else will select the result of the definition according to the value of some attribute, which is mainly related to the specific configuration (system platform, system bit number, etc.). Therefore, in order to determine the result of the macro selection definition, in actual operation, the macro selection definition is tested by adding # error, and the result is output to judge the macro definition selected by the code in actual execution.

In addition, because the Xen code contains function code implementation of a plurality of platforms, a plurality of renaming functions are contained according to different platforms, and here, the platform is uniformly selected to be arm64 bit.

And reconstructing code statements, namely reconstructing data types in the function which cannot be identified or are not completely converted by a CompCert C compiler, such as goto statements and the like.

The comp cert compiler cannot recognize some statement structures, so in order to compile smoothly, for some specific structures, reconstruction needs to be performed on the premise of ensuring that the structure function is not changed. For example, a goto statement needs to consider the location of goto jump during code analysis, if … else … statements can be used instead for goto statements jumping to the internal location of a function, and if goto statements jumping to the external location of the function needs to be carefully analyzed, and function call is added if necessary. In the attestation function, all goto statements jump to a location inside the function. An example of a goto structure is as follows:

the function is to assign a scheduler, where "goto found" jumps to the found position, for which reconstruction can be as follows:

and separating the complex structure of the function, and separating the complex structure in the code, namely packaging the complex structure into a new function, simplifying the function logic and further simplifying the proving process.

It is a proven skill to separate out some complex operations in the function. If the function contains complex data types, the certification process is tedious and complex, and even the problem that the certification strategy cannot be adapted may occur. To solve this problem, it is possible to separately prove the complicated data types as an independent function. In the final function, the complex operation is only to call the independent function, namely when the final function is proved, the complex operation does not need to be proved, and only the function call relation needs to be proved, so that the proving process is simplified. The following is the proof operation for a generic function:

the function is used for recording the current operating vcpu, and comprises complex data structure nested pointer operations 'd.vcpu ═ v- > vcpu _ id' and 'd.domain ═ v- > domain- > domain _ id', and the pointer operations are packaged into the function, so that the purpose of simplifying codes is achieved:

when performing the proof, the 3 separation functions need to be respectively proved, and finally, 3 proofs are quoted in trace _ confidence _ running, and the proof writing process for the pointer structure is not needed.

And other operations, deleting the functions which do not influence the function logic and the variable state in the code.

For some record tracking functions and assertion functions contained in the functions, the states of function variables are not influenced in the whole function execution process, and only the states are equivalent to record variable related information or output assertions, so that the codes can be deleted in the actual process; in addition, the CompCert C compiler and VST certification tool have imperfect certification policies for mutual exclusion locks, and thus have no way to certify the mutual exclusion lock problem temporarily.

2. Formalized modeling of functions

The formal modeling work is to formally describe the module function, and the part of the work is mainly to formally model the function automatically through a CompCert C compiler and a VST tool. Automated modeling of both tools improves the efficiency of the attestation process, but automated modeling is not flexible enough and some specific structures in the function still need to be described by human.

For the structure type VST, the inside is defined as "Tstruct _ name noattr", but the type definition cannot represent the internal structure of the structure variable. The structure type can be understood as a collection representation of different types, so when describing the structure type, how to represent the internal variable types needs to be considered, and then the structure name is used as the collection name of the variables.

The above codes represent part of internal variables of the vcpu structure type, and before describing the structure of the structure, a method for representing the simple type of the C language by the VST needs to be introduced, as shown in table 1 below.

TABLE 1C language simple type and VST library mapping Table

Basic type of C language	Corresponding types in VST library
		unsigned int	tuint
unsigned short	tushort
		unsigned long	tulong
unsigned long long	tulong
		int	tint
signed long	tlong
		signed long long	tlong
struct a{…}	Tstruct_a noattr
		char	tschar

The table is a representation of some simple types in a VST library, and this representation may be used to represent types describing function parameters or intermediate variables and return values, for example, when the parameter type is int, the function model is denoted by tint, and the return type is unsigned int, and denoted by tunt, but when a specific operation is involved, it may be necessary to use other means to describe, for example, for some arithmetic operations in a C program, because only Z is represented for an integer in Coq, and the addition, subtraction, multiplication, and division are also calculated in units of Z, so in the whole process, we need to represent the variable type as Z, and when finally representing the type of a value stored at a certain address, perform type conversion by an auxiliary function, and convert Z into the required type.

In describing the structure, the values of the internal variables are actually expressed, and in the function operation, the structure is also operated, so that the integer variable type can be expressed by the Z type, and in the above example, two structure type members including "struct domain" and "struct vpu _ run _ informanstate" require special processing.

The pointer of the structure type nested inside the structure body is equivalent to indirect reference to the structure body, the value of the pointer is an address pointing to another structure body, the transfer address is also a string of numbers of the integer type, the type is equivalent to 'Z- > struct', the string of addresses can represent the variable of the structure body, and subsequent operations on the structure body can locate the structure body through the address for operation.

Definition domain_pool:ZMap.t domain_abs.

The code is a set abstraction of all domain structure types, and after domain _ pool is defined, the internal members of the structure above the designated address can be operated through Zmap.

For a structure variable nested inside a structure, the structure member is essentially directly referenced in the memory, and the variable address stores all the structure members, so the description of the variable is another structure type definition.

The final above structure can be formally described as:

when a specific structure is represented elsewhere, for example, a structure with id of 1, domain address of 2, and runstate of (1,2,3) is represented, the "Vcpu 12 (1,2,3) ___" horizontal line represents is _ urgent, pause _ flags, and the processor may have any integer.

3. Description of theorem

The theorem description part is a key in the certification, and the correctness of the theorem description determines the correctness of the whole certification process. The described theorem is required to meet the specification of the VST, i.e., the part of the theorem must be expressed by using the Hoare logical structure and can reflect the state change of each part before and after the function is executed.

The specification of the theorem description structure specified in the VST tool is as follows:

in the above specification, decode binds the theorem description to the formal model of the original function, which is followed by the name of the formal model of the function; the WITH grammar is used for expressing all variables which can appear in the pre-condition and the post-condition by using Coq grammar; PRE and PROP represent the precondition and postcondition of the function respectively, the internal structure is "PROP () LOCAL () SEP ()", the PROP describes the function constraint, the description is the proposition which is not related to the program state, the LOCAL describes the definition of the LOCAL type and binds it to the concrete variable, the SEP is the space predicate, which represents the concrete type and the concrete value which exist at the concrete address. We will now describe this structure in more detail.

From the above structural specification, a structural schematic diagram of a theorem-describing process can be obtained, as shown in fig. 2 (all theorem suffixes are denoted spec):

when the theorem description is carried out, the theorem description is firstly bound with the original function formalized model through DECLARE, and then the function formalized model name is followed, for example, A, namely the theorem description is declared to be the A model.

The WITH statement lists variables that may occur throughout the execution of the function, including function parameters, global variables, etc., and specifies the type of the function. Note that the variables listed WITH are actually the variables after formalizing the original variables of the function, and in the subsequent operation, the variables need to be bound WITH the original function variables. The statement is used to indicate which variables' states may change during the execution of the function.

The PRE part represents the state of the function before execution, the main part of the PRE part is 'PROP () SEAP ()', namely, the ① part in the graph, before the PRE part enters the main body, parameter arguments are required to be listed first and are bound WITH formal description of argument types, the argument names are consistent WITH the argument names in the formal model bound by theorems, the argument names are proof tools, and an underline is added before the original function parameter name to represent the PRE part, namely, constraint conditions which should be met before the function is executed, generally including range constraints to be met by the listed variables of the WITH, type constraints, identity constraints required for simplifying the proof, and the like, the 'LOCAL ()' in the PRE part represents LOCAL variables or global variables which the function must have before execution, such as the argument, the formal description of the variables is bound WITH the variable names in the formal model, the 'SEP ()' in the PRE part represents a space, represents an initial value before the function is executed, and represents a certain actual operation address structure data, and the like.

The POST part, opposite to the PRE, represents the state situation after the function execution, and the main part is still "PROP () LOCAL () SEP ()", that is, ② part in the figure, and the return type of the function is needed after POST.

Since we prove that the way is forward proof, that is, the precondition deducts the postcondition from the precondition to be satisfied, PRE can be used as a precondition to assist proof, what we need to prove finally is whether the state after the function execution is consistent with the expected state after the execution, the state after the function execution is determined by the formalized model of the function, and the expected execution state needs us to describe the relevant theorem according to the expected function of the function. If the two are consistent, the function has no problem, and if the two are not consistent, the program has a problem or the description of the formal reduction has a problem.

In addition, when theorem description is carried out, two times of theorem description are carried out on one function, and one time of function abstract logic description is carried out; the function specific logic is described once more. The abstract logic is the simplified logic of the function and does not relate to describing the complex structure existing in the function. When proving, we prove the correctness of the program logic, and then prove the consistency of the concrete logic and the abstract logic. The advantage of this is that after we prove the consistency between the concrete logic and abstract logic of a function, if the call of the function is included in other functions, the abstract logic of the function can be directly used to replace the concrete logic when the other functions prove. Thereby simplifying the attestation process.

4. Theorem proving

Theorem proving works by using proving strategies to deduce proving of formally described theorems, and proving performed by us is forward reasoning, namely deducing proving according to function execution sequence. During the certification process, the propositions in the PROP of the PRE part in the concrete proposition are all used as preconditions and put in the context of the certification, while the propositions in the PROP in the POST are put in the POSTCONDITION structure, after the current statement is certified, the POSTCONDITION releases the next statement as a certification target and deletes the statement in the POSTCONDITION structure.

The whole proving idea is based on the Hoare logic and the separation logic, so that the final result of the proving reasoning process is to prove whether the { P } c { Q } is true or not. Since the process of attestation is related to the statement of the function itself, we will in the next section introduce the attestation process according to the specific function, this section mainly introduces the important attestation strategies used in the attestation process.

The rewrite strategy has the function of replacing the target variable in the current target with the required variable in the premise. For example, assuming n-m, the n variable in the certification target can be changed to m variable by the rewrite strategy, and the direction of rewriting can be indicated by adding an arrow symbol, for example, if "rewrite < - >, the m variable in the certification target is changed to n variable.

The apply policy is also a replacement policy, but differs from the rewrite policy in that it provides a reduction operation after replacement, for example, for a certification target of "n ═ m", where just a proposition m ═ n is assumed, if rewrite is used, the certification target is reduced to m ═ m, whereas the apply policy reduces the certification target after replacement, thereby certifying the target.

A forward policy is used to facilitate forward execution of the Hoare logic attestation, e.g., for the Hoare logic attestation { P } i ═ 0; the more { R } firstly needs to apply a sequence rule, and for the derived assertion Q statement, statements of assignment, return, interruption, continuation and the like can be automatically derived through a forward policy, and the forward policy applies a strong postconditioning type attestation rule to derive Q. In addition, the forward strategy has different forms for different C statements: forward _ if policy for derivation of if … else … statements; forward _ while policy, which is a derivation for while loops; forward _ call policy is the derivation policy that is made when a function call is involved.

The entailer strategy is used for completing the derivation work of the implication statement. Implication statements

It means "for P, Q two propositions, if P holds then Q holds", it is denoted P → Q, P is implication front piece, Q is implication back piece, in the program this statement means that any state satisfying P statement also must satisfy Q statement, VST is denoted "ENTAIL △,

"△" represents a context global type, and is used for providing a constraint condition added by a state "entailer" strategy is mainly used for deducing the content of "△", and combining a precondition to judge whether the constraint condition satisfies a Q statement while satisfying a P statement, and if both the constraint conditions satisfy the Q statement, the deduction is finished.

The contents of the present invention are further illustrated below by taking the demonstration process of the specific functions as an example:

secure operating system based on MILS architecture, formalized proof of rtsc2_ vcpu _ insert function

From the fourth section, the function reconstructs the following code as follows:

in this code, variables relating to pointers, for example, "svc- > sdom __ elem" and the like, obtain variable values and assign new variables through the function "get _", and then use the new variables in the judgment conditions instead of the pointer structure, while splitting the three judgment conditions in the second if statement into three if statements, in order to split the 3 judgment conditions for certification at the time of certification.

In the VST environment, the formalized model of the A program can be obtained by executing a 'clinghtgen-formalized A.c' command, wherein the obtained formalized model of the target function can be seen in an attachment, and the naming rules are all function names plus 'f'. Further abstract descriptions of the structure are as follows:

here, "Inductive" means Inductive, and here, the structure is abstractly described by the statement, and the content in parentheses is the internal member of the structure; "pool" is denoted as a domain, and can be understood as a collection of structures of the same type, which can be found in the collection by the structure address.

When theorem description is performed, firstly, the condition that the function is satisfied before being executed, namely the precondition, needs to be analyzed:

1. the form parameter satisfies the type requirement range.

2. Among other functions called other proven functions, they are correct, i.e. satisfy functional correctness.

3. The conditions that are always satisfied in the certification process can be listed for simplification of certification.

From the 3 preconditions, the "PROP ()" statement in the PRE can be obtained.

Analyzing the function requirement to obtain the abstract function description of the function:

the function of the "rtsc 2_ vcpu _ insert" function is to insert vcpu in transfer domain into the scheduler run queue, which mainly realizes its function by calling the "__ runq _ insert" function and the "list _ add _ tail" function, in the abstract functional description of the function, "match … with …" match "if … else …" statement in the source code, and "_ abs" respectively represents the abstract functional description of other functions. In the function, except for the called function, the state change is not involved, so in the function description, the function description of other functions is ensured to be accurate.

The function of the "__ runq _ insert" function is described as follows:

when the attribute 'virq _ flags' of the vcpu is larger than zero, the function is in an interruption state, an interruption queue insertion function is executed, the corresponding function description is 'match Zgt _ bool vf 0 with', 'vf' is obtained through a function 'get _ vcpu _ virq _ flag', and the function description is

Get, look up through "zmap. get", in combination with the abstract description of previous vcpu structure type "vcpu _ abs", it is possible to get its internal member variable "avirq _ flag", and output the values of vcpu fields "vpool" and "avirq _ flag".

The function abstract description of other "get _ …" functions is consistent with the above, and is to find a concrete structure in the domain and then obtain the values of its member variables. The '__ runq _ insert' function is branched by a judgment statement to execute an 'irqq _ insert' function, the function comprises a loop structure, and the functional abstraction of the loop structure needs to be described separately:

"Fixpoint" indicates a recursive method, and in Coq, the recursive definition must be decremented, so that a recursive condition, namely "match n with" needs to be added to the loop, when the loop description is called, we will transfer a large value as n, namely the actual loop number is always smaller than n, to ensure that the recursive condition does not affect the original loop, the original loop functions to find the position of the first vcpu with lower priority than svc in the interrupt sequence, and therefore the final output value is the vcpu position. With this loop description, we get a functional abstract description of "irqq _ insert _ abs":

"iter'" i.e. at the end of the loop returns the vcpu position with lower priority than svc, which is inserted in front of it by the "list _ add _ tail" function.

Above is the function description case of "rtsc 2_ vcpu _ insert", the output case after the function is executed should be consistent with the function description, so the theorem description of the function can be finally obtained:

the "PROP" in the PRE includes all preconditions for its calling function, which can be viewed in the attachment. Other parts of the PRE are as already described in the fourth section. "PROP" and "loop" in POST are changes of the list structure domain after the function is executed, because the function actually adds a new vcpu in the vcpu list, the list domain state changes, and the changes are consistent with the function description, namely "lpool" rtsc2_ vcpu _ insert _ abs ". It can be further seen that the part of the theorem description "SEP ()" is empty because its calling function has been proven to be correct, the involved spatial structures have all been proven to be correct during the process of calling the function, and the function does not involve other spatial structures, and is therefore empty here, because the "SEP" is empty and the function is the final function, it is not necessary to describe an abstract theorem for proving.

Finally, the proof of the theorem is obtained:

a part of the proof process of this function is listed here, "start _ function" indicates the start of the proof, which expresses the hall triplet to be proved; "forward _ call ()" is the proof of function call in C language, and the variable "forward _ if ()" in the "WITH" statement in the theorem describing part of the called function is the proof policy for if structure, and the policy is "prop (p) local (q) sep (r)" structure in parentheses, that is, the state condition to be satisfied when the function executes the if statement. In the source code, whether the vcpu is in an idle state is judged, if the vcpu is in the idle state, the vcpu is directly exited, and otherwise, the subsequent statement is executed. Correspondingly, it can be understood that if the list field is free, the list field is not modified, and if the list field is not free, the next statement operation is executed, namely, the content in the "PROP"; in addition, temporary variable cases required for this if statement are listed in "LOCAL". When the proof is over, Coq has 3 branches, the first two branches correspond to the result of the if decision statement, and the third describes the following program statement. Finally "qed." indicates the end of the certification. At Coq, if a function is correct, the attestation process may deduce "Qed", otherwise attestation deduction must not be done in the middle, and after all attestation targets are attested, the upper right corner shows that no targets can be attested.

It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims

1. A formal verification method based on theorem proving is characterized by comprising the following steps: reconstructing a source code, performing formal modeling on a function, performing theorem description on the function, and finally performing formal certification;

the source code reconstruction specifically includes:

a3, separating complex structures in the code;

2. The proof of theorem-based formal verification method according to claim 1, wherein the formally described theorem satisfies specifications for a VST.

3. The proof of theorem based formal verification method according to claim 2, wherein the process of theorem describing the function is:

a4, postconditioning, wherein the internal structure is 'PROP' () LOCAL '() SEP' () ', and PROP' () represents the constraint condition to be satisfied after the function is executed; LOCAL' () represents a return value representing a function; SEP' () is a space predicate, which represents the specific value situation on the address of an operation variable after the function is executed.

4. The proof-of-theorem-based formal verification method according to claim 3, wherein the proof policy includes: the method comprises the following steps that a rewrite rewriting strategy, an application strategy, a forward strategy and an enterler strategy are adopted, wherein the rewrite rewriting strategy replaces target variables in a current target with requirement variables in a premise; replacing a target variable in a current target by a required variable in the premise by the application strategy, and simplifying the proved target; the forward policy is used to facilitate forward execution of the Hoare logic attestation; the entailer policy is used to perform the derivation work for implication statements.