CN116932389A

CN116932389A - Solver defect detection method based on large pre-training language model

Info

Publication number: CN116932389A
Application number: CN202310869091.1A
Authority: CN
Inventors: 杨已彪; 孙茂林; 许沂聪; 卢红敏; 周毓明
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2023-07-14
Filing date: 2023-07-14
Publication date: 2023-10-24

Abstract

The invention provides a solver defect detection method based on a large pre-training language model. The method mainly comprises the following steps: firstly, carrying out data amplification on formulas in a solver standard library and formulas of historical trigger defects to obtain a training set; secondly, performing customized training on the pre-training large model by using a 'retraining-fine tuning' framework based on a training set so as to generate test input of a solver; and finally, generating a solver test case by using the model obtained through training, and verifying a plurality of solvers by using a differential test. The method solves the key challenges that the test cases are difficult to generate efficiently and the diversity test inputs are difficult to generate in the defect detection of the solver. The retraining-fine tuning framework provided by the invention can learn knowledge in a solver standard library and historical defect use cases by utilizing a pre-trained large language model, so that legal, effective and high-fault-uncovering test input can be generated. The invention provides a brand new solution for solver defect detection.

Description

Solver defect detection method based on large pre-training language model

Technical Field

The invention relates to the field of software defect detection, in particular to defect detection of an SMT solver, which is a defect detection method of the SMT solver based on a large pre-training language model.

Background

SMT (Satisfiability Modulo Theories, satisfiability modulo theory) solver is an automated reasoning tool used to check the satisfiability of a logical formula. It has been applied to a number of important areas including software verification, test case generation, and program synthesis. However, hidden defects exist in SMT solvers, which can cause erroneous results in these areas to have serious impact. It is therefore important to ensure the reliability and robustness of SMT solvers. While many testing methods have been proposed for SMT solvers, generating efficient test formulas to comprehensively test SMT solvers remains an important challenge. In order to solve the problem, the invention provides a method for retraining-fine tuning a Large Pre-trained language model (Large Pre-trained Language model, LLM) to generate a Large number of test cases which can be input to an SMT solver, namely formulas to be solved, by utilizing the Large language model. The large pre-training language model is a deep learning-based natural language processing model that learns a representation of a language by self-supervised learning on a large corpus. Large language models typically employ a transducer architecture to learn a representation of an input sequence through a multi-layer self-attention mechanism. The advent of large pre-trained language models greatly improved the performance of natural language processing tasks because they could be fine-tuned on a large-scale pre-trained basis, avoiding the need for a de novo training process. The large pre-trained language model exhibits excellent results on various types of natural language processing tasks and code-related tasks. The invention uses a large pre-training language model in the test case generation of the SMT solver, and customizes, retrains and optimizes the pre-training model through the proposed retraining-fine tuning framework so as to be suitable for generating test input of the solver and further detect defects in the solver.

Disclosure of Invention

The invention provides a defect detection method of an SMT solver based on a pre-training language model, which aims to solve the key problem that high-quality and various test inputs are lack in the test of the SMT solver. And collecting different types of training data (including normal SMT formulas and cases for triggering historical defects), amplifying by adopting a specific data enhancement technology, and customizing and retraining a pre-training language model by using a retraining-fine tuning framework to enable the pre-training language model to have the capability of generating legal and effective SMT formulas, and finally, detecting the defects in an SMT solver by using the SMT formulas generated by the model as test input to improve the reliability and quality of the solver.

In order to detect defects in a solver, the invention discloses an SMT solver defect detection method based on a pre-training language model, which specifically comprises the following steps:

step 1, collecting a data set for model training;

step 2, amplifying the data set collected in the step 1 through a diversity-oriented mutation technology and a semantic maintenance mutation technology;

step 3, training the pre-training language model by using a 'retraining-fine tuning' framework and an enhanced data set;

step 4, generating an SMT formula by using the model obtained through training, and taking the SMT formula as a test input after instantiation;

and 5, resolving the generated test input by using different SMT solvers, recording the obtained output, and regarding inconsistent results obtained by the different SMT solvers on the same test input and crashes or memory errors generated by the solvers as potential defects.

In step 1, two types of SMT formulas are mainly considered when collecting the training data set, one is a formula in the solver test benchmark, and the other is a formula triggering the defects of the solver. The formulas in the solver test standard refer to SMT formulas which are high in quality, correct in semantics and represent the target field, and the formulas contain rich formula related knowledge, so that the models can learn the grammar and the semantics of the SMT formulas. The historical defect triggering use case contains key elements for triggering the defects of the solver, so that a model can be helped to learn how to generate a formula capable of effectively triggering the defects of the solver.

In step 2, in order to obtain better performance of the model, it is necessary to provide it with a training dataset with diversity and high quality, so that the data collected in step 1 will be amplified. Techniques used for data amplification include diversity-oriented mutation techniques and semantic maintenance mutation techniques. Aiming at formulas in a test benchmark, the diversity guide mutation technology carries out sub-formula mutation and operator mutation, so as to increase the diversity of training sets as much as possible. The objective of the semantic maintenance mutation technique is to increase the diversity of the data set while maintaining the triggering defect capability of the defect triggering case as much as possible. The two mutation strategies are respectively implemented on the two types of data, and a data set with higher quality is obtained to train the model.

In step 3, the pre-training model is retrained by using the formula in the amplified test standard, so that the capacity of generating a legal SMT formula is obtained. And then fine tuning the re-trained model by using the amplified historical defect triggering formula, so that the model can generate a formula which is easy to trigger the defects of the solver. When fine tuning is performed, only the parameters in the two fully connected layers of the model are updated.

In step 4, the obtained model is used to generate an SMT formula, and the output is processed and instantiated to obtain legal test input.

In step 5, the instantiated test cases are resolved by using different SMT solvers, the results given by the solvers are compared, and whether the differences exist in the results are checked, so that defects in the solvers are found. Here, defects in the solver that the method can detect mainly include the following three types: 1) Soundness defect: the solver produces erroneous decisions on the satisfaction of a formula; 2) Invalid model defects: the solver gives the correct satisfaction, but provides an erroneous model, i.e. a solution that cannot be satisfied by the formula; 3) And (3) crashing: an SMT solver occurrence assertion violation or other error causes the solver to terminate abnormally. If any one of the possible defects is found, the input triggering the defect (including the test case and corresponding command) and the corresponding output are saved for subsequent review.

The invention uses the pre-trained large language model as a generator of the testing input of the solver, and performs data amplification and customized training on the model by using a 'retraining-fine tuning' framework, so that the model has the capability of generating the testing input of the effective SMT solver, thereby being capable of detecting defects in the solver. The invention can generate a large number of test formulas with diversity by utilizing the pre-training model, solves the problem of lack of effective test input in the test of the solver, and can effectively improve the quality and reliability of the solver.

The beneficial effects are that: the method can effectively detect the defects of the SMT solver, and uses the large pre-training language model as a test input generator, so that the key challenge of lack of legal and effective test input in the test of the SMT solver is effectively solved, and a brand-new solution is provided for defect detection of the solver.

Drawings

The foregoing and other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings and detailed description.

FIG. 1 is a flow chart of a solver defect detection method based on a large pre-trained language model.

FIG. 2 is a flow chart for enhancing a data set and training a model.

FIG. 3 is a flow chart for differentially validating different SMT solvers and collecting corresponding results.

Detailed Description

For a clearer description of the objects, technical solutions and advantages of the present invention, the present invention will be described in further detail with reference to the accompanying drawings.

FIG. 1 shows a flowchart of a method for SMT solver defect detection energized by historical defect use cases of the invention, comprising 5 steps, the process being as follows:

step 1, collecting a data set for model training, wherein the data set comprises formulas in a solver test benchmark and solver historical defect use cases;

step 2, enhancing the collected data set, wherein the related technology comprises diversity guide variation and semantic maintenance variation;

step 3, training the model, including two stages of retraining and fine tuning;

step 4, generating test input by using the obtained model, and instantiating the test input;

step 5, the inconsistent results obtained by different SMT solvers for the same test input and the crashes or memory errors generated by the solvers are considered as potential defects.

FIG. 2 shows a flow chart of step 2 and step 3 for constructing a dataset and training a model from the SMT formulas collected in step 1, as follows:

in step 3-1, in order to implement data enhancement, the formulas in the solver test standard are subjected to diversity-oriented variation, including sub-formula variation and operator variation. And performing semantic maintenance mutation on the historical defect triggering formula. After this step is completed, step 3-2 is entered.

And 3-2, preprocessing the enhanced formula, and screening the formula unsuitable for serving as training data. And then retraining the model by taking the enhanced test benchmark formula as a training set. After this step is completed, step 3-3 is entered.

And 3-3, fine tuning the model trained in the step 3-2 by using the enhanced historical defect triggering formula to obtain a final test case generator.

FIG. 3 shows a flow chart of step 5 of performing differential analysis on different SMT solvers based on the test inputs generated in step 4, as follows:

and 5-1, solving the same test input by using a plurality of SMT solvers. The two solvers to be tested are respectively recorded as a first SMT solver and a second SMT solver. And (4) after the solution is finished, entering a step 4-2.

Step 5-2, recording the outputs of different solvers, respectively, including but not limited to the following information: the solving result of the test formula (returned in the form of standard output), error information (invalid model report or abort signal) of the solver, and the like are regarded as the output of the SMT solver. Step 5-1 is entered.

And 5-3, performing differential analysis according to the output of the solver obtained in the step 3-2. If the solver's solution results are the same and there is no anomaly information, the round of operation is ended and the process from step 4-1 is continued for the next test input. Otherwise, go to step 5-4.

Step 5-4, recording relevant information of the solver defect, including but not limited to the following information: the test inputs that trigger the defects, the commands used to enable the solvers, and the information returned by the different solvers. After this step is completed, the round of operation is ended and the process from step 4-1 is continued for the next test input.

The invention provides a defect detection method of an SMT solver energized by historical defect use cases, and particularly provides a method and a plurality of ways for realizing the technical scheme. The foregoing is only a preferred embodiment of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be comprehended within the scope of the present invention. The components not explicitly described in this embodiment can be implemented by using the prior art.

Claims

1. A solver defect detection method based on a large pre-training language model is characterized in that the solver is tested by using the large pre-training language model as a test case generator, new legal and effective test input is generated, and defects in the SMT solver are detected through differential verification. The SMT solver in differential verification here typically uses solvers of different implementations, but can also employ the same solver and apply different solving options. In other words, the method supports the verification of multiple solvers, and may also support the self-verification of a single solver. The method mainly comprises the following steps:

1) Model training dataset construction:

two types of SMT formulas are mainly considered when collecting training data sets, one is a formula in a solver test benchmark and the other is a formula that triggers a solver defect. The formulas in the solver test standard refer to SMT formulas which are high in quality, correct in semantics and represent the target field, and the formulas contain rich formula related knowledge, so that the models can learn the grammar and the semantics of the SMT formulas. The historical defect triggering use case contains key elements for triggering the defects of the solver, so that a model can be helped to learn how to generate a formula capable of effectively triggering the defects of the solver.

2) Data enhancement:

in order for the model to achieve better performance, it is necessary to provide it with a training dataset of diversity, high quality, so the data collected in step 1 will be amplified. Techniques used for data amplification include diversity-oriented mutation techniques and semantic maintenance mutation techniques. Aiming at formulas in a test benchmark, the diversity guide mutation technology carries out sub-formula mutation and operator mutation, so as to increase the diversity of training sets as much as possible. The objective of the semantic maintenance mutation technique is to increase the diversity of the data set while maintaining the triggering defect capability of the defect triggering case as much as possible. The two mutation strategies are respectively implemented on the two types of data, and a data set with higher quality is obtained to train the model.

3) "retraining-fine tuning" of a pre-trained model:

firstly, retraining the pre-trained model by using a formula in the amplified test standard to ensure that the pre-trained model has the capability of generating a legal SMT formula. And then fine tuning the re-trained model by using the amplified historical defect triggering formula, so that the model can generate a formula which is easy to trigger the defects of the solver. When fine tuning is performed, only the parameters in the two fully connected layers of the model are updated.

4) Differential verification compares the outputs of different SMT solvers:

the solution results of the different SMT solvers are compared to check if they differ, thereby finding defects in the solvers. Here, defects in the solver that the method can detect mainly include the following three types: 1) Soundness defect: the solver produces erroneous decisions on the satisfaction of a formula; 2) Invalid model defects: the solver gives the correct satisfaction, but provides an erroneous model, i.e. a solution that cannot be satisfied by the formula; 3) And (3) crashing: the occurrence of assertion violation (violation of assertions) or other errors by the SMT solver causes the solver to terminate abnormally. If one of the three types of defects is found, the solver is determined to have the defect.

2. A method for detecting defects in a large pre-trained language model based solver according to claim 1 characterized in that in step 1) the method supports the collection of SMT formulas in test benchmarks and defect use cases in defect trackers of commonly used SMT solvers. The test-reference case here is an official test reference provided by SMT-LIB. Whereas a common SMT solver comprises Z3, cvc5 and ies 2. The method can also be extended to use SMT formulas from other sources and defect use cases in other SMT solvers as training data.

3. The method for detecting defects in a solver based on a large pre-trained language model according to claim 1, wherein in step 2), the method defines diversity-oriented variations and semantic maintenance variations for data enhancement. The semantic maintenance mutation is realized by adopting the function of a solver Z3, and the data enhancement technology used by the method can be replaced by other strategies.

4. The method for detecting defects in a solver based on a large pre-trained language model according to claim 1, wherein in step 3), the pre-trained language model used in the method is GPT-2. Other pre-trained models may be substituted for the model, such as GPT-3.5, GPT-4, BERT, etc.

5. The method for detecting defects of a solver based on a large pre-training language model according to claim 1, wherein in step 3), the method trains the model by using a "retraining-fine tuning" framework to obtain a customized model as a solver test case generator. Other specific training strategies may also be employed to train the model.

6. The method for detecting defects in a solver based on a large pre-trained language model according to claim 1, wherein in step 4), the method mainly detects three types of defects in the solver: soundness defect, invalid model defect, crashes. Other types of defects in the solver, such as completeness defects, performance defects, etc., may also be of interest herein.