CN117707948A - Training method and device for test case generation model - Google Patents

Training method and device for test case generation model Download PDF

Info

Publication number
CN117707948A
CN117707948A CN202311700286.XA CN202311700286A CN117707948A CN 117707948 A CN117707948 A CN 117707948A CN 202311700286 A CN202311700286 A CN 202311700286A CN 117707948 A CN117707948 A CN 117707948A
Authority
CN
China
Prior art keywords
code
training
test case
test
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311700286.XA
Other languages
Chinese (zh)
Inventor
张剑飞
周海莲
赵红兵
蔡文婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202311700286.XA priority Critical patent/CN117707948A/en
Publication of CN117707948A publication Critical patent/CN117707948A/en
Pending legal-status Critical Current

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

The embodiment of the specification provides a training method and a training device for a test case generation model, wherein the training method for the test case generation model comprises the following steps: in the model training process, analyzing a code file to obtain a structured code, extracting a context code of a tested unit from the structured code, analyzing a test case file of the code file, extracting a test case with a mapping relation with the context code from an analysis result, assembling the context code and the test case to obtain a training sample, and storing the training sample into a training sample set so as to train a pre-training model according to the training sample set to obtain a test case generation model.

Description

Training method and device for test case generation model
Technical Field
The present document relates to the technical field of code testing, and in particular, to a training method and apparatus for a test case generation model.
Background
With increasing software development scale and increasing complexity of software architecture, automated software testing is popular, an important aspect in the process of automated software testing is automatic generation of test cases, wherein the test cases refer to description of test tasks for testing a specific software product, and embody test schemes, methods, technologies and strategies, and contents of the test cases specifically include test targets, input data, test steps, expected results and the like. With the continuous enrichment of test case generating tools, more and more test cases can be generated in the software testing process, but the software testing is often constrained by the testing time and the testing cost, and in this case, how to generate high-quality test cases becomes a focus direction in an automated software testing scene.
Disclosure of Invention
One or more embodiments of the present specification provide a training method for generating a model for a test case, including: and analyzing the code file to obtain a structured code, and extracting the context code of the tested unit from the structured code. Analyzing the test case file of the code file, and extracting the test case with the mapping relation with the context code from the analysis result. And assembling the context code and the test case to obtain a training sample. And storing the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
One or more embodiments of the present specification provide a training apparatus for generating a model for a test case, including: the code file analysis module is configured to analyze the code file to obtain a structured code, and extract the context code of the tested unit from the structured code. The test case analysis module is configured to analyze the test case file of the code file and extract the test case with the mapping relation with the context code from the analysis result. And the training sample assembling module is configured to assemble the context codes and the test cases to obtain training samples. And the training sample storage module is configured to store the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
One or more embodiments of the present specification provide a training apparatus for generating a model for a test case, including: a processor; and a memory configured to store computer-executable instructions that, when executed, cause the processor to: and analyzing the code file to obtain a structured code, and extracting the context code of the tested unit from the structured code. Analyzing the test case file of the code file, and extracting the test case with the mapping relation with the context code from the analysis result. And assembling the context code and the test case to obtain a training sample. And storing the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
One or more embodiments of the present specification provide a storage medium storing computer-executable instructions that, when executed by a processor, implement the following: and analyzing the code file to obtain a structured code, and extracting the context code of the tested unit from the structured code. Analyzing the test case file of the code file, and extracting the test case with the mapping relation with the context code from the analysis result. And assembling the context code and the test case to obtain a training sample. And storing the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
Drawings
For a clearer description of one or more embodiments of the present description or of the solutions of the prior art, the drawings that are needed in the description of the embodiments or of the prior art will be briefly described below, it being obvious that the drawings in the description that follow are only some of the embodiments described in the present description, from which other drawings can be obtained, without inventive faculty, for a person skilled in the art;
FIG. 1 is a schematic diagram of a training method implementation environment for a test case generation model according to one or more embodiments of the present disclosure;
FIG. 2 is a process flow diagram of a training method for generating a model for a test case according to one or more embodiments of the present disclosure;
FIG. 3 is a flowchart of a training method for generating a model for a test case applied to a code test scenario according to one or more embodiments of the present disclosure;
FIG. 4 is a schematic diagram of an embodiment of a training apparatus for generating a model for a test case according to one or more embodiments of the present disclosure;
FIG. 5 is a schematic diagram of a training apparatus for generating a model for a test case according to one or more embodiments of the present disclosure.
Detailed Description
In order to enable a person skilled in the art to better understand the technical solutions in one or more embodiments of the present specification, the technical solutions in one or more embodiments of the present specification will be clearly and completely described below with reference to the drawings in one or more embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one or more embodiments of the present disclosure without inventive effort, are intended to be within the scope of the present disclosure.
The training method of the test case generation model provided in one or more embodiments of the present disclosure may be applicable to an implementation environment of test case generation, and referring to fig. 1, the implementation environment at least includes:
a pre-training model 101 for generating test cases, a sample library 102 for storing training sample sets, and a training system 103 for training the pre-training model; in addition, the implementation environment may further include a code repository 104 storing code files, and test case files of the code files may also be stored in the code repository 104, or test case files may be generated by inputting the code files into the test case generating component 105;
In the implementation environment, in the process of training a test case generation model, starting from a code file and a test case file of the code file, analyzing the code file on one hand to obtain a structured code, extracting a context code of a tested unit from the analyzed structured code, analyzing the test case file of the code file on the other hand, extracting a test case with a mapping relation with the context code from an analysis result, assembling the context code and the test case to obtain a training sample, storing the assembled training sample in a training sample set, wherein the training sample set can be stored in a sample library 102, reading a training sample in the training sample set from the sample library 102 in the model training process by a training system 103, training a pre-training model 101 by using the training sample, and obtaining a test case generation model capable of generating a corresponding test case for the input code;
wherein code files may be read from code repository 104 by training system 103, test case files for code files may be read from code repository 104 by training system 103, and test case files may be generated by inputting code files into test case generation component 105.
One or more embodiments of a training method for a test case generation model provided in the present specification are as follows:
referring to fig. 2, the training method for a test case generation model provided in this embodiment specifically includes steps S202 to S208.
Step S202, analyzing the code file to obtain a structured code, and extracting the context code of the unit under test from the structured code.
The code file in this embodiment refers to a file containing source code, such as a text file containing source code. Specifically, the code file may be obtained from a code repository, and in the process of obtaining the code file from the code repository, in order to improve the efficiency of obtaining the code file, the code file may be obtained from the code repository of the open source code platform, and also may be read from the private code repository.
In order to improve the validity of the obtained code file in the process of obtaining the code file from the code warehouse of the open source code platform, so as to improve the validity of the training sample obtained by subsequent assembly on the basis of the code file, in an optional implementation manner provided in this embodiment, according to the code configuration of the code warehouse of the open source code platform, the code file is extracted from the code warehouse of which the code configuration meets the preset condition; or extracting the code files with the code configuration meeting the preset conditions from a code warehouse of the open source code platform.
The code configuration of the code warehouse comprises access record information (such as access collection number of the code files) of the code files in the code warehouse, frame format of test case files of the code files in the code warehouse, code integration information of the code warehouse and/or code coverage rate; correspondingly, the code configuration meets the preset condition, which may be that the access record information of the code file in the code warehouse is greater than a preset access record threshold, for example, the access collection number is greater than a preset access collection threshold, or the frame format of the test case file is a standard frame format, or the code integration information meets the specific requirement, or the code coverage rate meets the preset code coverage rate range.
In addition, code files of different compiling languages can be obtained, so that training samples of the different compiling languages are assembled on the basis of the obtained code files of the different compiling languages, and after the pre-training model is trained by utilizing the training samples of the different compiling languages, corresponding test cases which can be generated for tested codes of the different compiling languages are obtained.
In the specific implementation, in the process of analyzing the code file, in order to improve the analysis efficiency, a code analyzer can be used for carrying out structural analysis on the code file, the code file is analyzed into a structural code through structural analysis, and the context code of the tested unit is extracted from the obtained structural code.
The tested unit refers to a code unit tested in the software testing process, and the tested unit can be a method in a code file, a class or other objects in the code file; the context code of the unit under test, including the execution code of the unit under test, may also contain definition code associated with the definition of the unit under test and/or call code associated with the unit under test invoking other code units.
Taking the Java language as an example, in the case where the unit under test is a method (method under test), the context code of the method under test includes: the method body and the method definition code of the tested method, and the method calling code; the method definition code may be a class definition in which the method to be tested is located, a member variable definition in the method to be tested, and/or a constructor definition; the method call code may be the code of other private methods in the class of the measured method call.
In an optional implementation manner provided in this embodiment, analyzing a code file to obtain a structured code, and extracting a context code of a unit under test from the structured code includes:
carrying out structural analysis on the code file through a code analyzer, and taking a code grammar tree obtained by analysis as a structural code;
And extracting the execution codes of the tested unit from the structured codes, extracting the definition codes and/or the calling codes of the tested unit, and taking the extracted execution codes and the definition codes and/or the calling codes as the context codes of the tested unit.
For example, the code file is subjected to structural analysis by an AST parser (Abstract Syntax Tree ) to obtain a structured abstract syntax tree of the code file; under the condition that the tested unit is a method, extracting a method body of the tested method in an abstract syntax tree, extracting class definitions of the tested method, member variable definitions and constructor definitions in the tested method from the abstract syntax tree to serve as method definition codes of the tested method, extracting codes of other private methods in the class of the tested method call from the abstract syntax tree, and taking the method body of the tested method, the class definitions of the tested method, the member variable definitions and constructor definitions in the tested method and the codes of other private methods in the class of the tested method call together as context codes of the current tested method;
for the method under test, initSolrIndex () in the Java compiled language, the context code of the method under test initSolrIndex () extracted in the abstract syntax tree is exemplified as follows:
And S204, analyzing the test case file of the code file, and extracting the test case with the mapping relation with the context code from the analysis result.
The code file is further acquired, the test case file of the code file is a file of a test case of a source code in the code file, and in the process of acquiring the test case file of the code file, the code file can be further extracted from a code warehouse of which the code configuration meets preset conditions on the basis of extracting the code file from the code warehouse of which the code configuration meets the preset conditions according to the code configuration of the code warehouse of the open source code platform; or further extracting the test case file of the code file meeting the preset condition on the basis of extracting the code file meeting the preset condition of the code configuration from the code warehouse of the open source code platform.
Further, for code files read from the private code repository, in the case where test case files have not been generated for code files in the private code repository, a corresponding test case file may be generated by inputting code files read from the private code repository into the test case generation component; similarly, in the case that the code file extracted from the code warehouse of the open source code platform has not yet generated the test case file, the code file extracted from the code warehouse of the open source code platform may also be input into the test case generating component to generate the corresponding test case file.
In the specific implementation, in the process of analyzing the test case file of the code file, the code analyzer can be adopted to perform structural analysis on the test case file to improve the analysis efficiency of the test case file, the test case file is analyzed into a structural test case through structural analysis, and the test case of the tested unit is extracted from the obtained structural test case.
In an optional implementation manner provided in this embodiment, analyzing the test case file of the code file, and extracting the test case having a mapping relationship with the context code from the analysis result includes:
carrying out structural analysis on the test case file to obtain a case grammar tree of the test case file;
determining a use case segment containing the identification information in the use case grammar tree according to the identification information of the unit under test contained in the context code;
and extracting the test cases of the tested unit from the case fragments.
For example, the test case file of the code file is subjected to structural analysis through an AST parser (Abstract Syntax Tree ) to obtain a structural abstract syntax tree of the test case file, classes containing method names of the tested methods are first located in the abstract syntax tree of the test case file, and then method test cases of the tested methods are determined and extracted from the located classes;
For the method under test, initSolrIndex () in the Java compiled language, a specific example of the test case of the method under test initSolrIndex () extracted from the abstract syntax tree of the test case file is as follows:
and S206, assembling the context codes and the test cases to obtain training samples.
In this embodiment, in the model training process of the test case generating model, training is performed on the basis of a pre-training model to obtain the test case generating model, where the pre-training model refers to a natural language model obtained by pre-training, the pre-training model may adopt a neural network architecture including a large number of parameters, and the pre-training model may adopt a pre-trained large language model (Large Language Model, LLM), such as a chatGPT (chat Generative Pre-trained Transformer) and various open-source large language models such as a generated pre-training transformation model.
On the basis of the pre-training model, the pre-training model is finely tuned for specific tasks to obtain corresponding processing models capable of processing the specific tasks to adapt to specific fields or task requirements, and the pre-training model is further trained in the field of test case generation so as to obtain a test case generation model capable of generating test cases.
Under the condition that the pre-training model is a large language model, in the process of assembling a training sample of the large language model, the training sample is obtained by combining the context code of the tested unit and the test case of the tested unit, and specifically, the training sample of the large language model consists of a Prompt (Prompt) and an Answer (Answer), wherein in the process of assembling the training sample, the context code of the tested unit is used as the Prompt (Prompt), and the test case of the tested unit is used as the Answer (Answer) to be assembled, so that the training sample of the large language model is obtained.
Step S208, storing the training samples into a training sample set to train the pre-training model according to the training sample set to obtain a test case generation model.
The training sample is obtained by assembling the context code of the tested unit and the test case of the tested unit, and the training sample obtained by assembling is stored in a training sample set on the basis, so that the pre-training model is trained according to the training sample in the training sample set to obtain the test case generation model. In the specific training process, the context codes of the tested units contained in the training samples are used as training input, and the test cases of the tested units contained in the training samples are used as sample labels to perform supervised training on the pre-training model, so that the knowledge of the mapping relationship between the context codes of the tested units and the test cases of the tested units is learned in a supervised training mode, and the capability of generating the test cases is provided on the basis of the learning knowledge.
In the specific execution process, in order to improve the training effect of the test case generation model, training samples in the training sample set are filtered, and the training samples are filtered from the test cases contained in the training samples, so that the quality of the training samples is improved, the pre-training model is helped to learn more effective knowledge on the basis of the training samples with higher quality, and therefore the quality of the test cases generated by the test case generation model is higher.
In an optional implementation manner provided in this embodiment, in a process of filtering a training sample, from a view point that a test case can be compiled, the training sample is filtered, and a specific implementation manner is as follows: compiling and detecting test cases contained in each training sample in the training sample set, and deleting training samples corresponding to test cases which are not passed by compiling and detecting from the training sample set;
the compiling detection is to detect whether the test case can be compiled and executed, if the test case cannot be compiled and executed, it indicates that the current test case cannot be used for software testing, and if the pre-training model learns knowledge of training samples including such test cases, it may cause that the test case generated by the test case generating model obtained by training cannot be compiled and executed, so that the training samples including the test case cannot be compiled and executed need to be deleted from the training sample set.
In another alternative implementation manner provided in this embodiment, the training samples are filtered from the case specifications of the test cases included in the training samples, and the training samples where the test cases that do not meet the case specifications are located are deleted from the training sample set, so that model training can be performed by using the training samples that meet the case specifications, and the test case generation model obtained by training can generate the test cases that meet the case specifications.
The training samples are filtered from the use case specifications of the test cases contained in the training samples, and the specific implementation mode is as follows: and carrying out integrity detection on the test cases contained in each training sample, and deleting the training samples corresponding to the test cases which fail the integrity detection from the training sample set.
In practical applications, one tested unit may only include one execution link in the execution process, and may also include two or more execution links, for example, two execution links exist in the tested unit including the judgment statement, and then, for the tested unit including the selection statement, multiple execution links exist in the tested unit including multiple execution links, in order to ensure the integrity and the comprehensiveness of the test, it is necessary to test each execution link of the tested unit separately, and in order to test different execution links of the tested unit, different test cases need to be tested, so, in the process of filtering the training samples, the test sample including the multiple execution links may also be deleted from the training sample set from the test sample set for testing the integrity of each test case of the execution links of the tested unit.
Specifically, in an optional implementation manner provided in this embodiment, from the standpoint of detecting the integrity of test cases of each execution link of the unit under test, the training sample set is filtered, and the specific implementation manner is as follows:
analyzing the context codes contained in each training sample in the training sample set to obtain an execution link of the context codes contained in each training sample;
screening candidate training samples with a plurality of execution links in each training sample;
and detecting whether test cases of each execution link exist in the test cases included in the candidate training samples, and if not, deleting the candidate training samples which are not included from the training sample set.
In addition, in a specific implementation, the implementation manner of filtering the training samples may be further used for performing sample detection on the training samples obtained by assembling in the step S206, and if the sample detection passes, storing the training samples obtained by assembling in the step S206 into a training sample set; if the sample detection fails, the training sample obtained by the assembly in step S206 is deleted.
For example, in the process of performing sample detection on the training sample obtained by assembling in the step S206, compiling and detecting test cases included in the training sample obtained by assembling, and if the compiling and detecting are not passed, deleting the training sample obtained by assembling; if the compiling detection is passed, executing step S208, and storing the training samples obtained by assembly into a training sample set;
For another example, performing case specification detection on the test cases included in the training samples obtained by assembly, and deleting the training samples obtained by assembly if the case specification detection fails; if the use case specification detection passes, executing step S208, and storing the training samples obtained by assembly into a training sample set;
for another example, the context code included in the training sample obtained by assembly is analyzed to obtain execution links of the context code, if the number of the execution links is a plurality of, whether test cases of all the execution links exist in the test cases included in the training sample obtained by assembly is detected, and if not, the training sample obtained by assembly is deleted; if so, step S208 is performed to store the assembled training samples into a training sample set.
It should be noted that any two or three of the above three implementation manners of performing sample detection on the training samples obtained by assembling in step S206 may be combined, where sample detection is performed on the training samples obtained by assembling, for example, the context code included in the training samples obtained by assembling is parsed to obtain execution links of the context code, if the number of execution links is multiple, whether test cases of each execution link exist in test cases included in the training samples obtained by assembling is detected, and if not, the training samples obtained by assembling are deleted; if the test cases are included, further carrying out case specification detection on the test cases included in the training samples obtained through assembly, and if the case specification detection is not passed, deleting the training samples obtained through assembly; if the use case specification detection passes, step S208 is executed to store the training samples obtained by assembly into a training sample set.
In this embodiment, in the process of training the pre-training model to obtain the test case generation model, the context code of the tested unit included in the training sample is used as training input, and the test case of the tested unit included in the training sample is used as a sample label to perform supervised training on the pre-training model, so that the knowledge of the mapping relationship between the context code of the tested unit and the test case of the tested unit is learned in a supervised training manner, and the capability of generating the test case is provided on the basis of learning knowledge.
Specifically, in an optional implementation manner provided in this embodiment, training the pre-training model according to the training sample set to obtain a test includes:
inputting training samples in the training sample set into the pre-training model to generate test cases to obtain output test cases;
and determining training loss based on the output test cases and the test cases contained in the training samples, and performing parameter adjustment on the pre-training model based on the training loss.
In addition, in the process of training the pre-training model to obtain the test case generation model, in order to improve the training effect, the test case generation model obtained by training can generate a high-quality test case, the training effect can be tested in the training process, the training sample type with the negative training effect in the training process is determined by combining the test result, the training sample type with the negative training effect in the training sample set is deleted, and the influence of the training sample with the negative training effect in the training sample set on the subsequent training process is avoided.
In an optional implementation manner provided in this embodiment, training the pre-training model according to the training sample set to obtain a test case generating model includes:
inputting each test sample contained in a test sample set into the pre-training model to perform test case generation to obtain test cases of each test sample;
clustering the test cases of each test sample to obtain at least one test case cluster, and determining a negative cluster in the at least one test case cluster;
deleting the training samples corresponding to the negative clusters in the training sample set, so as to train the intermediate model according to the deleted training sample set to obtain the test case generation model. The intermediate model is a pre-training model obtained after training the pre-training model by using part of training samples in the training sample set.
In summary, according to the training method of the one or more test case generation models provided in the embodiment, starting from a code file and a test case file of the code file, on one hand, analyzing the code file to obtain a structured code, extracting a context code of a tested unit from the analyzed structured code, on the other hand, analyzing the test case file of the code file, extracting a test case having a mapping relationship with the context code from the analysis result, then assembling the context code and the test case to obtain a training sample, storing the assembled training sample in a training sample set, training the pre-training model according to the training sample in the training sample set to obtain a case generation model, taking the context code of the tested unit contained in the training sample as training input in the training process, taking the test case of the tested unit contained in the training sample as a sample tag to perform supervised training on the pre-training model, and enabling the mapping relationship between the context code of the tested unit and the test case of the tested unit to be learned in a supervised training mode, so that the obtained test case generation model has the capability of generating the corresponding test case generation model for the tested unit;
Further, from the test cases contained in the training samples, the training samples in the training sample set are filtered, so that the quality of the training samples is improved, the training effect of the pre-training model is improved, and the test case generation model obtained through training has the capability of generating test cases with higher quality.
The following further describes, with reference to fig. 3, the training method of the test case generation model provided in this embodiment, by taking an application of the training method of the test case generation model provided in this embodiment to a code test scene as an example, and referring to fig. 3, the training method of the test case generation model applied to the code test scene specifically includes the following steps.
Step S302, reading a code file in a code warehouse of the open source code platform, and reading a test case file of the code file in the code warehouse.
Step S304, carrying out structural analysis on the code file through an AST analyzer to obtain a code grammar tree.
Step S306, extracting the context code of the tested method from the code grammar tree.
Step S308, carrying out structural analysis on the test case file to obtain a case grammar tree of the test case file.
Step S310, extracting the test case of the tested method from the case grammar tree according to the method name contained in the context code of the tested method.
Step S312, the context code of the tested method and the test case of the tested method are assembled to obtain a training sample.
Step S314, storing the training samples into a training sample set to train the pre-training model according to the training sample set to obtain a test case generation model.
In addition, after step S312, any one of the following four implementations of sample detection may be used to perform sample detection on the training sample obtained by assembly:
the implementation mode is as follows:
compiling and detecting test cases contained in the training samples obtained through assembly, and deleting the training samples obtained through assembly if the compiling and detecting are not passed; if the compiling detection is passed, executing step S314, and storing the training samples obtained by assembly into a training sample set;
the implementation mode II is as follows:
performing case specification detection on the test cases contained in the assembled training samples, and deleting the assembled training samples if the case specification detection fails; if the use case specification detection passes, executing step S314, and storing the training samples obtained by assembly into a training sample set;
And the implementation mode is three:
analyzing the context code contained in the training sample obtained by assembly to obtain execution links of the context code, if the number of the execution links is a plurality of, detecting whether test cases of all the execution links exist in the test cases contained in the training sample obtained by assembly, and if not, deleting the training sample obtained by assembly; if so, step S314 is performed to store the assembled training samples into a training sample set.
The implementation mode is four:
analyzing the context code contained in the training sample obtained by assembly to obtain execution links of the context code, if the number of the execution links is a plurality of, detecting whether test cases of all the execution links exist in the test cases contained in the training sample obtained by assembly, and if not, deleting the training sample obtained by assembly; if the test cases are included, further carrying out case specification detection on the test cases included in the training samples obtained through assembly, and if the case specification detection is not passed, deleting the training samples obtained through assembly; if the use case specification detection passes, step S314 is executed to store the training samples obtained by assembly into a training sample set.
An embodiment of a training device for generating a model for a test case provided in the present specification is as follows:
In the foregoing embodiments, a training method of a test case generating model is provided, and a training device of the test case generating model is provided correspondingly, which is described below with reference to the accompanying drawings.
Referring to fig. 4, a schematic diagram of an embodiment of a training apparatus for generating a model for a test case according to the present embodiment is shown.
Since the apparatus embodiments correspond to the method embodiments, the description is relatively simple, and the relevant portions should be referred to the corresponding descriptions of the method embodiments provided above. The device embodiments described below are merely illustrative.
The embodiment provides a training device for a test case generation model, which comprises:
a code file parsing module 402, configured to parse the code file to obtain a structured code, and extract a context code of the unit under test from the structured code;
the test case analysis module 404 is configured to analyze the test case file of the code file, and extract the test case having a mapping relation with the context code from the analysis result;
a training sample assembling module 406 configured to assemble the context code and the test case to obtain a training sample;
The training sample storage module 408 is configured to store the training samples in a training sample set, so as to train the pre-training model according to the training sample set to obtain a test case generation model.
The embodiment of the training device for generating a model by using test cases provided in the specification is as follows:
according to the training method of the test case generating model, based on the same technical concept, one or more embodiments of the present disclosure further provide a training device of the test case generating model, where the training device of the test case generating model is used for executing the training method of the test case generating model provided above, and fig. 5 is a schematic structural diagram of the training device of the test case generating model provided by one or more embodiments of the present disclosure.
The training device for generating a model by using test cases provided in this embodiment includes:
as shown in FIG. 5, the training device of the test case generating model may have a relatively large difference due to different configurations or performances, and may include one or more processors 501 and a memory 502, where the memory 502 may store one or more storage applications or data. Wherein the memory 502 may be transient storage or persistent storage. The application program stored in memory 502 may include one or more modules (not shown in the figures), each of which may include a series of computer-executable instructions in a training device of the test case generation model. Still further, the processor 501 may be configured to communicate with the memory 502 and execute a series of computer executable instructions in the memory 502 on a training device of the test case generation model. The training device of the test case generation model may also include one or more power supplies 503, one or more wired or wireless network interfaces 504, one or more input/output interfaces 505, one or more keyboards 506, and the like.
In a particular embodiment, a training device for generating a model for a test case includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions in the training device for generating a model for a test case, and the one or more programs configured to be executed by one or more processors include computer-executable instructions for:
analyzing the code file to obtain a structured code, and extracting the context code of the unit to be tested from the structured code;
analyzing the test case file of the code file, and extracting the test case with a mapping relation with the context code from the analysis result;
assembling the context code and the test case to obtain a training sample;
and storing the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
An embodiment of a storage medium provided in the present specification is as follows:
According to the training method of the test case generating model, which corresponds to the above description, one or more embodiments of the present specification further provide a storage medium based on the same technical concept.
The storage medium provided in this embodiment is configured to store computer executable instructions that, when executed by a processor, implement the following flow:
analyzing the code file to obtain a structured code, and extracting the context code of the unit to be tested from the structured code;
analyzing the test case file of the code file, and extracting the test case with a mapping relation with the context code from the analysis result;
assembling the context code and the test case to obtain a training sample;
and storing the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
It should be noted that, in the present specification, an embodiment of a storage medium and an embodiment of a training method of a test case generation model in the present specification are based on the same inventive concept, so that a specific implementation of the embodiment may refer to an implementation of the foregoing corresponding method, and repeated descriptions are omitted.
In this specification, each embodiment is described in a progressive manner, and the same or similar parts of each embodiment are referred to each other, and each embodiment focuses on the differences from other embodiments, for example, an apparatus embodiment, and a storage medium embodiment, which are all similar to a method embodiment, so that description is relatively simple, and relevant content in reading apparatus embodiments, and storage medium embodiments is referred to the part description of the method embodiment.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In the 30 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each unit may be implemented in the same piece or pieces of software and/or hardware when implementing the embodiments of the present specification.
One skilled in the relevant art will recognize that one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising at least one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
One or more embodiments of the present specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The foregoing description is by way of example only and is not intended to limit the present disclosure. Various modifications and changes may occur to those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. that fall within the spirit and principles of the present document are intended to be included within the scope of the claims of the present document.

Claims (12)

1. A training method of a test case generation model comprises the following steps:
analyzing the code file to obtain a structured code, and extracting the context code of the unit to be tested from the structured code;
analyzing the test case file of the code file, and extracting the test case with a mapping relation with the context code from the analysis result;
assembling the context code and the test case to obtain a training sample;
and storing the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
2. The training method of test case generating model according to claim 1, wherein the parsing the code file to obtain a structured code, and extracting the context code of the unit under test from the structured code comprises:
Carrying out structural analysis on the code file through a code analyzer, and taking a code grammar tree obtained by analysis as the structural code;
and extracting the execution code of the tested unit from the structured code, and extracting the definition code and/or the calling code of the tested unit.
3. The training method of test case generation model according to claim 1, wherein the analyzing the test case file of the code file, extracting the test case having a mapping relation with the context code from the analysis result, comprises:
carrying out structural analysis on the test case file to obtain a case grammar tree of the test case file;
determining a use case segment containing the identification information in the use case grammar tree according to the identification information of the unit under test contained in the context code;
and extracting the test cases of the tested unit from the case fragments.
4. The training method of test case generating model according to claim 1, wherein the parsing the code file to obtain a structured code, and before the step of extracting the context code of the unit under test from the structured code is performed, further comprising:
Extracting the code file from a code warehouse of which the code configuration meets preset conditions according to the code configuration of the code warehouse of the open source code platform;
and extracting the test case file of the code file from a code warehouse of which the code configuration meets the preset condition.
5. The training method of test case generating model according to claim 1, wherein the parsing the code file to obtain a structured code, and before the step of extracting the context code of the unit under test from the structured code is performed, further comprising:
reading the code file in a private code repository;
and inputting the code file into a test case generating component to generate a test case file of the code file.
6. The training method of a test case generation model according to claim 1, further comprising:
compiling and detecting test cases contained in each training sample in the training sample set, and deleting training samples corresponding to test cases which are not passed by compiling and detecting from the training sample set;
and/or the number of the groups of groups,
and carrying out case specification detection on the test cases contained in each training sample, and deleting the training samples corresponding to the test cases which are failed in the case specification detection from the training sample set.
7. The training method of a test case generation model according to claim 1, further comprising:
analyzing the context codes contained in each training sample in the training sample set to obtain an execution link of the context codes contained in each training sample;
screening candidate training samples with a plurality of execution links in each training sample;
and detecting whether test cases of each execution link exist in the test cases included in the candidate training samples, and if not, deleting the candidate training samples which are not included from the training sample set.
8. The training method of the test case generation model according to claim 1, wherein training the pre-training model according to the training sample set to obtain the test case generation model comprises:
inputting training samples in the training sample set into the pre-training model to generate test cases to obtain output test cases;
and determining training loss based on the output test cases and test cases contained in training samples in the training sample set, and performing parameter adjustment on the pre-training model based on the training loss.
9. The training method of the test case generation model according to claim 1, wherein the training the pre-training model according to the training sample set includes:
Inputting each test sample contained in a test sample set into the pre-training model to perform test case generation to obtain test cases of each test sample;
clustering the test cases of each test sample to obtain at least one test case cluster, and determining a negative cluster in the at least one test case cluster;
deleting the training samples corresponding to the negative clusters in the training sample set, so as to train the intermediate model according to the deleted training sample set to obtain the test case generation model.
10. A training apparatus for generating a model of a test case, comprising:
the code file analysis module is configured to analyze the code file to obtain a structured code, and extract the context code of the unit to be tested from the structured code;
the test case analysis module is configured to analyze the test case file of the code file and extract the test case with a mapping relation with the context code from the analysis result;
the training sample assembling module is configured to assemble the context code and the test case to obtain a training sample;
and the training sample storage module is configured to store the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
11. A training apparatus for generating a model of a test case, comprising:
a processor; and a memory configured to store computer-executable instructions that, when executed, cause the processor to:
analyzing the code file to obtain a structured code, and extracting the context code of the unit to be tested from the structured code;
analyzing the test case file of the code file, and extracting the test case with a mapping relation with the context code from the analysis result;
assembling the context code and the test case to obtain a training sample;
and storing the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
12. A storage medium storing computer-executable instructions that when executed by a processor implement the following:
analyzing the code file to obtain a structured code, and extracting the context code of the unit to be tested from the structured code;
analyzing the test case file of the code file, and extracting the test case with a mapping relation with the context code from the analysis result;
Assembling the context code and the test case to obtain a training sample;
and storing the training samples into a training sample set so as to train the pre-training model according to the training sample set to obtain a test case generation model.
CN202311700286.XA 2023-12-11 2023-12-11 Training method and device for test case generation model Pending CN117707948A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311700286.XA CN117707948A (en) 2023-12-11 2023-12-11 Training method and device for test case generation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311700286.XA CN117707948A (en) 2023-12-11 2023-12-11 Training method and device for test case generation model

Publications (1)

Publication Number Publication Date
CN117707948A true CN117707948A (en) 2024-03-15

Family

ID=90158212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311700286.XA Pending CN117707948A (en) 2023-12-11 2023-12-11 Training method and device for test case generation model

Country Status (1)

Country Link
CN (1) CN117707948A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118012781A (en) * 2024-04-08 2024-05-10 腾讯科技(深圳)有限公司 Model training method, related method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118012781A (en) * 2024-04-08 2024-05-10 腾讯科技(深圳)有限公司 Model training method, related method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107808098B (en) Model safety detection method and device and electronic equipment
CN113221555B (en) Keyword recognition method, device and equipment based on multitasking model
CN111144126A (en) Training method of semantic analysis model, semantic analysis method and device
CN117707948A (en) Training method and device for test case generation model
CN112417093B (en) Model training method and device
CN110569428A (en) recommendation model construction method, device and equipment
CN117216271A (en) Article text processing method, device and equipment
CN116186330B (en) Video deduplication method and device based on multi-mode learning
CN116185617A (en) Task processing method and device
CN115292196A (en) User interface testing method and device, electronic equipment and readable storage medium
CN115391015A (en) Batch processing method and device based on test framework, electronic equipment and medium
CN111242195B (en) Model, insurance wind control model training method and device and electronic equipment
CN112287130A (en) Searching method, device and equipment for graphic questions
CN113821437B (en) Page test method, device, equipment and medium
CN117035695B (en) Information early warning method and device, readable storage medium and electronic equipment
CN117992600B (en) Service execution method and device, storage medium and electronic equipment
CN111325195B (en) Text recognition method and device and electronic equipment
CN117369783B (en) Training method and device for security code generation model
CN115599891B (en) Method, device and equipment for determining abnormal dialogue data and readable storage medium
CN116933087A (en) Training method and device for intention detection model
CN110674495B (en) Detection method, device and equipment for group border crossing access
CN117421214A (en) Batch counting method, device, electronic equipment and computer readable storage medium
CN115033485A (en) Big data automatic testing method and device, electronic equipment and storage medium
CN117828360A (en) Model training method, model training device, model code generating device, storage medium and storage medium
CN116824580A (en) Image processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination