CN111444076B - Recommendation method and device for test case steps based on machine learning model - Google Patents

Recommendation method and device for test case steps based on machine learning model Download PDF

Info

Publication number
CN111444076B
CN111444076B CN201811647595.4A CN201811647595A CN111444076B CN 111444076 B CN111444076 B CN 111444076B CN 201811647595 A CN201811647595 A CN 201811647595A CN 111444076 B CN111444076 B CN 111444076B
Authority
CN
China
Prior art keywords
test case
function
data
machine learning
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811647595.4A
Other languages
Chinese (zh)
Other versions
CN111444076A (en
Inventor
杨本芊
李珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
3600 Technology Group Co ltd
Original Assignee
3600 Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 3600 Technology Group Co ltd filed Critical 3600 Technology Group Co ltd
Priority to CN201811647595.4A priority Critical patent/CN111444076B/en
Publication of CN111444076A publication Critical patent/CN111444076A/en
Application granted granted Critical
Publication of CN111444076B publication Critical patent/CN111444076B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a recommendation method and device for test case steps based on a machine learning model. According to the scheme, the preprocessed historical test case data are formatted by taking n continuous case steps as units to obtain the specified format data, the first n-1 steps in each specified format data are used as training data, the nth step is used as a label to train the constructed machine learning model, and further the trained machine learning model is used for recommending the test cases, so that the combination of artificial intelligence AI and testing is realized, the case steps can be automatically recommended for a tester to select without manually selecting a function library when the test cases are compiled, the test case compiling efficiency is effectively improved, and the test efficiency is improved and the test cost is reduced. Particularly, the recommendation of the test case step is carried out by adopting the long-short-term memory network LSTM model, so that the model precision can be remarkably improved, and the recommendation accuracy of the test case step can reach 83%.

Description

Recommendation method and device for test case steps based on machine learning model
Technical Field
The invention relates to the technical field of software testing, in particular to a recommendation method and device for test case steps based on a machine learning model, a computer storage medium and computing equipment.
Background
The software test is a key link of software quality assurance, and is an essential link of software life cycle. In the process of software testing, a plurality of test cases need to be written according to the characteristics of the software, and the effective management of the test cases can help testers to improve the testing efficiency and reduce the testing cost.
In the prior art, a method for writing test cases on a test case management platform is to newly build test cases under a certain project, and select functions from a function library according to the functions of the test cases as each step of the cases, and the process needs to select a desired function from the function library with thousands of functions. Although the function library is divided by items, the functions are sorted, and it takes about several seconds to select a desired function each time. Thus, it may take 2-3 minutes to complete writing a test case by manually selecting the case step, which is inefficient.
Therefore, a technology is needed to automatically recommend a plurality of case steps for a tester to select without manually going to a function library for selection when writing test cases, so as to improve the test case writing efficiency.
Disclosure of Invention
In view of the foregoing, the present invention has been developed to provide a method, apparatus, computer storage medium, and computing device that overcome, or at least partially solve, the above-described problems, and that are based on machine learning model test case procedures.
According to an aspect of the embodiment of the present invention, there is provided a recommendation method for a test case step based on a machine learning model, including:
preprocessing historical test case data to obtain preprocessed test case data, wherein each piece of preprocessed test case data comprises identifiers of all steps of a test case in sequence arrangement, and each step corresponds to a function in a function library of a test case management platform;
for each piece of data in the preprocessed test case data, taking a step in the piece of data as a starting point, and taking n-1 steps in succession after the step and the mode of forming specified format data by the step to obtain a plurality of specified format data, wherein each specified format data comprises n steps arranged according to an execution sequence, the first n-1 steps are taken as training data, the nth step is taken as a label, and n is an integer not less than 3;
Constructing a machine learning model, and training the machine learning model by utilizing the training data and the label;
inputting n-1 adjacent steps before the to-be-recommended step of the test case of the to-be-recommended step into the trained machine learning model to obtain the recommended step of the test case.
Optionally, the machine learning model includes an N-gram model, a continuous word bag CBOW model, or a long and short term memory network LSTM model.
Optionally, when the machine learning model is an LSTM model, constructing the machine learning model includes:
constructing an LSTM model by using a keras sequential model;
the LSTM model comprises a word vector embedding layer, a bidirectional LSTM layer and a full connection layer;
the full connection layer adopts a softmax activation function;
the LSTM model employs a multi-class cross entropy function as a loss function.
Optionally, preprocessing the historical test case data includes:
sequencing the functions in the function library according to the original function identifiers;
mapping the ordered functions to a continuous space, so that the mapped identification of each function is used as the identification of the step corresponding to the function.
Optionally, before ordering the functions in the function library according to the original identities of the functions, the method further includes:
and performing duplicate removal on the functions in the function library according to the functions, and renumbering steps corresponding to the duplicate-removed functions in the historical test case data.
Optionally, after obtaining the plurality of specified format data, the method further comprises:
and performing one-time thermal encoding on all tags in the plurality of specified format data.
Optionally, before training the machine learning model with the training data and the labels, the method further comprises:
dividing the plurality of specified format data into a training set and a testing set according to a specified proportion;
training the machine learning model using the training data and the labels, comprising:
and training the machine learning model by using training data and labels in the training set.
Optionally, after training the machine learning model using the training data and the labels, the method further comprises:
and determining the recommended accuracy of the trained machine learning model by using the test set.
Optionally, inputting n-1 adjacent steps before the to-be-recommended step of the test case of the to-be-recommended step into the trained machine learning model to obtain the recommended test case, including:
Inputting n-1 adjacent steps of the test case of the step to be recommended before the step to be recommended into the trained machine learning model;
according to the input adjacent n-1 steps before the step to be recommended, obtaining the probability that each function in the function library is used as the recommended step;
and selecting the function in the function library as the recommended test case according to the probability.
Optionally, the step of selecting the function in the function library as the recommended test case according to the probability includes:
sorting the functions in the function library according to the probability;
and selecting the designated number of functions ranked in front as recommended test cases.
Optionally, the step of selecting the function in the function library as the recommended test case according to the probability includes:
comparing the probability of the function in the function library with a preset probability threshold;
and selecting a function with probability not lower than the preset probability threshold as the recommended test case.
According to another aspect of the embodiment of the present invention, there is also provided a recommendation apparatus for a test case step based on a machine learning model, including:
The data preprocessing module is suitable for preprocessing the historical test case data to obtain preprocessed test case data, wherein each piece of preprocessed test case data comprises identifiers of all steps of a test case in sequence arrangement, and each step corresponds to a function in a function library of the test case management platform;
the training data generation module is suitable for obtaining a plurality of pieces of specified format data by taking a step in the preprocessed test case data as a starting point and taking n-1 steps in succession after the step and a mode of forming specified format data by the step, wherein each piece of specified format data comprises n steps arranged according to an execution sequence, the first n-1 steps are used as training data, the nth step is used as a label, and n is an integer not less than 3;
the recommendation model training module is suitable for constructing a machine learning model and training the machine learning model by utilizing the training data and the label; and
the case step recommending module is suitable for inputting n-1 adjacent steps before the to-be-recommended step of the test case of the to-be-recommended step into the trained machine learning model to obtain the recommended step of the test case.
Optionally, the machine learning model includes an N-gram model, a continuous word bag CBOW model, or a long and short term memory network LSTM model.
Optionally, when the machine learning model is an LSTM model, the recommendation model training module is further adapted to:
constructing an LSTM model by using a keras sequential model;
the LSTM model comprises a word vector embedding layer, a bidirectional LSTM layer and a full connection layer;
the full connection layer adopts a softmax activation function;
the LSTM model employs a multi-class cross entropy function as a loss function.
Optionally, the data preprocessing module is further adapted to:
sequencing the functions in the function library according to the original function identifiers;
mapping the ordered functions to a continuous space, so that the mapped identification of each function is used as the identification of the step corresponding to the function.
Optionally, the data preprocessing module is further adapted to:
before sorting the functions in the function library according to the original function identifiers, performing de-duplication on the functions in the function library according to the functions, and renumbering steps corresponding to the de-duplicated functions in the historical test case data.
Optionally, the training data generation module is further adapted to:
After a plurality of specified format data are obtained, all tags in the plurality of specified format data are subjected to one-time encoding.
Optionally, the recommendation model training module is further adapted to:
dividing the plurality of specified format data into a training set and a testing set according to a specified proportion;
and training the machine learning model by using training data and labels in the training set.
Optionally, the recommendation model training module is further adapted to:
after training the machine learning model using the training data and the labels, determining a recommended accuracy of the trained machine learning model using the test set.
Optionally, the use case step recommendation module is further adapted to:
inputting n-1 adjacent steps of the test case of the step to be recommended before the step to be recommended into the trained machine learning model;
according to the input adjacent n-1 steps before the step to be recommended, obtaining the probability that each function in the function library is used as the recommended step;
and selecting the function in the function library as the recommended test case according to the probability.
Optionally, the use case step recommendation module is further adapted to:
Sorting the functions in the function library according to the probability;
and selecting the designated number of functions ranked in front as recommended test cases.
Optionally, the use case step recommendation module is further adapted to:
comparing the probability of the function in the function library with a preset probability threshold;
and selecting a function with probability not lower than the preset probability threshold as the recommended test case.
According to yet another aspect of embodiments of the present invention, there is also provided a computer storage medium storing computer program code which, when run on a computing device, causes the computing device to perform a recommended method of machine learning model based test case steps according to any of the above.
According to yet another aspect of an embodiment of the present invention, there is also provided a computing device including:
a processor; and
a memory storing computer program code;
the computer program code, when executed by the processor, causes the computing device to perform the recommended method of test case steps based on a machine learning model according to any of the above.
According to the recommending method and the recommending device for the test case steps based on the machine learning model, the preprocessed historical test case data are formatted by taking the continuous n case steps as units to obtain the specified format data, the first n-1 steps in each specified format data are used as training data, the nth step is used as a label to train the constructed machine learning model, and further the trained machine learning model is used for recommending the test case steps, so that the combination of artificial intelligence AI and testing is realized, the case steps can be automatically recommended for a tester to select without selecting a function library during the test case compiling, the test case compiling efficiency is effectively improved, and the test efficiency is further improved, and the test cost is reduced.
Furthermore, the model precision can be remarkably improved by recommending test case steps by adopting a long-short-term memory network LSTM model. Through testing, the recommended accuracy of the use case steps by utilizing the LSTM model can reach 83%.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
The above, as well as additional objectives, advantages, and features of the present invention will become apparent to those skilled in the art from the following detailed description of a specific embodiment of the present invention when read in conjunction with the accompanying drawings.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a flow chart of a recommended method of test case steps based on a machine learning model according to an embodiment of the invention;
FIG. 2 is a flow chart of a recommended method of test case steps based on a machine learning model according to another embodiment of the invention; and
FIG. 3 is a schematic diagram of a recommendation device for testing case steps based on a machine learning model according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The inventor finds that in the prior art, when the test case is compiled, a tester is required to select a function from a function library as each step of the case. The existing method for writing test cases on the test case management platform is to newly establish the test cases under a certain project, and select functions from a function library as each step of the cases by checking all, importing and other operations according to the functions of the test cases. The method of manually selecting the use case step is low in efficiency and increases labor cost.
In order to solve the technical problems, an embodiment of the invention provides a recommendation method of test case steps based on a machine learning model. FIG. 1 illustrates a flow chart of a recommended method of test case steps based on a machine learning model according to an embodiment of the invention. Referring to fig. 1, the method may include at least the following steps S202 to S208.
Step S202, preprocessing is performed on the historical test case data to obtain preprocessed test case data, wherein each piece of preprocessed test case data comprises identifiers of all steps of a test case in sequence arrangement, and each step corresponds to a function in a function library of a test case management platform.
Step S204, for each piece of data in the preprocessed test case data, taking a step in the piece of data as a starting point, taking n-1 steps in succession after the step and the step to form a mode of appointed format data, and obtaining a plurality of appointed format data, wherein each appointed format data comprises n steps arranged according to an execution sequence, the first n-1 steps are used as training data, the nth step is used as a label, and n is an integer not less than 3.
Step S206, a machine learning model is built, and the machine learning model is trained by using training data and labels.
Step S208, inputting the adjacent n-1 steps of the test case of the step to be recommended before the step to be recommended into the trained machine learning model to obtain the recommended step of the test case.
According to the recommendation method of the test case step based on the machine learning model, the preprocessed historical test case data are formatted by taking the continuous n case steps as units to obtain the appointed format data, the first n-1 steps in each appointed format data are used as training data, the nth steps are used as labels to train the constructed machine learning model, and further the trained machine learning model is used for recommending the test case steps, so that the combination of artificial intelligence AI and testing is realized, the case steps can be automatically recommended for a tester to select without manually selecting a function library when the test case is compiled, the test case writing efficiency is effectively improved, and the test efficiency is further improved, and the test cost is reduced.
In step S202 above, the historical test case data is preprocessed to facilitate formatting of the subsequent data.
In an alternative embodiment of the present invention, historical test case data may be preprocessed by:
all functions in the function library of the test case management platform are firstly ordered according to the original identification (such as func_id) of the functions. Then, all the ordered functions are mapped to a continuous space, so that the identifier (id) mapped by each function is used as the identifier of the step corresponding to the function included in each piece of preprocessed test case data.
Through the sequencing and mapping mode, the marks (ids) of the functions with discontinuous func_ids in the original function library after mapping are continuous, and each step of the test case can be uniquely represented by the id after mapping of each function.
In practical application, the history test case data of the test case management platform after preprocessing can be obtained from a MySQL database. One example of the obtained pieces of pre-processed test case data is listed below:
12015:[35936,52046,3770,69847,63913,78438,3770,88668,3770,49182]
12014:[35936,52046,3770,69847,63913,78438,3770,88668,3770,49189]
22489:[52046,82568,51742,69847,35918,35918,2647,46454,83066,78616]
12013:[35936,52046,3770,69847,63913,78438,3770,88668,3770,88685,3770,82450]
as indicated above, for each piece of data in the preprocessed test case data, the left side of the colon is the unique identification (id) of the test case, and the right side of the colon is all steps of the test case, where each step of the test case is uniquely represented by the id mapped by each function in the function library.
The functions in the function library of the test case management platform are stored according to the items, and the functions of some functions have certain repeatability.
In order to make the functions unique so as to simplify preprocessing and recommending operations, in an alternative embodiment of the present invention, before sorting the functions in the function library according to the original identities of the functions, the functions in the function library may be further de-duplicated according to the functions, and the steps corresponding to the de-duplicated functions in the historical test case data may be renumbered.
In step S204, each piece of preprocessed historical test case data is formatted in units of n consecutive case steps to obtain a plurality of pieces of specified format data. This step operation is based on the assumption that the step of a test case is related to n-1 steps preceding the step. The value of n is preferably in the range of 4-6, considering the balance of model recommendation accuracy and time complexity.
In practical application, each piece of preprocessed historical test case data can be formatted in an appropriate formatting manner according to the need to obtain a plurality of pieces of specified format data.
Taking the first piece of preprocessed test case data listed above as an example, assuming that n=4, each of the first to fourth last steps in the piece of data may be sequentially selected from front to back as a starting point, and 3 consecutive steps after the step and the step form a specified format data, so as to obtain the following 7 specified format data:
[35936,52046,3770,69847]、
[52046,3770,69847,63913]、
[3770,69847,63913,78438]、
[69847,63913,78438,3770]、
[63913,78438,3770,88668]、
[78438,3770,88668,3770]、
[3770,88668,3770,49182]。
Or selecting a step from front to back at intervals of 3 steps, and taking 3 continuous steps after the step and the step to form specified format data, so as to obtain the following 2 specified format data:
[35936,52046,3770,69847]、
[63913,78438,3770,88668]。
or selecting a step from front to back at intervals of 1 step as a starting point, and taking 3 continuous steps after the step and the step to form specified format data, thereby obtaining the following 4 specified format data:
[35936,52046,3770,69847]、
[3770,69847,63913,78438]、
[63913,78438,3770,88668]、
[3770,88668,3770,49182]。
one step can be selected from the last to the fourth step from the back to the front every 3 steps, and the continuous 3 steps after the step and the step are taken to form a piece of appointed format data, so that the following 2 pieces of appointed format data are obtained:
[3770,88668,3770,49182]、
[3770,69847,63913,78438]。
it should be noted that, the above format of the preprocessed historical test case data is merely exemplary, and the present invention is not limited thereto.
In an alternative embodiment of the present invention, after performing step S204 to obtain a plurality of specified format data, a step of performing one-hot encoding on all tags in the plurality of specified format data may be further included.
One-Hot Encoding (One Encoding), also known as One-bit valid Encoding, uses an N-bit state register to encode N states, each with its own register bit, and at any time only One of the bits is valid. Thus, the single thermal encoding is able to handle non-continuous numerical features.
Because the numerical values of the labels in the embodiment of the invention are sparsely and randomly distributed in space, the labels are suitable for a machine learning algorithm by performing independent thermal coding, and the influence on the model effect is avoided. Further, by sequencing and mapping the functions to a continuous space, dimension disasters possibly encountered when the labels are subjected to single-hot coding are avoided, and data processing efficiency is improved.
In step S206 above, a machine learning model is constructed, and the machine learning model is trained using the training data and the labels obtained after formatting.
Since the test case can be considered a time-based sequence, a classical language model in natural language processing can be employed to build a machine learning model for case step recommendation. The machine learning model mentioned here may include an N-gram model, a CBOW (Continuous Bag-of-Words) model, or an LSTM (Long Short-Term Memory network) model, or the like.
The n-gram model is a probabilistic language model based on an n-1 order Markov chain, and is widely applied to natural language processing, sequence analysis and the like based on statistics by deducing the structure of sentences through the occurrence probability of n words. The CBOW model uses the context of the word to predict the current word. The LSTM model is a time-recurrent neural network adapted to process and predict important events that are relatively long spaced and delayed in a time series.
When the machine learning model is an LSTM model, the LSTM model may be constructed using a Sequential (Sequential) model of the keras. Keras is a high-level neural network API (Application Programming Interface ) written in pure Python, and has highly modular, extremely simple, fast, and extensible properties. The constructed LSTM model uses a sequential model of keras, and mainly can comprise a word vector Embedding (Embedding) layer, a bidirectional LSTM layer and a fully connected layer. The full connection layer mainly uses a softmax activation function. The model chooses a loss function that is a multi-class cross entropy (categorical cross entropy) function.
Examples of a piece of code that builds the LSTM model are listed below:
train_data=np.reshape(self.train_data,(len(self.train_data),self.windowsize))
train_label=np_utils.to_categorical(self.train_label,num_classes=len(self.func_index_dict))
model=Sequential()
model.add(Embedding(input_dim=len(self.func_index_dict),output_dim=32,input_length=self.windowsize))
model.add(Bidirectional(LSTM(16)))
model.add(Dense(len(self.func_index_dict),activation=’softmax’))
model.summary()
model.compile(loss=’categorical_crossentropy’,optimizer=’adam’,metrics=[‘accuracy’])
model.fit(train_data,train_label,epochs=self.epoch,batch_size=1000,verbose=2)
model.save(self.model_path)
in an alternative embodiment of the present invention, the step of dividing the plurality of specified format data into a training set and a test set according to a specified ratio may be further included before training the constructed machine learning model using the training data and the labels in the plurality of specified format data. Further, the constructed machine learning model is trained using the training data and the labels in the training set. The specified ratio mentioned here may be set according to the training data volume and model accuracy requirements, and may be set in the range of 6:1 to 9:1, for example.
Preferably, for the case that corresponding steps in each piece of preprocessed historical test case data are sequentially selected as starting points, and each piece of preprocessed historical test case data is formatted by taking n case steps as units to obtain a plurality of pieces of specified format data, before the obtained plurality of pieces of specified format data are divided into a training set and a test set, the sequence of the plurality of pieces of specified format data can be disordered, so that the training set and the test set are divided more reasonably and uniformly.
Further, the test set may also be utilized to determine a recommendation accuracy of the trained machine learning model to adjust and optimize the model accuracy of the use case recommendation.
In step S208, the step of the test case is recommended by using the trained machine learning model.
In an alternative embodiment of the present invention, step S208 may be further implemented as:
firstly, inputting n-1 adjacent steps of a test case of a step to be recommended before the step to be recommended into a trained machine learning model.
Then, according to the input adjacent n-1 steps before the step to be recommended, the probability that each function in the function library appears as the recommended step is obtained. Specifically, the probability that each function may appear may be found by calling the predict function of the machine learning model.
And finally, selecting the function in the function library as the recommended test case according to the probability that the obtained function appears as the recommended step.
Further, choosing the function according to probability may include the following two ways.
Mode one
Firstly, all functions in a function library are ordered according to the probability that the obtained functions appear as recommended steps. Then, selecting the designated number of functions ranked in front as recommended test cases. For example, the top 5 functions (i.e., the functions with probability ranks of top 5) may be selected as the recommended step of the test case.
Mode two
And comparing the probability of each function in the function library, which is used as the recommended step, with a preset probability threshold, and selecting the function with the probability not lower than the preset probability threshold as the recommended test case. The preset probability threshold mentioned herein may be set according to the actual requirements of the application.
One example of implementation code for performing use case step recommendations with the above-mentioned test set data to determine recommendation accuracy for a trained machine learning model is listed below:
As shown by the codes above, the first n-1 steps in each piece of data in the test set are used as test data to be input into a trained machine learning model, the prediction function is called to obtain the probability that each function in the function library appears as a recommended step, the functions are ordered according to the probability, and then the 4 functions which appear possibly and are ordered in the first step are selected as recommended steps.
Comparing the modeling time complexity and model accuracy of the N-gram model, the CBOW model and the LSTM model, the time complexity of the N-gram model is found to be exponentially increased along with the increase of the dependent steps (namely, the N value); the time complexity of the CBOW model is smaller, and each iteration takes a shorter time (about 1 s); the LSTM model is moderately time-complex, taking about 5 seconds per iteration after acceleration by using a graphics processor (Graphic Processing Unit, GPU). In terms of model accuracy, the LSTM model is significantly better than the N-gram model and the CBOW model. Through test, the recommended accuracy of the LSTM model can reach 83%.
Having described various implementations of the various links of the embodiment shown in fig. 2, the implementation process of the recommendation method for testing case steps based on machine learning model of the present invention will be described in detail below through specific embodiments.
FIG. 2 is a flow chart of a method for recommending test case steps based on a machine learning model according to an embodiment of the invention. Referring to fig. 2, the method may include the following steps S302 to S316.
Step S302, the functions in the function library of the test case management platform are de-duplicated according to the functions, and the steps corresponding to the de-duplicated functions in the historical test case data are renumbered.
Step S304, sorting the functions in the function library according to the original identifiers of the functions, and mapping the sorted functions to a continuous space.
Step S306, acquiring all the historical test case data of the test case management platform from the MySQL database, wherein each piece of historical test case data comprises all the steps of the sequence arrangement of the test cases, and each step corresponds to a function in the function library and is identified by id mapped by the function.
Step S308, for each piece of historical test case data, taking a step in the piece of data as a starting point, taking n-1 steps in succession after the step and the step to form a specified format data mode, and obtaining a plurality of specified format data, wherein each specified format data comprises n steps arranged according to an execution sequence, the first n-1 steps are taken as training data, the nth step is taken as a label, and n is an integer not less than 3.
Step S310, performing one-time thermal encoding on all tags in the plurality of specified format data.
Step S312, constructing an LSTM model by using the keras sequential model, and training the LSTM model by using training data and labels.
In this embodiment, the constructed LSTM model includes a word vector embedding layer, a bidirectional LSTM layer, and a full connection layer. The fully connected layer employs a softmax activation function. The LSTM model uses a multi-class cross entropy function as the loss function.
Step S314, inputting the adjacent n-1 steps of the test case of the step to be recommended, which are before the step to be recommended, into the trained LSTM layer, and calling the prediction function to obtain the probability that each function in the function library appears as the recommended step.
Step S316, the functions in the function library are ranked according to the probability, and the appointed number of functions ranked in front are selected as recommended test cases.
The embodiment of the invention can accurately recommend the test case steps for the testers, and has good auxiliary effect on the writing of the test cases.
Based on the same inventive concept, the embodiment of the invention also provides a recommendation device for the test case step based on the machine learning model, which is used for supporting the recommendation method for the test case step based on the machine learning model provided by any one embodiment or combination thereof. FIG. 3 is a schematic diagram of a recommendation device for testing case steps based on a machine learning model according to an embodiment of the present invention. Referring to fig. 3, the apparatus may include at least: a data preprocessing module 410, a training data generation module 420, a recommendation model training module 430, and a use case step recommendation module 440.
The function of each component or device of the recommending device based on the test case step of the machine learning model and the connection relation among each part according to the embodiment of the invention are described:
the data preprocessing module 410 is adapted to preprocess the historical test case data to obtain preprocessed test case data, wherein each piece of preprocessed test case data includes identifiers of all steps of a sequential arrangement of test cases, and each step corresponds to a function in the function library of the test case management platform.
The training data generating module 420 is connected to the data preprocessing module 410, and is adapted to obtain a plurality of pieces of specified format data by taking a step of randomly selecting one of the pieces of data as a starting point, and taking n-1 consecutive steps after the step and the step to form specified format data, wherein each piece of specified format data includes n steps arranged according to an execution sequence, the first n-1 steps are used as training data, the nth step is used as a tag, and n is an integer not less than 3.
The recommendation model training module 430, coupled to the training data generation module 420, is adapted to construct a machine learning model and train the machine learning model using the training data and the labels.
The case step recommending module 440 is connected to the recommending model training module 430, and is adapted to input n-1 adjacent steps before the to-be-recommended step of the test case of the to-be-recommended step into the trained machine learning model to obtain the recommended step of the test case.
In an alternative embodiment of the present invention, the machine learning model mentioned above may include an N-gram model, a continuous word bag CBOW model, or a long and short term memory network LSTM model.
In an alternative embodiment of the present invention, when the machine learning model is an LSTM model, the recommendation model training module 430 is further adapted to:
constructing an LSTM model by using a keras sequential model;
the LSTM model comprises a word vector embedding layer, a bidirectional LSTM layer and a full connection layer;
the full connection layer adopts a softmax activation function;
the LSTM model uses a multi-class cross entropy function as the loss function.
In an alternative embodiment of the invention, the data preprocessing module 410 is further adapted to:
sequencing the functions in the function library according to the original identifiers of the functions;
the ordered functions are mapped to a continuous space, so that the identification mapped by each function is used as the identification of the step corresponding to the function.
In an alternative embodiment of the invention, the data preprocessing module 410 is further adapted to:
before ordering the functions in the function library according to the original identifiers of the functions, the functions in the function library are de-duplicated according to the functions, and the steps corresponding to the de-duplicated functions in the historical test case data are renumbered.
In an alternative embodiment of the invention, the training data generation module 420 is further adapted to:
after a plurality of specified format data are obtained, all tags in the plurality of specified format data are subjected to one-time encoding.
In an alternative embodiment of the invention, the recommendation model training module 430 is further adapted to:
dividing a plurality of specified format data into a training set and a testing set according to a specified proportion;
training the machine learning model by using training data and labels in the training set.
In an alternative embodiment of the invention, the recommendation model training module 430 is further adapted to:
after training the machine learning model with training data and labels in the training set, determining a recommendation accuracy of the trained machine learning model with the testing set.
In an alternative embodiment of the invention, the use case step recommendation module 440 is further adapted to:
Inputting n-1 adjacent steps of the test case of the step to be recommended before the step to be recommended into a trained machine learning model;
according to the input adjacent n-1 steps before the step to be recommended, obtaining the probability that each function in the function library appears as a recommended step;
and selecting the function in the function library as the recommended test case according to the probability.
Further, in an alternative embodiment, the use case step recommendation module 440 is further adapted to:
sequencing the functions in the function library according to the probability;
and selecting the designated number of functions ranked in front as recommended test cases.
In another alternative embodiment, the use case step recommendation module 440 is further adapted to:
comparing the probability of the function in the function library with a preset probability threshold;
and selecting a function with probability not lower than a preset probability threshold as the recommended test case.
Based on the same inventive concept, the embodiment of the invention also provides a computer storage medium. The computer storage medium stores computer program code which, when run on a computing device, causes the computing device to perform a recommended method of machine learning model based test case steps according to any one or a combination of the above embodiments.
Based on the same inventive concept, the embodiment of the invention also provides a computing device. The computing device may include:
a processor; and
a memory storing computer program code;
the computer program code, when executed by a processor, causes the computing device to perform the recommended method of machine learning model based test case steps according to any one or a combination of the embodiments described above.
According to any one of the optional embodiments or the combination of multiple optional embodiments, the following beneficial effects can be achieved according to the embodiment of the invention:
according to the recommending method and the recommending device for the test case steps based on the machine learning model, the preprocessed historical test case data are formatted by taking the continuous n case steps as units to obtain the specified format data, the first n-1 steps in each specified format data are used as training data, the nth step is used as a label to train the constructed machine learning model, and further the trained machine learning model is used for recommending the test case steps, so that the combination of artificial intelligence AI and testing is realized, the case steps can be automatically recommended for a tester to select without selecting a function library during the test case compiling, the test case compiling efficiency is effectively improved, and the test efficiency is further improved, and the test cost is reduced.
Furthermore, the model precision can be remarkably improved by recommending test case steps by adopting a long-short-term memory network LSTM model. Through testing, the recommended accuracy of the use case steps by utilizing the LSTM model can reach 83%.
It will be clear to those skilled in the art that the specific working procedures of the above-described systems, devices and units may refer to the corresponding procedures in the foregoing method embodiments, and are not repeated herein for brevity.
In addition, each functional unit in the embodiments of the present invention may be physically independent, two or more functional units may be integrated together, or all functional units may be integrated in one processing unit. The integrated functional units may be implemented in hardware or in software or firmware.
Those of ordinary skill in the art will appreciate that: the integrated functional units, if implemented in software and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in essence or in whole or in part in the form of a software product stored in a storage medium, comprising instructions for causing a computing device (e.g., a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present invention when the instructions are executed. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a read-only memory (ROM), a random-access memory (RAM), a magnetic disk, or an optical disk, etc.
Alternatively, all or part of the steps of implementing the foregoing method embodiments may be implemented by hardware (such as a personal computer, a server, or a computing device such as a network device) associated with program instructions, where the program instructions may be stored on a computer-readable storage medium, and where the program instructions, when executed by a processor of the computing device, perform all or part of the steps of the method according to the embodiments of the present invention.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all technical features thereof can be replaced by others within the spirit and principle of the present invention; such modifications and substitutions do not depart from the scope of the invention.

Claims (20)

1. A recommendation method of test case steps based on a machine learning model comprises the following steps:
preprocessing historical test case data to obtain preprocessed test case data, wherein each piece of preprocessed test case data comprises identifiers of all steps of a test case in sequence arrangement, and each step corresponds to a function in a function library of a test case management platform;
For each piece of data in the preprocessed test case data, taking a step in the piece of data as a starting point, and taking n-1 steps in succession after the step and the mode of forming specified format data by the step to obtain a plurality of specified format data, wherein each specified format data comprises n steps arranged according to an execution sequence, the first n-1 steps are taken as training data, the nth step is taken as a label, and n is an integer not less than 3;
constructing a machine learning model, and training the machine learning model by utilizing the training data and the label;
inputting n-1 adjacent steps of the test case of the step to be recommended, which are adjacent to the step to be recommended, into the trained machine learning model to obtain the probability that each function in the function library appears as a recommended step;
selecting a function in the function library as a recommended test case according to the probability;
the preprocessing of the historical test case data comprises the following steps:
sequencing the functions in the function library according to the original function identifiers;
mapping the ordered functions to a continuous space, so that the mapped identification of each function is used as the identification of the step corresponding to the function.
2. The method of claim 1, wherein the machine learning model comprises an N-gram model, a continuous word bag CBOW model, or a long and short term memory network LSTM model.
3. The method of claim 2, wherein when the machine learning model is an LSTM model, constructing a machine learning model comprises:
constructing an LSTM model by using a keras sequential model;
the LSTM model comprises a word vector embedding layer, a bidirectional LSTM layer and a full connection layer;
the full connection layer adopts a softmax activation function;
the LSTM model employs a multi-class cross entropy function as a loss function.
4. A method according to claim 3, wherein before ordering the functions in the function library according to their original identities, further comprising:
and performing duplicate removal on the functions in the function library according to the functions, and renumbering steps corresponding to the duplicate-removed functions in the historical test case data.
5. The method of claim 4, wherein after obtaining the plurality of specified format data, further comprising:
and performing one-time thermal encoding on all tags in the plurality of specified format data.
6. The method of claim 5, wherein prior to training the machine learning model with the training data and the labels, further comprising:
Dividing the plurality of specified format data into a training set and a testing set according to a specified proportion;
training the machine learning model using the training data and the labels, comprising:
and training the machine learning model by using training data and labels in the training set.
7. The method of claim 6, wherein after training the machine learning model with the training data and the labels, further comprising:
and determining the recommended accuracy of the trained machine learning model by using the test set.
8. The method of claim 7, wherein selecting the function in the function library as the recommended test case according to the probability comprises:
sorting the functions in the function library according to the probability;
and selecting the designated number of functions ranked in front as recommended test cases.
9. The method of claim 8, wherein selecting the function in the function library as the recommended test case according to the probability comprises:
comparing the probability of the function in the function library with a preset probability threshold;
And selecting a function with probability not lower than the preset probability threshold as the recommended test case.
10. A recommendation device for test case steps based on a machine learning model, comprising:
the data preprocessing module is suitable for preprocessing the historical test case data to obtain preprocessed test case data, wherein each piece of preprocessed test case data comprises identifiers of all steps of a test case in sequence arrangement, and each step corresponds to a function in a function library of the test case management platform;
the training data generation module is suitable for obtaining a plurality of pieces of specified format data by taking a step in the preprocessed test case data as a starting point and taking n-1 steps in succession after the step and a mode of forming specified format data by the step, wherein each piece of specified format data comprises n steps arranged according to an execution sequence, the first n-1 steps are used as training data, the nth step is used as a label, and n is an integer not less than 3;
the recommendation model training module is suitable for constructing a machine learning model and training the machine learning model by utilizing the training data and the label; and
The case step recommending module is suitable for inputting n-1 adjacent steps of the test case of the step to be recommended, which are adjacent to the step to be recommended, into the trained machine learning model to obtain the probability that each function in the function library appears as a recommended step; selecting a function in the function library as a recommended test case according to the probability;
the data preprocessing module is further suitable for sorting the functions in the function library according to the original functions identification; mapping the ordered functions to a continuous space, so that the mapped identification of each function is used as the identification of the step corresponding to the function.
11. The apparatus of claim 10, wherein the machine learning model comprises an N-gram model, a continuous word bag CBOW model, or a long and short term memory network LSTM model.
12. The apparatus of claim 11, wherein when the machine learning model is an LSTM model, the recommendation model training module is further adapted to:
constructing an LSTM model by using a keras sequential model;
the LSTM model comprises a word vector embedding layer, a bidirectional LSTM layer and a full connection layer;
the full connection layer adopts a softmax activation function;
The LSTM model employs a multi-class cross entropy function as a loss function.
13. The apparatus of claim 12, wherein the data preprocessing module is further adapted to:
before sorting the functions in the function library according to the original function identifiers, performing de-duplication on the functions in the function library according to the functions, and renumbering steps corresponding to the de-duplicated functions in the historical test case data.
14. The apparatus of claim 13, wherein the training data generation module is further adapted to:
after a plurality of specified format data are obtained, all tags in the plurality of specified format data are subjected to one-time encoding.
15. The apparatus of claim 14, wherein the recommendation model training module is further adapted to:
dividing the plurality of specified format data into a training set and a testing set according to a specified proportion;
and training the machine learning model by using training data and labels in the training set.
16. The apparatus of claim 15, wherein the recommendation model training module is further adapted to:
after training the machine learning model using the training data and the labels, determining a recommended accuracy of the trained machine learning model using the test set.
17. The apparatus of claim 16, wherein the use case step recommendation module is further adapted to:
sorting the functions in the function library according to the probability;
and selecting the designated number of functions ranked in front as recommended test cases.
18. The apparatus of claim 17, wherein the use case step recommendation module is further adapted to:
comparing the probability of the function in the function library with a preset probability threshold;
and selecting a function with probability not lower than the preset probability threshold as the recommended test case.
19. A computer storage medium storing computer program code which, when run on a computing device, causes the computing device to perform the recommended method of machine learning model based test case steps of any of claims 1-9.
20. A computing device, comprising:
a processor; and
a memory storing computer program code;
the computer program code, when executed by the processor, causes the computing device to perform the recommended method of test case steps based on a machine learning model according to any one of claims 1-9.
CN201811647595.4A 2018-12-29 2018-12-29 Recommendation method and device for test case steps based on machine learning model Active CN111444076B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811647595.4A CN111444076B (en) 2018-12-29 2018-12-29 Recommendation method and device for test case steps based on machine learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811647595.4A CN111444076B (en) 2018-12-29 2018-12-29 Recommendation method and device for test case steps based on machine learning model

Publications (2)

Publication Number Publication Date
CN111444076A CN111444076A (en) 2020-07-24
CN111444076B true CN111444076B (en) 2024-04-05

Family

ID=71626593

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811647595.4A Active CN111444076B (en) 2018-12-29 2018-12-29 Recommendation method and device for test case steps based on machine learning model

Country Status (1)

Country Link
CN (1) CN111444076B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11954019B2 (en) 2022-02-04 2024-04-09 Optum, Inc. Machine learning techniques for automated software testing configuration management
CN116303088A (en) * 2023-04-17 2023-06-23 南京航空航天大学 Test case ordering method based on deep neural network cross entropy loss
CN117093501B (en) * 2023-09-25 2024-03-12 哈尔滨航天恒星数据系统科技有限公司 Test case recommendation method based on pre-training model, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893256A (en) * 2016-03-30 2016-08-24 西北工业大学 Software failure positioning method based on machine learning algorithm
CN108153658A (en) * 2016-12-02 2018-06-12 富士通株式会社 The method and apparatus of models of priority training method and determining priorities of test cases
CN108228469A (en) * 2018-02-23 2018-06-29 科大讯飞股份有限公司 test case selection method and device
CN108304324A (en) * 2018-01-22 2018-07-20 百度在线网络技术(北京)有限公司 Method for generating test case, device, equipment and storage medium
CN108491326A (en) * 2018-03-21 2018-09-04 重庆金融资产交易所有限责任公司 Behavioral test recombination method, device and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7496549B2 (en) * 2005-05-26 2009-02-24 Yahoo! Inc. Matching pursuit approach to sparse Gaussian process regression
US10838848B2 (en) * 2017-06-01 2020-11-17 Royal Bank Of Canada System and method for test generation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893256A (en) * 2016-03-30 2016-08-24 西北工业大学 Software failure positioning method based on machine learning algorithm
CN108153658A (en) * 2016-12-02 2018-06-12 富士通株式会社 The method and apparatus of models of priority training method and determining priorities of test cases
CN108304324A (en) * 2018-01-22 2018-07-20 百度在线网络技术(北京)有限公司 Method for generating test case, device, equipment and storage medium
CN108228469A (en) * 2018-02-23 2018-06-29 科大讯飞股份有限公司 test case selection method and device
CN108491326A (en) * 2018-03-21 2018-09-04 重庆金融资产交易所有限责任公司 Behavioral test recombination method, device and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A systematic review of software fault prediction studies;Cagatay Catal 等;《Expert Systems with Applications》;第36卷(第4期);7346-7354 *
测试用例自动生成中人工智能的应用研究;张博;《中国优秀硕士学位论文全文数据库信息科技辑》(第11期);I138-91 *

Also Published As

Publication number Publication date
CN111444076A (en) 2020-07-24

Similar Documents

Publication Publication Date Title
CN111444076B (en) Recommendation method and device for test case steps based on machine learning model
Mahabadi et al. Perfect: Prompt-free and efficient few-shot learning with language models
KR102315984B1 (en) Event prediction device, prediction model generator and event prediction program
Boehmke Data wrangling with R
US8180715B2 (en) Systems and methods for collaborative filtering using collaborative inductive transfer
CN115392237B (en) Emotion analysis model training method, device, equipment and storage medium
CN112667775A (en) Keyword prompt-based retrieval method and device, electronic equipment and storage medium
CN112906361A (en) Text data labeling method and device, electronic equipment and storage medium
CN111045670B (en) Method and device for identifying multiplexing relationship between binary code and source code
CN110837730B (en) Method and device for determining unknown entity vocabulary
Ens et al. Quantifying musical style: Ranking symbolic music based on similarity to a style
CN113159630B (en) Method for maintaining calculation formula in laboratory information management system
Amorim et al. A new word embedding approach to evaluate potential fixes for automated program repair
Buchholz conversive hidden non-Markovian models
JPWO2007132564A1 (en) Data processing apparatus and method
WO2014144779A1 (en) Systems and methods for abductive learning of quantized stochastic processes
Sakkas et al. Seq2Parse: neurosymbolic parse error repair
CN114757154B (en) Job generation method, device and equipment based on deep learning and storage medium
WO2022230226A1 (en) A meta-learning data augmentation framework
CN115495085A (en) Generation method and device based on deep learning fine-grained code template
JP2023117513A (en) Learning program, learning method, and information processing apparatus
CN103326731B (en) A kind of Hidden Markov correlated source coded method encoded based on distributed arithmetic
Jacobsen et al. Optimal size-performance tradeoffs: Weighing PoS tagger models
Da et al. IMPROVING WEBPAGE ACCESS PREDICTIONS BASED ON SEQUENCE PREDICTION AND PAGERANK ALGORITHM.
Clark et al. Perceptron training for a wide-coverage lexicalized-grammar parser

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240311

Address after: Room 03, 2nd Floor, Building A, No. 20 Haitai Avenue, Huayuan Industrial Zone (Huanwai), Binhai New Area, Tianjin, 300450

Applicant after: 3600 Technology Group Co.,Ltd.

Country or region after: China

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Applicant before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Country or region before: China

GR01 Patent grant
GR01 Patent grant