CN111625468A

CN111625468A - Test case duplicate removal method and device

Info

Publication number: CN111625468A
Application number: CN202010505902.6A
Authority: CN
Inventors: 李刘强
Original assignee: Bank of China Ltd
Current assignee: Bank of China Ltd
Priority date: 2020-06-05
Filing date: 2020-06-05
Publication date: 2020-09-04
Anticipated expiration: 2040-06-05
Also published as: CN111625468B

Abstract

The method comprises the steps of extracting characteristic values of all test cases, respectively determining the similarity of the characteristic values of the two test cases in all test case pairs, using the test case pair to which the characteristic value with the similarity larger than a preset similarity threshold belongs as a test case pair to be processed, and carrying out duplicate removal on the test cases in a plurality of test case pairs to be processed, so that the automatic duplicate removal of the test cases is realized. And the feature values of all the test cases are respectively extracted, and the similarity of the feature values of the two test cases in each test case is respectively determined, so that the processing of all the test cases is realized, the omission of the test cases is avoided, and the accuracy of the test cases is improved.

Description

Test case duplicate removal method and device

Technical Field

The present application relates to the field of testing technologies, and in particular, to a method and an apparatus for removing duplicate in a test case.

Background

For large business systems, a large number of test cases may be required to test them. In order to meet the requirement of the use amount of the test cases, different testers are needed to compile the test cases. Different testers may have repetition of the written test cases, which results in waste of test resources.

Therefore, it is necessary to perform deduplication processing for test cases in which duplication exists. At present, the duplication of test cases is generally removed by adopting a manual mode, but the accuracy rate is low and the efficiency is low.

Disclosure of Invention

In order to solve the above technical problems, embodiments of the present application provide a method and an apparatus for removing duplicate of a test case, so as to achieve the purpose of improving the efficiency and accuracy of removing duplicate of the test case, and the technical scheme is as follows:

a test case deduplication method, comprising:

extracting characteristic values of each test case;

respectively determining the similarity of the characteristic values of two test cases in each test case pair, wherein the test case pairs are obtained by selecting any two test cases from a plurality of test cases;

taking the test case pair to which the characteristic value with the similarity larger than a preset similarity threshold belongs as a test case pair to be processed;

and carrying out duplicate removal on the test cases in the plurality of test case pairs to be processed.

Preferably, the extracting the feature values of the test cases includes:

extracting characteristic values of each test case;

performing word segmentation on the characteristic value of each test case to obtain at least one keyword;

the determining the similarity of the characteristic values of the two test cases in each test case pair respectively comprises the following steps:

and determining the similarity of the characteristic values of the two test cases in each test case pair based on the keywords obtained by segmenting the characteristic values of the test cases respectively.

Preferably, the determining the similarity between the feature values of the two test cases in each test case pair based on the keywords obtained by segmenting the feature values of each test case respectively includes:

respectively determining similar keyword pairs in keywords of two test cases in each test case pair, wherein the similar keyword pairs consist of first keywords and second keywords, the text similarity of the first keywords and the second keywords is greater than a set text similarity threshold, the first keywords are keywords of one test case in the two test cases, and the second keywords are keywords of the other test case in the two test cases;

and counting the number of the similar keyword pairs, and determining the similarity of the characteristic values of the two test cases based on the number of the similar keyword pairs.

respectively counting the occurrence frequency of each first keyword in each test case pair in a first test case and the occurrence frequency of each second keyword in the test case pair in a second test case, wherein the first test case and the second test case form the test case pair, the first keyword is the keyword of the first test case, and the second keyword is the keyword of the second test case;

and determining the similarity of the characteristic values of the two test cases respectively based on the keywords of the two test cases in each test case pair, the occurrence frequency of each first keyword in each test case pair in the first test case, and the occurrence frequency of each second keyword in each test case pair in the second test case.

Preferably, the determining the similarity of the feature values of the two test cases based on the keywords of the two test cases in each test case pair, the occurrence frequency of each first keyword in each test case pair in the first test case, and the occurrence frequency of each second keyword in the test case pair in the second test case respectively includes:

and determining the similarity of the characteristic values of the two test cases by using a cosine similarity algorithm based on the keywords of the two test cases in each test case pair, the occurrence frequency of each first keyword in each test case pair in the first test case and the occurrence frequency of each second keyword in each test case pair in the second test case.

A test case deduplication apparatus, comprising:

the extraction module is used for extracting the characteristic value of each test case;

the first determining module is used for respectively determining the similarity of the characteristic values of two test cases in each test case pair, wherein the test case pair is obtained by selecting any two test cases from a plurality of test cases;

the second determination module is used for taking the test case pair to which the characteristic value with the similarity larger than the preset similarity threshold belongs as a test case pair to be processed;

and the duplication removing module is used for carrying out duplication removal on the test cases in the plurality of test case pairs to be processed.

Preferably, the extraction module is specifically configured to:

extracting characteristic values of each test case;

the first determining module is specifically configured to:

Preferably, the first determining module is specifically configured to:

Compared with the prior art, the beneficial effect of this application is:

in the application, the similarity of the characteristic values of two test cases in each test case pair is respectively determined by extracting the characteristic value of each test case, the test case pair to which the characteristic value with the similarity larger than a preset similarity threshold belongs is taken as a test case pair to be processed, and the test cases in the plurality of test case pairs to be processed are subjected to duplicate removal, so that the test cases are automatically deduplicated. And the feature values of all the test cases are respectively extracted, and the similarity of the feature values of the two test cases in each test case is respectively determined, so that the processing of all the test cases is realized, the omission of the test cases is avoided, and the accuracy of the test cases is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.

FIG. 1 is a flowchart of an embodiment 1 of a test case deduplication method provided in the present application;

FIG. 2 is a flowchart of an embodiment 2 of a test case deduplication method provided by the present application;

FIG. 3 is a flowchart of an embodiment 3 of a test case deduplication method provided in the present application;

FIG. 4 is a flowchart of an embodiment 4 of a test case deduplication method provided in the present application;

fig. 5 is a schematic structural diagram of a test case removing device provided in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the application discloses a test case duplicate removal method, which comprises the following steps: extracting characteristic values of each test case; respectively determining the similarity of the characteristic values of two test cases in each test case pair, wherein the test case pairs are obtained by selecting any two test cases from a plurality of test cases; taking the test case pair to which the characteristic value with the similarity larger than a preset similarity threshold belongs as a test case pair to be processed; and carrying out duplicate removal on the test cases in the plurality of test case pairs to be processed. . In the application, the duplicate removal efficiency and the accuracy can be improved.

Next, a description is given of a test case deduplication method disclosed in the embodiment of the present application, and as shown in fig. 1, a flowchart of embodiment 1 of a test case deduplication method provided in the present application may include the following steps:

and step S11, extracting the characteristic values of the test cases.

The characteristic values of the test case may include, but are not limited to: any one or more of characteristic values representing the function module to which the test case belongs, characteristic values describing the test case, characteristic values representing operation steps of the test case, and characteristic values representing expected results of execution of the test case.

And step S12, respectively determining the similarity of the characteristic values of two test cases in each test case pair, wherein the test case pair is obtained by selecting any two test cases from the plurality of test cases.

After the characteristic values of the test cases are extracted, pairwise combination can be performed on the test cases to obtain a plurality of test case pairs, and the similarity of the characteristic values of the two test cases in each test case pair is respectively determined. Specifically, the similarity of the feature values of the two test cases in each test case pair can be determined by using a cosine similarity algorithm.

The similarity of the characteristic values of the two test cases in each test case pair can be used as the similarity of the two test cases in each test case pair.

And step S13, taking the test case pair to which the characteristic value with the similarity larger than the preset similarity threshold belongs as the test case pair to be processed.

The preset similarity threshold may be set as needed, and is not limited in this embodiment.

And step S14, carrying out duplicate removal on the test cases in the plurality of test case pairs to be processed.

The deduplication of the test cases in the plurality of test case pairs to be processed may include:

s141, respectively removing one of the two test cases in each test case pair to be processed to obtain a first test case set;

and S142, if a first test case subset exists in the first test case set, selecting one test case from the first test case subset for reservation, wherein the first test case subset comprises at least 2 test cases, and the at least 2 test cases are the same.

As another alternative embodiment of the present application, referring to fig. 2, a schematic flow chart of an embodiment 2 of a test case deduplication method provided by the present application is provided, where this embodiment mainly relates to a refinement scheme of the test case deduplication method described in the foregoing embodiment 1, as shown in fig. 2, the method may include, but is not limited to, the following steps:

and step S21, extracting the characteristic values of the test cases.

In this embodiment, the characteristic values of the test case may include, but are not limited to: any one or more of characteristic values representing the function module to which the test case belongs, characteristic values describing the test case, characteristic values representing operation steps of the test case, and characteristic values representing expected results of execution of the test case.

And step S22, performing word segmentation on the characteristic value of each test case to obtain at least one keyword.

Steps S21-S22 are a specific implementation of step S11 in example 1.

In this embodiment, the feature values of each test case may be segmented based on a segmentation library of python to obtain at least one keyword.

And step S23, determining the similarity of the characteristic values of the two test cases in each test case pair based on the keywords obtained by segmenting the characteristic values of the test cases respectively.

Step S23 is a specific implementation manner of step S12 in example 1.

And step S24, taking the test case pair to which the characteristic value with the similarity larger than the preset similarity threshold belongs as the test case pair to be processed.

And step S25, carrying out duplicate removal on the test cases in the plurality of test case pairs to be processed.

The detailed procedures of steps S24-S25 can be found in the related descriptions of steps S13-S14 in embodiment 1, and are not repeated herein.

In this embodiment, the keywords are obtained by segmenting the feature values, and the similarity of the feature values is determined based on the keywords, so that the complexity of determining the similarity of the feature values can be reduced, the efficiency of determining the similarity of the feature values is improved, and the deduplication efficiency is further improved.

As another alternative embodiment of the present application, referring to fig. 3, a schematic flow chart of an embodiment 3 of a test case deduplication method provided by the present application is provided, where this embodiment mainly relates to a refinement scheme of the test case deduplication method described in the foregoing embodiment 2, as shown in fig. 3, the method may include, but is not limited to, the following steps:

and step S31, extracting the characteristic values of the test cases.

And step S32, performing word segmentation on the characteristic value of each test case to obtain at least one keyword.

The detailed procedures of steps S31-S32 can be referred to the related descriptions of steps S21-S22 in embodiment 2, and are not described herein again.

And step S33, respectively determining similar keyword pairs in the keywords of the two test cases in each test case pair.

In this embodiment, the similar keyword pair is composed of a first keyword and a second keyword, the text similarity between the first keyword and the second keyword is greater than a set text similarity threshold, the first keyword is a keyword of one of the two test cases, and the second keyword is a keyword of the other of the two test cases.

In this embodiment, the process of determining the similar keyword pair may include: and calculating the text similarity of the first keyword and the second keyword, judging whether the text similarity of the first keyword and the second keyword is greater than a set text similarity threshold, and if so, combining the first keyword and the second keyword into a similar keyword pair.

And step S34, counting the number of the similar keyword pairs, and determining the similarity of the characteristic values of the two test cases based on the number of the similar keyword pairs.

In this embodiment, a correspondence between the number of similar keyword pairs and the similarity of the feature values of the test cases may be set, after the number of similar keyword pairs is counted, the similarity of the feature values corresponding to the number of the related keyword pairs is found in the correspondence, and the found similarity is used as the similarity of the feature values of the two test cases.

Steps S33-S34 are a specific implementation of step S23 in example 2.

Step S35, taking the test case pair to which the characteristic value with the similarity larger than the preset similarity threshold belongs as a test case pair to be processed;

and step S36, carrying out duplicate removal on the test cases in the plurality of test case pairs to be processed.

The detailed procedures of steps S35-S36 can be seen in steps S24-S25 of embodiment 2, and are not repeated herein.

As another alternative embodiment of the present application, referring to fig. 4, a schematic flow chart of an embodiment 4 of a test case deduplication method provided by the present application is provided, where this embodiment mainly relates to a refinement scheme of the test case deduplication method described in the foregoing embodiment 2, as shown in fig. 4, the method may include, but is not limited to, the following steps:

and step S41, extracting the characteristic values of the test cases.

And step S42, performing word segmentation on the characteristic value of each test case to obtain at least one keyword.

And step S33, respectively counting the occurrence frequency of each first keyword in each test case pair in the first test case and the occurrence frequency of each second keyword in the test case pair in the second test case.

The first test case and the second test case form the test case pair, the first keywords are keywords of the first test case, and the second keywords are keywords of the second test case.

Step S44, determining similarity of characteristic values of two test cases respectively based on the keywords of the two test cases in each test case pair, the occurrence frequency of each first keyword in each test case pair in the first test case, and the occurrence frequency of each second keyword in the test case pair in the second test case.

In this embodiment, the number of times that the first keyword and the first keyword of the first test case in each test case pair appear in the first test case may be formed into a first vector, the number of times that the second keyword and the second keyword of the second test case appear in the second test case may be formed into a second vector, the similarity between the first vector and the second vector is calculated, and the similarity between the first vector and the second vector is used as the similarity between the feature values of the first test case and the second test case.

In this embodiment, the similarity of the feature values of the two test cases may be determined by using a cosine similarity algorithm. Specifically, the similarity of the first vector and the second vector may be calculated using a cosine similarity algorithm.

Steps S43-S44 are a specific implementation of step S23 in example 2.

Step S45, taking the test case pair to which the characteristic value with the similarity larger than the preset similarity threshold belongs as a test case pair to be processed;

and step S46, carrying out duplicate removal on the test cases in the plurality of test case pairs to be processed.

The detailed procedures of steps S45-S46 can be seen in steps S24-S25 of embodiment 2, and are not repeated herein.

In this embodiment, the similarity of the feature values of the two test cases is determined based on the number of the similar keyword pairs and the number of times that the keywords in each similar keyword pair appear in the keywords of the two test cases, so that the accuracy of determining the similarity of the feature values can be improved.

The test case deduplication device provided in the present application is described below, and the test case deduplication device described below and the test case deduplication method described above may be referred to in correspondence with each other.

Referring to fig. 5, the test case deduplication apparatus includes: an extraction module 11, a first determination module 12, a second determination module 13 and a deduplication module 14.

The extraction module 11 is used for extracting the characteristic values of the test cases;

the first determining module 12 is configured to determine similarity between feature values of two test cases in each test case pair, where the test case pair is obtained by combining any two test cases selected from the plurality of test cases;

a second determining module 13, configured to use the test case pair to which the feature value with the similarity greater than the preset similarity threshold belongs as a test case pair to be processed;

and the duplication removing module 14 is configured to duplicate the test cases in the plurality of test case pairs to be processed.

In this embodiment, the extracting module 11 may specifically be configured to:

extracting characteristic values of each test case;

accordingly, the first determining module 12 may be specifically configured to:

In this embodiment, the first determining module 12 may be specifically configured to:

It should be noted that each embodiment is mainly described as a difference from the other embodiments, and the same and similar parts between the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.

From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.

The detailed description is given above on the duplicate removal method and device for a test case provided by the present application, and a specific example is applied in the detailed description to explain the principle and the implementation manner of the present application, and the description of the above embodiment is only used to help understanding the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method for duplicate removal of test cases, comprising:

extracting characteristic values of each test case;

2. The method of claim 1, wherein the extracting feature values of the respective test cases comprises:

extracting characteristic values of each test case;

3. The method according to claim 2, wherein determining similarity of the feature values of the two test cases in each test case pair based on the keywords obtained by segmenting the feature values of each test case respectively comprises:

4. The method according to claim 2, wherein determining similarity of the feature values of the two test cases in each test case pair based on the keywords obtained by segmenting the feature values of each test case respectively comprises:

5. The method of claim 4, wherein determining similarity of feature values of the two test cases based on the keywords of the two test cases in each test case pair, the number of times that each first keyword in each test case pair appears in a first test case, and the number of times that each second keyword in the test case pair appears in a second test case respectively comprises:

6. A test case deduplication apparatus, comprising:

7. The apparatus according to claim 6, wherein the extraction module is specifically configured to:

extracting characteristic values of each test case;

the first determining module is specifically configured to:

8. The apparatus of claim 7, wherein the first determining module is specifically configured to:

9. The apparatus of claim 7, wherein the first determining module is specifically configured to:

10. The apparatus of claim 9, wherein the first determining module is specifically configured to: