CN111625468B

CN111625468B - Test case duplicate removal method and device

Info

Publication number: CN111625468B
Application number: CN202010505902.6A
Authority: CN
Inventors: 李刘强
Original assignee: Bank of China Ltd
Current assignee: Bank of China Ltd
Priority date: 2020-06-05
Filing date: 2020-06-05
Publication date: 2024-04-16
Anticipated expiration: 2040-06-05
Also published as: CN111625468A

Abstract

The application provides a test case de-duplication method and device, the method respectively determines the similarity of the characteristic values of two test cases in each test case pair by extracting the characteristic values of each test case, the test case pair with the similarity larger than the preset similarity threshold belongs to the characteristic value as a test case pair to be processed, and the test cases in a plurality of test case pairs to be processed are de-duplicated, so that the automatic de-duplication of the test cases is realized. And the feature values of all the test cases are respectively extracted, and the similarity of the feature values of two test cases in all the test cases is respectively determined, so that the processing of all the test cases is realized, the omission of the test cases is avoided, and the accuracy of the test cases is improved.

Description

Test case duplicate removal method and device

Technical Field

The present disclosure relates to the field of testing technologies, and in particular, to a test case deduplication method and apparatus.

Background

For large business systems, a large number of test cases may be required to test them. In order to meet the usage requirements of the test cases, different testers are required to write the test cases. And different testers may have repetition of the written test cases, resulting in waste of test resources.

Therefore, it is necessary to perform deduplication processing on test cases where there are duplications. At present, the test case deduplication is generally performed manually, but the accuracy and the efficiency are low.

Disclosure of Invention

In order to solve the above technical problems, an embodiment of the present application provides a test case deduplication method and apparatus, so as to achieve the purpose of improving the deduplication efficiency and accuracy of the test case, and the technical scheme is as follows:

a test case deduplication method, comprising:

extracting characteristic values of each test case;

respectively determining the similarity of the characteristic values of two test cases in each test case pair, wherein the test case pair is obtained by selecting any two test case compositions from a plurality of test cases;

taking the test case pair with the characteristic value of which the similarity is larger than a preset similarity threshold value as a test case pair to be processed;

and de-duplicating the test cases in the plurality of pairs of the test cases to be processed.

Preferably, the extracting the feature value of each test case includes:

extracting characteristic values of each test case;

performing word segmentation on the characteristic values of each test case to obtain at least one keyword;

the determining the similarity of the feature values of the two test cases in each test case pair comprises the following steps:

and determining the similarity of the characteristic values of the two test cases in each test case pair based on the obtained keywords by segmenting the characteristic values of each test case.

Preferably, the determining the similarity of the feature values of the two test cases in each test case pair based on the keywords obtained by word segmentation of the feature values of each test case includes:

determining similar keyword pairs in keywords of two test cases in each test case pair respectively, wherein the similar keyword pairs consist of a first keyword and a second keyword, the text similarity of the first keyword and the second keyword is larger than a set text similarity threshold, the first keyword is a keyword of one of the two test cases, and the second keyword is a keyword of the other of the two test cases;

and counting the number of the similar keyword pairs, and determining the similarity of the characteristic values of the two test cases based on the number of the similar keyword pairs.

counting the occurrence times of each first keyword in each test case pair in a first test case and the occurrence times of each second keyword in each test case pair in a second test case respectively, wherein the first test case and the second test case form the test case pair, the first keyword is the keyword of the first test case, and the second keyword is the keyword of the second test case;

and determining the similarity of the feature values of the two test cases based on the keywords of the two test cases in each test case pair, the frequency of occurrence of each first keyword in each test case pair in the first test case, and the frequency of occurrence of each second keyword in each test case pair in the second test case.

Preferably, the determining the similarity of the feature values of the two test cases based on the keywords of the two test cases in each test case pair, the number of times each first keyword in each test case pair appears in the first test case, and the number of times each second keyword in each test case pair appears in the second test case, includes:

and determining the similarity of the characteristic values of the two test cases by using a cosine similarity algorithm based on the keywords of the two test cases in each test case pair, the frequency of occurrence of each first keyword in each test case pair in the first test case, and the frequency of occurrence of each second keyword in each test case pair in the second test case.

A test case deduplication apparatus comprising:

the extraction module is used for extracting the characteristic values of each test case;

the first determining module is used for respectively determining the similarity of the characteristic values of two test cases in each test case pair, wherein the test case pair is obtained by selecting any two test case compositions from a plurality of test cases;

the second determining module is used for taking the test case pair with the feature value of which the similarity is larger than a preset similarity threshold value as a test case pair to be processed;

and the de-duplication module is used for de-duplicating the test cases in the plurality of the test case pairs to be processed.

Preferably, the extraction module is specifically configured to:

extracting characteristic values of each test case;

the first determining module is specifically configured to:

Preferably, the first determining module is specifically configured to:

Compared with the prior art, the beneficial effects of this application are:

in the method, the similarity of the characteristic values of the two test cases in each test case pair is respectively determined by extracting the characteristic values of each test case, the test case pair with the similarity larger than the preset similarity threshold belongs to the characteristic value as a test case pair to be processed, and the test cases in the plurality of test case pairs to be processed are subjected to the de-duplication mode, so that the automatic de-duplication of the test cases is realized. And the feature values of all the test cases are respectively extracted, and the similarity of the feature values of two test cases in all the test cases is respectively determined, so that the processing of all the test cases is realized, the omission of the test cases is avoided, and the accuracy of the test cases is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.

Fig. 1 is a flow chart of example 1 of a test case deduplication method provided herein;

fig. 2 is a flow chart of example 2 of a test case deduplication method provided herein;

fig. 3 is a flow chart of example 3 of a test case deduplication method provided herein;

fig. 4 is a flow chart of example 4 of a test case deduplication method provided herein;

fig. 5 is a schematic structural diagram of a test case deduplication apparatus provided in the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The embodiment of the application discloses a test case deduplication method, which comprises the following steps: extracting characteristic values of each test case; respectively determining the similarity of the characteristic values of two test cases in each test case pair, wherein the test case pair is obtained by selecting any two test case compositions from a plurality of test cases; taking the test case pair with the characteristic value of which the similarity is larger than a preset similarity threshold value as a test case pair to be processed; and de-duplicating the test cases in the plurality of pairs of the test cases to be processed. In the application, the duplicate removal efficiency and the accuracy can be improved.

Next, a test case duplication eliminating method disclosed in the embodiments of the present application is described, and as shown in fig. 1, a flowchart of an embodiment 1 of a test case duplication eliminating method provided in the present application may include the following steps:

and S11, extracting characteristic values of each test case.

The characteristic values of the test cases may include, but are not limited to: any one or more of a feature value characterizing a functional module to which the test case belongs, a feature value describing the test case, a feature value characterizing an operational step of the test case, and a feature value characterizing an expected result of execution of the test case.

Step S12, similarity of characteristic values of two test cases in each test case pair is determined, wherein the test case pair is obtained by selecting any two test case compositions from a plurality of test cases.

After extracting the characteristic values of each test case, the plurality of test cases can be combined in pairs to obtain a plurality of test case pairs, and the similarity of the characteristic values of two test cases in each test case pair is respectively determined. Specifically, the similarity of the feature values of two test cases in each test case pair can be determined by using a cosine similarity algorithm.

The similarity of the feature values of the two test cases in each test case pair can be used as the similarity of the two test cases in each test case pair.

And S13, taking the test case pair with the characteristic value of which the similarity is larger than a preset similarity threshold value as a test case pair to be processed.

The preset similarity threshold may be set as needed, and is not limited in this embodiment.

And S14, performing de-duplication on the test cases in the plurality of to-be-processed test case pairs.

The deduplicating the test cases in the plurality of pairs of test cases to be processed may include:

s141, respectively removing one of two test cases in each test case pair to be processed to obtain a first test case set;

s142, if a first test case example set exists in the first test case set, selecting one test case from the first test case subset for reservation, wherein the first test case subset comprises at least 2 test cases, and at least 2 test cases are the same.

As another alternative embodiment of the present application, referring to fig. 2, a schematic flow chart of an embodiment 2 of a test case duplication eliminating method provided in the present application is mainly a refinement of the test case duplication eliminating method described in the foregoing embodiment 1, and as shown in fig. 2, the method may include, but is not limited to, the following steps:

and S21, extracting characteristic values of each test case.

In this embodiment, the feature values of the test cases may include, but are not limited to: any one or more of a feature value characterizing a functional module to which the test case belongs, a feature value describing the test case, a feature value characterizing an operational step of the test case, and a feature value characterizing an expected result of execution of the test case.

And S22, segmenting the characteristic values of each test case to obtain at least one keyword.

Steps S21-S22 are a specific embodiment of step S11 in example 1.

In this embodiment, the feature value of each test case may be segmented based on the python segmentation library to obtain at least one keyword.

And S23, determining the similarity of the characteristic values of the two test cases in each test case pair based on the obtained keywords by segmenting the characteristic values of the test cases.

Step S23 is a specific embodiment of step S12 in example 1.

And S24, taking the test case pair with the characteristic value of which the similarity is larger than a preset similarity threshold value as a test case pair to be processed.

And S25, performing de-duplication on the test cases in the plurality of to-be-processed test case pairs.

The detailed procedure of steps S24-S25 can be referred to the related description of steps S13-S14 in embodiment 1, and will not be repeated here.

In this embodiment, the keyword is obtained by word segmentation of the feature value, and the similarity of the feature value is determined based on the keyword, so that the complexity of determining the similarity of the feature value can be reduced, the efficiency of determining the similarity of the feature value is improved, and further the duplicate removal efficiency is improved.

As another alternative embodiment of the present application, referring to fig. 3, a schematic flow chart of an embodiment 3 of a test case duplication eliminating method provided in the present application is mainly a refinement of the test case duplication eliminating method described in the foregoing embodiment 2, and as shown in fig. 3, the method may include, but is not limited to, the following steps:

and S31, extracting characteristic values of each test case.

And S32, segmenting the characteristic values of each test case to obtain at least one keyword.

The detailed procedure of steps S31-S32 can be referred to in the related description of steps S21-S22 in embodiment 2, and will not be described herein.

Step S33, determining similar keyword pairs in keywords of two test cases in each test case pair respectively.

In this embodiment, the pair of similar keywords is composed of a first keyword and a second keyword, where the text similarity between the first keyword and the second keyword is greater than a set text similarity threshold, the first keyword is a keyword of one of the two test cases, and the second keyword is a keyword of the other of the two test cases.

In this embodiment, the determining process of the similar keyword pairs may include: and calculating the text similarity of the first keyword and the second keyword, judging whether the text similarity of the first keyword and the second keyword is larger than a set text similarity threshold, and if so, forming a similar keyword pair by the first keyword and the second keyword.

And step S34, counting the number of the similar keyword pairs, and determining the similarity of the characteristic values of the two test cases based on the number of the similar keyword pairs.

In this embodiment, a correspondence between the number of similar keyword pairs and the similarity of the feature values of the test cases may be set, after the number of similar keyword pairs is counted, the similarity of the feature values corresponding to the number of related keyword pairs is searched in the correspondence, and the searched similarity is used as the similarity of the feature values of the two test cases.

Steps S33-S34 are a specific embodiment of step S23 in example 2.

Step S35, taking a test case pair to which the characteristic value with the similarity larger than a preset similarity threshold belongs as a test case pair to be processed;

and S36, performing de-duplication on the test cases in the plurality of to-be-processed test case pairs.

The detailed procedure of steps S35-S36 can be seen in steps S24-S25 in embodiment 2, and will not be described here.

As another alternative embodiment of the present application, referring to fig. 4, a schematic flow chart of an embodiment 4 of a test case deduplication method provided in the present application is mainly a refinement of the test case deduplication method described in the foregoing embodiment 2, and as shown in fig. 4, the method may include, but is not limited to, the following steps:

and S41, extracting characteristic values of each test case.

And step S42, segmenting the characteristic values of each test case to obtain at least one keyword.

Step S33, counting the times of occurrence of each first keyword in each test case pair in the first test case and the times of occurrence of each second keyword in each test case pair in the second test case.

The first test case and the second test case form the test case pair, the first keyword is a keyword of the first test case, and the second keyword is a keyword of the second test case.

Step S44, determining the similarity of the feature values of the two test cases based on the keywords of the two test cases in each test case pair, the number of times that each first keyword in each test case pair appears in the first test case, and the number of times that each second keyword in each test case pair appears in the second test case.

In this embodiment, the first keyword of the first test case and the number of times the first keyword appears in the first test case in each test case pair may be formed into a first vector, the second keyword of the second test case and the number of times the second keyword appears in the second test case may be formed into a second vector, the similarity between the first vector and the second vector may be calculated, and the similarity between the first vector and the second vector may be used as the similarity between the feature values of the first test case and the second test case.

In this embodiment, the similarity of the feature values of the two test cases may be determined using a cosine similarity algorithm. Specifically, the similarity of the first vector and the second vector may be calculated using a cosine similarity algorithm.

Steps S43-S44 are a specific embodiment of step S23 in example 2.

Step S45, taking a test case pair to which the characteristic value with the similarity larger than a preset similarity threshold belongs as a test case pair to be processed;

and S46, performing deduplication on the test cases in the plurality of to-be-processed test case pairs.

The detailed procedure of steps S45-S46 can be seen in steps S24-S25 in embodiment 2, and will not be described here.

In this embodiment, the similarity of the feature values of the two test cases is determined based on the number of the similar keyword pairs and the number of times that the keywords in the similar keyword pairs appear in the keywords of the two test cases, so that the accuracy of determining the similarity of the feature values can be improved.

Next, a test case deduplication device provided in the present application will be described, and the test case deduplication device described below and the test case deduplication method described above may be referred to correspondingly.

Referring to fig. 5, the test case deduplication apparatus includes: the device comprises an extraction module 11, a first determination module 12, a second determination module 13 and a deduplication module 14.

An extracting module 11, configured to extract feature values of each test case;

a first determining module 12, configured to determine similarity of feature values of two test cases in each pair of test cases, where the pair of test cases is obtained by selecting any two test case compositions from a plurality of test cases;

a second determining module 13, configured to use a pair of test cases to which the feature value with the similarity greater than the preset similarity threshold belongs as a pair of test cases to be processed;

and the deduplication module 14 is used for deduplicating the test cases in the plurality of the to-be-processed test case pairs.

In this embodiment, the extracting module 11 may specifically be configured to:

extracting characteristic values of each test case;

accordingly, the first determining module 12 may specifically be configured to:

In this embodiment, the first determining module 12 may specifically be configured to:

It should be noted that, in each embodiment, the differences from the other embodiments are emphasized, and the same similar parts between the embodiments are referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.

From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in the embodiments or some parts of the embodiments of the present application.

The foregoing has described in detail a test case deduplication method and apparatus provided herein, with specific examples being employed herein to illustrate the principles and implementations of the present application, the above examples being provided only to assist in understanding the methods of the present application and their core ideas; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A test case deduplication method, comprising:

extracting characteristic values of each test case; the characteristic value is any one or more of a characteristic value of a functional module to which the characterization test case belongs, a characteristic value of a description of the test case, a characteristic value of an operation step of the characterization test case and a characteristic value of an expected result of the performance of the characterization test case;

performing de-duplication on the test cases in the plurality of pairs of test cases to be processed, including: removing one of two test cases in each test case pair to be processed respectively to obtain a first test case set, and if the first test case set exists in the first test case set, selecting one test case in the first test case set for reservation, wherein the first test case set comprises at least 2 test cases, and at least 2 test cases are the same;

the extracting the characteristic value of each test case comprises the following steps:

extracting characteristic values of each test case;

performing word segmentation on the characteristic values of each test case to obtain at least one keyword; the method comprises the steps of dividing words of characteristic values of each test case based on a python word dividing library to obtain at least one keyword;

2. The method according to claim 1, wherein the determining the similarity of the feature values of two test cases in each test case pair based on the keywords obtained by word segmentation of the feature values of each test case, respectively, comprises:

3. The method according to claim 1, wherein the determining the similarity of the feature values of two test cases in each test case pair based on the keywords obtained by word segmentation of the feature values of each test case, respectively, comprises:

4. The method of claim 3, wherein determining the similarity of the feature values of the two test cases based on the keywords of the two test cases in each of the test cases, the number of times each first keyword in each of the test cases appears in the first test case, and the number of times each second keyword in each of the test cases appears in the second test case, respectively, comprises:

5. A test case deduplication apparatus, comprising:

the extraction module is used for extracting the characteristic values of each test case; any one or more of a feature value representing a functional module to which the test case belongs, a feature value describing the test case, a feature value representing an operation step of the test case, and a feature value representing an expected result of execution of the test case;

the de-duplication module is configured to de-duplicate test cases in the plurality of pairs of test cases to be processed, and includes: removing one of two test cases in each test case pair to be processed respectively to obtain a first test case set, and if the first test case set exists in the first test case set, selecting one test case in the first test case set for reservation, wherein the first test case set comprises at least 2 test cases, and at least 2 test cases are the same;

the extraction module is specifically configured to:

extracting characteristic values of each test case;

the characteristic values of all the test cases are segmented to obtain at least one keyword, wherein the characteristic values of all the test cases are segmented based on a python segmentation library to obtain at least one keyword;

the first determining module is specifically configured to:

6. The apparatus of claim 5, wherein the first determining module is specifically configured to:

7. The apparatus of claim 5, wherein the first determining module is specifically configured to:

8. The apparatus of claim 7, wherein the first determining module is specifically configured to: