CN112988567B

CN112988567B - Crowdsourcing test automated evaluation method and device

Info

Publication number: CN112988567B
Application number: CN202110104907.2A
Authority: CN
Inventors: 杨鹏; 赵聚雪; 曾哲军; 詹增荣
Original assignee: Guangzhou Panyu Polytechnic
Current assignee: Guangzhou Panyu Polytechnic
Priority date: 2021-01-26
Filing date: 2021-01-26
Publication date: 2022-02-15
Anticipated expiration: 2041-01-26
Also published as: CN112988567A

Abstract

The invention discloses an automated evaluation method and device for crowdsourcing testing. The automated evaluation method includes: converting each standard test case in a standard test case set into a first keyword set, and converting each to-be-evaluated test in the to-be-evaluated test case set The use cases are converted into the second keyword set; the intersection of the first keyword set and the second keyword set is obtained, and each test case to be evaluated is mapped to a unique corresponding standard test case according to the intersection; the first keyword set and For the union of the corresponding second keyword set, the score of each standard test case is calculated according to the union; the average value of the total score is calculated, and the evaluation is performed on the test case set to be evaluated according to the average value. The present invention provides an automated evaluation method and device for crowdsourcing testing, so as to solve the technical problem that the prior art cannot accurately evaluate crowdsourcing testing, and realize accurate automatic evaluation of crowdsourcing testing.

Description

Crowdsourcing test automated evaluation method and device

Technical Field

The invention relates to the technical field of software testing, in particular to a crowdsourcing test automated evaluation method and device.

Background

In the internet era, various kinds of software are developed, the software has defects in different degrees without exception, and the purpose of software testing is to discover the defects as far as possible before a system is on line, so that the defects can be corrected in time to reduce the influence caused by the defects. In the traditional software test, a few testers need to perform a large number of tests on one software, and the tests consume a great deal of energy of the few testers, are long in time and often have limitations. The crowdsourcing test is a test method derived for solving the defects that the traditional software test is long in time consumption, large in human and material resource consumption and limited in test. The crowdsourcing test is low in cost, and the test task is released on the crowdsourcing platform, so that users on the crowdsourcing platform can participate in the crowdsourcing test task with low reward, and the test cost is effectively reduced. Registered users on a crowdsourcing platform are distributed around the world, internet access equipment, operating systems and programming languages used by the internet personnel are different, and problems such as compatibility problems of software possibly existing in more real environments can be found for the same test task. The information fed back is collected by using crowdsourcing test, so that a large number of possible defects of the software can be obtained, and the software is improved in a targeted manner. However, the low cost of the crowdsourcing test also causes the level of the personnel participating in the crowdsourcing test to be uneven, and some testers can submit low-quality test reports for quickly obtaining reward, so that the quality of the test information collected by the test task is poor, and the test information is difficult to be used for improving the software quality in a targeted manner.

The inventor of the application finds that in the prior art, a large number of test cases are evaluated by a manual means, and the crowdsourcing test cannot be accurately evaluated due to the fact that the number of the test cases is large and the credibility of participants of the crowdsourcing test does not have specific quantitative indexes.

Disclosure of Invention

The invention provides a crowdsourcing test automatic evaluation method and device, which are used for solving the technical problem that a large number of test cases in crowdsourcing tests cannot be accurately evaluated through manual means in the prior art, so that accurate automatic evaluation of the crowdsourcing tests is realized.

The first embodiment of the invention provides a crowdsourcing test automated evaluation method, which comprises the following steps:

converting each standard test case in the standard test case set into a first keyword set, and converting each test case to be evaluated in the test case set to be evaluated into a second keyword set;

acquiring an intersection of a first keyword set corresponding to each standard test case and the second keyword set of each test case to be evaluated, and mapping each test case to be evaluated to the only corresponding standard test case according to the intersection;

acquiring a union set of the first keyword set of each standard test case and the corresponding second keyword set of the test case to be evaluated, and calculating to obtain the score of each standard test case according to the union set;

and calculating the total score of all the standard test cases according to the score of each standard test case, calculating the average value of the total score, and evaluating the test case set to be evaluated according to the average value.

Further, each standard test case in the standard test case set is converted into a first keyword set, and each test case to be evaluated in the test case set to be evaluated is converted into a second keyword set, which specifically includes:

converting each standard test case in the standard test case set into a first keyword set by adopting a natural language processing technology; and converting each test case to be evaluated in the test case set to be evaluated into a second keyword set by adopting a natural language processing technology.

Further, mapping each test case to be evaluated to the unique corresponding standard test case according to the intersection specifically includes:

and acquiring the maximum intersection of each first keyword set and each second keyword set, and mapping the test case to be evaluated corresponding to the maximum intersection to a unique corresponding standard test case.

Further, the obtaining a score of each standard test case according to the union specifically includes:

and calculating the coverage rate of the union set and the first keyword set of the standard test case, and converting the coverage rate into the score of each corresponding standard test case according to a preset conversion rule.

A second embodiment of the present invention provides a crowdsourcing test automated evaluation device, including: the device comprises a conversion module, a mapping module, a calculation module and an evaluation module;

the conversion module is used for converting each standard test case in the standard test case set into a first keyword set and converting each test case to be evaluated in the test case set to be evaluated into a second keyword set;

the mapping module is used for acquiring an intersection of a first keyword set corresponding to each standard test case and the second keyword set of each test case to be evaluated, and mapping each test case to be evaluated to the only corresponding standard test case according to the intersection;

the calculation module is used for acquiring a union set of the first keyword set of each standard test case and the corresponding second keyword set of the test case to be evaluated, and calculating the score of each standard test case according to the union set;

and the evaluation module is used for calculating the total score of all the standard test cases according to the score of each standard test case, calculating the average value of the total score, and evaluating the test case set to be evaluated according to the average value.

Further, the conversion module comprises means for:

Further, the mapping module includes instructions for:

Further, the computing module includes instructions for:

The invention provides a crowdsourcing test automatic evaluation method and device, aiming at solving the technical problem that in the prior art, a large number of test cases are evaluated by a manual means, so that crowdsourcing tests cannot be evaluated accurately.

Drawings

Fig. 1 is a schematic flow chart of an automated evaluation method for crowdsourcing test according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating a natural language processing technique according to an embodiment of the present invention for obtaining keywords of a test case;

FIG. 3 is a process diagram of a standard test case for mapping test cases to be evaluated according to an embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating a coverage of a test case for obtaining a standard provided by an embodiment of the present invention;

FIG. 5 is a schematic diagram of obtaining scores for each standard test case according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an automated evaluation apparatus for crowdsourcing test according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the description of the present application, it is to be understood that the terms "first", "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless otherwise specified.

In the description of the present application, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art.

Referring to fig. 1-5, a first embodiment of the invention provides an automated evaluation method for crowdsourcing test as shown in fig. 1, comprising:

s1, converting each standard test case in the standard test case set into a first keyword set, and converting each test case to be evaluated in the test case set to be evaluated into a second keyword set;

s2, acquiring the intersection of the first keyword set corresponding to each standard test case and the second keyword set of each test case to be evaluated, and mapping each test case to be evaluated to the only corresponding standard test case according to the intersection;

s3, acquiring a union set of the first keyword set of each standard test case and the corresponding second keyword set of the test case to be evaluated, and calculating to obtain the score of each standard test case according to the union set;

and S4, calculating the total score of all the standard test cases according to the score of each standard test case, calculating the average value of the total score, and evaluating the test case set to be evaluated according to the average value.

In the embodiment of the invention, the score of each standard test case is calculated, the total score is obtained after all the scores are accumulated and summed, the total score is divided by the total number of the standard test cases in the standard test case set to obtain the average score, the evaluation case set to be evaluated is evaluated according to the average score, and when the average score is higher than the preset threshold score, the reliable evaluation conclusion of the test case set to be evaluated is obtained.

The embodiment of the invention converts each standard test case in the standard test case set into the first keyword set, converts each test case to be evaluated in the test case set to be evaluated into the second keyword set, maps each test case to be evaluated to the only corresponding standard test case according to the intersection of the first keyword set and the second keyword set, calculates the score of each standard test case according to the mapping relation, and evaluates the test case set to be evaluated by averaging the total scores of all the standard test cases. According to the embodiment of the invention, the standard test cases with the same number of keywords as the most number are obtained for the test cases to be evaluated according to the mapping relation between the standard test cases and the test cases to be evaluated, the evaluation of the test case set to be evaluated is realized by calculating the average score of the standard test cases, the test cases to be evaluated are evaluated by taking the standard test case data as the basis, the automatic evaluation of the test case set to be evaluated accurately can be realized, and the evaluation accuracy and reliability of the crowdsourcing test can be improved.

As a specific implementation manner of the embodiment of the present invention, converting each standard test case in a standard test case set into a first keyword set, and converting each test case to be evaluated in a test case set to be evaluated into a second keyword set, specifically includes:

and converting each standard test case in the standard test case set into a first keyword set by adopting a natural language processing technology, and converting each test case to be evaluated in the test case set to be evaluated into a second keyword set by adopting the natural language processing technology.

In the embodiment of the invention, aiming at a crowdsourced test object, a standard test case set is obtained and is { TS1, TS2 … TSm }, wherein each standard test case is TSj; the test case set to be evaluated is { TC1, TC2 … TCn }, each test case to be evaluated converts each standard test case into a first keyword set { TSjKW1, TSjKW2 … TSjKWp } (j ═ 1,2 … m) by using a natural language processing technology; each test case to be evaluated is converted into a second keyword set { tciww 1, tciww 2 … TCiKWq } (i ═ 1,2 … n). Referring to fig. 2, a schematic diagram of obtaining a keyword of a test case by using a natural language processing technology according to an embodiment of the present invention is shown. Wherein, the Testcase Standard represents a Standard test case; the Testcase Standard Keywords set represents a first keyword set of a Standard test case; testcase represents a test case to be evaluated; the Testcase Keywords set represents a second keyword set of the test case to be evaluated; NLP denotes natural language processing.

As a specific implementation manner of the embodiment of the present invention, mapping each test case to be evaluated to a unique corresponding standard test case according to an intersection includes:

and acquiring the maximum intersection of each first keyword set and each second keyword set, and mapping the test case to be evaluated corresponding to the maximum intersection to the only corresponding standard test case.

In the embodiment of the invention, an intersection is taken between the second keyword set { tciww 1, tciww 2 … TCiKWq } (i ═ 1,2 … n) and the first keyword set { TSjKW1, TSjKW2 … TSjKWp } (j ═ 1,2 … m), m × n intersections are taken together, the standard test cases TSk with the same number as the keywords of the test cases to be evaluated and the maximum number are obtained according to all the intersections, and each test case to be evaluated is mapped to the only corresponding standard test case. Such as: TC1 → TS2, TC2 → TS2, TC3 → TS1, TC4 → TS3, …, TCm → TSn. According to the embodiment of the invention, each test case to be evaluated is mapped to the only corresponding standard test case, evaluation is carried out according to the mapping relation between the standard test case and the test case to be evaluated, and the evaluation accuracy and reliability of the test case to be evaluated can be effectively improved. Please refer to fig. 3, which is a process diagram of mapping a Standard test case to a test case to be evaluated according to an embodiment of the present invention, wherein a Testcase Standard Keywords represents Keywords of the Standard test case; keywords count of associations represents the same number of Keywords in the intersection; and TC-TS mapping represents mapping from the test case to be evaluated to the standard test case.

As a specific implementation manner of the embodiment of the present invention, obtaining the score of each standard test case according to the union specifically includes:

In the embodiment of the present invention, each standard test case TSk is merged with the intersection TSk ≤ TCp, TSk ≤ TCq … of the second keyword set of all test cases to be evaluated { TCm, TCp, TCq … } corresponding thereto to obtain a third keyword set { TCmKWx, …, TCpKWy, …, TCqKWz, … }, the number of keywords in the third keyword set is divided with the number of keywords in the keyword set { tskkkw 1, tskkkw 2 … TSkKWp } converted by the standard test case TSk to obtain the coverage rate of the first keyword set of the standard test case, the coverage rate is converted into the score of the standard test case according to a preset conversion rule, and the scores of all standard test cases are calculated. Fig. 4 is a schematic diagram illustrating a coverage condition of a standard test case obtained according to an embodiment of the present invention, where TC-TS Keywords represent Keywords of a test case to be evaluated mapped to the standard test case. Please refer to fig. 5, which is a schematic diagram illustrating obtaining scores of each Standard test case according to an embodiment of the present invention, wherein a Testcase Standard Keywords count represents the number of Keywords of the Standard test case; the TC-TS Keywords score represents the keyword score that the test case to be evaluated maps to the standard test case.

The embodiment of the invention has the following beneficial effects:

the embodiment of the invention converts each standard test case in a standard test case set into a first keyword set, converts each test case to be evaluated in the test case set to be evaluated into a second keyword set, maps each test case to be evaluated to a unique corresponding standard test case according to the intersection of the first keyword set and the second keyword set, calculates the score of each standard test case according to the mapping relation, calculates the average value according to the total score of all the standard test cases, evaluates the test case set to be evaluated, obtains the standard test cases with the most keywords according to the mapping relation of the standard test cases and the test cases to be evaluated, evaluates the test case set to be evaluated by calculating the average score of the standard test cases, evaluates the test cases to be evaluated by taking the data of the standard test cases as the basis, the automatic evaluation of the test case set to be evaluated accurately can be realized, and the evaluation accuracy and reliability of the crowdsourcing test can be improved.

Referring to fig. 6, a second embodiment of the present invention provides an automatic evaluation apparatus for crowdsourcing test shown in fig. 6, comprising: a conversion module 10, a mapping module 20, a calculation module 30 and an evaluation module 40;

the conversion module 10 is configured to convert each standard test case in the standard test case set into a first keyword set, and convert each test case to be evaluated in the test case set to be evaluated into a second keyword set;

the mapping module 20 is configured to obtain an intersection of the first keyword set corresponding to each standard test case and the second keyword set of each test case to be evaluated, and map each test case to be evaluated to a unique corresponding standard test case according to the intersection;

the calculation module 30 is configured to obtain a union set of the first keyword set of each standard test case and the corresponding second keyword set of the test case to be evaluated, and calculate a score of each standard test case according to the union set;

and the evaluation module 40 is configured to calculate a total score of all the standard test cases according to the score of each standard test case, calculate an average value of the total scores, and evaluate the test case set to be evaluated according to the average value.

As a specific implementation manner of the embodiment of the present invention, the conversion module 10 includes:

In the embodiment of the invention, aiming at a test object, a standard test case set is obtained as { TS1, TS2 … TSm }, wherein each standard test case is TSj; the test case set to be evaluated is { TC1, TC2 … TCn }, each test case to be evaluated converts each standard test case into a first keyword set { TSjKW1, TSjKW2 … TSjKWp } (j ═ 1,2 … m) by using a natural language processing technology; each test case to be evaluated is converted into a second keyword set { tciww 1, tciww 2 … TCiKWq } (i ═ 1,2 … n). Referring to fig. 2, a schematic diagram of obtaining a keyword of a test case by using a natural language processing technology according to an embodiment of the present invention is shown. Wherein, the Testcase Standard represents a Standard test case; the Testcase Standard Keywords set represents a first keyword set of a Standard test case; testcase represents a test case to be evaluated; the Testcase Keywords set represents a second keyword set of the test case to be evaluated; NLP denotes natural language processing.

As a specific implementation manner of the embodiment of the present invention, the mapping module 20 includes:

In the embodiment of the invention, an intersection is taken between the second keyword set { tciww 1, tciww 2 … TCiKWq } (i ═ 1,2 … n) and the first keyword set { TSjKW1, TSjKW2 … TSjKWp } (j ═ 1,2 … m), m × n intersections are taken together, the standard test cases TSj with the same number and the most number as the keywords of the test cases to be evaluated are obtained according to all the intersections, and each test case to be evaluated is mapped to the only corresponding standard test case. Such as: TC1 → TS2, TC2 → TS2, TC3 → TS1, TC4 → TS3, …, TCm → TSn. According to the embodiment of the invention, each test case to be evaluated is mapped to the only corresponding standard test case, evaluation is carried out according to the mapping relation between the standard test case and the test case to be evaluated, and the evaluation accuracy and reliability of the test case to be evaluated can be effectively improved. Please refer to fig. 3, which is a process diagram of mapping a Standard test case to a test case to be evaluated according to an embodiment of the present invention, wherein a Testcase Standard Keywords represents Keywords of the Standard test case; keywords count of associations represents the same number of Keywords in the intersection; and TC-TS mapping represents mapping from the test case to be evaluated to the standard test case.

As a specific implementation manner of the embodiment of the present invention, the calculating module 30 includes:

In the embodiment of the present invention, each standard test case TSk is merged with the intersection TSk ≦ TCm, TSk ≦ TCp, TSk ≦ TCq … of the second keyword set of all test cases to be evaluated { TCm, TCp, TCq … } corresponding thereto to obtain the third keyword set { TCmKWx, …, TCpKWy, …, TCqKWz, … }, the number of keywords in the third keyword set is divided by the number of keywords in the keyword set { tskkkw 1, tskkkw 2 … kkkwp } converted by the standard test case TSk to obtain the coverage rate of the first keyword set of the standard test case, the coverage rate is converted into the score of the standard test case according to the preset conversion rule, and the scores of all standard test cases are calculated. Fig. 4 is a schematic diagram illustrating a coverage condition of a standard test case obtained according to an embodiment of the present invention, where TC-TS Keywords represent Keywords of a test case to be evaluated mapped to the standard test case. Please refer to fig. 5, which is a schematic diagram illustrating obtaining scores of each Standard test case according to an embodiment of the present invention, wherein a Testcase Standard Keywords count represents the number of Keywords of the Standard test case; the TC-TS Keywords score represents the keyword score that the test case to be evaluated maps to the standard test case.

The embodiment of the invention has the following beneficial effects:

The foregoing is a preferred embodiment of the present invention, and it should be noted that it would be apparent to those skilled in the art that various modifications and enhancements can be made without departing from the principles of the invention, and such modifications and enhancements are also considered to be within the scope of the invention.

Claims

1. An automated evaluation method for crowdsourcing tests, comprising:

acquiring a union set of the first keyword set of each standard test case and the corresponding second keyword set of the test case to be evaluated, and calculating to obtain the score of each standard test case according to the union set; calculating the score of each standard test case according to the union set, wherein the calculation specifically comprises the following steps: calculating the coverage rate of the union set and the first keyword set of the standard test case, and converting the coverage rate into the score of each corresponding standard test case according to a preset conversion rule;

calculating the total score of all the standard test cases according to the score of each standard test case, calculating the average value of the total score, and evaluating the test case set to be evaluated according to the average value; calculating the score of each standard test case, accumulating and summing all the scores to obtain a total score, and dividing the total score by the total number of the standard test cases in the standard test case set to obtain an average score.

2. The automated evaluation method for crowdsourcing test according to claim 1, wherein converting each standard test case in the standard test case set into a first keyword set, and converting each test case in the test case set to be evaluated into a second keyword set, specifically comprises:

3. The automated evaluation method for crowdsourcing test according to claim 1, wherein the mapping each test case to be evaluated to the uniquely corresponding standard test case according to the intersection specifically comprises:

4. An automated crowdsourcing test evaluation device, comprising: the device comprises a conversion module, a mapping module, a calculation module and an evaluation module;

the calculation module is used for acquiring a union set of the first keyword set of each standard test case and the corresponding second keyword set of the test case to be evaluated, and calculating the score of each standard test case according to the union set; the method is specifically used for: calculating the coverage rate of the union set and the first keyword set of the standard test case, and converting the coverage rate into the score of each corresponding standard test case according to a preset conversion rule;

the evaluation module is used for calculating the total score of all the standard test cases according to the score of each standard test case, calculating the average value of the total score, and evaluating the test case set to be evaluated according to the average value; the method is specifically used for calculating the score of each standard test case, accumulating and summing all the scores to obtain a total score, and dividing the total score by the total number of the standard test cases in the standard test case set to obtain an average score.

5. The crowdsourcing test automated evaluation device of claim 4, wherein the translation module comprises instructions to:

6. The crowdsourcing test automated evaluation device of claim 4, wherein the mapping module comprises means for: