CN113032253B

CN113032253B - Test data feature extraction method, test method and related device

Info

Publication number: CN113032253B
Application number: CN202110292100.6A
Authority: CN
Inventors: 陈振坤; 张伟杰
Original assignee: Guangzhou Huya Technology Co Ltd
Current assignee: Guangzhou Huya Technology Co Ltd
Priority date: 2021-03-18
Filing date: 2021-03-18
Publication date: 2024-04-19
Anticipated expiration: 2041-03-18
Also published as: CN113032253A

Abstract

The application discloses a test data feature extraction method, a test method and a related device, wherein the test data feature extraction method comprises the following steps: acquiring original corpus short sentences in a test case; extracting keywords of the original corpus short sentences to form a keyword set; adjusting the keyword set by adopting a pre-trained word embedding model to obtain a vectorization characteristic sentence corresponding to the original corpus short sentence; and encrypting the vectorization feature sentences by adopting a preset encryption algorithm to obtain the test data features corresponding to the original corpus short sentences. By the scheme, redundant construction of test data can be avoided.

Description

Test data feature extraction method, test method and related device

Technical Field

The present application relates to the field of software testing technologies, and in particular, to a method for extracting features of test data, a test method, and a related device.

Background

Because the description of the same thing by the Chinese natural language has a plurality of speaking methods, especially the text written by different people has larger difference, the difference can occur when different people write the structured test case by the Chinese natural language, thereby leading to redundant construction of test data and forming excessive repeated labor and low multiplexing results.

The existing implementation scheme can use sentence vector mode, uses a large number of Chinese text sets as training materials, obtains a data model through unsupervised learning, then utilizes the model to cluster the phrases in the structured test case, and associates similar phrases with unique test data ID mapping. However, this solution has two significant problems: firstly, a large number of Chinese text sets are relied on as training materials, but a large number of structured test case texts are generally difficult to provide for new products; secondly, the correlation accuracy by a clustering mode is low, and is generally lower than 50%. In view of this, how to provide a test data feature extraction method with high accuracy and without requiring a large amount of training materials has become a very valuable topic.

Disclosure of Invention

The application mainly solves the technical problem of providing a test data feature extraction method, a test method and a related device, which can avoid redundant construction of test data.

In order to solve the above problems, a first aspect of the present application provides a test data feature extraction method, the extraction method comprising: acquiring original corpus short sentences in a test case; extracting keywords of the original corpus short sentences to form a keyword set; adjusting the keyword set by adopting a pre-trained word embedding model to obtain a vectorization characteristic sentence corresponding to the original corpus short sentence; and encrypting the vectorization feature sentences by adopting a preset encryption algorithm to obtain the test data features corresponding to the original corpus short sentences.

In order to solve the above problems, a second aspect of the present application provides a test method comprising: extracting test data features from all original corpus short sentences in the test case by using a test data feature extraction method to obtain test data features corresponding to each original corpus short sentence; establishing an index value according to the test data characteristics corresponding to each original corpus short sentence so as to classify all the original corpus short sentences in the test case according to different test data characteristics; testing all original corpus short sentences in the test case according to the classification result; the test data feature extraction method is the test data feature extraction method of the first aspect.

In order to solve the above problem, a third aspect of the present application provides an extraction device for testing data features, including: the corpus acquisition module is used for acquiring original corpus short sentences in the test cases; the keyword extraction module is used for extracting keywords of the original corpus short sentences to form a keyword set; the vectorization module is used for adjusting the keyword set by adopting a pre-trained word embedding model to obtain vectorization feature sentences corresponding to the original corpus short sentences; and the encryption module is used for encrypting the vectorized feature sentences by adopting a preset encryption algorithm to obtain the test data features corresponding to the original corpus short sentences.

In order to solve the above problems, a fourth aspect of the present application provides a test apparatus, comprising: the feature extraction module is used for extracting test data features of all original corpus short sentences in the test case by using a test data feature extraction method to obtain test data features corresponding to each original corpus short sentence; the classification module is used for establishing an index value according to the test data characteristics corresponding to each original corpus short sentence so as to classify all the original corpus short sentences in the test case according to different test data characteristics; the testing module is used for testing all original corpus short sentences in the test case according to the classification result; the test data feature extraction method is the test data feature extraction method of the first aspect.

In order to solve the above-mentioned problems, a fifth aspect of the present application provides an electronic device, including a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory, to implement the test data feature extraction method of the first aspect, or the test method of the second aspect.

In order to solve the above-described problems, a sixth aspect of the present application provides a computer-readable storage medium having stored thereon program instructions which, when executed by a processor, implement the test data feature extraction method of the above-described first aspect, or the test method of the above-described second aspect.

The beneficial effects of the application are as follows: in the test data feature extraction method, the key words of the original corpus short sentences are extracted to form a key word set, a pre-trained word embedding model is adopted to adjust the key word set to obtain vectorized feature sentences corresponding to the original corpus short sentences, and then a preset encryption algorithm is adopted to encrypt the vectorized feature sentences, so that test data features corresponding to the original corpus short sentences are obtained. In addition, the test method of the application uses the test data obtained by the test data characteristic extraction method to carry out software test, thereby realizing low repeated labor and high multiplexing.

Drawings

FIG. 1 is a flow chart of an embodiment of a method for extracting test data features according to the present application;

FIG. 2 is a flowchart of an embodiment of step S12 in FIG. 1;

FIG. 3 is a flowchart illustrating an embodiment of step S13 in FIG. 1;

FIG. 4 is a flow chart of an embodiment of the testing method of the present application;

FIG. 5 is a schematic diagram of an embodiment of an apparatus for extracting test data features according to the present application;

FIG. 6 is a schematic diagram of a testing apparatus according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a frame of an embodiment of an electronic device of the present application;

FIG. 8 is a schematic diagram of a frame of one embodiment of a computer-readable storage medium of the present application.

Detailed Description

The following describes embodiments of the present application in detail with reference to the drawings.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a flowchart illustrating an embodiment of a test data feature extraction method according to the present application. Specifically, the test data feature extraction method in this embodiment may include the following steps:

step S11: and obtaining the original corpus short sentence in the test case.

Software testing is usually performed by carefully selecting a batch of test data to form test cases according to specifications of various stages of software development and internal structures of programs, using the test cases to drive a tested program, observing execution results of the programs, verifying whether the obtained results are consistent with expected results, and then performing corresponding adjustment. In one embodiment, the test cases may be structured test cases, which are semi-formal test case documents written in chinese natural language, written in accordance with UML (unified modeling language or standard modeling language) and BNF (bachelus-style) constraints. In order to ensure the coverage rate of software testing, the test cases generally comprise large-scale corpus data which are scientifically sampled and processed, the corpus data can be collected in the process of testing the processed program, and because the description of the same thing by Chinese natural language has multiple speaking methods, especially the text written by different people has larger difference, the corpus data are repeatedly described, so that the original corpus short sentences in the initial test cases need to be obtained to realize the feature extraction of the initial test cases, and the redundant construction of the test data is avoided.

Step S12: and extracting keywords of the original corpus short sentences to form a keyword set.

In the natural language processing process, various expressions exist for the description of the same things, and largely, the expressions with the modifier have a plurality of modifier words in the corpus, the modifier words have smaller meaning for the description of the things in most cases, and words with main effects on the description of the things are keywords of the original corpus short sentences. By extracting the keywords of the original corpus short sentence, a keyword set can be formed, and the formed keyword set can reflect things to be described by the original corpus short sentence.

Referring to fig. 2, fig. 2 is a flowchart illustrating an embodiment of step S12 in fig. 1. In an embodiment, the step S12 may specifically include:

Step S121: and performing word segmentation on the original corpus short sentences to obtain an original word set.

Specifically, the original corpus short sentence comprises the requirement of testing a target program, word segmentation processing is carried out on the test target information, a plurality of words contained in the test target information are obtained, and an original word set formed by the plurality of words obtained after the word segmentation processing can be obtained. When the word segmentation is performed on the original corpus short sentence, jieba (barker word segmentation) algorithm, snowNLP (emotion analysis) algorithm, THULAC (Chinese lexical analysis) algorithm and the like can be adopted, or other algorithms with word segmentation function can be adopted, so that the original corpus short sentence can be segmented into individual words.

Step S122: and filtering the original word set by adopting a preset processing rule to obtain the keyword set.

It can be understood that after the original corpus short sentence is subjected to word segmentation processing to obtain an original word set, the original word set may include vocabularies which are used for representing the mood, the modification and the like and have no specific meaning, the vocabularies which have no specific meaning in the original word set are filtered by adopting a preset processing rule, and the remaining vocabularies can reflect things to be described by the original corpus short sentence, so that a corresponding keyword set is obtained.

Specifically, the preset processing rule comprises at least one of a de-stop word rule, a de-punctuation rule and a de-numerical rule. After the word segmentation is finished, filtering the obtained original word set, wherein the rule of removing the stop word refers to that a stop word stock is established to filter the original word set so as to filter the segmentation existing in the original word set in the stop word stock, the rule of removing the punctuation refers to that punctuation marks, suffix marks and the like in the original word set are filtered, and the rule of removing the digits refers to that the digits in the original word set are filtered. By filtering out useless words in the original word set, which do not have actual meanings, the accuracy of the feature extraction of the test data can be improved.

Step S13: and adjusting the keyword set by adopting a pre-trained word embedding model to obtain the vectorization characteristic sentences corresponding to the original corpus short sentences.

The word embedding model can vectorize all words through training, so that the relation between the de-metric words and the words can be quantified, and the relation between the words is mined, therefore, word vectors output by the word embedding model can be used for performing a plurality of natural language processing related works, such as clustering, synonym finding, part-of-speech analysis and the like. In the application, a keyword set is adjusted by adopting a pre-trained word embedding model, and a vectorization characteristic sentence corresponding to an original corpus short sentence can be obtained; specifically, the keyword is vectorized by adopting a pre-trained word embedding model, so that multidimensional vector data, namely a word vector of the keyword, can be obtained; and according to the word characteristics of the keywords in the original corpus short sentences, vectorizing each keyword in the keyword set to obtain a word vector set, and according to the word vector set, converting the original corpus short sentences into vectorized feature sentences.

Referring to fig. 3, fig. 3 is a flowchart illustrating an embodiment of step S13 in fig. 1. In an embodiment, the step S13 may specifically include:

Step S131: and inputting the keyword set into the pre-trained word embedding model to obtain a word vector.

Step S132: and replacing the body word by using an adjacent word nearest to the body word in the embedded space to obtain the vectorization characteristic sentence corresponding to the original corpus short sentence.

And carrying out vectorization processing on each keyword to obtain a word vector of each keyword, thereby obtaining a word vector set formed by the word vectors corresponding to each original corpus short sentence. The word vector is output through the word embedding model, and the similarity between the closest words in the embedding space is highest, so that the adjacent words closest to the ontology word in the embedding space are used for replacing the ontology word, the obtained vectorization characteristic sentences corresponding to the original corpus short sentences can be used for flattening the differences of the structured text test cases written by different people in Chinese natural language.

Step S14: and encrypting the vectorization feature sentences by adopting a preset encryption algorithm to obtain the test data features corresponding to the original corpus short sentences.

It can be understood that the vectorized feature sentence is encrypted by adopting a preset encryption algorithm, so that a unique feature value corresponding to the vectorized feature sentence can be obtained, namely, the unique feature value is the test data feature corresponding to the original corpus short sentence. It can be understood that, because the vectorization feature sentence corresponds to a unique feature value, the vectorization feature sentence corresponding to each original corpus short sentence can be compared and classified according to the feature value, so that a plurality of original corpus short sentences which are originally different are related to the same test data feature, the difference of different test cases is flattened, and a unique mapping relation can be established between the original corpus short sentences which are in different expression modes and have the same actual semantics and the test data feature, thereby avoiding redundant construction of the test data.

In an embodiment, the predetermined encryption algorithm is a message digest algorithm. The message digest algorithm is an algorithm that sums inputs of arbitrary length to produce a pseudo-random input of fixed length. Generally, whenever an input message is different, the digest message generated after it is digested must also be different; but the same input must produce the same output. This is exactly what the good message digest algorithm has: i.e. the input changes and the output changes. Therefore, the vectorization feature sentences are encrypted through the message digest algorithm to obtain the test data features corresponding to the original corpus short sentences, so that the digests of two similar vectorization feature sentences are not similar and even are quite different, and whether the original different original corpus short sentences describe the same thing can be accurately judged according to the obtained test data features corresponding to the original corpus short sentences.

Specifically, the Message Digest Algorithm may be Message-Digest Algorithm 5 (md5), where the test data feature corresponding to the original corpus is an MD5 value. For any length of data, the length of the calculated MD5 value is fixed, and it is easy to calculate the MD5 value from the original data, and any modification to the original data, even if only 1 byte is modified, the resulting MD5 values are quite different. Therefore, the original corpus short sentences in the test cases can be classified according to the MD5 values corresponding to the obtained original corpus short sentences, so that a plurality of original corpus short sentences which are different originally are related to the same MD5 value, the difference of grinding different test cases is realized, a unique mapping relation can be established between the original corpus short sentences with different expression modes and the same actual semantics and the MD5 value, and redundant construction of test data is avoided.

Referring to fig. 4, fig. 4 is a flow chart of an embodiment of the testing method of the present application. The test method in this embodiment may include the steps of:

Step S41: and extracting test data features from all original corpus short sentences in the test case by using a test data feature extraction method to obtain the test data features corresponding to each original corpus short sentence. The test data feature extraction method is any one of the test data feature extraction methods.

The testing method is applied to the testing device, and the testing device can be a terminal or a server. The terminal can be a smart phone, a tablet personal computer, a computer and the like. The server may be a server or a server cluster consisting of several servers. The user can install the software program to be tested on the test terminal or upload the software program to be tested on the server, and the test terminal or the server adopts the test method provided by the application to test the software program to be tested.

Step S42: and establishing an index value according to the test data characteristics corresponding to each original corpus short sentence so as to classify all the original corpus short sentences in the test case according to different test data characteristics.

Step S43: and testing all original corpus short sentences in the test case according to the classification result.

Specifically, before the software program to be tested is tested, a corresponding test case can be established, however, because the description of the same thing by Chinese natural language has multiple speaking methods, especially the text written by different people has larger difference, a plurality of different original corpus phrases often exist in the test case to describe the same thing, and if the test case is adopted to test the software program to be tested, a result of high repeated labor and low multiplexing can appear. Therefore, the method for extracting the test data features is required to extract all the original corpus short sentences in the test case by using the test data feature extraction method to obtain the test data features corresponding to each original corpus short sentence, and then an index value is established according to the test data features corresponding to each original corpus short sentence so as to classify all the original corpus short sentences in the test case according to different test data features, thereby testing all the original corpus short sentences in the test case according to classification results and realizing low-repetition labor-height multiplexing.

In the test data feature extraction method, the key word set is formed by extracting the key words of the original corpus short sentences, the key word set is adjusted by adopting the pre-trained word embedding model, the vectorized feature sentences corresponding to the original corpus short sentences are obtained, and then the vectorized feature sentences are encrypted by adopting a preset encryption algorithm, so that the test data features corresponding to the original corpus short sentences are obtained. In addition, the test method of the application uses the test data obtained by the test data characteristic extraction method to carry out software test, thereby realizing low repeated labor and high multiplexing.

In an application scenario, a plurality of original corpus short sentences are obtained, wherein the original corpus short sentences 1 are used for judging whether a user is registered as a high-level user, the original corpus short sentences 2 are used for judging whether a deer accompanying user is registered as a high-level user, and the original corpus short sentences 3 are used for judging whether the user is registered as the high-level user. The method comprises the steps of firstly performing word segmentation on each original corpus short sentence to obtain a corresponding original word set, wherein the result of the word segmentation on the original corpus short sentence 1 is that whether a user is registered by a registered value is the user with the mind, the result of the word segmentation on the original corpus short sentence 2 is that whether a deer accompanying player is registered by the registered value is the user with the mind, and the result of the word segmentation on the original corpus short sentence 3 is that whether the user is registered by the registered value is the user with the mind. And then filtering the original word set to remove the useless words to obtain a keyword set, wherein the result of removing the useless words from the original word set corresponding to the original corpus short sentence 1 is that the user is registered with the mind, the result of removing the useless words from the original word set corresponding to the original corpus short sentence 2 is that the deer accompany and playing the user is registered with the mind, and the result of removing the useless words from the original word set corresponding to the original corpus short sentence 3 is that the user is registered with the mind. Because the 'deer accompanying and playing user' is close to the 'user' in the embedded space, the body word replacement is carried out on the keyword set corresponding to each original corpus short sentence, and the obtained vectorization characteristic sentences have the result that the 'deer accompanying and playing user|registration|magical user'. Since the vectorization feature sentences corresponding to the original corpus short sentences are the same, the vectorization feature sentences are encrypted through the message digest algorithm 5, and the obtained test data features corresponding to the original corpus short sentences are necessarily the same. The three original corpus short sentences are processed to finally obtain the same test data characteristics, the md5 value of the test data characteristics is 'a deer accompanying player|registration|mind user', and an index value is established by using the md5 value of the test data characteristics, so that the three original corpus short sentences with different original languages can be related to the same test data characteristics, and then the classified test data characteristics are subjected to specific test automation script implementation. Therefore, the difference of the structured text test cases written by different people in Chinese natural language can be flattened, so that a plurality of Chinese phrases with different expression modes and the same actual semantics can establish a unique mapping relation with the characteristics of the test data, redundant construction of the test data is avoided, and low-repetition labor-height multiplexing is realized.

Referring to fig. 5, fig. 5 is a schematic diagram of a frame of an apparatus for extracting test data features according to an embodiment of the application. The test data feature extraction device 50 includes: the corpus acquisition module 500 is used for acquiring original corpus short sentences in the test cases; the keyword extraction module 502 is configured to extract keywords of the original corpus short sentence, so as to form a keyword set; the vectorization module 504 is configured to adjust the keyword set by using a pre-trained word embedding model, so as to obtain a vectorized feature sentence corresponding to the original corpus short sentence; and the encryption module 506 is configured to encrypt the vectorized feature sentence by using a preset encryption algorithm, so as to obtain a test data feature corresponding to the original corpus short sentence.

In some embodiments, the keyword extraction module 502 performs a step of extracting keywords of the original corpus short sentence to form a keyword set, including: word segmentation processing is carried out on the original corpus short sentences to obtain an original word set; and filtering the original word set by adopting a preset processing rule to obtain the keyword set.

In some embodiments, the vectorization module 504 performs adjustment on the keyword set using a pre-trained word embedding model to obtain a vectorized feature sentence corresponding to the original corpus short sentence, including: inputting the keyword set into the pre-trained word embedding model to obtain a word vector; and replacing the body word by using an adjacent word nearest to the body word in the embedded space to obtain the vectorization characteristic sentence corresponding to the original corpus short sentence.

Referring to fig. 6, fig. 6 is a schematic diagram of a frame of an embodiment of a testing apparatus according to the present application. The test device 60 includes: the feature extraction module 600 is configured to extract test data features of all original corpus short sentences in the test case by using a test data feature extraction method, so as to obtain test data features corresponding to each original corpus short sentence; the test data feature extraction method is any one of the test data feature extraction methods; the classification module 602 is configured to establish an index value according to the test data feature corresponding to each original corpus short sentence, so as to classify all the original corpus short sentences in the test case according to different test data features; and the testing module 604 is configured to test all the original corpus phrases in the test case according to the classification result by using the testing module 604.

Referring to fig. 7, fig. 7 is a schematic diagram of a frame of an electronic device according to an embodiment of the application. The electronic device 70 comprises a memory 71 and a processor 72 coupled to each other, the processor 72 being adapted to execute program instructions stored in the memory 71 to implement the steps of any one of the test data feature extraction method embodiments described above, or the steps of any one of the test method embodiments described above. In one particular implementation scenario, electronic device 70 may include, but is not limited to: microcomputer, server.

Specifically, the processor 72 is configured to control itself and the memory 71 to implement the steps of any of the test data feature extraction method embodiments described above, or the steps of any of the test method embodiments described above. The processor 72 may also be referred to as a CPU (Central Processing Unit ). The processor 72 may be an integrated circuit chip having signal processing capabilities. The Processor 72 may also be a general purpose Processor, a digital signal Processor (DIGITAL SIGNAL Processor, DSP), an Application SPECIFIC INTEGRATED Circuit (ASIC), a Field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, a discrete gate or transistor logic device, a discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 72 may be commonly implemented by an integrated circuit chip.

Referring to fig. 8, fig. 8 is a schematic diagram illustrating a frame of an embodiment of a computer readable storage medium according to the present application. The computer readable storage medium 80 stores program instructions 800 that can be executed by a processor, the program instructions 800 being configured to implement the steps of any of the test data feature extraction method embodiments described above, or the steps of any of the test method embodiments described above.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.

The elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over network elements. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a read-only memory (ROM), a random access memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims

1. A method of testing, the method comprising:

extracting test data features from all original corpus short sentences in the test case by using a test data feature extraction method to obtain test data features corresponding to each original corpus short sentence;

establishing an index value according to the test data characteristics corresponding to each original corpus short sentence so as to classify all the original corpus short sentences in the test case according to different test data characteristics;

testing all original corpus short sentences in the test case according to the classification result;

the test data feature extraction method comprises the following steps:

Acquiring original corpus short sentences in a test case;

extracting keywords of the original corpus short sentences to form a keyword set;

adjusting the keyword set by adopting a pre-trained word embedding model to obtain a vectorization characteristic sentence corresponding to the original corpus short sentence;

and encrypting the vectorization feature sentences by adopting a preset encryption algorithm to obtain the test data features corresponding to the original corpus short sentences.

2. The method according to claim 1, wherein the extracting keywords of the original corpus short sentence to form a keyword set includes:

word segmentation processing is carried out on the original corpus short sentences to obtain an original word set;

and filtering the original word set by adopting a preset processing rule to obtain the keyword set.

3. The test method of claim 2, wherein the preset processing rules include at least one of a de-stop word rule, a de-punctuation rule, a de-numericity rule.

4. The method according to claim 1, wherein the adjusting the keyword set by using the pre-trained word embedding model to obtain the vectorized feature sentence corresponding to the original corpus short sentence includes:

inputting the keyword set into the pre-trained word embedding model to obtain a word vector;

And replacing the body word by using an adjacent word nearest to the body word in the embedded space to obtain the vectorization characteristic sentence corresponding to the original corpus short sentence.

5. The test method according to claim 1, wherein,

The test cases are structured test cases; and/or, the preset encryption algorithm is a message digest algorithm.

6. A test device, comprising:

The feature extraction module is used for extracting test data features of all original corpus short sentences in the test case by using a test data feature extraction method to obtain test data features corresponding to each original corpus short sentence; wherein the test data feature extraction method is the test data feature extraction method of any one of claims 1 to 5;

the classification module is used for establishing an index value according to the test data characteristics corresponding to each original corpus short sentence so as to classify all the original corpus short sentences in the test case according to different test data characteristics;

And the testing module is used for testing all the original corpus short sentences in the test case according to the classification result.

7. An electronic device comprising a memory and a processor coupled to each other, the processor being configured to execute program instructions stored in the memory to implement the test method of any one of claims 1 to 5.

8. A computer readable storage medium having stored thereon program instructions, which when executed by a processor implement the test method of any of claims 1 to 5.