CN113591998A

CN113591998A - Method, device, equipment and storage medium for training and using classification model

Info

Publication number: CN113591998A
Application number: CN202110887260.5A
Authority: CN
Inventors: 张敬来; 贾欢欢
Original assignee: Shanghai Pudong Development Bank Co Ltd
Current assignee: Shanghai Pudong Development Bank Co Ltd
Priority date: 2021-08-03
Filing date: 2021-08-03
Publication date: 2021-11-02

Abstract

The invention discloses a method, a device, equipment and a storage medium for training and using a classification model, and belongs to the technical field of software automation test and natural language processing. The method comprises the following steps: inputting a sample text into a feature model to obtain word vectors of words in the sample text; processing word vectors of words, disturbance vectors of words and position characteristics of words through an initial model to obtain a classification prediction result of the sample text; wherein the initial model comprises at least one two-layer bidirectional LSTM, an attention mechanism layer and a classification layer; determining a target classification loss value according to the classification prediction result of the sample text and the supervision data of the sample text; and training the initial model according to the target classification loss value to obtain a target classification model. By the technical scheme, software test results can be automatically classified, the labor force of testers in retesting is reduced, and the software development and test efficiency and quality are improved.

Description

Method, device, equipment and storage medium for training and using classification model

Technical Field

The invention relates to the technical field of software automation test and natural language processing, in particular to a method, a device, equipment and a storage medium for training and using a classification model.

Background

With the popularization of the DevOps concept in Internet companies, continuous integration and continuous testing also become the mainstream trend of project testing at present, and various defects and risks related to latest software development are analyzed through continuous automatic testing, so that rapid feedback is provided for developers, and the project delivery quality is improved.

In the process of software automatic testing, testers can design automatic testing cases of related functions, the cases are executed through a testing platform, success and failure information of testing results are fed back, and whether the failure information causes are defects of software functions or not is judged manually. The automatic test usually carries out repeated regression test, the repeated test depends on a large amount of labor cost, the retest and the analysis of test results are carried out, and the subjective experience has errors and influences the test quality and the test efficiency.

At present, a platform manually adds error information into an information base each time based on a test result to gradually expand an error information resource base, but the method is not predictive, and newly encountered error information cannot be judged. Therefore, how to automatically classify the software test results has important practical value for software test and software development.

Disclosure of Invention

The invention provides a method, a device, equipment and a storage medium for training and using a classification model so as to realize automatic classification of software test results.

In a first aspect, an embodiment of the present invention provides a method for training a classification model, including:

inputting a sample text into a feature model to obtain word vectors of words in the sample text; wherein the feature model comprises a two-layer bidirectional LSTM; the sample text is a historical software test result, and the type of the historical software test result comprises an environmental problem, a data problem, a business problem, a code problem and a case problem;

processing word vectors of words, disturbance vectors of words and position characteristics of words through an initial model to obtain a classification prediction result of the sample text; wherein the initial model comprises at least one two-layer bidirectional LSTM, an attention mechanism layer and a classification layer;

determining a target classification loss value according to the classification prediction result of the sample text and the supervision data of the sample text;

and carrying out countermeasure training on the initial model according to the target classification loss value to obtain a target classification model.

In a second aspect, an embodiment of the present invention further provides a method for using a classification model, including:

determining word vectors of words in the target text; the target file is a software test result to be classified;

inputting word vectors of words in a target text into a target classification model to obtain a classification prediction result of the target text; wherein the target classification model is obtained by a training method of the classification model according to any one of claims 1 to 7.

In a third aspect, an embodiment of the present invention further provides a training apparatus for a classification model, including:

the word vector determining module is used for inputting the sample text into the feature model to obtain word vectors of words in the sample text; wherein the feature model comprises a two-layer bidirectional LSTM; the sample text is a historical software test result, and the type of the historical software test result comprises an environmental problem, a data problem, a business problem, a code problem and a case problem;

the prediction result determining module is used for processing word vectors of words, disturbance vectors of words and position characteristics of the words through the initial model to obtain a classification prediction result of the sample text; wherein the initial model comprises at least one two-layer bidirectional LSTM, an attention mechanism layer and a classification layer;

the classification loss value determining module is used for determining a target classification loss value according to the classification prediction result of the sample text and the supervision data of the sample text;

and the classification model determining module is used for carrying out countermeasure training on the initial model according to the target classification loss value to obtain a target classification model.

In a fourth aspect, an embodiment of the present invention further provides a device for using a classification model, including:

the word vector determining module is used for determining word vectors of words in the target text; the target file is a software test result to be classified;

the prediction result determining module is used for inputting word vectors of words in the target text into the target classification model to obtain a classification prediction result of the target text; the target classification model is obtained by the training method of the classification model provided in the first aspect.

In a fifth aspect, an embodiment of the present invention further provides an electronic device, including:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a method of training a classification model as provided in the first aspect, or to implement a method of using a classification model as provided in the second aspect.

In a sixth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements a training method of the classification model provided in the first aspect, or implements a using method of the classification model provided in the second aspect.

According to the technical scheme of the embodiment of the invention, word vectors of words in a sample text are obtained by inputting the sample text into a feature model, then the word vectors of the words, disturbance vectors of the words and position features of the words are processed through an initial model to obtain a classification prediction result of the sample text, a target classification loss value is determined according to the classification prediction result of the sample text and supervision data of the sample text, and then countermeasure training is carried out on the initial model according to the target classification loss value to obtain a target classification model. According to the technical scheme, the word vectors of the words, the disturbance vectors of the words and the position characteristics of the words are input into the model together, long-distance deep text characteristic information can be learned, and the classification accuracy of the text is improved; meanwhile, software test results can be automatically classified, the labor force of retest of testers is reduced, and the software development and test efficiency and quality are improved.

Drawings

FIG. 1A is a flowchart of a method for training a classification model according to an embodiment of the present invention;

FIG. 1B is a diagram of a single LSTM unit according to an embodiment of the present invention;

FIG. 2A is a flowchart of a method for training a classification model according to a second embodiment of the present invention;

FIG. 2B is a model diagram of a feature model and a classification model according to a second embodiment of the present invention;

FIG. 3 is a flowchart of a method for using a classification model according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of a training apparatus for classification models according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of an apparatus for using a classification model according to a fifth embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1A is a flowchart of a method for training a classification model according to an embodiment of the present invention, where the method is applicable to a situation where a software measurement result is predicted and determined, and the method may be executed by a device for training a classification model, where the device may be implemented by software and/or hardware, and may be integrated in an electronic device, such as a server, that carries a function of training a classification model.

As shown in fig. 1A, the method may specifically include:

s110, inputting the sample text into the feature model to obtain word vectors of words in the sample text; the sample text is a historical software test result, and the types of the historical software test result comprise an environmental problem, a data problem, a business problem, a code problem and a case problem.

In this embodiment, the sample text is a historical software test result, may be short text data of a software development error code of the processed mobile software, and may also be short text data prompted by a failure result of the processed automated testing platform; further, the sample text can be divided into five categories of environmental problems, data problems, business problems, code problems, and case problems.

The environment problems include network abnormality, system problems, service failures and the like, such as network connection failure, background system return abnormality, proxy service call abnormality, communication timeout and the like; the data problems comprise format problems, time abnormity, data errors and the like, for example, the data problems are not in the transaction time currently, the collection account number does not conform to the required format, the dynamic password is input wrongly, the mobile phone number or the password is invalid and the like; the business problem is a business logic error, for example, the signing state is abnormal and cannot be modified, the product cannot perform the transaction at present, and the account is not passed risk verification and is not supported temporarily; code problems are program development problems that result in test failures, such as An error occurred, database table insertion errors! Com.csii.pe.communication.except; the case problem is that errors in case design result in execution failures, such as inability to locate text boxes, checkpoint "fixes" do not exist, inability to locate pictures, and the like.

In this embodiment, the feature model may be a charCNN model or a word2vec model. Correspondingly, the sample text can be processed based on the charCNN model to obtain word vectors of words in the sample text. Further, the sample text can be processed based on the word2vec model, and word vectors of words in the sample text are obtained.

And S120, processing the word vectors of the words, the disturbance vectors of the words and the position characteristics of the words through the initial model to obtain a classification prediction result of the sample text.

Here, the term "disturbance vector" refers to a vector of an interfering word corresponding to a word. Optionally, the interfering word corresponding to the word may be determined according to the word, and then the interfering word is processed according to a charCNN model or a word2vec model, so as to obtain a word vector of the interfering word, that is, a disturbance vector of the word.

Optionally, the expectation and variance of the sample text may be determined according to the word vector of the word and the word frequency of the word, and the perturbation vector of the word may be determined according to the expectation and variance of the sample text and the word. Specifically, the expectation and variance of the sample text are determined according to the word vector and the word frequency of each word in the sample text, and may be determined by the following formula:

wherein, w_kWord vector representing the kth word in the sample text, f_kRepresents the word frequency of the kth word in all text samples, e (w) represents the expectation of sample text, var (w) represents the variance of sample text, and N represents the number of words in each sample text.

Furthermore, the perturbation vector of the word is determined according to the expectation and variance of the sample text and the word, and may be determined by the following formula, for example:

wherein, w'_kPerturbation vector representing the k-th word, w_kA word vector representing the k-th word.

It can be understood that the training performance of the model can be improved by adding the disturbance vector of the word to carry out the countermeasure training, and then the defense capability of the model to the interference sample is improved.

The position characteristics refer to the position characteristics of each word in the words, and optionally, the words in the words can be coded according to the length of the words to obtain the position characteristics of the words. Specifically, the length of each word is determined, if the length is 1, the part is a single word, and the part is identified by the number "0"; if the length is greater than 1, the part is a word, the first word of the word is identified by the number "1", the last word by the number "3", and the remainder by the number "2". For example, if a word is a length of 1, the location characteristic of the word is 0; if the word is the word with the length of 2, the position characteristic of the word is 13; if it is a word of length 4, the positional characteristic of the word is 1223.

It can be understood that introducing the position feature into the word vector can provide richer feature information to improve the classification accuracy.

It should be noted that the word vector of the word, the perturbation vector of the word, and the vector dimension of the position feature of the word are the same.

The initial model includes at least two layers of Long Short-Term Memory (LSTM), attention mechanism layer and classification layer. Wherein the attention mechanism layer is to weight the feature representation of the bidirectional LSTM output; the classification layer is used for classifying the weighted feature representation; the bidirectional LSTM is used for predicting words, and in order to predict the words more accurately in combination with context, the bidirectional LSTM is respectively a forward LSTM and a backward LSTM, wherein the forward LSTM is used for predicting the context according to the context, namely the k < th > word is predicted by using the k < th > words; backward LSTM is used to predict the context from the context, i.e., using the words after k to predict the k-th word.

The forward LSTM layer and the backward LSTM layer both comprise two layers of LSTMs, so that the problem that a single layer of LSTM lacks the following semantic information is solved; further, each LSTM layer corresponds to an output layer that can predict the complete past and future context information at each time in the text sequence. As shown in FIG. 1B, LSTM has three control gates designed to implement a controllable memory neural unit. As shown in formula (1), the forgetting gate determines which information output at the last moment needs to be discarded, and outputs a value between 0 and 1 by using a Sigmoid function, wherein 1 represents complete retention, and 0 represents complete rejection; as shown in equations (2) and (3), the input gate determines which information is stored in the memory unit, and includes two parts: one part is a Sigmoid layer, which decides what value is to be updated; the other part is a tanh layer, and a new candidate value vector is created; as shown in equation (4), the forgetting gate obtains the information f that needs to be discarded_kThe input gate obtains the value i to be updated_kThen, the cell state is changed from C_k-1Update to C_k(ii) a As shown in the formulas (5) and (6), the output gate determines which information can be output, the Sigmoid function is used for regularizing the information into a weight value between 0 and 1, the tanh function is used for transforming the memory value into a numerical value between-1 and 1, and then the numerical value is compared with the output value o of the output gate_kThe product is multiplied to obtain the final output expression.

f_k＝σ(W_f·[h_k-1,w_k]+b_f) (1)

i_k＝σ(W_i·[h_k-1,w_k]+b_i) (2)

o_k＝σ(W_o·[h_k-1,w_k]+b_o) (5)

h_k＝o_k*tanh(C_k) (6)

Wherein, w_kExpressed as a word vector input at time k, h_k-1Expressed as the output of LSTM at time k-1, σ represents the Sigmoid function, W_f、W_i、W_o、W_cRespectively a forgetting gate, an input gate, an output gate, an updated weight vector, b_f、b_i、b_c、b_oIs its corresponding offset vector. Under the action of three control door switches, the LSTM can solve the problem of long-distance dependence and solve the problems of gradient disappearance and gradient explosion to a certain extent. Output vector of forward LSTM

And the output vector of the inverse LSTM

Is calculated in a manner similar to that of a single LSTM, with the final output vector being a combination of the forward output vector and the backward output vector.

Optionally, the word vectors of the words, the perturbation vectors of the words, and the position features of the words are combined and spliced, and input into the bidirectional LSTM to obtain feature representations of the words in the sample text, and then the feature representations of the words in the sample text are input into the attention mechanism layer, and different weights are assigned to the feature representations of the words in the sample text at the attention mechanism layer to obtain the optimized feature representation. The optimized feature representation can be determined, for example, by the following formula:

u_k＝tanh(W_sh_k+b_s)

wherein h is_kIs a characteristic representation of time k, W, obtained by bi-directional LSTM learning_s、b_s、u_sIs a parameter of the attention mechanism; u. of_kIs h_kHidden layer representation obtained by a simple neural network layer; a is_kIs h_kThe scoring weight of the contribution degree of the corresponding feature words to the distinguishing text categories; y is_kAnd the feature vector of the final text information is the optimized feature representation.

It can be understood that compared with the traditional method of directly summing up the updated output vectors corresponding to each moment of the LSTM layer to obtain an average value, the default is that each feature vocabulary has the same contribution degree for distinguishing the text categories; according to the method, the unimportant words are set to be smaller in weight ratio, and the words with strong category distinguishing capability are larger in weight ratio, so that an attention mechanism is introduced, different proportional weights are divided for regions with different contribution amounts of the current tasks, the expression of the words with the category distinguishing capability is enhanced, the influence of redundant features on text classification is weakened, information which is more important to classification is screened out from a large amount of information, and the accuracy of classification is improved.

And then inputting the optimized feature representation into a classification layer for text classification. Wherein the classification layer can adopt a Softmax classifier as follows:

wherein x is input text, theta is parameter learned by model training, y is classification mark, and p (y) is finally output by the model_i| x, θ) the category for which the probability value is greatest, i ∈ {1, 2, 3, …, n }, n representing the number of categories of text.

And S130, determining a target classification loss value according to the classification prediction result of the sample text and the supervision data of the sample text.

In this embodiment, a cross entropy loss function is adopted, and a target classification loss value is determined according to a classification prediction result of a sample text and supervision data of the sample text.

And S140, performing countermeasure training on the initial model according to the target classification loss value to obtain a target classification model.

In this embodiment, the initial model is continuously subjected to iterative optimization according to the target classification loss value, until the target classification loss value reaches the minimum value, the iterative optimization is stopped, and the model when the iteration is stopped is used as the target classification model.

On the basis of the technical scheme, because the original text often contains a large amount of noise including information such as non-text data, labels and special symbols, in order to save storage space, optimize operation efficiency and accurately represent subsequent word segmentation and word vectors, effective filtering is performed to eliminate interference on text data information mining, and as an optional mode of the embodiment of the invention, the original text can be processed by adopting a data filtering algorithm and/or a word segmentation algorithm to obtain a sample text.

For example, the useless information in the original text can be removed by adopting a regular expression matching mode, the regular expression in the text uses a re module in Python, and the symbols in the original text are removed by adopting a re.sub () function.

For another example, a Jieba open source library can be selected to perform word segmentation on the original text, and the Jieba word segmentation system supports three word segmentation modes: precision mode, full mode, and search engine mode. Because the full mode has the word ambiguity problem, the search engine mode has large calculation amount and occupies more resources, and the accurate mode can segment the text most accurately, segment the original text into word sequences with independent semantic information and is suitable for the text analysis task. Therefore, the invention selects the accurate mode of the Jieba word segmentation system to perform word segmentation processing on the filtered original text according to the actual business requirements, the key function is Jieba.

For another example, in order to save storage space and improve calculation efficiency, word-stop processing is performed after word segmentation preprocessing of the text, and words or characters which have high frequency but no practical significance and have no influence on the text analysis result are filtered. The invention eliminates the words such as number words, prepositions, auxiliary words and the like which are useless for the emotion analysis of the text by utilizing the work-in-the-air stop word list and the Baidu stop word list.

It can be understood that invalid data and redundant data in the original text are filtered, and the original text is subjected to normalized processing, so that the storage space can be saved, the calculation efficiency can be improved, and the classification accuracy of the subsequent text can be further improved.

On the basis of the above embodiment, as an optional way of the embodiment of the present invention, the visualization of the text classification result and the implementation of the functional interface may be further performed, the text classification result obtained from the data processing layer is received, the classification result is visualized according to the category, the classification function is packaged as an interface device, and the execution result of the corresponding module can be directly obtained by calling the interface for subsequent use and integration.

Optionally, for the category distribution visualization, statistics of distribution conditions, such as frequency, and the like, of each text in each classification category is performed, and the statistical result is visualized by using a histogram and a pie chart. Specifically, the characteristics and the future trend of data can be visually represented by visualizing the text classification result, and decision support is provided for a user. After the text is classified, the classification result is stored in a marked text corpus by structured data, the classification result is inquired from the corpus, a visual chart is drawn by utilizing an echarts.

Optionally, for the functional interface, the parameter acquisition related function of the interface is firstly realized, then the data reading and query related functions are realized, and finally the format of the returned data is specified, so that the data output related function is realized. Specifically, the text classification result output interface receives a text character string input by a user and adds a parameter to indicate the type of input content. If the user inputs a text character string, the text character string is converted into word vectors of words in the text by the method provided by the embodiment, then the probability distribution of each category is obtained by using the classification model, the output probability distribution is a vector in a character string array form, and the vector is converted into a character string array and then output. In order to call the functional interface and unify the data structure in other parts subsequently, the interface adopts HyperText Transfer Protocol (http) to send a cross-domain request to acquire data, and the acquired data is packaged into a json form and then output.

Example two

Fig. 2A is a flowchart of a training method of a classification model according to a second embodiment of the present invention, which is optimized based on the second embodiment and provides an alternative implementation, and fig. 2B is a model diagram of a feature model and a classification model according to the second embodiment of the present invention, where the feature model includes a two-layer bi-directional LSTM, the classification model includes a two-layer bi-directional LSTM, an attention mechanism layer and a classification layer, where the classification layer employs a softmax classifier, and optionally, a sample text sequence { x } of an input layer is input into the classification model₁，x₂，x₃，…，x_kInputting the words into the feature model to obtain word vectors (w) of the words₁，w₂，w₃，…，w_kDetermining a perturbation vector corresponding to each word vector; then inputting the word vector and the position feature into the bidirectional LSTM, inputting the output result of the bidirectional LSTM into an attention mechanism layer, and assigning a weight { a ] to the output result of the bidirectional LSTM by the attention mechanism layer₁，a₂，a₃，…，a_kAnd inputting the output result of the attention mechanism layer into a classification layer for classification. Optionally, as shown in fig. 2A, the method may specifically include:

s210, inputting the sample text into the feature model to obtain word vectors of words in the sample text; wherein the feature model comprises a two-layer bidirectional LSTM; the sample text is a historical software test result, and the types of the historical software test result comprise an environmental problem, a data problem, a business problem, a code problem and a case problem.

In this embodiment, optionally, the sample text may be input into the feature model to obtain word vectors of words in the sample text; as shown in fig. 2B, the feature model includes two layers of bidirectional LSTM, namely forward LSTM and backward LSTM, wherein the forward LSTM is used for predicting the context according to the context, i.e. using the k-1 words to predict the k-th word; backward LSTM is used to predict the context from the context, i.e., using the words after k to predict the k-th word. For example, for a piece of text { x ] with a total length N₁，x₂，…，x_k，…，x_N}，x_kRepresenting the word currently to be predicted in the text, as follows:

wherein, formula (7) is a backward language model, formula (8) is a forward language model, the forward language model and the backward language model are combined by the bidirectional LSTM to maximize the likelihood function of the bidirectional language model, and formula (9) is the likelihood function of the bidirectional language model:

wherein

And

parameters of the forward LSTM layer and the backward LSTM layer.

It should be noted that the feature model is trained in advance by using a Wiki encyclopedia Chinese corpus as a reference lexicon; and when the word vectors of the words in the sample text are determined, the sample text is input into the feature model after fine tuning through fine tuning of the feature model, so that the word vectors of the words in the sample text are obtained.

It can be understood that the dynamic word vector corresponding to the context text information, i.e. the word vector of the word, can be obtained by processing the sample text through the pre-trained bi-directional LSTM.

And S220, processing the word vectors of the words, the disturbance vectors of the words and the position characteristics of the words through the initial model to obtain a classification prediction result of the sample text.

As shown in fig. 2B, the initial model includes a two-layer bidirectional LSTM, an attention mechanism layer and a classification layer, wherein the two-layer bidirectional LSTM is a forward LSTM and a backward LSTM, respectively, wherein the forward LSTM is used for predicting the context according to the context, i.e., using the k-1 words to predict the kth word; backward LSTM is used to predict the context from the context, i.e., using words after k to predict the kth word; the attention mechanism layer is used to weight the feature representation of the bi-directional LSTM output; the classification layer is used to classify the weighted feature representations.

In this embodiment, the word vectors and the position features are processed through the initial model to obtain a first classification prediction result of the sample text. Specifically, the word vectors and the position features are correspondingly combined and then input into the initial model to obtain a first classification prediction result of the sample text.

Meanwhile, the disturbance vector and the position feature are processed through the initial model, and a second classification prediction result of the sample text is obtained. Specifically, the disturbance vector and the position feature are correspondingly combined and then input into the initial model, so that a second classification prediction result of the sample text is obtained.

And S230, determining a target classification loss value according to the classification prediction result of the sample text and the supervision data of the sample text.

In this embodiment, based on the cross entropy loss function, a first classification loss value is determined according to the first classification prediction result and the supervision data, a second classification loss value is determined according to the second classification prediction result and the supervision data, a target classification loss value is further determined according to the first classification loss value and the second classification loss value, and specifically, an average value of the first classification loss value and the second classification loss value is used as the target classification loss value. This can be determined, for example, by the following equation:

loss₁＝-logp(y|w_k)

loss_adv＝-logp(y|w_k')

therein, loss₁Is the first classification loss value, loss, of the model when the input layer is not disturbed_advIs the second classification loss value of the model after the disturbance of the embedding layer, loss₁And loss_advAnd averaging to calculate the target classification loss.

It can be understood that the thought of the antagonistic training is adopted, and the error sample (namely the disturbance sample) is generated and added into the training process, so that the defense capability of the antagonistic sample is improved, and the generalization capability and the robustness of the model are better.

And S240, performing countermeasure training on the initial model according to the target classification loss value to obtain a target classification model.

EXAMPLE III

Fig. 3 is a flowchart of a method for using a classification model according to a third embodiment of the present invention, where this embodiment is applicable to a case where a software test result is predicted and judged, and the method may be executed by a device for using a classification model, where the device may be implemented by software and/or hardware, and may be integrated in an electronic device, such as a server, that carries a function of using the classification model.

As shown in fig. 3, the method may specifically include:

s310, determining word vectors of words in the target text; the target file is a software test result to be classified.

The target text refers to a software test result text which needs to be predicted and judged, namely a software test result to be classified.

In this embodiment, the target text may be input into a pre-trained feature model to obtain word vectors of words in the target text. Wherein the feature model may comprise a two-layer bi-directional LSTM.

And S320, inputting the word vectors of the words in the target text into the target classification model to obtain a classification prediction result of the target text.

The target classification model is obtained by the training method of the classification model provided by any one of the above embodiments.

In this embodiment, word vectors of words in the target text are input into the target classification model, and are processed by the target classification model to obtain a classification prediction result of the target text.

According to the technical method, the word vectors of the words in the target text are determined, and then the word vectors of the words in the target text are input into the target classification model, so that the classification prediction result of the target text is obtained. According to the technical scheme, the software test results can be automatically classified, the labor force of retesting of testers is reduced, and the software development and test efficiency and quality are further improved.

Example four

Fig. 4 is a schematic structural diagram of a training apparatus for a classification model according to a fourth embodiment of the present invention, which is applicable to a situation where a software measurement result is predicted and determined, and which may be implemented in a software and/or hardware manner and may be integrated into an electronic device, such as a server, that carries a training function of a classification model.

As shown in fig. 4, the apparatus may specifically include a word vector determination module 410, a prediction result determination module 420, a classification loss value determination module 430, and a classification model determination module 440, wherein,

a word vector determining module 410, configured to input the sample text into the feature model to obtain word vectors of words in the sample text; wherein the feature model comprises a two-layer bidirectional LSTM; the sample text is a historical software test result, and the types of the historical software test result comprise an environmental problem, a data problem, a business problem, a code problem and a case problem;

the prediction result determining module 420 is configured to process the word vectors of the words, the disturbance vectors of the words, and the position features of the words through the initial model to obtain a classification prediction result of the sample text; wherein the initial model comprises at least two layers of bidirectional LSTM, an attention mechanism layer and a classification layer;

a classification loss value determining module 430, configured to determine a target classification loss value according to a classification prediction result of the sample text and supervision data of the sample text;

and the classification model determining module 440 is configured to perform countermeasure training on the initial model according to the target classification loss value to obtain a target classification model.

Further, the prediction result determining module 420 is specifically configured to:

processing the word vectors and the position characteristics through an initial model to obtain a first classification prediction result of the sample text;

and processing the disturbance vector and the position characteristic through the initial model to obtain a second classification prediction result of the sample text.

Further, the classification loss value determining module 430 is specifically configured to:

determining a first classification loss value according to the first classification prediction result and the supervision data;

determining a second classification loss value according to the second classification prediction result and the supervision data;

and determining a target classification loss value according to the first classification loss value and the second classification loss value.

Further, the apparatus further includes a perturbation vector determination module, which is specifically configured to:

determining expectation and variance of a sample text according to the word vector and the word frequency of the word;

and determining the perturbation vector of the word according to the expectation and the variance of the sample text and the word.

Further, the apparatus further includes a location feature determination module, which is specifically configured to:

and coding the characters in the words according to the length of the words to obtain the position characteristics of the words.

Further, the apparatus further includes a sample text determination module, which is specifically configured to:

and processing the original text by adopting a data filtering algorithm and/or a word segmentation algorithm to obtain a sample text.

The training device of the classification model can execute the training method of the classification model provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

EXAMPLE five

Fig. 5 is a schematic structural diagram of a device for using a classification model according to a fifth embodiment of the present invention, where this embodiment is applicable to predicting and determining a software test result, and the device may be implemented in a software and/or hardware manner, and may be integrated in an electronic device, such as a server, that carries a use function of the classification model.

As shown in fig. 5, the apparatus may specifically include a word vector determination module 510 and a prediction result determination module 520, wherein,

a word vector determination module 510, configured to determine word vectors of words in the target text; the target file is a software test result to be classified;

a prediction result determining module 520, configured to input word vectors of words in the target text into the target classification model, so as to obtain a classification prediction result of the target text; the target classification model is obtained by the training method of the classification model provided by any one of the embodiments.

The using device of the classification model can execute the using method of the classification model provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the executing method.

EXAMPLE six

Fig. 6 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present invention, and fig. 6 shows a block diagram of an exemplary device suitable for implementing the embodiment of the present invention. The device shown in fig. 6 is only an example and should not bring any limitation to the function and the scope of use of the embodiments of the present invention.

As shown in FIG. 6, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Electronic device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory (cache 32). The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 6, and commonly referred to as a "hard drive"). Although not shown in FIG. 6, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. System memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of embodiments described herein.

Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with electronic device 12, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown, the network adapter 20 communicates with other modules of the electronic device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes programs stored in the system memory 28 to execute various functional applications and data processing, such as a training method of the classification model or a use method of the classification model provided by the embodiment of the present invention.

EXAMPLE seven

The seventh embodiment of the present invention further provides a computer-readable storage medium, on which a computer program (or referred to as computer-executable instructions) is stored, where the computer program is used for executing the training method of the classification model or the using method of the classification model provided in the embodiments of the present invention when the computer program is executed by a processor.

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the embodiments of the present invention have been described in more detail through the above embodiments, the embodiments of the present invention are not limited to the above embodiments, and many other equivalent embodiments may be included without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A training method of a classification model is characterized by comprising the following steps:

2. The method of claim 1, wherein the processing, through the initial model, the word vector of a word, the perturbation vector of a word, and the position feature of a word to obtain the classification prediction result of the sample text comprises:

processing the word vector and the position feature through the initial model to obtain a first classification prediction result of the sample text;

3. The method of claim 2, wherein determining a target classification loss value based on the classification prediction result of the sample text and the supervised data of the sample text comprises:

4. The method of claim 1, further comprising:

determining expectation and variance of a sample text according to the word vector of the word and the word frequency of the word;

determining a perturbation vector for the word based on the expectation and variance of the sample text and the word.

5. The method of claim 1, further comprising:

6. The method of claim 1, further comprising:

and processing the original text by adopting a data filtering algorithm and/or a word segmentation algorithm to obtain the sample text.

7. A method for using a classification model, comprising:

inputting word vectors of words in a target text into a target classification model to obtain a classification prediction result of the target text; wherein the target classification model is obtained by the training method of the classification model according to any one of claims 1 to 7.

8. A training device for classification models, comprising:

9. An apparatus for using a classification model, comprising:

the prediction result determining module is used for inputting word vectors of words in the target text into the target classification model to obtain a classification prediction result of the target text; wherein the target classification model is obtained by a training method of the classification model according to any one of claims 1 to 7.

10. An electronic device, comprising:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a method of training the classification model of any one of claims 1-6, or a method of using the classification model of claim 7.

11. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of training a classification model according to any one of claims 1 to 6 or a method of using a classification model according to claim 7.