CN115994225B

CN115994225B - Text classification method and device, storage medium and electronic equipment

Info

Publication number: CN115994225B
Application number: CN202310273838.7A
Authority: CN
Inventors: 苏海波; 李霖枫; 杜晓梦; 刘译璟
Original assignee: Beijing Percent Technology Group Co ltd
Current assignee: Beijing Percent Technology Group Co ltd
Priority date: 2023-03-20
Filing date: 2023-03-20
Publication date: 2023-06-27
Anticipated expiration: 2043-03-20
Also published as: CN115994225A

Abstract

The disclosure relates to a text classification method, a device, a storage medium and electronic equipment, and relates to the technical field of computers, wherein the method comprises the following steps: and acquiring the target text. And obtaining target input data according to a target text and a target classification template, wherein the target classification template comprises a target parameter vector and a target natural language template, the target parameter vector is obtained by training a first preset network model according to first training sample data, the first training sample data is marked with sample data of categories, and the first preset network model comprises a preset parameter vector and a preset classification model. Inputting target input data into a preset target text classification model to obtain target text categories output by the target text classification model, wherein the target text classification model is obtained by training a second preset network model according to second training sample data, the second training sample data is sample data of unlabeled categories, and the second preset network model comprises target parameter vectors and preset classification models.

Description

Text classification method and device, storage medium and electronic equipment

Technical Field

The disclosure relates to the field of computer technology, and in particular relates to a text classification method, a text classification device, a storage medium and electronic equipment.

Background

Aiming at the multi-classification problem, the existing typical small sample learning methods comprise a natural language template method PET (Pattern-Exploting Training) and a parameter vector template method P-Tuning, and aiming at a marked sample data set, a corresponding model is trained. The natural language template method PET needs manual template construction, the difference of different template effects is larger, and the template learned by the parameter vector template method P-Tuning lacks of interpretability. In addition, the two methods only aim at the marked sample data set to carry out model training, and a large amount of unmarked sample data cannot be fully utilized.

Disclosure of Invention

The disclosure aims to provide a text classification method, a text classification device, a storage medium and electronic equipment, which are used for improving the accuracy of text classification.

According to a first aspect of embodiments of the present disclosure, there is provided a method of classifying text, the method comprising:

acquiring a target text;

obtaining target input data according to the target text and a target classification template, wherein the target classification template comprises a target parameter vector and a target natural language template, the target parameter vector is obtained by training a first preset network model according to first training sample data, the first training sample data is sample data marked with categories, and the first preset network model comprises a preset parameter vector and a preset classification model;

Inputting the target input data into a preset target text classification model to obtain a target text class output by the target text classification model, wherein the target text classification model is obtained by training a second preset network model according to second training sample data, the second training sample data is sample data of unlabeled classes, and the second preset network model comprises the target parameter vector and the preset classification model.

Optionally, the first training sample data includes at least one preset classification template; the target parameter vector and the target text classification model are determined by:

training the first preset network model according to the first training sample data aiming at each preset classification template to obtain candidate parameter vectors corresponding to the preset classification templates;

training a standby network model corresponding to the candidate parameter vector according to the second training sample data aiming at each candidate parameter vector to obtain a candidate text classification model corresponding to the candidate parameter vector, wherein the standby network model comprises the candidate vector parameters and the preset classification model;

And determining the target parameter vector and the target text classification model from the candidate parameter vector and the candidate text classification model according to a preset verification data set, wherein the preset verification data set comprises a sample verification text and a sample verification category corresponding to the sample verification text.

Optionally, the first training sample data includes first sample input data and a first sample category corresponding to the first sample input data; training the first preset network model according to the first training sample data, and obtaining candidate parameter vectors corresponding to the preset classification templates comprises the following steps:

training the first preset network model according to the first sample input data and the first sample category to obtain the candidate parameter vector.

Optionally, the first sample input data includes the preset classification template and a first sample, and the preset classification template includes a preset parameter vector and a preset natural language template; training the first preset network model according to the first sample input data and the first sample category, and obtaining the candidate parameter vector includes:

Obtaining the first sample input data according to the first sample text and the preset classification template;

and taking the first sample input data as the input of the first preset network model, taking the first sample class as the output of the first preset network model, and training the first preset network model to obtain the candidate parameter vector.

Optionally, the second training sample data includes second sample input data and second sample output data corresponding to the second sample input data; training the standby network model corresponding to the candidate parameter vector according to the second training sample data, and obtaining a candidate text classification model corresponding to the candidate parameter vector includes:

and training the standby network model according to the second sample input data and the second sample output data aiming at each candidate parameter vector to obtain the candidate text classification model.

Optionally, the second sample input data includes a candidate classification template and a second sample text, the candidate classification template including the candidate parameter vector and the preset natural language template; the second sample output data is a text extracted from a preset sample text, and the second sample text is a text obtained after the second sample output data is extracted from the preset sample text; training the standby network model according to the second sample input data and the second sample output data to obtain the candidate text classification model comprises the following steps:

Obtaining the second sample input data according to the second sample text and the candidate classification templates;

and taking the second sample input data as the input of the standby network model, taking the second sample output data as the output of the standby network model, and training the standby network model to obtain the candidate text classification model.

Optionally, the determining the target parameter vector and the target text classification model from the candidate parameter vector and the candidate text classification model according to a preset verification data set includes:

for each candidate parameter vector, verifying the candidate classification templates corresponding to the text and the candidate parameter vector according to the sample to obtain verification input data;

taking the verification input data as the input of the candidate text classification model corresponding to the candidate parameter vector, and obtaining a target verification category output by the candidate text classification model;

determining the classification accuracy of each candidate network model according to the target verification category and the sample verification category, wherein the candidate network model comprises the candidate parameter vector and the candidate text classification model corresponding to the candidate parameter vector;

And taking the candidate parameter vector in the candidate network model with the highest classification accuracy as the target parameter vector, and taking the candidate text classification model in the candidate network model with the highest classification accuracy as the target text classification model.

According to a second aspect of embodiments of the present disclosure, there is provided a text classification apparatus, the apparatus comprising:

the acquisition module is used for acquiring the target text;

the input module is used for obtaining target input data according to the target text and the target classification template, the target classification template comprises a target parameter vector and a target natural language template, the target parameter vector is obtained by training a first preset network model according to first training sample data, the first training sample data is sample data marked with categories, and the first preset network model comprises a preset parameter vector and a preset classification model;

the classification module is used for inputting the target input data into a preset target text classification model to obtain a target text class output by the target text classification model, the target text classification model is obtained by training a second preset network model according to second training sample data, the second training sample data is sample data of unlabeled classes, and the second preset network model comprises the target parameter vector and the preset classification model.

According to a third aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method described in the first aspect of the present disclosure.

According to a fourth aspect of embodiments of the present disclosure, there is provided an electronic device, comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the method in the first aspect of the disclosure.

Through the technical scheme, the method and the device for classifying the target text obtain the target text first, obtain target input data according to the target text and the target classification template comprising the target parameter vector and the target natural language template, and then input the target input data into a preset target text classification model to obtain the target text class output by the target text classification model. The target parameter vector is obtained by training a first preset network model according to first training sample data, wherein the first training sample data are sample data marked with categories, and the first preset network model comprises a preset parameter vector and a preset classification model. The target text classification model is obtained by training a second preset network model according to second training sample data, wherein the second training sample data is sample data of unlabeled categories, and the second preset network model comprises a target parameter vector and a preset classification model. According to the method, the target text category corresponding to the target text is determined through the target parameter vector and the target text classification model which are obtained through pre-training, the classification methods of the natural language template method and the parameter vector template method are fused, a large amount of unlabeled training data is fully utilized, and a more accurate text classification result can be obtained.

Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate the disclosure and together with the description serve to explain, but do not limit the disclosure.

Fig. 1 is a flow chart illustrating a method of classifying text according to an exemplary embodiment.

FIG. 2 is a flow chart illustrating a method of determining a target parameter vector and a target text classification model according to an exemplary embodiment.

Fig. 3 is a flow chart illustrating a method of determining candidate parameter vectors according to an exemplary embodiment.

FIG. 4 is a flowchart illustrating a method of determining a candidate text classification model according to an exemplary embodiment.

FIG. 5 is a flowchart illustrating a method of determining a target parameter vector and a target text classification model according to an exemplary embodiment.

Fig. 6 is a block diagram illustrating a text classification apparatus according to an exemplary embodiment.

Fig. 7 is a block diagram of an electronic device, according to an example embodiment.

Detailed Description

Specific embodiments of the present disclosure are described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the disclosure, are not intended to limit the disclosure.

Before introducing the classification method, the device, the storage medium and the electronic equipment of the text shown in the disclosure, an application scenario related to an embodiment of the disclosure is first described. Aiming at the multi-classification problem of the complaint text, the input is the complaint text of the user, such as 'someone parks in disorder', 'visible garbage everywhere in a cell', and the like, the output is the classification of the complaint text, and the classification number can be multiple, including 'illegal parking', 'green garbage', 'street lamp is not bright', and the like. At present, aiming at the multi-classification problem, the existing typical small sample learning method comprises a natural language template method PET and a parameter vector template method P-Tuning, and aiming at a marked sample data set, a corresponding model is trained. The natural language template method PET is to combine a natural language template constructed manually with an MLM model (English: masked Language Model, chinese: mask language model) of a BERT (English: bidirectional Encoder Representation from Transformers, chinese: based on a bi-directional representation coding algorithm) model to convert tasks into complete blank filling for small sample learning, but the template is required to be constructed manually, and the difference of different template effects is relatively large. The parametric vector template method P-Tuning automatically learns the best template by using the representation of the unused words in the pre-trained model, which automatically learns the template parameters, but the learned template lacks interpretability and cannot be generated by human beings from a priori knowledge.

Fig. 1 is a flow chart illustrating a method of classifying text according to an exemplary embodiment, which may include the following steps, as shown in fig. 1.

And step 101, acquiring a target text.

And 102, obtaining target input data according to the target text and the target classification template. The target classification template comprises a target parameter vector and a target natural language template, the target parameter vector is obtained by training a first preset network model according to first training sample data, the first training sample data is sample data marked with categories, and the first preset network model comprises a preset parameter vector and a preset classification model.

For example, the target text may be first obtained from the complaint text submitted by the user, and then the target text and the target classification template are spliced to obtain the target input data. The target classification templates may include target parameter vectors and target natural language templates, among others. For example, the target classification template may be "[ u ] ₁ ][u ₂ ]...[u _n ]This is [ M ] ₁ ][M ₂ ][M ₃ ][M ₄ ]Information. ", wherein, [ u ] ₁ ][u ₂ ]...[u _n ]For the target parameter vector, "this is [ M ] ₁ ][M ₂ ][M ₃ ][M ₄ ]Information. "as target natural language template, [ M ] ₁ ][M ₂ ][M ₃ ][M ₄ ]Representing the text that was MASK. In this way, the parameter vector in the parameter vector template is fused with the natural language template to obtain the target classification template, so that the priori knowledge of human beings can be utilized, and template parameters can be automatically learned based on the labeling data of small samples. In some embodiments, the target parameter vector may be trained on a first preset network model based on first training sample data, where the first training sample data is sample data labeled with a class, and the first preset network model may include a preset parameter vector and a preset classification model, where the preset classification model may be, for example, a BERT model.

And step 103, inputting the target input data into a preset target text classification model to obtain the target text category output by the target text classification model. The target text classification model is obtained by training a second preset network model according to second training sample data, the second training sample data is sample data of unlabeled categories, and the second preset network model comprises a target parameter vector and a preset classification model.

For example, after obtaining the target input data, the target input data may be input into a preset target text classification model, so as to obtain a target text category output by the target text classification model. In some embodiments, the target text classification model may be obtained by training a second preset network model according to the second training sample data, where the second preset network model may include a target parameter vector and a preset classification model, that is, the second preset network model is a model obtained by replacing a preset parameter vector in the first preset network model with the target parameter vector. The second training sample data can be sample data with unlabeled categories, and the model training is performed by using the sample data with unlabeled categories to finely adjust the parameters of the preset classification model, so that the semi-supervised learning effect can be achieved.

In summary, the present disclosure firstly obtains a target text, obtains target input data according to the target text and a target classification template including a target parameter vector and a target natural language template, and then inputs the target input data into a preset target text classification model to obtain a target text category output by the target text classification model. The target parameter vector is obtained by training a first preset network model according to first training sample data, wherein the first training sample data are sample data marked with categories, and the first preset network model comprises a preset parameter vector and a preset classification model. The target text classification model is obtained by training a second preset network model according to second training sample data, wherein the second training sample data is sample data of unlabeled categories, and the second preset network model comprises a target parameter vector and a preset classification model. According to the method, the target text category corresponding to the target text is determined through the target parameter vector and the target text classification model which are obtained through pre-training, the classification methods of the natural language template method and the parameter vector template method are fused, a large amount of unlabeled training data is fully utilized, and a more accurate text classification result can be obtained.

Fig. 2 is a flowchart illustrating a method of determining a target parameter vector and a target text classification model according to an exemplary embodiment, as shown in fig. 2, the target parameter vector and the target text classification model being determined by the following steps.

Step 201, training a first preset network model according to the first training sample data for each preset classification template to obtain candidate parameter vectors corresponding to the preset classification templates.

Step 202, training a standby network model corresponding to each candidate parameter vector according to the second training sample data to obtain a candidate text classification model corresponding to the candidate parameter vector, wherein the standby network model comprises candidate vector parameters and a preset classification model.

And 203, determining a target parameter vector and a target text classification model from the candidate parameter vector and the candidate text classification model according to a preset verification data set, wherein the preset verification data set comprises a sample verification text and a sample verification category corresponding to the sample verification text.

By way of example, the first training sample data may include at least one preset classification template therein, wherein,the preset classification templates may include preset parameter vectors and preset natural language templates, and it may be understood that the preset classification templates are templates obtained by fusing the preset parameter vectors and the preset natural language templates, and the preset natural language templates in each preset classification template may be different. For example, the preset classification templates may include: "[ u ] ₁ ][u ₂ ]...[u _n ]This is [ M ] ₁ ][M ₂ ][M ₃ ][M ₄ ]Information "," [ u ] ₁ ][u ₂ ]...[u _n ]Belonging to [ M1 ]][M2][M3][M4]Category "," [ u ] ₁ ][u ₂ ]...[u _n ]Attribution [ M1 ]][M2][M3][M4]And (5) classification. [ u ] ₁ ][u ₂ ]...[u _n ]For a preset parameter vector, "this is [ M ] ₁ ][M ₂ ][M ₃ ][M ₄ ]Information "" belong to [ M1 ]][M2][M3][M4]Category "and" attribution [ M1 ]][M2][M3][M4]The classification is a preset natural language template.

For each preset classification template, training the first preset network model according to the first training sample data to obtain candidate parameter vectors corresponding to the preset classification templates. Wherein each candidate parameter vector and the preset classification Model can form a standby network Model, and under the condition that the preset classification templates have P numbers, the number of the candidate parameter vectors obtained through training can be P, and correspondingly, the standby network Model can have a Model ¹ ，Model ² ，......，Model ^P And P standby network models comprise a second preset network model.

For each candidate parameter vector, training the standby network model corresponding to the candidate parameter vector according to the second training sample data to obtain a candidate text classification model corresponding to the candidate parameter vector. Each candidate parameter vector and the candidate text classification model corresponding to the candidate parameter vector may form a candidate network model. The network Model to be used has Model ¹ ，Model ² ，......，Model ^P In the case of P total, a Model can be obtained _h ¹ ，Model _h ² ......，Model _h ^P There are P candidate network models in total. Due to training the first preset network modeThe first training sample data used in the model process is data marked with categories, and the second training sample used in the process of training the standby network model is data unmarked with categories, so that the candidate network model generated by training can achieve the effect of semi-supervised learning, and the classification accuracy of the target text classification model obtained by training is higher.

And finally, according to a preset verification data set, determining the target parameter vector and the target text classification model with highest classification accuracy from the candidate parameter vector and the candidate text classification model. The preset verification data set may include a sample verification text and a sample verification category corresponding to the sample verification text. In some embodiments, a target network model with the highest classification accuracy may be determined from the candidate network models according to a preset verification data set, then the candidate parameter vector included in the target network model is taken as the target parameter vector, and the candidate text classification model included in the target network model is taken as the target text classification model.

In one embodiment, one implementation of step 201 may be: training a first preset network model according to the first sample input data and the first sample category to obtain candidate parameter vectors.

Fig. 3 is a flowchart illustrating a method of determining candidate parameter vectors according to an exemplary embodiment, and step 201 may be implemented by the following steps, as shown in fig. 3.

And step 2011, obtaining first sample input data according to the first sample text and a preset classification template.

Step 2012, taking the first sample input data as input of a first preset network model, taking the first sample category as output of the first preset network model, and training the first preset network model to obtain candidate parameter vectors.

For example, the first training sample data may include first sample input data and a first sample category corresponding to the first sample input data, and the first sample input data may include a preset classification template and a first sample text. The preset classification templates and the first text sample can be splicedFirst sample input data is obtained. With the preset classification template as "[ u ] ₁ ][u ₂ ]...[u _n ]This is [ M1 ]][M2][M3][M4]Information. "first sample text is" someone parks in a cell in disorder "as an example, [ u ] ₁ ][u ₂ ]...[u _n ]For a preset parameter vector, "this is [ M1 ]][M2][M3][M4]Information. "is a preset natural language template, the first sample input data is" [ u ] ₁ ][u ₂ ]...[u _n ]This is [ M1 ]][M2][M3][M4]Information. Some district has someone to park in disorder. "[ u ] ₁ ][u ₂ ]...[u _n ]This is [ M1 ]][M2][M3][M4]Information. The first sample category corresponding to the someone parking in disorder in a certain district can be "illegal parking", namely "[ M1 ]][M2][M3][M4]The corresponding text is "parking in violation".

And then taking the first sample input data as the input of a first preset network model, taking the first sample class as the output of the first preset network model, and training the first preset network model according to a preset first loss function so as to obtain candidate parameter vectors. In one embodiment, in training the first preset network model, a pytorch framework may be employed, using Adam optimizer as the optimizer, and the learning rate may be set to 1e-4. First training sample data can be loaded, data batch is carried out according to preset batch_size, and total M batches of data are assumed, wherein the batch_size can be 8, namely, the number of samples which are transmitted to a program for training at a time is 8, and the initial value of a preset parameter vector can be a random value.

Taking a preset classification model as a BERT model as an example, for a preset parameter vector, LSTM (Long Short-Term Memory, chinese: long-Short Term Memory network) can be used for processing to obtain a hidden state vector h corresponding to the preset parameter vector _j . Then a batch of first sample input data is input into an Embedding layer of the BERT model to be converted into a vector form, wherein, for a preset parameter vector [ u ] _j ]The hidden state vector h can be used _j And performing replacement. And inputting the vector output by the coding layer into an MLM model in the BERT model to obtain the category of the predicted first sample text. According to the preset firstA loss function can obtain a loss function value corresponding to the batch of data, and an updated parameter vector u can be solved according to an Adam optimizer of pytorch and learning rate setting, so that a preset parameter vector is updated. The first loss function may be shown in equation 1, for example.

(equation 1)

Wherein L is ₁ As a first loss function, u= ([ u) ₁ ],[u ₂ ],...,[u _n ]) For the preset parameter vector, x is the first sample text, y is the first sample class, D _train1 For the first training sample data, LSTM (u) is a hidden state vector h obtained after the preset parameter vector is processed by the LSTM _j 。

In other embodiments, the training steps may be repeated with the M batches of data as training data, so that the parameter u is updated M times, and an epoch operation is completed, that is, a complete training is performed on the first preset network model by using the first training sample data. According to the above procedure, the training may be performed for epoch_size=50 times, and the parameter u is updated multiple times, thereby obtaining candidate parameter vectors.

In another embodiment, one implementation of step 202 may be: and training the network model to be used according to the second sample input data and the second sample output data aiming at each candidate parameter vector to obtain a candidate text classification model.

FIG. 4 is a flowchart illustrating a method of determining a candidate text classification model according to an exemplary embodiment, as shown in FIG. 4, step 202 may be implemented by the following steps.

Step 2021, obtaining second sample input data according to the second sample text and the candidate classification templates.

And step 2022, taking the second sample input data as the input of the standby network model, and taking the second sample output data as the output of the standby network model, and training the standby network model to obtain the candidate text classification model.

For example, the second training sample data may include second sample input data and second sample output data corresponding to the second sample input data, where the second sample input data may include a candidate classification template and second sample text, and the candidate classification template may include a candidate parameter vector and a preset natural language template, and it may be understood that the candidate classification template is a template obtained by replacing a preset parameter vector in the preset classification template with the candidate parameter vector.

The candidate classification templates and the second sample texts can be spliced to obtain second sample input data, the second sample input data are used as input of a standby network model, the second sample output data are used as output of the standby network model, and the network model to be used is trained to obtain a candidate text classification model. The second sample output data may be text extracted from a preset sample text, and the second sample text may be text obtained after the second sample output data is extracted from the preset sample text. Taking a preset sample text as an example of ' a certain cell is shut down and is required to recover as soon as possible ', the ' as soon as possible ' can be extracted as second sample output data, and ' a certain cell is shut down and is required to recover [ M1] [ M2] as second sample text.

In one embodiment, in training the second preset network model, a pytorch framework may be employed, using Adam optimizer as the optimizer, and the learning rate may be set to 1e-4. First training sample data can be loaded, data batch is carried out according to batch_size, total N batches of data are assumed, wherein the batch_size can be 32, namely, the number of samples which are transmitted to a program for training at a time is 32, and parameters para in a preset classification model can be obtained through training ⁱ _bert And updating to obtain a candidate text classification model.

Taking a preset classification model as a BERT model as an example, aiming at each batch of second training sample data, a loss function value corresponding to the batch of data can be obtained according to a preset second loss function, and an updated parameter par can be solved according to an Adam optimizer of a pyrach and learning rate settinga ⁱ _bert Thereby for parameter para ⁱ _bert And updating. The second loss function may be represented by equation 2, for example.

(equation 2)

Wherein L is ₂ As a second loss function, z is a second sample text, w is second sample output data, D _train2 For the second training sample data, u ⁱ As candidate parameter vectors, lstm (u ⁱ ) Is the vector obtained after the candidate parameter vector is processed by LSTM.

In other embodiments, N batches of data may be used as training data, respectively, and the above training steps may be repeated for the parameter para ⁱ _bert Updating for N times to complete one epoch operation, namely, performing one complete training on the standby network model by using the second training sample data. According to the above procedure, one can perform epoch_size=50 training by applying to parameter para ⁱ _bert Multiple updates are carried out, and the final parameter para ⁱ _bert And taking the candidate text classification model as a parameter of a preset classification model, thereby obtaining the candidate text classification model. Thus, the parameter para of the preset classification model is obtained through the unlabeled second training sample data ⁱ _bert And fine adjustment is performed, a large amount of unlabeled sample data is fully utilized, and the semi-supervised learning effect is achieved.

Fig. 5 is a flowchart illustrating a method of determining a target parameter vector and a target text classification model according to an exemplary embodiment, and step 203 may be implemented by the following steps, as shown in fig. 5.

Step 2031, for each candidate parameter vector, obtaining verification input data according to the sample verification text and the candidate classification template corresponding to the candidate parameter vector.

Step 2032, taking the verification input data as the input of the candidate text classification model corresponding to the candidate parameter vector, and obtaining the target verification category output by the candidate text classification model.

Step 2033, determining the classification accuracy of each candidate network model according to the target verification category and the sample verification category, where the candidate network models include candidate parameter vectors and candidate text classification models corresponding to the candidate parameter vectors.

Step 2034, taking the candidate parameter vector in the candidate network model with the highest classification accuracy as the target parameter vector, and taking the candidate text classification model in the candidate network model with the highest classification accuracy as the target text classification model.

For example, after obtaining at least one candidate parameter vector and a candidate text classification model corresponding to each candidate parameter vector, a target parameter vector and a target text classification model with highest classification accuracy may be determined from the at least one candidate parameter vector and the candidate text classification model. If the candidate parametric vector and the candidate text classification model are one, the candidate parametric vector and the candidate text classification model may be regarded as the target parametric vector and the target text classification model. If there are a plurality of candidate parameter vectors and candidate text classification models, the target parameter vector and the target text classification model can be determined according to the classification accuracy of each candidate network model, wherein each candidate network model comprises one candidate parameter vector and the candidate text classification model corresponding to the candidate parameter vector.

In some embodiments, for each candidate network model, first, a sample verification text and a candidate classification template corresponding to a candidate parameter vector in the candidate network model may be spliced to obtain verification input data. And then taking the verification input data as the input of the candidate text classification model corresponding to the candidate parameter vector to obtain the target verification category output by the candidate text classification model. And further determining the classification accuracy corresponding to each candidate network model according to the matching degree of the target verification category and the sample verification category. The higher the matching degree of the target verification category and the sample verification category is, the higher the classification accuracy of the candidate network model corresponding to the target verification category is, and correspondingly, the lower the matching degree of the target verification category and the sample verification category is, the lower the classification accuracy of the candidate network model corresponding to the target verification category is. After determining the classification accuracy of each candidate network model, the candidate network model with the highest classification accuracy may be taken as the target network model. And simultaneously, taking the candidate parameter vector in the target network model as the target parameter vector, and taking the candidate text classification model in the target network model as the target text classification model.

Fig. 6 is a block diagram of a text classification apparatus according to an exemplary embodiment, and as shown in fig. 6, the apparatus 300 may include the following modules.

The obtaining module 301 is configured to obtain a target text.

And the input module 302 is configured to obtain target input data according to the target text and the target classification template. The target classification template comprises a target parameter vector and a target natural language template, the target parameter vector is obtained by training a first preset network model according to first training sample data, the first training sample data is sample data marked with categories, and the first preset network model comprises a preset parameter vector and a preset classification model.

The classification module 303 is configured to input the target input data into a preset target text classification model, so as to obtain a target text category output by the target text classification model. The target text classification model is obtained by training a second preset network model according to second training sample data, the second training sample data is sample data of unlabeled categories, and the second preset network model comprises a target parameter vector and a preset classification model.

In one embodiment, the first training sample data comprises at least one preset classification template. The target parameter vector and the target text classification model are determined in the following manner.

And training the first preset network model according to the first training sample data aiming at each preset classification template to obtain candidate parameter vectors corresponding to the preset classification templates.

And training a standby network model corresponding to the candidate parameter vector according to the second training sample data aiming at each candidate parameter vector to obtain a candidate text classification model corresponding to the candidate parameter vector, wherein the standby network model comprises candidate vector parameters and a preset classification model.

And determining a target parameter vector and a target text classification model from the candidate parameter vector and the candidate text classification model according to a preset verification data set, wherein the preset verification data set comprises a sample verification text and a sample verification category corresponding to the sample verification text.

In another embodiment, the first training sample data includes first sample input data and a first sample class corresponding to the first sample input data. Training a first preset network model according to the first training sample data, and obtaining candidate parameter vectors corresponding to a preset classification template comprises the following steps: training a first preset network model according to the first sample input data and the first sample category to obtain candidate parameter vectors.

In another embodiment, the first sample input data includes a preset classification template and a first sample text, the preset classification template including a preset parameter vector and a preset natural language template. Training a first preset network model according to the first sample input data and the first sample category to obtain candidate parameter vectors.

And obtaining first sample input data according to the first sample text and a preset classification template.

And taking the first sample input data as input of a first preset network model, taking the first sample category as output of the first preset network model, and training the first preset network model to obtain candidate parameter vectors.

In another embodiment, the second training sample data includes second sample input data and second sample output data corresponding to the second sample input data. Training the standby network model corresponding to the candidate parameter vector according to the second training sample data, and obtaining the candidate text classification model corresponding to the candidate parameter vector comprises the following steps: and training the network model to be used according to the second sample input data and the second sample output data aiming at each candidate parameter vector to obtain a candidate text classification model.

In another embodiment, the second sample input data includes a candidate classification template and a second sample text, the candidate classification template including a candidate parameter vector and a preset natural language template. The second sample output data is a text extracted from a preset sample text, and the second sample text is a text obtained after the second sample output data is extracted from the preset sample text. Training the network model to be used according to the second sample input data and the second sample output data to obtain a candidate text classification model.

And obtaining second sample input data according to the second sample text and the candidate classification templates.

And taking the second sample input data as the input of the standby network model, taking the second sample output data as the output of the standby network model, and training the network model to be used to obtain a candidate text classification model.

In another embodiment, determining the target parameter vector and the target text classification model from the candidate parameter vector and the candidate text classification model according to the preset validation data set comprises the following steps.

And aiming at each candidate parameter vector, obtaining verification input data according to the sample verification text and the candidate classification template corresponding to the candidate parameter vector.

And taking the verification input data as the input of the candidate text classification model corresponding to the candidate parameter vector, and obtaining the target verification category output by the candidate text classification model.

And determining the classification accuracy of each candidate network model according to the target verification category and the sample verification category, wherein the candidate network models comprise candidate parameter vectors and candidate text classification models corresponding to the candidate parameter vectors.

And taking the candidate parameter vector in the candidate network model with the highest classification accuracy as a target parameter vector, and taking the candidate text classification model in the candidate network model with the highest classification accuracy as a target text classification model.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Fig. 7 is a block diagram of an electronic device, according to an example embodiment. For example, electronic device 400 may be provided as a server. Referring to fig. 7, the electronic device 400 includes a processor 422, which may be one or more in number, and a memory 432 for storing a computer program executable by the processor 422. The computer program stored in memory 432 may include one or more modules each corresponding to a set of instructions. Further, the processor 422 may be configured to execute the computer program to perform the text classification method described above.

In addition, the electronic device 400 may further include a power supply component 426 and a communication component 450, the power supply component 426 may be configured to perform power management of the electronic device 400, and the communication component 450 may be configured to enable communication of the electronic device 400, e.g., wired or wireless communication. In addition, the electronic device 400 may also include an input/output interface 458. The electronic device 400 may operate based on an operating system stored in the memory 432.

In another exemplary embodiment, a computer readable storage medium is also provided comprising program instructions which, when executed by a processor, implement the steps of the text classification method described above. For example, the non-transitory computer readable storage medium may be the memory 432 described above including program instructions executable by the processor 422 of the electronic device 400 to perform the text classification method described above.

In another exemplary embodiment, a computer program product is also provided, comprising a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-described text classification method when executed by the programmable apparatus.

The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solutions of the present disclosure within the scope of the technical concept of the present disclosure, and all the simple modifications belong to the protection scope of the present disclosure.

In addition, the specific features described in the foregoing embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, the present disclosure does not further describe various possible combinations.

Moreover, any combination between the various embodiments of the present disclosure is possible as long as it does not depart from the spirit of the present disclosure, which should also be construed as the disclosure of the present disclosure.

Claims

1. A method of classifying text, the method comprising:

Acquiring a target text;

inputting the target input data into a preset target text classification model to obtain a target text class output by the target text classification model, wherein the target text classification model is obtained by training a second preset network model according to second training sample data, the second training sample data is sample data of unlabeled classes, and the second preset network model comprises the target parameter vector and the preset classification model;

the first training sample data comprises at least one preset classification template, wherein the preset classification template comprises a preset parameter vector and a preset natural language template; the target parameter vector and the target text classification model are determined by:

training a standby network model corresponding to the candidate parameter vector according to the second training sample data aiming at each candidate parameter vector to obtain a candidate text classification model corresponding to the candidate parameter vector, wherein the standby network model comprises the candidate parameter vector and the preset classification model;

determining the target parameter vector and the target text classification model from the candidate parameter vector and the candidate text classification model according to a preset verification data set, wherein the preset verification data set comprises a sample verification text and a sample verification category corresponding to the sample verification text;

the second training sample data comprises second sample input data and second sample output data corresponding to the second sample input data; training the standby network model corresponding to the candidate parameter vector according to the second training sample data, and obtaining a candidate text classification model corresponding to the candidate parameter vector includes:

2. The method of claim 1, wherein the first training sample data comprises first sample input data and a first sample class to which the first sample input data corresponds; training the first preset network model according to the first training sample data, and obtaining candidate parameter vectors corresponding to the preset classification templates comprises the following steps:

3. The method of claim 2, wherein the first sample input data comprises the preset classification template and a first sample text; training the first preset network model according to the first sample input data and the first sample category, and obtaining the candidate parameter vector includes:

4. The method of claim 1, wherein the second sample input data comprises a candidate classification template and a second sample text, the candidate classification template comprising the candidate parameter vector and the preset natural language template; the second sample output data is a text extracted from a preset sample text, and the second sample text is a text obtained after the second sample output data is extracted from the preset sample text; training the standby network model according to the second sample input data and the second sample output data to obtain the candidate text classification model comprises the following steps:

5. The method of claim 4, wherein determining the target parameter vector and the target text classification model from the candidate parameter vector and the candidate text classification model according to a preset validation data set comprises:

6. A text classification apparatus, the apparatus comprising:

the acquisition module is used for acquiring the target text;

the classification module is used for inputting the target input data into a preset target text classification model to obtain a target text class output by the target text classification model, the target text classification model is obtained by training a second preset network model according to second training sample data, the second training sample data is sample data of unlabeled class, and the second preset network model comprises the target parameter vector and the preset classification model;

7. A non-transitory computer readable storage medium, having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of the method according to any of claims 1-5.

8. An electronic device, comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the method of any one of claims 1-5.