CN113326379B - Text classification prediction method, device, equipment and storage medium - Google Patents

Text classification prediction method, device, equipment and storage medium Download PDF

Info

Publication number
CN113326379B
CN113326379B CN202110734767.7A CN202110734767A CN113326379B CN 113326379 B CN113326379 B CN 113326379B CN 202110734767 A CN202110734767 A CN 202110734767A CN 113326379 B CN113326379 B CN 113326379B
Authority
CN
China
Prior art keywords
model
classification
training
sentence
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110734767.7A
Other languages
Chinese (zh)
Other versions
CN113326379A (en
Inventor
刘广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202110734767.7A priority Critical patent/CN113326379B/en
Publication of CN113326379A publication Critical patent/CN113326379A/en
Application granted granted Critical
Publication of CN113326379B publication Critical patent/CN113326379B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to the technical field of artificial intelligence, and discloses a text classification prediction method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring target text data; inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method; and obtaining a target text classification prediction result output by the target text classification model. Therefore, the target text classification model with excellent generalization capability is determined, and the success rate of text classification prediction of the target text classification model on the target text data is improved.

Description

Text classification prediction method, device, equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a text classification prediction method, apparatus, device, and storage medium.
Background
Overfitting is one of the major problems we have encountered when attempting to apply machine learning techniques to text classification, which arises because the use of labeled text to train a text classification model, such as classifying news emotions for organization G, requires historical news data for that organization and carries out news emotion labeling. The inventors found that not all emotional expressions are well represented in the existing tagged text. In fact, many of the new emotion expressions used in large quantities are not fully reflected, so that the generalization of the trained text classification model is poor, and when the text classification model with poor generalization faces a new application scene, classification prediction failure may occur.
Disclosure of Invention
The application mainly aims to provide a text classification prediction method, a device, equipment and a storage medium, and aims to solve the technical problems that when a text classification model is trained by using marked texts in the prior art, a large number of used new emotion expressions are not fully reflected due to the fact that the number of training samples is too small, and the generalization of the trained text classification model is poor.
In order to achieve the above object, the present application provides a text classification prediction method, which includes:
Acquiring target text data;
inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method;
and obtaining a target text classification prediction result output by the target text classification model.
Further, before the step of inputting the target text data into the target text classification model for text classification prediction, the method further includes:
Acquiring a first classification training sample set, wherein each first classification training sample in the first classification training sample set comprises: a first sentence sample text and first sentence authenticity calibration data;
Performing countermeasure training on the generating sub-model and the judging sub-model according to the first classification training sample set by adopting an iterative optimization training method, and taking the generating sub-model after the countermeasure training as a sentence generating model, wherein the generating sub-model is a model obtained based on a circulating neural network, and the judging sub-model is a model obtained based on a fully connected network or a convolutional neural network;
Acquiring a plurality of language fragments to be predicted;
respectively inputting each language fragment to be predicted in the plurality of language fragments to be predicted into the sentence generating model to generate sentences by adopting a prediction and splicing iteration method, so as to obtain a plurality of generated sentence texts;
Acquiring classification calibration data corresponding to each generated sentence text in the generated sentence texts, and generating samples according to the generated sentence texts and the classification calibration data to obtain generated sentence samples;
Acquiring a plurality of classification training samples to be expanded, and taking the plurality of classification training samples to be expanded and the plurality of generated sentence samples as a second classification training sample set;
Training a text classification initial model according to the second classification training sample set by adopting an MLM training method, and taking the text classification initial model after training as the target text classification model, wherein the text classification initial model is a model obtained based on the Bert model, the full connection layer and the Softmax activation function.
Further, the step of training the text classification initial model according to the second classification training sample set by adopting an MLM training method and taking the text classification initial model after training as the target text classification model includes:
Extracting a second classification training sample from the second classification training sample set as a target classification training sample;
Inputting a sample text to be classified and trained of the target classification training sample into an embedding layer of the text classification initial model for marking analysis and character adjustment by adopting a fixed character length to obtain an adjusted sample text;
Training the text classification initial model according to the adjusted sample text and the classification calibration data of the target classification training sample by adopting the MLM training method;
repeating the step of extracting a second classification training sample from the second classification training sample set as a target classification training sample until a classification training convergence condition is met;
And taking the text classification initial model meeting the classification training convergence condition as the target text classification model.
Further, the step of performing the countermeasure training on the generating sub-model and the discriminating sub-model according to the first classification training sample set by using the iterative optimization training method, and taking the generating sub-model after the countermeasure training as a sentence generating model includes:
Acquiring a first classification training sample from the first classification training sample set as a classification training sample to be subjected to countermeasure training;
adopting a sentence iteration generating method for iteratively adding predicted characters into a next character to be predicted, and generating sentences according to the generating submodel and the first sentence sample text of the classification training sample to be subjected to countermeasure training to obtain generated sentences to be processed;
Acquiring a generated sentence calibration symbol, and generating a sample according to the generated sentence calibration symbol and the generated sentence to be processed to obtain a discrimination and classification training sample;
Respectively inputting a first sentence sample text of the classification training sample to be subjected to countermeasure training and a second subsampled text of the discrimination classification training sample into the discrimination subsampled model to predict the authenticity probability, so as to obtain a first authenticity probability predicted value corresponding to the classification training sample to be subjected to countermeasure training and a second authenticity probability predicted value corresponding to the discrimination classification training sample;
adopting an iterative optimization training method, and performing countermeasure training on the generating sub-model and the judging sub-model according to the first authenticity probability prediction value corresponding to the classification training sample to be subjected to countermeasure training, the first sentence authenticity calibration data and the second authenticity probability prediction value corresponding to the judging classification training sample, and the second sentence authenticity calibration data;
repeating the step of acquiring a first classification training sample from the first classification training sample set as a classification training sample to be subjected to countermeasure training until a countermeasure training convergence condition is met, and taking the generation sub-model meeting the countermeasure training convergence condition as the sentence generation model.
Further, the step of iteratively adding the predicted character into a sentence iteration generating method for predicting the next character, generating sentences according to the generating submodel and the first sentence sample text of the classifying training sample to be subjected to countermeasure training, and obtaining generated sentences to be processed includes:
extracting a language fragment from the first sentence sample text of the classification training sample to be subjected to countermeasure training by adopting a preset language fragment extraction rule and a mode of extracting from the beginning, and taking the language fragment as a language fragment to be subjected to prediction splicing;
And iteratively adding predicted characters into a sentence iteration generating method for predicting the next character through the generating sub-model, and generating sentences according to the language fragments to be predicted and spliced to obtain the generated sentences to be processed.
Further, the step of generating sentences according to the language fragments spliced to be predicted by adopting a sentence iteration generating method for iteratively adding predicted characters into a next character prediction to obtain the generated sentences to be processed through the generating submodel comprises the following steps:
Taking the language fragments to be predicted and spliced as texts to be predicted;
Inputting the text to be predicted into the generation submodel to predict the next character to obtain a character predicted value to be spliced;
sequentially splicing the text to be predicted and the character predicted value to be spliced to obtain a spliced text;
taking the spliced text as the text to be predicted;
Repeatedly executing the step of inputting the text to be predicted into the generation sub-model to predict the next character to obtain a character predicted value to be spliced until the number of characters of the text to be predicted reaches a character prediction convergence condition;
and taking the text to be predicted as the generated sentence to be processed.
Further, the step of performing the countermeasure training on the generating sub-model and the discriminating sub-model according to the first authenticity probability prediction value corresponding to the classifying training sample to be subjected to the countermeasure training, the first sentence authenticity calibration data, the second authenticity probability prediction value corresponding to the discriminating classifying training sample, and the second sentence authenticity calibration data by adopting an iterative optimization training method includes:
Obtaining a model identifier to be optimized, and taking the identifier of the generation sub-model as the model identifier to be optimized when the model identifier to be optimized is empty;
When the model to be optimized is identified as the identifier of the generation sub-model, calculating a loss value according to the first authenticity probability prediction value and the first sentence authenticity calibration data corresponding to the classification training sample to be subjected to countermeasure training to obtain a first loss value of the generation sub-model, updating parameters of the generation sub-model according to the first loss value of the generation sub-model, calculating a loss value according to the second authenticity probability prediction value and the second sentence authenticity calibration data corresponding to the discrimination classification training sample to obtain a second loss value of the generation sub-model, updating parameters of the generation sub-model according to the second loss value of the generation sub-model, judging whether the first loss value and the second loss value reach a first convergence condition or not, or judging whether the iteration times of the generation sub-model reach a second convergence condition or not, and taking the identification sub-model as the discrimination model to be optimized when the first loss value and the second loss value reach the first convergence condition or the iteration times of the generation sub-model reach the second convergence condition;
When the model identifier to be optimized is the identifier of the judgment sub-model, calculating a loss value according to the first authenticity probability prediction value and the first sentence authenticity calibration data corresponding to the classification training sample to be subjected to countermeasure training to obtain a third loss value of the judgment sub-model, updating parameters of the judgment sub-model according to the third loss value of the judgment sub-model, calculating a loss value according to the second authenticity probability prediction value and the second sentence authenticity calibration data corresponding to the classification training sample to obtain a fourth loss value of the judgment sub-model, updating parameters of the judgment sub-model according to the fourth loss value of the judgment sub-model, judging whether the third loss value and the fourth loss value reach a third convergence condition or not, or judging whether the iteration times of the judgment sub-model reach a fourth convergence condition or not, and when the third loss value and the fourth loss value reach the third convergence condition or the iteration times of the judgment sub-model reach the fourth convergence condition, taking the identifier as the identifier to be optimized model.
The application also provides a text classification prediction device, which comprises:
The data acquisition module is used for acquiring target text data;
The text classification prediction module is used for inputting the target text data into a target text classification model to perform text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a judgment sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method;
and the target text classification prediction result determining module is used for acquiring a target text classification prediction result output by the target text classification model.
The application also proposes a computer device comprising a memory storing a computer program and a processor implementing the steps of any of the methods described above when the processor executes the computer program.
The application also proposes a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the method of any of the above.
The text classification prediction method, the device, the equipment and the storage medium are used for acquiring target text data; inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method; and obtaining a target text classification prediction result output by the target text classification model, rapidly expanding training samples of various emotions by generating a sub-model and judging the sub-model, and training a Bert model, a full connection layer and a Softmax activation function by the expanded training samples and marked texts determined according to real data to obtain a target text classification model with excellent generalization capability, thereby improving the success rate of the target text classification model for carrying out text classification prediction on target text data.
Drawings
FIG. 1 is a flow chart of a text classification prediction method according to an embodiment of the application;
FIG. 2 is a block diagram schematically illustrating a text classification predicting apparatus according to an embodiment of the present application;
Fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Referring to fig. 1, in an embodiment of the present application, there is provided a text classification prediction method, including:
s1: acquiring target text data;
s2: inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method;
s3: and obtaining a target text classification prediction result output by the target text classification model.
The embodiment obtains the target text data; inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method; and obtaining a target text classification prediction result output by the target text classification model, rapidly expanding training samples of various emotions by generating a sub-model and judging the sub-model, and training a Bert model, a full connection layer and a Softmax activation function by the expanded training samples and the samples obtained by the marked text determined according to real data to obtain the target text classification model with excellent generalization capability, thereby improving the success rate of the target text classification model for carrying out text classification prediction on the target text data.
For S1, target text data input by the user may be obtained, target text data may be obtained from a database, or target text data may be obtained from a third party application system.
The target text data is text data for which a text classification prediction is required. The target text data may be a sentence.
And S2, inputting the target text data into a target text classification model to conduct text classification prediction, and outputting a text classification prediction result by the target text classification model.
The method comprises the steps of performing countermeasure training by adopting a generating sub-model and a judging sub-model, and taking the generating sub-model after the countermeasure training as a sentence generating model so that the sentence generating model can generate sentences with the same reality as marked texts determined according to real data; generating sentences according to language fragments by adopting a prediction and splicing iteration method and a sentence generation model, determining generated sentence samples according to the generated sentences, combining the generated sentence samples and samples obtained by marked texts determined according to real data into a second classification training sample set, and training a Bert model, a full connection layer and a Softmax activation function by adopting the second classification training sample set to obtain a target text classification model with excellent generalization capability, thereby improving the success rate of text classification prediction of target text data by the target text classification model.
The Bert model is a pre-trained language characterization model.
And the full-connection layer is used for integrating the features extracted from the front edge to realize the function of the classifier, and each node is connected with all nodes of the upper layer.
The Softmax activation function is a normalization function at the time of multiple classifications.
And S3, taking the text classification prediction result output by the target text classification model as a target text classification prediction result corresponding to the target text data. That is, the target text classification predictor is a probability vector, and each probability in the target text classification predictor corresponds to a classification label.
Optionally, after the step of obtaining the target text classification prediction result output by the target text classification model, the method further includes: finding out the maximum value from the target text classification prediction result to obtain target probability; and taking the classification label corresponding to the target probability as a target classification label of the target text data.
In one embodiment, before the step of inputting the target text data into the target text classification model for text classification prediction, the method further includes:
s21: acquiring a first classification training sample set, wherein each first classification training sample in the first classification training sample set comprises: a first sentence sample text and first sentence authenticity calibration data;
S22: performing countermeasure training on the generating sub-model and the judging sub-model according to the first classification training sample set by adopting an iterative optimization training method, and taking the generating sub-model after the countermeasure training as a sentence generating model, wherein the generating sub-model is a model obtained based on a circulating neural network, and the judging sub-model is a model obtained based on a fully connected network or a convolutional neural network;
s23: acquiring a plurality of language fragments to be predicted;
S24: respectively inputting each language fragment to be predicted in the plurality of language fragments to be predicted into the sentence generating model to generate sentences by adopting a prediction and splicing iteration method, so as to obtain a plurality of generated sentence texts;
S25: acquiring classification calibration data corresponding to each generated sentence text in the generated sentence texts, and generating samples according to the generated sentence texts and the classification calibration data to obtain generated sentence samples;
S26: acquiring a plurality of classification training samples to be expanded, and taking the plurality of classification training samples to be expanded and the plurality of generated sentence samples as a second classification training sample set;
S27: training a text classification initial model according to the second classification training sample set by adopting an MLM training method, and taking the text classification initial model after training as the target text classification model, wherein the text classification initial model is a model obtained based on the Bert model, the full connection layer and the Softmax activation function.
The method adopts the generation sub-model and the discrimination sub-model to conduct countermeasure training, and takes the generation sub-model after the countermeasure training as a sentence generation model, so that the sentence generation model can generate sentences with the same reality as marked texts determined according to the real data; generating sentences according to language fragments by adopting a prediction and splicing iteration method and a sentence generation model, determining generated sentence samples according to the generated sentences, combining the generated sentence samples and samples obtained by marked texts determined according to real data into a second classification training sample set, and training a Bert model, a full connection layer and a Softmax activation function by adopting the second classification training sample set, so that training of the Bert model, the full connection layer and the Softmax activation function is carried out on the expanded training samples and the samples obtained by marked texts determined according to the real data, a target text classification model with excellent generalization capability is obtained, and the success rate of text classification prediction of the target text classification model on the target text data is improved.
For S21, the first classification training sample set input by the user may be obtained, or the first classification training sample set may be obtained from the database, or the first classification training sample set may be obtained from the third party application system.
The first classification training sample set comprises a plurality of first classification training samples. The first sentence authenticity calibration data is set as an authentic sentence calibration symbol, that is, all of the first sentence authenticity calibration data of the first class training samples are set as authentic sentence calibration symbols. For example, if the real sentence calibration symbol is 1, the first sentence real calibration data of all the first classification training samples is 1, which is not specifically limited herein.
The first sentence sample text is a sentence.
For S22, inputting the first n characters of the first sentence sample text of each first classification training sample in the first classification training sample set into the generation sub-model to conduct n+1th character prediction, splicing the n+1th character behind the n characters to obtain n+1th characters, then conducting n+2th character prediction on the generation sub-model of the n+1th characters, splicing the n+2th characters behind the n+1th characters to obtain n+2th characters, and conducting prediction and splicing circularly until the end condition is reached to obtain a plurality of generated sentences to be processed; sample generation is carried out according to the generated sentence calibration symbols and each generated sentence to be processed, so as to obtain a discrimination training sample; and performing countermeasure training on the generation sub-model and the discrimination sub-model according to the first classification training sample set and each discrimination training sample by adopting an iterative optimization training method, and taking the generation sub-model after the countermeasure training is finished as a sentence generation model.
When the generating sub-model and the discriminating sub-model are subjected to countermeasure training according to the first classifying training sample set and each discriminating training sample, the parameters of the discriminating sub-model are kept unchanged and updated, then the parameters of the generating sub-model are kept unchanged and updated, and the parameters of the discriminating sub-model are repeatedly kept unchanged and updated, and then the parameters of the generating sub-model are kept unchanged and updated until reaching the countermeasure training convergence condition.
For S23, a plurality of language fragments to be predicted input by the user may be obtained, a plurality of language fragments to be predicted may be obtained from the database, and a plurality of language fragments to be predicted may be obtained from the third party application system.
Each of the plurality of language fragments to be predicted includes one or more words. It is to be understood that the words in the language segment to be predicted may be continuous words in a sentence or discontinuous words in a sentence, which is not limited herein specifically.
For S24, acquiring a preset sentence length; taking a target language fragment to be predicted as a language fragment to be processed, wherein the target language fragment to be predicted is any one of the language fragments to be predicted; inputting the language fragment to be processed into the sentence generation model to predict the next character, so as to obtain the character to be processed; splicing the character to be processed behind the language fragment to be processed to obtain a spliced language fragment; taking the spliced language fragments as language fragments to be processed; repeatedly executing the step of inputting the language fragments to be processed into the sentence generating model to predict the next character to obtain the characters to be processed until the number of the characters of the language fragments to be processed is equal to the preset sentence length; and taking the language fragments to be processed as the generated sentences to be processed corresponding to the language fragments to be predicted by the target.
For S25, the classification calibration data corresponding to each of the plurality of generated sentence texts may be obtained from the database, the classification calibration data corresponding to each of the plurality of generated sentence texts may be obtained from the third party application system, and the classification calibration data corresponding to each of the plurality of generated sentence texts input by the user may be obtained.
And taking the target generated sentence text as a sample text of a generated sentence sample corresponding to the target generated sentence text and to be trained in a classification mode, and taking the classification calibration data corresponding to the target generated sentence text as the classification calibration data of the generated sentence sample corresponding to the target generated sentence text, wherein the target generated sentence text is any one of the generated sentence texts.
For S26, a plurality of classification training samples to be expanded input by the user may be obtained, a plurality of classification training samples to be expanded may be obtained from the database, and a plurality of classification training samples to be expanded may be obtained from the third party application system.
And taking the plurality of classification training samples to be expanded and the plurality of generated sentence samples as one set, and taking the set as the second classification training sample set.
Each of the plurality of class training samples to be expanded includes: sample text to be classified and training, and classification calibration data, wherein the sample text to be classified and training and the classification calibration data are arranged in a one-to-one correspondence manner.
For S27, the MLM training method is adopted, that is, 15% of the characters in the sentence are randomly selected to be Mask (Mask), and of the characters selected as Mask, 80% are truly replaced with Mask, 10% are not replaced, and 10% are replaced with a random character.
The specific steps of training the text classification initial model according to the second classification training sample set by adopting the MLM training method are not described herein.
In one embodiment, the step of training the text classification initial model according to the second classification training sample set by using the MLM training method and using the text classification initial model after training as the target text classification model includes:
S271: extracting a second classification training sample from the second classification training sample set as a target classification training sample;
S272: inputting a sample text to be classified and trained of the target classification training sample into an embedding layer of the text classification initial model for marking analysis and character adjustment by adopting a fixed character length to obtain an adjusted sample text;
S273: training the text classification initial model according to the adjusted sample text and the classification calibration data of the target classification training sample by adopting the MLM training method;
s274: repeating the step of extracting a second classification training sample from the second classification training sample set as a target classification training sample until a classification training convergence condition is met;
s275, the text classification initial model meeting the classification training convergence condition is used as the target text classification model.
In the embodiment, the sample text to be classified and trained is firstly subjected to marking analysis and character adjustment by adopting a fixed character length, and the text classification initial model is trained by adopting the adjusted sample text, so that parallel training of the model is facilitated, and the training efficiency of the model is improved.
For S271, one second classification training sample is sequentially extracted from the second classification training sample set, and the extracted second classification training sample is used as the target classification training sample.
And S272, inputting the sample text to be classified and trained of the target classification training sample into an embedding layer of the text classification initial model for marking analysis, and then carrying out character adjustment by adopting a fixed character length according to the result of marking analysis, so as to obtain an adjusted sample text. That is, the number of characters of the adjusted sample text is the same as the fixed character length.
And (3) performing mark analysis, namely adding a start character and a stop character into the sample text to be classified and trained of the target classification training sample. For example, the sample text "i likes the scenery of beijing" of the target classification training sample is subjected to mark analysis to obtain "< CLS > i likes the scenery of beijing < SEP >", "< CLS >" is a start character, and "< SEP >" is a stop character, which is not limited in detail herein.
Alternatively, the fixed character length is set to 128, and is not particularly limited herein by way of example.
And S273, adopting the MLM training method, and adopting a random gradient descent algorithm to perform parameter optimization on the text classification initial model when training the text classification initial model according to the adjusted sample text and the classification calibration data of the target classification training sample.
For S274, S271 to S274 are repeatedly performed until the classification training convergence condition is satisfied, thereby obtaining a text classification initial model having excellent generalization ability.
Classification training convergence conditions: and the loss value of the text classification initial model reaches a loss value convergence condition, or the training times of the text classification initial model reach a classification training convergence condition.
The loss value convergence condition means that the magnitude of the loss value of the text classification initial model calculated in two adjacent times meets lipschitz conditions.
The convergence condition of classification training is a specific numerical value.
For S275, the text classification initial model satisfying the classification training convergence condition has already met an expected training target, and thus the text classification initial model satisfying the classification training convergence condition may be taken as the target text classification model.
In one embodiment, the step of performing the countermeasure training on the generating sub-model and the discriminating sub-model according to the first classification training sample set by using the iterative optimization training method, and taking the generating sub-model after the countermeasure training as the sentence generating model includes:
S221: acquiring a first classification training sample from the first classification training sample set as a classification training sample to be subjected to countermeasure training;
s222: adopting a sentence iteration generating method for iteratively adding predicted characters into a next character to be predicted, and generating sentences according to the generating submodel and the first sentence sample text of the classification training sample to be subjected to countermeasure training to obtain generated sentences to be processed;
s223: acquiring a generated sentence calibration symbol, and generating a sample according to the generated sentence calibration symbol and the generated sentence to be processed to obtain a discrimination and classification training sample;
S224: respectively inputting a first sentence sample text of the classification training sample to be subjected to countermeasure training and a second subsampled text of the discrimination classification training sample into the discrimination subsampled model to predict the authenticity probability, so as to obtain a first authenticity probability predicted value corresponding to the classification training sample to be subjected to countermeasure training and a second authenticity probability predicted value corresponding to the discrimination classification training sample;
s225: adopting an iterative optimization training method, and performing countermeasure training on the generating sub-model and the judging sub-model according to the first authenticity probability prediction value corresponding to the classification training sample to be subjected to countermeasure training, the first sentence authenticity calibration data and the second authenticity probability prediction value corresponding to the judging classification training sample, and the second sentence authenticity calibration data;
S226: repeating the step of acquiring a first classification training sample from the first classification training sample set as a classification training sample to be subjected to countermeasure training until a countermeasure training convergence condition is met, and taking the generation sub-model meeting the countermeasure training convergence condition as the sentence generation model.
According to the embodiment, an iterative optimization training method is adopted, and the generation sub-model and the judgment sub-model are subjected to countermeasure training according to the first classification training sample set, so that the obtained sentence generation model can generate sentences with the same authenticity as the marked texts determined according to the real data.
For S221, a first classification training sample is sequentially obtained from the first classification training sample set, and the obtained first classification training sample is used as the classification training sample to be subjected to countermeasure training.
For S222, inputting the first n characters of the first sentence sample text of the training sample to be countertrained into the generating sub-model to predict the n+1th character, splicing the n+1th character behind the n characters to obtain the n+1th character, then splicing the n+1th character behind the generating sub-model to predict the n+2th character, splicing the n+2th character behind the n+1th character to obtain the n+2th character, and circularly predicting and splicing until reaching an end condition to obtain the to-be-processed generated sentence corresponding to the first sentence sample text of the training sample to be countertrained.
For S223, the generated sentence calibration symbol input by the user may be obtained, the generated sentence calibration symbol may be obtained from the database, the generated sentence calibration symbol may be obtained from the third party application system, or the generated sentence calibration symbol may be written into a program for implementing the present application.
The generated sentences to be processed are used as second subsampled texts of the discrimination and classification training samples, and the generated sentence calibration symbols are used as second sentence authenticity calibration data of the discrimination and classification training samples. That is, each generated sentence to be processed corresponds to one of the discriminative classification training samples.
For example, if the generated sentence calibration symbol is 0, the second sentence authenticity calibration data of the classification training sample is determined to be 0, which is not specifically limited herein.
For S224, inputting the first sentence sample text of the classification training sample to be countermeasure training into the discriminant sub-model to predict the authenticity probability, so as to obtain a first authenticity probability prediction value corresponding to the classification training sample to be countermeasure training; and inputting a second subsample text of the discrimination and classification training sample into the discrimination subsample to predict the authenticity probability, so as to obtain a second authenticity probability predicted value corresponding to the discrimination and classification training sample.
And S225, repeating the generation sub-model and the judgment sub-model according to the first authenticity probability prediction value corresponding to the classification training sample to be subjected to countermeasure training, the first sentence authenticity calibration data and the second authenticity probability prediction value corresponding to the judgment classification training sample, and the second sentence authenticity calibration data, wherein the countermeasure training is performed by keeping the parameters of the judgment sub-model unchanged and updating the parameters of the generation sub-model, and keeping the parameters of the generation sub-model unchanged and updating the parameters of the judgment sub-model.
For S226, S221 to S226 are repeatedly executed until an countermeasure training convergence condition is satisfied, and the generation sub-model satisfying the countermeasure training convergence condition is taken as the sentence generation model.
The challenge training convergence conditions include: the first loss value and the second loss value of the generating sub-model reach a first convergence condition, the third loss value and the fourth loss value of the judging sub-model reach a third convergence condition, or the counter training times reach a fifth convergence condition.
When the parameters of the generating sub-model need to be updated, calculating a loss value according to the first authenticity probability predicted value and the first sentence authenticity calibration data corresponding to the classifying training sample to be subjected to countermeasure training to obtain a first loss value of the generating sub-model, and calculating a loss value according to the second authenticity probability predicted value and the second sentence authenticity calibration data corresponding to the discriminating classifying training sample to obtain a second loss value of the generating sub-model. When the parameters of the judging sub-model need to be updated, calculating a loss value according to the first authenticity probability predicted value and the first sentence authenticity calibration data corresponding to the classifying training sample to be subjected to countermeasure training to obtain a third loss value of the judging sub-model, and calculating a loss value according to the second authenticity probability predicted value and the second sentence authenticity calibration data corresponding to the classifying training sample to obtain a fourth loss value of the judging sub-model.
That is, when the first loss value and the second loss value of the generation sub-model reach the first convergence condition at the same time and the third loss value and the fourth loss value of the discrimination sub-model reach the third convergence condition at the same time, or the countermeasure training frequency reaches the fifth convergence condition, the generation sub-model is determined to be the time series generation model.
The first convergence condition means that the magnitude of the first loss value of the generation sub-model is calculated twice in the neighborhood to satisfy lipschitz condition (lipschitz continuous condition), and the magnitude of the second loss value of the generation sub-model is calculated twice in the neighborhood to satisfy lipschitz condition.
The third convergence condition means that the magnitude of the third loss value of the discriminant sub-model calculated twice adjacently satisfies the lipschitz condition, and the magnitude of the fourth loss value of the discriminant sub-model calculated twice adjacently satisfies the lipschitz condition.
The number of countermeasure training refers to the number of times the loss value of the generation sub-model and the loss value of the discrimination sub-model are used for calculation, that is, the number of iterations is increased by 1 by calculating the loss value of the generation sub-model once (including the first loss value and the second loss value) or calculating the loss value of the discrimination sub-model once (including the third loss value and the fourth loss value).
The fifth convergence condition is a specific value.
In one embodiment, the step of generating sentences according to the generation sub-model and the first sentence sample text of the classification training sample to be used for countermeasure training to obtain the generated sentences to be processed includes:
S2221: extracting a language fragment from the first sentence sample text of the classification training sample to be subjected to countermeasure training by adopting a preset language fragment extraction rule and a mode of extracting from the beginning, and taking the language fragment as a language fragment to be subjected to prediction splicing;
S2222: and iteratively adding predicted characters into a sentence iteration generating method for predicting the next character through the generating sub-model, and generating sentences according to the language fragments to be predicted and spliced to obtain the generated sentences to be processed.
According to the embodiment, the language fragments are firstly extracted from the first sentence sample text of the classification training sample to be subjected to countermeasure training by adopting a preset language fragment extraction rule and a mode of extracting from the beginning, and then the extracted language fragments are adopted to generate the generated sentence to be processed, so that a foundation is provided for rapidly generating and judging the classification training sample.
For S2221, extracting a language segment from the beginning of the first sentence sample text of the classification training sample to be used for countermeasure training, and taking the extracted language segment as a language segment to be predicted and spliced, where the number of characters in the language segment to be predicted and spliced is the same as the number of characters of a preset language segment extraction rule.
For example, the first sentence sample text of the classification training sample to be combinedly trained is "i liked the scenery of beijing", the language fragment "i liked" is extracted from the "i liked the scenery of beijing" by adopting a preset language fragment extraction rule and a mode of extracting from the beginning, and the extracted language fragment "i liked" is used as the language fragment to be predicted and spliced, which is not limited specifically herein.
And S2222, performing sentence generation according to the language fragments to be predicted and spliced by adopting a sentence iteration generation method for iteratively adding predicted characters into the next character prediction through the generation submodel, so as to supplement the language fragments to be predicted and spliced into complete sentences.
In one embodiment, the step of generating the sentence according to the language segment to be predicted and spliced to obtain the generated sentence to be processed by using the generating sub-model and adopting a sentence iteration generating method for iteratively adding the predicted character into the next character prediction, includes:
S22221: taking the language fragments to be predicted and spliced as texts to be predicted;
S22222: inputting the text to be predicted into the generation submodel to predict the next character to obtain a character predicted value to be spliced;
S22223: sequentially splicing the text to be predicted and the character predicted value to be spliced to obtain a spliced text;
s22224: taking the spliced text as the text to be predicted;
S22225: repeatedly executing the step of inputting the text to be predicted into the generation sub-model to predict the next character to obtain a character predicted value to be spliced until the number of characters of the text to be predicted reaches a character prediction convergence condition;
s22226: and taking the text to be predicted as the generated sentence to be processed.
According to the method, the sentence iteration generation method for carrying out next character prediction by iteratively adding the predicted characters into the generation submodel is adopted, and sentence generation is carried out according to the language fragments to be predicted and spliced, so that automatic sentence generation is realized.
And for S22222, inputting the text to be predicted into the generation submodel to predict the next character, and taking the predicted character as a character predicted value to be spliced.
And for S22223, splicing the character predicted value to be spliced behind the text to be predicted, and taking the spliced data as the spliced text.
For S22224, the spliced text is taken as the text to be predicted, so as to be taken as the basis of the next prediction.
For S22225, S22222 to S2225 are repeatedly executed until the number of characters of the text to be predicted reaches the character prediction convergence condition.
Optionally, the character prediction convergence condition: subtracting the number of the characters of the text to be predicted from the number of the characters of the first sentence sample text of the classification training sample to be countertrained to obtain a difference value of the number of the characters to be analyzed, and ending iteration when the difference value of the number of the characters to be analyzed is equal to a difference value of the number of the characters, wherein the difference value of the number of the characters can be equal to 0 or greater than 0.
For S22226, the text to be predicted that reaches the character prediction convergence condition is taken as the generated sentence to be processed.
In an embodiment, the step of performing the countermeasure training on the generating sub-model and the discriminating sub-model according to the first authenticity probability prediction value corresponding to the classifying training sample to be subjected to the countermeasure training, the second authenticity probability prediction value corresponding to the first sentence authenticity calibration data and the discriminating classifying training sample, and the second sentence authenticity calibration data by adopting an iterative optimization training method includes:
s2241: obtaining a model identifier to be optimized, and taking the identifier of the generation sub-model as the model identifier to be optimized when the model identifier to be optimized is empty;
S2242: when the model to be optimized is identified as the identifier of the generation sub-model, calculating a loss value according to the first authenticity probability prediction value and the first sentence authenticity calibration data corresponding to the classification training sample to be subjected to countermeasure training to obtain a first loss value of the generation sub-model, updating parameters of the generation sub-model according to the first loss value of the generation sub-model, calculating a loss value according to the second authenticity probability prediction value and the second sentence authenticity calibration data corresponding to the discrimination classification training sample to obtain a second loss value of the generation sub-model, updating parameters of the generation sub-model according to the second loss value of the generation sub-model, judging whether the first loss value and the second loss value reach a first convergence condition or not, or judging whether the iteration times of the generation sub-model reach a second convergence condition or not, and taking the identification sub-model as the discrimination model to be optimized when the first loss value and the second loss value reach the first convergence condition or the iteration times of the generation sub-model reach the second convergence condition;
S2243: when the model identifier to be optimized is the identifier of the judgment sub-model, calculating a loss value according to the first authenticity probability prediction value and the first sentence authenticity calibration data corresponding to the classification training sample to be subjected to countermeasure training to obtain a third loss value of the judgment sub-model, updating parameters of the judgment sub-model according to the third loss value of the judgment sub-model, calculating a loss value according to the second authenticity probability prediction value and the second sentence authenticity calibration data corresponding to the classification training sample to obtain a fourth loss value of the judgment sub-model, updating parameters of the judgment sub-model according to the fourth loss value of the judgment sub-model, judging whether the third loss value and the fourth loss value reach a third convergence condition or not, or judging whether the iteration times of the judgment sub-model reach a fourth convergence condition or not, and when the third loss value and the fourth loss value reach the third convergence condition or the iteration times of the judgment sub-model reach the fourth convergence condition, taking the identifier as the identifier to be optimized model.
According to the embodiment, the generating sub-model and the judging sub-model are repeatedly performed according to the first authenticity probability prediction value corresponding to the classifying training sample to be subjected to countermeasure training, the first sentence authenticity calibration data and the second authenticity probability prediction value corresponding to the judging classifying training sample, and the second sentence authenticity calibration data, so that parameters of the judging sub-model are kept unchanged and updated, parameters of the generating sub-model are kept unchanged and parameters of the judging sub-model are updated, and the generalization capability of the generating sub-model is improved.
For S2241, the model identification to be optimized may be obtained from the database.
When the model identifier to be optimized is empty, it means that countermeasure training is not performed, so that the identifier of the generating sub-model is used as the model identifier to be optimized, and a parameter updating mode of 'keeping the parameters of the discriminating sub-model unchanged and updating the parameters of the generating sub-model' is performed first.
With S2242, when the model identifier to be optimized is the identifier of the generation sub-model, it means that a parameter update mode of "keeping the parameters of the determination sub-model unchanged and updating the parameters of the generation sub-model" is required at this time.
When the first loss value and the second loss value reach the first convergence condition or the iteration number of the generating sub-model reaches the second convergence condition, the fact that the parameters of the generating sub-model are continuously updated at present cannot improve the effect of the generating sub-model is meant, so that the identification of the judging sub-model is used as the model identification to be optimized, and the parameter updating mode of the countermeasure training is enabled to enter into the mode of 'keeping the parameters of the generating sub-model unchanged and updating the parameters of the judging sub-model', so that iterative optimization is achieved.
And calculating the loss value of the generation submodel by adopting a cross entropy loss function.
The second convergence condition is a specific value.
With S2243, when the model identifier to be optimized is the identifier of the discriminant sub-model, it means that the parameter update method of "keeping the parameters of the generation sub-model unchanged and updating the parameters of the discriminant sub-model" is performed at this time.
When the third loss value and the fourth loss value reach the third convergence condition or the iteration number of the discriminant sub-model reaches the fourth convergence condition, the fact that the parameters of the discriminant sub-model continue to be updated currently cannot improve the effect of the discriminant sub-model means that the identification of the generating sub-model is used as the model identification to be optimized, so that the parameter updating mode of the countermeasure training is enabled to enter into the mode of 'keeping the parameters of the discriminant sub-model unchanged and updating the parameters of the generating sub-model', and iterative optimization is achieved.
And calculating the loss value of the judging sub-model by adopting a cross entropy loss function.
The fourth convergence condition is a specific value.
Referring to fig. 2, the present application also proposes a text classification prediction apparatus, the apparatus comprising:
a data acquisition module 100 for acquiring target text data;
The text classification prediction module 200 is configured to input the target text data into a target text classification model for text classification prediction, where the target text classification model is a model obtained by training according to a generation sub-model, a determination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method;
and the target text classification prediction result determining module 300 is used for obtaining the target text classification prediction result output by the target text classification model.
The embodiment obtains the target text data; inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method; and obtaining a target text classification prediction result output by the target text classification model, rapidly expanding training samples of various emotions by generating a sub-model and judging the sub-model, and training a Bert model, a full connection layer and a Softmax activation function by the expanded training samples and marked texts determined according to real data to obtain a target text classification model with excellent generalization capability, thereby improving the success rate of the target text classification model for carrying out text classification prediction on target text data.
Referring to fig. 3, in an embodiment of the present application, there is further provided a computer device, which may be a server, and an internal structure thereof may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer equipment is used for storing data such as text classification prediction methods and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a text classification prediction method. The text classification prediction method comprises the following steps: acquiring target text data; inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method; and obtaining a target text classification prediction result output by the target text classification model.
The embodiment obtains the target text data; inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method; and obtaining a target text classification prediction result output by the target text classification model, rapidly expanding training samples of various emotions by generating a sub-model and judging the sub-model, and training a Bert model, a full connection layer and a Softmax activation function by the expanded training samples and marked texts determined according to real data to obtain a target text classification model with excellent generalization capability, thereby improving the success rate of the target text classification model for carrying out text classification prediction on target text data.
An embodiment of the present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a text classification prediction method, comprising the steps of: acquiring target text data; inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method; and obtaining a target text classification prediction result output by the target text classification model.
The text classification prediction method is implemented by acquiring target text data; inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method; and obtaining a target text classification prediction result output by the target text classification model, rapidly expanding training samples of various emotions by generating a sub-model and judging the sub-model, and training a Bert model, a full connection layer and a Softmax activation function by the expanded training samples and marked texts determined according to real data to obtain a target text classification model with excellent generalization capability, thereby improving the success rate of the target text classification model for carrying out text classification prediction on target text data.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided by the present application and used in embodiments may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that comprises the element.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the application, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application or directly or indirectly applied to other related technical fields are included in the scope of the application.

Claims (7)

1. A method of text classification prediction, the method comprising:
Acquiring target text data;
inputting the target text data into a target text classification model for text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a discrimination sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method;
obtaining a target text classification prediction result output by the target text classification model;
Before the step of inputting the target text data into a target text classification model for text classification prediction, the method further comprises the following steps:
Acquiring a first classification training sample set, wherein each first classification training sample in the first classification training sample set comprises: a first sentence sample text and first sentence authenticity calibration data;
Performing countermeasure training on the generating sub-model and the judging sub-model according to the first classification training sample set by adopting an iterative optimization training method, and taking the generating sub-model after the countermeasure training as a sentence generating model, wherein the generating sub-model is a model obtained based on a circulating neural network, and the judging sub-model is a model obtained based on a fully connected network or a convolutional neural network;
Acquiring a plurality of language fragments to be predicted;
respectively inputting each language fragment to be predicted in the plurality of language fragments to be predicted into the sentence generating model to generate sentences by adopting a prediction and splicing iteration method, so as to obtain a plurality of generated sentence texts;
Acquiring classification calibration data corresponding to each generated sentence text in the generated sentence texts, and generating samples according to the generated sentence texts and the classification calibration data to obtain generated sentence samples;
Acquiring a plurality of classification training samples to be expanded, and taking the plurality of classification training samples to be expanded and the plurality of generated sentence samples as a second classification training sample set;
Training a text classification initial model according to the second classification training sample set by adopting an MLM training method, and taking the text classification initial model after training as the target text classification model, wherein the text classification initial model is a model obtained based on the Bert model, the full connection layer and the Softmax activation function;
Training the text classification initial model according to the second classification training sample set by adopting an MLM training method, and taking the text classification initial model after training as the target text classification model, wherein the training step comprises the following steps:
Extracting a second classification training sample from the second classification training sample set as a target classification training sample;
Inputting a sample text to be classified and trained of the target classification training sample into an embedding layer of the text classification initial model for marking analysis and character adjustment by adopting a fixed character length to obtain an adjusted sample text;
Training the text classification initial model according to the adjusted sample text and the classification calibration data of the target classification training sample by adopting the MLM training method;
repeating the step of extracting a second classification training sample from the second classification training sample set as a target classification training sample until a classification training convergence condition is met;
Taking the text classification initial model meeting the classification training convergence condition as the target text classification model;
The step of performing the countermeasure training on the generating sub-model and the discriminating sub-model according to the first classification training sample set by adopting the iterative optimization training method, and taking the generating sub-model after the countermeasure training as a sentence generating model includes:
Acquiring a first classification training sample from the first classification training sample set as a classification training sample to be subjected to countermeasure training;
adopting a sentence iteration generating method for iteratively adding predicted characters into a next character to be predicted, and generating sentences according to the generating submodel and the first sentence sample text of the classification training sample to be subjected to countermeasure training to obtain generated sentences to be processed;
acquiring a generated sentence calibration symbol, and generating a sample according to the generated sentence calibration symbol and the generated sentence to be processed to obtain a discrimination and classification training sample; the generated sentence calibration symbols are used as second sentence authenticity calibration data for distinguishing and classifying training samples;
Respectively inputting a first sentence sample text of the classification training sample to be subjected to countermeasure training and a second sentence sample text of the discrimination classification training sample into the discrimination sub-model to predict the authenticity probability, so as to obtain a first authenticity probability predicted value corresponding to the classification training sample to be subjected to countermeasure training and a second authenticity probability predicted value corresponding to the discrimination classification training sample;
adopting an iterative optimization training method, and performing countermeasure training on the generating sub-model and the judging sub-model according to the first authenticity probability prediction value corresponding to the classification training sample to be subjected to countermeasure training, the first sentence authenticity calibration data and the second authenticity probability prediction value corresponding to the judging classification training sample, and the second sentence authenticity calibration data;
repeating the step of acquiring a first classification training sample from the first classification training sample set as a classification training sample to be subjected to countermeasure training until a countermeasure training convergence condition is met, and taking the generation sub-model meeting the countermeasure training convergence condition as the sentence generation model.
2. The text classification prediction method according to claim 1, wherein the step of iteratively adding the predicted character to a sentence iteration generation method for performing next character prediction, performing sentence generation according to the generation submodel and the first sentence sample text of the classification training sample to be countertrained, to obtain a generated sentence to be processed, includes:
extracting a language fragment from the first sentence sample text of the classification training sample to be subjected to countermeasure training by adopting a preset language fragment extraction rule and a mode of extracting from the beginning, and taking the language fragment as a language fragment to be subjected to prediction splicing;
And iteratively adding predicted characters into a sentence iteration generating method for predicting the next character through the generating sub-model, and generating sentences according to the language fragments to be predicted and spliced to obtain the generated sentences to be processed.
3. The text classification prediction method according to claim 2, wherein the step of generating sentences according to the language fragments to be predicted and spliced to obtain the generated sentences to be processed by using the generating submodel and adopting a sentence iteration generating method for iteratively adding predicted characters into a sentence for next character prediction comprises the steps of:
Taking the language fragments to be predicted and spliced as texts to be predicted;
Inputting the text to be predicted into the generation submodel to predict the next character to obtain a character predicted value to be spliced;
sequentially splicing the text to be predicted and the character predicted value to be spliced to obtain a spliced text;
taking the spliced text as the text to be predicted;
Repeatedly executing the step of inputting the text to be predicted into the generation sub-model to predict the next character to obtain a character predicted value to be spliced until the number of characters of the text to be predicted reaches a character prediction convergence condition;
and taking the text to be predicted as the generated sentence to be processed.
4. The text classification prediction method according to claim 1, wherein the step of performing the countermeasure training on the generation sub-model and the discrimination sub-model by using the iterative optimization training method according to the first authenticity probability prediction value corresponding to the classification training sample to be subjected to the countermeasure training and the second authenticity probability prediction value corresponding to the first sentence authenticity calibration data and the discrimination classification training sample and the second sentence authenticity calibration data comprises:
Obtaining a model identifier to be optimized, and taking the identifier of the generation sub-model as the model identifier to be optimized when the model identifier to be optimized is empty;
When the model to be optimized is identified as the identifier of the generation sub-model, calculating a loss value according to the first authenticity probability prediction value and the first sentence authenticity calibration data corresponding to the classification training sample to be subjected to countermeasure training to obtain a first loss value of the generation sub-model, updating parameters of the generation sub-model according to the first loss value of the generation sub-model, calculating a loss value according to the second authenticity probability prediction value and the second sentence authenticity calibration data corresponding to the discrimination classification training sample to obtain a second loss value of the generation sub-model, updating parameters of the generation sub-model according to the second loss value of the generation sub-model, judging whether the first loss value and the second loss value reach a first convergence condition or not, or judging whether the iteration times of the generation sub-model reach a second convergence condition or not, and taking the identification sub-model as the discrimination model to be optimized when the first loss value and the second loss value reach the first convergence condition or the iteration times of the generation sub-model reach the second convergence condition;
When the model identifier to be optimized is the identifier of the judgment sub-model, calculating a loss value according to the first authenticity probability prediction value and the first sentence authenticity calibration data corresponding to the classification training sample to be subjected to countermeasure training to obtain a third loss value of the judgment sub-model, updating parameters of the judgment sub-model according to the third loss value of the judgment sub-model, calculating a loss value according to the second authenticity probability prediction value and the second sentence authenticity calibration data corresponding to the classification training sample to obtain a fourth loss value of the judgment sub-model, updating parameters of the judgment sub-model according to the fourth loss value of the judgment sub-model, judging whether the third loss value and the fourth loss value reach a third convergence condition or not, or judging whether the iteration times of the judgment sub-model reach a fourth convergence condition or not, and when the third loss value and the fourth loss value reach the third convergence condition or the iteration times of the judgment sub-model reach the fourth convergence condition, taking the identifier as the identifier to be optimized model.
5. A text classification prediction device for performing the method of any of claims 1-4, the device comprising:
The data acquisition module is used for acquiring target text data;
The text classification prediction module is used for inputting the target text data into a target text classification model to perform text classification prediction, wherein the target text classification model is a model obtained by training according to a generation sub-model, a judgment sub-model, a Bert model, a full connection layer, a Softmax activation function and an MLM training method;
and the target text classification prediction result determining module is used for acquiring a target text classification prediction result output by the target text classification model.
6. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 4 when the computer program is executed.
7. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 4.
CN202110734767.7A 2021-06-30 2021-06-30 Text classification prediction method, device, equipment and storage medium Active CN113326379B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110734767.7A CN113326379B (en) 2021-06-30 2021-06-30 Text classification prediction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110734767.7A CN113326379B (en) 2021-06-30 2021-06-30 Text classification prediction method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113326379A CN113326379A (en) 2021-08-31
CN113326379B true CN113326379B (en) 2024-07-19

Family

ID=77423658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110734767.7A Active CN113326379B (en) 2021-06-30 2021-06-30 Text classification prediction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113326379B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113628043B (en) * 2021-09-17 2024-06-07 平安银行股份有限公司 Complaint validity judging method, device, equipment and medium based on data classification
CN113836303A (en) * 2021-09-26 2021-12-24 平安科技(深圳)有限公司 Text type identification method and device, computer equipment and medium
CN114386391B (en) * 2022-01-11 2023-08-15 平安科技(深圳)有限公司 Sentence vector feature extraction method, device, equipment and medium based on artificial intelligence
CN114416984A (en) * 2022-01-12 2022-04-29 平安科技(深圳)有限公司 Text classification method, device and equipment based on artificial intelligence and storage medium
CN116127067B (en) * 2022-12-28 2023-10-20 北京明朝万达科技股份有限公司 Text classification method, apparatus, electronic device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100378A (en) * 2020-09-15 2020-12-18 中国平安人寿保险股份有限公司 Text classification model training method and device, computer equipment and storage medium
CN112131366A (en) * 2020-09-23 2020-12-25 腾讯科技(深圳)有限公司 Method, device and storage medium for training text classification model and text classification

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118349673A (en) * 2019-09-12 2024-07-16 华为技术有限公司 Training method of text processing model, text processing method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100378A (en) * 2020-09-15 2020-12-18 中国平安人寿保险股份有限公司 Text classification model training method and device, computer equipment and storage medium
CN112131366A (en) * 2020-09-23 2020-12-25 腾讯科技(深圳)有限公司 Method, device and storage medium for training text classification model and text classification

Also Published As

Publication number Publication date
CN113326379A (en) 2021-08-31

Similar Documents

Publication Publication Date Title
CN113326379B (en) Text classification prediction method, device, equipment and storage medium
CN110909803B (en) Image recognition model training method and device and computer readable storage medium
CN109543032B (en) Text classification method, apparatus, computer device and storage medium
CN110717514A (en) Session intention identification method and device, computer equipment and storage medium
CN112732919B (en) Intelligent classification label method and system for network security threat information
CN111159454A (en) Picture description generation method and system based on Actor-Critic generation type countermeasure network
KR20190085098A (en) Keyword extraction method, computer device, and storage medium
EP3979098A1 (en) Data processing method and apparatus, storage medium, and electronic apparatus
CN109948160B (en) Short text classification method and device
CN113849648B (en) Classification model training method, device, computer equipment and storage medium
CN112580329B (en) Text noise data identification method, device, computer equipment and storage medium
CN111291264A (en) Access object prediction method and device based on machine learning and computer equipment
CN114026556A (en) Semantic element prediction method, computer device and storage medium background
CN112188311B (en) Method and apparatus for determining video material of news
CN112699923A (en) Document classification prediction method and device, computer equipment and storage medium
CN112613555A (en) Object classification method, device, equipment and storage medium based on meta learning
CN112733549B (en) Patent value information analysis method and device based on multiple semantic fusion
CN110175273A (en) Text handling method, device, computer readable storage medium and computer equipment
CN112100377A (en) Text classification method and device, computer equipment and storage medium
CN113449489A (en) Punctuation mark marking method, punctuation mark marking device, computer equipment and storage medium
US20220358658A1 (en) Semi Supervised Training from Coarse Labels of Image Segmentation
CN114357164A (en) Emotion-reason pair extraction method, device and equipment and readable storage medium
CN116484224A (en) Training method, device, medium and equipment for multi-mode pre-training model
CN111738012B (en) Method, device, computer equipment and storage medium for extracting semantic alignment features
CN103744830A (en) Semantic analysis based identification method of identity information in EXCEL document

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant