CN115510232A - Text sentence classification method and classification device, electronic equipment and storage medium - Google Patents

Text sentence classification method and classification device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115510232A
CN115510232A CN202211201108.8A CN202211201108A CN115510232A CN 115510232 A CN115510232 A CN 115510232A CN 202211201108 A CN202211201108 A CN 202211201108A CN 115510232 A CN115510232 A CN 115510232A
Authority
CN
China
Prior art keywords
text
sample
sentence
feature
positive sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211201108.8A
Other languages
Chinese (zh)
Inventor
欧阳升
王健宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202211201108.8A priority Critical patent/CN115510232A/en
Publication of CN115510232A publication Critical patent/CN115510232A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a text sentence classification method and device, electronic equipment and a storage medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: inputting each sample pair data of the training sample set into an initial text sentence classification model, and respectively extracting the features of the sample sentences to obtain a first text feature and a second text feature; performing feature constraint on the first text feature to update the first text feature; performing text classification processing on the first text features to obtain a plurality of sample prediction probability values; numerically comparing the plurality of sample prediction probability values to determine a target sample label; adjusting model parameters of the initial text statement classification model according to the positive sample label and the target sample label to obtain a target text statement classification model; and classifying the obtained initial text sentences through a target text sentence classification model to obtain target classes. The method and the device for classifying the text sentences can improve the accuracy of text sentence classification.

Description

Text sentence classification method and classification device, electronic equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a text sentence classification method and classification apparatus, an electronic device, and a storage medium.
Background
Currently, the text classification task is a fundamental and important task in natural language processing. This task has a great number of applications in industrial scenarios, such as negative emotion recognition, intention recognition, etc. However, in an actual industrial scene, the labeled sample size is small, the sample distribution is unbalanced, the relevance between the text content and the label is not obvious, and the like, so that the existing text classification model has a poor classification effect. Therefore, how to improve the accuracy of text statement classification in a scene with few samples becomes a technical problem to be solved urgently.
Disclosure of Invention
The embodiment of the application mainly aims to provide a text sentence classification method and a classification device, electronic equipment and a storage medium, and aims to improve the accuracy of a model for classifying text sentences.
In order to achieve the above object, a first aspect of an embodiment of the present application provides a text sentence classification method, where the method includes:
obtaining a training sample set, wherein the training sample set comprises a plurality of sample pair data, and each sample pair data comprises a positive sample statement, a positive sample label corresponding to the positive sample statement, a negative sample statement and a negative sample label corresponding to the negative sample statement;
acquiring an initial text sentence classification model, wherein the initial text sentence classification model comprises a pre-training sub-model, a feature constraint sub-model and a text classification sub-model;
inputting the positive sample sentences and the negative sample sentences of each sample pair data into the initial text sentence classification model, and respectively performing feature extraction on the positive sample sentences and the negative sample sentences through the pre-training submodels to obtain first text features of the positive sample sentences and second text features of the negative sample sentences;
performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature to update the first text feature;
performing text classification processing on the first text features through the text classification submodel to obtain a sample prediction probability value of each class label of the positive sample sentence;
numerically comparing the sample prediction probability values of the positive sample statement to determine a target sample label of the positive sample statement;
adjusting model parameters of the initial text statement classification model according to the positive sample label and the target sample label of the positive sample statement, and continuing to train the adjusted initial text statement classification model based on the training sample set until a model loss value of the initial text statement classification model meets a preset training end condition to obtain a target text statement classification model;
and acquiring an initial text sentence to be classified, and classifying the initial text sentence through the target text sentence classification model to obtain a target category.
In some embodiments, after the text classification processing is performed on the first text feature through the text classification submodel to obtain a sample prediction probability value that the positive sample sentence belongs to each class label, the method further includes:
performing feature constraint calculation on the first text feature, the second text feature, the positive sample label and the negative sample label through the feature constraint submodel to obtain a comparison loss value;
obtaining a cross entropy loss value according to the positive sample label and the sample prediction probability value;
and obtaining a model loss value according to the contrast loss value and the cross entropy loss value.
In some embodiments, the pre-training submodel includes a feature coding process and a self-attention process, the inputting the positive sample sentence and the negative sample sentence of each sample pair data into the initial text sentence classification model, and performing feature extraction on the positive sample sentence and the negative sample sentence through the pre-training submodel to obtain a first text feature of the positive sample sentence and a second text feature of the negative sample sentence, including:
inputting the positive sample statements and the negative sample statements of each of the sample pair data to the initial text statement classification model;
respectively carrying out the feature coding processing on each text word in the positive sample sentence and each text word in the negative sample sentence to obtain a positive sample word feature corresponding to the positive sample sentence and a negative sample word feature corresponding to the negative sample sentence;
and respectively carrying out the self-attention processing on all the positive sample word features and all the negative sample word features to obtain first text features of the positive sample sentences and second text features of the negative sample sentences.
In some embodiments, prior to said inputting said positive sample statements and said negative sample statements of each said sample pair data to said initial text statement classification model, said method further comprises:
respectively carrying out length comparison on the positive sample statement and the negative sample statement according to a preset text length threshold, when the text length of the positive sample statement is smaller than the text length threshold, carrying out zero filling operation on the positive sample statement according to the text length threshold, and updating the positive sample statement until the text length of the positive sample statement is equal to the text length threshold;
and when the text length of the negative sample sentence is smaller than the text length threshold, carrying out zero filling operation on the negative sample sentence according to the text length threshold, and updating the negative sample sentence until the text length of the negative sample sentence is equal to the text length threshold.
In some embodiments, the performing, by the feature constraint sub-model, a feature constraint calculation on the first text feature, the second text feature, the positive exemplar label, and the negative exemplar label to obtain a contrast loss value includes:
comparing the labels of the positive sample label and the negative sample label according to a preset indication function, and determining a contrast coefficient;
and performing feature constraint calculation according to the first text feature, the second text feature and the contrast coefficient to obtain a contrast loss value.
In some embodiments, said deriving a cross entropy loss value from said positive sample label and said sample prediction probability value comprises:
acquiring the number of text words of the positive sample sentence;
and calculating the text word number, the positive sample label and the sample prediction probability value according to a preset cross entropy loss function to obtain a cross entropy loss value.
In order to achieve the above object, a second aspect of the embodiments of the present application provides a text sentence classification apparatus, including:
a sample set obtaining module, configured to obtain a training sample set, where the training sample set includes multiple sample pair data, and each sample pair data includes a positive sample statement, a positive sample label corresponding to the positive sample statement, a negative sample statement, and a negative sample label corresponding to the negative sample statement;
the initial model obtaining module is used for obtaining an initial text sentence classification model, and the initial text sentence classification model comprises a pre-training sub-model, a characteristic constraint sub-model and a text classification sub-model;
the feature extraction module is used for inputting the positive sample sentences and the negative sample sentences of each sample pair data into the initial text sentence classification model, and respectively performing feature extraction on the positive sample sentences and the negative sample sentences through the pre-training sub-model to obtain first text features of the positive sample sentences and second text features of the negative sample sentences;
the feature constraint module is used for performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature so as to update the first text feature;
the sample text classification module is used for carrying out text classification processing on the first text features through the text classification submodel to obtain a sample prediction probability value of each class label of the positive sample sentence;
the numerical value comparison module is used for carrying out numerical value comparison on the plurality of sample prediction probability values of the positive sample statement and determining a target sample label of the positive sample statement;
a target model building module, configured to adjust model parameters of the initial text sentence classification model according to the positive sample labels and the target sample labels of the positive sample sentences, and continue to train the adjusted initial text sentence classification model based on the training sample set until a model loss value of the initial text sentence classification model meets a preset training end condition, so as to obtain a target text sentence classification model;
and the target text classification module is used for acquiring the initial text sentences to be classified and classifying the initial text sentences through the target text sentence classification model to obtain target classes.
In order to achieve the above object, a third aspect of the embodiments of the present application provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the method of the first aspect when executing the computer program.
To achieve the above object, a fourth aspect of the embodiments of the present application proposes a storage medium, which is a computer-readable storage medium, and the computer-readable storage medium stores a computer program, and the computer program implements the method of the first aspect when executed by a processor.
According to the text sentence classification method and device, the electronic equipment and the storage medium, the positive and negative samples are combined and the characteristics are constrained, so that the text sentence to be recognized can be fully classified with high precision under the conditions of less samples and weak relevance of text sentence pre-labels. Firstly, a target text statement classification model is constructed, specifically, a training sample set is obtained, the training sample set comprises a plurality of sample pair data, and each sample pair data comprises a positive sample statement, a positive sample label corresponding to the positive sample statement, a negative sample statement and a negative sample label corresponding to the negative sample statement. Obtaining an initial text sentence classification model, wherein the initial text sentence classification model comprises a pre-training sub-model, a feature constraint sub-model and a text classification sub-model. And then, inputting positive sample sentences and negative sample sentences of each sample pair data into the initial text sentence classification model, and respectively performing feature extraction on the positive sample sentences and the negative sample sentences through the pre-training sub-model to obtain first text features of the positive sample sentences and second text features of the negative sample sentences. Then, performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature to update the first text feature; and performing text classification processing on the first text features through a text classification submodel to obtain a sample prediction probability value of each class label of the positive sample sentence. And carrying out numerical comparison on the plurality of sample prediction probability values obtained by the positive sample statement to determine a target sample label of the positive sample statement. And finally, adjusting model parameters of the initial text sentence classification model according to the positive sample label and the target sample label of the positive sample sentence, and continuing to train the adjusted initial text sentence classification model based on the training sample set until the model loss value of the initial text sentence classification model meets a preset training end condition to obtain the target text sentence classification model. And acquiring an initial text sentence to be classified, and classifying the initial text sentence through a target text sentence classification model to obtain a target category. According to the text sentence classification method and device, the accuracy of the model for classifying the text sentences can be improved.
Drawings
Fig. 1 is a first flowchart of a text sentence classification method provided in an embodiment of the present application;
fig. 2 is a flowchart of step S103 in fig. 1;
FIG. 3 is a second flowchart of a text sentence classification method provided by an embodiment of the present application;
FIG. 4 is a third flowchart of a text sentence classification method provided in an embodiment of the present application;
fig. 5 is a flowchart of step S401 in fig. 4;
FIG. 6 is a flowchart of step S402 in FIG. 4;
fig. 7 is a schematic structural diagram of a text sentence classification apparatus according to an embodiment of the present application;
fig. 8 is a schematic hardware structure diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
First, several terms referred to in the present application are resolved:
artificial Intelligence (AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence; artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produces a new intelligent machine that can react in a manner similar to human intelligence, and research in this field includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others. The artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is also a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results.
Natural Language Processing (NLP): NLP uses computer to process, understand and use human language (such as chinese, english, etc.), and belongs to a branch of artificial intelligence, which is a cross discipline between computer science and linguistics, also commonly called computational linguistics. Natural language processing includes parsing, semantic analysis, discourse understanding, and the like. Natural language processing is commonly used in the technical fields of machine translation, character recognition of handwriting and print, speech recognition and text-to-speech conversion, information intention recognition, information extraction and filtering, text classification and clustering, public opinion analysis and viewpoint mining, and relates to data mining, machine learning, knowledge acquisition, knowledge engineering, artificial intelligence research, linguistic research related to language calculation and the like related to language processing.
Unified Language Model (Unified Language Model, uniLM Model): the method is a pre-training language model and can be finely adjusted according to natural language understanding and generating tasks. The model is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction, which can use a shared transform network and utilize specific attention masks to control the context of the prediction conditions.
Metric Learning (Metric Learning): a metric (or distance function) is a function that defines the distance between elements in a set, and a set with a metric is called a metric space. Metric learning is commonly referred to as similarity learning, and distance metric learning is used to measure the degree of similarity between samples.
BERT (Bidirectional Encoder retrieval from transformations) model: the method is used for further increasing the generalization capability of a word vector model and fully describing character level, word level, sentence level and even sentence-to-sentence relation characteristics and is constructed based on a Transformer. There are three embeddings in BERT, namely Token Embedding, segment Embedding, position Embedding; wherein, token entries is a word vector, the first word is a CLS mark, and the first word can be used for the subsequent classification task; segment Embeddings are used to distinguish two sentences because pre-training does not only do LM but also do classification tasks with two sentences as input; position entries, where the Position word vector is not a trigonometric function in transform, but is learned by BERT training. But BERT directly trains a Position Embedding to reserve Position information, randomly initializes a vector at each Position, adds model training to obtain an Embedding containing the Position information, and finally selects direct splicing in the combination mode of the Position Embedding and the word Embedding.
Contrast Loss function (contrast Loss): the method is mainly used for dimensionality reduction, namely after the originally similar samples are subjected to dimensionality reduction (feature extraction), two samples are still similar in a feature space, and originally dissimilar samples are still dissimilar in the feature space. Also, the loss function can be well expressed as the degree of matching to the sample.
Cross entropy Loss function (CrossEntropy Loss): is the most common loss function in classification, and the cross entropy is used to measure the difference between two probability distributions to measure the difference between the learned distribution and the true distribution of the model.
Currently, the text classification task is a fundamental and important task in natural language processing. This task has a great number of applications in industrial scenarios, such as negative emotion recognition, intent recognition, etc. With the advent of pre-trained models such as BERT, existing solutions tend to only complete the classification directly through the pre-trained models. However, in an actual industrial scene, the problems that the labeled sample amount is small, the samples are distributed unevenly, the relevance between the text content and the label is not obvious and the like are solved, so that the classification effect of the existing text classification model is poor. Therefore, how to improve the accuracy of text statement classification in a scene with few samples becomes a technical problem to be solved urgently.
Based on this, the embodiment of the application provides a text sentence classification method and classification device, an electronic device and a storage medium, and aims to improve the accuracy of a model for classifying text sentences.
The text sentence classification method and classification device, the electronic device, and the storage medium provided in the embodiments of the present application are specifically described in the following embodiments, and first, the text sentence classification method in the embodiments of the present application is described.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence base technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The embodiment of the application provides a text sentence classification method, and relates to the technical field of artificial intelligence. The text sentence classification method provided by the embodiment of the application can be applied to a terminal, a server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smartphone, tablet, laptop, desktop computer, or the like; the server side can be configured as an independent physical server, or a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, content Delivery Network (CDN) and a big data and artificial intelligence platform; the software may be an application or the like that implements a text sentence classification method, but is not limited to the above form.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Fig. 1 is an optional flowchart of a text sentence classification method provided in an embodiment of the present application, and the method in fig. 1 may include, but is not limited to, step S101 to step S108.
Step S101, a training sample set is obtained, the training sample set comprises a plurality of sample pair data, and each sample pair data comprises a positive sample statement, a positive sample label corresponding to the positive sample statement, a negative sample statement and a negative sample label corresponding to the negative sample statement;
step S102, obtaining an initial text sentence classification model, wherein the initial text sentence classification model comprises a pre-training sub-model, a characteristic constraint sub-model and a text classification sub-model;
step S103, inputting positive sample sentences and negative sample sentences of each sample pair data into an initial text sentence classification model, and respectively performing feature extraction on the positive sample sentences and the negative sample sentences through a pre-training sub-model to obtain first text features of the positive sample sentences and second text features of the negative sample sentences;
step S104, performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature to update the first text feature;
step S105, carrying out text classification processing on the first text characteristics through a text classification sub-model to obtain a sample prediction probability value of a positive sample sentence belonging to each class label;
step S106, performing numerical comparison on the prediction probability values of a plurality of samples of the positive sample sentences to determine target sample labels of the positive sample sentences;
step S107, model parameters of the initial text sentence classification model are adjusted according to the positive sample labels and the target sample labels of the positive sample sentences, and the adjusted initial text sentence classification model continues to be trained on the basis of the training sample set until model loss values of the initial text sentence classification model meet preset training end conditions, so that the target text sentence classification model is obtained;
step S108, obtaining an initial text sentence to be classified, and classifying the initial text sentence through the target text sentence classification model to obtain a target category.
In steps S101 to S108 illustrated in the embodiment of the present application, by combining positive and negative samples and by using feature constraints, even when the sample size is small and the relevance of the pre-label of the text sentence is not strong, the text sentence to be recognized can be fully classified with high precision. Firstly, a target text statement classification model is constructed, specifically, a training sample set is obtained, the training sample set comprises a plurality of sample pair data, and each sample pair data comprises a positive sample statement, a positive sample label corresponding to the positive sample statement, a negative sample statement and a negative sample label corresponding to the negative sample statement. Obtaining an initial text sentence classification model, wherein the initial text sentence classification model comprises a pre-training sub-model, a feature constraint sub-model and a text classification sub-model. And then, inputting positive sample sentences and negative sample sentences of each sample pair data into the initial text sentence classification model, and respectively performing feature extraction on the positive sample sentences and the negative sample sentences through the pre-training sub-model to obtain first text features of the positive sample sentences and second text features of the negative sample sentences. Then, performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature to update the first text feature; and performing text classification processing on the first text features through a text classification submodel to obtain a sample prediction probability value of each class label of the positive sample sentence. And carrying out numerical comparison on the plurality of sample prediction probability values obtained by the positive sample statement to determine a target sample label of the positive sample statement. And finally, adjusting model parameters of the initial text sentence classification model according to the positive sample label and the target sample label of the positive sample sentence, and continuing training the adjusted initial text sentence classification model based on the training sample set until the model loss value of the initial text sentence classification model meets a preset training end condition, so as to obtain the target text sentence classification model. And acquiring an initial text sentence to be classified, and classifying the initial text sentence through a target text sentence classification model to obtain a target category. According to the text sentence classification method and device, the accuracy of the model for classifying the text sentences can be improved.
In step S101 in some embodiments, the positive sample statement and the negative sample statement in the sample pair data may be sample statements obtained from the same document, or sample statements obtained from different documents, that is, the embodiment of the present application does not limit the relationship between the sample statements in the sample pair data. The positive and negative exemplar labels may be specified by a technician or extracted from the corresponding exemplar sentences, without limitation. The positive exemplar labels are used for representing the classification of the positive exemplar sentences, and the negative exemplar labels are used for representing the classification of the negative exemplar sentences.
Illustratively, x represents a positive sample statement, y represents a negative sample statement, then one sample pair data may be represented as (x, y), the training sample set may be represented as ((x 1, y 1), (x 2, y 2),.., (xm, ym)), and m represents the number of sample pair data. For example, when a positive sample sentence is extracted from a plurality of sentences of a movie comment text, it can be found that the positive sample sentence is "this movie looks nice", and the positive sample label of the positive sample sentence is "nice". Negative example sentences are extracted from a plurality of sentences of film evaluation texts of the film, and the negative example sentences are 'the film is really ugly', and the negative example labels of the negative example sentences are 'ugly'. Therefore, the extracted positive sample statement and negative sample statement and their corresponding labels may form a sample pair data.
It should be noted that, in order to solve the problems of a small number of labeled samples, unbalanced sample distribution, and unobvious association between text content and labels in the existing application scenario, when label type data in a training sample set is distributed more evenly, the training samples may be extended by cross-pairing the obtained positive sample sentences and negative sample sentences.
It should be noted that, when the label category data in the training sample set is not distributed uniformly, in the security monitoring problem, most of the positive samples are normal people, and available negative samples are rare. If a simple and high accuracy binary model is trained with a full number of samples, the results are severely biased towards normal population, resulting in failure of the model. In order to balance sample pair data of different types of labels in a training sample set, a full sample pair matching mode can be adopted for the training sample set with a small labeled sample amount, for example, under the condition of two classifications, if the number of positive sample sentences is n1 and the number of negative sample sentences is n2, the total number of samples in the training sample set is n1+ n2, a first sample sentence can be matched with all sample sentences except the first sample sentence to generate (n 1+ n 2) (n 1+ n 2-1)/2 sample pair data, so that the labels are more diversified, meanwhile, the training sample set is enriched, and the accuracy of model training is improved.
In step S102 of some embodiments, in order to improve accuracy of text sentence classification, an initial text sentence classification model is obtained, where the initial text sentence classification model includes a pre-training sub-model, a feature constraint sub-model, and a text classification sub-model.
It should be noted that the pre-training submodel is used for extracting text information of an input text sentence, the feature constraint submodel is used for constraining and adjusting the representation of the text sentence features based on metric learning of the extracted text sentence features, and the text classification submodel is used for forming a final classification result according to the obtained features.
It should be noted that the initial text sentence classification model may be a Bert model, a UniLM model, an eletra model, or the like.
In step S103 of some embodiments, the pre-training submodel is configured to perform text information extraction on an input text statement, input a positive sample statement and a negative sample statement of each sample pair data into the initial text statement classification model, and perform feature extraction on the positive sample statement and the negative sample statement respectively through the pre-training submodel to obtain a first text feature of the positive sample statement and a second text feature of the negative sample statement, where the first text feature and the second text feature are used to represent deep semantic information features of corresponding sample statements.
Referring to fig. 2, in some embodiments, the pre-training submodel includes a feature encoding process and a self-attention process, and step S103 may include, but is not limited to, step S201 to step S203:
step S201, inputting positive sample sentences and negative sample sentences of each sample pair data into an initial text sentence classification model;
step S202, respectively carrying out feature coding processing on each text word in the positive sample sentence and each text word in the negative sample sentence to obtain a positive sample word feature corresponding to the positive sample sentence and a negative sample word feature corresponding to the negative sample sentence;
step S203, performing self-attention processing on all positive sample word features and all negative sample word features respectively to obtain first text features of the positive sample sentences and second text features of the negative sample sentences.
In steps S201 to S202 of some embodiments, after a positive sample sentence and a negative sample sentence of each sample pair data are input to an initial text sentence classification model, each text word in the positive sample sentence is subjected to feature coding processing by a pre-training sub-model, so as to obtain a positive sample word feature corresponding to the positive sample sentence; and carrying out feature coding processing on each text word in the negative sample sentence through the pre-training sub-model to obtain the negative sample word features corresponding to the negative sample sentence.
It should be noted that, when the Bert model is used as the initial text sentence classification model, each text word in the positive sample sentence is numbered through the Bert dictionary, and each number is subjected to feature coding processing to obtain a positive sample word feature corresponding to each sentence word in the positive sample sentence, that is, the token feature vector corresponding to each sentence word is obtained. Similarly, numbering each text word in the negative sample sentence through the Bert dictionary, and performing feature coding processing on each number to obtain a negative sample word feature corresponding to each sentence word in the negative sample sentence, so as to obtain a token feature vector corresponding to each sentence word, wherein the token feature vector is used for representing the feature meaning of the corresponding sentence word.
In step S203 of some embodiments, in order to extract a feature vector having deep information while reducing the data amount, self-attention processing is performed on all positive sample word features and all negative sample word features through a self-attention mechanism of a pre-training sub-model, so as to obtain first text features of a positive sample sentence and second text features of a negative sample sentence, where the first text features and the second text features are used to represent text features obtained by performing deep fusion on feature information between text words and words through the self-attention mechanism.
It should be noted that the first text feature is obtained by summing token feature vectors of all positive sample sub-features of the corresponding positive sample sentence, and the second text feature is obtained by summing token feature vectors of all positive sample sub-features of the corresponding negative sample sentence.
Referring to fig. 3, in some embodiments, before step S103, the method for classifying text sentences provided in the embodiments of the present application may further include, but is not limited to, step S301 to step S302:
step S301, respectively comparing the lengths of the positive sample sentence and the negative sample sentence according to a preset text length threshold, when the text length of the positive sample sentence is smaller than the text length threshold, performing zero filling operation on the positive sample sentence according to the text length threshold, and updating the positive sample sentence until the text length of the positive sample sentence is equal to the text length threshold;
step S302, when the text length of the negative sample sentence is smaller than the text length threshold, zero filling operation is carried out on the negative sample sentence according to the text length threshold, and the negative sample sentence is updated until the text length of the negative sample sentence is equal to the text length threshold.
In step S301 of some embodiments, in order to improve the training efficiency of the model, text length adjustment is predicted for the positive sample sentence and the negative sample sentence input into one batch of the initial text sentence classification model, so as to ensure that the lengths of the texts are consistent. Specifically, length comparison is respectively performed on the positive sample sentence and the negative sample sentence according to a preset text length threshold, when the text length of the positive sample sentence is smaller than the text length threshold, zero filling operation is performed on the positive sample sentence according to the text length threshold, namely zero filling is performed on the back of the original sample sentence until the text length of the positive sample sentence is equal to the text length threshold, and the spliced sample sentence is used as a new positive sample sentence.
In step S302 of some embodiments, similarly, when the text length of the negative sample sentence is smaller than the text length threshold, zero padding is performed on the negative sample sentence according to the text length threshold, that is, zero padding is performed after the original sample sentence until the text length of the negative sample sentence is equal to the text length threshold, and the spliced sample sentence is used as a new negative sample sentence.
It should be noted that the preset text length threshold may be set according to the maximum length of the positive sample sentence and the negative sample sentence in the training sample set, or may be set manually, and is not limited specifically herein.
In step S104 of some embodiments, since it is expected that the text feature vectors between sample sentences of the same category are more similar, i.e., feature distances are closer, and the text feature vectors between different categories are more distant, i.e., feature distances are larger, the distribution of text features is continuously adjusted in the training process. Specifically, the feature constraint sub-model is used for constraining and adjusting the representation of the text sentence features based on metric learning of the extracted text sentence features, and the feature constraint is performed on the first text features through the feature constraint sub-model and the second text features so as to update the first text features.
In step S105 of some embodiments, after the first text feature of the positive sample sentence is obtained through updating, the text classification processing is performed on the first text feature through the text classification sub-model, so as to obtain a sample prediction probability value of the positive sample sentence belonging to each class label.
It should be noted that the text classification submodel may be a model constructed by using a fully-connected neural network, and the model is combined with the softmax activation function to obtain the sample prediction probability value of each class label to which the positive sample statement belongs.
Referring to fig. 4, in some embodiments, after step S105, the method for classifying text sentences provided in the embodiments of the present application may further include, but is not limited to, step S401 to step S403:
step S401, performing characteristic constraint calculation on the first text characteristic, the second text characteristic, the positive sample label and the negative sample label through a characteristic constraint sub-model to obtain a comparison loss value;
step S402, obtaining a cross entropy loss value according to the positive sample label and the sample prediction probability value;
and S403, obtaining a model loss value according to the contrast loss value and the cross entropy loss value.
In steps S401 to S403 of some embodiments, in the feature constraint submodel of the embodiment of the present application, feature constraint calculation is performed on the first text feature, the second text feature, the positive sample label, and the negative sample label through a comparison loss function to obtain a comparison loss value, so as to guide the text features of the same category to be closer, and the text features of different categories to be further away, so that the model can effectively learn the corresponding distribution even when the relevance between the text content and the label is not obvious. And obtaining a cross entropy loss value according to the positive sample label and the sample prediction probability value, so that a model loss value is obtained according to the comparison loss value and the cross entropy loss value, a depth measurement learning method and a cross entropy loss method are combined, the distribution condition of text sentences and token feature vectors is continuously adjusted, and the training capacity of the model is improved.
Referring to fig. 5, in some embodiments, step S401 may include, but is not limited to include, step S501 to step S502:
step S501, comparing labels of the positive sample label and the negative sample label according to a preset indication function, and determining a contrast coefficient;
and S502, performing feature constraint calculation according to the first text feature, the second text feature and the contrast coefficient to obtain a contrast loss value.
In steps S501 and S502 of some embodiments, in order to perform feature constraint calculation on the first text feature, the second text feature, the positive sample label and the negative sample label through a comparison loss function, specifically, perform label comparison on the positive sample label and the negative sample label according to a preset indication function 1{ · } to determine a comparison coefficient, that is, the comparison coefficient includes 0 and 1, and when the positive sample label and the negative sample label are the same, the comparison coefficient is 1; when the positive and negative swatch labels are not the same, the contrast ratio is 0. As shown in formula (1), performing feature constraint calculation according to the first text feature, the second text feature and the contrast coefficient to obtain a contrast loss value L CL
Figure BDA0003869881210000121
Wherein f is x Representing a first text feature, f y Representing a second text feature, L x Denotes a positive sample label, L y A negative example label is indicated and a indicates a preset constant.
Referring to fig. 6, in some embodiments, step S402 may include, but is not limited to, step S601 to step S602:
step S601, acquiring the number of text words of a positive sample sentence;
step S602, calculating the number of text words, the positive sample label, and the sample prediction probability value according to a preset cross entropy loss function, to obtain a cross entropy loss value.
In steps S601 to S602 of some embodiments, for solving the cross entropy loss value, specifically, the number of text words of the positive sample sentence is obtained first, and then the number of text words, the positive sample label and the sample prediction probability value are calculated according to the preset cross entropy loss function as shown in formula (2), so as to obtain the cross entropy loss value L CEL
L CEL =∑ m y·log(p) (2)
Wherein m represents the number of text words of a sample sentence, represents the positive sample label number of each text sentence, and p represents the corresponding sample prediction probability value.
Note that the loss value L will be compared CL And cross entropy loss value L CEL Summing to obtain the Loss value Loss of the model sum
In step S106 of some embodiments, after obtaining the sample prediction probability values of positive sample sentences belonging to each class label, the sample prediction probability values of multiple class labels are compared, and the target sample label to which the positive sample sentences belong is determined.
In step S107 of some embodiments, model parameters of the initial text statement classification model are adjusted according to the positive sample tags and the target sample tags of the positive sample statements, that is, the model parameters are adjusted according to similar accuracy rates of the positive sample tags and the target sample tags. And continuing to train the adjusted initial text sentence classification model based on the training sample set until the model loss value of the initial text sentence classification model meets the preset training end condition, namely, the performance of the initial text sentence classification model at the moment can be considered to meet the requirement, and determining the target text sentence classification model according to the model parameters and the network structure of the initial text sentence classification model.
It should be noted that the preset training end condition may be when the total model loss value is smaller than a preset loss value threshold, or when the obtained similarity accuracy of the target sample label and the positive sample label is greater than or equal to a preset accuracy threshold.
In step S108 of some embodiments, in a specific application scenario, the application scenario includes a client device and a server device, where the client device is configured to send an initial text sentence to be classified, which is obtained by the client device, to the server device, and the server device is configured to execute the text sentence classification method provided in the embodiments of the present application, so as to classify the initial text sentence through a target text sentence classification model after the initial text sentence to be classified, which is sent by the client device, is obtained by identification, so as to obtain a target category.
It should be noted that, when a user needs to obtain information related to an initial text sentence to be recognized by determining a named entity included in the initial text sentence to be classified, the user may input the initial text sentence to be classified in a text input field to be classified provided on a client device. And then, after the client device obtains the initial text sentence to be classified input by the user, the initial text sentence is sent to the server device.
It should be noted that the text sentence classification method in the embodiment of the present application may be applied to, for example, emotion classification for movie reviews, or positive and negative direction judgment for APP usage reviews, and the like, that is, after a certain movie review is obtained, the emotion classification category of the user for a movie is determined by using the text sentence classification method in the embodiment of the present application, or after a certain usage review for APP is obtained, the approval degree classification category of the user for APP is determined by using the text sentence classification method in the embodiment of the present application.
According to the text sentence classification method provided by the embodiment of the application, a target text sentence classification model is firstly established, in order to achieve the purpose that high-precision text sentence classification can be fully performed on sentences to be recognized under the conditions that the sample amount is small and the relevance of text sentence pre-labels is not strong, a training sample set is obtained by collecting sample pair data, and each sample pair data comprises a positive sample sentence, a positive sample label corresponding to the positive sample sentence, a negative sample sentence and a negative sample label corresponding to the negative sample sentence. Then, in the training process of the model, an initial text sentence classification model is obtained, and the initial text sentence classification model comprises a pre-training sub-model, a feature constraint sub-model and a text classification sub-model. And respectively carrying out feature coding processing on each text word in the positive sample sentence and each text word in the negative sample sentence through the pre-training sub-model to obtain a positive sample word feature corresponding to the positive sample sentence and a negative sample word feature corresponding to the negative sample sentence. And respectively carrying out self-attention processing on all the positive sample word features and all the negative sample word features to obtain first text features of the positive sample sentences and second text features of the negative sample sentences. And then, performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature to update the first text feature. And performing text classification processing on the first text features through a text classification submodel to obtain a sample prediction probability value of each class label of the positive sample sentence. And carrying out numerical comparison on the plurality of sample prediction probability values obtained by the positive sample statement to determine a target sample label of the positive sample statement. And performing feature constraint calculation on the first text feature, the second text feature, the positive sample label and the negative sample label through a feature constraint sub-model to obtain a comparison loss value, obtaining a cross entropy loss value according to the positive sample label and the sample prediction probability value, and obtaining a model loss value according to the comparison loss value and the cross entropy loss value. And finally, adjusting model parameters of the initial text sentence classification model according to the positive sample label and the target sample label of the positive sample sentence, and continuing to train the adjusted initial text sentence classification model based on the training sample set until the model loss value of the initial text sentence classification model meets a preset training end condition to obtain the target text sentence classification model. And acquiring an initial text sentence to be classified, and classifying the initial text sentence through a target text sentence classification model to obtain a target category. According to the text sentence classification method and device, the accuracy of the model for classifying the text sentences can be improved. According to the embodiment of the application, the training sample set is obtained by collecting the sample pair data, so that the expansion of the training samples is effectively realized, and the number of the training samples of each category can be flexibly controlled, so that the balance of the samples is realized. According to the method and the device, the constraint is added to the characteristics of the text by introducing the deep measurement learning, so that the characteristics of the sample sentences of the same category can be guided to be closer, the characteristics of the sentences of different categories are far away, and the model can be effectively learned to the corresponding distribution even under the condition that the relevance between the content of the text and the label is not obvious. Meanwhile, the contrast loss function and the cross entropy loss function of the depth measurement learning are combined to continuously adjust the distribution condition of the sentences and token feature vectors, so that a high-precision target text sentence classification model is guided to be obtained, and the accuracy of the model for text sentence classification is improved.
Referring to fig. 7, an embodiment of the present application further provides a text sentence classification device, which can implement the text sentence classification method, and the device includes:
a sample set obtaining module 701, configured to obtain a training sample set, where the training sample set includes multiple sample pair data, and each sample pair data includes a positive sample statement, a positive sample label corresponding to the positive sample statement, and a negative sample label corresponding to the negative sample statement;
an initial model obtaining module 702, configured to obtain an initial text sentence classification model, where the initial text sentence classification model includes a pre-training sub-model, a feature constraint sub-model, and a text classification sub-model;
the feature extraction module 703 is configured to input the positive sample sentences and the negative sample sentences of each sample pair data into the initial text sentence classification model, and perform feature extraction on the positive sample sentences and the negative sample sentences through the pre-training submodels, so as to obtain first text features of the positive sample sentences and second text features of the negative sample sentences;
a feature constraint module 704, configured to perform feature constraint on the first text feature through the feature constraint sub-model and the second text feature to update the first text feature;
the sample text classification module 705 is configured to perform text classification processing on the first text feature through the text classification submodel to obtain a sample prediction probability value of each class label to which the positive sample statement belongs;
a numerical comparison module 706, configured to perform numerical comparison on the multiple sample prediction probability values of the positive sample statement, and determine a target sample label of the positive sample statement;
a target model building module 707, configured to adjust model parameters of the initial text sentence classification model according to the positive sample labels and the target sample labels of the positive sample sentences, and continue to train the adjusted initial text sentence classification model based on the training sample set until a model loss value of the initial text sentence classification model meets a preset training end condition, so as to obtain a target text sentence classification model;
and the target text classification module 708 is configured to obtain an initial text sentence to be classified, and classify the initial text sentence through the target text sentence classification model to obtain a target category.
The specific implementation of the text sentence classification apparatus is substantially the same as the specific implementation of the text sentence classification method, and is not described herein again.
The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the text sentence classification method when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet computer, a vehicle-mounted computer and the like.
Referring to fig. 8, fig. 8 illustrates a hardware structure of an electronic device according to another embodiment, where the electronic device includes:
the processor 801 may be implemented by a general Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solution provided in the embodiment of the present Application;
the Memory 802 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a Random Access Memory (RAM). The memory 802 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present disclosure is implemented by software or firmware, the relevant program codes are stored in the memory 802, and the processor 801 calls the text statement classification method for executing the embodiments of the present disclosure;
an input/output interface 803 for realizing information input and output;
the communication interface 804 is configured to implement communication interaction between the device and another device, and may implement communication in a wired manner (e.g., USB, network cable, etc.) or in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.);
a bus 805 that transfers information between the various components of the device (e.g., the processor 801, memory 802, input/output interfaces 803, and communication interface 804);
wherein the processor 801, the memory 802, the input/output interface 803 and the communication interface 804 are communicatively connected to each other within the device via a bus 805.
The embodiment of the present application further provides a storage medium, which is a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for classifying text sentences is implemented.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
According to the text statement classification method and device, the electronic device and the storage medium, the training sample set is obtained by collecting sample pair data, and each sample pair data comprises a positive sample statement, a positive sample label corresponding to the positive sample statement, a negative sample statement and a negative sample label corresponding to the negative sample statement. Then, in the training process of the model, an initial text sentence classification model is obtained firstly, and the initial text sentence classification model comprises a pre-training submodel, a feature constraint submodel and a text classification submodel. And respectively carrying out feature coding processing on each text word in the positive sample sentence and each text word in the negative sample sentence through the pre-training submodel to obtain a positive sample word feature corresponding to the positive sample sentence and a negative sample word feature corresponding to the negative sample sentence. And respectively carrying out self-attention processing on all the positive sample word features and all the negative sample word features to obtain first text features of the positive sample sentences and second text features of the negative sample sentences. And then, performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature to update the first text feature. And performing text classification processing on the first text features through a text classification submodel to obtain a sample prediction probability value of each class label of the positive sample sentence. And performing numerical comparison on the plurality of sample prediction probability values obtained by the positive sample statement to determine a target sample label of the positive sample statement. And performing feature constraint calculation on the first text feature, the second text feature, the positive sample label and the negative sample label through a feature constraint sub-model to obtain a comparison loss value, obtaining a cross entropy loss value according to the positive sample label and the sample prediction probability value, and obtaining a model loss value according to the comparison loss value and the cross entropy loss value. And finally, adjusting model parameters of the initial text sentence classification model according to the positive sample label and the target sample label of the positive sample sentence, and continuing training the adjusted initial text sentence classification model based on the training sample set until the model loss value of the initial text sentence classification model meets a preset training end condition, so as to obtain the target text sentence classification model. And acquiring an initial text sentence to be classified, and classifying the initial text sentence through a target text sentence classification model to obtain a target category. According to the text sentence classification method and device, the accuracy of the model for classifying the text sentences can be improved. According to the embodiment of the application, the training sample set is obtained by collecting the sample pair data, so that the expansion of the training samples is effectively realized, and the number of the training samples of each category can be flexibly controlled, so that the balance of the samples is realized. According to the method and the device, the characteristics of the text are constrained by introducing depth measurement learning, so that the characteristics of the sample sentences of the same category can be guided to be closer, the characteristics of the sentences of different categories are far away, and the model can be effectively learned to the corresponding distribution even under the condition that the relevance between the content of the text and the label is not obvious. Meanwhile, the contrast loss function and the cross entropy loss function of the depth measurement learning are combined to continuously adjust the distribution condition of the sentences and token feature vectors, so that a high-precision target text sentence classification model is guided to be obtained, and the accuracy of the model for text sentence classification is improved.
The embodiments described in the embodiments of the present application are for more clearly illustrating the technical solutions of the embodiments of the present application, and do not constitute a limitation to the technical solutions provided in the embodiments of the present application, and it is obvious to those skilled in the art that the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems with the evolution of technology and the emergence of new application scenarios.
It will be appreciated by those skilled in the art that the embodiments shown in the figures are not intended to limit the embodiments of the present application and may include more or fewer steps than those shown, or some of the steps may be combined, or different steps may be included.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
One of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes multiple instructions for causing a computer device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing programs, such as a usb disk, a portable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and the scope of the claims of the embodiments of the present application is not limited thereby. Any modifications, equivalents, and improvements that may occur to those skilled in the art without departing from the scope and spirit of the embodiments of the present application are intended to be within the scope of the claims of the embodiments of the present application.

Claims (10)

1. A method for classifying a text statement, the method comprising:
obtaining a training sample set, wherein the training sample set comprises a plurality of sample pair data, and each sample pair data comprises a positive sample statement, a positive sample label corresponding to the positive sample statement, a negative sample statement and a negative sample label corresponding to the negative sample statement;
acquiring an initial text sentence classification model, wherein the initial text sentence classification model comprises a pre-training sub-model, a feature constraint sub-model and a text classification sub-model;
inputting the positive sample statement and the negative sample statement of each sample pair data into the initial text statement classification model, and respectively performing feature extraction on the positive sample statement and the negative sample statement through the pre-training sub-model to obtain a first text feature of the positive sample statement and a second text feature of the negative sample statement;
performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature to update the first text feature;
performing text classification processing on the first text features through the text classification submodel to obtain a sample prediction probability value of each class label of the positive sample sentence;
performing numerical comparison on a plurality of sample prediction probability values of the positive sample statement to determine a target sample label of the positive sample statement;
adjusting model parameters of the initial text statement classification model according to the positive sample label and the target sample label of the positive sample statement, and continuing to train the adjusted initial text statement classification model based on the training sample set until a model loss value of the initial text statement classification model meets a preset training end condition to obtain a target text statement classification model;
and acquiring an initial text sentence to be classified, and classifying the initial text sentence through the target text sentence classification model to obtain a target category.
2. The method of claim 1, wherein after the text classification processing of the first text feature by the text classification submodel to obtain the sample prediction probability value of each class label of the positive sample sentence, the method further comprises:
performing feature constraint calculation on the first text feature, the second text feature, the positive sample label and the negative sample label through the feature constraint submodel to obtain a comparison loss value;
obtaining a cross entropy loss value according to the positive sample label and the sample prediction probability value;
and obtaining a model loss value according to the contrast loss value and the cross entropy loss value.
3. The method of claim 2, wherein the pre-training submodel comprises a feature coding process and a self-attention process, the inputting the positive sample sentences and the negative sample sentences of each sample pair data into the initial text sentence classification model, and performing feature extraction on the positive sample sentences and the negative sample sentences respectively through the pre-training submodel to obtain first text features of the positive sample sentences and second text features of the negative sample sentences comprises:
inputting the positive sample statements and the negative sample statements of each of the sample pair data to the initial text statement classification model;
respectively carrying out the feature coding processing on each text word in the positive sample sentence and each text word in the negative sample sentence to obtain a positive sample word feature corresponding to the positive sample sentence and a negative sample word feature corresponding to the negative sample sentence;
and respectively carrying out the self-attention processing on all the positive sample word features and all the negative sample word features to obtain first text features of the positive sample sentences and second text features of the negative sample sentences.
4. The method of claim 3, wherein prior to the inputting the positive sample statements and the negative sample statements for each of the sample pair data to the initial textual statement classification model, the method further comprises:
respectively comparing the lengths of the positive sample sentence and the negative sample sentence according to a preset text length threshold, when the text length of the positive sample sentence is smaller than the text length threshold, performing zero filling operation on the positive sample sentence according to the text length threshold, and updating the positive sample sentence until the text length of the positive sample sentence is equal to the text length threshold;
and when the text length of the negative sample sentence is smaller than the text length threshold, carrying out zero filling operation on the negative sample sentence according to the text length threshold, and updating the negative sample sentence until the text length of the negative sample sentence is equal to the text length threshold.
5. The method of claim 2, wherein performing feature constraint computation on the first text feature, the second text feature, the positive exemplar label, and the negative exemplar label via the feature constraint submodel to obtain a contrast loss value comprises:
comparing the labels of the positive sample label and the negative sample label according to a preset indication function, and determining a contrast coefficient;
and performing feature constraint calculation according to the first text feature, the second text feature and the contrast coefficient to obtain a contrast loss value.
6. The method of any one of claims 2 to 5, wherein the deriving a cross entropy loss value from the positive sample label and the sample prediction probability value comprises:
acquiring the number of text words of the positive sample sentence;
and calculating the text word number, the positive sample label and the sample prediction probability value according to a preset cross entropy loss function to obtain a cross entropy loss value.
7. An apparatus for classifying a text sentence, the apparatus comprising:
a sample set obtaining module, configured to obtain a training sample set, where the training sample set includes multiple sample pair data, and each sample pair data includes a positive sample statement, a positive sample label corresponding to the positive sample statement, a negative sample statement, and a negative sample label corresponding to the negative sample statement;
the initial model obtaining module is used for obtaining an initial text sentence classification model which comprises a pre-training submodel, a characteristic constraint submodel and a text classification submodel;
the feature extraction module is used for inputting the positive sample sentences and the negative sample sentences of each sample pair data into the initial text sentence classification model, and performing feature extraction on the positive sample sentences and the negative sample sentences through the pre-training sub-model to obtain first text features of the positive sample sentences and second text features of the negative sample sentences;
the feature constraint module is used for performing feature constraint on the first text feature through the feature constraint sub-model and the second text feature so as to update the first text feature;
the sample text classification module is used for carrying out text classification processing on the first text characteristics through the text classification submodel to obtain a sample prediction probability value of the positive sample sentence belonging to each class label;
the numerical value comparison module is used for carrying out numerical value comparison on the plurality of sample prediction probability values of the positive sample statement and determining a target sample label of the positive sample statement;
a target model building module, configured to adjust model parameters of the initial text sentence classification model according to the positive sample labels and the target sample labels of the positive sample sentences, and continue to train the adjusted initial text sentence classification model based on the training sample set until a model loss value of the initial text sentence classification model meets a preset training end condition, so as to obtain a target text sentence classification model;
and the target text classification module is used for acquiring the initial text sentences to be classified and classifying the initial text sentences through the target text sentence classification model to obtain target classes.
8. The apparatus of claim 7, wherein before the sample text classification module is configured to perform text classification processing on the first text feature through the text classification submodel to obtain the sample prediction probability value of each class label to which the positive sample sentence belongs, the apparatus further comprises:
a comparison loss value obtaining module, configured to perform feature constraint calculation on the first text feature, the second text feature, the positive sample label, and the negative sample label through the feature constraint submodel, to obtain a comparison loss value;
the cross entropy loss value acquisition module is used for obtaining a cross entropy loss value according to the positive sample label and the sample prediction probability value;
and the model loss value acquisition module is used for obtaining a model loss value according to the contrast loss value and the cross entropy loss value.
9. An electronic device, comprising a memory storing a computer program and a processor implementing a method of classifying text sentences according to any one of claims 1 to 6 when the computer program is executed by the processor.
10. A storage medium being a computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a method of classifying text sentences as claimed in any one of claims 1 to 6.
CN202211201108.8A 2022-09-28 2022-09-28 Text sentence classification method and classification device, electronic equipment and storage medium Pending CN115510232A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211201108.8A CN115510232A (en) 2022-09-28 2022-09-28 Text sentence classification method and classification device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211201108.8A CN115510232A (en) 2022-09-28 2022-09-28 Text sentence classification method and classification device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115510232A true CN115510232A (en) 2022-12-23

Family

ID=84508014

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211201108.8A Pending CN115510232A (en) 2022-09-28 2022-09-28 Text sentence classification method and classification device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115510232A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117236329A (en) * 2023-11-15 2023-12-15 阿里巴巴达摩院(北京)科技有限公司 Text classification method and device and related equipment
CN117574146A (en) * 2023-11-15 2024-02-20 广州方舟信息科技有限公司 Text classification labeling method, device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117236329A (en) * 2023-11-15 2023-12-15 阿里巴巴达摩院(北京)科技有限公司 Text classification method and device and related equipment
CN117236329B (en) * 2023-11-15 2024-02-06 阿里巴巴达摩院(北京)科技有限公司 Text classification method and device and related equipment
CN117574146A (en) * 2023-11-15 2024-02-20 广州方舟信息科技有限公司 Text classification labeling method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN113792818B (en) Intention classification method and device, electronic equipment and computer readable storage medium
CN115510232A (en) Text sentence classification method and classification device, electronic equipment and storage medium
CN113887215A (en) Text similarity calculation method and device, electronic equipment and storage medium
CN114358007A (en) Multi-label identification method and device, electronic equipment and storage medium
CN114722069A (en) Language conversion method and device, electronic equipment and storage medium
CN114359810A (en) Video abstract generation method and device, electronic equipment and storage medium
CN114298287A (en) Knowledge distillation-based prediction method and device, electronic equipment and storage medium
CN114240552A (en) Product recommendation method, device, equipment and medium based on deep clustering algorithm
CN114897060A (en) Training method and device of sample classification model, and sample classification method and device
CN114926039A (en) Risk assessment method, risk assessment device, electronic device, and storage medium
CN113849661A (en) Entity embedded data extraction method and device, electronic equipment and storage medium
CN116050352A (en) Text encoding method and device, computer equipment and storage medium
CN114841146A (en) Text abstract generation method and device, electronic equipment and storage medium
CN113221553A (en) Text processing method, device and equipment and readable storage medium
CN114492661A (en) Text data classification method and device, computer equipment and storage medium
CN114637847A (en) Model training method, text classification method and device, equipment and medium
CN115796141A (en) Text data enhancement method and device, electronic equipment and storage medium
CN116432705A (en) Text generation model construction method, text generation device, equipment and medium
CN114398903B (en) Intention recognition method, device, electronic equipment and storage medium
CN114722774A (en) Data compression method and device, electronic equipment and storage medium
CN115238143A (en) Query statement generation method and device, model training method, equipment and medium
CN114936274A (en) Model training method, dialogue generating device, dialogue training equipment and storage medium
CN115033674A (en) Question-answer matching method, question-answer matching device, electronic equipment and storage medium
CN114611529A (en) Intention recognition method and device, electronic equipment and storage medium
CN114492437A (en) Keyword recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination