WO2023134083A1 - 基于文本的情感分类方法和装置、计算机设备、存储介质 - Google Patents

基于文本的情感分类方法和装置、计算机设备、存储介质 Download PDF

Info

Publication number
WO2023134083A1
WO2023134083A1 PCT/CN2022/090673 CN2022090673W WO2023134083A1 WO 2023134083 A1 WO2023134083 A1 WO 2023134083A1 CN 2022090673 W CN2022090673 W CN 2022090673W WO 2023134083 A1 WO2023134083 A1 WO 2023134083A1
Authority
WO
WIPO (PCT)
Prior art keywords
emotion
text data
data
word segmentation
emotional
Prior art date
Application number
PCT/CN2022/090673
Other languages
English (en)
French (fr)
Inventor
舒畅
陈又新
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023134083A1 publication Critical patent/WO2023134083A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the technical field of artificial intelligence, in particular to a text-based sentiment classification method and device, computer equipment, and storage media.
  • the embodiment of the present application proposes a text-based sentiment classification method, including:
  • word segmentation text data comprises the emotion characteristic for characterizing the emotional category
  • the embodiment of the present application proposes a text-based emotion classification device, including:
  • Obtaining module used to obtain the original text data to be classified
  • Word segmentation module used to perform word segmentation processing on the original text data to obtain word segmentation text data; wherein, the word segmentation text data includes emotional features for representing emotional categories;
  • Enhancement module used to perform data enhancement processing on the word segmentation text data, to obtain emotion positive example pairs corresponding to the word segmentation text data; wherein, each of the emotional positive example pairs includes the emotional features;
  • Learning module used to perform contrastive learning on the emotional positive examples through a pre-trained comparative learning model to obtain an emotional embedding vector
  • Classification module used to perform emotion classification processing according to the emotion embedding vector to obtain a target emotion category corresponding to the emotion feature.
  • the embodiment of the present application provides a computer device, the computer device includes a memory and a processor, wherein a program is stored in the memory, and when the program is executed by the processor, the processor uses Carry out a kind of sentiment classification method based on text, wherein, described sentiment classification method based on text comprises:
  • word segmentation text data comprises the emotional feature for characterizing emotional category
  • the embodiment of the present application provides a storage medium, the storage medium is a computer-readable storage medium, and the storage medium stores computer-executable instructions, and the computer-executable instructions are used to make the computer execute a method based on The sentiment classification method of text, wherein, described text-based sentiment classification method comprises:
  • word segmentation text data comprises the emotion characteristic for characterizing the emotional category
  • the text-based emotion classification method and device, computer equipment, and storage medium proposed in the embodiments of the present application can solve the problem of training data by comparing and learning emotion positive examples in combination with the comparative learning model, and then performing emotion classification processing after obtaining emotion embedding vectors.
  • the problem of uneven distribution can improve the accuracy of sentiment classification.
  • Fig. 1 is the first flowchart of the text-based emotion classification method provided by the embodiment of the present application
  • Fig. 2 is the flowchart of step S300 in Fig. 1;
  • Fig. 3 is the flowchart of step S340 in Fig. 2;
  • Fig. 4 is the second flowchart of the text-based emotion classification method provided by the embodiment of the present application.
  • Fig. 5 is a flowchart of step S500 in Fig. 1;
  • Fig. 6 is the flowchart of step S550 in Fig. 5;
  • FIG. 7 is a block diagram of a module structure of a text-based emotion classification device provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a hardware structure of a computer device provided by an embodiment of the present application.
  • Artificial Intelligence It is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence; artificial intelligence is a branch of computer science. Intelligence attempts to understand the essence of intelligence and produce a new intelligent machine that can respond in a manner similar to human intelligence. Research in this field includes robotics, language recognition, image recognition, natural language processing, and expert systems. Artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is also a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • Natural language processing uses computers to process, understand and use human languages (such as Chinese, English, etc.). NLP belongs to a branch of artificial intelligence and is an interdisciplinary subject between computer science and linguistics. Known as computational linguistics. Natural language processing includes syntax analysis, semantic analysis, text understanding, etc. Natural language processing is often used in technical fields such as machine translation, handwritten and printed character recognition, speech recognition and text-to-speech conversion, information retrieval, information extraction and filtering, text classification and clustering, public opinion analysis and opinion mining. It involves language processing Related data mining, machine learning, knowledge acquisition, knowledge engineering, artificial intelligence research and linguistics research related to language computing, etc.
  • Text sentiment analysis also known as opinion mining, tendency analysis, etc. Simply put, it is the process of analyzing, processing, summarizing and inferring emotionally subjective texts.
  • On the Internet such as blogs and forums and social service networks such as Dianping
  • These comments express people's various emotional colors and emotional tendencies, such as joy, anger, grief, joy, criticism, praise, etc. Based on this, potential users can understand public opinion's views on a certain event or product by browsing these subjective comments.
  • Recurrent Neural Network is a kind of recursive neural network (recursive neural network) that takes sequence data as input, performs recursion in the evolution direction of the sequence, and all nodes (circular units) are connected in chains.
  • network among which Bidirectional RNN (Bi-RNN) and Long Short-Term Memory networks (LSTM) are common recurrent neural networks.
  • the cyclic neural network has memory, parameter sharing and Turing completeness, so it has certain advantages in learning the nonlinear characteristics of the sequence.
  • Recurrent neural networks have applications in Natural Language Processing (NLP), such as speech recognition, language modeling, machine translation, and other fields, and are also used in various time series forecasts.
  • NLP Natural Language Processing
  • the cyclic neural network constructed by the introduction of Convolutional Neural Network (CNN) can deal with computer vision problems involving sequence input.
  • BERT Bidirectional Encoder Representations from Transformers
  • the first task is to use the pre-trained language model (Mask Language Model, MaskLM) to train the language model, that is, when inputting a sentence, randomly select some words to be predicted, and then replace them with a special symbol , and then let the model learn the words that should be filled in these places according to the given labels.
  • the second task adds a sentence-level continuity prediction task based on the bidirectional language model, that is, to predict whether the two texts input to BERT are continuous texts. The introduction of this task can better allow the model to learn continuous Relationships between text fragments.
  • Contrastive learning is a kind of self-supervised learning, which does not need to rely on manually labeled category label information, and directly uses the data itself as supervisory information. Contrastive learning is an approach to the task of describing similar and dissimilar things for deep learning models. Using contrastive learning methods, machine learning models can be trained to distinguish between similar and dissimilar images. Self-supervised learning in the image field is divided into two types: generative self-supervised learning and discriminative self-supervised learning. Contrastive learning applies typical discriminative self-supervised learning.
  • the core point of comparative learning is: by automatically constructing similar instances and dissimilar instances, that is, positive samples and negative samples, learning to compare positive samples and negative samples in the feature space, so that similar instances are closer in the feature space, The distance between dissimilar instances in the feature space is farther, and the difference becomes larger.
  • the model representation obtained through such a learning process can perform downstream tasks and fine-tune on a smaller labeled data set to achieve unsupervised model learning process.
  • the guiding principle of comparative learning is: by automatically constructing similar instances and dissimilar instances, a learning model is obtained through learning, and using this model, similar instances are relatively close in the projection space, while dissimilar instances can be compared in the projection space. Far.
  • the batch size (i.e. batch size) is a hyperparameter that defines the number of samples to process before updating the internal model parameters, that is, controls the number of training samples before the internal parameters of the model are updated.
  • the training data set can be divided into one or more batches, where, when all training samples are used to create a batch, the learning algorithm is called batch gradient descent; when the batch is the size of one sample, the learning algorithm is called stochastic gradient descent; When the batch size is more than one sample and less than the size of the training dataset, the learning algorithm is called mini-batch gradient descent.
  • the batch size is the number of samples processed before updating the model.
  • Data enhancement is mainly used to prevent overfitting and optimize the dataset when the dataset is small. Through data enhancement, the amount of training data can be increased, the generalization ability of the model can be improved, and noise data can be increased. Improve the robustness of the model.
  • Data enhancement can be divided into two categories, offline enhancement and online enhancement. Among them, offline enhancement is to directly process the data set, and the number of data will become the enhancement factor x the number of original data sets. Offline enhancement is often used when the data set is small. ;Online enhancement is mainly used to enhance the batch data after obtaining the batch data, such as rotation, translation, flipping and other corresponding changes. Since some data sets cannot accept linear level growth, online enhancement is often used for larger data Set, many machine learning frameworks already support online enhancement methods, and can use GPU to optimize calculations.
  • Multilayer Perceptron is a feed-forward artificial neural network model that maps multiple input data sets to a single output data set.
  • Dropout is discarding, which is a technology to prevent model overfitting. It means that during the training process of the deep learning network, for the neural network unit, it is temporarily discarded from the network according to a certain probability, so that the model can be more accurate. Robust, because it does not depend too much on some local features (because local features may be discarded).
  • Runtime refers to the state of a program running (cc or being executed). That is, when you open a program to run it on your computer, that program is running. In some programming languages, certain reusable programs or instances are packaged or rebuilt into "runtime libraries". These instances can be linked or invoked by any program while they are running.
  • the Runtime class encapsulates the runtime environment, and each Java application has an instance of the Runtime class, enabling the application to connect to its running environment.
  • a Runtime object cannot be instantiated, and the application program cannot create its own instance of the Runtime class, but the reference of the current Runtime object can be obtained through the getRuntime method. Once you get a reference to the current Runtime object, you can call the methods of the Runtime object to control the state and behavior of the Java virtual machine.
  • Embedding is a kind of vector representation, which refers to representing an object with a low-dimensional vector, which can be a word, or a commodity, or a movie, etc.; the nature of this embedding vector is that it can Make the objects corresponding to the vectors with similar distances have similar meanings. For example, the distance between embedding (Avengers) and embedding (Iron Man) will be very close, but the distance between embedding (Avengers) and embedding (Gone with the Wind) will be farther away.
  • Embedding is essentially a mapping from semantic space to vector space, while maintaining the relationship of the original sample in the semantic space as much as possible in the vector space.
  • Embedding can encode an object with a low-dimensional vector and retain its meaning. It is often used in machine learning. In the process of building a machine learning model, the object is encoded as a low-dimensional dense vector and then passed to DNN to improve efficiency.
  • Word embedding model It can convert the content of the request text into a vector representation.
  • Encoder is to convert the input sequence into a fixed-length vector; decoding (decoder) is to convert the previously generated fixed vector into an output sequence; where the input sequence can be text, voice, image, video; output sequence Can be text, image.
  • jieba word breaker also known as stutter word breaker, is an open source word breaker;
  • Chinese word segmentation is a basic step in Chinese text processing, and also a basic module of Chinese human-computer natural language interaction.
  • word segmentation needs to be performed first, among which, the jieba word segmenter is commonly used for word segmentation; the jieba word segmentation algorithm uses a prefix dictionary to realize efficient word map scanning, and generates a directed acyclic graph (DAG) composed of all possible generated words of Chinese characters in a sentence. , and then used dynamic programming to find the maximum probability path, and found the maximum segmentation combination based on word frequency.
  • DAG directed acyclic graph
  • Jieba word segmentation supports three word segmentation modes: the first is the precise mode, which tries to cut the sentence into the most precise, suitable for text analysis; the second is the full mode, which combines all the words in the sentence into words All the words are scanned very quickly, but the ambiguity cannot be resolved; the third is the search engine mode, which is based on the precise mode, segmenting long words and improving the recall rate, suitable for Search engine segmentation.
  • the Analyzer word breaker is a component that specializes in word segmentation. It generally includes three parts: Character Filters, Tokenizer (segmented into words according to rules), and Token Filters; among them, Character Filters are mainly used to process original text, such as removing html, Special characters; Tokenizer is used to segment words according to the rules; Token Filters is used to process the segmented words, including lowercase, deleting stopwords (stop words), adding synonyms, etc.
  • TextCNN Model Applying Convolutional Neural Networks to Text Classification Tasks.
  • the core point of CNN is that it can capture local correlations.
  • CNN can be used to extract key information similar to n-grams in sentences.
  • TextCNN uses multiple convolution kernels of different sizes to extract key information in sentences (similar to a multi-window N-gram model), uses Max-Pooling to select the most influential high-dimensional classification features, and then uses Dropout The fully connected layer extracts text depth features, and finally connects to softmax for classification.
  • Softmax classifier For the generalization of multiple classifications for the logistic regression classifier, the output is the probability value belonging to different categories.
  • Backpropagation The general principle of backpropagation is to input the training set data into the input layer of the neural network, pass through the hidden layer of the neural network, and finally reach the output layer of the neural network and output the result; since the output result of the neural network is different from the actual If there is an error in the result, the error between the estimated value and the actual value is calculated, and the error is backpropagated from the output layer to the hidden layer until it is propagated to the input layer; in the process of backpropagation, various parameters are adjusted according to the error The value of ; continue to iterate the above process until convergence.
  • NCE loss (Noise-Constrastive Estimation Loss): Assuming that X is a sample drawn from real data (or corpus), it obeys a relatively referenceable probability density function P(d), and the noise sample Y obeys the probability density function as P (n), Noise Contrastive Estimation (NCE) is to distinguish the two types of samples by learning a classifier, and can learn the attributes of the data from the model.
  • Gradient Descent It is a type of iterative method that can be used to solve least squares problems (both linear and nonlinear).
  • gradient descent is one of the most commonly used methods, and another commonly used method is the least squares method.
  • the gradient descent method can be used to iteratively solve step by step to obtain the minimized loss function and model parameter values. Conversely, if you need to find the maximum value of the loss function, you need to use the gradient ascent method to iterate.
  • two gradient descent methods have been developed based on the basic gradient descent method, namely stochastic gradient descent method and batch gradient descent method.
  • Max pooling that is, take the point with the largest value in the local receptive field.
  • the commonly used pooling methods are max-pooling and mean-pooling.
  • the error of feature extraction mainly comes from two aspects.
  • the variance of the estimated value is increased due to the limitation of the neighborhood size; on the other hand, the error of the convolutional layer parameters causes the deviation of the estimated mean.
  • mean-pooling can reduce the first error and retain more background information of the image, while max-pooling can reduce the second error and retain more texture information. It is similar to mean-pooling, but in a local sense, it obeys the max-pooling criterion.
  • the size of the max-pooling convolution kernel is generally 2 ⁇ 2, and a very large input may require 4 ⁇ 4. However, choosing a larger shape significantly reduces the size of the signal and may result in excessive loss of information. Generally, non-overlapping pooling windows perform best.
  • embodiments of the present application provide a text-based sentiment classification method and device, computer equipment, and storage media, which can improve the accuracy of text sentiment classification.
  • Embodiments of the present application provide a text-based emotion classification method and device, computer equipment, and storage media, which are specifically described through the following embodiments. First, the text-based emotion classification method in the embodiment of the present application is described.
  • the text-based emotion classification method provided in the embodiment of the present application relates to the field of artificial intelligence.
  • the text-based emotion classification method provided in the embodiment of the present application can be applied to a terminal, can also be applied to a server, and can also be software running on the terminal or the server.
  • the terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, or a smart watch;
  • the server end can be configured as an independent physical server, or as a server cluster composed of multiple physical servers or as a distributed
  • the system can also be configured to provide basic cloud computing such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
  • the cloud server of the service; the software may be an application for implementing a text-based sentiment classification method, etc., but is not limited to the above forms.
  • the embodiments of the present application can be used in many general-purpose or special-purpose computer system environments or configurations. Examples: personal computers, server computers, handheld or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, including A distributed computing environment for any of the above systems or devices, etc.
  • This application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computer storage media including storage devices.
  • the text-based emotion classification method includes but is not limited to steps S100 to S500 .
  • Step S100 obtaining the original text data to be classified
  • Step S200 performing word segmentation processing on the original text data to obtain word segmentation text data
  • Step S300 performing data enhancement processing on the word-segmented text data to obtain positive emotion pairs corresponding to the word-segmented text data;
  • Step S400 using a pre-trained contrastive learning model to perform contrastive learning on emotional positive examples to obtain emotional embedding vectors
  • Step S500 perform emotion classification processing according to the emotion embedding vector, and obtain the target emotion category corresponding to the emotion feature.
  • the original text data to be classified can be obtained according to actual business requirements.
  • users’ comments on e-commerce platforms are generally used as the reference standard for consumption.
  • the optimization provides actual data support.
  • the original text data in the field of e-commerce can be obtained by crawling the user comment data of multiple e-commerce platforms (for example, the comment data of purchasing mobile phones).
  • the effect is first-class. I really like the purple color that just came out.
  • step S200 of some embodiments when processing text data, in order to determine whether there are corresponding words in the sentiment dictionary in the sentence, it is necessary to accurately cut the sentence into words, that is, the automatic word segmentation of the sentence, in practical application
  • word segmentation tools such as Analyzer and jieba word segmentation to segment the original text data.
  • the principle of using the jieba tokenizer to segment the original text data is as follows: first, load a pre-stored dictionary to generate a trie tree.
  • the word segmentation text data obtained by segmenting "running very smoothly, photo effect is first-class, I like this purple color that just came out, and the appearance is very beautiful” can be “running/very/smooth, photo/effect/first-class, this This model/just came out/is/purple/I like it very much, and the shape/is very beautiful”.
  • the preset emotional feature dictionary can be loaded.
  • the emotional feature dictionary is the core part of text mining.
  • the emotional feature dictionary includes multiple words and words
  • the corresponding emotional features, where emotional features are used to represent emotional categories, in practical applications, emotional categories are usually divided into positive, negative and neutral, for example, in the field of e-commerce, the emotional features used to represent positive are: Like , very good, convenient, etc., can improve the accuracy of subsequent emotion classification.
  • step S300 of some embodiments the word segmentation text data is subjected to data enhancement processing, and the emotion positive example pair corresponding to the word segmentation text data is obtained, wherein the emotion positive example pair refers to data with the same emotional characteristics as the word segmentation text data, but the emotion is positive
  • the text information corresponding to the example pair is not the same as the participle text data.
  • the amount of data used for model training under the emotional feature is expanded by constructing the emotion positive example pair, so as to make the training result more accurate.
  • Negative emotional pairs refer to data that has different emotional characteristics from word-segmented text data.
  • data enhancement can be simply understood as the process of generating a large amount of data from a small amount of data.
  • a more successful neural network has a large number of parameters to make these parameters work correctly.
  • a large amount of data is required for training, but in reality there is not so much data, so data enhancement is required.
  • data enhancement processing can be performed on word segmentation text data in the following two ways.
  • the first method mainly to process the feature representation of the original text, such as injecting random noise into the representation layer to obtain an enhanced text representation, which can be decoded to obtain enhanced text or directly used for training Model.
  • the second way it is mainly to enhance the words in the original text by performing operations such as synonym replacement or deletion, and most of the studies have improved the enhancement effect by introducing various external resources.
  • the amount of training data can be increased, the generalization ability of the model can be improved, noise data can be added, the robustness of the model can be improved, and the problem of insufficient data or data imbalance can be solved.
  • a pre-trained contrastive learning model is used to perform contrastive learning on the emotional positive example pairs to obtain an emotional embedding vector.
  • the sentiment positive example pair is input into the contrastive learning model
  • the similarity of the sentiment positive example pair is calculated by the contrastive learning model
  • the contrastive learning model is calculated according to the similarity of the emotional positive example pair and the similarity of the emotional negative example pair
  • the loss function to get a loss value
  • use the gradient descent method to backpropagate the loss value to the word embedding matrix used to construct the word vector in the comparative learning model, so as to modify the matrix parameters in the word embedding matrix, so as to get the emotion Embedding vector.
  • the emotion embedding vector is obtained by the contrastive learning model, and then the emotion classification is carried out through the emotion embedding vector.
  • it is difficult to distinguish between neutral and positive, and neutral and negative when using the usual model to classify emotions.
  • the main reason is that most of the training data are neutral, and there are few positive and negative data.
  • Model collapse refers to the diversity of models. Sex is low.
  • the emotion classification process is performed according to the emotion embedding vector to obtain the target emotion category corresponding to the emotion feature.
  • the emotion embedding vector can be input into the TextCNN model, and the output result of the TextCNN model can be obtained To get the target emotion category corresponding to the emotion feature.
  • step S300 specifically includes but not limited to step S310 to step S340.
  • Step S310 copying the word segmentation text data to obtain copied text data
  • Step S320 performing a first data enhancement process on the word segmentation text data to obtain a first encoding vector
  • Step S330 performing a second data enhancement process on the copied text data to obtain a second encoding vector
  • Step S340 according to the first coded vector and the second coded vector, an emotion positive example pair is obtained.
  • step S310 of some embodiments assuming that a certain word-segmented text data is x, x is copied to obtain copied text data x', wherein, word-segmented text data x and copied text data x', emotional features of word-segmented text data x It is the same as the sentiment feature of copying text data x'.
  • the comparative learning model includes an embedding layer, specifically, the word segmentation text data x is input to the embedding layer, and the word segmentation text data is subjected to data enhancement processing through the dropout encoder of the embedding layer to generate The first encoded vector h(x1). Input the copied text data x' to the embedding layer, and perform data enhancement processing on the copied text data through the dropout encoder of the embedding layer to obtain the second encoding vector h(x2).
  • first data enhancement processing and the second Data enhancement processing refers to the operation of data enhancement processing, "first" and “second” are only used to distinguish, word segmentation text data and copy text data are respectively subjected to data enhancement processing operations, and are not used to limit the enhancement processing order.
  • the trained model is prone to overfitting.
  • Over-fitting problems are often encountered when training neural networks. Over-fitting is specifically manifested in: the loss function of the model on the training data is small, and the prediction accuracy is high, but the loss function on the test data is relatively large. The accuracy rate is lower.
  • the training data that is, word segmentation text data, can be enhanced.
  • the dropout encoder can be used as a trick for training deep neural networks.
  • Detector interaction refers to the dependence of some detectors on other detectors to function. Simply put, during forward propagation, let the activation value of a certain neuron stop working with a certain probability p, because it depends too much on some local features, so it can make the model generalization stronger.
  • the SimCSE Simple Contrastive Learning of Sentence Embeddings
  • SimCSE uses self-supervised learning to improve the representation ability of sentences. Since SimCSE has no label data, SimCSE regards each sentence itself as a similar sentence. In other words, SimCSE essentially uses (self, self) as a positive example pair and (self, others) as a negative example to train a comparative learning model, but this will lead to a greatly reduced generalization effect.
  • some data amplification methods can be used, that is, the dropout encoder can be directly used as data amplification.
  • each word-segmented text data x is subjected to data enhancement processing through an encoder with a dropout encoder to obtain the first encoding vector h(x1), and then the copied text data x' that is the same as the word-segmented text data x is again Re-pass the encoder with the dropout encoder for data enhancement processing (another random dropout at this time), and obtain the second encoding vector h(x2).
  • the purpose of this is to make the two samples of positive emotional pairs different, so as to achieve the effect of data amplification by using dropout.
  • the comparative learning method within the batch can also be used to perform data enhancement processing within the batch, so that the two samples (word segmentation text data and copy text data) of the emotional positive pair are different.
  • step S340 of some embodiments according to the first encoding vector h(x1) and the second encoding vector h(x2), the emotional positive example pair z(x1) and z(x2) are obtained.
  • step S340 specifically includes but not limited to step S341 to step S343.
  • Step S341 performing mapping processing on the first coding vector by the first multi-layer perceptron to obtain the first mapping data
  • Step S342 performing mapping processing on the second encoding vector by the second multi-layer perceptron to obtain second mapping data
  • Step S343 constructing an emotion positive example pair according to the first mapping data and the second mapping data.
  • the first encoding vector h(x1) and the second encoding vector h(x2) are respectively mapped, specifically, by using the fixed first multi-layer in the contrastive learning model
  • the perceptron performs mapping processing on the first encoding vector to obtain the first mapping data z(x1), and uses the fixed second multi-layer perceptron in the contrastive learning model to perform mapping processing on the second encoding vector to obtain the second mapping data z (x2).
  • step S343 of some embodiments the first mapping data z(x1) and the second mapping data z(x2) are used as a pair of emotion positive examples.
  • the word-segmented text data x is input to the embedding layer, and the word-segmented text data is subjected to data enhancement processing through the dropout encoder of the embedding layer to generate a third encoding vector h(x3).
  • the first multi-layer perceptron is used to map the three coding vectors h(x3) to obtain the third mapping data z(x3)
  • the second multi-layer perceptron is used to map the fourth coding vector h(x4) , to obtain the fourth mapping data z(x4), wherein the third mapping data z(x3) and the fourth mapping data z(x4) are emotional negative example pair data.
  • the text-based sentiment classification method mentioned in the embodiment of the present application further includes: constructing a contrastive learning model, specifically including but not limited to steps S410 to S440.
  • Step S410 obtaining training samples
  • Step S420 inputting the sample positive example pair and the sample negative example pair into the original learning model
  • Step S430 according to the sample positive example pair and the sample negative example pair, calculate the loss function of the original learning model to obtain the loss value
  • Step S440 updating the original learning model according to the loss value to obtain a comparative learning model.
  • training samples for constructing a contrastive learning model are acquired, wherein the training samples include pairs of positive examples and pairs of negative examples, wherein the pairs of positive examples and pairs of negative examples are the same as the aforementioned emotional positive examples
  • the specific construction process of the pair and the emotional negative example pair is the same, and will not be repeated here.
  • the loss function of the original learning model is calculated according to the sample positive example pair and the sample negative example pair to obtain the loss value.
  • the NCE loss can be used in the embodiment of the present application function
  • the specific loss function is shown in formula (1).
  • the numerator of the loss function is the first similarity corresponding to the sample positive example pair
  • the denominator of the loss function is the first similarity and all sample negative examples
  • the second similarity of the pair wherein, the first similarity and the second similarity can be calculated by formula (2) to formula (4).
  • f(x) T is the transpose of f(x)
  • f(x) is the original sample (word segmentation text data)
  • f(x + ) sample positive example pair
  • f(x j ) is sample negative example pair
  • the denominator term includes a sample positive example pair, and N-1 sample negative example pairs.
  • step S440 includes but is not limited to the step of: using the loss value as the amount of backpropagation, adjusting model parameters of the original learning model to update the original learning model to obtain a comparison learning model.
  • the partial derivative calculation can be performed on the input vector according to the loss function, and the obtained partial derivative value is used as the amount of backpropagation to adjust the model parameters of the original learning model.
  • the loss value needs to be minimized to achieve a better training effect.
  • the loss value of the loss function can be minimized, by calculating the NCE loss function, and using the gradient descent method to update the model parameters of the original learning model, thus obtaining Compare learning models.
  • the similarity between positive sample pairs is 1 if they are similar, and is equal to is 0, the purpose of contrastive learning training is to make the similarity between positive sample pairs as close to 1 as possible. In addition, it is also necessary to minimize the similarity between pairs of sample negative examples. During the training process of the contrastive learning model, it is necessary to make the similarity between pairs of sample negative examples as close to 0 as possible.
  • the loss value of the loss function is minimized, and according to the loss value, the method of gradient descent is used for backpropagation, that is, the model parameters in the original learning model are continuously updated along the gradient, Finally, after a certain convergence condition is reached, the optimized model parameters are obtained to update the original learning model and obtain a comparative learning model.
  • the similarity of the sample positive example pair needs to be much greater than the similarity of the sample negative example pair, where x+ refers to the data similar to the word segmentation text data x, that is, the sample positive example pair data; here x- refers to the data that is different from x Similar data, that is, sample negative pair data, f(x + ) is a sample positive sample, and f(x - ) is a sample negative sample.
  • Score used to evaluate the similarity between two features is shown in formula (3) and formula (4), where Score is a function using the dot product as a score function.
  • step S500 specifically includes but not limited to step S510 to step S550.
  • Step S510 obtaining a preset neural network model
  • Step S520 performing feature extraction processing on the emotion embedding vector through the convolution layer to obtain multiple convolution feature vectors
  • Step S530 performing maximum pooling processing on each convolutional feature vector through the pooling layer to obtain multiple pooled feature vectors
  • Step S540 performing splicing processing on multiple pooled feature vectors through the fully connected layer to obtain spliced feature vectors
  • step S550 the classifier is used to classify the concatenated feature vector to obtain the target emotion category corresponding to the emotion feature.
  • a preset neural network model is obtained.
  • a TextCNN model can be used, wherein the TextCNN model includes a convolutional layer, a pooling layer, a fully connected layer, and a classifier.
  • the convolution layer of TextCNN model comprises a plurality of convolution blocks, carries out feature extraction process to emotion embedding vector through a plurality of convolution blocks, obtains a plurality of convolution feature vectors.
  • the maximum pooling process is performed on each convolutional feature vector through the pooling layer of the TextCNN model to obtain multiple pooled feature vectors.
  • the sizes of feature maps obtained by convolution kernels of different sizes are also different, so a pooling function is used for each feature map to make their dimensions the same. The most commonly used is the maximum pooling, so that the feature obtained by each convolution kernel is a value, and the maximum pooling is used for all convolution kernels to obtain multiple pooling feature vectors.
  • step S540 of some embodiments multiple pooled feature vectors are spliced through the fully connected layer of the TextCNN model to obtain spliced feature vectors.
  • the multiple pooled feature vectors obtained in step S530 are concatenated to obtain the final feature vector, that is, the concatenated feature vector, and the concatenated feature vector is input into the classifier for further classification.
  • dropout can be used to prevent overfitting.
  • a classifier of the TextCNN model such as a softmax classifier, is used to classify the concatenated feature vector to obtain a target emotion category corresponding to the emotion feature.
  • step S550 specifically includes but not limited to step S551 and step S552.
  • Step S551 classify the concatenated feature vectors by a classifier to obtain a plurality of candidate emotion categories and an emotion probability value corresponding to each candidate emotion category;
  • Step S552 acquiring the candidate emotion category with the highest emotion probability value as the target emotion category.
  • the text-based emotion classification method proposed in the embodiment of the present application obtains the original text data to be classified, performs word segmentation processing on the original text data, and obtains a plurality of word-segmented text data, wherein the word-segmented text data includes emotional features for representing emotional categories ; Carry out data enhancement processing on the word segmentation text data, and obtain the emotional positive example pairs corresponding to the word segmentation text data, and each emotional positive example pair also includes emotional features; conduct comparative learning on the emotional positive example pairs through the pre-trained comparative learning model, and obtain Emotional embedding vectors containing emotional features; and then perform emotional classification processing according to the emotional embedding vectors to obtain target emotional categories corresponding to emotional features.
  • the contrastive learning model by combining the contrastive learning model to conduct comparative learning on the emotional positive examples, and then perform emotional classification processing after obtaining the emotional embedding vector, it can solve the problem of unbalanced distribution of training data, and can avoid the problem of model collapse, thereby improving emotional classification. the accuracy rate.
  • the embodiment of the present application also provides a text-based emotion classification device.
  • a learning module 640 and a classification module 650 Specifically, the acquisition module 610 is used to obtain the original text data to be classified; the word segmentation module 620 is used to perform word segmentation processing on the original text data to obtain word segmentation text data; wherein, the word segmentation text data includes emotional features for representing emotional categories; Module 630 is used to carry out data enhancement processing to the participle text data, obtains the emotional positive example pair corresponding to the participle text data; Wherein, each emotional positive example pair includes emotional features; Contrastive learning is performed on positive examples to obtain emotional embedding vectors; the classification module 650 is used to perform emotional classification processing according to the emotional embedding vectors to obtain target emotional categories corresponding to emotional features.
  • the contrastive learning model by combining the contrastive learning model to conduct comparative learning on the emotional positive examples, and then perform emotional classification processing after obtaining the emotional embedding vector, it can solve the problem of unbalanced distribution of training data, and can avoid the problem of model collapse, thereby improving emotional classification. the accuracy rate.
  • the text-based emotion classification device in the embodiment of the present application is used to implement the text-based emotion classification method in the above-mentioned embodiment, and its specific processing process is the same as the text-based emotion classification method in the above-mentioned embodiment, and will not be repeated here repeat.
  • the embodiment of the present application also provides a computer device, including:
  • At least one processor and,
  • the memory stores instructions, and the instructions are executed by at least one processor, so that when at least one processor executes the instructions, a text-based sentiment classification method is realized, wherein the text-based sentiment classification method includes:
  • word segmentation text data includes emotional features for representing emotional categories
  • the emotional positive example pairs are compared and learned to obtain the emotional embedding vector
  • Sentiment classification processing is performed according to the emotion embedding vector, and the target emotion category corresponding to the emotion feature is obtained.
  • the computer device includes: a processor 710 , a memory 720 , an input/output interface 730 , a communication interface 740 and a bus 750 .
  • the processor 710 may be implemented by a general-purpose central processing unit (Central Processin Unit, CPU), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute Relevant programs to realize the technical solutions provided by the embodiments of the present application;
  • CPU Central Processin Unit
  • ASIC Application Specific Integrated Circuit
  • the memory 720 may be implemented in the form of a read-only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM).
  • the memory 720 can store operating systems and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 720 and called by the processor 710 to execute the implementation of this application.
  • Example text-based sentiment classification method
  • the input/output interface 730 is used to realize information input and output
  • the communication interface 740 is used to realize the communication interaction between the device and other devices, and the communication can be realized through a wired method (such as USB, network cable, etc.), or can be realized through a wireless method (such as a mobile network, WIFI, Bluetooth, etc.); and
  • bus 750 to transfer information between various components of the device (eg, processor 710, memory 720, input/output interface 730, and communication interface 740);
  • the processor 710 , the memory 720 , the input/output interface 730 and the communication interface 740 are connected to each other within the device through the bus 750 .
  • the embodiment of the present application also provides a storage medium, which is a computer-readable storage medium, and the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make the computer execute a text-based emotional A classification method, wherein the text-based sentiment classification method includes:
  • word segmentation text data includes emotional features for representing emotional categories
  • the emotional positive example pairs are compared and learned to obtain the emotional embedding vector
  • Sentiment classification processing is performed according to the emotion embedding vector, and the target emotion category corresponding to the emotion feature is obtained.
  • the computer-readable storage medium may be non-volatile or volatile.
  • memory can be used to store non-transitory software programs and non-transitory computer-executable programs.
  • the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices.
  • the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • At least one (item) means one or more, and “multiple” means two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A exists, only B exists, and A and B exist at the same time , where A and B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • At least one of the following” or similar expressions refer to any combination of these items, including any combination of single or plural items.
  • At least one item (piece) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c ", where a, b, c can be single or multiple.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including multiple instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), magnetic disk or optical disc, etc., which can store programs. medium.
  • ROM read-only memory
  • RAM random access memory
  • magnetic disk or optical disc etc., which can store programs. medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本实施例提供一种基于文本的情感分类方法和装置、计算机设备、存储介质,属于人工智能技术领域。该基于文本的情感分类方法包括:获取待分类的原始文本数据,对原始文本数据进行分词处理,得到多个分词文本数据;对分词文本数据进行数据增强处理,得到分词文本数据对应的情感正例对;通过预先训练的对比学习模型对情感正例对进行对比学习,得到包含情感特征的情感嵌入向量;之后根据情感嵌入向量进行情感分类处理,得到对应情感特征的目标情感类别。通过结合对比学习模型对情感正例对进行对比学习,得到情感嵌入向量之后再进行情感分类处理,能够解决训练数据分布不均衡的问题,从而提高情感分类的准确率。

Description

基于文本的情感分类方法和装置、计算机设备、存储介质
本申请要求于2022年01月11日提交中国专利局、申请号为202210028278.4,发明名称为“基于文本的情感分类方法和装置、计算机设备、存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种基于文本的情感分类方法和装置、计算机设备、存储介质。
背景技术
随着计算机技术的发展,许多业务需要使用计算机技术进行自然语言处理。其中,情感分析是一种常见的自然语言处理方法的应用。目前,通常利用循环神经网络RNN或者BERT模型来进行文本情感分类。
技术问题
以下是发明人意识到的现有技术的技术问题:这两种模型在训练过程的不同种类的训练数据分布不均衡,从而影响情感分类的准确率。因此,如何提供一种文本的情感分类方法,来提高文本情感分类的准确率,成为了亟待解决的技术问题。
技术解决方案
第一方面,本申请实施例提出了一种基于文本的情感分类方法,包括:
获取待分类的原始文本数据;
对所述原始文本数据进行分词处理,得到分词文本数据;其中,所述分词文本数据包括用于表征情感类别的情感特征;
对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对;其中,每一所述情感正例对包括所述情感特征;
通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量;
根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别。
第二方面,本申请实施例提出了一种基于文本的情感分类装置,包括:
获取模块:用于获取待分类的原始文本数据;
分词模块:用于对所述原始文本数据进行分词处理,得到分词文本数据;其中,所述分词文本数据包括用于表征情感类别的情感特征;
增强模块:用于对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对;其中,每一所述情感正例对包括所述情感特征;
学习模块:用于通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量;
分类模块:用于根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别。
第三方面,本申请实施例提出了一种计算机设备,所述计算机设备包括存储器和处理器,其中,所述存储器中存储有程序,所述程序被所述处理器执行时所述处理器用于执行一种基于文本的情感分类方法,其中,所述基于文本的情感分类方法包括:
获取待分类的原始文本数据;
对所述原始文本数据进行分词处理,得到分词文本数据;其中,所述分词文本数据包括 用于表征情感类别的情感特征;
对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对;其中,每一所述情感正例对包括所述情感特征;
通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量;
根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别。
第四方面,本申请实施例提出了一种存储介质,该存储介质为计算机可读存储介质,所述存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行一种基于文本的情感分类方法,其中,所述基于文本的情感分类方法包括:
获取待分类的原始文本数据;
对所述原始文本数据进行分词处理,得到分词文本数据;其中,所述分词文本数据包括用于表征情感类别的情感特征;
对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对;其中,每一所述情感正例对包括所述情感特征;
通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量;
根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别。
有益效果
本申请实施例提出的基于文本的情感分类方法和装置、计算机设备、存储介质,通过结合对比学习模型对情感正例对进行对比学习,得到情感嵌入向量之后再进行情感分类处理,能够解决训练数据分布不均衡的问题,从而提高情感分类的准确率。
附图说明
图1是本申请实施例提供的基于文本的情感分类方法的第一流程图;
图2是图1中的步骤S300的流程图;
图3是图2中的步骤S340的流程图;
图4是本申请实施例提供的基于文本的情感分类方法的第二流程图;
图5是图1中的步骤S500的流程图;
图6是图5中的步骤S550的流程图;
图7是本申请实施例提供的基于文本的情感分类装置的模块结构框图;
图8是本申请实施例提供的计算机设备的硬件结构示意图。
本发明的实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
需要说明的是,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
首先,对本申请中涉及的若干名词进行解析:
人工智能(artificial intelligence,AI):是研究、开发用于模拟、延伸和扩展人的智能的理论、方法、技术及应用系统的一门新的技术科学;人工智能是计算机科学的一个分支,人工智能企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的 智能机器,该领域的研究包括机器人、语言识别、图像识别、自然语言处理和专家系统等。人工智能可以对人的意识、思维的信息过程的模拟。人工智能还是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。
自然语言处理(natural language processing,NLP):NLP用计算机来处理、理解以及运用人类语言(如中文、英文等),NLP属于人工智能的一个分支,是计算机科学与语言学的交叉学科,又常被称为计算语言学。自然语言处理包括语法分析、语义分析、篇章理解等。自然语言处理常用于机器翻译、手写体和印刷体字符识别、语音识别及文语转换、信息检索、信息抽取与过滤、文本分类与聚类、舆情分析和观点挖掘等技术领域,它涉及与语言处理相关的数据挖掘、机器学习、知识获取、知识工程、人工智能研究和与语言计算相关的语言学研究等。
文本情感分析(Text sentiment analysis):又称意见挖掘、倾向性分析等。简单而言,是对带有情感色彩的主观性文本进行分析、处理、归纳和推理的过程。互联网(如博客和论坛以及社会服务网络如大众点评)上产生了大量的用户参与的、对于诸如人物、事件、产品等有价值的评论信息。这些评论信息表达了人们的各种情感色彩和情感倾向性,如喜、怒、哀、乐和批评、赞扬等。基于此,潜在的用户就可以通过浏览这些主观色彩的评论来了解大众舆论对于某一事件或产品的看法。
循环神经网络(Recurrent Neural Network,RNN)是一类以序列(sequence)数据为输入,在序列的演进方向进行递归(recursion)且所有节点(循环单元)按链式连接的递归神经网络(recursive neural network),其中双向循环神经网络(Bidirectional RNN,Bi-RNN)和长短期记忆网络(Long Short-Term Memory networks,LSTM)是常见的循环神经网络。循环神经网络具有记忆性、参数共享并且图灵完备(Turing completeness),因此在对序列的非线性特征进行学习时具有一定优势。循环神经网络在自然语言处理(Natural Language Processing,NLP),例如语音识别、语言建模、机器翻译等领域有应用,也被用于各类时间序列预报。引入了卷积神经网络(Convolutional Neural Network,CNN)构筑的循环神经网络可以处理包含序列输入的计算机视觉问题。
BERT(Bidirectional Encoder Representations from Transformers)模型:是一个自编码语言模型,并且其设计了两个任务来预训练该模型。第一个任务是采用预训练语言模型(Mask Language Model,MaskLM)的方式来训练语言模型,就是在输入一句话的时候,随机地选一些要预测的词,然后用一个特殊的符号来代替它们,之后让模型根据所给的标签去学习这些地方该填的词。第二个任务在双向语言模型的基础上额外增加了一个句子级别的连续性预测任务,即预测输入BERT的两段文本是否为连续的文本,引入这个任务可以更好地让模型学到连续的文本片段之间的关系。
对比学习(Contrastive Learning)是自监督学习的一种,不需要依赖人工标注的类别标签信息,直接利用数据本身作为监督信息。对比学习是一种为深度学习模型描述相似和不同事物的任务的方法。利用对比学习方法,可以训练机器学习模型来区分相似和不同的图像。在图像领域的自监督学习分为两种类型:生成式自监督学习、判别式自监督学习。对比学习应用的是典型的判别式自监督学习。对比学习的核心要点是:通过自动构造相似实例和不相似实例,也就是正样本和负样本,学习将正样本和负样本在特征空间进行对比,使得相似的实例在特征空间中距离拉近,而不相似的实例在特征空间中的距离拉远,差异性变大,通过这样的学习过程得到的模型表征就可以去执行下游任务,在较小的标记数据集上进行微调,从而实现无监督的模型学习过程。对比学习的指导原则是:通过自动构造相似实例和不相似实例,通过学习得到一个学习模型,利用这个模型,使得相似的实例在投影空间中比较接近,而可不相似的实例在投影空间中距离比较远。
batch(批量):batch大小(即批量大小)是一个超参数,用于定义在更新内部模型参数之前要处理的样本数,也就是在模型的内部参数更新之前控制训练样本的数量。训练数据 集可以分为一个或多个batch,其中,当所有训练样本用于创建一个batch时,学习算法称为批量梯度下降;当批量是一个样本的大小时,学习算法称为随机梯度下降;当批量大小超过一个样本且小于训练数据集的大小时,学习算法称为小批量梯度下降。batch大小是在更新模型之前处理的多个样本。
数据增强:数据增强主要用来防止过拟合,用于dataset(数据集)较小时对数据集进行优化,通过数据增强,可以增加训练的数据量,提高模型的泛化能力,增加噪声数据,提升模型的鲁棒性。数据增强可以分为两类,离线增强和在线增强;其中,离线增强是直接对数据集进行处理,数据的数目会变成增强因子x原数据集的数目,离线增强常常用于数据集很小时;在线增强,主要用于获得batch数据之后,对这个batch的数据进行增强,如旋转、平移、翻折等相应的变化,由于有些数据集不能接受线性级别的增长,在线增强常用于较大数据集,很多机器学习框架已经支持了在线增强方式,并且可以使用GPU优化计算。
多层感知器(MLP,Multilayer Perceptron)是一种前馈人工神经网络模型,其将输入的多个数据集映射到单一的输出的数据集上。
dropout:dropout即丢弃,是一种防止模型过拟合的技术,是指在深度学习网络的训练过程中,对于神经网络单元,按照一定的概率将其暂时从网络中丢弃,从而可以让模型更鲁棒,因为它不会太依赖某些局部的特征(因为局部特征有可能被丢弃)。
运行时刻(Runtime):是指一个程序在运行(cc或者在被执行)的状态。也就是说,当你打开一个程序使它在电脑上运行的时候,那个程序就是处于运行时刻。在一些编程语言中,把某些可以重用的程序或者实例打包或者重建成为“运行库”。这些实例可以在它们运行的时候被链接或者被任何程序调用。Runtime类封装了运行时的环境,每个Java应用程序都有一个Runtime类实例,使应用程序能够与其运行的环境相连接。一般不能实例化一个Runtime对象,应用程序也不能创建自己的Runtime类实例,但可以通过getRuntime方法获取当前Runtime运行时对象的引用。一旦得到了一个当前的Runtime对象的引用,就可以调用Runtime对象的方法去控制Java虚拟机的状态和行为。
嵌入(embedding):embedding是一种向量表征,是指用一个低维的向量表示一个物体,该物体可以是一个词,或是一个商品,或是一个电影等等;这个embedding向量的性质是能使距离相近的向量对应的物体有相近的含义,比如embedding(复仇者联盟)和embedding(钢铁侠)之间的距离就会很接近,但embedding(复仇者联盟)和embedding(乱世佳人)的距离就会远一些。embedding实质是一种映射,从语义空间到向量空间的映射,同时尽可能在向量空间保持原样本在语义空间的关系,如语义接近的两个词汇在向量空间中的位置也比较接近。embedding能够用低维向量对物体进行编码还能保留其含义,常应用于机器学习,在机器学习模型构建过程中,通过把物体编码为一个低维稠密向量再传给DNN,以提高效率。
词嵌入模型(word embedding model):能够将请求文本的内容转换成向量表示。
encoder:编码,就是将输入序列转化成一个固定长度的向量;解码(decoder),就是将之前生成的固定向量再转化成输出序列;其中,输入序列可以是文字、语音、图像、视频;输出序列可以是文字、图像。
jieba分词器:jieba分词器也叫结巴分词器,是一种开源分词器;中文分词是中文文本处理的一个基础步骤,也是中文人机自然语言交互的基础模块,在进行中文自然语言处理时,通常需要先进行分词,其中,常用jieba分词器进行分词;jieba分词算法使用了基于前缀词典实现高效的词图扫描,生成句子中汉字所有可能生成词情况所构成的有向无环图(DAG),再采用了动态规划查找最大概率路径,找出基于词频的最大切分组合,对于未登录词,采用了基于汉字成词能力的HMM模型,使用了Viterbi算法。jieba分词支持三种分词模式:第一种是精确模式,该精确模式试图将句子最精确地切开,适合文本分析:第二种是全模式,该全模式是把句子中所有的可以成词的词语都扫描出来,速度非常快,但是不能解决歧义;第三种是搜索引擎模式,该搜索引擎模式是在精确模式的基础上,对长词再词切分,提高召回率,适合用于搜索引擎分词。
Analyzer分词器:Analyzer分词器是专门处理分词的组件,一般包括三部分:Character Filters、Tokenizer(按照规则切分为单词)、Token Filters;其中,Character Filters主要用于处理原始文本,例如去除html、特殊字符;Tokenizer用于按照规则切分为单词;Token Filters用于将切分的单词加工,包括小写、删除stopwords(停用词),增加同义词等。
TextCNN模型:将卷积神经网络应用于文本分类任务。CNN核心点在于可以捕捉局部相关性,具体到文本分类任务中可以利用CNN来提取句子中类似n-gram的关键信息。TextCNN利用多个不同size的卷积核提取句子中的关键信息(类似于多窗口的N-gram模型),使用Max-Pooling选择出最具影响力的高维分类特征,再使用带有Dropout的全连接层提取文本深度特征,最后接softmax进行分类。
Softmax分类器:为逻辑回归分类器面对多个分类的一般化归纳,输出的是属于不同类别的概率值。
反向传播:反向传播的大致原理为,将训练集数据输入到神经网络的输入层,经过神经网络的隐藏层,最后达到神经网络的输出层并输出结果;由于神经网络的输出结果与实际结果有误差,则计算估计值与实际值之间的误差,并将该误差从输出层向隐藏层反向传播,直至传播到输入层;在反向传播的过程中,根据误差调整各种参数的值;不断迭代上述过程,直至收敛。
NCE损失(Noise-Constrastive Estimation Loss):假设X是从真实的数据(或语料库)中抽取的样本,其服从一个相对可参考的概率密度函数P(d),噪音样本Y服从概率密度函数为P(n),噪音对比估计(NCE)就是通过学习一个分类器把这两类样本区别开来,并能从模型中学到数据的属性。
梯度下降(Gradient Descent):是迭代法的一种,可以用于求解最小二乘问题(线性和非线性都可以)。在求解机器学习算法的模型参数,即无约束优化问题时,梯度下降是最常采用的方法之一,另一种常用的方法是最小二乘法。在求解损失函数的最小值时,可以通过梯度下降法来一步步的迭代求解,得到最小化的损失函数和模型参数值。反过来,如果需要求解损失函数的最大值,这时就需要用梯度上升法来迭代。在机器学习中,基于基本的梯度下降法发展了两种梯度下降方法,分别为随机梯度下降法和批量梯度下降法。
最大池化(max-pooling):即取局部接受域中值最大的点。常用的池化方法有最大池化(max-pooling)和均值池化(mean-pooling)。根据相关理论,特征提取的误差主要来自两个方面,一方面是邻域大小受限造成的估计值方差增大;另一方面是卷积层参数误差造成估计均值的偏移。一般来说,mean-pooling能减小第一种误差,更多的保留图像的背景信息,max-pooling能减小第二种误差,更多的保留纹理信息。与mean-pooling近似,在局部意义上,则服从max-pooling的准则。max-pooling卷积核的大小一般是2×2,非常大的输入量可能需要4x4。但是,选择较大的形状会显着降低信号的尺寸,并可能导致信息过度丢失。通常,不重叠的池化窗口表现最好。
随着计算机技术的发展,许多业务需要使用计算机技术进行自然语言处理。其中,情感分析是一种常见的自然语言处理方法的应用。目前,通常利用循环神经网络RNN或者BERT模型来进行文本情感分类,由于这两种模型在训练过程的不同种类的训练数据分布不均衡,从而导致情感分类的准确率低。
基于此,本申请实施例提供一种基于文本的情感分类方法和装置、计算机设备、存储介质,能够提高文本情感分类的准确率。
本申请实施例提供基于文本的情感分类方法和装置、计算机设备、存储介质,具体通过如下实施例进行说明,首先描述本申请实施例中的基于文本的情感分类方法。
本申请实施例提供的基于文本的情感分类方法,涉及人工智能领域。本申请实施例提供的基于文本的情感分类方法可应用于终端中,也可应用于服务器端中,还可以是运行于终端或服务器端中的软件。在一些实施例中,终端可以是智能手机、平板电脑、笔记本电脑、台式计算机或者智能手表等;服务器端可以配置成独立的物理服务器,也可以配置成多个物理 服务器构成的服务器集群或者分布式系统,还可以配置成提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN以及大数据和人工智能平台等基础云计算服务的云服务器;软件可以是实现基于文本的情感分类方法的应用等,但并不局限于以上形式。
本申请实施例可用于众多通用或专用的计算机系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
参照图1,根据本申请实施例第一方面实施例的基于文本的情感分类方法,包括但不限于包括步骤S100至步骤S500。
步骤S100,获取待分类的原始文本数据;
步骤S200,对原始文本数据进行分词处理,得到分词文本数据;
步骤S300,对分词文本数据进行数据增强处理,得到分词文本数据对应的情感正例对;
步骤S400,通过预先训练的对比学习模型对情感正例对进行对比学习,得到情感嵌入向量;
步骤S500,根据情感嵌入向量进行情感分类处理,得到对应情感特征的目标情感类别。
在一些实施例的步骤S100中,可以根据实际的业务需求获取待分类的原始文本数据。例如,在电商领域,一般将用户在电商平台的评论信息作为是否消费的参考标准,此时就需要对用户的评论信息进行情感分类以及分析,通过挖掘用户情感为产品选购、平台运营的优化提供实际的数据支撑。具体地,可以通过爬取多个电商平台的用户评论数据(例如购买手机的评论数据)得到电商领域的原始文本数据,其中,采集到的原始文本数据可以为:“运行非常流畅,拍照效果一流,这款刚出来的紫色非常喜欢,外形很漂亮”、“快速充电、一级棒,质量很好,小巧方便,运行很快”、“外观简洁大方,方便收纳”、“物流很强大,当天订货第二天就能收到手机”和“刚开始拿在手上比较轻,习惯了觉得很轻巧,大小刚刚好”等。
在一些实施例的步骤S200中,在对文本数据进行处理的时候,为了判断句子中是否存在情感词典中相应的词语,需要把句子准确切割为一个个词语,即句子的自动分词,在实际应用中,可以利用Analyzer分词器和jieba分词器等分词工具给原始文本数据进行分词。具体地,利用jieba分词器对原始文本数据进行分词的原理为:首先,加载预先存储的字典,生成trie树。接着,给定待分词的句子(原始文本数据中的每个句子),使用正则表达式获取连续的中文字符和英文字符,切分成短语列表,对每个短语使用DAG(查字典)和动态规划,得到最大概率路径,对DAG中那些没有在字典中查到的字,组合成一个新的片段短语,使用HMM模型(隐马尔科夫模型)进行分词,即识别字典外的新词。例如,对“运行非常流畅,拍照效果一流,这款刚出来的紫色非常喜欢,外形很漂亮”进行分词而得到的分词文本数据,可以为“运行/非常/流畅,拍照/效果/一流,这款/刚出来/的/紫色/非常喜欢,外形/很漂亮”。
在对原始文本数据进行分词,得到分词文本数据之后,可以载入预先设置好的情感特征词典,一般来说,情感特征词典是文本挖掘最核心的部分,情感特征词典包括多个词语,以及词语所对应的情感特征,其中情感特征用于表征情感类别,在实际应用中,通常将情感类别划分为积极、消极和中性,例如在电商领域中,用于表征积极的情感特征有:喜欢、很好、方便等,能够提高后续情感分类的准确度。
在一些实施例的步骤S300中,对分词文本数据进行数据增强处理,得到分词文本数据对应的情感正例对,其中情感正例对指的与分词文本数据的情感特征相同的数据,但是情感正 例对与分词文本数据所对应的文字信息并不相同,本申请实施例通过构建情感正例对来扩大该情感特征下用于模型训练的数据量,从而使训练结果更为准确。情感负例对指的是与分词文本数据的情感特征不同的数据,此外,数据增强可以简单理解为由少量数据生成大量数据的过程,一般比较成功的神经网络拥有大量参数,使这些参数正确工作需要用大量的数据进行训练,但实际情况中数据并没有那么多,因此需要做数据增强。在实际应用中,可通过以下两种方式对分词文本数据进行数据增强处理。第一种方式:主要是对原始文本的特征表示进行处理,比如在表示层注入随机噪音等方法,来获得增强后的文本表示,增强后的表示可以再进行解码获得增强文本或者直接用于训练模型。第二种方式:主要是通过对原始文本中的词进行同义词替换或者删除等操作来进行增强,其中大部分研究都通过引入各种外部资源来提升增强效果。通过对分词文本数据进行增强处理,能够增加训练的数据量,提高模型的泛化能力、增加噪声数据,提升模型的鲁棒性,并且能够解决数据不足或数据不均衡问题。
在一些实施例的步骤S400中,通过预先训练的对比学习模型对情感正例对进行对比学习,得到情感嵌入向量。具体地,将情感正例对输入至对比学习模型中,由对比学习模型计算情感正例对的相似度,根据情感正例对的相似度以及情感负例对的相似度,来计算对比学习模型的损失函数,得到一个损失值,接着使用梯度下降的方法,将损失值反向传播给对比学习模型中用于构建词向量的词嵌入矩阵,以修改词嵌入矩阵中的矩阵参数,从而得到情感嵌入向量。其中,情感嵌入向量由对比学习模型获得,之后通过情感嵌入向量进行情感分类。但是,使用通常的模型对情感进行分类,难以区分中性与积极、中性与消极的区别,主要原因是因为大部分训练数据都是中性的,积极和消极的数据偏少。针对以上问题,采用对比学习模型对情感正例对以及情感负例对进行对比学习的方式,能够解决数据分布不均衡的问题,也能解决模型坍塌的问题,其中模型坍塌指的是模型的多样性低。
在一些实施例的步骤S500中,根据情感嵌入向量进行情感分类处理,得到对应情感特征的目标情感类别,在实际应用中,可以将情感嵌入向量输入至TextCNN模型中,通过获取TextCNN模型的输出结果来得到对应情感特征的目标情感类别。
在一些实施例中,如图2所示,步骤S300具体包括但不限于步骤S310至步骤S340。
步骤S310,对分词文本数据进行复制,得到复制文本数据;
步骤S320,对分词文本数据进行第一数据增强处理,得到第一编码向量;
步骤S330,对复制文本数据进行第二数据增强处理,得到第二编码向量;
步骤S340,根据第一编码向量和第二编码向量,得到情感正例对。
在一些实施例的步骤S310中,假设某一分词文本数据为x,对x进行复制,得到复制文本数据x’,其中,分词文本数据x和复制文本数据x’,分词文本数据x的情感特征和复制文本数据x’的情感特征相同。
在一些实施例的步骤S320和步骤S330中,对比学习模型包括嵌入层,具体地,将分词文本数据x输入至嵌入层,通过嵌入层的dropout编码器对分词文本数据进行数据增强处理,以生成第一编码向量h(x1)。将复制文本数据x’输入至嵌入层,通过嵌入层的dropout编码器对复制文本数据进行数据增强处理,得到第二编码向量h(x2),需要说明的是,第一数据增强处理和第二数据增强处理都表示对数据进行增强处理的操作,“第一”和“第二”仅仅只是用于区分,分词文本数据和复制文本数据分别进行数据增强处理的操作,并不用于限制增强处理的顺序。
需要说明的是,在机器学习的模型中,如果模型的参数太多,而训练样本又太少,训练出来的模型很容易产生过拟合的现象。在训练神经网络的时候经常会遇到过拟合的问题,过拟合具体表现在:模型在训练数据上损失函数较小,预测准确率较高,但是在测试数据上损失函数比较大,预测准确率较低。为了避免过拟合,可以对训练数据也即分词文本数据进行增强,通过添加dropout编码器,可以有效的缓解过拟合的发生,在一定程度上达到正则化的效果。dropout编码器可以作为训练深度神经网络的一种trick供选择。在每个训练批次中,通过忽略一半的特征检测器(让一半的隐层节点值为0),可以明显地减少过拟合现象, 这种方式可以减少特征检测器(隐层节点)间的相互作用,检测器相互作用是指某些检测器依赖其他检测器才能发挥作用。简单地说,也即在前向传播的时候,让某个神经元的激活值以一定的概率p停止工作,由于它太会依赖某些局部的特征,所以能使模型泛化行更强。
在实际应用中,可采用SimCSE(Simple Contrastive Learning of Sentence Embeddings)模型对分词文本数据进行数据增强处理。其中,SimCSE利用自监督学习来提升句子的表示能力,由于SimCSE没有标签数据,所以SimCSE把每个句子本身视为相似句子。换句话说,SimCSE本质上就是以(自己,自己)作为正例对,(自己,别人)作为负例来训练对比学习模型,但是这样会导致泛化效果大打折扣,在本申请实施例中,可以在SimCSE的基础上,使用一些数据扩增手段,即直接利用dropout编码器当做数据扩增。具体地说,就是将每一分词文本数据x经过带dropout编码器的encoder进行数据增强处理,得到第一编码向量h(x1),然后在将与分词文本数据x相同的复制文本数据x’再重新经过带dropout编码器的encoder进行数据增强处理(这时候是另一个随机的dropout),得到第二编码向量h(x2)。这样做的目的是让正例情感对的两个样本有所差异,以此达到利用dropout实现数据扩增的效果。此外,在进行对比学习的阶段,还可以采用batch内的对比学习方法,在batch内部进行数据增强处理,让情感正例对的两个样本(分词文本数据和复制文本数据)有所差异。
在一些实施例的步骤S340中,根据第一编码向量h(x1)和第二编码向量h(x2),得到情感正例对z(x1)和z(x2)。
在一些实施例中,如图3所示,步骤S340具体包括但不限于步骤S341至步骤S343。
步骤S341,通过第一多层感知机对第一编码向量进行映射处理,得到第一映射数据;
步骤S342,通过第二多层感知机对第二编码向量进行映射处理,得到第二映射数据;
步骤S343,根据第一映射数据和第二映射数据构建情感正例对。
在一些实施例的步骤S341和步骤S342中,分别对第一编码向量h(x1)和第二编码向量h(x2)进行映射处理,具体地,通过使用对比学习模型中固定的第一多层感知机对第一编码向量进行映射处理,得到第一映射数据z(x1),通过使用对比学习模型中固定的第二多层感知机对第二编码向量进行映射处理,得到第二映射数据z(x2)。
在一些实施例的步骤S343中,将第一映射数据z(x1)和第二映射数据z(x2),作为情感正例对。
此外,在一些实施例中,还需要构建情感负例对,其具体的构建方式可参考步骤S310至步骤S340,以及步骤S341至步骤S343中构建情感正例对的方式。需要说明的是,情感负例对于情感正例对所不同的是,情感负例对需要选取与分词文本数据x的情感特征不一致的数据,作为文本源数据x other。接着将分词文本数据x输入至嵌入层,通过嵌入层的dropout编码器对分词文本数据进行数据增强处理,以生成第三编码向量h(x3)。将文本源数据x other输入至嵌入层,通过嵌入层的dropout编码器对文本源数据x other进行数据增强处理,得到第四编码向量h(x4)。接着,通过第一多层感知机对三编码向量h(x3)进行映射处理,得到第三映射数据z(x3),通过第二多层感知机对第四编码向量h(x4)进行映射处理,得到第四映射数据z(x4),其中第三映射数据z(x3)和第四映射数据z(x4)为情感负例对数据。
在一些实施例中,如图4所示,在步骤S400之前,本申请实施例提到的基于文本的情感分类方法还包括:构建对比学习模型,具体包括但不限于步骤S410至步骤S440。
步骤S410,获取训练样本;
步骤S420,将样本正例对和样本负例对输入到原始学习模型;
步骤S430,根据样本正例对和样本负例对,对原始学习模型的损失函数进行计算,得到损失值;
步骤S440,根据损失值更新原始学习模型,得到对比学习模型。
在一些实施例的步骤S410中,获取用于构建对比学习模型的训练样本,其中训练样本包括样本正例对和样本负例对,其中,样本正例对和样本负例对与上述情感正例对和情感负例对的具体构建过程相同,此处不再一一赘述。
在一些实施例的步骤S430和步骤S440中,根据样本正例对和样本负例对,对原始学习模型的损失函数进行计算,得到损失值,具体地,在本申请实施例中可采用NCE损失函数,具体的损失函数为公式(1)所示,需要说明的是,损失函数的分子为样本正例对所对应的第一相似度,损失函数的分母是第一相似度以及所有样本负例对的第二相似度,其中,第一相似度和第二相似度可通过公式(2)至公式(4)计算得到。
Figure PCTCN2022090673-appb-000001
其中,f(x) T是f(x)的转置,f(x)是原样本(分词文本数据),f(x +)样本正例对,f(x j)是样本负例对,分母项包括一个样本正例对,和N-1个样本负例对。
在一些实施例中,步骤S440包括但不限于步骤:将损失值作为反向传播量,调整原始学习模型的模型参数,以更新原始学习模型,得到对比学习模型。在本申请实施例中,可以根据损失函数对输入向量进行偏导计算,将得到的偏导数值作为反向传播量以调整原始学习模型的模型参数。其中,在计算原始学习模型对应的损失值之后,还需要使损失值最小化,以达到更优的训练效果。具体地,通过最大化第一相似度和最小化第二相似度,能够使损失函数的损失值达到最小,通过计算NCE损失函数,并使用梯度下降的方法更新原始学习模型的模型参数,从而得到对比学习模型。
具体地,在构建样本正例对和样本负例对之后,需要最大化样本正例对之间的相似度,换句话说样本正例对之间的相似度如果相似即为1,不相似即为0,对比学习训练的目的是使样本正例对之间的相似度尽可能靠近1。此外,还需要最小化样本负例对之间的相似度,在对比学习模型的训练过程中,需要使样本负例对之间的相似度尽可能靠近0。通过调整第一相似度和第二相似度,最小化损失函数的损失值,根据该损失值,利用梯度下降的方法进行反向传播,即沿着梯度不断地更新原始学习模型中的模型参数,最后达到一定的收敛条件后得到优化后的模型参数,以更新原始学习模型,得到对比学习模型。
需要说明的是,样本正例对的第一相似度与样本负例对的第二相似度需要满足公式(2)的条件:
Score(f(x),f(x +))>>Score(f(x),f(x -))   公式(2)
其中,样本正例对的相似度需要远大于样本负例对的相似度,里x+指的是与分词文本数据x相似的数据,即样本正例对数据;这里x-指的是与x不相似的数据,即样本负例对数据,f(x +)是样本正例样本,f(x -)是样本负例样本。
进一步地,用于评价两个特征之间的相似度的度量函数Score如公式(3)和公式(4)所示,其中,Score为使用点积作为分数函数的函数。
Score(f(x),f(x +))=f(x) Tf(x +)   公式(3)
Score(f(x),f(x -))=f(x) Tf(x -)   公式(4)
在一些实施例中,如图5所示,步骤S500具体包括但不限于步骤S510至步骤S550。
步骤S510,获取预设的神经网络模型;
步骤S520,通过卷积层对情感嵌入向量进行特征提取处理,得到多个卷积特征向量;
步骤S530,通过池化层对每一卷积特征向量进行最大池化处理,得到多个池化特征向量;
步骤S540,通过全连接层对多个池化特征向量进行拼接处理,得到拼接特征向量;
步骤S550,通过分类器对拼接特征向量进行分类处理,得到对应情感特征的目标情感类别。
在一些实施例的步骤S510中,获取预设的神经网络模型,在本申请实施例中,可采用TextCNN模型,其中TextCNN模型包括卷积层、池化层、全连接层和分类器。
在一些实施例的步骤S520中,TextCNN模型的卷积层包括多个卷积块,通过多个卷积块 对情感嵌入向量进行特征提取处理,得到多个卷积特征向量。
在一些实施例的步骤S530中,通过TextCNN模型的池化层对每一个卷积特征向量进行最大池化处理,得到多个池化特征向量。需要说明的是,不同尺寸的卷积核得到的特征(feature map)大小也是不一样的,因此对每个feature map使用池化函数,使它们的维度相同。最常用的就是最大池化,这样每一个卷积核得到特征就是一个值,对所有卷积核使用最大池化,得到多个池化特性向量。
在一些实施例的步骤S540中,通过TextCNN模型的全连接层对多个池化特征向量进行拼接处理,得到拼接特征向量。具体地,将步骤S530得到的多个池化特征向量级联起来,可以得到最终的特征向量,即拼接特征向量,将拼接特征向量再输入分类器进行进一步的分类。在此过程中,可以使用dropout防止过拟合。
在一些实施例的步骤S550中,通过TextCNN模型的分类器,例如softmax分类器对拼接特征向量进行分类处理,得到对应情感特征的目标情感类别。
在一些实施例中,如图6所示,步骤S550具体包括但不限于步骤S551和步骤S552。
步骤S551,通过分类器对拼接特征向量进行分类处理,得到多个候选情感类别以及每一候选情感类别对应的情感概率值;
步骤S552,获取情感概率值最高的候选情感类别,作为目标情感类别。
在一些实施例的步骤S551和步骤S552中,通过softmax分类器对拼接特征向量进行分类处理,得到多个候选情感类别以及每一候选情感类别对应的情感概率值,具体地,softmax分类器得到的候选情感类别,y=0时表示消极、y=1时表示中性,y=2时表示积极,以及不同情感类别对应的情感概率值,获取情感概率值最高的候选情感类别,作为目标情感类别,由此完成了基于文本的情感分类过程。
本申请实施例提出的基于文本的情感分类方法,通过获取待分类的原始文本数据,对原始文本数据进行分词处理,得到多个分词文本数据,其中分词文本数据包括用于表征情感类别的情感特征;对分词文本数据进行数据增强处理,得到分词文本数据对应的情感正例对,其中每一情感正例对同样包括情感特征;通过预先训练的对比学习模型对情感正例对进行对比学习,得到包含情感特征的情感嵌入向量;之后根据情感嵌入向量进行情感分类处理,得到对应情感特征的目标情感类别。本申请实施例通过结合对比学习模型对情感正例对进行对比学习,得到情感嵌入向量之后再进行情感分类处理,能够解决训练数据分布不均衡的问题,并且能够避免模型坍塌问题,从而提高情感分类的准确率。
本申请实施例还提供一种基于文本的情感分类装置,如图7所示,可以实现上述基于文本的情感分类方法,该基于文本的情感分类装置包括:获取模块610、分词模块620、增强模块630、学习模块640和分类模块650。具体地,获取模块610用于获取待分类的原始文本数据;分词模块620用于对原始文本数据进行分词处理,得到分词文本数据;其中,分词文本数据包括用于表征情感类别的情感特征;增强模块630用于对分词文本数据进行数据增强处理,得到分词文本数据对应的情感正例对;其中,每一情感正例对包括情感特征;学习模块640用于通过预先训练的对比学习模型对情感正例对进行对比学习,得到情感嵌入向量;分类模块650用于根据情感嵌入向量进行情感分类处理,得到对应情感特征的目标情感类别。本申请实施例通过结合对比学习模型对情感正例对进行对比学习,得到情感嵌入向量之后再进行情感分类处理,能够解决训练数据分布不均衡的问题,并且能够避免模型坍塌问题,从而提高情感分类的准确率。
本申请实施例的基于文本的情感分类装置用于执行上述实施例中的基于文本的情感分类方法,其具体处理过程与上述实施例中的基于文本的情感分类方法相同,此处不再一一赘述。
本申请实施例还提供了一种计算机设备,包括:
至少一个处理器,以及,
与至少一个处理器通信连接的存储器;其中,
存储器存储有指令,指令被至少一个处理器执行,以使至少一个处理器执行指令时实现 一种基于文本的情感分类方法,其中,所述基于文本的情感分类方法包括:
获取待分类的原始文本数据;
对原始文本数据进行分词处理,得到分词文本数据;其中,分词文本数据包括用于表征情感类别的情感特征;
对分词文本数据进行数据增强处理,得到分词文本数据对应的情感正例对;其中,每一情感正例对包括情感特征;
通过预先训练的对比学习模型对情感正例对进行对比学习,得到情感嵌入向量;
根据情感嵌入向量进行情感分类处理,得到对应情感特征的目标情感类别。
下面结合图8对计算机设备的硬件结构进行详细说明。该计算机设备包括:处理器710、存储器720、输入/输出接口730、通信接口740和总线750。
处理器710,可以采用通用的中央处理器(Central Processin Unit,CPU)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本申请实施例所提供的技术方案;
存储器720,可以采用只读存储器(Read Only Memory,ROM)、静态存储设备、动态存储设备或者随机存取存储器(Random Access Memory,RAM)等形式实现。存储器720可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器720中,并由处理器710来调用执行本申请实施例的基于文本的情感分类方法;
输入/输出接口730,用于实现信息输入及输出;
通信接口740,用于实现本设备与其他设备的通信交互,可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信;和
总线750,在设备的各个组件(例如处理器710、存储器720、输入/输出接口730和通信接口740)之间传输信息;
其中处理器710、存储器720、输入/输出接口730和通信接口740通过总线750实现彼此之间在设备内部的通信连接。
本申请实施例还提供一种存储介质,该存储介质是计算机可读存储介质,该计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令用于使计算机执行一种基于文本的情感分类方法,其中,所述基于文本的情感分类方法包括:
获取待分类的原始文本数据;
对原始文本数据进行分词处理,得到分词文本数据;其中,分词文本数据包括用于表征情感类别的情感特征;
对分词文本数据进行数据增强处理,得到分词文本数据对应的情感正例对;其中,每一情感正例对包括情感特征;
通过预先训练的对比学习模型对情感正例对进行对比学习,得到情感嵌入向量;
根据情感嵌入向量进行情感分类处理,得到对应情感特征的目标情感类别。
所述计算机可读存储介质可以是非易失性,也可以是易失性。存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该处理器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
本申请实施例描述的实施例是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域技术人员可知,随着技术的演变和新应用场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本领域技术人员可以理解的是,图1至图6中示出的技术方案并不构成对本申请实施例的限定,可以包括比图示更多或更少的步骤,或者组合某些步骤,或者不同的步骤。
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。
本申请的说明书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括多指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等各种可以存储程序的介质。
以上参照附图说明了本申请实施例的优选实施例,并非因此局限本申请实施例的权利范围。本领域技术人员不脱离本申请实施例的范围和实质内所作的任何修改、等同替换和改进,均应在本申请实施例的权利范围之内。

Claims (20)

  1. 一种基于文本的情感分类方法,其中,包括:
    获取待分类的原始文本数据;
    对所述原始文本数据进行分词处理,得到分词文本数据;其中,所述分词文本数据包括用于表征情感类别的情感特征;
    对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对;其中,每一所述情感正例对包括所述情感特征;
    通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量;
    根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别。
  2. 根据权利要求1所述的方法,其中,所述对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对,包括:
    对所述分词文本数据进行复制,得到复制文本数据;
    对所述分词文本数据进行第一数据增强处理,得到第一编码向量;
    对所述复制文本数据进行第二数据增强处理,得到第二编码向量;
    根据所述第一编码向量和所述第二编码向量,得到所述情感正例对。
  3. 根据权利要求2所述的方法,其中,所述根据所述第一编码向量和所述第二编码向量,得到所述情感正例对,包括:
    通过第一多层感知机对所述第一编码向量进行映射处理,得到第一映射数据;
    通过第二多层感知机对所述第二编码向量进行映射处理,得到第二映射数据;
    根据所述第一映射数据和所述第二映射数据构建所述情感正例对。
  4. 根据权利要求1所述的方法,其中,在所述通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量之前,所述方法还包括:构建所述对比学习模型,具体包括:
    获取训练样本;所述训练样本包括样本正例对和样本负例对;
    将所述样本正例对和所述样本负例对输入到原始学习模型;
    根据所述样本正例对和所述样本负例对,对所述原始学习模型的损失函数进行计算,得到损失值;
    根据所述损失值更新所述原始学习模型,得到所述对比学习模型。
  5. 根据权利要求4所述的方法,其中,所述根据所述损失值更新所述原始学习模型,得到所述对比学习模型,包括:
    将所述损失值作为反向传播量,调整所述原始学习模型的模型参数,以更新所述原始学习模型,得到所述对比学习模型。
  6. 根据权利要求1至5任一项所述的方法,其中,所述根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别,包括:
    获取预设的神经网络模型;其中,所述神经网络模型包括卷积层、池化层、全连接层和分类器;
    通过所述卷积层对所述情感嵌入向量进行特征提取处理,得到多个卷积特征向量;
    通过所述池化层对每一所述卷积特征向量进行最大池化处理,得到多个池化特征向量;
    通过所述全连接层对多个所述池化特征向量进行拼接处理,得到拼接特征向量;
    通过所述分类器对所述拼接特征向量进行分类处理,得到对应所述情感特征的目标情感类别。
  7. 根据权利要求6所述的方法,其中,所述通过所述分类器对所述拼接特征向量进行分类处理,得到对应所述情感特征的目标情感类别,包括:
    通过所述分类器对所述拼接特征向量进行分类处理,得到多个候选情感类别以及每一所述候选情感类别对应的情感概率值;
    获取情感概率值最高的候选情感类别,作为所述目标情感类别。
  8. 一种基于文本的情感分类装置,其中,包括:
    获取模块:用于获取待分类的原始文本数据;
    分词模块:用于对所述原始文本数据进行分词处理,得到分词文本数据;其中,所述分词文本数据包括用于表征情感类别的情感特征;
    增强模块:用于对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对;其中,每一所述情感正例对包括所述情感特征;
    学习模块:用于通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量;
    分类模块:用于根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别。
  9. 一种计算机设备,其中,所述计算机设备包括存储器和处理器,其中,所述存储器中存储有程序,所述程序被所述处理器执行时所述处理器用于执行一种基于文本的情感分类方法,其中,所述基于文本的情感分类方法包括:
    获取待分类的原始文本数据;
    对所述原始文本数据进行分词处理,得到分词文本数据;其中,所述分词文本数据包括用于表征情感类别的情感特征;
    对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对;其中,每一所述情感正例对包括所述情感特征;
    通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量;
    根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别。
  10. 根据权利要求9所述的计算机设备,其中,所述对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对,包括:
    对所述分词文本数据进行复制,得到复制文本数据;
    对所述分词文本数据进行第一数据增强处理,得到第一编码向量;
    对所述复制文本数据进行第二数据增强处理,得到第二编码向量;
    根据所述第一编码向量和所述第二编码向量,得到所述情感正例对。
  11. 根据权利要求10所述的计算机设备,其中,所述根据所述第一编码向量和所述第二编码向量,得到所述情感正例对,包括:
    通过第一多层感知机对所述第一编码向量进行映射处理,得到第一映射数据;
    通过第二多层感知机对所述第二编码向量进行映射处理,得到第二映射数据;
    根据所述第一映射数据和所述第二映射数据构建所述情感正例对。
  12. 根据权利要求9所述的计算机设备,其中,在所述通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量之前,所述方法还包括:构建所述对比学习模型,具体包括:
    获取训练样本;所述训练样本包括样本正例对和样本负例对;
    将所述样本正例对和所述样本负例对输入到原始学习模型;
    根据所述样本正例对和所述样本负例对,对所述原始学习模型的损失函数进行计算,得到损失值;
    根据所述损失值更新所述原始学习模型,得到所述对比学习模型。
  13. 根据权利要求12所述的计算机设备,其中,所述根据所述损失值更新所述原始学习模型,得到所述对比学习模型,包括:
    将所述损失值作为反向传播量,调整所述原始学习模型的模型参数,以更新所述原始学习模型,得到所述对比学习模型。
  14. 根据权利要求9至13任一项所述的计算机设备,其中,所述根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别,包括:
    获取预设的神经网络模型;其中,所述神经网络模型包括卷积层、池化层、全连接层和分类器;
    通过所述卷积层对所述情感嵌入向量进行特征提取处理,得到多个卷积特征向量;
    通过所述池化层对每一所述卷积特征向量进行最大池化处理,得到多个池化特征向量;
    通过所述全连接层对多个所述池化特征向量进行拼接处理,得到拼接特征向量;
    通过所述分类器对所述拼接特征向量进行分类处理,得到对应所述情感特征的目标情感类别。
  15. 一种存储介质,所述存储介质为计算机可读存储介质,其中,所述计算机可读存储有计算机程序,在所述计算机程序被计算机执行时,所述计算机用于执行一种基于文本的情感分类方法,其中,所述基于文本的情感分类方法包括:
    获取待分类的原始文本数据;
    对所述原始文本数据进行分词处理,得到分词文本数据;其中,所述分词文本数据包括用于表征情感类别的情感特征;
    对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对;其中,每一所述情感正例对包括所述情感特征;
    通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量;
    根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别。
  16. 根据权利要求15所述的存储介质,其中,所述对所述分词文本数据进行数据增强处理,得到所述分词文本数据对应的情感正例对,包括:
    对所述分词文本数据进行复制,得到复制文本数据;
    对所述分词文本数据进行第一数据增强处理,得到第一编码向量;
    对所述复制文本数据进行第二数据增强处理,得到第二编码向量;
    根据所述第一编码向量和所述第二编码向量,得到所述情感正例对。
  17. 根据权利要求16所述的存储介质,其中,所述根据所述第一编码向量和所述第二编码向量,得到所述情感正例对,包括:
    通过第一多层感知机对所述第一编码向量进行映射处理,得到第一映射数据;
    通过第二多层感知机对所述第二编码向量进行映射处理,得到第二映射数据;
    根据所述第一映射数据和所述第二映射数据构建所述情感正例对。
  18. 根据权利要求15所述的存储介质,其中,在所述通过预先训练的对比学习模型对所述情感正例对进行对比学习,得到情感嵌入向量之前,所述方法还包括:构建所述对比学习模型,具体包括:
    获取训练样本;所述训练样本包括样本正例对和样本负例对;
    将所述样本正例对和所述样本负例对输入到原始学习模型;
    根据所述样本正例对和所述样本负例对,对所述原始学习模型的损失函数进行计算,得到损失值;
    根据所述损失值更新所述原始学习模型,得到所述对比学习模型。
  19. 根据权利要求18所述的存储介质,其中,所述根据所述损失值更新所述原始学习模型,得到所述对比学习模型,包括:
    将所述损失值作为反向传播量,调整所述原始学习模型的模型参数,以更新所述原始学习模型,得到所述对比学习模型。
  20. 根据权利要求15至19任一项所述的存储介质,其中,所述根据所述情感嵌入向量进行情感分类处理,得到对应所述情感特征的目标情感类别,包括:
    获取预设的神经网络模型;其中,所述神经网络模型包括卷积层、池化层、全连接层和分类器;
    通过所述卷积层对所述情感嵌入向量进行特征提取处理,得到多个卷积特征向量;
    通过所述池化层对每一所述卷积特征向量进行最大池化处理,得到多个池化特征向量;
    通过所述全连接层对多个所述池化特征向量进行拼接处理,得到拼接特征向量;
    通过所述分类器对所述拼接特征向量进行分类处理,得到对应所述情感特征的目标情感类别。
PCT/CN2022/090673 2022-01-11 2022-04-29 基于文本的情感分类方法和装置、计算机设备、存储介质 WO2023134083A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210028278.4A CN114358201A (zh) 2022-01-11 2022-01-11 基于文本的情感分类方法和装置、计算机设备、存储介质
CN202210028278.4 2022-01-11

Publications (1)

Publication Number Publication Date
WO2023134083A1 true WO2023134083A1 (zh) 2023-07-20

Family

ID=81108993

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/090673 WO2023134083A1 (zh) 2022-01-11 2022-04-29 基于文本的情感分类方法和装置、计算机设备、存储介质

Country Status (2)

Country Link
CN (1) CN114358201A (zh)
WO (1) WO2023134083A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756326A (zh) * 2023-08-18 2023-09-15 杭州光云科技股份有限公司 情感和非情感文本特征分析判断方法、装置及电子设备
CN117132004A (zh) * 2023-10-27 2023-11-28 四川省建筑设计研究院有限公司 基于神经网络的公共场所人流密度预测方法、系统及设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358201A (zh) * 2022-01-11 2022-04-15 平安科技(深圳)有限公司 基于文本的情感分类方法和装置、计算机设备、存储介质
CN115544260B (zh) * 2022-12-05 2023-04-25 湖南工商大学 用于文本情感分析的对比优化编解码方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684478A (zh) * 2018-12-18 2019-04-26 腾讯科技(深圳)有限公司 分类模型训练方法、分类方法及装置、设备和介质
US20200159826A1 (en) * 2018-11-19 2020-05-21 Genesys Telecommunications Laboratories, Inc. Method and System for Sentiment Analysis
CN111339305A (zh) * 2020-03-20 2020-06-26 北京中科模识科技有限公司 文本分类方法、装置、电子设备及存储介质
CN111858945A (zh) * 2020-08-05 2020-10-30 上海哈蜂信息科技有限公司 基于深度学习的评论文本方面级情感分类方法及系统
CN113343712A (zh) * 2021-06-29 2021-09-03 安徽大学 一种基于异质图的社交文本情感倾向分析方法及系统
CN113792818A (zh) * 2021-10-18 2021-12-14 平安科技(深圳)有限公司 意图分类方法、装置、电子设备及计算机可读存储介质
CN114358201A (zh) * 2022-01-11 2022-04-15 平安科技(深圳)有限公司 基于文本的情感分类方法和装置、计算机设备、存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200159826A1 (en) * 2018-11-19 2020-05-21 Genesys Telecommunications Laboratories, Inc. Method and System for Sentiment Analysis
CN109684478A (zh) * 2018-12-18 2019-04-26 腾讯科技(深圳)有限公司 分类模型训练方法、分类方法及装置、设备和介质
CN111339305A (zh) * 2020-03-20 2020-06-26 北京中科模识科技有限公司 文本分类方法、装置、电子设备及存储介质
CN111858945A (zh) * 2020-08-05 2020-10-30 上海哈蜂信息科技有限公司 基于深度学习的评论文本方面级情感分类方法及系统
CN113343712A (zh) * 2021-06-29 2021-09-03 安徽大学 一种基于异质图的社交文本情感倾向分析方法及系统
CN113792818A (zh) * 2021-10-18 2021-12-14 平安科技(深圳)有限公司 意图分类方法、装置、电子设备及计算机可读存储介质
CN114358201A (zh) * 2022-01-11 2022-04-15 平安科技(深圳)有限公司 基于文本的情感分类方法和装置、计算机设备、存储介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756326A (zh) * 2023-08-18 2023-09-15 杭州光云科技股份有限公司 情感和非情感文本特征分析判断方法、装置及电子设备
CN116756326B (zh) * 2023-08-18 2023-11-24 杭州光云科技股份有限公司 情感和非情感文本特征分析判断方法、装置及电子设备
CN117132004A (zh) * 2023-10-27 2023-11-28 四川省建筑设计研究院有限公司 基于神经网络的公共场所人流密度预测方法、系统及设备
CN117132004B (zh) * 2023-10-27 2024-02-09 四川省建筑设计研究院有限公司 基于神经网络的公共场所人流密度预测方法、系统及设备

Also Published As

Publication number Publication date
CN114358201A (zh) 2022-04-15

Similar Documents

Publication Publication Date Title
US20230100376A1 (en) Text sentence processing method and apparatus, computer device, and storage medium
CN111368996B (zh) 可传递自然语言表示的重新训练投影网络
CN108920622B (zh) 一种意图识别的训练方法、训练装置和识别装置
WO2023065544A1 (zh) 意图分类方法、装置、电子设备及计算机可读存储介质
US20220245365A1 (en) Translation method and apparatus based on multimodal machine learning, device, and storage medium
US20220050967A1 (en) Extracting definitions from documents utilizing definition-labeling-dependent machine learning background
WO2023134083A1 (zh) 基于文本的情感分类方法和装置、计算机设备、存储介质
CN111291195B (zh) 一种数据处理方法、装置、终端及可读存储介质
CN111680159A (zh) 数据处理方法、装置及电子设备
CN113591483A (zh) 一种基于序列标注的文档级事件论元抽取方法
KR102379660B1 (ko) 딥러닝 기반 의미역 분석을 활용하는 방법
CN111159485A (zh) 尾实体链接方法、装置、服务器及存储介质
WO2023108993A1 (zh) 基于深度聚类算法的产品推荐方法、装置、设备及介质
CN113987147A (zh) 样本处理方法及装置
CN113255320A (zh) 基于句法树和图注意力机制的实体关系抽取方法及装置
CN109271636B (zh) 词嵌入模型的训练方法及装置
CN111581392B (zh) 一种基于语句通顺度的自动作文评分计算方法
CN116661805B (zh) 代码表示的生成方法和装置、存储介质及电子设备
CN111145914B (zh) 一种确定肺癌临床病种库文本实体的方法及装置
CN116258137A (zh) 文本纠错方法、装置、设备和存储介质
CN115759119A (zh) 一种金融文本情感分析方法、系统、介质和设备
CN108875024B (zh) 文本分类方法、系统、可读存储介质及电子设备
CN113627550A (zh) 一种基于多模态融合的图文情感分析方法
CN113486143A (zh) 一种基于多层级文本表示及模型融合的用户画像生成方法
CN110377753B (zh) 基于关系触发词与gru模型的关系抽取方法及装置