CN114722206A - Extremely short text classification method based on keyword screening and attention mechanism - Google Patents

Extremely short text classification method based on keyword screening and attention mechanism Download PDF

Info

Publication number
CN114722206A
CN114722206A CN202210419204.3A CN202210419204A CN114722206A CN 114722206 A CN114722206 A CN 114722206A CN 202210419204 A CN202210419204 A CN 202210419204A CN 114722206 A CN114722206 A CN 114722206A
Authority
CN
China
Prior art keywords
word
short text
concept
representation
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210419204.3A
Other languages
Chinese (zh)
Inventor
朱毅
周鑫柯
李云
强继朋
袁运浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangzhou University
Original Assignee
Yangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangzhou University filed Critical Yangzhou University
Priority to CN202210419204.3A priority Critical patent/CN114722206A/en
Publication of CN114722206A publication Critical patent/CN114722206A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for classifying extremely short texts based on keyword screening and attention mechanism, which comprises the following steps: (1) designing and implementing a keyword screening algorithm, and introducing additional knowledge through a knowledge graph to optimize the feature representation of the extremely short text; (2) obtaining the representation of the extremely short text through a bidirectional long-short term memory model (Attention-based BilSTM) with an Attention mechanism; (3) constructing two attention mechanisms for extra knowledge to learn more important and relevant knowledge; (4) finally, the extremely short text representation is combined with additional knowledge, a softmax classifier is used for classifying the extremely short text data set, and a classification result is obtained. The invention improves the effects of representation learning and feature extraction, improves the accuracy of data set classification, and has higher robustness and practicability.

Description

Extremely short text classification method based on keyword screening and attention mechanism
Technical Field
The invention relates to the field of data mining and natural language processing research, in particular to a very short text classification method based on keyword screening and attention mechanism.
Background
The method can focus on the most important words through the Attention mechanism, thereby capturing the most important semantic information in the sentence.
With the rapid development of web services, more and more short texts are generated on the web such as Twitter and microblog and are applied in many fields. In recent years, there has been a strong demand for processing short texts, which has attracted extensive attention and research. Currently, most existing short text classification methods can be roughly classified according to the feature learning mode: both on its own and on external resources. For the self-resource based approach, it extends the feature space by rules or statistics hidden in the current short text. For the external resource based approach, it extends the feature space with additional knowledge of the outside.
Although the two short text classification methods have good effect on short text classification, the expected effect cannot be achieved on extremely short texts. This is mainly because very short text has a shorter length than conventional short text, and even one to two keywords in very short text classification can determine the final classification result. And because of the length of the extremely short text, often containing few features, do not provide sufficient word co-occurrence. The last words have high ambiguity in very short texts, which may have an effect on the classification result.
The method for classifying the short text based on the external resources does not consider that the final classification result can be determined by one or two key words in the extremely short text classification, and the method usually introduces concepts of each word in the short text to enrich the characteristics, so that a large number of irrelevant concepts or concepts without any action are introduced, and a large amount of noise is introduced to influence the final classification result. However, the concept of only introducing one keyword in a very short text also causes problems, the concept of one word has a plurality of concepts and no relevance exists between the concepts, and if all the concepts are introduced, the final classification result is also influenced.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method for classifying an extremely short text based on keyword screening and attention mechanism, which can find out a keyword determining classification in the extremely short text through the keyword screening and filter out useful concepts through the attention mechanism method so as to achieve the purposes of optimizing feature expression vectors and improving the accuracy of classification of an extremely short text data set.
The purpose of the invention is realized as follows: a method for classifying extremely short texts based on keyword screening and attention mechanism comprises the following steps:
1) designing and implementing a keyword screening algorithm, and introducing additional knowledge through a knowledge graph to optimize the feature representation of the extremely short text;
2) obtaining the representation of the extremely short text through a bidirectional long and short term memory model with an attention mechanism;
3) constructing two attention mechanisms for extra knowledge to learn more important and relevant knowledge;
4) finally, the extremely short text representation is combined with additional knowledge, a softmax classifier is used for classifying the extremely short text data set, and a classification result is obtained.
Further, the step 1) specifically includes:
1.1) selecting input keywords by using a Rake keyword extraction algorithm, dividing an extremely short text into a plurality of phrases by using separators, taking the phrases as candidate words of the finally extracted keywords, dividing each phrase into a plurality of words through spaces, giving a score to each word, obtaining the score of each phrase through accumulation, and finally selecting the highest score as a keyword; the word score formula is:
wordScore=wordDegree(w)/wordFrequency(w)
wherein wordDegree indicates that 1 is added to each word that co-occurs with a word in a phrase, and wordFrequency indicates the total number of occurrences of the word;
1.2) introducing relevant concepts of the keywords by using the knowledge graph, acquiring the concepts of the keywords from the knowledge graph base as additional knowledge, searching the concepts of the keywords by using an api interface of the knowledge graph base, and combining the obtained concepts into a concept set.
Further, the step 2) specifically includes:
2.1) word and word level embedding: representing input extremely short text as { (x)1,y1),(x2,y2),...(xn,yn) Where n is the number of all texts in the very short text, yiE {1,2,. c }, c being the number of tags; performing feature representation learning by adopting two embedding modes of words and words, obtaining word embedding of each word by using a convolutional neural network, obtaining word embedding by word2vec, wherein the dimensionalities of a word vector and a word vector are both d/2, and finally connecting the word vector and the word vector together to obtain a d-dimensional word representation;
2.2) representation of very short text: the word representation obtained in step 2.1) is regarded as a d-dimensional sequence of word vectors (x)1,x2,…,xn) Wherein n is the length of the very short text; inputting the word vector sequence into an Attention-based BilSTM to obtain a corresponding representation; BilSTM contains a network of forward and backward directions for processing very short text, as shown in equations (1) and (2):
Figure BDA0003606216410000031
Figure BDA0003606216410000032
then each one will
Figure BDA0003606216410000033
And
Figure BDA0003606216410000034
joined together to give a hidden state ht(ii) a Therefore, all of hts is defined as
Figure BDA0003606216410000035
As shown in equation (3):
H=(h1,h2,…,hn) (3)
wherein u is the number of hidden elements in each direction of the BilSTM, and n is the number of word vectors; the attention weight is then calculated by equation (4):
Figure BDA0003606216410000041
wherein alpha isiRepresenting the attention weight of each word, f is the activation function of the network, softmax is the weight used to normalize each word;
Figure BDA0003606216410000042
is a matrix of the weights that is,
Figure BDA0003606216410000043
is a weight vector, where daIs a hyperparameter, b1Is an offset vector, hiA hidden state representing the ith word;
final hiThe weighted sum of (a) and (b) yields a representation z of the very short textsAs shown in equation (5):
Figure BDA0003606216410000044
further, the step 3) specifically includes:
3.1) constructing a first concept attention mechanism: embedding the concept and the word level of the concept set obtained in the step 1.2) to obtain a d-dimensional concept vector (c)1,c2,…,cm) Where m is the number of concepts; the first concept attention mechanism is used for calculating the ith concept and the very short text zsThe calculation formula of the semantic similarity (2) is shown as (6):
Figure BDA0003606216410000045
wherein, betaiRepresenting the attention weight of the ith concept to the very short text, f is the activation function of the network;
Figure BDA0003606216410000046
is a matrix of the weights that is,
Figure BDA0003606216410000047
is a weight vector, where dbIs a hyperparameter, b2Is a bias vector;
3.2) constructing a second concept attention mechanism: the second concept attention mechanism is used to calculate the importance of each concept to the whole concept set, and the calculation formula is shown as (7):
Figure BDA0003606216410000048
wherein, deltaiRepresenting the attention weight of the ith concept to the concept set, f is the activation function of the network;
Figure BDA0003606216410000049
is a matrix of the weights that is,
Figure BDA00036062164100000410
is a weight vector, where dcIs a hyperparameter, b3Is a bias vector;
3.3) weighting the attention of both conceptsCombining: will betaiAnd deltaiThe final attention weight is obtained by combining equation (8):
μi=softmax(λβi+(1-λ)δi) (8)
wherein, muiRepresents the final concept weight of the ith concept, and lambda is a weighting parameter to adjust the importance of the two attention weights;
3.4) concept representation: the final concept weight mu obtained in the step 3.3)iAnd the concept vector (c) obtained in step 3.1)1,c2,…,cm) Weighting and summing according to equation (9) to obtain a conceptual representation zc
Figure BDA0003606216410000051
Wherein, ciConcept vector representing ith concept
Further, the step 4) specifically includes:
4.1) combining very short text representation with extra knowledge: representing the extremely short text obtained in the step 2.2) by zsAnd the conceptual representation z obtained in step 3.4)cCombining to obtain an output z, and inputting the output z into a full connection layer;
4.2) training a softmax classifier to classify on the extremely short text data set: training with a test extremely short text dataset, in softmax:
Figure BDA0003606216410000052
outputting z of the training data set in z in step 4.1)trainSubstituting the class label y of the known training data set into (10) to train the classifier;
4.3) classifying the test extremely-short text data set by using the trained classifier: outputting z of the test data set in the output z completed in the step 4.1)testObtaining a classification result T of the test extremely short text data set by substituting a classifier finished by an equation (10)testAs shown in formula (11):
Ttest=argmax P(ztest) (11)。
by adopting the technical scheme, compared with the prior art, the invention has the beneficial effects that: 1) the invention provides a mixed extremely short text classification method, which can enrich semantic information of extremely short texts by combining knowledge in an external knowledge map; 2) the method introduces an Attention-based BilSTM algorithm principle to assign different weights to each word in the extremely short text to enhance the function of the keywords in the classification, thereby solving the problem that one or two keywords in the extremely short text can determine the classification result; 3) the method for screening the keywords is provided to find the most key words in the extremely short text and obtain the related concepts of the words, so that the problem that the concepts of all the words are not necessarily introduced into the extremely short text is solved; 4) two conceptual attention mechanisms have been proposed to introduce useful concepts to reduce the effects of noise; the method effectively improves the effects of representation learning and feature extraction, improves the accuracy of data set classification, and has higher robustness and practicability.
Drawings
FIG. 1 is a general block diagram of the proposed method of the present invention.
Figure 2 is a conceptual representation of the structure of the present invention.
Detailed Description
Fig. 1 shows a method for classifying very short texts based on keyword screening and attention mechanism, which includes the following steps: 1) designing and implementing a keyword screening algorithm, and introducing additional knowledge through a knowledge graph to optimize the feature representation of the extremely short text; 2) constructing a bidirectional long-short term memory model (Attention-based BilSTM) with an Attention mechanism, and inputting the extremely short text to obtain the representation of the extremely short text; 3) constructing two attention mechanisms for additional knowledge, and inputting the additional knowledge obtained in the step 1) to obtain a concept representation; 4) finally, the extremely short text representation is combined with extra knowledge, a softmax classifier is used for training on the extremely short text data set, and a classification result is obtained.
The method comprises the following steps:
1) designing and implementing a keyword screening algorithm, and introducing additional knowledge through a knowledge graph to optimize the feature representation of the extremely short text;
1.1) selecting keywords by a Rake keyword extraction method;
selecting input keywords by using a Rake keyword extraction algorithm, taking the length of an extremely short text into consideration, dividing the extremely short text into a plurality of phrases by using separators such as AND, the, of and the like, taking the phrases as candidate words of the finally extracted keywords, dividing each phrase into a plurality of words through spaces, giving a score to each word, accumulating to obtain the score of each phrase, and finally selecting the phrase with the highest score as a keyword; the word score formula is:
wordScore=wordDegree(w)/wordFrequency(w)
wherein wordDegree indicates that 1 is added to each word that co-occurs with a word in a phrase, and wordFrequency indicates the total number of occurrences of the word;
1.2) introducing related concepts of keywords by using a knowledge graph; acquiring the concept of the keyword from the knowledge graph base as additional knowledge;
searching the key words for concepts by using an api interface of the knowledge graph base; 'https:// concept.research.microsoft.com/api/Concept/score by probingstate ═ key word & topK ═ 10', instance in the statement interface is set as the selected keyword, topK is the number of concepts desired, the obtained concepts are combined into a set of concepts;
2) inputting the extremely short text to obtain the representation z of the extremely short text through a bidirectional long-short term memory model (Attention-based BilSTM) with an Attention mechanisms
2.1) embedding words and words for the input extremely short text;
firstly, embedding words and characters according to the idea of FIG. 1, inputting a very short text with length n, which is a word sequence and can be expressed as { (x)1,y1),(x2,y2),...(xn,yn) Where n is the number of texts in the very short text,yiE {1,2,. c }, c being the number of tags; the feature representation learning is carried out by adopting two embedding modes of words and words, and the word embedding is to map each word to a high-dimensional vector space. Using a Convolutional Neural Network (CNN) to obtain word embedding for each word; the concrete mode is as follows: word embedding as a vector can be considered as a one-dimensional input to CNN, whose size is the input channel size of CNN; word embedding is obtained through word2vec, the dimensionality of a word vector and the dimensionality of a word vector are both d/2, and finally the word vector and the word vector are connected together to obtain a d-dimensional word representation;
2.2) constructing a bidirectional long-short term memory model (Attention-based BilSTM) with an Attention mechanism to obtain a representation z of an extremely short texts
The word representation obtained in step 2.1) is regarded as a d-dimensional sequence of word vectors (x)1,x2,…,xn) Wherein n is the length of the very short text; inputting the word vector sequence into an Attention-based BilSTM to obtain a corresponding representation; BilSTM contains a network of forward and backward directions for processing very short text, as shown in equations (1) and (2):
Figure BDA0003606216410000081
Figure BDA0003606216410000082
then each one will
Figure BDA0003606216410000083
And
Figure BDA0003606216410000084
joined together to form a hidden state ht(ii) a Therefore, all of hts is defined as
Figure BDA0003606216410000085
As shown in equation (3):
H=(h1,h2,…,hn) (3)
wherein u is the number of hidden elements in each direction of the BilSTM, and n is the number of word vectors; the attention weight is then calculated by equation (4):
Figure BDA0003606216410000086
wherein alpha isiRepresenting the attention weight of each word, f is the activation function of the network, softmax is the weight used to normalize each word;
Figure BDA0003606216410000087
is a matrix of weights that is a function of,
Figure BDA0003606216410000088
is a weight vector, where daIs a hyperparameter, b1Is an offset vector, hiA hidden state representing the ith word;
final hiThe weighted sum of (a) and (b) yields a representation z of the very short textsAs shown in equation (5):
Figure BDA0003606216410000089
3) constructing two attention mechanisms for the additional knowledge, and inputting the additional knowledge obtained in the step 1) to obtain a representation z of the conceptc
3.1) construct the first concept attention mechanism to obtain the attention weight βi
As shown in fig. 2, the concept and word-level embedding is performed on the set of concepts obtained in step 1.2) in the same manner as in step 2.1), to obtain a d-dimensional concept vector (c)1,c2,…,cm) Where m is the number of concepts; the first concept attention mechanism is used to calculate the ith concept and the very short text zsThe calculation formula of the semantic similarity (2) is shown as (6):
Figure BDA0003606216410000091
wherein, betaiRepresenting the attention weight of the ith concept to the very short text, and f is the activation function of the network;
Figure BDA0003606216410000092
is a matrix of the weights that is,
Figure BDA0003606216410000093
is a weight vector, where dbIs a hyperparameter, b2Is a bias vector;
3.2) constructing a second concept attention mechanism to obtain an attention weight deltai
The second concept attention mechanism is used to calculate the importance of each concept to the whole concept set, and the calculation formula is shown as (7):
Figure BDA0003606216410000094
wherein, deltaiRepresenting the attention weight of the ith concept to the concept set, f is the activation function of the network;
Figure BDA0003606216410000095
is a matrix of the weights that is,
Figure BDA0003606216410000096
is a weight vector, where dcIs a hyperparameter, b3Is a bias vector;
3.3) combining the two concept attention weights to obtain the attention weight mui
Will betaiAnd deltaiThe final attention weight is obtained by combining equation (8):
μi=softmax(λβi+(1-λ)δi) (8)
wherein, muiRepresents the final concept weight of the ith concept, and lambda is a weighting parameter to adjustThe importance of two attention weights is saved;
3.4) obtaining a conceptual representation zc
The final concept weight mu obtained in the step 3.3)iAnd the concept vector (c) obtained in step 3.1)1,c2,…,cm) Weighting and summing according to equation (9) to obtain a conceptual representation zc
Figure BDA0003606216410000101
Wherein, ciA concept vector representing the ith concept.
4) Finally, combining the extremely short text representation with additional knowledge, classifying the extremely short text data set by using a softmax classifier, and obtaining a classification result;
4.1) combining very short text representation with extra knowledge:
representing the extremely short text obtained in the step 2.2) by zsAnd the conceptual representation z obtained in step 3.4)cCombining to obtain z input into a full connection layer;
4.2) training a softmax classifier to classify on the extremely short text data set: training with a test extremely short text dataset, in softmax:
Figure BDA0003606216410000102
outputting z of the training data set in z in step 4.1)trainSubstituting the class label y of the known training data set into (10) to train the classifier;
4.3) classifying the test extremely-short text data set by using the trained classifier: outputting z of the test data set in the output z completed in the step 4.1)testObtaining a classification result T of the test extremely short text data set by substituting the classifier completed by the formula (10)testAs shown in formula (11):
Ttest=argmax P(ztest) (11)。
the invention provides a short text classification method based on keyword screening and attention mechanism, so that the keyword determining classification in an extremely short text can be found through the keyword screening, and a useful concept can be filtered through the attention mechanism method, so that a feature expression vector is optimized, and the classification accuracy of an extremely short text data set is improved.
The present invention is not limited to the above-mentioned embodiments, and based on the technical solutions disclosed in the present invention, those skilled in the art can make some substitutions and modifications to some technical features without creative efforts according to the disclosed technical contents, and these substitutions and modifications are all within the protection scope of the present invention.

Claims (5)

1. A method for classifying extremely short texts based on keyword screening and attention mechanism is characterized by comprising the following steps:
1) designing and implementing a keyword screening algorithm, and introducing additional knowledge through a knowledge graph to optimize the feature representation of the extremely short text;
2) obtaining the representation of the extremely short text through a bidirectional long and short term memory model with an attention mechanism;
3) constructing two attention mechanisms for extra knowledge to learn more important and relevant knowledge;
4) finally, the extremely short text representation is combined with additional knowledge, a softmax classifier is used for classifying the extremely short text data set, and a classification result is obtained.
2. The method for classifying very short texts based on keyword screening and attention mechanism as claimed in claim 1, wherein the step 1) specifically comprises:
1.1) selecting input keywords by using a Rake keyword extraction algorithm, dividing an extremely short text into a plurality of phrases by using separators, taking the phrases as candidate words of the finally extracted keywords, dividing each phrase into a plurality of words through spaces, giving a score to each word, obtaining the score of each phrase through accumulation, and finally selecting the highest score as a keyword; the word score formula is:
wordScore=wordDegree(w)/wordFrequency(w)
wherein wordDegree indicates that 1 is added to each word co-occurring with a word in a phrase, and wordFrequency indicates the total number of occurrences of the word;
1.2) introducing relevant concepts of the keywords by using the knowledge graph, acquiring the concepts of the keywords from the knowledge graph base as additional knowledge, searching the concepts of the keywords by using an api interface of the knowledge graph base, and combining the searched concepts into a concept set.
3. The method for classifying very short texts based on keyword screening and attention mechanism as claimed in claim 2, wherein said step 2) specifically comprises:
2.1) word and word level embedding: representing input extremely short text as { (x)1,y1),(x2,y2),...(xn,yn) Where n is the number of all texts in the very short text, yiE {1,2,. c }, c being the number of tags; performing feature representation learning by adopting two embedding modes of words and words, obtaining word embedding of each word by using a convolutional neural network, obtaining word embedding by word2vec, wherein the dimensionalities of a word vector and a word vector are both d/2, and finally connecting the word vector and the word vector together to obtain a d-dimensional word representation;
2.2) representation of very short text: the word representation obtained in step 2.1) is regarded as a d-dimensional sequence of word vectors (x)1,x2,…,xn) Wherein n is the length of the very short text; inputting the word vector sequence into an Attention-based BilSTM to obtain a corresponding representation; BilSTM contains a network of forward and backward directions for processing very short text, as shown in equations (1) and (2):
Figure FDA0003606216400000021
Figure FDA0003606216400000022
then each one will be
Figure FDA0003606216400000023
And
Figure FDA0003606216400000024
joined together to give a hidden state ht(ii) a Therefore, all of hts is defined as
Figure FDA0003606216400000025
As shown in equation (3):
H=(h1,h2,…,hn) (3)
wherein u is the number of hidden elements in each direction of the BilSTM, and n is the number of word vectors; the attention weight is then calculated by equation (4):
Figure FDA0003606216400000026
wherein alpha isiRepresenting the attention weight of each word, f is the activation function of the network, softmax is the weight used to normalize each word;
Figure FDA0003606216400000027
is a matrix of the weights that is,
Figure FDA0003606216400000028
is a weight vector, where daIs a hyperparameter, b1Is an offset vector, hiA hidden state representing the ith word;
final hiThe weighted sum of (a) and (b) yields a representation z of the very short textsAs shown in equation (5):
Figure FDA0003606216400000031
4. the method for classifying very short texts based on keyword screening and attention mechanism as claimed in claim 3, wherein said step 3) specifically comprises:
3.1) constructing a first concept attention mechanism: embedding the concept and the word level of the concept set obtained in the step 1.2) to obtain a d-dimensional concept vector (c)1,c2,…,cm) Where m is the number of concepts; the first concept attention mechanism is used to calculate the ith concept and the very short text zsThe calculation formula of the semantic similarity (2) is shown as (6):
Figure FDA0003606216400000032
wherein beta isiRepresenting the attention weight of the ith concept to the very short text, f is the activation function of the network;
Figure FDA0003606216400000033
is a matrix of the weights that is,
Figure FDA0003606216400000034
is a weight vector, where dbIs a hyperparameter, b2Is a bias vector;
3.2) constructing a second concept attention mechanism: the second concept attention mechanism is used to calculate the importance of each concept to the whole concept set, and the calculation formula is shown as (7):
Figure FDA0003606216400000035
wherein, deltaiRepresenting the attention weight of the ith concept to the concept set, f is the activation function of the network;
Figure FDA0003606216400000036
is a matrix of the weights that is,
Figure FDA0003606216400000037
is a weight vector, where dcIs a hyperparameter, b3Is a bias vector;
3.3) combine two conceptual attention weights: will betaiAnd deltaiThe final attention weight is obtained by combining equation (8):
μi=softmax(λβi+(1-λ)δi) (8)
wherein, muiRepresents the final concept weight of the ith concept, and lambda is a weighting parameter to adjust the importance of the two attention weights;
3.4) concept representation: the final conceptual weight mu obtained in the step 3.3)iAnd the concept vector (c) obtained in step 3.1)1,c2,…,cm) Weighting and summing according to equation (9) to obtain a conceptual representation zc
Figure FDA0003606216400000041
Wherein, ciA concept vector representing the ith concept.
5. The method for classifying very short texts based on keyword screening and attention mechanism as claimed in claim 4, wherein the step 4) specifically comprises:
4.1) combine very short text representation with extra knowledge: representing the extremely short text obtained in the step 2.2) by zsAnd the conceptual representation z obtained in step 3.4)cCombining to obtain an output z, and inputting z into a full connection layer;
4.2) training a softmax classifier to classify on the extremely short text data set: training with a test extremely short text dataset, in softmax:
Figure FDA0003606216400000042
outputting z of the training data set in z in step 4.1)trainSubstituting the class label y of the known training data set into (10) to train the classifier;
4.3) classifying the test extremely short text data set by using the trained classifier: outputting z of the test data set in the output z completed in the step 4.1)testObtaining a classification result T of the test very short text dataset by substituting a classifier of the formula (10)testAs shown in formula (11):
Ttest=argmax P(ztest) (11)。
CN202210419204.3A 2022-04-20 2022-04-20 Extremely short text classification method based on keyword screening and attention mechanism Pending CN114722206A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210419204.3A CN114722206A (en) 2022-04-20 2022-04-20 Extremely short text classification method based on keyword screening and attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210419204.3A CN114722206A (en) 2022-04-20 2022-04-20 Extremely short text classification method based on keyword screening and attention mechanism

Publications (1)

Publication Number Publication Date
CN114722206A true CN114722206A (en) 2022-07-08

Family

ID=82246679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210419204.3A Pending CN114722206A (en) 2022-04-20 2022-04-20 Extremely short text classification method based on keyword screening and attention mechanism

Country Status (1)

Country Link
CN (1) CN114722206A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117786092A (en) * 2024-02-27 2024-03-29 成都晓多科技有限公司 Commodity comment key phrase extraction method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117786092A (en) * 2024-02-27 2024-03-29 成都晓多科技有限公司 Commodity comment key phrase extraction method and system
CN117786092B (en) * 2024-02-27 2024-05-14 成都晓多科技有限公司 Commodity comment key phrase extraction method and system

Similar Documents

Publication Publication Date Title
CN110717047B (en) Web service classification method based on graph convolution neural network
Awasthi et al. Natural language processing (NLP) based text summarization-a survey
US8027977B2 (en) Recommending content using discriminatively trained document similarity
CN113268995B (en) Chinese academy keyword extraction method, device and storage medium
Sebastiani Classification of text, automatic
CN111611361A (en) Intelligent reading, understanding, question answering system of extraction type machine
Kmail et al. An automatic online recruitment system based on exploiting multiple semantic resources and concept-relatedness measures
CN110674252A (en) High-precision semantic search system for judicial domain
US11874862B2 (en) Community question-answer website answer sorting method and system combined with active learning
KR20060045786A (en) Verifying relevance between keywords and web site contents
CN110134799B (en) BM25 algorithm-based text corpus construction and optimization method
Suleiman et al. Deep learning based extractive text summarization: approaches, datasets and evaluation measures
CN114997288A (en) Design resource association method
Khalid et al. Topic detection from conversational dialogue corpus with parallel dirichlet allocation model and elbow method
Baboo et al. Sentiment analysis and automatic emotion detection analysis of twitter using machine learning classifiers
CN114722206A (en) Extremely short text classification method based on keyword screening and attention mechanism
CN115098690B (en) Multi-data document classification method and system based on cluster analysis
CN117057346A (en) Domain keyword extraction method based on weighted textRank and K-means
Zhang et al. Text information classification method based on secondly fuzzy clustering algorithm
Abalorio et al. Extended Max-Occurrence with Normalized Non-Occurrence as MONO Term Weighting Modification to Improve Text Classification
KR20070118154A (en) Information processing device and method, and program recording medium
Rezaei et al. Hierarchical three-module method of text classification in web big data
Hao Naive Bayesian Prediction of Japanese Annotated Corpus for Textual Semantic Word Formation Classification
Beumer Evaluation of Text Document Clustering using k-Means
Shahine et al. Hybrid Feature Selection Approach for Arabic Named Entity Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination