CN110083676B - Short text-based field dynamic tracking method - Google Patents

Short text-based field dynamic tracking method Download PDF

Info

Publication number
CN110083676B
CN110083676B CN201910322267.5A CN201910322267A CN110083676B CN 110083676 B CN110083676 B CN 110083676B CN 201910322267 A CN201910322267 A CN 201910322267A CN 110083676 B CN110083676 B CN 110083676B
Authority
CN
China
Prior art keywords
word
neural network
network model
short text
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910322267.5A
Other languages
Chinese (zh)
Other versions
CN110083676A (en
Inventor
郭贵冰
李昂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201910322267.5A priority Critical patent/CN110083676B/en
Publication of CN110083676A publication Critical patent/CN110083676A/en
Application granted granted Critical
Publication of CN110083676B publication Critical patent/CN110083676B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a field dynamic tracking method based on short texts, which belongs to the fields of recommendation systems and natural language processing and comprises the following steps: 1.1) data acquisition; 1.2) preprocessing data; 1.3) establishing a word embedding neural network model; 1.4) establishing a convolutional neural network model; 2.1) recommending short texts; 2.2) saving user feedback. The method can fully and accurately mine the domain theme characteristics of the short text, and improves the accuracy of the dynamic tracking of the domain; the current excellent text recommendation scheme is extended, the latest short text is scored, and the latest short text is dynamically presented to a user as a field; the data can be acquired from static web pages, and short texts can be acquired from dynamic data streams to make data storage for text recommendation.

Description

Short text-based field dynamic tracking method
Technical Field
The invention belongs to the field of recommendation systems and natural language processing, and particularly relates to a field dynamic tracking method based on short texts.
Background
The dynamic tracking method is from two existing technologies of text recommendation and crawler and is improved.
The traditional text recommendation technology is a content recommendation method, and text recommendation is performed according to the similarity with a text file to be recommended by analyzing the implicit theme feature vector of a text. In the prior art, a Web text personalized recommendation method is provided, which recommends a text with influence on a user higher than a threshold value; and a Sina microblog event recommendation method is also provided, wherein the similarity between the user model and the event vector is calculated through an improved cosine similarity (cosine similarity) algorithm. However, these prior arts cannot perform relevance (relevance) scoring in a certain field, and lack a certain universality.
The crawler (crawler) technology is used for collecting short text information, extracting texts or other related information in the short text information, and storing the texts or other related information in a database. It is an infrastructure in information retrieval applications and an important means for data acquisition. In the prior art, there is a web crawler method for extracting data from a web page, which extracts short text information by parsing a Document Object Model (DOM) in a HyperText Markup Language (HTML); and a crawler technology for automatically acquiring a large amount of microblog information by using various crawling methods. However, these prior art technologies cannot achieve the functions of acquiring data from static web pages, and acquiring short texts from dynamic data streams to make data storage for text recommendations.
Disclosure of Invention
Aiming at the technical problems, the invention provides a field dynamic tracking method based on short texts, which comprises the following steps:
step 1, performing model training, specifically:
step 1.1, data acquisition:
continuously and directionally acquiring short texts on internet media according to keywords in a specific field specified by a user, and storing the short texts in a local database;
the short text comprises words, publication time and a subject label;
step 1.2, preprocessing data;
removing noise content in the short text; the noise content comprises stop words, symbolic expressions and webpage links;
intercepting the content of a fixed digit length for the short text after the noise content is removed; if the short text length is less than the digit length, "< PAD >" is filled in to fill up the short text;
assigning a unique integer to each appearing word in the short text as an identifier to distinguish each word;
converting the document into a sequence of integers using the identifiers as a vector representation of the document;
step 1.3, establishing a word embedding neural network model;
respectively taking words appearing in the short text and context information thereof as input and output to train a word embedded neural network model, wherein the output result of the network intermediate layer is taken as word vector representation of a target word, and the method specifically comprises the following steps:
establishing a word embedding neural network model for embedding the information of the context words of the target words into the vector representation of the target words and combining the mathematical meaning of the embedded vector with the language meaning of the target words;
the one-hot vector of the target word is used as the input of the word embedding neural network model;
a one-hot vector of a context word of the target word is used as an output of the word embedding neural network model;
the projection word of the target word is used as an output result of the projection layer of the word embedding neural network model, namely the target word contains a word vector of context information and represents the meaning of the target word in the context;
the words are embedded into neural network weights between the input layer and the projection layer of the neural network model using a V N matrix W ═ Wi,jIs (1. ltoreq. j.ltoreq.V, 1. ltoreq. i.ltoreq.N);
carrying out random initialization on W to obtain a calculation result h of a projection layertIs a 1 XN-dimensional vector, i.e.
Figure GDA0003228908340000021
Wherein V represents the total number of words in the lexicon, N is a hyper-parameter representing the number of neurons in the projection layer, and wtThe target word is selected;
Figure GDA0003228908340000022
is the target word wtInputting a neural network model, namely a one-hot vector of the target word; in that
Figure GDA0003228908340000023
One and only one
Figure GDA0003228908340000024
Equal to 1, the remainder being 0;
the word is embedded in a neural network weight between a projection layer and an output layer of the neural network model, using an N × V matrix W ═ W'i,jIs (1. ltoreq. i.ltoreq.N, 1. ltoreq. j.ltoreq.V);
carrying out random initialization on W' to obtain a calculation result o of an output layertIs a vector of dimension 1 x V, i.e.
Figure GDA0003228908340000031
The target word wtContext word wI∈{wt-2,wt-1,wt+1,wt+2};wIUnique heat vector x ofI∈{xwt-2,xwt-1,xwt+1,xwt+2};
Calculation result o at output layert=[o1,o2,...,oV]Adding softmax7 classifier to realize the relation with the target word wtContext word wIUnique heat vector x ofITo obtain wtOutput on the neural network model
Figure GDA0003228908340000032
Namely:
Figure GDA0003228908340000033
step 1.4, establishing a convolutional neural network model;
in the convolutional neural network model, firstly, words in a short text are converted into word vectors as local features, and meanwhile, keywords appointed by a user are converted into the word vectors as global features; and the two groups of characteristics respectively pass through a convolution layer and a pooling layer of the convolution neural network model, and finally calculate the score corresponding to each short text through a multilayer perceptron and a softmax activation function, specifically:
for words and phrases
Figure GDA0003228908340000034
Length of composition LwAny word w in the short textkThe output of the projection layer of the word-embedded neural network model obtained by step 1.3, the word-embedding map S of its N-dimensional vectorkThe result is that
Sk=word2vec(wk)
For words and phrases
Figure GDA0003228908340000035
Performs this process, resulting in a matrix S:
Figure GDA0003228908340000041
wherein N is the number of neurons in a projection layer of the word embedding neural network model and is also the dimension of a word vector output by the projection layer;
in a similar manner, the subject label characteristic H is obtained as
Figure GDA0003228908340000042
Wherein L isHThe number of the subject labels contained in the short text;
for the matrix S, the convolutional layer filter is defined
Figure GDA0003228908340000043
Is composed of
Figure GDA0003228908340000044
Wherein, the matrix is
Figure GDA0003228908340000045
Is hyper-parametric and performs random initializationMelting; s calculation result C of convolution layerSIs composed of
Figure GDA0003228908340000046
To CSMaximum pooling (max pooling) is performed per row to obtain a matrix PS
Figure GDA0003228908340000047
Similarly, for matrix H, the convolutional layer filter is defined
Figure GDA0003228908340000051
Is composed of
Figure GDA0003228908340000052
Wherein, the matrix is
Figure GDA0003228908340000053
If the parameter is a hyper-parameter, carrying out random initialization; h calculation result C of convolution layerHIs composed of
Figure GDA0003228908340000054
To CHMaking maximum pooling of each row to obtain a matrix PH
Figure GDA0003228908340000055
Will PSAnd PHAre respectively flattened (flattened) and connected to have a length of
Figure GDA0003228908340000056
Vector of (2)
Figure GDA0003228908340000057
Figure GDA0003228908340000058
Figure GDA0003228908340000059
Figure GDA00032289083400000510
And according to PfAnd between logistic regression layer neural networks
Figure GDA00032289083400000511
The dimension weight matrix M calculates an output o ' ═ of (o ') of the logistic regression layer '1,o′2):
Figure GDA00032289083400000512
Calculating the output value of the softmax activating function:
Figure GDA0003228908340000061
wherein
Figure GDA0003228908340000062
I.e. the fraction of the final output (score) and satisfies
Figure GDA0003228908340000063
Step 2, text recommendation is carried out, specifically:
step 2.1, short text recommendation is carried out according to the scores obtained in the step 1.4, and a plurality of pieces of short text information with the highest scores are taken as dynamic progress of the field and recommended to a user;
different from step 1.4Step 2.1 is only for data which does not participate in training and is in the last week, and does not carry out back propagation error correction; at run-time, step 2.1 pairs the number N specified according to the usertopScreening out N with highest scoretopBars are dynamic for the domain;
step 2.2, the user feeds back the recommended short text according to the correlation degree, and records the recommended short text in a local database to correct the accuracy of the model and improve the performance of the model;
the training method of the word embedding neural network model in the step 1.3 comprises the following steps:
the training goal of the word embedding neural network model is to reduce
Figure GDA0003228908340000064
And wIUnique heat vector x ofI=(xI,1,xI,2,...,xI,V) Difference therebetween
Figure GDA0003228908340000065
Wherein wI∈{wt-2,wt-1,wt+1,wt+2And is provided with
Figure GDA0003228908340000067
For a weight matrix W ' of dimension N × V from projection layer to output layer, the gradient δ W ' of j (1 ≦ i ≦ N,1 ≦ j ≦ V) for any i, j according to the chain rule 'i,jIs equal to
Figure GDA0003228908340000066
Wherein
Figure GDA0003228908340000071
Figure GDA0003228908340000072
Figure GDA0003228908340000073
Therefore, it is not only easy to use
Figure GDA0003228908340000074
And according to gradient delta W'i,jWhile using the first moment calculated in the last execution of the training process
Figure GDA0003228908340000075
And second moment
Figure GDA0003228908340000076
Or first moment
Figure GDA0003228908340000077
And second moment
Figure GDA0003228908340000078
Is calculated as the set of weights W'i,jFirst moment estimation of
Figure GDA0003228908340000079
And second moment estimation
Figure GDA00032289083400000710
Figure GDA00032289083400000711
Figure GDA00032289083400000712
Wherein beta is1And beta2The super-parameter is used for controlling the change of the first moment estimation value and the second moment estimation value;
to pair
Figure GDA00032289083400000713
And
Figure GDA00032289083400000714
as a correction to
Figure GDA00032289083400000715
And
Figure GDA00032289083400000716
make it
Figure GDA00032289083400000717
And
Figure GDA00032289083400000718
approximated as an unbiased estimate:
Figure GDA0003228908340000081
Figure GDA0003228908340000082
wherein step is the number of times of the step completed, and W 'is updated'i,jIs composed of
Figure GDA0003228908340000083
Where α is a hyper-parameter, called learning rate (learning rate), and is controlled by'i,jThe update ratio of (2); e is defined as a decimal fraction greater than and approaching 0, e.g. 10-8For preventing division by zero in the score;
for a V x N dimensional weight matrix W from the input layer to the projection layer, according to the chain rule,for any j, i (1 ≦ j ≦ V,1 ≦ i ≦ N), its gradient δ Wj,iIs equal to
Figure GDA0003228908340000084
Wherein
Figure GDA0003228908340000085
Has been previously calculated; remaining value
Figure GDA0003228908340000086
Figure GDA0003228908340000087
And calculating a first moment estimate based on the gradient
Figure GDA0003228908340000088
And second moment estimation
Figure GDA0003228908340000089
Figure GDA00032289083400000810
Figure GDA00032289083400000811
Computing
Figure GDA00032289083400000812
And
Figure GDA00032289083400000813
approximate unbiased estimation:
Figure GDA0003228908340000091
Figure GDA0003228908340000092
and update the weight Wj,i
Figure GDA0003228908340000093
Each time the step is finished, the word embedding neural network model is trained once, so that the weight is updated;
the training method of the convolutional neural network model in step 1.4 specifically comprises the following steps:
the training target of the convolutional neural network model is reduced u '═ u'1,u′2) And actual degree of correlation a '═ a'1,a′2) The difference between them; wherein a'1Is [0,1 ]]Is LwDegree of correlation with domain to which keyword belongs, u'1The greater the correlation, the stronger the correlation, a'2=1-a′1
Is provided with
E′=E′1+E′2
And is provided with
E′j=-(a′jlogu′j+(1-a′j)(1-logu′j))
Wherein j is more than or equal to 1 and less than or equal to 2; for PfTo o' s
Figure GDA0003228908340000096
Dimension weight matrix M, according to the chain rule, for any i, j (M:)
Figure GDA0003228908340000097
J is more than or equal to 1 and less than or equal to 2) and the gradient delta M thereofi,jIs composed of
Figure GDA0003228908340000094
Wherein
Figure GDA0003228908340000095
Figure GDA0003228908340000101
Figure GDA0003228908340000102
Therefore, it is not only easy to use
Figure GDA0003228908340000103
And calculating a first moment estimate based on the gradient
Figure GDA00032289083400001012
And second moment estimation
Figure GDA00032289083400001013
Figure GDA0003228908340000104
Figure GDA0003228908340000105
Computing
Figure GDA0003228908340000106
And
Figure GDA0003228908340000107
approximate unbiased estimation:
Figure GDA0003228908340000108
Figure GDA0003228908340000109
and update the weight Mi,j
Figure GDA00032289083400001010
Filter with theme tag feature H updated later
Figure GDA00032289083400001011
Figure GDA0003228908340000111
Has a gradient of
Figure GDA0003228908340000112
Wherein
Figure GDA0003228908340000113
Figure GDA0003228908340000114
Figure GDA0003228908340000115
Therefore, it is not only easy to use
Figure GDA0003228908340000116
And calculate its first orderMoment estimation
Figure GDA0003228908340000117
And second moment estimation
Figure GDA0003228908340000118
Figure GDA0003228908340000119
Figure GDA00032289083400001110
Correction to approximate unbiased estimation:
Figure GDA00032289083400001111
Figure GDA00032289083400001112
and update
Figure GDA00032289083400001113
Figure GDA0003228908340000121
Filter for input matrix S
Figure GDA0003228908340000122
Has a gradient of
Figure GDA0003228908340000123
Wherein
Figure GDA0003228908340000124
Figure GDA0003228908340000125
Figure GDA0003228908340000126
Therefore, it is not only easy to use
Figure GDA0003228908340000127
And calculates an first moment estimate thereof
Figure GDA0003228908340000128
And second moment estimation
Figure GDA0003228908340000129
Figure GDA00032289083400001210
Figure GDA00032289083400001211
Correction to approximate unbiased estimation:
Figure GDA0003228908340000131
Figure GDA0003228908340000132
and update
Figure GDA0003228908340000133
Figure GDA0003228908340000134
And training the convolutional neural network model once every time the step is completed, so as to update the weight.
The invention has the beneficial effects that:
compared with the conventional text recommendation method, the method can fully and accurately mine the domain theme characteristics of the short text, and improves the accuracy of domain dynamic tracking; the current excellent text recommendation scheme is extended, the latest short text is scored, and the latest short text is dynamically presented to a user as a field; the data can be acquired from static web pages, and short texts can be acquired from dynamic data streams to make data storage for text recommendation.
The invention has reasonable design, easy realization and good practical value.
Drawings
Fig. 1 is a schematic diagram of a process for removing noise content in a short text according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a word-embedded neural network model according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a training method of the convolutional neural network model and a short text recommendation method in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a field dynamic tracking method based on a short text, which mainly adopts a recommendation technology based on a convolutional neural network and a short text-oriented crawler technology, and comprises the following steps: a model training part and a text recommendation part.
The model training part comprises the following steps:
step 1.1, data acquisition;
continuously and directionally acquiring short texts on internet media according to keywords in a specific field specified by a user, and storing the short texts in a local database;
the short text comprises words, publication times and topic labels (hashtag/category);
the short texts are from tweets, microblogs, paper abstracts and user postings of Twitter, arXiv and Reddit network platforms;
the subject label is specified by a user;
step 1.2, preprocessing data;
removing noise content in the short text, wherein the process is shown in FIG. 1; the noise content comprises stop words, symbolic expressions and webpage links, and the underlined content in the graph is the stop words;
intercepting the content of a fixed digit length for the short text after the noise content is removed; if the short text length is less than the digit length, "< PAD >" is filled in to fill up the short text;
assigning a unique integer to each appearing word in the short text as an identifier to distinguish each word;
converting the document into a sequence of integers using the identifiers as a vector representation of the document;
step 1.3, establishing a word embedding neural network model;
respectively taking words appearing in a short text and context information thereof as input and output to train a word embedding (word embedding) neural network model, wherein an output result of the network intermediate layer is taken as a word vector representation of a target word, as shown in fig. 2, specifically:
in order to embed the information of the context words of the target words into the vector representation of the target words, the mathematical meaning of the embedded vector is combined with the language meaning of the target words to establish a word embedding neural network model;
the one-hot vector of the target word is used as the input of the word embedding neural network model;
a one-hot vector of a context word of the target word is used as an output of the word embedding neural network model;
the projection word of the target word is used as an output result of the projection layer of the word embedding neural network model, namely the target word contains a word vector of context information and represents the meaning of the target word in the context;
the words are embedded into neural network weights between the input layer and the projection layer of the neural network model using a V N matrix W ═ Wi,jIs (1. ltoreq. j.ltoreq.V, 1. ltoreq. i.ltoreq.N);
carrying out random initialization on W to obtain a calculation result h of a projection layertIs a 1 XN-dimensional vector, i.e.
Figure GDA0003228908340000141
Wherein V represents the total number of words in the lexicon, N is a hyper-parameter representing the number of neurons in the projection layer, and wtThe target word is selected;
Figure GDA0003228908340000151
is the target word wtInputting a neural network model, namely a one-hot vector of the target word; in that
Figure GDA0003228908340000152
One and only one
Figure GDA0003228908340000153
Equal to 1, the remainder being 0;
the word is embedded in a neural network weight between a projection layer and an output layer of the neural network model, using an N × V matrix W ═ W'i,jIs (1. ltoreq. i.ltoreq.N, 1. ltoreq. j.ltoreq.V);
randomly initializing W' to obtainCalculation result o to output layertIs a vector of dimension 1 x V, i.e.
Figure GDA0003228908340000154
The target word wtContext word wI∈{wt-2,wt-1,wt+1,wt+2};wIUnique heat vector x ofI∈{xwt-2,xwt-1,xwt+1,xwt+2};
Calculation result o at output layert=[o1,o2,...,oV]Adding softmax7 classifier to realize the relation with the target word wtContext word wIUnique heat vector x ofITo obtain wtOutput on the neural network model
Figure GDA0003228908340000155
Namely:
Figure GDA0003228908340000156
the training method of the word embedding neural network model in the step 1.3 comprises the following steps:
the training goal of the word embedding neural network model is to reduce
Figure GDA0003228908340000157
And wIUnique heat vector x ofI=(xI,1,xI,2,...,xI,V) Difference therebetween
Figure GDA0003228908340000158
Wherein wI∈{wt-2,wt-1,wt+1,wt+2And is provided with
Figure GDA00032289083400001612
For a weight matrix W ' of dimension N × V from projection layer to output layer, the gradient δ W ' of j (1 ≦ i ≦ N,1 ≦ j ≦ V) for any i, j according to the chain rule 'i,jIs equal to
Figure GDA0003228908340000161
Wherein
Figure GDA0003228908340000162
Figure GDA0003228908340000163
Figure GDA0003228908340000164
Therefore, it is not only easy to use
Figure GDA0003228908340000165
And according to gradient delta W'i,jWhile using the first moment calculated in the last execution of the training process
Figure GDA0003228908340000166
And second moment
Figure GDA0003228908340000167
Or first moment
Figure GDA0003228908340000168
And second moment
Figure GDA0003228908340000169
Is calculated for the group ofWeight W'i,jFirst moment estimation of
Figure GDA00032289083400001610
And second moment estimation
Figure GDA00032289083400001611
Figure GDA0003228908340000171
Figure GDA0003228908340000172
Wherein beta is1And beta2The super-parameter is used for controlling the change of the first moment estimation value and the second moment estimation value;
to pair
Figure GDA0003228908340000173
And
Figure GDA0003228908340000174
as a correction to
Figure GDA0003228908340000175
And
Figure GDA0003228908340000176
make it
Figure GDA0003228908340000177
And
Figure GDA0003228908340000178
approximated as an unbiased estimate:
Figure GDA0003228908340000179
Figure GDA00032289083400001710
wherein step is the number of times of the step completed, and W 'is updated'i,jIs composed of
Figure GDA00032289083400001711
Where α is a hyper-parameter, called learning rate (learning rate), and is controlled by'i,jThe update ratio of (2); e is defined as a decimal fraction greater than and approaching 0, e.g. 10-8For preventing division by zero in the score;
for the V multiplied by N dimensional weight matrix W from the input layer to the projection layer, i (j is more than or equal to 1 and less than or equal to V, i is more than or equal to 1 and less than or equal to N) is arbitrarily set, and the gradient delta W is set according to the chain rulej,iIs equal to
Figure GDA00032289083400001712
Wherein
Figure GDA00032289083400001713
(1. ltoreq. k. ltoreq.V) has been calculated before; remaining value
Figure GDA00032289083400001714
Figure GDA00032289083400001715
And calculating a first moment estimate based on the gradient
Figure GDA0003228908340000181
And second moment estimation
Figure GDA0003228908340000182
Figure GDA0003228908340000183
Figure GDA0003228908340000184
Computing
Figure GDA0003228908340000185
And
Figure GDA0003228908340000186
approximate unbiased estimation:
Figure GDA0003228908340000187
Figure GDA0003228908340000188
and update the weight Wj,i
Figure GDA0003228908340000189
Each time the step is finished, the word embedding neural network model is trained once, so that the weight is updated;
step 1.4, establishing a convolutional neural network model;
in the Convolutional Neural Network (Convolutional Neural Network) model, firstly, words in a short text are converted into word vectors as local features, and meanwhile, keywords specified by a user are converted into word vectors as global features; and the two groups of characteristics respectively pass through a convolution layer (convolution layer) and a pooling layer (max-pooling layer) of the convolution neural network model, and finally calculate the score corresponding to each short text through a multilayer perceptron (multi layer perceptron) and a softmax activation function, specifically:
for theBy words and phrases
Figure GDA00032289083400001810
Length of composition LwAny word w in the short textkThe output of the projection layer of the word-embedded neural network model obtained by step 1.3, the word-embedding map S of its N-dimensional vectorkThe result is that
Sk=word2vec(wk)
For words and phrases
Figure GDA00032289083400001811
Performs this process, resulting in a matrix S:
Figure GDA0003228908340000191
wherein N is the number of neurons in a projection layer of the word embedding neural network model and is also the dimension of a word vector output by the projection layer;
in a similar manner, the subject label characteristic H is obtained as
Figure GDA0003228908340000192
Wherein L isHThe number of the subject labels contained in the short text;
for the matrix S, the convolutional layer filter is defined
Figure GDA0003228908340000193
Is composed of
Figure GDA0003228908340000194
Wherein, the matrix is
Figure GDA0003228908340000195
If the parameter is a hyper-parameter, carrying out random initialization; s calculation result C of convolution layerSIs composed of
Figure GDA0003228908340000196
To CSMaximum pooling (max pooling) is performed per row to obtain a matrix PS
Figure GDA0003228908340000197
Similarly, for matrix H, the convolutional layer filter is defined
Figure GDA0003228908340000201
Is composed of
Figure GDA0003228908340000202
Wherein, the matrix is
Figure GDA0003228908340000203
If the parameter is a hyper-parameter, carrying out random initialization; h calculation result C of convolution layerHIs composed of
Figure GDA0003228908340000204
To CHMaking maximum pooling of each row to obtain a matrix PH
Figure GDA0003228908340000205
Will PSAnd PHAre respectively flattened (flattened) and connected to have a length of
Figure GDA0003228908340000206
Vector of (2)
Figure GDA0003228908340000207
Figure GDA0003228908340000208
Figure GDA0003228908340000209
Figure GDA00032289083400002010
And according to PfAnd between logistic regression layers
Figure GDA00032289083400002011
The dimension weight matrix M calculates an output o ' ═ of (o ') of the logistic regression layer '1,o′2):
Figure GDA00032289083400002012
Calculating the output value of the softmax activating function:
Figure GDA0003228908340000211
wherein
Figure GDA0003228908340000212
I.e. the fraction of the final output (score) and satisfies
Figure GDA0003228908340000213
The training method of the convolutional neural network model in step 1.4 is shown in fig. 3, and specifically includes:
the training target of the convolutional neural network model is reduced u '═ u'1,u′2) And actual degree of correlation a '═ a'1,a′2) The difference between them; wherein a'1Is [0,1 ]]Is LwDegree of correlation with domain to which keyword belongs, u'1The greater the correlation, the stronger the correlation, a'2=1-a′1
Is provided with
E′=E′1+E′2
And is provided with
E′j=-(a′jlogu′j+(1-a′j)(1-logu′j))
Wherein j is more than or equal to 1 and less than or equal to 2; for PfTo o' s
Figure GDA0003228908340000216
Dimension weight matrix M, according to the chain rule, for any i, j (M:)
Figure GDA0003228908340000217
J is more than or equal to 1 and less than or equal to 2) and the gradient delta M thereofi,jIs composed of
Figure GDA0003228908340000214
Wherein
Figure GDA0003228908340000215
Figure GDA0003228908340000221
Figure GDA0003228908340000222
Therefore, it is not only easy to use
Figure GDA0003228908340000223
And calculating a first moment estimate based on the gradient
Figure GDA0003228908340000224
And second moment estimation
Figure GDA00032289083400002213
Figure GDA0003228908340000225
Figure GDA0003228908340000226
Computing
Figure GDA0003228908340000227
And
Figure GDA0003228908340000228
approximate unbiased estimation:
Figure GDA0003228908340000229
Figure GDA00032289083400002210
and update the weight Mi,j
Figure GDA00032289083400002211
Filter with theme tag feature H updated later
Figure GDA00032289083400002212
Figure GDA0003228908340000231
Has a gradient of
Figure GDA0003228908340000232
Wherein
Figure GDA0003228908340000233
Figure GDA0003228908340000234
Figure GDA0003228908340000235
Therefore, it is not only easy to use
Figure GDA0003228908340000236
And calculates an first moment estimate thereof
Figure GDA0003228908340000237
And second moment estimation
Figure GDA0003228908340000238
Figure GDA0003228908340000239
Figure GDA00032289083400002310
Correction to approximate unbiased estimation:
Figure GDA00032289083400002311
Figure GDA00032289083400002312
and update
Figure GDA00032289083400002313
Figure GDA0003228908340000241
Filter for input matrix S
Figure GDA0003228908340000242
Has a gradient of
Figure GDA0003228908340000243
Wherein
Figure GDA0003228908340000244
Figure GDA0003228908340000245
Figure GDA0003228908340000246
Therefore, it is not only easy to use
Figure GDA0003228908340000247
And calculates an first moment estimate thereof
Figure GDA0003228908340000248
And second moment estimation
Figure GDA0003228908340000249
Figure GDA00032289083400002410
Figure GDA00032289083400002411
Correction to approximate unbiased estimation:
Figure GDA0003228908340000251
Figure GDA0003228908340000252
and update
Figure GDA0003228908340000253
Figure GDA0003228908340000254
Each time the step is finished, training the convolutional neural network model once so as to update the weight;
the text recommendation part comprises the following steps:
step 2.1, as shown in fig. 3, short text recommendation is performed according to the scores obtained in step 1.4, and a plurality of pieces of short text information with the highest scores are taken as dynamic progress of the field and recommended to the user;
unlike step 1.4, step 2.1 only targets data in the last week that did not participate in training, and does not perform back propagation error correction; at run-time, step 2.1 pairs the number N specified according to the usertopScreening out N with highest scoretopBars are dynamic for the domain;
step 2.2, the user feeds back the recommended short text according to the correlation degree, and records the recommended short text in a local database to correct the accuracy of the model and improve the performance of the model;
in this embodiment, the feedback is N recommended by the usertopThe short text bars are scored between 1 and 5.
According to the application of the method in practice, the scheme of the invention greatly improves the method for tracking the dynamic state of a specific field in mass information, and the method can be applied to any field only by changing different field keywords.

Claims (3)

1. A field dynamic tracking method based on short texts is characterized by comprising the following steps:
step 1, performing model training, specifically:
step 1.1, data acquisition:
continuously and directionally acquiring short texts on internet media according to keywords in a specific field specified by a user, and storing the short texts in a local database;
the short text comprises words, publication time and a subject label;
step 1.2, preprocessing data;
removing noise content in the short text; the noise content comprises stop words, symbolic expressions and webpage links;
intercepting the content of a fixed digit length for the short text after the noise content is removed; if the short text length is less than the digit length, "< PAD >" is filled in to fill up the short text;
assigning a unique integer to each appearing word in the short text as an identifier to distinguish each word;
converting the document into a sequence of integers using the identifiers as a vector representation of the document;
step 1.3, establishing a word embedding neural network model;
respectively taking words appearing in the short text and context information thereof as input and output to train a word embedded neural network model, wherein the output result of the network intermediate layer is taken as word vector representation of a target word, and the method specifically comprises the following steps:
establishing a word embedding neural network model for embedding the information of the context words of the target words into the vector representation of the target words and combining the mathematical meaning of the embedded vector with the language meaning of the target words;
the one-hot vector of the target word is used as the input of the word embedding neural network model;
a one-hot vector of a context word of the target word is used as an output of the word embedding neural network model;
the projection word of the target word is used as an output result of the projection layer of the word embedding neural network model, namely the target word contains a word vector of context information and represents the meaning of the target word in the context;
the words are embedded into neural network weights between the input layer and the projection layer of the neural network model using a V N matrix W ═ Wi,jIs (1. ltoreq. i.ltoreq.V, 1. ltoreq. j.ltoreq.N);
carrying out random initialization on W to obtain a calculation result h of a projection layertIs a 1 XN-dimensional vector, i.e.
Figure FDA0003286402010000011
Figure FDA0003286402010000021
Wherein V represents the total number of words in the lexicon, N is a hyper-parameter representing the number of neurons in the projection layer, and wtThe target word is selected;
Figure FDA0003286402010000022
is the target word wtInputting a neural network model, namely a one-hot vector of the target word; in that
Figure FDA0003286402010000023
One and only one
Figure FDA0003286402010000024
Equal to 1, the remainder being 0;
the word is embedded in a neural network weight between a projection layer and an output layer of the neural network model, using an N × V matrix W ═ W'i,jIs (1. ltoreq. i.ltoreq.N, 1. ltoreq. j.ltoreq.V);
carrying out random initialization on W' to obtain a calculation result o of an output layertIs a vector of dimension 1 x V, i.e.
Figure FDA0003286402010000025
The target word wtContext word wI∈{wt-2,wt-1,wt+1,wt+2};wIIndependent heat vector of
Figure FDA0003286402010000026
Calculation result o at output layert=[o1,o2,...,oV]Adding softmax7 classifier to realize the relation with the target word wtContext word wIUnique heat vector x ofITo obtain wtOutput on the neural network model
Figure FDA0003286402010000027
Namely:
Figure FDA0003286402010000028
step 1.4, establishing a convolutional neural network model;
in the convolutional neural network model, firstly, words in a short text are converted into word vectors as local features, and meanwhile, keywords appointed by a user are converted into the word vectors as global features; and the two groups of characteristics respectively pass through a convolution layer and a pooling layer of the convolution neural network model, and finally calculate the score corresponding to each short text through a multilayer perceptron and a softmax activation function, specifically:
for the word w1,w2,...,
Figure FDA0003286402010000029
Length of composition LwAny word w in the short textkThe output of the projection layer of the word-embedded neural network model obtained by step 1.3, the word-embedding map S of its N-dimensional vectorkThe result is that
Sk=word2vec(wk)
For word w1,w2,...,
Figure FDA0003286402010000031
Performs this process, resulting in a matrix S:
Figure FDA0003286402010000032
wherein N is the number of neurons in a projection layer of the word embedding neural network model and is also the dimension of a word vector output by the projection layer;
in a similar manner, the subject label characteristic H is obtained as
Figure FDA0003286402010000033
Wherein L isHThe number of the subject labels contained in the short text;
for the matrix S, the convolutional layer filter is defined
Figure FDA0003286402010000034
Is composed of
Figure FDA0003286402010000035
Of (2) matrixWherein
Figure FDA0003286402010000036
If the parameter is a hyper-parameter, carrying out random initialization; s calculation result C of convolution layerSIs composed of
Figure FDA0003286402010000037
To CSMaximum pooling (max pooling) is performed per row to obtain a matrix PS
Figure FDA0003286402010000041
Similarly, for matrix H, the convolutional layer filter is defined
Figure FDA0003286402010000042
Is composed of
Figure FDA0003286402010000043
Wherein, the matrix is
Figure FDA0003286402010000044
If the parameter is a hyper-parameter, carrying out random initialization; h calculation result C of convolution layerHIs composed of
Figure FDA0003286402010000045
To CHMaking maximum pooling of each row to obtain a matrix PH
Figure FDA0003286402010000046
Will PSAnd PHAre respectively flattened (flattened) and connected to have a length of
Figure FDA0003286402010000047
Vector of (2)
Figure FDA0003286402010000048
Figure FDA0003286402010000049
Figure FDA00032864020100000410
Figure FDA00032864020100000411
And according to PfAnd between logistic regression layer neural networks
Figure FDA0003286402010000051
The dimension weight matrix M calculates an output o ' ═ of (o ') of the logistic regression layer '1,o′2):
Figure FDA0003286402010000052
Calculating the output value of the softmax activating function:
Figure FDA0003286402010000053
wherein
Figure FDA0003286402010000054
I.e. the fraction of the final output (score) and satisfies
Figure FDA0003286402010000055
Step 2, text recommendation is carried out, specifically:
step 2.1, short text recommendation is carried out according to the scores obtained in the step 1.4, and a plurality of pieces of short text information with the highest scores are taken as dynamic progress of the field and recommended to a user;
unlike step 1.4, step 2.1 only targets data in the last week that did not participate in training, and does not perform back propagation error correction; at run-time, step 2.1 pairs the number N specified according to the usertopScreening out N with highest scoretopBars are dynamic for the domain;
and 2.2, the user feeds back the recommended short text according to the correlation degree and records the recommended short text in a local database.
2. The short text-based field dynamic tracking method according to claim 1, wherein the training method of the word embedding neural network model in step 1.3 is:
the training goal of the word embedding neural network model is to reduce
Figure FDA0003286402010000056
And wIUnique heat vector x ofI=(xI,1,xI,2,...,xI,V) Difference therebetween
Figure FDA0003286402010000057
Wherein wI∈{wt-2,wt-1,wt+1,wt+2And is provided with
Figure FDA0003286402010000058
NxV-dimensional weight moment for projection layer to output layerMatrix W ', according to the chain rule, for any i, j (1 ≦ i ≦ N,1 ≦ j ≦ V), the gradient delta W'i,jIs equal to
Figure FDA0003286402010000061
Wherein
Figure FDA0003286402010000062
Figure FDA0003286402010000063
Figure FDA0003286402010000064
Therefore, it is not only easy to use
Figure FDA0003286402010000065
And according to gradient delta W'i,jWhile using the first moment calculated in the last execution of the training process
Figure FDA0003286402010000066
And second moment
Figure FDA0003286402010000067
Or first moment
Figure FDA0003286402010000068
And second moment
Figure FDA0003286402010000069
Is calculated as the set of weights W'i,jFirst moment estimation of
Figure FDA00032864020100000610
And second moment estimation
Figure FDA00032864020100000611
Figure FDA0003286402010000071
Figure FDA0003286402010000072
Wherein beta is1And beta2The super-parameter is used for controlling the change of the first moment estimation value and the second moment estimation value;
to pair
Figure FDA0003286402010000073
And
Figure FDA0003286402010000074
as a correction to
Figure FDA0003286402010000075
And
Figure FDA0003286402010000076
make it
Figure FDA0003286402010000077
And
Figure FDA0003286402010000078
approximated as an unbiased estimate:
Figure FDA0003286402010000079
Figure FDA00032864020100000710
wherein step is the number of times of the step completed, and W 'is updated'i,jIs composed of
Figure FDA00032864020100000711
Where α is a hyper-parameter, called learning rate (learning rate), and is controlled by'i,jThe update ratio of (2); e is defined as a fraction greater than and approaching 0 to prevent division by zero in the fraction;
for the V multiplied by N dimensional weight matrix W from the input layer to the projection layer, i (j is more than or equal to 1 and less than or equal to V, i is more than or equal to 1 and less than or equal to N) is arbitrarily set, and the gradient delta W is set according to the chain rulej,iIs equal to
Figure FDA00032864020100000712
Wherein
Figure FDA00032864020100000713
Has been previously calculated; remaining value
Figure FDA00032864020100000714
Figure FDA00032864020100000715
And calculating a first moment estimate based on the gradient
Figure FDA0003286402010000081
And second moment estimation
Figure FDA0003286402010000082
Figure FDA0003286402010000083
Figure FDA0003286402010000084
Computing
Figure FDA0003286402010000085
And
Figure FDA0003286402010000086
approximate unbiased estimation:
Figure FDA0003286402010000087
Figure FDA0003286402010000088
and update the weight Wj,i
Figure FDA00032864020100000812
And training the word embedding neural network model once every time the step is completed, so as to update the weight.
3. The short text-based field dynamic tracking method according to claim 1, wherein the training method of the convolutional neural network model in step 1.4 specifically comprises:
the training target of the convolutional neural network model is reduced u '═ u'1,u′2) And actual degree of correlation a '═ a'1,a′2) The difference between them; wherein a'1Is [0,1 ]]Is LwDegree of correlation with domain to which keyword belongs, u'1The greater the correlation, the stronger the correlation, a'2=1-a′1
Is provided with
E′=E′1+E′2
And is provided with
E′j=-(a′jlogu′j+(1-a′j)(1-logu′j))
Wherein j is more than or equal to 1 and less than or equal to 2; for PfTo o' s
Figure FDA0003286402010000089
Dimension weight matrix M, according to the chain rule, for arbitrary
Figure FDA00032864020100000810
Gradient δ M thereofi,jIs composed of
Figure FDA00032864020100000811
Wherein
Figure FDA0003286402010000091
Figure FDA0003286402010000092
Figure FDA0003286402010000093
Therefore, it is not only easy to use
Figure FDA0003286402010000094
And calculating a first moment estimate based on the gradient
Figure FDA0003286402010000095
And second moment estimation
Figure FDA0003286402010000096
Figure FDA0003286402010000097
Figure FDA0003286402010000098
Computing
Figure FDA0003286402010000099
And
Figure FDA00032864020100000910
approximate unbiased estimation:
Figure FDA00032864020100000911
Figure FDA00032864020100000912
and update the weight Mi,j
Figure FDA00032864020100000913
Filter with theme tag feature H updated later
Figure FDA0003286402010000101
Figure FDA0003286402010000102
Has a gradient of
Figure FDA0003286402010000103
Wherein
Figure FDA0003286402010000104
Figure FDA0003286402010000105
Figure FDA0003286402010000106
Therefore, it is not only easy to use
Figure FDA0003286402010000107
And calculates an first moment estimate thereof
Figure FDA0003286402010000108
And second moment estimation
Figure FDA0003286402010000109
Figure FDA00032864020100001010
Figure FDA00032864020100001011
Correction to approximate unbiased estimation:
Figure FDA0003286402010000111
Figure FDA0003286402010000112
and update
Figure FDA0003286402010000113
Figure FDA0003286402010000114
Filter for input matrix S
Figure FDA0003286402010000115
Figure FDA0003286402010000116
Has a gradient of
Figure FDA0003286402010000117
Wherein
Figure FDA0003286402010000118
Figure FDA0003286402010000119
Figure FDA00032864020100001110
Therefore, it is not only easy to use
Figure FDA00032864020100001111
And calculates an first moment estimate thereof
Figure FDA00032864020100001112
And second moment estimation
Figure FDA00032864020100001113
Figure FDA0003286402010000121
Figure FDA0003286402010000122
Correction to approximate unbiased estimation:
Figure FDA0003286402010000123
Figure FDA0003286402010000124
and update
Figure FDA0003286402010000125
Figure FDA0003286402010000126
And training the convolutional neural network model once every time the step is completed, so as to update the weight.
CN201910322267.5A 2019-04-22 2019-04-22 Short text-based field dynamic tracking method Active CN110083676B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910322267.5A CN110083676B (en) 2019-04-22 2019-04-22 Short text-based field dynamic tracking method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910322267.5A CN110083676B (en) 2019-04-22 2019-04-22 Short text-based field dynamic tracking method

Publications (2)

Publication Number Publication Date
CN110083676A CN110083676A (en) 2019-08-02
CN110083676B true CN110083676B (en) 2021-12-03

Family

ID=67415993

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910322267.5A Active CN110083676B (en) 2019-04-22 2019-04-22 Short text-based field dynamic tracking method

Country Status (1)

Country Link
CN (1) CN110083676B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795937A (en) * 2019-09-25 2020-02-14 卓尔智联(武汉)研究院有限公司 Information processing method, device and storage medium
CN111460105B (en) * 2020-04-02 2023-08-29 清华大学 Topic mining method, system, equipment and storage medium based on short text

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN107562729A (en) * 2017-09-14 2018-01-09 云南大学 The Party building document representation method strengthened based on neutral net and theme
CN107656990A (en) * 2017-09-14 2018-02-02 中山大学 A kind of file classification method based on two aspect characteristic informations of word and word

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170308790A1 (en) * 2016-04-21 2017-10-26 International Business Machines Corporation Text classification by ranking with convolutional neural networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN107562729A (en) * 2017-09-14 2018-01-09 云南大学 The Party building document representation method strengthened based on neutral net and theme
CN107656990A (en) * 2017-09-14 2018-02-02 中山大学 A kind of file classification method based on two aspect characteristic informations of word and word

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Supervised Deep Polylingual Topic Modeling for Scholarly Information Recommendations;Daniel Ramage等;《ICPRAM 2018 - 7th International Conference on Pattern Recognition Applications and Methods》;20181231;全文 *
一种基于深度学习与Labeled_LDA的文本分类方法;庞宇明;《中国优秀硕士学位论文全文数据库信息科技辑》;20180315;全文 *
基于主题增强卷积神经网络的用户兴趣识别;杜雨萌等;《计算机研究与发展》;20180131;全文 *

Also Published As

Publication number Publication date
CN110083676A (en) 2019-08-02

Similar Documents

Publication Publication Date Title
CN109992648B (en) Deep text matching method and device based on word migration learning
CN107330049B (en) News popularity estimation method and system
CN110019685B (en) Deep text matching method and device based on sequencing learning
CN114565104A (en) Language model pre-training method, result recommendation method and related device
CN110516245A (en) Fine granularity sentiment analysis method, apparatus, computer equipment and storage medium
CN111881334A (en) Keyword-to-enterprise retrieval method based on semi-supervised learning
CN107688870B (en) Text stream input-based hierarchical factor visualization analysis method and device for deep neural network
WO2021139415A1 (en) Data processing method and apparatus, computer readable storage medium, and electronic device
US20220318317A1 (en) Method for disambiguating between authors with same name on basis of network representation and semantic representation
WO2023272748A1 (en) Academic accurate recommendation-oriented heterogeneous scientific research information integration method and system
CN110046223B (en) Film evaluation emotion analysis method based on improved convolutional neural network model
CN113822776B (en) Course recommendation method, device, equipment and storage medium
CN110083676B (en) Short text-based field dynamic tracking method
CN110889282A (en) Text emotion analysis method based on deep learning
CN114328834A (en) Model distillation method and system and text retrieval method
CN110019653A (en) A kind of the social content characterizing method and system of fusing text and label network
CN109086463A (en) A kind of Ask-Answer Community label recommendation method based on region convolutional neural networks
CN113934835B (en) Retrieval type reply dialogue method and system combining keywords and semantic understanding representation
CN110874392B (en) Text network information fusion embedding method based on depth bidirectional attention mechanism
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
CN111914553A (en) Financial information negative subject judgment method based on machine learning
CN114492451A (en) Text matching method and device, electronic equipment and computer readable storage medium
CN113722439A (en) Cross-domain emotion classification method and system based on antagonism type alignment network
CN114282528A (en) Keyword extraction method, device, equipment and storage medium
CN112966096A (en) Cloud service discovery method based on multi-task learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant