CN117236330A - Mutual information and antagonistic neural network based method for enhancing theme diversity - Google Patents
Mutual information and antagonistic neural network based method for enhancing theme diversity Download PDFInfo
- Publication number
- CN117236330A CN117236330A CN202311524544.3A CN202311524544A CN117236330A CN 117236330 A CN117236330 A CN 117236330A CN 202311524544 A CN202311524544 A CN 202311524544A CN 117236330 A CN117236330 A CN 117236330A
- Authority
- CN
- China
- Prior art keywords
- distribution
- text
- topic
- layer
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000003042 antagnostic effect Effects 0.000 title claims abstract description 15
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 15
- 230000002708 enhancing effect Effects 0.000 title claims abstract description 15
- 238000009826 distribution Methods 0.000 claims abstract description 257
- 239000013598 vector Substances 0.000 claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 26
- 230000006870 function Effects 0.000 claims abstract description 21
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 238000013507 mapping Methods 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 15
- 230000004913 activation Effects 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 5
- 239000011782 vitamin Substances 0.000 claims description 5
- 238000005065 mining Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 230000003213 activating effect Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims 1
- 230000008447 perception Effects 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 4
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 210000005036 nerve Anatomy 0.000 abstract 1
- JCJIZBQZPSZIBI-UHFFFAOYSA-N 2-[2,6-di(propan-2-yl)phenyl]benzo[de]isoquinoline-1,3-dione Chemical compound CC(C)C1=CC=CC(C(C)C)=C1N(C1=O)C(=O)C2=C3C1=CC=CC3=CC=C2 JCJIZBQZPSZIBI-UHFFFAOYSA-N 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000013256 coordination polymer Substances 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 229940088594 vitamin Drugs 0.000 description 2
- 229930003231 vitamin Natural products 0.000 description 2
- 235000013343 vitamin Nutrition 0.000 description 2
- 150000003722 vitamin derivatives Chemical class 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
The invention belongs to the technical field of natural language processing, and discloses a method for enhancing theme diversity based on mutual information and an antagonistic neural network, which comprises the following steps: word preprocessing in the corpus is used as real text word distribution; using the randomly sampled corpus as input of an encoder to generate a real text topic distribution vector; the distribution pairs are formed by the real text word distribution and the theme distribution, and random disturbance in the batch is used as a negative sample distribution pair; the Dirichlet distribution randomly sampled false text subject distribution is input by a generator and converted into false text word distribution vectors; generating a subject term in the countermeasure training process by using real distribution pairs and false distribution pairs; training is performed with the aim of discriminant loss functions and regularization losses that maximize mutual information. According to the invention, modeling is carried out on the text theme, a high-quality theme is mined, a mutual information maximization technology is integrated into an antagonistic nerve theme modeling process, the theme diversity is enhanced, and higher theme identity and diversity indexes are provided.
Description
Technical Field
The invention belongs to the technical field of natural language processing, and particularly relates to a method for enhancing theme diversity based on mutual information and an antagonistic neural network.
Background
The topic model is an important tool for text mining, hidden information in corpus is mined, and the topic model has wide application in scenes such as topic aggregation, information extraction from unstructured text, feature selection and the like. Wherein implicit dirichlet allocation is the most representative model thereof to infer the topic distribution of the text. However, since the model is complex to solve and has very small adjustment, scientific researchers need to design a corresponding theoretical method for the model, which is not beneficial to modeling the subsequent theme at the application level.
In order to solve the deficiencies of the traditional topic model, based on the rapid development of the generated neural network in recent years, the neural topic model is focused by a plurality of scholars in the fields of text mining and natural language processing and is studied intensively, for example: an challenge-nerve topic model and a bidirectional challenge-nerve topic model are proposed based on challenge training. The model is modeled by using dirichlet distribution as a priori distribution of the subject space, and the encoder and generator generate more realistic data distribution and more accurate subject representation, but ignore valuable information between the generated data distribution and the real data distribution, resulting in insufficient diversity.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method for enhancing topic diversity based on mutual information and an antagonistic neural network, which can lead implicit topic information in a text to obey dirichlet allocation and integrate a mutual information maximization mechanism under an antagonistic neural topic modeling framework to promote the topic diversity of model mining.
In order to achieve the above purpose, the invention is realized by the following technical scheme:
the invention relates to a method for enhancing theme diversity based on mutual information and an antagonistic neural network, which comprises the following steps:
s1: carrying out data preprocessing on the online text of the social platform to obtain a real text, and representing the real text into a real text-word distribution vector by using a word bag model;
s2: placing a plurality of real text-word distribution vectors in a batch to serve as input of an encoder to obtain real text-topic distribution vectors, forming real distribution pairs by the real text-word distribution vectors and corresponding topic distributions, and forming negative sample distribution pairs by scrambling the real text-word distribution vectors in the batch and the real text-topic distributions;
s3: randomly sampling a topic vector from Dirichlet distribution as a pseudo-text-topic distribution, and inputting the pseudo-text-topic distribution into a generator to obtain a pseudo-text-word distribution vector and the pseudo-text-topic distribution to form a pseudo-distribution pair;
s4: the true distribution pair and the false distribution pair are used as the input of the countermeasure generation network, the true distribution pair and the negative sample pair are used as the input of the statistical network, and in the countermeasure training process, the encoder and the generator are trained through signals generated by countermeasure, and the model is trained with the regularization loss of mutual information as the maximum target.
S5: in order to approximate the bulldozer distance and the jensen-shannon distance between two high latitude distributions during training, the training objective is repeatedly optimized and iterated during the countermeasure training until the loss function converges.
The invention further improves that: encoder in step 2Training the mapping relation of the real text-word distribution vector to the real text-topic distribution vector, comprising +.>A dimension text-word distribution layer,/->Dimension semantics-implicit presentation layer and +.>The dimension text-theme distributing layer specifically comprises the following steps:
s2.1, representing the real text in the step 1 by using a bag-of-word model, and randomly sampling to obtainVitamin text-word distribution representation->As input, encoder->Map it to +.>Dimension implicit semantic space, and then get +.>Dimension implicit semantic space mapping to +.>Vitamin text-topic distribution layer:
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Bias term of weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>For the parameters of the LeakyReLU activation function,for batch normalization, ++>Weight matrix for semantic-implicit representation layer to text-topic distribution layer, +.>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is a text-topic distribution corresponding to real text and is the firstWei->Indicate->The proportion of the individual topics in the real text;
s2.2 will then be trueWired distribution vector and true->The dimension theme distribution vectors are spliced into real distribution pairsThe within-batch disruption of the real text-word distribution vector is expressed as +.>The topic distribution and word distribution which are not matched in the batch form a negative sample distribution pair +.>。
Step 3 generatorGenerating a mapping relationship of text-topic distribution to text-word distribution, comprising +.>A dimension text-topic distribution layer, & lt + & gt>Dimension semantics-implicit presentation layer and +.>A dimension text-word distribution layer, the usage parameter is +.>Dirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
wherein the parameters areFor the probability density of the dirichlet distribution, the topic +.>For the subject parameters of the model, +.>Representing the probability that each word in the text belongs to each topic.
S3.1 GeneratorPseudo-text-topic distribution is first +.>Switch to->The dimension semantics-implicit presentation layer, the obtained +.>Dimension implicit semantic space mapping to +.>A dimension text-word distribution layer:
wherein,weight matrix for text-topic distribution layer to semantic-implicit representation layer, +.>Biasing items for text-topic distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For batch normalization, ++>Is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is a text-topic distribution corresponding to the real text and +.>Wei->Indicate->The proportion of the individual topics in the real text;
s3-2 then distributes the pseudo-text-topicAnd pseudo text-word distribution->Splice into false distribution pairs。
S4.1 real distribution pairs in step 4And pseudo-distribution pair->Is regarded as being composed of two +.>+/>Dimension association distribution pair->And->Random samples from the sample, wherein +.>And->All are made of +.>Vidirichlet distribution pair and one +.>Combined distribution of the pair of Dirichlet distributions, against the production network>Training is to let false joint distributionApproximation to true coupletDistribution of the combination->Statistical networks use true sample pairs +.>And negative sample pair->Estimating mutual information between the text-word distribution space and the text-topic distribution space and maximizing it to promote topic diversity, encoder +_ when training is completed>Sum generator->The method can complete the bidirectional mapping relation and the internal mutual information maximization relation between the text-theme distribution and the text-word distribution, and specifically comprises the following steps:
s4.2 discriminatorIs composed of three layers of full-connection networks including one->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. In true distribution pair->And pseudo distribution pair->For inputting and outputting +.>To judge the transfusionThe method is to enter the true or false of the distribution pair, and adopts the following formula:
wherein,for bulldozer distance>For the output signal of the arbiter, a value close to 1 indicates that the arbiter is more prone to discriminate it as true, and vice versa;
s4.3 statistical networkComprising a global arbiter->And maximizing the mutual information loss function, global arbiter +.>Comprises a->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. Statistical network->For calculating the true sample pair +.>And negative sample pair->Mutual information between them and output +.>The method adopts the following formula:
wherein,representation->Activating function->Input representing an activation function->And->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->And->Representing distribution pairs of lot size->Is in the same batch (batch) and +.>Non-matching real text-word distribution.
S4.4, the final training target of the model is as follows:
the step 5 specifically comprises the following steps:
step 5-1, loading the dataset including text data, vocabulary, and word vectors
Step 5-2, build GeneratorEncoder->Discriminator->(mutual information) statistical network->The model is optimized by constructing an optimizer;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator +.>Input, its output signal +.>Can guide encoder->And generator->And thereby mine the topics in the text.
Step 5-4, counting the networkUse of true sample pairs->And negative sample distribution pair->Mutual information between the text-word distribution and the text topic distribution space is estimated for input and maximized to promote topic diversity.
Step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
step 5-6, repeating step 5-3, step 5-4 and step 5-5 until convergence.
The beneficial effects of the invention are as follows: the invention can help the topic model learn richer and diversified topic representations through the mutual information maximization mechanism, maximizes the mutual information among different words in the text, and promotes the model to organize related words into topics with more consistency and differentiation. The model can better adapt to task demands by optimizing the objective function with maximized mutual information, and the performance of the model in tasks such as generation, classification, clustering and the like is improved. The related experiments are carried out on a 20Newsgroups data set, and the results show that compared with other methods, the method has higher CP, CV, CA, NPMI and UT indexes, and the quality of the mined theme is obviously improved.
Drawings
Fig. 1 is a model diagram of the present invention.
Fig. 2 is a specific training flow chart of the present invention.
Detailed Description
Embodiments of the invention are disclosed in the drawings, and for purposes of explanation, numerous practical details are set forth in the following description. However, it should be understood that these practical details are not to be taken as limiting the invention. That is, in some embodiments of the invention, these practical details are unnecessary.
As shown in fig. 1-2, the present invention is a method for enhancing theme diversity based on mutual information and antagonistic neural networks, specifically comprising the steps of:
step 1, preprocessing an online text of a social platform to obtain a real text, and representing the real text sample into a real text-word distribution vector by using a word bag model method.
And 2, taking the real text-word distribution vector in the step 1 as input of an encoder to obtain mapping of the real text-theme distribution vector, forming a real distribution pair by the real text-word distribution vector and the theme distribution, and forming a negative sample distribution pair by the real text-word distribution vector in a batch and the real text-theme distribution vector.
Encoder in step 2Training the mapping relation of the real text-word distribution vector to the real text-topic distribution vector, comprising +.>A dimension text-word distribution layer,/->Dimension semantics-implicit presentation layer and +.>The dimension text-theme distributing layer specifically comprises the following steps:
step 2-1, using the real text in the step 1 to represent by using a bag-of-word model, and randomly sampling to obtainVitamin text-word distribution representation->As input, encoder->Map it to +.>Dimension implicit semantic space, and then get +.>Dimension implicit semantic space mapping to +.>Vitamin text-topic distribution layer:
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Bias term of weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>For the parameters of the LeakyReLU activation function,for batch normalization, ++>Weight matrix for semantic-implicit representation layer to text-topic distribution layer, +.>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is a text-topic distribution corresponding to real text and is the firstWei->Indicate->The weight of the individual subjects in the real text.
In the present example, the encoder network dimension is-/>-/>Wherein->For the word vector dimension,/->For semantic hidden layer dimension, ++>Is the topic vector dimension.
Step 2-2, followed by the actualWired distribution vector and true->The dimension theme distribution vectors are spliced into a real distribution pair +.>The within-batch disruption of the real text-word distribution vector is expressed as +.>The topic distribution and word distribution which are not matched in the batch form a negative sample distribution pair +.>。
Step 3, generator in step 3Generating a mapping relationship of text-topic distribution to text-word distribution, comprising +.>A dimension text-topic distribution layer, & lt + & gt>Dimension semantics-implicit presentation layer and +.>A dimension text-word distribution layer, the usage parameter is +.>Dirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
wherein the parameters areFor the probability density of the dirichlet distribution, the topic +.>For the subject parameters of the model, +.>Representing the probability that each word in the text belongs to each topic.
Step 3-1, generatorPseudo-text-topic distribution is first +.>Switch to->The dimension semantics-implicit presentation layer, the obtained +.>Dimension implicit semantic space mapping to +.>A dimension text-word distribution layer:
wherein,weight matrix for text-topic distribution layer to semantic-implicit representation layer, +.>Biasing items for text-topic distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For batch normalization, ++>Is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is a text-topic distribution corresponding to the real text and +.>Wei->Indicate->The weight of the individual subjects in the real text.
In the present example, the generator networkDimension is->-/>-/>Wherein->For the topic vector dimension, < >>For semantic hidden layer dimension, ++>Is the word vector dimension.
Step 3-2, then distributing the pseudo-text-subjectAnd pseudo text-word distribution->Splice into false distribution pairs。
True distribution pairs in step 4False distributionFor->Is regarded as being composed of two +.>+/>Dimension association distribution pair->And->Random samples from the sample, wherein +.>And->All are made of +.>Vidirichlet distribution pair and one +.>Combined distribution of the pair of Dirichlet distributions, against the production network>The training goal is to let the false distribution pair +.>Approximating the true distribution pair->Statistical network utilizes true distribution pairs +.>And negative sample pair->Estimating text-word distribution space and text-topic distribution spaceMutual information between them and maximize them to promote topic diversity, encoder +_when training is complete>Sum generator->The method can complete the bidirectional mapping relation and the internal mutual information maximization relation between the text-theme distribution and the text-word distribution, and specifically comprises the following steps:
step 4-1, discriminatorIs composed of three layers of full-connection networks including one->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. In true distribution pair->And pseudo distribution pair->For inputting and outputting +.>To judge the true or false of the input distribution pair, the method adopts the following formula:
wherein,for bulldozer distance>For the output signal of the arbiter, a value close to 1 indicates that the arbiter is more prone to discriminate it as true, and vice versa;
step 4-2, counting the networkComprising a global arbiter->And maximizing mutual information loss function, global arbiterComprises a->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, an output layer, statistical networkFor calculating the true sample pair +.>And negative sample pair->Mutual information between them and output +.>The method adopts the following formula:
wherein,representation->Activating function->Input representing an activation function->And->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->And->Representing distribution pairs of lot size->Is in the same batch (batch) and +.>Non-matching real text-word distribution.
In summary, the final training objectives of the model are as follows:
and 5, in order to approximate the bulldozer distance and the jensen-shannon distance between two high latitude distributions during training, repeatedly optimizing and iterating the training target in the countermeasure training process until the loss function converges.
Step 5-1, loading a data set comprising text data, a vocabulary and word vectors;
step 5-2, build GeneratorEncoder->Discriminator->Statistical network->The model is optimized by constructing an optimizer;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator +.>Input, its output signal +.>Can guide encoder->And generator->And thereby mine the topics in the text.
Step 5-4, counting networkUse of true sample pairs->And negative sample distribution pair->Estimating for input the mutual between text-word distribution and text topic distribution spaceInformation and maximize it to promote topic diversity.
Step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
step 5-6, repeating step 5-3, step 5-4 and step 5-5 until convergence.
The method for enhancing the topic diversity based on the mutual information and the antagonistic neural network improves the correlation between the topic distribution and the word distribution and enhances the topic diversity by maximizing the mutual information between the topic distribution and the word distribution.
The invention provides an antagonism neural network method for enhancing the diversity of a theme model, which is characterized in that 5 settings [20, 30, 50, 75, 100] are respectively arranged on the theme consistency tested on a 20News groups data set, and the average theme consistency values measured by the method are as follows: the indexes of C_P of 0.273, CA of 0.206, UCI of 0.139, NPMI of 0.052 and UT of 0.761 are higher than those of a comparison experiment, wherein the highest index in the comparison experiment is CP of 0.260, CA of 0.158, UCI of 0.09, NPMI of 0.47 and UT of 0.732.
The invention can help the topic model learn richer and diversified topic representations through the mutual information maximization mechanism, maximizes the mutual information among different words in the text, and promotes the model to organize related words into topics with more consistency and differentiation. The model can better adapt to task demands by optimizing the objective function with maximized mutual information, and the performance of the model in tasks such as generation, classification, clustering and the like is improved.
The foregoing description is only illustrative of the invention and is not to be construed as limiting the invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the present invention, should be included in the scope of the claims of the present invention.
Claims (5)
1. A method for enhancing topic diversity based on mutual information and antagonistic neural networks, characterized by: the method for enhancing the theme diversity comprises the following steps of:
step 1, carrying out data preprocessing on online texts of a social platform to obtain real texts, and representing the real texts into real text-word distribution vectors by using a word bag model;
step 2, placing a plurality of the real text-word distribution vectors in the step 1 in the same batch to serve as input of an encoder to obtain real text-topic distribution vectors, forming real distribution pairs by the real text-word distribution vectors and corresponding topic distribution, and splicing the real text-word distribution vectors in the batch with the real text-topic distribution vectors to form negative sample distribution pairs;
step 3, randomly sampling a theme vector from Dirichlet distribution to serve as a pseudo-text-theme distribution, inputting the pseudo-text-word distribution vector into a generator, and forming a pseudo-distribution pair by the pseudo-text-word distribution vector and the pseudo-text-theme distribution;
step 4, the discriminator receives the real distribution pairs obtained in the step 2 and the false distribution pairs generated in the step 3 as inputs of the discriminator, calculates losses of the real distribution pairs and the false distribution pairs to distinguish the real distribution pairs from the generated data distribution pairs, and introduces a statistical network, wherein the statistical network receives the real distribution pairs and the negative sample distribution pairs as inputs, calculates mutual information between the real distribution pairs and the negative sample distribution pairs, and regularized losses of the mutual information are added into the losses of the discriminator so as to increase the perception capability of the discriminator and improve the quality and diversity of generated samples;
and 5, using countermeasure training to approximately estimate the bulldozer distance between the real distribution pair and the false distribution pair and the jensen shannon distance between the real distribution pair and the negative sample distribution pair in training, and passing through an optimization target and an iterative model of the countermeasure training until the loss function converges.
2. The method for enhancing topic diversity based on mutual information and antagonistic neural networks according to claim 1, wherein: encoder in step 2Training the mapping relation of the real text-word distribution vector to the real text-topic distribution vector, comprising +.>A dimension text-word distribution layer,/->Dimension semantics-implicit presentation layer and +.>The dimension text-theme distributing layer specifically comprises the following steps:
step 2-1, using the real text in the step 1 to represent by using a bag-of-word model, and randomly sampling to obtainVitamin text-word distribution representation->As input, encoder->Map it to +.>Dimension implicit semantic space, and then get +.>Dimension implicit semantic space mapping to +.>The dimension text-topic distribution layer is obtained by adopting the following formula:
,
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Bias term of weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For batch normalization, ++>Weight matrix for semantic-implicit representation layer to text-topic distribution layer, +.>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is a text-topic distribution corresponding to real text and is the firstWei->Indicate->The proportion of the individual topics in the real text;
step 2-2, followed by the actualWired distribution vector and true->The dimension theme distribution vectors are spliced into real distribution pairsThe disturbed real text-word distribution vector within the batch is expressed as +.>The topic distribution and word distribution which are not matched in the batch form a negative sample distribution pair +.>。
3. A method of enhancing topic diversity based on mutual information and antagonistic neural networks according to claim 2, characterized in that: step 3 generatorGenerating a mapping relationship of text-topic distribution to text-word distribution, comprising +.>A dimension text-topic distribution layer, & lt + & gt>Dimension semantics-implicit presentation layer and +.>A dimension text-word distribution layer, the usage parameter is +.>Dirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
,
wherein the parameters areFor the probability density of the dirichlet distribution, the topic +.>For the subject parameters of the model, +.>Representing the probability that each word in the text belongs to each topic;
step 3-1, generatorPseudo-text-topic distribution is first +.>Switch to->The dimension semantics-implicit presentation layer, the obtained +.>Dimension implicit semantic space mapping to +.>A dimension text-word distribution layer:
,
wherein,distributing text-topicsWeight matrix layer to semantic-implicit representation layer, < ->Biasing items for text-topic distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For the purpose of batch normalization,is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is a text-topic distribution corresponding to the real text and +.>Wei->Indicate->The proportion of the individual topics in the real text;
step 3-2, then distributing the pseudo-text-subjectAnd pseudo text-word distribution->Splice to form pseudo distribution pair->。
4. A method of enhancing theme diversity based on mutual information and antagonistic neural networks according to claim 3, characterized in that: true distribution pairs in step 4And pseudo-distribution pair->Is regarded as being composed of two +.>+/>Dimension association distribution pair->And->Random samples from the sample, wherein +.>And->All are made of +.>Vidirichlet distribution pair and one +.>Combined distribution formed by Wilddirichlet distribution pairs and discriminator ++>The training goal is to let the false distribution +.>Approximating the true distribution pair->Statistical network->Utilize real distribution pair->And negative sample distribution pair->Estimating mutual information between the text-word distribution space and the text-topic distribution space and maximizing it to promote topic diversity, encoder +_ when training is completed>Sum generator->The method obtains the bi-directional mapping relation and the internal mutual information maximization relation between the text-topic distribution and the text-word distribution, and comprises the following steps of
Step 4-1, discriminatorConsists of three layers of fully connected networks, wherein the three layers of fully connected networks are specifically one +.>+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, an output layer, to truly divideCloth pair->And pseudo distribution pair->For inputting and outputting +.>To judge the true or false of the input distribution pair, the method adopts the following formula:
,
wherein,for bulldozer distance>For the output signal of the arbiter, a value close to 1 indicates that the arbiter is more prone to discriminate it as true, and vice versa;
step 4-2, counting the networkComprising a global arbiter->And maximizing the mutual information loss function, global arbiter +.>Comprises a->+/>A dimension association distribution layer, a +.>Semantic-implicit representation layer and an output layer of dimensions, said statistical networkFor calculating the true sample pair +.>And negative sample pair->Mutual information between them and output +.>The method adopts the following formula:
,
,
wherein,representation->Activating function->Input representing an activation function->And->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->Is in the same batch and->Non-matching real text-word distribution;
step 4-3, final training targets of the model are as follows:
。
5. the method for enhancing theme diversity based on mutual information and antagonistic neural networks according to claim 4, wherein: the step 5 specifically comprises the following steps:
step 5-1, loading a data set comprising text data, a vocabulary and word vectors;
step 5-2, construction of encoderGenerator->Discriminator->Statistical network->The model is optimized by constructing an optimizer;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator +.>Input, its output signal +.>Instruction encoder->And generator->Thereby mining the subject in the text;
step 5-4, counting networkUse of true sample pairs->And negative sample distribution pair->Estimating mutual information between text-word distribution and text topic distribution space for input and maximizing it to promote topic diversity;
step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
;
step 5-6, repeating the steps 5-3 to 5-5 until convergence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311524544.3A CN117236330B (en) | 2023-11-16 | 2023-11-16 | Mutual information and antagonistic neural network based method for enhancing theme diversity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311524544.3A CN117236330B (en) | 2023-11-16 | 2023-11-16 | Mutual information and antagonistic neural network based method for enhancing theme diversity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117236330A true CN117236330A (en) | 2023-12-15 |
CN117236330B CN117236330B (en) | 2024-01-26 |
Family
ID=89095326
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311524544.3A Active CN117236330B (en) | 2023-11-16 | 2023-11-16 | Mutual information and antagonistic neural network based method for enhancing theme diversity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117236330B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117808104A (en) * | 2024-02-29 | 2024-04-02 | 南京邮电大学 | Viewpoint mining method based on self-supervision expression learning and oriented to hot topics |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009187A (en) * | 2017-02-20 | 2018-05-08 | 南京航空航天大学 | A kind of short text Topics Crawling method for strengthening Text Representation |
CN110134786A (en) * | 2019-05-14 | 2019-08-16 | 南京大学 | A kind of short text classification method based on theme term vector and convolutional neural networks |
CN110442781A (en) * | 2019-06-28 | 2019-11-12 | 武汉大学 | It is a kind of based on generate confrontation network to grade ranked items recommended method |
CN110532378A (en) * | 2019-05-13 | 2019-12-03 | 南京大学 | A kind of short text aspect extracting method based on topic model |
CN110941721A (en) * | 2019-09-28 | 2020-03-31 | 国家计算机网络与信息安全管理中心 | Short text topic mining method and system based on variational self-coding topic model |
US20200372225A1 (en) * | 2019-05-22 | 2020-11-26 | Royal Bank Of Canada | System and method for controllable machine text generation architecture |
CN112100317A (en) * | 2020-09-24 | 2020-12-18 | 南京邮电大学 | Feature keyword extraction method based on theme semantic perception |
CN112597769A (en) * | 2020-12-15 | 2021-04-02 | 中山大学 | Short text topic identification method based on Dirichlet variational self-encoder |
US20210209416A1 (en) * | 2020-03-20 | 2021-07-08 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating event theme |
CN115099188A (en) * | 2022-06-22 | 2022-09-23 | 南京邮电大学 | Topic mining method based on word embedding and generating type neural network |
CN115828931A (en) * | 2023-02-09 | 2023-03-21 | 中南大学 | Chinese and English semantic similarity calculation method for paragraph-level text |
CN115878882A (en) * | 2021-09-26 | 2023-03-31 | 微软技术许可有限责任公司 | Hierarchical representation learning of user interests |
US11640493B1 (en) * | 2022-06-03 | 2023-05-02 | Actionpower Corp. | Method for dialogue summarization with word graphs |
CN116467443A (en) * | 2023-04-17 | 2023-07-21 | 西安理工大学 | Topic identification-based online public opinion text classification method |
CN116583880A (en) * | 2020-09-29 | 2023-08-11 | 通用电气精准医疗有限责任公司 | Multimodal image processing technique for training image data generation and use thereof for developing a unimodal image inference model |
-
2023
- 2023-11-16 CN CN202311524544.3A patent/CN117236330B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009187A (en) * | 2017-02-20 | 2018-05-08 | 南京航空航天大学 | A kind of short text Topics Crawling method for strengthening Text Representation |
CN110532378A (en) * | 2019-05-13 | 2019-12-03 | 南京大学 | A kind of short text aspect extracting method based on topic model |
CN110134786A (en) * | 2019-05-14 | 2019-08-16 | 南京大学 | A kind of short text classification method based on theme term vector and convolutional neural networks |
US20200372225A1 (en) * | 2019-05-22 | 2020-11-26 | Royal Bank Of Canada | System and method for controllable machine text generation architecture |
CN110442781A (en) * | 2019-06-28 | 2019-11-12 | 武汉大学 | It is a kind of based on generate confrontation network to grade ranked items recommended method |
CN110941721A (en) * | 2019-09-28 | 2020-03-31 | 国家计算机网络与信息安全管理中心 | Short text topic mining method and system based on variational self-coding topic model |
US20210209416A1 (en) * | 2020-03-20 | 2021-07-08 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating event theme |
CN112100317A (en) * | 2020-09-24 | 2020-12-18 | 南京邮电大学 | Feature keyword extraction method based on theme semantic perception |
CN116583880A (en) * | 2020-09-29 | 2023-08-11 | 通用电气精准医疗有限责任公司 | Multimodal image processing technique for training image data generation and use thereof for developing a unimodal image inference model |
CN112597769A (en) * | 2020-12-15 | 2021-04-02 | 中山大学 | Short text topic identification method based on Dirichlet variational self-encoder |
CN115878882A (en) * | 2021-09-26 | 2023-03-31 | 微软技术许可有限责任公司 | Hierarchical representation learning of user interests |
US11640493B1 (en) * | 2022-06-03 | 2023-05-02 | Actionpower Corp. | Method for dialogue summarization with word graphs |
CN115099188A (en) * | 2022-06-22 | 2022-09-23 | 南京邮电大学 | Topic mining method based on word embedding and generating type neural network |
CN115828931A (en) * | 2023-02-09 | 2023-03-21 | 中南大学 | Chinese and English semantic similarity calculation method for paragraph-level text |
CN116467443A (en) * | 2023-04-17 | 2023-07-21 | 西安理工大学 | Topic identification-based online public opinion text classification method |
Non-Patent Citations (5)
Title |
---|
CHING-SHENG LIN等: "Generative Adversarial Network for Joint Headline and Summary Generation", IEEE, vol. 10, pages 90745 * |
吴少康等: "基于深度学习的嵌入式主题模型研究", 电脑知识与技术, vol. 18, no. 28, pages 7 * |
夏家莉;曹中华;彭文忠;张守胜;: "Skip-Gram结构和词嵌入特性的文本主题建模", 小型微型计算机系统, vol. 41, no. 07, pages 1400 * |
孟祥福等: "个性化新闻推荐方法研究综述", 计算机科学与探索, vol. 17, no. 12, pages 2840 * |
张俊三等: "基于多样化标签矩阵的医学影像报告生成", 计算机科学, pages 1 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117808104A (en) * | 2024-02-29 | 2024-04-02 | 南京邮电大学 | Viewpoint mining method based on self-supervision expression learning and oriented to hot topics |
CN117808104B (en) * | 2024-02-29 | 2024-04-30 | 南京邮电大学 | Viewpoint mining method based on self-supervision expression learning and oriented to hot topics |
Also Published As
Publication number | Publication date |
---|---|
CN117236330B (en) | 2024-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105975573B (en) | A kind of file classification method based on KNN | |
CN104866810B (en) | A kind of face identification method of depth convolutional neural networks | |
CN109992779B (en) | Emotion analysis method, device, equipment and storage medium based on CNN | |
CN110472817A (en) | A kind of XGBoost of combination deep neural network integrates credit evaluation system and its method | |
CN106649275A (en) | Relation extraction method based on part-of-speech information and convolutional neural network | |
CN109255340A (en) | It is a kind of to merge a variety of face identification methods for improving VGG network | |
CN108920445A (en) | A kind of name entity recognition method and device based on Bi-LSTM-CRF model | |
CN117236330B (en) | Mutual information and antagonistic neural network based method for enhancing theme diversity | |
CN109344759A (en) | A kind of relatives' recognition methods based on angle loss neural network | |
CN104572786A (en) | Visualized optimization processing method and device for random forest classification model | |
CN110111848A (en) | A kind of human cyclin expressing gene recognition methods based on RNN-CNN neural network fusion algorithm | |
CN108804595B (en) | Short text representation method based on word2vec | |
CN101968853A (en) | Improved immune algorithm based expression recognition method for optimizing support vector machine parameters | |
CN102662931A (en) | Semantic role labeling method based on synergetic neural network | |
CN114169442A (en) | Remote sensing image small sample scene classification method based on double prototype network | |
CN116467443A (en) | Topic identification-based online public opinion text classification method | |
CN108520201A (en) | A kind of robust human face recognition methods returned based on weighted blend norm | |
Zhang et al. | Performance comparisons of Bi-LSTM and Bi-GRU networks in Chinese word segmentation | |
CN112489689B (en) | Cross-database voice emotion recognition method and device based on multi-scale difference countermeasure | |
CN108509840B (en) | Hyperspectral remote sensing image waveband selection method based on quantum memory optimization mechanism | |
CN109409231A (en) | Multiple features fusion sign Language Recognition Method based on adaptive hidden Markov | |
CN109783586A (en) | Waterborne troops's comment detection system and method based on cluster resampling | |
Liu | [Retracted] Art Painting Image Classification Based on Neural Network | |
Zhang et al. | Improved deep learning model text classification | |
Li et al. | Research on dual channel news headline classification based on ERNIE pre-training model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |