CN117236330A - Mutual information and antagonistic neural network based method for enhancing theme diversity - Google Patents

Mutual information and antagonistic neural network based method for enhancing theme diversity Download PDF

Info

Publication number
CN117236330A
CN117236330A CN202311524544.3A CN202311524544A CN117236330A CN 117236330 A CN117236330 A CN 117236330A CN 202311524544 A CN202311524544 A CN 202311524544A CN 117236330 A CN117236330 A CN 117236330A
Authority
CN
China
Prior art keywords
distribution
text
topic
layer
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311524544.3A
Other languages
Chinese (zh)
Other versions
CN117236330B (en
Inventor
王睿
郝仁
刘星
黄海平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202311524544.3A priority Critical patent/CN117236330B/en
Publication of CN117236330A publication Critical patent/CN117236330A/en
Application granted granted Critical
Publication of CN117236330B publication Critical patent/CN117236330B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention belongs to the technical field of natural language processing, and discloses a method for enhancing theme diversity based on mutual information and an antagonistic neural network, which comprises the following steps: word preprocessing in the corpus is used as real text word distribution; using the randomly sampled corpus as input of an encoder to generate a real text topic distribution vector; the distribution pairs are formed by the real text word distribution and the theme distribution, and random disturbance in the batch is used as a negative sample distribution pair; the Dirichlet distribution randomly sampled false text subject distribution is input by a generator and converted into false text word distribution vectors; generating a subject term in the countermeasure training process by using real distribution pairs and false distribution pairs; training is performed with the aim of discriminant loss functions and regularization losses that maximize mutual information. According to the invention, modeling is carried out on the text theme, a high-quality theme is mined, a mutual information maximization technology is integrated into an antagonistic nerve theme modeling process, the theme diversity is enhanced, and higher theme identity and diversity indexes are provided.

Description

Mutual information and antagonistic neural network based method for enhancing theme diversity
Technical Field
The invention belongs to the technical field of natural language processing, and particularly relates to a method for enhancing theme diversity based on mutual information and an antagonistic neural network.
Background
The topic model is an important tool for text mining, hidden information in corpus is mined, and the topic model has wide application in scenes such as topic aggregation, information extraction from unstructured text, feature selection and the like. Wherein implicit dirichlet allocation is the most representative model thereof to infer the topic distribution of the text. However, since the model is complex to solve and has very small adjustment, scientific researchers need to design a corresponding theoretical method for the model, which is not beneficial to modeling the subsequent theme at the application level.
In order to solve the deficiencies of the traditional topic model, based on the rapid development of the generated neural network in recent years, the neural topic model is focused by a plurality of scholars in the fields of text mining and natural language processing and is studied intensively, for example: an challenge-nerve topic model and a bidirectional challenge-nerve topic model are proposed based on challenge training. The model is modeled by using dirichlet distribution as a priori distribution of the subject space, and the encoder and generator generate more realistic data distribution and more accurate subject representation, but ignore valuable information between the generated data distribution and the real data distribution, resulting in insufficient diversity.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method for enhancing topic diversity based on mutual information and an antagonistic neural network, which can lead implicit topic information in a text to obey dirichlet allocation and integrate a mutual information maximization mechanism under an antagonistic neural topic modeling framework to promote the topic diversity of model mining.
In order to achieve the above purpose, the invention is realized by the following technical scheme:
the invention relates to a method for enhancing theme diversity based on mutual information and an antagonistic neural network, which comprises the following steps:
s1: carrying out data preprocessing on the online text of the social platform to obtain a real text, and representing the real text into a real text-word distribution vector by using a word bag model;
s2: placing a plurality of real text-word distribution vectors in a batch to serve as input of an encoder to obtain real text-topic distribution vectors, forming real distribution pairs by the real text-word distribution vectors and corresponding topic distributions, and forming negative sample distribution pairs by scrambling the real text-word distribution vectors in the batch and the real text-topic distributions;
s3: randomly sampling a topic vector from Dirichlet distribution as a pseudo-text-topic distribution, and inputting the pseudo-text-topic distribution into a generator to obtain a pseudo-text-word distribution vector and the pseudo-text-topic distribution to form a pseudo-distribution pair;
s4: the true distribution pair and the false distribution pair are used as the input of the countermeasure generation network, the true distribution pair and the negative sample pair are used as the input of the statistical network, and in the countermeasure training process, the encoder and the generator are trained through signals generated by countermeasure, and the model is trained with the regularization loss of mutual information as the maximum target.
S5: in order to approximate the bulldozer distance and the jensen-shannon distance between two high latitude distributions during training, the training objective is repeatedly optimized and iterated during the countermeasure training until the loss function converges.
The invention further improves that: encoder in step 2Training the mapping relation of the real text-word distribution vector to the real text-topic distribution vector, comprising +.>A dimension text-word distribution layer,/->Dimension semantics-implicit presentation layer and +.>The dimension text-theme distributing layer specifically comprises the following steps:
s2.1, representing the real text in the step 1 by using a bag-of-word model, and randomly sampling to obtainVitamin text-word distribution representation->As input, encoder->Map it to +.>Dimension implicit semantic space, and then get +.>Dimension implicit semantic space mapping to +.>Vitamin text-topic distribution layer:
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Bias term of weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>For the parameters of the LeakyReLU activation function,for batch normalization, ++>Weight matrix for semantic-implicit representation layer to text-topic distribution layer, +.>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is a text-topic distribution corresponding to real text and is the firstWei->Indicate->The proportion of the individual topics in the real text;
s2.2 will then be trueWired distribution vector and true->The dimension theme distribution vectors are spliced into real distribution pairsThe within-batch disruption of the real text-word distribution vector is expressed as +.>The topic distribution and word distribution which are not matched in the batch form a negative sample distribution pair +.>
Step 3 generatorGenerating a mapping relationship of text-topic distribution to text-word distribution, comprising +.>A dimension text-topic distribution layer, & lt + & gt>Dimension semantics-implicit presentation layer and +.>A dimension text-word distribution layer, the usage parameter is +.>Dirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
wherein the parameters areFor the probability density of the dirichlet distribution, the topic +.>For the subject parameters of the model, +.>Representing the probability that each word in the text belongs to each topic.
S3.1 GeneratorPseudo-text-topic distribution is first +.>Switch to->The dimension semantics-implicit presentation layer, the obtained +.>Dimension implicit semantic space mapping to +.>A dimension text-word distribution layer:
wherein,weight matrix for text-topic distribution layer to semantic-implicit representation layer, +.>Biasing items for text-topic distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For batch normalization, ++>Is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is a text-topic distribution corresponding to the real text and +.>Wei->Indicate->The proportion of the individual topics in the real text;
s3-2 then distributes the pseudo-text-topicAnd pseudo text-word distribution->Splice into false distribution pairs
S4.1 real distribution pairs in step 4And pseudo-distribution pair->Is regarded as being composed of two +.>+/>Dimension association distribution pair->And->Random samples from the sample, wherein +.>And->All are made of +.>Vidirichlet distribution pair and one +.>Combined distribution of the pair of Dirichlet distributions, against the production network>Training is to let false joint distributionApproximation to true coupletDistribution of the combination->Statistical networks use true sample pairs +.>And negative sample pair->Estimating mutual information between the text-word distribution space and the text-topic distribution space and maximizing it to promote topic diversity, encoder +_ when training is completed>Sum generator->The method can complete the bidirectional mapping relation and the internal mutual information maximization relation between the text-theme distribution and the text-word distribution, and specifically comprises the following steps:
s4.2 discriminatorIs composed of three layers of full-connection networks including one->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. In true distribution pair->And pseudo distribution pair->For inputting and outputting +.>To judge the transfusionThe method is to enter the true or false of the distribution pair, and adopts the following formula:
wherein,for bulldozer distance>For the output signal of the arbiter, a value close to 1 indicates that the arbiter is more prone to discriminate it as true, and vice versa;
s4.3 statistical networkComprising a global arbiter->And maximizing the mutual information loss function, global arbiter +.>Comprises a->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. Statistical network->For calculating the true sample pair +.>And negative sample pair->Mutual information between them and output +.>The method adopts the following formula:
wherein,representation->Activating function->Input representing an activation function->And->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->And->Representing distribution pairs of lot size->Is in the same batch (batch) and +.>Non-matching real text-word distribution.
S4.4, the final training target of the model is as follows:
the step 5 specifically comprises the following steps:
step 5-1, loading the dataset including text data, vocabulary, and word vectors
Step 5-2, build GeneratorEncoder->Discriminator->(mutual information) statistical network->The model is optimized by constructing an optimizer;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator +.>Input, its output signal +.>Can guide encoder->And generator->And thereby mine the topics in the text.
Step 5-4, counting the networkUse of true sample pairs->And negative sample distribution pair->Mutual information between the text-word distribution and the text topic distribution space is estimated for input and maximized to promote topic diversity.
Step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
step 5-6, repeating step 5-3, step 5-4 and step 5-5 until convergence.
The beneficial effects of the invention are as follows: the invention can help the topic model learn richer and diversified topic representations through the mutual information maximization mechanism, maximizes the mutual information among different words in the text, and promotes the model to organize related words into topics with more consistency and differentiation. The model can better adapt to task demands by optimizing the objective function with maximized mutual information, and the performance of the model in tasks such as generation, classification, clustering and the like is improved. The related experiments are carried out on a 20Newsgroups data set, and the results show that compared with other methods, the method has higher CP, CV, CA, NPMI and UT indexes, and the quality of the mined theme is obviously improved.
Drawings
Fig. 1 is a model diagram of the present invention.
Fig. 2 is a specific training flow chart of the present invention.
Detailed Description
Embodiments of the invention are disclosed in the drawings, and for purposes of explanation, numerous practical details are set forth in the following description. However, it should be understood that these practical details are not to be taken as limiting the invention. That is, in some embodiments of the invention, these practical details are unnecessary.
As shown in fig. 1-2, the present invention is a method for enhancing theme diversity based on mutual information and antagonistic neural networks, specifically comprising the steps of:
step 1, preprocessing an online text of a social platform to obtain a real text, and representing the real text sample into a real text-word distribution vector by using a word bag model method.
And 2, taking the real text-word distribution vector in the step 1 as input of an encoder to obtain mapping of the real text-theme distribution vector, forming a real distribution pair by the real text-word distribution vector and the theme distribution, and forming a negative sample distribution pair by the real text-word distribution vector in a batch and the real text-theme distribution vector.
Encoder in step 2Training the mapping relation of the real text-word distribution vector to the real text-topic distribution vector, comprising +.>A dimension text-word distribution layer,/->Dimension semantics-implicit presentation layer and +.>The dimension text-theme distributing layer specifically comprises the following steps:
step 2-1, using the real text in the step 1 to represent by using a bag-of-word model, and randomly sampling to obtainVitamin text-word distribution representation->As input, encoder->Map it to +.>Dimension implicit semantic space, and then get +.>Dimension implicit semantic space mapping to +.>Vitamin text-topic distribution layer:
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Bias term of weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>For the parameters of the LeakyReLU activation function,for batch normalization, ++>Weight matrix for semantic-implicit representation layer to text-topic distribution layer, +.>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is a text-topic distribution corresponding to real text and is the firstWei->Indicate->The weight of the individual subjects in the real text.
In the present example, the encoder network dimension is-/>-/>Wherein->For the word vector dimension,/->For semantic hidden layer dimension, ++>Is the topic vector dimension.
Step 2-2, followed by the actualWired distribution vector and true->The dimension theme distribution vectors are spliced into a real distribution pair +.>The within-batch disruption of the real text-word distribution vector is expressed as +.>The topic distribution and word distribution which are not matched in the batch form a negative sample distribution pair +.>
Step 3, generator in step 3Generating a mapping relationship of text-topic distribution to text-word distribution, comprising +.>A dimension text-topic distribution layer, & lt + & gt>Dimension semantics-implicit presentation layer and +.>A dimension text-word distribution layer, the usage parameter is +.>Dirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
wherein the parameters areFor the probability density of the dirichlet distribution, the topic +.>For the subject parameters of the model, +.>Representing the probability that each word in the text belongs to each topic.
Step 3-1, generatorPseudo-text-topic distribution is first +.>Switch to->The dimension semantics-implicit presentation layer, the obtained +.>Dimension implicit semantic space mapping to +.>A dimension text-word distribution layer:
wherein,weight matrix for text-topic distribution layer to semantic-implicit representation layer, +.>Biasing items for text-topic distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For batch normalization, ++>Is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is a text-topic distribution corresponding to the real text and +.>Wei->Indicate->The weight of the individual subjects in the real text.
In the present example, the generator networkDimension is->-/>-/>Wherein->For the topic vector dimension, < >>For semantic hidden layer dimension, ++>Is the word vector dimension.
Step 3-2, then distributing the pseudo-text-subjectAnd pseudo text-word distribution->Splice into false distribution pairs
True distribution pairs in step 4False distributionFor->Is regarded as being composed of two +.>+/>Dimension association distribution pair->And->Random samples from the sample, wherein +.>And->All are made of +.>Vidirichlet distribution pair and one +.>Combined distribution of the pair of Dirichlet distributions, against the production network>The training goal is to let the false distribution pair +.>Approximating the true distribution pair->Statistical network utilizes true distribution pairs +.>And negative sample pair->Estimating text-word distribution space and text-topic distribution spaceMutual information between them and maximize them to promote topic diversity, encoder +_when training is complete>Sum generator->The method can complete the bidirectional mapping relation and the internal mutual information maximization relation between the text-theme distribution and the text-word distribution, and specifically comprises the following steps:
step 4-1, discriminatorIs composed of three layers of full-connection networks including one->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. In true distribution pair->And pseudo distribution pair->For inputting and outputting +.>To judge the true or false of the input distribution pair, the method adopts the following formula:
wherein,for bulldozer distance>For the output signal of the arbiter, a value close to 1 indicates that the arbiter is more prone to discriminate it as true, and vice versa;
step 4-2, counting the networkComprising a global arbiter->And maximizing mutual information loss function, global arbiterComprises a->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, an output layer, statistical networkFor calculating the true sample pair +.>And negative sample pair->Mutual information between them and output +.>The method adopts the following formula:
wherein,representation->Activating function->Input representing an activation function->And->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->And->Representing distribution pairs of lot size->Is in the same batch (batch) and +.>Non-matching real text-word distribution.
In summary, the final training objectives of the model are as follows:
and 5, in order to approximate the bulldozer distance and the jensen-shannon distance between two high latitude distributions during training, repeatedly optimizing and iterating the training target in the countermeasure training process until the loss function converges.
Step 5-1, loading a data set comprising text data, a vocabulary and word vectors;
step 5-2, build GeneratorEncoder->Discriminator->Statistical network->The model is optimized by constructing an optimizer;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator +.>Input, its output signal +.>Can guide encoder->And generator->And thereby mine the topics in the text.
Step 5-4, counting networkUse of true sample pairs->And negative sample distribution pair->Estimating for input the mutual between text-word distribution and text topic distribution spaceInformation and maximize it to promote topic diversity.
Step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
step 5-6, repeating step 5-3, step 5-4 and step 5-5 until convergence.
The method for enhancing the topic diversity based on the mutual information and the antagonistic neural network improves the correlation between the topic distribution and the word distribution and enhances the topic diversity by maximizing the mutual information between the topic distribution and the word distribution.
The invention provides an antagonism neural network method for enhancing the diversity of a theme model, which is characterized in that 5 settings [20, 30, 50, 75, 100] are respectively arranged on the theme consistency tested on a 20News groups data set, and the average theme consistency values measured by the method are as follows: the indexes of C_P of 0.273, CA of 0.206, UCI of 0.139, NPMI of 0.052 and UT of 0.761 are higher than those of a comparison experiment, wherein the highest index in the comparison experiment is CP of 0.260, CA of 0.158, UCI of 0.09, NPMI of 0.47 and UT of 0.732.
The invention can help the topic model learn richer and diversified topic representations through the mutual information maximization mechanism, maximizes the mutual information among different words in the text, and promotes the model to organize related words into topics with more consistency and differentiation. The model can better adapt to task demands by optimizing the objective function with maximized mutual information, and the performance of the model in tasks such as generation, classification, clustering and the like is improved.
The foregoing description is only illustrative of the invention and is not to be construed as limiting the invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the present invention, should be included in the scope of the claims of the present invention.

Claims (5)

1. A method for enhancing topic diversity based on mutual information and antagonistic neural networks, characterized by: the method for enhancing the theme diversity comprises the following steps of:
step 1, carrying out data preprocessing on online texts of a social platform to obtain real texts, and representing the real texts into real text-word distribution vectors by using a word bag model;
step 2, placing a plurality of the real text-word distribution vectors in the step 1 in the same batch to serve as input of an encoder to obtain real text-topic distribution vectors, forming real distribution pairs by the real text-word distribution vectors and corresponding topic distribution, and splicing the real text-word distribution vectors in the batch with the real text-topic distribution vectors to form negative sample distribution pairs;
step 3, randomly sampling a theme vector from Dirichlet distribution to serve as a pseudo-text-theme distribution, inputting the pseudo-text-word distribution vector into a generator, and forming a pseudo-distribution pair by the pseudo-text-word distribution vector and the pseudo-text-theme distribution;
step 4, the discriminator receives the real distribution pairs obtained in the step 2 and the false distribution pairs generated in the step 3 as inputs of the discriminator, calculates losses of the real distribution pairs and the false distribution pairs to distinguish the real distribution pairs from the generated data distribution pairs, and introduces a statistical network, wherein the statistical network receives the real distribution pairs and the negative sample distribution pairs as inputs, calculates mutual information between the real distribution pairs and the negative sample distribution pairs, and regularized losses of the mutual information are added into the losses of the discriminator so as to increase the perception capability of the discriminator and improve the quality and diversity of generated samples;
and 5, using countermeasure training to approximately estimate the bulldozer distance between the real distribution pair and the false distribution pair and the jensen shannon distance between the real distribution pair and the negative sample distribution pair in training, and passing through an optimization target and an iterative model of the countermeasure training until the loss function converges.
2. The method for enhancing topic diversity based on mutual information and antagonistic neural networks according to claim 1, wherein: encoder in step 2Training the mapping relation of the real text-word distribution vector to the real text-topic distribution vector, comprising +.>A dimension text-word distribution layer,/->Dimension semantics-implicit presentation layer and +.>The dimension text-theme distributing layer specifically comprises the following steps:
step 2-1, using the real text in the step 1 to represent by using a bag-of-word model, and randomly sampling to obtainVitamin text-word distribution representation->As input, encoder->Map it to +.>Dimension implicit semantic space, and then get +.>Dimension implicit semantic space mapping to +.>The dimension text-topic distribution layer is obtained by adopting the following formula:
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Bias term of weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For batch normalization, ++>Weight matrix for semantic-implicit representation layer to text-topic distribution layer, +.>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is a text-topic distribution corresponding to real text and is the firstWei->Indicate->The proportion of the individual topics in the real text;
step 2-2, followed by the actualWired distribution vector and true->The dimension theme distribution vectors are spliced into real distribution pairsThe disturbed real text-word distribution vector within the batch is expressed as +.>The topic distribution and word distribution which are not matched in the batch form a negative sample distribution pair +.>
3. A method of enhancing topic diversity based on mutual information and antagonistic neural networks according to claim 2, characterized in that: step 3 generatorGenerating a mapping relationship of text-topic distribution to text-word distribution, comprising +.>A dimension text-topic distribution layer, & lt + & gt>Dimension semantics-implicit presentation layer and +.>A dimension text-word distribution layer, the usage parameter is +.>Dirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
wherein the parameters areFor the probability density of the dirichlet distribution, the topic +.>For the subject parameters of the model, +.>Representing the probability that each word in the text belongs to each topic;
step 3-1, generatorPseudo-text-topic distribution is first +.>Switch to->The dimension semantics-implicit presentation layer, the obtained +.>Dimension implicit semantic space mapping to +.>A dimension text-word distribution layer:
wherein,distributing text-topicsWeight matrix layer to semantic-implicit representation layer, < ->Biasing items for text-topic distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For the purpose of batch normalization,is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is a text-topic distribution corresponding to the real text and +.>Wei->Indicate->The proportion of the individual topics in the real text;
step 3-2, then distributing the pseudo-text-subjectAnd pseudo text-word distribution->Splice to form pseudo distribution pair->
4. A method of enhancing theme diversity based on mutual information and antagonistic neural networks according to claim 3, characterized in that: true distribution pairs in step 4And pseudo-distribution pair->Is regarded as being composed of two +.>+/>Dimension association distribution pair->And->Random samples from the sample, wherein +.>And->All are made of +.>Vidirichlet distribution pair and one +.>Combined distribution formed by Wilddirichlet distribution pairs and discriminator ++>The training goal is to let the false distribution +.>Approximating the true distribution pair->Statistical network->Utilize real distribution pair->And negative sample distribution pair->Estimating mutual information between the text-word distribution space and the text-topic distribution space and maximizing it to promote topic diversity, encoder +_ when training is completed>Sum generator->The method obtains the bi-directional mapping relation and the internal mutual information maximization relation between the text-topic distribution and the text-word distribution, and comprises the following steps of
Step 4-1, discriminatorConsists of three layers of fully connected networks, wherein the three layers of fully connected networks are specifically one +.>+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, an output layer, to truly divideCloth pair->And pseudo distribution pair->For inputting and outputting +.>To judge the true or false of the input distribution pair, the method adopts the following formula:
wherein,for bulldozer distance>For the output signal of the arbiter, a value close to 1 indicates that the arbiter is more prone to discriminate it as true, and vice versa;
step 4-2, counting the networkComprising a global arbiter->And maximizing the mutual information loss function, global arbiter +.>Comprises a->+/>A dimension association distribution layer, a +.>Semantic-implicit representation layer and an output layer of dimensions, said statistical networkFor calculating the true sample pair +.>And negative sample pair->Mutual information between them and output +.>The method adopts the following formula:
wherein,representation->Activating function->Input representing an activation function->And->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->Is in the same batch and->Non-matching real text-word distribution;
step 4-3, final training targets of the model are as follows:
5. the method for enhancing theme diversity based on mutual information and antagonistic neural networks according to claim 4, wherein: the step 5 specifically comprises the following steps:
step 5-1, loading a data set comprising text data, a vocabulary and word vectors;
step 5-2, construction of encoderGenerator->Discriminator->Statistical network->The model is optimized by constructing an optimizer;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator +.>Input, its output signal +.>Instruction encoder->And generator->Thereby mining the subject in the text;
step 5-4, counting networkUse of true sample pairs->And negative sample distribution pair->Estimating mutual information between text-word distribution and text topic distribution space for input and maximizing it to promote topic diversity;
step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
step 5-6, repeating the steps 5-3 to 5-5 until convergence.
CN202311524544.3A 2023-11-16 2023-11-16 Mutual information and antagonistic neural network based method for enhancing theme diversity Active CN117236330B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311524544.3A CN117236330B (en) 2023-11-16 2023-11-16 Mutual information and antagonistic neural network based method for enhancing theme diversity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311524544.3A CN117236330B (en) 2023-11-16 2023-11-16 Mutual information and antagonistic neural network based method for enhancing theme diversity

Publications (2)

Publication Number Publication Date
CN117236330A true CN117236330A (en) 2023-12-15
CN117236330B CN117236330B (en) 2024-01-26

Family

ID=89095326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311524544.3A Active CN117236330B (en) 2023-11-16 2023-11-16 Mutual information and antagonistic neural network based method for enhancing theme diversity

Country Status (1)

Country Link
CN (1) CN117236330B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117808104A (en) * 2024-02-29 2024-04-02 南京邮电大学 Viewpoint mining method based on self-supervision expression learning and oriented to hot topics

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009187A (en) * 2017-02-20 2018-05-08 南京航空航天大学 A kind of short text Topics Crawling method for strengthening Text Representation
CN110134786A (en) * 2019-05-14 2019-08-16 南京大学 A kind of short text classification method based on theme term vector and convolutional neural networks
CN110442781A (en) * 2019-06-28 2019-11-12 武汉大学 It is a kind of based on generate confrontation network to grade ranked items recommended method
CN110532378A (en) * 2019-05-13 2019-12-03 南京大学 A kind of short text aspect extracting method based on topic model
CN110941721A (en) * 2019-09-28 2020-03-31 国家计算机网络与信息安全管理中心 Short text topic mining method and system based on variational self-coding topic model
US20200372225A1 (en) * 2019-05-22 2020-11-26 Royal Bank Of Canada System and method for controllable machine text generation architecture
CN112100317A (en) * 2020-09-24 2020-12-18 南京邮电大学 Feature keyword extraction method based on theme semantic perception
CN112597769A (en) * 2020-12-15 2021-04-02 中山大学 Short text topic identification method based on Dirichlet variational self-encoder
US20210209416A1 (en) * 2020-03-20 2021-07-08 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for generating event theme
CN115099188A (en) * 2022-06-22 2022-09-23 南京邮电大学 Topic mining method based on word embedding and generating type neural network
CN115828931A (en) * 2023-02-09 2023-03-21 中南大学 Chinese and English semantic similarity calculation method for paragraph-level text
CN115878882A (en) * 2021-09-26 2023-03-31 微软技术许可有限责任公司 Hierarchical representation learning of user interests
US11640493B1 (en) * 2022-06-03 2023-05-02 Actionpower Corp. Method for dialogue summarization with word graphs
CN116467443A (en) * 2023-04-17 2023-07-21 西安理工大学 Topic identification-based online public opinion text classification method
CN116583880A (en) * 2020-09-29 2023-08-11 通用电气精准医疗有限责任公司 Multimodal image processing technique for training image data generation and use thereof for developing a unimodal image inference model

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009187A (en) * 2017-02-20 2018-05-08 南京航空航天大学 A kind of short text Topics Crawling method for strengthening Text Representation
CN110532378A (en) * 2019-05-13 2019-12-03 南京大学 A kind of short text aspect extracting method based on topic model
CN110134786A (en) * 2019-05-14 2019-08-16 南京大学 A kind of short text classification method based on theme term vector and convolutional neural networks
US20200372225A1 (en) * 2019-05-22 2020-11-26 Royal Bank Of Canada System and method for controllable machine text generation architecture
CN110442781A (en) * 2019-06-28 2019-11-12 武汉大学 It is a kind of based on generate confrontation network to grade ranked items recommended method
CN110941721A (en) * 2019-09-28 2020-03-31 国家计算机网络与信息安全管理中心 Short text topic mining method and system based on variational self-coding topic model
US20210209416A1 (en) * 2020-03-20 2021-07-08 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for generating event theme
CN112100317A (en) * 2020-09-24 2020-12-18 南京邮电大学 Feature keyword extraction method based on theme semantic perception
CN116583880A (en) * 2020-09-29 2023-08-11 通用电气精准医疗有限责任公司 Multimodal image processing technique for training image data generation and use thereof for developing a unimodal image inference model
CN112597769A (en) * 2020-12-15 2021-04-02 中山大学 Short text topic identification method based on Dirichlet variational self-encoder
CN115878882A (en) * 2021-09-26 2023-03-31 微软技术许可有限责任公司 Hierarchical representation learning of user interests
US11640493B1 (en) * 2022-06-03 2023-05-02 Actionpower Corp. Method for dialogue summarization with word graphs
CN115099188A (en) * 2022-06-22 2022-09-23 南京邮电大学 Topic mining method based on word embedding and generating type neural network
CN115828931A (en) * 2023-02-09 2023-03-21 中南大学 Chinese and English semantic similarity calculation method for paragraph-level text
CN116467443A (en) * 2023-04-17 2023-07-21 西安理工大学 Topic identification-based online public opinion text classification method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
CHING-SHENG LIN等: "Generative Adversarial Network for Joint Headline and Summary Generation", IEEE, vol. 10, pages 90745 *
吴少康等: "基于深度学习的嵌入式主题模型研究", 电脑知识与技术, vol. 18, no. 28, pages 7 *
夏家莉;曹中华;彭文忠;张守胜;: "Skip-Gram结构和词嵌入特性的文本主题建模", 小型微型计算机系统, vol. 41, no. 07, pages 1400 *
孟祥福等: "个性化新闻推荐方法研究综述", 计算机科学与探索, vol. 17, no. 12, pages 2840 *
张俊三等: "基于多样化标签矩阵的医学影像报告生成", 计算机科学, pages 1 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117808104A (en) * 2024-02-29 2024-04-02 南京邮电大学 Viewpoint mining method based on self-supervision expression learning and oriented to hot topics
CN117808104B (en) * 2024-02-29 2024-04-30 南京邮电大学 Viewpoint mining method based on self-supervision expression learning and oriented to hot topics

Also Published As

Publication number Publication date
CN117236330B (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN105975573B (en) A kind of file classification method based on KNN
CN104866810B (en) A kind of face identification method of depth convolutional neural networks
CN109992779B (en) Emotion analysis method, device, equipment and storage medium based on CNN
CN110472817A (en) A kind of XGBoost of combination deep neural network integrates credit evaluation system and its method
CN106649275A (en) Relation extraction method based on part-of-speech information and convolutional neural network
CN109255340A (en) It is a kind of to merge a variety of face identification methods for improving VGG network
CN108920445A (en) A kind of name entity recognition method and device based on Bi-LSTM-CRF model
CN117236330B (en) Mutual information and antagonistic neural network based method for enhancing theme diversity
CN109344759A (en) A kind of relatives&#39; recognition methods based on angle loss neural network
CN104572786A (en) Visualized optimization processing method and device for random forest classification model
CN110111848A (en) A kind of human cyclin expressing gene recognition methods based on RNN-CNN neural network fusion algorithm
CN108804595B (en) Short text representation method based on word2vec
CN101968853A (en) Improved immune algorithm based expression recognition method for optimizing support vector machine parameters
CN102662931A (en) Semantic role labeling method based on synergetic neural network
CN114169442A (en) Remote sensing image small sample scene classification method based on double prototype network
CN116467443A (en) Topic identification-based online public opinion text classification method
CN108520201A (en) A kind of robust human face recognition methods returned based on weighted blend norm
Zhang et al. Performance comparisons of Bi-LSTM and Bi-GRU networks in Chinese word segmentation
CN112489689B (en) Cross-database voice emotion recognition method and device based on multi-scale difference countermeasure
CN108509840B (en) Hyperspectral remote sensing image waveband selection method based on quantum memory optimization mechanism
CN109409231A (en) Multiple features fusion sign Language Recognition Method based on adaptive hidden Markov
CN109783586A (en) Waterborne troops&#39;s comment detection system and method based on cluster resampling
Liu [Retracted] Art Painting Image Classification Based on Neural Network
Zhang et al. Improved deep learning model text classification
Li et al. Research on dual channel news headline classification based on ERNIE pre-training model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant