CN117236330B - Mutual information and antagonistic neural network based method for enhancing theme diversity - Google Patents

Mutual information and antagonistic neural network based method for enhancing theme diversity Download PDF

Info

Publication number
CN117236330B
CN117236330B CN202311524544.3A CN202311524544A CN117236330B CN 117236330 B CN117236330 B CN 117236330B CN 202311524544 A CN202311524544 A CN 202311524544A CN 117236330 B CN117236330 B CN 117236330B
Authority
CN
China
Prior art keywords
distribution
text
topic
real
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311524544.3A
Other languages
Chinese (zh)
Other versions
CN117236330A (en
Inventor
王睿
郝仁
刘星
黄海平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202311524544.3A priority Critical patent/CN117236330B/en
Publication of CN117236330A publication Critical patent/CN117236330A/en
Application granted granted Critical
Publication of CN117236330B publication Critical patent/CN117236330B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention belongs to the technical field of natural language processing, and discloses a method for enhancing theme diversity based on mutual information and an antagonistic neural network, which comprises the following steps: word preprocessing in the corpus is used as real text word distribution; using the randomly sampled corpus as input of an encoder to generate a real text topic distribution vector; the distribution pairs are formed by the real text word distribution and the theme distribution, and random disturbance in the batch is used as a negative sample distribution pair; the Dirichlet distribution randomly sampled false text subject distribution is input by a generator and converted into false text word distribution vectors; generating a subject term in the countermeasure training process by using real distribution pairs and false distribution pairs; training is performed with the aim of discriminant loss functions and regularization losses that maximize mutual information. According to the invention, modeling is carried out on the text theme, a high-quality theme is mined, a mutual information maximization technology is integrated into an antagonistic nerve theme modeling process, the theme diversity is enhanced, and higher theme identity and diversity indexes are provided.

Description

Mutual information and antagonistic neural network based method for enhancing theme diversity
Technical Field
The invention belongs to the technical field of natural language processing, and particularly relates to a method for enhancing theme diversity based on mutual information and an antagonistic neural network.
Background
The topic model is an important tool for text mining, hidden information in corpus is mined, and the topic model has wide application in scenes such as topic aggregation, information extraction from unstructured text, feature selection and the like. Wherein implicit dirichlet allocation is the most representative model thereof to infer the topic distribution of the text. However, since the model is complex to solve and has very small adjustment, scientific researchers need to design a corresponding theoretical method for the model, which is not beneficial to modeling the subsequent theme at the application level.
In order to solve the deficiencies of the traditional topic model, based on the rapid development of the generated neural network in recent years, the neural topic model is focused by a plurality of scholars in the fields of text mining and natural language processing and is studied intensively, for example: an challenge-nerve topic model and a bidirectional challenge-nerve topic model are proposed based on challenge training. The model is modeled by using dirichlet distribution as a priori distribution of the subject space, and the encoder and generator generate more realistic data distribution and more accurate subject representation, but ignore valuable information between the generated data distribution and the real data distribution, resulting in insufficient diversity.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method for enhancing topic diversity based on mutual information and an antagonistic neural network, which can lead implicit topic information in a text to obey dirichlet allocation and integrate a mutual information maximization mechanism under an antagonistic neural topic modeling framework to promote the topic diversity of model mining.
In order to achieve the above purpose, the invention is realized by the following technical scheme:
the invention relates to a method for enhancing theme diversity based on mutual information and an antagonistic neural network, which comprises the following steps:
s1: carrying out data preprocessing on the online text of the social platform to obtain a real text, and representing the real text into a real text-word distribution vector by using a word bag model;
s2: placing a plurality of real text-word distribution vectors in a batch to serve as input of an encoder to obtain real text-topic distribution vectors, forming real distribution pairs by the real text-word distribution vectors and corresponding topic distributions, and forming negative sample distribution pairs by scrambling the real text-word distribution vectors in the batch and the real text-topic distributions;
s3: randomly sampling a topic vector from Dirichlet distribution as a pseudo-text-topic distribution, and inputting the pseudo-text-topic distribution into a generator to obtain a pseudo-text-word distribution vector and the pseudo-text-topic distribution to form a pseudo-distribution pair;
s4: the true distribution pair and the false distribution pair are used as the input of the countermeasure generation network, the true distribution pair and the negative sample pair are used as the input of the statistical network, and in the countermeasure training process, the encoder and the generator are trained through signals generated by countermeasure, and the model is trained with the regularization loss of mutual information as the maximum target.
S5: in order to approximate the bulldozer distance and the jensen-shannon distance between two high latitude distributions during training, the training objective is repeatedly optimized and iterated during the countermeasure training until the loss function converges.
The invention further improves that: encoder in step 2Training the mapping relation of the real text-word distribution vector to the real text-topic distribution vector, comprising +.>A dimension text-word distribution layer,/->Dimension semantics-implicit presentation layer and +.>The dimension text-theme distributing layer specifically comprises the following steps:
s2.1, representing the real text in the step 1 by using a bag-of-word model, and randomly sampling to obtainVitamin text-word distribution representation->As input, encoder->Map it to +.>Dimension implicit semantic space, and then get +.>Dimension implicit semantic space mapping to +.>Vitamin text-topic distribution layer:
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Bias term of weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>For the parameters of the LeakyReLU activation function,for batch normalization, ++>For semantic-implicit presentation layer to text-topic distribution layerWeight matrix of>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is a text-topic distribution corresponding to real text and is the firstWei->Indicate->The proportion of the individual topics in the real text;
s2.2 will then be trueWired distribution vector and true->The dimension theme distribution vectors are spliced into real distribution pairsThe within-batch disruption of the real text-word distribution vector is expressed as +.>The topic distribution and word distribution which are not matched in the batch form a negative sample distribution pair +.>
Step 3 generatorGenerating a mapping relationship of text-topic distribution to text-word distribution, comprising +.>A dimension text-topic distribution layer, & lt + & gt>Dimension semantics-implicit presentation layer and +.>A dimension text-word distribution layer, the usage parameter is +.>Dirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
wherein the parameters areFor the probability density of the dirichlet distribution, the topic +.>For the subject parameters of the model, +.>Representing the probability that each word in the text belongs to each topic.
S3.1 GeneratorPseudo-text-topic distribution is first +.>Switch to->The dimension semantics-implicit presentation layer, the obtained +.>Dimension implicit semantic space mapping to +.>A dimension text-word distribution layer:
wherein,weight matrix for text-topic distribution layer to semantic-implicit representation layer, +.>Biasing items for text-topic distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For batch normalization, ++>Is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is a text-topic distribution corresponding to the real text and +.>Wei->Indicate->The proportion of the individual topics in the real text;
s3-2 then distributes the pseudo-text-topicAnd pseudo text-word distribution->Splice into false distribution pairs
S4.1 real distribution pairs in step 4And pseudo-distribution pair->Is regarded as being composed of two +.>+/>Dimension association distribution pair->And->Random samples from the sample, wherein +.>And->All are made of +.>Vidirichlet distribution pair and one +.>Combined distribution of the pair of Dirichlet distributions, against the production network>Training is to let false joint distributionApproximating true joint distribution->Statistical networks use true sample pairs +.>And negative sample pair->Estimating mutual information between the text-word distribution space and the text-topic distribution space and maximizing it to promote topic diversity, encoder +_ when training is completed>Sum generator->The method can complete the bidirectional mapping relation and the internal mutual information maximization relation between the text-theme distribution and the text-word distribution, and specifically comprises the following steps:
s4.2 discriminatorIs composed of three layers of full-connection networks including one->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. In true distribution pair->And pseudo distribution pair->For inputting and outputting +.>To judge the true or false of the input distribution pair, the method adopts the following formula:
wherein,for bulldozer distance>For the output signal of the arbiter, a value close to 1 indicates that the arbiter is more prone to discriminate it as true, and vice versa;
s4.3 statistical networkComprising a global arbiter->And maximizing the mutual information loss function, global arbiter +.>Comprises a->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. Statistical network->For calculating the true sample pair +.>And negative sample pair->Mutual trust between each otherOutput->The method adopts the following formula:
wherein,representation->Activating function->Input representing an activation function->And->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->And->Representing distribution pairs of lot size->Is in the same batch (batch) and +.>Non-matching real text-word distribution.
S4.4, the final training target of the model is as follows:
the step 5 specifically comprises the following steps:
step 5-1, loading the dataset including text data, vocabulary, and word vectors
Step 5-2, build GeneratorEncoder->Discriminator->(mutual information) statistical network->The model is optimized by constructing an optimizer;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator +.>Input, its output signal +.>Can guide encoder->And generator->And thereby mine the topics in the text.
Step 5-4, counting the networkBy using realitySample pair->And negative sample distribution pair->Mutual information between the text-word distribution and the text topic distribution space is estimated for input and maximized to promote topic diversity.
Step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
step 5-6, repeating step 5-3, step 5-4 and step 5-5 until convergence.
The beneficial effects of the invention are as follows: the invention can help the topic model learn richer and diversified topic representations through the mutual information maximization mechanism, maximizes the mutual information among different words in the text, and promotes the model to organize related words into topics with more consistency and differentiation. The model can better adapt to task demands by optimizing the objective function with maximized mutual information, and the performance of the model in tasks such as generation, classification, clustering and the like is improved. The related experiments are carried out on a 20Newsgroups data set, and the results show that compared with other methods, the method has higher CP, CV, CA, NPMI and UT indexes, and the quality of the mined theme is obviously improved.
Drawings
Fig. 1 is a model diagram of the present invention.
Fig. 2 is a specific training flow chart of the present invention.
Detailed Description
Embodiments of the invention are disclosed in the drawings, and for purposes of explanation, numerous practical details are set forth in the following description. However, it should be understood that these practical details are not to be taken as limiting the invention. That is, in some embodiments of the invention, these practical details are unnecessary.
As shown in fig. 1-2, the present invention is a method for enhancing theme diversity based on mutual information and antagonistic neural networks, specifically comprising the steps of:
step 1, preprocessing an online text of a social platform to obtain a real text, and representing the real text sample into a real text-word distribution vector by using a word bag model method.
And 2, taking the real text-word distribution vector in the step 1 as input of an encoder to obtain mapping of the real text-theme distribution vector, forming a real distribution pair by the real text-word distribution vector and the theme distribution, and forming a negative sample distribution pair by the real text-word distribution vector in a batch and the real text-theme distribution vector.
Encoder in step 2Training the mapping relation of the real text-word distribution vector to the real text-topic distribution vector, comprising +.>A dimension text-word distribution layer,/->Dimension semantics-implicit presentation layer and +.>The dimension text-theme distributing layer specifically comprises the following steps:
step 2-1, using the real text in the step 1 to represent by using a bag-of-word model, and randomly sampling to obtainVitamin text-word distribution representation->As input, encoder->Map it to +.>Dimension implicit semantic space, and then get +.>Dimension implicit semantic space mapping to +.>Vitamin text-topic distribution layer:
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>Bias term of weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>For the parameters of the LeakyReLU activation function,for batch normalization, ++>Weight matrix for semantic-implicit representation layer to text-topic distribution layer, +.>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is a text-topic distribution corresponding to real text and is the firstWei->Indicate->The weight of the individual subjects in the real text.
In the present example, the encoder network dimension is-/>-/>Wherein->For the word vector dimension,/->For semantic hidden layer dimension, ++>Is the topic vector dimension.
Step 2-2, followed by the actualWired distribution vector and true->The dimension theme distribution vectors are spliced into a real distribution pair +.>The within-batch disruption of the real text-word distribution vector is expressed as +.>The topic distribution and word distribution which are not matched in the batch form a negative sample distribution pair +.>
Step 3, generator in step 3Generating a mapping relationship of text-topic distribution to text-word distribution, includingA dimension text-topic distribution layer, & lt + & gt>Dimension semantics-implicit presentation layer and +.>A dimension text-word distribution layer, the usage parameter is +.>Dirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
wherein the parameters areFor the probability density of the dirichlet distribution, the topic +.>For the subject parameters of the model, +.>Representing the probability that each word in the text belongs to each topic.
Step 3-1, generatorThe false text is first converted as followsPresent-topic distribution->Switch to->The dimension semantics-implicit presentation layer, the obtained +.>Dimension implicit semantic space mapping to +.>A dimension text-word distribution layer:
wherein,weight matrix for text-topic distribution layer to semantic-implicit representation layer, +.>Biasing items for text-topic distribution layer to semantic-implicit representation layer, +.>Parameter for the LeakyReLU activation function, < ->For batch normalization, ++>Is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is a text-topic distribution corresponding to the real text and +.>Wei->Indicate->The weight of the individual subjects in the real text.
In the present example, the generator networkDimension is->-/>-/>Wherein->For the topic vector dimension, < >>For semantic hidden layer dimension, ++>Is the word vector dimension.
Step 3-2, then distributing the pseudo-text-subjectAnd pseudo text-word distribution->Splice into false distribution pairs
True distribution pairs in step 4And pseudo-distribution pair->Is regarded as being composed of two +.>+/>Dimension association distribution pair->And->Random samples from the sample, wherein +.>And->All are made of +.>Vidirichlet distribution pair and one +.>Combined distribution of the pair of Dirichlet distributions, against the production network>The training goal is to let the false distribution pair +.>Approximating the true distribution pair->Statistical network utilizes true distribution pairs +.>And negative sample pair->EstimationMutual information between text-word distribution space and text-topic distribution space and maximize it to promote topic diversity, encoder +_ when training is completed>Sum generator->The method can complete the bidirectional mapping relation and the internal mutual information maximization relation between the text-theme distribution and the text-word distribution, and specifically comprises the following steps:
step 4-1, discriminatorIs composed of three layers of full-connection networks including one->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, one output layer. In true distribution pair->And pseudo distribution pair->For inputting and outputting +.>To judge the true or false of the input distribution pair, the method adopts the following formula:
wherein,distance to bulldozer,/>For the output signal of the arbiter, a value close to 1 indicates that the arbiter is more prone to discriminate it as true, and vice versa;
step 4-2, counting the networkComprising a global arbiter->And maximizing mutual information loss function, global arbiterComprises a->+/>A combination distribution layer of dimensions +.>Semantic-implicit representation layer of dimension, an output layer, statistical networkFor calculating the true sample pair +.>And negative sample pair->Mutual information between them and output +.>The method adopts the following formula:
wherein,representation->Activating function->Input representing an activation function->And->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->And->Representing distribution pairs of lot size->Is in the same batch (batch) and +.>Non-matching real text-word distribution.
In summary, the final training objectives of the model are as follows:
and 5, in order to approximate the bulldozer distance and the jensen-shannon distance between two high latitude distributions during training, repeatedly optimizing and iterating the training target in the countermeasure training process until the loss function converges.
Step 5-1, loading a data set comprising text data, a vocabulary and word vectors;
step 5-2, build GeneratorEncoder->Discriminator->Statistical network->The model is optimized by constructing an optimizer;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator +.>Input, its output signal +.>Can guide encoder->And generator->And thereby mine the topics in the text.
Step 5-4, counting networkUse of true sample pairs->And negative sample distribution pair->Estimating text for inputMutual information between the present-word distribution and the text topic distribution space is maximized to promote topic diversity.
Step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
step 5-6, repeating step 5-3, step 5-4 and step 5-5 until convergence.
The method for enhancing the topic diversity based on the mutual information and the antagonistic neural network improves the correlation between the topic distribution and the word distribution and enhances the topic diversity by maximizing the mutual information between the topic distribution and the word distribution.
The invention provides an antagonism neural network method for enhancing the diversity of a theme model, which is characterized in that 5 settings [20, 30, 50, 75, 100] are respectively arranged on the theme consistency tested on a 20News groups data set, and the average theme consistency values measured by the method are as follows: the indexes of C_P of 0.273, CA of 0.206, UCI of 0.139, NPMI of 0.052 and UT of 0.761 are higher than those of a comparison experiment, wherein the highest index in the comparison experiment is CP of 0.260, CA of 0.158, UCI of 0.09, NPMI of 0.47 and UT of 0.732.
The invention can help the topic model learn richer and diversified topic representations through the mutual information maximization mechanism, maximizes the mutual information among different words in the text, and promotes the model to organize related words into topics with more consistency and differentiation. The model can better adapt to task demands by optimizing the objective function with maximized mutual information, and the performance of the model in tasks such as generation, classification, clustering and the like is improved.
The foregoing description is only illustrative of the invention and is not to be construed as limiting the invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the present invention, should be included in the scope of the claims of the present invention.

Claims (4)

1. A method for enhancing topic diversity based on mutual information and antagonistic neural networks, characterized by: the method for enhancing the theme diversity comprises the following steps of:
step 1, carrying out data preprocessing on online texts of a social platform to obtain real texts, and representing the real texts into real text-word distribution vectors by using a word bag model;
step 2, placing a plurality of the real text-word distribution vectors in the step 1 in the same batch to serve as input of an encoder to obtain real text-topic distribution vectors, forming real distribution pairs by the real text-word distribution vectors and corresponding topic distribution, and splicing the real text-word distribution vectors in the batch with the real text-topic distribution vectors to form negative sample distribution pairs;
step 3, randomly sampling a theme vector from Dirichlet distribution to serve as a pseudo-text-theme distribution, inputting the pseudo-text-word distribution vector into a generator, and forming a pseudo-distribution pair by the pseudo-text-word distribution vector and the pseudo-text-theme distribution;
step 4, the discriminator receives the real distribution pair obtained in the step 2 and the false distribution pair generated in the step 3 as inputs of the discriminator, calculates losses of the real distribution pair and the false distribution pair to distinguish the real data distribution pair from the generated data distribution pair, and introduces a statistical network, wherein the statistical network receives the real distribution pair and the negative sample distribution pair as inputs, calculates mutual information between the real distribution pair and the negative sample distribution pair, and regularized losses of the mutual information are added into the losses of the discriminator;
step 5, using countermeasure training to approximate the bulldozer distance between the real distribution pair and the false distribution pair and the jensen shannon distance between the real distribution pair and the negative sample distribution pair, and passing through an optimization target and an iterative model of the countermeasure training until a loss function converges, wherein the method specifically comprises the following steps:
step 5-1, loading a data set comprising text data, a vocabulary and word vectors;
step 5-2, constructing an encoder E, a generator G, a discriminator D and a statistical network H model, and constructing an optimizer to optimize the model;
step 5-3, true distribution pairsAnd pseudo-distribution pair->As a discriminator D input, which outputs a signal D during the countermeasure training out Guiding the learning of the encoder E and the generator G so as to mine out the topics in the text;
step 5-4, the statistical network H uses the real sample pairAnd negative sample distribution pair->Estimating mutual information between text-word distribution and text topic distribution space for input and maximizing it to promote topic diversity;
step 5-5, performing random gradient descent optimization according to the loss function of the discriminator and the regularized mutual information loss function, and updating parameters of the encoder and the decoder, namely:
step 5-6, repeating the steps 5-3 to 5-5 until convergence.
2. The method for enhancing topic diversity based on mutual information and antagonistic neural networks according to claim 1, wherein: the encoder E in step 2 trains the mapping relation from the real text-word distribution vector to the real text-topic distribution vector, including a V-dimensional text-word distribution layer, an S-dimensional semantic-implicit representation layer, and a K-dimensional text-topic distribution layer, and specifically includes the steps of:
step 2-1, using the real text in the step 1 to represent by using a word bag model, and randomly sampling to obtain V-dimensional text-word distribution representationAs input, the encoder E maps it to an S-dimensional latent semantic space, and then maps the resulting S-dimensional latent semantic space to a K-dimensional text-topic distribution layer, using the following formula:
wherein,and->Weight matrix for text-word distribution layer to semantic-implicit representation layer, +.>For biasing terms of the weight matrix of text-word distribution layer to semantic-implicit representation layer, LR is the parameter of the LeakyReLU activation function, BN (·) is batch normalization,/->Weight matrix for semantic-implicit representation layer to text-topic distribution layer, +.>Bias items for semantic-implicit presentation layer to text-topic distribution layer, +.>Is the text-topic distribution corresponding to the real text and kth e {1,2, …, K } dimension ∈ ->Representing the proportion of the kth theme in the real text;
step 2-2, then splicing the true V-dimension word distribution vector and the true K-dimension topic distribution vectorFor true distribution pairsRepresenting the disturbed real text-word distribution vector within the batch as +.>Forming a negative sample distribution pair by the topic distribution and the word distribution which are not matched in the batch>
3. A method of enhancing topic diversity based on mutual information and antagonistic neural networks according to claim 2, characterized in that: in step 3, the generator G generates a mapping relationship from text-topic distribution to text-word distribution, including a K-dimensional text-topic distribution layer, an S-dimensional semantic-implicit representation layer, and a V-dimensional text-word distribution layer, using parameters ofDirichlet distribution as pseudo-text-topic distribution +.>Is obtained by using the following formula:
wherein the parameters areFor the probability density of dirichlet distribution, topic k is the topic parameter of the method for enhancing topic diversity,representing the probability that each word in the text belongs to each topic;
step 3-1, generator G first divides the pseudo-text-subject by the following transformationClothConverting to an S-dimensional semantic-implicit expression layer, and mapping the obtained S-dimensional implicit semantic space to a V-dimensional text-word distribution layer:
wherein,weight matrix for text-topic distribution layer to semantic-implicit representation layer, +.>For text-topic distribution layer to semantic-implicit representation layer bias terms, LR is the parameter of the LeakyReLU activation function, BN (·) is batch normalization,is a weight matrix of semantic-implicit presentation layer to text-word distribution layer,/for>Is a bias item of semantic-implicit presentation layer to text-word distribution layer, ++>Is the text-topic distribution corresponding to the real text and kth e {1,2, …, K } dimension ∈ ->Representing the proportion of the kth theme in the real text;
step 3-2, then distributing the pseudo-text-subjectAnd pseudo text-word distribution->Splice to form pseudo distribution pair->
4. A method of enhancing theme diversity based on mutual information and antagonistic neural networks according to claim 3, characterized in that: true distribution pairs in step 4And pseudo-distribution pair->Consider the pair +.A.by two K+V dimension joint distribution>And->Random samples from the sample, wherein +.>And->Are all combined distributions consisting of a K-dimensional dirichlet allocation pair and a V-dimensional dirichlet allocation pair, and the training goal of the discriminator D is to let false allocation +.>Approximating the true distribution pair->Statistical network H uses true distribution pairs +.>And negative sample distribution pair->Estimating mutual information between the text-word distribution space and the text-topic distribution space and maximizing the mutual information to improve topic diversity, and when training is completed, the encoder E and the generator G obtain a bidirectional mapping relation and an internal mutual information maximizing relation between the text-topic distribution and the text-word distribution, which comprises the following steps of
Step 4-1, the discriminator E is composed of three layers of fully connected networks, the three layers of fully connected networks are specifically a V+K-dimensional joint distribution layer, an S-dimensional semantic-implicit representation layer and an output layer, and the three layers are paired in real distributionAnd pseudo distribution pair->For inputting and outputting D out To judge the true or false of the input distribution pair, the method adopts the following formula:
wherein W is the bulldozer distance, D (&) is the output signal of the discriminator, and a value close to 1 indicates that the discriminator is more prone to discriminate it as true, and vice versa;
step 4-2, the statistical network H comprises a global arbiter D 'and a maximized mutual information loss function, the global arbiter D' comprises a joint distribution layer of V+K dimension, a semantic-implicit representation layer of S dimension and an output layer, and the statistical network H is used for calculating real sample pairsAnd negative sample pair->Mutual information between them and output S out The method adopts the following formula:
softplus=log(1+erp(x))
where sp (·) represents the softplus activation function, x represents the input of the activation function,and->Representing the true data distribution of the text-word distribution layer and the true distribution of the text-topic distribution layer, respectively,/->Is in the same batch and->Non-matching real text-word distribution;
step 4-3, final training targets of the model are as follows:
CN202311524544.3A 2023-11-16 2023-11-16 Mutual information and antagonistic neural network based method for enhancing theme diversity Active CN117236330B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311524544.3A CN117236330B (en) 2023-11-16 2023-11-16 Mutual information and antagonistic neural network based method for enhancing theme diversity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311524544.3A CN117236330B (en) 2023-11-16 2023-11-16 Mutual information and antagonistic neural network based method for enhancing theme diversity

Publications (2)

Publication Number Publication Date
CN117236330A CN117236330A (en) 2023-12-15
CN117236330B true CN117236330B (en) 2024-01-26

Family

ID=89095326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311524544.3A Active CN117236330B (en) 2023-11-16 2023-11-16 Mutual information and antagonistic neural network based method for enhancing theme diversity

Country Status (1)

Country Link
CN (1) CN117236330B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009187A (en) * 2017-02-20 2018-05-08 南京航空航天大学 A kind of short text Topics Crawling method for strengthening Text Representation
CN110134786A (en) * 2019-05-14 2019-08-16 南京大学 A kind of short text classification method based on theme term vector and convolutional neural networks
CN110442781A (en) * 2019-06-28 2019-11-12 武汉大学 It is a kind of based on generate confrontation network to grade ranked items recommended method
CN110532378A (en) * 2019-05-13 2019-12-03 南京大学 A kind of short text aspect extracting method based on topic model
CN110941721A (en) * 2019-09-28 2020-03-31 国家计算机网络与信息安全管理中心 Short text topic mining method and system based on variational self-coding topic model
CN112100317A (en) * 2020-09-24 2020-12-18 南京邮电大学 Feature keyword extraction method based on theme semantic perception
CN112597769A (en) * 2020-12-15 2021-04-02 中山大学 Short text topic identification method based on Dirichlet variational self-encoder
CN115099188A (en) * 2022-06-22 2022-09-23 南京邮电大学 Topic mining method based on word embedding and generating type neural network
CN115828931A (en) * 2023-02-09 2023-03-21 中南大学 Chinese and English semantic similarity calculation method for paragraph-level text
CN115878882A (en) * 2021-09-26 2023-03-31 微软技术许可有限责任公司 Hierarchical representation learning of user interests
US11640493B1 (en) * 2022-06-03 2023-05-02 Actionpower Corp. Method for dialogue summarization with word graphs
CN116467443A (en) * 2023-04-17 2023-07-21 西安理工大学 Topic identification-based online public opinion text classification method
CN116583880A (en) * 2020-09-29 2023-08-11 通用电气精准医疗有限责任公司 Multimodal image processing technique for training image data generation and use thereof for developing a unimodal image inference model

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3081242A1 (en) * 2019-05-22 2020-11-22 Royal Bank Of Canada System and method for controllable machine text generation architecture
CN111428049B (en) * 2020-03-20 2023-07-21 北京百度网讯科技有限公司 Event thematic generation method, device, equipment and storage medium

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009187A (en) * 2017-02-20 2018-05-08 南京航空航天大学 A kind of short text Topics Crawling method for strengthening Text Representation
CN110532378A (en) * 2019-05-13 2019-12-03 南京大学 A kind of short text aspect extracting method based on topic model
CN110134786A (en) * 2019-05-14 2019-08-16 南京大学 A kind of short text classification method based on theme term vector and convolutional neural networks
CN110442781A (en) * 2019-06-28 2019-11-12 武汉大学 It is a kind of based on generate confrontation network to grade ranked items recommended method
CN110941721A (en) * 2019-09-28 2020-03-31 国家计算机网络与信息安全管理中心 Short text topic mining method and system based on variational self-coding topic model
CN112100317A (en) * 2020-09-24 2020-12-18 南京邮电大学 Feature keyword extraction method based on theme semantic perception
CN116583880A (en) * 2020-09-29 2023-08-11 通用电气精准医疗有限责任公司 Multimodal image processing technique for training image data generation and use thereof for developing a unimodal image inference model
CN112597769A (en) * 2020-12-15 2021-04-02 中山大学 Short text topic identification method based on Dirichlet variational self-encoder
CN115878882A (en) * 2021-09-26 2023-03-31 微软技术许可有限责任公司 Hierarchical representation learning of user interests
US11640493B1 (en) * 2022-06-03 2023-05-02 Actionpower Corp. Method for dialogue summarization with word graphs
CN115099188A (en) * 2022-06-22 2022-09-23 南京邮电大学 Topic mining method based on word embedding and generating type neural network
CN115828931A (en) * 2023-02-09 2023-03-21 中南大学 Chinese and English semantic similarity calculation method for paragraph-level text
CN116467443A (en) * 2023-04-17 2023-07-21 西安理工大学 Topic identification-based online public opinion text classification method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Generative Adversarial Network for Joint Headline and Summary Generation;Ching-Sheng Lin等;IEEE;第10卷;第90745页-90751页 *
Skip-Gram结构和词嵌入特性的文本主题建模;夏家莉;曹中华;彭文忠;张守胜;;小型微型计算机系统;第41卷(第07期);第1400页-1405页 *
个性化新闻推荐方法研究综述;孟祥福等;计算机科学与探索;第17卷(第12期);第2840页-2860页 *
基于多样化标签矩阵的医学影像报告生成;张俊三等;计算机科学;第1页-13页 *
基于深度学习的嵌入式主题模型研究;吴少康等;电脑知识与技术;第18卷(第28期);第7页-10页 *

Also Published As

Publication number Publication date
CN117236330A (en) 2023-12-15

Similar Documents

Publication Publication Date Title
CN105975573B (en) A kind of file classification method based on KNN
CN108564129B (en) Trajectory data classification method based on generation countermeasure network
CN110188228B (en) Cross-modal retrieval method based on sketch retrieval three-dimensional model
CN108920445A (en) A kind of name entity recognition method and device based on Bi-LSTM-CRF model
CN106649275A (en) Relation extraction method based on part-of-speech information and convolutional neural network
CN102662931A (en) Semantic role labeling method based on synergetic neural network
CN101968853A (en) Improved immune algorithm based expression recognition method for optimizing support vector machine parameters
CN110059191A (en) A kind of text sentiment classification method and device
CN113191357A (en) Multilevel image-text matching method based on graph attention network
CN112101473B (en) Smoke detection algorithm based on small sample learning
CN105868796A (en) Design method for linear discrimination of sparse representation classifier based on nuclear space
CN116467443A (en) Topic identification-based online public opinion text classification method
Zhong Evaluation of traditional culture teaching efficiency by course ideological and political integration lightweight deep learning
CN117236330B (en) Mutual information and antagonistic neural network based method for enhancing theme diversity
Arora et al. Comparative question answering system based on natural language processing and machine learning
Liu Art painting image classification based on neural network
CN109783586A (en) Waterborne troops&#39;s comment detection system and method based on cluster resampling
Wang et al. Distant supervised relation extraction with position feature attention and selective bag attention
CN112598065B (en) Memory-based gating convolutional neural network semantic processing system and method
Yun et al. Quality evaluation and satisfaction analysis of online learning of college students based on artificial intelligence
Li et al. Research on dual channel news headline classification based on ERNIE pre-training model
Dai Online English teaching quality assessment based on K-means and improved SSD algorithm
Purnamasari et al. Performance comparison of text-based sentiment analysis using recurrent neural network and convolutional neural network
CN112489689B (en) Cross-database voice emotion recognition method and device based on multi-scale difference countermeasure
Cheng et al. Entity relationship extraction based on bi-channel neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant