CN112148863B - Generation type dialogue abstract method integrated with common knowledge - Google Patents
Generation type dialogue abstract method integrated with common knowledge Download PDFInfo
- Publication number
- CN112148863B CN112148863B CN202011104023.9A CN202011104023A CN112148863B CN 112148863 B CN112148863 B CN 112148863B CN 202011104023 A CN202011104023 A CN 202011104023A CN 112148863 B CN112148863 B CN 112148863B
- Authority
- CN
- China
- Prior art keywords
- dialogue
- node
- knowledge
- representation
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000003062 neural network model Methods 0.000 claims abstract description 23
- 230000006870 function Effects 0.000 claims description 28
- 238000013507 mapping Methods 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 22
- 238000013528 artificial neural network Methods 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 13
- 238000010586 diagram Methods 0.000 claims description 7
- 230000002457 bidirectional effect Effects 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 238000011161 development Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 4
- 230000017105 transposition Effects 0.000 claims description 4
- 238000007476 Maximum Likelihood Methods 0.000 claims description 2
- 230000000692 anti-sense effect Effects 0.000 claims description 2
- 230000010354 integration Effects 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 claims description 2
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
A generation type dialogue abstract method integrated with common knowledge belongs to the field of natural language processing. The invention solves the problems of inaccurate generated dialogue abstract and low abstraction caused by the existing method for generating the dialogue abstract without using common knowledge. The method comprises the following steps: acquiring a common knowledge base ConceptNet and a dialogue summary data set SAMSum; introducing tuple knowledge into a dialogue summary data set SAMSum by using the acquired common knowledge base conceptNet to construct a heterogeneous dialogue graph; and (4) generating a final dialogue abstract from a dialogue section through the trained dialogue heterogeneous neural network model by the dialogue heterogeneous neural network model constructed in the third step. The invention is applied to the generation of the dialogue summary.
Description
Technical Field
The invention relates to the field of natural language processing, in particular to a general knowledge-integrated generating type dialogue summarization method.
Background
Automatic text summarization based on natural language processing (AutomaticSummarization)[1](topic: structural architecture by computer: techniques and protocols, author: Chris D Paice, year 1990, cited in the literature from Information Processing&Management), that is, given a text record of a multi-person conversation, a short text description containing key information of the conversation is generated, and as shown in fig. 1, a multi-person conversation and its corresponding standard abstract are shown.
For dialog summaries, the existing work has mostly focused on the generative (abstract) approach, i.e. allowing the final summary to contain novel words and phrases that the original text does not have. For example Liu et al[2][ title: automatic dialog summary generation for customer service, author: chunyi Liu, year: 2019, the literature is quoted from Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery&Data Mining]For the customer service dialogue summary task, a multi-step generation mode is adopted to generate the dialogue summary, Liu et al[3][ title: the author:zhengyuan Liu, year: 2019, the literature is quoted from arXiv preprint]And integrating topic information modeling conversation aiming at the doctor-patient conversation abstract task to generate a final abstract. Ganesh et al[4][ title: abstract catalysis of spoke and writen conversion, authors: prakhar Ganesh, year: 2019, the literature is quoted from arXiv preprint]Useless sentences in the conversation are removed by using the conversation chapter structure as a rule, and then a conversation abstract is generated. Recently, in dialog reply generation[5][ title: commonsense knowledge conversion with graph association, author: hao Zhou, year: 2018, the literature is quoted from IJCAI,]and dialogue context modeling[6][ title: multi-task prediction for Multi-role dialogue representation learning, authors: tianyi Wang, year: in 2020, the literature is imported from AAAI]The tasks and the like show that although the existing abstract model based on the neural network has strong learning capacity, the existing method ignores the utilization of common knowledge, on one hand, the model cannot better understand the dialogue text, and generates an abstract with low quality; on the other hand, the lack of common sense knowledge can result in low abstract of the generated abstract. The model can be helped to better complete tasks by integrating explicit common knowledge, and the dialogue abstract integrated with the common knowledge can help the model to understand the high-level meaning behind the dialogue; and the method can also be used as a bridge between incoherent sentences to help better understand the conversation. However, existing dialog summarization systems have overlooked the use of common sense knowledge.
Common sense knowledge may help the dialog summarization system generate a higher quality summary. As shown in FIG. 1, Bob can know that Tom is expected to be given by Bob to take a free ride by 'pick up' and 'bad ride', and can help better generate a dialog summary by introducing explicit general knowledge of 'taking a free ride'. After the common sense knowledge is merged, in order to better model three types of data of speakers, sentences and common sense knowledge, the three types of data can be modeled by using the heteromorphic neural network, and a final abstract is generated.
Disclosure of Invention
The invention aims to solve the problems of inaccurate generated conversation abstract and low abstraction caused by the fact that the conventional generated conversation abstract method does not utilize common knowledge. A method for generating a dialog summary incorporating common knowledge is proposed.
A method for generating a dialogue abstract merged with common knowledge comprises the following steps:
step one, acquiring a common knowledge base ConceptNet and a dialogue summary data set SAMSum; the contained common sense knowledge exists in the form of a tuple, namely, the tuple knowledge is expressed as:
R=(h,r,t,w),
wherein R represents a tuple knowledge; representing a head entity; r represents a relationship; t represents a tail entity; w represents a weight, representing the confidence of the relationship; knowledge R represents the ownership relation R of the head entity and the tail entity t, and the weight is w;
the dialogue abstract data set SAMSum is divided into three parts, namely training, development and testing;
introducing tuple knowledge into the dialogue summary data set SAMSum by using the acquired common knowledge base conceptNet, and constructing a heterogeneous dialogue graph; the specific process is as follows:
step three, constructing a dialogue heterogeneous neural network model; the dialogue heterogeneous neural network model comprises a node encoder, a graph encoder and a decoder;
step three, constructing a node encoder, and acquiring node initialization representation by using a bidirectional long-time and short-time neural networkAnd word initialization representations
Step two, constructing a graph encoder, updating node representation by utilizing a heterogeneous graph neural network, and adding node position encoding information and updating word representation
Step three, constructing a decoder;
and step four, training the dialogue heterogeneous neural network model constructed in the step three, and generating a final dialogue abstract from a section of dialogue through the trained dialogue heterogeneous neural network model.
Advantageous effects
A conversation abstract integrated with common sense knowledge can help a model to understand the high-level meaning behind the conversation;
the conversation abstract integrated with the common knowledge can be used as a bridge between incoherent sentences to help better understand the conversation;
by introducing common knowledge, the model can be helped to generate more abstract and generalized abstract;
the invention introduces common sense knowledge in the dialogue summarization task, models three types of data of speakers, sentences and common sense knowledge in the dialogue into a heterogeneous dialogue graph, and models the whole heterogeneous dialogue graph by using a heterogeneous neural network.
The whole model adopts a graph to sequence framework to generate a final dialogue abstract. The problem that the conventional generated dialogue abstract ignores common knowledge utilization is solved. In the abstract generated after the experiment of the method, the more abstract and correct abstract is generated, the dialogue content is better summarized, the effectiveness of the method is shown, and the method obtains a better result on the evaluation index ROUGE than the existing method.
The ROUGE is a similarity measurement method based on recall rate, a group of indexes of automatic abstract and machine translation are evaluated, the sufficiency and the loyalty of the translation are examined, and the higher the value is, the better the value is. The calculation of ROUGE-1, ROUGE-2, and ROUGE-L involves unigram, bigram, and longest common subsequence, respectively.
Drawings
FIG. 1 is a diagram of a multi-person conversation and its corresponding standard abstract;
FIG. 2 is an example of a SAMSum session summary dataset session summary pair;
FIG. 3 is an example of a SAMSum dataset dialog-summary pair;
FIG. 4 is a related knowledge triple obtained from ConceptNet;
FIG. 5 is a sentence-knowledge graph constructed in accordance with the present invention;
FIG. 6 is a speaker-sentence graph constructed in accordance with the present invention;
FIG. 7 is a heterogeneous session diagram constructed in accordance with the present invention;
fig. 8 is a schematic diagram of the inventive model, in which (a) heterogeneous session map construction, (b) node encoder, (c) map encoder, and (d) decoder.
Detailed Description
The first embodiment is as follows: the embodiment is a generation type dialogue summarization method for integrating common knowledge, which comprises the following steps:
the method comprises the following steps: a large-scale common sense knowledge base ConceptNet and a session summary data set SAMSum are obtained.
Step one, obtaining a large-scale common sense knowledge base conceptNet:
obtaining a large-scale common knowledge base ConceptiNet from http:// conceceptinetIo; the common sense knowledge contained in the method exists in the form of tuple, namely, the tuple knowledge can be expressed as:
R=(h,r,t,w),
wherein R represents a tuple knowledge; h represents a head entity; r represents a relationship; t represents a tail entity; w represents a weight, representing the confidence of the relationship; knowledge R represents that a head entity h and a tail entity t have a relation R, and the weight is w; for example, R ═ call, associated, contact, 10, meaning that the relationship of "call" to "contact" is "associated" and the weight is 10, general knowledge in the form of tuples is available on a large scale via the website http:// concept.
Step two, acquiring a dialogue summary data set SAMSum:
fromhttps://arxiv.org/abs/1911.12237A dialog summary data set SAMSum may be obtained, which is divided into three parts of training, development and testing, the number of which is 14732, 818, 819, respectively, which is a fixed division of a unified standard; the data set mainly describes subjects such as chatting among participants, and each conversation has a corresponding standard abstract; FIG. 2 shows an example of this data set;
introducing tuple knowledge into a dialogue summary data set SAMSum by using the obtained large-scale common sense knowledge base conceptNet, and constructing a heterogeneous dialogue graph;
constructing a dialogue heterogeneous neural network model; the model mainly comprises three parts: the node encoder, the graph encoder and the decoder are three parts;
step three, constructing a node encoder, and acquiring node initialization representation by using a bidirectional long-short time neural network (Bi-LSTM)And word initialization representations(whereinAndall updated in step three or two);
step three, constructing a graph encoder, updating node representation by utilizing a heterogeneous graph neural network, and adding node position coding information and updating word representation
Thirdly, constructing a decoder, and generating an abstract by using a unidirectional long-short time memory network (LSTM) decoder;
step four: and training the dialogue heterogeneous neural network model constructed in the third step, and generating a final dialogue abstract from a dialogue through the trained dialogue heterogeneous neural network model.
The second embodiment is as follows: the second step is to introduce tuple knowledge into the session summary data set SAMSum by using the obtained large-scale common sense knowledge base ConceptNet, and construct a heterogeneous session graph; the specific process is as follows:
step two, acquiring dialogue related knowledge; for a section of conversation, the method firstly acquires a series of related tuple knowledge from the ConceptNet according to words in the conversation, eliminates noise knowledge, and finally can obtain a tuple knowledge set related to the given conversation, as shown in fig. 4;
step two, constructing a sentence-knowledge graph:
for the related tuple knowledge acquired in the second step, if a sentence A and a sentence B exist, a word a belongs to the sentence A, a word B belongs to the sentence B, and if tail entities h of the related knowledge of a and B are consistent, the sentence A and the sentence B are connected to the tail entity h; obtaining a sentence-knowledge graph; for example, in FIG. 5, sentence A is "do you have Betty number", and sentence B is "do Lao last called her"; the words a and b are "number" and "call", respectively; there are related tuples of knowledge (number, place, phonebook) and (call, related, phonebook), then sentences a and B are connected to the entity "phonebook";
the common knowledge obtained by the method has the problems of redundancy and repetition, so the invention also needs to simplify tuple knowledge, and the common knowledge with cleaner and higher quality can be introduced by simplifying the knowledge;
(1) if sentences a and B connect a plurality of entities, then the one with the highest average weight of the relationship is selected, for example, as shown in fig. 5, the average weight of "phonebook" is greater than the average weight of "date", then "phonebook" is selected;
(2) if different sentences are respectively connected to the same entity, the entity is merged into one entity, for example, as shown in fig. 5, two "contact" entities are merged into one "contact" entity;
step two, constructing a speaker-sentence subgraph:
establishing an edge relation between the speaker and the sentence according to the 'one sentence of the speaker' to obtain a speaker-sentence subgraph, as shown in fig. 6;
step two, fusing a sentence-knowledge graph and a speaker-sentence graph:
in the sentence-knowledge graph and the speaker-sentence subgraph, the sentence parts are the same, so the sentence parts are merged, and the sentence-knowledge graph and the speaker-sentence subgraph are fused into a final heterogeneous dialogue graph; the constructed heterogeneous dialogue graph has two edges between a speaker and a sentence, namely a 'speak-by' edge from the speaker to the sentence and a 'rev-speak-by' edge from the sentence to the speaker; there are two kinds of edges between sentences and knowledge, from knowledge to "knock-by" edge of a sentence, there are three kinds of nodes from the constructed heterogeneous dialogue graph: speaker, sentence, general knowledge.
Other steps and parameters are the same as those in the first embodiment.
The third concrete implementation mode: the difference between the first and second embodiments is that the third step is to construct a node encoder, and obtain the node initialization representation by using a bidirectional long-short-time neural network (Bi-LSTM)And word initialization representationsThe specific process is as follows:
for step two, the heterogeneous dialogue graph provided by the invention is adopted, wherein each node viContaining | viI words, the word sequence isWherein wi,nRepresenting a node viN.e [1, | vi|](ii) a Using a Bi-directional long-and-short neural network (Bi-LSTM) to align word sequencesGenerating a forward hidden layer sequenceAnd backward hidden layer sequenceWherein the forward hidden layer state Backward hidden layer statexnDenotes wi,nA word vector representation of; splicing the last hidden layer representation of the forward hidden layer state with the first hidden layer representation of the backward hidden layer state to obtain the initialized representation of the nodeWherein; representing vector splicing; and simultaneously obtaining the initialized representation of each word in the nodeAs shown in fig. 8.
Other steps and parameters are the same as those in the first or second embodiment.
The fourth concrete implementation mode: the difference between the present embodiment and the first to third embodiments is that the step three and two structure diagram encoder updates the node representation by using the heterogeneous diagram neural network, and adds the node position coding information and the updated word representationThe specific process is as follows:
given a target node t, obtaining a neighbor node s e N (t), wherein N (t) represents a neighbor node set of t, and s represents one of the neighbor nodes; given an edge e ═ s, t, representing an edge pointing from the neighbor node s to the target node t, defines: (1) the node type mapping function is:
wherein, tau represents a node type mapping function; v represents a given node; v represents a node set;representing a set of node types; in the heterogeneous dialogue graph constructed in the step two, three node types of speakers, sentences and common knowledge are contained;
(2) the edge relationship type mapping function is:
wherein,representing an edge type mapping function; e represents a given edge; e represents an edge set;representing a set of edge types;
in the heterogeneous dialog diagram constructed in the second step, four types of edges are contained: spot-by, rev-spot-by, knock-by, rev-knock-by; for a given edge e ═ (s, t), s and t each possess a representation from the previous layerAndfirstly, the first step is toAndis mapped asAndwherein,to representThe mapping function is related to the number of layers,indicating a mapping function related to the type, l indicating the l-th layer of the graph network,a key-value representation representing the neighbor node s at level l,representing the query representation of the node t at the l layer;
wherein,representing learnable parameters related to the number of layers and the type of edge; t represents transposition; α (s, e, t) representsAndweight in between;
after the weight between each neighbor node s and the target node t is obtained, all weights are normalized:
wherein Softmax is a normalization function,ATT(l)(s, e, t) is the score after final normalization;
wherein,as a mapping function related to type and number of layers;the message representation of the neighbor node s at the layer l;
when the target node t type is not a sentence node, the invention utilizes normalized score ATT(l)(s, e, t) as weights to weight the summed message vector Msg(l)(s, e, t) to obtain
Wherein,represents a summation;multiplying;is a representation of all neighbor nodes of the fusion t;
when the type of the target node t is a sentence node, the invention distinguishes the type of the neighbor node s to carry out information fusion to obtain
Where τ(s) denotes the type of neighbor node, skNeighboring nodes, s, of the type of representation knowledgesA neighbor node of which the representation type is speaker;
respectively mapping the two situations to obtainThe invention then maps it toAs the node after update represents:
wherein Sigmoid represents an activation function,representing a mapping function related to the type and the number of layers;
next, the updated node representationUpper integration of location information for each node viAssociated with a positionFor speaker nodes and knowledge nodes, locationWith respect to the nodes of the sentence,the position of a sentence in the dialog, i.e. the second sentence; the invention sets a position vector matrix WposFor each positionIts corresponding vector representation can be obtainedWill be provided withIs merged intoGet an updated representation:
finally, will furtherNode representation after newWith corresponding initialization word representationSplicing and mapping to obtain an updated word representation:
wherein F _ Linear () represents a mapping function; representing vector stitching.
Other steps and parameters are the same as those in one of the first to third embodiments.
The fifth concrete implementation mode: the difference between this embodiment and one of the first to the fourth embodiments is that the decoder is constructed by the third step:
word representation after being updatedThen, the mean s of the representations of all words is calculated0,Wherein G represents all node sets in the heterogeneous dialogue graph; s0Assigning a cell state and a hidden state to the decoder to initialize an initial state of the decoder; at each step of decoding, according to the decoder state stComputing a context vector ct:
at=Softmax(et) (12)
Wherein, WaWhich represents a parameter that can be learned by the user,is an updated word representation; t denotes the transpose of the image,is the unnormalized weight for the nth term for the ith node; stIs the state of the decoder at the moment t; a istIs the weight after normalization; e.g. of the typetIs the weight before normalization; c. CtIs a context vector representation;the weight of the nth word of the ith node after normalization;
according to the context vector ctAnd decoder t time state stCalculating the probability P of generating each word in the word listvocab:
Pvocab(w)=Softmax(V′(V[st;ct]+b)+b′) (14)
Wherein V ', V, b, b' are learnable parameters; [ s ] oft;ct]Denotes stAnd ctSplicing; softmax is a normalization function; pvocab(w) represents the probability of generating word w;
in addition to generating words from a vocabulary, the present model also allows words to be copied from the original text; first, the probability p of generating words is calculatedgen:
Wherein, wc,ws,wxAnd bptrIs a learnable parameter; sigmoid is an activation function; p is a radical ofgenRepresenting a probability of generating a word; 1-pgenRepresenting the probability of copying from the original text;is to wcCalculating and transposing;is to wsCalculating and transposing;is to wxCalculating and transposing; x is the number oftInputting word vectors of words for a decoder at the time t;
therefore, for a word w, the probability generated from the word list and the probability copied from the original text are considered together, and the final probability is expressed as formula (16):
according to equation (16), the decoder can be used to select the word with the highest probability as output at each decoding step.
Other steps and parameters are the same as in one of the first to fourth embodiments.
The sixth specific implementation mode: the present embodiment is different from the first to the fifth embodiment in that the dialogue heterogeneous neural network model constructed in the training step three generates a final dialogue summary from a section of dialogue through the trained dialogue heterogeneous neural network model; the specific process is as follows:
training a dialogue heterogeneous neural network model by using a maximum likelihood estimation and utilizing a training part of a SAMSum data set, and calculating cross entropy loss according to the word probability predicted by the formula (16) and standard words at each decoding step of a decoder;
wherein,the first word in the standard abstract;the last word in the standard abstract;words of the standard abstract which need to be predicted at the time t; l is a cross entropy loss function;
the dialogue heterogeneous neural network model is trained according to equation (17), the best model is selected by the development part of the SAMSum data set, and finally the final dialogue summary is generated by the trained dialogue heterogeneous neural network model according to equation (16) for the test part of the SAMSum data set.
Other steps and parameters are the same as those in one of the first to fifth embodiments.
The seventh concrete implementation mode: the difference between this embodiment and one of the first to sixth embodiments is that the noise elimination knowledge method includes:
(1) excluding tuple knowledge if the weight w in this knowledge is lower than 1;
(2) if the relation r of tuple knowledge belongs to: antisense words, related in language origin, originated in language origin, different, not desired, then this knowledge is excluded.
Other steps and parameters are the same as those in one of the first to sixth embodiments.
The specific implementation mode is eight: the difference between this embodiment and one of the first to seventh embodiments is that the process of simplifying tuple knowledge includes:
(1) if sentences a and B connect a plurality of entities, then the one with the highest average weight of the relationship is selected, for example, as shown in fig. 5, the average weight of "phonebook" is greater than the average weight of "date", then "phonebook" is selected;
(2) if different sentences are respectively connected to the same entity, the entity is merged into one entity, for example, as shown in fig. 5, two "contact" entities are merged into one "contact" entity.
Other steps and parameters are the same as those in one of the first to seventh embodiments.
Examples
The first embodiment is as follows:
the invention realizes the proposed model and compares the model with the current baseline model and the standard abstract.
(1) Summary of baseline model generation:
Gary and Lara will meet at 5pm for Tom's bday party.
(2) summary of model generation of the invention:
Gary and Lara are going to Tom's birthday party at 5pm.Lara will pick up the cake.
(3) standard abstract:
It’s Tom's birthday.Lara and Gary will come to Tom's place about 5pm to prepare everything.Gary has already paid for the cake Lara will pick it.
according to the above embodiments, it can be seen that the model of the present invention can generate results more similar to the standard abstract, and the dialog information can be better understood by introducing common sense knowledge.
The present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof, and it is therefore intended that all such changes and modifications be considered as within the spirit and scope of the appended claims.
Claims (9)
1. A method for generating a dialog abstract merged with common knowledge is characterized by comprising the following steps:
step one, acquiring a common knowledge base ConceptNet and a dialogue summary data set SAMSum; the contained common sense knowledge exists in the form of tuple, namely tuple knowledge, and is expressed as follows:
R=(h,r,t,w),
wherein R represents a tuple knowledge; h represents a head entity; r represents a relationship; t represents a tail entity; w represents a weight, representing the confidence of the relationship; knowledge R represents that a head entity h and a tail entity t have a relation R, and the weight is w;
the dialogue abstract data set SAMSum is divided into three parts, namely training, development and testing;
introducing tuple knowledge into the dialogue summary data set SAMSum by using the acquired common knowledge base conceptNet, and constructing a heterogeneous dialogue graph; the specific process is as follows:
step three, constructing a dialogue heterogeneous neural network model; the dialogue heterogeneous neural network model comprises a node encoder, a graph encoder and a decoder;
step three, constructing a node encoder, and acquiring node initialization representation by using a bidirectional long-time and short-time neural networkAnd word initialization representations
Step three, constructing a graph encoder, updating node representation by utilizing a heterogeneous graph neural network, and adding node position coding information and updating word representation
Step three, constructing a decoder;
and step four, training the dialogue heterogeneous neural network model constructed in the step three, and generating a final dialogue abstract from a section of dialogue through the trained dialogue heterogeneous neural network model.
2. The method for generating dialogue summary merged with general knowledge according to claim 1, wherein in the second step, the obtained general knowledge base ConceptNet is used to introduce tuple knowledge to the dialogue summary data set SAMSum to construct a heterogeneous dialogue graph; the specific process is as follows:
step two, for a section of conversation, eliminating noise knowledge according to related tuple knowledge acquired from ConceptNet by words in the conversation to obtain a tuple knowledge set related to the given conversation;
step two, assuming that a sentence A and a sentence B exist in the related tuple knowledge acquired in the step two, the word a belongs to A, the word B belongs to B, simplifying the tuple knowledge, and if tail entities h of a and B are consistent, connecting the sentences A and B to the tail entity h; obtaining a sentence-knowledge graph;
step two, establishing an edge relation between the speaker and a sentence according to the 'one sentence of the speaker' to obtain a speaker-sentence image;
step two, the sentence-knowledge graph and the speaker-sentence graph are fused into a heterogeneous dialogue graph; the heterogeneous dialogue graph has two edges between a speaker and a sentence, namely a 'speak-by' edge from the speaker to the sentence and a 'rev-speak-by' edge from the sentence to the speaker; there are two kinds of edges between sentence and tuple knowledge, namely the "knock-by" edge from knowledge to sentence, the "rev-knock-by" edge from sentence to tuple knowledge; heterogeneous dialogs exist with three types of nodes, namely speakers, sentences, and common sense knowledge.
3. The method for generating dialogue summary incorporating general knowledge as claimed in claim 2, wherein the step three is a node encoder; obtaining node initialization representation using a bidirectional long-and-short-term neural networkAnd word initialization representationsThe specific process is as follows:
for a constructed heterogeneous dialogue graph, each node viContaining | viI words, the word sequence isWherein, wi,nRepresenting a node viN.e [1, | vi|](ii) a Word sequence using bidirectional long-and-short-term neural networkGenerating a forward hidden layer sequenceAnd backward hidden layer sequenceWherein the forward hidden layer stateBackward hidden layer statexnIs wi,nA word vector representation of; splicing the last hidden layer representation of the forward hidden layer state with the first hidden layer representation of the backward hidden layer state to obtain the initialized representation of the nodeWherein, the first and second connecting parts are connected with each other; representing vector stitching; obtaining initialized representation of each word in node
4. The method for generating dialogue summary incorporating general knowledge according to claim 1 or 2, wherein the step three or two constructs graph encoder, updates node representation by using a neural network of a heterogeneous graph, and adds node position coding information and update wordsLanguage representationThe specific process is as follows:
giving a target node t, and obtaining a neighbor node s e to N (t), wherein N (t) represents a neighbor node set of t, and s represents one neighbor node; given an edge e ═ s, t, representing an edge pointing from the neighbor node s to the target node t, defines:
(1) the node type mapping function is:
wherein τ represents a node type mapping function; v represents a given node; v represents a node set;representing a set of node types; in the heterogeneous dialogue graph constructed in the step two, three node types of speakers, sentences and common knowledge are contained;
(2) the edge relationship type mapping function is:
wherein,representing an edge type mapping function; e represents a given edge; e represents an edge set;representing a set of edge types;
in a heterogeneous dialog diagram, there are a total of four types of edges: spot-by, rev-spot-by, knock-by, rev-knock-by; for a given edge e ═ (s, t), s and t each possess a representation from the previous layerAndwill be provided withAndis mapped asAndwherein,a mapping function relating to the number of layers is indicated,representing a mapping function related to the type; l denotes the l-th layer of the network,a key-value representation representing the neighbor node s at level l,representing the query representation of the node t at the l layer;
wherein,representing learnable parameters related to the number of layers and the type of edge; t represents transposition; α (s, e, t) representsAndweight in between;
after the weight between each neighbor node s and the target node t is obtained, all weights are normalized:
wherein Softmax is a normalization function, ATT(l)(s, e, t) is the score after final normalization;
when the target node t type is not a sentence node, utilizing the normalized score ATT(l)(s, e, t) as weights to weight the summed message vector Msg(l)(s, e, t) to obtain
Wherein,which means that the sum is given,multiplying;is a representation of all neighbor nodes of the fusion t;
when the type of the target node t is a sentence node, distinguishing the type of the neighbor node s to perform information fusion to obtain
Where τ(s) denotes the type of neighbor node, skDenotes a neighbor node of which the type is knowledge, τ(s) in equation (6) denotes the type of the neighbor node, ssA neighbor node of which the representation type is speaker;
wherein Sigmoid represents an activation function,representing a mapping function related to type and number of layers;
node representation after updateUpper integration of location information for each node viAssociated with a positionFor speaker nodes and knowledge nodes, locationWith respect to the nodes of the sentence,the position of a sentence in the dialog, i.e. the second sentence;
setting a position vector matrix WposFor each positionCan obtain its corresponding vector representationWill be provided withIs merged intoGet an updated representation:
representing the updated nodesWith corresponding initialization word representationSplicing, and mapping to obtain an updated word representation:
wherein F _ Linear () represents a mapping function; representing vector stitching.
5. The method for generating dialogue summary incorporating general knowledge as claimed in claim 4, wherein the third step is to construct a decoder; the specific process is as follows:
get updated word representationThereafter, the mean s of the representations of all words is calculated0Expressed as:
wherein G represents all node sets in the heterogeneous dialogue graph;
s0the cell state and the hidden state assigned to the decoder initialize the initial state of the decoder; at each step of decoding, using an attention mechanism, according to the decoder state stComputing a context vector ct:
at=Softmax(et) (12)
Wherein, WaRepresents a learnable parameter;is an updated word representation; t represents transposition;is the unnormalized weight of n words for the ith node; stIs the state of the decoder at the moment t; a is atIs the weight after normalization; e.g. of the typetIs the weight before normalization; c. CtIs a context vector representation;for the weight of the nth term of the ith node after normalization;
vector ctAnd decoder t time state stCalculating the probability P of generating each word in the word listvocab:
Pvocab(w)=Softmax(V′(V[st;ct]+b)+b′) (14)
Wherein V ', V, b, b' are learnable parameters; [ s ] oft;ct]Denotes stAnd ctSplicing; softmax is a normalization function; pvocab(w) represents the probability of generating word w;
in addition to generating words from the vocabulary, allowing copying of words from the original text; first, the probability p of generating words is calculatedgen:
Wherein, wc,ws,wxAnd bptrIs a learnable parameter; sigmoid is an activation function; p is a radical of formulagenRepresenting a probability of generating a word; 1-pgenThen the probability of copying from the original text is indicated;is to wcCalculating and transposing;is to wsCalculating and transposing;is to wxCalculating transposition; x is the number oftInputting word vectors of words for a decoder at the time t;
the final probability is as follows (16):
according to equation (16), the word with the highest probability is selected as output at each step of decoding by the decoder.
6. The method for generating dialogue summary fused with general knowledge according to claim 5, wherein the step four trains the dialogue heterogeneous neural network model constructed in the step three, and generates the final dialogue summary from a dialogue by the trained dialogue heterogeneous neural network model; the specific process is as follows:
using maximum likelihood estimation, a dialogue-heterogeneous neural network model is trained using the training portion of the SAMSum dataset, and at each step of decoder decoding, the cross-entropy loss is calculated from the predicted word probabilities and the standard words according to equation (16):
wherein,the first word in the standard abstract;the last word in the standard abstract;words of the standard abstract which need to be predicted at the time t; l is a cross entropy loss function;
the dialogue-heterogeneous neural network model is trained according to equation (17), the best model is selected using the development part of the SAMSum data set, and finally the final dialogue summary is generated using the trained dialogue-heterogeneous neural network model according to equation (16) for the test part of the SAMSum data set.
7. The method for generating dialogue summary incorporating general knowledge as claimed in claim 2, wherein the method for eliminating noise knowledge comprises:
(1) when the weight w in the tuple knowledge is lower than 1, excluding the tuple knowledge;
(2) when the relation r of tuple knowledge belongs to: antisense words, related in language origin, originated in language origin, different or not desired, this knowledge is excluded.
8. The method for generating dialogue summary merged with common sense knowledge according to claim 2, wherein the process of simplifying tuple knowledge comprises:
(1) if the sentence A and the sentence B are connected with a plurality of entities, selecting one with the highest average weight of the edge relation;
(2) if different pairs of sentences are respectively connected to entities with the same name, all entities with the same name are combined into one entity.
9. The method for generating dialogue summary incorporating general knowledge according to claim 1, wherein the number of training, developing and testing in SAMSum is: 14732, 818, 819.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011104023.9A CN112148863B (en) | 2020-10-15 | 2020-10-15 | Generation type dialogue abstract method integrated with common knowledge |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011104023.9A CN112148863B (en) | 2020-10-15 | 2020-10-15 | Generation type dialogue abstract method integrated with common knowledge |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112148863A CN112148863A (en) | 2020-12-29 |
CN112148863B true CN112148863B (en) | 2022-07-01 |
Family
ID=73952047
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011104023.9A Active CN112148863B (en) | 2020-10-15 | 2020-10-15 | Generation type dialogue abstract method integrated with common knowledge |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112148863B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112765344B (en) * | 2021-01-12 | 2022-07-08 | 哈尔滨工业大学 | Method, device and storage medium for generating meeting abstract based on meeting record |
CN112818113A (en) * | 2021-01-26 | 2021-05-18 | 山西三友和智慧信息技术股份有限公司 | Automatic text summarization method based on heteromorphic graph network |
CN113204627B (en) * | 2021-05-13 | 2022-08-23 | 哈尔滨工业大学 | Dialog summary generation system using DialloGPT as feature annotator |
CN113553804A (en) * | 2021-07-15 | 2021-10-26 | 重庆邮电大学 | Single document text summarization system based on heterogeneous graph transform |
CN114328956B (en) * | 2021-12-23 | 2023-02-28 | 北京百度网讯科技有限公司 | Text information determination method and device, electronic equipment and storage medium |
CN114580439B (en) * | 2022-02-22 | 2023-04-18 | 北京百度网讯科技有限公司 | Translation model training method, translation device, translation equipment and storage medium |
CN114626368B (en) * | 2022-03-18 | 2023-06-09 | 中国电子科技集团公司第十研究所 | Method and system for acquiring rule common sense knowledge in vertical field |
CN115905513B (en) * | 2023-02-22 | 2023-07-14 | 中国科学技术大学 | Dialogue abstracting method based on denoising type question and answer |
CN116541505B (en) * | 2023-07-05 | 2023-09-19 | 华东交通大学 | Dialogue abstract generation method based on self-adaptive dialogue segmentation |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110348016A (en) * | 2019-07-15 | 2019-10-18 | 昆明理工大学 | Text snippet generation method based on sentence association attention mechanism |
CN110609891A (en) * | 2019-09-18 | 2019-12-24 | 合肥工业大学 | Visual dialog generation method based on context awareness graph neural network |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2595541A1 (en) * | 2007-07-26 | 2009-01-26 | Hamid Htami-Hanza | Assisted knowledge discovery and publication system and method |
US10114148B2 (en) * | 2013-10-02 | 2018-10-30 | Nec Corporation | Heterogeneous log analysis |
US10055486B1 (en) * | 2014-08-05 | 2018-08-21 | Hrl Laboratories, Llc | System and method for real world event summarization with microblog data |
CN107403375A (en) * | 2017-04-19 | 2017-11-28 | 北京文因互联科技有限公司 | A kind of listed company's bulletin classification and abstraction generating method based on deep learning |
CN108763333B (en) * | 2018-05-11 | 2022-05-17 | 北京航空航天大学 | Social media-based event map construction method |
CN109344391B (en) * | 2018-08-23 | 2022-10-21 | 昆明理工大学 | Multi-feature fusion Chinese news text abstract generation method based on neural network |
US10885281B2 (en) * | 2018-12-06 | 2021-01-05 | International Business Machines Corporation | Natural language document summarization using hyperbolic embeddings |
CN110929024B (en) * | 2019-12-10 | 2021-07-02 | 哈尔滨工业大学 | Extraction type text abstract generation method based on multi-model fusion |
CN111026861B (en) * | 2019-12-10 | 2023-07-04 | 腾讯科技(深圳)有限公司 | Text abstract generation method, training device, training equipment and medium |
CN111339754B (en) * | 2020-03-04 | 2022-06-21 | 昆明理工大学 | Case public opinion abstract generation method based on case element sentence association graph convolution |
CN111460132B (en) * | 2020-03-10 | 2021-08-10 | 哈尔滨工业大学 | Generation type conference abstract method based on graph convolution neural network |
CN111460135B (en) * | 2020-03-31 | 2023-11-07 | 北京百度网讯科技有限公司 | Method and device for generating text abstract |
CN111639176B (en) * | 2020-05-29 | 2022-07-01 | 厦门大学 | Real-time event summarization method based on consistency monitoring |
-
2020
- 2020-10-15 CN CN202011104023.9A patent/CN112148863B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110348016A (en) * | 2019-07-15 | 2019-10-18 | 昆明理工大学 | Text snippet generation method based on sentence association attention mechanism |
CN110609891A (en) * | 2019-09-18 | 2019-12-24 | 合肥工业大学 | Visual dialog generation method based on context awareness graph neural network |
Also Published As
Publication number | Publication date |
---|---|
CN112148863A (en) | 2020-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112148863B (en) | Generation type dialogue abstract method integrated with common knowledge | |
Wang et al. | Deep learning for aspect-based sentiment analysis | |
Qi et al. | Finding all you need: web APIs recommendation in web of things through keywords search | |
US11900056B2 (en) | Stylistic text rewriting for a target author | |
WO2022095378A1 (en) | Artificial-intelligence-based training method and apparatus, and computer device and storage medium | |
CN107341145B (en) | A kind of user feeling analysis method based on deep learning | |
CN111143576A (en) | Event-oriented dynamic knowledge graph construction method and device | |
CN109376222A (en) | Question and answer matching degree calculation method, question and answer automatic matching method and device | |
CN112256866B (en) | Text fine-grained emotion analysis algorithm based on deep learning | |
CN111460132A (en) | Generation type conference abstract method based on graph convolution neural network | |
JP7335300B2 (en) | Knowledge pre-trained model training method, apparatus and electronic equipment | |
CN113326374B (en) | Short text emotion classification method and system based on feature enhancement | |
WO2023184226A1 (en) | Article recommendation method, article knowledge graph training method and apparatus, and model training method and apparatus | |
Gu et al. | HeterMPC: A heterogeneous graph neural network for response generation in multi-party conversations | |
CN115062208A (en) | Data processing method and system and computer equipment | |
CN113535949B (en) | Multi-modal combined event detection method based on pictures and sentences | |
Dhole | Resolving intent ambiguities by retrieving discriminative clarifying questions | |
CN113407697A (en) | Chinese medical question classification system for deep encyclopedia learning | |
CN117574915A (en) | Public data platform based on multiparty data sources and data analysis method thereof | |
Yonglan et al. | [Retracted] English‐Chinese Machine Translation Model Based on Bidirectional Neural Network with Attention Mechanism | |
CN117708692A (en) | Entity emotion analysis method and system based on double-channel graph convolution neural network | |
CN116992886A (en) | BERT-based hot news event context generation method and device | |
Hsueh et al. | A Task-oriented Chatbot Based on LSTM and Reinforcement Learning | |
CN112036165B (en) | Construction method and application of news feature vector | |
Fulmal | The implementation of question answer system using deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |