CN112148863B - Generation type dialogue abstract method integrated with common knowledge - Google Patents

Generation type dialogue abstract method integrated with common knowledge Download PDF

Info

Publication number
CN112148863B
CN112148863B CN202011104023.9A CN202011104023A CN112148863B CN 112148863 B CN112148863 B CN 112148863B CN 202011104023 A CN202011104023 A CN 202011104023A CN 112148863 B CN112148863 B CN 112148863B
Authority
CN
China
Prior art keywords
dialogue
node
knowledge
representation
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011104023.9A
Other languages
Chinese (zh)
Other versions
CN112148863A (en
Inventor
冯骁骋
冯夏冲
秦兵
刘挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202011104023.9A priority Critical patent/CN112148863B/en
Publication of CN112148863A publication Critical patent/CN112148863A/en
Application granted granted Critical
Publication of CN112148863B publication Critical patent/CN112148863B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

A generation type dialogue abstract method integrated with common knowledge belongs to the field of natural language processing. The invention solves the problems of inaccurate generated dialogue abstract and low abstraction caused by the existing method for generating the dialogue abstract without using common knowledge. The method comprises the following steps: acquiring a common knowledge base ConceptNet and a dialogue summary data set SAMSum; introducing tuple knowledge into a dialogue summary data set SAMSum by using the acquired common knowledge base conceptNet to construct a heterogeneous dialogue graph; and (4) generating a final dialogue abstract from a dialogue section through the trained dialogue heterogeneous neural network model by the dialogue heterogeneous neural network model constructed in the third step. The invention is applied to the generation of the dialogue summary.

Description

Generation type dialogue abstract method integrated with common knowledge
Technical Field
The invention relates to the field of natural language processing, in particular to a general knowledge-integrated generating type dialogue summarization method.
Background
Automatic text summarization based on natural language processing (AutomaticSummarization)[1](topic: structural architecture by computer: techniques and protocols, author: Chris D Paice, year 1990, cited in the literature from Information Processing&Management), that is, given a text record of a multi-person conversation, a short text description containing key information of the conversation is generated, and as shown in fig. 1, a multi-person conversation and its corresponding standard abstract are shown.
For dialog summaries, the existing work has mostly focused on the generative (abstract) approach, i.e. allowing the final summary to contain novel words and phrases that the original text does not have. For example Liu et al[2][ title: automatic dialog summary generation for customer service, author: chunyi Liu, year: 2019, the literature is quoted from Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery&Data Mining]For the customer service dialogue summary task, a multi-step generation mode is adopted to generate the dialogue summary, Liu et al[3][ title: the author:zhengyuan Liu, year: 2019, the literature is quoted from arXiv preprint]And integrating topic information modeling conversation aiming at the doctor-patient conversation abstract task to generate a final abstract. Ganesh et al[4][ title: abstract catalysis of spoke and writen conversion, authors: prakhar Ganesh, year: 2019, the literature is quoted from arXiv preprint]Useless sentences in the conversation are removed by using the conversation chapter structure as a rule, and then a conversation abstract is generated. Recently, in dialog reply generation[5][ title: commonsense knowledge conversion with graph association, author: hao Zhou, year: 2018, the literature is quoted from IJCAI,]and dialogue context modeling[6][ title: multi-task prediction for Multi-role dialogue representation learning, authors: tianyi Wang, year: in 2020, the literature is imported from AAAI]The tasks and the like show that although the existing abstract model based on the neural network has strong learning capacity, the existing method ignores the utilization of common knowledge, on one hand, the model cannot better understand the dialogue text, and generates an abstract with low quality; on the other hand, the lack of common sense knowledge can result in low abstract of the generated abstract. The model can be helped to better complete tasks by integrating explicit common knowledge, and the dialogue abstract integrated with the common knowledge can help the model to understand the high-level meaning behind the dialogue; and the method can also be used as a bridge between incoherent sentences to help better understand the conversation. However, existing dialog summarization systems have overlooked the use of common sense knowledge.
Common sense knowledge may help the dialog summarization system generate a higher quality summary. As shown in FIG. 1, Bob can know that Tom is expected to be given by Bob to take a free ride by 'pick up' and 'bad ride', and can help better generate a dialog summary by introducing explicit general knowledge of 'taking a free ride'. After the common sense knowledge is merged, in order to better model three types of data of speakers, sentences and common sense knowledge, the three types of data can be modeled by using the heteromorphic neural network, and a final abstract is generated.
Disclosure of Invention
The invention aims to solve the problems of inaccurate generated conversation abstract and low abstraction caused by the fact that the conventional generated conversation abstract method does not utilize common knowledge. A method for generating a dialog summary incorporating common knowledge is proposed.
A method for generating a dialogue abstract merged with common knowledge comprises the following steps:
step one, acquiring a common knowledge base ConceptNet and a dialogue summary data set SAMSum; the contained common sense knowledge exists in the form of a tuple, namely, the tuple knowledge is expressed as:
R=(h,r,t,w),
wherein R represents a tuple knowledge; representing a head entity; r represents a relationship; t represents a tail entity; w represents a weight, representing the confidence of the relationship; knowledge R represents the ownership relation R of the head entity and the tail entity t, and the weight is w;
the dialogue abstract data set SAMSum is divided into three parts, namely training, development and testing;
introducing tuple knowledge into the dialogue summary data set SAMSum by using the acquired common knowledge base conceptNet, and constructing a heterogeneous dialogue graph; the specific process is as follows:
step three, constructing a dialogue heterogeneous neural network model; the dialogue heterogeneous neural network model comprises a node encoder, a graph encoder and a decoder;
step three, constructing a node encoder, and acquiring node initialization representation by using a bidirectional long-time and short-time neural network
Figure GDA0003568539010000021
And word initialization representations
Figure GDA0003568539010000022
Step two, constructing a graph encoder, updating node representation by utilizing a heterogeneous graph neural network, and adding node position encoding information and updating word representation
Figure GDA0003568539010000023
Step three, constructing a decoder;
and step four, training the dialogue heterogeneous neural network model constructed in the step three, and generating a final dialogue abstract from a section of dialogue through the trained dialogue heterogeneous neural network model.
Advantageous effects
A conversation abstract integrated with common sense knowledge can help a model to understand the high-level meaning behind the conversation;
the conversation abstract integrated with the common knowledge can be used as a bridge between incoherent sentences to help better understand the conversation;
by introducing common knowledge, the model can be helped to generate more abstract and generalized abstract;
the invention introduces common sense knowledge in the dialogue summarization task, models three types of data of speakers, sentences and common sense knowledge in the dialogue into a heterogeneous dialogue graph, and models the whole heterogeneous dialogue graph by using a heterogeneous neural network.
The whole model adopts a graph to sequence framework to generate a final dialogue abstract. The problem that the conventional generated dialogue abstract ignores common knowledge utilization is solved. In the abstract generated after the experiment of the method, the more abstract and correct abstract is generated, the dialogue content is better summarized, the effectiveness of the method is shown, and the method obtains a better result on the evaluation index ROUGE than the existing method.
The ROUGE is a similarity measurement method based on recall rate, a group of indexes of automatic abstract and machine translation are evaluated, the sufficiency and the loyalty of the translation are examined, and the higher the value is, the better the value is. The calculation of ROUGE-1, ROUGE-2, and ROUGE-L involves unigram, bigram, and longest common subsequence, respectively.
Drawings
FIG. 1 is a diagram of a multi-person conversation and its corresponding standard abstract;
FIG. 2 is an example of a SAMSum session summary dataset session summary pair;
FIG. 3 is an example of a SAMSum dataset dialog-summary pair;
FIG. 4 is a related knowledge triple obtained from ConceptNet;
FIG. 5 is a sentence-knowledge graph constructed in accordance with the present invention;
FIG. 6 is a speaker-sentence graph constructed in accordance with the present invention;
FIG. 7 is a heterogeneous session diagram constructed in accordance with the present invention;
fig. 8 is a schematic diagram of the inventive model, in which (a) heterogeneous session map construction, (b) node encoder, (c) map encoder, and (d) decoder.
Detailed Description
The first embodiment is as follows: the embodiment is a generation type dialogue summarization method for integrating common knowledge, which comprises the following steps:
the method comprises the following steps: a large-scale common sense knowledge base ConceptNet and a session summary data set SAMSum are obtained.
Step one, obtaining a large-scale common sense knowledge base conceptNet:
obtaining a large-scale common knowledge base ConceptiNet from http:// conceceptinetIo; the common sense knowledge contained in the method exists in the form of tuple, namely, the tuple knowledge can be expressed as:
R=(h,r,t,w),
wherein R represents a tuple knowledge; h represents a head entity; r represents a relationship; t represents a tail entity; w represents a weight, representing the confidence of the relationship; knowledge R represents that a head entity h and a tail entity t have a relation R, and the weight is w; for example, R ═ call, associated, contact, 10, meaning that the relationship of "call" to "contact" is "associated" and the weight is 10, general knowledge in the form of tuples is available on a large scale via the website http:// concept.
Step two, acquiring a dialogue summary data set SAMSum:
fromhttps://arxiv.org/abs/1911.12237A dialog summary data set SAMSum may be obtained, which is divided into three parts of training, development and testing, the number of which is 14732, 818, 819, respectively, which is a fixed division of a unified standard; the data set mainly describes subjects such as chatting among participants, and each conversation has a corresponding standard abstract; FIG. 2 shows an example of this data set;
introducing tuple knowledge into a dialogue summary data set SAMSum by using the obtained large-scale common sense knowledge base conceptNet, and constructing a heterogeneous dialogue graph;
constructing a dialogue heterogeneous neural network model; the model mainly comprises three parts: the node encoder, the graph encoder and the decoder are three parts;
step three, constructing a node encoder, and acquiring node initialization representation by using a bidirectional long-short time neural network (Bi-LSTM)
Figure GDA0003568539010000041
And word initialization representations
Figure GDA0003568539010000042
(wherein
Figure GDA0003568539010000043
And
Figure GDA0003568539010000044
all updated in step three or two);
step three, constructing a graph encoder, updating node representation by utilizing a heterogeneous graph neural network, and adding node position coding information and updating word representation
Figure GDA0003568539010000045
Thirdly, constructing a decoder, and generating an abstract by using a unidirectional long-short time memory network (LSTM) decoder;
step four: and training the dialogue heterogeneous neural network model constructed in the third step, and generating a final dialogue abstract from a dialogue through the trained dialogue heterogeneous neural network model.
The second embodiment is as follows: the second step is to introduce tuple knowledge into the session summary data set SAMSum by using the obtained large-scale common sense knowledge base ConceptNet, and construct a heterogeneous session graph; the specific process is as follows:
step two, acquiring dialogue related knowledge; for a section of conversation, the method firstly acquires a series of related tuple knowledge from the ConceptNet according to words in the conversation, eliminates noise knowledge, and finally can obtain a tuple knowledge set related to the given conversation, as shown in fig. 4;
step two, constructing a sentence-knowledge graph:
for the related tuple knowledge acquired in the second step, if a sentence A and a sentence B exist, a word a belongs to the sentence A, a word B belongs to the sentence B, and if tail entities h of the related knowledge of a and B are consistent, the sentence A and the sentence B are connected to the tail entity h; obtaining a sentence-knowledge graph; for example, in FIG. 5, sentence A is "do you have Betty number", and sentence B is "do Lao last called her"; the words a and b are "number" and "call", respectively; there are related tuples of knowledge (number, place, phonebook) and (call, related, phonebook), then sentences a and B are connected to the entity "phonebook";
the common knowledge obtained by the method has the problems of redundancy and repetition, so the invention also needs to simplify tuple knowledge, and the common knowledge with cleaner and higher quality can be introduced by simplifying the knowledge;
(1) if sentences a and B connect a plurality of entities, then the one with the highest average weight of the relationship is selected, for example, as shown in fig. 5, the average weight of "phonebook" is greater than the average weight of "date", then "phonebook" is selected;
(2) if different sentences are respectively connected to the same entity, the entity is merged into one entity, for example, as shown in fig. 5, two "contact" entities are merged into one "contact" entity;
step two, constructing a speaker-sentence subgraph:
establishing an edge relation between the speaker and the sentence according to the 'one sentence of the speaker' to obtain a speaker-sentence subgraph, as shown in fig. 6;
step two, fusing a sentence-knowledge graph and a speaker-sentence graph:
in the sentence-knowledge graph and the speaker-sentence subgraph, the sentence parts are the same, so the sentence parts are merged, and the sentence-knowledge graph and the speaker-sentence subgraph are fused into a final heterogeneous dialogue graph; the constructed heterogeneous dialogue graph has two edges between a speaker and a sentence, namely a 'speak-by' edge from the speaker to the sentence and a 'rev-speak-by' edge from the sentence to the speaker; there are two kinds of edges between sentences and knowledge, from knowledge to "knock-by" edge of a sentence, there are three kinds of nodes from the constructed heterogeneous dialogue graph: speaker, sentence, general knowledge.
Other steps and parameters are the same as those in the first embodiment.
The third concrete implementation mode: the difference between the first and second embodiments is that the third step is to construct a node encoder, and obtain the node initialization representation by using a bidirectional long-short-time neural network (Bi-LSTM)
Figure GDA0003568539010000051
And word initialization representations
Figure GDA0003568539010000052
The specific process is as follows:
for step two, the heterogeneous dialogue graph provided by the invention is adopted, wherein each node viContaining | viI words, the word sequence is
Figure GDA0003568539010000053
Wherein wi,nRepresenting a node viN.e [1, | vi|](ii) a Using a Bi-directional long-and-short neural network (Bi-LSTM) to align word sequences
Figure GDA0003568539010000054
Generating a forward hidden layer sequence
Figure GDA0003568539010000055
And backward hidden layer sequence
Figure GDA0003568539010000056
Wherein the forward hidden layer state
Figure GDA0003568539010000057
Figure GDA0003568539010000058
Backward hidden layer state
Figure GDA0003568539010000059
xnDenotes wi,nA word vector representation of; splicing the last hidden layer representation of the forward hidden layer state with the first hidden layer representation of the backward hidden layer state to obtain the initialized representation of the node
Figure GDA00035685390100000510
Wherein; representing vector splicing; and simultaneously obtaining the initialized representation of each word in the node
Figure GDA00035685390100000511
As shown in fig. 8.
Other steps and parameters are the same as those in the first or second embodiment.
The fourth concrete implementation mode: the difference between the present embodiment and the first to third embodiments is that the step three and two structure diagram encoder updates the node representation by using the heterogeneous diagram neural network, and adds the node position coding information and the updated word representation
Figure GDA00035685390100000512
The specific process is as follows:
given a target node t, obtaining a neighbor node s e N (t), wherein N (t) represents a neighbor node set of t, and s represents one of the neighbor nodes; given an edge e ═ s, t, representing an edge pointing from the neighbor node s to the target node t, defines: (1) the node type mapping function is:
Figure GDA00035685390100000513
wherein, tau represents a node type mapping function; v represents a given node; v represents a node set;
Figure GDA00035685390100000514
representing a set of node types; in the heterogeneous dialogue graph constructed in the step two, three node types of speakers, sentences and common knowledge are contained;
(2) the edge relationship type mapping function is:
Figure GDA00035685390100000515
wherein,
Figure GDA0003568539010000061
representing an edge type mapping function; e represents a given edge; e represents an edge set;
Figure GDA0003568539010000062
representing a set of edge types;
in the heterogeneous dialog diagram constructed in the second step, four types of edges are contained: spot-by, rev-spot-by, knock-by, rev-knock-by; for a given edge e ═ (s, t), s and t each possess a representation from the previous layer
Figure GDA0003568539010000063
And
Figure GDA0003568539010000064
firstly, the first step is to
Figure GDA0003568539010000065
And
Figure GDA0003568539010000066
is mapped as
Figure GDA0003568539010000067
And
Figure GDA0003568539010000068
wherein,
Figure GDA0003568539010000069
to representThe mapping function is related to the number of layers,
Figure GDA00035685390100000610
indicating a mapping function related to the type, l indicating the l-th layer of the graph network,
Figure GDA00035685390100000611
a key-value representation representing the neighbor node s at level l,
Figure GDA00035685390100000612
representing the query representation of the node t at the l layer;
then calculate
Figure GDA00035685390100000613
And
Figure GDA00035685390100000614
weight in between:
Figure GDA00035685390100000615
wherein,
Figure GDA00035685390100000616
representing learnable parameters related to the number of layers and the type of edge; t represents transposition; α (s, e, t) represents
Figure GDA00035685390100000617
And
Figure GDA00035685390100000618
weight in between;
after the weight between each neighbor node s and the target node t is obtained, all weights are normalized:
Figure GDA00035685390100000619
wherein Softmax is a normalization function,ATT(l)(s, e, t) is the score after final normalization;
representing each neighbor node s
Figure GDA00035685390100000620
The mapping is as follows:
Figure GDA00035685390100000621
wherein,
Figure GDA00035685390100000622
as a mapping function related to type and number of layers;
Figure GDA00035685390100000623
the message representation of the neighbor node s at the layer l;
is obtained by
Figure GDA00035685390100000624
Then, the final message vector is calculated:
Figure GDA00035685390100000625
wherein,
Figure GDA00035685390100000626
learnable parameters related to type and number of layers;
when the target node t type is not a sentence node, the invention utilizes normalized score ATT(l)(s, e, t) as weights to weight the summed message vector Msg(l)(s, e, t) to obtain
Figure GDA00035685390100000627
Figure GDA0003568539010000071
Wherein,
Figure GDA0003568539010000072
represents a summation;
Figure GDA0003568539010000073
multiplying;
Figure GDA0003568539010000074
is a representation of all neighbor nodes of the fusion t;
when the type of the target node t is a sentence node, the invention distinguishes the type of the neighbor node s to carry out information fusion to obtain
Figure GDA0003568539010000075
Figure GDA0003568539010000076
Figure GDA0003568539010000077
Figure GDA0003568539010000078
Where τ(s) denotes the type of neighbor node, skNeighboring nodes, s, of the type of representation knowledgesA neighbor node of which the representation type is speaker;
respectively mapping the two situations to obtain
Figure GDA0003568539010000079
The invention then maps it to
Figure GDA00035685390100000710
As the node after update represents:
Figure GDA00035685390100000711
wherein Sigmoid represents an activation function,
Figure GDA00035685390100000712
representing a mapping function related to the type and the number of layers;
next, the updated node representation
Figure GDA00035685390100000713
Upper integration of location information for each node viAssociated with a position
Figure GDA00035685390100000714
For speaker nodes and knowledge nodes, location
Figure GDA00035685390100000715
With respect to the nodes of the sentence,
Figure GDA00035685390100000716
the position of a sentence in the dialog, i.e. the second sentence; the invention sets a position vector matrix WposFor each position
Figure GDA00035685390100000717
Its corresponding vector representation can be obtained
Figure GDA00035685390100000718
Will be provided with
Figure GDA00035685390100000719
Is merged into
Figure GDA00035685390100000720
Get an updated representation:
Figure GDA00035685390100000721
finally, will furtherNode representation after new
Figure GDA00035685390100000722
With corresponding initialization word representation
Figure GDA00035685390100000723
Splicing and mapping to obtain an updated word representation:
Figure GDA00035685390100000724
wherein F _ Linear () represents a mapping function; representing vector stitching.
Other steps and parameters are the same as those in one of the first to third embodiments.
The fifth concrete implementation mode: the difference between this embodiment and one of the first to the fourth embodiments is that the decoder is constructed by the third step:
word representation after being updated
Figure GDA0003568539010000081
Then, the mean s of the representations of all words is calculated0
Figure GDA0003568539010000082
Wherein G represents all node sets in the heterogeneous dialogue graph; s0Assigning a cell state and a hidden state to the decoder to initialize an initial state of the decoder; at each step of decoding, according to the decoder state stComputing a context vector ct
Figure GDA0003568539010000083
at=Softmax(et) (12)
Figure GDA0003568539010000084
Wherein, WaWhich represents a parameter that can be learned by the user,
Figure GDA0003568539010000085
is an updated word representation; t denotes the transpose of the image,
Figure GDA0003568539010000086
is the unnormalized weight for the nth term for the ith node; stIs the state of the decoder at the moment t; a istIs the weight after normalization; e.g. of the typetIs the weight before normalization; c. CtIs a context vector representation;
Figure GDA0003568539010000087
the weight of the nth word of the ith node after normalization;
according to the context vector ctAnd decoder t time state stCalculating the probability P of generating each word in the word listvocab
Pvocab(w)=Softmax(V′(V[st;ct]+b)+b′) (14)
Wherein V ', V, b, b' are learnable parameters; [ s ] oft;ct]Denotes stAnd ctSplicing; softmax is a normalization function; pvocab(w) represents the probability of generating word w;
in addition to generating words from a vocabulary, the present model also allows words to be copied from the original text; first, the probability p of generating words is calculatedgen
Figure GDA0003568539010000088
Wherein, wc,ws,wxAnd bptrIs a learnable parameter; sigmoid is an activation function; p is a radical ofgenRepresenting a probability of generating a word; 1-pgenRepresenting the probability of copying from the original text;
Figure GDA0003568539010000089
is to wcCalculating and transposing;
Figure GDA00035685390100000810
is to wsCalculating and transposing;
Figure GDA00035685390100000811
is to wxCalculating and transposing; x is the number oftInputting word vectors of words for a decoder at the time t;
therefore, for a word w, the probability generated from the word list and the probability copied from the original text are considered together, and the final probability is expressed as formula (16):
Figure GDA0003568539010000091
wherein,
Figure GDA0003568539010000092
the weight of the nth word of the ith node after normalization;
according to equation (16), the decoder can be used to select the word with the highest probability as output at each decoding step.
Other steps and parameters are the same as in one of the first to fourth embodiments.
The sixth specific implementation mode: the present embodiment is different from the first to the fifth embodiment in that the dialogue heterogeneous neural network model constructed in the training step three generates a final dialogue summary from a section of dialogue through the trained dialogue heterogeneous neural network model; the specific process is as follows:
training a dialogue heterogeneous neural network model by using a maximum likelihood estimation and utilizing a training part of a SAMSum data set, and calculating cross entropy loss according to the word probability predicted by the formula (16) and standard words at each decoding step of a decoder;
for a dialog D, a standard abstract is given
Figure GDA0003568539010000093
The training objective is to minimize equation (17):
Figure GDA0003568539010000094
wherein,
Figure GDA0003568539010000095
the first word in the standard abstract;
Figure GDA0003568539010000096
the last word in the standard abstract;
Figure GDA0003568539010000097
words of the standard abstract which need to be predicted at the time t; l is a cross entropy loss function;
the dialogue heterogeneous neural network model is trained according to equation (17), the best model is selected by the development part of the SAMSum data set, and finally the final dialogue summary is generated by the trained dialogue heterogeneous neural network model according to equation (16) for the test part of the SAMSum data set.
Other steps and parameters are the same as those in one of the first to fifth embodiments.
The seventh concrete implementation mode: the difference between this embodiment and one of the first to sixth embodiments is that the noise elimination knowledge method includes:
(1) excluding tuple knowledge if the weight w in this knowledge is lower than 1;
(2) if the relation r of tuple knowledge belongs to: antisense words, related in language origin, originated in language origin, different, not desired, then this knowledge is excluded.
Other steps and parameters are the same as those in one of the first to sixth embodiments.
The specific implementation mode is eight: the difference between this embodiment and one of the first to seventh embodiments is that the process of simplifying tuple knowledge includes:
(1) if sentences a and B connect a plurality of entities, then the one with the highest average weight of the relationship is selected, for example, as shown in fig. 5, the average weight of "phonebook" is greater than the average weight of "date", then "phonebook" is selected;
(2) if different sentences are respectively connected to the same entity, the entity is merged into one entity, for example, as shown in fig. 5, two "contact" entities are merged into one "contact" entity.
Other steps and parameters are the same as those in one of the first to seventh embodiments.
Examples
The first embodiment is as follows:
the invention realizes the proposed model and compares the model with the current baseline model and the standard abstract.
(1) Summary of baseline model generation:
Gary and Lara will meet at 5pm for Tom's bday party.
(2) summary of model generation of the invention:
Gary and Lara are going to Tom's birthday party at 5pm.Lara will pick up the cake.
(3) standard abstract:
It’s Tom's birthday.Lara and Gary will come to Tom's place about 5pm to prepare everything.Gary has already paid for the cake Lara will pick it.
according to the above embodiments, it can be seen that the model of the present invention can generate results more similar to the standard abstract, and the dialog information can be better understood by introducing common sense knowledge.
The present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof, and it is therefore intended that all such changes and modifications be considered as within the spirit and scope of the appended claims.

Claims (9)

1. A method for generating a dialog abstract merged with common knowledge is characterized by comprising the following steps:
step one, acquiring a common knowledge base ConceptNet and a dialogue summary data set SAMSum; the contained common sense knowledge exists in the form of tuple, namely tuple knowledge, and is expressed as follows:
R=(h,r,t,w),
wherein R represents a tuple knowledge; h represents a head entity; r represents a relationship; t represents a tail entity; w represents a weight, representing the confidence of the relationship; knowledge R represents that a head entity h and a tail entity t have a relation R, and the weight is w;
the dialogue abstract data set SAMSum is divided into three parts, namely training, development and testing;
introducing tuple knowledge into the dialogue summary data set SAMSum by using the acquired common knowledge base conceptNet, and constructing a heterogeneous dialogue graph; the specific process is as follows:
step three, constructing a dialogue heterogeneous neural network model; the dialogue heterogeneous neural network model comprises a node encoder, a graph encoder and a decoder;
step three, constructing a node encoder, and acquiring node initialization representation by using a bidirectional long-time and short-time neural network
Figure FDA0003568538000000011
And word initialization representations
Figure FDA0003568538000000012
Step three, constructing a graph encoder, updating node representation by utilizing a heterogeneous graph neural network, and adding node position coding information and updating word representation
Figure FDA0003568538000000013
Step three, constructing a decoder;
and step four, training the dialogue heterogeneous neural network model constructed in the step three, and generating a final dialogue abstract from a section of dialogue through the trained dialogue heterogeneous neural network model.
2. The method for generating dialogue summary merged with general knowledge according to claim 1, wherein in the second step, the obtained general knowledge base ConceptNet is used to introduce tuple knowledge to the dialogue summary data set SAMSum to construct a heterogeneous dialogue graph; the specific process is as follows:
step two, for a section of conversation, eliminating noise knowledge according to related tuple knowledge acquired from ConceptNet by words in the conversation to obtain a tuple knowledge set related to the given conversation;
step two, assuming that a sentence A and a sentence B exist in the related tuple knowledge acquired in the step two, the word a belongs to A, the word B belongs to B, simplifying the tuple knowledge, and if tail entities h of a and B are consistent, connecting the sentences A and B to the tail entity h; obtaining a sentence-knowledge graph;
step two, establishing an edge relation between the speaker and a sentence according to the 'one sentence of the speaker' to obtain a speaker-sentence image;
step two, the sentence-knowledge graph and the speaker-sentence graph are fused into a heterogeneous dialogue graph; the heterogeneous dialogue graph has two edges between a speaker and a sentence, namely a 'speak-by' edge from the speaker to the sentence and a 'rev-speak-by' edge from the sentence to the speaker; there are two kinds of edges between sentence and tuple knowledge, namely the "knock-by" edge from knowledge to sentence, the "rev-knock-by" edge from sentence to tuple knowledge; heterogeneous dialogs exist with three types of nodes, namely speakers, sentences, and common sense knowledge.
3. The method for generating dialogue summary incorporating general knowledge as claimed in claim 2, wherein the step three is a node encoder; obtaining node initialization representation using a bidirectional long-and-short-term neural network
Figure FDA0003568538000000021
And word initialization representations
Figure FDA0003568538000000022
The specific process is as follows:
for a constructed heterogeneous dialogue graph, each node viContaining | viI words, the word sequence is
Figure FDA0003568538000000023
Wherein, wi,nRepresenting a node viN.e [1, | vi|](ii) a Word sequence using bidirectional long-and-short-term neural network
Figure FDA0003568538000000024
Generating a forward hidden layer sequence
Figure FDA0003568538000000025
And backward hidden layer sequence
Figure FDA0003568538000000026
Wherein the forward hidden layer state
Figure FDA0003568538000000027
Backward hidden layer state
Figure FDA0003568538000000028
xnIs wi,nA word vector representation of; splicing the last hidden layer representation of the forward hidden layer state with the first hidden layer representation of the backward hidden layer state to obtain the initialized representation of the node
Figure FDA0003568538000000029
Wherein, the first and second connecting parts are connected with each other; representing vector stitching; obtaining initialized representation of each word in node
Figure FDA00035685380000000210
4. The method for generating dialogue summary incorporating general knowledge according to claim 1 or 2, wherein the step three or two constructs graph encoder, updates node representation by using a neural network of a heterogeneous graph, and adds node position coding information and update wordsLanguage representation
Figure FDA00035685380000000211
The specific process is as follows:
giving a target node t, and obtaining a neighbor node s e to N (t), wherein N (t) represents a neighbor node set of t, and s represents one neighbor node; given an edge e ═ s, t, representing an edge pointing from the neighbor node s to the target node t, defines:
(1) the node type mapping function is:
τ(v):
Figure FDA00035685380000000212
wherein τ represents a node type mapping function; v represents a given node; v represents a node set;
Figure FDA00035685380000000213
representing a set of node types; in the heterogeneous dialogue graph constructed in the step two, three node types of speakers, sentences and common knowledge are contained;
(2) the edge relationship type mapping function is:
Figure FDA00035685380000000214
wherein,
Figure FDA00035685380000000215
representing an edge type mapping function; e represents a given edge; e represents an edge set;
Figure FDA00035685380000000216
representing a set of edge types;
in a heterogeneous dialog diagram, there are a total of four types of edges: spot-by, rev-spot-by, knock-by, rev-knock-by; for a given edge e ═ (s, t), s and t each possess a representation from the previous layer
Figure FDA0003568538000000031
And
Figure FDA0003568538000000032
will be provided with
Figure FDA0003568538000000033
And
Figure FDA0003568538000000034
is mapped as
Figure FDA0003568538000000035
And
Figure FDA0003568538000000036
wherein,
Figure FDA0003568538000000037
a mapping function relating to the number of layers is indicated,
Figure FDA0003568538000000038
representing a mapping function related to the type; l denotes the l-th layer of the network,
Figure FDA0003568538000000039
a key-value representation representing the neighbor node s at level l,
Figure FDA00035685380000000310
representing the query representation of the node t at the l layer;
computing
Figure FDA00035685380000000311
And
Figure FDA00035685380000000312
weight in between:
Figure FDA00035685380000000313
wherein,
Figure FDA00035685380000000314
representing learnable parameters related to the number of layers and the type of edge; t represents transposition; α (s, e, t) represents
Figure FDA00035685380000000315
And
Figure FDA00035685380000000316
weight in between;
after the weight between each neighbor node s and the target node t is obtained, all weights are normalized:
Figure FDA00035685380000000317
wherein Softmax is a normalization function, ATT(l)(s, e, t) is the score after final normalization;
representing each neighbor node s
Figure FDA00035685380000000318
The mapping is:
Figure FDA00035685380000000319
wherein,
Figure FDA00035685380000000320
as a mapping function related to type and number of layers;
is obtained by
Figure FDA00035685380000000321
Then, the final message vector is calculated:
Figure FDA00035685380000000322
wherein,
Figure FDA00035685380000000323
learnable parameters related to type and number of layers;
when the target node t type is not a sentence node, utilizing the normalized score ATT(l)(s, e, t) as weights to weight the summed message vector Msg(l)(s, e, t) to obtain
Figure FDA00035685380000000324
Figure FDA00035685380000000325
Wherein,
Figure FDA00035685380000000326
which means that the sum is given,
Figure FDA00035685380000000327
multiplying;
Figure FDA00035685380000000328
is a representation of all neighbor nodes of the fusion t;
when the type of the target node t is a sentence node, distinguishing the type of the neighbor node s to perform information fusion to obtain
Figure FDA0003568538000000041
Figure FDA0003568538000000042
Figure FDA0003568538000000043
Figure FDA0003568538000000044
Where τ(s) denotes the type of neighbor node, skDenotes a neighbor node of which the type is knowledge, τ(s) in equation (6) denotes the type of the neighbor node, ssA neighbor node of which the representation type is speaker;
to obtain
Figure FDA0003568538000000045
Then, it is mapped as
Figure FDA0003568538000000046
As an updated node representation:
Figure FDA0003568538000000047
wherein Sigmoid represents an activation function,
Figure FDA0003568538000000048
representing a mapping function related to type and number of layers;
node representation after update
Figure FDA0003568538000000049
Upper integration of location information for each node viAssociated with a position
Figure FDA00035685380000000410
For speaker nodes and knowledge nodes, location
Figure FDA00035685380000000411
With respect to the nodes of the sentence,
Figure FDA00035685380000000412
the position of a sentence in the dialog, i.e. the second sentence;
setting a position vector matrix WposFor each position
Figure FDA00035685380000000413
Can obtain its corresponding vector representation
Figure FDA00035685380000000414
Will be provided with
Figure FDA00035685380000000415
Is merged into
Figure FDA00035685380000000416
Get an updated representation:
Figure FDA00035685380000000417
representing the updated nodes
Figure FDA00035685380000000418
With corresponding initialization word representation
Figure FDA00035685380000000419
Splicing, and mapping to obtain an updated word representation:
Figure FDA00035685380000000420
wherein F _ Linear () represents a mapping function; representing vector stitching.
5. The method for generating dialogue summary incorporating general knowledge as claimed in claim 4, wherein the third step is to construct a decoder; the specific process is as follows:
get updated word representation
Figure FDA00035685380000000421
Thereafter, the mean s of the representations of all words is calculated0Expressed as:
Figure FDA00035685380000000422
wherein G represents all node sets in the heterogeneous dialogue graph;
s0the cell state and the hidden state assigned to the decoder initialize the initial state of the decoder; at each step of decoding, using an attention mechanism, according to the decoder state stComputing a context vector ct
Figure FDA0003568538000000051
at=Softmax(et) (12)
Figure FDA0003568538000000052
Wherein, WaRepresents a learnable parameter;
Figure FDA0003568538000000053
is an updated word representation; t represents transposition;
Figure FDA0003568538000000054
is the unnormalized weight of n words for the ith node; stIs the state of the decoder at the moment t; a is atIs the weight after normalization; e.g. of the typetIs the weight before normalization; c. CtIs a context vector representation;
Figure FDA0003568538000000055
for the weight of the nth term of the ith node after normalization;
vector ctAnd decoder t time state stCalculating the probability P of generating each word in the word listvocab
Pvocab(w)=Softmax(V′(V[st;ct]+b)+b′) (14)
Wherein V ', V, b, b' are learnable parameters; [ s ] oft;ct]Denotes stAnd ctSplicing; softmax is a normalization function; pvocab(w) represents the probability of generating word w;
in addition to generating words from the vocabulary, allowing copying of words from the original text; first, the probability p of generating words is calculatedgen
Figure FDA0003568538000000056
Wherein, wc,ws,wxAnd bptrIs a learnable parameter; sigmoid is an activation function; p is a radical of formulagenRepresenting a probability of generating a word; 1-pgenThen the probability of copying from the original text is indicated;
Figure FDA0003568538000000057
is to wcCalculating and transposing;
Figure FDA0003568538000000058
is to wsCalculating and transposing;
Figure FDA0003568538000000059
is to wxCalculating transposition; x is the number oftInputting word vectors of words for a decoder at the time t;
the final probability is as follows (16):
Figure FDA00035685380000000510
wherein,
Figure FDA00035685380000000511
the weight of the nth word of the ith node after normalization;
according to equation (16), the word with the highest probability is selected as output at each step of decoding by the decoder.
6. The method for generating dialogue summary fused with general knowledge according to claim 5, wherein the step four trains the dialogue heterogeneous neural network model constructed in the step three, and generates the final dialogue summary from a dialogue by the trained dialogue heterogeneous neural network model; the specific process is as follows:
using maximum likelihood estimation, a dialogue-heterogeneous neural network model is trained using the training portion of the SAMSum dataset, and at each step of decoder decoding, the cross-entropy loss is calculated from the predicted word probabilities and the standard words according to equation (16):
for a dialog D, a standard abstract is given
Figure FDA0003568538000000061
The training objective is to minimize equation (17):
Figure FDA0003568538000000062
wherein,
Figure FDA0003568538000000063
the first word in the standard abstract;
Figure FDA0003568538000000064
the last word in the standard abstract;
Figure FDA0003568538000000065
words of the standard abstract which need to be predicted at the time t; l is a cross entropy loss function;
the dialogue-heterogeneous neural network model is trained according to equation (17), the best model is selected using the development part of the SAMSum data set, and finally the final dialogue summary is generated using the trained dialogue-heterogeneous neural network model according to equation (16) for the test part of the SAMSum data set.
7. The method for generating dialogue summary incorporating general knowledge as claimed in claim 2, wherein the method for eliminating noise knowledge comprises:
(1) when the weight w in the tuple knowledge is lower than 1, excluding the tuple knowledge;
(2) when the relation r of tuple knowledge belongs to: antisense words, related in language origin, originated in language origin, different or not desired, this knowledge is excluded.
8. The method for generating dialogue summary merged with common sense knowledge according to claim 2, wherein the process of simplifying tuple knowledge comprises:
(1) if the sentence A and the sentence B are connected with a plurality of entities, selecting one with the highest average weight of the edge relation;
(2) if different pairs of sentences are respectively connected to entities with the same name, all entities with the same name are combined into one entity.
9. The method for generating dialogue summary incorporating general knowledge according to claim 1, wherein the number of training, developing and testing in SAMSum is: 14732, 818, 819.
CN202011104023.9A 2020-10-15 2020-10-15 Generation type dialogue abstract method integrated with common knowledge Active CN112148863B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011104023.9A CN112148863B (en) 2020-10-15 2020-10-15 Generation type dialogue abstract method integrated with common knowledge

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011104023.9A CN112148863B (en) 2020-10-15 2020-10-15 Generation type dialogue abstract method integrated with common knowledge

Publications (2)

Publication Number Publication Date
CN112148863A CN112148863A (en) 2020-12-29
CN112148863B true CN112148863B (en) 2022-07-01

Family

ID=73952047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011104023.9A Active CN112148863B (en) 2020-10-15 2020-10-15 Generation type dialogue abstract method integrated with common knowledge

Country Status (1)

Country Link
CN (1) CN112148863B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112765344B (en) * 2021-01-12 2022-07-08 哈尔滨工业大学 Method, device and storage medium for generating meeting abstract based on meeting record
CN112818113A (en) * 2021-01-26 2021-05-18 山西三友和智慧信息技术股份有限公司 Automatic text summarization method based on heteromorphic graph network
CN113204627B (en) * 2021-05-13 2022-08-23 哈尔滨工业大学 Dialog summary generation system using DialloGPT as feature annotator
CN113553804A (en) * 2021-07-15 2021-10-26 重庆邮电大学 Single document text summarization system based on heterogeneous graph transform
CN114328956B (en) * 2021-12-23 2023-02-28 北京百度网讯科技有限公司 Text information determination method and device, electronic equipment and storage medium
CN114580439B (en) * 2022-02-22 2023-04-18 北京百度网讯科技有限公司 Translation model training method, translation device, translation equipment and storage medium
CN114626368B (en) * 2022-03-18 2023-06-09 中国电子科技集团公司第十研究所 Method and system for acquiring rule common sense knowledge in vertical field
CN115905513B (en) * 2023-02-22 2023-07-14 中国科学技术大学 Dialogue abstracting method based on denoising type question and answer
CN116541505B (en) * 2023-07-05 2023-09-19 华东交通大学 Dialogue abstract generation method based on self-adaptive dialogue segmentation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110348016A (en) * 2019-07-15 2019-10-18 昆明理工大学 Text snippet generation method based on sentence association attention mechanism
CN110609891A (en) * 2019-09-18 2019-12-24 合肥工业大学 Visual dialog generation method based on context awareness graph neural network

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2595541A1 (en) * 2007-07-26 2009-01-26 Hamid Htami-Hanza Assisted knowledge discovery and publication system and method
US10114148B2 (en) * 2013-10-02 2018-10-30 Nec Corporation Heterogeneous log analysis
US10055486B1 (en) * 2014-08-05 2018-08-21 Hrl Laboratories, Llc System and method for real world event summarization with microblog data
CN107403375A (en) * 2017-04-19 2017-11-28 北京文因互联科技有限公司 A kind of listed company's bulletin classification and abstraction generating method based on deep learning
CN108763333B (en) * 2018-05-11 2022-05-17 北京航空航天大学 Social media-based event map construction method
CN109344391B (en) * 2018-08-23 2022-10-21 昆明理工大学 Multi-feature fusion Chinese news text abstract generation method based on neural network
US10885281B2 (en) * 2018-12-06 2021-01-05 International Business Machines Corporation Natural language document summarization using hyperbolic embeddings
CN110929024B (en) * 2019-12-10 2021-07-02 哈尔滨工业大学 Extraction type text abstract generation method based on multi-model fusion
CN111026861B (en) * 2019-12-10 2023-07-04 腾讯科技(深圳)有限公司 Text abstract generation method, training device, training equipment and medium
CN111339754B (en) * 2020-03-04 2022-06-21 昆明理工大学 Case public opinion abstract generation method based on case element sentence association graph convolution
CN111460132B (en) * 2020-03-10 2021-08-10 哈尔滨工业大学 Generation type conference abstract method based on graph convolution neural network
CN111460135B (en) * 2020-03-31 2023-11-07 北京百度网讯科技有限公司 Method and device for generating text abstract
CN111639176B (en) * 2020-05-29 2022-07-01 厦门大学 Real-time event summarization method based on consistency monitoring

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110348016A (en) * 2019-07-15 2019-10-18 昆明理工大学 Text snippet generation method based on sentence association attention mechanism
CN110609891A (en) * 2019-09-18 2019-12-24 合肥工业大学 Visual dialog generation method based on context awareness graph neural network

Also Published As

Publication number Publication date
CN112148863A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
CN112148863B (en) Generation type dialogue abstract method integrated with common knowledge
Wang et al. Deep learning for aspect-based sentiment analysis
Qi et al. Finding all you need: web APIs recommendation in web of things through keywords search
US11900056B2 (en) Stylistic text rewriting for a target author
WO2022095378A1 (en) Artificial-intelligence-based training method and apparatus, and computer device and storage medium
CN107341145B (en) A kind of user feeling analysis method based on deep learning
CN111143576A (en) Event-oriented dynamic knowledge graph construction method and device
CN109376222A (en) Question and answer matching degree calculation method, question and answer automatic matching method and device
CN112256866B (en) Text fine-grained emotion analysis algorithm based on deep learning
CN111460132A (en) Generation type conference abstract method based on graph convolution neural network
JP7335300B2 (en) Knowledge pre-trained model training method, apparatus and electronic equipment
CN113326374B (en) Short text emotion classification method and system based on feature enhancement
WO2023184226A1 (en) Article recommendation method, article knowledge graph training method and apparatus, and model training method and apparatus
Gu et al. HeterMPC: A heterogeneous graph neural network for response generation in multi-party conversations
CN115062208A (en) Data processing method and system and computer equipment
CN113535949B (en) Multi-modal combined event detection method based on pictures and sentences
Dhole Resolving intent ambiguities by retrieving discriminative clarifying questions
CN113407697A (en) Chinese medical question classification system for deep encyclopedia learning
CN117574915A (en) Public data platform based on multiparty data sources and data analysis method thereof
Yonglan et al. [Retracted] English‐Chinese Machine Translation Model Based on Bidirectional Neural Network with Attention Mechanism
CN117708692A (en) Entity emotion analysis method and system based on double-channel graph convolution neural network
CN116992886A (en) BERT-based hot news event context generation method and device
Hsueh et al. A Task-oriented Chatbot Based on LSTM and Reinforcement Learning
CN112036165B (en) Construction method and application of news feature vector
Fulmal The implementation of question answer system using deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant