WO2020237479A1 - 实时事件摘要的生成方法、装置、设备及存储介质 - Google Patents

实时事件摘要的生成方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020237479A1
WO2020237479A1 PCT/CN2019/088630 CN2019088630W WO2020237479A1 WO 2020237479 A1 WO2020237479 A1 WO 2020237479A1 CN 2019088630 W CN2019088630 W CN 2019088630W WO 2020237479 A1 WO2020237479 A1 WO 2020237479A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
representation
event
knowledge
user query
Prior art date
Application number
PCT/CN2019/088630
Other languages
English (en)
French (fr)
Inventor
杨敏
曲强
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to PCT/CN2019/088630 priority Critical patent/WO2020237479A1/zh
Publication of WO2020237479A1 publication Critical patent/WO2020237479A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis

Definitions

  • the invention belongs to the field of computer technology, and in particular relates to a method, device, equipment and storage medium for generating a real-time event summary.
  • Event Summarization is a very challenging task in the field of Natural Language Processing (NLP).
  • NLP Natural Language Processing
  • the purpose of the task is to generate an informative text summary for a given text stream, and to dynamically change the process of events. Update text summaries in real-time, and provide text summaries of events that people are interested in.
  • NLP Natural Language Processing
  • the text summaries are generated after parsing the text using static summarization methods, and only simple updates are made to the text summaries.
  • the static summary method can only generate one summary at a time, and cannot infer the evolution of the event over time and update the summary in real time when new information appears. It is not suitable for large-scale dynamic streaming media applications.
  • Real-time Event Summarization aims to generate a series of text summaries from a large number of real-time text streams, which can accurately describe the events of interest to users.
  • Real-time event summaries are generally used in news and social media scenarios, and have very broad application prospects.
  • streaming media applications including Twitter
  • Twitter can provide users with summary push services of current popular or user-interested tweets.
  • this is also a very challenging task.
  • news texts are usually written by professional reporters or writers, with complete sentences and grammatical structures, and the extracted abstracts are of good quality.
  • social media texts are usually short, with many spelling errors and grammatical errors, as well as many popular words and sentences on the Internet, which greatly hinders the summarization of social media texts.
  • the purpose of the present invention is to provide a method, device, equipment and storage medium for generating and controlling real-time event summary, which aims to solve the problem of insufficient real-time event summary information, high redundancy, and poor performance of real-time event summary in the prior art. problem.
  • the present invention provides a method for generating a real-time event summary.
  • the method includes the following steps:
  • the interactive learning text representation of the event text and the user query text representation are generated.
  • Interactive learning text representation
  • the specific text representation of the event text is input into a trained multi-task joint training model to generate a real-time event summary of the text stream.
  • the multi-task joint training model includes a real-time event summary task model and a correlation prediction task model.
  • the present invention provides a device for generating a real-time event summary, the device comprising:
  • a text receiving module for receiving a text stream and user query text, the text stream including event text sorted by time;
  • a knowledge-aware representation generation module configured to generate a knowledge-aware text representation of the event text and a knowledge-aware text representation of the user query text according to the event text, the user query text, and a preset knowledge base;
  • An interactive representation generating module for generating interactive learning text of the event text based on the knowledge-aware text representation of the event text, the knowledge-aware text representation of the user query text and the trained interactive multi-head attention network An interactive learning text representation of the representation and the user query text;
  • a specific representation generating module for generating a specific text representation of the event text based on the interactive learning text representation of the event text, the interactive learning text representation of the user query text, and the trained dynamic memory network;
  • the real-time summary generation module is used to input the specific text representation of the event text into a trained multi-task joint training model to generate a real-time event summary of the text stream.
  • the multi-task joint training model includes a real-time event summary task model and Relevance prediction task model.
  • the present invention also provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor implements the computer program when the computer program is executed. The steps described in the above-mentioned method for generating real-time event summary.
  • the present invention also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the above-mentioned method for generating a real-time event summary is implemented. step.
  • the present invention receives a text stream and a user query text.
  • the text stream includes event text sorted by time, generates event text and a knowledge-aware text representation of the user query text according to the knowledge base, and based on these knowledge-aware text representations and interactive multi-head attention networks, Generate interactive learning text representations of event text and user query text, generate specific text representations of event text based on these interactive learning text representations and dynamic memory networks, and input specific text representations into the multi-task joint training model to obtain real-time event summaries.
  • the content of the real-time event summary is effectively enriched, the text representation is better learned through the interactive learning and attention mechanism, and the redundancy of the real-time event summary is effectively reduced through the dynamic memory network.
  • the training model realizes the joint processing of real-time event summary tasks and correlation prediction tasks, improves the performance of real-time event summaries, and effectively generates real-time event summaries.
  • FIG. 1 is an implementation flowchart of a method for generating a real-time event summary provided by Embodiment 1 of the present invention
  • Embodiment 2 is an implementation flowchart of a method for generating a real-time event summary provided by Embodiment 2 of the present invention
  • FIG. 3 is a schematic structural diagram of an apparatus for generating a real-time event summary provided by Embodiment 3 of the present invention.
  • FIG. 4 is a schematic diagram of a preferred structure of an apparatus for generating a real-time event summary provided in Embodiment 3 of the present invention.
  • FIG. 5 is a schematic diagram of the structure of a computer device according to Embodiment 4 of the present invention.
  • Fig. 1 shows the implementation process of the method for generating a real-time event summary provided in the first embodiment of the present invention.
  • Fig. 1 shows the implementation process of the method for generating a real-time event summary provided in the first embodiment of the present invention.
  • Fig. 1 shows the implementation process of the method for generating a real-time event summary provided in the first embodiment of the present invention.
  • Only the parts related to the embodiment of the present invention are shown, which are detailed as follows:
  • step S101 a text stream and a user query text are received, and the text stream includes event text sorted by time.
  • Event texts (such as social media texts) can be collected in real time on the network, and the text streams collected under these different timestamps can form a text stream.
  • the user query text is the keyword text entered by the user. Each event text and user query text includes multiple words.
  • step S102 a knowledge-aware text representation of the event text and a knowledge-aware text representation of the user query text are generated according to the event text, the user query text and the preset knowledge base.
  • a knowledge base contains a large amount of knowledge, such as a Microsoft knowledge base or some knowledge bases built on Wikipedia.
  • the knowledge base is used to represent event text and user query text. Effectively improve the richness of real-time time summary.
  • the knowledge-aware text representation of the event text includes the initial context representation and initial knowledge representation of the event text
  • the knowledge-aware text representation of the user query text includes the initial context representation and the initial knowledge representation of the user query text.
  • step S103 based on the knowledge-aware text representation of the event text, the knowledge-aware text representation of the user query text and the trained interactive multi-head attention network, the interactive learning text representation of the event text and the interactive learning user query text are generated Text representation.
  • an interactive multi-head attention network is constructed and trained in advance, and the knowledge-aware text representation of the event text and the user query text is input into the trained interactive multi-head attention network to obtain the attention of each event text Matrix, based on the attention matrix of the event text and the knowledge-aware text representation of the event text, the interactive learning text representation of the event text is calculated. Similarly, by inputting the knowledge-aware text representation of the event text and the user query text into the interactive multi-head attention network, the attention matrix of the user query text is obtained. The interactive learning text representation of the user query text is calculated based on the attention matrix of the user query text and the knowledge-aware text representation of the user query text.
  • the calculation process of the attention matrix of the event text has the participation of the user query text
  • the calculation process of the attention matrix of the user query text has the participation of the event text.
  • the interactive multi-head attention network realizes the event text and
  • the interactive learning between the user query text can effectively capture the interactive information between the event text and the user query text, and provide the performance of the text representation of the event text and the user query text.
  • step S104 a specific text representation of the event text is generated based on the interactive learning text representation of the event text, the interactive learning text representation of the user query text and the trained dynamic memory network.
  • the dynamic memory network is used to memorize the past event text, and adjust the current attention according to the memory content to avoid a large amount of redundant content in the real-time event summary.
  • the dynamic memory network also includes methods for updating memory content The cyclic network obtains the memory content of the event text under the current time stamp according to the memory content of the event text under the previous time stamp and the interactive learning text representation of the event text under the current time stamp.
  • step S105 the specific text representation of the event text is input into the trained multi-task joint training model to generate a real-time event summary of the text stream.
  • the multi-task joint training model includes a real-time event summary task model and a correlation prediction task model.
  • the specific text representation of each event text in the text stream is input to the trained multi-task joint training model, and each event in the text stream is calculated by the correlation prediction task model in the multi-task joint training model.
  • the specific text of the text represents the relevance label relative to the user's query text.
  • the real-time event summary in the multi-task joint training model is used to determine the text action of each event text in the text stream, and generate the text stream based on the text action of each event text Real-time event summary.
  • the predicted relevance tags include highly relevant, relevant, and irrelevant, and the text actions include push and no push.
  • the specific text representation of the event text is pushed to the real-time event summary.
  • the knowledge-aware text representation of the event text and the user query text is generated by the knowledge base, and the knowledge-aware text is interactively learned through the interactive multi-head attention network to generate the event text and the user query text.
  • Interactive learning text representation These interactive learning text representations are processed through a dynamic memory network to generate a specific text representation of the event text.
  • the specific text representation of the event text is input into the multi-task joint training model to generate a real-time event summary of the text stream. This effectively improves the content richness and performance of the real-time event summary, reduces the redundancy of the real-time event summary, and further improves the generation effect of the real-time event summary.
  • Figure 2 shows the implementation process of the method for generating a real-time event summary provided in the second embodiment of the present invention.
  • Figure 2 shows the implementation process of the method for generating a real-time event summary provided in the second embodiment of the present invention.
  • Only the parts related to the embodiment of the present invention are shown, which are described in detail as follows:
  • step S201 a text stream and a user query text are received, and the text stream includes event text sorted by time.
  • the event text under a timestamp. Every text in the text stream By l words Composition (the time suffix of the text is omitted here to simplify the mathematical representation of these parameters).
  • the user query text can be expressed as From n words composition.
  • step S202 the initial context representation of the event text is obtained by extracting the hidden state of the word in the event text, and the initial context representation of the user query text is obtained by extracting the hidden state of the word in the user query text.
  • each word in the event text and each word in the user query text are respectively mapped to a low-dimensional word embedding vector through a preset word embedding layer.
  • the low-dimensional word embedding vector of each word in the event text is input into the first gated recurrent unit (GRU), and the hidden state of each word in the event text is calculated.
  • the first gating recirculation unit and the second gating recirculation unit are mutually independent gating recirculation units.
  • the calculation formula for calculating the hidden state of the word through the gated loop unit is:
  • h k GRU(h k-1 ,v k ), where v k represents the low-dimensional word embedding vector of the k-th word, h k represents the hidden state of the k- th word, and h k-1 represents the k-1 The hidden state of a word.
  • the hidden states of all words in the event text are combined into the initial context representation of the event text
  • the hidden states of all words in the user query text are combined into the initial context representation of the user query text.
  • the initial context of the event text is expressed as
  • the initial context of the user query text is expressed as among them, Is the i-th word in the event text Hidden state, Is the jth word in the event text Hidden state.
  • step S203 an initial knowledge representation of the event text is generated according to the initial context representation, attention mechanism and knowledge base of the event text, and initial knowledge of the query text is generated according to the initial context representation, attention mechanism and knowledge base of the user query text Said.
  • a candidate entity set composed of a preset number of embedded entities is selected in the knowledge base for each word in the event text and each user query text, and the candidate entity set is expressed as:
  • N is the total number of embedded entities
  • the knowledge representation of each word in the event text is learned by embedding the corresponding candidate entity set in the knowledge base, and the learning process can be expressed as:
  • e ki is the i-th embedded entity in the set of candidate entities of the k-th word in the event text
  • a ki is the context-guided attention weight of e ki
  • is the average pooling operation
  • W kb and W c are the trained weight matrices
  • b kb is the bias value.
  • the initial knowledge estimate of the event text is formed by the knowledge representation of all words in the event text
  • the knowledge base and the initial context representation of the event text the initial knowledge representation of the event text can be obtained.
  • the knowledge base and the initial context representation of the user query text the initial knowledge representation of the user query text can be obtained.
  • step S204 the knowledge-aware text representation of the event text is obtained by combining the initial context representation of the event text and the initial knowledge representation of the event text
  • the user query text is obtained by combining the initial context representation of the user query text and the initial knowledge representation of the user query text
  • the knowledge-aware textual representation is obtained by combining the initial context representation of the user query text and the initial knowledge representation of the user query text.
  • the knowledge-aware text of the event text is expressed as:
  • the knowledge-aware text of the user query text is expressed as:
  • step S205 the knowledge-aware text representation of the event text and the knowledge-aware text representation of the user query text are input into the interactive multi-head attention network, and the attention matrix of the event text and the attention matrix of the user query text are calculated.
  • the knowledge-aware text representation of the event text and the knowledge-aware text representation of the user query text are input into the interactive multi-head attention network, and the attention matrix of the event text and the attention matrix of the user query text are calculated.
  • the calculation formula of the attention matrix of the event text is expressed as:
  • A [A 1 ,A 2 ,...,A l ]
  • is the average pooling operation
  • a i is the i-th row matrix in the attention matrix A of the event text
  • is the attention function
  • U (1) and W (1) are the weight matrices trained for the interactive multi-head attention network.
  • the calculation formula of the attention matrix of the user query text is expressed as:
  • B i is the i-th row matrix in the attention matrix B of the user query text.
  • step S206 the interactive learning text representation of the event text is calculated according to the attention matrix of the event text and the knowledge-aware text representation, and the interaction of the user query text is calculated according to the attention matrix of the user query text and the knowledge-aware text representation Learning textual representation.
  • step S207 a specific text representation of the event text is generated based on the interactive learning text representation of the event text, the interactive learning text representation of the user query text and the trained dynamic memory network.
  • each time stamp in the text stream is equivalent to each step of the dynamic memory network.
  • obtain the memory content of the previous time stamp of the current time stamp and input the memory content, the interactive learning text representation of the event text under the current time stamp, and the user query text into the dynamic memory network through dynamic
  • the attention mechanism in the network calculates the specific text representation of the event text under the current timestamp.
  • the calculation formula of the specific text representation of the event text is:
  • emb t is the specific text representation of the event text at timestamp t
  • o tj d is the interactive learning text representation of the j-th word in the event text at timestamp t
  • the attention function w tj is a forward neural network
  • ⁇ flattened matrix for the vector function of the form W a, U a, V a w tj is the weight matrix
  • b a w tj is the bias term
  • m t-1 is the time stamp t- 1.
  • the memory content corresponding to the event text under the last time stamp and the specific text representation of the event text under the current time stamp is calculated, so as to generate the corresponding memory content of the event text in the order of the time stamp.
  • the memory content is stored in the dynamic memory network.
  • the memory content corresponding to the event text under the last time stamp and the specific text representation of the event text under the current time stamp is calculated by the third gating loop unit, and the calculation formula is :
  • the memory content corresponding to the time text under the initial timestamp is the interactive learning text representation corresponding to the last word in the user query text which is
  • step S208 the specific text representation of the event text is input into the trained multi-task joint training model to generate a real-time event summary of the text stream.
  • the multi-task joint training model includes a real-time event summary task model and a correlation prediction task model.
  • the objective function of the correlation prediction task model in the training process of the multi-task joint training model, can be expressed as:
  • the weight matrix of the correlation prediction task in a supervised way with For learning, the training data set is d t and q t are the event text and user query text under the time stamp t in the training data set, Is the true correlation label of d t relative to q t . Training is performed by minimizing the objective function (ie, the cross entropy between the predicted correlation label and the true correlation label).
  • the objective function of the real-time event summary task model can be expressed as:
  • the expected reward is a typical delayed reward
  • r( ⁇ ) is the reward function
  • is the coefficient of control function EG( ⁇ ) and function nCG( ⁇ )
  • Is the strategy function here the independent function approximator with parameter ⁇ in the stochastic strategy gradient algorithm is used to approximate the stochastic strategy ⁇ ⁇
  • b s is the offset value
  • a t ⁇ ⁇ 0,1 ⁇ is a text operation
  • s t emb t.
  • a reinforcement learning algorithm is used to optimize the objective function of the real-time event summary task model.
  • a strategy gradient algorithm is used as a reinforcement learning algorithm for optimizing the real-time event summary task model to improve the training effect of the real-time event summary task model.
  • the multi-task joint training model can be expressed as:
  • L ⁇ 1 L 1 + ⁇ 2 L 2
  • L 1 is the objective function of the correlation prediction task model
  • L 2 is the objective function of the real-time event summary task model
  • ⁇ 1 , ⁇ 2 are the weights of L 1 and L 2 respectively Coefficient
  • training the multi-task joint training model means that the correlation prediction task model and the real-time event summary task model are synchronized training, taking into account the interdependence of the correlation prediction task and the real-time event summary task, effectively improving the real-time time summary The generation effect.
  • the knowledge-aware text representation of the event text and the user query text is generated by the knowledge base, and the knowledge-aware text is interactively learned through the interactive multi-head attention network to generate the event text and the user query text.
  • Interactive learning text representation These interactive learning text representations are processed through a dynamic memory network to generate a specific text representation of the event text.
  • the specific text representation of the event text is input into the multi-task joint training model to generate a real-time event summary of the text stream. This effectively improves the content richness and performance of the real-time event summary, reduces the redundancy of the real-time event summary, and further improves the generation effect of the real-time event summary.
  • Fig. 3 shows the structure of the device for generating a real-time event summary provided in the third embodiment of the present invention.
  • Fig. 3 shows the structure of the device for generating a real-time event summary provided in the third embodiment of the present invention.
  • parts related to the embodiment of the present invention including:
  • the text receiving module 31 is used to receive a text stream and user query text, the text stream includes event text sorted by time;
  • the knowledge-aware representation generating module 32 is used to generate the knowledge-aware text representation of the event text and the knowledge-aware text representation of the user query text according to the event text, the user query text and the preset knowledge base;
  • Interactive representation generating module 33 used to generate interactive learning text representation of event text and user query text based on knowledge-aware text representation of event text, knowledge-aware text representation of user query text and trained interactive multi-head attention network Interactive learning text representation;
  • the specific representation generation module 34 is used to generate a specific text representation of the event text based on the interactive learning text representation of the event text, the interactive learning text representation of the user query text and the trained dynamic memory network;
  • the real-time summary generation module 35 is used to input the specific text representation of the event text into the trained multi-task joint training model to generate a real-time event summary of the text stream.
  • the multi-task joint training model includes a real-time event summary task model and a correlation prediction task model .
  • the knowledge-aware representation generating module 32 includes:
  • the context generation module 321 is configured to obtain the initial context representation of the event text by extracting the hidden state of words in the event text, and obtain the initial context representation of the user query text by extracting the hidden state of the words in the user query text;
  • the initial knowledge representation generating module 322 is used to generate the initial knowledge representation of the event text based on the initial context representation, attention mechanism and knowledge base of the event text, and generate queries based on the initial context representation, attention mechanism and knowledge base of the user query text The initial knowledge representation of the text;
  • the knowledge-aware representation combination module 323 is used to combine the initial context representation of the event text and the initial knowledge representation of the event text to obtain the knowledge-aware text representation of the event text, and combine the initial context representation of the user query text and the initial knowledge representation of the user query text Get the knowledge-aware text representation of the user's query text.
  • the interactive representation generating module 33 includes:
  • the attention matrix calculation module is used to input the knowledge-aware text representation of the event text and the knowledge-aware text representation of the user query text into the interactive multi-head attention network, and calculate the attention matrix of the event text and the attention matrix of the user query text;
  • the interactive representation generation sub-module is used to calculate the interactive learning text representation of the event text according to the attention matrix and knowledge perception text representation of the event text, and calculate the user according to the attention matrix and knowledge perception text representation of the user query text Interactive learning text representation of query text.
  • the specific representation generating module 34 includes:
  • the memory content acquisition module is used to acquire the memory content of the event text under the last time stamp in the text stream.
  • the specific representation generation sub-module is used to input the memory content of the event text at the last time stamp, the interactive learning text representation of the event text at the current time stamp, and the interactive learning text representation of the user query text into the dynamic memory network to obtain the current time Stamp a specific textual representation of the event text.
  • the specific representation generating module 34 further includes:
  • the memory content calculation module is used to calculate the memory content of the event text under the current time stamp based on the specific text representation of the event text under the current time stamp and the memory content of the event text under the previous time stamp.
  • the device for generating a real-time event summary further includes:
  • the training module is used to obtain training data, and simultaneously train the real-time event summary task and the correlation prediction task according to the training data.
  • the real-time event summary task is trained by strategy gradient algorithm, and the correlation prediction task is trained in a supervised manner.
  • the knowledge-aware text representation of the event text and the user query text is generated by the knowledge base, and the knowledge-aware text is interactively learned through the interactive multi-head attention network to generate the event text and the user query text.
  • Interactive learning text representation These interactive learning text representations are processed through a dynamic memory network to generate a specific text representation of the event text.
  • the specific text representation of the event text is input into the multi-task joint training model to generate a real-time event summary of the text stream. This effectively improves the content richness and performance of the real-time event summary, reduces the redundancy of the real-time event summary, and further improves the generation effect of the real-time event summary.
  • each unit of the apparatus for generating a real-time event summary can refer to the detailed description of the corresponding steps in the first embodiment and the second embodiment, which will not be repeated here.
  • each unit of the device for generating a real-time event summary can be implemented by a corresponding hardware or software unit.
  • Each unit can be an independent software and hardware unit, or can be integrated into a software and hardware unit. Limit the invention.
  • FIG. 5 shows the structure of the computer device provided in the fourth embodiment of the present invention. For ease of description, only the parts related to the embodiment of the present invention are shown.
  • the computer device 5 in the embodiment of the present invention includes a processor 50, a memory 51, and a computer program 52 stored in the memory 51 and running on the processor 50.
  • the processor 50 implements the steps in the foregoing method embodiments when the computer program 52 is executed, such as steps S101 to S105 shown in FIG. 1.
  • the processor 50 executes the computer program 52, the functions of the units in the above-mentioned device embodiments, such as the functions of the units 31 to 35 shown in FIG. 3, are realized.
  • the knowledge-aware text representation of the event text and the user query text is generated by the knowledge base, and the knowledge-aware text is interactively learned through the interactive multi-head attention network to generate the event text and the user query text.
  • Interactive learning text representation These interactive learning text representations are processed through a dynamic memory network to generate a specific text representation of the event text.
  • the specific text representation of the event text is input into the multi-task joint training model to generate a real-time event summary of the text stream. This effectively improves the content richness and performance of the real-time event summary, reduces the redundancy of the real-time event summary, and further improves the generation effect of the real-time event summary.
  • a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in the foregoing method embodiment are implemented, for example, as shown in FIG. Steps S101 to S105 shown.
  • the computer program when executed by the processor, realizes the functions of the units in the above-mentioned device embodiment, for example, the functions of units 31 to 35 shown in FIG. 3.
  • the knowledge-aware text representation of the event text and the user query text is generated by the knowledge base, and the knowledge-aware text is interactively learned through the interactive multi-head attention network to generate the event text and the user query text.
  • Interactive learning text representation These interactive learning text representations are processed through a dynamic memory network to generate a specific text representation of the event text.
  • the specific text representation of the event text is input into the multi-task joint training model to generate a real-time event summary of the text stream. This effectively improves the content richness and performance of the real-time event summary, reduces the redundancy of the real-time event summary, and further improves the generation effect of the real-time event summary.
  • the computer-readable storage medium in the embodiment of the present invention may include any entity or device or recording medium capable of carrying computer program code, such as ROM/RAM, magnetic disk, optical disk, flash memory and other memories.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

本发明适用计算机技术领域,提供了一种实时事件摘要的生成方法、装置、设备及存储介质,该方法包括:接收文本流和用户查询文本,依据知识库生成文本流中事件文本的和用户查询文本的知识感知文本表示,依据生成的知识感知文本表示和交互式多头注意力网络,生成事件文本的和用户查询文本的交互式学习文本表示,依据生成的交互式学习文本表示和动态记忆网络,生成事件文本的特定文本表示,将特定文本表示输入多任务联合训练模型,生成文本流的实时事件摘要,从而有效地提高了实时事件摘要的内容丰富度,降低了实时事件摘要的冗余度,提高了实时事件摘要的生成效果。

Description

实时事件摘要的生成方法、装置、设备及存储介质 技术领域
本发明属于计算机技术领域,尤其涉及一种实时事件摘要的生成方法、装置、设备及存储介质。
背景技术
随着流媒体应用的快速发展,互联网中的信息量呈现爆炸式增长,人们利用流媒体应用获取信息的同时,也往往因为信息量的庞大和繁杂,无法获取自己最感兴趣的信息,收到许多困扰以及造成不必要的事件浪费。事件摘要(Event Summarization)是自然语言处理(Natural Language Processing,简称NLP)领域中十分具有挑战性的任务,任务目的是为给定的文本流生成信息量丰富的文本摘要,并在事件动态变化过程中实时更新文本摘要,提供给人们感兴趣的事件的文本摘要。然而,关于事件摘要的工作主要侧重于新闻文章,通过采用静态摘要方法对文本进行解析后生成文本摘要,并且只对文本摘要进行简单的更新。静态摘要方法每次只能生成一条摘要,无法随时间去推断事件的演化过程并在出现新信息时实时更新摘要,更不适合大规模的动态流媒体应用。
实时事件摘要(Real-time Event Summarization)旨在从大量实时的文本流中生成一系列文本摘要,这些文本摘要能够准确地描述用户所感兴趣的事件。实时事件摘要一般用于新闻及社交媒体场景,应用前景十分广阔,例如,包括Twitter在内的一些流媒体应用能够向用户提供当前热门或用户感兴趣的推文的摘要推送服务。同时,这也是一项十分具有挑战性的任务。首先,新闻文本通常由专业记者或作家撰写,句子和语法结构完整,提取的摘要质量良好。但是社交媒体文本通常较短,存在不少拼写错误及语法病句,还有许多网络热门词句,为社交媒体文本的摘要工作造成较大阻碍。其次,与静态新闻摘要相比, 社交媒体文本的摘要生成必须沿着时间轴的动态的文本流进行。另外,由于在线文本流的信息量不断增加,采用静态方法生成事件摘要的成本骤增且不可能保持实时更新状态。
在已知的一项研究成果中,提出了通过建立和维持适当的推送更新阈值来实现最佳的推送结果的方式、采用局部最优学习来选择或跳过文本流中的文本分方式、以及将文本流的实时推送定义为一个顺序决策问题并基于神经网络的强化学习(NNRL)算法用于实时决策的方式等等.尽管这些研究已经小有成效,实时事件摘要的生成方法还有待改进。首先,实时事件摘要系统的信息丰富度有待提升;其次,现有的研究往往侧重于只生成具有高度相关性的实时事件摘要,却忽略了实时事件摘要的非冗余性,这将严重降低其性能,可能向用户推送重复冗余的多个文本;第三,大多数方法将相关性预测和实时事件摘要视为顺序步骤或仅将相关性预测分数视为实时事件摘要模型的特征,导致实时事件摘要的性能不佳。
发明内容
本发明的目的在于提供一种实时事件摘要的生成控制方法、装置、设备及存储介质,旨在解决现有技术中实时事件摘要信息不够丰富、冗余度较高、实时事件摘要性能不佳的问题。
一方面,本发明提供了一种实时事件摘要的生成方法,所述方法包括下述步骤:
接收文本流和用户查询文本,所述文本流包括按时间排序的事件文本;
依据所述事件文本、所述用户查询文本和预设的知识库,生成所述事件文本的知识感知文本表示和所述用户查询文本的知识感知文本表示;
依据所述事件文本的知识感知文本表示、所述用户查询文本的知识感知文本表示和训练好的交互式多头注意力网络,生成所述事件文本的交互式学习文本表示和所述用户查询文本的交互式学习文本表示;
依据所述事件文本的交互式学习文本表示、所述用户查询文本的交互式学习文本表示和训练好的动态记忆网络,生成所述事件文本的特定文本表示;
将所述事件文本的特定文本表示输入训练好的多任务联合训练模型,生成所述文本流的实时事件摘要,所述多任务联合训练模型包括实时事件摘要任务模型和相关性预测任务模型。
另一方面,本发明提供了一种实时事件摘要的生成装置,所述装置包括:
文本接收模块,用于接收文本流和用户查询文本,所述文本流包括按时间排序的事件文本;
知识感知表示生成模块,用于依据所述事件文本、所述用户查询文本和预设的知识库,生成所述事件文本的知识感知文本表示和所述用户查询文本的知识感知文本表示;
交互式表示生成模块,用于依据所述事件文本的知识感知文本表示、所述用户查询文本的知识感知文本表示和训练好的交互式多头注意力网络,生成所述事件文本的交互式学习文本表示和所述用户查询文本的交互式学习文本表示;
特定表示生成模块,用于依据所述事件文本的交互式学习文本表示、所述用户查询文本的交互式学习文本表示和训练好的动态记忆网络,生成所述事件文本的特定文本表示;以及
实时摘要生成模块,用于将所述事件文本的特定文本表示输入训练好的多任务联合训练模型,生成所述文本流的实时事件摘要,所述多任务联合训练模型包括实时事件摘要任务模型和相关性预测任务模型。
另一方面,本发明还提供了一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上述实时事件摘要的生成方法所述的步骤。
另一方面,本发明还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如上述实时事 件摘要的生成方法所述的步骤。
本发明接收文本流和用户查询文本,文本流包括按时间排序的事件文本,依据知识库生成事件文本和用户查询文本的知识感知文本表示,依据这些知识感知文本表示和交互式多头注意力网络,生成事件文本和用户查询文本的交互式学习文本表示,依据这些交互式学习文本表示和动态记忆网络,生成事件文本的特定文本表示,将特定文本表示输入多任务联合训练模型,得到实时事件摘要,从而借助知识库有效地丰富了实时事件摘要的内容,通过交互式学习和注意力机制更好地学习了文本表示,通过动态记忆网络有效地降低了实时事件摘要的冗余度,通过多任务联合训练模型实现对实时事件摘要任务和相关性预测任务联合处理,提高实时事件摘要的性能,进而有效地实时事件摘要的生成效果。
附图说明
图1是本发明实施例一提供的实时事件摘要的生成方法的实现流程图;
图2是本发明实施例二提供的实时事件摘要的生成方法的实现流程图;
图3是本发明实施例三提供的实时事件摘要的生成装置的结构示意图;
图4是本发明实施例三提供的实时事件摘要的生成装置的优选结构示意图;以及
图5是本发明实施例四提供的计算机设备的结构示意图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
以下结合具体实施例对本发明的具体实现进行详细描述:
实施例一:
图1示出了本发明实施例一提供的实时事件摘要的生成方法的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:
在步骤S101中,接收文本流和用户查询文本,文本流包括按时间排序的事件文本。
本发明适用于数据处理平台或计算机、服务器等数据处理设备。可通过在网络中实时采集事件文本(例如社交媒体文本),由这些不同时间戳下采集到的事件文本构成文本流。用户查询文本为用户输入的关键字文本。每个事件文本和用户查询文本中都包括多个单词。
在步骤S102中,依据事件文本、用户查询文本和预设的知识库,生成事件文本的知识感知文本表示和用户查询文本的知识感知文本表示。
在本发明实施例中,知识库(Knowledge base,KB)中包含大量的知识,例如微软知识库或者一些基于维基百科构建的知识库,借助知识库对事件文本和用户查询文本进行文本表示,可有效地提高实时时间摘要的丰富度。
在本发明实施例中,事件文本的知识感知文本表示包括事件文本的初始上下文表示和初始知识表示,用户查询文本的知识感知文本表示包括用户查询文本的初始上下文表示和初始知识表示。在获得文本流和用户查询文本后,依据文本流中事件文本的单词得到事件文本的初始上下文表示,依据知识库和注意力机制得到事件文本的初始知识表示。同样的,依据文本流中用户查询文本的单词得到用户查询文本的初始上下文表示,依据知识库和注意力机制得到用户查询文本的初始知识表示。
在步骤S103中,依据事件文本的知识感知文本表示、用户查询文本的知识感知文本表示和训练好的交互式多头注意力网络,生成事件文本的交互式学习文本表示和用户查询文本的交互式学习文本表示。
在本发明实施例中,预先构建并训练交互式多头注意力网络,将事件文本的和用户查询文本的知识感知文本表示输入训练好的交互式多头注意力网络,得到每个事件文本的注意力矩阵,基于事件文本的注意力矩阵和事件文本的知 识感知文本表示计算得到事件文本的交互式学习文本表示。同样地,通过将事件文本的和用户查询文本的知识感知文本表示输入交互式多头注意力网络,得到用户查询文本的注意力矩阵。基于用户查询文本的注意力矩阵和用户查询文本的知识感知文本表示计算得到用户查询文本的交互式学习文本表示。
在本发明实施例中,事件文本的注意力矩阵的计算过程有用户查询文本的参与,用户查询文本的注意力矩阵的计算过程有事件文本的参与,交互式多头注意力网络实现了事件文本与用户查询文本之间的交互学习,能够有效地捕捉到事件文本与用户查询文本之间的交互信息,提供事件文本和用户查询文本的文本表示的性能。
在步骤S104中,依据事件文本的交互式学习文本表示、用户查询文本的交互式学习文本表示和训练好的动态记忆网络,生成事件文本的特定文本表示。
在本发明实施例中,动态记忆网络用于记忆过去的事件文本,依据记忆内容调整当前的注意力,以免实时事件摘要中出现大量冗余内容。动态记忆网络除了记忆过去的事件文本、调整当前的注意力、依据事件文本的交互式学习文本表示和注意力生成与记忆内容重复度低的特定文本表示之外,还包括用于更新记忆内容的循环网络,该循环网络根据上一时间戳下事件文本的记忆内容和当前时间戳下事件文本的交互式学习文本表示得到当前时间戳下事件文本的记忆内容。
在步骤S105中,将事件文本的特定文本表示输入训练好的多任务联合训练模型,生成文本流的实时事件摘要,多任务联合训练模型包括实时事件摘要任务模型和相关性预测任务模型。
在本发明实施例中,将文本流中每个事件文本的特定文本表示输入至训练好的多任务联合训练模型,通过多任务联合训练模型中相关性预测任务模型计算得到文本流中每个事件文本的特定文本表示相对于用户查询文本的相关性标签,通过多任务联合训练模型中的实时事件摘要确定文本流中每个事件文本的文本动作,根据每个事件文本的文本动作生成文本流的实时事件摘要。其中, 预测相关性标签包括高度相关、相关和不相关,文本动作包括推送和不推送,当文本动作为推送时将该事件文本的特定文本表示推送至实时事件摘要。
在本发明实施例中,借助知识库生成事件文本的和用户查询文本的知识感知文本表示,通过交互式多头注意力网络对这些知识感知文本进行交互式学习,生成事件文本的和用户查询文本的交互式学习文本表示,通过动态记忆网络对这些交互式学习文本表示进行处理,生成事件文本的特定文本表示,将事件文本的特定文本表示输入多任务联合训练模型,生成文本流的实时事件摘要,从而有效地提高了实时事件摘要的内容丰富度和性能,降低了实时事件摘要的冗余度,进而提高了实时事件摘要的生成效果。
实施例二:
图2示出了本发明实施例二提供的实时事件摘要的生成方法的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:
在步骤S201中,接收文本流和用户查询文本,文本流包括按时间排序的事件文本。
在本发明实施例中,文本流可表示为D={d 1,d 2,…,d t,…,d T},T表示文本流中事件文本的总数,d t为文本流中第t个时间戳下的事件文本。文本流中的每个文本
Figure PCTCN2019088630-appb-000001
由l个单词
Figure PCTCN2019088630-appb-000002
组成(此处省略了文本的时间下缀,以简化这些参数的数学表示)。用户查询文本可表示为
Figure PCTCN2019088630-appb-000003
由n个单词
Figure PCTCN2019088630-appb-000004
组成。
在步骤S202中,通过提取事件文本中单词的隐藏状态,得到事件文本的初始上下文表示,通过提取用户查询文本中单词的隐藏状态,得到用户查询文本的初始上下文表示。
在本发明实施例中,通过预设的词嵌入层将事件文本中的每个单词和用户查询文本中的每个单词分别映射低维词嵌入向量。将事件文本中每个单词的低维词嵌入向量输入第一门控循环单元(GRU),计算得到事件文本中每个单词的隐藏状态。将用户查询文本中每个单词的低维词嵌入向量输入第二门控循环 单元,计算得到用户查询文本中每个单词的隐藏状态。其中,第一门控循环单元与第二门控循环单元为互相独立的门控循环单元。
优选地,通过门控循环单元计算单词隐藏状态的计算公式为:
h k=GRU(h k-1,v k),其中,v k表示第k个单词的低维词嵌入向量,h k表示第k个单词的隐藏状态,h k-1表示第k-1个单词的隐藏状态。
在本发明实施例中,将事件文本中所有单词的隐藏状态组合成事件文本的初始上下文表示,将用户查询文本中所有单词的隐藏状态组合成用户查询文本的初始上下文表示。事件文本的初始上下文表示为
Figure PCTCN2019088630-appb-000005
用户查询文本的初始上下文表示为
Figure PCTCN2019088630-appb-000006
其中,
Figure PCTCN2019088630-appb-000007
为事件文本中第i个单词
Figure PCTCN2019088630-appb-000008
的隐藏状态,
Figure PCTCN2019088630-appb-000009
为事件文本中第j个单词
Figure PCTCN2019088630-appb-000010
的隐藏状态。
在步骤S203中,根据事件文本的初始上下文表示、注意力机制和知识库,生成事件文本的初始知识表示,根据用户查询文本的初始上下文表示、注意力机制和知识库,生成查询文本的初始知识表示。
在本发明实施例中,在知识库中为事件文本中的每个单词和用户查询文本中的每个分别选取预设数量个嵌入实体构成的候选实体集,候选实体集表示为:
Figure PCTCN2019088630-appb-000011
N为嵌入实体的总数,e k为第k个单词对应的候选实体集,当单词为事件文本中的单词时k=1,2,…,l,当单词为用户查询文本中的单词时k=1,2,…,n,d kb为知识库中候选实体的维度。
在本发明实施例中,通过知识库中相应候选实体集的嵌入来学习事件文本中每个单词的知识表示,学习过程可表示为:
Figure PCTCN2019088630-appb-000012
为事件文本中第k个单词的知识表示,e ki为事件文本中第k个单词的候选实体集中第i个嵌入实体,a ki为e ki的上下文引导注意力权重,a ki=soft max(ρ(e ki,μ(H d))),ρ(e ki,μ(H d))=tanh(W kbe ki+W cμ(H d)+b kb),μ为平均池化操作,W kb和W c为训练好的权重矩阵,b kb为偏置值。由事件文本中所有单词 的知识表示构成事件文本的初始知识估计
Figure PCTCN2019088630-appb-000013
从而通过上下文引导的注意力机制、知识库和事件文本的初始上下文表示,可得到事件文本的初始知识表示。同样地,通过上下文引导的注意力机制、知识库和用户查询文本的初始上下文表示,可得到用户查询文本的初始知识表示
Figure PCTCN2019088630-appb-000014
具体可参照事件文本初始知识表示的学习过程,在此不再赘述。
在步骤S204中,由事件文本的初始上下文表示和事件文本的初始知识表示组合得到事件文本的知识感知文本表示,由用户查询文本的初始上下文表示和用户查询文本的初始知识表示组合得到用户查询文本的知识感知文本表示。
在本发明实施例中,事件文本的知识感知文本表示为:
Figure PCTCN2019088630-appb-000015
用户查询文本的知识感知文本表示为:
Figure PCTCN2019088630-appb-000016
在步骤S205中,将事件文本的知识感知文本表示和用户查询文本的知识感知文本表示输入交互式多头注意力网络,计算事件文本的注意力矩阵和用户查询文本的注意力矩阵。
在本发明实施例中,将事件文本的知识感知文本表示和用户查询文本的知识感知文本表示输入交互式多头注意力网络,计算得到事件文本的注意力矩阵和用户查询文本的注意力矩阵。
优选地,事件文本的注意力矩阵的计算公式表示为:
A=[A 1,A 2,…,A l]
Figure PCTCN2019088630-appb-000017
其中,μ为平均池化操作,
Figure PCTCN2019088630-appb-000018
为事件文本中第i个单词的知识感知文本表示,A i为事件文本的注意力矩阵A中的第i行矩阵,ρ为注意力函数,且
Figure PCTCN2019088630-appb-000019
U (1)和W (1)为交互式多头注意力网络训练好的权重矩阵。
优选地,用户查询文本的注意力矩阵的计算公式表示为:
B=[B 1,B 2,…,B l]
Figure PCTCN2019088630-appb-000020
其中,B i为用户查询文本的注意力矩阵B中的第i行矩阵。
在步骤S206中,根据事件文本的注意力矩阵和知识感知文本表示,计算得到事件文本的交互式学习文本表示,根据用户查询文本的注意力矩阵和知识感知文本表示,计算得到用户查询文本的交互式学习文本表示。
在本发明实施例中,事件文本的交互式学习文本表示的计算公式为o d=AZ d,用户查询文本的交互式学习文本表示的计算公式为o q=AZ q
在步骤S207中,依据事件文本的交互式学习文本表示、用户查询文本的交互式学习文本表示和训练好的动态记忆网络,生成事件文本的特定文本表示。
在本发明实施例中,由于文本流包括按时间排序的事件文本,文本流中的每个时间戳相当于动态记忆网络的每个步长。依次针对每个时间戳,获取当前时间戳的上一个时间戳的记忆内容,将该记忆内容、当前时间戳下的事件文本的交互式学习文本表示、以及用户查询文本输入动态记忆网络,通过动态网络中的注意力机制,计算得到当前时间戳下事件文本的特定文本表示。
优选地,事件文本的特定文本表示的计算公式为:
Figure PCTCN2019088630-appb-000021
其中,emb t为时间戳t下事件文本的特定文本表示,o tj d为时间戳t下事件文本中第j个单词的交互式学习文本表示,注意力函数w tj是一个前向神经网络,δ为用于将矩阵展平为矢量形式的函数,W a、U a、V a是w tj中的权重矩阵,b a为w tj中的偏置项,m t-1为时间戳t-1下事件文本的记忆内容。
优选地,根据上一时间戳下事件文本对应的记忆内容和当前时间戳下事件 文本的特定文本表示,计算当前时间戳下事件文本对应的记忆内容,从而按照时间戳的顺序生成事件文本对应的记忆内容并存储在动态记忆网络中。
进一步优选地,根据上一时间戳下事件文本对应的记忆内容和当前时间戳下事件文本的特定文本表示,通过第三门控循环单元计算当前时间戳下事件文本对应的记忆内容,计算公式为:
m t=GRU(emb t,m t-1)。其中,初始时间戳下时间文本对应的记忆内容为用户查询文本中最后一个单词对应的交互式学习文本表示
Figure PCTCN2019088630-appb-000022
Figure PCTCN2019088630-appb-000023
在步骤S208中,将事件文本的特定文本表示输入训练好的多任务联合训练模型,生成文本流的实时事件摘要,多任务联合训练模型包括实时事件摘要任务模型和相关性预测任务模型。
在本发明实施例中,在多任务联合训练模型的训练过程中,相关性预测任务模型的目标函数可表示为:
Figure PCTCN2019088630-appb-000024
其中,
Figure PCTCN2019088630-appb-000025
分别为相关性预测任务模型中softmax层和全连接层的输出,
Figure PCTCN2019088630-appb-000026
为特定文本表示emb t相对于用户查询文本预测到的相关性标签,
Figure PCTCN2019088630-appb-000027
Figure PCTCN2019088630-appb-000028
为相关性预测的权重矩阵,需要在训练过程中对该权重矩阵进行训练。K为相关性标签的类别,例如当相关性标签包括高度相关、相关和不相关时K=3。I{·}是一个指示标记,I{true}=1,I{false}=0。通过有监督的方式对相关性预测任务的权重矩阵
Figure PCTCN2019088630-appb-000029
Figure PCTCN2019088630-appb-000030
进行学习,训练数据集为
Figure PCTCN2019088630-appb-000031
d t和q t分别为训练数据集中时间戳t下的事件文本和用户查询文本,
Figure PCTCN2019088630-appb-000032
为d t相对于q t的真实的相关性标签。通过最小化该目标函数(即预测的相关性标签与真实的相关性标签之间的交叉熵)进行训练。
在本发明实施例中,实时事件摘要任务模型的目标函数可表示为:
Figure PCTCN2019088630-appb-000033
其中,R T=r(a 1:T)=λEG(a 1:T)+(1-λ)nCG(a 1:T)为根据预测得到的相关性标签计算得到的预期奖励,表示在给定文本流和生成实时事件摘要的全局动作序列a 1:T之间的分数,由于无法在得到最终的全局动作序列之前获得奖励,该预期奖励是典型的延迟奖励,r(·)为奖励函数,λ为控制函数EG(·)和函数nCG(·)的系数,
Figure PCTCN2019088630-appb-000034
为策略函数,此处采用随机策略梯度算法中具有参数θ的独立函数逼近器逼近随机策略π θ
Figure PCTCN2019088630-appb-000035
Figure PCTCN2019088630-appb-000036
为策略函数中要学习的权重矩阵,b s为偏置值,a t∈{0,1}为文本动作,a t=1表示将时间戳为t的事件文本的特定文本表示推送至实时事件摘要中,a t=0表示不将时间戳为t的事件文本的特定文本表示推送至实时事件摘要中,s t=emb t。在训练过程中采用强化学习算法对实时事件摘要任务模型的目标函数进行优化,优选地,采用策略梯度算法作为优化实时事件摘要任务模型的强化学习算法,以提高实时事件摘要任务模型的训练效果。
在本发明实施例中,多任务联合训练模型可表示为:
L=γ 1L 12L 2,L 1为相关性预测任务模型的目标函数,L 2为实时事件摘要任务模型的目标函数,γ 1、γ 2分别为L 1和L 2的权重系数,对多任务联合训练模型进行训练即对相关性预测任务模型和实时事件摘要任务模型进行同步训练,充分考虑到相关性预测任务和实时事件摘要任务相互依赖的关系,有效地提高实时时间摘要的生成效果。
在本发明实施例中,借助知识库生成事件文本的和用户查询文本的知识感知文本表示,通过交互式多头注意力网络对这些知识感知文本进行交互式学习,生成事件文本的和用户查询文本的交互式学习文本表示,通过动态记忆网络对这些交互式学习文本表示进行处理,生成事件文本的特定文本表示,将事件文本的特定文本表示输入多任务联合训练模型,生成文本流的实时事件摘要,从而有效地提高了实时事件摘要的内容丰富度和性能,降低了实时事件摘要的冗余度,进而提高了实时事件摘要的生成效果。
实施例三:
图3示出了本发明实施例三提供的实时事件摘要的生成装置的结构,为了便于说明,仅示出了与本发明实施例相关的部分,其中包括:
文本接收模块31,用于接收文本流和用户查询文本,文本流包括按时间排序的事件文本;
知识感知表示生成模块32,用于依据事件文本、用户查询文本和预设的知识库,生成事件文本的知识感知文本表示和用户查询文本的知识感知文本表示;
交互式表示生成模块33,用于依据事件文本的知识感知文本表示、用户查询文本的知识感知文本表示和训练好的交互式多头注意力网络,生成事件文本的交互式学习文本表示和用户查询文本的交互式学习文本表示;
特定表示生成模块34,用于依据事件文本的交互式学习文本表示、用户查询文本的交互式学习文本表示和训练好的动态记忆网络,生成事件文本的特定文本表示;以及
实时摘要生成模块35,用于将事件文本的特定文本表示输入训练好的多任务联合训练模型,生成文本流的实时事件摘要,多任务联合训练模型包括实时事件摘要任务模型和相关性预测任务模型。
优选地,如图4所示,知识感知表示生成模块32包括:
上下文生成模块321,用于通过提取事件文本中单词的隐藏状态,得到事件文本的初始上下文表示,通过提取用户查询文本中单词的隐藏状态,得到用户查询文本的初始上下文表示;
初始知识表示生成模块322,用于根据事件文本的初始上下文表示、注意力机制和知识库,生成事件文本的初始知识表示,根据用户查询文本的初始上下文表示、注意力机制和知识库,生成查询文本的初始知识表示;以及
知识感知表示组合模块323,用于由事件文本的初始上下文表示和事件文本的初始知识表示组合得到事件文本的知识感知文本表示,由用户查询文本的初始上下文表示和用户查询文本的初始知识表示组合得到用户查询文本的知识 感知文本表示。
优选地,交互式表示生成模块33包括:
注意力矩阵计算模块,用于将事件文本的知识感知文本表示和用户查询文本的知识感知文本表示输入交互式多头注意力网络,计算事件文本的注意力矩阵和用户查询文本的注意力矩阵;以及
交互式表示生成子模块,用于根据事件文本的注意力矩阵和知识感知文本表示,计算得到事件文本的交互式学习文本表示,根据用户查询文本的注意力矩阵和知识感知文本表示,计算得到用户查询文本的交互式学习文本表示。
优选地,特定表示生成模块34包括:
记忆内容获取模块,用于获取文本流中上一时间戳下的事件文本的记忆内容;以及
特定表示生成子模块,用于将上一时间戳下事件文本的记忆内容、当前时间戳下事件文本的交互式学习文本表示和用户查询文本的交互式学习文本表示输入动态记忆网络,获得当前时间戳下事件文本的特定文本表示。
优选地,特定表示生成模块34还包括:
记忆内容计算模块,用于根据当前时间戳下事件文本的特定文本表示和上一时间戳下事件文本的记忆内容,计算当前时间戳下事件文本的记忆内容。
优选地,实时事件摘要的生成装置还包括:
训练模块,用于获取训练数据,根据训练数据对实时事件摘要任务与相关性预测任务进行同时训练,实时事件摘要任务采用策略梯度算法进行训练,相关性预测任务采用有监督方式进行训练。
在本发明实施例中,借助知识库生成事件文本的和用户查询文本的知识感知文本表示,通过交互式多头注意力网络对这些知识感知文本进行交互式学习,生成事件文本的和用户查询文本的交互式学习文本表示,通过动态记忆网络对这些交互式学习文本表示进行处理,生成事件文本的特定文本表示,将事件文本的特定文本表示输入多任务联合训练模型,生成文本流的实时事件摘要,从 而有效地提高了实时事件摘要的内容丰富度和性能,降低了实时事件摘要的冗余度,进而提高了实时事件摘要的生成效果。
在本发明实施例中,实时事件摘要的生成装置的各单元的实施内容可参照实施例一、实施例二相应步骤的详细描述,在此不再赘述。
在本发明实施例中,实时事件摘要的生成装置的各单元可由相应的硬件或软件单元实现,各单元可以为独立的软、硬件单元,也可以集成为一个软、硬件单元,在此不用以限制本发明。
实施例四:
图5示出了本发明实施例四提供的计算机设备的结构,为了便于说明,仅示出了与本发明实施例相关的部分。
本发明实施例的计算机设备5包括处理器50、存储器51以及存储在存储器51中并可在处理器50上运行的计算机程序52。该处理器50执行计算机程序52时实现上述各个方法实施例中的步骤,例如图1所示的步骤S101至S105。或者,处理器50执行计算机程序52时实现上述装置实施例中各单元的功能,例如图3所示单元31至35的功能。
在本发明实施例中,借助知识库生成事件文本的和用户查询文本的知识感知文本表示,通过交互式多头注意力网络对这些知识感知文本进行交互式学习,生成事件文本的和用户查询文本的交互式学习文本表示,通过动态记忆网络对这些交互式学习文本表示进行处理,生成事件文本的特定文本表示,将事件文本的特定文本表示输入多任务联合训练模型,生成文本流的实时事件摘要,从而有效地提高了实时事件摘要的内容丰富度和性能,降低了实时事件摘要的冗余度,进而提高了实时事件摘要的生成效果。
实施例五:
在本发明实施例中,提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述方法实施例中的步骤,例如,图1所示的步骤S101至S105。或者,该计算机程序被处理器 执行时实现上述装置实施例中各单元的功能,例如图3所示单元31至35的功能。
在本发明实施例中,借助知识库生成事件文本的和用户查询文本的知识感知文本表示,通过交互式多头注意力网络对这些知识感知文本进行交互式学习,生成事件文本的和用户查询文本的交互式学习文本表示,通过动态记忆网络对这些交互式学习文本表示进行处理,生成事件文本的特定文本表示,将事件文本的特定文本表示输入多任务联合训练模型,生成文本流的实时事件摘要,从而有效地提高了实时事件摘要的内容丰富度和性能,降低了实时事件摘要的冗余度,进而提高了实时事件摘要的生成效果。
本发明实施例的计算机可读存储介质可以包括能够携带计算机程序代码的任何实体或装置、记录介质,例如,ROM/RAM、磁盘、光盘、闪存等存储器。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种实时事件摘要的生成方法,其特征在于,所述方法包括下述步骤:
    接收文本流和用户查询文本,所述文本流包括按时间排序的事件文本;
    依据所述事件文本、所述用户查询文本和预设的知识库,生成所述事件文本的知识感知文本表示和所述用户查询文本的知识感知文本表示;
    依据所述事件文本的知识感知文本表示、所述用户查询文本的知识感知文本表示和训练好的交互式多头注意力网络,生成所述事件文本的交互式学习文本表示和所述用户查询文本的交互式学习文本表示;
    依据所述事件文本的交互式学习文本表示、所述用户查询文本的交互式学习文本表示和训练好的动态记忆网络,生成所述事件文本的特定文本表示;
    将所述事件文本的特定文本表示输入训练好的多任务联合训练模型,生成所述文本流的实时事件摘要,所述多任务联合训练模型包括实时事件摘要任务模型和相关性预测任务模型。
  2. 如权利要求1所述的方法,其特征在于,所述生成所述事件文本的知识感知文本表示和所述用户查询文本的知识感知文本表示的步骤,包括:
    通过提取所述事件文本中单词的隐藏状态,得到所述事件文本的初始上下文表示,通过提取所述用户查询文本中单词的隐藏状态,得到所述用户查询文本的初始上下文表示;
    根据所述事件文本的初始上下文表示、注意力机制和所述知识库,生成所述事件文本的初始知识表示,根据所述用户查询文本的初始上下文表示、注意力机制和所述知识库,生成所述查询文本的初始知识表示;
    由所述事件文本的初始上下文表示和所述事件文本的初始知识表示组合得到所述事件文本的知识感知文本表示,由所述用户查询文本的初始上下文表示和所述用户查询文本的初始知识表示组合得到所述用户查询文本的知识感知文本表示。
  3. 如权利要求1所述的方法,其特征在于,所述生成所述事件文本的交互 式学习文本表示和所述用户查询文本的交互式学习文本表示的步骤,包括:
    将所述事件文本的知识感知文本表示和所述用户查询文本的知识感知文本表示输入所述交互式多头注意力网络,计算所述事件文本的注意力矩阵和所述用户查询文本的注意力矩阵;
    根据所述事件文本的注意力矩阵和知识感知文本表示,计算得到所述事件文本的交互式学习文本表示,根据所述用户查询文本的注意力矩阵和知识感知文本表示,计算得到所述用户查询文本的交互式学习文本表示。
  4. 如权利要求1所述的方法,其特征在于,所述生成所述事件文本的特定文本表示的步骤,包括:
    获取所述文本流中上一时间戳下的事件文本的记忆内容;
    将所述上一时间戳下事件文本的记忆内容、当前时间戳下事件文本的交互式学习文本表示和用户查询文本的交互式学习文本表示输入动态记忆网络,获得所述当前时间戳下事件文本的特定文本表示。
  5. 如权利要求4所述的方法,其特征在于,所述生成所述事件文本的特定文本表示的步骤,还包括:
    根据所述当前时间戳下事件文本的特定文本表示和所述上一时间戳下事件文本的记忆内容,计算所述当前时间戳下事件文本的记忆内容。
  6. 如权利要求1所述的方法,其特征在于,在所述接收文本流和用户查询文本的步骤之前,所述方法还包括:
    获取训练数据,根据所述训练数据对所述实时事件摘要任务与所述相关性预测任务进行同时训练,所述实时事件摘要任务采用策略梯度算法进行训练,所述相关性预测任务采用有监督方式进行训练。
  7. 一种实时事件摘要的生成装置,其特征在于,所述装置包括:
    文本接收模块,用于接收文本流和用户查询文本,所述文本流包括按时间排序的事件文本;
    知识感知表示生成模块,用于依据所述事件文本、所述用户查询文本和预 设的知识库,生成所述事件文本的知识感知文本表示和所述用户查询文本的知识感知文本表示;
    交互式表示生成模块,用于依据所述事件文本的知识感知文本表示、所述用户查询文本的知识感知文本表示和训练好的交互式多头注意力网络,生成所述事件文本的交互式学习文本表示和所述用户查询文本的交互式学习文本表示;
    特定表示生成模块,用于依据所述事件文本的交互式学习文本表示、所述用户查询文本的交互式学习文本表示和训练好的动态记忆网络,生成所述事件文本的特定文本表示;以及
    实时摘要生成模块,用于将所述事件文本的特定文本表示输入训练好的多任务联合训练模型,生成所述文本流的实时事件摘要,所述多任务联合训练模型包括实时事件摘要任务模型和相关性预测任务模型。
  8. 如权利要求6所述的装置,其特征在于,所述知识感知表示生成模块包括:
    上下文生成模块,用于通过提取所述事件文本中单词的隐藏状态,得到所述事件文本的初始上下文表示,通过提取所述用户查询文本中单词的隐藏状态,得到所述用户查询文本的初始上下文表示;
    初始知识表示生成模块,用于根据所述事件文本的初始上下文表示、注意力机制和所述知识库,生成所述事件文本的初始知识表示,根据所述用户查询文本的初始上下文表示、注意力机制和所述知识库,生成所述查询文本的初始知识表示;以及
    知识感知表示组合模块,用于由所述事件文本的初始上下文表示和所述事件文本的初始知识表示组合得到所述事件文本的知识感知文本表示,由所述用户查询文本的初始上下文表示和所述用户查询文本的初始知识表示组合得到所述用户查询文本的知识感知文本表示。
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在 所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至6任一项所述方法的步骤。
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至6任一项所述方法的步骤。
PCT/CN2019/088630 2019-05-27 2019-05-27 实时事件摘要的生成方法、装置、设备及存储介质 WO2020237479A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/088630 WO2020237479A1 (zh) 2019-05-27 2019-05-27 实时事件摘要的生成方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/088630 WO2020237479A1 (zh) 2019-05-27 2019-05-27 实时事件摘要的生成方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2020237479A1 true WO2020237479A1 (zh) 2020-12-03

Family

ID=73553569

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088630 WO2020237479A1 (zh) 2019-05-27 2019-05-27 实时事件摘要的生成方法、装置、设备及存储介质

Country Status (1)

Country Link
WO (1) WO2020237479A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106484767A (zh) * 2016-09-08 2017-03-08 中国科学院信息工程研究所 一种跨媒体的事件抽取方法
CN108763211A (zh) * 2018-05-23 2018-11-06 中国科学院自动化研究所 融合蕴含知识的自动文摘方法及系统
CN109344391A (zh) * 2018-08-23 2019-02-15 昆明理工大学 基于神经网络的多特征融合中文新闻文本摘要生成方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106484767A (zh) * 2016-09-08 2017-03-08 中国科学院信息工程研究所 一种跨媒体的事件抽取方法
CN108763211A (zh) * 2018-05-23 2018-11-06 中国科学院自动化研究所 融合蕴含知识的自动文摘方法及系统
CN109344391A (zh) * 2018-08-23 2019-02-15 昆明理工大学 基于神经网络的多特征融合中文新闻文本摘要生成方法

Similar Documents

Publication Publication Date Title
CN110297885B (zh) 实时事件摘要的生成方法、装置、设备及存储介质
WO2019114512A1 (zh) 用于客户服务的方法、装置、电子设备、计算机可读存储介质
Zhang et al. Cross-domain recommendation with semantic correlation in tagging systems
Cai et al. Intelligent question answering in restricted domains using deep learning and question pair matching
RU2720074C2 (ru) Способ и система создания векторов аннотации для документа
CN103886047A (zh) 面向流式数据的分布式在线推荐方法
CN108833933A (zh) 一种使用支持向量机推荐视频流量的方法及系统
CN111563158A (zh) 文本排序方法、排序装置、服务器和计算机可读存储介质
Xiao et al. User behavior prediction of social hotspots based on multimessage interaction and neural network
CN115510226A (zh) 一种基于图神经网络的情感分类方法
Liu et al. Heterogeneous relational graph neural networks with adaptive objective for end-to-end task-oriented dialogue
Yan et al. Response selection from unstructured documents for human-computer conversation systems
CN112231554A (zh) 一种搜索推荐词生成方法、装置、存储介质和计算机设备
Pulikottil et al. Onet–a temporal meta embedding network for mooc dropout prediction
Garg et al. Reinforced approximate exploratory data analysis
WO2020237479A1 (zh) 实时事件摘要的生成方法、装置、设备及存储介质
Wang College physical education and training in big data: a big data mining and analysis system
Yao et al. Scalable algorithms for CQA post voting prediction
Xu et al. A novel data-to-text generation model with transformer planning and a wasserstein auto-encoder
Zeng et al. Multi-aspect attentive text representations for simple question answering over knowledge base
CN113849641A (zh) 一种跨领域层次关系的知识蒸馏方法和系统
Li et al. Personalized education resource recommendation method based on deep learning in intelligent educational robot environments
Du et al. Employ Multimodal Machine Learning for Content Quality Analysis
Matsuda et al. Benchmark for Personalized Federated Learning
Wang et al. MOOC resources recommendation based on heterogeneous information network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19931483

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19931483

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19931483

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 090622)

122 Ep: pct application non-entry in european phase

Ref document number: 19931483

Country of ref document: EP

Kind code of ref document: A1