WO2023124648A1 - 一种文本纪要生成方法、装置、设备及存储介质 - Google Patents

一种文本纪要生成方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023124648A1
WO2023124648A1 PCT/CN2022/133167 CN2022133167W WO2023124648A1 WO 2023124648 A1 WO2023124648 A1 WO 2023124648A1 CN 2022133167 W CN2022133167 W CN 2022133167W WO 2023124648 A1 WO2023124648 A1 WO 2023124648A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
target
features
segment
target text
Prior art date
Application number
PCT/CN2022/133167
Other languages
English (en)
French (fr)
Inventor
高建清
戚婷
闫莉
孙境廷
Original Assignee
科大讯飞股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 科大讯飞股份有限公司 filed Critical 科大讯飞股份有限公司
Publication of WO2023124648A1 publication Critical patent/WO2023124648A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular to a method, device, equipment and storage medium for generating a text summary.
  • Text summary generation refers to the content extraction of long texts, so as to extract information that can represent the core content of the text. Text summary can help people grasp the text content more directly and effectively.
  • Text automatic summarization technology can be divided into extractive summarization and generative summarization according to the way of generating summaries. Extractive summarization is to extract words or sentences from the original text to form a summary, and the content of the summary is all from the original text; while the generative summary allows the generation of new words and phrases that are not in the original text to form a summary, generating a summary At first, the semantic understanding of the text content is carried out, and a paragraph is generated based on the semantics to summarize the given text.
  • the content of the target text that needs to generate a text summary is multifaceted, and different people may be interested in different aspects of the content. Therefore, different people have different requirements for the text summary of the same target text.
  • this application proposes a method, device, device and storage medium for generating text summaries.
  • text summaries meeting different user needs can be generated for the same target text.
  • a method for generating text minutes comprising:
  • target text and reference text wherein the reference text is determined based on the content of the target text that the user pays attention to;
  • a device for generating text minutes comprising:
  • a data acquisition unit configured to acquire target text and reference text, wherein the reference text is determined based on the content of the target text concerned by the user;
  • the summary generation unit is configured to generate a summary of the target text based on locating the associated content of the reference text from the target text, so as to obtain a summary of the target text corresponding to the reference text.
  • a device for generating text minutes comprising:
  • the memory is connected to the processor for storing programs
  • the processor is configured to implement the above-mentioned text summary generation method by running the program in the memory.
  • a storage medium on which a computer program is stored, and when the computer program is run by a processor, the above-mentioned method for generating a text summary is realized.
  • the text summary generation method proposed by this application uses the reference text as a reference to generate a text summary when generating a text summary for the target text, and generates a summary of the target text by locating the relevant content of the reference text from the target text, and obtains A summary of the target text corresponding to the reference text.
  • the method generates a text summary for the target text, the content of the target text and the associated content of the reference text in the target text are jointly used to determine the text summary of the target text.
  • this text summary generation method can generate text summaries that meet different user needs for the same target text.
  • Fig. 1 is a schematic flow chart of a method for generating a text summary provided in the embodiment of the present application
  • Fig. 2 is a schematic structural diagram of the article interactive semantic retrieval model provided by the embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a semantic retrieval model based on retrieval candidate information enhancement provided by an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of a text summary generation model based on an attention mechanism provided in an embodiment of the present application
  • FIG. 5 is a schematic structural diagram of another text summary generation model based on the attention mechanism provided by the embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of a word-sentence-chapter level information coding model provided by the embodiment of the present application.
  • Fig. 7 is a schematic diagram of word segmentation feature extraction provided by the embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an information fusion model provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a text summary generation device provided by an embodiment of the present application.
  • Fig. 10 is a schematic structural diagram of a device for generating a text summary provided by an embodiment of the present application.
  • the technical solution of the embodiment of the present application is applicable to the application scenario of generating a text summary.
  • a text summary consistent with the user's focus can be generated, thereby meeting the text summary needs of different users.
  • the above-mentioned application scenarios for generating text minutes specifically refer to scenarios that need to generate summary content, including but not limited to specific application scenarios such as generation of meeting minutes, generation of document summaries, and extraction of news points.
  • Text summary generation refers to the content extraction of long texts, so as to extract information that can represent the core content of the text. Text summary can help people grasp the text content more directly and effectively.
  • Text automatic summarization technology can be divided into extractive summarization and generative summarization according to the way of generating summaries. Extractive summarization is to extract words or sentences from the original text to form a summary, and the content of the summary is all from the original text; while the generative summary allows the generation of new words and phrases that are not in the original text to form a summary, generating a summary At first, the semantic understanding of the text content is carried out, and a paragraph is generated based on the semantics to summarize the given text.
  • the content of the target text that needs to generate a text summary is multifaceted, and different people may be interested in different aspects of the content. Therefore, different people have different requirements for the text summary of the same target text.
  • the content of the meeting is usually multi-faceted, and the content that different participants are concerned about is usually different.
  • the heads of the company's design department, product department, and marketing department who participated in the seminar each cared about different aspects of the content.
  • the design department pays more attention to the improvement of product design schemes
  • the product department pays more attention to product definition and R&D planning
  • the marketing department pays more attention to the market positioning of new products. Therefore, the content of meeting minutes required by different departments is different.
  • the conventional text summary generation scheme whether it is an extractive summary or a generative summary, cannot generate emphases for different concerns because it can only perform technical processing on the text to be generated to determine the main content of the text. Different text minutes.
  • the embodiment of the present application proposes a text summary generation scheme, which can refer to the target text content that the user is concerned about, and generate a text summary for the target text, so that different text summaries can be generated for different concerns, satisfying Different users' personalized needs for text summary content.
  • the technical solution of the embodiment of the application is not only applicable to generating a summary of text to obtain a summary in text form, but also applicable to generating a summary of speech to obtain a summary in the form of text or voice, or generating a summary of text Get minutes in audio form.
  • the embodiment of the present application proposes a method for generating a text summary, as shown in Figure 1, the method includes:
  • the above-mentioned target text refers to the text for which the minutes need to be generated, and the target text may be a text of any content and any language obtained through any means.
  • the target text can be directly obtained text, such as academic literature, news releases, books, etc., or text obtained by speech recognition, such as the text of the recognition result obtained by speech recognition of conference recordings, the speaker's The text of the recognition result obtained by recognizing the speech speech, etc.
  • any form of data content can be converted into a text form, so as to serve as the above-mentioned target text, and the summary of the target text can be generated through subsequent processing.
  • the above reference text is determined based on the target text content that the user pays attention to.
  • the reference text can represent the user's interest or focus on the target text content, and at the same time represent the user's demand for the content of the generated text summary, which is used to provide a reference for generating the text summary of the above target text, so that Ability to generate target text minutes that meet user concerns or contain content of interest to users.
  • the reference text can be input by the user, or can be preset and determined before generating the text summary.
  • the specific form of the reference text may be a fixed sentence pattern, a key word or a phrase, or a short text sentence or a text paragraph, or even a logical combination of multiple texts.
  • the reference text can be a combination of some retrieval conditions, such as related to A and related to B (denoted as A&B), related to A or related to B (denoted as A
  • the meeting recording For example, in the scenario of generating meeting minutes, speech recognition processing is performed on the meeting recording to obtain a text corresponding to the meeting recording, which is used as the above-mentioned target text.
  • the reference text input by the user representing the content of the meeting that the user is interested in or concerned about is obtained.
  • the reference text may be a phrase or short sentence summarized by the user based on the meeting content, or a simple meeting record recorded by the user at the meeting place, or a keyword, phrase, retrieval condition, etc. determined by the user based on the desired meeting minutes.
  • the subsequent text summary generation process can be performed, and the above target text is generated by the summary generation process to obtain the target that meets the user's concerns or contains the content that the user is interested in Text Minutes.
  • the above-mentioned "obtaining the target text and reference text” can be to obtain the original text of the target text and the reference text, and then perform feature extraction on the acquired target text and the original text of the reference text, obtain the features of the target text and the features of the reference text, and use for the subsequent generation of text minutes; or, it may also directly acquire the features of the target text and the reference text for the subsequent generation of text minutes.
  • the above-mentioned associated content of the reference text refers to the text content related to the reference text, for example, the text content whose similarity with the reference text is greater than the set similarity threshold, or the text content which is semantically similar or related to the reference text, All can be used as the associated content of the reference text.
  • the embodiment of the present application compares each text segment of the reference text and the target text in sequence, and determines the textual similarity or semantic similarity of each text segment of the reference text and the target text, thereby determining the reference text and the target text segment
  • the relevance of the reference text is realized to locate and identify the associated content related to the reference text from the target text.
  • the associated content of the reference text in the target text is more valuable for generating a text summary corresponding to the reference text. Because the associated content of the reference text contains text information related to the reference text, if we can focus on the associated content of these reference texts when generating a text summary of the target text, the final generated text summary can contain more reference texts. Correlate information so that the generated target text minutes match the reference text.
  • the embodiment of the present application when generating the text summary of the target text, the embodiment of the present application focuses on the relevant content of the reference text in the target text, supplemented by other content in the target text, and generates the text summary of the target text, so that it is consistent with the reference text
  • the proportion of the associated content in the final target text summary is higher, so that the correlation between the final target text summary and the reference text is higher, that is, the target text summary corresponding to the reference text is obtained.
  • the embodiment of the present application is based on locating text fragments related to the reference text from the target text, and performing summary generation processing on the full text of the target text, so as to obtain a target text summary corresponding to the reference text.
  • the aforementioned text fragments may be text sentences, text segments, or text phrases.
  • text segments related to the reference text are identified from the target text through methods such as text comparison and semantic comparison. For example, as long as the textual similarity or semantic similarity between the text segment in the target text and the reference text is not 0, it can be considered valuable for generating the target text minutes corresponding to the reference text, so it is determined to be related to the reference text text fragment.
  • the full-text content of the target text is processed to generate a summary, that is, a text summary is generated for the full-text content of the target text.
  • a text summary is generated for the full-text content of the target text.
  • the contribution of text fragments related to the reference text to the generation of the target text minutes is higher than that of the text fragments not related to the reference text to the generation of the target text minutes, so that the generated target text minutes contain more More reference text related text fragment information.
  • different contribution degrees can be set for each text segment related to the reference text according to its degree of relevance to the reference text, so that the final generated target text summary has a higher degree of relevance to the reference text.
  • the original text of a certain meeting contains a total of 200 paragraphs of conversation text, such as the following text of the meeting (due to the long content of the original text of the meeting, for the sake of simplicity, ellipses are used to represent the omitted parts of the original text of the meeting):
  • Paragraph 1 Today we will mainly discuss the launch plan of new scanning pen products.
  • Paragraph 2 It is planned to launch a four-month touring exhibition in March 2019 to carry out a technological innovation day.
  • Paragraph 16 The following is a comprehensive analysis of product trends to help the product stand out from the fierce market. We will select 14 cities and actively look for self-media cooperation. The cooperation content includes...
  • Paragraph 68 Young people prefer designs with a cool appearance. We must take this part of the market seriously and find an entry point. Strengthen the publicity effect...
  • Paragraph 79 Now there is a demo that you can take a look at. Before the release of Double Eleven, we conducted a more systematic test on the product. The correct rate on the test set can be as high as 98%, which is 30% higher than that of competing products, which is enough to form a generation gap.
  • Paragraph 88 The test set data test has no effect. Because if we want to completely compare with competing products, we actually have to compare the same things. If it is the same episode, different people feel differently.
  • Paragraph 88 Product positioning should take young people into consideration, add this part of the design...
  • Paragraph 162 The interface design also needs reasonable interaction, considering the connection of button properties and jumps... By the way, we should also consider the user's subjective experience for the test problem mentioned above, and we need to set up some comparisons of subjective experience. For some special cases, such as smoothness, multi-line selection, etc., feel the effect of using the product.
  • Paragraph 200 This is the end of today's meeting, and all departments should pay attention to cooperation.
  • the text content related to the retrieval condition "subjective experience of product effect” input by user A is "By the way, the test questions mentioned above, we also It is necessary to consider the user's subjective experience, and it is necessary to set up some comparisons of subjective experience.” and "For some special cases, such as smoothness, multi-line selection, etc., feel the effect of product use.”
  • the original text of the meeting is generated and processed, and finally the meeting summary "the effect of the new product of the scanning pen is not only based on the data results, but also needs to consider some subjective experience solutions”.
  • the finally obtained meeting minutes match the search condition "subjective experience of product effect" input by user A, that is, the meeting minutes are meeting minutes representing relevant information representing the subjective experience of product effects. Therefore, by adopting the technical solution of the embodiment of the present application, by locating the text content related to the search conditions input by user A from the original text of the meeting, the meeting minutes corresponding to the search conditions input by user A can be generated, so as to meet the meeting minutes of user A need.
  • the search condition input by user B is "subjective experience of product effect & product positioning"
  • the original text of the meeting is used as the target text
  • the search condition input by user B is used as the reference text.
  • the text content related to the retrieval condition "subjective experience of product effect & product positioning" input by user B is "the product positioning should be more Consider young people, and add this part of the design.”, “By the way, we should also consider the user’s subjective experience for the test questions mentioned above, and we need to set up some subjective experience comparisons.” and “For some special cases, For example, smoothness, multi-line selection, etc., to feel the effect of product use.”
  • the original text of the meeting was generated and processed.
  • the new product positioning of the scanning pen should consider that young people prefer cool-looking designs.
  • the use effect is not only based on the data results, but also needs to consider some subjective experience solutions.
  • the finally obtained meeting minutes match the search condition "subjective experience of product effect & product positioning" input by user B, that is, the meeting minutes are meeting minutes representing relevant information of subjective experience of product effect and product positioning. Therefore, by adopting the technical solution of the embodiment of the present application, by locating the text content related to the search conditions input by user B from the original text of the meeting, the meeting minutes corresponding to the search conditions input by user B can be generated, so as to meet the meeting minutes of user B need.
  • the text summary generation method proposed in the embodiment of the present application uses the reference text as a reference for generating a text summary when generating a text summary for the target text, and by locating the relevant content of the reference text from the target text, the target text The text is processed to generate a summary, and the target text summary corresponding to the reference text is obtained.
  • the method generates a text summary for the target text, the content of the target text and the associated content of the reference text in the target text are jointly used to determine the text summary of the target text.
  • this text summary generation method can generate text summaries that meet different user needs for the same target text.
  • the embodiment of the present application pre-trains the text summary generation model based on the attention mechanism, which is used to generate the text summary of the target text according to the target text and the reference text, and obtain the target corresponding to the reference text Text Minutes.
  • the text summary generation model based on the attention mechanism is trained based on the pre-collected target text-reference text-target text summary parallel data. For example, a large amount of parallel data of meeting original text-user search conditions-meeting minutes text is collected in advance, and after data preprocessing, it is used to train the model.
  • the original text data of the meeting can be collected from the audio data of the meeting, and the corresponding text data can be obtained after voice transcription; of course, the text data of the meeting can also be collected directly, such as the original full text of the meeting compiled by the stenographer.
  • the search conditions are not limited to the fixed sentence pattern of the fixed template, but also support users to input keywords or phrases they care about, or short meeting records, or even a logical combination of multiple sub-conditions, such as A is related and B Correlation (denoted as A&B), A correlation or B correlation (denoted as A
  • the specific source or specific content of the retrieval conditions is not limited.
  • the meeting minutes are intended to be highly condensed and highly
  • the data preprocessing is firstly to process the original text of the meeting into sentences. Described clause, can adopt to carry out the segmentation of clause or whole sentence by punctuation mark, also can carry out clause according to fixed number of words window, sliding window, this case does not do specific requirement to the concrete method of clause, this case adopts with punctuation mark, According to the way of segmenting the whole sentence, the original text of the conference is divided into sentences; secondly, the data of the original text of the meeting after sentence segmentation is processed into the input sequence form of words.
  • the text word segmentation can use existing technology, which will not be detailed here.
  • the original text sentence of the meeting "The next key work is a follow-up of the Spring Fair.” After word segmentation, the result is "the next / of / key / work / is / of / of the Spring Fair / a / follow-up.”
  • the search condition is directly regarded as plain text, and the preprocessing only needs to process the search condition text data into the input sequence form of words, if the search condition is a combination of multiple sub-conditions compound, each sub-condition is processed into the sequence form after the word. For the corresponding meeting minutes, the preprocessing is to process the text data into a sequence of words.
  • the text summary generation model based on the attention mechanism is a time-series output model, that is, each decoding outputs a word segment, and the final word segments can be combined to obtain the summary text.
  • the features of the word segmentation sequence of the original text of the meeting and the features of the word segmentation sequence of the retrieval conditions are respectively obtained, and the features of the word segmentation sequence of the original text of the meeting and the features of the word segmentation sequence of the retrieval conditions are input into the text summary generation model based on the attention mechanism , and determine the model loss by comparing the word segmentation sequence of the meeting minutes output by the model with the word segmentation sequence of the meeting minutes obtained in advance, and modify the model parameters based on the model loss, so that the text minutes generation model based on the attention mechanism can Taking features of the target text and reference text as input, a target text summary corresponding to the reference text is generated.
  • the specific processing process of the method for generating text minutes proposed in the embodiment of the present application will be introduced by taking the generation of meeting minutes satisfying the retrieval conditions of different users as an example.
  • the original text of the meeting is used to represent the above-mentioned target text
  • the retrieval conditions or retrieval sub-conditions input by the user are used to represent the above-mentioned reference text
  • the text summary is generated for the original text of the conference through introduction, and the retrieval conditions that are consistent with the user input are obtained.
  • the process of the text summary shows the specific processing process of the technical solution of the embodiment of the present application based on locating the relevant content of the reference text from the target text and generating the target text summary corresponding to the reference text.
  • obtaining the target text and reference text it may be to obtain the original text of the target text and reference text, and then perform feature extraction on it for subsequent text summary generation processing.
  • the features of the target text and the features of the reference text can also be obtained directly, and used for subsequent text summary generation processing.
  • A1 By determining the correlation between each text segment in the target text and the reference text, locate the text segment related to the reference text from the target text.
  • the above-mentioned text segment may be text content of any granularity such as a text sentence, a text segment, or a text phrase.
  • the target text is divided into text sentences, and the divided text sentences are used as the above-mentioned text fragments.
  • dividing the target text into text sentences can be divided into text sentences according to the punctuation of the target text, or can be divided into sentences based on a fixed number of words window or a sliding window. This case does not make specific requirements for the specific method of sentence division.
  • sentences are divided according to the punctuation in the target text.
  • the correlation between each text segment in the target text and the reference text is determined through semantic measurement, that is, by comparing the semantic similarity between the reference text and each text segment in the target text, the reference text is determined. The relevance of the text to each text segment in the target text.
  • the features of the target text and the features of the reference text are obtained respectively. Then, according to the features of each text segment in the target text and the features of the reference text, the correlation between each text segment in the target text and the reference text is respectively determined. According to the correlation between the reference text and each text segment in the target text, the text segment related to the reference text can be located from the target text. For example, in the target text, a text segment whose correlation degree with the reference text is not zero is a text segment related to the reference text.
  • the feature of the target text can be obtained by extracting features from each text segment in the target text, and then combining the features of each text segment to obtain the feature of the target text, thus, based on the features of each text segment in the target text, you can The overall features of the target text, and the features of individual text fragments of the target text are determined separately.
  • the features of the target text are determined by extracting the features of each text segment of the target text
  • the features of the reference text are determined by extracting the features of each text segment of the reference text.
  • determining its semantic similarity with each sentence in the original text of the meeting can be realized through the above-mentioned process of determining the semantic similarity between a text segment of the target text and the reference text.
  • the embodiment of the present application builds a semantic fuzzy retrieval model, extracts the semantic similarity score between each sub-condition of the user's retrieval condition and each sentence in the original text of the conference, that is, determines the relationship between each text segment in the reference text and the target text relativity.
  • the semantic fuzzy retrieval model takes the retrieval sub-condition text and the original meeting text as input, and outputs, for a certain retrieval sub-condition text, each sentence in the original conference text and its semantic similarity score. Based on the above-mentioned semantic fuzzy retrieval model, after inputting the original text of the conference and the text sequence of user retrieval sub-conditions into the semantic fuzzy retrieval model, it can be obtained that for a certain user retrieval sub-condition, 1 to n in the original text of the conference (n is the total number of sentences in the original text of the conference ) sentences and their semantic similarity scores, for the retrieval sub-condition A, the semantic similarity scores between it and n sentences in the original text of the conference are used Indicates that for the retrieval sub-condition B, the semantic similarity score between it and n sentences in the original text of the conference is used express.
  • the embodiment of the present application proposes two semantic fuzzy retrieval model frameworks: the chapter interactive semantic retrieval model, and the semantic retrieval model based on retrieval candidate information enhancement, which is used to measure the semantics of the retrieval sub-conditions (reference text) and the original sentence of the meeting (target text) similarity score.
  • the text interactive semantic retrieval model includes a word encoder, a sentence encoder, and an interactive module between the retrieval sub-condition text and the conference text.
  • the context encoding vector of each word in extract the word vector representation of [CLS] as well as As the sentence code of the transcribed text and the sentence code of the retrieval sub-condition, it is used to represent the information of the entire sentence.
  • the sentence encoding of the transcribed text introduces the context information into the current sentence through the two-layer Transformer sentence encoder modeling, and completes the current sentence to inherit the information omitted from the context, so as to obtain a more accurate sentence representation
  • the retrieval sub-condition text and conference text interaction module is composed of attention structure, which encodes the retrieval sub-condition encoding sentence
  • the conference transcribed sentences are coded as K and V, so that the content information of the conference is integrated into the sentence code q of the retrieval sub-condition.
  • the text content can better supplement the information omitted in the retrieval sub-conditions; at the same time, the code q has a global view of the content of the whole meeting, which is more beneficial for it to select better retrieval results in the transcribed text.
  • the semantic retrieval model based on retrieval candidate information enhancement has modified the interaction between retrieval sub-conditions and text sentences. Specifically, after obtaining the retrieval subcondition sentence encoding And obtain the sentence encoding of each text sentence of the original text of the meeting Afterwards, according to the sentence codes of the retrieval sub-conditions and the sentence codes of each text sentence, the similarity between each text sentence in the original text of the conference and the retrieval sub-conditions is calculated, and then, according to the similarity between each text sentence in the original text of the conference and the retrieval sub-conditions, from Select the N text sentences with the highest similarity with the reference text from the original text of the conference, and finally, perform an interactive operation based on the attention mechanism on the sentence codes of the selected N text sentences and the sentence codes of the retrieval sub-conditions, and get Sentence encoding of retrieval sub-conditions after information is perfected.
  • Calculating the cosine distance can obtain the TopN preliminary matching retrieval results that are most similar to the retrieval sub-conditions, that is, according to the similarity between each text sentence in the original text of the meeting and the retrieval sub-conditions, select from the original text of the conference and the retrieval sub-conditions
  • the model can be guided to select results similar to the preliminary high-quality retrieval results, so as to avoid the excessive difference in semantic correlation between the final selected retrieval results and reduce the user experience.
  • the embodiment of the present application integrates the BM25 scheme with the output result of the semantic retrieval model. Specifically, according to the coding of each text sentence in the original text of the meeting and the coding of the retrieval sub-conditions, the BM25 algorithm is used to calculate and determine the semantic similarity between each text sentence in the original text of the conference and the retrieval sub-conditions.
  • the semantic similarity scores between each text sentence in the original meeting text and the retrieval sub-conditions output by the above-mentioned semantic fuzzy retrieval model, and the semantic similarity between each text sentence in the original conference text and the retrieval sub-conditions calculated and determined by the BM25 algorithm Perform fusion processing, such as weighted fusion, to obtain the semantic similarity score between each text sentence in the original conference text and the retrieval sub-condition after fusion, that is, to obtain the correlation between each text sentence in the original conference text after fusion and the retrieval sub-condition.
  • the position distribution of the sentences in the original text of the conference related to the retrieval sub-conditions is constrained, that is, the correlation between each text sentence in the conference text and the retrieval sub-condition is corrected according to the position distribution of each text sentence in the conference text in the conference text.
  • the higher the penalty for the correlation between other text sentences and the retrieval sub-conditions determine the correctness of the conference text.
  • the penalty degree of the correlation between other text sentences in the conference text and the retrieval sub-condition determines the correctness of the conference text.
  • the penalty degree of the correlation between other text sentences in the conference text and the retrieval sub-condition determines the correctness of the conference text.
  • the embodiment of the present application assumes that the Top 2 semantic similarity scores between each text sentence in the original meeting text and the retrieval sub-condition are relatively accurate, and the semantic similarity scores between other text sentences in the original conference text and the retrieval sub-condition The similarity score should be punished according to the size of the distance Top-2. Since there are two sentences in Top2, the penalty for the semantic similarity scores of other text sentences and the retrieval sub-conditions should only be punished with the smaller distance between the two sentences of Top2; The semantic similarity score of the retrieval sub-condition is the original score minus the distance penalty score.
  • the embodiment of the present application also performs abnormal text sentence filtering processing on the text sentences related to the retrieval sub-conditions identified from the original text of the conference, that is, further accurate similarity scores between the conference text sentences and the retrieval sub-conditions .
  • the correlation between each text sentence in the text sentence and the retrieval sub-condition, from the text sentences of the third quantity, the correlation with the retrieval sub-condition is selected to be greater than the first correlation threshold, or the correlation with the retrieval sub-condition
  • a text sentence that is greater than the second correlation degree threshold and whose standardized correlation with the retrieval sub-condition is greater than the third correlation degree threshold is used as a text sentence related to the retrieval sub-condition;
  • the first correlation threshold is greater than the second correlation threshold, and the second correlation threshold is larger than the third correlation threshold.
  • the normalized score score_norm is calculated as follows:
  • the final relevant text sentence selection strategy is, for the text sentence i in the original conference text, its similarity score with the retrieval sub-condition needs to satisfy (score i >t 1 )
  • text sentences related to the retrieval condition or retrieval sub-condition can be located from the original text of the meeting.
  • the half-sentence related to the retrieval condition or the retrieval sub-condition can also be output to the user, so that the user can apply or know what text content is related to the retrieval condition or the retrieval sub-condition in the original text of the meeting.
  • the semantic similarity between the retrieval sub-conditions and each text sentence in the conference text can be respectively determined, and the text sentences related to the retrieval sub-conditions can be identified from the text of the original text of the conference.
  • the embodiment determines its relevance with the retrieval condition through the following processing, that is, determines its similarity score with the retrieval condition:
  • the degree of correlation between the text sentence and each retrieval sub-condition is determined.
  • the text sentence based on the characteristics of the text sentence and the characteristics of each retrieval sub-condition, respectively calculate and determine the similarity between the text sentence and each retrieval sub-condition in the retrieval condition Score.
  • each retrieval sub-condition the correlation between the text sentence and each retrieval sub-condition is fused, and the correlation between the text sentence and the retrieval condition is determined.
  • each retrieval sub-condition in the retrieval condition has a clear logical relationship.
  • the text sentence and each retrieval sub-condition Based on the above-mentioned logical relationship, after determining the similarity scores between the text sentence and each retrieval sub-condition in the embodiment of the present application, according to the logical relationship between each retrieval sub-condition, the text sentence and each retrieval sub-condition The similarity scores between the sentences are logically combined to determine the similarity score between the text sentence and the retrieval condition as a whole.
  • the embodiment of the present application also performs normalization processing on the similarity scores between each text sentence in the original text of the meeting and the retrieval condition determined in the above-mentioned manner, so that the similarity score between each text sentence in the original text of the conference and the retrieval condition is processed as 0 -1, in order to more intuitively express the correlation between the original text sentences of the conference and the retrieval conditions, and make the correlations between different text sentences and retrieval conditions comparable.
  • the 1st to nth text sentences in the original text of the meeting their correlation with the retrieval conditions is represented by p 1 , p 2 ..., p n .
  • the full-text content of the target text can be generated according to the text fragments related to the reference text, and the target text corresponding to the reference text can be obtained. Text Minutes.
  • a text summary is generated, and the main content of the obtained target text summary is related to the reference text content.
  • the contribution of each related text segment to the generation of the target text summary is set, so that the text segment with a higher correlation with the reference text.
  • the embodiment of the present application comprehensively considers the correlation between each text segment in the target text and the reference text, and generates a summary of the full text of the target text. That is, by performing the following steps A21-A22, the full-text content of the target text is generated as a summary:
  • the higher the correlation between the text segment in the target text and the reference text the greater the contribution of the text segment in the target text to the generation of the text summary corresponding to the reference text, respectively determine the target The contribution of each text fragment in the text to generate the text summary corresponding to the reference text.
  • the embodiment of the present application adopts a text summary generation model based on an attention mechanism to generate a text summary of a target text.
  • the text summary generation model based on the attention mechanism can determine the contribution of each text segment in the target text to the generation of the text summary corresponding to the reference text according to the correlation between each text segment in the target text and the reference text, and then, A text summary of the target text is generated based on the contribution.
  • the text summary generation model is a text decoding model based on the attention mechanism
  • the model can obtain a text summary decoding result that meets the requirements by adjusting the attention coefficient of each text segment of the input target text.
  • the content of the finally decoded text summary can be changed.
  • the text summary generation model based on the attention mechanism is a time series output model
  • the attention coefficient of each text segment in the target text for decoding the text summary at the current moment may also be different from the content of the target text summary that has been output before the current moment. related.
  • the model can determine the distribution of attention coefficients for each text segment of the target file when the text summary is decoded at the current moment according to the decoding results of the preorder, that is, assign the correct attention coefficient to each text segment of the input target text, Therefore, the model can be used to determine the attention coefficient of each text segment in the target text when the model generates a text summary of the target text.
  • the embodiment of the present application Since the ultimate goal of the embodiment of the present application is to generate a text summary corresponding to the reference text, only determining the attention coefficient of each text segment of the target text generated by the text summary is not enough to make the final generated target text summary consistent with the reference text Corresponding. In order to make the final generated target text summary correspond to the reference text, the embodiment of the present application will also generate the attention coefficient of the text summary of the target text for each text segment in the target text, and each text segment in the target text and the reference text The correlation degree of the target text is combined to determine the contribution of each text segment in the target text to the generation of the text summary corresponding to the reference text.
  • the attention coefficient of each text fragment in the target text is multiplied by the text summary of the target text, and the correlation between each text fragment in the target text and the reference text, and then combined with each text
  • the product results corresponding to the segments are normalized, and the final normalized value corresponding to each text segment is obtained as the contribution of each text segment in the target text to the generation of the text summary corresponding to the reference text.
  • A22 At least according to the contribution of each text segment in the target text to the generation of the text summary corresponding to the reference text, perform summary generation processing on the full text of the target text to obtain the target corresponding to the reference text Text Minutes.
  • the above-mentioned attention mechanism-based text summary generation model first generates text summary decoding features according to the characteristics of the target text and the contribution of each text segment in the target text to the generation of a text summary corresponding to the reference text; and then , performing text summary decoding processing according to the text summary decoding feature to generate a text summary of the target text.
  • the embodiment of the present application determines the features of the target text by acquiring the features of each text segment of the target text, the features of each text segment of the target text are specified in advance. The contribution of each text segment of the target text to the generation of the text summary is different. Therefore, according to the characteristics of each text segment of the target text and the contribution of each text segment in the target text to the generation of the text summary corresponding to the reference text, Generate a textual summary of the decoded features.
  • the above-mentioned text summary decoding feature is decoded within the preset dictionary range to obtain a decoding result.
  • the model decodes the full text of the target text, and then the summary of the target text corresponding to the reference text can be obtained.
  • Figure 4 shows the structure of the attention mechanism-based text minutes generation model, and its decoding process to generate target text minutes.
  • the word sequence of the meeting minutes text that has been solved by the model history is expressed as y 1, y 2 ,...,y t-1
  • the sentence hidden in each sentence (assuming there are n sentences) in the original text of the meeting Layer features are t represents the current decoding moment
  • their correlation with the retrieval conditions is represented by p 1 , p 2 ...,p n .
  • the historically solved meeting minutes text word sequence y 1, y 2 ,...,y t-1 passes through the decoding hidden layer feature expression module of the model, and the hidden layer state feature at the current decoding moment is obtained as d t .
  • the decoding hidden layer feature expression module can input a certain retrieval condition of the user, use the historical conference minutes text word sequence that has been solved by the model, and output the hidden layer state feature at the current decoding moment.
  • the network structure of the feature expression module of the decoding hidden layer may utilize a decoder partial encoding model or a one-way LSTM structure under the Transformer scheme.
  • the attention coefficient of the hidden layer state feature d t to the sentence hidden layer feature of the jth text sentence in the original text of the conference is determined at the current decoding moment.
  • the attention coefficient of the sentence hidden layer feature of the jth text sentence in the original text of the conference is And the correlation p j between the jth text sentence in the original text of the meeting and the retrieval conditions, calculate and determine the contribution of the jth text sentence in the original text of the meeting to the generation of meeting minutes
  • the decoding features of the text minutes are generated The specific calculation process of the above processing is as follows:
  • Attention() represents the calculation function of the attention mechanism, which can use self-attention and additive attention.
  • the decoding end fully considers the influence of information related to the search condition content on the generation of meeting minutes in all text sentences of the original text of the meeting, and optimizes and improves the original attention coefficient as Text Minute Decoding Feature
  • the decoder to have semantic fuzzy retrieval features to guide relevant content selection, focus on context vector representations with different degrees of hidden layer features in the original sentence of the meeting.
  • the word prediction module in Figure 4 is the decoding feature of the input text minutes Calculate the word output probability distributed in the dictionary size, and output the word corresponding to the current decoding moment.
  • the network structure of the word prediction module can use a linear layer connected to a nonlinear activation function layer, and the decoding algorithm can use beam search algorithm.
  • the model determines the decoding features of the text minutes based on the original text of the meeting at each moment, and decodes the output, so that the meeting minutes corresponding to the original text of the meeting and meeting the retrieval conditions can be obtained.
  • the embodiment of the present application since the final target text summary generated by the embodiment of the present application needs to correspond to the reference text, in addition to clarifying that each text segment in the target text is important for generating a text summary corresponding to the reference text Contribution degree, so that the final generated target text summary contains the target text content related to the reference text, the embodiment of the present application also directly uses the features of the reference text to generate the text summary of the target text, thereby further improving the generated The relevance of the target text minutes to the reference text.
  • the embodiment of the present application when generating the decoding features of the text summary, the embodiment of the present application generates the text according to the characteristics of the target text, the characteristics of the reference text, and the contribution of each text segment in the target text to the generation of the text summary corresponding to the reference text. Minutes decoded features.
  • the features of the retrieval condition are also input into the model, for example, the word hidden layer feature of each word of the retrieval condition
  • the model is input so that when the model performs decoding processing, it can refer to the retrieval condition to decode the text content of the original text of the meeting, thereby improving the correlation between the decoding result and the retrieval condition.
  • the sentence hidden features of each sentence in the original conference text Relevance p 1 ,p 2 ...,p n of each sentence text to retrieval conditions, and word hidden layer features of each word in retrieval conditions Input the decoding end of the model and the original text interactive attention module, so that the model can determine the attention coefficient of each text sentence of the original text of the meeting, and determine the contribution of each text sentence of the original text of the meeting to the generation of the meeting minutes corresponding to the retrieval conditions.
  • the retrieval conditions can be used as a reference, so that the final decoded meeting minutes are more relevant to the retrieval conditions, and the meeting minutes are prevented from deviating from the retrieval conditions.
  • the embodiment of the present application proposes an attention mechanism-based text summary generation model structure as shown in Figure 5.
  • this model interacts with the original text at the decoding end.
  • the decoding end and retrieval interactive attention module are added.
  • the decoding end and retrieval interactive attention module mainly realize the interaction between the state features of the model hidden layer and the retrieval condition features, and generate reference decoding features.
  • the reference decoding features Then, the interaction with the features of the original text of the conference is realized through the interactive attention module between the decoding end and the original text.
  • the features of the retrieval conditions and the state features of the hidden layer are input into the decoding end of the model and the retrieval interactive attention module, so that the features of the retrieval conditions are fused with the state features of the hidden layer of the model, and the reference Decode features.
  • the features of the reference text are determined by the features of each text segment of the reference text, and the impact of each text segment of the reference text on the generation of the target text summary corresponding to the reference text is also different.
  • the key entity words in the reference text can express the semantics of the reference text to a large extent, so it has a greater reference value for generating the target text summary corresponding to the reference text, while the non-entity words in the reference text, such as tone Words, modifiers, etc., have relatively less reference value for generating the target text summary corresponding to the reference text.
  • the embodiment of the present application refers to the scheme of determining the contribution of each text segment in the target text to the generation of the text summary corresponding to the reference text introduced in the above-mentioned embodiment.
  • the text segment in the reference text may be text content of any granularity such as words, phrases, sentences, and text paragraphs.
  • the embodiment of the present application determines the features of the reference text by acquiring the features of each text segment of the reference text, and the contribution of each text segment of the reference text to the generation of the text summary corresponding to the reference text is different. Therefore, the present application The embodiment generates reference decoding features according to the features of each text segment of the reference text and the contribution of each text segment in the reference text to generating a text summary corresponding to the reference text.
  • the word hidden layer feature of each word in the retrieval condition is The hidden layer state feature d t at the current decoding time, and the word hidden layer feature of each word in the retrieval condition Input the interactive attention module between the decoding end and the retrieval.
  • This module uses the attention mechanism to first determine the contribution of each word in the retrieval condition to the generation of the text summary corresponding to the retrieval condition, and then according to the characteristics of each word in the retrieval condition and the retrieval condition The contribution of each word in the condition to the generation of the text summary corresponding to the retrieval condition is used to generate reference decoding features.
  • the specific calculation process is as follows:
  • m1+m2 represents the sum of the words of the two retrieval sub-conditions included in the retrieval condition.
  • Attention() represents the calculation function of the attention mechanism, which can use self-attention and additive attention. Indicates the attention coefficient of the word hidden layer feature of the i-th word in the retrieval condition from the state feature of the decoding hidden layer at the current decoding moment, and this coefficient also indicates the contribution of the i-th word in the retrieval condition to generating a text summary corresponding to the retrieval condition contribution.
  • the reference decoding features, the characteristics of each text sentence in the original conference text, and the correlation between each text sentence in the original conference text and the retrieval conditions are input into the interactive attention module between the decoding end and the original text, so that the model can be calculated through attention interaction. , to determine the contribution of each text sentence in the original text of the meeting to the generation of the text summary corresponding to the retrieval condition, and the decoding features of the text summary.
  • the specific calculation process is as follows:
  • Attention() represents the calculation function of the attention mechanism, which can use self-attention and additive attention. Indicates the current decoding moment, the context vector after the interaction between the decoder and the retrieval, the sentence hidden layer feature attention coefficient of the jth sentence in the original text of the meeting, that is, the decoder will pay attention to the original text of the meeting when generating the text summary corresponding to the retrieval condition
  • the attention coefficient of the jth text sentence of . p j represents the similarity between the retrieval condition and the jth sentence in the original text of the meeting.
  • the embodiment of the present application not only considers the contribution of each text segment of the target text to the generation of the target text summary corresponding to the reference text, but also considers the contribution of each text segment of the reference text to Contribution to generating a target text summary corresponding to the reference text. This ensures that the entire target text summary generation process has the ability to select and retrieve relevant target text content and reference text content, thereby improving the relevance of the final generated target text summary to the reference text and the associated content of the reference text in the target text , that is, to make the final generated target text summary correspond to the reference text.
  • the overall features of the target text and the overall features of the reference text can be directly input into the above-mentioned text summary generation model based on the attention mechanism to obtain the reference decoding features and text summary decoding features.
  • Both the training process of the model and the specific processing process of the model can be executed with reference to the introduction of the above-mentioned embodiments.
  • the names of the various processing modules of the text summary generation model based on the attention mechanism shown in Figure 4 and Figure 5 above are named in conjunction with specific processing objects.
  • the name of each processing module can be adaptively changed according to the actual processing object.
  • the embodiment of the present application does not limit the name of each processing module of the above-mentioned attention mechanism-based text summary generation model, but mainly introduces the functions and processing content of each processing module, so as to specifically introduce the attention mechanism-based text summary The process of generating the model and the functions implemented.
  • the specific implementations of measuring the correlation between each text segment of the target text and the reference text, and generating a text summary of the target text to obtain a target text summary corresponding to the reference text are respectively introduced.
  • the processing of text is essentially processing the characteristics of the text, that is to say, the processing content of the text included in the text summary generation method proposed in the embodiment of the present application, In essence, they are all processing the characteristics of the text. Therefore, the accuracy of text features will directly affect the accuracy of text summary generation.
  • the embodiment of the present application will give an example to describe the manner of acquiring the features of the target text and the features of the reference text.
  • the text features can be obtained by encoding the text with an encoder.
  • the word-level encoding encoder structure can be used to obtain the encoding features of the target text and the reference text.
  • the target text that needs to generate the minutes is usually a long text.
  • the main feature of the meeting text is its long text length.
  • a one-hour meeting may contain 10,000-20,000 words.
  • the embodiment of the present application determines the features of the target text by acquiring the features of each text segment in the target text, and, by acquiring the features of each text segment in the reference text , to determine the characteristics of the reference text.
  • the feature extraction to each text segment is realized by performing the following steps B1-B4:
  • B1. Perform text segment division processing on the target text, and determine each text segment included in the target text.
  • the embodiment of the present application constructs a word-sentence-chapter level information coding model to extract the sentence-level hidden layer features and chapter-level hidden layer features of the fusion context information of the target text, that is, to extract the features of each text segment of the target text , and the overall features of the target text.
  • the target text is first divided into text segments, and word segmentation is performed on the divided text segments, and each word segment included in each text segment is determined.
  • dividing the target text into text segments may be dividing the target text into text sentences, for example, dividing text into sentences according to punctuation marks, or sliding and extracting text segments on the target text by means of a sliding window with a fixed number of words. Segmenting a text segment can be achieved by using an existing word segmentation algorithm, which will not be described in detail in this embodiment of the present application.
  • the above-mentioned word-sentence-chapter level information coding model can be used to extract the text fragment features and discourse features of the target text.
  • the word-sentence-text level information coding model includes a word hidden layer feature expression module, a sentence representation extraction module, a sentence hidden layer feature expression module and a text feature extraction module.
  • the original text of the conference is used to represent the target text
  • each text sentence in the original text of the conference is used to represent each text segment in the target text.
  • the above-mentioned word hidden layer feature expression module means that for each text sentence in the original text of the meeting, the word representation of each word is input, and the word hidden layer feature fused with the context information of the current sentence is output.
  • the network structure of the word hidden layer feature expression module can utilize the encoder part model or bidirectional LSTM and other structures under the Transformer scheme. Assuming that the original text of the meeting has been divided into sentences, there are n sentences in total, and the sequence of words contained in each sentence is Among them, n represents the nth sentence of the original conference text, and m n represents the total number of words contained in the nth sentence.
  • the above-mentioned sentence representation extraction module is to compress the word representations of multiple words in the input sequence to obtain sentence representation vectors, all hidden word features of the first sentence of the original conference text
  • the sentence representation vector of the first sentence is obtained as s 1 .
  • the sentence representation vectors of the first sentence to the nth sentence in the conference text can be expressed as a sequence s 1 , s 2 ...,s n .
  • the embodiment of the present application does not limit the network structure of the sentence representation extraction module, and techniques such as attention mechanism or pooling may be used.
  • the above-mentioned sentence hidden layer feature expression module refers to inputting all the sentence representation vectors of the original text of the conference, and outputting the sentence hidden layer features integrated with the context information of the current sentence. Similar to the above-mentioned word hidden layer feature expression module, the network structure of the sentence hidden layer feature expression module can use the encoder part model or bidirectional LSTM structure under the Transformer scheme. Represents the sentence hidden layer features after fusing the context information of n sentences in the conference text.
  • the above-mentioned discourse feature extraction module is similar to the above-mentioned sentence representation extraction module, which compresses the sentence hidden layer feature representations of multiple sentences in the input sequence to obtain a text representation vector.
  • the sentence representation vectors of the first sentence to the nth sentence of the conference original text can be expressed as a sequence After the article feature extraction module, the article feature u of the original conference text is obtained.
  • the embodiment of the present application does not limit the network structure of the text feature extraction module, and techniques such as attention mechanism or pooling may be used.
  • the features of each text sentence of the conference original text and the chapter features of the conference original text that is, the overall features of the conference original text
  • the features of the text sentences of the original meeting text, the features of the word segmentation contained in the text sentences, and the discourse features of the original meeting text are all features that incorporate context information. Therefore, the above-mentioned meeting original text feature extraction scheme in the embodiment of the present application, It can better capture the long-distance dependent information in long conference texts, and obtain more accurate conference text features.
  • the above-mentioned word-sentence-chapter level information coding model can also omit the sentence hidden layer feature expression module, and directly use the sentence representation vector s1 of each text sentence output by the sentence representation extraction module , s 2 . _
  • the overall features of the reference text are determined by extracting features of each text segment of the reference text.
  • the text segment of the reference text may be text content of any granularity such as words, phrases, text sentences, and text segments in the reference text.
  • the features of the reference text are determined by extracting features of each word segment of the reference text.
  • word segmentation processing is first performed on the reference text, for example, by using a word segmentation model or a word segmentation algorithm to perform word segmentation on the reference text, and determine each word segment included in the reference text. Then, the word segmentation features of the fusion context information of each word segment included in the reference text are respectively extracted, and the overall features of the reference text are obtained by combining the word segment features of each word segment included in the reference text.
  • the various text sentences contained in the reference text are combined or After the screening process, it is integrated into a reference text, and then word segmentation, word segmentation feature extraction, and reference text feature extraction are performed on the reference text.
  • the word segmentation features of each word segmentation of the reference text are extracted through the word hidden layer feature expression module as shown in FIG. 7 .
  • the retrieval word sequence is input into the above-mentioned word hidden layer feature expression module, and the word hidden layer feature fused with the context information of the retrieval condition is output, that is, each word included in the retrieval condition is obtained.
  • the word segmentation feature of the fusion context information of word segmentation is obtained.
  • the word sequence of one or more retrieval sub-conditions is input into the above-mentioned word hidden layer feature expression module according to the method shown in Table 2 below, and the word hidden layer feature The expression module corresponds to output the word hidden layer features of the fusion context information of single or multiple retrieval conditions.
  • the network structure of the above-mentioned word hidden layer feature expression module can use the encoder part model or bidirectional LSTM structure under the Transformer scheme.
  • the hidden layer feature of each word in the retrieval condition is expressed as
  • the extraction process is similar, as shown in the figure, another input sequence has a total of m2 words, respectively
  • the hidden layer feature representation of each word is obtained as
  • the word segmentation features of each retrieval sub-condition can be obtained separately, and finally, the word segmentation features of each word segmentation contained in the retrieval sub-conditions are spliced according to the order of word segmentation, and the overall feature of the retrieval condition can be obtained.
  • the embodiment of the present application further characterizes the features of each text segment of the target text and the features of each text segment of the reference text Fusion processing to obtain target text features fused with reference text features, and/or reference text features fused with target text features.
  • the embodiment of the present application integrates the features of the reference text into the features of the target text, and/or integrates the features of the target text into the features of the reference text, so that the features of the reference text and/or the target text not only include their own characteristics, including the characteristics of the other party.
  • the features of the reference text are integrated into the features of the target text, and at the same time, the features of the target text are integrated into the features of the reference text.
  • the discourse features of the target text and/or the features of each text segment of the target text may be integrated into the features of each text segment of the reference text, or integrated into the overall features of the reference text.
  • the above discourse features of the target text are determined according to the features of each text segment of the target text.
  • the discourse features of the target text are firstly determined according to the features of each text segment of the target text. Then, the discourse features of the target text and the features of each text segment of the target text are integrated into the features of each text segment of the reference text; in addition, the features of each text segment of the reference text are integrated into the features of each text segment of the target text. feature. Finally, the obtained features of each text segment of the reference text are fused with the discourse features of the target text and the features of each text segment, and the obtained features of each text segment of the target text are fused with the features of each text segment of the reference text.
  • the target text is still represented by the above-mentioned conference original text
  • the reference text is represented by the user retrieval condition.
  • the embodiment of the present application fully considers the information fusion of the user retrieval condition and the conference original text, that is, extracts the hidden layer of each sentence in the final conference original text When extracting features, relevant retrieval condition information will be fused. At the same time, when extracting the hidden layer features of each word in the end user's retrieval conditions, the original text information of the conference will also be fused.
  • the embodiment of the present application builds an information fusion model, which is used to realize the information fusion of the user's retrieval conditions and the original text of the meeting.
  • the information fusion model includes a word hidden layer feature expression module, a word feature extraction module and an information mutual fusion module.
  • the word feature extraction module inputs the hidden layer feature u of the original text of the conference and the hidden layer feature representation of each word of the retrieval condition Output the hidden layer features of each word of the retrieval condition after fusing the original article information
  • the word feature extraction module adopts a recursive network structure, the hidden layer feature u of the original conference text is the initial state representation, and the hidden layer feature of each word of the retrieval condition is input, and after recursively obtaining the fusion conference original text text information
  • the calculation process of the hidden layer features of each word in the retrieval condition is as follows:
  • the above-mentioned recursive network structure can adopt structures such as LSTM or GRU.
  • the retrieval condition is a combination of multiple retrieval sub-conditions
  • the embodiment of the present application extracts the hidden layer features of each word in the retrieval condition after the fusion of the conference original article information. The calculation process is consistent with the above process, and finally obtains
  • the information mutual fusion module after obtaining the hidden layer feature of each word in the retrieval condition after the fusion of the original text of the conference, input the feature and the sentence hidden layer feature of each text sentence of the original text of the conference Output the hidden layer features of each word of the retrieval condition after further fusing the original sentence information And the sentence hidden features of each sentence in the original text of the conference that integrates the information of relevant retrieval terms
  • the information fusion module can use self-attention mechanism or bidirectional LSTM and other structures.
  • the embodiment of the present application extracts the features of the target text and the reference text, it can not only extract the features of each text segment of the target text and the reference text that integrates the context information, but also realize the target text feature and the reference text
  • the fusion of features makes the feature information of the target text and the reference text richer, and is more conducive to generating a text summary of the target text to obtain a target text summary corresponding to the reference text.
  • the text summary generation process is performed, or the features of the target text extracted by the above method and the features of the reference text are combined and applied to perform the text summary generation process.
  • the features of the reference text are integrated into the feature, so the generated target text summary can be related to the reference text.
  • the features of each text segment of the target text extracted in the above manner and the features of each text segment of the reference text are input into the text summary generation model based on the attention mechanism as shown in Figure 5, and the generation and reference
  • the embodiment of the present application also proposes a text summary generation device, as shown in FIG. 9, the device includes:
  • a data acquisition unit 100 configured to acquire target text and reference text, wherein the reference text is determined based on the content of the target text concerned by the user;
  • the summary generating unit 110 is configured to generate a summary of the target text based on locating the associated content of the reference text from the target text, to obtain a summary of the target text corresponding to the reference text.
  • performing summary generation processing on the target text to obtain a target text summary corresponding to the reference text including:
  • a summary is generated for the full text of the target text to obtain a summary of the target text corresponding to the reference text.
  • the full-text content of the target text is processed to generate a summary to obtain the target text corresponding to the reference text Minutes, including:
  • the full-text content of the target text is processed to generate a summary to obtain A target text summary corresponding to the reference text, including:
  • each text segment in the target text determines the contribution of each text segment in the target text for generating a text summary corresponding to the reference text
  • determining the correlation between each text segment in the target text and the reference text includes:
  • the correlation between each text fragment in the target text and the reference text is respectively determined.
  • the summary generation unit 110 is also used to:
  • an interactive operation based on the attention mechanism is performed to obtain the reference text features after information improvement.
  • an attention mechanism-based interactive operation is performed on the features of each text segment in the target text and the features of the reference text to obtain the reference text features after information improvement, including:
  • an interactive operation based on the attention mechanism is performed to obtain the reference text features after information improvement.
  • the correlation with the reference text is determined through the following processes:
  • the characteristics of the text fragment and the characteristics of each reference text determine the relevance between the text fragment and each reference text
  • the correlation between the text segment and each reference text is fused to determine the correlation between the text segment and the reference text.
  • the summary generation unit 110 is also used to:
  • each text fragment in the target text calculates and determines the semantic similarity between each text fragment in the target text and the reference text through BM25 algorithm
  • the summary generation unit 110 is also used to:
  • the correlation degree between each text segment in the target text and the reference text is corrected.
  • the correlation degree between each text segment in the target text and the reference text is corrected, including:
  • the summary generation unit 110 is also used to:
  • each text segment in the selected third number of text segments and the reference text select a correlation degree with the reference text greater than the first correlation threshold, or with A text fragment whose relevance degree of the reference text is greater than the second relevance degree threshold and whose standardized correlation degree with the reference text is greater than the third relevance degree threshold is regarded as a text fragment related to the reference text;
  • the first correlation threshold is greater than the second correlation threshold, and the second correlation threshold is larger than the third correlation threshold.
  • determining the contribution of each text segment in the target text to generating a text summary corresponding to the reference text includes :
  • each text segment in the target text generated by the text summary of the target text determines each text in the target text Contribution of the segment to generating a text summary corresponding to the reference text.
  • the full-text content of the target text is processed to generate a summary to obtain the same as
  • the target text summary corresponding to the reference text includes:
  • a text summary of the target text is generated according to the text summary decoding feature.
  • generating text summary decoding features including :
  • Generate text minutes decoding features including:
  • a text summary decoding feature is generated according to the reference decoding feature, the feature of the target text, and the contribution of each text segment in the target text to generating a text summary corresponding to the reference text.
  • the summary generation unit 110 is also used to:
  • a reference decoding feature is generated.
  • generating reference decoding features includes:
  • each text segment of the reference text According to the characteristics of each text segment of the reference text, and the contribution of each text segment in the reference text for generating a text summary corresponding to the reference text, generate a reference decoding feature
  • a text summary decoding feature including:
  • the features of each text segment of the target text, and the contribution of each text segment in the target text to generating a text summary corresponding to the reference text generate a text summary decoding feature.
  • generating text summary decoding features includes:
  • Target text minutes including:
  • the features of the target text and the features of the reference text are input into the pre-trained text summary generation model based on the attention mechanism, so that the text summary generation model based on the attention mechanism is based on the association of locating the reference text from the target text content, performing summary generation processing on the target text to obtain a target text summary corresponding to the reference text.
  • the features of the target text are obtained by acquiring the features of each text segment in the target text; the features of the reference text are obtained by acquiring the features of each text segment of the reference text.
  • acquiring the features of each text segment in the target text includes:
  • word segmentation processing For each text segment in the target text, word segmentation processing is performed respectively, and each word segment contained in each text segment is determined;
  • the features of each text segment are determined according to the word segmentation features of the fusion context information of each word segment included in each text segment.
  • the feature of each text segment in the acquisition target text also includes:
  • the features of each text segment are fused and encoded to obtain the text segment features of the fused context information of each text segment.
  • acquiring the features of each text segment of the reference text includes:
  • the word segmentation features of the fusion context information of each word segment included in the reference text are respectively extracted.
  • the number of texts in the reference text when the number of texts in the reference text is greater than 1, before performing word segmentation processing on the reference text and determining each word segment contained in the reference text, it also includes:
  • the reference texts are merged or screened.
  • it also includes:
  • feature fusion processing is performed on the features of each text segment of the target text and the features of each text segment of the reference text to obtain the reference text features of the fusion target text features, including:
  • the discourse features of the target text are determined according to the features of each text segment of the target text.
  • feature fusion processing is performed on the features of each text segment of the target text and the features of each text segment of the reference text to obtain the target text features of the fusion reference text features and the reference text of the fusion target text features features, including:
  • each text fragment of the target text determine the discourse characteristics of the target text
  • the text segment features of each text segment of the reference text are fused with the text segment features of the target text features, and the features of each text segment of the target text are subjected to feature fusion processing, and the target text features of the fusion reference text features and the reference text features of the fusion target text features are obtained .
  • Another embodiment of the present application also proposes a text summary generation device, as shown in FIG. 10 , the device includes:
  • the memory 200 is connected to the processor 210 for storing programs
  • the processor 210 is configured to execute the program stored in the memory 200 to implement the method for generating a text summary disclosed in any of the above embodiments.
  • the above text summary generation device may further include: a bus, a communication interface 220 , an input device 230 and an output device 240 .
  • the processor 210, the memory 200, the communication interface 220, the input device 230 and the output device 240 are connected to each other through a bus. in:
  • a bus may include a pathway that carries information between various components of a computer system.
  • the processor 210 can be a general-purpose processor, such as a general-purpose central processing unit (CPU), a microprocessor, etc., and can also be an application-specific integrated circuit (application-specific integrated circuit, ASIC), or one or more for controlling the present invention integrated circuit for program execution. It can also be a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, and discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the processor 210 may include a main processor, and may also include a baseband chip, a modem, and the like.
  • the program for executing the technical solution of the present invention is stored in the memory 200, and an operating system and other key services may also be stored.
  • the program may include program code, and the program code includes computer operation instructions.
  • the memory 200 may include read-only memory (read-only memory, ROM), other types of static storage devices that can store static information and instructions, random access memory (random access memory, RAM), that can store information and Other types of dynamic storage devices, disk storage, flash, etc. for instructions.
  • the input device 230 may include a device for receiving data and information input by a user, such as a keyboard, a mouse, a camera, a scanner, a light pen, a voice input device, a touch screen, a pedometer or a gravity sensor, and the like.
  • a device for receiving data and information input by a user such as a keyboard, a mouse, a camera, a scanner, a light pen, a voice input device, a touch screen, a pedometer or a gravity sensor, and the like.
  • Output devices 240 may include devices that allow information to be output to a user, such as a display screen, printer, speakers, and the like.
  • Communication interface 220 may include the use of any transceiver or the like to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area network (WLAN), and the like.
  • RAN radio access network
  • WLAN wireless local area network
  • the processor 210 executes the program stored in the memory 200, and calls other devices, which can be used to implement each step of any method for generating a text summary provided by the above-mentioned embodiments of the present application.
  • Another embodiment of the present application also provides a storage medium, on which a computer program is stored.
  • a computer program is stored.
  • the computer program is run by a processor, each of the methods for generating a text summary provided in the above-mentioned embodiments of the present application can be realized. step.
  • each embodiment in this specification is described in a progressive manner, and each embodiment focuses on the differences from other embodiments.
  • the same and similar parts in each embodiment refer to each other, that is, Can.
  • the description is relatively simple, and for related parts, please refer to part of the description of the method embodiments.
  • modules and submodules in the devices and terminals in the various embodiments of the present application can be combined, divided and deleted according to actual needs.
  • the disclosed terminal, device and method may be implemented in other ways.
  • the terminal embodiments described above are only illustrative.
  • the division of modules or sub-modules is only a logical function division. In actual implementation, there may be other division methods. For example, multiple sub-modules or modules can be combined Or it can be integrated into another module, or some features can be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • modules or sub-modules described as separate components may or may not be physically separated, and the components as modules or sub-modules may or may not be physical modules or sub-modules, that is, they may be located in one place, or may also be distributed to on multiple network modules or submodules. Part or all of the modules or sub-modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module or submodule in each embodiment of the present application may be integrated into one processing module, each module or submodule may exist separately physically, or two or more modules or submodules may be integrated in one processing module. in a module.
  • the above-mentioned integrated modules or sub-modules can be implemented in the form of hardware or in the form of software function modules or sub-modules.
  • the steps of the methods or algorithms described in conjunction with the embodiments disclosed herein may be directly implemented by hardware, software units executed by a processor, or a combination of both.
  • the software unit can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other Any other known storage medium.
  • RAM random access memory
  • ROM read-only memory
  • electrically programmable ROM electrically erasable programmable ROM
  • registers hard disk, removable disk, CD-ROM, or any other Any other known storage medium.

Abstract

本申请提出一种文本纪要生成方法、装置、设备及存储介质,该方法包括:获取目标文本以及参考文本,其中,所述参考文本基于用户所关注的目标文本内容而确定;基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。采用该文本纪要生成方法,即便是针对相同的目标文本,当参考文本不同时,能够通过从目标文本中定位与参考文本相关的文本内容,对目标文本进行不同侧重点的文本纪要生成处理,从而得到与参考文本对应的目标文本纪要。因此,该方法能够针对同一目标文本,生成满足不同用户需求的文本纪要。

Description

一种文本纪要生成方法、装置、设备及存储介质
本申请要求于2021年12月30日提交中国专利局、申请号为202111667181.X、发明名称为“一种文本纪要生成方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种文本纪要生成方法、装置、设备及存储介质。
背景技术
文本纪要生成,是指对长篇文本进行内容提取,从而提炼出能够表征文本核心内容的信息,文本纪要可以帮助人们更加直接、有效地把握文本内容。
常规的文本纪要生成方案通常是基于文本自动摘要技术,从文本中提取要点并形成概括性的文本。文本自动摘要技术按照产生摘要的方式可以划分为抽取式摘要和生成式摘要。抽取式摘要是从原始文本中原封不动地抽取单词或句子来形成一个摘要,摘要内容全部来源于原文;而生成式摘要允许生成新的词语以及原文本中没有的短语来组成摘要,生成摘要时首先对文本内容进行语义理解,基于语义生成一段话来对给定的文本进行概括。
通常,需要生成文本纪要的目标文本的内容是多方面的,而不同的人员可能对不同方面的内容感兴趣,因此,不同人员对相同目标文本的文本纪要需求是不一样的。
但是,目前的文本纪要生成方案,无论是抽取式摘要还是生成式摘要,均不能针对不同的人员需求生成不同内容的文本纪要,无法满足不同人员对相同目标文本的文本纪要生成需求。
发明内容
基于上述技术现状,本申请提出一种文本纪要生成方法、装置、设备及存储介质,通过实施本申请技术方案,能够针对同一目标文本,生成满足不同用户需求的文本纪要。
为了实现上述目的,本申请提出如下技术方案:
一种文本纪要生成方法,包括:
获取目标文本以及参考文本,其中,所述参考文本基于用户所关注的目标文本内容而确定;
基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
一种文本纪要生成装置,包括:
数据获取单元,用于获取目标文本以及参考文本,其中,所述参考文本基于用户所关注的目标文本内容而确定;
纪要生成单元,用于基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
一种文本纪要生成设备,包括:
存储器和处理器;
所述存储器与所述处理器连接,用于存储程序;
所述处理器,用于通过运行所述存储器中的程序,实现上述的文本纪要生成方法。
一种存储介质,所述存储介质上存储有计算机程序,所述计算机程序被处理器运行时,实现上述的文本纪要生成方法。
本申请提出的文本纪要生成方法,在对目标文本生成文本纪要时,以参考文本作为生成文本纪要的参考,通过从目标文本中定位参考文本的关联内容,对该目标文本进行纪要生成处理,得到与参考文本相对应的目标文本纪要。该方法在对目标文本生成文本纪要时,将目标文本内容,以及目标文本中的参考文本关联内容联合应用,共同用于确定目标文本的文本纪要。采用该文本纪要生成方法,即便是针对相同的目标文本,当参考文本不同时,能够通过从目标 文本中定位与参考文本相关的文本内容,对目标文本进行不同侧重点的文本纪要生成处理,从而得到与参考文本对应的目标文本纪要。因此,该方法能够针对同一目标文本,生成满足不同用户需求的文本纪要。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1是本申请实施例提供的一种文本纪要生成方法的流程示意图;
图2是本申请实施例提供的篇章交互语义检索模型的结构示意图;
图3是本申请实施例提供的基于检索候选信息增强的语义检索模型的结构示意图;
图4是本申请实施例提供的一种基于注意力机制的文本纪要生成模型的结构示意图;
图5是本申请实施例提供的另一种基于注意力机制的文本纪要生成模型的结构示意图;
图6是本申请实施例提供的词-句-篇章层级信息编码模型的结构示意图;
图7是本申请实施例提供的分词特征提取示意图;
图8是本申请实施例提供的信息融合模型的结构示意图;
图9是本申请实施例提供的一种文本纪要生成装置的结构示意图;
图10是本申请实施例提供的一种文本纪要生成设备的结构示意图。
具体实施方式
本申请实施例技术方案适用于生成文本纪要的应用场景,采用本申请实施例技术方案,能够生成与用户关注点相符的文本纪要,从而能够满足不同用户的文本纪要需求。
上述的生成文本纪要的应用场景,具体是指需要生成纪要内容的场景,包 括但不限于会议纪要生成、文献摘要生成、新闻要点提炼等具体的应用场景。
文本纪要生成,是指对长篇文本进行内容提取,从而提炼出能够表征文本核心内容的信息,文本纪要可以帮助人们更加直接、有效地把握文本内容。
常规的文本纪要生成方案通常是基于文本自动摘要技术,从文本中提取要点并形成概括性的文本。文本自动摘要技术按照产生摘要的方式可以划分为抽取式摘要和生成式摘要。抽取式摘要是从原始文本中原封不动地抽取单词或句子来形成一个摘要,摘要内容全部来源于原文;而生成式摘要允许生成新的词语以及原文本中没有的短语来组成摘要,生成摘要时首先对文本内容进行语义理解,基于语义生成一段话来对给定的文本进行概括。
通常,需要生成文本纪要的目标文本的内容是多方面的,而不同的人员可能对不同方面的内容感兴趣,因此,不同人员对相同目标文本的文本纪要需求是不一样的。
例如,在会议场景中,会议内容通常是多方面的,而不同的参会人员所关心的内容通常是不一样的。比如,一场关于新产品策划的研讨会,同时参会的公司设计部、产品部以及市场部等负责人各自关心不同方面的内容。如设计部,更多关注的是产品设计方案的完善,产品部更多关注的是产品定义及研发规划,市场部更多关注的是新产品的市场定位。因此,不同部门所需的会议纪要的内容不同。
但是,常规的文本纪要生成方案,无论是抽取式摘要还是生成式摘要,由于其只能是对待生成摘要的文本进行技术处理,从而确定文本的主要内容,均不能针对不同的关注点生成侧重点不同的文本纪要。
基于上述技术现状,本申请实施例提出一种文本纪要生成方案,该方案能够参考用户所关注的目标文本内容,对目标文本生成文本纪要,从而可以针对不同的关注点生成不同的文本纪要,满足不同用户对于文本纪要内容的个性化需求。
在本申请后续的各项实施例中,以会议纪要生成为例,介绍本申请实施例技术方案的具体处理内容,当本申请实施例技术方案应用于其他场景时,其具体执行过程可以参照本申请各项实施例的介绍。需要说明的是,本申请实施例技术方案不仅适用于对文本进行纪要生成,从而得到文本形式的纪要,还适用 于对语音进行纪要生成,得到文本或语音形式的纪要,或者对文本进行纪要生成得到语音形式的纪要。当针对非文本形式的数据进行纪要生成,或者生成非文本形式的纪要时,可以通过将非文本形式的数据转换为文本数据,或者将生成的文本形式的纪要转换为非文本形式而实现,其中的纪要生成主要处理,依然可以参照本申请实施例的介绍。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提出一种文本纪要生成方法,参见图1所示,该方法包括:
S101、获取目标文本以及参考文本。
其中,上述的目标文本,是指需要生成纪要的文本,该目标文本可以是是通过任意途径获取的任意内容、任意语种的文本。具体而言,该目标文本可以是直接获取的文本,例如学术文献、新闻稿件、书籍等,或者是由语音识别得到的文本,例如对会议录音进行语音识别得到的识别结果文本、对演讲人的演讲语音进行识别得到的识别结果文本等。理论上,任意形式的数据内容,均可以转换为文本形式,从而作为上述的目标文本,通过后续处理实现对该目标文本的纪要生成。
上述的参考文本,基于用户所关注的目标文本内容而确定。具体而言,该参考文本,可以表征用户对目标文本内容的感兴趣内容或关注点,同时表征用户对生成的文本纪要内容的需求,其用于对生成上述目标文本的文本纪要提供参考,以便能够生成符合用户关注点或包含用户感兴趣内容的目标文本纪要。
该参考文本,可以由用户输入,或者在执行文本纪要生成之前预先设置确定。该参考文本的具体形式,可以是固定的句式,也可以是关键词或短语的形式,或者是简短的文本句或文本段的形式,甚至可以是多种文本的逻辑组合。例如,参考文本可以是一些检索条件的组合,如A相关且B相关(记作A&B)、A相关或B相关(记作A||B),或者A相关但B不相关(记作A-B)等,甚至是更为复杂的条件组合,如A相关且B相关但C不相关(记作{A&B}-C),其中A,B,C的具体形式也不做要求,可以是短语、关键词、语句或其他形 式。
例如,在会议纪要生成场景中,对会议录音进行语音识别处理,得到与会议录音对应的文本,该文本作为上述的目标文本。同时,获取用户输入的表征其感兴趣或关注的会议内容的参考文本。该参考文本,可以是用户基于会议内容概括的短语、短句,或者是用户在会场记录的简单会议记录,或者是用户基于希望得到的会议纪要内容而确定的关键词、短语、检索条件等。当获取与会议对应的目标文本以及用户输入的参考文本时,即可执行后续的文本纪要生成处理,对上述的目标文本进行纪要生成处理,得到符合用户关注点,或者包含用户感兴趣内容的目标文本纪要。
对于处理设备来说,其对文本进行处理时,实际上是对文本特征进行处理。因此,上述的“获取目标文本以及参考文本”,可以是获取目标文本以及参考文本的原文,然后针对获取的目标文本以及参考文本原文进行特征提取,获取目标文本的特征以及参考文本的特征,用于后续的文本纪要生成处理;或者,也可以是直接获取目标文本以及参考文本的特征,用于后续的文本纪要生成处理。
S102、基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
其中,上述的参考文本的关联内容,是指与参考文本相关的文本内容,例如与参考文本的相似度大于设定相似度阈值的文本内容,或者与参考文本的语义相似或相关的文本内容,均可作为参考文本的关联内容。
示例性的,本申请实施例将参考文本与目标文本的各个文本片段依次进行比对,确定参考文本与目标文本的各文本片段的文本相似度或语义相似度,从而确定参考文本与目标文本片段的相关性,实现从目标文本中定位、识别出与参考文本相关的关联内容。
可以理解,目标文本中的参考文本关联内容,对于生成与参考文本对应的文本纪要的作用价值更大。因为参考文本关联内容中包含与参考文本相关的文本信息,如果在对目标文本进行文本纪要生成时,能够重点考虑这些参考文本的联内容,就能够使得最终生成的文本纪要包含更多的参考文本关联信息,从而使得生成的目标文本纪要与参考文本相匹配。
基于上述思想,本申请实施例在生成目标文本的文本纪要时,以目标文本中的参考文本关联内容为主,以目标文本中的其他内容为辅,生成目标文本的文本纪要,使得与参考文本的关联内容在最终生成的目标文本纪要中的占比更高,从而使的最终生成的目标文本纪要与参考文本的相关性越高,即,得到与参考文本对应的目标文本纪要。
作为一种示例性的实施方式,本申请实施例基于从目标文本中定位与参考文本相关的文本片段,对该目标文本的全文内容进行纪要生成处理,从而得到与参考文本对应的目标文本纪要。
上述的文本片段,可以是文本句、文本段或者是文本短语等。本申请实施例通过文本比对、语义比对等方法,从目标文本中识别与参考文本相关的文本片段。比如,只要目标文本中的文本片段与参考文本的文本相似度或语义相似度不为0,则可以认为其对于生成与参考文本对应的目标文本纪要有价值,因此将其确定为与参考文本相关的文本片段。
然后,结合与参考文本相关的文本片段的定位结果,对目标文本的全文内容进行纪要生成处理,即对目标文本的全文内容生成文本纪要。在生成纪要过程中,设置与参考文本相关的文本片段对于生成目标文本纪要的贡献度高于与参考文本不相关的文本片段对于生成目标文本纪要的贡献度,从而使得生成的目标文本纪要包含更多的参考文本相关文本片段信息。进一步的,还可以针对与参考文本相关的各个文本片段,根据其与参考文本的相关度大小,为其设置不同的贡献度,从而使得最终生成的目标文本纪要与参考文本的相关度更高。
例如,假设某场会议的会议原文共包含200段会话文本,例如以下会议原文所示(由于会议原文内容较长,为了表述简便,采用省略号代表省略的部分会议原文内容):
会议原文:
段落1:今天我们主要讨论下关于扫描笔新产品的上市计划。
段落2:计划2019年3月启动为期约四个月的巡展,开展科技创新日。
…………
段落:16:下面将产品趋势的全面分析,帮助产品从激烈的市场中脱颖而出。我们将选取14个城市,积极寻找自媒体合作,合作内容包括……
…………
段落68:年轻人更喜欢外表很酷的设计,我们要认真对待这一部分市场,找到切入点。加强宣传效果……
…………
段落79:现在有个demo可以看一下。双十一发布前,我们对产品进行了较为系统的测试。测试集上正确率可以高达98%,超过竞品相对30%,足以形成代差。
…………
段落88:测试集数据测试,是没有测出效果的。因为要跟竞品完全对比的话,其实我们要在相同的事情真上作对比的。如果是同样一个集,不同的人感受是不同的。
…………
段落88:产品定位应该更多考虑年轻人,把这部分设计加进去……
…………
段落162:界面设计也需要交互合理,考虑按键属性和跳转等衔接……对了,前面说的测试问题,我们也该考虑用户主观体验这一项,需要设置一些主观体验的对比。对一些特殊情况,比如顺滑度,多行跨选等,感受下产品使用效果。
…………
段落200:今天会议就到这,后面各部门注意配合。
将上述的会议原文作为目标文本,假设用户A输入的检索条件为“产品效果主观体验”,则将用户A输入的检索条件作为参考文本,针对确定的目标文本和参考文本,通过执行本申请实施例提出的文本纪要生成方法的处理,可以确定,在上述的会议原文中,与用户A输入的检索条件“产品效果主观体验”相关的文本内容为“对了,前面说的测试问题,我们也该考虑用户主观体验这一项,需要设置一些主观体验的对比。”以及“对一些特殊情况,比如顺滑度,多行跨选等,感受下产品使用效果。”通过从会议原文中定位出与检索条件“产品效果主观体验”相关的文本内容,对该会议原文进行纪要生成处理,最终得到会议纪要“扫描笔新产品的效果不仅仅看数据结果,还需要考虑一些主观体 验的方案”。可见,最终得到的会议纪要,与用户A输入的检索条件“产品效果主观体验”相匹配,即该会议纪要是表征产品效果主观体验的相关信息的会议纪要。因此,采用本申请实施例技术方案,能够通过从会议原文中定位与用户A输入的检索条件相关的文本内容,生成与用户A输入的检索条件相对应的会议纪要,从而满足用户A的会议纪要需求。
又例如,针对上述的会议原文,假设用户B输入的检索条件为“产品效果主观体验&产品定位”,则将该会议原文作为目标文本,将用户B输入的检索条件作为参考文本,通过执行本申请实施例提出的文本纪要生成方法的处理,可以确定,在上述的会议原文中,与用户B输入的检索条件“产产品效果主观体验&产品定位”相关的文本内容为“产品定位应该更多考虑年轻人,把这部分设计加进去。”、“对了,前面说的测试问题,我们也该考虑用户主观体验这一项,需要设置一些主观体验的对比。”以及“对一些特殊情况,比如顺滑度,多行跨选等,感受下产品使用效果。”通过从会议原文中定位出与检索条件“产品效果主观体验&产品定位”相关的文本内容,对该会议原文进行纪要生成处理,最终得到会议纪要“扫描笔新产品定位应该考虑年轻人更喜欢外表酷的设计,使用效果不仅仅看数据结果,还需要考虑一些主观体验的方案”。可见,最终得到的会议纪要,与用户B输入的检索条件“产品效果主观体验&产品定位”相匹配,即该会议纪要是表征产品效果主观体验和产品定位的相关信息的会议纪要。因此,采用本申请实施例技术方案,能够通过从会议原文中定位与用户B输入的检索条件相关的文本内容,生成与用户B输入的检索条件相对应的会议纪要,从而满足用户B的会议纪要需求。
可见,采用本申请实施例提出的文本纪要生成方法,即便是针对相同的会议原文,当用户输入的检索条件不同时,能够分别针对不同的用户检索条件生成与用户检索条件相匹配的文本纪要,从而满足不同用户对会议内容的需求。
综上所述,本申请实施例提出的文本纪要生成方法,在对目标文本生成文本纪要时,以参考文本作为生成文本纪要的参考,通过从目标文本中定位参考文本的关联内容,对该目标文本进行纪要生成处理,得到与参考文本相对应的目标文本纪要。该方法在对目标文本生成文本纪要时,将目标文本内容,以及 目标文本中的参考文本关联内容联合应用,共同用于确定目标文本的文本纪要。采用该文本纪要生成方法,即便是针对相同的目标文本,当参考文本不同时,能够通过从目标文本中定位与参考文本相关的文本内容,对目标文本进行不同侧重点的文本纪要生成处理,从而得到与参考文本对应的目标文本纪要。因此,该方法能够针对同一目标文本,生成满足不同用户需求的文本纪要。
作为一种示例性的实施方式,本申请实施例预先训练基于注意力机制的文本纪要生成模型,用于根据目标文本以及参考文本,对目标文本进行文本纪要生成处理,得到与参考文本对应的目标文本纪要。
该基于注意力机制的文本纪要生成模型,基于预先收集的目标文本-参考文本-目标文本纪要平行数据训练得到。比如,预先收集大量的会议原文-用户检索条件-会议纪要文本的平行数据,并进行数据预处理后,用于对该模型进行训练。
所述会议原文数据,可以收集会议音频数据,进行语音转写后得到相应文本数据;当然,也可以直接收集会议文本数据,如速记员对会议整理的原始全文稿等。所述检索条件,形式不仅仅局限于固定模板的固定句式,还支持用户输入自己关心的关键词或短语,或是简短的会议记录,甚至是多种子条件的逻辑组合,如A相关且B相关(记作A&B),A相关或B相关(记作A||B)以及A相关但B不相关(记作A-B),甚至是更为复杂的条件组合,如A相关且B相关但C不相关(记作{A&B}-C),其中A,B,C的具体形式也不做要求,可以是短语、关键词、语句或其他形式。在本申请实施例中,不限定检索条件的具体来源或具体内容。所述会议纪要,旨在高度凝练和检索条件高度相关的原长篇幅的会议文本,覆盖并总结检索条件相关的会议重点内容。
所述数据预处理,首先是对会议原文文本进行分句处理。所述分句,可以采用按标点符号进行子句或整句的切分,也可根据固定字数窗,滑窗进行分句,本案对分句具体方法不做具体要求,本案采用以标点符号,按照整句切分的方式,对会议原文文本进行分句处理;其次,是将分句后的会议原文文本数据处理成分词的输入序列形式,文本分词可以用现有技术,在此不再详述,如会议原文文本句“接下来的重点工作是春交会的一个跟进。”分词后结果“接下来/ 的/重点/工作/是/春交会/的/一个/跟进。”。对于用户检索条件,若检索条件仅为单个条件,则直接将检索条件视为纯文本,所述预处理仅需将检索条件文本数据处理成分词的输入序列形式,若检索条件为多个子条件的复合,则分别将各个子条件处理成分词后的序列形式。对于对应的会议纪要,所述预处理是将文本数据处理成分词的序列形式。
该基于注意力机制的文本纪要生成模型,为时序输出模型,即每次解码输出一个分词,最终得到的各个分词可以组合得到纪要文本。在训练时,分别获取会议原文的分词序列的特征,以及检索条件的分词序列的特征,将会议原文分词序列的特征和检索条件的分词序列的特征,输入该基于注意力机制的文本纪要生成模型,并通过将模型输出的会议纪要分词序列与预先处理得到的会议纪要分词序列进行比对确定模型损失,基于该模型损失对模型参数进行修正,使得该基于注意力机制的文本纪要生成模型,能够以目标文本和参考文本的特征为输入,生成与参考文本对应的目标文本纪要。
下面,以生成满足不同用户检索条件的会议纪要为例,对本申请实施例提出的文本纪要生成方法的具体处理过程进行介绍。在下文实施例中,以会议原文,代表上述的目标文本,以用户输入的检索条件或检索子条件,代表上述的参考文本,通过介绍对会议原文生成文本纪要,得到与用户输入的检索条件相符的文本纪要的过程,展示本申请实施例基于从目标文本中定位参考文本的关联内容,生成与参考文本相对应的目标文本纪要的技术方案的具体处理过程。
首先,对于“获取目标文本以及参考文本”,如上文所示,可以是获取目标文本和参考文本的原文,然后对其进行特征提取,用于后续的文本纪要生成处理。或者,也可以是直接获取目标文本的特征以及参考文本的特征,并用于后续的文本纪要生成处理。
具体的目标文本特征和参考文本特征的提取过程,可以参照后续实施例的详细介绍。
然后,关于本申请实施例提出的文本纪要生成方法中的“基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要”这一处理步骤,作为一种示例性的实 施方式,可以通过如下步骤A1-A2实现:
A1、通过确定目标文本中的各个文本片段与参考文本的相关度,从所述目标文本中定位出与所述参考文本相关的文本片段。
具体的,上述的文本片段,可以是文本句、文本段或者文本短语等任意粒度的文本内容。本申请实施例对目标文本进行文本句划分,以划分得到的各个文本句作为上述的文本片段。其中,对目标文本进行文本句划分,可以是按照目标文本的标点,对其进行文本句划分,或者是根据固定字数窗,滑窗进行分句,本案对分句具体方法不做具体要求。在本申请实施例中,根据目标文本中的标点,对其进行语句划分。
然后,将参考文本与划分得到的各个目标文本句进行文本比对或语义比对,确定参考文本与各个目标文本句的相关度。在本申请实施例中,通过语义度量,确定目标文本中的各个文本片段与参考文本的相关度,即,通过比对参考文本与目标文本中的各个文本片段之间的语义相似度,确定参考文本与目标文本中的各个文本片段的相关度。
示例性的,先分别获取目标文本的特征,以及参考文本的特征。然后,根据目标文本中的各个文本片段的特征以及参考文本的特征,分别确定目标文本中的各个文本片段与参考文本的相关度。根据参考文本与目标文本中的各个文本片段的相关度,即可从目标文本中定位出与参考文本相关的文本片段。例如,在目标文本中,与参考文本的相关度不为零的文本片段,即为与参考文本相关的文本片段。
其中,目标文本的特征,可以是通过对目标文本中的各个文本片段提取特征,然后由各个文本片段的特征组合得到目标文本的特征,由此,基于目标文本中的各个文本片段的特征,可以分别确定目标文本的整体特征,以及目标文本的各个文本片段的特征。或者,也可以直接对目标文本整体进行特征提取,得到目标文本的特征,然后,根据目标文本的各个文本片段在目标文本中的位置,从目标文本特征中截取相应位置的特征,得到各个文本片段的特征。
在本申请实施例中,通过提取目标文本的各个文本片段的特征,确定目标文本的特征,以及,通过提取参考文本的各个文本片段的特征,确定参考文本的特征,具体的特征提取处理过程将在后续实施例中介绍。
作为一种可选的实施方式,例如,在会议纪要生成的应用场景下,假设用户输入了多个检索子条件,则针对每个检索子条件,分别确定其与会议原文中的每句文本的语义相似度。此时,针对每个检索子条件,确定其与会议原文中的每句文本的语义相似度,均可以通过上述的确定目标文本的文本片段与参考文本的语义相似度的处理过程实现。
示例性的,本申请实施例构建语义模糊检索模型,提取用户检索条件的每个子条件与会议原文中的每句文本的语义相似度得分,也就是确定参考文本与目标文本中的各个文本片段的相关度。
所述语义模糊检索模型,以检索子条件文本及会议原文文本为输入,输出对于某条检索子条件文本,会议原文文本中每句文本和其语义相似度得分。基于上述的语义模糊检索模型,将会议原文及用户检索子条件文本序列输入到语义模糊检索模型后,即可得到对于某条用户检索子条件,会议原文中1到n(n为会议原文句子总数)个句子与其的语义相似度得分,对于检索子条件A,其与会议原文的n个句子的语义相似度得分用
Figure PCTCN2022133167-appb-000001
表示,对于检索子条件B,其与会议原文的n个句子的语义相似度得分用
Figure PCTCN2022133167-appb-000002
表示。
本申请实施例提出两种语义模糊检索模型框架:篇章交互语义检索模型,以及基于检索候选信息增强的语义检索模型,用于度量检索子条件(参考文本)与会议原文(目标文本)句子的语义相似度得分。
如图2所示,篇章交互语义检索模型包括词编码器、句编码器,以及检索子条件文本与会议文本交互模块。其中,词编码器采用BERT预训练模型,首先输入会议转写文本句子S,以及检索子条件Q到词编码器中(其中,q={w 1,w 2,...,w n}表示检索子条件Q所包含的n个分词,S j={w j,1,w j,2,...w j,m}表示第j个会议文本句子所包含的m个分词),获取句子中每个词的上下文编码向量,抽取[CLS]的词向量表征
Figure PCTCN2022133167-appb-000003
以及
Figure PCTCN2022133167-appb-000004
作为转写文本句子编码以及检索子条件句子编码,用以表征整个句子的信息。
然后,转写文本句子编码通过两层Transformer句编码器建模将上下文信息引入当前句子,补全当前句子承接上下文所省略的信息,从而获得更准确的句子表征
Figure PCTCN2022133167-appb-000005
其次,通过检索子条件文本与会议文本交互模块,对会议原文的各个文本 句的句子编码,以及检索子条件的句子编码,进行基于注意力机制的交互运算,得到信息完善后的检索子条件句子编码。检索子条件文本与会议文本交互模块由attention结构构成,将检索子条件编码句子编码
Figure PCTCN2022133167-appb-000006
作为attention机制的询问Q,将会议转写句子编码作为K和V,从而将会议内容信息融入检索子条件句子编码q中,在检索子条件较为简短,或者描述模糊等信息不全的情况下,会议文本内容可以较好地补充检索子条件中省略的信息;同时编码q拥有了全场会议内容的全局视野,对于其在转写文本中选择更优的检索结果更为有利。
最后,将检索子条件句子编码q与每个会议文本句子编码
Figure PCTCN2022133167-appb-000007
做拼接,生成最终的交互向量S j,输入到输出层预测该句与检索子条件的语义相似度得分。
Figure PCTCN2022133167-appb-000008
上式中,
Figure PCTCN2022133167-appb-000009
表示检索子条件与文本句子j的点积运算,代表两者相似程度;
Figure PCTCN2022133167-appb-000010
表示检索子条件与文本句子j的信息差,通过多个视角的比较能够获得更全面的相似性判断信息。
相比于篇章交互语义检索模型,基于检索候选信息增强的语义检索模型在检索子条件与文本句子交互方式上进行了修改。具体来说,在获得检索子条件句子编码
Figure PCTCN2022133167-appb-000011
以及获得会议原文的各个文本句的句子编码
Figure PCTCN2022133167-appb-000012
之后,根据检索子条件的句子编码以及各个文本句的句子编码,计算会议原文的各个文本句与检索子条件的相似度,然后,根据会议原文的各个文本句与检索子条件的相似度,从会议原文中选出与参考文本的相似度最高的N个文本句,最后,对选出的N个文本句的句子编码,以及检索子条件的句子编码,进行基于注意力机制的交互运算,得到信息完善后的检索子条件句子编码。
例如,通过和会议文本句子编码
Figure PCTCN2022133167-appb-000013
计算余弦距离可以获得与检索子条件最为相似的TopN个初步匹配的检索结果,也就是,根据会议原文文本中的各个文本句与检索子条件的相似度,从会议原文文本中选出与检索子条件相似度最高的N个文本句。如图3中所示,r=1表示选中的TopN个初步匹配句子,r=0表示其他相关性较低的句子。
然后,检索子条件句子编码
Figure PCTCN2022133167-appb-000014
与TopN个高质量匹配结果
Figure PCTCN2022133167-appb-000015
通过attention进行交互,更新检索子条件的编码向量为q。在N设置较小,如2时,N个检索候选与检索子条件相关的置信度较高,可以较为准确地补充检索子条件信息; 同时,通过将质量较高的N个检索作为提示信息,可以引导模型将与初步高质量检索结果相似的结果选择出来,避免最终选择的检索结果之间语义相关性相差过大,降低用户体验。
考虑到BM25在精确匹配场景上的优势,本申请实施例将BM25方案与语义检索模型的输出结果相融合。具体来说,根据会议原文中的各个文本句子的编码以及检索子条件的编码,通过BM25算法计算确定会议原文中的各个文本句子与检索子条件的语义相似度。然后,对上述的语义模糊检索模型输出的会议原文中的各个文本句子与检索子条件的语义相似度得分,以及通过BM25算法计算确定的会议原文中的各个文本句子与检索子条件的语义相似度进行融合处理,例如进行加权融合,得到融合后的会议原文中的各个文本句子与检索子条件的语义相似度得分,即得到融合后的会议原文中的各个文本句子与检索子条件的相关度。
另外,考虑到与检索子条件相关的句子在会议文本中的跨度过大时不符合实际检索子条件描述内容较为集中的事实,本申请实施例对于与检索子条件相关的会议原文句子的位置分布的跨度进行约束,也就是,根据会议文本中的各个文本句子在会议文本中的位置分布,对会议文本中的各个文本句子与检索子条件的相关度进行修正。
本申请实施例先从会议文本中的各个文本句中,选出与检索子条件的相关度最高的第二数量的文本片段;
然后,按照会议文本中的其它文本句与选出的第二数量的文本句的距离越大,则对其它文本句与检索子条件的相关度的惩罚度越高的规则,确定对会议文本中的其它文本句与检索子条件的相关度的惩罚度;并根据对会议文本中的其它文本句与检索子条件的相关度的惩罚度,对会议文本中的其它文本句与检索子条件的相关度进行惩罚。
具体来说,本申请实施例设定,融合后的会议原文中的各个文本句子与检索子条件的语义相似度得分的Top2是相对准确的,会议原文中的其他文本句子与检索子条件的语义相似度得分应当按照距离Top-2的大小进行相应惩罚。由于Top2有两个句子,其他文本句子与检索子条件的语义相似度得分受到的惩罚应该只选择与Top2两个句子中距离较小的进行惩罚;基于上述处理,会议原文 中的其他文本句子与检索子条件的语义相似度得分为原得分减去距离惩罚分。
更进一步的,本申请实施例还对从会议原文中识别出的与检索子条件相关的文本句进行异常文本句滤除处理,即,更进一步地精确会议文本句与检索子条件的相似度得分。首先,根据会议文本中的各个文本句与检索子条件的相关度,从会议文本中选出与检索子条件的相关度最高的第三数量的文本句;然后,根据选出的第三数量的文本句中的各个文本句与检索子条件的相关度,从所述第三数量的文本句中,选出与检索子条件的相关度大于第一相关度阈值,或者与检索子条件的相关度大于第二相关度阈值并且与检索子条件的标准化相关度大于第三相关度阈值的文本句,作为与检索子条件相关的文本句;
其中,所述第一相关度阈值大于所述第二相关度阈值,所述第二相关度阈值大于所述第三相关度阈值。
具体而言,在从会议文本中抽取出与检索子条件的相关度TopK的文本句子后,将非停用词数目小于等于1的文本句子删除以过滤低信息量句子。如果剩余的文本句子与检索子条件的相似度得分score小于t1(如t1=0.6),则认为该文本句子的可信度较低,考虑将该句子删除;如果剩余的文本句子与检索子条件的相似度得分score都比较低,例如都小于t1,则说明检索子条件内容较为困难,如过于简略或者概括性较强,因此将调低门限t2(如t2=0.3),并进一步查看剩余的文本句子与检索子条件的标准化得分score_norm,以及设定门限t3(如t3=0.2)。最后,只保留与检索子条件的相似度得分score大于t1,或者与检索子条件的相似度得分大于t2并且与检索子条件的标准化相似度得分大于t3的文本句子,作为与检索子条件相关的文本句子。
具体来说,标准化得分score_norm计算如下:
Figure PCTCN2022133167-appb-000016
Figure PCTCN2022133167-appb-000017
Figure PCTCN2022133167-appb-000018
最终的相关文本句子挑选策略为,对于会议原文中的文本句子i,其与检索子条件的相似度得分需要满足(score i>t 1)||((score i>t 2)&&(score_norm i>t 3))。
通过上述的处理,可以从会议原文中定位出与检索条件或检索子条件相关的文本句。该与检索条件或检索子条件相关的文半句,也可以输出给用户,使用户可以应用或知晓在会议原文中,有哪些与检索条件或检索子条件相关的文本内容。
基于上述处理,可以分别确定检索子条件与会议文本中的各个文本句的语义相似度,还可以从会议原文的文本中,识别出与检索子条件相关的文本句子。
当检索子条件有多个时,需要对各个检索子条件与会议文本中的各个文本句的语义相似度得分进行整合,确定检索条件整体上与会议文本中的各个文本句的语义相似度得分。
具体而言,当检索条件包含多个检索子条件(即相当于参考文本的文本数量大于1,其中,文本的数量可以根据文本句、文本段等粒度的文本的数量而确定)时,本申请实施例针对会议文本中的每个文本句,通过如下处理确定其与检索条件的相关度,也就是确定其与检索条件的相似度得分:
首先,根据该文本句的特征,以及各个检索子条件的特征,确定该文本句与各个检索子条件的相关度。
具体而言,参照申请上述实施例的介绍,可以针对该文本句,基于该文本句的特征和各个检索子条件的特征,分别计算确定该文本句与检索条件中的各个检索子条件的相似度得分。
然后,根据各条检索子条件之间的关系,对该文本句与各个检索子条件的相关度进行融合处理,确定该文本句与检索条件的相关度。
具体的,当用户设置多个检索子条件构成完整的检索条件时,通常是通过对多个检索子条件进行逻辑组合得到检索条件。因此,检索条件中的各个检索子条件之间具有明确的逻辑关系。
基于上述的逻辑关系,本申请实施例在分别确定该文本句与各个检索子条件之间的相似度得分后,按照各个检索子条件之间的逻辑关系,对该文本句与各个检索子条件之间的相似度得分进行逻辑组合,从而确定该文本句与检索条件整体的相似度得分。
上述的对该文本句与各个检索子条件之间的相似度得分进行逻辑组合,可以参见表1所示:
表1
Figure PCTCN2022133167-appb-000019
根据表1所示,对于会议原文中的第个文本句,假设其与检索子条件A的相似度得分为
Figure PCTCN2022133167-appb-000020
其与检索子条件B的相似度得分为
Figure PCTCN2022133167-appb-000021
则,该文本句与检索条件A&B的相似度得分为
Figure PCTCN2022133167-appb-000022
该文本句与检索条件A||B的相似度得分为
Figure PCTCN2022133167-appb-000023
该文本句与检索条件A-B的相似度得分为
Figure PCTCN2022133167-appb-000024
进一步的,本申请实施例还对通过上述方式确定的会议原文中的各个文本句与检索条件的相似度得分进行归一化处理,使得会议原文的各个文本句与检索条件的相似度得分处理0-1之间,以便于更加直观地表示会议原文的文本句与检索条件的相关度,并且使不同文本句与检索条件的相关度具有可比性。经过上述处理后,对于会议原文中的第1到n个文本句子,其与检索条件的相关度,用p 1,p 2…,p n表示。
A2、至少基于所述目标文本中的与所述参考文本相关的各个文本片段与所述参考文本的相关度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
具体的,当从目标文本中定位出与参考文本相关的各个文本片段时,即可根据与参考文本相关的各个文本片段,对目标文本的全文内容进行纪要生成处理,得到与参考文本对应的目标文本纪要。
例如,以目标文本中的与参考文本相关的各个文本片段的内容为主,以目标文本中的其他文本内容为辅,生成文本纪要,得到的目标文本纪要中的主要内容是与参考文本相关的内容。
或者,根据目标文本中与参考文本相关的各个文本片段与参考文本的相关度,对各个相关文本片段对于生成目标文本纪要的贡献度进行设置,使得与参考文本的相关度越高的文本片段,对于生成目标文本纪要的贡献度越高,从而使得最终生成的目标文本纪要中,所包含的与参考文本相关的文本段内容所占的比例与参考文本的相关度成正比。
作为一种优选的实施方式,本申请实施例综合考虑目标文本中的各个文本片段与参考文本的相关度,对目标文本的全文内容进行纪要生成处理。即,通过执行如下步骤A21-A22,对目标文本的全文内容进行纪要生成:
A21、根据目标文本中的各个文本片段与参考文本的相关度,确定所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度。
具体的,本申请实施例按照目标文本中的文本片段与参考文本的相关度越高,则目标文本中的文本片段对于生成与参考文本对应的文本纪要的贡献度越大的规则,分别确定目标文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度。
作为一种示例性的实施方式,本申请实施例采用基于注意力机制的文本纪要生成模型来生成目标文本的文本纪要。该基于注意力机制的文本纪要生成模型,能够根据目标文本中的各个文本片段与参考文本的相关度,确定目标文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度,进而,基于该贡献度生成目标文本的文本纪要。
由于该文本纪要生成模型是基于注意力机制的文本解码模型,因此,该模型能够通过调整对输入的目标文本的各个文本片段的注意力系数,得到满足需求的文本纪要解码结果。当解码过程对不同文本片段的注意力不同时,可以使得最终解码得到的文本纪要的内容发生变化。由于该基于注意力机制的文本纪要生成模型是时序输出模型,因此,当前时刻的文本纪要解码对于目标文本中的各个文本片段的注意力系数,还可能与当前时刻之前已经输出的目标文本纪要内容有关。
通过训练,该模型能够根据前序的解码结果,确定当前时刻文本纪要解码对目标文件的各个文本片段的注意力系数分布,也就是对输入的目标文本的各个文本片段分配正确的注意力系数,因此,通过该模型能够确定该模型在生成 目标文本的文本纪要时,其对于目标文本中的各个文本片段的注意力系数。
然后,根据生成目标文本的文本纪要对于目标文本中的各个文本片段的注意力系数,以及目标文本中的各个文本片段与参考文本的相关度,确定目标文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度。
由于本申请实施例的最终目的是生成与参考文本对应的文本纪要,因此,只确定文本纪要生成对目标文本的各个文本片段的注意力系数,还不足以使得最终生成的目标文本纪要与参考文本相对应。为了使得最终生成的目标文本纪要与参考文本相对应,本申请实施例还将生成目标文本的文本纪要对于目标文本中的各个文本片段的注意力系数,与目标文本中的各个文本片段与参考文本的相关度,进行结合,共同用于确定目标文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度。
示例性的,将生成目标文本的文本纪要对于目标文本中的各个文本片段的注意力系数,与目标文本中的各个文本片段与参考文本的相关度,进行相乘运算,然后再将与各个文本片段对应的乘积结果进行归一化处理,最终得到的与各个文本片段对应的归一化值,作为目标文本中的各个文本片段对于生成参考文本对应的文本纪要的贡献度。
以上的贡献度确定方案,也可以参见后续实施例对于生成目标文本纪要的具体过程的举例说明。
A22、至少根据所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
具体的,上述的基于注意力机制的文本纪要生成模型,首先根据目标文本的特征,以及目标文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度,生成文本纪要解码特征;然后,在根据该文本纪要解码特征,进行文本纪要解码处理,生成目标文本的文本纪要。
由于本申请实施例是通过获取目标文本的各个文本片段的特征,进而确定目标文本的特征,因此,目标文本的各个文本片段的特征是预先明确的。而目标文本各个文本片段对于生成文本纪要的贡献度不同,因此,可以直接根据目标文本的各个文本片段的特征,以及目标文本中的各个文本片段对于生成与参 考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
进而,对上述的文本纪要解码特征在预设的字典范围内进行解码,得到解码结果。按照上述方法使模型对目标文本全文进行解码,即可得到与参考文本对应的目标文本纪要。
仍以对会议原文生成符合用户检索条件的会议纪要为例,图4示出了基于注意力机制的文本纪要生成模型的结构,以及其解码生成目标文本纪要的处理过程。
假设对于用户某条检索条件,该模型历史已解出的会议纪要文本词序列表示为y 1,y 2,…,y t-1,会议原文中每句(假设共有n句)文本的句隐层特征为
Figure PCTCN2022133167-appb-000025
t表示当前解码时刻,对于会议原文中的第1到n个文本句子,其与检索条件的相关度,用p 1,p 2…,p n表示。将会议原文中每句文本的句隐层特征
Figure PCTCN2022133167-appb-000026
以及每句文本与检索条件的相关度p 1,p 2…,p n,输入该模型,具体是输入该模型的解码端与原文交互注意力模块。
历史已解出的会议纪要文本词序列y 1,y 2,…,y t-1经过该模型的解码隐层特征表达模块后,得到当前解码时刻的隐层状态特征为d t。该解码隐层特征表达模块,可以输入对于用户某条检索条件,利用该模型已解出的历史会议纪要文本词序列,输出当前解码时刻的隐层状态特征。所述解码隐层特征表达模块的网络结构可利用Transformer方案下的decoder部分编码模型或单向LSTM等结构。
在解码端与原文交互注意力模块中,首先基于attention机制,确定在当前解码时刻,隐层状态特征d t对会议原文中第j个文本句的句隐层特征的注意力系数
Figure PCTCN2022133167-appb-000027
然后,根据隐层状态特征d t对会议原文中第j个文本句的句隐层特征的注意力系数
Figure PCTCN2022133167-appb-000028
以及会议原文中的第j个文本句与检索条件的相关度p j,计算确定会议原文中的第j个文本句对于会议纪要生成的贡献度
Figure PCTCN2022133167-appb-000029
最后,根据会议原文中的各个文本句的句隐层特征,以及会议原文中的各个文本句对于会议纪要生成的贡献度,生成文本纪要解码特征
Figure PCTCN2022133167-appb-000030
以上处理的具体计算过程如下:
Figure PCTCN2022133167-appb-000031
Figure PCTCN2022133167-appb-000032
Figure PCTCN2022133167-appb-000033
其中,j=1,2,…n,表示会议原文中的n个文本句。Attention()表示注意力机制计算函数,可采用self-attention及加性attention等方式。在本申请实施例中,解码端充分考虑在会议原文所有文本句中,和检索条件内容相关的信息对会议纪要生成的影响,优化改进原注意力系数为
Figure PCTCN2022133167-appb-000034
文本纪要解码特征
Figure PCTCN2022133167-appb-000035
为解码端具有语义模糊检索特征指导做相关内容选择的,关注会议原文句隐层特征程度不同的上下文向量表示。
Figure PCTCN2022133167-appb-000036
Figure PCTCN2022133167-appb-000037
计算过程可以看出,若会议原文中的某个句子和检索条件的相关度越高,即检索匹配特征值越大,则该句对应优化后的注意力系数值越大,对最终的文本纪要解码特征
Figure PCTCN2022133167-appb-000038
的贡献程度则越大,保证了本申请实施例提出的基于注意力机制的会议纪要生成模型具备选择和检索相关原文内容的能力。
图4中的出词预测模块,是输入文本纪要解码特征
Figure PCTCN2022133167-appb-000039
计算分布在词典大小的出词概率,输出当前解码时刻对应的词。所述出词预测模块的网络结构可利用线性层接非线性激活函数层,解码算法可用beam search算法。依照上述算法,模型在每一时刻分别确定基于会议原文的文本纪要解码特征,并且进行解码输出,从而可以与会议原文对应、并且符合检索条件的会议纪要。
作为一种优选的实施方式,由于本申请实施例最终生成的目标文本纪要,是需要与参考文本相对应的,因此,除了明确目标文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度,从而使得最终生成的目标文本纪要包含与参考文本相关的目标文本内容之外,本申请实施例还直接将参考文本的特征,用于生成目标文本的文本纪要,从而进一步地提高生成的目标文本纪要与参考文本的相关性。
基于上述思想,本申请实施例在生成文本纪要解码特征时,根据目标文本的特征、参考文本的特征,以及目标文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
示例性的,在图4所示的基于注意力机制的文本纪要生成模型中,除了将会议原文中每句文本的句隐层特征
Figure PCTCN2022133167-appb-000040
以及每句文本与检索条件的相关度p 1,p 2…,p n,输入该模型之外,还将检索条件的特征输入该模型,例如将检索条件的每个词的词隐层特征
Figure PCTCN2022133167-appb-000041
输入该模型,使得该模型进行解码处理时,能够参考检索条件对会议原文的文本内容进行解码,从而提高解码结果与检索条件的相关性。
例如,将会议原文中每句文本的句隐层特征
Figure PCTCN2022133167-appb-000042
每句文本与检索条件的相关度p 1,p 2…,p n,以及检索条件的每个词的词隐层特征
Figure PCTCN2022133167-appb-000043
输入该模型的解码端与原文交互注意力模块,使得该模型在确定对会议原文的各个文本句的注意力系数、确定会议原文的各个文本句对生成与检索条件对应的会议纪要的贡献度,以及生成文本纪要解码特征时,均能够以检索条件为参考,从而使得最终解码得到的会议纪要与检索条件的相关度更高,避免会议纪要脱离检索条件。
作为一种可选的实施方式,本申请实施例提出如图5所示的基于注意力机制的文本纪要生成模型结构,该模型相对于图4所示的模型结构,在解码端与原文交互注意力模块之前,增加了解码端与检索交互注意力模块,在该解码端与检索交互注意力模块,主要实现模型隐层状态特征与检索条件特征的交互,生成参考解码特征,该参考解码特征,再通过解码端与原文交互注意力模块实现与会议原文特征的交互。
基于图5所示的模型结构,在生成文本纪要解码特征时,先根据参考文本的特征,生成参考解码特征;然后再根据参考解码特征、目标文本的特征,以及目标文本中的各个文本片段对于生成参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
具体而言,参见图5所示,将检索条件的特征,以及隐层状态特征,输入模型的解码端与检索交互注意力模块,使得检索条件的特征与模型隐层状态特征相融合,得到参考解码特征。
更进一步的,本申请实施例通过参考文本的各个文本片段的特征,来确定参考文本的特征,而参考文本的各个文本片段,对于生成与参考文本对应的目标文本纪要的影响,也是不同的。例如,参考文本中的关键实体词,能够很大 程度上表达参考文本的语义,因此其对于生成与参考文本对应的目标文本纪要的参考价值更大,而参考文本中的非实体词,例如语气词、修饰词等,对于生成与参考文本对应的目标文本纪要的参考价值相对更小。因此,本申请实施例参照上述实施例介绍的确定目标文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度的方案,确定参考文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度。其中,参考文本中的文本片段,可以是词、短语、语句、文本段落等任意粒度的文本内容。
当明确了参考文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度后,根据参考文本的特征,以及参考文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度,生成参考解码特征。
具体而言,本申请实施例通过获取参考文本的各个文本片段的特征而确定参考文本的特征,而参考文本的各个文本片段对于生成与参考文本对应的文本纪要的贡献度不同,因此,本申请实施例根据参考文本的各个文本片段的特征,以及参考文本中的各个文本片段对于生成与参考文本对应的文本纪要的贡献度,生成参考解码特征。
继续以上述的对会议原文生成符合用户检索条件的会议纪要为例,借助如图5所示的基于注意力机制的文本纪要生成模型,假设检索条件的每个词的词隐层特征为
Figure PCTCN2022133167-appb-000044
将当前解码时刻隐层状态特征d t,以及检索条件每个词的词隐层特征
Figure PCTCN2022133167-appb-000045
输入解码端与检索交互注意力模块,该模块利用attention机制,先确定检索条件中的每个词对于生成与检索条件对应的文本纪要的贡献度,然后根据检索条件的每个词的特征以及检索条件中的每个词对于生成与检索条件对应的文本纪要的贡献度,生成参考解码特征。具体计算过程如下:
Figure PCTCN2022133167-appb-000046
Figure PCTCN2022133167-appb-000047
其中,i=1,2,…m1+m2,表示检索条件中的m1+m2个词,m1+m2表示检索条件包含的两个检索子条件的词的数量之和。Attention()表示注意力机制计算函数,可采用self-attention及加性attention等方式。
Figure PCTCN2022133167-appb-000048
表示当前解码时刻, 解码隐层状态特征对检索条件中第i个词的词隐层特征注意力系数,该系数也表示检索条件中的第i个词对于生成与检索条件相对应的文本纪要的贡献度。
Figure PCTCN2022133167-appb-000049
为解码端与检索交互后获得的关注检索条件词隐层特征程度不同的上下文向量表示,在本申请实施例中,将其命名为参考解码特征。可以理解,本申请实施例技术方案在解码端充分了考虑检索条件中信息对会议纪要生成的贡献度。
参见图5所示,在得到参考解码特征
Figure PCTCN2022133167-appb-000050
后,将该参考解码特征、会议原文的各个文本句的特征,以及会议原文中的各个文本句与检索条件的相关度,输入解码端与原文交互注意力模块,使该模型通过注意力交互运算,确定会议原文中的各个文本句对于生成与检索条件对应的文本纪要的贡献度,以及文本纪要解码特征。具体计算过程如下:
Figure PCTCN2022133167-appb-000051
Figure PCTCN2022133167-appb-000052
Figure PCTCN2022133167-appb-000053
其中,
Figure PCTCN2022133167-appb-000054
为解码端在第t时刻与检索交互后的上下文向量,也就是参考解码特征。
Figure PCTCN2022133167-appb-000055
为会议原文中第j句话的句隐层特征,i=1,2,…n。Attention()表示注意力机制计算函数,可采用self-attention及加性attention等方式。
Figure PCTCN2022133167-appb-000056
表示当前解码时刻,解码端与检索交互后的上下文向量对会议原文文本中第j句话的句隐层特征注意力系数,也就是解码端在生成与检索条件对应的文本纪要时对会议原文中的第j个文本句的注意力系数。p j表示检索条件与会议原文中第j句文本的相似度。
综合上述实施例介绍可见,本申请实施例在解码文本纪要时,不仅考虑了目标文本的各个文本片段对于生成与参考文本对应的目标文本纪要的贡献度,还考虑了参考文本的各个文本片段对于生成与参考文本对应的目标文本纪要的贡献度。从而保证了整个目标文本纪要生成过程具备了选择和检索相关的目 标文本内容和参考文本内容的能力,从而提高了最终生成的目标文本纪要与参考文本和目标文本中的参考文本关联内容的相关度,即使得最终生成的目标文本纪要与参考文本相对应。
需要说明的是,上述实施例中,借助目标文本的各个文本片段的特征,以及参考文本的各个文本片段的特征,说明参考解码特征和文本纪要解码特征的获取过程,以及最终解码得到目标文本纪要的处理过程。在实际实施本申请实施例时,可以直接将目标文本的整体特征和参考文本的整体特征,输入上述的基于注意力机制的文本纪要生成模型,获取参考解码特征和文本纪要解码特征,此时,对该模型的训练过程和模型的具体处理过程,均可以参照上述实施例的介绍而执行。
另外,上述的图4和图5所示的基于注意力机制的文本纪要生成模型的各个处理模块的名称,是结合具体的处理对象而命名的,当实际处理的目标文本和参考文本为其他类型的文本,而非会议原文和检索条件时,可以根据实际处理对象而对各个处理模块的名称进行适应性更改。本申请实施例并不限定上述的基于注意力机制的文本纪要生成模型的各个处理模块的名称,而主要是介绍各个处理模块的功能和处理内容,从而具体地介绍该基于注意力机制的文本纪要生成模型的处理过程和所实现的功能。
在上文各实施例中,分别介绍了度量目标文本的各个文本片段与参考文本的相关度,以及对目标文本进行文本纪要生成处理得到与参考文本对应的目标文本纪要的具体实施方式。由于对于计算机设备来说,其对文本进行处理,本质上均是对文本的特征进行处理,也就是说,本申请实施例所提出的文本纪要生成方法中所包含的对文本进行的处理内容,本质上均是对文本的特征进行的处理。因此,文本特征的准确与否,将直接影响对文本进行纪要生成处理的准确度。下面,本申请实施例将对目标文本的特征和参考文本的特征的获取方式,进行示例说明。
通常情况下,采用编码器对文本进行编码,即可得到文本特征。例如采用词级编码encoder结构,可以获取对目标文本和参考文本的编码特征。但是,需要生成纪要的目标文本,通常是篇幅较长的文本,例如会议文本的主要特点 就是其文本长度较长,一个时长一小时的会议,可能包含1-2万个词。而如果借助常规的词级编码器获取会议文本的特征,则将耗费大量内容,同时也无法很好地捕捉长距离依赖信息,导致提取的文本特征不准确或不完整,这也导致常规的纪要生成方法,只能是针对篇幅较短的文章如新闻、邮件以及轮次较少的人际对话等场景进行摘要生成,其无法胜任长篇幅的目标文本的纪要生成任务。
为了提高文本特征提取的效果,作为示例性的实施方式,本申请实施例通过获取目标文本中的各个文本片段的特征,确定目标文本的特征,以及,通过获取参考文本中的各个文本片段的特征,确定参考文本的特征。
其中,在获取目标文本中的各个文本片段的特征时,通过执行如下步骤B1-B4实现对各个文本片段的特征提取:
B1、对目标文本进行文本片段划分处理,确定目标文本包含的各个文本片段。
B2、对于目标文本中的各个文本片段,分别进行分词处理,确定各个文本片段包含的各个分词。
B3、分别提取各个文本片段包含的各个分词的融合上下文信息的分词特征。
B4、根据各个文本片段包含的各个分词的融合上下文信息的分词特征,确定各个文本片段的特征。
具体的,本申请实施例构建词-句-篇章层级信息编码模型,来提取目标文本的融合上下文信息的句级隐层特征和篇章级隐层特征,也就是提取目标文本的各个文本片段的特征,和目标文本的整体特征。
本申请实施例先对目标文本进行文本片段划分,以及对划分的文本片段进行分词,确定各个文本片段包含的各个分词。其中,对目标文本进行文本片段划分,可以是对目标文本进行文本句划分,例如按照标点符号进行文本句划分,或者借助固定字数的滑窗在目标文本上滑动提取文本片段。对文本片段进行分词,可以采用现有的分词算法实现,本申请实施例不再详细介绍。基于上述的文本片段划分和分词处理后,针对目标文本的各个文本片段包含的分词,即可借助上述的词-句-篇章层级信息编码模型来提取目标文本的文本片段特征和 篇章特征。
参见图6所示,该词-句-篇章层级信息编码模型,包括词隐层特征表达模块、句子表示提取模块、句隐层特征表达模块以及篇章特征提取模块。下面,以会议原文表示目标文本,以会议原文中的各个文本句表示目标文本中的各个文本片段,以提取会议原文的各个文本句的特征和会议原文篇章特征为例,介绍提取目标文本的各个文本片段的特征和目标文本的整体特征的处理过程。
上述的词隐层特征表达模块,是指对于会议原文文本中的每一文本句,输入每个词的词表示,输出融合当前句上下文信息的词隐层特征。所述词隐层特征表达模块的网络结构可利用Transformer方案下的encoder部分模型或双向LSTM等结构。假设会议原文经过分句后,共有n句话,每句话包含的词序列为
Figure PCTCN2022133167-appb-000057
其中,n表示会议原文的第n句话,m n表示第n句话中包含的词的总数。
Figure PCTCN2022133167-appb-000058
表示会议原文文本中第1个句子中的每个词融合当前句上下文信息后的词隐层特征,m1表示第1句话中,共有m1个词。同理
Figure PCTCN2022133167-appb-000059
表示会议原文文本中第2个句子中的m2个词的融合当前句上下文信息后的词隐层特征,
Figure PCTCN2022133167-appb-000060
表示会议原文文本中第n个句子中的m n个词的融合当前句上下文信息后的词隐层特征。
上述的句子表示提取模块,是将输入序列中多个词的词表示进行压缩,得到句子表示向量,会议原文文本的第1个句子所有的词隐层特征
Figure PCTCN2022133167-appb-000061
经过句子表示提取模块后,得到第1句的句子表示向量为s 1。依次类推,会议文本中第1句至第n句的句子表示向量可表示为序列s 1,s 2…,s n。本申请实施例对句子表示提取模块的网络结构不作限定,可采用注意力机制或池化等技术。
上述的句隐层特征表达模块,是指输入会议原文文本所有的句子表示向量,输出融合当前句上下文信息的句子隐层特征。和上述的词隐层特征表达模块类似,所述句隐层特征表达模块的网络结构,可利用Transformer方案下的encoder部分模型或双向LSTM等结构。
Figure PCTCN2022133167-appb-000062
表示会议文本中n个句子的融合上下文信息后的句隐层特征。
上述的篇章特征提取模块,和上述的句子表示提取模块类似,是将输入序列中多句话的句隐层特征表示进行压缩,得到篇章表示向量。会议原始文本的第1句至第n句的句子表示向量可表示为序列
Figure PCTCN2022133167-appb-000063
经过篇章特征提取模块 后,得到会议原文篇章特征u。本申请实施例对篇章特征提取模块的网络结构不作限定,可采用注意力机制或池化等技术。
通过上述的词-句-篇章层级信息编码模型,可以分别获取会议原文的每个文本句的特征,以及会议原文的篇章特征,也就是会议原文的整体特征。并且,会议原文的文本句的特征、文本句所包含的分词的特征,以及会议原文的篇章特征,均是融合了上下文信息的特征,因此,本申请实施例上述的会议原文文本特征提取方案,能够更好地捕捉长篇幅的会议文本中的长距离依赖信息,得到更加准确的会议文本特征。
作为一种可选的实施方式,上述的词-句-篇章层级信息编码模型,也可以省略其中的句隐层特征表达模块,直接将句子表示提取模块输出的各个文本句的句子表示向量s 1,s 2…,s n,作为会议文本的各个文本句的特征,以及,根据各个文本句的特征s 1,s 2…,s n,确定会议文本的篇章特征u。
在获取参考文本中的各个文本片段的特征时,通过执行如下步骤C1-C2实现对各个文本片段的特征提取:
C1、对参考文本进行分词处理,确定参考文本包含的各个分词。
C2、分别提取参考文本包含的各个分词的融合上下文信息的分词特征。
具体的,本申请实施例通过提取参考文本的各个文本片段的特征,确定参考文本的整体特征。其中,参考文本的文本片段,可以是参考文本中的词、短语、文本句、文本段等任意粒度的文本内容。在本申请实施例中,通过提取参考文本的各个分词的特征,确定参考文本的特征。
因此,首先对参考文本进行分词处理,例如通过分词模型或分词算法,对参考文本进行分词,确定参考文本包含的各个分词。然后,分别提取参考文本包含的各个分词的融合上下文信息的分词特征,以及,根据参考文本包含的各个分词的分词特征,组合得到参考文本的整体特征。
进一步的,如果参考文本包含的文本数量大于1,即参考文本中包含多个文本句,则本申请实施例先根据各个参考文本之间的关系,对参考文本中包含的各个文本句进行合并或筛选处理后,将其整合为一个参考文本,然后对该参考文本再进行分词、分词特征提取以及参考文本特征提取处理。
示例性的,本申请实施例通过如图7所示的词隐层特征表达模块,提取参考文本的各个分词的分词特征。
以提取用户在获取会议原文的会议纪要使输入的检索条件的特征为例。若用户检索条件为单个条件,则对用户检索条件进行分词后,将检索词序列输入上述的词隐层特征表达模块,输出融合检索条件上下文信息的词隐层特征,即得到检索条件包含的各个分词的融合上下文信息的分词特征。
若用户检索条件为多个检索子条件的组合,则按照如下表2所示的方法,将单个或多个检索子条件的词序列,输入上述的词隐层特征表达模块,该词隐层特征表达模块对应输出单个或多个检索条件的融合上下文信息的词隐层特征。
表2
复合条件 输入说明 输入序列个数
A&B 将A和B子条件序列拼接为一个序列输入 1
A||B 将A和B子条件序列分别输入 2
A-B 仅输入A子条件序列 1
按照表2所示的检索条件合并或筛选思想,对于更复杂的检索子条件复合情况,如{A&B}-C,通过上述方法可处理成一个文本序列,即子条件A和B拼接后的文本序列输入。
上述的词隐层特征表达模块的网络结构可利用Transformer方案下的encoder部分模型或双向LSTM等结构。如图7中所示,假设某个检索条件中的一个输入序列,共有m1个词,分别为
Figure PCTCN2022133167-appb-000064
将该词序列输入上述的词隐层特征表达模块后,得到检索条件每个词的隐层特征表示为
Figure PCTCN2022133167-appb-000065
特别地,若检索条件按所述方法拆解后有多个序列,提取过程类似,如图中另一个输入序列,共有m2个词,分别为
Figure PCTCN2022133167-appb-000066
将该词序列输入上述的词隐层特征表达模块后,得到每个词的隐层特征表示为
Figure PCTCN2022133167-appb-000067
按照上述方法,可分别获取每个检索子条件的分词特征,最后,将检索子条件包含的各个分词的分词特征按照分词顺序进行拼接,即可得到检索条件的整体特征。
通过上述处理分别确定目标文本的各个文本片段的特征,以及参考文本的各个文本片段的特征后,本申请实施例还对目标文本的各个文本片段的特征以及参考文本的各个文本片段的特征进行特征融合处理,得到融合参考文本特征的目标文本特征,和/或融合目标文本特征的参考文本特征。
即,本申请实施例将参考文本的特征,融入目标文本特征中,和/或,将目标文本的特征,融入参考文本特征中,从而使得参考文本和/或目标文本的特征中,不仅包含自身特征,还包括对方的特征。
在本申请实施例中,将参考文本的特征融入目标文本特征中,同时,将目标文本的特征融入参考文本的特征中。在实际实施本申请实施例技术方案时,可以根据本申请实施例的介绍,选择将其中一方的特征融入另一方。
其中,将目标文本特征融入参考文本特征时,可以是将目标文本的篇章特征和/或目标文本的各个文本片段的特征,融入参考文本的各个文本片段的特征,或者融入参考文本的整体特征。上述的目标文本的篇章特征,根据目标文本的各个文本片段的特征而确定。
在本申请实施例中,先根据目标文本的各个文本片段的特征,确定目标文本的篇章特征。然后,将目标文本的篇章特征,以及目标文本的各个文本片段的特征,分别融入参考文本的各个文本片段的特征;另外,将参考文本的各个文本片段的特征,融入目标文本的各个文本片段的特征中。最终,得到的参考文本的各个文本片段的特征中融合了目标文本的篇章特征和各个文本片段特征,得到的目标文本的各个文本片段的特征中融合了参考文本的各个文本片段的特征。
示例性的,仍以上述的会议原文表示目标文本,以用户检索条件表示参考文本,本申请实施例充分考虑用户检索条件和会议原文的信息融合,即提取最终的会议原文中每个句子隐层特征时,将融合相关检索条件信息,与此同时,提取最终用户检索条件中每个词隐层特征时,也将融合会议原文信息。
本申请实施例构建信息融合模型,用于实现用户检索条件和会议原文的信息融合。参见图8所示,该信息融合模型包括词隐层特征表达模块、词特征提取模块以及信息相互融合模块。
上述的词隐层特征表达模块的功能和处理过程,可参见图7所述的词隐层特征表达模块的功能介绍。
所述词特征提取模块,输入会议原文篇章级隐层特征u以及检索条件每个词隐层特征表示
Figure PCTCN2022133167-appb-000068
输出融合原文篇章信息后的检索条件每个词的隐层特征
Figure PCTCN2022133167-appb-000069
在本申请实施例中,该词特征提取模块采用递归的网络结构,会议原文篇章级隐层特征u为初始状态表示,检索条件每个词隐层特征为输入,递归获得融合会议原文篇章信息后的检索条件中每一个词的隐层特征的计算过程如下所示:
Figure PCTCN2022133167-appb-000070
Figure PCTCN2022133167-appb-000071
...
Figure PCTCN2022133167-appb-000072
上述的递归的网络结构,可采用LSTM或GRU等结构。
特别地,若检索条件为多个检索子条件的复合,则按上述处理后,有多个词隐层特征。如图8所示,对于另一个词隐层特征表示
Figure PCTCN2022133167-appb-000073
本申请实施例提取融合会议原文篇章信息后的检索条件中每一个词的隐层特征的计算过程和上述过程一致,最终得到
Figure PCTCN2022133167-appb-000074
所述信息相互融合模块,在获得融合会议原文篇章信息后的检索条件中每一个词的隐层特征后,输入该特征以及会议原文每个文本句的句隐层特征
Figure PCTCN2022133167-appb-000075
输出进一步融合原文句信息后的检索条件每个词的隐层特征
Figure PCTCN2022133167-appb-000076
以及融合相关检索条件词信息的会议原文中每句话的句隐层特征
Figure PCTCN2022133167-appb-000077
该信息相互融合模块,可利用self-attention机制或双向LSTM等结构。
通过上述介绍可见,本申请实施例在对目标文本和参考文本进行特征提取时,不仅能够提取目标文本和参考文本的融合了上下文信息的各个文本片段的特征,还实现了目标文本特征和参考文本特征的融合,使得目标文本和参考文本的特征信息更丰富,更加有利于通过对目标文本进行文本纪要生成处理,得到与参考文本对应的目标文本纪要。
例如,基于通过上述方式提取的目标文本的特征,进行文本纪要生成处理,或者,将通过上述方式提取的目标文本的特征和参考文本的特征进行组合应 用,进行文本纪要生成处理,由于目标文本的特征中融入了参考文本的特征,因此,能够使得生成的目标文本纪要与参考文本相关。
本申请实施例通过将按照上述方式提取的目标文本的各个文本片段的特征,以及参考文本的各个文本片段的特征,输入如图5所示的基于注意力机制的文本纪要生成模型,生成与参考文本对应的目标文本纪要,具体的纪要生成过程,请参见上述实施例的介绍。
与上述的文本纪要生成方法相对应的,本申请实施例还提出一种文本纪要生成装置,参见图9所示,该装置包括:
数据获取单元100,用于获取目标文本以及参考文本,其中,所述参考文本基于用户所关注的目标文本内容而确定;
纪要生成单元110,用于基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
作为一种可选的实施方式,基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
基于从所述目标文本中定位与所述参考文本相关的文本片段,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
作为一种可选的实施方式,基于从所述目标文本中定位与所述参考文本相关的文本片段,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
通过确定目标文本中的各个文本片段与参考文本的相关度,从所述目标文本中定位出与所述参考文本相关的文本片段;
至少基于所述目标文本中的与所述参考文本相关的各个文本片段与所述参考文本的相关度,对所述目标文本的全文内容进行纪要生成处理,得到与所 述参考文本对应的目标文本纪要。
作为一种可选的实施方式,至少基于所述目标文本中的与所述参考文本相关的各个文本片段与所述参考文本的相关度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
根据目标文本中的各个文本片段与参考文本的相关度,确定所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度;
至少根据所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
作为一种可选的实施方式,确定目标文本中的各个文本片段与参考文本的相关度,包括:
分别获取目标文本的特征以及参考文本的特征;
根据目标文本中的各个文本片段的特征,以及参考文本的特征,分别确定目标文本中的各个文本片段与参考文本的相关度。
作为一种可选的实施方式,所述纪要生成单元110还用于:
对目标文本中的各个文本片段的特征,以及参考文本的特征,进行基于注意力机制的交互运算,得到信息完善后的参考文本特征。
作为一种可选的实施方式,对目标文本中的各个文本片段的特征,以及参考文本的特征,进行基于注意力机制的交互运算,得到信息完善后的参考文本特征,包括:
根据目标文本中的各个文本片段的特征,以及参考文本的特征,计算确定目标文本中的各个文本片段与参考文本的相似度;
根据目标文本中的各个文本片段与参考文本的相似度,从目标文本中选出与参考文本的相似度最高的第一数量的文本片段;
对从目标文本中选出的第一数量的文本片段的特征,以及参考文本的特 征,进行基于注意力机制的交互运算,得到信息完善后的参考文本特征。
作为一种可选的实施方式,当参考文本的文本数量大于1时,根据目标文本中的各个文本片段的特征,以及参考文本的特征,分别确定目标文本中的各个文本片段与参考文本的相关度,包括:
对于目标文本中的各个文本片段,分别通过如下处理确定其与参考文本的相关度:
根据该文本片段的特征,以及各条参考文本的特征,确定该文本片段与各条参考文本的相关度;
根据各条参考文本之间的关系,对该文本片段与各条参考文本的相关度进行融合处理,确定该文本片段与参考文本的相关度。
作为一种可选的实施方式,所述纪要生成单元110还用于:
根据目标文本中的各个文本片段的特征以及参考文本的特征,通过BM25算法计算确定目标文本中的各个文本片段与参考文本的语义相似度;
对目标文本中的各个文本片段与参考文本的相关度,以及目标文本中的各个文本片段与参考文本的语义相似度进行融合处理,得到融合后的目标文本中的各个文本片段与参考文本的相关度。
作为一种可选的实施方式,所述纪要生成单元110还用于:
根据目标文本中的各个文本片段在目标文本中的位置分布,对目标文本中的各个文本片段与参考文本的相关度进行修正。
作为一种可选的实施方式,根据目标文本中的各个文本片段在目标文本中的位置分布,对目标文本中的各个文本片段与参考文本的相关度进行修正,包括:
从目标文本中的各个文本片段中,选出与参考文本的相关度最高的第二数量的文本片段;
按照目标文本中的其它文本片段与选出的第二数量的文本片段的距离越 大,则对其它文本片段与参考文本的相关度的惩罚度越高的规则,确定对目标文本中的其它文本片段与参考文本的相关度的惩罚度;
根据对目标文本中的其它文本片段与参考文本的相关度的惩罚度,对目标文本中的其它文本片段与参考文本的相关度进行惩罚。
作为一种可选的实施方式,所述纪要生成单元110还用于:
根据目标文本中的各个文本片段与参考文本的相关度,从目标文本中选出与参考文本的相关度最高的第三数量的文本片段;
根据选出的第三数量的文本片段中的各个文本片段与参考文本的相关度,从所述第三数量的文本片段中,选出与参考文本的相关度大于第一相关度阈值,或者与参考文本的相关度大于第二相关度阈值并且与参考文本的标准化相关度大于第三相关度阈值的文本片段,作为与参考文本相关的文本片段;
其中,所述第一相关度阈值大于所述第二相关度阈值,所述第二相关度阈值大于所述第三相关度阈值。
作为一种可选的实施方式,根据目标文本中的各个文本片段与参考文本的相关度,确定所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,包括:
确定生成所述目标文本的文本纪要对于所述目标文本中的各个文本片段的注意力系数;
根据生成所述目标文本的文本纪要对于所述目标文本中的各个文本片段的注意力系数,以及所述目标文本中的各个文本片段与参考文本的相关度,确定所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度。
作为一种可选的实施方式,至少根据所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
至少根据所述目标文本的特征,以及所述目标文本中的各个文本片段对于 生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征;
根据所述文本纪要解码特征,生成所述目标文本的文本纪要。
作为一种可选的实施方式,至少根据所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征,包括:
至少根据所述目标文本的特征、所述参考文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
作为一种可选的实施方式,至少根据所述目标文本的特征、所述参考文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征,包括:
至少根据所述参考文本的特征,生成参考解码特征;
根据所述参考解码特征、所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
作为一种可选的实施方式,所述纪要生成单元110还用于:
确定所述参考文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度;
至少根据所述参考文本的特征,生成参考解码特征,包括:
根据所述参考文本的特征,以及所述参考文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成参考解码特征。
作为一种可选的实施方式,根据所述参考文本的特征,以及所述参考文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成参考解码特征,包括:
根据所述参考文本的各个文本片段的特征,以及所述参考文本中的各个文 本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成参考解码特征;
根据所述参考解码特征、所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征,包括:
根据所述参考解码特征、所述目标文本的各个文本片段的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
作为一种可选的实施方式,根据所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征,包括:
根据所述目标文本中的各个文本片段的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
作为一种可选的实施方式,获取目标文本以及参考文本,基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
获取目标文本的特征,以及参考文本的特征;
将目标文本的特征和参考文本的特征输入预先训练的基于注意力机制的文本纪要生成模型,使所述基于注意力机制的文本纪要生成模型基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
作为一种可选的实施方式,目标文本的特征,通过获取目标文本中的各个文本片段的特征而得到;参考文本的特征,通过获取参考文本的各个文本片段的特征而得到。
作为一种可选的实施方式,获取目标文本中的各个文本片段的特征,包括:
对目标文本进行文本片段划分处理,确定目标文本包含的各个文本片段;
对于目标文本中的各个文本片段,分别进行分词处理,确定各个文本片段包含的各个分词;
分别提取各个文本片段包含的各个分词的融合上下文信息的分词特征;
根据各个文本片段包含的各个分词的融合上下文信息的分词特征,确定各个文本片段的特征。
作为一种可选的实施方式,所述获取目标文本中的各个文本片段的特征,还包括:
对各个文本片段的特征进行融合编码处理,得到各个文本片段的融合上下文信息的文本片段特征。
作为一种可选的实施方式,获取参考文本的各个文本片段的特征,包括:
对参考文本进行分词处理,确定参考文本包含的各个分词;
分别提取参考文本包含的各个分词的融合上下文信息的分词特征。
作为一种可选的实施方式,当参考文本的文本数量大于1时,在对参考文本进行分词处理,确定参考文本包含的各个分词之前,还包括:
根据各条参考文本之间的关系,对各条参考文本进行合并或筛选处理。
作为一种可选的实施方式,还包括:
对目标文本的各个文本片段的特征以及参考文本的各个文本片段的特征,进行特征融合处理,得到融合参考文本特征的目标文本特征,和/或融合目标文本特征的参考文本特征。
作为一种可选的实施方式,对目标文本的各个文本片段的特征以及参考文本的各个文本片段的特征,进行特征融合处理,得到融合目标文本特征的参考文本特征,包括:
将目标文本的篇章特征和/或目标文本的各个文本片段的特征,与参考文本的各个文本片段的特征进行特征融合处理,得到融合目标文本特征的参考文本特征;
其中,目标文本的篇章特征根据目标文本的各个文本片段的特征而确定。
作为一种可选的实施方式,对目标文本的各个文本片段的特征以及参考文本的各个文本片段的特征,进行特征融合处理,得到融合参考文本特征的目标文本特征和融合目标文本特征的参考文本特征,包括:
根据目标文本的各个文本片段的特征,确定目标文本的篇章特征;
将目标文本的篇章特征,与参考文本的各个文本片段的特征进行特征融合处理,得到参考文本的各个文本片段的融合目标文本篇章特征的文本片段特征;
将参考文本的各个文本片段的融合目标文本篇章特征的文本片段特征,与目标文本的各个文本片段的特征进行特征融合处理,得到融合参考文本特征的目标文本特征和融合目标文本特征的参考文本特征。
具体的,上述的文本纪要生成装置的各个实施例中的各个部分的具体工作内容,请参见上述的文本纪要生成方法的各个实施例中的相应处理步骤的具体内容,此处不再重复说明。
本申请另一实施例还提出一种文本纪要生成设备,参见图10所示,该设备包括:
存储器200和处理器210;
其中,所述存储器200与所述处理器210连接,用于存储程序;
所述处理器210,用于通过运行所述存储器200中存储的程序,实现上述任一实施例公开的文本纪要生成方法。
具体的,上述文本纪要生成设备还可以包括:总线、通信接口220、输入设备230和输出设备240。
处理器210、存储器200、通信接口220、输入设备230和输出设备240 通过总线相互连接。其中:
总线可包括一通路,在计算机系统各个部件之间传送信息。
处理器210可以是通用处理器,例如通用中央处理器(CPU)、微处理器等,也可以是特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本发明方案程序执行的集成电路。还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
处理器210可包括主处理器,还可包括基带芯片、调制解调器等。
存储器200中保存有执行本发明技术方案的程序,还可以保存有操作系统和其他关键业务。具体地,程序可以包括程序代码,程序代码包括计算机操作指令。更具体的,存储器200可以包括只读存储器(read-only memory,ROM)、可存储静态信息和指令的其他类型的静态存储设备、随机存取存储器(random access memory,RAM)、可存储信息和指令的其他类型的动态存储设备、磁盘存储器、flash等等。
输入设备230可包括接收用户输入的数据和信息的装置,例如键盘、鼠标、摄像头、扫描仪、光笔、语音输入装置、触摸屏、计步器或重力感应器等。
输出设备240可包括允许输出信息给用户的装置,例如显示屏、打印机、扬声器等。
通信接口220可包括使用任何收发器一类的装置,以便与其他设备或通信网络通信,如以太网,无线接入网(RAN),无线局域网(WLAN)等。
处理器210执行存储器200中所存放的程序,以及调用其他设备,可用于实现本申请上述实施例所提供的任意一种文本纪要生成方法的各个步骤。
本申请另一实施例还提供了一种存储介质,该存储介质上存储有计算机程序,该计算机程序被处理器运行时,实现本申请上述实施例所提供的任意一种文本纪要生成方法的各个步骤。
具体的,上述的文本纪要生成设备的各个部分的具体工作内容,以及上述的存储介质上的计算机程序被处理器运行时的具体处理内容,均可以参见上述 的文本纪要生成方法的各个实施例的内容,此处不再赘述。
对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。对于装置类实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本申请各实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减,各实施例中记载的技术特征可以进行替换或者组合。
本申请各实施例种装置及终端中的模块和子模块可以根据实际需要进行合并、划分和删减。
本申请所提供的几个实施例中,应该理解到,所揭露的终端,装置和方法,可以通过其它的方式实现。例如,以上所描述的终端实施例仅仅是示意性的,例如,模块或子模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个子模块或模块可以结合或者可以集成到另一个模块,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块或子模块可以是或者也可以不是物理上分开的,作为模块或子模块的部件可以是或者也可以不是物理模块或子模块,即可以位于一个地方,或者也可以分布到多个网络模块或子模块上。可以根据实际的需要选择其中的部分或者全部模块或子模块来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能模块或子模块可以集成在一个处理模块中,也可以是各个模块或子模块单独物理存在,也可以两个或两个以上模块或子模块集成在一个模块中。上述集成的模块或子模块既可以采用硬件的形 式实现,也可以采用软件功能模块或子模块的形式实现。
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件单元,或者二者的结合来实施。软件单元可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本申请的精神或范围的情况下,在其它实施例中实现。因此,本申请将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (31)

  1. 一种文本纪要生成方法,其特征在于,包括:
    获取目标文本以及参考文本,其中,所述参考文本基于用户所关注的目标文本内容而确定;
    基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
  2. 根据权利要求1所述的方法,其特征在于,基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
    基于从所述目标文本中定位与所述参考文本相关的文本片段,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
  3. 根据权利要求2所述的方法,其特征在于,基于从所述目标文本中定位与所述参考文本相关的文本片段,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
    通过确定目标文本中的各个文本片段与参考文本的相关度,从所述目标文本中定位出与所述参考文本相关的文本片段;
    至少基于所述目标文本中的与所述参考文本相关的各个文本片段与所述参考文本的相关度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
  4. 根据权利要求3所述的方法,其特征在于,至少基于所述目标文本中的与所述参考文本相关的各个文本片段与所述参考文本的相关度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
    根据目标文本中的各个文本片段与参考文本的相关度,确定所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度;
    至少根据所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
  5. 根据权利要求3所述的方法,其特征在于,确定目标文本中的各个文本片段与参考文本的相关度,包括:
    分别获取目标文本的特征以及参考文本的特征;
    根据目标文本中的各个文本片段的特征,以及参考文本的特征,分别确定目标文本中的各个文本片段与参考文本的相关度。
  6. 根据权利要求5所述的方法,其特征在于,在分别获取目标文本的特征以及参考文本的特征后,所述方法还包括:
    对目标文本中的各个文本片段的特征,以及参考文本的特征,进行基于注意力机制的交互运算,得到信息完善后的参考文本特征。
  7. 根据权利要求6所述的方法,其特征在于,对目标文本中的各个文本片段的特征,以及参考文本的特征,进行基于注意力机制的交互运算,得到信息完善后的参考文本特征,包括:
    根据目标文本中的各个文本片段的特征,以及参考文本的特征,计算确定目标文本中的各个文本片段与参考文本的相似度;
    根据目标文本中的各个文本片段与参考文本的相似度,从目标文本中选出与参考文本的相似度最高的第一数量的文本片段;
    对从目标文本中选出的第一数量的文本片段的特征,以及参考文本的特征,进行基于注意力机制的交互运算,得到信息完善后的参考文本特征。
  8. 根据权利要求5所述的方法,其特征在于,当参考文本的文本数量大于1时,根据目标文本中的各个文本片段的特征,以及参考文本的特征,分别确定目标文本中的各个文本片段与参考文本的相关度,包括:
    对于目标文本中的各个文本片段,分别通过如下处理确定其与参考文本的相关度:
    根据该文本片段的特征,以及各条参考文本的特征,确定该文本片段与各条参考文本的相关度;
    根据各条参考文本之间的关系,对该文本片段与各条参考文本的相关度进行融合处理,确定该文本片段与参考文本的相关度。
  9. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    根据目标文本中的各个文本片段的特征以及参考文本的特征,通过BM25 算法计算确定目标文本中的各个文本片段与参考文本的语义相似度;
    对目标文本中的各个文本片段与参考文本的相关度,以及目标文本中的各个文本片段与参考文本的语义相似度进行融合处理,得到融合后的目标文本中的各个文本片段与参考文本的相关度。
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    根据目标文本中的各个文本片段在目标文本中的位置分布,对目标文本中的各个文本片段与参考文本的相关度进行修正。
  11. 根据权利要求10所述的方法,其特征在于,根据目标文本中的各个文本片段在目标文本中的位置分布,对目标文本中的各个文本片段与参考文本的相关度进行修正,包括:
    从目标文本中的各个文本片段中,选出与参考文本的相关度最高的第二数量的文本片段;
    按照目标文本中的其它文本片段与选出的第二数量的文本片段的距离越大,则对其它文本片段与参考文本的相关度的惩罚度越高的规则,确定对目标文本中的其它文本片段与参考文本的相关度的惩罚度;
    根据对目标文本中的其它文本片段与参考文本的相关度的惩罚度,对目标文本中的其它文本片段与参考文本的相关度进行惩罚。
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    根据目标文本中的各个文本片段与参考文本的相关度,从目标文本中选出与参考文本的相关度最高的第三数量的文本片段;
    根据选出的第三数量的文本片段中的各个文本片段与参考文本的相关度,从所述第三数量的文本片段中,选出与参考文本的相关度大于第一相关度阈值,或者与参考文本的相关度大于第二相关度阈值并且与参考文本的标准化相关度大于第三相关度阈值的文本片段,作为与参考文本相关的文本片段;
    其中,所述第一相关度阈值大于所述第二相关度阈值,所述第二相关度阈值大于所述第三相关度阈值。
  13. 根据权利要求4所述的方法,其特征在于,根据目标文本中的各个文本片段与参考文本的相关度,确定所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,包括:
    确定生成所述目标文本的文本纪要对于所述目标文本中的各个文本片段的注意力系数;
    根据生成所述目标文本的文本纪要对于所述目标文本中的各个文本片段的注意力系数,以及所述目标文本中的各个文本片段与参考文本的相关度,确定所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度。
  14. 根据权利要求4所述的方法,其特征在于,至少根据所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,对所述目标文本的全文内容进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
    至少根据所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征;
    根据所述文本纪要解码特征,生成所述目标文本的文本纪要。
  15. 根据权利要求14所述的方法,其特征在于,至少根据所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征,包括:
    至少根据所述目标文本的特征、所述参考文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
  16. 根据权利要求15所述的方法,其特征在于,至少根据所述目标文本的特征、所述参考文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征,包括:
    至少根据所述参考文本的特征,生成参考解码特征;
    根据所述参考解码特征、所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
  17. 根据权利要求16所述的方法,其特征在于,所述方法还包括:
    确定所述参考文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度;
    至少根据所述参考文本的特征,生成参考解码特征,包括:
    根据所述参考文本的特征,以及所述参考文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成参考解码特征。
  18. 根据权利要求17所述的方法,其特征在于,根据所述参考文本的特征,以及所述参考文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成参考解码特征,包括:
    根据所述参考文本的各个文本片段的特征,以及所述参考文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成参考解码特征;
    根据所述参考解码特征、所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征,包括:
    根据所述参考解码特征、所述目标文本的各个文本片段的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
  19. 根据权利要求14所述的方法,其特征在于,根据所述目标文本的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征,包括:
    根据所述目标文本中的各个文本片段的特征,以及所述目标文本中的各个文本片段对于生成与所述参考文本对应的文本纪要的贡献度,生成文本纪要解码特征。
  20. 根据权利要求1所述的方法,其特征在于,获取目标文本以及参考文本,基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要,包括:
    获取目标文本的特征,以及参考文本的特征;
    将目标文本的特征和参考文本的特征输入预先训练的基于注意力机制的文本纪要生成模型,使所述基于注意力机制的文本纪要生成模型基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
  21. 根据权利要求1至20中任意一项所述的方法,其特征在于,目标文本的特征,通过获取目标文本中的各个文本片段的特征而得到;参考文本的特征,通过获取参考文本的各个文本片段的特征而得到。
  22. 根据权利要求21所述的方法,其特征在于,获取目标文本中的各个文本片段的特征,包括:
    对目标文本进行文本片段划分处理,确定目标文本包含的各个文本片段;
    对于目标文本中的各个文本片段,分别进行分词处理,确定各个文本片段包含的各个分词;
    分别提取各个文本片段包含的各个分词的融合上下文信息的分词特征;
    根据各个文本片段包含的各个分词的融合上下文信息的分词特征,确定各个文本片段的特征。
  23. 根据权利要求22所述的方法,其特征在于,还包括:
    对各个文本片段的特征进行融合编码处理,得到各个文本片段的融合上下文信息的文本片段特征。
  24. 根据权利要求21所述的方法,其特征在于,获取参考文本的各个文本片段的特征,包括:
    对参考文本进行分词处理,确定参考文本包含的各个分词;
    分别提取参考文本包含的各个分词的融合上下文信息的分词特征。
  25. 根据权利要求24所述的方法,其特征在于,当参考文本的文本数量大于1时,在对参考文本进行分词处理,确定参考文本包含的各个分词之前,还包括:
    根据各条参考文本之间的关系,对各条参考文本进行合并或筛选处理。
  26. 根据权利要求21所述的方法,其特征在于,所述方法还包括:
    对目标文本的各个文本片段的特征以及参考文本的各个文本片段的特征,进行特征融合处理,得到融合参考文本特征的目标文本特征,和/或融合目标文本特征的参考文本特征。
  27. 根据权利要求26所述的方法,其特征在于,对目标文本的各个文本片段的特征以及参考文本的各个文本片段的特征,进行特征融合处理,得到融合目标文本特征的参考文本特征,包括:
    将目标文本的篇章特征和/或目标文本的各个文本片段的特征,与参考文本的各个文本片段的特征进行特征融合处理,得到融合目标文本特征的参考文本特征;
    其中,目标文本的篇章特征根据目标文本的各个文本片段的特征而确定。
  28. 根据权利要求26所述的方法,其特征在于,对目标文本的各个文本片段的特征以及参考文本的各个文本片段的特征,进行特征融合处理,得到融合参考文本特征的目标文本特征和融合目标文本特征的参考文本特征,包括:
    根据目标文本的各个文本片段的特征,确定目标文本的篇章特征;
    将目标文本的篇章特征,与参考文本的各个文本片段的特征进行特征融合处理,得到参考文本的各个文本片段的融合目标文本篇章特征的文本片段特征;
    将参考文本的各个文本片段的融合目标文本篇章特征的文本片段特征,与目标文本的各个文本片段的特征进行特征融合处理,得到融合参考文本特征的目标文本特征和融合目标文本特征的参考文本特征。
  29. 一种文本纪要生成装置,其特征在于,包括:
    数据获取单元,用于获取目标文本以及参考文本,其中,所述参考文本基于用户所关注的目标文本内容而确定;
    纪要生成单元,用于基于从所述目标文本中定位所述参考文本的关联内容,对所述目标文本进行纪要生成处理,得到与所述参考文本对应的目标文本纪要。
  30. 一种文本纪要生成设备,其特征在于,包括:
    存储器和处理器;
    所述存储器与所述处理器连接,用于存储程序;
    所述处理器,用于通过运行所述存储器中的程序,实现如权利要求1至28中任意一项所述的文本纪要生成方法。
  31. 一种存储介质,其特征在于,所述存储介质上存储有计算机程序,所述计算机程序被处理器运行时,实现如权利要求1至28中任意一项所述的文本纪要生成方法。
PCT/CN2022/133167 2021-12-30 2022-11-21 一种文本纪要生成方法、装置、设备及存储介质 WO2023124648A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111667181.X 2021-12-30
CN202111667181.XA CN114328899A (zh) 2021-12-30 2021-12-30 一种文本纪要生成方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023124648A1 true WO2023124648A1 (zh) 2023-07-06

Family

ID=81021718

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/133167 WO2023124648A1 (zh) 2021-12-30 2022-11-21 一种文本纪要生成方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN114328899A (zh)
WO (1) WO2023124648A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114281948A (zh) * 2021-12-30 2022-04-05 安徽听见科技有限公司 一种纪要确定方法及其相关设备
CN114328899A (zh) * 2021-12-30 2022-04-12 科大讯飞股份有限公司 一种文本纪要生成方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723205A (zh) * 2020-06-18 2020-09-29 中国银行股份有限公司 会议纪要处理方法、装置及会议纪要处理设备
CN112861510A (zh) * 2021-02-08 2021-05-28 北京字跳网络技术有限公司 纪要处理方法、装置、设备和存储介质
US20210375289A1 (en) * 2020-05-29 2021-12-02 Microsoft Technology Licensing, Llc Automated meeting minutes generator
CN113806554A (zh) * 2021-09-14 2021-12-17 上海云思智慧信息技术有限公司 面向海量会议文本的知识图谱构建方法
CN114328899A (zh) * 2021-12-30 2022-04-12 科大讯飞股份有限公司 一种文本纪要生成方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210375289A1 (en) * 2020-05-29 2021-12-02 Microsoft Technology Licensing, Llc Automated meeting minutes generator
CN111723205A (zh) * 2020-06-18 2020-09-29 中国银行股份有限公司 会议纪要处理方法、装置及会议纪要处理设备
CN112861510A (zh) * 2021-02-08 2021-05-28 北京字跳网络技术有限公司 纪要处理方法、装置、设备和存储介质
CN113806554A (zh) * 2021-09-14 2021-12-17 上海云思智慧信息技术有限公司 面向海量会议文本的知识图谱构建方法
CN114328899A (zh) * 2021-12-30 2022-04-12 科大讯飞股份有限公司 一种文本纪要生成方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114328899A (zh) 2022-04-12

Similar Documents

Publication Publication Date Title
US11615799B2 (en) Automated meeting minutes generator
US11594221B2 (en) Transcription generation from multiple speech recognition systems
US11145312B2 (en) Switching between speech recognition systems
WO2021232725A1 (zh) 基于语音交互的信息核实方法、装置、设备和计算机存储介质
US10672383B1 (en) Training speech recognition systems using word sequences
US20220122587A1 (en) Training of speech recognition systems
US11545156B2 (en) Automated meeting minutes generation service
WO2023124648A1 (zh) 一种文本纪要生成方法、装置、设备及存储介质
CN115238101B (zh) 一种面向多类型知识库的多引擎智能问答系统
Hahn et al. Comparing stochastic approaches to spoken language understanding in multiple languages
US20180197548A1 (en) System and method for diarization of speech, automated generation of transcripts, and automatic information extraction
US6484136B1 (en) Language model adaptation via network of similar users
CN110717031A (zh) 一种智能会议纪要生成方法和系统
US20160163318A1 (en) Metadata extraction of non-transcribed video and audio streams
JPWO2005122144A1 (ja) 音声認識装置、音声認識方法、及びプログラム
WO2020077825A1 (zh) 论坛社区应用管理方法、装置、设备及可读存储介质
CN107480152A (zh) 一种音频分析及检索方法和系统
CN115827854B (zh) 语音摘要生成模型训练方法、语音摘要生成方法及装置
WO2023124647A1 (zh) 一种纪要确定方法及其相关设备
WO2023035529A1 (zh) 基于意图识别的信息智能查询方法、装置、设备及介质
CN115831117A (zh) 实体识别方法、装置、计算机设备和存储介质
Wang Mandarin spoken document retrieval based on syllable lattice matching
Komatani et al. Efficient dialogue strategy to find users’ intended items from information query results
US11934439B1 (en) Similar cases retrieval in real time for call center agents
CN112820274B (zh) 一种语音信息识别校正方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913906

Country of ref document: EP

Kind code of ref document: A1