CN113688231A

CN113688231A - Abstract extraction method and device of answer text, electronic equipment and medium

Info

Publication number: CN113688231A
Application number: CN202110881696.3A
Authority: CN
Inventors: 花新宇; 代文; 陈帅
Original assignee: Beijing Xiaomi Mobile Software Co Ltd; Beijing Xiaomi Pinecone Electronic Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd; Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date: 2021-08-02
Filing date: 2021-08-02
Publication date: 2021-11-23

Abstract

The embodiment of the application provides a method, a device, electronic equipment and a medium for extracting an abstract of an answer text, wherein the method comprises the following steps: acquiring a first answer text with a first text length; determining a target model from a plurality of candidate models according to the text type and/or the first text length of the first answer text; and processing the first answer text by using the target model to obtain a second answer text with a second text length, wherein the second text length is shorter than the first text length, and the second answer text has the same meaning as the first answer text.

Description

Abstract extraction method and device of answer text, electronic equipment and medium

Technical Field

The present application relates to the field of natural language processing technologies, and in particular, to a method and an apparatus for extracting an abstract of an answer text, an electronic device, and a medium.

Background

With the popularization of the internet, people increasingly search the internet for answer information of questions. And the internet returns a great amount of matched answer information, so that a great amount of time is needed for browsing to identify effective answers, and the user experience is poor.

Therefore, it becomes important to extract the summary of the answer information.

Disclosure of Invention

In view of this, the embodiments of the present disclosure provide a method and an apparatus for extracting an abstract of an answer text, an electronic device, and a storage medium.

According to a first aspect of the embodiments of the present disclosure, there is provided a method for extracting a summary of an answer text, including:

acquiring a first answer text with a first text length;

determining a target model from a plurality of candidate models according to the text type and/or the first text length of the first answer text;

and processing the first answer text by using the target model to obtain a second answer text with a second text length, wherein the second text length is shorter than the first text length, and the second answer text has the same meaning as the first answer text.

In one embodiment, the alternative model includes at least one of:

a generative model for generating the second answer text based on the content of the first answer text;

an extraction model, which is used for extracting at least one keyword and/or key sentence existing in the first answer text to form a second answer text;

and the comprehensive model comprises an extraction model and a generation model which are arranged in sequence and is used for forming the second answer text after the first answer text is sequentially processed by the extraction model and the generation model.

In one embodiment, the determining a target model from a plurality of candidate models according to the text type and/or the first text length of the first answer text includes at least one of:

if the text type of the first answer text is a first text type and the length of the first text is within a first interval range, determining the target model as the generative model;

if the first answer text is of a first text type and the length of the first text is within a second interval range, determining that the target model is the extraction model;

if the first answer text is of a first text type and the length of the first text is within a third interval range, determining that the target model is the comprehensive model;

wherein the minimum value of the second interval range is greater than or equal to the maximum value of the first interval range, and the minimum value of the third interval range is greater than or equal to the maximum value of the second interval range.

In one embodiment, the determining a target model from a plurality of candidate models according to the text type and/or the first text length of the first answer text further includes:

and if the first answer text is of a second text type or a third text type and the first text length is within a fourth area range, determining that the target model is the extraction model.

In one embodiment, the generative model is configured to generate the second answer text based on the content of the first answer text, and includes:

performing word segmentation processing on the first answer text, and determining a word segmentation position in the first answer text;

inserting a predetermined separator at a participle position in the first answer text;

and inputting the first answer text inserted with the separator into the generative model to obtain the second answer text.

In one embodiment, the generative model is a language model of a transform-based bidirectional encoder BERT.

In one embodiment, the extracting model extracts at least one keyword and/or keyword sentence already existing in the first answer text to form the second answer text, and includes:

splitting the first answer text into N sentences, and selecting M sentences from the N sentences, wherein M is not more than N;

determining candidate keywords of the M sentences;

sorting the importance of the candidate keywords;

selecting a preset number of candidate keywords with high importance as keywords;

and forming the second answer text based on the keywords.

According to a second aspect of the embodiments of the present disclosure, there is provided an apparatus for extracting a summary of an answer text, including:

the acquisition module is used for acquiring a first answer text with a first text length;

the model determining module is used for determining a target model from a plurality of candidate models according to the text type and/or the first text length of the first answer text;

and the processing module is used for processing the first answer text by using the target model to obtain a second answer text with a second text length, wherein the second text length is shorter than the first text length, and the second answer text has the same main meaning as the first answer text.

In one embodiment, the alternative model includes at least one of:

In one embodiment, the generative model comprises:

the processing unit is used for performing word segmentation processing on the first answer text and determining the word segmentation position in the first answer text;

an insert delimiter unit for inserting a predetermined delimiter at a participle position in the first answer text;

and the generating unit is used for inputting the first answer text inserted with the separator into the generative model to obtain the second answer text.

In one embodiment, the pull-out model comprises:

the selecting unit is used for splitting the first answer text into N sentences and selecting M sentences from the N sentences, wherein M is not more than N;

a candidate keyword determining unit for determining candidate keywords of the M sentences;

the sorting unit is used for sorting the importance of the candidate keywords;

the selection unit is used for selecting a preset number of candidate keywords with high importance as keywords;

and the word connection unit is used for forming the second answer text based on the keywords.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic device including the apparatus of the second aspect.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing executable instructions for causing a processor to implement the method of the first aspect when executed.

According to the method, the device, the electronic equipment and the storage medium for extracting the abstract of the answer text, different alternative models are adopted for processing the first answer text with different text types and/or text lengths according to the text type and the text length of the answer text, and the second answer text is obtained and used as the abstract of the answer text, so that the accuracy of extracting the abstract of the answer text is improved, and the efficiency and the experience of obtaining information by a user are improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of embodiments of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the embodiments of the disclosure.

Fig. 1 is a flowchart illustrating a first method for abstracting an abstract of an answer text according to an exemplary embodiment;

FIG. 2 is a flowchart illustrating a second method for abstracting a summary of answer text, according to an exemplary embodiment;

fig. 3 is a flowchart illustrating a third method for abstracting a summary of answer text according to an exemplary embodiment;

fig. 4 is a flowchart illustrating a fourth method for abstracting a summary of answer text according to an exemplary embodiment;

fig. 5 is a flowchart illustrating a fifth method for abstracting a summary of answer text according to an exemplary embodiment;

fig. 6 is a flowchart illustrating a sixth method for abstracting a summary of answer text according to an exemplary embodiment;

FIG. 7 is a schematic diagram illustrating a removable model according to an exemplary embodiment;

FIG. 8 is a diagram illustrating a generative model in accordance with an exemplary embodiment;

FIG. 9 is a diagram illustrating an integrated model in accordance with an exemplary embodiment;

FIG. 10 is a schematic diagram of another generative model shown in accordance with an exemplary embodiment;

fig. 11 is a schematic structural diagram illustrating a first answer text abstract extracting apparatus according to an exemplary embodiment;

fig. 12 is a schematic structural diagram illustrating a second answer text abstract extracting apparatus according to an exemplary embodiment;

fig. 13 is a schematic structural diagram illustrating a third apparatus for extracting a summary of an answer text according to an exemplary embodiment;

fig. 14 is a schematic structural diagram illustrating a fourth answer text abstract extracting apparatus according to an exemplary embodiment;

fig. 15 is a block diagram illustrating a structure of an apparatus for extracting a summary of an answer text according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with embodiments of the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the disclosed embodiments, as detailed in the appended claims.

The terminology used in the embodiments of the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present disclosure. As used in the disclosed embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information in the embodiments of the present disclosure, such information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of embodiments of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

As shown in fig. 1, the present exemplary embodiment provides a method for extracting a summary of an answer text, including:

step S101: acquiring a first answer text with a first text length;

step S103: determining a target model from a plurality of candidate models according to the text type and/or the first text length of the first answer text;

step S104: and processing the first answer text by using the target model to obtain a second answer text with a second text length, wherein the second text length is shorter than the first text length, and the second answer text has the same meaning as the first answer text.

In this embodiment, the method for abstracting the abstract of the answer text may be performed by an abstract abstracting device of the answer text, which may be integrated in an electronic device (e.g., a server), and may be implemented in hardware and/or software.

In this embodiment, the first answer text may be answer information provided by the interactive system based on the question information, where the interactive system may be an intelligent interactive system such as an intelligent terminal, a platform, an application, and a client capable of providing an intelligent interactive service, for example, an intelligent sound box, an intelligent video sound box, an intelligent story machine, an intelligent interactive platform, an intelligent interactive application, a search engine, a question and answer system, and the like. For example, the interactive system performs matching through a preset question-answer library based on the input question information, and takes the answer with the highest matching degree as the answer information corresponding to the question information.

As shown in fig. 2, before step S103, the method further includes:

step S102: determining a text type of the first answer text.

Here, the text type may be determined according to the first answer text and/or the question information corresponding to the first answer text, and the text type includes a fact type text, a reason type text, a method type text, and the like.

In some possible embodiments, a question library may be established in advance, the sentence similarity between the question information corresponding to the first answer text and all question sentences in the question library is calculated, and finally, the question type corresponding to the question sentence with the highest similarity is used as the text type of the first answer text.

For example, different types of question statements are predefined in a question bank, and the question statements may include: a fact question statement, a cause question statement, and a method question statement, wherein the fact question statement may include: what, who, what to call, what, where, what included, etc.; the causal question statement may include: why, what the reason is, etc.; the method question statement may include: how, method, manner, etc.

It is to be understood that the contents included in the various types of question sentences given above are only some examples of building a question bank, and may be defined according to the actual application field and scene, etc., when applied specifically.

In other possible embodiments, the answer library may be pre-established, sentence similarity between the first answer text and all answer sentences in the answer library is calculated, and finally the answer type corresponding to the answer sentence with the highest similarity is used as the text type of the first answer text.

For example, different types of answer sentences are predefined in the answer library, and the answer sentences may include: a factual answer sentence, a causal answer sentence, and a methodological answer sentence, wherein the factual answer sentence may include: "(person, place, or entity) is", "(person, place, or entity) is inclusive, and the like; the causal answer sentence may include: "reason is," "because," etc. represent words of reason; the method type answer sentence may include: words such as "method is," "manner is," "step is," etc. that indicate a manner.

It is understood that the contents included in the various types of answer sentences given above are only some examples of the construction of the answer library, and may be defined according to the actual application field and scenario in a specific application.

It is to be understood that the text type of the first answer text may also be determined in other ways according to the needs of the practical application, and the embodiment is not limited thereto.

After determining the text type of the first answer text, in step S103, a target model is determined from a plurality of candidate models according to the text type and/or the first text length of the first answer text.

Here, since the text type reflects the constituent features of the text to a certain extent, and meanwhile, on the premise of the same target model, the text length directly determines the complexity of model processing, the target model for extracting the answer text is selected in consideration of at least one dimension of the text type and the text length.

In the present embodiment, the alternative model may be a model implementing a text summarization function algorithm, for example, an extraction model and/or a generative model, the extraction model includes a model implementing a Lead-3 algorithm, a TextRank algorithm, an LDA algorithm, a CNN algorithm, and/or a BERT algorithm, and the generative model includes a model implementing an LSTM algorithm, a convS2S algorithm, and/or a UniLM algorithm.

In some possible embodiments, the target model may be determined from a plurality of candidate models directly according to a text type of the first answer text.

For example, for the fact type text or the cause type text, since the subject meaning of the fact or the cause to be stated is mainly focused on the first few sentences or the last few sentences of the first answer text, when the target model is determined from the candidate models, a model in which the first few sentences or the last few sentences of the first answer text are mainly used as the second answer text is determined. If the determined target model is the extraction model, the first answer text is processed by the extraction model, and the obtained second answer text comprises the first sentences or the last sentences of the first answer text as the second answer text.

For another example, in the case of a method type text, the meaning of the subject of the method to be stated is dispersed throughout the first answer text, and therefore, when the target model is determined from the candidate models, a model in which the keywords/sentences of the first answer text are summarized as the second answer text is determined. For example, the determined target model is a generative model, the first answer text is processed by the generative model, and the obtained second answer text is a summary of the meaning of the subject matter of the first answer text.

In other possible embodiments, the target model may be determined from a plurality of candidate models directly according to the first text length of the first answer text.

For example, if the length of the first text is smaller than the first preset threshold, when the target model is determined from the candidate models, it is determined that the keyword/sentence of the first answer text is summarized into the model of the second answer text. For example, the determined target model is a generative model, the first answer text is processed by the generative model, and the obtained second answer text is a summary of the meaning of the subject matter of the first answer text.

For another example, if the length of the first text is greater than the first preset threshold, when the target model is determined from the candidate models, it is determined that the model mainly extracts keywords/sentences from the first answer text to form the second answer text. If the determined target model is the extraction model, the first answer text is processed by the extraction model, and the obtained second answer text comprises the second answer text of the keywords/sentences.

In still other possible embodiments, the target model may be determined from a plurality of candidate models in combination with a text type and a text length of the first answer text.

For example, for method-type texts with a first text length greater than a first preset threshold, when the target model is determined from the candidate models, first, a first model for extracting keywords/sentences from the first answer text is determined, an intermediate answer text is generated through the first model processing, and then, a second model for summarizing the intermediate answer text into a second answer text is determined. Finally, the target model comprises a first model and a second model which are arranged in sequence.

It will be appreciated that the above are only examples of determining the target model from a plurality of candidate models according to text type and/or text length, and that in a specific application, a suitable target model may be selected according to the actual application requirements.

And after the target model is determined, processing the first answer text by using the target model to obtain a second answer text with a second text length.

Namely, the first answer text is used as the input of the target model, and the second answer text is output after the target model is processed. The second answer text is an abstract of the first answer text, so that the second text length of the second answer text is shorter than the first text length of the first answer text, and the second answer text has the same meaning as the first answer text.

According to the method for extracting the abstract of the answer text, different alternative models are adopted to process the first answer text with different text types and/or text lengths according to the text type and the text length of the answer text, and the second answer text is obtained and used as the abstract of the first answer text, so that the accuracy of extracting the abstract of the answer text is improved, and the efficiency and experience of obtaining information by a user are improved.

In some possible embodiments, before processing the first answer text using the target model in step S103, the method further includes:

and filtering the first answer text.

Specifically, the noise information included in the first answer text is filtered, where the noise information may include: hypertext markup language tags, scrambled characters, punctuation, and/or spoken words and sentences, and the like. By filtering the first answer text, the accuracy of the extracted abstract can be effectively improved.

In some possible embodiments, the alternative model includes at least one of:

a generative model for generating the second answer text based on the content of the first answer text; for example, the generative model rewrites and reforms the first answer text based on an understanding of the semantics of the first answer text, generating a more concise and more general second answer text.

An extraction model, which is used for extracting at least one keyword and/or key sentence existing in the first answer text to form a second answer text; for example, the pull-out model directly selects important phrases or sentences (i.e., keywords or key sentences) from the first answer text, and combines the keywords and/or key sentences to form the second answer text.

And the comprehensive model comprises an extraction model and a generation model which are arranged in sequence and is used for forming the second answer text after the first answer text is sequentially processed by the extraction model and the generation model. For example, first, the first answer text is processed by the extraction model to obtain an intermediate answer text, and then, the intermediate answer text is processed by the generation model to obtain a second answer text.

In this embodiment, a third text length of the intermediate answer text is not less than a second preset threshold, where the third text length is K times the second text length, and K is greater than 1.

In some possible embodiments, if the determined target model is a generative model and the second answer text is generated based on the generative model, the method further comprises:

judging whether the semantics of the second answer text and the first answer text are consistent by using a semantic matching model;

and if not, processing the first answer text by using an extraction model to obtain a third answer text, and outputting the third answer text.

In some possible embodiments, when determining the target model from the plurality of candidate models, at least one of the following ways may be employed:

In this embodiment, when the text type of the first answer text is the first text type, a suitable candidate model is adopted as the target model according to the first text length of the first answer text.

Here, the first, second, and third section ranges are divided based on the size of the text length, and the target model is determined according to the divided section ranges. For example, the first interval range is not more than 300 words, the second interval range is more than 300 words and less than 1000 words, and the third interval range is not less than 1000 words.

The first text type may be factual text or causal text, preferably factual text.

Specifically, for example, when the first answer text is a fact type text, and the length of the first text does not exceed 300 words, the target model is determined to be a generative model, and the generative model rewrites and recombines the first answer text based on understanding of the semantics of the first answer text to generate a more concise and more general second answer text.

For another example, when the first answer text is a factual text and the length of the first text is not less than 1000 characters, the target model is determined to be a comprehensive model, the first answer text is processed by an extraction model to obtain an intermediate answer text, and then the intermediate answer text is processed by a generation model to obtain a second answer text.

In some possible embodiments, if the first answer text is of the second text type or the third text type and the first text length is within a fourth region range, the target model is determined to be the pull-out model.

In this embodiment, when the text type of the first answer text is the second text type or the third text type, a suitable alternative model is adopted as the target model according to the first text length of the first answer text.

Here, the fourth, fifth, and sixth section ranges are divided based on the size of the text length, and the target model is determined according to the divided section ranges. Wherein the minimum value of the fourth interval range is greater than or equal to the maximum value of the fifth interval range, and the minimum value of the sixth interval range is greater than or equal to the maximum value of the fourth interval range. For example, the fifth interval range is not more than 500 words, the fourth interval range is more than 500 words and less than 1000 words, and the sixth interval range is not less than 1000 words.

It is to be understood that the fourth, fifth and sixth section ranges divided here are not directly associated with the first, second and third section ranges divided for the first text type.

The second text type and the third text type may be cause type text or method type text.

Specifically, for example, when the first answer text is a method-type text, and the length of the first text is greater than 500 words and less than 1000 words, the target model is determined to be an extraction model, the extraction model directly selects important phrases or sentences (i.e., keywords or key sentences) from the first answer text, and combines the keywords and/or key sentences to form the second answer text.

In some possible embodiments, as shown in fig. 3, the generating model forms a second answer text based on the first answer text, including:

step S201: performing word segmentation processing on the first answer text, and determining a word segmentation position in the first answer text;

step S202: inserting a predetermined separator at a participle position in the first answer text;

step S203: and inputting the first answer text inserted with the separator into the generative model to obtain the second answer text.

In step S201, a word segmentation process is performed on the first answer text, and positions of word segmentation in the first answer text may be recorded, where a word segmentation is included between two adjacent word segmentation positions, and meanwhile, based on a BERT mechanism, a symbol [ CLS ] may be set at the beginning of a sentence in the first answer text, and a symbol [ SEP ] may be set between two adjacent sentences, and a symbol [ SEP ] may be set at the end of the first answer text.

For example, the first answer text is "typhoon belongs to one of tropical cyclones, which are low-pressure vortices that occur on tropical or sub-tropical ocean surfaces, a powerful and deep" tropical weather system ". … …, wherein wind forces of up to 12 or more levels are collectively referred to as typhoons. ", then by word segmentation of the above first answer text," [ CLS ] typhoon/belonging/tropical/cyclonic/one/tropical/cyclonic/yes/occur/tropical/subtropical/ocean/up/low pressure/vortex/yes/one/powerful/but/deep/tropical/weather/system/… …/wherein/wind/up/12 class/or/up/general/mean/typhoon [ SEP ] ", wherein"/"can be used to determine the word segmentation position, the symbol [ CLS ] or [ SEP ] is comprised with a word segmentation between"/", a participle is also included between two adjacent "/".

In step S202, the separator may be a symbol for dividing two adjacent words in the sentence, and the specific form of the separator may be various and may be set according to the actual situation, for example, the separator may be "/", or may be "[ SEW ]", or the like. A word can be arranged between two adjacent separators, the word can form semantic information, and the characters in the word can be characterized to have strong correlation with each other, the correlation between different words is weak, and the like.

If the separator is "SEW" ", the example sentence is changed into" [ CLS ] typhoon [ SEW ] belongs to [ SEW ] of [ SEW ] tropical [ SEW ] cyclone [ SEW ], a [ SEW ] tropical [ SEW ] cyclone [ SEW ] is that [ SEW ] occurs [ SEW ] and [ SEW ] low-pressure [ SEW ] vortex [ SEW ] of [ SEW ] on [ SEW ] tropical [ SEW ] or [ SEW ] subtropical [ SEW ] ocean surface [ SEW ] is [ SEW ] strong [ SEW ] and [ SEW ] deep thickness [ SEW ] tropical [ SEW ] weather [ SEW ] system [ SEW ] … … [ SEW ] wherein [ SEW ] wind [ SEW ] reaches [ SEW ]12 level [ SEW ] or is more than SEW ] SEW.

In step S203, the first answer text with the separator inserted therein is input into the generative model, and the second answer text is obtained.

In some possible embodiments, the generative model is a language model of a transform-based bidirectional encoder BERT.

In this embodiment, the language model of the bidirectional encoder BERT based on transforms models the first answer text based on the attention mechanism, calculates the correlation between each word in the first answer text and all words in the first answer text, considers that the correlation between the word and the word reflects the relevance and importance degree between different words in the first answer text to some extent, and based on this, adjusts the importance (or weight) of each word by using the relevance and importance degree, and finally obtains a new expression for each word.

Finally, the second answer text corresponding to the first answer text is determined based on the new expression of each word, and the accuracy of the second text is improved.

In some possible embodiments, the extracting model extracts at least one keyword and/or keyword sentence already existing in the first answer text to form the second answer text, as shown in fig. 4, including:

step S301: splitting the first answer text into N sentences, and selecting M sentences from the N sentences, wherein M is not more than N;

step S302: determining candidate keywords of the M sentences;

step S303: sorting the importance of the candidate keywords;

step S304: selecting a preset number of candidate keywords with high importance as keywords;

step S305: and forming the second answer text based on the keywords.

In this embodiment, by directly extracting partial sentences (M sentences) from the N sentences of the first answer text for further processing, the computational complexity of the entire model processing can be effectively reduced. Here, the extraction may be performed in such a manner that the first M sentences of the first answer text, the last M sentences of the first answer text, and the like are extracted.

In step S302, candidate keywords may be determined by a method of extracting keywords in the related art.

After the candidate keywords are determined, an importance value is determined for each candidate keyword, and the importance value is used for representing the importance of the candidate keywords. Here, the importance value of the candidate keyword may be determined according to the occurrence frequency of the candidate keyword in the first answer text, and the higher the occurrence frequency is, the larger the importance value is, the higher the corresponding importance degree is; or, determining according to the corresponding relation between the preset keywords and the importance value.

And finally, sorting the importance of the candidate keywords according to the importance value, selecting a preset number of candidate keywords with high importance as the keywords, and forming a second answer text based on the determined keywords.

One specific example is provided below in connection with any of the embodiments described above:

the basic flow of the embodiment of the present disclosure is shown in fig. 5 and 6 when applied to an open domain answer scenario in an intelligent question-answering:

a1: after the question answering system acquires question information (query) of a user, retrieving a final answer (namely, a first answer text) from a library;

a2: for answers with longer lengths, judging the text type of the answers through a text analysis module (namely, a model determining module) and judging which abstract module (namely, a target model) the answers are applicable to;

a3: sending the query and the first answer text to an abstract module, and processing to obtain an abstract result (namely, a second answer text);

a4: if the abstract module is a generative model, a check module (such as a semantic matching model) is further used for judging whether the semantics of the second answer text are consistent with the semantics of the first answer text, if not, the abstract model is used for processing to obtain a third answer text, and the third answer text is used as the abstract of the first answer text.

Here, the text analysis module is configured to analyze a text type and a text length of the first answer text; and analyzing the query and the first answer text through a text analysis module to obtain information such as the type, the length and the like of the answer text. By analyzing the query and the first answer text, it can be obtained what the main question types are, what is (i.e. fact type text), why is (i.e. reason type text), and what is (i.e. method type text), and by using the information, the first answer text for different text types and text lengths can pass through different abstract modules. By experimental comparison, the following protocol is preferably used: the factual text and the text length use the extraction model if the factual text and the text length are between 200 and 500 characters; the fact type text with the text length of 100-300 words uses the generative model; factual text, and text larger than 1000 words, the comprehensive model is used. The reason type text and the method type text are mainly paragraph texts without line feed, and for the two texts and the text length is between 300 and 1000 characters, the extraction type model is used.

The abstract module is used for extracting a second answer text based on the first answer text and comprises a generating model, an extracting model and a comprehensive model.

For the pull-type model, the first problem to face is the choice of sentence granularity. In this embodiment, commas, periods, semicolons, and the like are selected as sentence breaks, and texts with continuous semantics and short text lengths can be extracted as much as possible.

For the fact type text, in order to ensure semantic consistency, the abstract model fuses a Lead-3 algorithm and a TextRank algorithm which uses bm25 as a weight and a word vector as a weight as shown in fig. 7. Meanwhile, in order to ensure that the key core sentence is not lost, a longest common subsequence matching algorithm is used for matching similar places in the query and the first answer text, and the position of the core sentence is determined according to the matching place.

Here, the correlation algorithm is described as follows:

the Lead-3 algorithm selects the first 3 divided sentences according to the sentence dividing particle size division. The information concentration ratio of the reality type text is high, the key information of the reality type text is generally concentrated in the first few segments of the text, and the information is supplemented in the back;

the TextRank algorithm is a graph-based sorting algorithm for texts, and is used for scoring important components in the texts by dividing the texts into a plurality of sentences and establishing a graph model and utilizing a voting mechanism. The extraction results of the TextRank are dispersed, and sentences distributed in various places in the answer text can be extracted;

and the longest public subsequence is matched with a matching algorithm, so that the key core sentence is not lost after extraction.

For the generative model, the problems faced by it mainly include: training data is small, unregistered words appear when the training data is generated, and the training data cannot be faithful to original texts. Meanwhile, the first answer text covers various types of culture such as life, science and technology, and the text line text mode of each type of culture is different.

Therefore, in order to make the fine classification more effective, a BERT based on a pre-trained model is used as an initialization model. Meanwhile, in order to solve the problem that the BERT model is not suitable for the generative task, an attribute matrix of the BERT model is improved to support the generative task. Thus, by labeling a small number of categories of data, a better summary can be generated in each category.

Here, BERT is a two-way coder based on transforms representing a pre-trained model, using a large corpus for self-supervised pre-training upstream tasks. The natural language processing task can use some prior linguistic data, and the effect of the downstream task is improved. Original BERT can not be used for generating tasks, UNILM modifies mask matrix, and single BERT model can be used for making Seq2Seq tasks.

As shown in fig. 8, a splicing manner shown on the left side in fig. 8 is adopted in a general bidirectional attention, and when a task is generated, a problem of information leakage occurs by using the bidirectional attention. Since only the information of the "today's weather" part is really predicted, the Mask information of the "sunny day" part can be removed, and the attention concatenation shown on the right side in fig. 8 is obtained. Therefore, the Attention of the input part is bidirectional, the Attention of the output part is unidirectional, the requirement of the Seq2Seq is met, the thought that the Seq2Seq task can be completed by using a single Bert model provided by UNILM is provided, and the generative task is performed by using and training weights as long as the Mask with the shape is added without modifying a model architecture.

Meanwhile, in order to solve the problem that the unknown words are inconsistent with the semantics, a sequence labeling model is used for assisting in generating a result. Some segments are copied from the original text into the generated summary (i.e., the second answer text).

When marking data, calculating the public part of the abstract and the original text by using the longest public subsequence, wherein the part which is overlapped in the abstract and the article is the segment which needs to be reserved, the segment is marked as B as the beginning, I marks the rest words, and the word which does not appear is marked as O. For example, the article is an intelligent question and answer abstract, the abstract is two texts of the question and answer abstract, and the two texts are marked as O, B, I and I, and the rest are marked as O.

In the training stage, a sequence prediction task is added, and a Copy mechanism of sequence labeling can ensure the fidelity of the abstract and the original text and avoid professional errors.

For the comprehensive model, as shown in fig. 9, it can be applied to the fact type text and the text length is greater than 1000 words, such answer text length is long and the key information is scattered in each position of the article, and the effect of using the extraction model or the generation model alone is not good.

For such a question, an extraction model is used to obtain a short text (intermediate answer text), such as within 500 words, and then a generation model is used to process the short text to obtain a second answer text. At this time, the model may ultimately give two results: matching the intermediate answer text and the second answer text with the semantic matching model through the checking model, and if the matching degree of the second answer text and the first answer text does not meet the preset condition, using the intermediate answer text obtained by the extraction type model processing as a final result; and if the matching degree of the second answer text and the first answer text meets a preset condition, taking the second answer text as a final result.

The verification module is used for verifying the matching degree between the second answer text and the first answer text. In order to ensure that the generated second answer text is faithful to the original first answer text, the second answer text is verified through a verification module to ensure the matching degree of the finally generated abstract. Here, the check module may be a semantic matching model.

As shown in fig. 10, in the embodiment of the present application, a pretrained model BERT is used, and a relatively good effect can be obtained on a downstream task with only a small amount of labeled data, and by using the pretrained model BERT to splice a generated abstract (a second answer text) and an original article (a first answer text) to perform a machine inference task, context information can be obtained, and an attribute is performed on the original article and the generated sentence, so as to obtain a representation of a context.

As shown in fig. 11, the present exemplary embodiment further provides an apparatus 10 for extracting a summary of an answer text, including:

an obtaining module 110, configured to obtain a first answer text with a first text length;

a model determining module 120, configured to determine a target model from the multiple candidate models 140 according to a text type and/or a first text length of the first answer text;

a processing module 130, configured to process the first answer text by using the target model to obtain a second answer text with a second text length, where the second text length is shorter than the first text length, and the second answer text and the first answer text have the same subject meaning.

In some possible embodiments, as shown in fig. 12, the alternative model 140 includes at least one of:

a generating model 1401 for generating the second answer text based on the content of the first answer text;

an extraction model 1402, extracting at least one keyword and/or key sentence already existing in the first answer text to form the second answer text;

and the comprehensive model 1403 comprises the extraction model and the generation model which are arranged in sequence, and is used for forming the second answer text after the first answer text is sequentially processed by the extraction model and the generation model.

In some possible embodiments, as shown in fig. 13, the generative model 1401 includes:

a processing unit 14011, configured to perform word segmentation processing on the first answer text, and determine a word segmentation position in the first answer text;

an insert delimiter unit 14012 for inserting a predetermined delimiter at a participle position in the first answer text;

the generating unit 14013 is configured to input the first answer text into which the separator is inserted into the generative model, so as to obtain the second answer text.

In some possible embodiments, the generative model 1401 is a language model of a transform-based bidirectional encoder BERT.

In some possible embodiments, as shown in fig. 14, the extracted model 1402 includes:

a selecting unit 14021, configured to split the first answer text into N sentences and select M sentences from the N sentences, where M is not greater than N;

a candidate keyword determining unit 14022 for determining candidate keywords of the M sentences;

a ranking unit 14023, configured to rank the importance of the candidate keywords;

a selecting unit 14024, configured to select a preset number of candidate keywords with high importance as keywords;

a conjunction unit 14025, configured to form the second answer text based on the keyword.

The present exemplary embodiment also provides an electronic device, which includes the apparatus according to any one of the above embodiments.

The present exemplary embodiment also provides a computer-readable storage medium, which stores executable instructions for causing a processor to implement the method according to any one of the above embodiments when executed.

The computer-readable storage medium may be: a storage medium such as a removable storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and various media capable of storing program codes may be selected as a non-transitory storage medium.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

In an exemplary embodiment, each module/Unit in the Device may be implemented by one or more Central Processing Units (CPUs), Graphics Processing Units (GPUs), Baseband Processors (BPs), Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, Micro Controllers (MCUs), microprocessors (microprocessors), or other electronic components, for performing the aforementioned methods.

Fig. 15 is a block diagram illustrating an electronic device apparatus 800 according to an example embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 15, the apparatus 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 800.

The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The apparatus 800 may access a wireless network based on a communication standard, such as WiFi, 4G or 5G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosed embodiments will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the embodiments of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosed embodiments being indicated by the following claims.

It is to be understood that the disclosed embodiments are not limited to the precise arrangements described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the embodiments of the present disclosure is limited only by the appended claims.

Claims

1. A method for abstracting an abstract of an answer text is characterized by comprising the following steps:

acquiring a first answer text with a first text length;

2. The method of claim 1, wherein the alternative model comprises at least one of:

3. The method for abstracting an abstract of an answer text as claimed in claim 2, wherein the determining a target model from a plurality of candidate models according to the text type and/or the first text length of the first answer text comprises at least one of:

4. The method for abstracting an abstract of an answer text as claimed in claim 2, wherein the determining a target model from a plurality of candidate models according to a text type and/or a first text length of the first answer text further comprises:

5. The method for abstracting an abstract of an answer text as claimed in claim 2, wherein the generative model is configured to generate the second answer text based on the content of the first answer text, and comprises:

6. The method of claim 5, wherein said generative model is a language model of a transform-based bidirectional encoder BERT.

7. The method for abstracting an abstract of an answer text as claimed in claim 2, wherein the extracting model extracts at least one keyword and/or keyword sentence already existing in the first answer text to form the second answer text, and comprises:

determining candidate keywords of the M sentences;

sorting the importance of the candidate keywords;

and forming the second answer text based on the keywords.

8. An apparatus for abstracting an abstract of an answer text, comprising:

9. The apparatus for abstracting an abstract of an answer text as claimed in claim 8, wherein the alternative model comprises at least one of:

10. The apparatus for abstracting an abstract of an answer text as claimed in claim 9, wherein the generative model comprises:

11. The apparatus for abstracting a summary of an answer text as claimed in claim 9, wherein the generative model is a language model of a bidirectional encoder BERT based on transforms.

12. The apparatus for abstracting an abstract of an answer text as claimed in claim 9, wherein the decimated model comprises:

the sorting unit is used for sorting the importance of the candidate keywords;

13. An electronic device, characterized in that it comprises the apparatus of any of claims 8 to 12.

14. A computer-readable storage medium having stored thereon executable instructions for causing a processor, when executed, to implement the method of any one of claims 1 to 7.