CN113204956A

CN113204956A - Multi-model training method, abstract segmentation method, text segmentation method and text segmentation device

Info

Publication number: CN113204956A
Application number: CN202110762240.5A
Authority: CN
Inventors: 蒋志燕; 吕少领; 黄石磊; 程刚
Original assignee: Shenzhen Raisound Technology Co ltd
Current assignee: Shenzhen Raisound Technology Co ltd
Priority date: 2021-07-06
Filing date: 2021-07-06
Publication date: 2021-08-03
Anticipated expiration: 2041-07-06
Also published as: CN113204956B

Abstract

The application relates to the technical field of artificial intelligence, and discloses a multi-model training method, which comprises the following steps: dividing the text in the training text set into single sentences to obtain a training single sentence set; extracting the characteristics of the training single sentence set to obtain a training single sentence vector set; extracting paragraph coding features and abstract coding features of training single sentences in the training single sentence vector set; and performing first training on the pre-constructed text segmentation model by using the training single sentence vector set and the abstract coding features, and performing second training on the pre-constructed text abstract extraction model by using the training single sentence vector set and the paragraph coding features to obtain a standard text segmentation model and a standard text abstract extraction model. In addition, the application also relates to a method for abstracting the abstract, a method for segmenting the text, a device, equipment and a storage medium. The method and the device can improve the model accuracy of the text segmentation model and the abstract extraction model obtained by training and the obtaining efficiency of the text segmentation model and the abstract extraction model obtained by training.

Description

Multi-model training method, abstract segmentation method, text segmentation method and text segmentation device

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a multi-model training method, an abstract segmentation method, a text segmentation method, an apparatus, an electronic device, and a storage medium.

Background

Text segmentation and abstract extraction are common processing modes for text processing, and in the prior art, a text abstract extraction model and a text segmentation model are usually obtained by training respectively. Specifically, the existing technology adopted for the text segmentation task is to use a Jaccard similarity analyzer to find the distance between consecutive sentences, and if the distance between them is less than a given value, the consecutive sentences are grouped into a paragraph. However, the segmentation method only performs segmentation from a single dimension of similarity between sentences, which easily results in inaccurate segmentation results.

The text abstract extracting model used in the existing text abstract extracting method usually only considers the position of a sentence and extracts the first few sentences of a text to form an abstract, the method is relatively suitable for news texts, the application range of the text abstract extracting model is not wide, and the extracting result is often inaccurate when paragraphs except specific types of texts (such as news texts) are extracted. In summary, the obtaining efficiency of the text segmentation model and the text abstract extraction model in the prior art is not high, and the model accuracy is not high.

Disclosure of Invention

In order to solve the technical problems or at least partially solve the technical problems, the application provides a multi-model training method, a summary segmentation method, a text segmentation method, a device, an electronic device and a storage medium.

In a first aspect, the present application provides a multi-model training method, including:

acquiring a training text set, and dividing texts in the training text set into single sentences to obtain a training single sentence set;

performing feature extraction on the training single sentence set through a preset feature extraction model to obtain a training single sentence vector set;

extracting paragraph coding features and abstract coding features of the training single sentences in the training single sentence vector set;

and carrying out first training on a pre-constructed text segmentation model by using the training single sentence vector set and the abstract coding features of the training single sentences in the training single sentence vector set, and carrying out second training on a pre-constructed text abstract extraction model by using the training single sentence vector set and the paragraph coding features of the training single sentences in the training single sentence vector set to obtain a standard text segmentation model and a standard text abstract extraction model.

Optionally, the performing feature extraction on the training single sentence set through a preset feature extraction model to obtain a training single sentence vector set includes:

extracting the mark embedding characteristics, distinguishing embedding characteristics and position embedding characteristics of the training single sentences in the training single sentence set through a first extraction network of a preset characteristic extraction model;

combining the mark embedding characteristics, the distinguishing embedding characteristics and the position embedding characteristics of the training single sentence and inputting the combination into a second extraction network of the characteristic extraction model to obtain a single sentence vector of the training single sentence;

and summarizing all the obtained single sentence vectors to obtain the training single sentence vector set.

Optionally, the feature extraction model is a BERT model.

Optionally, the performing, by using the training single sentence vector set and the paragraph coding features of the training single sentences in the training single sentence vector set, a second training on the pre-constructed text abstract extraction model includes:

inputting the training single sentence vector set and paragraph coding features of the training single sentences in the training single sentence vector set to a pre-constructed document feature extraction model to obtain an abstract training single sentence vector set;

and inputting the abstract training single sentence vector set into the text abstract extraction model for second training.

Optionally, after obtaining the training single sentence set, the method further includes:

and deleting stop words, tone words and repeated words in the training single sentence set.

In a second aspect, the present application provides a method for abstracting a summary, the method including:

acquiring a text to be processed;

and inputting the text to be processed into a standard text abstract extraction model for abstract extraction to obtain an abstract extraction result, wherein the standard text abstract extraction model is obtained by training by adopting the multi-model training method in the first aspect.

In a third aspect, the present application provides a text segmentation method, including:

acquiring a text to be processed;

and inputting the text to be processed into a standard text segmentation model for text segmentation to obtain a text segmentation result, wherein the standard text segmentation model is obtained by training by adopting the multi-model training method in the first aspect.

In a fourth aspect, the present application provides a multi-model training apparatus, the apparatus comprising:

the training text acquisition module is used for acquiring a training text set, dividing texts in the training text set into single sentences to obtain a training single sentence set;

the first feature extraction module is used for extracting features of the training single sentence set through a preset feature extraction model to obtain a training single sentence vector set;

the second characteristic extraction module is used for extracting paragraph coding characteristics and abstract coding characteristics of the training single sentences in the training single sentence vector set;

and the training module is used for carrying out first training on the pre-constructed text segmentation model by utilizing the training single sentence vector set and the abstract coding characteristics of the training single sentences in the training single sentence vector set, and carrying out second training on the pre-constructed text abstract extraction model by utilizing the training single sentence vector set and the paragraph coding characteristics of the training single sentences in the training single sentence vector set to obtain a standard text segmentation model and a standard text abstract extraction model.

In a fifth aspect, the present application provides a digest extraction apparatus, including:

the abstract text acquisition module is used for acquiring a text to be processed;

and the abstract extraction module is used for inputting the text to be processed into a standard text abstract extraction model for abstract extraction to obtain an abstract extraction result, wherein the standard text abstract extraction model is obtained by training by adopting the multi-model training device in the fourth aspect.

In a sixth aspect, the present application provides a text segmentation apparatus, comprising:

the segmented text acquisition module is used for acquiring a text to be processed;

and the text segmentation module is used for inputting the text to be processed into a standard text segmentation model for text segmentation to obtain a text segmentation result, wherein the standard text segmentation model is obtained by training by adopting the multi-model training device according to any one of the fourth aspect.

In a seventh aspect, an electronic device is provided, which includes a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

a processor, configured to implement the steps of the multi-model training method according to any one of the embodiments of the first aspect, or implement the steps of the abstract extraction method according to the second aspect, or implement the steps of the text segmentation method according to the third aspect when executing the program stored in the memory.

In an eighth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, implements the steps of the multi-model training method according to any one of the embodiments of the first aspect, or implements the steps of the abstract extraction method according to the second aspect, or implements the steps of the text segmentation method according to the third aspect.

Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:

according to the multi-model training method, the abstract segmentation method, the text segmentation method, the device, the electronic equipment and the computer readable storage medium, the text abstract extraction model and the text segmentation model can be trained simultaneously by using the training single sentence set obtained by the training text set, the abstract coding features of the training single sentences are used during the training of the text segmentation model, and the paragraph coding features are used during the training of the text abstract extraction model, so that the segmentation model obtained by training can be learned by combining the abstract coding features during the training of the segmentation model, the accuracy of the segmentation model obtained by training can be improved, the abstract extraction model can be learned by combining the paragraph coding features during the training of the abstract extraction model, and the accuracy of the abstract extraction model obtained by training can be improved. Therefore, the embodiment of the invention can not only improve the acquisition efficiency of the text segmentation model and the abstract extraction model obtained by training, but also improve the model accuracy of the text segmentation model and the abstract extraction model obtained by training.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a schematic flowchart of a multi-model training method according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of a method for abstracting a summary according to an embodiment of the present application;

fig. 3 is a schematic flowchart of a text segmentation method according to an embodiment of the present application;

FIG. 4 is a block diagram of a multi-model training apparatus according to an embodiment of the present disclosure;

fig. 5 is a schematic block diagram of a summary extraction apparatus according to an embodiment of the present disclosure;

FIG. 6 is a block diagram of a text segmentation apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Fig. 1 is a schematic flowchart of a multi-model training method according to an embodiment of the present disclosure. In this embodiment, the multi-model training method includes:

s11, obtaining a training text set, and dividing the text in the training text set into single sentences to obtain a training single sentence set.

In this embodiment, the training text set is data used for model training. Specifically, in this embodiment, the training text set is data of a text type, and specifically may be a chinese text, an english text, and the like.

In this embodiment, the training text set includes a plurality of paragraph texts, where the paragraph text data may be text data crawled from local or network, the paragraph text includes a plurality of single sentences, and includes paragraph tags and abstract tags for the single sentences in the paragraph text, and the paragraph tags and the abstract tags are used to mark whether the sentence is a paragraph and whether the sentence is an abstract.

For example, each sentence sent_iAll contain respective labels

And

wherein

Indicating whether the ith sentence is a paragraph boundary,

indicating whether the ith sentence is included in the abstract, i.e., whether the ith sentence is an abstract, specifically,

and

can be 0 or 1 when

When 0, it means that the ith sentence is not a paragraph boundary, when

When 1, it means that the ith sentence is a paragraph boundary, when

0 means that the ith sentence is not a summary, when

A value of 1 indicates that the ith sentence is a summary.

In this embodiment, when the training text set is chinese, the text in the training text set may be divided into single sentences by punctuation marks. For example, when recognizing the occurrence of a symbol such as "a character symbol corresponding to a period", "a character symbol corresponding to an exclamation mark", "a character symbol corresponding to a question mark", the text before extracting the symbol is one sentence.

Further, after obtaining the training single sentence set, the method may further include:

Specifically, stop words are used herein to refer to common words (e.g., "and," "the," etc.), numbers, punctuation marks and other special characters, etc.

In the embodiment, the information of the training single sentence set is deleted, so that the single sentence can be simplified, the simple single sentence can be obtained, and the efficiency of model training is improved under the condition of ensuring the training accuracy.

And S12, performing feature extraction on the training single sentence set through a preset feature extraction model to obtain a training single sentence vector set.

In this embodiment, the feature extraction of the training single sentence set may include performing word segmentation on the training single sentence in the training single sentence set to obtain word segmentation representation of the sentence, and performing feature extraction on the word segmentation representation of the sentence through a preset feature extraction model.

Specifically, a word segmenter (Tokenizer) may be employed to tag a sentence to obtain all words that appear in the sentence. And add [ CLS ] at the beginning and [ SEP ] at the end of each sentence.

In this embodiment, the feature extraction model is a model capable of performing sentence vector extraction, for example, the feature extraction model is an ELMo model.

Preferably, the feature extraction model is a BERT model. BERT is capable of capturing bi-directional context information, consisting of multiple bi-directional transform encoder layers.

For example, all the sent with the delimiters [ CLS ] and [ SEP ] are input as sequences into the model BERT, since the input length of the sequences is not fixed, the sequence length MAXLENN is set in advance, if the input sequence length is smaller than MAXLENN, the length uniformity is ensured by the character string padding, beyond MAXLENN, the sequences are truncated, and input is divided into a plurality of times (e.g., twice).

Preferably, the performing feature extraction on the training single sentence set through a preset feature extraction model to obtain a training single sentence vector set includes:

In this embodiment, the feature extraction model includes a plurality of extraction networks, for example, the second extraction network is the last several (for example, three) transform layers in the BERT model, and the first extraction network is the other transform layers in the BERT model.

In this embodiment, the mark embedding feature is a word vector of a word in a single sentence, the distinguishing embedding feature is used to distinguish the single sentence from an adjacent sentence into different sentences, and the position embedding feature is used to identify an absolute position of the word in the single sentence.

For example, in distinguishing embedded features, E is used_AAnd E_BTo distinguish two different sentences, i.e. for the sentence sent_iIn other words, if i is odd, the distinguishing embedding feature is E_AOtherwise, it is E_B. For example: input 5 sentences Sent₁, sent_2, sent₃, sent₄, sent₅The corresponding distinguishing embedding characteristic is { E }_A, E_B, E_A, E_B, E_A}。

For example, the mark Embedding characteristic is Token Embedding, the distinguishing Embedding characteristic is Segment Embedding, and the Position Embedding characteristic is Position Embedding, so that the mark E of a sentence is Token Embedding + Segment Embedding + Position Embedding, E passes through multiple transform layers of BERT to obtain the context Embedding of each sentence, the context of each sentence is embedded through the last transform layer of BERT to obtain the vector representation of a single sentence, i.e., a single sentence vector, and all the single sentence vectors are summarized to obtain a training single sentence vector set.

Preferably, in an embodiment of the present invention, after obtaining the training single sentence vector set, the method further includes:

and dividing the training single sentence vector set into a training set, a verification set and a test set. The training set is used for training the model, the verification set is used for verifying the model, and the test set is used for testing the model.

Specifically, M% of data in the training single sentence vector set may be randomly taken as a training set, N% of texts may be taken as a verification set, and the last (100-M-N)% of texts may be taken as a test set, where the values of M and N may be preset.

And S13, extracting paragraph coding features and abstract coding features of the training single sentences in the training single sentence vector set.

In this embodiment, the paragraph coding feature of the training single sentence represents a paragraph feature of the training single sentence in the entire training text, and specifically may include a position of the single sentence in the paragraph, for example, the paragraph coding feature of a certain single sentence is N/M, M represents the number of sentences in the paragraph to which the single sentence belongs (M is different in value for different paragraphs), and N represents that the current single sentence is the second sentence in the paragraph to which the single sentence belongs.

The abstract coding characteristics of a single sentence indicate the abstract characteristics of the single sentence in a paragraph, and specifically may include whether the single sentence is an abstract sentence, and whether the single sentence is the first sentence of the abstract sentence. For example, a single sentence has a digest coding characteristic of K/P, where P represents the number of digests when the sentence is a digest sentence, K represents that the single sentence is the first sentence in the digests, and if the current sentence is not a digest sentence, it is represented as 0/0.

In this embodiment, the paragraph coding features and the abstract coding features of each single sentence may be pre-labeled, and then the paragraph coding features and the abstract coding features of the pre-labeled single sentences are directly extracted when the model is trained.

S14, performing first training on the pre-constructed text segmentation model by using the training single sentence vector set and the abstract coding features of the training single sentences in the training single sentence vector set, and performing second training on the pre-constructed text abstract extraction model by using the training single sentence vector set and the paragraph coding features of the training single sentences in the training single sentence vector set to obtain a standard text segmentation model and a standard text abstract extraction model.

For example, a single sentence vector of a single sentence in a training single sentence vector set is

If the abstract code of the single sentence is K/P, the abstract code will be

Splicing with K/P to obtain

Then will be

The first training is performed by inputting the first training data into a text segmentation model.

In this embodiment, the text segmentation model is a linear classifier.

For example, the text segmentation model is:

wherein the content of the first and second substances,

a value indicating a predictive tag for a single sentence, specifically,

is a predicted value for a single sentence as a paragraph boundary,

in order to activate the function sigmoid,

representing the transpose of the preset parameter vector,

representing an input sentence sent_iThe vector representation of (i.e. the combination of the single sentence vector and the digest-coding features of the single sentence),

indicating a preset bias.

In this embodiment, the first training is performed

And

the value of (1) then including

And

the text segmentation model of the value of (a) is a trained standard text segmentation model.

Furthermore, during the first training, the training set can be concentrated by using the training single sentence vectors, and then the trained model is verified and corrected through the verification set. In particular, predictive tag values may be utilized

And actual tag value

And (4) comparing the numerical values, determining whether the paragraph boundary prediction of the single sentence is accurate, and then counting the accurate proportion of the single sentence prediction to obtain the trained standard text segmentation model. And if the accuracy is lower than the preset accuracy, training the text segmentation model again until obtaining a model with the accuracy reaching the preset accuracy.

Similarly, for example, a single sentence vector of a single sentence in a set of training single sentence vectors is

If the paragraph coding feature of the single sentence is N/M, then it will be

Splicing with N/M to obtain

Then will be

And inputting the text abstract extraction model to perform second training.

In this embodiment, the text summarization extraction model may also be a linear classifier.

For example, the text summarization extraction model is:

wherein the content of the first and second substances,

representing the digest predictive tag value for a single sentence,

in order to activate the function sigmoid,

representing an input sentence sent_iA vector representation of (i.e. a combination of a single sentence vector and paragraph coding features of the single sentence),

and

are parameters of the linear classifier.

In this embodiment, the second training is performed

And

the value of (1) then including

And

the text abstract extraction model of the value of (1) is a trained standard text abstract extraction model.

Furthermore, during the second training, the training set can be concentrated by using the training single sentence vectors, and the trained model is verified and corrected through the verification set. In particular, predictive tag values may be utilized

And actual tag value

And comparing the numerical values to determine whether the abstract prediction of the single sentence is accurate, and then counting the accurate proportion of the abstract prediction of all the single sentences to obtain a trained standard text abstract extraction model. And if the accuracy is lower than the preset accuracy, training the text abstract extraction model again until a model with the accuracy reaching the preset accuracy is obtained.

Further, when training and verification processes occur, a lot of data is generated

And

the best 3 of the combinations (namely the ones with the most correct label prediction) are selected from the verification set

And

in the combination of (1), reserve these 3

And

combinations of (a) and (b).

Further, the performing a second training on the pre-constructed text abstract extraction model by using the training single sentence vector set and the paragraph coding features of the training single sentences in the training single sentence vector set includes:

In this embodiment, the document feature extraction model may be a linear classification model, or a Long-Short Term Memory network (LSTM) model, or a transform model.

Specifically, one may be used first

A function that combines a single sentence vector and paragraph coding features of the single sentence, namely:

wherein

It is shown that the initial value is,

vector representation representing sentences, paragraph codingThe feature N/M may be immediately adjacent [ CLS ] of the sentence]After the delimiter, the sentence content is spaced apart with "@".

For example, will

Inputting the data into a Transformer model, and extracting features at a document level by using a Multi-Head Attention mechanism (Multi-Head Attention) and Layer Normalization (Layer Normalization) in the Transformer model.

Is shown in the following formula, wherein

The layer normalization operation is represented as a function of,

a multi-head attention mechanism is shown,

a feed-forward network is shown in which,

representing the depth of the stack of layers of the transform model (e.g. the network layers of the multiple decoders and encoders in the transform model), i.e.

Is an indication of the depth of the layer,

after the input is input into a Transformer model, a multi-head attention mechanism and layer normalization are gradually carried out through each layer, and a final sentence vector is obtained.

Wherein the content of the first and second substances,

is shown as

The vector representation of the output of the layer,

is shown as

The vector output by the layer passes through

The resulting vectors of the Layer Normalization operation of the layers,

is shown as

Vector representation of sentences output by layers, i.e. passing through

Extracting information by a layer Transformer model to obtain a final vector of a certain sentence i and expressing the final vector as

And then, the vector representation of the sentence is input into a text abstract extraction model for secondary classification, specifically, the text abstract extraction model may be:

wherein the content of the first and second substances,

representing the digest predictive tag value for a single sentence,

in order to activate the function sigmoid,

and

are parameters of the linear classifier.

According to the method and the device, the text abstract extraction model and the text segmentation model are trained simultaneously by using the training single sentence set obtained by the training text set, the abstract coding features of the training single sentence are used when the text segmentation model is trained, the paragraph coding features are used when the text abstract extraction model is extracted, the abstract coding features can be combined for learning when the segmentation model is trained, the accuracy of the segmentation model obtained by training is improved, the paragraph coding features can be combined for learning when the abstract extraction model is trained, and the accuracy of the abstract extraction model obtained by training is improved. Therefore, the embodiment of the invention can not only improve the acquisition efficiency of the text segmentation model and the abstract extraction model obtained by training, but also improve the model accuracy of the text segmentation model and the abstract extraction model obtained by training.

Fig. 2 is a flowchart illustrating a method for abstracting a summary according to an embodiment of the present disclosure. In this embodiment, the digest extraction method includes:

and S21, acquiring the text to be processed.

In this embodiment, the text to be processed may be input by a user, or may be a text to be segmented obtained in a local or network database, and the text to be processed may include a plurality of single sentences.

The type of the text to be processed can be a Chinese type, or other language types.

And S22, inputting the text to be processed into a standard text abstract extraction model for abstract extraction to obtain an abstract extraction result.

In this embodiment, the standard text abstract extraction model is obtained by training by using the multi-model training method described in the foregoing method embodiment.

In this embodiment, a single-sentence dividing process may be performed on the text to be processed, that is, the text to be processed is divided into single sentences, then the probability that a single sentence in the text to be processed is a summary sentence is identified by using the standard text summary extraction model, and a sentence with the probability greater than the preset probability is determined as the summary sentence.

Further, the abstract extraction result includes an abstract sentence identified as an abstract and an abstract probability of the abstract sentence, and after the abstract extraction result is obtained, the method further includes:

determining the number of the abstracts for generating the abstract paragraphs, and sequencing the abstract sentences according to the abstract probability;

judging whether at least two abstract sentences with similarity greater than preset similarity exist in the plurality of abstract sentences from front to back according to the sequence, wherein the total number of the plurality of abstract sentences is equal to the number of the abstract sentences;

if at least two abstract sentences with the similarity greater than the preset similarity exist, reserving any one of the at least two abstract sentences as a target abstract sentence, deleting abstract sentences except the target abstract sentence, and executing the operation of judging whether at least two abstract sentences with the similarity greater than the preset similarity exist in the plurality of abstract sentences from front to back according to the sequence;

and if at least two abstract sentences with the similarity larger than the preset similarity do not exist, forming the abstract sentences into abstract paragraphs.

In this embodiment, the summary paragraph is a summary paragraph composed of summaries. In the embodiment, a plurality of sentences are selected from the abstract extraction result to form the abstract paragraphs, so that the brief accuracy of generating the abstract is improved.

For example, it is determined that the number of abstracts for generating an abstract paragraph is 8, if it is determined that there are 12 sentences in the sentences with the probability greater than the preset probability, that is, there are 12 abstract sentences, the 12 sentences are sorted in the order of the abstract probability from large to small, and it is determined whether there are two sentences with a large similarity in the first 8 sentences, specifically, it is possible to determine whether similar or repeated phrases in the two sentences are greater than the preset number of terms (for example, it is determined whether similar or repeated phrases in the two sentences are greater than 3), and if similar or repeated phrases in the two sentences are greater than the preset number of terms, it is determined that the two sentences are similar sentences, and one of the two sentences is selected and deleted, preferably, a sentence with a low abstract probability is selected and deleted, and a sentence with a high abstract probability is retained; then judging whether a sentence with larger similarity exists in the 9 th sentence in the original sequence and the 7 sentences after deleting the sentence, if not, determining that the 7 th sentence after deleting the sentence and the 9 th sentence form a summary paragraph, if the 9 th sentence in the original sequence and any one of the 7 sentences after deleting the sentence are sentences with larger similarity, deleting the 9 th sentence, and taking the 10 th sentence in the original sequence and the 7 sentences to judge the similarity of the sentences, and so on until obtaining the summary paragraph without similar sentences, reducing redundant information of the summary paragraph, and enabling the summary paragraph to be more refined and the contained information to be more complete.

In this embodiment, since the standard text abstract extraction model is obtained by training by using the multi-model training method in the foregoing method embodiment, the method embodiment of the present invention can accurately judge the abstract sentence of the text to be processed, obtain an accurate abstract extraction result, and obtain a brief and accurate abstract paragraph.

Fig. 3 is a schematic flowchart of a text segmentation method according to an embodiment of the present application. In this embodiment, the text segmentation method includes:

and S31, acquiring the text to be processed.

And S32, inputting the text to be processed into a standard text segmentation model for text segmentation to obtain a text segmentation result.

In this embodiment, the standard text segmentation model is obtained by training using the multi-model training method described in the foregoing method embodiment.

In this embodiment, a text to be processed may be divided into single sentences, that is, the text to be processed is divided into single sentences, then labels of paragraph boundaries of different single sentences in the text to be processed are identified by using a standard text segmentation model, and which single sentences are the same paragraph and which single sentences are not the same paragraph are determined by the labels of the paragraph boundaries, so as to determine a segmentation result of the text to be processed.

In this embodiment, since the standard text segmentation model is obtained by training using the multi-model training method described in the foregoing method embodiment, the standard text segmentation model can accurately segment the text to be processed, so as to obtain an accurate segmentation result.

As shown in fig. 4, an embodiment of the present application provides a block diagram of a multi-model training apparatus 40, where the multi-model training apparatus 40 includes: a training text acquisition module 41, a first feature extraction module 42, a second feature extraction module 43, and a training module 44.

The training text acquisition module 41 is configured to acquire a training text set, and divide texts in the training text set into single sentences to obtain a training single sentence set;

the first feature extraction module 42 is configured to perform feature extraction on the training single sentence set through a preset feature extraction model to obtain a training single sentence vector set;

the second feature extraction module 43 is configured to extract paragraph coding features and abstract coding features of a training single sentence in the training single sentence vector set;

the training module 44 is configured to perform first training on a pre-constructed text segmentation model by using the training single sentence vector set and the abstract coding features of the training single sentences in the training single sentence vector set, and perform second training on a pre-constructed text abstract extraction model by using the training single sentence vector set and the paragraph coding features of the training single sentences in the training single sentence vector set, so as to obtain a standard text segmentation model and a standard text abstract extraction model.

In detail, when the modules in the multi-model training device 40 in the embodiment of the present application are used, the same technical means as the multi-model training method described in fig. 1 above are adopted, and the same technical effects can be produced, which is not described herein again.

As shown in fig. 5, an embodiment of the present application provides a block diagram of a summary extraction apparatus 50, where the summary extraction apparatus 50 includes: a summary text acquisition module 51 and a summary extraction module 52.

The abstract text acquisition module 51 is configured to acquire a text to be processed;

the abstract extraction module 52 is configured to input the text to be processed into a standard text abstract extraction model for abstract extraction, so as to obtain an abstract extraction result, where the standard text abstract extraction model is obtained by training using the multi-model training apparatus described in the foregoing apparatus embodiment.

In detail, when the modules in the abstract extracting apparatus 50 in the embodiment of the present application are used, the same technical means as the abstract extracting method described in fig. 2 above is adopted, and the same technical effects can be produced, which is not described herein again.

As shown in fig. 6, an embodiment of the present application provides a block diagram of a text segmentation apparatus 60, where the text segmentation apparatus 60 includes: a segmented text acquisition module 61 and a text segmentation module 62.

The segmented text acquisition module 61 is configured to acquire a text to be processed;

the text segmentation module 62 is configured to input the text to be processed into a standard text segmentation model for text segmentation, so as to obtain a text segmentation result, where the standard text segmentation model is obtained by training using the multi-model training apparatus described in the foregoing apparatus embodiment.

In detail, when the modules in the text segmentation apparatus 60 in the embodiment of the present application are used, the same technical means as the text segmentation method described in fig. 3 above are adopted, and the same technical effects can be produced, which is not described herein again.

As shown in fig. 7, an electronic device according to an embodiment of the present application includes a processor 111, a communication interface 112, a memory 113, and a communication bus 114, where the processor 111, the communication interface 112, and the memory 113 complete communication with each other through the communication bus 114.

The memory 113 stores a computer program.

In an embodiment of the present application, the processor 111 is configured to, when executing the program stored in the memory 113, implement the multi-model training method provided in any one of the foregoing method embodiments, or implement the abstract extraction method provided in any one of the foregoing method embodiments, or implement the text segmentation method provided in any one of the foregoing method embodiments.

The multi-model training method comprises the following steps:

The abstract extraction method comprises the following steps:

acquiring a text to be processed;

and inputting the text to be processed into a standard text abstract extraction model for abstract extraction to obtain an abstract extraction result, wherein the standard text abstract extraction model is obtained by training by adopting the multi-model training method of any one of the method embodiments.

The text segmentation method comprises the following steps:

acquiring a text to be processed;

and inputting the text to be processed into a standard text segmentation model for text segmentation to obtain a text segmentation result, wherein the standard text segmentation model is obtained by training by adopting the multi-model training method of any one of the method embodiments.

The communication bus 114 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus 114 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface 112 is used for communication between the above-described electronic apparatus and other apparatuses.

The memory 113 may include a Random Access Memory (RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory 113 may also be at least one storage device located remotely from the processor 111.

The processor 111 may be a general-purpose processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the integrated circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components.

Embodiments of the present application further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the multi-model training method as provided in any of the foregoing method embodiments, or implements the steps of the summarization extraction method as provided in any of the foregoing method embodiments, or implements the steps of the text segmentation method as provided in any of the foregoing method embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (ssd)), among others.

It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of multi-model training, the method comprising:

2. The method of claim 1, wherein the performing feature extraction on the training single sentence set through a preset feature extraction model to obtain a training single sentence vector set comprises:

3. The method of claim 2, wherein the feature extraction model is a BERT model.

4. The method of claim 1, wherein said second training of the pre-built text summarization extraction model using the set of training single sentence vectors and paragraph coding features of the training single sentences in the set of training single sentence vectors comprises:

5. The method of claim 1, wherein after obtaining the set of training sentences, the method further comprises:

6. A method for abstracting a summary, the method comprising:

acquiring a text to be processed;

inputting the text to be processed into a standard text abstract extraction model for abstract extraction to obtain an abstract extraction result, wherein the standard text abstract extraction model is obtained by training by adopting the multi-model training method according to any one of claims 1 to 5.

7. The method of claim 6, wherein the digest extraction result includes a digest sentence identified as a digest and a digest probability of the digest sentence, and after obtaining the digest extraction result, the method further comprises:

8. A method of text segmentation, the method comprising:

acquiring a text to be processed;

inputting the text to be processed into a standard text segmentation model for text segmentation to obtain a text segmentation result, wherein the standard text segmentation model is obtained by training by adopting the multi-model training method of any one of claims 1 to 5.

9. A multi-model training apparatus, the apparatus comprising:

10. An apparatus for abstracting a summary, the apparatus comprising:

the abstract extraction module is used for inputting the text to be processed into a standard text abstract extraction model for abstract extraction to obtain an abstract extraction result, wherein the standard text abstract extraction model is obtained by training through the multi-model training device according to claim 9.

11. A text segmentation apparatus, characterized in that the apparatus comprises:

a text segmentation module, configured to input the text to be processed into a standard text segmentation model for text segmentation, so as to obtain a text segmentation result, where the standard text segmentation model is obtained by training using the multi-model training apparatus according to claim 9.

12. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;

a memory for storing a computer program;

a processor for implementing the steps of the multi-model training method of any one of claims 1 to 5, or implementing the steps of the summarization method of claim 6, or implementing the steps of the text segmentation method of claim 7, when executing a program stored in a memory.

13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the multi-model training method according to any one of claims 1 to 5, or the steps of the summarization extraction method according to claim 6, or the steps of the text segmentation method according to claim 7.