CN117909494B

CN117909494B - Abstract consistency assessment model training method and device

Info

Publication number: CN117909494B
Application number: CN202410321411.4A
Authority: CN
Inventors: 魏楚元; 张森
Original assignee: Beijing University of Civil Engineering and Architecture
Current assignee: Beijing University of Civil Engineering and Architecture
Priority date: 2024-03-20
Filing date: 2024-03-20
Publication date: 2024-06-07
Anticipated expiration: 2044-03-20
Also published as: CN117909494A

Abstract

The application provides a method and a device for training a summary consistency evaluation model, and relates to the field of text processing. The abstract consistency assessment model training method provided by the application comprises the following steps: acquiring a general abstract data set, and processing the general abstract data set based on a positive processing rule and a negative processing rule to form a training sample; constructing a mixed sample based on the general abstract data set, the positive training sample, the negative training sample and the manually marked domain abstract data set; determining the enhancement mode of each sample in the actual training sample according to the source of the sample in the actual training sample, and constructing an enhanced actual training sample; masking the enhanced actual training samples, training the abstract consistency evaluation model based on the masked samples, wherein masking modes of the enhanced actual training samples of different training rounds are different; judging whether the abstract consistency evaluation model is trained, if so, returning to the step of determining the training turn of the language model; otherwise, the training of the abstract consistency assessment model is completed.

Description

Abstract consistency assessment model training method and device

Technical Field

The application relates to the field of text processing, in particular to a method and a device for training a summary consistency evaluation model.

Background

With the development of artificial intelligence technology in recent years, text summarization technology based on deep learning has been rapidly developed, and artificial intelligence technology represented by deep learning is excellent in natural language processing task. However, some fact errors often exist in the abstract generated by the model, which affects the reading interest of people, so that the text abstract technology cannot be better popularized and used.

The existing mainstream text abstract fact consistency assessment method comprises an unsupervised method, a supervised method and a weak supervision method, but due to the limitation of the difference of the training data set field, the dependence on a large amount of manual labor to obtain large-scale labeling data and the like, the problems of low data quality and poor assessment effect exist, and a higher level consistency assessment task cannot be met.

Disclosure of Invention

In view of the above, the application provides a training method and a training device for a summary consistency evaluation model, which can greatly improve the performance of the summary consistency evaluation model.

Specifically, the application is realized by the following technical scheme:

The first aspect of the application provides a method for training a summary consistency assessment model, which comprises the following steps:

Acquiring a general abstract data set, and processing the general abstract data set based on a positive processing rule and a negative processing rule to form a training sample; each abstract data in the general abstract data set comprises a plurality of abstract sentences, each abstract sentence comprises an abstract sentence and an original text corresponding to the abstract sentence, the training sample at least comprises a positive training sample and a negative training sample, and the content of the training sample is different from that of the general abstract data set;

Constructing a mixed sample based on the general abstract data set, the positive training sample, the negative training sample and the manually marked domain abstract data set; determining training rounds of a language model, and selecting actual training samples from the mixed samples according to the training rounds, wherein the actual training samples comprise the general abstract data set, positive training samples, negative training samples and manually marked field abstract data sets, the number of the actual training samples is smaller than that of the mixed samples, and the actual training samples corresponding to different training rounds are not identical;

Determining the enhancement mode of each sample in the actual training samples according to the sources of the samples in the actual training samples, and constructing enhanced actual training samples, wherein the difference between positive samples and negative samples in the enhanced actual training samples is larger than that of the actual training samples before enhancement;

masking the enhanced actual training samples, training the abstract consistency evaluation model based on the masked samples, wherein masking modes of the enhanced actual training samples of different training rounds are different;

Judging whether the abstract consistency evaluation model is trained, if so, returning to the step of determining the training turn of the language model; otherwise, the training of the abstract consistency assessment model is completed.

The application provides a summary consistency assessment device, which comprises an acquisition module, a construction module, a processing module, a training module and a judging module; wherein,

The acquisition module is used for acquiring a general abstract data set, and processing the general abstract data set based on a positive processing rule and a negative processing rule to form a training sample; each abstract data in the general abstract data set comprises a plurality of abstract sentences, each abstract sentence comprises an abstract sentence and an original text corresponding to the abstract sentence, the training sample at least comprises a positive training sample and a negative training sample, and the content of the training sample is different from that of the general abstract data set;

The construction module is used for constructing a mixed sample based on the general abstract data set, the positive training sample, the negative training sample and the manually marked domain abstract data set; determining training rounds of a language model, and selecting actual training samples from the mixed samples according to the training rounds, wherein the actual training samples comprise the general abstract data set, positive training samples, negative training samples and manually marked field abstract data sets, the number of the actual training samples is smaller than that of the mixed samples, and the actual training samples corresponding to different training rounds are not identical;

The processing module is used for determining the enhancement mode of each sample in the actual training samples according to the sources of the samples in the actual training samples, and constructing enhanced actual training samples, wherein the difference between the positive example samples and the negative example samples in the enhanced actual training samples is larger than that of the actual training samples before enhancement;

the training module is used for masking the enhanced actual training samples, training the abstract consistency evaluation model based on the masked samples, and masking the enhanced actual training samples in different training rounds;

The judging module is used for judging whether the abstract consistency evaluation model is trained, if yes, returning to the step of determining the training turn of the language model; otherwise, the training of the abstract consistency assessment model is completed.

According to the method and the device for training the abstract consistency assessment model, a general abstract data set is obtained, the general abstract data set is processed based on a positive processing rule and a negative processing rule to form a training sample, a mixed sample is constructed based on the general abstract data set, the positive training sample, the negative training sample and a manually marked field abstract data set, further, the enhancement mode of each sample in the mixed sample is determined according to the source of the sample in the mixed sample, an enhanced actual training sample is constructed, further, the enhanced actual training sample is masked, the abstract consistency assessment model is trained based on the masked sample, the masking modes of the enhanced actual training samples of different training rounds are different, so that whether the abstract consistency assessment model is trained is judged, and if yes, the step of determining the training round of a language model is returned; otherwise, the training of the abstract consistency assessment model is completed. In this way, the general abstract data set is processed through the positive processing rule and the negative processing rule to form training samples, which is helpful for introducing a certain sample diversity, so that the model is better adapted to different semantics and structures; the training data of the model can be further improved by constructing the mixed sample, the generalization performance is improved, and the diversity and the complexity of the sample are further improved by constructing the enhanced actual training sample so as to better simulate the real scene. In the whole, under the condition of no need of large-batch manual labeling, positive and negative samples with strong diversity, weak supervision and strong supervision are generated, and different grades are distinguished in the negative samples, so that the comprehensiveness of the distribution of the abstract consistency evaluation training samples is improved, and the evaluation capability of the model is further improved; in order to further enlarge the gap between the positive and negative samples, the application also carries out enhancement operation after constructing rich positive and negative samples, increases the gap between the positive and negative samples, further compares and learns the gap between the positive and negative samples, better completes evaluation and classification, and improves the accuracy of the model. In addition, the mask training method provided by the application only masks samples of each batch, and the sample masking modes of different batches are different, so that the diversity of the sample masks is improved, the evaluation capability of the model is further improved, and the robustness of the model can be better improved; judging whether the model is trained according to the performance index of the abstract consistency evaluation model or the change trend of training rounds, and ensuring that the model is sufficiently learned in training. Finally, in the abstract consistency evaluation model, the application introduces the comparison learning of the positive example sample and the negative example sample, and better improves the classification capability in the limited sample number, thereby improving the performance of the model.

Drawings

FIG. 1 is a flowchart of a first embodiment of a training method for a summary consistency assessment model provided by the present application;

FIG. 2 is a schematic diagram of a method for constructing an enhanced actual training sample according to an exemplary embodiment of the present application;

FIG. 3 is a diagram illustrating a masking policy according to an exemplary embodiment of the present application;

FIG. 4 is a diagram of a summary consistency assessment model in accordance with an exemplary embodiment of the present application;

FIG. 5 is a block diagram of a trim model according to an exemplary embodiment of the present application;

fig. 6 is a schematic structural diagram of a first embodiment of the training device for the abstract consistency assessment model.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.

The application provides a method and a device for training a summary consistency evaluation model, which can greatly improve the performance of the summary consistency evaluation model.

According to the abstract consistency assessment model training method and device, positive processing rules and negative processing rules are applied to the general abstract data set, so that a variety of training samples are generated, the generalization capability of the model can be improved, and the model can be better adapted to abstract consistency assessment tasks; and the text representation form is established through the process of identifying entity structure relations and semantics in the general abstract data set, so that rich information about text contents can be provided for subsequent processing and training tasks. Specifically, a general abstract data set is obtained, a training sample is formed by processing the general abstract data set based on the positive processing rule and the negative processing rule, a mixed sample is constructed based on the general abstract data set, the positive training sample, the negative training sample and the manually marked field abstract data set, further, the enhancement mode of each sample in the mixed sample is determined according to the source of the sample in the mixed sample, an enhanced actual training sample is constructed, further, the enhanced actual training sample is subjected to masking, the abstract consistency evaluation model is trained based on the masked sample, the masking modes of the enhanced actual training samples of different training rounds are different, so that whether the abstract consistency evaluation model is trained is judged, and if yes, the step of determining the training round of a language model is returned; otherwise, the training of the abstract consistency assessment model is completed. In this way, the general abstract data set is processed through the positive processing rule and the negative processing rule to form training samples, which is helpful for introducing a certain sample diversity, so that the model is better adapted to different semantics and structures; training data of the model can be further improved by constructing a mixed sample, generalization performance is improved, and diversity and complexity of the sample are further improved by constructing an enhanced actual training sample so as to better simulate a real scene; in addition, the abstract consistency evaluation model can process the condition of partial information loss through mask training, so that the robustness of the model can be better improved; judging whether the model is trained according to the performance index of the abstract consistency evaluation model or the change trend of training rounds, and ensuring that the model is sufficiently learned in training.

Specific examples are given below to describe the technical solution of the present application in detail.

FIG. 1 is a flowchart of a training method for a summary consistency assessment model according to an embodiment of the present application. Referring to fig. 1, the method includes:

S101, acquiring a general abstract data set, and processing the general abstract data set based on a positive processing rule and a negative processing rule to form a training sample; each abstract data in the general abstract data set comprises a plurality of abstract sentences, each abstract sentence comprises an abstract sentence and an original text corresponding to the abstract sentence, the training sample at least comprises a positive training sample and a negative training sample, and the content of the training sample is different from that of the general abstract data set.

It should be noted that, the selection of the general summary data set may be selected according to actual needs, which is not limited in this embodiment. For example, in one embodiment, the generic digest dataset may use the CNN/DAILY MAIL digest dataset as the generic digest dataset. The data set comprises almost 30 ten thousand pieces of news data collected from news websites CNN and DAILY MAIL, each piece of summary data in the general summary data set comprises a plurality of summary sentences, each summary sentence comprises a summary sentence and an original text corresponding to the summary sentence, and the positive processing rule and the negative processing rule process the summary sentences and the original text at the same time.

After the universal abstract data set is obtained, the universal abstract data set is processed based on a positive processing rule and a negative processing rule to form a training sample, and specifically, the positive processing rule and the negative processing rule respectively generate training samples with the same semantic meaning and opposite semantic meaning at least through character replacement and rearranged character processing modes. The positive processing rule at least comprises a random word deletion method, a word reordering method, a synonym replacement method and a redundancy discarding method; the negative processing rules include at least entity substitution, sentence antisense, and evidence discarding methods.

In specific implementation, the processing the generic abstract data set based on the positive processing rule and the negative processing rule to form a training sample at least includes:

(1) And identifying entity structure relations and semantics in any piece of data in the general abstract data set.

It should be noted that, the NER labeling tool spaCy may be used to extract entities in any piece of data in the generic abstract data set, and the extracted entities may be roughly classified into two types: named entity, including personnel, place and organization names. Further, relationships between the entities are analyzed. This may involve the use of dependency syntax analysis or other relationship extraction techniques to determine the grammatical and semantic relationships between entities.

(2) And determining the processing entities of the positive processing rule and the negative processing rule according to the entity result relation.

The positive processing rule includes a plurality of processing methods, and the negative processing rule also includes a plurality of processing methods. When the positive processing is carried out, any processing mode is selected for any piece of data, and a processed object in the data is determined according to the analysis result of the entity result relationship, wherein the processed object comprises one or more entities with entity relevance, for example, in order to realize sentence antisense, a plurality of table languages need to be antisense. Wherein, the positive processing rule can comprise operations such as random noise, word reordering, synonym replacement, redundancy discarding method and the like so as to enhance the diversity of training samples; negative processing rules may include operations such as entity substitution, sentence antisense, and evidence discarding methods, introducing some interference and negative information.

Specifically, the random noise performs random discarding or repeating on the words in the original text and the abstract sentence with a preset probability, and in this way, the original data content is partially modified, but the whole semantics are not changed, so as to obtain a positive training sample similar to and identical to the data in the general abstract data set, and preferably, the preset probability may be 15%, but not limited to this. In the implementation process, a spaCy word segmentation tool is used for word segmentation, word loss and/or repeated operation are carried out with 15% probability when all words are traversed, and finally the modified word list is spliced to obtain the enhanced text. Both the original text and the abstract sentence do this.

Word reordering is performed by randomly selecting small parts of words to perform sequence scrambling, and a positive training sample of the general abstract data set is obtained. In the specific implementation process, a word segmentation tool spaCy is used for segmenting the text, and words with preset ratios are randomly selected from the text to exchange with other random words. The original sentence and the abstract sentence are both processed, and the predetermined ratio may be 5%, but not limited thereto.

Synonym replacement partial words may be replaced with synonyms to obtain a positive training sample of data in the generic summary dataset. Entity recognition is carried out by using a part-of-speech and part-of-speech tagging tool of spaCy, the text is segmented and tagged, verbs, adjectives and adverbs in the text are used as candidate sets according to the relation among the entities, and words with preset proportions in different types of entities are selected from the candidate sets for synonym exchange, or words with preset proportions in the same types of entities are selected for synonym exchange, wherein the preset proportion can be 15%, but is not limited to the preset proportion. The synonym list of words is queried using the WordNet thesaurus provided by the natural language toolkit NLTK issued by the university of stanford, and then a word with the same part of speech as the original text is randomly selected for replacement. Also, the method will be performed simultaneously in both the original and abstract sentences.

Redundancy discarding is to divide the original text by using spaCy tool, then calculate TF-IDF feature vectors of all sentences by gensim tool, then sort the original text sentences according to cosine distances of the original text sentences and the feature vectors of the abstract sentences, and finally discard the sentences with least relevant preset ratio in the original text, wherein the preset ratio can be 15%, but not limited to this.

Entity replacement can use spaCy's NER labeling tool to extract all mentioned entities in abstract sentences and texts, and the extracted entities can be roughly divided into two categories: named entity, including personnel, place and organization names. In addition, in the process of generating the abstract sentences, the semantics of the abstract sentences are changed, and meanwhile, the abstract sentences are ensured to be accordant with linguistic rules as much as possible, so that whether the facts are accordant or not is judged only by conforming to the linguistic rules or not during model training is avoided. In the replacement process, an entity randomly selected from the abstract sentences is randomly replaced by an entity with the same type and different meaning in the original text, and the generated abstract is ensured to be close to the natural text as much as possible.

The sentence antisense is that firstly, verb and adjective in abstract sentence are extracted, and a word is randomly selected and replaced by using WordNet to inquire the antisense word. If proper anti-meaning words cannot be inquired in WordNet, scanning the abstract sentence to find auxiliary verbs, and adding or deleting non/n't after randomly carrying out auxiliary verbs on the abstract sentence according to grammar rules to form an anti-meaning sentence of the original abstract sentence.

The evidence discarding is that firstly, the sentence is divided into sentences, then TF-IDF feature vectors of all sentences are calculated by utilizing gensim tools, then the sentences of the original are ordered according to cosine distances of the feature vectors of the sentences of the original and the abstract sentences, and finally, the most relevant 2 sentences in the original are discarded.

Based on the determined positive and negative processing rules, a particular entity in the summary dataset to which the rules are to be applied is selected, which may involve determining factors such as type, location, importance of the entity, etc.

(3) And determining a processing result based on the processing entity, the semantics of any piece of data, the positive processing rule and the negative processing rule.

Further semantic analysis of the selected processing entities to understand the meaning of the processing entity's context may include using natural language processing techniques, word embedding models, etc., in conjunction with processing entity operations after applying positive and negative processing rules, generating processed results, which may include altering the processing entity's text representation, adjusting sentence structure, introducing new words, etc. It should be noted that the determined processing result should still be semantically reasonable, and the generated processing result is prevented from being too discrete or not conforming to the grammar and the context of the natural language.

(4) And replacing the processing entity based on the processing result to generate a training sample.

And replacing the processing entities in the abstract data set with the processing results and combining the processing entities with the rest to form a final training sample.

It should be noted that the generated training samples at least include a positive training sample and a negative training sample, and the content of the training samples is different from the content of the general abstract data set.

In combination with steps (1) - (4) above, for example, in one embodiment, the generic summary dataset has one piece of data: "scientists have discovered in research that the concentration of carbon dioxide in the atmosphere affects the earth's climate change. "at this time, entity structures and semantics are identified, entities such as" scientists "," carbon dioxide "and" earth climate change "can be identified, regular rule processing is adopted according to entity result relation determination, for example, regular rule processing can include synonym replacement, in this example, a processing entity is" scientist ", according to the positive processing rule," scientist "is replaced by" researcher "to obtain a processing result, and a finally generated training sample can be: "researchers found in the study that the concentration of carbon dioxide in the atmosphere affected the earth climate change. "

For another example, in another embodiment, the generic summary dataset has one piece of data: "it is widely recognized that the planets in the solar system move around the sun. At this time, the entity structure and the semantics are identified, entities such as a solar system, a planet and a sun can be identified, and negative processing rules are adopted according to the entity result relationship, for example, the negative processing rules can include entity replacement, in this example, the processing entity is a solar system, according to the negative processing rules, the solar system is replaced by the outer space of the earth to obtain a processing result, and the finally generated training sample can be: "it is generally recognized that planets in the world are moving around the sun. "

S102, constructing a mixed sample based on the general abstract data set, the positive training sample, the negative training sample and the manually marked domain abstract data set; determining training rounds of a language model, and selecting actual training samples from the mixed samples according to the training rounds, wherein the actual training samples comprise the general abstract data set, positive training samples, negative training samples and manually marked field abstract data sets, the number of the actual training samples is smaller than that of the mixed samples, and the actual training samples corresponding to different training rounds are not identical.

It should be noted that, the manually noted field abstract dataset refers to an abstract dataset generated by a manual annotation mode in a specific field, and this dataset generally includes abstract information for refining and summarizing document contents in text documents in a given field, and these abstracts are made according to understanding and judging after the manual annotator reads the documents. In this embodiment, the related field is the technical field of text abstract evaluation verification of natural language processing.

Specifically, in this embodiment, the small-scale manual labeling dataset issued in QAGS method may be adopted to perform supervised fine tuning, for example, in one embodiment, the author issues the collected abstract sentences and texts output by the language model on an online manual labeling platform to perform manual labeling, and each sample is evaluated by three persons on average, and then votes are performed to obtain the final labeling sample. In addition, in this embodiment, a small sample manual annotation dataset in the FacTCC method may also be used as the evaluation dataset.

Specifically, the actual training samples include the general abstract data set, the positive training samples, the negative training samples and the manually marked field abstract data set, the number of the actual training samples is smaller than that of the mixed samples, and the actual training samples corresponding to different training rounds are not identical. According to the construction process, a rich sample library is constructed for model training, wherein the number of samples is increased based on the general abstract data set obtained in the step S101, the weak supervision positive and negative processing samples obtained in the corresponding data enhancement mode and the strong supervision manually-marked field abstract data set, and meanwhile, the manual marking is not needed to be excessively carried out, so that the marking workload is reduced. After the construction of strong and weak supervision samples is completed, the invention introduces a comparison learning mechanism in the abstract evaluation consistency model, improves the comparison learning effect so as to improve the model classification capability, and further enhances the pertinence of the samples to obtain positive and negative samples with large gaps.

S103, determining enhancement modes of all samples in the actual training samples according to sources of the samples in the actual training samples, and constructing enhanced actual training samples, wherein differences between positive examples and negative examples in the enhanced actual training samples are larger than those of the actual training samples before enhancement.

Fig. 2 is a schematic diagram of a method for constructing an enhanced actual training sample according to an exemplary embodiment of the present application, and referring to fig. 2, specifically, determining an enhancement mode of each sample in the actual training sample includes:

(1) And extracting any one sample in the actual training samples, and judging the source of the any one sample.

It should be noted that any one of the samples in the actual training samples may be extracted randomly or sequentially. The determination of the source of any sample may be accomplished by metadata, tags, or other identifiable information in the sample, for example, by examining the tag or file path of the sample to determine its source.

(2) And when the source of any sample is the general abstract data set and the manually marked domain abstract data set, carrying out forward enhancement processing and reverse enhancement processing on the samples in the actual training samples simultaneously to generate positive examples and difficult negative examples of the current turn.

The positive example samples are similar to the original data and are identical in facts, and the difficult negative example samples are similar to the original data and are not identical in facts.

(3) If the source of any sample is the positive training sample, the positive training sample is directly used as the training sample of the round.

(4) If the source of any sample is the negative training sample, the negative training sample is directly used as a simple negative example sample of the round.

This means that the positive training samples and the negative training samples do not require further adjustment or enhancement. In the text processing field, simple negative examples are those which are easily classified as negative examples by the model correctly with respect to the difficult negative examples.

(5) The method comprises the steps of constructing an enhanced actual training sample of the round, wherein the enhanced actual training sample at least comprises a training sample of the round, a positive example sample of the round and a negative example sample of the round, and the negative example sample of the round at least comprises a simple negative example sample and a difficult negative example sample of the round.

On the one hand, by constructing the enhanced practical training samples of the round, the enhanced practical training samples can be ensured to contain various samples required by the round training, namely, the coverage of the model training samples is ensured to be wide and the diversity is sufficient, the robustness of the language model in training is improved, the language model can be better adapted to different types of situations, and the generalization capability is improved. The enhanced practical training sample provided by the invention constructs the samples in the positive and negative aspects through positive and negative enhancement, and simultaneously the difference between the enhanced samples is larger than that before enhancement, so that the model can learn and distinguish the samples in the two types better through enlarging the difference between the enhanced samples, thereby improving the final evaluation capability of the model based on contrast learning.

It should be noted that, after the actual training sample after the enhancement is constructed, the method further includes:

Optimizing and solving a characterization model based on a characterization model objective function, and increasing the distinction between a positive example sample of the round and a negative example sample of the round;

Wherein, the characterization model specifically comprises:

，

Where f is the mapping function that the model is to learn, x represents any sample, Representing a sample similar to x,/>Representing samples dissimilar to x, s being a function of the similarity between the characterizations obtained by the metric mapping function;

The characterization model objective function is:

，

wherein E represents the characterization model objective function.

After the samples in the positive aspect and the negative aspect are obtained, preferably, the similarity between the samples is evaluated by indexes, so that an objective function is optimized and solved, the difference between the samples is further enlarged, and the classification evaluation capacity of the model is improved.

S104, masking the enhanced actual training samples, training the abstract consistency evaluation model based on the masked samples, wherein masking modes of the enhanced actual training samples of different training rounds are different, and masking modes of the same sample in the different training rounds are different.

It should be noted that, masking the enhanced actual training samples, and training the abstract consistency evaluation model based on the masked samples includes:

(1) And determining a first proportion, a second proportion and a third proportion based on the attribute of the enhanced actual training sample.

Fig. 3 is a schematic diagram of a mask strategy according to an exemplary embodiment of the present application, referring to fig. 3, in which a network structure formed by stacking a transducer module is used, and a pre-trained BERT language model is used. Masking the enhanced actual training sample, and determining a first proportion, a second proportion and a third proportion based on the attribute of the enhanced actual training sample, wherein the first proportion is the word segmentation proportion of the MASK in any sample, the second proportion is the word segmentation proportion of the replacement word in any sample, and the third proportion is the word segmentation proportion which remains unchanged, and the subword for performing MASK in any sample and the subword for performing replacement word processing belong to the same word segmentation.

For example, in one embodiment the first ratio is 80%, the second ratio is 10%, and the third ratio is 10%. Specifically, in the masking process, 80% of the probability is masked by using a [ MASK ] mark, 10% of the probability is replaced by using a random sub-word, and 10% of the probability is kept unchanged.

(2) And segmenting any sample, and determining the relation among all segmented words.

In the word segmentation process, most words are divided into a plurality of sub words by a word segmentation device, and specifically, sequential relationship, dependency relationship, synonymous relationship, near-sense relationship and the like may exist among the words.

(3) And determining a masking mode of any sample based on the relation among the first proportion, the second proportion and the third proportion and each word segmentation.

In combination with the above steps, the masking modes of the actual training samples after the enhancement of different training rounds are different, and the masking modes of the same sample in different training rounds are different, so that a specific training process can include:

(1) And marking the sub-words obtained by dividing any sample.

(2) And carrying out random shielding operation on the subwords in the same word.

(3) And determining target subwords to be predicted in the same word, wherein the subwords which are not masked are used as the target subwords.

(4) And inputting the samples subjected to the random masking operation into a model for training.

In the prior art, BERT randomly MASKs 15% of all subwords in each text sequence using MASK labels and then reconstructs the masked input. Although MLM helps to obtain a bi-directional pre-training model, this approach also results in a mismatch between pre-training and fine-tuning, since the [ MASK ] mark only exists during the pre-training phase and does not appear during fine-tuning. According to the mask training mode, the mask processing is carried out on the samples in the mask, replacement and unchanged modes, and the full word mask mode is adopted for processing, so that the capability of the model is improved; meanwhile, most words in the word segmentation process are divided into a plurality of sub-words by a word segmentation device. In the process of randomly shielding the sub-words, only the sub-words belonging to one word are often partially shielded, so that a model can predict target sub-words from the sub-words which are not shielded by the word, prompt information is reserved, and the difficulty of a task is reduced.

It should be noted that, in the training process, the method provided in this embodiment dynamically processes the data of each batch, and specifically, the method is as follows:

，

Wherein x represents any one sample, Representing a sample similar to x,/>Representing samples dissimilar to x, H,/>And/>Representing masking patterns of samples of different batches. And all samples share the same BERT encoder. Specifically, the summary consistency assessment model structure shown in an exemplary embodiment of the present application is shown in fig. 4.

It should be further noted that training the summary-consistent evaluation model based on the masked samples further includes:

(1) Extracting the sub-word representation of each sub-word in the masked sample based on the encoder.

It should be noted that, extracting the representation of each subword in the masked sample from the output of the encoder may be an indexing operation for each subword's position in the encoder output to obtain the corresponding representation.

(2) And carrying out pooling operation on all the sub-word representations to obtain hidden layer representations.

Pooling operations of sub-word representations are performed to integrate representations of individual sub-words into a hidden layer representation of a fixed dimension for use in subsequent tasks, common pooling operations include average pooling and maximum pooling.

(3) And outputting the hidden layer representation to a contrast learning nonlinear mapping layer, and extracting the correlation information of the samples after masking, wherein the parameters shared by all training samples in the round of the contrast learning nonlinear mapping layer are input.

The hidden layer representation is output to a contrast learning nonlinear mapping layer, aiming at learning the correlation information of the masked samples (emphasizing the similarity and the difference between the samples) by means of contrast learning (Contrastive Learning).

Specifically, the hidden layer representation is output to the contrast learning nonlinear mapping layer, and the method for extracting the correlation information of the sample after masking is as follows:

，

Wherein the non-linear mapping layer Parameters are shared among the three classes of samples.

(4) A learning loss function is calculated based on a comparison learning loss of the comparison learning nonlinear mapping layer and a first loss of the nonlinear mapping layer of the summary consistency assessment model to train the summary consistency assessment model.

Specifically, the method for calculating the learning loss function based on the first loss of the nonlinear mapping layer of the comparison learning nonlinear mapping layer and the summary consistency evaluation model may be as follows:

，

where N is the total number of samples in the batch at training, Is a temperature coefficient super parameter used to scale the cosine function value.

S105, judging whether the abstract consistency evaluation model is trained, if so, returning to the step of determining the training turn of the language model; otherwise, the training of the abstract consistency assessment model is completed.

Judging whether the abstract consistency evaluation model is trained according to the calculated value of the learning loss function, if the difference value of the learning loss function values calculated in two adjacent iteration periods is smaller than a preset value, judging that the training is completed, otherwise, the training is not completed, and a new round of training is required to be re-entered, and the actual training sample corresponding to the round is re-selected.

Compared with the prior art, the method provided by the invention has the advantages that masking is uniformly completed after sample training, only the samples of each training round are masked, long waiting time is not needed, and the training efficiency is accelerated; meanwhile, the mask modes of masks corresponding to different training rounds are different, so that the diversity of sample masks is improved, and the capability of a model can be improved without complex mask algorithm calculation under the condition of limited samples.

It should be noted that after the training of the summary consistency assessment model is completed, the method further includes:

(1) And discarding parameters of a nonlinear mapping layer of the abstract consistency evaluation model, wherein the abstract consistency evaluation model only maintains a contrast learning nonlinear mapping layer, and the contrast learning nonlinear mapping layer is connected with an output layer to construct a fine tuning model.

In this way, the summary consistency assessment model can be fine-tuned on the basis of the contrast learning nonlinear mapping layer so as to adapt to specific downstream tasks. Specifically, fig. 5 is a block diagram of a fine tuning model according to an exemplary embodiment of the present application, and referring to fig. 5, the fine tuning model includes an input layer, an encoder, a mapping layer, and an output layer.

(2) And performing two-stage parameter fine tuning on the fine tuning model based on the training sample generated by processing the positive processing rule and the negative processing rule and the manually marked domain abstract data set.

Specifically, in the first stage, parameter fine adjustment is performed on a fine adjustment model based on a training sample generated by a positive processing rule and a negative processing rule; the second stage is to perform two-stage parameter fine adjustment on the model fine-adjusted in the first stage based on the manually marked domain summary data set. The fine adjustment of parameters in the two stages is to gradually improve the performance of the model, the first stage focuses on the adaptability of the model to the generated samples, and the second stage further carries out fine adjustment on the data in the real field so as to better adapt to the actual application scene.

It should be further noted that after the training of the summary consistency assessment model is completed, the method further includes:

and acquiring abstract texts and abstract texts, and evaluating the relevance of the abstract texts and the abstract texts based on the abstract consistency evaluation model to obtain a relevance classification recognition result.

According to the abstract consistency assessment model training method provided by the embodiment, positive processing rules and negative processing rules are applied to the general abstract data set, so that a variety of training samples are generated, the generalization capability of the model can be improved, and the model can be better adapted to abstract consistency assessment tasks; the method has the advantages that the method can be used for effectively training the language model in a limited training round by means of the fact that the method is used for generating the training data set, the text representation form is established through the process of identifying the entity structure relation and the semantics in the general abstract data set, rich information about text content can be provided for subsequent processing and training tasks, each processing entity is replaced by a corresponding processing result, a plurality of training sample sets are formed, the model is facilitated to capture the modes and rules in the abstract consistency task better, generalization capability of the model is improved, actual training samples are flexibly selected in the mixed samples, and the language model can be effectively trained in the limited training round. Specifically, a general abstract data set is obtained, a training sample is formed by processing the general abstract data set based on the positive processing rule and the negative processing rule, a mixed sample is constructed based on the general abstract data set, the positive training sample, the negative training sample and the manually marked field abstract data set, further, the enhancement mode of each sample in the mixed sample is determined according to the source of the sample in the mixed sample, an enhanced actual training sample is constructed, further, the enhanced actual training sample is subjected to masking, the abstract consistency evaluation model is trained based on the masked sample, the masking modes of the enhanced actual training samples of different training rounds are different, so that whether the abstract consistency evaluation model is trained is judged, and if yes, the step of determining the training round of a language model is returned; otherwise, the training of the abstract consistency assessment model is completed. In this way, the general abstract data set is processed through the positive processing rule and the negative processing rule to form training samples, which is helpful for introducing a certain sample diversity, so that the model is better adapted to different semantics and structures; training data of the model can be further improved by constructing a mixed sample, generalization performance is improved, and diversity and complexity of the sample are further improved by constructing an enhanced actual training sample so as to better simulate a real scene; in addition, the abstract consistency evaluation model can process the condition of partial information loss through mask training, so that the robustness of the model can be better improved; judging whether the model is trained according to the performance index of the abstract consistency evaluation model or the change trend of training rounds, and ensuring that the model is sufficiently learned in training.

Corresponding to the embodiment of the method for training the abstract consistency assessment model, the application also provides an embodiment of a device for training the abstract consistency assessment model.

Fig. 6 is a schematic structural diagram of a first embodiment of a training device for a summary consistency assessment model according to the present application, please refer to fig. 6, where the device includes an obtaining module 610, a constructing module 620, a processing module 630, a training module 640, and a judging module 650; wherein,

The obtaining module 610 is configured to obtain a general abstract data set, and process the general abstract data set based on the positive processing rule and the negative processing rule to form a training sample; each abstract data in the general abstract data set comprises a plurality of abstract sentences, each abstract sentence comprises an abstract sentence and an original text corresponding to the abstract sentence, the training sample at least comprises a positive training sample and a negative training sample, and the content of the training sample is different from that of the general abstract data set;

The building module 620 is configured to build a hybrid sample based on the generic abstract dataset, the positive training sample, the negative training sample, and the artificially labeled domain abstract dataset; determining training rounds of a language model, and selecting actual training samples from the mixed samples according to the training rounds, wherein the actual training samples comprise the general abstract data set, positive training samples, negative training samples and manually marked field abstract data sets, the number of the actual training samples is smaller than that of the mixed samples, and the actual training samples corresponding to different training rounds are not identical;

the processing module 630 is configured to determine an enhancement mode of each sample in the actual training samples according to a source of the samples in the actual training samples, and construct an enhanced actual training sample;

The training module 640 is configured to mask the enhanced actual training samples, train the abstract consistency evaluation model based on the masked samples, and mask the enhanced actual training samples in different training rounds in different manners;

The judging module 650 is configured to judge whether the abstract consistency evaluation model is trained, and if yes, return to the step of determining the training round of the language model; otherwise, the training of the abstract consistency assessment model is completed.

The apparatus provided in this embodiment may be used to perform the steps of the method shown in fig. 1, and the implementation principle and implementation procedure are similar to those described above, and are not repeated here.

The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.

For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present application without undue burden.

The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the application.

Claims

1. A method for training a summary consistency assessment model, the method comprising:

Determining the enhancement mode of each sample in the actual training samples according to the sources of the samples in the actual training samples, and constructing enhanced actual training samples, wherein the difference between positive samples and negative samples in the enhanced actual training samples is larger than that of the actual training samples before enhancement; after the reinforced actual training sample is constructed, optimizing and solving a characterization model based on a characterization model objective function, and increasing the distinction between the positive example sample of the round and the negative example sample of the round;

Judging whether the abstract consistency evaluation model is trained, if so, returning to the step of determining the training turn of the language model; otherwise, completing the training of the abstract consistency assessment model;

Masking the enhanced actual training samples, and training the abstract consistency evaluation model based on the masked samples comprises the following steps:

Determining a first proportion, a second proportion and a third proportion based on the attribute of the enhanced actual training sample;

Performing word segmentation on any sample, and determining the relation among the word segments;

determining masking modes of any sample based on the first proportion, the second proportion and the third proportion and the relation among the word segmentation;

The first proportion is the word segmentation proportion of the MASK in any sample, the second proportion is the word segmentation proportion of the replacement word in any sample, the third proportion is the word segmentation proportion which remains unchanged, and the subword for performing MASK and the subword for performing replacement word processing in any sample belong to the same word segmentation.

2. The method according to claim 1, wherein the positive processing rule and the negative processing rule generate training samples with the same and opposite semantics respectively through at least character substitution and rearranged character processing modes;

the processing the universal abstract data set based on the positive processing rule and the negative processing rule to form a training sample at least comprises:

identifying entity structure relations and semantics in any piece of data in the general abstract data set;

Determining processing entities of the positive processing rule and the negative processing rule according to the entity structure relation;

determining a processing result based on the processing entity, the semantics of any piece of data, the positive processing rule and the negative processing rule;

and replacing the processing entity based on the processing result to generate a training sample.

3. The method of claim 2, wherein the positive processing rules include at least random puncturing, word reordering, synonym replacement, and redundancy discarding methods; the negative processing rules include at least entity substitution, sentence antisense, and evidence discarding methods.

4. The method of claim 1, wherein determining the enhancement mode of each of the actual training samples comprises:

any one sample in the actual training samples is extracted, and the source of the any one sample is judged;

When the source of any sample is the general abstract data set and the manually marked domain abstract data set, forward enhancement processing and reverse enhancement processing are simultaneously carried out on any sample, so that positive examples and difficult negative examples of the current turn are generated;

if the source of any sample is the positive training sample, directly taking the positive training sample as the training sample of the round;

If the source of any sample is the negative training sample, directly taking the negative training sample as a simple negative example sample of the round;

The method comprises the steps of constructing an enhanced actual training sample of the round, wherein the enhanced actual training sample at least comprises a training sample of the round, a positive example sample of the round and a negative example sample of the round, and the negative example sample of the round at least comprises a simple negative example sample and a difficult negative example sample of the round.

5. The method of claim 4, wherein after constructing the enhanced actual training samples, further comprising:

Optimizing and solving a characterization model based on a characterization model objective function;

Wherein, the characterization model specifically comprises:

，

Wherein, Is a mapping function to be learned by the model, x represents any sample,/>Representation and/>Similar sample,/>Representation and/>Dissimilar samples,/>Is a function of the similarity between the characterizations obtained by the metric mapping function;

The characterization model objective function is:

，

wherein E represents the characterization model objective function.

6. The method of claim 1, wherein the training the digest consistency assessment model based on the masked samples comprises:

extracting the sub-word representation of each sub-word in the masked sample based on an encoder;

Pooling all the sub-word representations to obtain hidden layer representations;

outputting hidden layer representation to a contrast learning nonlinear mapping layer, extracting the correlation information of the samples after masking, wherein the shared parameters of various training samples of the round of the contrast learning nonlinear mapping layer are input;

a learning loss function is calculated based on a comparison learning loss of the comparison learning nonlinear mapping layer and a first loss of the nonlinear mapping layer of the summary consistency assessment model to train the summary consistency assessment model.

7. The method of claim 6, wherein after the training of the summary consistency assessment model is completed, further comprising:

Discarding parameters of a nonlinear mapping layer of a summary consistency evaluation model, wherein the summary consistency evaluation model only reserves a contrast learning nonlinear mapping layer, and the contrast learning nonlinear mapping layer is connected with an output layer to construct a fine tuning model;

and performing two-stage parameter fine tuning on the fine tuning model based on the training sample generated by processing the positive processing rule and the negative processing rule and the manually marked domain abstract data set.

8. The method of claim 1, wherein after the training of the summary consistency assessment model is completed, further comprising:

9. The abstract consistency assessment device is characterized by comprising an acquisition module, a construction module, a processing module, a training module and a judging module; wherein,

The processing module is used for determining the enhancement mode of each sample in the actual training samples according to the sources of the samples in the actual training samples, and constructing enhanced actual training samples, wherein the difference between the positive example samples and the negative example samples in the enhanced actual training samples is larger than that of the actual training samples before enhancement; after the reinforced actual training sample is constructed, optimizing and solving a characterization model based on a characterization model objective function, and increasing the distinction between the positive example sample of the round and the negative example sample of the round;

The judging module is used for judging whether the abstract consistency evaluation model is trained, if yes, returning to the step of determining the training turn of the language model; otherwise, completing the training of the abstract consistency assessment model;