CN112800745A

CN112800745A - Method, device and equipment for text generation quality evaluation

Info

Publication number: CN112800745A
Application number: CN202110135581.XA
Authority: CN
Inventors: 吴佳鸣; 卫海天
Original assignee: Beijing Minglue Zhaohui Technology Co Ltd
Current assignee: Beijing Minglue Zhaohui Technology Co Ltd
Priority date: 2021-02-01
Filing date: 2021-02-01
Publication date: 2021-05-14
Anticipated expiration: 2041-02-01
Also published as: CN112800745B

Abstract

The present application relates to the technical field of artificial intelligence, and discloses a method for evaluating the quality of text generation, including: obtaining reference text and generated text, and obtaining the generated text according to the reference text; inputting the reference text and the generated text into a preset evaluation model to obtain evaluation indicators; the evaluation model is obtained from sample texts with topic similarity labels and generated sentence identification labels; the quality of text generation is evaluated according to the evaluation indicators. Compared with the current evaluation method, the method for text generation quality evaluation of the present application not only evaluates the subject similarity between the reference text and the corresponding generated text, but also considers whether the generated text is easily recognized as the text generated by the model, which improves the quality of the text. The reliability of generation quality assessment improves the reliability of text generation quality assessment. The present application also discloses an apparatus and device for text generation quality assessment.

Description

Method, device and equipment for text generation quality evaluation

Technical Field

The application relates to the technical field of artificial intelligence, and relates to a method, a device and equipment for text generation quality assessment.

Background

Text generation is currently an important area of research in natural language processing technology. Due to the wide application scenarios, research on text generation is also endless. The current bottleneck of text generation is how to formulate a reasonable evaluation index so as to objectively and accurately evaluate the quality of a generated text.

In the process of implementing the embodiments of the present disclosure, it is found that at least the following problems exist in the related art:

at present, most text generation quality evaluation only focuses on the similarity of semantic subjects between reference texts and corresponding generated texts, whether the generated texts are easily identified as model generated texts is not focused, evaluation indexes are single, and reliability of evaluation results is low.

Disclosure of Invention

The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview nor is intended to identify key/critical elements or to delineate the scope of such embodiments but rather as a prelude to the more detailed description that is presented later.

The embodiment of the disclosure provides a method, a device and equipment for text generation quality evaluation, which can improve the reliability of text generation quality evaluation.

In some embodiments, the method comprises:

acquiring a reference text and a generated text, wherein the generated text is acquired according to the reference text;

inputting the reference text and the generated text into a preset evaluation model to obtain an evaluation index; the evaluation model is obtained according to a sample text with a theme similarity label and a generated sentence identification label;

and evaluating the text generation quality according to the evaluation index.

In some embodiments, an apparatus includes a processor and a memory storing program instructions, the processor configured to, when executing the program instructions, perform the method for text generation quality assessment described above.

In some embodiments, the apparatus comprises the above-described means for text generation quality assessment.

The evaluation method, the device and the equipment for the text generation quality provided by the embodiment of the disclosure can realize the following technical effects: compared with the conventional evaluation method, the evaluation method not only evaluates the topic similarity of the reference text and the corresponding generated text, but also considers whether the generated text is easy to be identified as the text generated by the model, thereby improving the reliability of text generation quality evaluation.

The foregoing general description and the following description are exemplary and explanatory only and are not restrictive of the application.

Drawings

One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the accompanying drawings and not in limitation thereof, in which elements having the same reference numeral designations are shown as like elements and not in limitation thereof, and wherein:

FIG. 1 is a schematic diagram of a method for text generation quality provided by embodiments of the present disclosure;

fig. 2 is a schematic diagram of an apparatus for text generation quality provided by an embodiment of the present disclosure.

Detailed Description

So that the manner in which the features and elements of the disclosed embodiments can be understood in detail, a more particular description of the disclosed embodiments, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may be practiced without these details. In other instances, well-known structures and devices may be shown in simplified form in order to simplify the drawing.

The terms "first," "second," and the like in the description and in the claims, and the above-described drawings of embodiments of the present disclosure, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the present disclosure described herein may be made. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions.

The term "plurality" means two or more unless otherwise specified.

In the embodiment of the present disclosure, the character "/" indicates that the preceding and following objects are in an or relationship. For example, A/B represents: a or B.

The term "and/or" is an associative relationship that describes objects, meaning that three relationships may exist. For example, a and/or B, represents: a or B, or A and B.

With reference to fig. 1, an embodiment of the present disclosure provides a method for text generation quality evaluation, including:

step S101, acquiring a reference text and a generated text, wherein the generated text is acquired according to the reference text;

step S102, inputting the reference text and the generated text into a preset evaluation model to obtain an evaluation index; the evaluation model is obtained according to a sample text with a theme similarity label and a generated sentence identification label;

and step S103, evaluating the text generation quality according to the evaluation index.

By adopting the method for evaluating the text generation quality, the evaluation model is trained through the sample text with the topic similarity label and the generated sentence identification label, so that the trained evaluation model can comprehensively evaluate the text generation quality through two aspects of the topic similarity and the generated sentence identification.

Optionally, obtaining an evaluation model according to the sample text with the topic similarity label and the generated sentence recognition label includes: obtaining a sample text; obtaining a theme similarity label of a sample text; obtaining a generated sentence identification label of a sample text; and training a preset neural network model by using a sample text with a theme similarity label and a generated sentence identification label to obtain an evaluation model.

Therefore, the evaluation model is obtained through the collaborative training of the sample text with the theme similarity label and the generated sentence identification label, so that the trained evaluation model can comprehensively evaluate the text generation quality through the theme similarity and the generated sentence identification, and the reliability of text generation quality evaluation can be improved.

Optionally, the neural network model comprises a bert (bidirectional Encoder replication from transformations) pre-training model. In some embodiments, the neural network model is trained using sample text with topic similarity labels and generated sentence recognition labels to obtain an evaluation model. Compared with the traditional statistical evaluation model, the evaluation model trained on the neural network model can analyze the topic features and the structural features of the texts, and the evaluation of the topic similarity between the texts is more reliable by evaluating the topic similarity of the texts through analyzing the topic features and the structural features.

Optionally, the sample text comprises a first text pair and a second text pair; obtaining the sample text includes: acquiring a reference sample text and a generated sample text, and generating the generated sample text according to the reference sample text; combining the reference sample text with a generated sample text corresponding to the reference sample text to obtain a first text pair; and combining the two different reference sample texts to obtain a second text pair.

Optionally, the obtaining of the topic similarity label of the sample text includes: acquiring a first similarity of a reference sample text and a generated sample text in a first text pair, and acquiring a topic similarity label of the first text pair according to the first similarity; and acquiring a second similarity between the reference sample texts in the second text pair, and acquiring the topic similarity label of the second text pair according to the second similarity. Therefore, the sample texts have the topic similarity labels, the evaluation model trained on the sample texts with the topic similarity labels can generate topic similarity indexes based on the topic similarity labels, the topic similarity degree between the texts is evaluated according to the topic similarity indexes, and the accuracy of the evaluation model for evaluating the text generation quality is improved.

Optionally, obtaining a first similarity between a reference sample text and a generated sample text in the first text pair includes: inputting the reference sample text and the generated sample text into a preset theme model for theme analysis to obtain a first probability distribution vector and a second probability distribution vector of the reference sample text and the generated sample text which respectively correspond to a plurality of preset themes; and acquiring a first KL distance (Kullback-Leibler direction) of the first probability distribution vector and the second probability distribution vector, and determining the first KL distance as a first similarity.

Optionally by calculation

Obtaining a first KL distance of a first probability distribution vector and a second probability distribution vector, wherein p (x)_i) Is a first probability distribution vector, q (x)_i) Is a second probability distribution vector, D_KL(p | | q) is a first probability distribution vector p (x)_i) And a second probability distribution vector q (x)_i) N is a first probability distribution vector p (x)_i) And a second probability distribution vector q (x)_i) N is a positive integer.

Optionally, the first KL distance D_KLThe smaller the numerical value of (p | | q), the smaller the first probability distribution vector p (x)_i) And a second probability distribution vector q (x)_i) The higher the similarity between them.

Optionally, obtaining a second similarity between the reference sample texts in the second text pair includes: inputting the two reference sample texts into a preset theme model for theme analysis to obtain a third probability distribution vector and a fourth probability distribution vector of the two reference sample texts, wherein the third probability distribution vector and the fourth probability distribution vector correspond to a plurality of preset themes respectively; acquiring a second KL distance between the third probability distribution vector and the fourth probability distribution vector; the second KL distance is determined as a second similarity.

Optionally by calculation

Obtaining a second KL distance of the third probability distribution vector and the fourth probability distribution vector, wherein p' (x)_i) Is a third probability distribution vector, q' (x)_i) Is a fourth probability distribution vector, D'_KL(p ' | q ') is a third probability distribution vector p ' (x)_i) And a fourth probability distribution vector q' (x)_i) Is a third probability distribution vector p' (x)_i) And a fourth probability distribution vector q' (x)_i) N' is a positive integer.

Optionally, a second KL distance D'_KLThe smaller the numerical value of (p ' | q '), the smaller the third probability distribution vector p ' (x)_i) And a fourth probability distribution vector q' (x)_i) The higher the similarity between them.

Optionally, obtaining the topic similarity label of the first text pair according to the first similarity includes: determining the topic similarity label of the first text pair as topic similarity under the condition that the first similarity meets a first preset condition; and under the condition that the first similarity does not meet a first preset condition, determining the topic similarity label of the first text pair as dissimilar in topic.

Optionally, in a case that the first similarity satisfies a first preset condition, determining the topic similarity label of the first text pair as topic similarity includes: and under the condition that the first similarity of the first text pair is smaller than or equal to a preset first threshold, the reference sample text in the first text pair is similar to the theme of the generated text sample, and the theme similarity label of the first text pair is determined to be similar to the theme.

Optionally, in a case that the first similarity does not satisfy the first preset condition, determining the topic similarity label of the first text pair as dissimilar topics includes: and under the condition that the first similarity of the first text pair is greater than a preset second threshold, the reference sample text in the first text pair is not similar to the subject of the generated text sample, and the subject similarity label of the first text pair is determined to be not similar to the subject.

Optionally, obtaining the topic similarity label of the second text pair according to the second similarity includes: determining the topic similarity label of the second text pair as topic similarity under the condition that the second similarity meets a second preset condition; and under the condition that the second similarity does not meet a second preset condition, determining the topic similarity label of the second text pair as the topic dissimilarity.

Optionally, when the second similarity satisfies a second preset condition, determining the topic similarity label of the second text pair as topic similarity includes: and under the condition that the second similarity of the second text pair is smaller than or equal to a preset third threshold, determining that the subjects of the reference sample texts in the second text pair are similar, and determining the subject similarity label of the second text pair as subject similarity.

Optionally, when the second similarity does not satisfy the second preset condition, determining the topic similarity label of the second text pair as dissimilar topics, including: and under the condition that the second similarity of the second text pair is greater than a preset fourth threshold, the subjects of the reference sample texts in the second text pair are not similar, and the subject similarity label of the second text pair is determined to be the subject dissimilarity.

Optionally, obtaining a sentence recognition tag for generating a sample text includes: determining a generation sentence recognition tag of the sample text as having a generation sample text in a case where the sample text includes the generation sample text; in a case where the sample text does not include the generated sample text, the generated sentence recognition tag of the sample text is determined not to have the generated sentence. Therefore, the sample text has the generated sentence identification tag, the evaluation model trained on the sample text with the generated sentence identification tag can output the generated sentence identification index, the possibility that the text is identified as the text generated by the model is evaluated according to the generated sentence identification index, and the accuracy of evaluating the generation quality of the model evaluation text is improved.

Optionally, training a preset neural network model by using a sample text with a topic similarity label and a generated sentence identification label to obtain an evaluation model, including: obtaining a first loss value corresponding to a theme similarity label in a trained neural network model; obtaining a second loss value corresponding to the generated sentence identification label in the trained neural network model(ii) a By calculating L_all＝L_dist+λL_topicObtaining a total loss value, wherein L_allAs overall loss value, L_topicIs a first loss value, L_distAnd the second loss value is lambda which is a hyperparameter and is less than or equal to 1. Optionally, the larger λ is, the larger the influence of the evaluation index on the evaluation index by the theme evaluation index corresponding to the theme similarity label generated by the evaluation model when evaluating the text generation quality is larger.

Optionally, obtaining a first loss value corresponding to a topic similarity label in the trained neural network model, including; obtaining a first sample hidden layer output vector from a reference sample text in a first text pair through a trained neural network model; the method comprises the steps that a generated sample text in a first text pair passes through a trained neural network model to obtain a hidden layer output vector of a second sample; respectively obtaining a third sample hidden layer output vector and a fourth sample hidden layer output vector of different reference sample texts in the second text pair through a trained neural network model; taking an arithmetic mean of a combination of the first sample hidden layer output vector and the third sample hidden layer output vector to obtain a first sample arithmetic mean; taking an arithmetic mean of a combination of the second sample hidden layer output vector and the fourth sample hidden layer output vector to obtain a second sample arithmetic mean; transforming the first sample arithmetic mean to an output dimension corresponding to the topic similarity label through a multilayer perceptron (MLP) to obtain a fifth sample hidden Layer output vector; transforming the arithmetic mean of the second samples to an output dimension corresponding to the theme similarity label through a multi-layer perceptron MLP to obtain a sixth sample hidden layer output vector; splicing the fifth sample hidden layer output vector and the sixth sample hidden layer output vector to obtain a sample splicing vector; performing linear transformation on the sample splicing vector through a full Connected Layer (FC) to obtain a sample theme similarity characteristic corresponding to the sample splicing vector; performing two-classification processing on the sample theme similarity characteristics through an SIGMOID function to obtain a sample theme similarity function corresponding to the theme similarity label; and calculating loss of the sample theme similarity function according to the cross entropy loss function to obtain a first loss value.

Optionally, obtaining a second loss value corresponding to the generated sentence recognition tag in the trained neural network includes: acquiring a sample semantic expression symbol generated by a trained neural network at an input layer; obtaining a sample semantic output vector by passing the semantic expression symbol through a trained neural network; transforming the sample semantic output vector to an output dimension corresponding to the generated sentence identification tag through a multi-layer perceptron MLP to obtain a sample semantic vector; performing two-classification processing on the sample semantic vector through an SIGMOD function to obtain a sample generation statement identification function corresponding to the generation statement identification tag; and calculating loss of the sample generation statement identification function according to the cross entropy loss function to obtain a second loss value.

Optionally, training a preset neural network model by using a sample text with a topic similarity label and a generated sentence identification label to obtain an evaluation model, including: inputting a sample text with a theme similarity label and a generated sentence identification label into a preset neural network model for training, and recording the total loss value of the training model in each preset period; acquiring the lowest value of the recorded overall loss values; when the total loss value of the training model in M continuous preset periods is not lower than the lowest value in the total loss values, determining that the accuracy of the neural network model is not improved any more; stopping model training, and determining the trained model as an evaluation model; wherein M is a positive integer. Optionally, M is greater than or equal to 10.

Optionally, inputting the reference text and the generated text into a preset evaluation model to obtain an evaluation index, where the evaluation index includes: the reference text passes through an evaluation model to obtain a first hidden layer output vector; the generated text passes through an evaluation model to obtain a second hidden layer output vector; obtaining a semantic expression symbol generated by an evaluation model in an input layer; the semantic expression symbols pass through an evaluation model to obtain semantic output vectors; calculating an arithmetic mean of the output vector of the first hidden layer to obtain a first arithmetic mean; taking an arithmetic mean of the output vector of the second hidden layer to obtain a second arithmetic mean; transforming the first arithmetic mean to an output dimension corresponding to the theme similarity label through a multilayer perceptron MLP to obtain a third hidden layer output vector; transforming the second arithmetic mean to an output dimension corresponding to the theme similarity label through a multilayer perceptron MLP to obtain a fourth hidden layer output vector; splicing the third hidden layer output vector and the fourth hidden layer output vector to obtain a spliced vector; performing linear transformation on the spliced vector through a full connection layer FC to obtain theme similarity characteristics corresponding to the spliced vector; performing two-classification processing on the theme similarity characteristics through an SIGMOID function to obtain a theme similarity function corresponding to the theme similarity label; acquiring a first confidence coefficient of the topic similarity function on a positive example corresponding to the topic similarity label; determining the first confidence as a topic similarity index; converting the semantic output vector to an output dimension corresponding to the generated sentence identification tag through a multi-layer perceptron MLP to obtain a semantic vector; performing two-classification processing on the semantic vector through an SIGMOD function to obtain a generated statement identification function corresponding to the generated statement identification tag; acquiring a second confidence coefficient of the generated statement identification function on the corresponding positive example of the generated statement identification tag; determining the second confidence coefficient as a generated sentence identification index; and obtaining an evaluation index according to the topic similarity index and the generated sentence identification index.

In one embodiment, the reference text and the generated text are input to an evaluation model based on a BERT pre-training model, the evaluation model generating input data at an input layer, the input data comprising [ CLS ] symbols, the reference text and the generated text, wherein the [ CLS ] symbols represent semantic features of the reference text and the generated text; and acquiring a [ CLS ] output vector of the [ CLS ] symbol after passing through the evaluation model, and determining the [ CLS ] output vector as a semantic output vector.

Optionally, obtaining an evaluation index according to the topic similarity index and the generated sentence identification index includes: and obtaining an evaluation index by multiplying the topic similarity index and the generated sentence identification index.

Optionally, the evaluating the text generation quality according to the evaluation index includes: determining the text generation quality to be excellent in the case where the evaluation index is greater than or equal to a first set threshold; determining the text generation quality to be good under the condition that the evaluation index is greater than or equal to a second set threshold and smaller than a first set threshold; determining the text generation quality as poor in the case where the evaluation index is smaller than a second set threshold; wherein the second set threshold is less than the first set threshold. Optionally, the first set threshold is 0.7. Optionally, the second set threshold is 0.4.

In one embodiment, the reference text is "which is a very cost effective handset. The price is too much for the people, the cost performance is ultrahigh, and the method is suitable for all people. Especially for the old, the page is clear, simple and free from dazzling. The screen is clear, and is smooth, and the slip unblock is convenient, and tone quality is put the lever. The tone is vivid. The picture is too excellent, and the Qian Yuan machine firstly draws one finger. And obtaining a generated text through a text generation model to be evaluated according to the reference text, wherein the generated text is' the cost performance of the mobile phone is high. The electronic lock has the advantages of simplicity, high preference, high cost performance, suitability for all scenes, clear screen page display, smooth sliding, quick unlocking and tone quality bar, and is particularly suitable for the old people. The best picture is taken, and the Qian Yuan machine is the first to yield one finger. ". Inputting the reference text and the generated text into an evaluation model; determining the topic similarity of the topic similarity label as a positive example of a topic similarity function, wherein the higher the topic similarity of the reference text and the generated text is, the higher the numerical value of a topic similarity index is; the theme similarity index output by the evaluation model is 0.852; determining the generated sentences without the generated sentence identification tags as the normal examples of the language generation sentence identification function, wherein the higher the possibility that the generated text is identified as the text generated by the model is, the higher the numerical value of the generated sentence identification index is; the generated sentence recognition index output by the evaluation model is 0.831; the evaluation index obtained by multiplying the topic similarity index and the generated sentence identification index is 0.708; the text generation quality of the text generation model to be evaluated is determined to be excellent.

As shown in fig. 2, an apparatus for text generation quality evaluation according to an embodiment of the present disclosure includes a processor (processor)100 and a memory (memory)101 storing program instructions. Optionally, the apparatus may also include a Communication Interface (Communication Interface)102 and a bus 103. The processor 100, the communication interface 102, and the memory 101 may communicate with each other via a bus 103. The communication interface 102 may be used for information transfer. The processor 100 may call program instructions in the memory 101 to perform the method for text generation quality assessment of the above-described embodiments.

Further, the program instructions in the memory 101 may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product.

The memory 101, which is a computer-readable storage medium, may be used for storing software programs, computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 100 executes functional applications and data processing, i.e., implements the method for text generation quality evaluation in the above-described embodiments, by executing program instructions/modules stored in the memory 101.

The memory 101 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. In addition, the memory 101 may include a high-speed random access memory, and may also include a nonvolatile memory.

By adopting the device for text generation quality evaluation provided by the embodiment of the disclosure, the evaluation model is trained through the sample text with the topic similarity label and the generated sentence identification label, so that the trained evaluation model can comprehensively evaluate the text generation quality through two aspects of the topic similarity and the generated sentence identification.

Compared with the conventional evaluation method, the evaluation method not only evaluates the topic similarity of the reference text and the corresponding generated text, but also considers whether the generated text is easy to be identified as the text generated by the model, and improves the reliability of text generation quality evaluation.

Optionally, the device comprises a computer, smartphone, tablet, or the like.

Embodiments of the present disclosure provide a computer-readable storage medium storing computer-executable instructions configured to perform the above-described method for text generation quality assessment.

The disclosed embodiments provide a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-described method for text generation quality assessment.

The computer-readable storage medium described above may be a transitory computer-readable storage medium or a non-transitory computer-readable storage medium.

The technical solution of the embodiments of the present disclosure may be embodied in the form of a software product, where the computer software product is stored in a storage medium and includes one or more instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium comprising: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes, and may also be a transient storage medium.

The above description and drawings sufficiently illustrate embodiments of the disclosure to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. The examples merely typify possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in or substituted for those of others. Furthermore, the words used in the specification are words of description only and are not intended to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, the terms "comprises" and/or "comprising," when used in this application, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Without further limitation, an element defined by the phrase "comprising an …" does not exclude the presence of other like elements in a process, method or apparatus that comprises the element. In this document, each embodiment may be described with emphasis on differences from other embodiments, and the same and similar parts between the respective embodiments may be referred to each other. For methods, products, etc. of the embodiment disclosures, reference may be made to the description of the method section for relevance if it corresponds to the method section of the embodiment disclosure.

Those of skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software may depend upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments. It can be clearly understood by the skilled person that, for convenience and brevity of description, the specific working processes of the system, the apparatus and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments disclosed herein, the disclosed methods, products (including but not limited to devices, apparatuses, etc.) may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units may be merely a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to implement the present embodiment. In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. In the description corresponding to the flowcharts and block diagrams in the figures, operations or steps corresponding to different blocks may also occur in different orders than disclosed in the description, and sometimes there is no specific order between the different operations or steps. For example, two sequential operations or steps may in fact be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. Each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims

1. A method for text generation quality assessment, comprising:

and evaluating the text generation quality according to the evaluation index.

2. The method of claim 1, wherein obtaining an assessment model from the sample text with the topic similarity label and the generated sentence recognition label comprises:

obtaining a sample text;

obtaining a theme similarity label of the sample text; obtaining a generated sentence identification tag of the sample text;

and training a preset neural network model by using the sample text with the theme similarity label and the generated sentence identification label to obtain an evaluation model.

3. The method of claim 2, wherein the sample text comprises a first text pair and a second text pair; the obtaining of the sample text comprises:

acquiring the reference sample text and the generation sample text, wherein the generation sample text is generated according to the reference sample text;

combining the reference sample text with the generated sample text corresponding to the reference sample text to obtain a first text pair; and combining two different reference sample texts to obtain a second text pair.

4. The method of claim 3, wherein obtaining the topic similarity label for the sample text comprises:

acquiring a first similarity of the reference sample text and the generated sample text in the first text pair, and acquiring a topic similarity label of the first text pair according to the first similarity;

and acquiring a second similarity between the reference sample texts in the second text pair, and acquiring a topic similarity label of the second text pair according to the second similarity.

5. The method of claim 4, wherein obtaining the topic similarity label for the first text pair based on the first similarity comprises:

determining the topic similarity label of the first text pair to be topic similarity under the condition that the first similarity meets a first preset condition; and the combination of (a) and (b),

and under the condition that the first similarity does not meet a first preset condition, determining the topic similarity label of the first text pair as dissimilar topic.

6. The method of claim 4, wherein obtaining the topic similarity label for the second text pair based on the second similarity comprises:

determining the topic similarity label of the second text pair to be topic similarity under the condition that the second similarity meets a second preset condition; and the combination of (a) and (b),

and under the condition that the second similarity does not meet a second preset condition, determining the theme similarity label of the second text pair as the theme dissimilarity.

7. The method of claim 3, wherein the obtaining of the sentence-recognition tag of the sample text comprises:

determining the generated sentence recognition tag of the sample text as having a generated sample text if the sample text includes the generated sample text;

determining the generated sentence recognition tag of the sample text as not having a generated sentence in a case that the sample text does not include a generated sample text.

8. The method according to any one of claims 1 to 7, wherein the obtaining of the evaluation index from the evaluation model preset by inputting the reference text and the generated text comprises:

enabling the reference text to pass through an evaluation model to obtain a first hidden layer output vector;

passing the generated text through an evaluation model to obtain a second hidden layer output vector;

obtaining semantic representation symbols generated by the evaluation model in an input layer; passing the semantic representation symbols through the evaluation model to obtain semantic output vectors;

calculating an arithmetic mean of the output vector of the first hidden layer to obtain a first arithmetic mean; taking an arithmetic mean of the second hidden layer output vector to obtain a second arithmetic mean; transforming the first arithmetic mean to an output dimension corresponding to the theme similarity label through a multilayer perceptron MLP to obtain a third hidden layer output vector; transforming the second arithmetic mean to an output dimension corresponding to the theme similarity label through a multilayer perceptron MLP to obtain a fourth hidden layer output vector; splicing the third hidden layer output vector and the fourth hidden layer output vector to obtain a spliced vector; performing linear transformation on the splicing vector through a full connection layer FC to obtain a theme similarity characteristic corresponding to the splicing vector; performing two-classification processing on the theme similarity feature through an SIGMOID function to obtain a theme similarity function corresponding to the theme similarity label; acquiring a first confidence coefficient of the theme similarity function on a positive example corresponding to the theme similarity label; determining the first confidence as a topic similarity index;

transforming the semantic output vector to an output dimension corresponding to the generated sentence identification tag through a multi-layer perceptron MLP to obtain a semantic vector; performing two-classification processing on the semantic vector through an SIGMOD function to obtain a generated statement identification function corresponding to the generated statement identification tag; acquiring a second confidence of the generated statement identification function on a regular example corresponding to the generated statement identification tag; determining the second confidence coefficient as a generated sentence identification index;

and obtaining an evaluation index according to the theme similarity index and the generated sentence identification index.

9. An apparatus for text generation quality assessment, comprising a processor and a memory storing program instructions, characterized in that the processor is configured to perform the method for text generation quality assessment according to any one of claims 1 to 8 when executing the program instructions.

10. An apparatus, characterized in that it comprises a device for text generation quality assessment as claimed in claim 9.