CN115796125A

CN115796125A - Text generation method, model training method and device

Info

Publication number: CN115796125A
Application number: CN202310078329.9A
Authority: CN
Inventors: 耿瑞莹; 石翔; 李亮; 黎槟华; 李永彬
Original assignee: Alibaba Damo Institute Hangzhou Technology Co Ltd
Current assignee: Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority date: 2023-02-08
Filing date: 2023-02-08
Publication date: 2023-03-14
Anticipated expiration: 2043-02-08
Also published as: CN115796125B

Abstract

The application provides a text generation method, a model training method and a model training device. The method comprises the steps of obtaining a table to be processed; generating a plurality of character groups based on the table to be processed, wherein the plurality of character groups comprise: the text groups comprise key texts in the table to be processed, value words corresponding to the key texts, forward-order positions of the value words in the value texts, and reverse-order positions of the value words in the value texts according to a preset sequence; inputting the character groups into an encoder for encoding to obtain a plurality of encoding vectors; inputting the plurality of encoding vectors into a text content extraction model for text content extraction to obtain a first output text; and inputting the plurality of coding vectors and the first output text into a text splicing model for text splicing to obtain a target output text corresponding to the table to be processed. The method and the device can generate the content based on the table to be processed, and generate the smooth text.

Description

Text generation method, model training method and device

Technical Field

The application relates to the technical field of computers, in particular to a text generation method, a model training method and a device.

Background

The Table-to-text-based text generation technology is a technology for generating a natural language-form text description for information elements in a given structured Table, can help a user to quickly acquire key information in the Table, and is widely applied to scenes such as character biography generation, weather broadcast, news event broadcast and the like. Therefore, how to convert the form into the text with high quality is an important issue worthy of research.

In the related art, a text is generated by processing a table based on an end-to-end framework, but the fluency of the generated text cannot be ensured.

Disclosure of Invention

Aspects of the application provide a text generation method, a model training method and a device, so as to realize fluency of texts generated by a table.

A first aspect of an embodiment of the present application provides a text generation method, including: acquiring a table to be processed; generating a plurality of character groups based on the table to be processed, wherein the plurality of character groups comprise: the text groups comprise key texts in the table to be processed, value words corresponding to the key texts, forward-order positions of the value words in the value texts, and reverse-order positions of the value words in the value texts according to a preset sequence; inputting the character groups into an encoder for encoding to obtain a plurality of encoding vectors, wherein the encoding vectors correspond to the character groups one by one; inputting the plurality of encoding vectors into a text content extraction model for text content extraction to obtain a first output text, wherein the first output text comprises at least one value word in a table to be processed; and inputting the plurality of coding vectors and the first output text into a text splicing model for text splicing to obtain a target output text corresponding to the table to be processed, wherein the target output text comprises the first output text and characters in a preset word stock.

A second aspect of the embodiments of the present application provides a text generation method, which is applied to a terminal device, and the text generation method includes: acquiring a table to be processed; sending the table to be processed to a server; and receiving a target output text sent by the server, wherein the target output text is determined by the server according to the text generation method of the first aspect.

A third aspect of the embodiments of the present application provides a text generation apparatus, including:

the acquisition module is used for acquiring a table to be processed;

a generating module, configured to generate a plurality of character groups based on the table to be processed, where the plurality of character groups include: the text groups comprise key texts in the table to be processed, value words corresponding to the key texts, forward-order positions of the value words in the value texts and reverse-order positions of the value words in the value texts according to a preset sequence;

the encoding module is used for inputting the character groups into the encoder for encoding processing to obtain a plurality of encoding vectors, and the encoding vectors correspond to the character groups one by one;

the extraction module is used for inputting the plurality of coding vectors into the text content extraction model for text content extraction to obtain a first output text, and the first output text comprises at least one value word in the table to be processed;

and the splicing module is used for inputting the plurality of coding vectors and the first output text into the text splicing model for text splicing to obtain a target output text corresponding to the to-be-processed table, wherein the target output text comprises the first output text and characters in a preset word stock.

A fourth aspect of the embodiments of the present application provides an electronic device, including: a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the text generation method according to the first or second aspect when executing the computer program.

A fifth aspect of embodiments of the present application provides a computer-readable storage medium storing a computer program, which, when executed by a processor, causes the processor to implement the text generation method according to the first aspect or the second aspect.

The method and the device are applied to a classification scene of table generation contents, and the table to be processed is obtained; generating a plurality of character groups based on the table to be processed, wherein the plurality of character groups comprise: the text groups comprise key texts in the table to be processed, value words corresponding to the key texts, forward-order positions of the value words in the value texts and reverse-order positions of the value words in the value texts according to a preset sequence; inputting the character groups into an encoder for encoding to obtain a plurality of encoding vectors, wherein the encoding vectors correspond to the character groups one by one; inputting the plurality of encoding vectors into a text content extraction model for text content extraction to obtain a first output text, wherein the first output text comprises at least one value word in a table to be processed; and inputting the plurality of coding vectors and the first output text into the text splicing model for text splicing to obtain a target output text corresponding to the table to be processed, wherein the target output text comprises the first output text and characters in a preset character library, and can be generated based on the content in the table to be processed to generate a smooth text.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a diagram of an application scenario provided in an exemplary embodiment of the present application;

FIG. 2 is a flowchart illustrating steps of a method for generating text according to an exemplary embodiment of the present application;

FIG. 3 is a schematic diagram of a plurality of character sets provided by an exemplary embodiment of the present application;

FIG. 4 is a flowchart illustrating steps of another text generation method provided in an exemplary embodiment of the present application;

fig. 5 is a flowchart illustrating a text generation method according to an exemplary embodiment of the present application;

FIG. 6 is a diagram of a second word predictor provided in an exemplary embodiment of the present application;

FIG. 7 is a flowchart illustrating steps of a method for training a model provided in an exemplary embodiment of the present application;

fig. 8 is a block diagram illustrating a structure of a text generating apparatus according to an exemplary embodiment of the present application;

fig. 9 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only a few embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The method is characterized in that the structured data in the table is generated into corresponding word descriptions, one mode is end-to-end generation based on a framework of an encoder and a decoder, the other mode is a two-stage method, the generation efficiency is slow, and the method is difficult to apply in an actual scene, while the decoding efficiency can be greatly improved by adopting a non-autoregressive method, but the non-autoregressive method is difficult to ensure the consistency and consistency of the generated text due to the fact that all single times of parallel prediction are adopted.

Based on the problems, the non-autoregressive prediction model (the text content extraction model and the text splicing model) in two stages is designed, the prediction of the text splicing model in the second stage is established on the prediction result in the first stage, so that the semantic dependency relationship between characters predicted by the text content extraction model in the first stage is obtained, and the fluency of the predicted target output text is greatly improved.

In the present embodiment, the execution apparatus of the text generation method is not limited. Alternatively, the text generation method may be implemented as a whole by means of a cloud computing system. For example, the text generation method may be applied to a cloud server to run various models by virtue of resources on the cloud; compared with the application to the cloud, the text generation method can also be applied to server-side equipment such as a conventional server, a cloud server or a server array.

In addition, referring to fig. 1, an application scenario of the present application is illustrated. The terminal device 11 sends the form to be processed to the server 12, the server 12 is loaded with a model, the form to be processed is processed by the model in the server 12, a descriptive target output text of data in the form to be processed is obtained, such as "livingweis is an actor and a model" in fig. 1, and then the target output text is sent to the terminal device 11 to be provided for the user.

Fig. 1 is only an exemplary application scenario of the present application, and the present application may also be applied to other text classification scenarios, which are not limited herein.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 2 is a flowchart illustrating steps of a text generation method according to an exemplary embodiment of the present application. As shown in fig. 2, the text generation method specifically includes the following steps:

s201, obtaining a table to be processed.

The form to be processed can be sent by the terminal device, and the form to be processed is a form, is a visual communication mode, and is a means for organizing and arranging data. Referring to the personal information table in fig. 1, the table to be processed is the table.

S202, generating a plurality of character groups based on the table to be processed.

Wherein the plurality of character groups include: the text groups comprise key texts in the tables to be processed, value words corresponding to the key texts, forward-order positions of the value words in the value texts, and reverse-order positions of the value words in the value texts according to a preset sequence. In the present application, the key text and the value word constitute a key-value pair (key-value), the key text is a key in the key-value pair, the key is in a text form, the value word is a value in the key-value pair, and the value is in a word form.

In the embodiment of the application, each text group is a character group and can be a value wordwKey textfPositive sequence positionP ⁺ And the reverse order positionP ^- And (4) sequentially composing. Wherein, the value word can be a word or a word. Wherein, each text group has a sequence number, such as ri, the table T to be processed can be represented as T = { r1, r2, r3, \8230;, rn }, where n is a positive integer. Each ri = ∑ quick actionw _i ，f _i ，P _i ⁺ ，P _i ^- }。

Referring to fig. 3, the plurality of character sets in fig. 3 are generated based on the table to be processed in fig. 1, and include 5 text sets, specifically, r1{ lie, name, 1}, r2{ martial, spouse, 1}, r3{ actor, occupation, 1, 2}, r4{ model, occupation, 2, 1}, r5{22, age, 1}. In the embodiment of the present application, reference is made to a table to be processed, a text group r3, and a text group r4. In the table to be processed, the key text is a career, the value text corresponding to the key text is { actor model }, wherein the value word "actor" is ranked as first positive and second negative in the value text, the positive sequence position of the value word "actor" is 1, the reverse sequence position is 2, the same reason "model" is ranked as second positive and first negative in the value text, the positive sequence position of the value word "model" is 2, and the reverse sequence position is 1.

And S203, inputting the plurality of character groups into an encoder to perform encoding processing, and obtaining a plurality of encoding vectors.

Wherein, the code vector and the character group are in one-to-one correspondence.

In the embodiment of the present application, specifically, in the encoder, four trained embedding matrices (embedding) are used to pair the text group r _i Performing conversion, wherein an embedded matrix is used for value wordsw _i Converting to obtain value vector e _wi An embedded matrix for the key textf _i Converting to obtain key vector e _fi An embedded matrix for aligning the sequence positionsP _i ⁺ Converting to obtain positive sequence vector e _Pi+ An embedded matrix for aligning the positions in reverse orderP _i ^- Converting to obtain a reverse-order vector e _Pi- . These embedded vectors are then concatenated to obtain a concatenated vector e _wi ; e _fi ; e _Pi+ ; e _Pi- ]Wherein "; "represents a concatenation of the embedded vectors, and then concatenates the vectors [ e _wi ; e _fi ; e _Pi+ ; e _Pi- ]Linear projection is carried out to obtain a projection vector e of each text group _i In particular, projection vector e _i Is expressed by the following formula (1):

formula (1)

In the formula (1)

And

are pre-trained parameters. Further, transform (a neural network model) coding is used in the encoder to code the projection vector into a context sequence representation

Wherein, in the process,H ^e a plurality of code vectors is represented as a plurality of code vectors,

for one of the coded vectors, projection vector e _i And a coded vector

One-to-one correspondence, i.e. coding vectors

And character group r _i And correspond to each other.

And S204, inputting the plurality of coding vectors into the text content extraction model for text content extraction to obtain a first output text.

The first output text comprises at least one value word in the table to be processed, and the value words are distributed in the first output text in sequence.

In the embodiment of the present application, the text content extraction model is a pre-trained non-autoregressive model. The text content extraction model can extract the value words of the text to be processed which need to form the target output text. For example, referring to fig. 1 or 3, the value words of the text to be processed include: livingwen, warrior, actors, models, and 22. If the text content extraction model is trained aiming at describing professional requirements, a plurality of encoding vectors are input into the text content extraction model for text content extraction, and the obtained first output text is { Liwen, actor, model }. And if the text content extraction model is trained aiming at the description spouse requirement, inputting a plurality of coding vectors into the text content extraction model for text content extraction, and obtaining a first output text { Liwen, wangwu }. And if the text content extraction model is trained aiming at the requirement of describing age, inputting a plurality of encoding vectors into the text content extraction model for text content extraction, and obtaining a first output text { Li wen, 22}, and if the text content extraction model is trained aiming at the requirement of describing spouse, occupation and age, inputting a plurality of encoding vectors into the text content extraction model for text content extraction, and obtaining a first output text { Li wen, wangwu, actor, model, 22}.

In the embodiment of the application, it can be understood that the text content extraction model extracts at least value words in the table to be processed, and sorts the extracted value words according to a certain sequence to obtain a first output text, and the first output text can ensure consistency with the text in the table to be processed.

And S205, inputting the plurality of coding vectors and the first output text into the text splicing model for text splicing to obtain a target output text corresponding to the table to be processed.

The target output text comprises the first output text and characters in a preset character library. In the embodiment of the present application, the text stitching model is also a pre-trained non-autoregressive model. The text stitching model may predict a connection word between any two value words in the first output text, where the connection word is a word in a preset lexicon.

Illustratively, the first output text is { lie, actor, model }, and the target output text is { lie is an actor and model }. The text splicing model is used for splicing the text, wherein the 'yes', 'individual' and 'sum' in the target output text are obtained by prediction of the text splicing model, and the target output text can be obtained after the text splicing model is combined with the first output text.

In the embodiment of the application, the text content is extracted in the text content extraction model, so that a first output text which keeps consistency with data in a to-be-processed table can be obtained. In addition, after the first output text is processed by the text splicing model, the fluency of the obtained target output text can be improved.

Referring to fig. 4, a flowchart of steps of another text generation method provided in an exemplary embodiment of the present application is shown. As shown in fig. 4, the text generation method specifically includes the following steps:

s401, obtaining a table to be processed.

The specific implementation process of this step refers to S201, and is not described herein again.

S402, generating a plurality of character groups based on the table to be processed.

The specific implementation process of this step refers to S202, which is not described herein again.

And S403, inputting the plurality of character groups into an encoder for encoding to obtain a plurality of encoding vectors.

The specific implementation process of this step refers to S203, and it should be added that the multiple character sets further include: a first set of identification characters and a second set of identification characters. The first identification character group, the second identification character group and the text group have the same format, the first identification character group is a character group, and the second identification character group is also a character group. Referring to fig. 3, the first flag character set may be r0= { [ BOS ], [ BOS ] }, and the second flag character set may be r (n + 1) = { [ EOS ], [ EOS ] }.

In addition, splicing the first identification character group in front of the text groups, splicing the second identification character group in back of the text groups to obtain a table to be processed, wherein the table to be processed is represented by T = { r0, r1, r2, r3, \ 8230:, rn, r (n + 1) }, then T = { r0, r1, r2, r3, \8230, rn, r (n + 1) } is subjected to coding processing of formula (1) and transform, and the projection vector corresponding to the first identification character group is obtained as e ₀ The projection vector corresponding to the second identification character set is e _n+1 Then the context sequence of the plurality of coded vectors is represented as

Wherein, in the process,

is that the first burst identifies the corresponding code vector,

is the encoded vector corresponding to the second burst.

S404, inputting the coding vectors corresponding to the first identification character group and the second identification character group into a decoder for decoding processing to obtain corresponding identification decoding data.

Referring to fig. 5, the text content extraction model includes: a decoder, a placeholder predictor, and a first word predictor. In fig. 5, the same name is the same module, i.e., all decoders in fig. 5 are one decoder.

In the embodiment of the application, the coding vector of the first identification character group is combined

And a second identification character set

To obtain an identification vector

Inputting the identification vector into a decoder to obtain corresponding identification decoding data, wherein the identification decoding data comprises the first identification decoding dataA decoded data of the identification character

(and the encoded vector

Corresponding) and a second identification character

(and the encoded vector

Corresponding).

S405, the identifier decoding data are input into the placeholder predictor, the number of first placeholders between the first identifier character and the second identifier character is predicted in the placeholder predictor, and first placeholders with the number of the first placeholders are added between the first identifier character and the second identifier character to obtain a placeholder sequence.

Specifically, the first placeholder quantity is determined in the placeholder predictor using the following equation (2).

Formula (2)

Wherein, in the formula (2), softmax is a logistic regression function, _l Wfor pre-trained projection matrices, pi ^l Indicates the first character symbol BOS]And a second character identifier EOS]The probability distribution of the number of placeholders in between,lrepresenting the number of placeholders to take the maximum probability,lis a first placeholder quantity. Then marking [ BOS ] in the first character]And a second character identifier EOS]Is inserted betweenlA first placeholder [ PLH]Obtain a placeholder sequence y ₁ 。

By way of example, with reference to figure 5,

after the decoder and the placeholder predictor process, if the first placeholder number is 3, the placeholder sequence y is obtained ₁ ={[BOS][PLH][PLH][PLH][EOS]。

S406, inputting the placeholder sequence and the plurality of coded vectors into a decoder for decoding processing to obtain placeholder decoded data of the first placeholder.

Wherein a plurality of code vectors H are encoded ^e And placeholder sequence y ₁ The placeholder decoding data of each first placeholder obtained by the processing of the input decoder

And (4) showing. Illustratively, for y in FIG. 5 ₁ ={[BOS][PLH][PLH][PLH][EOS]First one of [ PLH]The corresponding placeholder decoded data is represented as

Second one [ PLH]The corresponding placeholder decoded data is represented as

Third, [ PLH ]]The corresponding placeholder decoded data is represented as

。

S407, inputting the placeholder decoding data and the plurality of coding vectors into a first word predictor, determining confidence scores between the placeholders and the coding vectors in the first word predictor, and replacing the first placeholder in the placeholder sequence with a value word with the highest confidence score aiming at the first placeholder to obtain a second output text.

Specifically, in the first word predictor, the confidence score between each first placeholder and each code vector is first determined using equation (3) below.

Formula (3)

Wherein, in the formula (3), softmax is a logistic regression function, W _p For a pre-trained projection matrix, n ^P Represents the ith first placeholder [ PLH ]]And the probability distribution of confidence between the jth value words,

the confidence representing the maximum probability is the first placeholder PLH]And the confidence between the jth value word.

Exemplarily, referring to fig. 5, comprising 3 first placeholders, and 5 value words (respectively corresponding code vectors as

And

) Wherein a confidence between each first placeholder and each value word is calculated

The calculation process and results are shown in Table 1.

TABLE 1

Further, in this embodiment of the application, after determining the confidence of each first placeholder and each value word, for each first placeholder, the value word with the highest confidence score is used to replace the first placeholder in the placeholder sequence, so as to obtain the second output text.

Illustratively, referring to Table 1, for the first placeholder [ PLH ]]The highest confidence is

Then the first placeholder is replaced with the corresponding value word "lie". For the second first placeholder [ PLH]With the highest degree of confidence of

Then the second first placeholder is replaced with the corresponding value word "actor". For the third first placeholder [ PLH]With the highest degree of confidence of

Then replace the third one with the corresponding value word "modelA placeholder. And then a second output text y can be obtained ₂ ={[BOS][ Liwen ]][ actor ]][ model ]][EOS]Referring to FIG. 5, the placeholder sequence y is shown ₁ ={[BOS][PLH][PLH][PLH][EOS]After being processed by the decoder and the first word predictor, a second output text y can be obtained ₂ ={[BOS][ Li Wen ]][ actor ]][ model ]][EOS]。

S408, determining the first output text according to the second output text.

In an alternative embodiment, the second output text is directly determined as the first output text for the subsequent text splicing model.

In another alternative embodiment, the text content extraction model further includes a second word predictor for determining the first output text based on the second output text, including: determining an embedded vector of a value word in the second output text; inputting the embedded vector, the placeholder decoding data and the plurality of coding vectors into a decoder for decoding processing to obtain first decoding data of median words in a second output text; and inputting the first decoding data and the plurality of coding vectors into a second word predictor, determining confidence scores between the value words and the coding vectors in the second output text in the second word predictor, and replacing corresponding characters in the second output text with the value words with the highest confidence scores aiming at the value words in the second output text to obtain a first output text.

Wherein, referring to FIG. 6, if the first [ PLH ]]Corresponding placeholder decoded data

Second one [ PLH]Corresponding placeholder decoded data

Third, [ PLH]Corresponding placeholder decoded data

Inputting the first word predictor to obtain a second output text y ₂ ={[BOS][ Li Wen ]][ actor ]][ actor ]][EOS]It can be seen that there are repeated value words "actors", and then an embedding vector (e) for each value word in the second output text is determined ₁ 、e ₃ And e ₃ ) (ii) a Vector (e) to be embedded ₁ 、e ₃ And e ₃ ) And placeholder decoded data: (a)

、

、

) And a plurality of coded vectors (H ^e ) And the input decoder performs decoding processing to obtain first decoding data (h '1, h '2 and h ' 3) of the median word in the second output text.

Further, in the embodiment of the present application, the text output by the second word predictor may be taken as the first output text. For example, in FIG. 6 the text output by the second word predictor is y' ₂ ={[BOS][ Liwen ]][ actor ]][ model ]][EOS]. Then the text y 'can be' ₂ ={[BOS][ Li Wen ]][ actor ]][ model ]][EOS]As the first output text.

In the embodiment of the application, the prediction mode of the second word predictor is the same as that of the first word predictor, and the projection matrix is different. In addition, the second word predictor is added, so that the result output by the first word predictor can be calibrated, and a more accurate first output text can be output.

In another optional embodiment, the text content extraction model further comprises: a word remover for determining a first output text from a second output text, comprising: inputting the second output text and the plurality of coding vectors into a decoder to obtain second decoding data corresponding to the second output text; and inputting the second decoding data into a word deleter, determining the accuracy of the median word in the second output text in the word deleter, and deleting the characters with the accuracy smaller than the first threshold value in the second output text to obtain the first output text.

In the examples of the present application. Referring to fig. 5, the second output text entry word remover output by the first word predictor may be processed, or the text entry word remover output by the second word predictor in fig. 6 may be processed.

Specifically, the second output text and the multiple encoding vectors are input into the decoder to obtain second decoded data corresponding to the second output text, that is, second decoded data corresponding to each value word in the second output text can be obtained, where the second decoded data is represented as

For example, the second decoded data corresponding to "livingware" is represented as

And the second decoded data corresponding to "actor" is expressed as

And second decoded data corresponding to the model is represented as

.

Specifically, in the word remover, the following formula (4) is employed to determine the correct rate.

Formula (4)

Wherein, in the formula (4), softmax is a logistic regression function, W _d For pre-trained projection matrices, pi ^d Probability distribution representing the correct rate of the ith value word, d _i The accuracy rate of the ith value word is the accuracy rate of the ith value word.

Further, if the correctness rate can be set to 0.5, the value words with the correctness rate less than 0.5 in the second output text are deleted. Exemplarily, referring to fig. 5, the second output text y ₂ ={[BOS][ Li Wen ]][ actor ]][ model ]][EOS]In the input word remover, since the value word [ Li text ]][ actor ]][ model ]]If the correctness rates of the first output text y and the second output text y are all more than 0.5, the value words in the second output text can not be deleted, and the first output text y output by the word deleter ₃ ={[BOS][ Li Wen ]][ actor ]][ model ]][EOS]The same as the second output text.

In the embodiment of the application, the set word remover can further improve the prediction accuracy of the first output text.

And S409, inputting the first output text and the plurality of coding vectors into a decoder for decoding to obtain third decoding data.

In an embodiment of the present application, the text stitching model includes: a decoder, a placeholder predictor, and a third word predictor. The decoder, the placeholder predictor and the word remover are shared by the text splicing model and the text content extraction model, so that the calculation cost and the memory occupation can be reduced.

In addition, the decoding principle of the decoder is referred to above, and is not described herein again.

S410, inputting the third decoded data into the placeholder predictor, predicting the number of second placeholders between adjacent value words in the first output text in the placeholder predictor, and adding second placeholders corresponding to the number of the second placeholders between the adjacent value words to obtain a character sequence.

In the embodiment of the application, the first identification character [ BOS ] and the second identification character [ EOS ] are also used as value words, and are used for predicting the second placeholder together with other value words. Wherein the second placeholder can be the same as the first placeholder.

Illustratively, referring to FIG. 5, for a first output text y at the placeholder predictor ₃ ={[BOS][ Li Wen ]][ actor ]][ model ]][EOS]Prediction of [ BOS][ Li Wen ]]A second placeholder in between, such as 0. Prediction of [ Li Wen ]][ actor ]]Such as 3. Prediction of actor][ model ]]The number of second placeholders in between is e.g. 1. Prediction model][EOS]If the number of the second placeholders in between is 0, then filling the corresponding placeholders in the corresponding positions to obtain the character sequence y ₄ ={[BOS][ Liwen ]][PLH][PLH][PLH][ actor ]][PLH][ model ]][EOS]。

And S411, inputting the character sequence and the plurality of coding vectors into a decoder for decoding processing to obtain fourth decoding data.

And S412, inputting the fourth decoding data into a third word predictor, predicting filling characters corresponding to the placeholders in the word predictor, and replacing corresponding second placeholders in the character sequence with the filling characters to obtain a third output text.

Wherein the third word predictor is pre-trained, the third word predictor can predict the fill-in words of each placeholder. Referring to FIG. 5, for example, [ Liwen][ actor ]]The first placeholder in between corresponds to the word "yes", the second placeholder corresponds to the word "one", and the third placeholder corresponds to the word "one". [ actor ]][ model ]]The placeholder in between corresponds to the word "and". And the third output text output by the third word predictor is y ₅ ={[BOS][ Li Wen ]][ is][ 1 ]][ an][ actor ]][ and][ model ]][EOS]。

And S413, determining a target output text according to the third output text.

In one embodiment, the third output text may be directly determined as the target output text, and the target output text is y ₅ ={[BOS][ Liwen ]][ is][ 1 ]][ an][ actor ]][ and][ model ]][EOS]I.e. "Li Wen is an actor and model".

In another embodiment, the text stitching model further comprises: the word deleter is used for determining a target output text according to the third output text and comprises the following steps: inputting the third output text and the plurality of coding vectors into a decoder to obtain fifth decoding data corresponding to the third output text; and inputting the fifth decoding data into a word deleter, determining the correct rate of each value word in the third output text in the word deleter, and deleting the characters with the correct rate smaller than a second threshold value in the third output text to obtain the target output text.

The specific content of the word deleter refers to the above S408, which is not described herein again.

Illustratively, referring to FIG. 5, the target output text output by the word deleter in the text stitching model is y ₆ ={[BOS][ Li Wen ]][ is][ an][ actor ]][ and][ model ]][EOS]I.e. "plum is an actor and model". That is, it is determined in the word deleter that the correctness rate of correspondence of "one" is less than 0.5, and thus the word "one" is deleted in the word deleter.

In the embodiment of the application, firstly, the text content extraction model and the text splicing model share some modules, so that the calculation resources of the models can be reduced. In addition, the text content extraction model and the text splicing model are both non-self-scale, so that the quality of the target output text can be improved. In addition, the first output text is extracted through the text content extraction model, the consistency of the extracted text and the to-be-processed table can be kept, then the text splicing model is adopted to process the first output text to obtain the target output text, the fluency of the target output text can be improved, and the target output text which is consistent and fluent can be obtained.

Referring to fig. 7, a model training method provided by the present application specifically includes the following steps:

s701, obtaining a training sample and a label text corresponding to the training sample.

Wherein the training samples include a plurality of sample character sets, the plurality of sample character sets including: the text group comprises sample value words in a sample table, one sample value word corresponding to the sample value word, the positive sequence positions of the sample value words in the sample value texts, and the negative sequence positions of the sample value words in the sample value texts according to a preset sequence.

In this embodiment, the sample table may be obtained first, and then the sample table is processed to obtain the sample character group, and the format of the sample character group refers to the above embodiments, which is not described herein again. The label sample is a natural language text which is corresponding to the training sample and has consistency and fluency.

S702, inputting the plurality of sample character groups into an encoder for encoding processing to obtain a plurality of sample encoding vectors.

Wherein, the sample code vector and the sample character group are in one-to-one correspondence. In addition, the encoding manner of the encoder refers to the above embodiments, and is not described herein again.

And S703, inputting the multiple sample coding vectors into the text content extraction model for text content extraction to obtain a first prediction text.

S704, determining characters in the label text, which are the same as the sample value words, and obtaining a middle text.

Wherein, for example, the label text is "lie is an actor and a modality", and the sample value words include: livingwen, warrior, actors, models, and 22. The li article, actor, and model are determined to be intermediate text.

S705, adjusting model parameters of the text content extraction model according to the first loss values of the first prediction text and the intermediate text.

In this embodiment of the present application, the model parameters of the text content extraction model may be first adjusted according to the first loss values of the first predicted text and the intermediate text until the first loss values of the first predicted text and the intermediate text are less than the first loss value threshold.

And S706, inputting the multiple sample coding vectors and the first prediction text into a text splicing model for text splicing to obtain a target prediction text.

And S707, adjusting the model parameters of the text content extraction model and the model parameters of the text splicing model according to the second loss values of the target predicted text and the label text.

In the embodiment of the application, the text content extraction model and the text splicing model can be trained simultaneously in a multi-task mode. Wherein the loss function is expressed as formula (5):

L=λL ₁ +L ₂ formula (5)

Wherein L is total loss value, lambda is preset coefficient, and L ₁ And extracting a loss value corresponding to the model for the text content. L is a radical of an alcohol ₂ And obtaining a corresponding loss value of the text splicing model.

Further, L ₁ = L'1+ L'2+ L'3 ；L ₂ = L ' 1+ L ' 2+ L '3. Wherein, L '1 represents the loss value corresponding to the placeholder predictor in the processing process of the text content extraction model, L '2 represents the loss value corresponding to the first word predictor in the processing process of the text content extraction model, and L '3 represents the loss value corresponding to the word deleter in the processing process of the text content extraction model. L '1 represents a loss value corresponding to the placeholder predictor in the processing process of the text splicing model, L '2 represents a loss value corresponding to the third word predictor in the processing process of the text splicing model, and L '3 represents a loss value corresponding to the word remover in the processing process of the text splicing model.

Specifically, the loss value is calculated with reference to the following formula,

；

；

. In these formulas, T' represents a sample table, P represents a label sample,

and the combined vector represents a first identification character group and a second identification character group corresponding to the label text. And

and (7) corresponding.

The number of words of the tag text may be a set number.

Further, the word deleter predicts the correct rate of each value word

Will be below the threshold

And deleting the corresponding value words, and calculating the loss value L'3 by using the obtained result and the intermediate text.

Illustratively, if the tag text is a 5-word, then

To 5, will

Predicting the number of placeholders in the input placeholder predictor based on the predicted number of placeholders and

the loss value of (c) adjusts the parameters of the placeholder predictor (projection)A matrix).

Further, will

Filling the space between the first identification character and the second identification character with the quantity of placeholders to obtain

. For example:

=[BOS] [PLH][PLH] [PLH][PLH] [PLH][EOS]determining the confidence of each placeholder and each value word in the label sample when the input is input into the first word predictor, wherein

Representing the confidence of the ith placeholder with the corresponding code vector. For example, if the first placeholder [ PLH ]]Corresponding value word is [ Li Wen ]]Then corresponds to the first placeholder [ PLH ]]And [ Li Wen ]]The true value of the confidence coefficient of (1) is set as 1, the confidence coefficient of the word with other values is set as 0, and the first word predictor is adjusted according to the prediction result and the true value.

In the embodiment of the present application, reference is made to the above determination for L "1, L"2, and L "3, and in addition, in the embodiment of the present application, a part of the training process that obtains the prediction result may refer to a process of determining the target output text in the above embodiment, which is not described herein again.

In addition, the application also provides a text generation method, which is applied to the terminal equipment and comprises the following steps: acquiring a table to be processed; sending the table to be processed to a server; and receiving a target output text sent by the server, wherein the target output text is determined by the server according to the text generation method of the embodiment.

According to the embodiment of the application, the table to be processed can be obtained in the terminal equipment, and the table to be processed is sent to the server, so that the description text of the table to be processed can be obtained.

In addition, the present application also provides a text generation system, including:

the system comprises a cloud server and terminal equipment, wherein a pre-trained text content extraction model is deployed on the cloud server;

the terminal equipment is used for acquiring the table to be processed and sending the table to be processed to the server;

the cloud server is used for acquiring a table to be processed; generating a plurality of character groups based on the table to be processed, wherein the plurality of character groups comprise: the text groups comprise key texts in the table to be processed, value words corresponding to the key texts, forward-order positions of the value words in the value texts and reverse-order positions of the value words in the value texts according to a preset sequence; inputting a plurality of character groups into an encoder for encoding processing to obtain a plurality of encoding vectors, wherein the encoding vectors correspond to the character groups one by one; inputting the plurality of encoding vectors into a text content extraction model for text content extraction to obtain a first output text, wherein the first output text comprises at least one value word in a table to be processed; inputting the multiple coding vectors and the first output text into a text splicing model for text splicing to obtain a target output text corresponding to the to-be-processed table, wherein the target output text comprises the first output text and characters in a preset word stock;

and the terminal equipment is used for receiving the target output text sent by the server.

For the specific implementation process, reference is made to the above embodiments, which are not described herein again.

In the embodiment of the present application, in addition to providing a text generation method, there is also provided a text generation apparatus, as shown in fig. 8, the text generation apparatus 80 includes:

an obtaining module 81, configured to obtain a table to be processed;

a generating module 82, configured to generate a plurality of character groups based on the table to be processed, where the plurality of character groups include: the text groups comprise key texts in the table to be processed, value words corresponding to the key texts, forward-order positions of the value words in the value texts, and reverse-order positions of the value words in the value texts according to a preset sequence;

the encoding module 83 is configured to input the multiple character groups into an encoder to perform encoding processing, so as to obtain multiple encoding vectors, where the encoding vectors correspond to the character groups one to one;

the extracting module 84 is configured to input the multiple encoding vectors into the text content extracting model to perform text content extraction, so as to obtain a first output text, where the first output text includes at least one value word in the table to be processed;

and the splicing module 85 is configured to input the multiple encoding vectors and the first output text into the text splicing model to perform text splicing, so as to obtain a target output text corresponding to the to-be-processed table, where the target output text includes the first output text and characters in a preset word stock.

In an alternative embodiment, the plurality of character sets further comprises: the text content extraction model comprises a first identification character group and a second identification character group: decoder, placeholder predictor and first word predictor, the extraction module 84 is specifically configured to: inputting the coding vectors corresponding to the first identification character group and the second identification character group into a decoder for decoding processing to obtain corresponding identification decoding data, wherein the identification decoding data comprises the decoding data of the first identification character and the decoding data of the second identification character; inputting mark decoding data into a placeholder predictor, predicting the number of first placeholders between a first mark character and a second mark character in the placeholder predictor, and adding first placeholders with the number of the first placeholders between the first mark character and the second mark character to obtain a placeholder sequence; inputting the placeholder sequence and the plurality of coding vectors into a decoder for decoding to obtain placeholder decoding data of the first placeholder; inputting the placeholder decoding data and the plurality of coding vectors into a first word predictor, determining confidence scores between the placeholders and the coding vectors in the first word predictor, and replacing the first placeholder in the placeholder sequence with a value word with the highest confidence score aiming at the first placeholder to obtain a second output text; the first output text is determined from the second output text.

In an alternative embodiment, the text content extraction model further includes a second word predictor, and the extraction module 84 is configured to determine the first output text according to the second output text, specifically: determining an embedded vector of a value word in the second output text; inputting the embedded vector, the placeholder decoding data and the plurality of encoding vectors into a decoder for decoding processing to obtain first decoding data of a median word in a second output text; and inputting the first decoding data and the plurality of coding vectors into a second word predictor, determining confidence scores between the value words and the coding vectors in the second output text in the second word predictor, and replacing corresponding characters in the second output text by the value words with the highest confidence scores aiming at the value words in the second output text to obtain a first output text.

In an optional embodiment, the text content extraction model further comprises: the word remover, extracting module 84 determines the first output text according to the second output text, specifically configured to: inputting the second output text and the plurality of coding vectors into a decoder to obtain second decoding data corresponding to the second output text; and inputting the second decoding data into a word deleter, determining the accuracy of the median word in the second output text in the word deleter, and deleting the characters with the accuracy smaller than the first threshold value in the second output text to obtain the first output text.

In an alternative embodiment, the text stitching model comprises: decoder, placeholder predictor and third word predictor, concatenation module 85 is specifically configured to: inputting the first output text and the plurality of coding vectors into a decoder for decoding processing to obtain third decoding data; inputting third decoding data into a placeholder predictor, predicting the quantity of second placeholders between adjacent value words in the first output text in the placeholder predictor, and adding second placeholders corresponding to the quantity of the second placeholders between the adjacent value words to obtain a character sequence; inputting the character sequence and the plurality of coding vectors into a decoder for decoding to obtain fourth decoding data; inputting the fourth decoding data into a third word predictor, predicting filling characters corresponding to the placeholders in the word predictor, and replacing corresponding second placeholders in the character sequence with the filling characters to obtain a third output text; and determining a target output text according to the third output text.

In an optional embodiment, the text stitching model further comprises: the word remover, concatenation module 85, determines the target output text according to the third output text, and is specifically configured to: inputting the third output text and the plurality of coding vectors into a decoder to obtain fifth decoding data corresponding to the third output text; and inputting the fifth decoding data into a word deleter, determining the accuracy of the median word in the third output text in the word deleter, and deleting the characters with the accuracy smaller than a second threshold value in the third output text to obtain the target output text.

In an optional embodiment, the system further includes a training module (not shown), specifically configured to: training a text content extraction model in the following way: obtaining a training sample and a label text corresponding to the training sample, wherein the training sample comprises a plurality of sample character sets, and the plurality of sample character sets comprise: the text group comprises sample value words in a sample table, one sample value word corresponding to the sample value word, positive sequence positions of the sample value words in the sample value texts, and negative sequence positions of the sample value words in the sample value texts according to a preset sequence; inputting a plurality of sample character groups into an encoder for encoding processing to obtain a plurality of sample encoding vectors, wherein the sample encoding vectors correspond to the sample character groups one by one; inputting a plurality of sample coding vectors into a text content extraction model for text content extraction to obtain a first prediction text; determining characters in the label text, which are the same as the sample value words, to obtain a middle text; and adjusting the model parameters of the text content extraction model according to the first loss values of the first predicted text and the intermediate text.

In an alternative embodiment, the training module (not shown) is further configured to: training a text content extraction model and a text splicing model in the following ways: inputting a plurality of sample coding vectors and a first prediction text into a text splicing model for text splicing to obtain a target prediction text; and adjusting the model parameters of the text content extraction model and the model parameters of the text splicing model according to the second loss values of the target prediction text and the label text.

The text generation device provided by the embodiment of the application can improve the fluency of the text generated based on the table data. For the specific implementation process, reference is made to the above method embodiments, which are not described herein again.

In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations are included in a certain order, but it should be clearly understood that the operations may be executed out of the order presented herein or in parallel, and only for distinguishing between different operations, and the sequence number itself does not represent any execution order. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.

Fig. 9 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application. The electronic equipment is used for operating the text generation method and the text generation method. As shown in fig. 9, the electronic apparatus includes: memory 94 and a processor 95.

The memory 94 is used for storing computer programs and may be configured to store other various data to support operations on the electronic device. The Storage 94 may be an Object Storage Service (OSS).

The memory 94 may be implemented by any type or combination of volatile and non-volatile storage devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

A processor 95, coupled to the memory 94, for executing computer programs in the memory 94 for: acquiring a table to be processed; generating a plurality of character groups based on the table to be processed, wherein the plurality of character groups comprise: the text groups comprise key texts in the table to be processed, value words corresponding to the key texts, forward-order positions of the value words in the value texts, and reverse-order positions of the value words in the value texts according to a preset sequence; inputting the character groups into an encoder for encoding to obtain a plurality of encoding vectors, wherein the encoding vectors correspond to the character groups one by one; inputting the plurality of encoding vectors into a text content extraction model for text content extraction to obtain a first output text, wherein the first output text comprises at least one value word in a table to be processed; and inputting the plurality of coding vectors and the first output text into a text splicing model for text splicing to obtain a target output text corresponding to the table to be processed, wherein the target output text comprises the first output text and characters in a preset word stock.

Further optionally, the plurality of character groups further comprises: the first identification character set and the second identification character set, and the text content extraction model comprises: decoder, placeholder predictor and first word predictor, processor 95 is specifically configured to: inputting the coding vectors corresponding to the first identification character group and the second identification character group into a decoder for decoding processing to obtain corresponding identification decoding data, wherein the identification decoding data comprises the decoding data of the first identification character and the decoding data of the second identification character; inputting the identifier decoding data into a placeholder predictor, predicting the first placeholder quantity between a first identifier character and a second identifier character in the placeholder predictor, and adding a first placeholder with the first placeholder quantity between the first identifier character and the second identifier character to obtain a placeholder sequence; inputting the placeholder sequence and the plurality of coding vectors into a decoder for decoding to obtain placeholder decoding data of the first placeholder; inputting the placeholder decoding data and the plurality of coding vectors into a first word predictor, determining confidence scores between the placeholders and the coding vectors in the first word predictor, and replacing the first placeholder in the placeholder sequence with a value word with the highest confidence score aiming at the first placeholder to obtain a second output text; the first output text is determined from the second output text.

In an alternative embodiment, the text content extraction model further comprises a second word predictor, and the processor 95 is configured to determine the first output text from the second output text, specifically: determining an embedded vector of a median word in the second output text; inputting the embedded vector, the placeholder decoding data and the plurality of coding vectors into a decoder for decoding processing to obtain first decoding data of median words in a second output text; and inputting the first decoding data and the plurality of coding vectors into a second word predictor, determining confidence scores between the value words and the coding vectors in the second output text in the second word predictor, and replacing corresponding characters in the second output text by the value words with the highest confidence scores aiming at the value words in the second output text to obtain a first output text.

In an optional embodiment, the text content extraction model further comprises: the processor 95 determines the first output text according to the second output text, and is specifically configured to: inputting the second output text and the plurality of coding vectors into a decoder to obtain second decoding data corresponding to the second output text; and inputting the second decoding data into a word deleter, determining the accuracy of the median word in the second output text in the word deleter, and deleting the characters with the accuracy smaller than the first threshold value in the second output text to obtain the first output text.

In an alternative embodiment, the text stitching model comprises: decoder, placeholder predictor and third word predictor, processor 95 is specifically configured to: inputting the first output text and the plurality of coding vectors into a decoder for decoding processing to obtain third decoding data; inputting the third decoded data into a placeholder predictor, predicting the number of second placeholders between adjacent value words in the first output text in the placeholder predictor, and adding second placeholders corresponding to the number of the second placeholders between the adjacent value words to obtain a character sequence; inputting the character sequence and the plurality of coding vectors into a decoder for decoding processing to obtain fourth decoding data; inputting the fourth decoding data into a third word predictor, predicting filling characters corresponding to the placeholders in the word predictor, and replacing corresponding second placeholders in the character sequence with the filling characters to obtain a third output text; and determining a target output text according to the third output text.

In an optional embodiment, the text stitching model further comprises: the word remover, the processor 95 determines the target output text according to the third output text, and is specifically configured to: inputting the third output text and the plurality of coding vectors into a decoder to obtain fifth decoding data corresponding to the third output text; and inputting the fifth decoding data into a word deleter, determining the correctness of the value words in the third output text in the word deleter, and deleting the words with the correctness less than a second threshold value in the third output text to obtain the target output text.

In an alternative embodiment, the system further includes a training module (not shown), specifically configured to: the text content extraction model is trained in the following way: obtaining a training sample and a label text corresponding to the training sample, wherein the training sample comprises a plurality of sample character sets, and the plurality of sample character sets comprise: the text group comprises sample value words in a sample table, one sample value word corresponding to the sample value word, positive sequence positions of the sample value words in the sample value texts, and negative sequence positions of the sample value words in the sample value texts according to a preset sequence; inputting a plurality of sample character groups into an encoder for encoding processing to obtain a plurality of sample encoding vectors, wherein the sample encoding vectors correspond to the sample character groups one by one; inputting a plurality of sample coding vectors into a text content extraction model for text content extraction to obtain a first prediction text; determining characters in the label text, which are the same as the sample value words, to obtain a middle text; and adjusting the model parameters of the text content extraction model according to the first loss values of the first predicted text and the intermediate text.

In an alternative embodiment, the processor 95, coupled to the memory 94, is configured to execute the computer program in the memory 94 to further: acquiring a table to be processed; sending the table to be processed to a server; and receiving a target output text sent by the server, wherein the target output text is determined by the server according to the text generation method of the embodiment.

Further, as shown in fig. 9, the electronic device further includes: firewall 91, load balancer 92, communications component 96, power component 93, and other components. Only some of the components are schematically shown in fig. 9, and the electronic device is not meant to include only the components shown in fig. 9.

Accordingly, the present application also provides a computer readable storage medium storing a computer program, which when executed by a processor causes the processor to implement the steps of the method shown above.

Accordingly, embodiments of the present application also provide a computer program product, which includes computer programs/instructions, when executed by a processor, cause the processor to implement the steps in the above-described illustrated method.

The communication component of fig. 9 described above is configured to facilitate communication between the device in which the communication component is located and other devices in a wired or wireless manner. The device where the communication component is located can access a wireless network based on a communication standard, such as a WiFi, a 2G, 3G, 4G/LTE, 5G and other mobile communication networks, or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast associated text from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

The power supply assembly of fig. 9 described above provides power to the various components of the device in which the power supply assembly is located. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable text processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable text processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable text processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable text processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs and/or GPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement the text storage by any method or technology. The text may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store text that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional identical elements in the process, method, article, or apparatus comprising the element.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A text generation method, comprising:

acquiring a table to be processed;

generating a plurality of character groups based on the table to be processed, wherein the plurality of character groups comprise: the text groups comprise key texts in the table to be processed and value words corresponding to the key texts, wherein the value words are arranged at forward-sequence positions in the value texts, and the value words are arranged at backward-sequence positions in the value texts according to a preset sequence;

inputting the character groups into an encoder for encoding to obtain a plurality of encoding vectors, wherein the encoding vectors correspond to the character groups one by one;

inputting the plurality of encoding vectors into a text content extraction model for text content extraction to obtain a first output text, wherein the first output text comprises at least one value word in the table to be processed;

and inputting the plurality of coding vectors and the first output text into a text splicing model for text splicing to obtain a target output text corresponding to the table to be processed, wherein the target output text comprises the first output text and characters in a preset word stock.

2. The text generation method of claim 1, wherein the plurality of character groups further comprises: a first identification character set and a second identification character set, the text content extraction model comprising: the method comprises the following steps of inputting a plurality of coding vectors into a pre-trained text content extraction model for text content extraction to obtain a first output text, and comprises the following steps:

inputting the coding vectors corresponding to the first identification character group and the second identification character group into a decoder for decoding processing to obtain corresponding identification decoding data, wherein the identification decoding data comprises decoding data of the first identification character and decoding data of the second identification character;

inputting the identifier decoding data into a placeholder predictor, predicting a first placeholder quantity between the first identifier character and the second identifier character in the placeholder predictor, and adding a first placeholder with the first placeholder quantity between the first identifier character and the second identifier character to obtain a placeholder sequence;

inputting the placeholder sequence and the plurality of coding vectors into the decoder for decoding processing to obtain placeholder decoding data of a first placeholder;

inputting the placeholder decoding data and the plurality of coding vectors into a first word predictor, determining confidence scores between placeholders and the coding vectors in the first word predictor, and replacing first placeholders in the placeholder sequence with value words with highest confidence scores aiming at the first placeholders to obtain second output texts;

and determining the first output text according to the second output text.

3. The text generation method of claim 2, wherein the text content extraction model further comprises a second word predictor, and wherein determining the first output text from the second output text comprises:

determining an embedding vector of a value word in the second output text;

inputting the embedded vector, the placeholder decoding data and the plurality of encoding vectors into the decoder for decoding processing to obtain first decoding data of a median word in the second output text;

and inputting the first decoding data and the plurality of coding vectors into the second word predictor, determining confidence scores between the value words and the coding vectors in the second output text in the second word predictor, and replacing corresponding characters in the second output text with the value words with the highest confidence scores aiming at the value words in the second output text to obtain the first output text.

4. The text generation method of claim 2, wherein the text content extraction model further comprises: a word pruner, said determining said first output text from said second output text, comprising:

inputting the second output text and the plurality of encoding vectors into the decoder to obtain second decoding data corresponding to the second output text;

and inputting the second decoding data into the word deleter, determining the correct rate of the value words in the second output text in the word deleter, and deleting the words with the correct rate smaller than a first threshold value in the second output text to obtain the first output text.

5. The text generation method of claim 4, wherein the text stitching model comprises: the decoder, the placeholder predictor and the third word predictor, wherein the step of inputting the plurality of coding vectors and the first output text into a pre-trained text splicing model for text splicing to obtain a target output text comprises the following steps:

inputting the first output text and the plurality of coding vectors into the decoder for decoding processing to obtain third decoding data;

inputting the third decoded data into the placeholder predictor, predicting the quantity of second placeholders between adjacent value words in the first output text in the placeholder predictor, and adding second placeholders corresponding to the quantity of the second placeholders between the adjacent value words to obtain a character sequence;

inputting the character sequence and the plurality of coding vectors into the decoder for decoding processing to obtain fourth decoding data;

inputting the fourth decoding data into the third word predictor, predicting filling words corresponding to the placeholders in the word predictor, and replacing corresponding second placeholders in the character sequence with the filling words to obtain a third output text;

and determining the target output text according to the third output text.

6. The text generation method of claim 5, wherein the text stitching model further comprises: the word remover, determining the target output text according to the third output text, comprises:

inputting the third output text and the plurality of encoding vectors into the decoder to obtain fifth decoding data corresponding to the third output text;

and inputting the fifth decoding data into the word deleter, determining the correctness of the median word in the third output text in the word deleter, and deleting the characters with the correctness smaller than a second threshold value in the third output text to obtain the target output text.

7. A method of model training, comprising:

obtaining a training sample and a label text corresponding to the training sample, wherein the training sample comprises a plurality of sample character sets, and the plurality of sample character sets comprise: the method comprises the steps that a plurality of sample text groups are formed, wherein each text group comprises sample value words in a sample table and one sample value word corresponding to the sample value word, the sample value words are arranged at the positive sequence positions in the sample value texts to which the sample value words belong, and the sample value words are arranged at the negative sequence positions in the sample value texts to which the sample value words belong according to a preset sequence;

inputting the sample character groups into an encoder for encoding to obtain a plurality of sample encoding vectors, wherein the sample encoding vectors correspond to the sample character groups one by one;

inputting the plurality of sample coding vectors into a text content extraction model for text content extraction to obtain a first prediction text;

determining characters in the label text, which are the same as the sample value words, to obtain a middle text;

and adjusting model parameters of the text content extraction model according to the first loss values of the first predicted text and the intermediate text.

8. The model training method of claim 7, wherein the text content extraction model and the text stitching model are trained in the following manner:

inputting the plurality of sample coding vectors and the first prediction text into a text splicing model for text splicing to obtain a target prediction text;

and adjusting the model parameters of the text content extraction model and the model parameters of the text splicing model according to the second loss values of the target prediction text and the label text.

9. A text generation method is applied to terminal equipment, and comprises the following steps:

acquiring a table to be processed;

sending the table to be processed to a server;

receiving a target output text sent by the server, wherein the target output text is determined by the server according to the text generation method of any one of claims 1 to 6.

10. A text generation apparatus, comprising:

the acquisition module is used for acquiring a table to be processed;

a generating module, configured to generate a plurality of character groups based on the to-be-processed table, where the plurality of character groups include: the text groups comprise key texts in the table to be processed and value words corresponding to the key texts, wherein the value words are arranged at forward-sequence positions in the value texts, and the value words are arranged at backward-sequence positions in the value texts according to a preset sequence;

the encoding module is used for inputting the character groups into an encoder for encoding processing to obtain a plurality of encoding vectors, and the encoding vectors correspond to the character groups one by one;

the extraction module is used for inputting the plurality of coding vectors into a text content extraction model to extract text content to obtain a first output text, and the first output text comprises at least one value word in the table to be processed;

and the splicing module is used for splicing the plurality of coding vectors and the first output text input text in the text splicing model to obtain a target output text corresponding to the table to be processed, wherein the target output text comprises the first output text and characters in a preset word stock.

11. A text generation system, comprising:

the terminal equipment is used for acquiring a table to be processed and sending the table to be processed to the server;

the cloud server is used for acquiring a table to be processed; generating a plurality of character groups based on the table to be processed, wherein the plurality of character groups comprise: the text groups comprise key texts in the table to be processed and value words corresponding to the key texts, wherein the value words are arranged at forward-sequence positions in the value texts, and the value words are arranged at backward-sequence positions in the value texts according to a preset sequence; inputting the character groups into an encoder for encoding to obtain a plurality of encoding vectors, wherein the encoding vectors correspond to the character groups one by one; inputting the multiple encoding vectors into a text content extraction model for text content extraction to obtain a first output text, wherein the first output text comprises at least one value word in the table to be processed; inputting the multiple coding vectors and the first output text into a text splicing model for text splicing to obtain a target output text corresponding to the table to be processed, wherein the target output text comprises the first output text and characters in a preset word stock;

12. An electronic device, comprising: a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the text generation method according to any one of claims 1 to 6,9 and/or the model training method according to claim 7 or 8 when executing the computer program.