WO2023185896A1

WO2023185896A1 - Text generation method and apparatus, and computer device and storage medium

Info

Publication number: WO2023185896A1
Application number: PCT/CN2023/084560
Authority: WO
Inventors: 黄斐; 周浩; 黄民烈; 李航
Original assignee: 北京有竹居网络技术有限公司; 清华大学
Priority date: 2022-03-31
Filing date: 2023-03-29
Publication date: 2023-10-05
Also published as: CN114818746A

Abstract

Disclosed in the embodiments of the present disclosure are a text generation method and apparatus, and a computer device and a storage medium. The method comprises: inputting acquired original text into a trained text coding model, so as to obtain text feature information (S101); and on the basis of the text feature information and in combination with a trained text decoding model, generating target text corresponding to the original text (S102), wherein the text decoding model comprises a text prediction layer, node information of a set number of nodes included in the text prediction layer is determined by means of the text feature information, and target words comprised in the target text and a combination sequence of the target words are determined by means of the node information of the nodes and a topological structure between the nodes.

Description

A text generation method, device, computer equipment and storage medium

This application claims priority to the Chinese patent application with application number 202210346397.4, which was submitted to the China Patent Office on March 31, 2022. The entire content of this application is incorporated into this application by reference.

Technical field

The embodiments of the present disclosure relate to the technical field of natural language processing, for example, to a text generation method, device, computer equipment and storage medium.

Background technique

Text generation technology is an important technology in the field of natural language processing. Text generation technology can use established information and text generation models to generate text sequences that meet specific goals. Among them, the text generation model used is trained based on sample data in different application scenarios (generative reading comprehension, human-computer dialogue, intelligent writing, machine translation, etc.), and text generation in different application scenarios can be achieved.

Currently, a problem with the text generation models used in text generation implementations is that there will be a high output delay during the text generation process (output delay refers to the time delay required from the model receiving input to the model fully generating text output. ). And this output delay is linearly related to the sentence length of the generated text. Or, when solving the output delay problem, new problems will be introduced. For example, the text produced may have consecutive repeated words, or the context may be incoherent.

Contents of the invention

Embodiments of the present disclosure provide a text generation method, device, computer equipment, and storage medium, which reduce the contextual incoherence and continuous repetition of words in the generated text, and improve the quality of the generated text.

In a first aspect, embodiments of the present disclosure provide a text generation method, which method includes:

Input the obtained original text into the trained text coding model to obtain text feature information;

Based on the text feature information and combined with the trained text decoding model, generate target text corresponding to the original text;

Wherein, the text decoding model includes a text prediction layer, the node information of a set number of nodes included in the text prediction layer is determined by the text feature information, and the target words contained in the target text And the combination order of the target words is determined by the node information of the nodes and the topological structure between nodes.

In a second aspect, embodiments of the present disclosure also provide a text generation device, which includes:

The encoding execution module is configured to input the acquired original text into the trained text encoding model to obtain text feature information;

A decoding execution module configured to generate target text corresponding to the original text based on the text feature information and combined with the trained text decoding model;

Wherein, the text decoding model includes a text prediction layer, the node information of a set number of nodes included in the text prediction layer is determined by the text feature information, and the target words contained in the target text And the combination order of the target words is determined by the node information of the nodes and the topological structure between nodes. .

In a third aspect, embodiments of the present disclosure also provide an electronic device, which includes:

one or more processors;

a storage device configured to store one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the text generation method provided by any embodiment of the present disclosure.

In a fourth aspect, embodiments of the disclosure also provide a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the text generation method provided by any embodiment of the disclosure is implemented.

Description of drawings

Figure 1 is a schematic flowchart of a text generation method provided by an embodiment of the present disclosure;

Figure 1a shows the application renderings of text generation models in related technologies in machine translation scenarios;

Figure 1b shows a structural diagram of the text decoding model used in the text generation method provided in this embodiment;

Figure 1c shows an application rendering of the text generation model involved in this embodiment in a machine translation scenario;

Figure 2 shows a schematic flowchart of a text generation method provided by an embodiment of the present disclosure;

Figure 2a shows a schematic diagram of part of the network structure of the text decoding model used in the text generation method provided in this embodiment;

Figure 2b shows an example diagram of calculating a node transfer matrix in the text generation method provided by this embodiment;

Figure 2c shows an example diagram of the fully connected structure in the text prediction layer involved in the text generation method provided by this embodiment;

Figure 3 is a schematic structural diagram of a text generation device provided by an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed ways

It should be understood that multiple steps described in the method implementations of the present disclosure may be executed in different orders and/or in parallel. Furthermore, method embodiments may include additional steps and/or omit performance of illustrated steps. The scope of the present disclosure is not limited in this regard.

As used herein, the term "include" and its variations are open-ended, ie, "including but not limited to." The term "based on" means "based at least in part on." The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; and the term "some embodiments" means "at least some embodiments". Relevant definitions of other terms will be given in the description below.

It should be noted that concepts such as “first” and “second” mentioned in this disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units. Or interdependence. It should be noted that the modifications of "one" and "plurality" mentioned in this disclosure are illustrative and not restrictive. Those skilled in the art will understand that unless the context clearly indicates otherwise, it should be understood as "one or Multiple”.

The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are for illustrative purposes only and are not used to limit the scope of these messages or information.

Figure 1 is a schematic flowchart of a text generation method provided by an embodiment of the present disclosure. This embodiment can be applied to the situation of text generation. The method can be executed by a text generation device, and the device can use software and/or hardware. To implement, it can be configured in a terminal and/or server to implement the text generation method in the embodiment of the present disclosure.

It should be noted that in text generation models in related technologies, sample data composed of one input text and multiple output texts are usually used for training and learning. After using this training form to train the conventional text generation model, in practical applications, The generated target text has mixed output of predicted words, mainly because it is impossible to distinguish which output text the predicted words come from in the training phase, and the possible output of the predicted words contained in multiple output texts is mixed together, so it cannot be guaranteed. Text generation quality.

As an example, Figure 1a shows an application rendering of a text generation model in related technologies in a machine translation scenario. As shown in Figure 1a, the input text may be "I went to the cinema" in Chinese. In the application scenario of machine translation, the purpose of the text generation model 11 in the related art is to generate the English text of the above-mentioned Chinese sentence. When training the text generation model 11 in related technologies, the English output samples used may include multiple such as: "I went to the movie theater" and "I just went to the cinema". After completing the training, when actually performing English machine translation of "I went to the cinema", it is possible to mix the words in the above output samples and output the predicted text as the wrong text of "I went to the theater".

This embodiment provides a text generation method that improves the text generation model in related technologies and adds a text prediction layer. Through the nodes included in the added text prediction layer, high-quality text can be obtained. Generate text.

For example, as shown in Figure 1, a text generation method provided by this embodiment may include the following steps:

S101. Input the obtained original text into the trained text coding model to obtain text feature information.

It should be noted that the text generation method provided in this embodiment is not limited to a certain application scenario. If text generation is required in a certain application scenario, training samples can be collected in the application scenario to train the text generation model. Among them, the text generation model can include two parts in structure, one is the text encoding model, and the other is the text decoding model.

In this embodiment, the original text is equivalent to the input text before text generation, and the content of the original text may be different in different application scenarios. For example, in the machine translation scenario, if Chinese-English translation is performed, the original text can be the Chinese text to be translated; if English-Chinese translation is performed, the original text can be the English text to be translated.

In this embodiment, the text encoding model can be used to encode the original text, thereby obtaining the text feature information of the original text, wherein the model structure of the text encoding model can be directly reused in the text generation model in the related art. The text encoding model can be trained and learned through the sample data provided in different application scenarios, so that the output text feature information can meet the text generation needs of the application scenarios. For example, in the application scenario of machine translation, the output text feature information is mainly used to subsequently obtain the translated text corresponding to the original text.

In this embodiment, the text feature information is used to characterize the feature information of multiple words in the originally input original text. The text feature information can be represented by a text feature matrix. Generally, the text included in the text feature matrix The number of feature vectors is the same as the number of words contained in the original text.

S102. Based on the text feature information and combined with the trained text decoding model, generate a target text corresponding to the original text.

In this embodiment, after the text feature information output by the text encoding model is obtained through the above steps, the text feature information can be used as input data to enter the text decoding model. In this embodiment, the text decoding model includes a text prediction layer, the node information of a set number of nodes included in the text prediction layer is determined by the text feature information, and the target text The included target words and the combination order of the target words are determined by the node information of the set number of nodes and the topological structure between nodes.

For example, compared with the text decoding model in the text generation model in related technologies, the text decoding model used in this step includes a text prediction layer, and the text prediction layer includes a certain number of nodes, among which, through The node information of multiple nodes and the topological structure between nodes can effectively determine the target text of the original text. It can be known that the text decoding model in this embodiment is also trained and learned through sample data provided in different application scenarios, so that the output target text can meet the text generation requirements of the application scenarios.

Following the above description, the text prediction layer contains a set number of nodes. All nodes can be used to construct the graph required for text generation. The node information of each node can be determined through text feature information. In this embodiment, the specific value of the set number is greater than the number of words contained in the original text, and can be used as the graph size required for graph construction in the text prediction layer, or as the possible predicted length of the text to be generated. , which means that the number of words contained in the text to be generated will not be greater than the set number. The node information of a set number of nodes contained in the text prediction layer can be determined through the text feature information. For example, the text feature information can be combined with certain parameter information for full connection processing, and finally multiple nodes in the original text can be processed. The relevant feature information of each word is mapped to the node respectively as the node information of the node.

In this embodiment, for the generation logic of the target text, it needs to consider the node information of the nodes in the text prediction layer and the topological structure between nodes. The analysis shows that the target text is also composed of single words, and the words in the target text should have some relationship with the words in the original text. Among them, through the above-mentioned text encoding model of this embodiment, text feature information representing multiple words in the original text can be obtained. Afterwards, through the text decoding model of this embodiment, the text feature information can be converted into text through basic decoding processing. The node information of multiple nodes included in the prediction layer is equivalent to establishing an association between multiple words in the original text and multiple nodes in the text prediction layer.

For example, the text decoding model provided in this embodiment can establish a corresponding relationship between multiple nodes and words in the dictionary through the node information of multiple nodes in the text prediction layer, so that the node can correspond to a most matching word. .

In addition, the text coding model provided in this embodiment can also connect multiple nodes in the text prediction layer according to certain connection conditions to form a topological structure between nodes. Based on the formed topological structure between nodes, the connection relationship between multiple nodes can be clearly understood. According to the trained learning parameters in the text prediction layer, combined with the topological structure between nodes, the transition probability from one node to another connected node can be determined, and finally based on the word corresponding to each node, and each node Click to other The transition probability of the connected nodes can select the target node from multiple nodes. Since the nodes correspond to the words one-to-one, the target words required to generate the target text are also determined accordingly when the target node is selected; in addition, The combination order of multiple target words in the generated target text can also be determined by the connection relationship between nodes represented by the topological structure between nodes. Through the above logic, the target text that avoids continuous repeated words and has clear contextual relationships can be determined relative to the original text.

On the basis of this embodiment, the text decoding model is optimized. The text decoding model may include: a position information input layer, a basic decoding sub-model and a text prediction layer;

Wherein, the position information input layer includes a set number of node position parameters, and the set number is used to determine the number of nodes included in the text prediction layer; The node information of the set number of nodes included is determined through the node position parameters and the text feature information, combined with the basic decoding sub-model.

In the above embodiment, in addition to the text prediction layer, the text decoding model also includes a position information input layer and a basic decoding sub-model, and on the structural connection, the output information of the position information input layer is passed to the basic decoding sub-model, The information output by the basic decoding sub-model is passed to multiple nodes in the text prediction layer.

In this embodiment, the position information input layer can be understood as an information input layer that predicts the graph size required to generate a directed acyclic graph in the text prediction layer in the text generation implementation. The position information input layer The predicted graph size is actually the number of nodes required to construct the graph. The value of the graph size can be set to a multiple of the number of words contained in the original text. It can be known that the number of nodes determines the number of nodes included in the text prediction layer. That is, the set number of nodes representing the number of nodes in the text prediction layer is equivalent to the preset number of nodes in the position information input layer. Determined; when setting the graph size to n, it is equivalent to determining that the number of nodes contained in the text prediction layer is n.

Continuing from the above description, in the position information input layer, in addition to presetting the number of nodes contained in the text prediction layer, it is also necessary to preset the position information of each node. In this embodiment, node position parameters are used to represent the nodes. The position information of the point. The node position parameters can be understood as the position parameters assigned to the nodes required to construct the graph. Each node position parameter represents the existence of a corresponding node in the text prediction layer; at the same time, the node The position parameter is also equivalent to one of the learning parameters obtained from training in the text decoding model. Through training iterations, the node position parameters can be adjusted accordingly until stable parameter information is obtained after the training.

In the specific implementation of text generation, the text feature information and node position parameters input by the text encoding model can be used as the input of the basic decoding sub-model in the text decoding model, and the basic decoding sub-model can be output and combined with the text prediction layer. Vector information containing the same number of nodes is used as the node information of the corresponding node. Among them, the basic decoding sub-model can include: self-attention mechanism self-attention network structure and cross-attention mechanism cross-attention network structure, which is equivalent to reusing the text decoding model in the text generation model in related technologies.

Exemplarily, Figure 1b shows a structural diagram of the text decoding model used in the text generation method provided in this embodiment. As shown in Figure 1b, the text decoding model 12 includes an input layer. The input layer includes two different input branches. One of the input branches is the position information input layer 121 for inputting graph size and node position information. The position information input layer 121 includes n determined node position parameters g; the other input branch is used to input text feature information output by the text encoding model; the text decoding model 12 also includes a basic decoding sub-model 122 and a text prediction layer 123. Basic decoding The sub-model 122 may include an m-layer network structure composed of a self-attention mechanism and a spanning attention mechanism; the text prediction layer 123 includes n nodes with the same number of node position parameters; finally, through the text decoding model The output layer 124 of 12 outputs the target text of the original text.

The text generation method provided by the embodiment of the present invention realizes the parallel determination of node information of multiple nodes in the added text prediction layer and the parallel determination of multiple target words in the generated text, reducing the text generation delay; At the same time, through the node information of multiple nodes in the added text prediction layer, one-to-one correspondence between multiple words in the generated text and the matching nodes can be achieved, thereby better avoiding continuous repetition in the generated text. The appearance of words; in addition, through the inter-node topological structure of multiple nodes, the combination order of multiple words in the generated text can be limited, thereby ensuring the relevance of the context in the generated text, thus improving the quality of the generated text. The generation quality ensures text accuracy.

In one embodiment, the method further includes:

Based on the set loss function generation strategy, perform learning parameter training on the constructed text decoding model to obtain the trained text decoding model;

Wherein, the learning parameters include: node position parameters involved in the position information input layer included in the text decoding model, basic model parameters involved in the included basic decoding sub-model, and included text prediction layer. Node-related parameters involved in each node.

For the text generation model in the related art as shown in Figure 1a, another situation that exists in the training phase is that the sample data participating in the training contains one input text and multiple output texts, so there will be labels in the training phase Inconsistency. For example, facing the same input text, there are many different possible output texts. During the model training stage, when learning parameters at the same position are learned, the corresponding predicted words may come from different output texts, thus causing training problems. difficulty.

Based on this, on the one hand, this embodiment improves the network structure of the text decoding model, such as adding a text prediction layer and using more values than the words contained in the text as the number of nodes, so that each node can be related to the output A word correspondence in the text. On the other hand, sample data improvement and loss function improvement were performed during the training phase.

For sample data improvement, this embodiment can use single sample data, that is, one input text corresponds to only one output text, forming a piece of sample data; for loss function improvement, a loss function generation strategy is given, which mainly decodes text Consider the nodes in the text prediction layer added to the model. For example, this strategy can first consider the possible paths between nodes and consider the generation probability of generating output text through the constituted path, and then combine the generation probabilities of multiple paths to generate a loss function.

In one embodiment, through the determined loss function and the improved sample data in the set form, the learning parameters in the created text decoding model can be adjusted through backpropagation, and finally a higher accuracy can be obtained. Text decoding model.

What you can know is that training the text decoding model is equivalent to adjusting the learning parameters included in the model. The learning parameters in the text decoding model can include node position parameters in the position information input layer; they can also include weight parameters involved in the basic decoding sub-model; and they can also include relatively multiple nodes in the text prediction layer. The set node-related parameters can be used to determine the predicted nodes related to the generated text and match the nodes to predicted words in the dictionary.

In one embodiment, based on the set loss function generation strategy, the built text decoding model will be trained with learning parameters to obtain a trained text decoding model, including:

a0. Obtain at least one set of sample data. The set of sample data includes an original sample text and a corresponding single target sample text.

In this embodiment, multiple sets of sample data can be obtained to input different sample data in each training iteration. Compared with sample data in the related art, a set of sample data in this embodiment may include an original sample text and a target sample text.

b0. Under the current iteration, the original sample text in a set of sample data is encoded using the text encoding model and then input into the current text decoding model.

In this embodiment, the current iteration can be understood as being either the first iteration or a training iteration to be executed in an iteration loop, and the training logic executed in each iteration is the same. The current text decoding model can be understood as the text decoding model to be trained under the current iteration. In this step, the original sample text can be input into the trained text encoding model for encoding processing, and then input into the current text decoding model.

c0. Based on the current text decoding model, determine the probability value corresponding to when the original sample text is generated through multiple text prediction paths to generate the target sample text.

In this embodiment, the original sample text can be processed through the network structure included in the current text decoding model and the current parameter values of the learning parameters in the network structure, where based on the text prediction layer in the current text decoding model Multiple nodes can form multiple text prediction paths, and one predicted text can be generated through multiple text prediction paths. In this step, the probability value that the predicted text is the target sample text can be determined as the corresponding probability value when generating the target sample text based on the text prediction path. This step is equivalent to one of the execution logic in the loss function generation strategy, and the determined probability value is used to determine the value of the loss function used in the current iteration.

Wherein, the text prediction path is formed based on the node combination setting algorithm in the text prediction layer. illustrative, this During the execution of the step, all paths formed by connections between multiple nodes can be used as text prediction paths. If all paths are directly selected as text prediction paths, more computing resources will be occupied during the path calculation process. For model training, this embodiment considers using a dynamic programming algorithm in all path calculations to avoid repeated operations of the same logic, thereby saving computing resources and improving training time. At the same time, this embodiment can also consider using a certain algorithm to start from the node. Select a part of the paths formed by the connection as the text prediction path.

d0. Based on the probability value combined with the loss function generation formula, determine the current loss function value, and adjust the learning parameters in the current text decoding model through backpropagation based on the current loss function value to obtain the next iteration Text decoding model.

In this embodiment, the above-mentioned determined probability value can be substituted into a preset loss function generation formula, and the current loss function value under the current iteration can be determined. Wherein, the loss function generation formula is expressed as taking the logarithm of the sum of multiple probability values, and taking the negative of the logarithmic operation result.

e0. Use the next iteration as the new current iteration, return and continue executing step b0 until the iteration end condition is met, and the trained text decoding model is obtained.

In one embodiment, the iteration acceptance condition may be that the current loss function value determined in the iteration logic is within a set threshold range, or the number of iterations reaches a set threshold.

Through the model training logic given in this embodiment, it is possible to better avoid the inconsistent labels in the training samples that occur during the model training stage, so that each node in the text encoding model can be matched with the words that appear in the text to be generated. correspond.

Illustratively, Figure 1c shows an application rendering of the text generation model involved in this embodiment in a machine translation scenario. As shown in Figure 1c, the input text can also be "I went to the cinema" in Chinese. In the application scenario of machine translation, the text generation model 13 used in this embodiment (the model includes a text prediction layer) The English text of the above Chinese sentence can be generated. When training the text generation model 13 used in this embodiment, the English text samples used may be only "I went to the movie theater" or only "I just went to the cinema". After completing the training, multiple words in the English text sample each correspond to a processing node in the text generation model 13 (in Figure 1c, a processing node can be represented by predicting the words presented in the text); equivalent to this The text generation model 13 used in the embodiment can determine the most matching word for each processing node, and according to the connection relationship between the processing nodes, a word that best fits the contextual relationship can be determined from the multiple connection paths formed by the processing nodes. Combined paths.

When using the text generation model trained in this embodiment to perform English machine translation of "I went to the cinema", you can only select words corresponding to the processing nodes in the combination path to combine, thereby forming an outputable target. Text, for example, based on one of the determined combination paths, the corresponding output text can be expressed as: "I went to the movie theater". Compared to the error text "I went went the the theater" output in Figure 1a. The text output in this embodiment avoids repeated word connections and ensures contextual coherence.

Figure 2 shows a schematic flow chart of a text generation method provided by an embodiment of the present disclosure. This embodiment is a refinement of the above embodiment. In this embodiment, based on the text feature information, combined with the trained text Decoding model, generating target text corresponding to the original text, including: inputting the text feature information and the position information into the node position parameters in the layer, inputting into the basic decoding sub-model; obtaining the basic decoding sub-model The set number of initial text prediction vectors output by the model are used as node information of a set number of nodes in the text prediction layer; based on the node construction Directed acyclic graph, determine the topological structure between nodes, and determine the target text of the original text based on the node information.

As shown in Figure 2, this embodiment provides a text generation method, including the following steps:

S201. Input the obtained original text into the trained text coding model to obtain text feature information.

For example, the text feature information may be a feature matrix containing feature vectors corresponding to multiple words in the original text.

S202. Input the text feature information and the position information into the node position parameters in the layer and into the basic decoding sub-model.

In this embodiment, the text decoding model includes a position information input layer. The position information input layer includes the position information (node position parameters) in the text decoding model that represents the nodes in the graph to be constructed and the features of the graph to be constructed. The size of the graph (mainly characterized by the number of node position parameters included).

In this step, text feature information and node position parameters can be used as input information into the basic decoding sub-model in the text decoding model.

Figure 2a shows a schematic diagram of part of the network structure of the text decoding model used in the text generation method provided in this embodiment. As shown in Figure 2a, the position information input layer in the text decoding model is given, and the basic decoding sub-model 20 is provided; among them, the position information input layer contains 9 (graph size) node position parameters 21. Multiple node position parameters 21 and text feature information 22 output by the text encoding model can be input to the basic decoding sub-model 20 .

S203. Obtain the set number of initial text prediction vectors output by the basic decoding sub-model, and use the set number of initial text prediction vectors as nodes of the set number of nodes in the text prediction layer. information.

This step can obtain the processing information output by the basic decoding sub-model, and the processing information can include a set number of initial text prediction vectors. The number of settings is the same as the number of node position parameters in the position information input layer. This step can also correspond to the initial text prediction vector obtained above and the node in the text prediction layer as the node information of the node.

Continuing from Figure 2a above, it can be seen that Figure 2a also shows the node set in the text prediction layer. The node set also contains 9 nodes 23; multiple initial text prediction vectors output by the basic decoding sub-model 20 It can correspond to the node 23 one-to-one to serve as the node information of multiple nodes 23.

S204. Construct a directed acyclic graph based on the nodes, determine the topological structure between nodes, and determine the target text of the original text based on the node information.

The above steps are equivalent to assigning node information to the nodes in the text prediction layer, so that the nodes in the text prediction layer are associated with the actual original text.

This step is equivalent to using the text prediction layer as the execution subject, which mainly performs subsequent processing of text generation based on the node information of multiple nodes, thereby generating the target text of the original text.

The analysis of the execution logic of this step can be described as: after multiple nodes are assigned node information, they are still single nodes, and there is no correlation between multiple nodes; considering that there are multiple words in the text to be generated Contextual association, and multiple words are related to nodes in the text prediction layer. Therefore, this step needs to establish the association between multiple nodes, and the association between multiple nodes can be achieved by constructing a graph. Considering that the text to be generated is directed and acyclic, this step can build a directed acyclic graph based on multiple nodes.

Following the above analysis, there needs to be a contextual association between words in the text to be generated. After determining that a node can represent a word, if you want to determine the contextual association between words, you can convert it into an association between nodes, and the Association can be reflected by the weight of the edges formed after connecting nodes in a directed acyclic graph. This embodiment considers using the transition probability from one node to another node to represent the weight of the edge formed by two nodes. After determining the transition probability between nodes, the higher the transition probability between two nodes, the greater the correlation between the two nodes.

Based on the above analysis, the execution logic of generating the target text corresponding to the original text based on node information in this step can be described as: 1) Establishing directed connections between multiple nodes to form a directed acyclic graph, and determining whether the two connected Among the nodes, the transition probability from the source node to the target node, where the source node is the outgoing node in the directed connection between two nodes, and the target node is the incoming node in the directed connection between two nodes; 2) Determine the predicted words corresponding to each node; 3) Select the target words based on the transition probability between nodes and the predicted words corresponding to the nodes, and finally combine the target words to form the target by obtaining the combination order between the target words. text.

For example, this embodiment provides an implementation method of constructing a directed acyclic graph based on multiple nodes, determining the topological structure between nodes, and determining the target text of the original text based on node information. The implementation steps Including the following steps a1 ~ c1, for example:

a1. According to the node labels of the nodes in the text prediction layer, construct a directed acyclic graph and obtain the topological structure between nodes.

For example, the construction of a directed acyclic graph is used to determine the connection relationship between nodes. Taking into account the directional nature of the constructed graph, this embodiment performs directional connections based on the node labels of the nodes. For example, assuming there are 9 nodes, based on the node labels from small to large, node v1 will be connected with node v1. v2~v9 establish directed connections respectively, and node v2 can only establish directed connections with v3~v9, and so on, the last node v9 will no longer have directed connections. After determining the directed acyclic graph, it is equivalent to determining the topology between nodes.

b1. Determine the node transfer matrix corresponding to the text prediction layer according to the topological structure between nodes and the node information of the node.

In this embodiment, the topological structure between nodes includes the connection relationship between the node and other nodes. Based on the connection relationship between nodes, it can be known which nodes each node is connected to, and the existing connections are Directed connection. In this embodiment, the row and column values of the node transfer matrix are respectively the number of nodes included in the text prediction layer. And considering the directionality of node connections, the node transfer matrix can be an upper triangular matrix. For a valid element value in the node transition matrix, it represents the existence of a directed connection between the node corresponding to the corresponding row and the node corresponding to the corresponding column, and is mainly the transition probability of the two nodes determined through the corresponding calculation logic.

In this embodiment, for determining the transition probability between nodes, one of the implementation logic can be described as: for two nodes that establish a connection, it can obtain the node information of the two nodes, where the node information can be obtained through the characteristics Then the feature vectors representing the node information of the two nodes can be multiplied together, and the obtained product vector can be used as the transition probability of the two nodes after normalization.

For the determination of the transition probability between nodes, another implementation logic can also be described as: first obtain the node-related parameters set by the node in the text prediction layer, such as the first learning parameter and the second learning parameter, mainly using is used to determine the transition probability; among them, the node-related parameters of the node exist in the text decoding model. After completing the training of the text decoding model, they can have fixed parameter values; after that, for the two nodes that are connected, they can be based on the result. The product vector obtained by multiplying the point information and the node-related parameters is used to determine the transition probability.

Among them, for the above-mentioned implementation of determining the transition probability between two nodes based on node information combined with node-related parameters, the following exemplary description is given: Taking node vi and node vj as an example, node vi is connected to The calculation of the transition probability of node vj, node vi and node vj can be described as: determining the product of the initial text prediction vector (node information) of node vi and the first learning parameter (recorded as the first product); determining the node The product of the initial text prediction vector (node information) of point vj and the second learning parameter (recorded as the second product); normalize the product result of the first product and the second product, and the result after normalization can be Seen as the transition probability of node vi and node vj.

Based on the above description, it can be known that after determining the transition probability between the two connected nodes, the node transition matrix of the text prediction layer can be formed based on the transition probability.

For example, this embodiment can determine the node transition matrix corresponding to the text prediction layer based on the topological structure between nodes and the node information of the node, including:

b11. For each node, determine the adjacent nodes with directed connections to the node from the topological structure between nodes.

Among them, through the directed acyclic graph constructed above, after obtaining the topological structure between nodes, it is easy to determine other nodes with directed connections to the node, and these nodes can be regarded as the adjacent nodes of the node. point.

b12. Based on the node information of the node and the adjacent node, determine the transition probability from the node to the adjacent node.

For example, in one of the implementations, the calculation of the transition probability p _vi->vj from node vi to node vj can be described as: Among them, softmax represents normalization, Indicates the size of the text prediction layer (d is determined in the construction phase), Vi and Vj respectively represent the node information vectors of node vi and node vj.

In another exemplary implementation, the implementation logic can be summarized as: for each node, according to the node information of the node and the corresponding adjacent node, the first learning parameter and the second learning parameter, Combined with the probability transition formula, the transition probability from the node to the adjacent node is determined, where the first learning parameter and the second learning parameter are both node-related parameters corresponding to the node. Referring to the above description, the probability transfer formula can be expressed as:

Among them, the same as the above description, softmax represents normalization, Indicates the size of the text prediction layer (d is determined in the construction phase), Vi and Vj respectively represent the node information vectors of node vi and node vj; in addition, W1 in this formula represents the first learning parameter related to the node ; W2 represents the second learning parameter related to the node; p _{vi-> vj} represents the transition probability from node vi to node vj.

b13. Form a node transition matrix corresponding to the text prediction layer based on the transition probability.

It can be known that based on the above steps b12 and b13, the transition probability of a node and its adjacent nodes can be calculated, and the node transition matrix can be formed based on the transition probability.

Illustratively, FIG. 2b shows an example diagram of calculating a node transition matrix in the text generation method provided in this embodiment. As shown in Figure 2b, the transition probability is calculated for the nodes included in the text prediction layer. Table E in Figure 2b represents the calculated node transfer matrix. It should be noted that Figure 2b shows some connections of nodes and the transition probabilities corresponding to the corresponding connections. For example, the transition probability from v1 to v2 is 0.3; the transition probability from v1 to v3 is 0.7, etc. What we can know is that in the node transition matrix E, the sum of the transition probabilities of each row is 1.

c1. Determine the target text of the original text according to the node information of the node and the node transition matrix.

In this embodiment, after determining the node transition matrix, it is equivalent to determining the weight of the edge formed by the connection in the directed acyclic graph. This embodiment can select a prediction path through the prediction path selection strategy; exemplary , for the selection of the predicted path, one of the implementation methods can be described as: along the node connection line, on the premise that the outgoing end node is fixed, select the incoming end with the highest transition probability from the outgoing end node. node, and use the edge between the two nodes as one of the sides of the predicted path; then repeat the above logic on selecting a new outgoing node, and finally select all the edges in the predicted path, and then determine the components that make up the predicted path. Target.

As shown in Figure 2b, through the above logical description, it can be determined that the predicted path is v1->v3->v4->v5->v6->v9; the included target point is A={v1, v3, v4, v5 , v6, v9}.

At the same time, this step can determine the probability information of each node and the words included in the dictionary based on the node information and the fully connected layer existing in the text prediction layer. The dictionary can be pre-created word list information. It contains a variety of words required for text generation, and each word can be represented in the form of a vector. Based on the fully connected layer existing in the text prediction layer, the node of the previous layer of the fully connected layer can be the node of the directed acyclic graph in this embodiment, and the node of the latter layer can be the word in the dictionary. The full connection processing can be to calculate the matching probability from the node in the directed acyclic graph to the word node in the dictionary. The calculation form can be realized through full connection based on the node information of the node and the word vector of the word node.

After the prediction path and the matching vector from the node to the word are determined above, the target word corresponding to the node in the prediction path can be determined, and finally the target text is formed based on the combination of multiple target words. It should be noted that in this embodiment, the order of execution of determining the prediction path and matching probability is not determined. It is also possible to determine the prediction path after determining the matching probability. As long as the target text can be generated.

On the basis of the above embodiment, this embodiment can also provide a detailed description of the above step c1 of "determining the target text of the original text based on the node information of the node and the node transfer matrix".

For example, after obtaining the node transfer matrix corresponding to the text prediction layer and the node information of the node, it can be implemented by executing the logic of steps c11 to c13 provided in this embodiment.

It should be noted that in addition to the nodes required to construct a directed acyclic graph, the text prediction layer also includes a fully connected structure. The fully connected structure can regard the node information of the nodes in the directed acyclic graph as Input information. The next layer in the fully connected structure can be considered as the word node formed by the words in the dictionary. In the fully connected structure, the nodes in the directed acyclic graph and the word nodes in the dictionary can be connected through connecting lines. connect. The connection weight of each connection line in the fully connected structure can be the third learning parameter determined by the connection between the relative node and the word after training the text decoding model.

Figure 2c shows an example diagram of the fully connected structure in the text prediction layer involved in the text generation method provided in this embodiment. As shown in Figure 2c, above the nodes shown in the directed acyclic graph, there is a fully connected structure 24 that determines the predicted words associated with the nodes. It should be noted that Figure 2c also includes the result output layer. On the result output layer, only the predicted words matching the nodes in the directed acyclic graph are displayed. For example, the word matching node v1 is " I"; the word matching node v2 is "just"; the word matching node v3 is "went", etc.

c11. According to the node information of the node, determine the matching probability of the node to the word in the preset vocabulary through the fully connected layer in the text prediction layer.

The specific implementation of this step and the execution logic can be described as each node being connected to multiple words in the dictionary. In this step, the nodes and words can also represent the corresponding information through vectors. Therefore, for the matching probability of nodes and words, if the connection weights in the fully connected structure are re-determined in the text decoding model training stage, the third learning parameters obtained from training can be obtained first, and then the corresponding third learning parameters and the corresponding nodes can be determined. The vector product of point information and word information; if the text decoding model training phase no longer corrects the connection weights, but directly shares the word features used by the text encoding model, the vector product of the corresponding node information and word information can be directly determined; Afterwards, the vector product of the node relative to all words can also be determined, which can be used as the matching probability from the node to the word after normalization.

Among them, the fully connected layer is built within the text prediction layer, which contains a fully connected structure for matching probability processing. In the connection structure, full connection processing can be performed on each node.

c12. Determine the predicted node and the corresponding target word according to the node transfer matrix and the matching probability from the node to the word.

In this embodiment, the prediction node can be considered as a key node on which the target text generation depends on selected among the nodes of the text prediction layer. Based on the matching probability corresponding to the prediction node, the prediction word matching the prediction node can be determined, and the prediction word can be regarded as the target word contained in the target text.

In this embodiment, the prediction point can be obtained by determining the prediction path based on the node transfer matrix in the text prediction layer, and then the target word of the prediction point can be determined based on the matching probability from the node to the word; it can also be based on the node transfer The matrix and the matching probability of the node to the word are used to determine the prediction node and the target word, and the prediction path is determined based on the prediction node, which is used to combine the target words to form the target text; it can also be determined first based on the matching probability of the node to the word. The predicted words corresponding to the nodes are generated, and then the prediction path is determined in the directed acyclic graph through a search algorithm, and the target words required for text generation are finally selected.

c13. Based on the target words, combine to form the target text of the original text.

In this step, the target words determined above are combined according to the connection directions between the corresponding nodes in the text prediction layer. Multiple target words can only determine one combination order, and finally the final result can be obtained according to this combination order. target text. The target text is equivalent to the result of text generation processing with the original text.

This embodiment provides a refinement of the above step c13. For example, for determining the prediction node and the corresponding target word according to the node transfer matrix and the matching probability from the node to the word, this embodiment provides a An implementation manner can be described as:

At least one predicted node is determined according to the maximum transition probability corresponding to the node in the node transition matrix.

Among them, there is a sequential nature in determining the maximum transition probability of a node. It first starts from the node corresponding to the starting node label. This node can be used as the first predicted node, and the predicted node is connected to the adjacent node correspondingly. Among the transition probabilities, the maximum transition probability of the prediction node can be determined, and the adjacent node corresponding to the maximum transition probability can be regarded as a new prediction node; after that, the maximum transition can be performed again on the new prediction node Determine the probability, and then determine a new prediction node; through the above logic, the prediction nodes can be determined in a loop until the last node is reached, and the last node can also be used as the last prediction node. From this step, at least one prediction node can be obtained (in one case, the starting node is also the ending node).

For each prediction node, the maximum matching probability is determined from the matching probability between the prediction node and the word, and the word corresponding to the maximum matching probability is determined as the target word.

Among them, for the prediction node determined above, after the matching probability of the prediction node to the word is known, the maximum matching probability can also be determined from the matching probability, and then the prediction word corresponding to the maximum matching probability can be obtained. The prediction word It is equivalent to the target word corresponding to the prediction node. What can be known is that by predicting the determined order of nodes, a combination path for target word combination can be determined, and this combination path can be used as the final target text generation.

Illustratively, for the refinement of the above step c13, this embodiment also provides another implementation method. It should be noted that, different from the above implementation logic, the implementation logic of this method is to simultaneously consider the transition probability in the node transition matrix. And the impact of the matching probability of the node and the word on the predicted node, which can multiply the transition probability and the matching probability, and determine the predicted node based on the product result.

Among them, the steps of this implementation can be described as:

1) Use the node corresponding to the starting node label as the current node. Among them, the current node can be recorded as the first predicted node.

2) Obtain the current transition probability from the current node to the adjacent node from the node transition matrix.

3) Determine the product value of the current transition probability and the matching probability corresponding to the current node and word respectively.

4) Select the maximum product value from the product values, and use the adjacent nodes and words associated with the maximum product value as prediction nodes and target words respectively, and add the prediction node and target word associations to Cache table.

Among them, you can know the matching probability and transition probability corresponding to the maximum product value. Using the current node as a reference, you can know the word corresponding to the above matching probability relative to the current node. The word is recorded as a target word. You can also know the above transition probability. Relative to the adjacent node corresponding to the current node, the adjacent node can be recorded as another predicted node.

5) Use the predicted node as a new current node, and re-execute the selection operation of the current adjacent point corresponding to the current node until the loop end condition is reached.

It can be seen that this execution logic also performs loop processing in the order of directed connections of nodes, from which it can be determined Prediction nodes and target words that meet the conditions.

Similarly, the above process of determining the prediction node is equivalent to determining the combination order of the target word combination.

Illustratively, for the refinement of the above step c13, this embodiment also provides another implementation method. Different from the above two implementation methods, this implementation method mainly considers the situation that there are different nodes that may correspond to the same word. This embodiment is equivalent to proposing a target word determination method based on this situation.

Among them, the steps of this implementation can be described as:

1) Based on the matching probability from the node to the word, determine the corresponding maximum matching probability, and determine the word corresponding to the maximum matching probability as the predicted word of the corresponding node.

First, through this step, the corresponding predicted words are determined for the nodes in the text prediction layer. Among them, the determination of predicted words is also implemented using the logic of maximum matching probability.

2) According to the preset path search algorithm, combined with the node transfer matrix and the predicted word of the node, determine the predicted path with the highest weight.

The purpose of this step is mainly to determine the candidate text generation path based on the node label sequence for the nodes in the text prediction layer, and to determine the transition probability of the edge between the two nodes in the candidate text generation path based on the node transition matrix; and then through The path search algorithm combines the predicted words to determine candidate prediction paths with different nodes representing the same predicted word from the candidate text generation path; and obtains the prediction path with the highest weight from the candidate prediction paths.

3) Determine the predicted word corresponding to the predicted node in the predicted path as the corresponding target word.

For the three implementation methods of determining prediction nodes and target words given above, the first one has the fastest execution speed, but the quality of the generated text is relatively low; the second one is at the lowest in terms of execution speed and text generation quality. Moderate state; the execution speed of the third type is relatively slow, but the quality of the generated text is relatively high. This embodiment can adopt the above methods but is not limited to the above methods. In the application scenario, the target text can be generated by considering the appropriate prediction node and target word implementation method according to the actual situation.

This embodiment provides a text generation method that refines the implementation process of the text decoding model to generate target text. By adding a text prediction layer, it is considered to use graph nodes in the form of a directed acyclic graph to perform target words and predictions. The effective determination of nodes ensures the relevance of the context and avoids the continuous occurrence of repeated words in the generated text. Compared with related technologies, it improves the quality of generated text and ensures text accuracy.

Figure 3 is a schematic structural diagram of a text generation device provided by an embodiment of the present disclosure. This embodiment can be applied to text generation. The device can be implemented by software and/or hardware, and can be configured in a terminal and/or server. To implement the text generation method in the embodiment of the present disclosure. The device may include: an encoding execution module 31 and a decoding execution module 32.

Among them, the encoding execution module 31 is configured to input the acquired original text into the trained text encoding model to obtain text feature information;

The decoding execution module 32 is configured to generate target text corresponding to the original text based on the text feature information and combined with the trained text decoding model;

This embodiment provides a text generation device that realizes parallel determination of node information of nodes in the added text prediction layer and parallel determination of target words in the generated text, reducing text generation delay; at the same time, through the added The node information of the nodes in the text prediction layer can achieve a one-to-one correspondence between the words in the generated text and the matching nodes, thereby better avoiding the occurrence of consecutive repeated words in the generated text; in addition, through the nodes The topological structure between nodes can limit the combination order of words in the generated text, thereby ensuring the relevance of the context in the generated text, thereby improving the generation quality of the generated text and ensuring text accuracy.

In one embodiment, the text decoding model includes: a position information input layer, a basic decoding sub-model and a text prediction layer;

The position information input layer includes a set number of node position parameters, and the set number is used to determine the number of nodes included in the text prediction layer;

The node information of a set number of nodes included in the text prediction layer is determined through the node position parameters and the text feature information, combined with the basic decoding sub-model.

In one embodiment, the decoding execution module 32 includes:

An information input unit configured to input the text feature information and the position information into the node position parameters in the layer and into the basic decoding sub-model;

The initial vector output unit is configured to obtain the set number of initial text prediction vectors output by the basic decoding sub-model, and use the set number of initial text prediction vectors as the set number in the text prediction layer. Node information of the node;

The text generation unit is configured to construct a directed acyclic graph based on the nodes, determine the topological structure between nodes, and determine the target text of the original text in combination with node information.

In one embodiment, the text generation unit includes:

The first execution unit is configured to construct a directed acyclic graph based on the node labels of the nodes in the text prediction layer, and obtain the topological structure between nodes;

The second execution unit is configured to determine the node transfer matrix corresponding to the text prediction layer based on the topological structure between nodes and the node information of the node;

The third execution unit is configured to determine the target text of the original text based on the node information of the node and the node transition matrix.

In one embodiment, the second execution unit is configured as:

For each node, determine the adjacent nodes with directed connections to the node from the topological structure between nodes;

Determine the transition probability from the node to the adjacent node based on the node information of the node and the adjacent node;

A node transition matrix corresponding to the text prediction layer is formed based on the transition probability.

In one embodiment, the third execution unit is configured as:

According to the node information of the node, the matching probability of the node to the word in the preset vocabulary is determined through the fully connected layer in the text prediction layer;

According to the node transfer matrix and the matching probability from the node to the word, determine the predicted node and the corresponding target word;

Based on the target words, a target text of the original text is formed.

In one embodiment, the third execution unit performs the step of determining the predicted node and the corresponding target word based on the node transition matrix and the matching probability from the node to the word, which may be:

Determine at least one prediction node according to the maximum transition probability corresponding to the node in the node transition matrix;

In an embodiment, the fourth execution unit performs the step of determining the predicted node and the corresponding target word based on the node transition matrix and the matching probability from the node to the word. The step may also be:

Use the node corresponding to the starting node label as the current node;

Obtain the current transition probability from the current node to the adjacent node from the node transition matrix;

Determine the product value of the current transition probability and the matching probability corresponding to the current node and word respectively;

Select the maximum product value from the product values, use the adjacent nodes and words associated with the maximum product value as prediction nodes and target words respectively, and add the prediction nodes and target words to the cache table. ;

The predicted node is used as a new current node, and the selection operation of the current adjacent point corresponding to the current node is re-executed until the loop end condition is reached.

In one embodiment, the fourth execution unit performs the step of determining the predicted node and the corresponding target word based on the node transition matrix and the matching probability from the node to the word, which may also be:

Based on the matching probability from the node to the word, determine the corresponding maximum matching probability, and determine the word corresponding to the maximum matching probability as the predicted word of the corresponding node;

According to the preset path search algorithm, combined with the node transfer matrix and the predicted word of the node, determine the predicted path with the highest weight;

Determine the predicted word corresponding to the predicted node in the predicted path as the corresponding target word

In one embodiment, the device may further include: a model training module configured to generate a strategy based on the set loss function, perform learning parameter training on the constructed text decoding model, and obtain a trained text decoding model;

Wherein, the learning parameters include: node position parameters involved in the position information input layer included in the text decoding model, basic model parameters involved in the included basic decoding sub-model, and included text prediction layer. Node related parameters involved in the node.

In an embodiment, the model training module can be set to:

Obtain at least one set of sample data, which includes an original sample text and a corresponding single target sample text;

Under the current iteration, the original sample text in a set of sample data is encoded using the text encoding model and then input to the current text decoding model;

Based on the current text decoding model, determine the probability value corresponding to when the original sample text is used to generate the target sample text through a text prediction path, wherein the text prediction path is based on the node combination device in the text prediction layer. The formula is formed;

Based on the probability value combined with the loss function generation formula, the current loss function value is determined, and based on the current loss function value, the learning parameters in the current text decoding model are adjusted through backpropagation to obtain text decoding for the next iteration. Model;

The next iteration is regarded as the new current iteration, and the learning parameter training is continued until the iteration end condition is met, and the trained text decoding model is obtained.

For example, the loss function generation formula is expressed as: taking the logarithm of the sum of the probability values, and taking the negative of the logarithm operation result.

The above-mentioned device can execute the method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.

It is worth noting that the multiple units and modules included in the above device are only divided according to functional logic, but are not limited to the above divisions, as long as they can achieve the corresponding functions; in addition, the specific names of the multiple functional units They are only used to facilitate mutual differentiation and are not used to limit the protection scope of the embodiments of the present disclosure.

FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure. Referring now to FIG. 4 , a schematic structural diagram of an electronic device (such as the terminal device or server in FIG. 4 ) 40 suitable for implementing embodiments of the present disclosure is shown. Terminal devices in embodiments of the present disclosure may include, but are not limited to, mobile phones, laptops, digital broadcast receivers, PDAs (Personal Digital Assistants), PADs (Tablets), PMPs (Portable Multimedia Players), vehicle-mounted terminals (such as Mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers, etc. The electronic device shown in FIG. 4 is only an example and should not impose any limitations on the functions and scope of use of the embodiments of the present disclosure.

As shown in FIG. 4 , the electronic device 40 may include a processing device (eg, central processing unit, graphics processor, etc.) 41 that may be loaded into a random access memory according to a program stored in a read-only memory (ROM) 42 or from a storage device 48 . The program in the memory (RAM) 43 executes various appropriate actions and processes. In the RAM 43, various programs and data required for the operation of the electronic device 40 are also stored. The processing device 41, ROM 42 and RAM 43 are connected to each other via a bus 45. An editing/output (I/O) interface 44 is also connected to bus 45.

Generally, the following devices may be connected to the I/O interface 44: input devices 46 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 47 such as a computer; a storage device 48 including a magnetic tape, a hard disk, etc.; and a communication device 49. The communication device 49 may allow the electronic device 40 to communicate wirelessly or wiredly with other devices to exchange data. Although FIG. 4 illustrates electronic device 40 with various means, it should be understood that implementation or availability of all illustrated means is not required. More or fewer means may alternatively be implemented or provided.

According to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as a computer software program. For example, Embodiments of the present disclosure include a computer program product including a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such embodiments, the computer program may be downloaded and installed from the network via the communication device 49, or from the storage device 48, or from the ROM 42. When the computer program is executed by the processing device 41, the above-mentioned functions defined in the method of the embodiment of the present disclosure are performed.

The electronic device provided by the embodiments of the present disclosure and the text generation method provided by the above embodiments belong to the same inventive concept. Technical details that are not described in detail in this embodiment can be referred to the above embodiments, and this embodiment has the same features as the above embodiments. beneficial effects.

Embodiments of the present disclosure provide a computer storage medium on which a computer program is stored. When the program is executed by a processor, the text generation method provided in the above embodiments is implemented.

It should be noted that the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium may be, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination thereof. More specific examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmed read-only memory (EPROM or flash memory), fiber optics, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wire, optical cable, RF (radio frequency), etc., or any suitable combination of the above.

In some embodiments, the client and server can communicate using any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol), and can communicate with digital data in any form or medium. Communications (e.g., communications network) interconnections. Examples of communications networks include local area networks ("LAN"), wide area networks ("WAN"), the Internet (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks), as well as any currently known or developed in the future network of.

The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; it may also exist independently without being assembled into the electronic device.

The above-mentioned computer-readable medium carries one or more programs. When the above-mentioned one or more programs are executed by the electronic device, the electronic device:

Computer program code for performing the operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages—such as "C" or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In situations involving remote computers, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through Internet connection).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operations of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each box in the flowchart or block diagram may represent a A module, program segment, or part of code that contains one or more executable instructions for implementing specified logical functions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.

The units involved in the embodiments of the present disclosure can be implemented in software or hardware. The name of the unit does not constitute a limitation on the unit itself under certain circumstances. For example, the first acquisition unit can also be described as "the unit that acquires at least two Internet Protocol addresses."

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, and without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical device (CPLD) and so on.

In the context of this disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

According to one or more embodiments of the present disclosure, [Example 1] provides a text generation method, which method includes: inputting the acquired original text into a trained text coding model to obtain text feature information; based on the text features Information, combined with the trained text decoding model, generates the target text corresponding to the original text; wherein, the text decoding model includes a text prediction layer, and the text prediction layer includes a set number of nodes. The point information is determined by the text feature information, and the target words contained in the target text and the combination order of the target words are determined by the node information of the nodes and the topological structure between nodes.

According to one or more embodiments of the present disclosure, [Example 2] provides a text generation method, wherein the text decoding model includes: a position information input layer, a basic decoding sub-model and a text prediction layer; the position The information input layer includes a set number of node position parameters, and the set number is used to determine the number of nodes included in the text prediction layer; the set number included in the text prediction layer The node information of the node is determined through the node position parameter and the text feature information, combined with the basic decoding sub-model.

According to one or more embodiments of the present disclosure, [Example 3] provides a text generation method, wherein a target text corresponding to the original text is generated based on the text feature information and a trained text decoding model, The method includes: inputting the text feature information and the position information into the node position parameter in the layer and inputting it into the basic decoding sub-model; obtaining the set number of initial text prediction vectors output by the basic decoding sub-model. , use the set number of initial text prediction vectors as node information of a set number of nodes in the text prediction layer; construct a directed acyclic graph based on the nodes, and determine the topology structure between nodes, And determine the target text of the original text in combination with the node information.

According to one or more embodiments of the present disclosure, [Example 4] provides a text generation method, wherein the directed acyclic graph is constructed based on the nodes, the topology between nodes is determined, and the nodes are combined The information determines the target text of the original text, including: predicting the node labels of the nodes in the layer based on the text, constructing a directed acyclic graph, and obtaining the topological structure between nodes; based on the topological structure between nodes and the The node information of the node determines the node transition matrix corresponding to the text prediction layer; based on the node information of the node and the node transition matrix, the target text of the original text is determined.

According to one or more embodiments of the present disclosure, [Example 5] provides a text generation method, in which the text prediction layer corresponding to the text prediction layer is determined based on the topological structure between nodes and the node information of the nodes. The node transfer matrix includes: for each node, determining the adjacent nodes of the directed connection of the node from the topological structure between nodes; according to the node and the The node information of the adjacent node determines the transition probability from the node to the adjacent node; based on the transition probability, a node transition matrix corresponding to the text prediction layer is formed.

According to one or more embodiments of the present disclosure, [Example 6] provides a text generation method, wherein the target of the original text is determined based on the node information of the node and the node transition matrix. The text includes: according to the node information of the node, through the fully connected layer in the text prediction layer, determining the matching probability of the node to the word in the preset vocabulary; according to the node transfer matrix and The matching probability of the node to the word is used to determine the predicted node and the corresponding target word; based on the target word, the target text of the original text is combined to form the target text.

According to one or more embodiments of the present disclosure, [Example 7] provides a text generation method, in which the predicted node and the corresponding target word are determined according to the node transition matrix and the matching probability from the node to a word. , including: determining at least one prediction node according to the maximum transition probability corresponding to the node in the node transition matrix; for each prediction node, determining the maximum matching probability from the matching probability of the prediction node to the word, And the word corresponding to the maximum matching probability is determined as the target word.

According to one or more embodiments of the present disclosure, [Example 8] provides a text generation method, in which the predicted node and the corresponding target word are determined according to the node transition matrix and the matching probability of the node to a word. , including: taking the node corresponding to the starting node label as the current node; obtaining the current transition probability from the current node to the adjacent node from the node transition matrix; determining the current transition probability and the The product value of the matching probability corresponding to the current node and the word; select the maximum product value from the product value, and use the adjacent nodes and words associated with the maximum product value as the prediction node and the target word respectively, and Add the predicted node and the target word association to the cache table; use the predicted node as the new current node, and re-execute the selection operation of the current adjacent point corresponding to the current node until the loop end condition is reached .

According to one or more embodiments of the present disclosure, [Example 9] provides a text generation method, in which the predicted node and the corresponding target word are determined according to the node transfer matrix and the matching probability from the node to a word. , including: determining the corresponding maximum matching probability based on the matching probability from the node to the word, and determining the word corresponding to the maximum matching probability as the predicted word of the corresponding node; based on the preset path search algorithm, combined with the knot The point transfer matrix and the predicted word of the node determine the prediction path with the highest weight; the predicted word corresponding to the prediction node in the prediction path is determined as the corresponding target word.

According to one or more embodiments of the present disclosure, [Example 10] provides a text generation method, which further includes: based on a set loss function generation strategy, learning parameter training for the constructed text decoding model, and obtaining A text decoding model; wherein the learning parameters include: node position parameters involved in the position information input layer included in the text decoding model, basic model parameters involved in the included basic decoding sub-model, and included text predictions Node-related parameters involved in the nodes in the layer.

According to one or more embodiments of the present disclosure, [Example 11] provides a text generation method, in which the built text decoding model is trained with learning parameters based on a set loss function generation strategy. After training, the The text decoding model includes: obtaining at least one set of sample data, which includes an original sample text and a corresponding single target sample text; under the current iteration, using text encoding to encode the original sample text in a set of sample data After the model is encoded, it is input to the current text decoding model; based on the current text decoding model, the probability value corresponding to when the original sample text is generated through the text prediction path to generate the target sample text, wherein the text prediction path is based on the The nodes in the text prediction layer are formed by combining the setting algorithm; based on the probability value combined with the loss function generation formula, the current loss function value is determined, and the current text decoding is adjusted through backpropagation based on the current loss function value. The learning parameters in the model are used to obtain the text decoding model for the next iteration; the next iteration is regarded as the new current iteration, and the learning parameter training is continued until the iteration end conditions are met, and the trained text decoding model is obtained.

According to one or more embodiments of the present disclosure, [Example 12] provides a text generation method, wherein the loss function generation formula is expressed as: taking the logarithm of the sum of the probability values, and adding the logarithm The result of the operation is negative.

Furthermore, although various operations are depicted in a specific order, this should not be understood as requiring that these operations be performed in the specific order shown or performed in a sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, although specific implementation details are included in the discussion above, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Claims

A text generation method including:

Input the obtained original text into the trained text coding model to obtain text feature information;

Based on the text feature information and combined with the trained text decoding model, generate target text corresponding to the original text;

Wherein, the text decoding model includes a text prediction layer, the node information of a set number of nodes included in the text prediction layer is determined by the text feature information, and the target words contained in the target text And the combination order of the target words is determined by the node information of the nodes and the topological structure between nodes.
The method according to claim 1, wherein the text decoding model includes: a position information input layer, a basic decoding sub-model and a text prediction layer;

The position information input layer includes a set number of node position parameters, and the set number is used to determine the number of nodes included in the text prediction layer;

The node information of a set number of nodes included in the text prediction layer is determined through the node position parameters and the text feature information, combined with the basic decoding sub-model.
The method of claim 2, wherein generating the target text corresponding to the original text based on the text feature information and a trained text decoding model includes:

Enter the text feature information and the position information into the node position parameters in the layer and into the basic decoding sub-model;

Obtain the set number of initial text prediction vectors output by the basic decoding sub-model, and use the set number of initial text prediction vectors as nodes of the set number of nodes in the text prediction layer. information;

Construct a directed acyclic graph based on the nodes, determine the topological structure between nodes, and determine the target text of the original text based on the node information of the nodes.
The method according to claim 3, wherein the directed acyclic graph is constructed based on the nodes, the topological structure between nodes is determined, and the target text of the original text is determined in combination with the node information of the nodes, include:

Construct a directed acyclic graph according to the node labels of the nodes in the text prediction layer, and obtain the topological structure between nodes;

Determine the node transition matrix corresponding to the text prediction layer according to the topological structure between the nodes and the node information of the node;

According to the node information of the node and the node transition matrix, the target text of the original text is determined.
The method according to claim 4, wherein determining the node transition matrix corresponding to the text prediction layer based on the topological structure between the nodes and the node information of the node includes:

For each node, determine the adjacent nodes with directed connections to each node from the topological structure between the nodes;

Determine the transition probability from each node to the adjacent node based on the node information of each node and the adjacent node;

A node transition matrix corresponding to the text prediction layer is formed based on the transition probability.
The method according to claim 4, wherein determining the target text of the original text according to the node information of the node and the node transition matrix includes:

According to the node information of the node, the matching probability of the node to the word in the preset vocabulary is determined through the fully connected layer in the text prediction layer;

According to the node transfer matrix and the matching probability from the node to the word, determine the predicted node and the corresponding target word;

Based on the target words, a target text of the original text is formed.
The method according to claim 6, wherein determining the predicted node and the corresponding target word based on the node transition matrix and the matching probability from the node to a word includes:

Determine at least one prediction node according to the maximum transition probability corresponding to the node in the node transition matrix;

For each prediction node, the maximum matching probability is determined from the matching probability between the prediction node and the word, and the word corresponding to the maximum matching probability is determined as the target word.
The method according to claim 6, wherein determining the predicted node and the corresponding target word based on the node transition matrix and the matching probability from the node to a word includes:

Use the node corresponding to the starting node label as the current node;

Obtain the current transition probability from the current node to the adjacent node from the node transition matrix;

Determine the product value of the current transition probability and the matching probability corresponding to the current node and word respectively;

Select the maximum product value from the product values, use the adjacent nodes and words associated with the maximum product value as prediction nodes and target words respectively, and add the prediction nodes and target words to the cache table. ;

The predicted node is used as a new current node, and the selection operation of the current adjacent point corresponding to the current node is re-executed until the loop end condition is reached.
The method according to claim 6, wherein determining the predicted node and the corresponding target word based on the node transition matrix and the matching probability from the node to a word includes:

Based on the matching probability from the node to the word, determine the corresponding maximum matching probability, and determine the word corresponding to the maximum matching probability as the predicted word of the corresponding node;

According to the preset path search algorithm, combined with the node transfer matrix and the predicted word of the node, determine the predicted path with the highest weight;

The predicted word corresponding to the prediction node in the prediction path is determined as the corresponding target word.
The method according to any one of claims 1-9, further comprising:

Based on the set loss function generation strategy, perform learning parameter training on the constructed text decoding model to obtain the trained text decoding model;

Wherein, the learning parameters include: node position parameters involved in the position information input layer included in the text decoding model, basic model parameters involved in the included basic decoding sub-model, and included text prediction layer. Node related parameters involved in the node.
The method according to claim 10, wherein, based on the set loss function generation strategy, the constructed text decoding model is trained with learning parameters, and the trained text decoding model is obtained, including:

Obtain at least one set of sample data, which includes an original sample text and a corresponding single target sample text;

Under the current iteration, the original sample text in a set of sample data is encoded using the text encoding model and then input to the current text decoding model;

Based on the current text decoding model, determine the probability value corresponding to when the original sample text is used to generate the target sample text through a text prediction path, wherein the text prediction path is based on nodes in the text prediction layer Formed by combining the setting algorithm;

Based on the probability value combined with the loss function generation formula, the current loss function value is determined, and based on the current loss function value, the learning parameters in the current text decoding model are adjusted through backpropagation to obtain text decoding for the next iteration. Model;

The next iteration is regarded as the new current iteration, and the learning parameter training is continued until the iteration end condition is met, and the trained text decoding model is obtained.
The method according to claim 11, wherein the loss function generation formula is expressed as:

Calculate the logarithm of the sum of the probability values and negate the logarithmic result.
A text generating device including:

The encoding execution module is configured to input the acquired original text into the trained text encoding model to obtain text feature information;

A decoding execution module configured to generate target text corresponding to the original text based on the text feature information and combined with the trained text decoding model;

Wherein, the text decoding model includes a text prediction layer, the node information of a set number of nodes included in the text prediction layer is determined by the text feature information, and the target words contained in the target text And the combination order of the target words is determined by the node information of the nodes and the topological structure between nodes.
An electronic device including:

one or more processors;

a storage device configured to store one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the text generation method as described in any one of claims 1-12.
A computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the text generation method as described in any one of claims 1-12 is implemented.