WO2023071242A1

WO2023071242A1 - Text generation method and apparatus, and storage medium

Info

Publication number: WO2023071242A1
Application number: PCT/CN2022/100545
Authority: WO
Inventors: 王昕远; 郑少杰; 范增虎
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2021-11-01
Filing date: 2022-06-22
Publication date: 2023-05-04
Also published as: CN114118041A

Abstract

Embodiments of the present application disclose a text generation method and apparatus, and a storage medium. The method comprises: obtaining a text keyword from a text generation instruction when the text generation instruction is received, and determining a target text type corresponding to the text keyword; when a target template comprising the target text type exists in a template library, obtaining the target template from the template library, templates in the template library being text templates that are provided with text types; and finding the position of the target text type in the target template, and replacing field information corresponding to the target text type by using field information of the text keyword at the position to obtain a target text comprising the text keyword.

Description

A text generation method, device, and storage medium

Cross References to Related Applications

This application claims the priority of the Chinese patent application with the application number 202111284961.6 and the application title "a text generation method and device, storage medium" submitted to the China Patent Office on November 01, 2021, the entire contents of which are incorporated by reference In this application.

technical field

The present application relates to the technical field of artificial intelligence, in particular to a method and device for generating text, and a storage medium.

Background technique

With the development of Internet technology, the network will push a lot of text information corresponding to the object to the user every day, so that the user can understand the object in depth according to the text information, so as to realize the processing process of the object.

In the prior art, when the description information related to the object is obtained, the corresponding description template is manually searched, and the description template and the description information are manually associated to obtain the corresponding text information, which reduces the generation of text information. time intelligence.

Contents of the invention

In order to solve the above technical problems, the embodiments of the present application expect to provide a text generation method and device, and a storage medium, which can improve the intelligence of the text generation device when generating text information.

The technical scheme of the present application is realized like this:

An embodiment of the present application provides a text generation method, including:

In the case of receiving a text generation instruction, acquiring text keywords from the text generation instruction, and determining the target text type corresponding to the text keywords;

In the case that there is a target template containing the target text type in the template library, the target template is obtained from the template library; the template in the template library is a text template provided with a text type;

Find the position of the target text type in the target template, and replace the field information corresponding to the target text type with the field information of the text keyword at the position, to obtain the target containing the text keyword text.

An embodiment of the present application provides a text generation device, the device includes:

The obtaining part is configured to obtain text keywords from the text generation instruction in the case of receiving the text generation instruction; in the case that there is a target template containing the target text type in the template library, from the template library Obtain the target template in the template library; the template in the template library is a text template with a text type;

A determining part configured to determine a target text type corresponding to the text keyword;

The replacement part is configured to replace the field information corresponding to the target text type with the field information of the text keyword at the position, so as to obtain the target text containing the text keyword.

a memory, a processor, and a communication bus, the memory communicates with the processor through the communication bus, the memory stores a text-generated program executable by the processor, and when the text-generated program is executed , using the processor to execute the above text generation method.

An embodiment of the present application provides a storage medium on which a computer program is stored, which is applied to a text generation device. When the computer program is executed by a processor, the above text generation method is implemented.

Embodiments of the present application provide a text generation method and device, and a storage medium. The text generation method includes: in the case of receiving a text generation instruction, obtaining text keywords from the text generation instruction, and determining the target corresponding to the text keyword Text type; if there is a target template containing the target text type in the template library, obtain the target template from the template library; the template in the template library is a text template with a text type; find the target text type in the target template position, and replace the field information corresponding to the target text type with the field information of the text keyword at the position to obtain the target text containing the text keyword. Using the implementation scheme of the above method, when the text generation device receives the text generation instruction, it obtains the text keywords from the text generation instruction, searches the template library for a target template that includes the target text type corresponding to the text keyword, and Find the position of the target text type in the target template, and use the field information of the text keyword to replace the field information corresponding to the target text type at this position, so as to obtain the target text containing the text keyword, which does not need to be obtained manually The text information improves the intelligence of the text generating device when generating the text information.

Description of drawings

The accompanying drawings here are incorporated into the specification and constitute a part of the specification. These drawings show embodiments consistent with the application, and are used together with the description to describe the technical solution of the application.

Fig. 1 is a flow chart of a text generation method provided by the embodiment of the present application;

FIG. 2 is a schematic structural diagram of an exemplary BERT provided in the embodiment of the present application;

FIG. 3 is a schematic diagram of an exemplary supervised training BERT model provided by an embodiment of the present application;

Fig. 4 is a flow chart of an exemplary training BERT model provided by the embodiment of the present application;

FIG. 5 is an exemplary text template persistence flowchart provided by the embodiment of the present application;

FIG. 6 is a flow chart of an exemplary text generation method provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of a seed stage and an automatic training stage of an exemplary text generation method provided by an embodiment of the present application;

FIG. 8 is a first structural diagram of a text generation device provided by an embodiment of the present application;

FIG. 9 is a second schematic diagram of the composition and structure of a text generation device provided by an embodiment of the present application.

Detailed ways

In order to understand the characteristics and technical contents of the embodiments of the present application in more detail, the implementation of the embodiments of the present application will be described in detail below in conjunction with the accompanying drawings. The attached drawings are only for reference and description, and are not intended to limit the embodiments of the present application.

Embodiment one

The embodiment of the present application provides a text generation method, and Fig. 1 is a flow chart 1 of a text generation method provided in the embodiment of the present application. As shown in Fig. 1, the text generation method may include:

S101. When a text generation instruction is received, acquire text keywords from the text generation instruction, and determine a target text type corresponding to the text keywords.

A text generation method provided by an embodiment of the present application is applicable to a scenario where a target text is generated according to text keywords carried in a text generation instruction.

In the embodiment of the present application, the text generation device may be implemented in various forms. For example, the text generation device described in this application may include mobile phones, cameras, tablet computers, notebook computers, palmtop computers, personal digital assistants (Personal Digital Assistant, PDA), portable media players (Portable Media Player, PMP), Devices such as navigation devices, wearable devices, smart bracelets, pedometers, and devices such as digital TVs, desktop computers, etc.

In the embodiment of the present application, the text generation instruction can be an instruction for generating marketing text; the text generation instruction can also be an instruction for generating advertising text; the text generation instruction can be an instruction for generating other text; the specific text generation instruction can be based on the actual situation The determination is made, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the text generation device may include a display screen, and the text generation device may receive a text generation instruction from the display screen; the text generation device may also receive a text generation instruction from other devices, and the text generation instruction may also be transmitted through other devices. The method for receiving the text generation instruction is performed by the text generation device; the specific method for the text generation device to receive the text generation instruction can be determined according to the actual situation, which is not limited in this embodiment of the present application.

In this embodiment of the present application, the text keyword may be information used to generate the target text corresponding to the text generation instruction.

In the embodiment of the present application, the number of text keywords can be one, the number of text keywords can also be two, the number of text keywords can also be multiple, and the specific number of text keywords can be determined according to the actual situation. Definitely, this embodiment of the present application does not limit it.

Exemplarily, the text keywords include bank, coupon, 10 yuan, January 1st to January 30th, movie viewing, card binding, etc.

In the embodiment of the present application, the number of target text types can be one, the number of target text types can also be two, the number of target text types can also be multiple, and the specific number of target text types can be determined according to the actual situation. Definitely, this embodiment of the present application does not limit it.

Exemplarily, the target text type can be a company name; the target text type can also be a product name; the target text type can also be a distribution item; the target text type can also be a numerical amount; the target text type can also be an activity time or an activity Description; the specific target text type can be determined according to the actual situation, which is not limited in this embodiment of the present application.

It should be noted that there can be a one-to-one correspondence between text keywords and target text types, that is, one text keyword corresponds to one target text type; two text keywords can also correspond to one target text type; or multiple text keywords A keyword corresponds to a target text type; the specific correspondence between the text keyword and the target text type can be determined according to the actual situation, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the process of the text generation device determining the target text type corresponding to the text keyword includes: when the text generation instruction does not carry the target text type, the text generation device inputs the text keyword into the type recognition model, Obtain the target text type; if the target text type is carried in the text generation instruction, the text generation device obtains the target text type from the text generation instruction.

It should be noted that the type recognition model can be a model configured in the text generation device; the type recognition model can also be a model obtained by the type recognition model from other devices before the text generation device inputs text keywords into the type recognition model; The recognition model may also be a model obtained by the text generation device in other ways; the specific manner in which the text generation device obtains the type recognition model may be determined according to actual conditions, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the type recognition model can be a text classification (FastText) model; the type recognition model can also be other models that can determine the text type according to text keywords; the specific type recognition model can be determined according to the actual situation , which is not limited in this embodiment of the present application.

In the embodiment of the present application, the text generation device inputs the text keywords into the type recognition model, and before obtaining the target text type, the text generation device will also obtain the second sample keyword and the second sample text type; the text generation device uses the second sample The keywords and the second sample text type train the initial type recognition model to obtain the type recognition model.

In the embodiment of the present application, the second sample keyword can be a preset keyword; the second sample keyword can also be a keyword transmitted to the text generating device by other devices; the second sample keyword can also be a text generated The keywords received by the device through manual labeling; the specific method of obtaining the second sample keywords by the text generation device can be determined according to the actual situation, which is not limited in this embodiment of the present application.

In this embodiment of the present application, the second sample text type is a text type corresponding to the second sample keyword. The second sample text type can be a preset text type; the second sample text type can also be the text type transmitted to the text generating device by other equipment; the second sample text type can also be received by the text generating device through manual marking The text type obtained by the method; the specific method of obtaining the second sample text type by the text generation device can be determined according to the actual situation, which is not limited in this embodiment of the present application.

In this embodiment of the present application, the text generation device may acquire the second sample keyword and the second sample text type only once.

Exemplarily, the second sample keywords include bank, coupon, 10 yuan, January 1 to January 30, movie watching, card binding and so on.

Exemplarily, the second sample text type includes: company name, product name, issued items, numerical value, activity time or activity description, etc.; the specific second sample text type can be determined according to the actual situation, and this embodiment of the application Not limited.

S102. If there is a target template including the target text type in the template library, acquire the target template from the template library; the templates in the template library are text templates set with the text type.

In the embodiment of the present application, after the text generation device determines the target text type corresponding to the text keyword, if the text generation device has a target template containing the target text type in the template library, the text generation device obtains the target text type from the template library. template.

It should be noted that the templates in the template library are text templates with a text type.

In the embodiment of the present application, the number of text templates can be one, and the number of text templates can also be two; the number of text templates can also be multiple, and the specific number of text templates can be determined according to the actual situation. The embodiment does not limit this.

In the embodiment of the present application, before the text generation device obtains the target template from the template library, the text generation device will also obtain the first sample text; and input the first sample text into the keyword recognition model to obtain the first sample text The corresponding first sample keyword, the first sample type and the first position of the first sample keyword in the first sample text; the text generation device inputs the first sample keyword into the text generation model to obtain the first sample keyword An output text; according to the first output text, the first sample text, the first sample keyword, the first sample type and the first position, a text template is obtained, and the text template is added to the template library.

In the embodiment of the present application, the text generation device may obtain the first sample text every preset time period; the text generation device may also obtain the first sample text when receiving the sample text acquisition instruction. Note that the text generation device can also obtain the first sample text in other ways; the specific method for the text generation device to obtain the first sample text can be determined according to the actual situation, which is not limited in this embodiment of the present application.

It should be noted that the preset time period can be the time period configured in the text generating device; the preset time period can also be the time period received by the text generating device before the text generating device obtains the first sample text; the preset time The segment may also be a time segment obtained by the text generating device in other ways, and the specific manner in which the text generating device obtains the preset time segment may be determined according to actual conditions, which is not limited in this embodiment of the present application.

It should also be noted that the preset time period can be one week; the preset time period can also be one month; the preset time period can also be one day; the specific preset time period can be determined according to the actual situation. There is no limit to this.

In this embodiment of the application, the keyword recognition model can be a model configured in the text generation device; the keyword recognition model can also be a model transmitted by other devices received by the text generation device; the keyword recognition model can also be a text generation device Models obtained in other ways; the specific manner in which the text generation device obtains the keyword recognition model may be determined according to actual conditions, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the keyword recognition model can be a model obtained from a language representation model (Bidirectional Encoder Representation from Transformers, BERT) and a conditional random field model; the keyword recognition model can also be other models that can be obtained from the sample text Models of the sample keywords, sample types, and positions of the sample keywords in the sample text corresponding to the sample text; the specific keyword recognition model can be determined according to the actual situation, which is not limited in this embodiment of the present application.

In this embodiment of the application, the text generation model can be a model configured in the text generation device; the text generation model can also be a model transmitted by other devices received by the text generation device; the text generation model can also be a text generation device with other The model obtained by the method; the specific method for the text generation device to obtain the text generation model can be determined according to the actual situation, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the text generation model can be the Fixed-Keywords BERT model; the text generation model can also be other models that can generate output text according to the text keywords; the specific text generation model can be determined according to the actual situation. The embodiment does not limit this.

In the embodiment of the present application, the text generation device obtains the text template according to the first output text, the first sample text, the first sample keyword, the first sample type and the first position, including: the text generation device Utilize the keyword recognition model to determine the second position of the first sample keyword in the first output text; the text generation device replaces the first sample key with the first sample type at the second position in the first output text word, to obtain the first template; the text generation device uses the first sample type to replace the first sample keyword at the first position in the first sample text to obtain the second template; the text generation device combines the first template and The second template serves as a text template.

In the embodiment of the present application, the text generation device uses the keyword recognition model to determine the second position of the first sample keyword in the first output text, which can be used for the text generation device to input the first output text into the keyword recognition model , using the keyword recognition model to determine the second position of the first sample keyword in the first output text.

In this embodiment of the application, the first template and the second template can be the same; the first template and the second template can also be different; if the number of the first template and the number of the second template are multiple, then the first template and the second template The second template may also have some same templates and some different templates; the specific ones can be determined according to the actual situation, which is not limited in this embodiment of the present application.

It should be noted that the first position and the second position may be the same; the first position and the second position may also be different; the specifics may be determined according to actual conditions, which is not limited in this embodiment of the present application.

In this embodiment of the present application, the text generation device inputs the first sample text into the keyword recognition model, and obtains the first sample keyword corresponding to the first sample text, the first sample type, and the first sample keyword in Before the first position in the first sample text, the text generation device will also obtain the second sample text and the second sample keyword corresponding to the second sample text, the second sample type and the second sample key corresponding to the second sample text The third position of the word in the second sample text; the text generation device uses the second sample keyword, the second sample type, the third position and the second sample text to train the initial keyword recognition model to obtain the keyword recognition model.

In this embodiment of the application, the text generation device is configured with a regular expression combining {marketing words} and {product/company name}, and the text generation device can use the regular expression to obtain the second sample text from the full amount of Internet data , and mark the corresponding second sample keyword, the second sample type, and the third position of the second sample keyword in the second sample text from the second sample text by manual labeling. Then, the second sample keyword, the second sample type and the third location are transmitted to the text generating device, and at this time, the text generating device acquires the second sample keyword, the second sample type and the third location.

In this embodiment of the application, the marketing words are words related to financial marketing, words configured in the text generating device, and the marketing words include: receiving, benefits, discounts, red envelopes, limited time, special prices, free shipping, recharge, coupons, members , voucher, blockbuster, good news, exclusive, exclusive, super value, special offer, gift, reward, exchange, activation, gift, subsidy, 11.11, 12.12, double 11, double 12, lottery, double 11, double 12. Intimate, money-saving, discount, high-quality goods, free shipping, warm winter, exquisite, waiting for you to get it, spike, free, coupon, discount, gift, gift, gift, store celebration, limited, discount, exchange, good news, surprise , carnival, shocking, launch, event, special offer, special, special offer, incoming, wool, direct drop, money saving, subsidy, immediate discount, red envelope, limited time, points, online, shock, slow hands, [low less let Reduction].*[Interest rate interest price], [Interest rate interest rate price].*[Low less reduction reduction], full.*Reduction, exclusive, unsecured, come to grab, come quickly, quickly, must Preparation, recharge, rebate, opening, latest.

In this embodiment of the application, the product/company name is a financial-related product and company name or abbreviation, represented by "{product/company name}".

Exemplarily, the regular expression for combining {marketing word} and {product/company name} can be: {marketing word}.*{product/company name}; combining {marketing word} and {product/company name} The combined regular expression can also be {product/company name}.*{marketing word}.

It should be noted that the first sample text may also use the sample text information obtained from the full amount of data on the Internet every preset time period by using the regular expression.

In this embodiment of the application, if the second sample keyword is a bank, then the corresponding second sample type is a company name; if the second sample keyword is a coupon, then the corresponding second sample type is an issued item; If the keyword in the second sample is 10 yuan, the corresponding second sample type is monetary value; if the second sample keyword is from January 1 to January 30, then the corresponding second sample type is activity time.

Exemplarily, if the text of the second sample is "The company is giving out benefits! 50 yuan red envelope, come and get it", then the corresponding keyword of the first second sample is the company, and the corresponding type of the first second sample is Company name, the corresponding first third position is (0, 2); the corresponding second second sample keyword is 50 yuan, the corresponding second second sample type is the amount value, the corresponding second The third position is (7, 10); the corresponding third second sample keyword is red envelope, the corresponding third second sample type is distribution items, and the corresponding third third position is (10, 12) .

It should be noted that the third position may be a pair of starting and ending positions where the second sample keyword appears in the second sample text.

In the embodiment of the present application, after the text generation device determines the target text type corresponding to the text keyword, the text generation device determines at least two Empty positions and at least two groups of characters corresponding to at least two vacant positions; the text generation device splices at least two vacant positions and keywords according to at least two groups of characters to obtain splicing information; the text generating device inputs the splicing information into the text to generate model to obtain at least two groups of target character information corresponding to at least two empty positions; the text generation device adds at least two groups of target character information to at least two empty positions in the splicing information to obtain target text.

It should be noted that at least two empty positions correspond to at least two groups of characters, that is, at least one empty position corresponds to a group of characters.

It should be noted that, when the number of text keywords is one, there will be an empty position on the left side of the text keyword, and there will be a second empty position on the right side of the text keyword; when the number of text keywords is In the case of two, there will be an empty space to the left of the first text keyword, and a second empty space between the first text keyword and the second text keyword; the second text keyword's There will be a third empty position on the right; ....; In the case that the number of text keywords is N, there will be an empty position to the left of the first text keyword, the first text keyword and the second text There will be a second empty position between the keywords; ...; There will be an Nth empty position between the N-1th text keyword and the Nth text keyword; there will be a Nth empty position on the right of the Nth text keyword N+1 empty slots. That is, when the number of text keywords is N, the corresponding number of empty positions is N+1.

In the embodiment of the present application, the text generation device inputs the splicing information into the text generation model to obtain at least two groups of target character information corresponding to at least two empty positions, including the text generation device inputting the splicing information into the text generation model, using the text The generation model obtains the first word in each group of empty positions in at least two empty positions by sampling, that is, at least two groups of first characters are obtained; and then at least two groups of first characters and splicing information are input into the text Generate a model, use the text generation model to obtain at least two groups of second characters in at least two empty positions by sampling, until each word in the at least two empty positions is obtained by using the text generation model by sampling, that is, get At least two sets of target character information.

It should be noted that the structure of BERT is shown in Figure 2: BERT can be divided into three parts: word vector conversion part, encoding part and supervision part. Among them, in the case of receiving the input text, first use the word vector conversion part to perform word vector conversion on the input text to obtain the word vector sequence (CLS, word 1, word 2, word 3, ..., word N), and then use BERT The encoding part encodes the word vector sequence, and finally uses the supervision part to determine the text category of the encoded input text. The encoding part is the main body of BERT. Its main function is to encode the input N+1 word vectors to allow information interaction between all input vectors. The coding part is composed of several layers of coding blocks, and the first coding block in the coding part can obtain the first coding sequence (E _CLS , E ₁ , E 2 , E 3 , ..., E _n ) after coding the word vector sequence (E CLS , E 1 , E ₂ , E ₃ , ..., E n A coding sequence is the coding sequence closest to the word vector sequence in Fig. 3), and the last coding block in the coding part can obtain the coding output sequence (E _CLS , E ₁ , E ₂ , E ₃ , ..., E _n ) (the coding output sequence is the coding sequence closest to the supervisory part in Fig. 3). Finally, the supervised part includes the labels corresponding to the input text needed for supervised training of BERT. Figure 3 shows the multi-category classification of the input text. At this time, it is only necessary to take the encoded output sequence output by the encoding part, and map the encoded output sequence to the target category (text category), and then supervised training can be started. The supervision part can be adjusted according to the task goal, such as performing named entity recognition, question answering, etc.

In the embodiment of the present application, if the text generation model is the Fixed-Keywords BERT model, the function of the text generation model is to generate marketing copy (that is, target text) containing these template keywords according to the given template keywords.

In the embodiment of the present application, Fig. 3 shows the supervised training process of inputting "bank" and "red envelope" as sample keywords to generate a marketing copy (output text) "Come and get the bank red envelope!". Firstly, word-vector conversion is performed on "bank", "red envelope" and the mask part to obtain a word-vector sequence; the word-vector sequence is encoded using the first code block in the code part to obtain the first code sequence (E _CLS , E _M , _EM , _EM , _Esilver , _Erow , _EM , EM , _EM , _Ered , _Epacket _, _EM , _EM , _EM ) until the last coded block pair in the coded section is utilized Encoding is performed to obtain the encoded output sequence, and the supervised part is used to supervise the encoded output part to obtain the prediction result (come and get it, ---, la!-).

Specifically: Since the input only has two words "bank" and "red envelope", a complete marketing copy cannot be formed, but the spaces formed by these two words in sequence (on the left side of "bank", "bank" and " Between "red envelopes", on the right side of "red envelopes"), words that make up this marketing copy may appear. At this time, it is necessary to first set the maximum value L _M of the number of words where these vacancies appear. There are two ways to set L _M :

The first method is: according to the sample text and the sample keyword, determine the number of words corresponding to the blank value formed by the sample keyword in the sample text. For example, for the marketing copy "Come and get the bank red envelope!", and the sample keywords it contains:

Sample Type: <Company Name>, Sample Keyword: Bank, Keyword Position: (3, 5)

Sample type: <distributed item>, sample keyword: red envelope, keyword position: (5, 7)

It can be determined that the numbers of words contained in the three slots are: 3, 0, 2 respectively.

The second type is: for the number of words corresponding to all vacancy values, take the maximum value, which can be L _M . Then, L _M mask vectors (denoted as M) are used to insert into the vacancies formed between all sample keywords. For example, for inputting "bank" and "red envelope", L _M mask vectors can be inserted before "bank", between "bank" and "red envelope", and after "red envelope". As shown in Figure 3, assuming that L _M is 3, the word vector conversion part shows the final result including the mask.

It can be understood that, with the maximum value as the number of words corresponding to the vacancy value, there will be sufficient positions (the most sufficient mask parts) to predict the target character at the vacancy position, which improves the accuracy of predicting the target character information. accuracy.

In the embodiment of the present application, in the supervision part, word supervision is performed on the mask part. If the number of words in the corresponding position in the sample copy is less than L _M , the supervision starts from the leftmost part of the mask part, and the supervision object of the remaining position is "-", as shown in "---" in Figure 3, which means this Words do not exist anywhere. In the case of predicting the predicted target character corresponding to the mask part, the final marketing copy can be obtained according to the “-” in the predicted target character and the sample keyword.

Exemplarily, the process of text generation model training is shown in Figure 4:

S41. The text generation device acquires a second sample text and a second sample keyword corresponding to the second sample text.

S42. The text generation device constructs a word vector sequence by using the second sample keywords.

Exemplarily, for the marketing copy (the second sample text) "Come and get the bank red envelope!", and the second sample keywords contained therein:

Second Sample Type: <Company Name>, Second Sample Keyword: Bank, Third Position: (3, 5)

The second sample type: <distributed items>, the second sample keyword: red envelope, the third position: (5, 7)

First, according to the position of the second sample keyword in the marketing copy, it is converted into word vectors from left to right, and each word is directly converted into a 200-dimensional vector. All the word vectors of the two words are spliced together to construct a 200-dimensional word vector sequence of length 4. Then, the mask vector is filled for all vacancies formed by the two second sample keywords.

Exemplarily, if the number of characters L _M in each mask part is 3, then for each vacant position, three 200-dimensional mask vectors are inserted, all of which are initialized as all-0 vectors. After insertion, the length can be obtained as: 3 (length of mask vector sequence) + 2 (length of "bank" vector sequence) + 3 (length of mask vector sequence) + 2 (length of "red envelope" vector sequence) + 3 (mask The vector sequence length) is a 200-dimensional vector sequence. At this time, the word vector conversion part of the Fixed-Keywords BERT model is completed, that is, the word vector sequence is obtained.

S43. The text generation device constructs a training label according to the second sample text.

In the embodiment of this application, the training label represents the expected result after inputting the data into the model, that is, the real marketing copy. When constructing the word vector sequence, a mask is inserted for each vacancy, and it is necessary to ensure that the training label and the word vector sequence correspond to each word position.

Exemplarily, a vector sequence can be constructed: [M, M, M, silver, line, M, M, M, red, bag, M, M, M], then the corresponding training label is constructed as follows: [Quick , come, collar, silver, OK, -,-,-, red, bag, la,! ,-]. Among them, "-" indicates that there is no character in the corresponding position.

S44. The text generation device inputs the word vector sequence into the encoding part of the initial text generation model to obtain an encoded output sequence.

S45. The text generation device trains the initial text generation model according to the encoded output sequence and the training labels, and obtains the text generation model.

It should be noted that after the text generating device obtains the coded output sequence, the text generating device maps each vector (except the CLS vector) in the coded output sequence to the word list set (including "-").

Exemplarily, the encoded output sequence length obtained after inputting the word vector sequence [M, M, M, silver, line, M, M, M, red, bag, M, M, M] into the encoding part of the initial text generation model A sequence of 200-dimensional vectors of 13. For each vector, multiply a trainable matrix (matrix shape: 200×(word size+1), 1 means "-") so as to map the vector to the target word table (including "-"). After that, you can determine the mapped vector sequence and training label: [Quick, come, collar, silver, line, -,-,-, red, package, la,! ,-] The cross entropy between , and gradient descent can be used to fine-tune and update the parameters of the initial text generation model. When the initial text generation model converges (that is, the parameters of the initial text generation model cannot be updated) or reaches the maximum number of training steps, it can be considered that the initial text generation model has been trained, thereby obtaining a text generation model.

In the embodiment of this application, after model training, the Fixed-Keywords BERT model will acquire the following capabilities: input the words "bank" and "red envelope", and output "come to get" on the left side of "bank", "bank", "---" between "red envelopes", "啦!-" on the right side of "red envelopes". Among them, "-" indicates that there is no character here, and the mask part after removing "-" is spliced together with "bank" and "red envelope" in order to get a complete marketing copy: "Come and get the bank red envelope! ". Since "bank" and "red envelope" are sample keywords, each has its corresponding sample type, therefore, here you can further replace the sample keywords in the generated marketing copy with the corresponding sample type, that is, use "<company Name>" replaces "bank", "<issued item>" replaces "red envelope", and the template can be obtained: "Come and get <company name> <issued item>!". That is, the persistence of the template is completed. Specifically, the flow chart of persisting as a text template is shown in Figure 5:

S51. The text generation device acquires a first sample keyword.

In this embodiment of the present application, while acquiring the first sample keyword, the text generation device will also acquire the first sample type corresponding to the first sample keyword. Specifically, the text generation device may input the first sample text into the keyword recognition model to obtain the first sample keyword and the first sample type corresponding to the first sample text.

In the embodiment of the present application, the text generation device also needs the first sample keyword sequence. Exemplarily, the input first sample keyword form may be:

The first sample type: <company name>, the first sample keyword: bank;

Type of the first sample: <item issued>, keyword of the first sample: interest-free coupon

It should be noted that the input first sample keyword sequence is order-sensitive, that is, the order of the input first sample keyword sequence is consistent with the order in which it appears in the final generated marketing copy. In order to subsequently generate a text template, it is necessary to obtain the first sample type.

S52. The text generation device inputs the first sample keywords into the text generation model to obtain a first output text.

In the embodiment of this application, the two words bank and interest-free coupon form spaces in sequence (the left side of "bank", between "bank" and "interest-free coupon", and the right side of "interest-free coupon") The maximum value L _M of the number of words that appear. If L _M is 3, the word vector sequence that can be constructed is: [M, M, M, bank, bank, M, M, M, free, interest, coupon, M, M, M]. Input the constructed word vector sequence into the encoding part of the Fixed-Keywords BERT model, and the encoded output sequence output by the last encoding layer (the last encoding block) can be obtained. Each vector in the coded output sequence (except the part where the _ECLS and the first sample keyword is located) is mapped to the word table (including "-"), and the word with the largest probability value obtained after the mapping is selected as the current location predictions.

Exemplary, corresponding to the word vector sequence: [M, M, M, bank, line, M, M, M, free, interest, coupon, M, M, M], place the position at (0, 3), ( 5, 8), the nine vectors of (11, 14) are mapped to all word tables (including "-") respectively. After mapping, for each character position, there is a numerical (probability) vector representing the possibility of each word in the word table (including "-"), and the word with the highest probability value can be used as the position prediction here out of the word. When all nine positions are predicted, then combine with the first sample keywords to get: 【-,-,-, bank, line, big, amount,-, free, interest, coupon, enjoy, no, stop】 .

It should be noted that the predicted marketing copy (the first output text is ) can be obtained by removing the "-" representing the non-existent character here: "Enjoy non-stop interest-free bank coupons".

S53. The text generation device determines a second position of the first sample keyword in the first output text by using the keyword recognition model.

S54. The text generation device replaces the first sample keyword with the first sample type at the second position in the first output text to obtain a first template; and uses the first template as a text template.

In the embodiment of this application, the first sample keyword in the predicted marketing copy: "Enjoy non-stop interest-free bank coupons" is replaced with the corresponding first sample type, that is, "bank" is replaced by the first sample keyword A sample type "<company name>", "interest-free coupon" is replaced by the first sample type "<issued item>", the final first template can be obtained: "<company name> large amount <issued item> enjoy Non-stop", the first template is stored, that is, the persistence of the marketing copy template is completed.

S103. Find the position of the target text type in the target template, and replace the field information corresponding to the target text type with the field information of the text keyword at the position, so as to obtain the target text containing the text keyword.

In the embodiment of the present application, after the text generation device searches the target template containing the target text type in the template library, the text generation device can search for the position of the target text type in the target template, and use the field of the text keyword at the position The information replaces the field information corresponding to the target text type to obtain the target text containing text keywords.

It should be noted that the target text is the text corresponding to the text generation instruction.

Exemplarily, a schematic diagram of an exemplary text generation method is shown in Figure 6:

S61. In the case of receiving the text generation instruction, the text generation device acquires text keywords from the text generation instruction.

S62. In the case that the text generation instruction does not carry the target text type, the text generation device inputs text keywords into the type recognition model to obtain the target text type.

S63. In the case that the text generation instruction carries the target text type, the text generation device obtains the target text type from the text generation instruction.

S64. In the case that there is a target template including the target text type in the template library, the text generation device acquires the target template from the template library.

S65. The text generation device searches for the position of the target text type in the target template, and replaces the field information corresponding to the target text type with the field information of the text keyword at the position, to obtain the target text containing the text keyword.

S66. In the case that the template library does not contain the target template of the target text type, the text generation device determines at least two empty positions formed according to the text keywords and at least two groups of character quantities corresponding to the at least two empty positions.

S67. The text generating device splices at least two empty positions and keywords according to at least two sets of characters to obtain splicing information.

S68. The text generation device inputs the splicing information into the text generation model to obtain at least two sets of target character information corresponding to at least two empty positions.

S69. The text generation device adds at least two sets of target character information to at least two empty positions in the splicing information to obtain the target text.

Exemplarily, an exemplary text generation method includes a seed stage and an automatic training stage, as shown in FIG. 7 . Among them, the seed stage is to obtain the second sample text first, and manually mark the second sample text to obtain the second sample keywords corresponding to the second sample text, the second sample type and the second sample text corresponding to the second sample text The third position of the keyword in the second sample text; Utilize the second sample keyword, the second sample type, the third position and the second sample text to train the initial keyword recognition model to obtain the keyword recognition model (training keyword recognition Model). The initial type recognition model is trained by using the second sample keywords and the second sample text type to obtain a type recognition model (training type recognition model). The automatic training stage is to obtain the first sample text, input the first sample text into the keyword recognition model, and obtain the first sample keyword, the first sample type and the first sample key corresponding to the first sample text The first position of the word in the first sample text (use the keyword recognition model to mark the first sample text); input the first sample keyword into the text generation model to obtain the first output text; use keyword recognition The model determines the second position of the first sample keyword in the first output text; at the second position in the first output text, the first sample keyword is replaced by the first sample type to obtain the first template; At the first position in the first sample text, utilize the first sample type to replace the first sample keyword to obtain a second template; use the first template and the second template as text templates (get text templates), and The text template is added to the template library, so that when the text generation instruction is received, the target text containing the text keyword is obtained according to the text keyword in the text generation instruction and the target template in the template library.

It can be understood that, when the text generation device receives the text generation instruction, it obtains the text keyword from the text generation instruction, searches the template library for a target template that includes the target text type corresponding to the text keyword, and Find the position of the target text type in the template, and use the field information of the text keyword to replace the field information corresponding to the target text type at this position, so as to obtain the target text containing the text keyword, and do not need to obtain the text manually information, which improves the intelligence of the text generating device when generating text information.

Embodiment two

Based on the same inventive concept of Embodiment 1, the embodiment of the present application provides a text generating device 1 corresponding to a text generating method; FIG. Text generating device 1 may include:

The obtaining part 11 is configured to, in the case of receiving a text generation instruction, obtain text keywords from the text generation instruction; if there is a target template containing the target text type in the template library, from the template Obtain the target template in the library; the template in the template library is a text template with a text type;

The determining part 12 is configured to determine the target text type corresponding to the text keyword;

The replacement part 13 is configured to replace the field information corresponding to the target text type with the field information of the text keyword at the position, so as to obtain the target text containing the text keyword.

In some embodiments of the present application, the device further includes an input part and an adding part;

The acquisition part 11 is configured to acquire the first sample text;

The input part is configured to input the first sample text into the keyword recognition model to obtain the first sample keyword, the first sample type and the first sample text corresponding to the first sample text The keyword is in the first position in the first sample text; the first sample keyword is input into the text generation model to obtain the first output text; according to the first output text, the first sample text This, the first sample keyword, the first sample type and the first position to obtain the text template;

The adding part is configured to add the text template to the template library.

In some embodiments of the present application, the determining part 12 is configured to determine a second position of the first sample keyword in the first output text by using a keyword recognition model;

The replacement part 13 is configured to replace the first sample keyword with the first sample type at the second position in the first output text to obtain a first template; in the At the first position in the first sample text, use the first sample type to replace the first sample keyword to obtain a second template; use the first template and the second template as The text template.

In some embodiments of the present application, the device further includes a training part;

The acquiring part 11 is configured to acquire the second sample text and the second sample keyword corresponding to the second sample text, the second sample type corresponding to the second sample text and the second sample keyword in the second sample text the third position in the second sample text;

The training part is configured to use the second sample keyword, the second sample type, the third position and the second sample text to train an initial keyword recognition model to obtain the keyword recognition model.

In some embodiments of the present application, the device further includes a splicing part;

The determining part 12 is configured to determine at least two empty positions formed according to the text keywords and the at least two At least two groups of characters corresponding to the empty positions; the at least two empty positions correspond to the at least two groups of characters one by one;

The splicing part is configured to splice the at least two empty positions and the keyword according to the at least two groups of characters to obtain splicing information;

The input part is configured to input the splicing information into the text generation model to obtain at least two sets of target character information corresponding to the at least two empty positions;

The adding part is configured to add the at least two groups of target character information to the at least two empty positions in the splicing information to obtain the target text.

In some embodiments of the present application, the input part is configured to input the text keywords into the type recognition model to obtain the target text when the target text type is not carried in the text generation instruction type;

The obtaining part 11 is configured to obtain the target text type from the text generation instruction if the text generation instruction carries the target text type.

In some embodiments of the present application, the acquisition part 11 is configured to acquire a second sample keyword and a second sample text type;

The training part is configured to use the second sample keywords and the second sample text type to train an initial type recognition model to obtain the type recognition model.

It should be noted that, in practical applications, the above-mentioned acquisition part 11, determination part 12 and replacement part 13 can be realized by the processor 14 on the text generation device 1, specifically CPU (Central Processing Unit, central processing unit), MPU (Microprocessor Unit, microprocessor), DSP (Digital Signal Processing, digital signal processor) or Field Programmable Gate Array (FPGA, Field Programmable Gate Array) and other realizations; the above-mentioned data storage can be realized by the memory 15 on the text generation device 1.

The embodiment of the present application also provides a text generating device 1. As shown in FIG. The processor 14 communicates, and the memory 15 stores a program executable by the processor 14. When the program is executed, the processor 14 executes the text generation method as described above.

In practical applications, the above-mentioned memory 15 can be a volatile memory (volatile memory), such as a random access memory (Random-Access Memory, RAM); or a non-volatile memory (non-volatile memory), such as a read-only memory (Read-Only Memory, ROM), flash memory (flash memory), hard disk (Hard Disk Drive, HDD) or solid-state hard drive (Solid-State Drive, SSD); Provide instructions and data.

An embodiment of the present application provides a computer-readable storage medium, on which a computer program is carried, and when the program is executed by the processor 14, the text generation method as described above is implemented.

Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) having computer-usable program code embodied therein.

The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

The above descriptions are only preferred embodiments of the present application, and are not intended to limit the protection scope of the present application.

Industrial Applicability

Embodiments of the present application provide a text generation method and device, and a storage medium. The text generation method includes: in the case of receiving a text generation instruction, obtaining text keywords from the text generation instruction, and determining the target corresponding to the text keyword Text type; if there is a target template containing the target text type in the template library, obtain the target template from the template library; the template in the template library is a text template with a text type; find the target text type in the target template position, and replace the field information corresponding to the target text type with the field information of the text keyword at the position to obtain the target text containing the text keyword. Using the implementation scheme of the above method, when the text generation device receives the text generation instruction, it obtains the text keywords from the text generation instruction, searches the template library for the target template that includes the target text type corresponding to the text keyword, and Find the position of the target text type in the target template, and use the field information of the text keyword to replace the field information corresponding to the target text type at this position, so as to obtain the target text containing the text keyword, which does not need to be obtained manually The text information improves the intelligence of the text generating device when generating the text information.

Claims

A text generation method, the method comprising:

In the case of receiving a text generation instruction, acquiring text keywords from the text generation instruction, and determining the target text type corresponding to the text keywords;

In the case that there is a target template containing the target text type in the template library, the target template is obtained from the template library; the template in the template library is a text template provided with a text type;

Find the position of the target text type in the target template, and replace the field information corresponding to the target text type with the field information of the text keyword at the position, to obtain the target containing the text keyword text.
The method according to claim 1, wherein, before obtaining the target template from the template library, the method further comprises:

Obtain a first sample text; and input the first sample text into the keyword recognition model to obtain the first sample keyword, the first sample type and the first sample corresponding to the first sample text the first position of the keyword in the first sample text;

Inputting the first sample keywords into the text generation model to obtain the first output text;

According to the first output text, the first sample text, the first sample keywords, the first sample type and the first position, the text template is obtained, and the text A template is added to the template library.
The method according to claim 2, wherein said first output text, said first sample text, said first sample keywords, said first sample type and said first location , to get the text template, including:

using a keyword recognition model to determine a second position of the first sample keyword in the first output text;

At the second position in the first output text, replace the first sample keyword with the first sample type to obtain a first template;

At the first position in the first sample text, using the first sample type to replace the first sample keyword to obtain a second template;

The first template and the second template are used as the text templates.
The method according to claim 2, wherein the first sample text is input into the keyword recognition model to obtain the first sample keyword, the first sample type and the first sample text corresponding to the first sample text The first sample keyword is before the first position in the first sample text, and the method further includes:

Obtaining the second sample text, the second sample keyword corresponding to the second sample text, the second sample type corresponding to the second sample text, and the third position of the second sample keyword in the second sample text ;

An initial keyword recognition model is trained by using the second sample keyword, the second sample type, the third position and the second sample text to obtain the keyword recognition model.
The method according to claim 1, wherein, after determining the target text type corresponding to the text keyword, the method further comprises:

If the template library does not contain the target template of the target text type, determine at least two empty positions formed according to the text keywords and at least two groups of characters corresponding to the at least two empty positions amount; the at least two empty positions correspond to the at least two groups of character amounts;

splicing the at least two empty positions and the keyword according to the at least two groups of characters to obtain splicing information;

Inputting the splicing information into the text generation model to obtain at least two groups of target character information corresponding to the at least two empty positions;

Adding the at least two groups of target character information to the at least two empty positions in the splicing information to obtain the target text.
The method according to claim 1, wherein said determination of the target text type corresponding to said text keyword comprises:

In the case that the text generation instruction does not carry the target text type, inputting the text keyword into a type recognition model to obtain the target text type;

If the target text type is carried in the text generation instruction, the target text type is obtained from the text generation instruction.
The method according to claim 6, wherein, before said inputting said text keywords into a type recognition model and obtaining said target text type, said method further comprises:

Obtain a second sample keyword and a second sample text type;

An initial type recognition model is trained by using the second sample keywords and the second sample text type to obtain the type recognition model.
A text generation device, said device comprising:

The obtaining part is configured to obtain text keywords from the text generation instruction in the case of receiving the text generation instruction; in the case that there is a target template containing the target text type in the template library, from the template library Obtain the target template in the template library; the template in the template library is a text template with a text type;

A determining part configured to determine a target text type corresponding to the text keyword;

The replacement part is configured to replace the field information corresponding to the target text type with the field information of the text keyword at the position, so as to obtain the target text containing the text keyword.
A text generation device, said device comprising:

a memory, a processor, and a communication bus, the memory communicates with the processor through the communication bus, the memory stores a text-generated program executable by the processor, and when the text-generated program is executed , executing the method according to any one of claims 1 to 7 by the processor.
A storage medium, on which a computer program is stored, applied to a text generating device, and the computer program is executed by a processor to implement the method described in any one of claims 1 to 7.