CN117807961A

CN117807961A - Training method and device of text generation model, medium and electronic equipment

Info

Publication number: CN117807961A
Application number: CN202410236982.8A
Authority: CN
Inventors: 蔡京京; 董波; 柏洁明; 葛俊; 孔祥夫; 周宏豪
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2024-03-01
Filing date: 2024-03-01
Publication date: 2024-04-02
Anticipated expiration: 2044-03-01

Abstract

The specification discloses a training method, device, medium and electronic equipment of a text generation model, comprising the following steps: the collected documents are determined, and for each title included in the documents, the upper level title of the title in the document is determined. The prompt text of the title is determined according to the upper-level title, which is helpful for the text generation model to be influenced by the upper-level title of the title when generating the content under the title. The prompt text and the headline are then input into a pre-trained initial text generation model, which determines the output text. And determining the text corresponding to the title in the text, and taking the text as a target text. Training the initial text generation model according to the target text and the output text to obtain a text generation model, and improving the accuracy of the text generated by the text generation model.

Description

Training method and device of text generation model, medium and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a training method, apparatus, medium, and electronic device for a text generation model.

Background

With the continuous development of science and technology, natural language models are increasingly widely applied, and are particularly applied to the field of document writing.

At present, when the document writing is carried out, the related requirements of the document are generally input into a trained model, and the document is generated based on the model. However, since the language understanding ability of the model is limited and creativity and flexibility are insufficient, the document generated based on the model is low in quality. Therefore, how to train a text generation model to generate high quality documents is an important issue.

Based on this, the present specification provides a training method of a text generation model.

Disclosure of Invention

The present disclosure provides a training method, device, medium and electronic device for a text generation model, so as to partially solve the foregoing problems in the prior art.

The technical scheme adopted in the specification is as follows:

the specification provides a training method of a text generation model, which comprises the following steps:

determining a collected document, wherein the document comprises a plurality of titles;

determining, for each title included in the document, a superior title of the title in the document;

determining a prompt text of the title according to the superior title;

inputting the prompt text and the title into a pre-trained initial text generation model, and determining an output text; wherein the initial text generation model is a large language model;

Determining a text corresponding to the title in the public text and taking the text as a target text;

training the initial text generation model according to the target text and the output text to obtain a text generation model; the text generation model is used for generating documents required by the user according to the text input by the user.

Optionally, the prompt text includes a contact prompt;

according to the superior title, determining the prompt text of the title specifically comprises the following steps:

splicing the upper-level titles;

and taking the spliced result as a contact prompt of the title.

Optionally, the prompt text includes a distinguishing prompt;

determining a peer title of the title in the document;

and splicing the peer title, the appointed text and the upper-level title to obtain a distinguishing prompt of the title.

Optionally, the method further comprises:

determining a text input by the user as a first text in response to a first input operation of the user;

inputting the first text into the text generation model to determine an output text;

Determining a target document required by the user according to the output text;

and displaying the target document to the user.

Optionally, inputting the first text into the text generation model, and determining output text specifically includes:

determining literature materials;

determining a document matching the first text from the document material and as a reference;

the reference and the first text are input into the text generation model, and output text is determined.

Optionally, the method further comprises:

determining a text input by the user as a second text in response to a second input operation of the user;

inputting the second text into the text generation model, determining a brief outline and displaying the brief outline to the user; wherein the document outline comprises a plurality of titles;

determining a title selected by the user as a first title in response to a first selection operation of the user;

inputting the first title into the text generation model, and determining an output text corresponding to the first title;

determining a target document required by the user according to the output text corresponding to the first title;

And displaying the target document to the user.

Optionally, determining literature material specifically includes:

and responding to the uploading operation of the user, determining the data uploaded by the user and taking the data as literature data.

The specification provides a training device of a text generation model, comprising:

the first determining module is used for determining collected documents, wherein the documents comprise a plurality of titles;

a second determining module, configured to determine, for each title included in the document, a superior title of the title in the document;

the generation prompt module is used for determining the prompt text of the title according to the upper-level title;

the text generation module is used for inputting the prompt text and the title into a pre-trained initial text generation model and determining an output text; wherein the initial text generation model is a large language model;

the third determining module is used for determining a text corresponding to the title in the public text and taking the text as a target text;

the training module is used for training the initial text generation model according to the target text and the output text to obtain a text generation model; the text generation model is used for generating documents required by the user according to the text input by the user.

The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the training method of the text generation model described above.

The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the training method of the text generation model described above when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:

the training method of the text generation model provided by the specification determines collected documents, and determines the upper-level title of each title in the documents according to each title included in the documents. And determining the prompt text of the title according to the upper-level title. The prompt text and the headline are then input into a pre-trained initial text generation model, which determines the output text. And determining the text corresponding to the title in the text, and taking the text as a target text. Training the initial text generation model according to the target text and the output text to obtain a text generation model.

As can be seen from the above method, when training the text generation model, the present application determines the collected documents, and for each title included in the documents, determines the upper level title of the title in the document. The prompt text of the title is determined according to the upper-level title, which is helpful for the text generation model to be influenced by the upper-level title of the title when generating the content under the title. The prompt text and the headline are then input into a pre-trained initial text generation model, which determines the output text. And determining the text corresponding to the title in the text, and taking the text as a target text. Training the initial text generation model according to the target text and the output text to obtain a text generation model, and improving the accuracy of the text generated by the text generation model.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:

FIG. 1 is a flow chart of a training method of a text generation model provided in the present specification;

FIG. 2 is a schematic illustration of a document provided in the present specification;

FIG. 3 is a schematic illustration of training of a text generation model provided in the present specification;

FIG. 4 is a schematic diagram of a process for generating a target document provided in the present specification;

FIG. 5 is a schematic diagram of a training device for a text generation model provided herein;

fig. 6 is a schematic structural diagram of an electronic device corresponding to fig. 1 provided in the present specification.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 1 is a flow chart of a training method of a text generation model provided in the present specification, including the following steps:

s100: determining a collected document, wherein the document comprises a plurality of titles.

In this specification, a device for training a text generation model may determine collected documents. The device for training the text generation model may be a server, or may be an electronic device such as a desktop computer, a notebook computer, or the like. For convenience of description, a training method of the text generation model provided in the present specification will be described below with only a server as an execution subject.

In this specification, a document refers to a official document, which is a document having legal effectiveness and specifications that some organizations or social organizations form and use in official activities. The above-mentioned official document includes several titles, and the title is formed from several levels of titles except for the name of the official document, i.e. the titles of all the levels included in the official document except for the name of the official document. In addition, there is a relationship between the upper and lower levels, each title corresponds to one level, and a title with a high level contains a title with a low level. Each title included in the document has at least one upper level title. Of course, the document includes, in addition to a plurality of titles, a document name and each content text corresponding to each level title, that is, each title (i.e., each level title) has a corresponding content text.

In this specification, when one title includes another title, the title is an upper level title of the other title, and the other title is a lower level title of the title. Specifically, two levels are taken as an example, that is, only two levels of titles including a first level and a second level are included in the document, and the title of the first level is a first level title, and the title of the second level is a second level title. For each title included in the document, when the title is not included in other titles, determining the level of the title as a first level and determining the title as a first level title, and determining the title included in the title as a second level title. And meanwhile, determining that the upper level title of the title is a document name, and determining that the upper level title of the second level title is the title.

For example, the document name is "advance traffic construction" document, which includes a research background, a research question, an overall content, and a summary of four primary titles, wherein the research question further includes three secondary titles of traffic, social, and development questions. The overall content also comprises four secondary titles of construction objects, construction modes, specific contents and construction effects.

In this specification, in a document, the server may determine a level to which each title corresponds according to an inclusion relationship between the titles, in addition to the document name. When a title is not included by other titles, that is, when a title is not a lower title of other titles, the title is a title of a level one, and an upper title of the title is a document name. The title of the next level of the primary title may be a secondary title (i.e., a title of a level two), the title of the next level of the secondary title is a tertiary title (i.e., a title of a level three), and similarly, the corresponding level of each title in the document is determined. The inclusion relationship between the titles may be predetermined by a labeling person, specifically, the server may send a document to the labeling person, and the labeling person may determine the inclusion relationship between the titles according to the received document.

Of course, the server may also send the document directly to the labeling personnel, and the labeling personnel determines the levels corresponding to the titles included in the document according to the received document, and returns the levels. The server can respond to the input level operation of the labeling personnel to determine the level corresponding to each title included in the document. In addition, the server can also determine the corresponding level of each title included in the document according to a preset title template. Specifically, the server may use any existing matching algorithm or machine learning model to determine, based on a preset title template, a level corresponding to each title included in the document.

In this specification, the server may collect and save documents for each region based on a crawler written by python, or any other existing manner. Meanwhile, in order to collect more documents, the server can periodically collect and store the documents of all areas.

S102: for each title included in the document, determining an upper level title of the title in the document.

The server may determine, for each title included in the document, a superior title of the title in the document. Because the document includes a plurality of titles and there is a relationship between the titles at the upper and lower levels, each title in the document has an upper level title, and the upper level title of the first level title in the document may be the document name. Specifically, the corresponding level of each title in the public is determined, and for each title, the corresponding level of the title is determined according to the corresponding level of each title and is used as the target level. In each title, the upper level title of the title is determined according to the target level.

Wherein, in each title, when determining the upper level title of the title according to the target level, the server may determine the document name and use the document name as the upper level title of the title when the target level is one level. However, when the target level is not one level, the server may determine a previous level title of the title, which is a target level-1, i.e., a difference of the target level minus the level of the previous level title is 1, from the target level, and the previous level title contains the title. And then judging whether the level of the previous-level title is one level, if so, determining the document name, and taking the document name and the previous-level title as the previous-level title of the title. If not, the previous-level title is used as the first title, the previous-level title of the first title is continuously determined, the document name is determined until the level of the previous-level title is one level, and the document name and each first title are used as the previous-level title of the title.

Continuing with the above example, it is assumed that the title is a social question, and the social question is a secondary title, so the level (i.e., the target level) corresponding to the social question is secondary and not primary. The server may determine a previous level title of the title, i.e., a study, according to the target level, the study being at a level of one, and the study including a social question, the two having an inclusion relationship. Since the level of the research question is one level, the server can determine the document name, i.e. push traffic construction, and take the document name and the upper level title as the upper level title of the title, i.e. push traffic construction and research question.

In this specification, when determining, for each title, a superior title of the title in the document, the server may further determine, in response to a first operation by the labeling person, a first set to which each title included in the document corresponds, respectively. For each title, determining a first set corresponding to the title, and taking the title contained in the first set as an upper-level title of the title. The first operation characterization annotates personnel and uploads a first set, the first set corresponding to the title comprises all superior titles of the title, the first set corresponding to the title can be pre-constructed by the annotating personnel and uploaded to a server, specifically, the server can send a document to the annotating personnel, the annotating personnel determines each title included in the document according to the received document, then determines the superior title of the title for each title, and generates the first set of the title according to the superior title. And then, the labeling personnel upload the first sets corresponding to the titles to the server. The server responds to the first operation of the labeling personnel to determine a first set corresponding to each title included in the document. For each title, determining a first set corresponding to the title, and taking the title contained in the first set as an upper-level title of the title.

Of course, when determining the upper level title of the title in the document for each title, the server may also determine the upper level title of the title in the document for each title by any existing means or manner, which is not specifically limited in this specification. In addition, the above-identified upper level titles are all upper level titles of the title, and the difference between the level of the title and the level of the identified upper level title is not less than 1.

S104: and determining the prompt text of the title according to the upper-level title.

The server may determine the prompt text for the title based on the superior title. Wherein, when the title is a primary title, there is only one upper title of the title, and the upper title of the title is a document name. However, when the title is not a primary title, there are at least two superior titles of the title, and the superior title of the title includes a document name.

In the description, the upper level titles of the title are all other titles or official document names containing the title, and the server can take the upper level title of the title as a prompt text of the title, so that a contact prompt is added to a text generation model, the influence of the upper level title of the title on the text generation model in the process of generating the content under the title is facilitated, the contact of the upper level title and the lower level title is fully considered, the content under the title generated by the text generation model is prevented from deviating from the content range covered by the upper level title of the title, and the accuracy of the text generated by the text generation model is improved. Based on this, the alert text includes a contact alert that includes an upper level title, and the contact alert is used to alert the text generation model that requires content under the title to be generated based on the upper level title of the title. Specifically, the server may splice the superior titles, and then use the spliced result as a contact prompt of the title.

In this specification, in addition to increasing the relativity between titles, it is necessary to increase the distinctiveness between titles so as to avoid the influence between peer titles, taking the distinctiveness of peer titles into full consideration. Peer titles are titles that are of the same level and are contained by the same upper level title. For example, the primary title includes two secondary titles of title 1 and title 2, so title 1 and title 2 are peer titles. Based on this, the above-described hint text may include a distinguishing hint that includes a superior title and a peer title, and the distinguishing hint is used to hint that the text generation model needs to generate content under the title based on the superior title of the title and the peer title of the title. Specifically, when determining the prompt text of the title according to the superior title, the server may determine the peer title of the title in the public text. And splicing the same-level title, the appointed text and the upper-level title to obtain the distinguishing prompt of the title. The specified text may be a preset negative text, may be "not" or "not identical", and is not specifically limited in this specification.

When determining the peer titles of the titles in the document, the server may determine the second set corresponding to each title included in the document in response to the second operation of the annotator. And determining the second set of the titles according to each second set, and taking the title included in the second set of the titles as the peer title of the title. The second operation characterizes the annotator to upload the second set, the second set of titles includes all peer titles of the titles, and the second set of titles may be pre-built by the annotator and uploaded to the server. Specifically, the server may send the document to a labeling person, and the labeling person determines, according to the received document, each title included in the document, determines, for each title, a peer title of the title, and generates a second set of the titles. And then, the labeling personnel upload the second sets corresponding to the titles to the server. And the server responds to the second operation of the labeling personnel to determine a second set corresponding to each title included in the document. And determining the second set of the titles according to each second set, and taking the title contained in the second set of the titles as the peer title of the title.

In addition, when determining the peer title of the title in the public document, the server may also determine the level corresponding to the title as a standard level, and determine the upper level title of the title as a standard title. From the titles, other titles having the first level other than the title are determined. Then, from among the other titles, the other title whose upper-level title is the standard title is determined as the peer title of the title.

S106: inputting the prompt text and the title into a pre-trained initial text generation model, and determining an output text; wherein the initial text generation model is a large language model.

S108: and determining the text corresponding to the title in the public text and taking the text as a target text.

S110: training the initial text generation model according to the target text and the output text to obtain a text generation model; the text generation model is used for generating documents required by the user according to the text input by the user.

The server may input the prompt text and the headline into a pre-trained initial text generation model, determining the output text. The server may then determine the text corresponding to the title in the document and act as the target text. And then training the initial text generation model according to the target text and the output text to obtain a text generation model. The initial text generation model is a large language model, and the initial text generation model is a pre-trained large language model, and the large language model can be a GPT model. The text generation model is used for generating a document required by a user according to the text input by the user. The target text in the step S108 is a text corresponding to the title itself, where the target text is a text describing the title, and the target text may be a paragraph, a abstract, or the like.

In this specification, each title has a content text corresponding to the title under the document, when the title contains other titles, or when the title has a lower-level title, the content text corresponding to the title includes a text corresponding to the title itself, a lower-level title of the title, and a text corresponding to the lower-level title of the title itself, and the target text is only a text corresponding to the title itself, so in the above step S108, when the title contains other titles, or when the title has a lower-level title, the server determines a text corresponding to the title itself in the document, and as the target text, the text corresponding to the title itself may be a text describing the title.

However, when the title does not include other titles, or when the title does not include a subordinate title, the content text corresponding to the title includes only the text corresponding to the title itself, and the target text may be the content text corresponding to the title, so in step S108 described above, when the title does not include other titles, or when the title does not include a subordinate title, the server determines the content text corresponding to the title in the document as the target text.

Continuing with the example in step S102, in conjunction with fig. 2, fig. 2 is a schematic diagram of a document provided in the present specification, assuming that the title is overall content, text 1 in fig. 2 is text corresponding to the overall content itself, the overall content includes four other titles of construction object, construction mode, specific content and construction effect, that is, the overall content includes four lower titles of construction object, construction mode, specific content and construction effect, text 2 in fig. 2 is content text corresponding to the overall content, text 2 includes text corresponding to the overall content itself (that is, text 1), construction object of the overall content, construction mode, specific content and construction effect, that is, text corresponding to the overall content itself (that is, text 3), and text corresponding to the overall content itself (that is, text 4), and the target text is only text corresponding to the overall content itself, that includes other titles, that is, the overall content includes lower titles, so the server can determine text corresponding to the overall content itself in the document, that is, and target text 1.

Further, assuming that the title is a construction object, the text 5 in fig. 2 is a text corresponding to the construction object itself, and is also a content text corresponding to the construction object. The construction object is a secondary title, and the construction object does not contain other titles, that is, the construction object does not have a lower-level title, so that the server can determine the content text corresponding to the construction object in the document and serve as the target text, namely the text 5.

In the present specification, when training a text generation model, a server may use a prompt text and a title obtained based on a title in a document as a training sample, use a text corresponding to the title (i.e., a target text) as a label of the training sample, and train an initial text generation model based on the training sample and the label to obtain the text generation model. In the step S106, the server may splice the prompt text and the title to obtain the first text. And inputting the first text into a pre-trained initial text generation model to determine an output text. In the step S110, the server may train the initial text generation model with the minimum difference between the target text and the output text as a target, to obtain the text generation model.

As can be seen from the above method, when the text generation model is trained, the server can determine the collected documents, and then determine the superior title of the title in the documents for each title included in the documents. According to the upper-level title, the prompt text of the title is determined, so that the text generation model can be influenced by the upper-level title of the title when generating the content under the title, and the content under the title generated by the text generation model is prevented from deviating from the content range covered by the upper-level title of the title. The prompt text and the headline are then input into a pre-trained initial text generation model, which determines the output text. And determining the text corresponding to the title in the text, and taking the text as a target text. Training the initial text generation model according to the target text and the output text to obtain a text generation model, and improving the accuracy of the text generated by the text generation model.

In this specification, the server may determine unstructured text in a pre-built corpus of documents when pre-training the initial text generation model. And then training the initial text generation model to be trained based on the unstructured text to obtain the initial text generation model. The document corpus is pre-constructed for a server, and comprises structured texts and unstructured texts, wherein the structured texts are formatted texts with titles of various levels, such as document texts formatted in XML. While unstructured text is text that has no format and only content, such as document text in the format of TXT. In addition, the documents collected in the step S100 may be structured texts in a document corpus.

In order to enable the initial text generation model to learn the characterization of the document, the server can train the initial text generation model to be trained based on the unstructured text in the document corpus to obtain the initial text generation model. The initial text generation model to be trained can be any existing base large model, and the base large model is an open-source dialogue language model supporting Chinese-English bilingual. The base large model is trained by Chinese-English bilingual with a large number of identifiers, and is supplemented with techniques such as supervision fine tuning, feedback self-service, human feedback reinforcement learning and the like. The base large model is a pre-trained model with billions of parameters and has been able to generate text that meets human preferences.

In the specification, the server can train the initial text generation model to be trained based on the unstructured text in the document corpus by adopting an unsupervised learning mode to obtain the initial text generation model. Of course, the server can train the initial text generation model to be trained by two training tasks, namely a word mask prediction task and a sentence sequence prediction task, based on the unstructured text in the document corpus, so as to obtain the initial text generation model. Specifically, the server may train the word segmentation machine based on unstructured text to create a vocabulary corresponding to the dataset. And processing the unstructured text according to the training task to obtain a training sample for training and labels corresponding to the training sample. Based on the obtained training sample and the labels corresponding to the training sample, training the word mask prediction task and the sentence sequence prediction task on the initial text generation model to be trained.

If the training task is a word mask prediction task, it is assumed that unstructured text may be "at present, traffic congestion is often occurring in the early and late rush hour. The word element of ' congestion ' can be subjected to MASK processing according to the requirement of word MASK prediction, so that the condition that a training sample is ' at present, and traffic [ MASK ] usually occurs in the early and late peak time can be obtained. "and training sample labels" congestion ".

If the training task is a sentence sequential prediction task, it is assumed that unstructured text may be "at present, traffic congestion is often occurring in the early and late rush hour. Traffic jams can increase the occurrence of traffic accidents. The training data can be obtained by processing according to the requirements of sentence sequence prediction tasks, and the traffic accident is increased when the traffic jam [ SEP ] is generated in the early and late peak time at present. ", and training label" 1 "indicate that the latter sentence has a logical relationship with the former sentence.

Based on this, as shown in fig. 3, fig. 3 is a schematic diagram of training a text generation model provided in the present specification, and the server may train a base large model based on unstructured text in a document corpus to obtain an initial text generation model. And training an initial text generation model based on the structured text in the document corpus to obtain a text generation model.

In the present specification, when training the text generation model, the server may use, in addition to the prompt text and the title obtained based on the title in the document as training samples and the text corresponding to the title (i.e., the target text) as a label of the training samples, use the document name as the training samples and use each title included in the document as a label of the training samples, and train the text generation model based on the document name and each title, so that the text generation model may generate not only the content corresponding to the title but also the document outline corresponding to the document name, i.e., each title corresponding to the document name.

In this specification, when training a text generation model, the server may use, as training samples, a prompt text and a title obtained based on a title in a document, and use a text corresponding to the title (i.e., a target text) as a label of the training samples, and may use, as training samples, a title in the document and a text corresponding to the title itself (i.e., a target text), and use a next-level title of the title as a label, that is, a difference value of a level of the title as the training sample minus a level of the title as the label is 1. Then, based on the training sample and the label corresponding to the training sample, training the text generation model, so that the text generation model can generate not only the content corresponding to the title, but also the next-stage title corresponding to the title.

Specifically, the server may determine the collected documents, determine, for each title included in the documents, a text corresponding to the title in the documents, that is, a target text, and then use the title and the text corresponding to the title as training samples. And determining a lower-level title corresponding to the title, and determining a label corresponding to the training sample according to the lower-level title. And training the text generation model according to the training samples and the labels. The next level of the title is marked as the corresponding training sample, that is, the server is the title which is different from the level of the title by 1 in the next level of the title. When determining the label corresponding to the training sample according to the lower-level title, the server may determine, for each lower-level title, a difference value of the level of the title minus the level of the lower-level title, and when the difference value is 1, use the lower-level title as the label corresponding to the training sample.

In this specification, after obtaining the text generation model, the server may determine the text input by the user as the first text in response to the first input operation by the user. A first text input text generation model is used to determine an output text. And determining a target document required by the user according to the output text. And displaying the target document to the user. The first text may be a title of a document input by a user, the title may be a title of a target document, and the subsequent text generation model may generate a text corresponding to the title, that is, an output text, according to the title included in the first text.

Because the target document may include a plurality of titles, the first text may be only a title corresponding to one target document input by the user, or may be a title corresponding to a plurality of target documents input by the user. Based on the above, the server may determine each title corresponding to the target document according to the first text, and then generate a model for each title input text, and determine the output text corresponding to the title. And determining target documents required by the user according to the titles and the output text corresponding to the titles, and displaying the target documents to the user.

In addition, when the first text is input into the text generation model to determine the output text, or when the title is input into the text generation model to determine the output text corresponding to the title, the server may add a relevance prompt to the text generation model to improve the accuracy of the text generated by the text generation model, so that the first text is input into the text generation model to determine the output text for explanation, for example, the server may determine the upper title of the first text according to the first text, then use the upper title as the relevance prompt, and use the relevance prompt and the first text input into the text generation model to determine the output text.

In this specification, in addition to increasing the relatedness between titles, it is also necessary to increase the distinctiveness between titles so as to avoid the influence between peer titles, so that the description will be given by taking the first text input text generation model, determining the output text as an example, the server may determine the peer title of the first text, and determine the upper level title of the first text. And splicing the same-level title, the appointed text and the upper-level title to obtain the distinguishing prompt of the first text. And generating a model by the distinguishing prompt and the first text input text, and determining an output text. The upper-level title of the first text and the peer title of the first text may be input by a user, or may be obtained by a server based on a document outline corresponding to the first text.

In this specification, in order to enable the text corresponding to the title generated by the text generation model to have novelty and enable the text generation model to learn the latest knowledge related to the title, the server may determine the literature when inputting the first text into the text generation model and determining the output text as described above. From the literature, a literature matching the first text is determined and referred to. The reference and the first text are input into a text generation model, and an output text is determined. Specifically, the server may determine the literature, determine the literature matching the first text from the literature using a text matching algorithm, and use the literature as a reference. The reference and the first text are input into a text generation model, and an output text is determined.

Of course, the server may also determine literature. And determining the text characteristics of the first text, and determining the text characteristics corresponding to each document in the document data. For each document, a similarity between text features of the document and text features of the first text is determined. From the literature, the literature with the highest similarity is determined according to each similarity and used as a reference. The reference and the first text are then input to a text generation model, which determines the output text. When determining the text feature of the first text, or determining the text feature corresponding to each document in the document material, specifically taking determining the text feature of the first text as an example, the server may input the first text into a pre-trained text feature extraction layer to determine the text feature of the first text, and of course, the server may also determine the text feature of the first text by adopting any existing text feature extraction layer.

The document data is document related data, and the document data can be document data collected in advance by a server or document data uploaded by a user. Therefore, when determining the document data, the server can respond to the uploading operation of the user to determine the document uploaded by the user and serve as the document data.

In the present specification, the text generation model may generate, in addition to the text corresponding to the title, an outline corresponding to the document, so after obtaining the text generation model, the server may further determine, in response to a second input operation by the user, the text input by the user, and use the determined text as the second text. The server may generate a model of the second text-in text, determine a brief outline, and display it to the user. Then, in response to a first selection operation by the user, a title selected by the user is determined as a first title. And inputting the first title into a text generation model, and determining an output text corresponding to the first title. And determining a target document required by the user according to the output text corresponding to the first title, and displaying the target document to the user.

The second text is the name of the target document, and the document outline includes a plurality of titles, so that the first title may be only one title selected by the user from the plurality of titles, or may be a plurality of titles selected by the user. In this case, when the first-caption input text generation model is used to determine the output text corresponding to the first caption, the server may determine the output text corresponding to the first caption by using the first-caption input text generation model for each first caption. And then determining a target document required by the user according to each first title and the output text corresponding to each first title, and displaying the target document to the user.

In addition, when the first title is input into the text generation model to determine the output text corresponding to the first title, or when the first title is input into the text generation model to determine the output text corresponding to the first title, specifically, taking the first title input into the text generation model to determine the output text corresponding to the first title as an example, the server may determine the upper-level title of the first title according to the document outline, and then use the upper-level title as the contact prompt, and use the contact prompt and the first title input text generation model to determine the output text corresponding to the first title. In addition, the server may determine a peer title of the first title according to the outline of the document, and determine an upper level title of the first title. And splicing the same-level title, the appointed text and the upper-level title to obtain the distinguishing prompt of the first title. And then the distinguishing prompt and the first title input text are subjected to a generating model, and the output text corresponding to the first title is determined.

In this specification, when the first title is one title selected by the user, and the target document required by the user is determined according to the output text corresponding to the first title, the server may directly use the output text corresponding to the first title and the document outline as the target document. Of course, after determining the output text corresponding to the first title, the user may further continue to select other titles included in the brief outline, and the server continues to determine the title newly selected by the user, and determines the output text corresponding to the first title as the first title until determining the output text of all the titles included in the brief outline, and determines the target brief required by the user according to the output text of all the titles included in the brief outline.

Based on this, as shown in fig. 4, fig. 4 is a schematic diagram of a process of generating a target document provided in the present specification. The server may determine a document name input by the user in response to a second input operation by the user. And inputting the document name into a text generation model, determining a document outline of the target document, and displaying the document outline to a user. In response to a first selection operation by the user, a title selected by the user is determined and used as a first title. Then, document data is determined, and from the document data, a document matching the first header is determined as a reference. The reference and the first title are input into a text generation model, and an output text corresponding to the first title is determined. And determining a target document required by the user according to the first title and the output text corresponding to the first title, and displaying the target document to the user.

In this specification, after the first text is input into the text generation model to determine the outline of the document and display the outline of the document to the user, the user may modify the title in the outline of the document, specifically, determine modification information in response to the modification operation of the user. And updating the outline of the official document according to the modification information, and displaying the updated outline of the official document to the user. The modification information comprises a title to be modified and a target title, wherein the target title is a title after the title to be modified is modified.

In this specification, the text generation model may generate, in addition to the text corresponding to the title, a next-level title corresponding to the title, so after obtaining the text generation model, the server may determine the text input by the user in response to a third input operation by the user, and use the determined text as a third text, where the third text includes the title of the document and the text corresponding to the title. Then, the server inputs the third text into the text generation model, determines the next-level title corresponding to the title in the third text, and displays the next-level title to the user. In response to a second selection operation by the user, a title selected by the user is determined and used as a second title. And inputting the second title into a text generation model, and determining an output text corresponding to the second title. And then, determining a target document required by the user according to the output text corresponding to the second title, and displaying the target document to the user.

The third text comprises a title and a text corresponding to the title, and the output text is the text corresponding to the title. The second title is the next level title corresponding to the title in the third text. Since the titles in the third text may correspond to a plurality of next-level titles, the second title may be only one title selected by the user from the plurality of next-level titles, or may be a plurality of titles selected by the user. In this case, when the second title input text generation model is used to determine the output text corresponding to the second title, the server may determine the output text corresponding to the second title by using the second title input text generation model for each second title. And then determining a target document required by the user according to each second title and the output text corresponding to each second title, and displaying the target document to the user. In addition, the process of inputting the second title into the text generation model to determine the output text corresponding to the second title, or the process of inputting the second title into the text generation model to determine the output text corresponding to the second title is similar to the process of inputting the first title into the text generation model to determine the output text corresponding to the first title, and will not be described herein again.

When determining the target document required by the user according to the second titles and the output texts corresponding to the second titles, the server may use the third text, the second titles and the output texts corresponding to the second titles as the target document required by the user.

In addition, the server may further continue to use the second title and the output text corresponding to the second title as a new third text, input the new third text into the text generation model, determine a next-level title corresponding to the title (i.e., the first title) included in the new third text, and display the next-level title to the user. And continuing to respond to the second selection operation of the user, determining the title selected by the user and re-using the title as the second title. And inputting the new second titles into a text generation model, determining output texts corresponding to the new second titles until the end condition is reached, and determining target documents required by the user according to the second titles and the output texts corresponding to the second titles by the server. The ending condition may be that the number of second titles reaches a specified value, where the specified value is preset.

In this specification, the server may collect document data of the text generation model in the application process, where the document data may include a target document, a document name, a document outline, a title, and a text corresponding to the title. The follow-up server can perform optimization training on the text generation model based on the document data, so that the text generation model can be updated iteratively, and the accuracy of an output result of the text generation model is improved.

Specifically, the server may use the prompt text and the title obtained based on the title in the target document in the document data as a training sample, and use the text corresponding to the title as a label of the training sample. The server can also take the document name of the target document in the document data as a training sample and take all titles contained in the target document as labels of the training sample. The server may also use the title in the target document in the document data and the text corresponding to the title as a training sample, and use the next-stage title of the title as a label, that is, the difference between the level of the title as the training sample and the level of the title as the label is 1. Then, training the text to generate a model based on the training sample and the labels corresponding to the training sample.

The foregoing is a method implemented by one or more embodiments of the present specification, and based on the same ideas, the present specification further provides a training device for a corresponding text generation model, as shown in fig. 5.

Fig. 5 is a schematic diagram of a training device for a text generation model provided in the present specification, including:

a first determining module 200, configured to determine a collected document, where the document includes a plurality of titles;

A second determining module 202, configured to determine, for each title included in the document, a top level title of the title in the document;

a generating prompt module 204, configured to determine a prompt text of the title according to the upper level title;

a text generation module 206, configured to input the prompt text and the title into a pre-trained initial text generation model, and determine an output text; wherein the initial text generation model is a large language model;

a third determining module 208, configured to determine a text corresponding to the title in the document, and serve as a target text;

the training module 210 is configured to train the initial text generation model according to the target text and the output text, so as to obtain a text generation model; the text generation model is used for generating documents required by the user according to the text input by the user.

Optionally, the prompt text includes a contact prompt;

the text generation module 206 is specifically configured to splice the upper-level titles; and taking the spliced result as a contact prompt of the title.

Optionally, the prompt text includes a distinguishing prompt;

the text generation module 206 is specifically configured to determine a peer title of the title in the document; and splicing the peer title, the appointed text and the upper-level title to obtain a distinguishing prompt of the title.

Optionally, the apparatus further comprises:

an application module 212, configured to determine, in response to a first input operation by the user, a text input by the user, and serve as a first text; inputting the first text into the text generation model to determine an output text; determining a target document required by the user according to the output text; and displaying the target document to the user.

Optionally, the application module 212 is specifically configured to determine literature; determining a document matching the first text from the document material and as a reference; the reference and the first text are input into the text generation model, and output text is determined.

Optionally, the application module 212 is further configured to determine, in response to a second input operation by the user, a text input by the user, and to serve as a second text; inputting the second text into the text generation model, determining a brief outline and displaying the brief outline to the user; wherein the document outline comprises a plurality of titles; determining a title selected by the user as a first title in response to a first selection operation of the user; inputting the first title into the text generation model, and determining an output text corresponding to the first title; determining a target document required by the user according to the output text corresponding to the first title; and displaying the target document to the user.

Optionally, the application module 212 is specifically configured to determine, in response to the upload operation of the user, a profile uploaded by the user, and use the profile as a literature profile.

The present specification also provides a computer readable storage medium storing a computer program operable to perform a training method of a text generation model as provided in fig. 1 above.

The present specification also provides a schematic structural diagram of an electronic device corresponding to fig. 1 shown in fig. 6. At the hardware level, as shown in fig. 6, the electronic device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile storage, and may of course include hardware required by other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement the training method of the text generation model described above with respect to fig. 1.

Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims

1. A method of training a text generation model, comprising:

determining a prompt text of the title according to the superior title;

2. The method of claim 1, wherein the alert text comprises a contact alert;

splicing the upper-level titles;

and taking the spliced result as a contact prompt of the title.

3. The method of claim 1, wherein the prompt text comprises a distinguishing prompt;

determining a peer title of the title in the document;

4. The method of claim 1, wherein the method further comprises:

and displaying the target document to the user.

5. The method of claim 4, wherein inputting the first text into the text generation model, determining output text, comprises:

determining literature materials;

6. The method of claim 1, wherein the method further comprises:

and displaying the target document to the user.

7. The method of claim 5, wherein determining literature material comprises:

8. A method of training a text generation model, comprising:

9. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-7.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-7 when executing the program.