CN117312497A - Method, device and equipment for generating instruction fine tuning data - Google Patents
Method, device and equipment for generating instruction fine tuning data Download PDFInfo
- Publication number
- CN117312497A CN117312497A CN202311278107.8A CN202311278107A CN117312497A CN 117312497 A CN117312497 A CN 117312497A CN 202311278107 A CN202311278107 A CN 202311278107A CN 117312497 A CN117312497 A CN 117312497A
- Authority
- CN
- China
- Prior art keywords
- task
- instruction
- question
- types
- answer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000012549 training Methods 0.000 claims abstract description 26
- 238000009966 trimming Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 11
- 238000010276 construction Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 5
- 230000000694 effects Effects 0.000 abstract description 12
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 7
- 230000008450 motivation Effects 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000006698 induction Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 230000002996 emotional effect Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000008521 reorganization Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002195 synergetic effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
Abstract
The application discloses a method, a device and equipment for generating instruction fine adjustment data, relates to the technical field of artificial intelligence, and can accurately meet task demands in different fields, so that training effects of a model in a fine adjustment stage are improved. The method comprises the following steps: acquiring a question-answer task sample in a structured text form, constructing initial task instructions of different task types according to the question-answer task sample, adding instruction description limits matched with the task types into the initial task instructions according to the task types, constructing instruction examples of different task types, and finally respectively combining the question-answer task sample with the instruction examples of different task types to obtain instruction fine-tuning data.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, and a device for generating instruction fine adjustment data.
Background
Large-scale language models are one of the applications of deep learning, whose goal is to generate high-quality, logical-rich natural language through large-scale training data and powerful computing power. To achieve this goal, during the pre-training phase, the model is trained on large-scale generic text data, learning the basic structure and patterns of the language. In the fine tuning stage, the model is further trained on smaller and specific data sets. Whereas the data set of the fine tuning stage is usually specific to a specific task or domain, for example, medical text, legal text, and specific dialogue data, etc., the language of the specific domain can be better understood and generated by the model of the fine tuning stage, so that specific language tasks, such as question-answering, text classification, summarization, collaboration, keyword extraction, etc., can be understood and performed.
In the related art, the fine tuning stage utilizes large-scale instruction fine tuning data to perform fine tuning on the pre-trained language model, and the process is closely related to supervised fine tuning and multi-task prompt training. The instruction fine tuning data in the form of questions and answers can be built by random combination through an instruction generation mechanism, and although a large number of training examples are formatted by adding instructions, the training examples mainly come from a public natural language processing data set, the instruction fine tuning data lacks of diversity of task description, is difficult to accurately meet task requirements in different fields, and has limited effect on a lifting model.
Disclosure of Invention
In view of this, the present application provides a method, an apparatus, and a device for generating instruction trimming data, which mainly aims to solve the problem that the instruction trimming data generated in the prior art lacks diversity of task description, is difficult to accurately dock to task demands in different fields, and has limited effect on lifting a model.
According to a first aspect of the present application, there is provided a method for generating instruction trimming data, including:
acquiring a question-answer task sample in a structured text form;
constructing initial task instructions of different task types according to the question-answer task samples;
Adding instruction description limits matched with the task types into the initial task instruction according to the task types, and constructing instruction instances of different task types;
and respectively combining the question-answer task samples with the instruction examples of different task types to obtain instruction fine-tuning data.
Further, the obtaining a question-answer task sample in the form of a structured text includes:
obtaining structured text data of different domain types;
and extracting the content of the structured text data according to the field content contained in the structured text data in the field type to obtain a question-answer task sample in the structured text form.
Further, the constructing initial task instructions of different task types according to the question-answer task samples includes:
selecting key fields of task description from the question-answering task sample according to field content contained in the question-answering task sample;
and constructing initial task instructions of different task types according to the key fields of the task description.
Further, the constructing initial task instructions of different task types according to the key fields of the task description includes:
setting question-answering tasks of different task types according to the key fields of the task description;
Determining multiple description modes of key fields on different task types;
and constructing initial task instructions of different task types according to various description modes of the key fields on different task types.
Further, before adding instruction description restrictions matched with the task types in the initial task instruction according to the task types, and constructing instruction instances of different task types, the method further comprises:
acquiring text limiting characteristics related to question-answer task samples in different task types in advance;
and determining the limiting information of the question-answer task samples of different task types on each text limiting feature according to the text limiting features.
Further, adding instruction description limits matched with the task types in the initial task instruction according to the task types, and constructing instruction instances of different task types, wherein the instruction instances comprise:
determining target text limiting characteristics to be added to the initial task instruction according to the task type;
acquiring limiting information of the question-answering task sample on a target text limiting feature;
and adding the limiting information on the target text limiting feature into the initial task instruction in an instruction description mode, and constructing instruction instances of different task types.
Further, after the question-answer task samples are respectively combined with the instruction instances of the different task types to obtain instruction fine-tuning data, the method further includes:
and inputting the instruction fine tuning data into a network model output in a pre-training stage, and fine tuning the network model by utilizing a pre-configured sequence loss function to obtain a fine-tuned network model.
According to a second aspect of the present application, there is provided a generation apparatus of instruction trimming data, including:
the first acquisition unit is used for acquiring a question-answer task sample in a structured text form;
the construction unit is used for constructing initial task instructions of different task types according to the question-answer task samples;
the adding unit is used for adding instruction description limits matched with the task types into the initial task instruction according to the task types and constructing instruction instances of different task types;
and the combination unit is used for respectively combining the question-answer task samples with the instruction examples of different task types to obtain instruction fine-tuning data.
Further, the first obtaining unit is specifically configured to obtain structured text data of different domain types; and extracting the content of the structured text data according to the field content contained in the structured text data in the field type to obtain a question-answer task sample in the structured text form.
Further, the construction unit is specifically configured to select a key field of task description from the question-answer task sample according to field content included in the question-answer task sample; and constructing initial task instructions of different task types according to the key fields of the task description.
Further, the construction unit is specifically configured to set question-answer tasks of different task types according to the key fields of the task description; determining multiple description modes of key fields on different task types; and constructing initial task instructions of different task types according to various description modes of the key fields on different task types.
Further, the apparatus further comprises:
the second obtaining unit is used for obtaining text restriction characteristics related to question-answer task samples in different task types in advance before adding instruction description restrictions matched with the task types into the initial task instruction according to the task types and constructing instruction examples of different task types;
and the determining unit is used for determining the limit information of the question-answer task samples with different task types on each text limit feature according to the text limit features.
Further, the adding unit is specifically configured to determine a target text restriction feature to be added to the initial task instruction according to a task type; acquiring limiting information of the question-answering task sample on a target text limiting feature; and adding the limiting information on the target text limiting feature into the initial task instruction in an instruction description mode, and constructing instruction instances of different task types.
Further, the apparatus further comprises:
and the fine tuning unit is used for inputting the instruction fine tuning data into the network model output in the pre-training stage after the question-answer task samples are respectively combined with the instruction examples of different task types to obtain the instruction fine tuning data, and fine tuning the network model by utilizing a pre-configured sequence loss function to obtain a fine-tuned network model.
According to a third aspect of the present application there is provided a computer device comprising a memory storing a computer program and a processor implementing the steps of the method of the first aspect described above when the computer program is executed by the processor.
According to a fourth aspect of the present application there is provided a readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of the first aspect described above.
By means of the technical scheme, compared with the method, the device and the equipment for generating the instruction fine tuning data by randomly combining and constructing the instruction fine tuning data in the question-answer form through an instruction generation mechanism in the prior art, the method, the device and the equipment for generating the instruction fine tuning data construct the initial task instructions in different task types by acquiring the question-answer task samples in the structured text form, then add instruction description limits matched with the task types into the initial task instructions according to the task types, construct instruction instances in different task types, and finally combine the question-answer task samples with the instruction instances in different task types respectively to obtain the instruction fine tuning data. The whole process starts from a question-answer task sample in a structured text form, diversified description limits matched with task types are added to an initial instruction task, instruction fine adjustment data which is more suitable for a fine adjustment stage are extracted, and because the instruction fine adjustment data contains more specialized task descriptions, task demands in different fields can be accurately docked, and the training effect of a model in the fine adjustment stage is improved.
The above description is merely an overview of the technical solutions of the present application, and may be implemented in accordance with the content of the specification in order to make it possible to more clearly understand the technical means of the present application, and in order to make the above and other aspects of the present application
The purpose, features and advantages will be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a flow chart illustrating a method for generating instruction trim data according to an embodiment of the present application;
FIG. 2 is a flow chart of step 101 of FIG. 1;
FIG. 3 is a flow chart of step 102 of FIG. 1;
FIG. 4 is a flow chart illustrating a method for generating instruction trim data according to another embodiment of the present application;
FIG. 5 is a flow chart of step 103 of FIG. 1;
FIG. 6 is a schematic diagram of a device for generating fine tuning data according to an embodiment of the present application;
fig. 7 is a schematic diagram of an apparatus structure of a computer device according to an embodiment of the present invention.
Detailed Description
The present disclosure will now be discussed with reference to several exemplary embodiments. It should be understood that these embodiments are discussed only to enable those of ordinary skill in the art to better understand and thus practice the teachings of the present invention, and are not meant to imply any limitation on the scope of the invention.
As used herein, the term "comprising" and variants thereof are to be interpreted as meaning "including but not limited to" open-ended terms. The term "based on" is to be interpreted as "based at least in part on". The terms "one embodiment" and "an embodiment" are to be interpreted as "at least one embodiment. The term "another embodiment" is to be interpreted as "at least one other embodiment".
In the related art, the fine tuning stage utilizes large-scale instruction fine tuning data to perform fine tuning on the pre-trained language model, and the process is closely related to supervised fine tuning and multi-task prompt training. The instruction fine tuning data in the form of questions and answers can be built by random combination through an instruction generation mechanism, and although a large number of training examples are formatted by adding instructions, the training examples mainly come from a public natural language processing data set, the instruction fine tuning data lacks of diversity of task description, is difficult to accurately meet task requirements in different fields, and has limited effect on a lifting model.
In order to solve the problem, the present embodiment provides a method for generating instruction fine tuning data, as shown in fig. 1, where the method is applied to a server side of model training, and includes the following steps:
101. And acquiring a question-answer task sample in a structured text form.
Typically, the language model training process requires a huge set of text data to be prepared as training samples, and the text may be unstructured text data in different fields of web pages, data, news, etc. In order to improve the readability, understandability and operability of the text, after unstructured text data in different fields are acquired, a question-answering task sample in the form of structured text can be formed by writing the text according to specific formats and rules. In the process of writing the structured text, a certain principle and a certain method are required to be followed, firstly, the structure and the content of a question and answer sample are definitely determined, secondly, the text is divided according to the content, so that the text is clear in arrangement, finally, the content is expressed by using a concise and clear language, and information can be presented in the forms of a list, a table, a chart and the like, so that the readability and the understandability of the text are improved.
Considering that the structured text-form question-answer task samples obtained by different question-answer task types are different, specifically, firstly, the question-answer task types to be created, such as classification question-answer, keyword extraction question-answer, reading understanding question-answer and the like, are determined, then the structured text-form question-answer task samples are extracted from the unstructured text data according to the task types, for example, text contents with multiple categories can be extracted from the unstructured text data as question-answer task samples for the task types of the classification question-answer, paragraph contents can be extracted from the unstructured text as question-answer task samples for the task types of the keyword extraction question-answer, and the text contents extracted by the different task types are different, which is not limited.
It should be noted that, here, the question-answer task sample is not a text formed by the questions and the answers, but a part of text selected from unstructured text data is used as the question-answer task sample, and the question-answer task sample does not have the questions, but the corresponding questions and answers can be designed according to the contents in the question-answer task sample.
The execution main body of the embodiment can be a device or equipment for generating the instruction fine tuning data, the device or equipment can be configured at a server corresponding to the generation of the instruction fine tuning data, question-answer task samples in a structured text form are obtained, question-answer structures of different task demands are designed by using the question-answer task samples, further instruction fine tuning data are generated according to instruction examples constructed by the question-answer structures without demands of different people, further task demands of different model training are met, and the effect of model training is improved.
102. And constructing initial task instructions of different task types according to the question-answer task samples.
It will be appreciated that it is important to construct task instructions during the model tuning phase, by which the model can be aided in understanding and learning the specific requirements of the questioning task.
Considering the difference between the input and output formats of the models, for a question-answering task, the input is usually a question, the output is a corresponding answer, after the question-answering task type is selected, a question representation is defined firstly, a natural language text can be used for the question representation, or a structured question representation form can be used, for example, the question is converted into a keyword or an entity, then an answer identification is defined, the answer identification can be a single word, phrase, sentence or an entity or a relation extracted from the text, indexes for evaluating the models are further defined, for example, accuracy, recall and the like, and finally, the acceptable input and output formats of the models are extracted from the question-answering task samples according to the question representation and the answer representation, so that initial task instructions of different task types are obtained. For example, taking a paper as a question-answer task sample, describing a question-answer task generated by abstracting in the paper as an example, wherein the question-answer task generated by abstracting can define different forms of question expression as abstract generation, the question-answer task generated by abstracting can define text sentences of which answers are expressed as paragraphs, and correspondingly, the input and output formats acceptable by abstracting the abstract from the paper through question expression and answer identification can be as follows: "generate_summary", "generate summary with_as the topic", "generate summary, subject is_", "given subject_, generate summary", etc., where "_" is the filling location of the paper title.
103. And adding instruction description limits matched with the task types into the initial task instruction according to the task types, and constructing instruction instances of different task types.
It can be understood that different task types have different limitations in the actual application scenario, for example, the question-answer task generated by the abstract generally has field limitations, word number limitations, language style limitations and the like, where the limitation information matched with the task type image can be obtained according to the task type, and the limitation information matched with the task type is added into the initial task instruction after the instruction description limitation is generated, so as to construct the instruction instance of the corresponding task type.
The constraint information that may be added specifically in the initial task instruction may include, but is not limited to, the following: subject restrictions, document category (papers, journals, etc.) restrictions, language style restrictions, similar word substitutions, number of words generated restrictions, title level restrictions, etc. For the limitation of the discipline category, the instruction description limitation of the discipline can be added into the initial task instruction, and the discipline category is divided into a major category limitation and a subdivision field limitation respectively; for the limitation of the types of the documents, the instruction description limitation of the types of the documents can be added into the initial task instruction, and the paragraph contents of the specified types of the documents can be updated or generated according to the characteristics of different writing methods of different document types (papers, journals and the like); for language style limitation, instruction description limitation of language style can be added in the initial task instruction, and the format of the output content is adjusted by specifying the language style, for example, the following content is summarized in poetry style, the following content is summarized in fairy style, the following content is summarized in tool book style, and the like; for word number limitation, word number instruction description limitation can be added into the initial task instruction, and the input is mainly designated for question-answering tasks of generation class or induction class
An upper limit and a lower limit of the number of the text of the content are set, for example, the following content is summarized, and the number of the text of the content is not more than 200 words; the method comprises the steps of carrying out a first treatment on the surface of the For the title level limitation, an instruction description limitation of the title level can be added in an initial task instruction, mainly aiming at a sub-title paragraph generation type question-answer task and a outline generation type question-answer task, and the title level is added to the title in the instruction fine tuning data through a special character identifier in the structured data, so that the complexity of the instruction fine tuning data is increased; for similar word replacement, a word list can be constructed to replace keywords in the initial task instruction, for example, the description word outline in the outline generation type question-answering task can be replaced by: outline, schema, outline, summary, etc.
104. And respectively combining the question-answer task samples with the instruction examples of different task types to obtain instruction fine-tuning data.
Considering the difference of task types, the task sample of question and answer is difficult to meet task requirements of different fields, the diversity of task descriptions is lacking, the instruction examples of different task types are constructed, the instruction examples can comprise instruction descriptions of natural language sentences or phrases, further, the task to be executed is described, the operation can be a specific action or an instruction target to be executed by a model, further, the instruction examples of different task types are used as question and answer descriptions and question and answer task samples to form instruction fine adjustment data, and answers corresponding to questions can be question and answer task samples or can be obtained from the question and answer task samples.
In an actual application scene, a large number of instructions and operations are included in instruction examples of different task types, for each instruction and operation, labeling information, such as instruction types, operation types, instruction sources and the like, is corresponding to each instruction example, in order to facilitate screening of corresponding answers in instruction fine tuning data, labeling information can be added in question-answer task samples according to the instruction examples of different task types, and therefore the question-answer task samples are combined with the instruction examples of different task types according to the labeling information as a correlation basis, and instruction fine tuning data are obtained.
Further, in order to enhance the diversity and generalization capability of the data, it may be considered to use a data enhancement technique, for example, sentence reorganization, random insertion, and the like, to adjust the question-answer task samples corresponding to the instruction instances by the data enhancement technique, so as to form more diversified instruction fine-tuning data.
Compared with the method for generating the instruction fine tuning data by randomly combining and constructing the instruction fine tuning data in the question-answer form through an instruction generation mechanism in the prior art, the method for generating the instruction fine tuning data in the question-answer form comprises the steps of acquiring a question-answer task sample in the structured text form, constructing initial task instructions of different task types according to the question-answer task sample, adding instruction description limits matched with the task types into the initial task instructions according to the task types, constructing instruction instances of different task types, and finally respectively combining the question-answer task sample with the instruction instances of different task types to obtain the instruction fine tuning data. The whole process starts from a question-answer task sample in a structured text form, diversified description limits matched with task types are added to an initial instruction task, instruction fine adjustment data which is more suitable for a fine adjustment stage are extracted, and because the instruction fine adjustment data contains more specialized task descriptions, task demands in different fields can be accurately docked, and the training effect of a model in the fine adjustment stage is improved.
In the above embodiment, in consideration of the diversity of the question-answer task samples in the domain types, specifically, as shown in fig. 2, step 101 includes the steps of:
201. and obtaining structured text data of different domain types.
202. And extracting the content of the structured text data according to the field content contained in the structured text data in the field type to obtain a question-answer task sample in the structured text form.
It may be understood that the question-answer task in the structured text form is equivalent to a text expression mode, unstructured text data is expressed as a question-answer task sample through a set structured form, the specific set structured form may be determined according to the field of text data, for example, for structured text data of academic papers, a question-answer task sample in the structured text form may be formed by selecting content such as abstract, outline, paragraph, etc. from unstructured text data, and for structured text data of tool books, a question-answer task sample in the structured text form may be formed by selecting content such as term interpretation, detailed flow, etc. from unstructured text data.
The structured text data of different domain types may include, but is not limited to, paper structured text data and tool book structured text data, and the field content extracted for the structured text data of different domain types is different, for example, for academic paper structured text data, the content of abstract, outline, paragraph and the like is selected as a question-answering task sample, and for tool book structured text data, the content of noun paraphrasing, detailed flow and the like is selected as a question-answering task sample.
Further, considering the diversity of text data in the question-answer task samples, identical text data may be filtered out. In addition, text data with too long or too short words in the question-answer task sample can be filtered in order to ensure the quality of the question-answer task sample.
In the above embodiment, in order to accurately construct the initial task instruction, specifically, as shown in fig. 3, step 102 includes the following steps:
301. and selecting key fields of task description from the question-answer task sample according to the field content contained in the question-answer task sample.
302. And constructing initial task instructions of different task types according to the key fields of the task description.
In this embodiment, the format design of the task instruction is an important factor affecting the generalization performance of the large-scale language model, where the task description is a key part of understanding the task of the large-scale language model, and based on the task sample of the question-answer task, an initial task instruction can be built for various question-answer tasks of different data sets, by building the initial task instruction, the model can be helped to better understand and solve a specific question-answer task, and the performance of the model on the question-answer task is improved.
Considering that the key fields of task descriptions required by the question-answer task samples of different task types in the training process are different, for example, the key fields of the task types generated by the abstract comprise the abstract and the generated, the key fields of the task types for paragraph and paragraph writing are paragraphs and paragraph writing, and the key fields of the task descriptions are further used for constructing initial task instructions of different types. In the process of constructing the initial task instructions of different task types according to the key fields of task description, the question-answer tasks of different task types can be set according to the key fields of task description, multiple description modes of the key fields on different task types are determined, and finally the initial task instructions of different task types are constructed according to the multiple description modes of the key fields on different task types. Taking the task type explained by the technical flow as an example, the initial task instruction can be constructed by: what skill "," _ "needs to be noted what", "_" reasons for and what precautions are, "where" _ "is the filling location of the technical flow name.
It may be appreciated that different initial task instructions may be generated for different task types for the question-answer task sample, the generation of the initial question-answer task instructions may include summary generation, subtitle paragraph generation, paragraph writing, outline generation, paragraph induction, classification induction, etc., and the generation of the initial question-answer task instructions may include concept interpretation, technical flow interpretation, etc., for the question-answer task sample of the tool class.
In the above embodiment, in order to add diversified scene limitation information to the question-answer task text, complex and varied instruction instances are generated for different task instructions, and further, as shown in fig. 4, before step 103, the method further includes the following steps:
401. and acquiring text limiting characteristics related to the question-answer task samples in different task types in advance.
402. And determining the limiting information of the question-answer task samples of different task types on each text limiting feature according to the text limiting features.
Considering that the text restriction characteristics corresponding to different task types are different, correspondingly, the restriction information added in the initial task instruction is also different, and for the task types of science and technology abstract class, the text restriction characteristics related to the question-answer task sample mainly comprise discipline fields, abstract word numbers and the like, correspondingly, the question is determined
The limited information of the answer task sample in the subject field is in the science and technology field, and the limited information of the question-answer task sample in the abstract word number is within 300 words.
Accordingly, in the above embodiment, as shown in fig. 5, step 103 includes the following steps:
501. and determining target text limiting characteristics to be added to the initial task instruction according to the task type.
502. And acquiring the limiting information of the question-answering task sample on the target text limiting characteristics.
503. And adding the limiting information on the target text limiting feature into the initial task instruction in an instruction description mode, and constructing instruction instances of different task types.
It will be appreciated that the text restriction features that need to be added to the initial task instruction for different task types are different, and the corresponding restriction information on the text restriction features is also different, for example, the text restriction features added to the initial task instruction may include word count, field, etc. for the task type generated by the abstract, and the text restriction features added to the initial task instruction may include word count restriction, language style, etc. for the summarized task type.
For example, for a question-answer task sample of an academic paper, taking a task type generated by abstract as an example, a complex task instruction is constructed as follows: "in the medical field, generate abstract" with the question @; taking the task type of paragraph writing as an example, a complex task instruction is constructed as follows: within 200 words, the following is written; taking the task type of paragraph induction as an example, a complex task instruction is constructed as follows, "in a document style, the following is induced within 20 words". Aiming at a question-answer task sample of the tool book, taking task types interpreted by technical flows as examples, a complex task instruction is constructed as follows: what the correspondence of "_" is, within 500 words.
Further, after constructing the instruction trim data, trimming of the original language model using the instruction trim data is started, and correspondingly, after step 104, the method further includes the steps of:
and inputting the instruction fine tuning data into a network model output in a pre-training stage, and fine tuning the network model by utilizing a pre-configured sequence loss function to obtain a fine-tuned network model.
Typically, the network model output in the pre-training stage is trained on large-scale unsupervised data, which may perform poorly on specific tasks, and instruction fine-tuning data is used to provide guidance on the tasks, thereby improving the performance of the model and enabling it to better adapt to specific task requirements. The instruction fine tuning stage is to improve the performance of the model by introducing human instruction, and particularly in the instruction fine tuning stage, instruction fine tuning data can conduct instruction of corresponding task types on the network model output by the pre-training stage according to instruction examples of different task types, so that the network model can generate correct output according to instructions through learning, and the performance and adaptability of the model are improved.
By way of example, consider the outline generation task type as an example, and the effect example of the input and output corresponding to the network model in the instruction fine tuning stage is as follows:
Task id:0
Input: "how far from us the given subject chinese contemporary art is, a synopsis is generated, containing at most 3 primary titles. "
And (3) outputting: "output": "one, china contemporary art is" New national extract "? N two, the contemporary art has specificity n'
Task id:0
Input: the outline is generated, and the theme is designed for consumption motivations and product packaging of the college students with shallow analysis, and comprises 10 secondary titles at most. "
And (3) outputting: first consumption motivation/n two, consumption motivation/n three of college students, consumption motivation of college students and product packaging design strategy/n four, summary/n1. emotion motivation. Including both emotional and emotional content. N2. intellectual motivation. N3. benefits the engine. N1. holds the requirements and positions them correctly. N2. motivates the purchase incentive. N3. follows fashion. And n4. personalized. N5. emotion association. N'
Illustratively, taking the subtitle paragraph task type as an example, the effect example of the corresponding input and output of the network model through the instruction fine tuning stage is as follows:
task id:0
Input: "about 300 words of subtitle paragraphs are generated, the theme is" Jingdezhen City national economy and society development fourteenth five years plan and two and three five years long-range view target outline "and the subtitle is: the second section develops an emerging industry. "
And (3) outputting: the development of the grabbing front-end field aims at the subdivision fields of flexible electronics, micro-nano optics, new energy equipment, biotechnology, life science and the like, and the future front-end industry is laid out. The strategic emerging industries such as high-end equipment manufacturing, new materials, biological medicines and the like are gathered, a batch of key core technologies are broken through, a batch of technological achievement industrialization is promoted, a batch of test point demonstration projects are planned, a batch of heavy application scenes are created, the volume and specific gravity of the emerging industries are improved, scale gathering and synergistic effects are formed, and a new engine for future development is cultivated. "
For example, taking the concept explanation task type as an example, the effect example of the input and output corresponding to the network model through the instruction fine tuning stage is as follows:
task id:0
Input: "what the drum word means",
and (3) outputting: "drum word" is a kind of talking and singing art in Song dynasty. Similar to the variational text, the space is short and small because of the heterology of the vowels and the variational text, so most of the vowels are simple and the fact is simple, each singing is only about ten chapters, and the name is obtained by the drum accompaniment during singing. "
Further, as a specific implementation of the method of fig. 1-5, an embodiment of the present application provides a device for generating instruction trimming data, as shown in fig. 6, where the device includes: a first acquisition unit 61, a construction unit 62, an addition unit 63, a combination unit 64.
A first obtaining unit 61, configured to obtain a question-answer task sample in a structured text form;
a construction unit 62, configured to construct initial task instructions of different task types according to the question-answer task samples;
an adding unit 63, configured to add instruction description restrictions matched with the task types in the initial task instruction according to the task types, and construct instruction instances of different task types;
and the combining unit 64 is configured to combine the question-answer task samples with the instruction instances of the different task types, respectively, to obtain instruction fine-tuning data.
Compared with the prior art that the instruction fine adjustment data in the form of question and answer are built by randomly combining through an instruction generation mechanism, the generation device of the instruction fine adjustment data provided by the embodiment of the invention has the advantages that the question and answer task samples in the form of structured text are obtained, initial task instructions of different task types are built according to the question and answer task samples, instruction description limits matched with the task types are added in the initial task instructions according to the task types, instruction instances of different task types are built, and finally the question and answer task samples are respectively combined with the instruction instances of different task types, so that the instruction fine adjustment data is obtained. The whole process starts from a question-answer task sample in a structured text form, diversified description limits matched with task types are added to an initial instruction task, instruction fine adjustment data which is more suitable for a fine adjustment stage are extracted, and because the instruction fine adjustment data contains more specialized task descriptions, task demands in different fields can be accurately docked, and the training effect of a model in the fine adjustment stage is improved.
In a specific application scenario, further, the first obtaining unit 61 is specifically configured to obtain structured text data of different domain types; and extracting the content of the structured text data according to the field content contained in the structured text data in the field type to obtain a question-answer task sample in the structured text form.
In a specific application scenario, the construction unit 62 is specifically configured to select, according to field content included in the question-answer task sample, a key field of task description in the question-answer task sample; according to
And constructing initial task instructions of different task types by the key fields of the task description.
In a specific application scenario, the construction unit 62 is specifically further configured to set question-answer tasks of different task types according to the key fields of the task description; determining multiple description modes of key fields on different task types; and constructing initial task instructions of different task types according to various description modes of the key fields on different task types.
In a specific application scenario, the apparatus further includes:
the second obtaining unit is used for obtaining text restriction characteristics related to question-answer task samples in different task types in advance before adding instruction description restrictions matched with the task types into the initial task instruction according to the task types and constructing instruction examples of different task types;
And the determining unit is used for determining the limit information of the question-answer task samples with different task types on each text limit feature according to the text limit features.
In a specific application scenario, the adding unit 63 is specifically configured to determine, according to a task type, a target text restriction feature to be added to the initial task instruction; acquiring limiting information of the question-answering task sample on a target text limiting feature; and adding the limiting information on the target text limiting feature into the initial task instruction in an instruction description mode, and constructing instruction instances of different task types.
In a specific application scenario, the apparatus further includes:
and the fine tuning unit is used for inputting the instruction fine tuning data into the network model output in the pre-training stage after the question-answer task samples are respectively combined with the instruction examples of different task types to obtain the instruction fine tuning data, and fine tuning the network model by utilizing a pre-configured sequence loss function to obtain a fine-tuned network model.
It should be noted that, other corresponding descriptions of each functional unit related to the generating device of the instruction trimming data provided in this embodiment may refer to corresponding descriptions in fig. 1 to 5, and are not described herein again.
Based on the method shown in fig. 1-5, correspondingly, the embodiment of the application further provides a storage medium, on which a computer program is stored, and the program is executed by a processor to implement the method for generating the instruction trimming data shown in fig. 1-5.
With such understanding, the present application may be embodied in the form of a software product that may be stored on a non-volatile storage medium (e.g., CD-ROM, USB flash disk, mobile)
Dynamic hard disk, etc.), instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in the various implementation scenarios of the present application.
Based on the method shown in fig. 1 to 5 and the virtual device embodiment shown in fig. 6, in order to achieve the above objective, the embodiment of the present application further provides an entity device for generating instruction fine tuning data, which may specifically be a computer, a smart phone, a tablet computer, a smart watch, a server, or a network device, where the entity device includes a storage medium and a processor; a storage medium storing a computer program; and a processor for executing a computer program to implement the method for generating instruction trimming data as shown in fig. 1-5.
Optionally, the physical device may further include a user interface, a network interface, a camera, radio Frequency (RF) circuitry, sensors, audio circuitry, WI-FI modules, and the like. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), etc.
In an exemplary embodiment, referring to fig. 7, the entity device includes a communication bus, a processor, a memory, a communication interface, an input/output interface, and a display device, where each functional unit may perform communication with each other through the bus. The memory stores a computer program and a processor for executing the program stored in the memory, and executing the method for generating instruction trimming data in the above embodiment.
It will be appreciated by those skilled in the art that the physical device structure for generating the fine-tuning data of an instruction provided in this embodiment is not limited to the physical device, and may include more or fewer components, or may combine some components, or may be a different arrangement of components.
The storage medium may also include an operating system, a network communication module. The operating system is a program of physical device hardware and software resources that manages the generation of the instruction trim data described above, supporting the execution of information handling programs and other software and/or programs. The network communication module is used for realizing communication among all components in the storage medium and communication with other hardware and software in the information processing entity equipment.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. By applying the technical scheme, compared with the current existing mode, the method and the device extract the more suitable fine tuning orders by adding diversified description limits matched with task types in the initial instruction task
The instruction fine tuning data of the section can accurately meet task demands in different fields because the instruction fine tuning data contains more specialized task descriptions, and the training effect of the model in the fine tuning stage is improved.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application. Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.
Claims (10)
1. A method of generating instruction trimming data, comprising:
acquiring a question-answer task sample in a structured text form;
constructing initial task instructions of different task types according to the question-answer task samples;
adding instruction description limits matched with the task types into the initial task instruction according to the task types, and constructing instruction instances of different task types;
and respectively combining the question-answer task samples with the instruction examples of different task types to obtain instruction fine-tuning data.
2. The method of claim 1, wherein the obtaining a sample of a question-answer task in the form of structured text comprises:
obtaining structured text data of different domain types;
and extracting the content of the structured text data according to the field content contained in the structured text data in the field type to obtain a question-answer task sample in the structured text form.
3. The method of claim 1, wherein constructing initial task instructions of different task types from the question-answer task samples comprises:
selecting key fields of task description from the question-answering task sample according to field content contained in the question-answering task sample;
and constructing initial task instructions of different task types according to the key fields of the task description.
4. A method according to claim 3, wherein constructing initial task instructions of different task types from key fields of the task description comprises:
setting question-answering tasks of different task types according to the key fields of the task description;
determining multiple description modes of key fields on different task types;
and constructing initial task instructions of different task types according to various description modes of the key fields on different task types.
5. The method of any of claims 1-4, wherein prior to said adding instruction description restrictions matching the task type in the initial task instruction according to task type, constructing instruction instances of different task types, the method further comprises:
Acquiring text limiting characteristics related to question-answer task samples in different task types in advance;
and determining the limiting information of the question-answer task samples of different task types on each text limiting feature according to the text limiting features.
6. The method of claim 5, wherein adding instruction description limits matching the task types in the initial task instruction according to the task types, and constructing instruction instances of different task types includes:
determining target text limiting characteristics to be added to the initial task instruction according to the task type;
acquiring limiting information of the question-answering task sample on a target text limiting feature;
and adding the limiting information on the target text limiting feature into the initial task instruction in an instruction description mode, and constructing instruction instances of different task types.
7. The method according to any one of claims 1-4, wherein after said combining the question-answer task samples with the instruction instances of the different task types, respectively, results in instruction trimming data, the method further comprises:
and inputting the instruction fine tuning data into a network model output in a pre-training stage, and fine tuning the network model by utilizing a pre-configured sequence loss function to obtain a fine-tuned network model.
8. A device for generating instruction trimming data, comprising:
the first acquisition unit is used for acquiring a question-answer task sample in a structured text form;
the construction unit is used for constructing initial task instructions of different task types according to the question-answer task samples;
the adding unit is used for adding instruction description limits matched with the task types into the initial task instruction according to the task types and constructing instruction instances of different task types;
and the combination unit is used for respectively combining the question-answer task samples with the instruction examples of different task types to obtain instruction fine-tuning data.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, carries out the steps of the method of generating instruction fine tuning data according to any one of claims 1 to 7.
10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor realizes the steps of the method of generating instruction trimming data according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311278107.8A CN117312497A (en) | 2023-09-28 | 2023-09-28 | Method, device and equipment for generating instruction fine tuning data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311278107.8A CN117312497A (en) | 2023-09-28 | 2023-09-28 | Method, device and equipment for generating instruction fine tuning data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117312497A true CN117312497A (en) | 2023-12-29 |
Family
ID=89296790
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311278107.8A Pending CN117312497A (en) | 2023-09-28 | 2023-09-28 | Method, device and equipment for generating instruction fine tuning data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117312497A (en) |
-
2023
- 2023-09-28 CN CN202311278107.8A patent/CN117312497A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Persson et al. | A systematic review of second language learning with mobile technologies. | |
CN110489538B (en) | Statement response method and device based on artificial intelligence and electronic equipment | |
US8744855B1 (en) | Determining reading levels of electronic books | |
Loucky | Comparing electronic dictionary functions and use | |
Adetayo et al. | Microsoft Copilot and Anthropic Claude AI in education and library service | |
CN117407502A (en) | Question-answer pair extraction method and device, electronic equipment and storage medium | |
KR20160008273A (en) | Method of providing online education service by server for providing online education service | |
Isa et al. | Quran Mobile Application: A Structured Review | |
TW202139177A (en) | Unambiguous phonics system | |
US20230169268A1 (en) | Textual adjustment to a target reading level | |
CN111782771A (en) | Character problem solving method and device | |
Rosner et al. | The Maltese language in the digital age | |
CN117312497A (en) | Method, device and equipment for generating instruction fine tuning data | |
Calzolari et al. | The Italian Language in the Digital Age | |
Zhang | Curriculum design in teaching Chinese characters to American students: when and what? | |
JP2021162732A (en) | Subject recommendation system | |
Boon et al. | An examination question paper preparation system with content-style separation and bloom’s taxonomy categorisation | |
Beard | Uncovering the key skills of reading | |
Nazemi | Non-visual representation of complex documents for use in digital talking books | |
Rijhwani et al. | User-Centric Evaluation of OCR Systems for Kwak'wala | |
Zhang | The Current Status of CALL for Chinese in the United States 1 | |
Zhang et al. | An investigation of the use of VR books to resolve difficulties with access and reading of Chinese ancient books | |
Schrader | Interdisciplinary Approaches to Forensic Linguistic Analysis and Medieval Manuscript Studies: Developing a Working Framework for Research | |
Martins et al. | Teaching Sign Language as Second Language Through Data-Driven Learning | |
Tono | Developing Multilingual Language Learning Resources Using the CEFR-J |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |