CN117708307A

CN117708307A - Method and device for fusing micro-tuning and Adapter of large language model

Info

Publication number: CN117708307A
Application number: CN202410170139.4A
Authority: CN
Inventors: 王震; 高德宏; 马宇飞; 蔡晓妍; 杨黎斌
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2024-02-06
Filing date: 2024-02-06
Publication date: 2024-03-15
Anticipated expiration: 2044-02-06
Also published as: CN117708307B

Abstract

The invention discloses a large language model tuning and Adapter fusion method and device, and relates to the field of deep learning. The method is used for solving the problems of large consumption and poor data quality caused by the fact that the data collection is required to be manually carried out in the construction of the existing multi-mode data set. The method comprises the following steps: collecting a plurality of question-answer data sets and dialogue data sets from a set network platform; respectively performing LoRA-adapter fine tuning on the question-answer data set and the dialogue data set to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function; obtaining an ideal loss function, ideal fusion weight and a first ideal parameter of the question-answer data set and the dialogue data set in an ideal state; obtaining the optimal parameters of the question-answering LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters; and obtaining the general LoRA-adapter according to the optimal parameters of the question-answering LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters.

Description

Method and device for fusing micro-tuning and Adapter of large language model

Technical Field

The invention relates to the field of deep learning, in particular to a large language model fine tuning and Adapter fusion method and device.

Background

The training of the large language model has important scientific research and application value, can promote the performance of natural language processing tasks, improve the interactive experience of a dialogue system, and promote the universality of scientific research, technical innovation and artificial intelligence development. The large language model can learn rich language knowledge and grammar rules by training a massive corpus, so that the large language model has better performance in natural language processing tasks such as machine translation, text generation, text classification and the like. These models are able to understand and generate more accurate and fluent natural language. Large language models can be used to build intelligent dialog systems that provide more natural, accurate, and personalized replies by performing dialogues with users. The trained model can understand and generate human language, so that the requirements of users can be better met, and the interactive experience of a dialogue system is improved. Training a large language model requires processing massive amounts of data and huge computing resources, which is of great significance in promoting scientific research and technological innovation. In the process of training a large language model, many technical challenges, such as data processing, model design, training algorithms, etc., need to be solved, and the solution of these challenges has a positive driving effect on research and development in the related field.

The conventional scheme adopted by the current large language model training is as follows: and collecting a large amount of instruction fine tuning data, fusing the instruction fine tuning data to construct a large-scale data set, and fine tuning an open-source large language model on the data set. However, it seems impossible to fuse multiple data sets to construct one multi-functional data set, on the one hand, there is a possibility that contradictions may exist between different data sets, and it is difficult to evaluate the quality of data; on the other hand these datasets consist of examples of various specific tasks such as mathematics, coding, role playing, authoring, etc. If these datasets are blended and fine-tuned on this blended dataset, the performance of the large language model may be degraded or even severely degraded.

Disclosure of Invention

The embodiment of the invention provides a large language model tuning and Adapter fusion method and device, which can prevent the problem of performance degradation caused by conflict of different data sets in a semantic space.

The embodiment of the invention provides a large language model tuning and Adapter fusion method, which comprises the following steps:

collecting a plurality of question-answer data sets and dialogue data sets from a set network platform, and respectively performing LoRA-adapter fine tuning on the question-answer data sets and the dialogue data sets to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function;

obtaining an ideal loss function of the question-answer data set and the dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included based on fine adjustment of each LoRA-adapter, and obtaining an ideal fusion weight and a first ideal parameter which correspond to the ideal loss function according to the minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

according to the ideal loss function, fine tuning is carried out on the question-answer LoRA-adapter corresponding to each question-answer data set and the dialogue LoRA-adapter corresponding to each dialogue data set, so as to respectively obtain the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters;

and obtaining the general LoRA-adapter according to the optimal parameters of the question-answering LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters.

Preferably, the performing LoRA-adapter fine tuning on the question-answer data set sequentially obtains a question-answer large language model and a question-answer negative log likelihood loss function, which specifically comprises:

training the question-answer data set to obtain a question-answer LoRA-adapter, and obtaining the question-answer large language model according to the question-answer LoRA-adapter and the question-answer data set;

obtaining the question-answer negative log-likelihood loss function according to the question-answer large language model and the token of the question-answer large language model;

the question-answer dataset, the question-answer large language model, and the question-answer negative log-likelihood loss function are as follows:

wherein,represent the firstA question-answer data set is provided for each question,represent the firstItem number of question-answer datasetThe information of the individual systems to be processed,represent the firstItem number of question-answer datasetA problem is that,represent the firstItem number of question-answer datasetThe reply is sent to the user in response to the user,representing question-answer datasetsIs provided for the length of (a),expressed in question and answer data setThe question and answer LoRA-adapter obtained by training is used,representation ofIs provided for the length of (a),representing large language model generationThe number of the token to be used in the process,a large language model is represented and the model is represented,the freeze parameters representing the large language model are displayed,representing a question-answer negative log-likelihood loss function.

Preferably, the performing the fine tuning of the dialogue data set to obtain a dialogue large language model and a dialogue negative log likelihood loss function in sequence specifically includes:

training the dialogue data set to obtain a dialogue LoRA-adapter, and obtaining a dialogue large language model according to the dialogue LoRA-adapter and the dialogue data set;

obtaining the dialogue negative log likelihood loss function according to the dialogue large language model and the token of the dialogue large language model;

the dialogue dataset, dialogue large language model, and the dialogue negative log likelihood loss function are as follows:

wherein,represent the firstThe number of sets of dialogue data,represent the firstIn wheel (C)The first dialogue data setThe number of queries to be made is,represent the firstIn wheel (C)The first dialogue data setThe reply is sent to the user in response to the user,representing a dialog data setIs provided for the length of (a),represented in a dialog datasetThe dialog LoRA-adapter obtained by the training,representing all the tokens belonging to the user query,the target mark is represented by a mark of the target,representing a dialog data setMiddle (f)The data contains the number of tokens and,representing a dialogue negative log-likelihood loss function,a large language model is represented and the model is represented,the freeze parameters representing the large language model.

Preferably, the ideal loss function is as follows:

the minimum of the ideal loss function is as follows:

wherein,representing the ideal loss function of the device,expressed in question and answer data setFine tuning the upper part to obtainIs used to determine the initial fusion weights of (1),represented in a dialog datasetFine tuning the upper part to obtainIs used to determine the initial fusion weights of (1),representing all of the first desired parameters,representing all of the ideal fusion weights that are to be used,a first desired parameter is indicated and,representing ideal fusion weights.

Preferably, the first ideal parameter is as follows:

the ideal fusion weights are as follows:

wherein,a first desired parameter is indicated and,expressed in question and answer data setThe first desired parameter obtained is fine-tuned up,represented in a dialog datasetThe first desired parameter obtained is fine-tuned up,representing the number of question-answer datasets,representing the number of sets of dialogue data,the ideal fusion weight is represented by the sum of the weights,representation ofIs used to determine the desired fusion weight of the (c),representation ofIs used for the optimal fusion weight of the (c).

Preferably, the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters are as follows:

wherein,the best parameters for the question-answer lorea-adapter are indicated,the best parameters for the dialog LoRA-adapter,the optimal fusion parameters are indicated to be the ones that are to be fused,representing a question-answer negative log-likelihood loss function,representing a dialogue negative log-likelihood loss function,expressed in question and answer data setThe initial fusion weights obtained are fine-tuned up,represented in a dialog datasetThe initial fusion weights obtained are fine-tuned up,represented in a dialog datasetThe resulting dialog LoRA-adapter is trained on.

The embodiment of the invention provides a large language model tuning and Adapter fusion device, which comprises the following components:

a first obtaining unit for collecting a plurality of question-answer data sets and dialogue data sets from the set network platform; respectively performing LoRA-adapter fine tuning on the question-answer data set and the dialogue data set to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function;

the second obtaining unit is used for obtaining an ideal loss function of the question-answer data set and the dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are finely adjusted based on each LoRA-adapter, and obtaining ideal fusion weights and first ideal parameters which correspond to the ideal loss function according to the minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

the third obtaining unit is used for carrying out fine adjustment on the question-answer LoRA-adapter corresponding to each question-answer data set and the dialogue LoRA-adapter corresponding to each dialogue data set according to the ideal loss function, so as to obtain the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters respectively;

and a fourth obtaining unit, configured to obtain a general LoRA-adapter according to the optimal parameter of the question-answering LoRA-adapter, the optimal parameter of the dialogue LoRA-adapter, and the optimal fusion parameter.

The embodiment of the invention provides a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and the computer program, when executed by the processor, enables the processor to execute the large language model refinement and Adapter fusion method described in any one of the above.

An embodiment of the present invention provides a computer readable storage medium storing a computer program, where the computer program when executed by a processor causes the processor to execute the large language model refinement and Adapter fusion method described in any one of the above.

The embodiment of the invention provides a large language model tuning and Adapter fusion method and device, wherein the method comprises the following steps: collecting a plurality of question-answer data sets and dialogue data sets from a set network platform; respectively performing LoRA-adapter fine tuning on the question-answer data set and the dialogue data set to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function; obtaining an ideal loss function of a question-answer data set and a dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included based on fine adjustment of each LoRA-adapter, and obtaining ideal fusion weights and first ideal parameters which correspond to the ideal loss function according to the minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively; according to the ideal loss function, fine tuning is carried out on the question-answer LoRA-adapter corresponding to each question-answer data set and the dialogue LoRA-adapter corresponding to each dialogue data set, so as to respectively obtain the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters; and obtaining the general LoRA-adapter according to the optimal parameters of the question-answering LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters. According to the method, a plurality of instruction fine-tuning data sets are constructed, the consumption of a GPU (English: graphic Process Unit, chinese: graphic processor) is saved by utilizing a QLoRA quantization technology, a large language model training mode which saves the cost of computing resources and is high in quality is provided, and meanwhile, a multi-LoRA-adapter fusion mode based on Grid-Search (Chinese: parameter adjustment means) optimization is designed to fuse the trained LoRA-adapters. According to the method, the LoRA-adapter is fused, so that semantic space conflict caused by fusion of the data sets can be effectively avoided, and generalization performance of the large language model on a plurality of tasks is improved. The problem of performance degradation caused by conflict of different data sets in semantic space in the prior art is solved.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a large language model tuning and Adapter fusion method provided by an embodiment of the invention;

fig. 2 is a schematic structural diagram of a large language model tuning and Adapter fusion device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Training a large language model not only aims at improving the performance of natural language processing tasks and improving the interactive experience of a dialogue system, but also has richer value and significance in scientific research and application fields.

Firstly, a large language model can learn rich language knowledge and grammar rules by training a massive corpus. The models can understand and generate more accurate and smoother natural language, and provide better performance for natural language processing tasks such as machine translation, text generation, text classification and the like. In the machine translation task, the large language model can more accurately understand the meaning of the source language and generate a more natural target language translation result. In a text generation task, a model can generate text content that is more logical and coherent. In the text classification task, the model can judge the type of the text more accurately, and the classification accuracy is improved.

Second, a large language model can be used to build an intelligent dialog system that provides a more natural, accurate and personalized reply by having a dialog with the user. This capability is very useful for chat robots, intelligent customer service, etc. in everyday life. The trained model can understand and generate human language, so that the requirements of users can be better met, and the interactive experience of a dialogue system is improved. The dialogue system can generate personalized replies through the model, so that the user can feel the same interaction experience as human dialogue, and the satisfaction degree of the user is enhanced.

In addition, training a large language model requires processing massive amounts of data and huge computing resources, which is of great significance in promoting scientific research and technological innovation. In training large language models, many technical challenges need to be addressed, such as data processing, model design, training algorithms, etc. The resolution of these challenges not only can drive the development of language models, but also helps in research and development of related fields. For example, by improving and optimizing the model, the efficiency and performance of the model can be improved, and technical support is provided for development and application of other natural language processing tasks.

Finally, training a large language model can provide intelligent natural language processing services and promote the development of the universality of artificial intelligence technology. These models can be applied to various fields such as education, medical treatment, finance, and the like. In the education field, the model can be used for assisting learning, intelligent answering and the like, and personalized learning resources and communication platforms are provided. In the medical field, the model can be used for assisting doctor diagnosis, intelligent medical record and the like, and the quality and efficiency of medical service are improved. In the financial field, the model can be used for intelligent customer service, risk management and the like, and more personalized and efficient financial services are provided.

In summary, training a large language model has important scientific research and application values, not only can improve the performance of natural language processing tasks and improve the interactive experience of a dialogue system, but also can promote the universality of scientific research, technical innovation and artificial intelligence development. By training a large language model, an intelligent natural language processing technology can be applied to various fields, and better intelligent service and solution can be provided for society.

Because of the conventional scheme adopted by the current large language model training, it seems impossible to combine multiple data sets to construct a multifunctional data set, and there is a possibility that contradictions may exist between different data sets, and it is difficult to evaluate the quality of data. Based on the above, the embodiment of the invention provides an efficient training method for constructing a high-quality and strong-capacity large language model. Cleaning and arranging a plurality of open source data on a Huggingface platform to obtain a plurality of different knowledge question-answer data sets and dialogue data sets, then independently training a LoRA-adapter on each data set by adopting QLoRA (Chinese: low-rank adaptation), and finally dynamically optimizing the fusion weights of the LoRA-adapters by utilizing Grid-Search.

FIG. 1 is a schematic flow chart of a large language model tuning and Adapter fusion method provided by an embodiment of the invention; as shown in fig. 1, the method comprises the steps of:

step 101, collecting a plurality of question and answer data sets and dialogue data sets from a set network platform; respectively performing LoRA-adapter fine tuning on the question-answer data set and the dialogue data set to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function;

step 102, obtaining ideal loss functions of a question-answer data set and a dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included based on fine adjustment of each LoRA-adapter, and obtaining ideal fusion weights and first ideal parameters which correspond to the ideal loss functions according to the minimum values of the ideal loss functions; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

step 103, fine tuning is carried out on the question-answer LoRA-adapter corresponding to each question-answer data set and the dialogue LoRA-adapter corresponding to each dialogue data set according to the ideal loss function, so as to obtain the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters respectively;

and 104, obtaining the general LoRA-adapter according to the optimal parameters of the question-answering LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters.

It should be noted that, the embodiment of the invention provides a large language model tuning and Adapter fusion method, the execution main body of which is a processor.

In step 101, a plurality of question-answer data sets and dialogue data sets are collected from a setting network platform, where the setting network platform may be a Huggingface community, and in the embodiment of the present invention, the setting network platform is not specifically limited.

Specifically, after collecting a plurality of data sets from a set network platform, cleaning the plurality of data sets is required, and finally obtaining a plurality of question-answer data sets and a plurality of dialogue data sets after cleaning, in the embodiment of the present invention, the cleaning rule of the data sets is as follows:

1) Delete and ChatGPT (english: chat Generative Pre-trained Transformer) -3.5-Turbo, only the dialogue instance with GPT-4 is reserved; 2) Deleting the GPT-4 refused answer or directly interpreted dialog; 3) The deleted answer is GPT-4 null or GPT-4 missed answer dialogue; 4) Deleting dialogs containing toxic or illegal information; 5) Deleting a dialogue containing an OpenAI or ChatGPT typeface or replacing a dialogue containing an OpenAI or ChatGPT typeface with information ensuring that the model has the correct identity; 6) Deleting user questions with similarity to the reference questions being greater than 85%; 7) Lengthy dialog instances are separated into dialogs that match the model maximum context length.

Through the above situation rules, a plurality of question-answer data sets and a plurality of dialogue data sets required by the embodiment of the invention can be finally obtained.

In embodiments of the present invention, the question-answer dataset may be represented asThe session data set may be represented as。

Specifically, the question-answer dataset may be represented by formula (1):

（1）

wherein,representing the system information system message,a question query representing a user,representing a reply response of the artificial intelligence,represent the firstA question-answer data set is provided for each question,represent the firstItem number of question-answer datasetThe information of the individual systems to be processed,represent the firstItem number of question-answer datasetA problem is that,represent the firstItem number of question-answer datasetThe reply is sent to the user in response to the user,representing question-answer datasetsIs a length of (c).

In the embodiment of the invention, when the obtained question-answer data set is subjected to LoRA-adapter fine tuning, a specific instance is givenSystem messages of (3)And queriesThe large language model should learn to generate corresponding replies. The process can obtain a question and answer language model, which is as follows:

（2）

wherein,expressed in question and answer data setThe question and answer LoRA-adapter obtained by training is used,a large language model is represented and the model is represented,representation ofIs provided for the length of (a),representing large language model generationThe number of the token to be used in the process,the freeze parameters representing the large language model are displayed,indicating that the small scale is smaller thanAll of (3)。

Further, a question-answer negative log-likelihood loss function is obtained according to the question-answer large language model and the token of the question-answer large language model.

Wherein the question-answer negative log-likelihood loss function is as follows:

（3）

wherein,representing a question-answer negative log-likelihood loss function,represent the firstItem number of question-answer datasetThe information of the individual systems to be processed,represent the firstItem number of question-answer datasetA problem is that,represent the firstItem number of question-answer datasetAnd replies.

Accordingly, the dialog data set may be represented by formula (4):

（4）

wherein the session data set comprises a plurality of ownsA dialog instance of the wheel,represent the firstThe number of sets of dialogue data,represent the firstIn wheel (C)The first dialogue data setThe number of queries to be made is,represent the firstIn wheel (C)The first dialogue data setThe reply is sent to the user in response to the user,representing a dialog data setIs a length of (c).

In embodiments of the present invention, when the dialogue dataset is LoRA-adapter fine-tuned, the large language model will learn at a given firstDialog history and queries before roundPredicting each replyThis process results in a large dialog language model, which is shown below:

（5）

wherein,represented in a dialog datasetThe dialog LoRA-adapter obtained by the training,representing all the tokens belonging to the user query,the target mark is represented by a mark of the target,representing a dialog data setMiddle (f)The data contains the number of token.

Further, a dialogue negative log likelihood loss function is obtained according to the dialogue large language model and the token of the dialogue large language model. Wherein the dialogue negative log-likelihood loss function is as follows:

（6）

wherein,representing dialogue negative log-likelihood loss function，Representing a dialog data setMiddle (f)The data contains the number of tokens and,a large language model is represented and the model is represented,the freeze parameters representing the large language model are displayed,represented in a dialog datasetThe resulting dialog LoRA-adapter is trained on.

It should be noted that, in the embodiment of the present invention, for the fusion of the LoRA-adapters, a trainable weight is given to each loss function of the fine-tuning LoRA-adapter, and all loss functions of the LoRA-adapters with the fusion weights are fine-tuned, where the fusion weights can be expressed as:。

in step 102, an ideal loss function of the question-answer dataset and the dialogue dataset in an ideal state can be obtained according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and the initial fusion weights included in fine adjustment based on each LoRA-adapter, wherein the ideal loss function is as follows:

（7）

wherein,representing the ideal loss function of the device,representing a question-answer negative log-likelihood loss function,representing a dialogue negative log-likelihood loss function,expressed in question and answer data setFine tuning the upper part to obtainThe weight of the initial fusion is calculated,represented in a dialog datasetFine tuning the upper part to obtainIs used to determine the initial fusion weights of (a).

Further, an ideal fusion weight and a first ideal parameter corresponding to the ideal loss function are obtained according to the minimum value of the ideal loss function, and the minimum value of the ideal loss function is expressed by the following formula:

（8）

wherein,representing all ideal LORA-adapters,representing all ideal fusion weights, argmin represents the corresponding parameters when the following formula takes the minimum valueAndis a value of (a).

In the embodiment of the present invention, all the ideal LORA-adapters may also be referred to as all the first ideal parameters, where the first ideal parameters represent all the LORA-adapters added to the question-answer large language model and the dialogue-large language model, respectively, and the first ideal parameters are represented by the following formulas:

（9）

wherein,a first desired parameter is indicated and,expressed in question and answer data setThe first desired parameter obtained is fine-tuned up,represented in a dialog datasetThe first desired parameter obtained is fine-tuned up,representing the number of question-answer datasets,representing the number of dialog data sets.

In particular, the method comprises the steps of,ideal fusion weights representing all LoRA-adapters, i.e., representing question-answer LoRA-adapters and dialog LoRA-aThe ideal fusion weight for dapter, which can be expressed by the following formula:

（10）

wherein,the ideal fusion weight is represented by the sum of the weights,representation ofIs used to determine the desired fusion weight of the (c),representation ofIs used for the optimal fusion weight of the (c).

In step 103, fine tuning is performed on the question-answer LoRA-adapter corresponding to each question-answer dataset and the dialogue LoRA-adapter corresponding to each dialogue dataset according to the ideal loss function, so as to obtain the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters respectively.

In practical applications, the first ideal parameters and ideal fusion weights are trimmed sequentially for efficiency and simplicity. In the first stage, each LoRA-adapter (all first ideal parameters) on each question-answer data set or dialogue data set is fine-tuned, namely, for the formula (8), the formula (8) is split into two formulas shown below, and then the optimal parameters of the question-answer LoRA-adapter and the optimal parameters of the dialogue LoRA-adapter can be obtained, specifically as follows:

（11）

（12）

wherein,the best parameters for the question-answer lorea-adapter are indicated,the best parameters for the dialog LoRA-adapter,represented in a dialog datasetThe dialog LoRA-adapter obtained by the training,expressed in question and answer data setThe question and answer LoRA-adapter obtained by training is used,representing a question-answer negative log-likelihood loss function,representing a dialogue negative log-likelihood loss function.

Further, only fine tuning the ideal fusion weight in the second stage, and freezing the basic large language model and the first ideal parameters to obtain the optimal fusion parameters, wherein the optimal fusion parameters are specifically as follows:

（13）

wherein,the optimal fusion parameters are indicated to be the ones that are to be fused,expressed in question and answer data setFine tuning the upper part to obtainIs used to determine the initial fusion weights of (1),represented in a dialog datasetFine tuning the upper part to obtainIs used to determine the initial fusion weights of (1),representing a question-answer negative log-likelihood loss function,representing a dialogue negative log-likelihood loss function,representing ideal fusion weights.

It should be noted that in practical applications, when the number of question-answer datasets and dialogue datasets is small, some simple and fast algorithms may be used to optimize the ideal fusion weights.

In summary, the embodiment of the invention provides a large language model refinement and adaptation fusion method and device, and the method can effectively avoid semantic space conflict caused by data set fusion by fusing LoRA-adaptation, and simultaneously promote generalization performance of a large language model on a plurality of tasks. The problem of performance degradation caused by conflict of different data sets in semantic space in the prior art is solved.

Based on the same inventive concept, the embodiment of the invention provides a large language model tuning and Adapter fusion device, and because the principle of the device for solving the technical problem is similar to that of a large language model tuning and Adapter fusion method, the implementation of the device can be referred to the implementation of the method, and the repetition is omitted.

As shown in fig. 2, the apparatus mainly includes a first obtaining unit 201, a second obtaining unit 202, a third obtaining unit 203, and a fourth obtaining unit 204.

A first obtaining unit 201, configured to collect a plurality of question-answer data sets and dialogue data sets from a set network platform, and perform a lorea-adaptive fine tuning on the question-answer data sets and the dialogue data sets respectively, so as to obtain a question-answer large language model, a question-answer negative log likelihood loss function, a dialogue large language model and a dialogue negative log likelihood loss function in sequence;

a second obtaining unit 202, configured to obtain an ideal loss function of the question-answer data set and the dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function, and initial fusion weights included based on fine tuning of each LoRA-adapter, and obtain an ideal fusion weight and a first ideal parameter corresponding to the ideal loss function according to a minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

a third obtaining unit 203, configured to fine tune a question-answer LoRA-adapter corresponding to each question-answer dataset and a dialogue LoRA-adapter corresponding to each dialogue dataset according to the ideal loss function, so as to obtain an optimal parameter of the question-answer LoRA-adapter, an optimal parameter of the dialogue LoRA-adapter, and an optimal fusion parameter respectively;

and a fourth obtaining unit 204, configured to obtain a general LoRA-adapter according to the optimal parameter of the question-answering LoRA-adapter, the optimal parameter of the dialogue LoRA-adapter, and the optimal fusion parameter.

It should be understood that the above large language model tuning and Adapter fusion device includes units that are logically divided only according to functions implemented by the device, and in practical application, the units may be overlapped or split. The functions implemented by the large language model tuning and Adapter fusion device provided in this embodiment are in one-to-one correspondence with the large language model tuning and Adapter fusion method provided in the foregoing embodiment, and the more detailed process flow implemented by the device is described in detail in the foregoing method embodiment one, which is not described in detail herein.

Another embodiment of the present invention also provides a computer apparatus, including: a processor and a memory; the memory is used for storing computer program codes, and the computer program codes comprise computer instructions; when the processor executes the computer instructions, the electronic device executes the steps of the large language model refinement and Adapter fusion method in the method flow shown in the method embodiment.

Another embodiment of the present invention further provides a computer readable storage medium, where computer instructions are stored, where the computer instructions, when executed on a computer device, cause the computer device to execute the steps of the large language model refinement and Adapter fusion method in the method flow shown in the foregoing method embodiment.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A large language model micro-tuning and Adapter fusion method is characterized by comprising the following steps:

2. The method of claim 1, wherein the performing a lore-adapter fine tuning on the question-answer dataset sequentially obtains a question-answer large language model and a question-answer negative log likelihood loss function, specifically comprising:

wherein,indicate->Individual question-answer data set,/->Indicate->The +.>Personal system information->Indicate->The +.>A problem of->Indicate->The +.>Reply (S) of (E)>Representing question-answer data set->Length of->Is expressed in question and answer data set->Question and answer LoRA-adapter obtained by training>Representing a large language model->Representation->Length of->Representing the generation of a large language model +.>Personal token,)>Freezing parameters representing large language model, +.>Representing a question-answer negative log-likelihood loss function.

3. The method of claim 1, wherein the performing the LoRA-adapter fine-tuning on the dialogue dataset sequentially obtains a dialogue large language model and a dialogue negative log likelihood loss function, specifically comprising:

wherein,indicate->Individual dialog data sets->Indicate->In wheel +.>The +.>Personal inquiry, inquiry>Indicate->In wheel +.>The +.>Reply (S) of (E)>Representing dialog data set +.>Length of->Expressed in dialog data set->The dialog LoRA-adapter obtained by the training, and (2)>Representing all tags belonging to the user query, +.>Indicating the target mark->Representing dialog data set +.>Middle->The data contains the number of tokens, < >>Representing a dialogue negative log-likelihood loss function, +.>Representing a large language model->The freeze parameters representing the large language model.

4. The method of claim 1, wherein the ideal loss function is as follows:

the minimum of the ideal loss function is as follows:

wherein,representing an ideal loss function, +.>Is expressed in question and answer data set->Upper trimming to get->Is used to determine the initial fusion weights of (1),expressed in dialog data set->Upper trimming to get->Is->Representing all of the first desired parameters,representing all ideal fusion weights, +.>Representing the first ideal parameter->Representing ideal fusion weights.

5. The method of claim 4, wherein the first desired parameter is as follows:

the ideal fusion weights are as follows:

wherein,representing the first ideal parameter->Is shown inQuestion and answer data set->The first desired parameter obtained is fine-tuned up,expressed in dialog data set->Fine tuning the first ideal parameter obtained, +.>Representing the number of question-answer datasets, +.>Representing the number of dialog data sets +.>Representing ideal fusion weights, ++>Representation->Is->Representation->Is used for the optimal fusion weight of the (c).

6. The method of claim 1, wherein the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter, and the optimal fusion parameters are as follows:

wherein,optimal parameter indicating question and answer LoRA-adapter,>optimal parameter indicating dialogue LoRA-adapter, < ->Representing the optimal fusion parameters->Representing a question-answer negative log-likelihood loss function, +.>Representing a dialogue negative log-likelihood loss function, +.>Is expressed in question and answer data set->Upper trimming to get->Is->Represented in a dialog datasetUpper trimming to get->Is->Is expressed in question and answer data set->Question and answer LoRA-adapter obtained by training>Expressed in dialog data set->The resulting dialog LoRA-adapter is trained on.

7. A large language model refinement and adaptation fusion apparatus, comprising:

the first obtaining unit is used for collecting a plurality of question-answer data sets and dialogue data sets from a set network platform, respectively performing LoRA-adapter fine tuning on the question-answer data sets and the dialogue data sets, and sequentially obtaining a question-answer large language model, a question-answer negative log likelihood loss function, a dialogue large language model and a dialogue negative log likelihood loss function;

the second obtaining unit is used for obtaining an ideal loss function of the question-answer data set and the dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included based on fine adjustment of each LoRA-adapter, and obtaining ideal fusion weights and first ideal parameters which correspond to the ideal loss function according to the minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

8. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the large language model refinement and adaptation fusion method of any one of claims 1-6.

9. A computer readable storage medium, storing a computer program which, when executed by a processor, causes the processor to perform the large language model refinement and adaptation fusion method of any one of claims 1-6.