CN117689041A - Cloud integrated embedded large language model training method and language question-answering method - Google Patents

Cloud integrated embedded large language model training method and language question-answering method Download PDF

Info

Publication number
CN117689041A
CN117689041A CN202410108095.2A CN202410108095A CN117689041A CN 117689041 A CN117689041 A CN 117689041A CN 202410108095 A CN202410108095 A CN 202410108095A CN 117689041 A CN117689041 A CN 117689041A
Authority
CN
China
Prior art keywords
model
trained
training
server
language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410108095.2A
Other languages
Chinese (zh)
Other versions
CN117689041B (en
Inventor
陈浩
田聪
于斌
贺子轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Dongjian Data Technology Co.,Ltd.
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202410108095.2A priority Critical patent/CN117689041B/en
Publication of CN117689041A publication Critical patent/CN117689041A/en
Application granted granted Critical
Publication of CN117689041B publication Critical patent/CN117689041B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a cloud integrated embedded large language model training method and a language question-answering method, wherein the training method is applied to an edge end and comprises the following steps: inputting a test sample to a first model to be trained, and reasoning to obtain the confusion degree and throughput rate of the first model to be trained; when the confusion degree and the throughput rate do not meet the user demands, related information is sent to at least one server in communication connection with the server, so that the server trains the second model to be trained by combining training samples and regularized loss functions, and then network parameters of the trained second model to be trained are issued to the edge end; and returning to the step of obtaining the confusion degree and the throughput rate of the first model to be trained by reasoning until the confusion degree and the throughput rate meet the requirements of users, and taking the corresponding first model to be trained as a language question-answer model. The training method provided by the invention realizes the balance between the reasoning speed and the accuracy, and the obtained language question-answering model has higher efficient and intelligent language processing capability.

Description

Cloud integrated embedded large language model training method and language question-answering method
Technical Field
The invention belongs to the technical field of artificial intelligence, and particularly relates to a cloud integrated embedded large language model training method and a language question-answering method.
Background
The large language model is crucial in the field of natural language processing, is an artificial intelligent model based on deep learning technology, has strong natural language processing capability, can understand and generate natural language texts, and can be applied to automatic question-answering scenes. For example, after a user presents a question, he can understand the semantics of the question and extract relevant information from a large amount of text data, thereby giving an accurate answer.
Because the large language model faces the problems of data privacy, information security and the like, the deployment and application of the edge large language model are huge, and the requirement of deploying the large language model at the mobile phone end is also stronger. However, the calculation capability of the CPU (Central Processing Unit ) and the memory size of the system are very high in the large language model, and when the large language model is applied to an edge device such as a mobile phone, the calculation resources such as the processor speed and the memory capacity often have serious limitation on the reasoning calculation, so that the answer accuracy, the answer speed and the like of the large language model are directly affected. Therefore, how to implement high-speed reasoning under such limited computing resources is a problem to be solved urgently by those skilled in the art.
To cope with this challenge, the existing method optimizes the large language model to reduce the amount of computation required in the inference stage, such as compressing, quantifying, etc., and reduces the computing load by reducing the number of parameters and computing operations to improve the inference efficiency. In addition, lightweight models specifically designed for edge devices have been proposed that reduce the size and complexity of the model while maintaining relatively high accuracy, thereby enabling faster completion of reasoning tasks with limited computational resources.
However, the above method has limitations in solving the reasoning problem of the edge-side large language model, and the existing method usually only focuses on single reasoning speed or accuracy and lacks flexible adjustment capability in different scenes; meanwhile, as the compatibility of the large language model between different edge computing hardware platforms is poor, if the speed and the accuracy are balanced, the operation amount is large, the efficiency is low, and the natural language processing is difficult to realize.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a cloud integrated embedded large language model training method and a language question-answering method. The technical problems to be solved by the invention are realized by the following technical scheme:
in a first aspect, the present invention provides a cloud integrated embedded large language model training method, applied to an edge end, including:
inputting a test sample into a first model to be trained, and obtaining the confusion degree PS and the throughput rate T of the first model to be trained through reasoning;
when the confusion degree PS and the throughput rate T do not meet the user requirements, related information is sent to at least one server in communication connection with the server, so that the server trains a second model to be trained by combining training samples and regularized loss functions, and network parameters of the trained second model to be trained are issued to an edge; the test sample and the training sample are natural language sequences;
substituting the network parameters of the second model to be trained after the training into the first model to be trained, returning to the step of obtaining the confusion degree PS and the throughput rate T of the first model to be trained through reasoning until the confusion degree PS and the throughput rate T meet the user requirements, and taking the first model to be trained meeting the user requirements as a language question-answering model obtained through training.
In one embodiment of the invention, the regularization loss function is:
where regulated Loss represents a Regularized Loss function, standard Loss represents a Standard Loss function, IL represents an objective function,representing the super parameter.
In one embodiment of the invention, the objective function is:wherein->Representing a first weight coefficient and a second weight coefficient, respectively,/->
In one embodiment of the invention, the standard loss function is:
in the method, in the process of the invention,representing the number of natural language sequences entered into the second model to be trained,/->Representing the natural language sequence input to the second model to be trained +.>The%>Personal word element->Representing a desired output sequence corresponding to a natural language sequence input to the second model to be trained,/for>Representation->The%>Personal word element->Representation->The number of words involved,/->Representation->The%>Element(s)>Network parameters representing the second model to be trained, < +.>Representation->Middle->The number of elements to be added to the composition,indicated at given +.>、/>And->Under the condition->Conditional probability of (2).
In one embodiment of the invention, the related information includes at least a super parameterFirst weight coefficient determined in advance according to user's demand and self-computing platform characteristics ∈>And a second weight coefficient->
When the confusion PS and the throughput T do not meet the user requirement, sending relevant information to at least one server in communication connection with the server, so that the server trains the second model to be trained by combining training samples and regularized loss functions, and sending the trained network parameters of the second model to be trained to an edge, including:
when the confusion degree PS and the throughput rate T do not meet the user requirements, the super-parameters are used for processing the dataInitial value of said first weight coefficient +.>And said second weight coefficient +.>To at least one server in communication with itself, so that the server will add the first weight coefficient +.>Said second weight coefficient->And said superparameter->Substituting initial values into the regularized loss function and utilizing trainingAnd after training the second model to be trained by the sample and the regularization loss function, transmitting the trained network parameters of the second model to be trained to the edge.
In one embodiment of the present invention, the server trains the second model to be trained according to the following steps:
will beInputting a training sample to a second model to be trained;
calculating a loss value of the current second model to be trained based on the regularized loss function;
judging whether the loss value converges or not; if not, adjusting the super-parameters in the regularization loss functionAnd return the ∈>Inputting training samples into a second model to be trained; otherwise, the training is finished, and the network parameters of the second model to be trained after the training is finished are issued to the edge.
In one embodiment of the present invention, the edge sends the super parameter in the form of a messageInitial value of said first weight coefficient +.>And said second weight coefficient +.>To at least one server in communicative connection with itself.
In a second aspect, the present invention provides a cloud integrated embedded large language model training method, applied to a server, including:
receiving related information sent by an edge terminal, and establishing a regularization loss function according to the related information;
after a preset number of training samples are input into a second model to be trained, calculating a loss value of a regularized loss function;
judging whether the loss value converges or not; if not, adjusting the super-parameters in the regularization loss functionAnd returning to the step of inputting the preset number of training samples into the second model to be trained; if yes, training is finished, the network parameters of the second training model to be trained after training are issued to the edge end, so that the edge end substitutes the network parameters of the second training model to be trained after training into the first training model, confusion PS and throughput T of the first training model are obtained through reasoning, and when the inferred confusion PS and throughput T meet user requirements, the first training model meeting the user requirements is used as a language question-answer model obtained through training.
In one embodiment of the present invention, the server issues parameters of the trained second model to be trained to the edge in the form of a message.
In a third aspect, the present invention further provides a language question-answering method, including:
acquiring a problem input by a user;
inputting the questions into a language question-answering model trained by the cloud integrated embedded large language model training method according to the first aspect or the second aspect, and obtaining answers to the questions output in a natural language form.
Compared with the prior art, the invention has the beneficial effects that:
(1) The invention provides a cloud integrated embedded large language model training method, which can flexibly adjust training parameters of a model based on different requirements of a user on reasoning speed and accuracy, combines an optimization strategy of a training end and a reasoning end in a training process, and can realize the optimal balance between the reasoning speed and the accuracy.
(2) According to the invention, a self-adaptive parameter adjustment strategy is adopted, and through end-to-end self-adaptive parameter adjustment, the training parameters of the second model to be trained at the server end can be dynamically adjusted in the reasoning process, so that the parameter configuration and training progress of the training of the model at the server end can be adaptively guided according to the performance indexes of speed and accuracy under different edge reasoning scenes.
(3) The invention adopts an end-to-end real-time feedback mechanism, introduces a real-time feedback network protocol, monitors the reasoning speed and accuracy in real time in the edge reasoning process, and transmits related parameters to the cloud server through the real-time feedback network protocol, so that the cloud server can adjust training parameters in time to adapt to the speed and accuracy requirements of dynamic change.
(4) The language question-answering model obtained by training can be applied to various edge end scenes, such as an embedded system, mobile equipment, the Internet of things and the like, so that more efficient and intelligent language processing capability is provided for realizing natural language question-answering of an edge end.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Drawings
FIG. 1 is a flowchart of a cloud-integrated embedded large language model training method applied to an edge according to an embodiment of the present invention;
FIG. 2 is a flowchart of a cloud integrated embedded large language model training method according to an embodiment of the present invention;
FIG. 3 is a flowchart of a cloud integrated embedded large language model training method applied to a server according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an application scenario of a cloud integrated embedded large language model training method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of another application scenario of the cloud integrated embedded large language model training method provided by the embodiment of the invention;
fig. 6 is a flowchart of a language question-answering method according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but embodiments of the present invention are not limited thereto.
It should be understood that the limitation of the resources of the edge device can cause that the large language model needs to balance the reasoning speed and the accuracy in the reasoning process, and in general, high-speed reasoning is often accompanied by a certain accuracy loss, while high-accuracy reasoning can sacrifice the reasoning speed; especially on different edge computing hardware platforms, there may also be differences in the demands on the speed and accuracy of reasoning due to the heterogeneity of computing resources.
In order to strike the best balance between speed and accuracy for the user experience, a large language model needs to be trained and deployed for each device's hardware configuration and performance case. At present, although some inference optimization methods exist, the methods still have some limitations when solving the problem of the inference of the large language model at the edge, such as focusing on single inference speed or accuracy, and lacking the capability of flexible adjustment under different scenes; meanwhile, reasoning of the edge-end large language model is carried out on edge-end equipment and training is carried out in a server, the training-reasoning scene of equipment splitting is extremely unfriendly to a developer, connection and testing are needed manually at different ends, if tens of series, high, middle and low-end mobile phones exist, each series has tens of products, and each product needs to iterate once in 3-6 months, so that a great deal of manpower is needed to manually adjust parameters of the cloud training large language model to adapt to mobile phones with different models and different performance parameters.
In view of the above, the invention provides a cloud integrated embedded large language model training method and a language question-answering method.
Fig. 1 is a schematic flow chart of cloud integration applied to an edge according to an embodiment of the present invention. As shown in fig. 1, an embodiment of the present invention provides a cloud integrated embedded large language model training method, which is applied to an edge end and includes:
s11, inputting a test sample into a first model to be trained, and obtaining the confusion degree PS and the throughput rate T of the first model to be trained through reasoning;
s12, when the confusion degree PS and the throughput rate T do not meet the user requirements, related information is sent to at least one server in communication connection with the server, so that the server trains the second model to be trained by combining training samples and regularized loss functions, and network parameters of the trained second model to be trained are issued to the edge end; the test sample and the training sample are natural language sequences;
s13, substituting the network parameters of the second model to be trained after training into the first model to be trained, returning to the step of obtaining the confusion PS and the throughput T of the first model to be trained through reasoning until the confusion PS and the throughput T meet the user requirements, and taking the first model to be trained meeting the user requirements as a language question-answer model obtained through training.
It should be noted that, in this embodiment, the first model to be trained and the second model to be trained are large language models with the same network structure, and existing large language models, such as LLaMA, LLaMA2, etc., may be selectively used, which is not limited in this application.
According to the embodiment, training parameters of the large language model can be dynamically adjusted according to requirements of different scenes and edge computing hardware platforms, so that a language question-answering model with optimal reasoning speed and accuracy is obtained.
Optionally, the regularization loss function is:
where regulated Loss represents a Regularized Loss function, standard Loss represents a Standard Loss function, IL represents an objective function,representing the super parameter.
Specifically, regularized loss functions are definedI.e. the regularized Loss function is the sum of the Standard Loss function Standard Loss and IL (information Loss), wherein the Standard Loss function Standard Loss is usedIn the non-negative real value function for measuring the inconsistency degree of the predicted value and the real value of the second model to be trained, the embodiment selectively uses a negative log likelihood function, and the objective function IL is used for describing the reasoning accuracy and speed of the edge, it should be noted that, in the embodiment, by introducing the super-parameter for the objective function IL in the regularized loss function>The intensity of the objective function IL may be controlled during the training process so as to minimize the objective function IL during the training process.
Optionally, the objective function IL is:wherein->、/>Representing a first weight coefficient and a second weight coefficient, respectively,/->
Specifically, PS (confusion score) may be used to characterize the inference accuracy of the first model to be trained at the edge, where the confusion PS is defined as follows in this embodiment:
wherein, the test sample at the edge end is also a natural language sequence,representing the +.th in the natural language sequence of the first model to be trained entered during the reasoning process>Personal word element->Represent the first/>All tokens before the individual token, +.>Representing the length of the natural language sequence, i.e. the number of tokens,/-for inputting the first model to be trained in the reasoning process>Indicated at given +.>And network parameters of the first model to be trained +.>Condition (II)>Personal word->Conditional probability of (2).
T (throughput) characterizes the reasoning speed of the first model to be trained of the edge, and is defined as:,/>representing the total number of output tokens, < >>Representing the total time from inputting the test sample into the first model to be trained until all the outputs are obtained.
In addition, a first weight coefficientSecond weight coefficient->May be determined based on characteristics of the user's needs, edge computing platform characteristics, etc.,for making a trade-off between the confusion PS of the characterization accuracy and the throughput T of the characterization speed.
Further, the standard loss function is:
in the method, in the process of the invention,representing the number of natural language sequences entered into the second model to be trained,/->Representing the natural language sequence input to the second model to be trained +.>The%>Personal word element->Representing a desired output sequence corresponding to a natural language sequence input to the second model to be trained,/for>Representation->The%>Personal word element->Representation->The number of words involved,/->Representation->The%>Element(s)>Network parameters representing the second model to be trained, < +.>Representation->Middle->The number of elements to be added to the composition,indicated at given +.>、/>And->Under the condition->Conditional probability of (2).
Fig. 2 is a flowchart of a cloud integrated embedded large language model training method according to an embodiment of the present invention. In this embodiment, the related information at least includes a super parameterFirst weight coefficient determined in advance according to user's demand and self-computing platform characteristics ∈>And a second weight coefficient->
In step S12, when the confusion PS and the throughput T do not meet the user requirement, the step of sending relevant information to at least one server in communication connection with the server, so that the server trains the second model to be trained by combining the training sample and the regularized loss function, and issues the trained network parameters of the second model to be trained to the edge end includes:
when the confusion degree PS and the throughput rate T do not meet the user requirements, the parameters are exceededInitial value, first weight coefficient->And a second weight coefficient->To at least one server in communication with itself so that the server will add the first weight coefficient>Second weight coefficient->And Supermarameter->Substituting the initial value of the second model to be trained into a regularization loss function, and transmitting the trained network parameters of the second model to be trained to the edge after training the second model to be trained by using the training sample and the regularization loss function.
Specifically, please combine fig. 1-2, the edge first obtains the user demand, then inputs the test sample into the first model to be trained, obtains the confusion PS and throughput T of the first model to be trained through reasoning, compares the inferred confusion PS and throughput T with the user demand, if the user demand is satisfied, ends the process, and indicates that the first model to be trained is a natural predictive question-answering model capable of satisfying the user demand. Otherwise, obtain the first weight coefficientSecond weight coefficientHyper-parameters->The related information such as initial values of the second model to be trained is transmitted to the server in the form of messages, and then the server combines the related information to construct a regularized loss function so as to train the second model to be trained by utilizing training samples and the regularized loss function, and super parameters are adjusted in the training process>And after convergence is achieved, the trained network parameters of the second model to be trained are issued to the edge, the edge substitutes the received network parameters into the first model to be trained, reasoning is carried out again until the deduced confusion degree PS and throughput rate T meet the requirements of users, and then the trained language question-answering model can be obtained.
Optionally, the server trains the second model to be trained according to the following steps:
will beInputting a training sample to a second model to be trained;
calculating a loss value of the current second model to be trained based on the regularized loss function;
judging whether the loss value converges or not; if not, adjusting the super-parameters in the regularized loss functionAnd return to the aboveInputting training samples into a second model to be trained; otherwise, the training is finished, and the network parameters of the second model to be trained after the training is finished are issued to the edge.
This embodimentIn, super parameterIs an empirical parameter for controlling the influence of the objective function IL on the Regularized Loss function adjusted Loss, a super parameter +.>The larger the second model to be trained will take into account the objective function IL during training, but the larger the super-parameter +.>The second model to be trained may not converge or be over-fitted; conversely, too small a superparameter +.>The consideration of the objective function IL is weakened and therefore a suitable superparameter +.>It can be obtained through multiple experiments.
Optionally, the edge terminal uses the super parameter in the form of messageInitial value, first weight coefficient->And a second weight coefficientThe message is sent to at least one server in communication connection with the server, and besides the content, the message further comprises: message ID, server name, edge end ID, edge end name, hash value of first model to be trained, hyper-parameters +.>Initial value, first weight coefficient->Second weight coefficient->And a timestamp, the message content is shown in Table 1:
TABLE 1
In table 1, "int", "string", "long" and "float" are data types, and represent integer, string, long integer and floating point types, respectively.
Fig. 3 is a flowchart of an embedded large language model training method applied to cloud integration of a server according to an embodiment of the present invention. As shown in fig. 3, an embodiment of the present invention provides a cloud integrated embedded large language model training method, which is applied to a server and includes:
s31, receiving related information sent by an edge terminal, and establishing a regularized loss function according to the related information;
s32, after a preset number of training samples are input into a second model to be trained, calculating a loss value of a regularized loss function;
s33, judging whether the loss value converges or not;
s34, if not, adjusting the super parameterAnd returning to the step of inputting the preset number of training samples into the second model to be trained;
s35, judging whether the loss value converges or not; if not, adjusting the super-parameters in the regularized loss functionAnd returning to the step of inputting the preset number of training samples into the second model to be trained; if yes, the training is finished, the network parameters of the second model to be trained after the training are issued to the edge end, so that the edge end substitutes the network parameters of the second model to be trained after the training into the first model to be trained, the confusion PS and the throughput rate T of the first model to be trained are obtained through reasoning,and when the inferred confusion degree PS and the inferred throughput rate T meet the user requirements, taking a first model to be trained which meets the user requirements as a language question-answering model obtained through training.
Specifically, the Regularized Loss function normalized Loss is:
where Standard Loss represents the Standard Loss function, IL represents the objective function,representing the super parameter.
Optionally, the objective function IL is:wherein->、/>Representing a first weight coefficient and a second weight coefficient, respectively,/->
The Standard Loss function Standard Loss is:
in the method, in the process of the invention,representing the number of natural language sequences entered into the second model to be trained,/->Representing the natural language sequence input to the second model to be trained +.>The%>Personal word element->Representing a desired output sequence corresponding to a natural language sequence input to the second model to be trained,/for>Representation->The%>Personal word element->Representation->The number of words involved,/->Representation->The%>Element(s)>Network parameters representing the second model to be trained, < +.>Representation->Middle->The number of elements to be added to the composition,is shown in given/>、/>And->Under the condition->Conditional probability of (2).
In this embodiment, the server may also send the network parameters of the trained second model to be trained to the edge in the form of a message.
The related information sent by the edge may include: initial value of super parameter lambda, first weight coefficient W determined in advance according to user demand and self-calculation platform characteristic p And a second weight coefficient W t First weight coefficient W p Second weight coefficient W t The method can be obtained according to the characteristics of user requirements, edge computing platform characteristics and the like, and is used for balancing the confusion degree PS of the characterization accuracy and the throughput rate T of the characterization speed, and the initial value of the super parameter lambda is a random number.
Fig. 4-5 are schematic diagrams of two application scenarios of the cloud integrated embedded large language model training method provided by the embodiment of the invention. For example, as shown in fig. 4, for a smaller scale natural language model, if it is desired to run on multiple different brands of handsets (edge terminals), then an "economy mode" may be employed, i.e., the edge terminals share the same server during training. In another application scenario with high performance requirements for the models, as shown in fig. 5, an edge end may perform collaborative training with multiple servers, and finally select one with optimal performance from the multiple obtained models.
Fig. 6 is a flowchart of a language question-answering method according to an embodiment of the present invention. As shown in fig. 6, the embodiment of the present invention further provides a language question-answering method, including:
s61, acquiring a problem input by a user;
s62, inputting the questions into a language question-answering model trained by the cloud integrated training method to obtain answers to the questions output in a natural language form.
According to the above embodiments, the beneficial effects of the invention are as follows:
(1) The invention provides a cloud integrated embedded large language model training method, which can flexibly adjust training parameters of a model based on different requirements of a user on reasoning speed and accuracy, combines an optimization strategy of a training end and a reasoning end in a training process, and can realize the optimal balance between the reasoning speed and the accuracy.
(2) According to the invention, a self-adaptive parameter adjustment strategy is adopted, and through end-to-end self-adaptive parameter adjustment, the training parameters of the second model to be trained at the server end can be dynamically adjusted in the reasoning process, so that the parameter configuration and training progress of the training of the model at the server end can be adaptively guided according to the performance indexes of speed and accuracy under different edge reasoning scenes.
(3) The invention adopts an end-to-end real-time feedback mechanism, introduces a real-time feedback network protocol, monitors the reasoning speed and accuracy in real time in the edge reasoning process, and transmits related parameters to the cloud server through the real-time feedback network protocol, so that the cloud server can adjust training parameters in time to adapt to the speed and accuracy requirements of dynamic change.
(4) The language question-answering model obtained by training can be applied to various edge end scenes, such as an embedded system, mobile equipment, the Internet of things and the like, so that more efficient and intelligent language processing capability is provided for realizing natural language question-answering of an edge end.
In the description of the present invention, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
The description of the terms "one embodiment," "some embodiments," "example," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Further, one skilled in the art can engage and combine the different embodiments or examples described in this specification.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (10)

1. The cloud integrated embedded large language model training method is characterized by being applied to an edge end and comprising the following steps of:
inputting a test sample into a first model to be trained, and obtaining the confusion degree PS and the throughput rate T of the first model to be trained through reasoning;
when the confusion degree PS and the throughput rate T do not meet the user requirements, related information is sent to at least one server in communication connection with the server, so that the server trains a second model to be trained by combining training samples and regularized loss functions, and network parameters of the trained second model to be trained are issued to an edge; the test sample and the training sample are natural language sequences;
substituting the network parameters of the second model to be trained after the training into the first model to be trained, returning to the step of obtaining the confusion degree PS and the throughput rate T of the first model to be trained through reasoning until the confusion degree PS and the throughput rate T meet the user requirements, and taking the first model to be trained meeting the user requirements as a language question-answering model obtained through training.
2. The cloud integrated embedded large language model training method of claim 1, wherein the regularization loss function is:
where regulated Loss represents a Regularized Loss function, standard Loss represents a Standard Loss function, IL represents an objective function,representing the super parameter.
3. The cloud integrated embedded large language model training method of claim 2, wherein the objective function is:wherein->、/>Representing a first weight coefficient and a second weight coefficient respectively,
4. the cloud integrated embedded large language model training method of claim 3, wherein the standard loss function is:
in the method, in the process of the invention,representing the number of natural language sequences entered into the second model to be trained,/->Representing the natural language sequence input to the second model to be trained +.>The%>Personal word element->Representing a desired output sequence corresponding to a natural language sequence input to the second model to be trained,/for>Representation->The%>Personal word element->Representation->The number of words involved,/->Representation->The%>The number of elements to be added to the composition,network parameters representing the second model to be trained, < +.>Representation->Middle->Element(s)>Indicated at given +.>、/>And->Under the condition->Conditional probability of (2).
5. The cloud integrated embedded large language model training method of claim 1, wherein the related information at least comprises super parametersFirst weight coefficient determined in advance according to user's demand and self-computing platform characteristics ∈>And a second weight coefficient->
When the confusion PS and the throughput T do not meet the user requirement, sending relevant information to at least one server in communication connection with the server, so that the server trains the second model to be trained by combining training samples and regularized loss functions, and sending the trained network parameters of the second model to be trained to an edge, including:
when the confusion degree PS and the throughput rate T do not meet the user requirements, the super-parameters are used for processing the dataInitial value of said first weight coefficient +.>And said second weight coefficient +.>To at least one server in communication with itself, so that the server will add the first weight coefficient +.>Said second weight coefficient->And said superparameter->Substituting the initial value of the second model to be trained into the regularized loss function, and transmitting the trained network parameters of the second model to be trained to the edge after training the second model to be trained by using the training sample and the regularized loss function.
6. The cloud integrated embedded large language model training method according to claim 5, wherein the server trains the second model to be trained according to the following steps:
will beInputting a training sample to a second model to be trained;
calculating a loss value of the current second model to be trained based on the regularized loss function;
judging whether the loss value converges or not; if not, adjusting the super-parameters in the regularization loss functionAnd return the ∈>Inputting training samples into a second model to be trained; otherwise, the training is finished, and the network parameters of the second model to be trained after the training is finished are issued to the edge.
7. The cloud integrated embedded large language model training method according to claim 5, wherein the edge terminal uses a message to send the hyper-parametersInitial value of said first weight coefficient +.>And said second weight coefficient +.>To at least one server in communicative connection with itself.
8. The cloud integrated embedded large language model training method is characterized by being applied to a server and comprising the following steps of:
receiving related information sent by an edge terminal, and establishing a regularization loss function according to the related information;
after a preset number of training samples are input into a second model to be trained, calculating a loss value of a regularized loss function;
judging whether the loss value converges or not; if not, adjusting the super-parameters in the regularization loss functionAnd returning to the step of inputting the preset number of training samples into the second model to be trained; if yes, training is finished, the network parameters of the second training model to be trained after training are issued to the edge end, so that the edge end substitutes the network parameters of the second training model to be trained after training into the first training model, confusion PS and throughput T of the first training model are obtained through reasoning, and when the inferred confusion PS and throughput T meet user requirements, the first training model meeting the user requirements is used as a language question-answer model obtained through training.
9. The cloud integrated training method of the embedded large language model according to claim 8, wherein the server issues parameters of the trained second model to be trained to the edge in the form of a message.
10. A method for language questions and answers, comprising:
acquiring a problem input by a user;
inputting the questions into a language question-answering model trained by the cloud integrated embedded large language model training method according to any one of claims 1-5 or 8-9 to obtain answers to the questions output in a natural language form.
CN202410108095.2A 2024-01-26 2024-01-26 Cloud integrated embedded large language model training method and language question-answering method Active CN117689041B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410108095.2A CN117689041B (en) 2024-01-26 2024-01-26 Cloud integrated embedded large language model training method and language question-answering method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410108095.2A CN117689041B (en) 2024-01-26 2024-01-26 Cloud integrated embedded large language model training method and language question-answering method

Publications (2)

Publication Number Publication Date
CN117689041A true CN117689041A (en) 2024-03-12
CN117689041B CN117689041B (en) 2024-04-19

Family

ID=90137392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410108095.2A Active CN117689041B (en) 2024-01-26 2024-01-26 Cloud integrated embedded large language model training method and language question-answering method

Country Status (1)

Country Link
CN (1) CN117689041B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111030861A (en) * 2019-12-11 2020-04-17 中移物联网有限公司 Edge calculation distributed model training method, terminal and network side equipment
CN111625361A (en) * 2020-05-26 2020-09-04 华东师范大学 Joint learning framework based on cooperation of cloud server and IoT (Internet of things) equipment
CN112990483A (en) * 2021-03-17 2021-06-18 北京理工大学 Large-scale edge machine learning training method based on probabilistic sampling
CN114265631A (en) * 2021-12-09 2022-04-01 浙江工业大学 Mobile edge calculation intelligent unloading method and device based on federal meta-learning
CN114626503A (en) * 2021-12-29 2022-06-14 亚信科技(中国)有限公司 Model training method, target detection method, device, electronic device and medium
WO2022174033A1 (en) * 2021-02-12 2022-08-18 Wyze Labs, Inc. Self-supervised collaborative approach to machine learning by models deployed on edge devices
WO2023273629A1 (en) * 2021-06-30 2023-01-05 华为技术有限公司 System and apparatus for configuring neural network model in edge server
CN116231860A (en) * 2023-03-02 2023-06-06 山东省计算中心(国家超级计算济南中心) Cloud edge end cooperation-based intelligent power load identification system, method and equipment
CN116341624A (en) * 2023-03-31 2023-06-27 华中科技大学 Edge-end cooperative deep learning calculation acceleration system and method
CN116363552A (en) * 2023-02-17 2023-06-30 北京理工大学 Real-time target detection method applied to edge equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111030861A (en) * 2019-12-11 2020-04-17 中移物联网有限公司 Edge calculation distributed model training method, terminal and network side equipment
CN111625361A (en) * 2020-05-26 2020-09-04 华东师范大学 Joint learning framework based on cooperation of cloud server and IoT (Internet of things) equipment
WO2022174033A1 (en) * 2021-02-12 2022-08-18 Wyze Labs, Inc. Self-supervised collaborative approach to machine learning by models deployed on edge devices
CN112990483A (en) * 2021-03-17 2021-06-18 北京理工大学 Large-scale edge machine learning training method based on probabilistic sampling
WO2023273629A1 (en) * 2021-06-30 2023-01-05 华为技术有限公司 System and apparatus for configuring neural network model in edge server
CN114265631A (en) * 2021-12-09 2022-04-01 浙江工业大学 Mobile edge calculation intelligent unloading method and device based on federal meta-learning
CN114626503A (en) * 2021-12-29 2022-06-14 亚信科技(中国)有限公司 Model training method, target detection method, device, electronic device and medium
CN116363552A (en) * 2023-02-17 2023-06-30 北京理工大学 Real-time target detection method applied to edge equipment
CN116231860A (en) * 2023-03-02 2023-06-06 山东省计算中心(国家超级计算济南中心) Cloud edge end cooperation-based intelligent power load identification system, method and equipment
CN116341624A (en) * 2023-03-31 2023-06-27 华中科技大学 Edge-end cooperative deep learning calculation acceleration system and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王志刚;王海涛;佘琪;史雪松;张益民;: "机器人4.0:边缘计算支撑下的持续学习和时空智能", 计算机研究与发展, no. 09, 30 September 2020 (2020-09-30) *

Also Published As

Publication number Publication date
CN117689041B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
WO2020177282A1 (en) Machine dialogue method and apparatus, computer device, and storage medium
CN111754985B (en) Training of voice recognition model and voice recognition method and device
WO2017210613A1 (en) Natural language generation in a spoken dialogue system
WO2022028304A1 (en) Multimedia data processing method and apparatus, device and readable storage medium
KR102375755B1 (en) System and method for recommendation of courses based on course similarity and computer program for the same
JP7402277B2 (en) Information processing system, information processing method, and information processing device
CN112750462B (en) Audio processing method, device and equipment
CN117332072B (en) Dialogue processing, voice abstract extraction and target dialogue model training method
WO2024051115A1 (en) Text generation method and apparatus, device, and non-volatile readable storage medium
CN107396144A (en) A kind of barrage distribution method and device
CN114328817A (en) Text processing method and device
CN112115703B (en) Article evaluation method and device
WO2024215252A1 (en) Question and answer method, and question and answer model training method
CN117689041B (en) Cloud integrated embedded large language model training method and language question-answering method
CN116913278B (en) Voice processing method, device, equipment and storage medium
CN113763928A (en) Audio category prediction method and device, storage medium and electronic equipment
US20230351153A1 (en) Knowledge graph reasoning model, system, and reasoning method based on bayesian few-shot learning
CN113159168B (en) Pre-training model accelerated reasoning method and system based on redundant word deletion
CN113761145A (en) Language model training method, language processing method and electronic equipment
CN110442692A (en) It is a kind of for problem worksheet processing and its method and apparatus of training
CN115037739B (en) File transmission method and device, electronic equipment and storage medium
CN117556026B (en) Data generation method, electronic device and storage medium
CN107731025A (en) Unmanned teaching methods, robot, system and readable storage medium storing program for executing
CN117312855A (en) Method, apparatus, electronic device and medium for selecting training data
WO2024225975A1 (en) Information extraction method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240524

Address after: Room 018, F1902, 19th Floor, Block 4-A, Xixian Financial Port, Fengdong New City Energy Jinmao District, Xixian New District, Xi'an City, Shaanxi Province, 712044

Patentee after: Xi'an Dongjian Data Technology Co.,Ltd.

Country or region after: China

Address before: 710071 Taibai South Road, Yanta District, Xi'an, Shaanxi Province, No. 2

Patentee before: XIDIAN University

Country or region before: China

TR01 Transfer of patent right