CN117668183A - Plug-in construction special for electric power and electric power equipment search question-answering method - Google Patents

Plug-in construction special for electric power and electric power equipment search question-answering method Download PDF

Info

Publication number
CN117668183A
CN117668183A CN202311617146.6A CN202311617146A CN117668183A CN 117668183 A CN117668183 A CN 117668183A CN 202311617146 A CN202311617146 A CN 202311617146A CN 117668183 A CN117668183 A CN 117668183A
Authority
CN
China
Prior art keywords
model
retrieval
search
training
power
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311617146.6A
Other languages
Chinese (zh)
Inventor
宋博川
张赛
胡宇巍
周飞
梁潇
张强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Smart Grid Research Institute Co ltd
Original Assignee
State Grid Smart Grid Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Smart Grid Research Institute Co ltd filed Critical State Grid Smart Grid Research Institute Co ltd
Priority to CN202311617146.6A priority Critical patent/CN117668183A/en
Publication of CN117668183A publication Critical patent/CN117668183A/en
Pending legal-status Critical Current

Links

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a method for constructing a plug-in special for electric power and a method for searching and asking for answers by electric power equipment, wherein the method for constructing the plug-in special for electric power comprises the following steps: acquiring a pre-training model and training data; initializing a retrieval model and a rearrangement model by using the pre-training model; performing joint training on the retrieval model and the rearrangement model based on training data by adopting a step-by-step iterative optimization method; and constructing the power special plug-in by using the search model and the rearrangement model after the joint training. According to the invention, the search model and the rearrangement model are jointly trained by adopting a step-by-step iterative optimization method, the trained search model and the rearrangement model are used for constructing the special plug-in for electric power, and the special plug-in for electric power is combined with the large language model in a plug-in mode, so that the purpose of supplementing the special knowledge for electric power for the large language model is realized, the effect of combining the task in the traditional electric power field with the large language model is achieved, and the problem that the large language model lacks special knowledge in the electric power field is solved.

Description

Plug-in construction special for electric power and electric power equipment search question-answering method
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method for constructing a plug-in special for electric power and searching and asking for a question and answer by electric power equipment.
Background
With the development of large-scale pre-training models, large language models are applied to various fields such as code generation, common sense reply, document generation, and the like. The large language model enables the answer of the model to be aligned with the answer of human beings through the methods of large-scale corpus pre-training, subsequent instruction learning, reinforcement learning and the like, and a good effect is achieved in the general field. However, the power domain is taken as a professional domain, and the following problems exist in the direct use of a large language model for searching questions and answers: as expertise, knowledge of the power domain is difficult to collect during the pre-training phase, thus making large language models without power-related knowledge. Therefore, the prior art has the problem that a large language model lacks expertise in the electric power field.
Disclosure of Invention
In view of the above, the invention provides a method for constructing a plug-in special for electric power and searching and asking for a question and answer by electric power equipment, so as to solve the problem that a large language model lacks professional knowledge in the electric power field.
In a first aspect, the present invention provides a method for constructing a power-specific plug-in, where the power-specific plug-in is used to provide power domain expertise for a large language model, the method including: acquiring a pre-training model and training data; initializing a retrieval model and a rearrangement model by using the pre-training model; performing joint training on the retrieval model and the rearrangement model based on training data by adopting a step-by-step iterative optimization method; and constructing the power special plug-in by using the search model and the rearrangement model after the joint training.
In the embodiment of the invention, the search model and the rearrangement model are jointly trained by adopting a step-and-step iterative optimization method, the trained search model and the rearrangement model are used for constructing the special electric power plug-in, and the special electric power plug-in is combined with the large language model in a plug-in mode, so that the purpose of supplementing the large language model with the professional knowledge of the electric power field is realized, the effect of combining the task of the traditional electric power field with the large language model is achieved, and the problem that the large language model in the related technology lacks the professional knowledge of the electric power field is solved.
In an alternative embodiment, the training data includes a search problem and a search document, and the step-and-repeat optimization method is used to jointly train the search model and the rearrangement model based on the training data, including: a search model is started on the training data in a hot mode, and an index vector corresponding to a search document is established; acquiring a negative sample set, wherein the negative sample set comprises a plurality of retrieval documents which have uncertain relevance to the retrieval problem and are similar to the real document; when the retrieval model and the rearrangement model are not converged, sampling a plurality of negative samples from the negative sample set, and updating parameters of the retrieval model by using the plurality of negative samples; updating index vectors corresponding to the search documents, and updating parameters of the rearrangement model by using a plurality of negative samples; when the search model and the rearrangement model converge, training is ended.
In the embodiment of the invention, the joint training of the retrieval model and the rearrangement model is optimized by adopting a step-by-step iterative optimization method, so that the effects of improving the performance and generalization capability of the model are achieved.
In an alternative embodiment, updating parameters of the retrieval model using a plurality of negative samples includes: establishing an objective function for training the retrieval model by using the KL divergence of the document selection probability distribution output by the retrieval model and the document selection probability distribution output by the rearrangement model; the retrieval model is trained by using a plurality of negative samples based on the target function trained by the retrieval model, and parameters of the retrieval model are updated.
In the embodiment of the invention, the retrieval model is distilled through a fixed rearrangement model, so that the effects of improving the performance and generalization capability of the retrieval model are achieved.
In an alternative embodiment, updating parameters of the rearrangement pattern using a plurality of negative examples comprises: parameters of the rearrangement model are updated using a plurality of negative samples using a method that maximizes the log likelihood function.
In the embodiment of the invention, the rearrangement model is optimized by fixing the retrieval model and adopting the maximized log likelihood function, so that the effects of improving the performance and generalization capability of the rearrangement model are achieved.
In a second aspect, the present invention provides a method for searching and asking for a power device by using the power-dedicated plug-in module constructed in the first aspect or any implementation manner corresponding to the first aspect, including: acquiring a retrieval problem of a user and a large language model; calling a special plug-in for electric power according to the retrieval problem to obtain a corresponding retrieval document; and obtaining a question and answer result according to the retrieval problem and the retrieval document by using the large language model.
In the embodiment of the invention, the special plug-in for electric power is combined with the large language model in a plug-in mode, so that the purpose that the large language model can call the special plug-in for electric power to acquire the relevant knowledge in the process of generating the question-answer result according to the retrieval problem is realized, and the effects of alleviating the illusion problem of the large language model and improving the accuracy of the retrieval question-answer are achieved.
In an alternative embodiment, before invoking the power-specific plug-in to obtain the corresponding search document according to the search question, the method further comprises: disassembling the retrieval problem of the user to obtain a plurality of subtasks; and determining the execution sequence and the execution parameters of the plurality of subtasks according to the retrieval problem.
In the embodiment of the invention, the corresponding search document is obtained by disassembling the search problem of the user and calling the special plug-in for electric power according to a plurality of subtasks, thereby achieving the effect of improving the question-answer search efficiency and accuracy of the large language model.
In an alternative embodiment, the execution parameters include query content, and invoking the power-specific plug-in to obtain a corresponding search document according to the search question includes: and calling the power special plug-in to determine the retrieval documents corresponding to the plurality of subtasks according to the execution parameters and the execution sequence.
In the embodiment of the invention, the search documents corresponding to the plurality of subtasks are determined according to the execution parameters and the execution sequence, so that the aim of inquiring the search documents according to the execution sequence of the subtasks is fulfilled.
In an alternative embodiment, obtaining the question and answer result according to the search question and the search document by using the large language model includes: sequentially determining question and answer results corresponding to a plurality of subtasks according to the retrieval documents corresponding to the subtasks, the execution sequences of the subtasks and the execution parameters by using a large language model; and summarizing the question and answer results corresponding to the plurality of subtasks to obtain the question and answer results corresponding to the retrieval questions.
In the embodiment of the invention, the question and answer results corresponding to the plurality of subtasks are summarized to obtain the question and answer results corresponding to the user retrieval problem, thereby achieving the effect of enhancing the question and answer retrieval performance of the language model.
In a third aspect, the present invention provides a power-dedicated plug-in building apparatus for providing power domain expertise for a large language model, the apparatus comprising: the model and data acquisition module is used for acquiring a pre-training model and training data; the initialization module is used for initializing a retrieval model and a rearrangement model by using the pre-training model; the joint training module is used for carrying out joint training on the retrieval model and the rearrangement model based on training data by adopting a step-by-step iterative optimization method; and the plug-in construction module is used for constructing the power special plug-in by using the search model and the rearrangement model after the joint training.
In an alternative embodiment, the training data includes a search question and a search document, and the joint training module includes: the initialization unit is used for thermally starting a retrieval model on the training data and establishing an index vector corresponding to the retrieval document; a negative-sample acquiring unit configured to acquire a negative-sample set including a plurality of search documents having uncertain relevance to the search question and being similar to the real document; the retrieval model training unit is used for sampling a plurality of negative samples from the negative sample set when the retrieval model and the rearrangement model are not converged, and updating parameters of the retrieval model by using the plurality of negative samples; the rearrangement model training unit is used for updating the index vector corresponding to the search document and updating parameters of the rearrangement model by using a plurality of negative samples; and the ending judgment unit is used for ending training when the retrieval model and the rearrangement model are converged.
In an alternative embodiment, the retrieval model training unit comprises: the objective function building subunit is used for building an objective function for training the retrieval model by using the document selection probability distribution output by the retrieval model and the KL divergence of the document selection probability distribution output by the rearrangement model; and the retrieval model training subunit is used for training the retrieval model by using a plurality of negative samples based on the target function trained by the retrieval model and updating the parameters of the retrieval model.
In an alternative embodiment, the rearrangement model training unit comprises: a rearrangement model training subunit for updating parameters of the rearrangement model using a plurality of negative samples using a method that maximizes the log likelihood function.
In a fourth aspect, the present invention provides a power device search question-answering apparatus of a power-dedicated plug-in unit constructed by adopting the third aspect or any one of the corresponding embodiments thereof, the apparatus comprising: the problem and model acquisition module is used for acquiring the retrieval problem of the user and a large language model; the retrieval document acquisition module is used for calling the special power plug-in according to the retrieval problem to obtain a corresponding retrieval document; and the question and answer result obtaining module is used for obtaining a question and answer result according to the retrieval problem and the retrieval document by using the large language model.
In an alternative embodiment, the apparatus further comprises: the problem disassembly module is used for disassembling the retrieval problem of the user to obtain a plurality of subtasks; and the sequence and parameter determining module is used for determining the execution sequence and the execution parameters of the plurality of subtasks according to the retrieval problem.
In an alternative embodiment, the execution parameters include query content, and the retrieve document acquisition module includes: and the retrieval document determining unit is used for calling the power special plug-in to determine retrieval documents corresponding to the plurality of subtasks according to the execution parameters and the execution sequence.
In an alternative embodiment, the question and answer result obtaining module includes: the question and answer result determining unit is used for sequentially determining question and answer results corresponding to the plurality of subtasks according to the search documents corresponding to the plurality of subtasks, the execution sequence of the plurality of subtasks and the execution parameters by using the large language model; and the summarizing unit is used for summarizing the question and answer results corresponding to the plurality of subtasks to obtain the question and answer results corresponding to the retrieval problems.
In a fifth aspect, the present invention provides a computer device comprising: the processor executes the computer instructions, thereby executing the power-dedicated plug-in construction method according to the first aspect or any embodiment thereof, and the power equipment search question-answering method according to the second aspect or any embodiment thereof.
In a sixth aspect, the present invention provides a computer-readable storage medium having stored thereon computer instructions for causing a computer to execute the power-dedicated plug-in building method of the first aspect or any of the embodiments corresponding thereto and the power equipment search question-answering method of the second aspect or any of the embodiments corresponding thereto.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a power-specific plug-in building method according to an embodiment of the invention;
FIG. 2 is a flow diagram of another power-specific plug-in building method according to an embodiment of the invention;
FIG. 3 is a flow chart of a power device search question and answer method according to an embodiment of the invention;
FIG. 4 is a flow chart of another power device search question and answer method according to an embodiment of the invention;
FIG. 5 is a block diagram of a power-specific plug-in building apparatus according to an embodiment of the present invention;
fig. 6 is a block diagram of a power equipment search question-answering apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that in the description of the present invention, the terms "first," "second," and the like are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. The terms "mounted," "connected," "coupled," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; the two components can be directly connected or indirectly connected through an intermediate medium, or can be communicated inside the two components, or can be connected wirelessly or in a wired way. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
The electric power field is taken as a professional field, and the following problems exist in the direct use of a large language model for question-answer retrieval: 1. as expertise, knowledge in the power domain is difficult to collect in the pre-training stage, so that the large language model does not have knowledge related to power; 2. in power scenarios where there is a high demand for reliability, the problem with the large language model of illusion is generally becoming more unacceptable.
According to an embodiment of the present invention, there is provided an embodiment of a power-specific plug-in construction method, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order other than that illustrated herein.
In this embodiment, a method for constructing a power-dedicated plug-in is provided, where the power-dedicated plug-in is used to provide power domain expertise for a large language model, and the method for constructing a power-dedicated plug-in may be used for the mobile terminal, such as a central processing unit, a server, etc., and fig. 1 is a schematic flow diagram of the method for constructing a power-dedicated plug-in according to an embodiment of the present invention, as shown in fig. 1, where the flow includes the following steps:
Step S101, a pre-training model and training data are obtained. Optionally, the pre-training model is a deep learning model obtained by training on a large-scale data set, for example, a pre-training model such as BERT, GPT, ERNIE, and the training data is data used for further training the pre-training model on a current task (such as searching questions and answers), for example, professional knowledge in the electric power field.
Step S102, initializing a retrieval model and a rearrangement model by using the pre-training model. Optionally, the pre-trained model obtained in step S101, such as the endie-3.0-base-zh model, is used to initialize a search model for searching for samples similar to the correct output and a rearrangement model for selecting the correct sample from the similar samples output by the search model.
Step S103, a step iteration optimization method is adopted to carry out joint training on the retrieval model and the rearrangement model based on training data. Optionally, based on the training data obtained in step S101, a step-and-step iterative optimization, that is, a fixed rearrangement model is adopted, the retrieval model and the fixed retrieval model are distilled through the rearrangement model, and the two are jointly trained by a method of optimizing the rearrangement model.
Step S104, constructing the power special plug-in by using the search model and the rearrangement model after the joint training. Optionally, the search model and the rearrangement model obtained after the combined training in the step S103 are used to construct a power special plug-in, and then the power special plug-in can be combined with the large language model in a plug-in mode, so that the purpose of providing power expertise for various tasks in the power field, such as searching of power technology standards, construction of a power equipment knowledge graph, grading of power equipment defects and the like, is achieved.
In the embodiment of the invention, the search model and the rearrangement model are jointly trained by adopting a step-and-step iterative optimization method, the trained search model and the rearrangement model are used for constructing the special electric power plug-in, and the special electric power plug-in is combined with the large language model in a plug-in mode, so that the purpose of supplementing the large language model with the professional knowledge of the electric power field is realized, the effect of combining the task of the traditional electric power field with the large language model is achieved, and the problem that the large language model in the related technology lacks the professional knowledge of the electric power field is solved.
In this embodiment, a method for constructing a power-dedicated plug-in is provided, where the power-dedicated plug-in is used to provide power domain expertise for a large language model, and the power-dedicated plug-in construction method may be used in the above mobile terminal, such as a central processing unit, a server, etc., and fig. 2 is a schematic flow diagram of another power-dedicated plug-in construction method according to an embodiment of the present invention, as shown in fig. 2, where the flow includes the following steps:
Step S201, a pre-training model and training data are obtained. Please refer to step S101 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S202, initializing a retrieval model and a rearrangement model by using the pre-training model.Alternatively, in the power domain retrieval task, the power-dedicated plug-in needs to retrieve the document d related to the retrieval problem q from the database C in accordance with the retrieval problem (user problem) q of the user. The electric power special plug-in unit is composed of the following two parts: search model (Dual-Encoder Retriever Module) G θ And a rearrangement model (Cross-Encoder ranker Module) D φ . For user problem q, search model G θ And a rearrangement model D φ The similarity score G of the user question q and the document d is calculated using the following formula (1) and formula (2), respectively θ (q, D) and D φ (q,d):
G θ (q,d)=E θ (q) T E θ (d) (1)
Wherein E is θ (. Cndot.) and E φ (. Cndot.) means that the coding model encodes the input as a vector, [ q, d ]]Means that the user questions and the documents are spliced; w (w) φ For D φ Is responsible for mapping the encoded vector to a real number; the upper corner T represents the transpose of the vector.
And step S203, performing joint training on the retrieval model and the rearrangement model based on training data by adopting a step iteration optimization method. Optionally, the objective function of the combined training of the search model and the rearrangement model by adopting a training method of contrast learning is as shown in formula (3):
Wherein,is an objective function of joint training; />Representing a search model G θ Negative sample set retrieved from databaseCombining; />Representing a rearrangement model from a set of documents>Probability of successfully selecting the correct document; θ and φ are target parameters of the search model and the rearrangement model, respectively. It should be noted that, the retrieval model needs to select a document fraud rearrangement model with high similarity as possible, and the rearrangement model needs to select the correct document from many similar documents. In order to optimize the above formula (3), the present embodiment optimizes the search model and the rearrangement model by using a step-and-repeat optimization method. Specifically, the step S203 includes:
step S2031, a search model is started up on the training data, and an index vector corresponding to the search document is established. Optionally, the training data includes a search question and a search document (document or sample), a corresponding index vector is established for the search document, and the search model is pre-trained on the training data in a small scale. The effects of improving the convergence speed and the retrieval speed of the retrieval model are achieved through hot start and index vector establishment.
Step S2032, a negative sample set is acquired, the negative sample set including a plurality of search documents having uncertain relevance to the search question and being similar to the real document. Optionally, the quality of the negative sample affects the model training effect, and the negative sample sampling should satisfy the following principles:
(1) Irrespective of the user problem, samples with low similarity to the problem should be sampled as little as possible.
(2) Documents with high relevance and similarity to real documents should also be sampled as little as possible, because such documents may be misjudged as true by the model, and the model cannot learn the distinction between positive and negative samples.
(3) With uncertain relevance, samples similar to real documents should be sampled as much as possible.
In summary, this embodiment adopts a simple and efficient sampling methodIn the sampling process, the similarity of all positive and negative samples is calculated by using a search model, and in order to improve the probability of sampling to the negative sample meeting the principle (3), the sampling probability p of the sample is calculated by using the following formula (3) i
Wherein, the oc is a proportional symbol; exp (·) is an exponential function; |·| is an absolute function; s (-) is used for solving the similarity score, such as calculating the similarity score by adopting methods of cosine similarity, euclidean distance and the like; d, d i Referring to the i-th document of the present invention,the document of the search question q can be answered for the correct document corresponding to the user question; b refers to deviation, which is a super parameter; />Is a negative set of samples. In order to reduce the calculation cost, the present embodiment uses k samples with highest similarity as the negative sample set +. >And samples therefrom.
In step S2033, when the retrieval model and the rearrangement model do not converge, a plurality of negative samples are sampled from the negative sample set, and the parameters of the retrieval model are updated using the plurality of negative samples. Specifically, step S2033 includes:
and a step a1, sampling a plurality of negative samples from the negative sample set when the retrieval model and the rearrangement model are not converged. Specifically, when the retrieval model and the rearrangement model do not converge, a plurality of negative samples are sampled according to step S2032.
And a2, establishing an objective function for training the retrieval model by using the document selection probability distribution output by the retrieval model and the KL divergence of the document selection probability distribution output by the rearrangement model. Specifically, KL powder is used as shown in formula (5)Degree building training search model objective function
Wherein, KL (& gt) represents calculation of KL divergence, the KL divergence is used for measuring the difference degree between two probability distributions, and the larger the KL divergence is, the larger the difference degree between the two probability distributions is, the smaller the KL divergence is, and the smaller the difference degree between the two probability distributions is;expressed in given user question q and document set +.>Under the condition of (1), the document output by the retrieval model selects probability distribution; />Expressed in given user question q and document set +. >The document output by the rearrangement model selects the probability distribution.
And a3, training the retrieval model by using a plurality of negative samples based on an objective function trained by the retrieval model, and updating parameters of the retrieval model. Based on the objective function in step a2And c, training a retrieval model according to the plurality of negative samples obtained by sampling in the step a1, and updating parameters of the retrieval model.
Step S2034, updating the index vector corresponding to the search document, and updating the parameters of the rearrangement model using a plurality of negative samples. Specifically, step S2034 includes:
and b1, updating the index vector corresponding to the search document. In the training process of each round of retrieval model and rearrangement model, the document index vector needs to be updated to ensure that the latest retrieval model is always used for retrieval. For creating an index vector corresponding to a search document, the present embodiment encodes all documents with a document encoder (document encoder) of a search model, and saves them in a database through a vector search tool FAISS (Facebook AI Similarity Search).
And b2, updating parameters of the rearrangement model by using a plurality of negative samples by adopting a method of maximizing the log likelihood function. Specifically, the rearrangement model is optimized by maximizing the log likelihood function:
Wherein phi is * Represents an optimal rearrangement model; argmax (·) is used to parameterize the function; log (·) is a logarithmic function;expressed in given user question q and document set +.>The rearrangement model outputs the probability distribution of the document d.
Step S2035, when the retrieval model and the rearrangement model converge, the training is ended. Optionally, the above steps S2033 and S2034 are repeated until the search model and the rearrangement model converge, the convergence condition being that the training time (epoch) reaches the upper limit or the loss is no longer reduced, and so on.
Step S204, constructing the power special plug-in by using the search model and the rearrangement model after the joint training. Please refer to step S104 in the embodiment shown in fig. 1 in detail, which is not described herein.
In the embodiment of the invention, the joint training of the retrieval model and the rearrangement model is optimized by adopting a step-by-step iterative optimization method, namely, the retrieval model is distilled by a fixed rearrangement model, the rearrangement model is optimized by adopting a maximized log likelihood function, and the effect of improving the performance and generalization capability of the retrieval model and the rearrangement model is achieved, so that the purpose of improving the performance and generalization capability of the special plug-in unit for electric power generated by the retrieval model and the rearrangement model is realized, and the accuracy and reliability of the retrieval document are improved.
In an alternative embodiment, the algorithm steps for optimizing the training of the search model and the rearrangement model by using the step-and-repeat optimization method are shown in table 1:
TABLE 1 step-by-step iterative optimization algorithm
Specifically, the retrieval model G is input θ Rearrangement model D φ Document collectionTraining data T; initializing a search model G using a pre-trained model θ And a rearrangement model D φ The method comprises the steps of carrying out a first treatment on the surface of the Hot start retrieval model G on training data T θ The method comprises the steps of carrying out a first treatment on the surface of the In the document set +.>Establishing an index vector; using G θ Retrieving the negative sample, when G θ And D φ When not converged, for the search model G θ Step k is circularly executed 1 : sampling n negative samples, updating the retrieval model parameters G θ If n negative samples are sampled according to the formula (4), the retrieval model parameters G are updated according to the formula (5) θ I.e., a fixed rearrangement model, by which the search model is distilled; for rearrangement model D φ Step k is circularly executed 2 : sampling n negative samples, updating rearrangement model parameter D φ If n negative samples are sampled according to the formula (4), the retrieval model parameters G are updated according to the formula (6) θ Fixing the retrieval model in the training process, and optimizing the rearrangement model by maximizing the log likelihood function; repeating the above steps until the model G is retrieved θ And a rearrangement model D φ And (5) convergence.
In the embodiment of the invention, the search model and the rearrangement model are jointly trained by adopting a step-and-step iterative optimization method, the trained search model and the rearrangement model are used for constructing the special electric power plug-in, and the special electric power plug-in is combined with the large language model in a plug-in mode, so that the purpose of supplementing the large language model with the professional knowledge of the electric power field is realized, the effect of combining the task of the traditional electric power field with the large language model is achieved, and the problem that the large language model in the related technology lacks the professional knowledge of the electric power field is solved.
In this embodiment, a method for searching and answering a power device by using the power-dedicated plug-in unit constructed in the first aspect or any implementation manner corresponding to the first aspect is provided, which may be used in the mobile terminal, such as a central processing unit, a server, etc., and fig. 3 is a schematic flow diagram of the method for searching and answering a power device according to an embodiment of the present invention, as shown in fig. 3, where the flow includes the following steps:
step S301, obtaining a retrieval problem of a user and a large language model. Alternatively, the large language model may be an existing large language model obtained through large-scale corpus pre-training, and the retrieval problem is a user problem that needs to input the large language model to generate a question-answer result.
Step S302, calling a power special plug-in according to the retrieval problem to obtain a corresponding retrieval document. Optionally, the search question of the user in step S301 is input into a power-dedicated plug-in, and a search document corresponding to the search question is obtained. It should be noted that, the power-dedicated plug-in may combine different types of tasks in the power domain with a large language model in a plug-in manner.
Step S303, obtaining a question and answer result according to the retrieval problem and the retrieval document by using the large language model. Optionally, the retrieval questions of the user and the retrieval documents are input into the large language model together to generate corresponding question and answer results, so that the purposes of improving the quality and effect of the question and answer results are achieved.
In the embodiment of the invention, the special plug-in for electric power is combined with the large language model in a plug-in mode, so that the purpose that the large language model can call the special plug-in for electric power to acquire the relevant knowledge in the process of generating the question-answer result according to the retrieval problem is realized, and the effects of alleviating the illusion problem of the large language model and improving the accuracy of the retrieval question-answer are achieved.
In this embodiment, a method for searching and answering a power device by using the power-dedicated plug-in unit constructed in the first aspect or any implementation manner corresponding to the first aspect is provided, which may be used in the mobile terminal, such as a central processing unit, a server, etc., and fig. 4 is a schematic flow diagram of the method for searching and answering a power device according to an embodiment of the present invention, as shown in fig. 4, where the flow includes the following steps:
Step S401, obtaining a retrieval problem of a user and a large language model. Please refer to step S301 in the embodiment shown in fig. 3 in detail, which is not described herein.
Step S402, disassembling the retrieval problem of the user to obtain a plurality of subtasks. Alternatively, in a real power operation environment, in order to provide accurate solutions to the retrieval problem of the user, a series of cumbersome task processes are often required. Therefore, the embodiment relies on the strong understanding capability of the large language model to deeply analyze the retrieval problem and disassemble the retrieval problem into a series of subtasks.
Step S403, determining the execution sequence and execution parameters of the plurality of subtasks according to the retrieval problem. Optionally, a large language model is utilized to make a reasonable arrangement of execution order and execution parameters for the plurality of subtasks. As an example, to better represent each subtask and facilitate execution of each subtask, parsing the search question into a list of tasks in json format, such as by directing a large language model by way of instructions to perform task parsing, each subtask includes: task names, task specific numbers, task dependent numbers and task parameters, wherein the task names are subtask plug-in names and comprise tasks such as power technology standard retrieval, power equipment knowledge graph construction, grading of power equipment defects and the like; the unique serial number of the task is the unique serial number of each subtask and is used for distinguishing each subtask; the task dependency number is a front sub-task number on which the current sub-task is executed; the parameters are parameters for performing the subtask, such as the content of the query.
Step S404, calling the power special plug-in to obtain a corresponding search document according to the search problem, wherein the step S404 optionally comprises:
step S4041, the power special plug-in is called to determine the retrieval documents corresponding to the plurality of subtasks according to the execution parameters and the execution sequence. Specifically, the natural language processing experience accumulated in the traditional power field includes many power characteristic tasks, such as retrieval of power technology standards, construction of power equipment knowledge maps, grading of power equipment defects, and the like, and the embodiment combines a power special plug-in capable of realizing the tasks with a large language model by adopting a plug-in structure. By way of example, the large language model is used for sequentially judging whether any subtask needs to call the power special plug-in according to the execution sequence and the execution parameters of the subtasks, and if the subtask needs to call the power special plug-in, the large language model calls the power special plug-in to obtain the corresponding retrieval document.
Step S405, obtaining a question and answer result according to the retrieval problem and the retrieval document by using the large language model. Optionally, the step S405 includes:
step S4051, sequentially determining the question and answer results corresponding to the plurality of subtasks by using the large language model according to the search document corresponding to the plurality of subtasks, the execution sequence of the plurality of subtasks and the execution parameters. Optionally, for any subtask, if a corresponding search document exists, the search document and the subtask are input into the large language model together, and the output of the large language model, namely, the question-answer result corresponding to any subtask is obtained. Optionally, when any subtask is executed according to the execution sequence, the question and answer result corresponding to any subtask may be saved, for example, using the args field as a parameter, and the question and answer result of the subtask may be saved in the database.
Step S4052, summarizing the question and answer results corresponding to the plurality of subtasks to obtain the question and answer results corresponding to the retrieval questions. Optionally, after all the subtasks are executed, the large language model gathers the execution results (question and answer results) of all the subtasks, and then generates the question and answer results corresponding to the retrieval questions of the user based on the results. In particular, the model in the power field needs to ensure that the model answers accurately, and any wrong data or operation flow may cause power accidents, so if the information returned by the task cannot solve the related retrieval problem for the user, the large language model feeds back the user: the question cannot be answered based on known information. Meanwhile, in order to trace the answer of the large language model, the accuracy of the question and answer result is conveniently analyzed by related personnel, and the execution result of each subtask is returned to the user in a structuring mode by taking the execution result of each subtask as a reference basis.
Compared with a general large-area language model, the embodiment of the invention can effectively improve the performance index of the downstream tasks in the electric power area, and simultaneously, the purpose of applying the migration cost to the downstream tasks in different electric power areas is realized.
The embodiment also provides a device for constructing a plug-in unit special for electric power, which is used for realizing the embodiment and the preferred implementation manner of the first aspect, and the description is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
The present embodiment provides a power-dedicated plug-in building apparatus, as shown in fig. 5, including: a model and data acquisition module 501, configured to acquire a pre-training model and training data; an initialization module 502 for initializing the retrieval model and the rearrangement model using the pre-training model; a joint training module 503, configured to perform joint training on the search model and the rearrangement model based on training data by adopting a step-by-step iterative optimization method; plug-in building module 504 is configured to build a power-specific plug-in using the jointly trained retrieval model and the rearrangement model.
In an alternative embodiment, the training data includes a search question and a search document, and the joint training module includes: the initialization unit is used for thermally starting a retrieval model on the training data and establishing an index vector corresponding to the retrieval document; a negative-sample acquiring unit configured to acquire a negative-sample set including a plurality of search documents having uncertain relevance to the search question and being similar to the real document; the retrieval model training unit is used for sampling a plurality of negative samples from the negative sample set when the retrieval model and the rearrangement model are not converged, and updating parameters of the retrieval model by using the plurality of negative samples; the rearrangement model training unit is used for updating the index vector corresponding to the search document and updating parameters of the rearrangement model by using a plurality of negative samples; and the ending judgment unit is used for ending training when the retrieval model and the rearrangement model are converged.
In an alternative embodiment, the retrieval model training unit comprises: the objective function building subunit is used for building an objective function for training the retrieval model by using the document selection probability distribution output by the retrieval model and the KL divergence of the document selection probability distribution output by the rearrangement model; and the retrieval model training subunit is used for training the retrieval model by using a plurality of negative samples based on the target function trained by the retrieval model and updating the parameters of the retrieval model.
In an alternative embodiment, the rearrangement model training unit comprises: a rearrangement model training subunit for updating parameters of the rearrangement model using a plurality of negative samples using a method that maximizes the log likelihood function.
Further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.
The power-specific plug-in building means in this embodiment is presented in the form of functional units, here referred to as ASIC (Application Specific Integrated Circuit ) circuits, processors and memories executing one or more software or fixed programs, and/or other devices that can provide the above described functionality.
The present embodiment provides a power equipment search question-answering device of a power-dedicated plug-in unit constructed by adopting the third aspect or any one of the corresponding embodiments thereof, as shown in fig. 6, including: a question and model acquisition module 601, configured to acquire a retrieval question of a user and a large language model; the retrieval document acquisition module 602 is configured to call a power-dedicated plug-in to obtain a corresponding retrieval document according to a retrieval problem; and a question and answer result obtaining module 603, configured to obtain a question and answer result according to the search question and the search document by using the large language model.
In an alternative embodiment, the apparatus further comprises: the problem disassembly module is used for disassembling the retrieval problem of the user to obtain a plurality of subtasks; and the sequence and parameter determining module is used for determining the execution sequence and the execution parameters of the plurality of subtasks according to the retrieval problem.
In an alternative embodiment, the execution parameters include query content, and the retrieve document acquisition module includes: and the retrieval document determining unit is used for calling the power special plug-in to determine retrieval documents corresponding to the plurality of subtasks according to the execution parameters and the execution sequence.
In an alternative embodiment, the question and answer result obtaining module includes: the question and answer result determining unit is used for sequentially determining question and answer results corresponding to the plurality of subtasks according to the search documents corresponding to the plurality of subtasks, the execution sequence of the plurality of subtasks and the execution parameters by using the large language model; and the summarizing unit is used for summarizing the question and answer results corresponding to the plurality of subtasks to obtain the question and answer results corresponding to the retrieval problems.
Further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.
The power device search question and answer apparatus in this embodiment is presented in the form of functional units, where the units refer to ASIC (Application Specific Integrated Circuit ) circuits, processors and memories executing one or more software or fixed programs, and/or other devices that can provide the above-described functions.
The embodiment of the invention also provides computer equipment, which is provided with the power special plug-in building device shown in the figure 5 and the power equipment search question-answering device shown in the figure 6.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer device according to an alternative embodiment of the present invention, as shown in fig. 7, the computer device includes: one or more processors 10, memory 20, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 7.
The processor 10 may be a central processor, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.
Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform a method for implementing the embodiments described above.
The memory 20 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the computer device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk, or solid state disk; the memory 20 may also comprise a combination of the above types of memories.
The computer device also includes a communication interface 30 for the computer device to communicate with other devices or communication networks.
The embodiments of the present invention also provide a computer readable storage medium, and the method according to the embodiments of the present invention described above may be implemented in hardware, firmware, or as a computer code which may be recorded on a storage medium, or as original stored in a remote storage medium or a non-transitory machine readable storage medium downloaded through a network and to be stored in a local storage medium, so that the method described herein may be stored on such software process on a storage medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware. The storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk, a solid state disk or the like; further, the storage medium may also comprise a combination of memories of the kind described above. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.
Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims (18)

1. A method for constructing a power-specific plug-in, the power-specific plug-in being configured to provide power domain expertise for a large language model, the method comprising:
acquiring a pre-training model and training data;
initializing a retrieval model and a rearrangement model by using the pre-training model;
performing joint training on the retrieval model and the rearrangement model based on the training data by adopting a step-by-step iterative optimization method;
and constructing the power special plug-in by using the search model and the rearrangement model after the joint training.
2. The power-dedicated plug-in building method according to claim 1, wherein the training data includes a search problem and a search document, the joint training of the search model and the rearrangement model based on the training data using a stepwise iterative optimization method includes:
a search model is started on the training data in a hot mode, and an index vector corresponding to a search document is established;
Acquiring a negative sample set, wherein the negative sample set comprises a plurality of retrieval documents which have uncertain relevance to the retrieval problem and are similar to the real document;
when the retrieval model and the rearrangement model are not converged, sampling a plurality of negative samples from the negative sample set, and updating parameters of the retrieval model by using the plurality of negative samples;
updating index vectors corresponding to the search documents, and updating parameters of the rearrangement model by using a plurality of negative samples;
when the search model and the rearrangement model converge, training is ended.
3. The power-dedicated plug-in building method according to claim 2, wherein updating parameters of the search model using a plurality of negative samples includes:
establishing an objective function for training the retrieval model by using the KL divergence of the document selection probability distribution output by the retrieval model and the document selection probability distribution output by the rearrangement model;
and training the retrieval model by using a plurality of negative samples based on an objective function trained by the retrieval model, and updating parameters of the retrieval model.
4. The power-dedicated plug-in building method according to claim 2, wherein updating parameters of the rearrangement model using a plurality of negative samples, comprises:
Parameters of the rearrangement model are updated using a plurality of negative samples using a method that maximizes the log likelihood function.
5. A power equipment search question-answering method of a power-dedicated plug-in constructed by the power-dedicated plug-in construction method according to any one of claims 1 to 4, characterized by comprising:
acquiring a retrieval problem of a user and a large language model;
calling a special plug-in for electric power according to the retrieval problem to obtain a corresponding retrieval document;
and obtaining a question and answer result according to the retrieval problem and the retrieval document by using the large language model.
6. The power device search question and answer method of claim 5, characterized in that before the calling of a power-specific plug-in according to the search question to obtain a corresponding search document, the method further comprises:
disassembling the retrieval problem of the user to obtain a plurality of subtasks;
and determining the execution sequence and the execution parameters of a plurality of subtasks according to the retrieval problem.
7. The method for searching and answering a power device according to claim 6, wherein the execution parameters include query contents, and the step of calling a power-specific plug-in to obtain a corresponding search document according to the search question includes:
And calling the power special plug-in to determine the retrieval documents corresponding to the plurality of subtasks according to the execution parameters and the execution sequence.
8. The power device search question and answer method according to claim 7, characterized in that the obtaining question and answer results from the search question and the search document using the large language model includes:
sequentially determining question and answer results corresponding to a plurality of subtasks according to the retrieval documents corresponding to the subtasks, the execution sequences of the subtasks and the execution parameters by using a large language model;
and summarizing the question and answer results corresponding to the plurality of subtasks to obtain the question and answer results corresponding to the retrieval problems.
9. A power-specific plug-in building apparatus for providing power domain expertise for a large language model, the apparatus comprising:
the model and data acquisition module is used for acquiring a pre-training model and training data;
the initialization module is used for initializing a retrieval model and a rearrangement model by using the pre-training model;
the joint training module is used for carrying out joint training on the retrieval model and the rearrangement model based on the training data by adopting a step-by-step iterative optimization method;
And the plug-in construction module is used for constructing the power special plug-in by using the search model and the rearrangement model after the joint training.
10. The power-specific plug-in building apparatus of claim 9, wherein the training data comprises a search question and a search document, the joint training module comprising:
the initialization unit is used for thermally starting a retrieval model on the training data and establishing an index vector corresponding to the retrieval document;
a negative-sample acquiring unit configured to acquire a negative-sample set including a plurality of search documents having uncertain relevance to a search problem and being similar to a real document;
the retrieval model training unit is used for sampling a plurality of negative samples from the negative sample set when the retrieval model and the rearrangement model are not converged, and updating parameters of the retrieval model by using the plurality of negative samples;
the rearrangement model training unit is used for updating the index vector corresponding to the search document and updating parameters of the rearrangement model by using a plurality of negative samples;
and the ending judgment unit is used for ending training when the retrieval model and the rearrangement model are converged.
11. The power-dedicated plug-in building apparatus according to claim 10, wherein the search model training unit includes:
The objective function building subunit is used for building an objective function for training the retrieval model by using the document selection probability distribution output by the retrieval model and the KL divergence of the document selection probability distribution output by the rearrangement model;
and the retrieval model training subunit is used for training the retrieval model by using a plurality of negative samples based on an objective function trained by the retrieval model and updating parameters of the retrieval model.
12. The power-dedicated plug-in building apparatus according to claim 10, wherein the rearrangement model training unit includes:
a rearrangement model training subunit for updating parameters of the rearrangement model using a plurality of negative samples using a method that maximizes the log likelihood function.
13. An electric power equipment search question-answering device employing the electric power-dedicated plug-in constructed by the electric power-dedicated plug-in constructing device according to any one of claims 9 to 12, characterized in that the device includes:
the problem and model acquisition module is used for acquiring the retrieval problem of the user and a large language model;
the retrieval document acquisition module is used for calling the special plug-in for electric power according to the retrieval problem to obtain a corresponding retrieval document;
and the question and answer result obtaining module is used for obtaining a question and answer result according to the retrieval problem and the retrieval document by using the large language model.
14. The power device search question-answering apparatus according to claim 13, wherein the apparatus further comprises:
the problem disassembly module is used for disassembling the retrieval problem of the user to obtain a plurality of subtasks;
and the sequence and parameter determining module is used for determining the execution sequence and the execution parameters of the plurality of subtasks according to the retrieval problem.
15. The power device search question and answer apparatus according to claim 14, characterized in that the execution parameter includes a search content, and the search document acquisition module includes:
and the retrieval document determining unit is used for calling the power special plug-in to determine retrieval documents corresponding to the plurality of subtasks according to the execution parameters and the execution sequence.
16. The power device search question-answering apparatus according to claim 15, wherein the question-answer result obtaining module includes:
the question and answer result determining unit is used for sequentially determining question and answer results corresponding to the plurality of subtasks according to the search documents corresponding to the plurality of subtasks, the execution sequence of the plurality of subtasks and the execution parameters by using the large language model;
and the summarizing unit is used for summarizing the question and answer results corresponding to the plurality of subtasks to obtain the question and answer results corresponding to the retrieval problems.
17. A computer device, comprising:
a memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the power-specific plug-in building method of any one of claims 1 to 4 and the power device search question-answering method of any one of claims 5 to 8.
18. A computer-readable storage medium having stored thereon computer instructions for causing a computer to execute the power-dedicated plug-in building method according to any one of claims 1 to 4 and the power device search question-answering method according to any one of claims 5 to 8.
CN202311617146.6A 2023-11-29 2023-11-29 Plug-in construction special for electric power and electric power equipment search question-answering method Pending CN117668183A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311617146.6A CN117668183A (en) 2023-11-29 2023-11-29 Plug-in construction special for electric power and electric power equipment search question-answering method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311617146.6A CN117668183A (en) 2023-11-29 2023-11-29 Plug-in construction special for electric power and electric power equipment search question-answering method

Publications (1)

Publication Number Publication Date
CN117668183A true CN117668183A (en) 2024-03-08

Family

ID=90070679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311617146.6A Pending CN117668183A (en) 2023-11-29 2023-11-29 Plug-in construction special for electric power and electric power equipment search question-answering method

Country Status (1)

Country Link
CN (1) CN117668183A (en)

Similar Documents

Publication Publication Date Title
CN107609056B (en) Question and answer processing method and device based on picture recognition
CN110837550A (en) Knowledge graph-based question and answer method and device, electronic equipment and storage medium
CN109933602B (en) Method and device for converting natural language and structured query language
CN103678637A (en) Method and device for acquiring test question information
CN110941698B (en) Service discovery method based on convolutional neural network under BERT
CN111382255A (en) Method, apparatus, device and medium for question and answer processing
CN116680384A (en) Knowledge question-answering method, device, equipment and storage medium
CN114547267A (en) Intelligent question-answering model generation method and device, computing equipment and storage medium
CN111753076A (en) Dialogue method, dialogue device, electronic equipment and readable storage medium
CN111400473A (en) Method and device for training intention recognition model, storage medium and electronic equipment
CN110610698A (en) Voice labeling method and device
CN117668183A (en) Plug-in construction special for electric power and electric power equipment search question-answering method
CN117271084A (en) Workflow generation method and device, electronic equipment and storage medium
CN111325212A (en) Model training method and device, electronic equipment and computer readable storage medium
CN110188274B (en) Search error correction method and device
CN110110280B (en) Curve integral calculation method, device and equipment for coordinates and storage medium
CN110647314B (en) Skill generation method and device and electronic equipment
CN116755683B (en) Data processing method and related device
Tuggener et al. So you want your private LLM at home?: a survey and benchmark of methods for efficient GPTs
CN117891917A (en) Customer service intelligent question and answer implementation method, device, equipment and storage medium
JP7413438B2 (en) Methods, devices, electronic devices and storage media for generating account intimacy
CN111221821A (en) AI model iterative updating method, electronic equipment and storage medium
CN117474097A (en) Model evaluation method, device, electronic equipment and storage medium
CN117312530A (en) Questionnaire and model training method, device, equipment, medium and product
CN113568929A (en) Data storage method, data query method, data storage device, data query device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination