CN117891831A

CN117891831A - Substation operation and maintenance intelligent question-answering method based on large model, related method and device

Info

Publication number: CN117891831A
Application number: CN202410076033.8A
Authority: CN
Inventors: 王鹏; 鲁学昆; 张荣涛; 郑小锋; 具学圆; 韦立雷
Original assignee: Beijing Washi Intelligent Technology Co ltd
Current assignee: Beijing Washi Intelligent Technology Co ltd
Priority date: 2024-01-18
Filing date: 2024-01-18
Publication date: 2024-04-16

Abstract

The application discloses a substation operation and maintenance intelligent question-answering method based on a large model, a related method and a related device. Filling a to-be-answered question into a first preset template, sending the filled first preset template to a trained large model to obtain an intention number, and determining a corresponding preset answer mode; if the preset answer mode is electric power knowledge, filling the to-be-answered questions into a second preset prompt template to obtain special prompt questions; if the preset answer mode is the power policy, searching corresponding relevant fragments through a pre-constructed knowledge base and filling the relevant fragments into a third preset template to obtain a special prompt question; if the preset answer mode is the state of the electric equipment, determining a corresponding database table according to the to-be-answered question, obtaining an SQL sentence through a trained large model to obtain a corresponding query result, and filling the query result into a fifth preset prompt template to obtain a special prompt question; and sending the special prompt questions to the trained large model to obtain corresponding answers.

Description

Substation operation and maintenance intelligent question-answering method based on large model, related method and device

Technical Field

The application relates to a substation operation and maintenance intelligent question-answering method based on a large model, a related method and a related device.

Background

In the operation and maintenance of a transformer substation, complex professional questions often need to be answered, and a traditional question-answering system mainly returns set answers based on keyword matching, fixed rules, similarity retrieval and other methods; for answering the power policy related questions, the traditional question answering system needs to consult corresponding documents for answering the related questions; for answering the questions related to the service data, the traditional question-answering system needs SQL writing and configuration, then queries in a database to obtain corresponding answers, and needs development and configuration for each update requirement.

Disclosure of Invention

In order to better realize the solution of professional problems in the operation and maintenance process of the transformer substation, the embodiment of the application provides a transformer substation operation and maintenance intelligent question-answering method based on a large model, a related method and a related device.

In a first aspect, an embodiment of the present application provides a substation operation and maintenance intelligent question-answering method based on a large model, where the method includes:

filling a to-be-answered question into a first preset template, and sending the filled first preset template to a trained large model to obtain an intention number;

Determining a corresponding preset answer mode according to the intention number; the preset answer mode comprises power knowledge, power policy and power equipment state;

if the preset answer mode is the electric power knowledge, filling the to-be-answered questions into a second preset template to obtain special prompt questions;

if the preset answer mode is the power policy, searching relevant fragments of the questions to be answered through a pre-constructed knowledge base; filling the questions to be answered and the relevant fragments into a third preset template to obtain a special prompt question;

if the preset answer mode is the power equipment state, determining a corresponding database table according to the questions to be answered; filling the questions to be answered and the database table into a fourth preset template, and sending the filled fourth preset template to a trained large model to obtain SQL sentences corresponding to the questions to be answered; executing the SQL sentence to obtain a corresponding query result; filling the questions to be answered and the query results into a fifth preset template to obtain special prompt questions;

and sending the special prompt questions to the trained large model to obtain corresponding answers.

In an alternative implementation manner of the embodiment of the present application, the knowledge base is constructed in the following manner:

preprocessing a pre-collected power related document, and dividing the preprocessed power related document to obtain a plurality of document fragments;

converting each document fragment into a vector according to a word embedding method;

and storing each vector in a preset vector database to obtain the knowledge base.

In an optional implementation manner of the embodiment of the present application, the database table includes an equipment table and an alarm table; the database table is constructed by:

acquiring the names of all equipment in the operation and maintenance range of the transformer substation, and constructing the equipment table;

and acquiring the ID and the alarm name of each device in the operation and maintenance range of the transformer substation, and constructing the alarm table.

In an optional implementation manner of the embodiment of the present application, the trained large model is a trained ChatGLM-6B large model; the trained large model is obtained by:

encoding the pre-obtained preprocessed power knowledge and operation and maintenance overhaul data to obtain an encoded file;

training the pre-trained large model through the coding file and a preset trainer based on a preset parameter fine tuning method to obtain the trained large model.

In an optional implementation manner of the embodiment of the present application, the power knowledge and the operation and maintenance overhaul data after the pretreatment are obtained by the following manners:

data cleaning is carried out on the power knowledge and operation maintenance data which are obtained in advance, and the cleaned power knowledge and operation maintenance data are obtained;

and converting the cleaned power knowledge and operation maintenance data into a JSON format according to the form of matching the questions and answers, and obtaining the preprocessed power knowledge and operation maintenance data.

In a second aspect, an embodiment of the present application provides a method for constructing a substation operation and maintenance intelligent question-answering system based on a large model, where the method includes:

preprocessing a pre-collected power related document, and dividing the preprocessed power related document to obtain each segment;

according to a word embedding method, converting each segment into each vector and storing the vectors in a preset vector database to obtain a knowledge base;

and constructing a substation operation and maintenance intelligent question-answering system based on the trained large model, the knowledge base, the pre-constructed database table and each preset answer mode.

In a third aspect, an embodiment of the present application provides a substation operation and maintenance intelligent question-answering device based on a large model, where the device includes:

The number determining module is used for filling the questions to be answered into a first preset template, and sending the filled first preset template to the trained large model to obtain the intention number;

the mode determining module is used for determining a corresponding preset answer mode according to the intention number; the preset answer mode comprises power knowledge, power policy and power equipment state;

the first determining module is used for filling the to-be-answered questions into a second preset prompt template if the preset answer mode is electric power knowledge, so as to obtain special prompt questions;

the second determining module is used for searching relevant fragments of the questions to be answered through a pre-constructed knowledge base if the preset answer mode is a power policy; filling the questions to be answered and the relevant fragments into a third preset template to obtain a special prompt question;

the third determining module is used for determining a corresponding database table according to the to-be-answered question if the preset answer mode is the power equipment state; filling the questions to be answered and the database table into a fourth preset template, and sending the filled fourth preset template to a trained large model to obtain SQL sentences corresponding to the questions to be answered; executing the SQL sentence to obtain a corresponding query result; filling the questions to be answered and the query results into a fifth preset template to obtain special prompt questions;

And the answer determining module is used for sending the special prompt questions to the trained large model to obtain corresponding answers.

In a fourth aspect, an embodiment of the present application provides a device for constructing a substation operation and maintenance intelligent question-answering system based on a large model, where the device includes:

the segmentation module is used for preprocessing the pre-collected power-related documents and segmenting the preprocessed power-related documents to obtain fragments;

the first construction module is used for converting each segment into each vector according to a word embedding method and storing the vectors in a preset vector database to obtain a knowledge base;

the second construction module is used for constructing the substation operation and maintenance intelligent question-answering system based on the trained large model, the knowledge base, the pre-constructed database table and each preset answer mode.

In a fifth aspect, embodiments of the present application provide a computer readable storage medium, on which a computer program is stored, where the program when executed by a processor implements a method for intelligent questioning and answering of operation and maintenance of a substation based on a large model as described above, and/or a method for constructing an intelligent questioning and answering system of operation and maintenance of a substation based on a large model.

In a sixth aspect, an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the above-mentioned intelligent query and answer method for operation and maintenance of a transformer substation based on a large model and/or the method for constructing the intelligent query and answer system for operation and maintenance of a transformer substation based on a large model when executing the computer program.

In a seventh aspect, embodiments of the present application provide a computer program product containing instructions that, when executed on a computer device, cause the computer device to perform a large model-based substation operation and maintenance intelligent question-answering method and/or a large model-based substation operation and maintenance intelligent question-answering system construction method as described above.

In an eighth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to run a computer program or instructions to implement the above-mentioned intelligent query-answering method for operation and maintenance of a transformer substation based on a large model, and/or the method for constructing the intelligent query-answering system for operation and maintenance of a transformer substation based on a large model.

The beneficial effects of the technical scheme provided by the embodiment of the application at least comprise:

according to the substation operation and maintenance intelligent question-answering method based on the large model, the to-be-answered questions are filled into the first preset template, the filled first preset template is sent to the trained large model to obtain the intention numbers, so that the corresponding preset answer modes are determined, and then the answers to the to-be-answered questions are obtained based on the preset answer modes and the trained large model. The method uses a trained large model with professional power knowledge and operation maintenance knowledge, can more accurately understand the problem intention of a user, provides a targeted answer according to understanding, and does not depend on the traditional keyword matching or fixed rules only, thereby remarkably improving the intelligent level of operation maintenance question and answer of the transformer substation; by constructing a knowledge base and combining the understanding capability of a large model, effective and accurate answers can be provided for users in time, the tedious process of manually searching documents is avoided, and the workload of operation and maintenance personnel is remarkably reduced; by converting the user problem into SQL query service system data, and integrating and outputting the result by using a large model, the operation and maintenance service proposal can be provided based on the real-time state of the equipment, and compared with the traditional way of developing and configuring SQL, the flexibility is improved, and SQL configuration or code writing is not required each time, so that the operation and maintenance service proposal based on real-time data is provided more quickly; by providing efficient, accurate and timely intelligent question-answering service, the working efficiency of transformer substation operation and maintenance personnel and the accuracy of problem solving can be remarkably improved, and further user experience and satisfaction are improved. Meanwhile, the flexibility and the expandability of the method also provide more personalized service experience for users.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

The technical scheme of the present application is described in further detail below through the accompanying drawings and examples.

Drawings

The accompanying drawings are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate the application and together with the embodiments of the application, and not constitute a limitation to the application. In the drawings:

fig. 1 is a schematic step diagram of a substation operation and maintenance intelligent question-answering method based on a large model provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of a process for tuning a model using the LoRA method according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of the intelligent question-answering using the trained large model according to the embodiment of the present application;

fig. 4 is a schematic step diagram of a method for constructing a substation operation and maintenance intelligent question-answering system based on a large model according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a substation operation and maintenance intelligent question-answering device based on a large model according to an embodiment of the present application;

Fig. 6 is a schematic structural diagram of a device for constructing a substation operation and maintenance intelligent question-answering system based on a large model according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".

In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

It should be understood that the sequence numbers of the steps in the following embodiments do not mean the order of execution, and the execution order of the processes should be determined by the functions and the internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

In order to illustrate the technical solution of the present application, the following description is made by specific examples.

The inventor finds that in the prior art, the traditional question-answering system mainly returns set answers based on keyword matching, fixed rule, similarity retrieval and other methods, cannot return comprehensive effective answers according to understanding of the questions, and needs to consult corresponding documents for related questions used by a power policy and a system, cannot acquire the answers in real time, and the process of manually retrieving the documents is tedious and aggravates the workload of operation and maintenance personnel; for the answer of the related questions of the business data, SQL writing and configuration are needed, then the corresponding answer is obtained by inquiring in a database, and each time of updating needs to be developed and configured, so that the method is not flexible and has low efficiency. Based on the above, the inventor further develops and makes the application, and provides a substation operation and maintenance intelligent question-answering method, a related method and a related device based on a large model.

Example 1

The embodiment of the application provides a substation operation and maintenance intelligent question-answering method based on a large model, which is shown by referring to fig. 1 and comprises the following steps:

s101: and filling the questions to be answered into a first preset template, and sending the filled first preset template to the trained large model to obtain the intention number.

S102: determining a corresponding preset answer mode according to the intention number; the preset answer mode comprises power knowledge, power policy and power equipment state; if the preset answer mode is the power knowledge, S103 is executed, if the preset answer mode is the power policy, S104 is executed, and if the preset answer mode is the power equipment state, S105 is executed.

S103: and if the preset answer mode is the power knowledge, filling the to-be-answered questions into a second preset prompt template to obtain special prompt questions.

S104: if the preset answer mode is the power policy, searching relevant fragments of the questions to be answered through a pre-constructed knowledge base; and filling the questions to be answered and the relevant fragments into a third preset template to obtain the special prompt questions.

S105: if the preset answer mode is the power equipment state, determining a corresponding database table according to the questions to be answered; filling the questions to be answered and the database table into a fourth preset template, and sending the filled fourth preset template to a trained large model to obtain SQL sentences corresponding to the questions to be answered; executing the SQL sentence to obtain a corresponding query result; and filling the questions to be answered and the query results into a fifth preset template to obtain the special prompt questions.

S106: and sending the special prompt questions to the trained large model to obtain corresponding answers.

In the embodiment of the application, after the user proposes the question to be answered, the question to be answered is sent to the trained large model through the first predefined template of the preset template, and the intention of the question to be answered is understood by the trained large model. Wherein the first preset template is defined by:

using the task, instruction and return examples to construct a few sample hint (few-shot prompt), the first preset prompt template needs to contain the question to be answered, which can be defined as follows:

campt = f "" "your task is to select the appropriate template based on the question used, and three templates are now available:

1. knowledge of the electric power field, operation and maintenance problems and other general knowledge;

2. power policy, power operation specification, and system usage issues;

3. real-time state data problems such as transformer substation equipment states, alarms and the like;

the user's problems are: { query }

Please return the intended numbers like 1, 2, 3, and not return other irrelevant information "" "";

in the step S101, the question to be answered is filled in the { query } place in the first preset template, so as to obtain a filled first preset template. And sending the filled first preset template to a trained large model, returning a corresponding intention number after the large model understands the intention of the question, for example, returning an intention number 2 after the large model understands that the question to be answered is a question of the power policy type.

In the embodiment of the application, the trained large model can be a trained ChatGLM-6B large model; the trained large model may be obtained by:

In this embodiment, for the ChatGLM-6B large model, the large model is fine-tuned to have power knowledge and power operation and maintenance knowledge, and specifically, the fine tuning can be performed by the following modes:

s1011: and (5) data collection.

Text data related to power knowledge and power operation and maintenance are collected from sources such as professional textbooks, national standards, power specifications, equipment manuals, operation and maintenance manuals, and experience cases. These data should include various power concepts, technical terms, operation procedures, maintenance methods, etc., ensuring the comprehensiveness and diversity of contents.

S1012: and (5) preprocessing data.

In the step S1012, the power knowledge and the operation and maintenance data after the preprocessing are obtained by the following method:

And carrying out data cleaning on the power knowledge and operation maintenance overhaul data which are obtained in advance, and obtaining the cleaned power knowledge and operation maintenance overhaul data.

The data cleaning process mainly comprises the steps of sorting collected documents, removing repeated, irrelevant, outdated and erroneous information, ensuring consistency and accuracy of texts, and obtaining high-quality training data. Where consistency refers to the term unification and data being non-contradictory, the term unification is that the same concept or term should be expressed consistently in different documents, e.g., if a concept has multiple designations in different documents, then the term unification should be a standard designation when data cleansing is performed; the data is not contradictory, i.e. information between different documents or within the same document should not be contradictory, and if inconsistent information exists, verification and correction are required to ensure the accuracy of the data. The data cleaning can be performed manually.

And according to the form of matching the questions and the answers, converting the cleaned electric power knowledge and operation and maintenance overhaul data into a JSON format required by a large model to obtain the preprocessed electric power knowledge and operation and maintenance overhaul data. Examples of large model fine-tuning data formats are as follows:

The power knowledge is exemplified as follows:

what is the power system? The power system is a network consisting of a power plant, a transmission line, a transformer substation and a distribution line and is used for generating, transmitting and distributing electric energy }

What is the high voltage electricity? The \nAnswer: "," target ":" high voltage electricity refers to electric energy with voltage exceeding a certain value (usually 1000V), and is used for long-distance power transmission and power supply of large-scale industrial equipment "}

The overhaul knowledge is exemplified as follows:

{ "context": what are the possible reasons why a switch cannot close? The inability of the \nanswer: "," target ":" switch to close may be due to power failure, mechanical connection problems, control circuit failure, improper switch position, etc. Careful investigation of these possibilities is required during maintenance, and the problem is solved gradually, ensuring normal operation of the switch. "}

{ "context": what is the possible cause of failure of the relay protection system in the power system? The \NAnswer: ", the" target ":" relay protection system failure may be caused by various reasons such as equipment aging, power failure, program error, communication failure, etc. The maintenance personnel should locate and solve the faults of the relay protection system through means of equipment replacement, power supply inspection, software adjustment, communication test and the like. "}

S1013: and (5) fine-tuning strategy selection of a large model.

In the embodiment of the present application, a migration learning method is used to perform large model tuning (Fine tuning), which means that a large model obtained by pre-training on a large number of general texts can be used as a starting point of a Fine tuning task in the electric power domain. The migration strategy can accelerate training convergence of the model, improve the performance of the model and overcome the problem of data sparseness. Since large models have learned rich language knowledge over generic text, good performance can be achieved with less power domain data.

In the trimming process, it is also important to select an appropriate trimming method. There are two technical routes for fine tuning large models:

first, full Fine Tuning (Full Fine Tuning) is performed to perform Full training on Full parameters. Full scale fine tuning, while allowing the model parameters to be fully adjusted to suit the target task, is computationally and memory expensive.

And secondly, high-efficiency Parameter fine tuning (Parameter-Efficient Fine Tuning), fine tuning a small amount or additional model parameters, and fixing most of Pre-training language model (Pre-trained Language Model, PLM) parameters, thereby greatly reducing the cost of calculation and storage and realizing the performance equivalent to the fine tuning of the full-scale parameters. Common techniques for efficient fine Tuning of parameters are BitFit, prefix Tuning, prompt Tuning, P-Tuning, adapter Tuning, loRA, etc.

In the embodiment of the present application, the preset parameter tuning method may be a lorea tuning method. Among them, loRA is a fine tuning method proposed in 2021, which is based on an assumption: the "learned over-parameterized model is actually in a low internal dimension. The low-rank adaptation (LoRA) method is proposed by assuming that the weight change in the model adaptation process also has a low "intrinsic rank". LoRA allows for the indirect training of some dense layers in a neural network by optimizing the rank decomposition matrix of dense layer variations during adaptation while keeping the pre-training weights unchanged.

The procedure using the LoRA tuning model is illustrated in FIG. 2, and is described in detail below:

a bypass is added beside the original PLM, and a dimension-reducing and dimension-increasing operation is performed to simulate a so-called intrinsic rank (intrinsic rank).

The parameters of PLM are fixed during training, and only the dimension-decreasing matrix a and dimension-increasing matrix B are trained during re-parameterization in fig. 2. The input and output dimensions of the model are unchanged, parameters of BA and PLM are overlapped when the model is output, and the overlapped model can answer the knowledge in the specific field. The parameters of PLM are obtained by pre-training through unsupervised learning on a large-scale corpus, and the parameters capture the general structure and knowledge of language; "BA" may refer to a matrix of parameters obtained after fine-tuning training, which contains information learned for a particular task or data set.

Initializing a with a random gaussian distribution, initializing B with a 0 matrix, ensuring that the bypass matrix (BA) remains a 0 matrix at the beginning of training.

Assuming that a pre-trained language model (e.g., GPT-3) is to be trimmed on a downstream task (referring to a specific application task that is done by trimming the model on a pre-trained basis, such as an answer to a power domain question), the pre-trained model parameters need to be updated, expressed as equation 1 below:

w ₀ +Δw formula 1;

wherein w is ₀ Is a parameter for initializing a pre-training model; Δw is a parameter that needs to be updated. If a full parameter fine tuning, its parameter=w ₀ (w in case of GPT-3 ₀ And 175B). From this it can be seen that the cost is very high to fine tune a large language model for all parameters.

Whereas for LORA only fine tuning aw is required. In particular, let the pre-trained matrix be w ₀ ∈R ^d×r The update of the pre-training model parameters can be expressed as the following equation 2:

w ₀ +Δw＝w ₀ +BA,B∈R ^d×r ,A∈R ^r×k equation 2;

wherein w is ₀ Is a parameter for initializing a pre-training model; Δw is a parameter that needs to be updated; a is a dimension reduction matrix; b is an ascending dimension matrix; BA is the bypass matrix; r is rank, r<=min (d, k); d is the row of matrix B; k is the column of the a matrix.

In the training process of LoRA, w ₀ Is fixed, only a and B are training parameters. In the forward direction, w ₀ Multiplying the same input x with Δw and finally adding to obtain the output of the model, expressed as the following equation 3:

h＝w ₀ x+Δwx＝w ₀ x+ BAx equation 3;

wherein h is the output of the model and comprises the parameters of the pre-trained model and the parameters after fine adjustment; w (w) ₀ Is a parameter for initializing a pre-training model; Δw is a parameter that needs to be updated; x is input; BA is the bypass matrix.

The LoRA method can achieve fine adjustment of the pre-trained large model through the process, and fine-adjusted large model parameters are obtained.

S1014: and (5) data encoding.

The power knowledge and operation and maintenance data after the preprocessing are obtained through the step S1012, and the data are encoded to obtain an encoded file. The coding mode can be as follows:

what is the power system, what is the preprocessed power knowledge and operation and maintenance data in Json format, such as { "context": "construction? The power system is a network formed by a power plant, a power transmission line, a transformer substation and a power distribution line, and is used for generating, transmitting and distributing electric energy, respectively reading context and target, and respectively encoding texts by using a word splitter of a transformers library for the context and the target to obtain a list of numerical representations. And adding the two lists to obtain each coding result, and storing the coding result in a coding file. The key codes for this process are as follows:

Through the above process, each piece of data is organized as: { "input_ids": input_ids, "seq_len": len (sample_ids) } format.

S1015: and (5) fine tuning the model. The specific fine tuning process is as follows:

initializing: the pre-trained model ChatGLM-6B is loaded and specific parameters and configurations of the model are set, such as enabling gradient checkpoints and allowing input of required gradients, etc.

Data preparation: the data set associated with a particular task is loaded and a function is defined for processing the data to ensure that the data can be properly processed by the model.

Parameter high-efficiency fine adjustment setting: defining a LoRA configuration object, designating parameters such as task type, whether the task is in an reasoning mode, the rank of the LoRA and the like, and adding a LoRA module to the model by using the configuration to realize efficient fine adjustment of the parameters.

Training preparation: initializing a modified trainer object, inputting a model, a data set, training parameters and the like, and adding a TensorBoard callback for visualizing the training process.

Training is started: the training method of the trainer is invoked to begin training the model, during which the model will continually update its parameters via back propagation to minimize the loss function at a particular task.

And (3) saving a model: after training, the trimmed model is saved for subsequent use.

The key codes of the fine tuning process are as follows:

/>

s1016: the model is used.

In order to load the pre-trained ChatGLM-6B model, make the model more efficient in video memory use and increase operation speed, a half () method is used to convert model parameters into half-precision floating point numbers, and ensure that the model can operate on the GPU through cuda (). Parameters that match the previous model that have been trimmed are loaded by from_pre-threaded (model, peft_model_id) and guaranteed to use full precision floating point numbers by the float () method. The word segmentation device plays a key role in processing user input and converting the user input into a format which can be understood by a model, so that the word segmentation device is pre-trained by selecting a 'ChatGLM-6B' matched with the model, and when a query (namely, a query) is provided by a user, the model can process the query by using the word segmentation device, and a corresponding response is generated through a chat (token) method. The key codes of this process are as follows:

model＝AutoModel.from_pretrained(model_name,trust_remote_code＝True,device_map＝"auto").half().cuda()

model＝PeftModel.from_pretrained(model,peft_model_id).float()

tokenizer＝AutoTokenizer.from_pretrained(model_name,trust_remote_code＝True)

response,history＝model.chat(tokenizer,query)

in the embodiment of the application, fine tuning of the ChatGLM-6B large model is realized through the steps S1011 to S1016, so that a trained large model is obtained, and the trained large model has power knowledge, including power operation and maintenance knowledge.

In the step S102, a preset answer mode corresponding to the to-be-answered question is determined according to the intention number returned by the trained large model, where the preset answer mode includes answer modes of power knowledge (including power domain knowledge, operation maintenance questions, other general knowledge), power policy (including power policy knowledge, power operation specification and system use questions), and power equipment status (including real-time status data questions such as substation equipment status, alarm, etc.). Because the association relation is established between the numbers and the preset answer modes in advance, the preset answer modes corresponding to the disagreement graph can be called through the intentions corresponding to the returned intent numbers. For example, the returned intention number is 2, the intention corresponding to the question to be answered is "2. The power policy, the power operation specification and the system use question", the preset answer mode corresponding to the intention is the answer mode of the power policy.

In this embodiment of the present application, as shown in fig. 3, according to the different intent numbers returned by the trained large model, that is, the different preset answer modes, the question-answering modes performed by using the trained large model are also different. The following description is given in order according to the power knowledge, the power policy, and the answer mode of the power equipment state:

when the preset answer mode is the power knowledge, the above step S103 is executed, and the specific execution process is as follows:

when the user asks for power-related knowledge, that is, the preset answer mode is power knowledge, the to-be-answered questions are filled into a second preset template, wherein the second preset template can be defined as follows:

simplet=f "" "please combine your power knowledge and operation and maintenance experience to solve the following problems: { query } "" "".

And transmitting the questions to be answered into the query position of a second preset template of the Prompt to obtain special Prompt questions (special Prompt).

When the preset answer mode is the power policy, the above step S104 is executed, and the specific execution process is as follows:

when the user asks the power policy and the system uses the questions, namely, the preset answering mode is the power policy, the relevant document snippet doc of the questions to be answered is searched through a pre-built knowledge base. The knowledge base is constructed by:

First, power policy, operation specification and system usage related documents are collected, and then pre-processing is performed on the pre-collected power related documents, wherein the pre-processing process includes text cleaning (to remove useless marks, formats and noise) and removal of stop words (such as "and" yes "and other words that are commonly used but have no practical meaning), so as to obtain high-quality power policy, operation specification and system usage related document usage knowledge. For more efficient processing and retrieval, long text needs to be segmented into smaller segments, such as sentences or paragraphs, which helps to more accurately capture the semantic information of the text in subsequent steps. Therefore, the preprocessed power-related document is divided to obtain a plurality of document fragments.

Then, each document fragment is converted into a vector according to the word embedding method. Word embedding is a technique in which words or phrases are converted into vectors in a high-dimensional space, where similar words are also located similarly in space. Word embedding methods such as huggingfacekeys and sentence converter embedding (sentence convermesseededings) help capture deep semantic relationships of text.

And finally, storing each vector in a preset vector database to obtain a knowledge base. The word vectors are stored here for subsequent quick retrieval and comparison; the preset vector database may be a Chroma database, which is a specialized vector database, providing efficient storage and retrieval functions, and using Chroma database storage, the stored vector may be quickly compared with new inputs in future queries to find the text segment most relevant to the input. The key codes of this process are as follows:

/>

and after the relevant fragments of the questions to be answered are obtained through the knowledge base retrieval, filling the questions to be answered and the relevant fragments into a third preset template to obtain the special prompt questions. Wherein, the third preset template may be defined as follows:

campt=f "" "please answer the user question in conjunction with the following document { doc }: { query } "" "".

And filling the relevant fragment doc document and the question to be answered query retrieved from the knowledge base into the third preset template to obtain the special prompt question.

When the preset answer mode is the power equipment state, the above step S105 is executed, and the specific execution process is as follows:

when a user asks for real-time state data questions such as the state of substation equipment or an alarm, namely, a preset answering mode is the state of the power equipment, a corresponding database table is determined according to the questions to be answered. The database table at least comprises an equipment table and an alarm table; the database table may be constructed by:

Predefining a table structure such as an equipment table and an alarm table which need to be provided with inquiry; then, obtaining the names of all the devices in the operation and maintenance range of the transformer substation, and constructing the device table; and acquiring the ID and the alarm name of each device in the operation and maintenance range of the transformer substation, and constructing the alarm table.

The key codes of the process of constructing the database table are as follows:

and after determining the database table corresponding to the questions to be answered, filling the questions to be answered and the database table into a fourth preset template. When the table structure schema is transferred into the template, the large model is required not to generate an addition, update or deletion statement, so as to avoid destroying the service system data, and therefore, a fourth preset template of the template can be defined as follows:

the prompt= (f "") generates an SQL statement to answer the user's question according to the following database table structure { schema }, without generating add, update, delete statements.

User problem: { query }

And returning the generated query SQL statement without returning other irrelevant information.

""")

And sending the filled fourth preset template to a trained large model, wherein the large model generates an executable SQL sentence by utilizing the internal knowledge and reasoning capacity of the large model after receiving the filled fourth preset template and combining the provided table structure information and query intention. After the large model is obtained and the executable SQL statement is returned, the corresponding query result can be obtained by executing the SQL statement.

And filling the questions to be answered and the query results into a fifth preset template to obtain the special prompt questions. Wherein, the fifth preset template may be defined as follows:

promtt= (f "" "user question is { query }, the result of the database search for this question is { resolution }, please answer the user's question according to the database search result" "")

And filling the query result and the question to be answered query retrieved from the database table into the fifth preset template to obtain the special prompt question.

In the step S106, the corresponding special prompt questions obtained in the steps S103 to S105 under different preset answer modes are sent to the trained large model to obtain corresponding answers.

In the embodiment of the application, the large model learns knowledge and language structures, grammar rules and semantic relations in training data and has the understanding and generating capacity of natural language, so that when a problem is presented to the large model, the model analyzes the grammar and the semantics of the problem and tries to understand the intention of a user. The large model then uses the knowledge and language rules it trained to generate an answer and returns it to the user in natural language. At the same time, the model also considers the context and context of the questions to ensure answer consistency and rationality.

The trained large model acts as follows:

understanding the problem intent: the large model is first used to understand the intent of the problem posed by the user, which involves understanding and analyzing the natural language to determine if the user is asking for power domain expertise, knowledge base problems, or system real-time data problems;

knowledge integration and reasoning: the large model integrates knowledge of the power industry and the power operation and maintenance and repair knowledge in the fine tuning process, and related document information retrieved from a knowledge base is inferred to provide accurate answers;

generating a professional answer: based on the understanding of the user questions and the integration of the relevant knowledge, the large model can generate answers to the power domain professional questions;

converting user questions into SQL queries: for real-time data problems of the system, the large model can convert the user problems into SQL query sentences so as to retrieve real-time data from the service system;

integrating real-time data and providing advice: after the real-time data are acquired, the large model can be combined with the data to provide operation and maintenance suggestions, so that the flexibility and the real-time response capability of the system are improved.

The inputs of the trained large model are specifically as follows:

the user asks: questions posed by users in natural language form, which may relate to power domain expertise, knowledge base queries, or system real-time data queries;

Knowledge base documents and system real-time data: while not directly input to the large model, the large model indirectly uses this information in answering questions, knowledge base documents are used to provide background knowledge and answer support, and system real-time data is used to generate operational maintenance recommendations based on equipment status.

The output of the trained large model is specifically as follows:

the output of the large model is a structured answer to the user question. These answers may include an interpretation of the power domain expertise, an excerpt of relevant information in knowledge base documents, and query results of system real time data, as well as operational maintenance recommendations based on these data. The output is presented in natural language form, aimed at directly and effectively meeting the information needs of the user.

In the embodiment of the application, the special prompt questions corresponding to the power knowledge are transmitted to the trained large model, and the large model can be guided to answer the related questions of the power knowledge to obtain corresponding answers; transmitting the special prompt questions corresponding to the power policy to the trained large model, and guiding the large model to summarize the document to obtain corresponding answers; and transmitting the special prompt questions corresponding to the power equipment states to a trained large model, wherein the large model can be integrated and output according to the system real-time data and combining the power knowledge and the operation and maintenance knowledge to obtain corresponding answers.

In this embodiment of the present application, a flow of intelligent question answering using a trained large model is shown in fig. 3, when a user proposes a question, the question is first filled into a first preset prompt and then transmitted into a trained ChatGLM-6B large model, the intention of the question is understood through the large model, and the intention number is returned. When the large model judges that a user asks a knowledge question in the electric power field or operation and maintenance, the user question is transmitted into a template (a second preset template) corresponding to the question, and the large model is guided to answer; when judging that the user asks the power policy and the current system uses the problem, recalling the corresponding document through a pre-constructed knowledge base, transmitting the document and the user problem into a special template (a third preset template) for the problem, and generating an answer by the large model; when judging that the user asks the real-time state data of the transformer substation, converting the text into SQL through a special template (a fourth preset template), executing the SQL to obtain system data, transmitting the result into the large model through the fifth preset template, and integrating and outputting the returned result through the large model.

Example two

Based on the same inventive concept, the embodiment of the application also provides a method for constructing a substation operation and maintenance intelligent question-answering system based on a large model, and referring to fig. 4, the method comprises the following steps:

s201: preprocessing the pre-collected power related documents, and dividing the preprocessed power related documents to obtain fragments.

S202: and converting each segment into each vector according to a word embedding method, and storing the vectors in a preset vector database to obtain a knowledge base.

S203: and constructing a substation operation and maintenance intelligent question-answering system based on the trained large model, the knowledge base, the pre-constructed database table and each preset answer mode.

The specific implementation process of steps S201 to S202 may refer to the process of constructing the knowledge base in the first embodiment; in the step S203, the specific training process of the large model may refer to the tuning process of the ChatGLM-6B large model implemented in S1011 to S1016 in the first embodiment, and the above steps are repeated, and will not be repeated here.

In this embodiment, based on the trained large model, the knowledge base constructed in the steps S210 to S202, the database table constructed in advance, and each preset answer mode, the substation operation and maintenance intelligent question-answering system can be constructed, and when the substation operation and maintenance intelligent question-answering system is used, the process as in the steps S101 to S106 in the embodiment can be executed, and the repetition is omitted here.

Example III

Based on the same inventive concept, the embodiment of the application also provides a substation operation and maintenance intelligent question-answering device based on a large model, and referring to fig. 5, the device comprises:

the number determining module 101 is configured to fill a to-be-answered question into a first preset template, and send the filled first preset template to a trained large model to obtain an intent number;

a mode determining module 102, configured to determine a corresponding preset answer mode according to the intent number; the preset answer mode comprises power knowledge, power policy and power equipment state;

a first determining module 103, configured to fill the to-be-answered question into a second preset prompt template if the preset answer mode is electric power knowledge, so as to obtain a special prompt question;

a second determining module 104, configured to retrieve, if the preset answer mode is a power policy, a relevant segment of the to-be-answered question through a pre-constructed knowledge base; filling the questions to be answered and the relevant fragments into a third preset template to obtain a special prompt question;

a third determining module 105, configured to determine a corresponding database table according to the to-be-answered question if the preset answer mode is a power equipment state; filling the questions to be answered and the database table into a fourth preset template, and sending the filled fourth preset template to a trained large model to obtain SQL sentences corresponding to the questions to be answered; executing the SQL sentence to obtain a corresponding query result; filling the questions to be answered and the query results into a fifth preset template to obtain special prompt questions;

And the answer determining module 106 is configured to send the specific prompt question to the trained large model to obtain a corresponding answer.

The above-described number determination module 101 may perform the process of step S101 as in the first embodiment; the above-described manner determination module 102 may perform the process of step S102 as in the embodiment; the first determining module 103 may perform the process of step S103 as in the first embodiment; the second determining module 104 may perform the process of step S104 as in the first embodiment; the third determining module 105 may perform the process of step S105 as in the first embodiment; the answer determination module 106 may perform the process of step S106 as in the embodiment. Where the above steps are repeated, they are not described here again.

Example IV

Based on the same inventive concept, the embodiment of the application also provides a device for constructing a substation operation and maintenance intelligent question-answering system based on a large model, and referring to fig. 6, the device comprises:

the segmentation module 201 is configured to pre-process a pre-collected power-related document, and segment the pre-processed power-related document to obtain each segment;

the first construction module 202 is configured to convert each segment into each vector according to a word embedding method, and store the each vector in a preset vector database to obtain a knowledge base;

The second construction module 203 is configured to construct a substation operation and maintenance intelligent question-answering system based on the trained large model, the knowledge base, the pre-constructed database table and each preset answer mode.

The above-described segmentation module 201 may perform the procedure of step S201 as in the second embodiment; the first building block 202 may perform the process of step S202 as in the second embodiment; the second construction module 203 may perform the process of step S203 as in the second embodiment. Where the above steps are repeated, they are not described here again.

Example five

Based on the same inventive concept, the embodiments of the present application further provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor, implements the large model-based substation operation and maintenance intelligent question-answering method described in the above embodiment one, and/or the large model-based substation operation and maintenance intelligent question-answering system construction method described in the above embodiment one.

Example six

Based on the same inventive concept, the embodiments of the present application further provide a computer device, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements the large model-based substation operation and maintenance intelligent question-answering method described in the first embodiment and/or the large model-based substation operation and maintenance intelligent question-answering system construction method described in the first embodiment.

Example seven

Based on the same inventive concept, the embodiments of the present application further provide a computer program product containing instructions, which when run on a computer device, cause the computer device to perform the large model-based substation operation and maintenance intelligent question-answering method as described in the above embodiment one, and/or the large model-based substation operation and maintenance intelligent question-answering system construction method as described in the above embodiment one.

Example eight

Based on the same inventive concept, the embodiments of the present application further provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a computer program or instructions to implement the large model-based substation operation and maintenance intelligent question-answering method as described in the first embodiment, and/or the large model-based substation operation and maintenance intelligent question-answering system construction method as described in the first embodiment.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims

1. The intelligent substation operation and maintenance question-answering method based on the large model is characterized by comprising the following steps of:

2. The method of claim 1, wherein the knowledge base is constructed by:

3. The method of claim 1, wherein the database tables include a device table and an alarm table; the database table is constructed by:

4. The method of claim 1, wherein the trained large model is a trained ChatGLM-6B large model; the trained large model is obtained by:

5. The method of claim 4, wherein the pre-processed power knowledge and operational maintenance data is obtained by:

6. A method for constructing a substation operation and maintenance intelligent question-answering system based on a large model is characterized by comprising the following steps:

7. The utility model provides a transformer substation operation and maintenance intelligence question-answering device based on big model which characterized in that includes:

8. The utility model provides a transformer substation operation and maintenance intelligence question-answering system construction device based on big model which characterized in that includes:

9. A computer-readable storage medium having stored therein a computer program which, when executed by a processor, causes the processor to perform the large model-based substation operation and maintenance intelligent question-answering method according to any one of claims 1 to 5, and/or the large model-based substation operation and maintenance intelligent question-answering system construction method according to claim 6.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the large model-based substation operation and maintenance intelligent question-answering method according to any one of claims 1 to 5, and/or the large model-based substation operation and maintenance intelligent question-answering system construction method according to claim 6, when the computer program is executed by the processor.