CN117764054B - Natural language understanding method and system based on automatic construction prompt engineering - Google Patents

Natural language understanding method and system based on automatic construction prompt engineering Download PDF

Info

Publication number
CN117764054B
CN117764054B CN202410170010.3A CN202410170010A CN117764054B CN 117764054 B CN117764054 B CN 117764054B CN 202410170010 A CN202410170010 A CN 202410170010A CN 117764054 B CN117764054 B CN 117764054B
Authority
CN
China
Prior art keywords
prompt
natural language
model
engineering
language understanding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410170010.3A
Other languages
Chinese (zh)
Other versions
CN117764054A (en
Inventor
叶展宏
韩咏
钟雨彤
林锐蓝
齐浩亮
孔蕾蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan University
Original Assignee
Foshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan University filed Critical Foshan University
Priority to CN202410170010.3A priority Critical patent/CN117764054B/en
Publication of CN117764054A publication Critical patent/CN117764054A/en
Application granted granted Critical
Publication of CN117764054B publication Critical patent/CN117764054B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

A natural language understanding method and system based on automatic construction prompt engineering belongs to the field of text data processing and character processing, and is provided for solving the problem that the structure of the prompt engineering cannot be automatically adjusted in the soft-hard prompt technology. The method comprises the steps of performing training by using data of a prompt engineering correction mechanism model to obtain a model for automatically constructing the prompt engineering, correcting the preset prompt engineering by combining data information of a natural language understanding task data set with information of the preset prompt engineering by using the prompt engineering correction mechanism model, combining the natural language understanding task data set with the corrected prompt engineering, and completing understanding tasks by using a pre-training language model as a natural language understanding model to obtain the performance of the natural language understanding tasks, so that the prompt engineering can adjust the content of the prompt engineering according to the natural language understanding tasks and adjust the form structure of the prompt engineering, the prompt engineering can be more suitable for the current natural language understanding tasks, and the overall performance of the natural language understanding tasks is improved.

Description

Natural language understanding method and system based on automatic construction prompt engineering
Technical Field
The invention belongs to the technical field of text data processing and character processing, and particularly relates to a natural language understanding method and system based on automatic construction prompt engineering.
Background
The hinting refers to a piece of text content designed according to the current fine-tuned task, and then is combined with fine-tuned task data to convert the current task into a pre-trained task form of a mask language model. In general, hints refer to the above-mentioned piece of content designed according to the task currently being tuned. The correct prompt has great influence on the accuracy of the model and the task performance. Designing the most appropriate hints to further exploit the ability of the pre-trained language model, let the language model solve the problem of downstream tasks is a challenging task.
Generally designed text can be divided into three categories: hard cues (also known as discrete cues), soft cues (also known as continuous cues), and soft-hard cues (a combination of discrete cues and continuous cues). Wherein the hard cues are made up of words that are understandable to humans. Discrete cues are portability, flexibility, and simplicity. While soft cues consist of the output of the embedded layer of the language model, which is continuous and unintelligible to humans. However, finding the appropriate hard cues for a particular task is challenging. In contrast, soft cues are automatically built without any manual intervention. The trainable parameters of the soft cues may be learned from training data of the fine tuning task. Soft-hard cues (a combination of discrete cues and continuous cues) do not use only a learnable continuous cue or a discrete cue, but rather select some tokens as discrete cues on the basis of a continuous cue, as compared to a continuous cue, the soft-hard cues reduce the number of parameters and avoid excessive manual handwriting cues.
The hard prompt is a prompt which is written appropriately for the current fine tuning task through human experience. While one can intuitively understand the content of the prompt, manual prompts are not the best solution for downstream tasks. Such as gradient-based search methods, obtain the most appropriate prompt from a large number of prompts manually designed according to the current downstream task. Hybrid cues combine a series of manual cues and additional layers designed according to downstream tasks. In the hybrid cues approach, each cue may be synthesized by additional layers, such as an attention layer and a convolution layer, but cues still need to be manually set.
The idea of soft prompting is to directly generate an optimal prompting for the current data through other modules. Study (see paper Deng M,Wang J,Hsieh C P,et al.Rlprompt:Optimizing discrete text prompts with reinforcement learning[J]), see paper Liu X, gao Y, bai Y, et al Psp: pre-trained soft prompts for few-shot abstractive summarization [ J ]) is directed to adding consecutive cues to an input embedding such that the embedding has consecutive cues, which can achieve cues of longer length, but such methods may not even be as fine-tuned when the number of model parameters is small.
Soft-hard cues are typically based on soft cues with the addition of certain hard cues. Such methods trade off the advantages of both methods, namely ensuring a certain efficiency while also ensuring a certain performance. P-turn is one of the methods of soft-hard cues. In P-training, a learnable continuous hint may be trained and adapted to the current downstream task. The method comprises the steps of presetting a plurality of suboptimal discrete prompts, replacing the word elements in all the discrete prompts with learnable continuous prompts, and then reserving certain discrete word elements for not being replaced in a manual selection mode to ensure certain performance so as to obtain certain performance improvement. p-tuning-v2 proposes a method of using prefix adjustment for natural language processing tasks. Thus, the problem that all parameters in p-tuning are required to be trained is solved. The study (reference Dual context-guided continuous prompt tuning for few-shot learning [ C ]) proposes to use additional downstream tasks to solve the problem that labels using LSTM in p-tuning do not interact with each other. Finally, the study (reference Zhang N,Li L,Chen X,et al.Differentiable prompt makes pre-trained language models better few-shot learners[J]) is similar and introduces an additional downstream task to increase the relationship between tokens.
In summary, no one in the existing soft-hard prompt technology proposes to correct a preset prompt project according to the data of the natural language understanding task by using a prompt project correction mechanism model, so that the prompt project not only can adjust the content of the prompt project according to the natural language understanding task, but also can adjust the form structure, and the prompt project can be more suitable for the current natural language understanding task. Therefore, it is imperative to provide a natural language understanding method and system based on automatic construction prompt engineering.
Disclosure of Invention
The invention aims to provide a natural language understanding method and a natural language understanding system based on automatic construction prompt engineering, which are used for solving the problem that the structure of the prompt engineering cannot be automatically adjusted in the existing soft-hard prompt technology.
To achieve the above object, according to an aspect of the present invention, there is provided a natural language understanding method based on automatic construction of hint engineering, the method comprising the steps of:
S100, dividing a data set to obtain a training set and a checking set, and dividing the training set into a natural language understanding data set and a prompting engineering correction mechanism data set; the natural language understanding data set is used for training a natural language understanding model, and the prompt engineering correcting mechanism data set is used for training a prompt engineering correcting mechanism model;
S200, setting the execution training times of the prompt engineering correction mechanism model and the training times of the natural language understanding model;
S300, setting up an experience pool, and training the prompt engineering correction mechanism model by using a reinforcement learning method until the training times set in the step S200 are reached, so as to obtain a trained prompt engineering correction mechanism model;
S400, correcting the prompt engineering by using the trained prompt engineering correction mechanism model, and inputting the corrected prompt engineering and the natural language understanding data set into the natural language understanding model to perform training to obtain a trained natural language understanding model;
S500, calculating the result of the trained natural language understanding model obtained in S400 on the check set, only storing the natural language understanding model with the best performance, judging whether the times of executing S400 meet the threshold value, and if not, jumping to S300.
In S100, the data set is divided, the training set is further divided into a natural language understanding data set and a prompt engineering correction mechanism model data set, and the data set specifically comprises:
The data set consists of a training set and a checking set, wherein the training set and the checking set are sets consisting of a plurality of training data, and one training data is a binary array consisting of a character string of a longer text, which is marked as text content, and a character string of a shorter label, which is marked as category label;
Wherein, training data of a part of natural language understanding data set is used as all data, namely all training data; all training data can be divided into two parts, one part is used for training a prompt engineering correction mechanism model, and the other part is used for training a natural language understanding model; the two training sets can divide training data into a plurality of parts, one part in natural language understanding is used as a batch, the batch is used as a unit to be input into a natural language understanding model for training, the natural language understanding model trains all the batches once into one epoch which is one round, the training of one epoch indicates that the natural language understanding model completes one round of training, and the value of the epoch can be used for indicating how many rounds of training are executed; the training set of the isotopy prompt engineering correction mechanism model can be divided into multiple components for training, and the rules are followed;
The training data in each data set is divided into a training set and a checking set, namely, part of the training data belongs to the training set and part of the training data belongs to the checking set in one task; the training set and the calibration set are divided before training is performed on all models, training data belonging to the training set in each task form a training set in all training data, and training data belonging to the calibration set in each task form a calibration set in all training data; the training set data are used for training the natural language understanding model and the prompt engineering correction mechanism model, and the data in the verification set can be used for enabling the natural language understanding model to calculate and output the class label according to text content in the training data, checking and calculating whether the output class label is correct or not and calculating accuracy; the natural language understanding model calculates and outputs a category label according to text content in data in the check set and checks whether the calculated and output category label is correct, wherein if the calculated and output category label is correct, the calculation is called correct prediction output category label; the prompting engineering correction mechanism model does not apply the check set, but only applies the training data of the corresponding training set to execute training;
The natural language understanding data set is called Prompting engineering correction mechanism data set to be calledWhereinDelta batches in length; the prompt engineering correction mechanism model is a model for constructing and organizing the prompt engineering; delta represents how many batches are; batch means that packaging multiple pieces of data together is called a batch;
The check set data are used for enabling the natural language understanding model to calculate and output the probability of multi-label classification according to text content in the data of the check set, taking a label with the highest probability as a result label, checking whether the result label is correct or not and calculating accuracy, wherein the correct label is called correct calculation and output type label.
According to the invention, 7 natural language understanding data sets are BoolQ, CB, wiC, RTE, multiRC, WSC and COPA respectively; their delta is 7, 2, 3, 2, 11, 3, 2, respectively; are all adjusted according to the respective data set sizes;
The training times for setting up the prompt engineering correction mechanism model in S200 are specifically as follows:
Will be Each batch of batch is input into the prompt engineering correction mechanism model one by one to perform training, and all batches are input into the prompt engineering correction mechanism model to finish the training of the prompt engineering correction mechanism model once and finish delta times of training altogether; for differentSetting up different alpha, so that the following step S300 is switched to S400 to train the natural language understanding model after alpha times of execution; then isSetting up different epochs so thatEach batch of batch is input into the natural language understanding model one by one to perform training, and all batches are input into the natural language understanding model to complete the training of the natural language understanding model once and complete epoch times altogether.
According to the invention, 7 natural language understanding data sets are BoolQ, CB, wiC, RTE, multiRC, WSC and COPA respectively; their α is 100, 1, 34, 15, 170, 1, 5, respectively; are adjusted according to the respective data set sizes.
In S300, the setting up of the experience pool is specifically:
The experience Pool is called Memory Pool, and the stored scenario episode is a set with dissimilarity, and the scenario is a triplet storing reward, action actor and state states and is used for prompting training of engineering correction mechanism models; the experience pool is initially empty and the size of the experience pool is δ, meaning that it can store δ episodes episode;
The method for adding the plot episode in the experience pool comprises the following steps: each element in the experience pool is a scenario, wherein each scenario contains a lot of rewards review, a lot of actions actor, and a lot of state states;
the specific process of obtaining a reward, an action actor and a state in a batch is as follows:
S301, setting a preset prompting project and combining with Combining, wherein T' = { [ A 0:i],X,[Ai+1:m],y},[A0:i ] and [ A i+1:m ] are obtained to represent a series of words which have practical significance and can be understood by human beings to form preset word elements, and a prompting project which is designed manually and is to be corrected is formed; wherein m represents the length of the prompt project, and i represents the number of the words, and the value of the words is as follows: i is more than or equal to 0 and less than or equal to m; t' represents a template, and the input X isThe data of the random number is composed of a preset word element [ A 0:i]、[Ai+1:m ] and a y label, and X;
S302, obtaining a state: t' is encoded by a fixed natural language understanding model M f, where a fixed in the fixed natural language understanding model M f means that the model cannot perform training, but is able to gather input data Calculating, wherein the fixed natural language understanding model M f has the BERT-base-cased of a 12-layer transducer, and h (T') = { h ([ A 0:i]),h(X),h([Ai+1:m ]), h (y) } is obtained through the coding, wherein h (·) represents the last layer output of the corresponding M f model, and · represents any input;
Setting a splicing function as F ([ h ([ A i]);h([y])])=si,F([h([Ai ]);) h ([ y ]) ] to represent spliced output, h ([ A i ]), h ([ y ]) to be respectively represented as a hidden vector of a preset word element and a hidden vector of a label, s i to represent a state and be used for prompting a hidden vector of an engineering correction mechanism model, the hidden vector represents output of the last layer of M f, so as to obtain a state of the prompting engineering correction mechanism model, M to represent the length of prompting engineering, i to represent the number of word elements, and i is less than or equal to 0 < M;
S303, obtaining action actor: sending s i obtained in the last step to a prompt engineering correction mechanism model consisting of linear layers, and then sampling according to a function pi Wci|si)=softmax(σ(F(si),WC) to obtain gamma i epsilon {0,1,2};
where s i is the environmental state, σ (·) is the RELU activation function, σ represents arbitrary input, and W C is the model parameters that prompt the engineering correction mechanism model to allow participation in training;
Gamma i action values represent three different actions; when gamma i =0, the prompt engineering correction mechanism model is instructed to delete the current word element, when gamma i =1, the prompt engineering correction mechanism model is instructed to replace the current word element, and when gamma i =2, the prompt engineering correction mechanism model is instructed to keep the current word element; the sampling at this stage is to select the action with the highest probability as the final action of the current prompt engineering correction mechanism model, so as to obtain an action actor;
S304, obtaining rewards report:
Operating the template T' by using an output gamma i of the prompting engineering correction mechanism model to obtain a template:
T '= { [ P 0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k ], y } wherein [ A 0:i-θ][Ai+1:l ] represents the left-over character element, and θ represents that a total of θ character elements are deleted, l represents the total length of a series of left-over preset character elements, k represents the total length of a series of replaced character elements, i and j are used for representing the positions of the character elements, i is equal to or less than 0 and equal to i < l, j is equal to or less than 0 and equal to or less than j < k, and [ P 0:j][Pj+1:k ] represents the result of replacing [ A 0:j][Aj+1:k ] in the corresponding position of T' is the replaced character element; wherein the series of replaced lemmas of T' are made up of l lemmas of the reciprocal of the vocabulary in the natural language understanding model, the replaced lemmas cannot be directly understood by humans;
Embedding the preset word element [ A 0:i-θ][Ai+1:l ] in the obtained T ={[P0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k, y } into an embedding layer of a natural language understanding model Encoding, e representsIs/is embedded in-The method comprises the steps of forming a pre-training language model BERT-base-based with 12 layers of transformers, sending replaced words [ P 0:j][Pj+1:k ] to an encoder of a prompt project to generate corresponding embeddings, selecting a two-way long short-term memory network (LSTM) and using a two-layer multi-layer perceptron MLP activated by ReLU as the encoder of the prompt project to obtain a template e (T '), wherein the template e (T ') is expressed as a formula e (T ')= { e ([ P 0:j]),e([A0:i-θ]),e(X),e([Ai+1:l]),e([Pj+1:k ]), e (y) }; e (·) represents the output result of the embedded layer, and · represents any input;
Using e (T') to understand model by natural language And combining the predictor model for calculating each tag probability to obtain a determiner model decider to calculate the target tag y, and then calculating a loss value between the calculated result and the target tag, namely WillNegative number of (i) as a reward, i.e.Obtaining rewards reward; i represents the number of words, and at the moment, i is equal to or more than 0 and less than m, and m is the total length of the prompt project;
S305, combining the obtained rewards review, action actor and state obtaining plot episode; storing the plot into an experience pool;
S306, jumping to S301 until the experience pool is filled with the plot episode.
In S300, training is performed on the correction mechanism model of the prompt engineering by using the reinforcement learning method, so as to obtain the correction mechanism model of the automatic construction prompt engineering, which is specifically:
s307, according to the formula Training the prompt engineering correction mechanism model and then updating the prompt engineering correction mechanism model by combining the plots in the experience pool; wherein: n represents the size of each batch, N represents the training data of which number in N,Representing each piece of training data in the batch as represented in the formula:Summation of the N results obtained above,Represented is each episode of one of the training data in the batch according to the formula:Summation of delta results,What is represented is that each term in each episode is according to the formulaSumming the m obtained results; delta represents how many batches the experience pool can store, t represents what plot episode is in the experience pool, m represents the total length of the tokens of the prompt project, i represents what token is, rewardi represents the reward corresponding to the ith token,Indicating that the prompt engineering correction mechanism model is biased, pi Wc(sii) indicates the prompt engineering correction mechanism model, wherein his inputs are state s i and action gamma i, beta indicates learning rate, wc indicates parameters of the prompt engineering correction mechanism model; finally,Represented is a gradient ascent formula by the formula:Updating Wc parameters according to the calculation result of (2);
According to the invention, 7 natural language understanding data sets are BoolQ, CB, wiC, RTE, multiRC, WSC and COPA respectively; their beta is 2e-5;
s308, emptying an experience pool;
s309, training an alpha-time prompt engineering correction mechanism model and judging whether the alpha-time executed exceeds a threshold value; if yes, the operation is skipped to S400, otherwise, the operation is continued to S301.
In S400, using the trained prompt engineering correction mechanism model, correcting the prompt engineering, combining with the natural language understanding data set, and inputting the corrected prompt engineering and the natural language understanding data set into the natural language understanding model to perform training, where the trained natural language understanding model is specifically:
S401, setting a preset manual prompt project similar to S301 and combining the manual prompt project with training data understood by natural language to obtain T' = { [ A 0:i],X,[Ai+1:m ] and y }; s301, data prompting an engineering correction mechanism model is used, and training data understood by natural language is used in the step;
S402, correcting the T' by using a prompt engineering correction mechanism model to obtain: t' = { [ P 0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k ], y };
S403, similar to S304-2, a formula of e (T ')= { e ([ P 0:j]),e([A0:i-θ]),e(X),e([Ai+1:l]),e([Pj+1:k ]), and e (y) } is obtained, and then the e (T') is adopted to understand the model by natural language A determiner model decider formed by the predictor calculates a target label y, and then a loss value is calculated between the calculated result and the target label, thereby obtainingI is equal to or more than 0 and is less than m; s304-2 uses data prompting an engineering correction mechanism model and training data understood by natural language in the step;
S404 according to Training is performed on a model that prompts an engineering correction mechanism.
In S500, whether the threshold is satisfied or not and the output of the prediction result is specifically:
Using the data in the test set as test data, enabling the natural language understanding model to calculate a class label with the maximum classification probability value according to the text content of the test data, comparing the class label with the maximum probability value with the class label in the test data by using an evaluation index, and calculating an evaluation result according to the evaluation index; the natural language understanding model calculates and outputs corresponding class labels on different test data according to text contents in the test data, and the class labels of the test data are compared, and an evaluation index is used for obtaining an evaluation result, wherein the evaluation index comprises accuracy, F1 and EM;
the accuracy formula is:
The formula of F1 is:
Wherein:
The accuracy formula:
recall ratio formula:
True examples, i.e., the number of correctly predicted positive examples;
true negative examples, i.e., the number of correctly predicted negative examples;
false positive examples, i.e., the number of false positive examples that are incorrectly predicted;
false negatives, i.e., the number of erroneously predicted negative cases;
EM is commonly used for evaluating a question-answering system, which refers to the proportion of the predicted output of a model to the real answer, and is used for examining question-answering tasks in natural language understanding tasks;
The EM formula is: Consider the ratio of the calculated result to the tag perfect agreement;
comparing the currently obtained evaluation result with the previously obtained evaluation result, if the currently obtained evaluation result is better than the previous result, replacing the previously obtained trained natural language understanding model, otherwise discarding the currently obtained natural language understanding model;
Wherein BoolQ data sets adopt ACC as an evaluation index, CB adopts ACC and F1 as an evaluation index, wiC adopts ACC as an evaluation index, RTE adopts ACC as an evaluation index, multiRC adopts EM and F1a as an evaluation index, WSC adopts ACC as an evaluation index and COPA adopts ACC as an evaluation index;
Judging whether the step S400 meets the threshold value, otherwise, jumping to the step S300 specifically comprises the following steps:
Judging whether the training times of the natural language understanding model of S400 exceeds the epoch times of the natural language understanding model training set in S200, if not, jumping to S300, wherein the epoch represents the times;
And outputting the calculation result of the best trained natural language understanding model on the test set if the execution times of S400 are satisfied, namely the threshold value is reached, and exiting.
The beneficial technical effects of the invention are as follows:
The invention uses the data of the prompt engineering correction mechanism model to execute training to obtain the model for automatically constructing the prompt engineering, then uses the prompt engineering correction mechanism model to correct the preset prompt engineering by combining the data information of the natural language understanding task data set with the information of the preset prompt engineering, combines the natural language understanding task data set with the corrected prompt engineering, and uses the pre-training language model as the natural language understanding model to complete the understanding task, thereby obtaining the performance of the natural language understanding task. The invention realizes that the prompt engineering not only can adjust the content of the task according to the natural language understanding task, but also can adjust the task in a form structure, so that the prompt engineering can be more suitable for the current natural language understanding task, and the overall performance of the natural language understanding task is improved.
The method can automatically learn the content of the prompt project, and then the natural language understanding data and the prompt project are combined to enter a fine adjustment stage so as to further reduce the gap between the pre-training and fine adjustment tasks of the pre-training language model, thereby obtaining the performance improvement of the natural language understanding task and obtaining the natural language understanding model with the performance improvement. The invention uses the data of the prompt engineering correction mechanism model to execute training to obtain the model for automatically constructing the prompt engineering; and correcting the preset prompt engineering by combining the data information of the natural language understanding task data set with the information of the preset prompt engineering by using the prompt engineering correction mechanism model, combining the natural language understanding task data set with the corrected prompt engineering, and completing the understanding task by using the natural language understanding model, thereby obtaining the performance of the natural language understanding task.
The invention uses the prompt engineering correction mechanism model to correct the preset prompt engineering according to the data of the fine adjustment task, so that the prompt engineering not only can adjust the content of the prompt engineering according to the natural language understanding task, but also can adjust the form structure, and the prompt engineering can be more suitable for the current natural language understanding task. It was verified that the method of the present invention made an improvement over the most similar prior art p-turn over the 7 data sets (see Table 1). Therefore, the invention solves the problem that the existing soft-hard prompting technology cannot automatically adjust the prompting engineering structure.
Drawings
The above and other features of the present invention will become more apparent from the detailed description of the embodiments thereof given in conjunction with the accompanying drawings, in which like reference characters designate like or similar elements, and it is apparent that the drawings in the following description are merely some examples of the present invention, and other drawings may be obtained from these drawings without inventive effort to those of ordinary skill in the art, in which: FIG. 1 is a flow chart of a natural language understanding method based on automatic construction prompt engineering according to the present invention; fig. 2 is a block diagram of a natural language understanding system based on automatic construction prompt engineering according to the present invention.
Detailed Description
The conception, specific structure, and technical effects produced by the present application will be clearly and completely described below with reference to the embodiments and the drawings to fully understand the objects, aspects, and effects of the present application. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.
In the description of the present invention, a number means one or more, a number means two or more, and greater than, less than, exceeding, etc. are understood to not include the present number, and above, below, within, etc. are understood to include the present number. The description of the first and second is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
Referring to fig. 1, a flowchart of a natural language understanding method based on an automatic construction prompt facility according to the present invention is shown, and a natural language understanding method or system based on an automatic construction prompt facility according to an embodiment of the present invention is described below with reference to fig. 1.
To achieve the above object, according to an aspect of the present invention, there is provided a natural language understanding method based on automatic construction of hint engineering, the method comprising the steps of:
S100, dividing a data set to obtain a training set and a checking set, and dividing the training set into a natural language understanding data set and a prompting engineering correction mechanism data set; the natural language understanding data set is used for training a natural language understanding model, and the prompt engineering correcting mechanism data set is used for training a prompt engineering correcting mechanism model;
S200, setting the execution training times of the prompt engineering correction mechanism model and the training times of the natural language understanding model;
S300, setting up an experience pool, and training the prompt engineering correction mechanism model by using a reinforcement learning method until the training times set in the step S200 are reached, so as to obtain a trained prompt engineering correction mechanism model;
S400, correcting the prompt engineering by using the trained prompt engineering correction mechanism model, and inputting the corrected prompt engineering and the natural language understanding data set into the natural language understanding model to perform training to obtain a trained natural language understanding model;
S500, calculating the result of the trained natural language understanding model obtained in S400 on the check set, only storing the natural language understanding model with the best performance, judging whether the times of executing S400 meet the threshold value, and if not, jumping to S300.
In S100, the data set is divided, the training set is further divided into a natural language understanding data set and a prompt engineering correction mechanism model data set, and the data set specifically comprises:
The data set consists of a training set and a checking set, wherein the training set and the checking set are sets consisting of a plurality of training data, and one training data is a binary array consisting of a character string of a longer text, which is marked as text content, and a character string of a shorter label, which is marked as category label;
Wherein, training data of a part of natural language understanding data set is used as all data, namely all training data; all training data can be divided into two parts, one part is used for training a prompt engineering correction mechanism model, and the other part is used for training a natural language understanding model; the two training sets can divide training data into a plurality of parts, one part in natural language understanding is used as a batch, the batch is used as a unit to be input into a natural language understanding model for training, the natural language understanding model trains all the batches once into one epoch which is one round, the training of one epoch indicates that the natural language understanding model completes one round of training, and the value of the epoch can be used for indicating how many rounds of training are executed; the training set of the isotopy prompt engineering correction mechanism model can be divided into multiple components for training, and the rules are followed;
The training data in each data set is divided into a training set and a checking set, namely, part of the training data belongs to the training set and part of the training data belongs to the checking set in one task; the training set and the calibration set are divided before training is performed on all models, training data belonging to the training set in each task form a training set in all training data, and training data belonging to the calibration set in each task form a calibration set in all training data; the data in the training set is used for training a natural language understanding model and a correction mechanism model, and the data in the verification set can be used for enabling the natural language understanding model to calculate and output a class label according to text content in the training data, checking and calculating whether the output class label is correct or not and calculating accuracy; the natural language understanding model calculates and outputs a category label according to text content in data in the check set and checks whether the calculated and output category label is correct, wherein if the calculated and output category label is correct, the calculation is called correct prediction output category label; the prompting engineering correction mechanism model does not apply the check set, but only applies the training data of the corresponding training set to execute training;
The natural language understanding data set is called Prompting engineering correction mechanism data set to be calledWhereinDelta batches in length; the prompt engineering correction mechanism model is a model for constructing and organizing the prompt engineering; delta represents how many batches are; batch means that packaging multiple pieces of data together is called a batch;
The check set data are used for enabling the natural language understanding model to calculate and output the probability of multi-label classification according to text content in the data of the check set, taking a label with the highest probability as a result label, checking whether the result label is correct or not and calculating accuracy, wherein the correct label is called correct calculation and output type label.
According to the invention, 7 natural language understanding data sets are BoolQ, CB, wiC, RTE, multiRC, WSC and COPA respectively; their delta is 7, 2, 3,2, 11, 3,2, respectively; the training times of the prompt engineering correction mechanism model set up in the S200 which is adjusted according to the respective data set size are specifically as follows:
Will be Each batch of batch is input into the prompt engineering correction mechanism model one by one to perform training, and all batches are input into the prompt engineering correction mechanism model to finish the training of the prompt engineering correction mechanism model once and finish delta times of training altogether; for differentSetting up different alpha, so that the following step S300 is switched to S400 to train the natural language understanding model after alpha times of execution; then isSetting up different epochs so thatEach batch of batch is input into the natural language understanding model one by one to perform training, and all batches of batch are input into the natural language understanding model to complete the training of the natural language understanding model once and complete epoch times of training altogether;
According to the invention, 7 natural language understanding data sets are BoolQ, CB, wiC, RTE, multiRC, WSC and COPA respectively; their α is 100, 1, 34, 15, 170, 1, 5, respectively; are adjusted according to the respective data set sizes.
In S300, the setting up of the experience pool is specifically:
The experience Pool is called Memory Pool, and the stored scenario episode is a set with dissimilarity, and the scenario is a triplet storing reward, action actor and state states and is used for prompting training of engineering correction mechanism models; the experience pool is initially empty and the size of the experience pool is δ, meaning that it can store δ episodes episode;
The method for adding the plot episode in the experience pool comprises the following steps: each element in the experience pool is a scenario, wherein each scenario contains a lot of rewards review, a lot of actions actor, and a lot of state states;
the specific process of obtaining a reward, an action actor and a state in a batch is as follows:
S301, setting a preset prompting project and combining with Combining, wherein T' = { [ A 0:i],X,[Ai+1:m],y},[A0:i ] and [ A i+1:m ] are obtained to represent a series of words which have practical significance and can be understood by human beings to form preset word elements, and a prompting project which is designed manually and is to be corrected is formed; wherein m represents the length of the prompt project, and i represents the number of the words, and the value of the words is as follows: i is more than or equal to 0 and less than or equal to m; t' represents a template, and the input X isThe data of the random number is composed of a preset word element [ A 0:i]、[Ai+1:m ] and a y label, and X;
S302, obtaining a state: t' is encoded by a fixed natural language understanding model M f, where a fixed in the fixed natural language understanding model M f means that the model cannot perform training, but is able to gather input data Calculating, wherein the fixed natural language understanding model M f has the BERT-base-cased of a 12-layer transducer, and h (T') = { h ([ A 0:i]),h(X),h([Ai+1:m ]), h (y) } is obtained through the coding, wherein h (·) represents the last layer output of the corresponding M f model, and · represents any input;
Setting a splicing function as F ([ h ([ A i]);h([y])])=si,F([h([Ai ]);) h ([ y ]) ] to represent spliced output, h ([ A i ]), h ([ y ]) to be respectively represented as a hidden vector of a preset word element and a hidden vector of a label, s i to represent a state and be used for prompting a hidden vector of an engineering correction mechanism model, the hidden vector represents output of the last layer of M f, so as to obtain a state of the prompting engineering correction mechanism model, M to represent the length of prompting engineering, i to represent the number of word elements, and i is less than or equal to 0 < M;
S303, obtaining action actor: sending s i obtained in the last step to a prompt engineering correction mechanism model consisting of linear layers, and then sampling according to a function pi Wci|si)=softmax(σ(F(si),WC) to obtain gamma i epsilon {0,1,2};
where s i is the environmental state, σ (·) is the RELU activation function, σ represents arbitrary input, and W C is the model parameters that prompt the engineering correction mechanism model to allow participation in training;
Gamma i action values represent three different actions; when gamma i =0, the prompt engineering correction mechanism model is instructed to delete the current word element, when gamma i =1, the prompt engineering correction mechanism model is instructed to replace the current word element, and when gamma i =2, the prompt engineering correction mechanism model is instructed to keep the current word element; the sampling at this stage is to select the action with the highest probability as the final action of the current prompt engineering correction mechanism model, so as to obtain an action actor;
S304, obtaining rewards report:
Operating the template T ' by using the output gamma i of the prompt engineering correction mechanism model to obtain a template, wherein T ' = { [ P 0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k ], y } wherein [ A 0:i-θ][Ai+1:l ] represents the left word element, θ represents that the total number of the left word elements is deleted, l represents the total length of a series of left preset word elements, k represents the total length of a series of replaced word elements, i and j are also used for representing the positions of the word elements, i is equal to or less than 0 and less than i < l, j is equal to or less than 0 and less than j < k, and [ P 0:j][Pj+1:k ] represents the result of replacing [ A 0:j][Aj+1:k ] in the corresponding position of T ' as the replaced word element; wherein the series of replaced lemmas of T' are made up of l lemmas of the reciprocal of the vocabulary in the natural language understanding model, the replaced lemmas cannot be directly understood by humans;
The obtained preset word element [ A 0:i-θ][Ai+1:l ] in T' = { [ P 0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k ], y } uses the embedded layer of the natural language understanding model Encoding, e representsIs/is embedded in-The method comprises the steps of forming a pre-training language model BERT-base-based with 12 layers of transformers, sending replaced words [ P 0:j][Pj+1:k ] to an encoder of a prompt project to generate corresponding embeddings, selecting a two-way long short-term memory network (LSTM) and using a two-layer multi-layer perceptron MLP activated by ReLU as the encoder of the prompt project to obtain a template e (T '), wherein the template e (T ') is expressed as a formula e (T ')= { e ([ P 0:j]),e([A0:i-θ]),e(X),e([Ai+1:l]),e([Pj+1:k ]), e (y) }; e (·) represents the output result of the embedded layer, and · represents any input;
Using e (T') to understand model by natural language And combining the predictor model for calculating each tag probability to obtain a determiner model decider to calculate the target tag y, and then calculating a loss value between the calculated result and the target tag, namely WillNegative number of (i) as a reward, i.e.Obtaining rewards reward; i represents the number of words, and at the moment, i is equal to or more than 0 and less than m, and m is the total length of the prompt project;
S305, combining the obtained rewards review, action actor and state obtaining plot episode; storing the plot into an experience pool;
S306, jumping to S301 until the experience pool is filled with the plot episode.
In S300, training is performed on the correction mechanism model of the prompt engineering by using the reinforcement learning method, so as to obtain the correction mechanism model of the automatic construction prompt engineering, which is specifically:
s307, according to the formula Training the prompt engineering correction mechanism model and then updating the prompt engineering correction mechanism model by combining the plots in the experience pool; wherein: n represents the size of each batch, N represents the training data of which number in N,Representing each piece of training data in the batch as represented in the formula:Summation of the N results obtained above,Represented is each episode of one of the training data in the batch according to the formula:Summation of delta results,What is represented is that each term in each episode is according to the formulaSumming the m obtained results; delta represents how many batches the experience pool can store, t represents what plot episode is in the experience pool, m represents the total length of the tokens of the prompt project, i represents what token is, and reward i represents the reward corresponding to the i-th token,Indicating that the prompt engineering correction mechanism model is biased, pi Wc(sii) indicates the prompt engineering correction mechanism model, wherein his inputs are state s i and action gamma i, beta indicates learning rate, wc indicates parameters of the prompt engineering correction mechanism model; finally,Represented is a gradient ascent formula by the formula:Updating Wc parameters according to the calculation result of (2);
s308, emptying an experience pool;
s309, training an alpha-time prompt engineering correction mechanism model and judging whether the alpha-time executed exceeds a threshold value; if yes, the operation is skipped to S400, otherwise, the operation is continued to S301.
In S400, using the trained prompt engineering correction mechanism model, correcting the prompt engineering, combining with the natural language understanding data set, and inputting the corrected prompt engineering and the natural language understanding data set into the natural language understanding model to perform training, where the trained natural language understanding model is specifically:
S401, setting a preset manual prompt project similar to S301 and combining the manual prompt project with training data understood by natural language to obtain T' = { [ A 0:i],X,[Ai+1:m ] and y }; s301, data prompting an engineering correction mechanism model is used, and training data understood by natural language is used in the step;
S402, correcting the T' by using a prompt engineering correction mechanism model to obtain:
T′={[P0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k],y};
S403, similar to S304-2, a formula of e (T ')= { e ([ P 0:j]),e([A0:i-θ]),e(X),e([Ai+1:l]),e([Pj+1:k ]), and e (y) } is obtained, and then the e (T') is adopted to understand the model by natural language A determiner model decider formed by the predictor calculates a target label y, and then a loss value is calculated between the calculated result and the target label, thereby obtainingI is equal to or more than 0 and is less than m; s304-2 uses data prompting an engineering correction mechanism model and training data understood by natural language in the step;
S404 according to Training is performed on a model that prompts an engineering correction mechanism.
In S500, whether the threshold is satisfied or not and the output of the prediction result is specifically: using the data in the test set as test data, enabling the natural language understanding model to calculate a class label with the maximum classification probability value according to the text content of the test data, comparing the class label with the maximum probability value with the class label in the test data by using an evaluation index, and calculating an evaluation result according to the evaluation index; the natural language understanding model calculates and outputs corresponding class labels on different test data according to text contents in the test data, and the class labels of the test data are compared, and an evaluation index is used for obtaining an evaluation result, wherein the evaluation index comprises accuracy, F1 and EM;
the accuracy formula is:
The formula of F1 is:
Wherein:
The accuracy formula:
recall ratio formula:
True examples, i.e., the number of correctly predicted positive examples;
true negative examples, i.e., the number of correctly predicted negative examples;
false positive examples, i.e., the number of false positive examples that are incorrectly predicted;
false negatives, i.e., the number of erroneously predicted negative cases;
EM is commonly used for evaluating a question-answering system, which refers to the proportion of the predicted output of a model to the real answer, and is used for examining question-answering tasks in natural language understanding tasks;
The EM formula is: Consider the ratio of the calculated result to the tag perfect agreement;
comparing the currently obtained evaluation result with the previously obtained evaluation result, if the currently obtained evaluation result is better than the previous result, replacing the previously obtained trained natural language understanding model, otherwise discarding the currently obtained natural language understanding model;
Wherein BoolQ data sets adopt ACC as an evaluation index, CB adopts ACC and F1 as an evaluation index, wiC adopts ACC as an evaluation index, RTE adopts ACC as an evaluation index, multiRC adopts EM and F1a as an evaluation index, WSC adopts ACC as an evaluation index and COPA adopts ACC as an evaluation index;
Judging whether the step S400 meets the threshold value, otherwise, jumping to the step S300 specifically comprises the following steps:
Judging whether the training times of the natural language understanding model of S400 exceeds the epoch times of the natural language understanding model training set in S200, if not, jumping to S300, wherein the epoch represents the times;
And outputting the calculation result of the best trained natural language understanding model on the test set if the execution times of S400 are satisfied, namely the threshold value is reached, and exiting.
The natural language understanding system based on automatic construction prompt engineering comprises: the steps of the embodiment of the natural language understanding method based on the automatic construction prompt project are realized when the processor executes the computer program, and the natural language understanding system based on the automatic construction prompt project can be operated in a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud data center and the like, and the operable system can comprise, but is not limited to, a processor, a memory, a server cluster.
The embodiment of the invention provides a natural language understanding method and a system based on automatic construction prompt engineering, as shown in fig. 2, wherein the system based on automatic construction prompt engineering is as follows: a processor, a memory, and a computer program stored in the memory and executable on the processor, the processor implementing the steps in one embodiment of a natural language understanding method based on automatic construction prompting engineering as described above when the computer program is executed, the processor executing the computer program to run in the units of the following system:
The data dividing unit is used for dividing different data and providing data training for the prompt engineering correction mechanism model and the natural language understanding model;
The numerical value setting unit is used for setting switching steps of natural language understanding training and prompting engineering correction mechanism model training; the prompt engineering correction mechanism model training unit is used for training a prompt engineering correction mechanism model so that the prompt engineering correction mechanism model can correct preset prompt engineering better;
The prompting project correcting mechanism model corrects a prompting project unit, which is used for correcting a preset prompting project into a prompting project more suitable for a natural language understanding task;
A natural language understanding unit for training a natural language understanding model;
The test output unit is used for predicting and outputting the input text by the natural language understanding model and examining the performance
In the implementation of the method for automatically constructing the prompt project, the following experimental data are recorded as table 1: table 1 is a comparison of p-turn (see paper X.Liu, Y.Zheng, Z.Du, M.Ding, Y.Qian, Z.Yang, and J. Tang, "Gpt understands, top," AIOpen, 2023.) using a natural language understanding method based on automatic construction prompt engineering with the most similar model to the method.
The data sets used are Boolq, multiRC (question and answer data set), CB, RTE (text implication data set), wiC (co-index data set), COPA (causal question and answer), WSC (refer to disambiguation), respectively. In the experiment, a plurality of prompt engineering correction mechanism models and a single prompt engineering correction mechanism model represent the experimental result of the natural language understanding method based on automatic construction of the prompt engineering. As shown in table 1, correcting each preset word element through a plurality of prompt engineering correction mechanism models, and respectively performing BoolQ, CB, wiC, RTE, multiRC, WSC and COPA according to different conditions; their delta is 7, 2, 3, 2, 11, 3, 2 and alpha is 100, 1, 34, 15, 170, 1, 5, respectively.
The single prompt engineering correction mechanism model is used for each preset word element, the same prompt engineering correction mechanism model is used for each preset word element, and the delta and alpha settings are consistent with the above. "randomly selecting action" means prompting the engineering correction mechanism model to randomly output 3 actions: the method comprises the steps of replacing, reserving and deleting, wherein 'p-tuning' represents a result to be compared, 'all replaced words' represents that the output of a prompt engineering correction mechanism model is all replacement operation, so that all replaced words are obtained, and 'fine tuning' represents the performance of the model on a natural language understanding data set directly through a natural language understanding model.
TABLE 1
The results of the supplemental experiments are shown in table 2:
Table 2 shows the overall performance impact caused by the fact that the correction mechanism model of the prompt engineering receives different inputs due to the fact that different hidden vectors are used as states in the natural language understanding method based on the automatic construction of the prompt engineering. I.e. the use of different splicing functions in S302 results in an impact on the overall performance. In the "correction mechanism model with only one prompt project", the method comprises the following steps:
the mask+anchor (without distinguishing the letter cases) indicates that the hidden vector of the target tag and the hidden vector of the current word element to be corrected are spliced to be used as input information of the prompt engineering correction mechanism model of the current word element position for the input of the prompt engineering correction mechanism model.
"Only Anchor" (case-independent) means that the input of the prompt engineering correction mechanism model uses the hidden vector of the currently to-be-corrected lemma as the input information of the correction mechanism model of the prompt engineering of the current lemma position.
The "cls+anchor" (case-independent) indicates that the input of the prompt engineering correction mechanism model uses the hidden vector of the first word element input to the natural language understanding model and the hidden vector of the word element to be corrected at present to splice as the input information of the prompt engineering correction mechanism model of the word element at the current position.
The "mask+anchor", "only anchor" and "cls+anchor" in the "correction mechanism model with only a plurality of prompt projects" are similar, except that each word element has a single correction mechanism model for correction operation.
TABLE 2
The Processor may be a central processing unit (Central Processing Unit, CPU), other general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete component gate or transistor logic device, discrete hardware components, or the like. The general processor may be a microprocessor or any conventional processor, etc., and the processor is a control center of the natural language understanding method based on automatic construction of prompt engineering, and various interfaces and lines are used to connect various sub-areas of the whole natural language understanding system based on automatic construction of prompt engineering.
The memory may be used to store the computer program and/or the module, and the processor may implement the various functions of the natural language understanding method based on automatic construction prompt engineering by running or executing the computer program and/or the module stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart memory card (SMART MEDIA CARD, SMC), secure Digital (SD) card, flash memory card (FLASH CARD), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
The invention provides a natural language understanding method and system based on automatic construction prompt engineering. Training is performed by using the data of the prompt engineering correction mechanism model to automatically construct a model of the prompt engineering. And correcting the preset prompt engineering by combining the data information of the natural language understanding task data set with the information of the preset prompt engineering by using the prompt engineering correction mechanism model, combining the natural language understanding task data set with the corrected prompt engineering, and completing the understanding task by taking the pre-training language model as the natural language understanding model, thereby obtaining the performance of the natural language understanding task. The prompt engineering can not only adjust the content of the task according to the natural language understanding task, but also adjust the task in a form structure, so that the prompt engineering is more suitable for the current natural language understanding task, and the overall performance of the natural language understanding task is improved.
Although the present invention has been described in considerable detail and with particularity with respect to several described embodiments, it is not intended to be limited to any such detail or embodiment or any particular embodiment so as to effectively cover the intended scope of the invention. Furthermore, the foregoing description of the invention has been presented in its embodiments contemplated by the inventors for the purpose of providing a useful description, and for the purposes of providing a non-essential modification of the invention that may not be presently contemplated, may represent an equivalent modification of the invention.

Claims (6)

1. A natural language understanding method based on automatic construction prompt engineering, the method comprising the steps of:
S100, dividing a data set to obtain a training set and a checking set, and dividing the training set into a natural language understanding data set and a prompting engineering correction mechanism data set; the natural language understanding data set is used for training a natural language understanding model, and the prompt engineering correcting mechanism data set is used for training a prompt engineering correcting mechanism model;
S200, setting the execution training times of the prompt engineering correction mechanism model and the training times of the natural language understanding model;
S300, setting up an experience pool, and training the prompt engineering correction mechanism model by using a reinforcement learning method until the training times set in the step S200 are reached, so as to obtain a trained prompt engineering correction mechanism model;
S400, correcting the prompt engineering by using the trained prompt engineering correction mechanism model, and inputting the corrected prompt engineering and the natural language understanding data set into the natural language understanding model to perform training to obtain a trained natural language understanding model;
S500, calculating the result of the trained natural language understanding model obtained in S400 on a verification set, only storing the natural language understanding model with the best performance, judging whether the times of executing S400 meet a threshold value, and if not, jumping to S300;
the specific process of setting up the experience pool in S300 is:
The experience Pool is called Memory Pool, and the stored scenario episode is a set with dissimilarity, and the scenario is a triplet storing reward, action actor and state states and is used for prompting training of engineering correction mechanism models; the experience pool is initially empty and the size of the experience pool is δ, meaning that it can store δ episodes episode;
The method for adding the plot episode in the experience pool comprises the following steps: each element in the experience pool is a scenario, wherein each scenario contains a lot of rewards review, a lot of actions actor, and a lot of state states;
the specific process of obtaining a reward, an action actor and a state in a batch is as follows:
S301, setting a preset prompting project and combining with Combining to obtain T' = { [ A 0:i],X,[Ai+1:m],y},[A0:i ] and
[ A i+1:m ] represents a series of words which have practical meaning and can be understood by human beings to form preset word elements, and a prompting project which is designed manually and is to be corrected is adopted; wherein m represents the length of the prompt project, and i represents the number of the words, and the value of the words is as follows: i is more than or equal to 0 and less than or equal to m; t' represents a template, and the input X isThe data of the random number is composed of a preset word element [ A 0:i]、[Ai+1:m ] and a y label, and X;
S302, obtaining a state: t' is encoded by a fixed natural language understanding model M f, where a fixed in the fixed natural language understanding model M f means that the model cannot perform training, but is able to gather input data Calculating, wherein the fixed natural language understanding model M f has the BERT-base-cased of a 12-layer transducer, and h (T') = { h ([ A 0:i]),h(X),h([Ai+1:m ]), h (y) } is obtained through the coding, wherein h (·) represents the last layer output of the corresponding M f model, and · represents any input;
Setting a splicing function as F ([ h ([ A i]);h([y])])=si,F([h([Ai ]);) h ([ y ]) ] to represent spliced output h ([ A i ]), wherein h ([ y ]) is respectively represented as a hidden vector of a preset word element and a hidden vector of a label, s i is represented as a state and is used for prompting the hidden vector of an engineering correction mechanism model, the hidden vector is represented as output of the last layer of M f, so that a state prompting the engineering correction mechanism model is obtained, M is represented as the length of prompting engineering, i is represented as a number of word elements, and i is 0-i < M;
S303, obtaining action actor: after s i obtained in the last step is sent to a prompt engineering correction mechanism model consisting of linear layers, sampling is carried out according to a function pi Wci|si)=softmax(σ(F(si),WC), and action gamma ii epsilon {0,1,2};
where s i is the environmental state, σ (·) is the RELU activation function, σ represents arbitrary input, and W C is the model parameters that prompt the engineering correction mechanism model to allow participation in training;
Gamma i action values represent three different actions; when gamma i =0, the prompt engineering correction mechanism model is instructed to delete the current word element, when gamma i =1, the prompt engineering correction mechanism model is instructed to replace the current word element, and when gamma i =2, the prompt engineering correction mechanism model is instructed to keep the current word element; the sampling at this stage is to select the action with the highest probability as the final action of the current prompt engineering correction mechanism model, so as to obtain an action actor;
S304, obtaining rewards report:
Operating the template T' by using an output gamma i of the prompting engineering correction mechanism model to obtain a template:
T' = { [ P 0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k ], y } where [ A 0:i-θ][Ai+1:l ] represents the remaining tokens and θ represents that a total of θ tokens have been deleted, l represents the total length of a series of remaining preset tokens, k represents the total length of a series of replaced tokens and i, j are also used to represent the positions of the tokens, where i has a value of 0.ltoreq.i < l, j has a value of 0.ltoreq.j < k,
[ P 0:j][Pj+1:k ] represents the result of replacing [ A 0:j][Aj+1:k ] in the corresponding position of T', which is the replaced term; wherein the series of replaced lemmas of T' are made up of l lemmas of the reciprocal of the vocabulary in the natural language understanding model, the replaced lemmas cannot be directly understood by humans;
The obtained preset word element [ A 0:i-θ][Ai+1:l ] in T' = { [ P 0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k ], y } uses the embedded layer of the natural language understanding model Encoding, e representsIs/is embedded in-The method comprises the steps of forming a pre-training language model BERT-base-based with 12 layers of transformers, sending replaced words [ P 0:j][Pj+1:k ] to an encoder of a prompt project to generate corresponding embeddings, selecting a two-way long short-term memory network (LSTM) and using a two-layer multi-layer perceptron MLP activated by ReLU as the encoder of the prompt project to obtain a template e (T '), wherein the template e (T ') is expressed as a formula e (T ')= { e ([ P 0:j]),e([A0:i-θ]),e(X),e([Ai+1:l]),e([Pj+1:k ]), e (y) }; e (·) represents the output result of the embedded layer, and · represents any input;
Using e (T') to understand model by natural language And combining the predictor model for calculating each tag probability to obtain a determiner model decider to calculate the target tag y, and then calculating a loss value between the calculated result and the target tag, namelyWillNegative number of (i) as a reward, i.e.Obtaining rewards reward; i represents the number of words, and at the moment, i is equal to or more than 0 and less than m, and m is the total length of the prompt project;
S305, combining rewards review, action actor and state obtaining plot episode; storing the plot into an experience pool;
S306, jumping to S301 until the experience pool is filled by the plot episode;
In S300, training is carried out on the prompt engineering correction mechanism model by using a reinforcement learning method until the training times set in the step S200 are reached, and the trained prompt engineering correction mechanism model is obtained, wherein the specific process is as follows:
s307, according to the formula Training the prompt engineering correction mechanism model and then updating the prompt engineering correction mechanism model by combining the plots in the experience pool; wherein: n represents the size of each batch, N represents the training data of which number in N,Representing each piece of training data in the batch as represented in the formula:Summation of the N results obtained above,Represented is each episode of one of the training data in the batch according to the formula:Summation of delta results,What is represented is that each term in each episode is according to the formulaSumming the m obtained results; delta represents how many batches the experience pool can store, t represents what plot episode is in the experience pool, m represents the total length of the tokens of the prompt project, i represents what token is, and reward i represents the reward corresponding to the i-th token,Indicating that the prompt engineering correction mechanism model is biased, pi Wc(sii) indicates the prompt engineering correction mechanism model, wherein his inputs are state s i and action gamma i, beta indicates learning rate, wc indicates parameters of the prompt engineering correction mechanism model; finally, the step of obtaining the product,
Represented is a gradient ascent formula by the formula:Updating Wc parameters according to the calculation result of (2);
s308, emptying an experience pool;
S309, training an alpha-time prompt engineering correction mechanism model and judging whether the alpha-time executed exceeds a threshold value; if yes, jumping to S400, otherwise jumping to S301 to continue operation;
the specific implementation process of S400 is as follows:
S401, setting a preset manual prompt project similar to S301 and combining the manual prompt project with training data understood by natural language to obtain T' = { [ A 0:i],X,[Ai+1:m ] and y }; s301, data prompting an engineering correction mechanism model is used, and training data understood by natural language is used in the step;
s402, correcting T' by using a prompt engineering correction mechanism model to obtain:
T′={[P0:j],[A0:i-θ],X,[Ai+1:l],[Pj+1:k],y};
S403, similar to S304, obtaining a formula:
e (T ')= { e ([ P 0:j]),e([A0:i-θ]),e(X),e([Ai+1:l]),e([Pj+1:k ]), e (y) } and then using the natural language understanding model to understand e (T') A determiner model decider formed by the predictor calculates a target label y, and then a loss value is calculated between the calculated result and the target label, so as to obtainI is equal to or more than 0 and is less than m; s304-2 uses data prompting an engineering correction mechanism model and training data understood by natural language in the step;
S404 according to Training is performed on a model that prompts an engineering correction mechanism.
2. The natural language understanding method based on automatic construction prompt engineering according to claim 1, wherein the method comprises the following steps: in the step S100 of the method,
The training set and the checking set are sets composed of a plurality of training data, wherein each training data is a binary array composed of text content and class labels;
The natural language understanding data set is called Prompting engineering correction mechanism data set to be calledWhereinDelta batches in length; the prompt engineering correction mechanism model is a model for constructing and organizing the prompt engineering; delta represents how many batches are; batch means that packaging multiple pieces of data together is called a batch;
The check set data are used for enabling the natural language understanding model to calculate and output the probability of multi-label classification according to text content in the data of the check set, taking a label with the highest probability as a result label, checking whether the result label is correct or not and calculating accuracy, wherein the correct label is called correct calculation and output type label.
3. The natural language understanding method based on automatic construction prompt engineering according to claim 2, wherein: in S200 the process of the present invention,
Will beEach batch of batch is input into the prompt engineering correction mechanism model one by one to perform training, and all batches are input into the prompt engineering correction mechanism model to finish the training of the prompt engineering correction mechanism model once and finish delta times of training altogether; for differentSetting up different alpha, so that the following step S300 is switched to S400 to train the natural language understanding model after alpha times of execution; then isSetting up different epochs so thatEach batch of batch is input into the natural language understanding model one by one to perform training, and all batches are input into the natural language understanding model to complete the training of the natural language understanding model once and complete epoch times altogether.
4. The natural language understanding method based on automatic construction prompt engineering according to claim 1, wherein in step S500, the result of evaluation index calculation is used on the verification set and only the natural language understanding model with the best performance is saved, specifically:
Using the data in the test set as test data, enabling the natural language understanding model to calculate a class label with the maximum classification probability value according to the text content of the test data, comparing the class label with the maximum probability value with the class label in the test data by using an evaluation index, and calculating an evaluation result according to the evaluation index; the natural language understanding model calculates and outputs corresponding class labels on different test data according to text contents in the test data, and the class labels of the test data are compared, and an evaluation index is used for obtaining an evaluation result, wherein the evaluation index comprises accuracy, F1 and EM;
the accuracy formula is:
The formula of F1 is:
Wherein:
The accuracy formula:
recall ratio formula:
True examples, i.e., the number of correctly predicted positive examples;
true negative examples, i.e., the number of correctly predicted negative examples;
false positive examples, i.e., the number of false positive examples that are incorrectly predicted;
false negatives, i.e., the number of erroneously predicted negative cases;
EM is commonly used for evaluating a question-answering system, which refers to the proportion of the predicted output of a model to the real answer, and is used for examining question-answering tasks in natural language understanding tasks;
The EM formula is: Consider the ratio of the calculated result to the tag perfect agreement;
comparing the currently obtained evaluation result with the previously obtained evaluation result, if the currently obtained evaluation result is better than the previous result, replacing the previously obtained trained natural language understanding model, otherwise discarding the currently obtained natural language understanding model;
Judging whether the step S400 meets a threshold, specifically:
Judging whether the training times of the natural language understanding model of S400 exceeds the epoch times of the natural language understanding model training set in S200, if not, jumping to S300, wherein the epoch represents the times;
And outputting the calculation result of the best trained natural language understanding model on the test set if the execution times of S400 are satisfied, namely the threshold value is reached, and exiting.
5. A natural language understanding system based on automatic construction of prompt projects, characterized in that the system based on automatic construction of natural language understanding of prompt projects comprises: a processor, a memory and a computer program stored in the memory and running on the processor, the processor implementing the steps in a natural language understanding method based on automatic construction of a prompt project according to any one of claims 1 to 4 when the computer program is executed, the natural language understanding system based on automatic construction of a prompt project running in a computing device of a desktop computer, a notebook computer, a palm computer or a cloud data center.
6. A computer-readable storage medium, characterized by: the computer readable storage medium stores a computer program configured to implement the steps of a natural language understanding method based on automatic construction prompt engineering according to any one of claims 1 to 4 when called by a processor.
CN202410170010.3A 2024-02-06 2024-02-06 Natural language understanding method and system based on automatic construction prompt engineering Active CN117764054B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410170010.3A CN117764054B (en) 2024-02-06 2024-02-06 Natural language understanding method and system based on automatic construction prompt engineering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410170010.3A CN117764054B (en) 2024-02-06 2024-02-06 Natural language understanding method and system based on automatic construction prompt engineering

Publications (2)

Publication Number Publication Date
CN117764054A CN117764054A (en) 2024-03-26
CN117764054B true CN117764054B (en) 2024-06-21

Family

ID=90325966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410170010.3A Active CN117764054B (en) 2024-02-06 2024-02-06 Natural language understanding method and system based on automatic construction prompt engineering

Country Status (1)

Country Link
CN (1) CN117764054B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303966A (en) * 2023-03-27 2023-06-23 天津大学 Dialogue behavior recognition system based on prompt learning
CN116738994A (en) * 2023-04-24 2023-09-12 广西师范大学 Context-enhanced-based hinting fine-tuning relation extraction method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113901799B (en) * 2021-12-07 2022-03-08 苏州浪潮智能科技有限公司 Model training method, text prediction method, model training device, text prediction device, electronic equipment and medium
US20230342552A1 (en) * 2022-04-25 2023-10-26 Salesforce, Inc. Systems and methods for contextualized and quantized soft prompts for natural language understanding
CN116861921A (en) * 2023-07-10 2023-10-10 厦门大学 Robot task analysis method and device based on large language model and readable medium
CN117236337B (en) * 2023-08-22 2024-07-16 北京工商大学 Method for generating natural language based on mixed prompt learning completion history knowledge graph
CN117151338B (en) * 2023-09-08 2024-05-28 安徽大学 Multi-unmanned aerial vehicle task planning method based on large language model
CN117217289A (en) * 2023-10-09 2023-12-12 北银金融科技有限责任公司 Banking industry large language model training method
CN117271567A (en) * 2023-10-10 2023-12-22 税友软件集团股份有限公司 Tax text conversion method, device and equipment based on large model and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303966A (en) * 2023-03-27 2023-06-23 天津大学 Dialogue behavior recognition system based on prompt learning
CN116738994A (en) * 2023-04-24 2023-09-12 广西师范大学 Context-enhanced-based hinting fine-tuning relation extraction method

Also Published As

Publication number Publication date
CN117764054A (en) 2024-03-26

Similar Documents

Publication Publication Date Title
CN112487182B (en) Training method of text processing model, text processing method and device
EP3794469A1 (en) Multitask learning as question answering
CN111177348B (en) Training method and device for problem generation model, electronic equipment and storage medium
CN112101042B (en) Text emotion recognition method, device, terminal equipment and storage medium
CN110209803B (en) Story generation method, apparatus, computer device and storage medium
CN110633473B (en) Implicit discourse relation identification method and system based on conditional random field
CN110866113A (en) Text classification method based on sparse self-attention mechanism fine-tuning Bert model
EP3563302A1 (en) Processing sequential data using recurrent neural networks
CN114925703B (en) Visual question-answering method and system for multi-granularity text representation and image-text fusion
CN116956835B (en) Document generation method based on pre-training language model
CN111427925A (en) Volume assembling method, device, equipment and storage medium
CN111538838B (en) Problem generating method based on article
CN114626529B (en) Natural language reasoning fine tuning method, system, device and storage medium
CN114386409A (en) Self-distillation Chinese word segmentation method based on attention mechanism, terminal and storage medium
CN117764054B (en) Natural language understanding method and system based on automatic construction prompt engineering
CN117877483A (en) Training method of spoken language scoring model, spoken language scoring method and related equipment
CN109359191A (en) Sentence semantics coding method based on intensified learning
US20220254351A1 (en) Method and system for correcting speaker diarization using speaker change detection based on text
CN111553173B (en) Natural language generation training method and device
CN113159168A (en) Pre-training model accelerated reasoning method and system based on redundant word deletion
Wibisono et al. How In-Context Learning Emerges from Training on Unstructured Data: On the Role of Co-Occurrence, Positional Information, and Noise Structures
CN114372151B (en) Personalized question setting method and device, computer readable storage medium and electronic equipment
CN115437511B (en) Pinyin Chinese character conversion method, conversion model training method and storage medium
CN118410851B (en) Mixed expert model routing network optimization method, product, device and medium
CN117669738B (en) Engine updating method, processing method, device, equipment, medium and robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: No.18, Jiangwan 1st Road, Chancheng District, Foshan City, Guangdong Province 528011

Patentee after: Foshan University

Country or region after: China

Address before: No.18, Jiangwan 1st Road, Chancheng District, Foshan City, Guangdong Province 528011

Patentee before: FOSHAN University

Country or region before: China