CN113987209A - Natural language processing method and device based on knowledge-guided prefix fine tuning, computing equipment and storage medium - Google Patents

Natural language processing method and device based on knowledge-guided prefix fine tuning, computing equipment and storage medium Download PDF

Info

Publication number
CN113987209A
CN113987209A CN202111300021.1A CN202111300021A CN113987209A CN 113987209 A CN113987209 A CN 113987209A CN 202111300021 A CN202111300021 A CN 202111300021A CN 113987209 A CN113987209 A CN 113987209A
Authority
CN
China
Prior art keywords
prefix
training
language model
words
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111300021.1A
Other languages
Chinese (zh)
Other versions
CN113987209B (en
Inventor
陈华钧
陈想
张宁豫
李磊
谢辛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202111300021.1A priority Critical patent/CN113987209B/en
Publication of CN113987209A publication Critical patent/CN113987209A/en
Application granted granted Critical
Publication of CN113987209B publication Critical patent/CN113987209B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a natural language processing method, a device, computing equipment and a storage medium based on knowledge-guided prefix fine tuning, which are characterized by firstly constructing prefix cue words related to a downstream task and label words related to task categories obtained from a knowledge map, then utilizing embedded vectors of the prefix cue words to be spliced with key values and value values of input texts and then carrying out self-attention calculation so as to enable the prefix cue words and the input texts to be closely combined for learning, and simultaneously integrating all the label words to determine learning labels, namely utilizing ontology knowledge related to the task categories to guide fine tuning of a pre-training language model, so that the prediction effect of the fine-tuned pre-training language model on the downstream task is better, and the prediction accuracy of the pre-training language model is improved. The downstream tasks are emotion analysis tasks and relation extraction tasks, and the emotion analysis accuracy and the relation extraction accuracy improved by the pre-training language model obtained by the corresponding method are adopted.

Description

Natural language processing method and device based on knowledge-guided prefix fine tuning, computing equipment and storage medium
Technical Field
The invention belongs to a natural language processing technology, and particularly relates to a natural language processing method and device based on knowledge-guided prefix fine tuning, computing equipment and a storage medium.
Background
The pre-training model is a model obtained by training on a large reference data set, such as a large pre-training language model like BERT, GPT, XLNet, etc., and is obtained by pre-training on a large amount of corpora. Because the pre-trained model has been unsupervised learning with a large corpus, knowledge in the corpus has been migrated into Eembedding of the pre-trained model.
The fine tuning/fine-tune is a main method for transferring the PTM knowledge to the downstream task, and the currently common fine tuning methods all need to add a network structure for fine tuning aiming at a specific task so as to adapt to a specific task. However, such trimming methods have the following drawbacks: (1) the parameter efficiency is low: each downstream task has its own fine tuning parameters; (2) the training target and the fine tuning target of the pre-training are different, so that the generalization capability of the pre-training model is poor; (3) compared with the network parameters added in the pre-training stage, a large amount of data is needed to learn the newly added parameters. The shortcomings of these fine tuning methods lead to poor task performance in emotion analysis tasks, relationship extraction tasks, and various classification tasks.
The prior patent document CN112100383A discloses a meta-knowledge fine tuning method and a platform facing a multi-task language model, wherein the method obtains highly transferable common knowledge, namely meta-knowledge, on different data sets of similar tasks based on cross-domain typicality fraction learning, mutually associates and mutually strengthens the learning processes of the similar tasks on different domains corresponding to different data sets, improves the fine tuning effect of similar downstream tasks on the data sets of different domains in the application of the language model, and improves the parameter initialization capability and generalization capability of a general language model of the similar tasks. The method does not consider ontology knowledge, and has poor fine tuning effect on downstream tasks.
As disclosed in CN113032559A, a language model fine-tuning method for low-resource-adhesion language text classification constructs a low-noise fine-tuning dataset through morphological analysis and stem extraction, fine-tunes a cross-language pre-training model on the dataset, provides a meaningful and easy-to-use feature extractor for downstream text classification tasks, better selects relevant semantic and syntactic information from the pre-trained language model, and uses these features for the downstream text classification tasks. The method does not consider ontology knowledge, and has poor fine tuning effect on downstream tasks.
Disclosure of Invention
In view of the foregoing, an object of the present invention is to provide a natural language processing method, apparatus, computing device and storage medium based on prefix fine tuning guided by knowledge, wherein a pre-trained language model is trained by considering prefix hints and ontology knowledge related to a downstream task, so as to improve accuracy of prediction of the pre-trained language model on the downstream task.
In a first aspect, an embodiment provides a natural language processing method based on knowledge-guided prefix fine tuning, including the following steps:
constructing an initial prefix cue word according to a downstream task, and mapping the initial prefix cue word into embedded vectors with the same number as the number of layers of a pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
linking each task category of the downstream tasks to a knowledge graph, and taking words related to each task category in the knowledge graph as tag words;
converting the pre-training language model into a downstream task of shielding the token according to the prefix prompt words and the label words, and performing fine tuning training on the pre-training language model, wherein the fine tuning training comprises the following steps: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
when the method is applied, the embedded vectors of the predicted text and the prefix cue words are input into the fine-tuned pre-training language model, and the predicted values of all the label words and the weighting results of the corresponding weights are used as prediction results after calculation.
Preferably, the mapping the initial prefix cue words into the embedded vectors with the same number as the number of layers of the pre-training language model through the function includes:
and initially encoding the initial prefix cue words into initial embedded vectors, and then mapping the initial embedded vectors once by adopting function mapping to obtain the embedded vectors with the same number as the number of layers of the pre-training language model.
Preferably, the mapping the initial prefix cue words into the embedded vectors with the same number as the number of layers of the pre-training language model through the function includes:
and initially encoding the initial prefix cue words into initial embedded vectors, and mapping the initial embedded vectors to each layer of the pre-training language model by adopting multiple layers of MLPs (Multi-level MLPs) to obtain the embedded vectors corresponding to each layer.
Preferably, when the pre-training language model is subjected to fine-tuning training, the calculation mode of participating in self-entry is as follows:
Figure BDA0003337961310000031
wherein l represents the number of layers, QlDenotes the query value, KlDenotes the key value, VlA value is represented by a value,
Figure BDA0003337961310000032
the embedded vector representing the prefix hint is split into the portion that corresponds to the key value,
Figure BDA0003337961310000033
representing prefix hintsThe embedded vector of words is split into the part corresponding to the value, soft (-) meaning, sign; indicating a splicing operation.
Preferably, the pre-trained language model comprises: BERT, RoBerta, GPT series model.
In one embodiment, the downstream task is an emotion analysis task, the corresponding initial prefix words are emotion analysis, and each task category of the emotion analysis task is connected to the financial field knowledge map so as to search words related to each task category as tag words; then converting the pre-training language model into an emotion analysis task of the shielding token according to emotion analysis and the label words, and carrying out fine tuning training on the pre-training language model; and finally, inputting the embedded vectors of the predicted text and the emotion analysis into the finely-tuned pre-training language model during application, and taking the predicted values of all the label words and the weighting results of the corresponding weights as emotion analysis prediction results after calculation.
In another embodiment, the downstream task is a relationship extraction task, the corresponding initial prefix words are extracted as relationships, and each task category of the relationship extraction task is connected to the medical field knowledge map so as to search words related to each task category as tag words; then converting the pre-training language model into a relation extraction task of a shielding token according to the relation extraction and the label words, and carrying out fine tuning training on the pre-training language model; and finally, inputting the prediction text and the embedded vector of the relation extraction into the fine-tuned pre-training language model during application, and taking the prediction values of all the label words and the weighting results of the corresponding weights as the relation extraction results after calculation.
In a second aspect, an embodiment provides a natural language processing apparatus based on knowledge-guided prefix fine tuning, including:
the prefix cue word processing module is used for constructing an initial prefix cue word according to a downstream task, and mapping the initial prefix cue word into embedded vectors with the same number as the number of layers of the pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
the system comprises a tag word processing module, a knowledge graph and a task processing module, wherein the tag word processing module is used for linking each task category of a downstream task to the knowledge graph and taking words related to each task category in the knowledge graph as tag words;
the fine tuning module is used for converting the pre-training language model into a downstream task of shielding the token according to the prefix cue words and the label words, and performing fine tuning training on the pre-training language model, and comprises the following steps: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
and the application module is used for inputting the embedded vectors of the predicted text and the prefix prompt words into the finely-tuned pre-training language model, and taking the predicted values of all the label words and the weighting results of the corresponding weights as prediction results after calculation.
In a third aspect, an embodiment provides a computing device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the natural language processing method based on knowledge-guided prefix fine tuning described in the first aspect when executing the computer program.
In a fourth aspect, an embodiment provides a computer storage medium, on which a computer program is stored, and when the computer program is processed and executed, the method for natural language processing based on knowledge-guided prefix fine tuning in the first aspect is implemented.
Compared with the prior art, the invention has the beneficial effects that at least:
according to the technical scheme provided by the embodiment, the prefix prompt words related to the downstream task and the label words related to the task categories obtained from the knowledge map are firstly constructed, then the embedded vectors of the prefix prompt words are spliced with the key values and the value values of the input text, and then self-entry calculation is carried out, so that the prefix prompt words and the input text are closely combined for learning, and meanwhile, all the label words are integrated to determine learning labels, namely, body knowledge related to the task categories is used for guiding fine adjustment of the pre-training language model, so that the prediction effect of the fine-adjusted pre-training language model on the downstream task is better, and the prediction accuracy of the pre-training language model is improved. The downstream tasks are emotion analysis tasks and relation extraction tasks, and the emotion analysis accuracy and the relation extraction accuracy improved by the pre-training language model obtained by the corresponding method are adopted.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart of a natural language processing method based on knowledge-guided prefix hinting provided by an embodiment;
fig. 2 is a schematic structural diagram of a natural language processing apparatus for guiding prefix fine-tuning based on knowledge according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Aiming at the problem that the emotion analysis task and the relation extraction task are inaccurate by using the fine-tuned pre-training language model, the embodiment provides a fine-tuning mode of the pre-training language model by taking a knowledge body and a prefix prompt word related to the emotion analysis task and the relation extraction task as guidance, and the pre-training language model obtained through the fine-tuning mode can improve the task prediction result.
Fig. 1 is a flowchart of a natural language processing method based on knowledge-guided prefix hinting provided by an embodiment. As shown in fig. 1, the natural language processing method based on knowledge-guided prefix fine tuning provided by the embodiment includes the following steps:
step 1, constructing an initial prefix cue word related to a downstream task, and mapping to obtain an embedded vector.
In an embodiment, the prefix hint words are phrases that are closely related to downstream tasks, and the phrases are text that is composed of at least one word. When the downstream task is an emotion analysis task for each certain text statement (e.g., today stock is all green, and is bad), the prefix hints are emotion analysis. When the downstream task is a relationship extraction task for each text sentence (e.g., external illumination may be effective to improve pain symptoms in patients with chronic pancreatitis), the prefix cue is relationship extraction. After the prefix cue words are initialized, mapping is carried out on the prefix cue words to obtain embedded vectors with the number being the same as the number of layers of the pre-training language model, and in order to realize that the embedded vectors are respectively combined with the key value and the value of each layer of the pre-training language model, the dimension of each embedded vector is required to be 2 times of the dimension of the corresponding model layer.
In an embodiment, the initial prefix hint words may be initially encoded into initial embedded vectors, and then the initial embedded vectors are mapped once by using function mapping to obtain embedded vectors with the same number of layers as the number of layers of the pre-training language model. For example, the pre-trained language model has 10 layers, each layer has a size of 5 × 768, and can be mapped once through the mapping function, and the mapping is directly to an embedded vector of 5 × 768 × 10 × 2, 10 indicates that 10 embedded vectors of 5 × 768 × 2 are obtained, and 2 indicates that the dimension is 2 times the size of 5 × 768 of each layer.
In an embodiment, the initial prefix hint words may also be initially encoded into initial embedded vectors, and the initial embedded vectors are mapped to each layer of the pre-training language model by using multiple layers of MLPs, so as to obtain embedded vectors corresponding to each layer. That is, after multiple times of mapping, the embedded vector corresponding to each layer is obtained, but the dimension of the embedded vector corresponding to each layer is also ensured to be 2 times of the size of each layer.
And 2, linking each task category of the downstream task to the knowledge graph, and taking words related to each task category in the knowledge graph as tag words.
In an embodiment, the task categories are related to downstream tasks, and for the emotion analysis task, the task categories comprise positive emotions, negative emotions and the like. The positive emotion and the negative emotion can be linked to financial field knowledge maps such as a HowNet emotion dictionary, words related to the positive emotion are obtained and used as label words, and for example, good evaluation, excellence, goodness and the like related to the positive emotion can be obtained to form a label word set for constructing task supervision learning labels. For the relationship extraction task, the task category includes radiation therapy and the like. The radiotherapy can be connected to a knowledge map in the medical field such as DiseasKG, Yidu-N7K and the like to obtain a text related to the radiotherapy, wherein the text "external irradiation can effectively improve the pain symptoms of patients with chronic pancreatitis", and tag words of the external irradiation, the chronic pancreatitis and the pain symptoms are extracted from the text.
And 3, converting the pre-training language model into a downstream task of the shielding token according to the prefix prompt words and the label words, and performing fine tuning training on the pre-training language model.
In an embodiment, the pre-training language model may employ a BERT, RoBerta, GPT series model. The models can map input texts to obtain query, key and value vectors, and all contain a self-attribute mechanism to carry out self-attribute calculation.
In an embodiment, the fine tuning training process comprises: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing the 2 parts with a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label.
In the l-th layer of a pre-trained language model, a representation of an input text sequence
Figure BDA0003337961310000081
First mapped to the query/key/value vector:
Ql=XlWQ,Kl=XlWK,Vl=XlWV
wherein, W is a model parameter, and then the calculation mode participating in self-attention is as follows:
Figure BDA0003337961310000082
wherein Q islDenotes the query value, KlDenotes the key value, VlA value is represented by a value,
Figure BDA0003337961310000083
the embedded vector representing the prefix hint is split into the portion that corresponds to the key value,
Figure BDA0003337961310000084
the embedded vector representing the prefix hint word is split into a part corresponding to a value, soft (-) represents and a symbol; indicating a splicing operation.
In an embodiment, a weight is initialized for each tag word, and then each tag word is summed according to the weight to obtain a training tag, for example, when the weights of epi-illumination and chronic pancreatitis and pain symptoms are initialized to 0.2, 0.5 and 0.3, respectively, and the task of masking token prediction is performed on the pre-trained language model, that is, when the vocabulary at the masking token position [ MASK ] in the input text sequence is predicted, the loci of this category are treated with radiation of 0.2 +0.5 + 0.2. Then, based on the embedded vector and the weight vector of the learnable prefix cue word, the parameters of the pre-training language model are finely adjusted on the sample data, and better performance of the pre-training language model can be obtained.
And 4, inputting the embedded vectors of the predicted text and the prefix prompt words into the finely-tuned pre-training language model during application, and taking the predicted values of all the label words and the weighting results of the corresponding weights as prediction results through calculation.
In the embodiment, for the emotion prediction task, the embedded vectors of the predicted text and the prefix prompt words are input into the fine-tuned pre-training language model, and the prediction values of all the label words and the weighting results of the corresponding weights are used as the prediction results of the predicted text after calculation.
The natural language processing method based on knowledge-guided prefix fine tuning provided by the embodiment generates an embedded vector and a tagged word set of a multilayer knowledge prefix cue word based on the downstream task description and the external knowledge base design, and converts the downstream task into a task of masking token prediction.
In the natural language processing method based on knowledge-guided prefix fine tuning provided by the above embodiment, the pre-training language model is a neural network model that is specially used for learning semantic information in a corpus from a large-scale unmarked corpus in an unsupervised manner, and is a complex learning model composed of multiple layers of neural networks, and the pre-training language model can more accurately capture semantic information in a text, thereby improving the accuracy of the model in performing downstream tasks.
In the natural language processing method based on knowledge-guided prefix fine tuning provided by the embodiment, the fine tuning technology based on knowledge-guided prefix is adopted, so that the accuracy and efficiency of downstream tasks can be remarkably improved, the requirements of different applications can be met, the method is not limited to a classification task in natural language processing, and the method is also suitable for a text generation task.
As shown in fig. 2, the embodiment further provides a fine tuning apparatus 200 for a language model, including:
the prefix cue word processing module 201 is configured to construct an initial prefix cue word according to a downstream task, and map the initial prefix cue word into embedded vectors with the same number as the number of layers of the pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
the tag word processing module 202 is configured to link each task category of the downstream task to the knowledge graph, and use a word related to each task category in the knowledge graph as a tag word;
the fine tuning module 203 is configured to convert the pre-training language model into a downstream task of masking the token according to the prefix cue word and the tag word, and perform fine tuning training on the pre-training language model, including: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
and the application module 204 is configured to input the embedded vectors of the predicted text and the prefix prompt word into the fine-tuned pre-training language model, and take the prediction values of all the label words and the weighting results of the corresponding weights as prediction results through calculation.
It should be noted that, when the natural language processing apparatus provided in the embodiment performs automatic generation, the division of each functional module is taken as an example, and the above-mentioned function distribution may be performed by different functional modules as needed, that is, the internal structure of the terminal or the server is divided into different functional modules to perform all or part of the above-described functions. In addition, the natural language processing apparatus provided in the embodiment and the natural language processing method embodiment belong to the same concept, and specific implementation procedures thereof are detailed in the natural language processing method embodiment and are not described herein again.
Embodiments also provide a computing device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing a knowledge-guided prefix-fine-tuning-based natural language processing method when executing the computer program.
Embodiments provide a computer storage medium having stored thereon a computer program that, when executed by a processor, implements a natural language processing method based on knowledge-guided prefix hinting.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A natural language processing method based on knowledge-guided prefix fine tuning is characterized by comprising the following steps:
constructing an initial prefix cue word according to a downstream task, and mapping the initial prefix cue word into embedded vectors with the same number as the number of layers of a pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
linking each task category of the downstream tasks to a knowledge graph, and taking words related to each task category in the knowledge graph as tag words;
converting the pre-training language model into a downstream task of shielding the token according to the prefix prompt words and the label words, and performing fine tuning training on the pre-training language model, wherein the fine tuning training comprises the following steps: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
when the method is applied, the embedded vectors of the predicted text and the prefix cue words are input into the fine-tuned pre-training language model, and the predicted values of all the label words and the weighting results of the corresponding weights are used as prediction results after calculation.
2. The knowledge-guided prefix-fine-tuning-based natural language processing method of claim 1, wherein the functionally mapping the initial prefix hints to the same number of embedded vectors as the number of layers of the pre-trained language model comprises:
and initially encoding the initial prefix cue words into initial embedded vectors, and then mapping the initial embedded vectors once by adopting function mapping to obtain the embedded vectors with the same number as the number of layers of the pre-training language model.
3. The knowledge-guided prefix-fine-tuning-based natural language processing method of claim 1, wherein the functionally mapping the initial prefix hints to the same number of embedded vectors as the number of layers of the pre-trained language model comprises:
and initially encoding the initial prefix cue words into initial embedded vectors, and mapping the initial embedded vectors to each layer of the pre-training language model by adopting multiple layers of MLPs (Multi-level MLPs) to obtain the embedded vectors corresponding to each layer.
4. The natural language processing method based on knowledge-guided prefix fine tuning of claim 1, wherein when the pre-trained language model is subjected to fine tuning training, the calculation mode of participating in self-attention is as follows:
Figure FDA0003337961300000021
wherein l represents the number of layers, QlDenotes the query value, KlDenotes the key value, VlA value is represented by a value,
Figure FDA0003337961300000022
embedded vector splitting to represent prefix hintsThe portion corresponding to the key value is come out,
Figure FDA0003337961300000023
the embedded vector representing the prefix hint word is split into a part corresponding to a value, soft (-) represents and a symbol; indicating a splicing operation.
5. The knowledge-guided prefix-fine-tuning-based natural language processing method of claim 1, wherein the pre-trained language model comprises: BERT, RoBerta, GPT series model.
6. The natural language processing method based on knowledge-guided prefix refinement of claim 1, wherein the downstream task is an emotion analysis task, the corresponding initial prefix word is emotion analysis, and each task category of the emotion analysis task is connected to a financial domain knowledge graph to search for words related to each task category as tag words; then converting the pre-training language model into an emotion analysis task of the shielding token according to emotion analysis and the label words, and carrying out fine tuning training on the pre-training language model; and finally, inputting the embedded vectors of the predicted text and the emotion analysis into the finely-tuned pre-training language model during application, and taking the predicted values of all the label words and the weighting results of the corresponding weights as emotion analysis prediction results after calculation.
7. The natural language processing method based on knowledge-guided prefix refinement of claim 1, wherein the downstream task is a relationship extraction task, the corresponding initial prefix words are relationship extractions, each task category of the relationship extraction task is connected to a medical field knowledge graph to search for words related to each task category as tag words; then converting the pre-training language model into a relation extraction task of a shielding token according to the relation extraction and the label words, and carrying out fine tuning training on the pre-training language model; and finally, inputting the prediction text and the embedded vector of the relation extraction into the fine-tuned pre-training language model during application, and taking the prediction values of all the label words and the weighting results of the corresponding weights as the relation extraction results after calculation.
8. A natural language processing apparatus that directs prefix hinting based on knowledge, comprising:
the prefix cue word processing module is used for constructing an initial prefix cue word according to a downstream task, and mapping the initial prefix cue word into embedded vectors with the same number as the number of layers of the pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
the system comprises a tag word processing module, a knowledge graph and a task processing module, wherein the tag word processing module is used for linking each task category of a downstream task to the knowledge graph and taking words related to each task category in the knowledge graph as tag words;
the fine tuning module is used for converting the pre-training language model into a downstream task of shielding the token according to the prefix cue words and the label words, and performing fine tuning training on the pre-training language model, and comprises the following steps: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
and the application module is used for inputting the embedded vectors of the predicted text and the prefix prompt words into the finely-tuned pre-training language model, and taking the predicted values of all the label words and the weighting results of the corresponding weights as prediction results after calculation.
9. A computing device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the natural language processing method based on knowledge-guided prefix refinement of any one of claims 1-7 when executing the computer program.
10. A computer storage medium having a computer program stored thereon, wherein the computer program when executed is configured to implement the natural language processing method for knowledge-based guided prefix refinement of any of claims 1-7.
CN202111300021.1A 2021-11-04 2021-11-04 Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment Active CN113987209B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111300021.1A CN113987209B (en) 2021-11-04 2021-11-04 Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111300021.1A CN113987209B (en) 2021-11-04 2021-11-04 Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment

Publications (2)

Publication Number Publication Date
CN113987209A true CN113987209A (en) 2022-01-28
CN113987209B CN113987209B (en) 2024-05-24

Family

ID=79746414

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111300021.1A Active CN113987209B (en) 2021-11-04 2021-11-04 Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment

Country Status (1)

Country Link
CN (1) CN113987209B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612290A (en) * 2022-03-11 2022-06-10 北京百度网讯科技有限公司 Training method of image editing model and image editing method
CN114792097A (en) * 2022-05-14 2022-07-26 北京百度网讯科技有限公司 Method and device for determining prompt vector of pre-training model and electronic equipment
CN114862493A (en) * 2022-04-07 2022-08-05 北京中科深智科技有限公司 Generation model for generating personalized commodity description based on light-weight fine adjustment
CN114943211A (en) * 2022-07-25 2022-08-26 北京澜舟科技有限公司 Text generation method and system based on prefix and computer readable storage medium
CN115563283A (en) * 2022-10-20 2023-01-03 北京大学 Text classification method based on prompt learning
CN115640520A (en) * 2022-11-07 2023-01-24 北京百度网讯科技有限公司 Method, device and storage medium for pre-training cross-language cross-modal model
CN115906815A (en) * 2023-03-08 2023-04-04 北京语言大学 Error correction method and device for modifying one or more types of wrong sentences
CN116186200A (en) * 2023-01-19 2023-05-30 北京百度网讯科技有限公司 Model training method, device, electronic equipment and storage medium
CN116306917A (en) * 2023-05-17 2023-06-23 卡奥斯工业智能研究院(青岛)有限公司 Task processing method, device, equipment and computer storage medium
CN116737938A (en) * 2023-07-19 2023-09-12 人民网股份有限公司 Fine granularity emotion detection method and device based on fine tuning large model online data network
CN116861928A (en) * 2023-07-07 2023-10-10 北京中关村科金技术有限公司 Method, device, equipment and medium for generating instruction fine tuning data
CN116956835A (en) * 2023-09-15 2023-10-27 京华信息科技股份有限公司 Document generation method based on pre-training language model
CN117194637A (en) * 2023-09-18 2023-12-08 深圳市大数据研究院 Multi-level visual evaluation report generation method and device based on large language model
CN117216227A (en) * 2023-10-30 2023-12-12 广东烟草潮州市有限责任公司 Tobacco enterprise intelligent information question-answering method based on knowledge graph and large language model
CN117332419A (en) * 2023-11-29 2024-01-02 武汉大学 Malicious code classification method and device based on pre-training
CN117474084A (en) * 2023-12-25 2024-01-30 淘宝(中国)软件有限公司 Bidirectional iteration method, equipment and medium for pre-training model and downstream sequence task
WO2024031891A1 (en) * 2022-08-10 2024-02-15 浙江大学 Fine tuning method and apparatus for knowledge representation-disentangled classification model, and application
CN117875273A (en) * 2024-03-13 2024-04-12 中南大学 News abstract automatic generation method, device and medium based on large language model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
CN112100383A (en) * 2020-11-02 2020-12-18 之江实验室 Meta-knowledge fine tuning method and platform for multitask language model
US20210035556A1 (en) * 2019-08-02 2021-02-04 Babylon Partners Limited Fine-tuning language models for supervised learning tasks via dataset preprocessing
CN112699218A (en) * 2020-12-30 2021-04-23 成都数之联科技有限公司 Model establishing method and system, paragraph label obtaining method and medium
CN113033182A (en) * 2021-03-25 2021-06-25 网易(杭州)网络有限公司 Text creation auxiliary method and device and server
CN113468877A (en) * 2021-07-09 2021-10-01 浙江大学 Language model fine-tuning method and device, computing equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210035556A1 (en) * 2019-08-02 2021-02-04 Babylon Partners Limited Fine-tuning language models for supervised learning tasks via dataset preprocessing
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
CN112100383A (en) * 2020-11-02 2020-12-18 之江实验室 Meta-knowledge fine tuning method and platform for multitask language model
CN112699218A (en) * 2020-12-30 2021-04-23 成都数之联科技有限公司 Model establishing method and system, paragraph label obtaining method and medium
CN113033182A (en) * 2021-03-25 2021-06-25 网易(杭州)网络有限公司 Text creation auxiliary method and device and server
CN113468877A (en) * 2021-07-09 2021-10-01 浙江大学 Language model fine-tuning method and device, computing equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
韩程程;李磊;刘婷婷;高明;: "语义文本相似度计算方法", 华东师范大学学报(自然科学版), no. 05, 25 September 2020 (2020-09-25) *

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612290B (en) * 2022-03-11 2023-07-21 北京百度网讯科技有限公司 Training method of image editing model and image editing method
CN114612290A (en) * 2022-03-11 2022-06-10 北京百度网讯科技有限公司 Training method of image editing model and image editing method
CN114862493A (en) * 2022-04-07 2022-08-05 北京中科深智科技有限公司 Generation model for generating personalized commodity description based on light-weight fine adjustment
CN114792097A (en) * 2022-05-14 2022-07-26 北京百度网讯科技有限公司 Method and device for determining prompt vector of pre-training model and electronic equipment
CN114943211A (en) * 2022-07-25 2022-08-26 北京澜舟科技有限公司 Text generation method and system based on prefix and computer readable storage medium
WO2024031891A1 (en) * 2022-08-10 2024-02-15 浙江大学 Fine tuning method and apparatus for knowledge representation-disentangled classification model, and application
CN115563283A (en) * 2022-10-20 2023-01-03 北京大学 Text classification method based on prompt learning
CN115563283B (en) * 2022-10-20 2023-04-25 北京大学 Text classification method based on prompt learning
CN115640520A (en) * 2022-11-07 2023-01-24 北京百度网讯科技有限公司 Method, device and storage medium for pre-training cross-language cross-modal model
CN116186200B (en) * 2023-01-19 2024-02-09 北京百度网讯科技有限公司 Model training method, device, electronic equipment and storage medium
CN116186200A (en) * 2023-01-19 2023-05-30 北京百度网讯科技有限公司 Model training method, device, electronic equipment and storage medium
CN115906815A (en) * 2023-03-08 2023-04-04 北京语言大学 Error correction method and device for modifying one or more types of wrong sentences
CN116306917B (en) * 2023-05-17 2023-09-08 卡奥斯工业智能研究院(青岛)有限公司 Task processing method, device, equipment and computer storage medium
CN116306917A (en) * 2023-05-17 2023-06-23 卡奥斯工业智能研究院(青岛)有限公司 Task processing method, device, equipment and computer storage medium
CN116861928A (en) * 2023-07-07 2023-10-10 北京中关村科金技术有限公司 Method, device, equipment and medium for generating instruction fine tuning data
CN116861928B (en) * 2023-07-07 2023-11-17 北京中关村科金技术有限公司 Method, device, equipment and medium for generating instruction fine tuning data
CN116737938A (en) * 2023-07-19 2023-09-12 人民网股份有限公司 Fine granularity emotion detection method and device based on fine tuning large model online data network
CN116956835A (en) * 2023-09-15 2023-10-27 京华信息科技股份有限公司 Document generation method based on pre-training language model
CN116956835B (en) * 2023-09-15 2024-01-02 京华信息科技股份有限公司 Document generation method based on pre-training language model
CN117194637A (en) * 2023-09-18 2023-12-08 深圳市大数据研究院 Multi-level visual evaluation report generation method and device based on large language model
CN117194637B (en) * 2023-09-18 2024-04-30 深圳市大数据研究院 Multi-level visual evaluation report generation method and device based on large language model
CN117216227A (en) * 2023-10-30 2023-12-12 广东烟草潮州市有限责任公司 Tobacco enterprise intelligent information question-answering method based on knowledge graph and large language model
CN117216227B (en) * 2023-10-30 2024-04-16 广东烟草潮州市有限责任公司 Tobacco enterprise intelligent information question-answering method based on knowledge graph and large language model
CN117332419A (en) * 2023-11-29 2024-01-02 武汉大学 Malicious code classification method and device based on pre-training
CN117332419B (en) * 2023-11-29 2024-02-20 武汉大学 Malicious code classification method and device based on pre-training
CN117474084A (en) * 2023-12-25 2024-01-30 淘宝(中国)软件有限公司 Bidirectional iteration method, equipment and medium for pre-training model and downstream sequence task
CN117474084B (en) * 2023-12-25 2024-05-03 淘宝(中国)软件有限公司 Bidirectional iteration method, equipment and medium for pre-training model and downstream sequence task
CN117875273A (en) * 2024-03-13 2024-04-12 中南大学 News abstract automatic generation method, device and medium based on large language model
CN117875273B (en) * 2024-03-13 2024-05-28 中南大学 News abstract automatic generation method, device and medium based on large language model

Also Published As

Publication number Publication date
CN113987209B (en) 2024-05-24

Similar Documents

Publication Publication Date Title
CN113987209B (en) Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN109325229B (en) Method for calculating text similarity by utilizing semantic information
CN110197279B (en) Transformation model training method, device, equipment and storage medium
CN113468877A (en) Language model fine-tuning method and device, computing equipment and storage medium
CN112905795A (en) Text intention classification method, device and readable medium
CN111062217A (en) Language information processing method and device, storage medium and electronic equipment
CN114676255A (en) Text processing method, device, equipment, storage medium and computer program product
CN112052318A (en) Semantic recognition method and device, computer equipment and storage medium
CN117149984B (en) Customization training method and device based on large model thinking chain
CN116992007B (en) Limiting question-answering system based on question intention understanding
CN110717021A (en) Input text and related device for obtaining artificial intelligence interview
CN115858750A (en) Power grid technical standard intelligent question-answering method and system based on natural language processing
CN112488111B (en) Indication expression understanding method based on multi-level expression guide attention network
CN114239599A (en) Method, system, equipment and medium for realizing machine reading understanding
CN112905750A (en) Generation method and device of optimization model
Yang et al. Task independent fine tuning for word embeddings
CN111813907A (en) Question and sentence intention identification method in natural language question-answering technology
CN116757195A (en) Implicit emotion recognition method based on prompt learning
CN113408267B (en) Word alignment performance improving method based on pre-training model
CN111401069A (en) Intention recognition method and intention recognition device for conversation text and terminal
Alwaneen et al. Stacked dynamic memory-coattention network for answering why-questions in Arabic
CN114239555A (en) Training method of keyword extraction model and related device
Khandait et al. Automatic question generation through word vector synchronization using lamma
Chakkarwar et al. A Review on BERT and Its Implementation in Various NLP Tasks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant