WO2024000966A1 - Procédé d'optimisation pour modèle de langage naturel - Google Patents
Procédé d'optimisation pour modèle de langage naturel Download PDFInfo
- Publication number
- WO2024000966A1 WO2024000966A1 PCT/CN2022/128623 CN2022128623W WO2024000966A1 WO 2024000966 A1 WO2024000966 A1 WO 2024000966A1 CN 2022128623 W CN2022128623 W CN 2022128623W WO 2024000966 A1 WO2024000966 A1 WO 2024000966A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- loss
- vector
- encoder
- entity
- discriminator
- Prior art date
Links
- 238000005457 optimization Methods 0.000 title claims abstract description 145
- 238000000034 method Methods 0.000 title claims abstract description 109
- 239000013598 vector Substances 0.000 claims abstract description 322
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 88
- 238000004458 analytical method Methods 0.000 claims description 133
- 238000000605 extraction Methods 0.000 claims description 111
- 230000002996 emotional effect Effects 0.000 claims description 51
- 238000004590 computer program Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 abstract description 14
- 238000010586 diagram Methods 0.000 description 18
- 238000012545 processing Methods 0.000 description 6
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000008451 emotion Effects 0.000 description 4
- 230000002708 enhancing effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 238000007429 general method Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- This application relates to the technical field of natural language processing, for example, to an optimization method for a natural language model.
- Natural language processing such as the aspect-based sentiment analysis (Aspect-Based Sentiment Analysis, ABSA) task aims to predict the sentiment polarity for a specific aspect term, and the relationship extraction task aims to extract two entities from a given sentence. , extract (predict) the relationship between these two given entities.
- aspects-based sentiment analysis Aspect-Based Sentiment Analysis, ABSA
- the relationship extraction task aims to extract two entities from a given sentence. , extract (predict) the relationship between these two given entities.
- understanding the meaning of aspect words and entities themselves is very important for emotion prediction and relationship prediction.
- general methods often ignore the modeling of aspect words and entities themselves, resulting in insufficient understanding of the meaning of aspect words and entities themselves.
- the natural language model includes a main model, an enhancement module and a discriminator.
- the main model includes a first encoder and a second encoder.
- the method includes:
- the overall loss is calculated through a preset first algorithm for the target result loss, the enhancement loss and the discriminator loss;
- the natural language model is initially optimized through the overall loss.
- the natural language model is a sentiment analysis model
- the enhancement module is a sentence restoration module
- the first encoder is a task-independent encoder
- the second encoder is a sentiment analysis encoder
- the The main module also includes a supplementary learning encoder
- the target result loss is an emotional polarity loss
- the enhancement loss is a supplementary learning loss
- the latent vector and the intermediate result are input into the discriminator to obtain the discriminator loss.
- the natural language model is a relationship extraction model
- the enhancement module is an entity recognition learning module
- the first encoder is a shared encoder
- the second encoder is a relationship extraction encoder
- the The target result loss is a relationship type loss
- the enhancement loss is an entity label loss
- the latent vector is input into the relationship extraction encoder, the entity recognition learning module and the discriminator to obtain the relationship type loss, the entity label loss and the discriminator loss respectively.
- the natural language model includes a main model, an enhancement module and a discriminator.
- the main model includes a first encoder and a second encoder.
- the device includes:
- the first module is configured to obtain the input sentence through the first encoder, encode the sentence, and output the latent vector of each word in the sentence;
- the second module is configured to input the latent vector into the second encoder, the enhancement module and the discriminator to obtain the target result loss, enhancement loss and discriminator loss respectively;
- the third module is configured to calculate the overall loss through a preset first algorithm for the target result loss, the enhancement loss, and the discriminator loss;
- the fourth module is configured to perform preliminary optimization on the natural language model through the overall loss.
- This application also provides an electronic device, including a processor and a memory.
- the computer program in the memory is executed by the processor, the above-mentioned optimization method for a natural language model is implemented.
- This application also provides a computer-readable storage medium that stores a computer program.
- the computer program is executed by a processor, the above-mentioned optimization method for a natural language model is implemented.
- Figure 1 is a schematic diagram of the steps of an optimization method for a natural language model provided by an embodiment of the present application
- Figure 2 is a schematic diagram of the steps of another optimization method for a natural language model provided by an embodiment of the present application
- Figure 3 is a schematic structural diagram of a sentiment analysis model used in an optimization method for a natural language model provided by an embodiment of the present application;
- Figure 4 is a schematic diagram of the logical steps of an optimization method for a natural language model provided by an embodiment of the present application
- Figure 5 is a schematic diagram of steps for calculating emotional polarity loss in an optimization method for a natural language model provided by an embodiment of the present application
- Figure 6 is a schematic diagram of the steps for calculating supplementary learning loss in an optimization method for a natural language model provided by an embodiment of the present application;
- Figure 7 is a schematic diagram of the steps of another optimization method for a natural language model provided by an embodiment of the present application.
- Figure 8 is a schematic structural diagram of a relationship extraction model used in an optimization method for a natural language model provided by an embodiment of the present application
- Figure 9 is a schematic diagram of the logical steps of another optimization method for a natural language model provided by an embodiment of the present application.
- Figure 10 is a schematic diagram of the steps for calculating relationship type loss in an optimization method for a natural language model provided by an embodiment of the present application;
- Figure 11 is a schematic diagram of the steps for calculating entity recognition label loss in an optimization method for a natural language model provided by an embodiment of the present application;
- Figure 12 is a schematic diagram of secondary optimization of an optimization method for a natural language model provided by an embodiment of the present application.
- Figure 13 is a schematic structural diagram of an optimization device for a natural language model provided by an embodiment of the present application.
- Figure 14 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
- the embodiment of the present application provides an optimization method for a natural language model.
- the natural language model includes a main model, an enhancement module and a discriminator.
- the main model includes a first encoder and a second encoder. This method include:
- the input sentence is obtained through the first encoder, the sentence is encoded, and the hidden vector of each word in the sentence is output; the hidden vector is input to the second encoder, the enhancement module and the The discriminator obtains the target result loss, the enhancement loss and the discriminator loss respectively; the target result loss, the enhancement loss and the discriminator loss are calculated to obtain the overall loss through the preset first algorithm; through the overall loss The loss performs preliminary optimization on the natural language model.
- the enhancement module in this embodiment is used to enhance the main model's understanding and modeling capabilities of input objects, thereby enhancing model performance.
- the natural language model is a sentiment analysis model
- the enhancement module is a sentence restoration module
- the first encoder is a task-independent encoder
- the second encoder is a sentiment analysis encoder
- the main module also includes a supplementary learning encoder.
- the target The result loss is the emotional polarity loss
- the enhancement loss is the supplementary learning loss.
- the sentiment analysis model includes a main model 100 (The sentiment Classifier), a sentence restoration module 101 (re-constructed sentence) and a discriminator 102 (Discriminator).
- the main model 100 includes a task-independent encoder 1000 (task -free encoder), sentiment analysis encoder 1001 (ABSA encoder) and supplementary learning encoder 1002 (CL encoder), where: the method includes the following steps: obtain the input sentence X through the task-free encoder 1000, and encode the sentence, obtaining The latent vector of each word in the sentence will the hidden vector Input the supplementary learning encoder 1002 to obtain the supplementary learning latent vector and will supplement the learning of latent vectors
- the vector representation of aspect words is calculated through vector algorithm.
- A represents the aspect word
- convert the hidden vector Vector representation with aspect words Input the sentiment analysis encoder 1001 to obtain the sentiment polarity loss L SA ; the hidden vector will be supplemented to learn Input the sentence restoration module 101 to obtain the intermediate results and supplementary learning loss L CL ; convert the latent vector And the intermediate results are input into the discriminator 102 to obtain the discriminator loss LD ; the emotional polarity loss L SA , the supplementary learning loss L CL and the discriminator loss LD are calculated according to the preset first algorithm to obtain the overall loss L; through the overall loss L performs preliminary optimization on the sentiment analysis model.
- the task-independent encoder 1000 and the supplementary learning encoder 1002 realize the learning of aspect words in the text, thereby enhancing the main model 100's understanding and modeling ability of aspect words, thereby Improved the performance of sentiment analysis models on sentiment analysis tasks.
- obtaining the emotional polarity loss LSA includes the following steps: convert the latent vector Vector representation with aspect words Input the sentiment analysis encoder 1001 to obtain the sentiment analysis vector representation of the vocabulary And sentiment analysis vector representation of aspect words latent vector Sentiment analysis vector representation of words And aspect word sentiment analysis vector representation Calculation is performed according to the preset second algorithm to obtain the emotional polarity loss L SA .
- the emotional polarity loss L SA used to characterize the emotional polarity prediction ability of the emotional analysis model can be obtained.
- the emotional polarity loss L SA can be used to initially optimize the emotional analysis model based on the overall loss L. In the process, the emotional polarity prediction ability of the sentiment analysis model is optimized to ensure the reliability of the optimization method.
- the preset second algorithm includes the following steps: Task-independent sentence representations are calculated according to the preset vector algorithm. Sentiment analysis vector representation of words And sentiment analysis vector representation of aspect words Sentiment analysis sentence representation is calculated based on the preset vector algorithm. Represent task-irrelevant sentences Sentence Representation with Sentiment Analysis The intermediate vector is obtained through series connection; the intermediate vector is input into the preset first fully connected layer 1003, and the output result of the first fully connected layer 1003 is input into the first SoftMax classifier 1004 to obtain the predicted emotional polarity. Will predict the sentiment polarity of The emotional polarity loss L SA is calculated by comparing it with the preset standard.
- Hidden vectors are processed using preset vector algorithms Sentiment analysis vector representation of words And sentiment analysis vector representation of aspect words Perform calculations to obtain task-independent sentence representations Sentence Representation with Sentiment Analysis As an intermediate variable, it reflects the vector representation of the different positions of the input sentence After processing, the predicted emotional polarity of the main model 100 pairs of words in the input sentence X is obtained. (scalar) that will predict the sentiment polarity of It is expressed in the form of a scalar to facilitate the subsequent calculation of the emotional polarity loss LSA .
- the preset vector algorithm is the MaxPooling algorithm. Here's how:
- the sentence restoration module 101 includes a supplementary learning decoder (The Specific Decoder) 1010.
- Obtaining the supplementary learning loss includes the following steps: adding the supplementary learning latent vector
- the input supplementary learning decoder 1010 reconstructs the input sentence X and obtains the predicted vocabulary will predict the vocabulary
- the intermediate result is obtained by comparing it with the vocabulary x t in the input sentence X, and the supplementary learning loss L CL is calculated according to the preset third algorithm.
- the supplementary learning latent vector encoded by the supplementary learning encoder 1002 by the supplementary learning decoder 1010 Decoding and reconstruction are performed, and the predicted vocabulary x t (scalar) is obtained. It can be seen that embodying the predicted vocabulary in the form of a scalar makes it easier to calculate the supplementary learning loss L CL later.
- the preset third algorithm includes the following steps: calculating the negative log natural loss of the word at one position, and summing the negative log natural loss of the words at all positions to obtain the supplementary learning loss.
- the supplementary learning loss is calculated by the following formula:
- n the number of words in the input sentence X.
- obtaining the discriminator loss includes the following steps: according to the hidden vector and the intermediate results to obtain the target output of the discriminator 102 output to target Calculate according to the preset fourth algorithm to obtain the discriminator loss LD .
- the discriminator 102 of adversarial learning effectively controls the learning degree of the task-irrelevant encoder 1000 and the supplementary learning encoder 1002 on the supplementary learning task-the sentence reconstruction task, and avoids the task-irrelevant encoder 1000 and the supplementary learning encoder 1002 on the supplementary learning.
- the over-fitting of the task ensures the optimization degree of fitting of the main model 100 to the main task-sentiment analysis task.
- the target output The value is 0 or 1
- the method is:
- the preset fourth algorithm includes the following steps: will supplement the learning latent vector Input the preset second fully connected layer 1020, and input the output result of the second fully connected layer 1020 to the preset second SoftMax classifier 1021, and obtain a 2-dimensional vector for each word, and each of the 2-dimensional vectors Dimensions correspond to target output Distribution probability over 0 and 1, for example,
- the predicted distribution probability on 0 is denoted as P(0
- the discriminator loss is obtained based on the distribution probability.
- the discriminator loss is calculated based on the distribution probability of the target output of the discriminator 102 on 0 and 1.
- the value of the discriminator loss can be used to intuitively determine whether the supplementary learning task is overfitting.
- the process is simple and takes up less computing resources. At the same time, it has extremely high computing efficiency.
- logarithmic natural loss is calculated, and the losses of all words are summed to obtain the discriminator loss LD . Calculated using the following formula:
- an adjustable control parameter ⁇ is introduced.
- the control parameter ⁇ is used to control the contribution of the sentence restoration module 101 and the adversarial learning discriminator 102 to model training.
- the control parameter ⁇ provides an adjustable option for the calculation of the overall loss L.
- the optimization method also includes the following steps: after optimizing the sentiment analysis model through the overall loss L, and then performing a secondary optimization update on the parameters in the sentiment analysis model by adjusting the parameters of some modules in the sentiment analysis model. .
- the sentiment analysis model after secondary optimization can achieve better performance.
- the sentiment analysis model after secondary optimization is used the same as the baseline model of sentiment analysis. It only requires sentences and entities as input and does not rely on additional input. Compared with the baseline model, it enhances performance without causing additional usage costs.
- the secondary optimization includes the following steps: optimizing the parameters of the sentiment analysis model through the overall loss L, applying the task-independent encoder 1000, the supplementary learning encoder 1002, the sentiment analysis encoder 1001, and the preset vector algorithm Initialize with the first and second fully connected layers 1020 and SoftMax classifier; calculate the optimized emotional polarity loss in the same way as in the preliminary optimization According to the optimized emotional polarity loss
- the model parameters in the sentiment analysis encoder 1001, the first fully connected layer 1003 and the second fully connected layer 1020 are optimized and updated twice through the back propagation algorithm to obtain the final optimized sentiment analysis model.
- the F value of the sentiment analysis model is 77.04; using the optimization method for optimization Finally, the F-value of the sentiment analysis model is 77.70. It can be seen that by introducing the sentence restoration module 101, the model's modeling ability of aspect words is enhanced, thereby improving the performance of the sentiment analysis model on sentiment analysis tasks.
- the natural language model can be an emotion analysis model.
- the emotion analysis model includes a main model, a sentence restoration module and a discriminator.
- the main model includes a task-independent encoder, emotion Analytical encoder and supplementary learning encoder, wherein: the method includes the following steps: obtain the input sentence through the task-independent encoder, encode the sentence, and obtain the latent vector of each word in the sentence; input the latent vector into the supplementary learning encoder, Obtain the supplementary learning latent vector, and calculate the supplementary learning latent vector through the vector algorithm to obtain the vector representation of the aspect word; input the latent vector and the vector representation of the aspect word into the sentiment analysis encoder to obtain the emotional polarity loss; input the supplementary learning latent vector
- the sentence restoration module obtains the intermediate results and supplementary learning loss; inputs the latent vector and intermediate results into the discriminator to obtain the discriminator loss; calculates the emotional polarity loss, supplementary learning loss, and discriminator loss according to the prese
- the task-independent encoder and the supplementary learning encoder realize the learning of aspect words in the text, which enhances the main model's understanding and modeling ability of aspect words, thereby improving sentiment analysis. Model performance on sentiment analysis tasks.
- obtaining the emotional polarity loss includes the following steps: input the latent vector and the vector representation of the aspect word into the sentiment analysis encoder, and obtain the sentiment analysis vector representation of the vocabulary; Sentiment analysis vector representation of aspect words; the latent vector, sentiment analysis vector representation of words and aspect word sentiment analysis vector representation are calculated according to the preset second algorithm to obtain the emotional polarity loss. Calculate the latent vectors, sentiment analysis vector representations of words, and sentiment analysis vector representations of aspect words according to the preset second algorithm, and you can obtain the emotional polarity loss, which is used to characterize the emotional polarity prediction ability of the emotional analysis model. The loss can be used to optimize the emotional polarity prediction ability of the emotional analysis model during the preliminary optimization of the emotional analysis model through the overall loss, ensuring the reliability of the optimization method.
- the preset second algorithm includes the following steps: calculating latent vectors according to the preset vector algorithm to obtain task-independent sentence representations; performing emotional analysis on vocabulary
- the vector representation and the sentiment analysis vector representation of the aspect words are calculated according to the preset vector algorithm to obtain the sentiment analysis sentence representation; the task-irrelevant sentence representation and the sentiment analysis sentence representation are concatenated to obtain an intermediate vector; the intermediate vector is input into the preset first full connection layer, and input the output result of the first fully connected layer into the preset SoftMax classifier to obtain the predicted emotional polarity; compare the predicted emotional polarity with the preset standard to calculate the emotional polarity loss.
- the preset vector algorithm is used to calculate latent vectors, sentiment analysis vector representations of words, and sentiment analysis vector representations of aspect words, and obtain task-independent sentence representations and sentiment analysis sentence representations as intermediate variables, reflecting the input sentence in the sentiment analysis model.
- the prediction pole of the vocabulary in the input sentence by the main model is obtained. (scalar), which embodies the predicted polarity in the form of a scalar to facilitate the subsequent calculation of emotional polarity loss.
- the sentence restoration module includes a supplementary learning decoder
- obtaining the supplementary learning loss includes the following steps: inputting the supplementary learning latent vector into the supplementary learning decoder to reconstruct the input sentence, Obtain the predicted vocabulary; compare the predicted vocabulary with the vocabulary in the input sentence to obtain the intermediate result, and calculate the supplementary learning loss according to the preset third algorithm.
- the predicted vocabulary (scalar) is obtained by decoding and reconstructing the supplementary learning latent vector encoded by the supplementary learning encoder through the supplementary learning encoder. It can be seen that embodying the predicted vocabulary in the form of a scalar makes it easier to calculate the supplementary learning loss later.
- the preset third algorithm includes the following steps: calculate the negative log natural loss of the word in one position, and add the negative logarithm of the word in all positions.
- the word summation of the natural loss is obtained by supplementing the learned loss.
- obtaining the discriminator loss also includes the following steps: obtaining the target output of the discriminator based on the latent vector and the intermediate result; and calculating the target output according to the preset fourth
- the algorithm performs calculations and obtains the discriminator loss.
- the discriminator of adversarial learning effectively controls the learning degree of the task-independent encoder and the supplementary learning encoder on the supplementary learning task-the sentence reconstruction task, and avoids the overfitting of the task-independent encoder and the supplementary learning encoder on the supplementary learning task. , ensuring the degree of fitting optimization of the main model for the main task-sentiment analysis task.
- the value of the target output is 0 or 1
- the preset fourth algorithm includes the following steps: input the supplementary learning latent vector into the preset second full algorithm. connection layer, and input the output result of the second fully connected layer to the preset second SoftMax classifier, and obtain a 2-dimensional vector for each word. Each dimension of the 2-dimensional vector corresponds to the target output between 0 and 1. Distribution probability; the discriminator loss is obtained based on the distribution probability. Calculate the discriminator loss based on the distribution probability of the discriminator's target output on 0 and 1. You can intuitively judge whether the supplementary learning task is overfitting through the value of the discriminator loss. The process is simple and takes up less computing resources. At the same time, it has extremely high computing efficiency.
- control parameters are used to control the contribution of the sentence restoration module and the adversarial learning discriminator to model training. .
- the control parameters provide adjustable options for the calculation of the overall loss.
- the optimization method for the natural language model also includes the following steps: after optimizing the sentiment analysis model through the overall loss, and then adjusting the parameters of some modules in the sentiment analysis model to perform the sentiment analysis.
- the parameters in the model are updated through secondary optimization. Understandably, by performing secondary optimization updates, the secondary optimized relationship extraction model can achieve better performance.
- the relationship extraction model after secondary optimization is used the same as the baseline model of relationship extraction. It only requires sentences and entities as input and does not rely on additional input. Compared with the baseline model, it enhances performance without causing additional usage costs.
- the secondary optimization includes the following steps: optimizing parameters of the sentiment analysis model through overall loss, task-independent encoder, supplementary learning encoder, sentiment analysis
- the encoder, the preset vector algorithm, the first and second fully connected layers and the SoftMax classifier are initialized; the optimized emotional polarity loss is calculated in the same way as in the preliminary optimization; based on the optimized emotional polarity loss , the model parameters in the sentiment analysis encoder, the first fully connected layer and the second fully connected layer are optimized and updated twice through the back propagation algorithm to obtain the final optimized sentiment analysis model.
- the natural language model is a relation extraction model
- the enhancement module is an entity recognition learning module
- the first encoder is a shared encoder
- the second encoder is a relation extraction encoder
- the target result loss is a relation type loss
- the enhancement loss is the entity label loss.
- An optimization method for a natural language model includes the following steps: through the shared encoder 2000 (Shared Encoder) in the preset main model 200 (Relationship extraction (RE)) Get the input sentence, encode the sentence, and output the hidden vector of each word in the sentence
- the relationship extraction model includes a main model 200, an entity recognition learning module 201 (NER) and a discriminator 202 (THE Discriminator).
- the main model 200 also includes a relationship extraction encoder 2001 (RE encoder), which converts the latent vector into Input the relationship extraction encoder 2001, the entity recognition learning module 201 and the discriminator 202 to obtain the relationship type loss L RE , the entity label loss L NER and the discriminator 202 loss respectively; combine the relationship type loss L RE , the entity label loss L NER and the discriminant loss
- the loss L D of the device 202 is calculated by the preset first algorithm to obtain the overall loss L; the relationship extraction model is initially optimized through the overall loss L.
- the shared encoder 2000 in the main model 200 realizes the learning of entities in the text, which enhances the modeling ability of the main model 200 for entities, thus improving the relationship extraction model. Performance on relation extraction tasks.
- the main model 200 obtains the relationship type loss including the following steps: convert the latent vector Input the relationship extraction encoder 2001 in the main model 200, and the relationship extraction encoder 2001 pairs the latent vector Encode to obtain the relationship extraction latent vector of each word Through the latent vector of each word and relationship extraction latent vector Perform calculation processing through the preset second algorithm to obtain the predicted relationship type The type of relationship that will be predicted Compare and calculate with the preset standard to obtain the relationship type loss L RE .
- the predicted relationship type of the current main model 200 can be obtained The type of relationship that will be predicted Compare and calculate with the preset standard to obtain the relationship type loss L RE that is used to characterize the ability of the relationship extraction model to predict relationship types.
- the relationship type loss L RE can be used to initially optimize the relationship extraction model through the overall loss L. The relationship prediction ability of the extracted model is optimized to ensure the reliability of the optimization method.
- the preset second algorithm includes the following steps: extracting latent vectors for relationships through a preset vector algorithm. Perform calculations to obtain the vector representation of the first entity Vector representation of the second entity and Sentence Representation with Relation Extraction Encoder 2001 At the same time, the preset vector algorithm is applied to the hidden vector to obtain the sentence representation of the shared encoder 2000 Represent the vector of the first entity Vector representation of the second entity Sentence Representation for Relation Extraction Encoder 2001 and shared encoder 2000 sentence representation concatenate to obtain the intermediate vector o; the intermediate vector o passes through the preset first fully connected layer 2002 and then is sent to the first SoftMax classifier 2003 to obtain the predicted relationship type.
- the vector representation of the first entity is obtained by calculating the implicit vector and the relationship extraction implicit vector through the preset vector algorithm.
- Vector representation of the second entity (E 1 and E 2 represent two entities respectively), sentence representation of relation extraction encoder 2001 and shared encoder 2000 sentence representation
- the above-mentioned vector representation is the intermediate vector o obtained after concatenation, after passing through the first fully connected layer 2002 and the first SoftMax classifier 2003 After classification and normalization, the relationship type predicted by the main model 200 for two entities is obtained. (scalar), the type of relationship that will be predicted It is expressed in the form of a scalar to facilitate the subsequent calculation of the relationship type loss L RE .
- the preset vector algorithm is the MaxPooling algorithm. Here's how:
- the entity recognition learning module 201 includes an entity encoder 2010.
- the entity recognition learning module 201 obtains the entity label loss including the following steps: converting the latent vector Input to the entity encoder 2010 in the entity recognition learning module 201, the entity encoder 2010 encodes the latent vector Further encoding, the entity recognition latent vector of each word is obtained. Through the latent vector of each word and entity recognition latent vector Perform conversion processing to obtain predicted entity recognition tags Convert all words’ predicted entity recognition labels to Compare and calculate with the preset standard to obtain the entity label loss L NER .
- Latent vectors via Entity Encoder 2010 The encoding obtains the entity recognition latent vector of each word And the hidden vector of each word and entity recognition latent vector Convert to a predicted entity recognition label for the approximate location of a word in an entity (Scalar), the scalar here refers to a quantity without direction, and its content can be numbers or characters. It can be seen that the predicted entity recognition label will be Reflected in the form of a scalar, it is easier to calculate the entity recognition loss later.
- the conversion process includes the following steps: transforming the latent vector of each word into and entity recognition latent vector Concatenate, and send the vector obtained after concatenation into the second fully connected layer 2011 and the second SoftMax classifier 2012 to obtain the predicted entity recognition label Understandably, the concatenated vectors are classified and converted into predicted entity recognition labels through the second fully connected layer 2011 and the second SoftMax classifier 2012 Convert the intermediate vector o to the predicted relation class in the main model 200
- the methods are the same, and unifying the two methods makes it easier to establish, optimize and maintain the relationship extraction model.
- the preset criteria are actual entity tags Convert all words’ predicted entity recognition labels to with actual entity tags Compare and calculate the comparison results through the cross-entropy loss function to obtain the entity label loss L NER .
- the difference between the predicted value and the actual value can be obtained, that is, the entity label loss L NER .
- Optimizing the relationship extraction model through the entity label loss L NER can enhance the model's understanding of the meaning of entities and improve the model's understanding of entity labels. predictive ability.
- the actual entity label of each word is determined by the location of the entity input into the shared encoder 2000.
- the entity label is the first label; if a word is an entity If a word is at the beginning of an entity, the entity tag is the second tag; if a word is in the middle or end of an entity, the entity tag is the third tag.
- the first label is O
- the second label is B
- the third label is I.
- the input sentence is An air force pilot is back, and the two input entities are air force and pilot .
- the actual entity label of each word in the input sentence Just O B I B O O.
- the entity recognition learning module 201 can be trained to find entities from text, thereby enhancing the relationship extraction model's ability to understand entities.
- the discriminator 202 obtaining the discriminator 202 loss includes the following steps: the discriminator 202 obtains the predicted entity recognition label The result is compared with the preset standard to obtain the target output. target output The value is 0 or 1 in the following way:
- the preset third fully connected layer 2020 and the third SoftMax classifier 2021 are sent to obtain a 2-dimensional vector for each word.
- Each dimension of the 2-dimensional vector corresponds to the distribution probability of the target output on 0 and 1 (for example, The predicted distribution probability on 0 is recorded as P(0
- a logarithmic natural loss is calculated and the losses for all words are summed to obtain the discriminator 202 loss LD . Calculated using the following formula:
- the setting of the discriminator 202 effectively controls the learning degree of the shared encoder 2000 for the entity recognition task and avoids the overfitting of the shared encoder 2000 for the entity recognition task. If there is no loss of the discriminator 202 provided by the discriminator 202, the overall loss will be In the initial optimization, entity recognition loss will be preferred for optimization, thus affecting the performance of the relationship extraction model for the main task-relation extraction. It can be seen that when the discriminator 202 loss provided by the discriminator 202 is initially optimized through the overall loss, it ensures that the main performance of the relationship extraction model is not incorrectly covered.
- an adjustable control parameter ⁇ is introduced, and the control parameter ⁇ is used to control the contribution of the entity recognition learning module 201 and the adversarial learning discriminator 202 to model training.
- the control parameter ⁇ provides an adjustable option for the calculation of the overall loss L.
- the optimization method also includes the following steps: after optimizing the relationship extraction model through the overall loss L, and then adjusting the parameters of some modules in the relationship extraction model to encode the relationship extraction 2001 and all fully connected layer model parameters are optimized and updated twice. Understandably, by performing secondary optimization updates, the secondary optimized relationship extraction model can achieve better performance.
- the relationship extraction model after secondary optimization is used the same as the baseline model of relationship extraction. It only requires sentences and entities as input and does not rely on additional input. Compared with the baseline model, it enhances performance without causing additional usage costs.
- the secondary optimization update includes the following steps: using the overall loss L to optimize the parameters of the relation extraction model, and comparing the shared encoder 2000 and the relation extraction encoder 2001 in the main model 200 with the first, second and The third fully connected layer 2020 and SoftMax classifier are initialized;
- the model parameters of the relationship extraction encoder 2001 and the first, second and third fully connected layers are optimized and updated twice through the back propagation algorithm to obtain the final optimized relationship extraction model.
- the F value of the relationship extraction model is 77.04; using the optimization method for optimization Finally, the F value of the relationship extraction model is 77.70. It can be seen that by introducing the entity recognition learning module 201, the model's ability to model entities is enhanced, thereby improving the performance of the relationship extraction model in the relationship extraction task.
- B corresponding to A means that B is associated with A, and B can be determined based on A.
- determining B based on A does not mean determining B only based on A.
- B can also be determined based on A and/or other information.
- the natural language model can be a relationship extraction model.
- the relationship extraction model includes a main model, an entity recognition learning module and a discriminator.
- the main model includes a shared encoder. and the relation extraction encoder, including the following steps: obtain the input sentence through the shared encoder in the main model, encode the sentence, and output the latent vector of each word in the sentence; input the latent vector into the relation extraction encoder and entity recognition
- the learning module and discriminator obtain the relationship type loss, entity label loss and discriminator loss respectively; the relationship type loss, entity label loss and discriminator loss are calculated through the preset first algorithm to obtain the overall loss; the relationship is extracted through the overall loss
- the model is initially optimized. By setting up the entity recognition module in the preliminary optimization, the shared encoder in the main model realizes the learning of entities in the text, enhances the main model's modeling ability of entities, and thus improves the relationship extraction model in the relationship extraction task. performance.
- obtaining the relationship type loss from the main model includes the following steps: inputting the latent vector into the relationship extraction encoder in the main model, and the relationship extraction encoder Encoding, obtain the relationship extraction latent vector of each word; calculate and process the latent vector of each word and the relationship extraction latent vector through the preset second algorithm to obtain the predicted relationship type; compare the predicted relationship type with the preset The standard is compared and calculated to obtain the relationship type loss.
- the latent vector and the encoded relationship extraction latent vector are processed through the second algorithm to obtain the predicted relationship type of the current main model.
- the predicted relationship type is compared and calculated with the preset standard to obtain the predicted relationship used to represent the relationship extraction model.
- Relationship type loss of type ability Relationship type loss can optimize the relationship prediction ability of the relationship extraction model in the process of preliminary optimization of the relationship extraction model through overall loss, ensuring the reliability of the optimization method.
- the preset second algorithm includes the following steps: calculating the relationship extraction latent vector through the preset vector algorithm to obtain the vector of the first entity representation, the vector representation of the second entity, and the sentence representation of the relation extraction encoder.
- the preset vector algorithm is applied to the latent vector to obtain the sentence representation of the shared encoder; the vector representation of the first entity, the vector of the second entity
- the representation, the vector representation of the sentence and the sentence representation of the shared encoder are concatenated to obtain an intermediate vector; the intermediate vector is sent to the SoftMax classifier after passing through a fully connected layer to obtain the predicted relationship type.
- the latent vector and the relationship extraction latent vector are calculated through the preset vector algorithm, and the obtained vector representation of the first entity, the vector representation of the second entity, the sentence representation of the relationship extraction encoder and the sentence representation of the shared encoder are used as intermediate variables , which reflects the vector representation of different positions of entities or sentences in the relationship extraction model.
- the above-mentioned vector representations are concatenated and the intermediate vectors are obtained.
- the main vector is obtained.
- the relationship type (scalar) predicted by the model for two entities is reflected in the form of a scalar to facilitate subsequent calculation of the relationship type loss.
- the entity recognition learning module includes an entity encoder
- obtaining the entity label loss by the entity recognition learning module includes the following steps: inputting the latent vector into the entity recognition learning module
- the entity encoder in The predicted entity recognition label of the word is compared with the preset standard to obtain the entity label loss.
- the entity recognition latent vector of each word is obtained by encoding the latent vector by the entity recognition encoder, and the latent vector of each word and the entity recognition latent vector are converted into the approximate position of a word in the entity.
- Predicted entity identification label (scalar). It can be seen that embodying the predicted entity recognition label in the form of a scalar makes it easier to calculate the entity recognition loss later.
- the conversion process includes the following steps: concatenate the latent vector of each word with the entity recognition latent vector, and send the concatenated vector into a Fully connected layer and SoftMax classifier to obtain predicted entity recognition labels.
- the way in which the concatenated vectors are classified and converted into predicted entity recognition labels through the fully connected layer and SoftMax classifier is the same as the way in which the intermediate vectors are converted into predicted relationship classes in the main model. It is easier to unify the two methods. Establishment, optimization and maintenance of relationship extraction models.
- the preset standard is the actual entity label
- the predicted entity identification labels of all words are compared with the actual entity labels
- the comparison results are
- the entity label loss is obtained by calculating the cross-entropy loss function.
- the actual entity label of each word is determined by the location of the entity in the input.
- the entity label is the One label; if a word is the beginning of an entity, the entity label is the second label; if a word is in the middle or end of an entity, the entity label is the third label.
- the discriminator obtaining the discriminator loss includes the following steps: the discriminator obtains the result of comparing the predicted entity recognition label with the preset standard, and obtains the target output. , the target output value is 0 or 1; send the latent vector to a fully connected layer and SoftMax classifier, and obtain a 2-dimensional vector for each word. Each dimension of the 2-dimensional vector corresponds to the target output between 0 and 1. Distribution probability; the discriminator loss is obtained by calculating the distribution probability.
- the setting of the discriminator effectively controls the learning degree of the shared encoder for the entity recognition task and avoids the overfitting of the shared encoder for the entity recognition task.
- the discriminator loss provided by the discriminator ensures that the main performance of the relationship extraction model is not incorrectly covered by the overall loss.
- control parameters are introduced, and the control parameters are used to control the entity recognition learning module and the discriminator pair model of adversarial learning.
- the control parameters provide adjustable options for the calculation of the overall loss.
- An optimization method for relational natural language provided by the embodiment of the present application also includes the following steps: after optimizing the above-mentioned relation extraction model through overall loss, and then adjusting the parameters of some modules in the relation extraction model to optimize the relation Extract the model parameters of the encoder and fully connected layer for secondary optimization and update.
- the secondary optimized relationship extraction model can achieve better performance.
- the relationship extraction model after secondary optimization is used the same as the baseline model of relationship extraction. It only requires sentences and entities as input and does not rely on additional input. Compared with the baseline model, it enhances performance without causing additional usage costs.
- the embodiment of the present application also provides an optimization device for a natural language model.
- the natural language model includes a main model, an enhancement module and a discriminator.
- the main model includes a first encoder and a second Encoder, the device includes: a first module 610, configured to obtain an input sentence through the first encoder, encode the sentence, and output the latent vector of each word in the sentence; a second module 620 , is configured to input the latent vector into the second encoder, the enhancement module and the discriminator to obtain the target result loss, enhancement loss and discriminator loss respectively; the third module 630 is configured to input the target result loss, enhancement loss and discriminator loss respectively.
- the result loss, the enhancement loss and the discriminator loss are calculated through a preset first algorithm to obtain the overall loss; the fourth module 640 is configured to perform preliminary optimization of the natural language model through the overall loss.
- the natural language model is a sentiment analysis model
- the enhancement module is a sentence restoration module
- the first encoder is a task-independent encoder
- the second encoder is a sentiment analysis encoder
- the The main module also includes a supplementary learning encoder
- the target result loss is an emotional polarity loss
- the enhancement loss is a supplementary learning loss
- the second module 620 is set to:
- the representation is input into the sentiment analysis encoder to obtain the emotional polarity loss;
- the supplementary learning latent vector is input into the sentence restoration module to obtain the intermediate result and the supplementary learning loss; the latent vector and the intermediate
- the result is input into the discriminator and the discriminator loss is obtained.
- the second module 620 is configured as:
- the latent vector and the vector representation of the aspect word are input into the sentiment analysis encoder to obtain the sentiment analysis vector representation of the vocabulary and the sentiment analysis vector representation of the aspect word; the latent vector and the sentiment analysis vector of the vocabulary are obtained
- the representation and the aspect word sentiment analysis vector representation are calculated according to a preset second algorithm to obtain the sentiment polarity loss.
- the preset second algorithm includes:
- the latent vector is calculated according to the preset vector algorithm to obtain a task-independent sentence representation; the sentiment analysis vector representation of the vocabulary and the sentiment analysis vector representation of the aspect word are calculated according to the preset vector algorithm to obtain the sentiment analysis sentence representation. ; Connect the task-independent sentence representation and the sentiment analysis sentence representation in series to obtain an intermediate vector; input the intermediate vector into the preset first fully connected layer, and input the output result of the first fully connected layer into the preset Set the SoftMax classifier to obtain the predicted emotional polarity; compare the predicted emotional polarity with the preset standard to calculate the emotional polarity loss.
- the sentence restoration module includes a supplementary learning decoder
- the second module 620 is configured as:
- the supplementary learning latent vector is input into the supplementary learning decoder to reconstruct the input sentence to obtain the predicted vocabulary; the predicted vocabulary is compared with the vocabulary in the input sentence to obtain the intermediate result, and the intermediate result is obtained according to A preset third algorithm calculates the supplementary learning loss.
- the preset third algorithm includes:
- the second module 620 is configured as:
- the target output of the discriminator is obtained; the target output is calculated according to a preset fourth algorithm to obtain the discriminator loss.
- the value of the target output is 0 or 1
- the second module 620 is set to:
- adjustable control parameters are introduced, and the control parameters are used to control the contribution of the sentence restoration module and the discriminator of adversarial learning to model training.
- a fifth module is also included, configured as:
- a second optimization and update of the parameters in the sentiment analysis model is performed by adjusting the parameters of some modules in the sentiment analysis model.
- the fifth module is configured as:
- the optimized parameters of the sentiment analysis model through the overall loss, the task-independent encoder, the supplementary learning encoder, the sentiment analysis encoder, the preset vector algorithm, the first fully connected layer, The second fully connected layer and the SoftMax classifier are initialized; the optimized emotional polarity loss is calculated in the same way as in the preliminary optimization; based on the optimized emotional polarity loss, the emotional polarity is calculated through the back propagation algorithm.
- the model parameters in the analysis encoder, the first fully connected layer and the second fully connected layer are optimized and updated twice to obtain the final optimized sentiment analysis model.
- the natural language model is a relationship extraction model
- the enhancement module is an entity recognition learning module
- the first encoder is a shared encoder
- the second encoder is a relationship extraction encoder
- the The target result loss is a relationship type loss
- the enhanced loss is an entity label loss
- the second module 620 is set to:
- the latent vector is input into the relationship extraction encoder, the entity recognition learning module and the discriminator to obtain the relationship type loss, the entity label loss and the discriminator loss respectively.
- the relationship type loss is obtained in the following manner:
- the latent vector is input into the relationship extraction encoder in the main model, and the relationship extraction encoder encodes the latent vector to obtain the relationship extraction latent vector of each word; the latent vector of each word and the relationship
- the latent vector is extracted and processed through a preset second algorithm to obtain a predicted relationship type; the predicted relationship type is compared and calculated with a preset standard to obtain the relationship type loss.
- the preset second algorithm includes:
- the relationship extraction latent vector is calculated through the preset vector algorithm to obtain the vector representation of the first entity, the vector representation of the second entity and the sentence representation of the relationship extraction encoder, and the preset vector algorithm is applied to the latent vector , obtain the sentence representation of the shared encoder; concatenate the vector representation of the first entity, the vector representation of the second entity, the sentence representation of the relationship extraction encoder and the sentence representation of the shared encoder to obtain the intermediate Vector; the intermediate vector passes through the preset first fully connected layer and then is sent to the preset first SoftMax classifier to obtain the predicted relationship type.
- the entity recognition learning module includes an entity encoder, and the entity label loss is obtained as follows:
- the latent vector is input to the entity encoder in the entity recognition learning module, and the entity encoder encodes the latent vector to obtain the entity recognition latent vector of each word; by encoding the latent vector of each word Perform conversion processing with entity recognition latent vectors to obtain predicted entity recognition labels; compare and calculate the predicted entity recognition labels of all words with preset standards to obtain the entity label loss.
- the second module 620 is configured as:
- Concatenate the latent vector of each word with the entity recognition latent vector and send the concatenated vector to the preset second fully connected layer and the preset second SoftMax classifier to obtain the predicted entity recognition label.
- the second module 620 is set to::
- the actual entity label of each word is determined by the location of the entity in the input.
- the actual entity label of a word is the first label; when a word If it is the beginning of an entity, the actual entity tag of the word is the second tag; if the word is in the middle or end of an entity, the actual entity tag of the word is the third tag. .
- the discriminator loss is obtained as follows:
- the discriminator obtains the result of comparing the predicted entity recognition label with the preset standard, and obtains a target output.
- the target output value is 0 or 1; the latent vector is sent to the preset third fully connected layer and the third SoftMax classifier, a 2-dimensional vector is obtained for each word, where each dimension of the 2-dimensional vector corresponds to the distribution probability of the target output on 0 and 1; calculated through the distribution probability The discriminator loss.
- adjustable control parameters when calculating the overall loss, adjustable control parameters are introduced, and the control parameters are used to control the contribution of the entity recognition learning module and the discriminator of adversarial learning to model training.
- a sixth module is also included, configured as:
- the model parameters of the relationship extraction encoder and the fourth fully connected layer are adjusted. Perform secondary optimization updates.
- the device provided by the embodiment of the present application can implement the method steps in the above method embodiment and has the same technical effect.
- this embodiment of the present application also provides an electronic device, including a processor 710 and a memory 720.
- a processor 710 When the computer program in the memory 720 is executed by the processor 710, the above-mentioned method for the natural language model is implemented. Optimization.
- Embodiments of the present application also provide a computer-readable storage medium.
- a computer program is stored on the computer-readable storage medium.
- the computer program is executed by a processor, the above-mentioned optimization method for a natural language model is implemented.
- references herein to "one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic associated with the embodiment is included in at least one embodiment of the present application. Thus, appearances of "in one embodiment” or “in an embodiment” throughout this text are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. Those skilled in the art should also know that the embodiments described in this article are optional embodiments, and the actions and modules involved are not necessarily necessary for this application.
- the size of the serial numbers of the above-mentioned finalization process does not necessarily mean the order of execution.
- the execution order of the finalization process should be determined by its functions and internal logic, and should not be used in the implementation of the embodiments of the present application.
- the process constitutes any limitation.
- each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
- the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending upon the functionality involved.
- Each block in the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration may be implemented by special purpose hardware-based systems that perform the specified functions or operations, or may be implemented using special purpose hardware implemented in combination with computer instructions.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
La présente demande concerne un procédé d'optimisation pour un modèle de langage naturel. Le modèle de langage naturel comprend un modèle principal, un module d'amélioration et un discriminateur, le modèle principal comprenant un premier codeur et un second codeur. Le procédé comprend : l'acquisition d'une instruction d'entrée au moyen d'un premier codeur, le codage de l'instruction et la délivrance d'un vecteur implicite de chaque terme dans l'instruction ; l'entrée des vecteurs implicites dans un second codeur, un module d'amélioration et un discriminateur, de façon à obtenir respectivement une perte de résultat cible, une perte d'amélioration et une perte de discriminateur ; au moyen d'un premier algorithme prédéfini, la réalisation d'un calcul sur la perte de résultat cible, la perte d'amélioration et la perte de discriminateur, de façon à obtenir une perte globale ; et l'optimisation au préalable d'un modèle de langage naturel en fonction de la perte globale.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210753408.0A CN114970857A (zh) | 2022-06-29 | 2022-06-29 | 一种用于关系抽取模型的优化方法 |
CN202210753408.0 | 2022-06-29 | ||
CN202210753407.6A CN115034228A (zh) | 2022-06-29 | 2022-06-29 | 一种用于情感分析模型的优化方法 |
CN202210753407.6 | 2022-06-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024000966A1 true WO2024000966A1 (fr) | 2024-01-04 |
Family
ID=89383923
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/128623 WO2024000966A1 (fr) | 2022-06-29 | 2022-10-31 | Procédé d'optimisation pour modèle de langage naturel |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024000966A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117610562A (zh) * | 2024-01-23 | 2024-02-27 | 中国科学技术大学 | 一种结合组合范畴语法和多任务学习的关系抽取方法 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111368528A (zh) * | 2020-03-09 | 2020-07-03 | 西南交通大学 | 一种面向医学文本的实体关系联合抽取方法 |
CN113128229A (zh) * | 2021-04-14 | 2021-07-16 | 河海大学 | 一种中文实体关系联合抽取方法 |
US20210271822A1 (en) * | 2020-02-28 | 2021-09-02 | Vingroup Joint Stock Company | Encoder, system and method for metaphor detection in natural language processing |
CN114357155A (zh) * | 2021-11-29 | 2022-04-15 | 山东师范大学 | 面向自然语言的方面情感分析方法及系统 |
CN114626529A (zh) * | 2022-02-25 | 2022-06-14 | 华南理工大学 | 一种自然语言推理微调方法、系统、装置及存储介质 |
CN114970857A (zh) * | 2022-06-29 | 2022-08-30 | 苏州思萃人工智能研究所有限公司 | 一种用于关系抽取模型的优化方法 |
CN115034228A (zh) * | 2022-06-29 | 2022-09-09 | 苏州思萃人工智能研究所有限公司 | 一种用于情感分析模型的优化方法 |
-
2022
- 2022-10-31 WO PCT/CN2022/128623 patent/WO2024000966A1/fr unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210271822A1 (en) * | 2020-02-28 | 2021-09-02 | Vingroup Joint Stock Company | Encoder, system and method for metaphor detection in natural language processing |
CN111368528A (zh) * | 2020-03-09 | 2020-07-03 | 西南交通大学 | 一种面向医学文本的实体关系联合抽取方法 |
CN113128229A (zh) * | 2021-04-14 | 2021-07-16 | 河海大学 | 一种中文实体关系联合抽取方法 |
CN114357155A (zh) * | 2021-11-29 | 2022-04-15 | 山东师范大学 | 面向自然语言的方面情感分析方法及系统 |
CN114626529A (zh) * | 2022-02-25 | 2022-06-14 | 华南理工大学 | 一种自然语言推理微调方法、系统、装置及存储介质 |
CN114970857A (zh) * | 2022-06-29 | 2022-08-30 | 苏州思萃人工智能研究所有限公司 | 一种用于关系抽取模型的优化方法 |
CN115034228A (zh) * | 2022-06-29 | 2022-09-09 | 苏州思萃人工智能研究所有限公司 | 一种用于情感分析模型的优化方法 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117610562A (zh) * | 2024-01-23 | 2024-02-27 | 中国科学技术大学 | 一种结合组合范畴语法和多任务学习的关系抽取方法 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022037256A1 (fr) | Procédé et dispositif de traitement de phrase de texte, dispositif informatique et support d'enregistrement | |
CN109492113B (zh) | 一种面向软件缺陷知识的实体、关系联合抽取方法 | |
JP7291183B2 (ja) | モデルをトレーニングするための方法、装置、デバイス、媒体、およびプログラム製品 | |
CN109062910A (zh) | 基于深度神经网络的句子对齐方法 | |
WO2023065617A1 (fr) | Système et procédé d'extraction intermodale basés sur un modèle de pré-entrainement ainsi qu'un rappel et un classement | |
CN111382582A (zh) | 一种基于非自回归的神经机器翻译解码加速方法 | |
CN112784051A (zh) | 专利术语抽取方法 | |
WO2019154210A1 (fr) | Procédé et dispositif de traduction automatique, et support de stockage lisible par ordinateur | |
CN111984791B (zh) | 一种基于注意力机制的长文分类方法 | |
CN110083702B (zh) | 一种基于多任务学习的方面级别文本情感转换方法 | |
CN110059324A (zh) | 基于依存信息监督的神经网络机器翻译方法及装置 | |
CN110569505A (zh) | 一种文本输入方法及装置 | |
WO2024000966A1 (fr) | Procédé d'optimisation pour modèle de langage naturel | |
CN115759119B (zh) | 一种金融文本情感分析方法、系统、介质和设备 | |
CN115168541A (zh) | 基于框架语义映射和类型感知的篇章事件抽取方法及系统 | |
CN114020906A (zh) | 基于孪生神经网络的中文医疗文本信息匹配方法及系统 | |
CN114970857A (zh) | 一种用于关系抽取模型的优化方法 | |
CN115034228A (zh) | 一种用于情感分析模型的优化方法 | |
WO2022228127A1 (fr) | Procédé et appareil de traitement de texte d'élément, dispositif électronique et support de stockage | |
CN117036706A (zh) | 一种基于多模态对话语言模型的图像分割方法和系统 | |
CN117291265A (zh) | 一种基于文本大数据的知识图谱构建方法 | |
CN116258147A (zh) | 一种基于异构图卷积的多模态评论情感分析方法及系统 | |
CN116775862A (zh) | 融合情感词的Bi-LSTM的情感分类方法 | |
CN113392929A (zh) | 一种基于词嵌入与自编码器融合的生物序列特征提取方法 | |
CN112966524A (zh) | 基于多粒度孪生网络的中文句子语义匹配方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22949049 Country of ref document: EP Kind code of ref document: A1 |