WO2024021343A1 - Natural language processing method, computer device, readable storage medium, and program product - Google Patents

Natural language processing method, computer device, readable storage medium, and program product Download PDF

Info

Publication number
WO2024021343A1
WO2024021343A1 PCT/CN2022/128622 CN2022128622W WO2024021343A1 WO 2024021343 A1 WO2024021343 A1 WO 2024021343A1 CN 2022128622 W CN2022128622 W CN 2022128622W WO 2024021343 A1 WO2024021343 A1 WO 2024021343A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
vector
representation
preset
word
Prior art date
Application number
PCT/CN2022/128622
Other languages
French (fr)
Chinese (zh)
Inventor
宋彦
田元贺
李世鹏
Original Assignee
苏州思萃人工智能研究所有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202210911109.5A external-priority patent/CN115270812A/en
Priority claimed from CN202210909700.7A external-priority patent/CN115329764A/en
Application filed by 苏州思萃人工智能研究所有限公司 filed Critical 苏州思萃人工智能研究所有限公司
Publication of WO2024021343A1 publication Critical patent/WO2024021343A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • This application relates to the technical field of natural language processing, for example, to a natural language processing method, device, readable storage medium and program product.
  • the proper name recognition task aims to extract nominal entities from a given sentence.
  • the relationship extraction task aims to extract (predict) the relationship between the two given entities from a given sentence and two entities. Among them, understanding the meaning of the entity itself is very important for predicting its relationship. However, general methods often ignore the modeling of the entity itself.
  • traditional relationship extraction models often face the problem of sparse entity data, making it difficult to correctly extract relationships between entities not encountered during training (that is, unlogged entities).
  • This application provides a natural language processing method, equipment, storage medium and program product to solve the problem that traditional natural language processing methods are difficult to correctly extract entities or relationships between entities that have not been encountered during training.
  • This application provides a natural language processing method, including:
  • the enhanced representation of each entity is transformed and processed to obtain the processing result.
  • obtaining an enhanced representation of each entity based on its latent vector includes:
  • the enhanced representation of each entity is converted and processed to obtain processing results, including:
  • the semantically enhanced latent vector of each entity is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity.
  • obtaining the latent vector of each entity in the input text includes:
  • Obtaining the enhanced representation of each entity based on the latent vector of each entity includes:
  • the enhanced representation of each entity is converted and processed to obtain processing results, including:
  • the intermediate vector is converted to obtain the predicted relationship type between text entities.
  • the computer device includes a processor, a memory, and a computer program stored on the memory.
  • the processor executes the above computer program to implement the above natural language processing method.
  • This application also provides a readable storage medium on which computer program instructions are stored. When the computer program instructions are executed, the above natural language processing method is implemented.
  • This application also provides a computer program product, which includes computer program instructions.
  • the computer program instructions When the computer program instructions are executed, the above-mentioned natural language processing method is implemented.
  • Figure 1 is a schematic flowchart of a natural language processing method provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of another natural language processing method provided by an embodiment of the present application.
  • Figure 3 is a schematic structural diagram of a proper name recognition model provided by an embodiment of the present application.
  • Figure 4 is a schematic flowchart of obtaining semantically enhanced latent vectors provided by an embodiment of the present application.
  • Figure 5 is a schematic flowchart of obtaining an average vector provided by an embodiment of the present application.
  • Figure 6 is a schematic structural diagram of another proper name recognition model provided by an embodiment of the present application.
  • Figure 7 is a schematic flow chart of another natural language processing method provided by an embodiment of the present application.
  • Figure 8 is a schematic module diagram of a relationship extraction model provided by an embodiment of the present application.
  • Figure 9 is a schematic flowchart of obtaining semantic enhanced representation provided by an embodiment of the present application.
  • Figure 10 is a schematic flowchart of obtaining an intermediate vector provided by an embodiment of the present application.
  • Figure 11 is a schematic flowchart of obtaining semantically enhanced vector representation provided by an embodiment of the present application.
  • Figure 12 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • an embodiment of the present application provides a natural language processing method, which includes: obtaining input text and encoding the input text to obtain the latent vector of each entity in the input text; according to the The latent vector obtains the enhanced representation of each entity; the enhanced representation of each entity is converted to obtain the processing result.
  • an enhanced representation of the entity is obtained based on the latent vector of each entity.
  • Semantic enhancement can be performed on each entity to help understand the entity, thereby improving the understanding ability of the corresponding natural language processing model and improving model performance.
  • the enhanced representation of each entity is obtained based on the latent vector of each entity, including: based on the latent vector of each entity and a predetermined representation based on a large number of similar words.
  • the embodiment of the present application provides a natural language processing method for obtaining proper name entity tags, which is implemented through a proper name recognition model.
  • the proper name recognition model includes an encoder 10, a neural network 11 and a decoder. 12.
  • the natural language processing method includes the following steps: obtaining the input text X through the encoder 10 and encoding the input text X to obtain the latent vector of each entity in the input text
  • the word vector library obtains the semantically enhanced latent vector of each entity.
  • the latent vector is input to the neural network 11, which contains a pre-trained word vector library composed of a large number of similar words, and then the semantically enhanced latent vector h′ i of each entity is obtained (for each entity in the input text X, The semantic enhancement latent vector h′ i enhances the semantic features of the word through similar words); the semantic enhancement latent vector h′ i is subjected to classification conversion processing by the decoder 12 to obtain the proper name entity label corresponding to each entity.
  • the neural network 11 in the method provided by this solution can use the meaning of similar words of each entity in the input text X to enhance the semantic representation of the current word, thereby enhancing the proper name recognition model's understanding of the meaning of the current word. That is, if the proper name recognition model has not trained this word during training, then the proper name recognition model can help understand the current word through the meanings of words that are similar to the word, thereby improving the proper name recognition model's recognition of training entities. ability.
  • the encoder 10 adopts Bidirectional Encoder Representation from Transformers (BERT).
  • BERT Bidirectional Encoder Representation from Transformers
  • BERT is used to encode the input text X to obtain the latent vector of each entity in the input text X.
  • the hidden vectors of the i-th word x i are respectively recorded as h i .
  • obtaining the semantically enhanced latent vector h′ i of each entity includes the following steps: finding one or more of each entity in the input text X according to a preset pre-trained word vector library Approximate words; calculate at least one approximate word of each entity according to the preset first algorithm to obtain the average vector o i of each entity; concatenate the average vector o i of each entity with the hidden vector h i to obtain each entity The semantically enhanced latent vector h′ i .
  • the pre-trained word vector library can fully cover the approximate words of each entity in the input text X, effectively enhancing the semantic representation of the word, thereby strengthening the named entity model's understanding of the meaning of the current word.
  • calculating at least one approximate word of each entity according to a preset first algorithm to obtain an average vector of each entity includes the following steps: calculating the approximate word vector matrix according to the preset word vector matrix. Words are mapped to word vectors; word vectors are mapped into key vectors and value vectors according to the preset second algorithm; key vectors and value vectors of each entity are calculated according to the preset third algorithm to obtain the average vector of each entity .
  • Word vectors, key vectors and value vectors are all abstract concepts for calculating and understanding the attention mechanism, which facilitates the computer to understand and calculate the meaning of each entity and approximate words.
  • the neural network 11 includes a preset key matrix and a value matrix, and maps word vectors into key vectors and value vectors according to a preset second algorithm, including the following steps: passing the key matrix and word vectors into the preset Set the activation function to get the key vector; pass the value matrix and word vector into the preset activation function to get the value vector.
  • Activation functions play a very important role in neural networks11 learning and understanding very complex and nonlinear functions. The activation function introduces nonlinear characteristics into the neural network 11. Without the activation function, the output signal is just a simple linear function with less ability to learn complex function mapping from the data. It can be seen that introducing the activation function into the neural network 11 improves the ability of the neural network 11 to process complex data and improves the recognition performance of the proper name recognition model.
  • calculating the key vector and value vector of each entity according to a preset third algorithm to obtain the average vector of each entity includes the following steps: according to the hidden vector h i of each entity and each entity The key vector k i,j corresponding to the similar word is calculated to obtain the weight p i,j of the similar word; the average vector o i of each entity is calculated according to the weight p i,j and the value vector p i,j .
  • some similar words for each entity in the input text Then the average vector o i obtained, and the subsequent semantically enhanced latent vector h′ i obtained based on the average vector o i , are not accurate enough for the meaning of the current word.
  • the The weights of similar words are divided, and the average vector o i of each entity is calculated based on the weight, and the subsequent semantic enhancement latent vector h′ i is obtained.
  • This undoubtedly improves the accuracy of the semantic enhancement latent vector h′ i for word meaning. degree, thus further enhancing the proper name recognition model’s understanding of the input text X and improving the performance of the proper name recognition model.
  • the semantically enhanced latent vector of each entity is obtained based on the latent vector and a pre-trained word vector library composed of a large number of similar words, and is obtained by inputting the latent vector into a preset key-value memory neural network.
  • Neural network 11 includes a key-value memory neural network that enables a machine to accept input (e.g., questions, puzzles, tasks, etc.) and, in response, generate output (e.g., answers, solutions, etc.) based on information from a knowledge source , response to tasks, etc.).
  • the key-value memory network model operates on symbolic memory that is structured into (key, value) pairs, which gives the proper name recognition model greater flexibility for encoding the input text X, and has Helps bridge the gap between reading text directly and answering from a library of pre-trained word vectors.
  • Key-value memory networks are versatile by encoding prior knowledge about the task at hand in key-value memories. For example, documents, pre-trained word vector libraries, or pre-trained word vector libraries built using information extraction, and answer questions about them.
  • Entity recognition is regarded as a sequence labeling task, that is, predicting the entity recognition label of each input word.
  • a preset pre-trained word vector library such as Tencent's 8 million word vectors
  • the method is as follows:
  • represents the product of matrix and vector
  • the calculation result is a vector
  • ReLU is the activation function
  • is used to calculate the product of weight p i,j (a positive real number between 0-1) and vector v i,j ; connect the average vector o i and h i in series Get the output of the key-value memory neural network 21:
  • the proper name recognition model also includes a fully connected layer
  • the decoder 12 is a SoftMax classifier. Classifying and converting the semantically enhanced latent vectors includes the following steps: After passing the semantically enhanced latent vectors through the fully connected layer, the SoftMax classifier, obtain proper name entity labels. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
  • the embodiment of this application provides a natural language processing method, which is applied to the case of proper name recognition.
  • the method includes the following steps: obtaining the input text and encoding the input text to obtain the latent vector of each entity in the input text;
  • the semantically enhanced latent vector of each entity is obtained based on the latent vector and the pre-trained word vector library based on a large number of similar words; the semantically enhanced latent vector is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity.
  • Traditional proper name recognition models often face the problem of sparse data, making it difficult to correctly extract entities that are not encountered during training.
  • the neural network model in the proper name recognition method can use the meaning of similar words of each entity in the input text to enhance the semantic representation of the current word, thereby enhancing the proper name recognition model's understanding of the meaning of the current word. That is, if the proper name recognition model has not seen this word during training, then the proper name recognition model can help understand the current word through the meanings of words similar to the word, thereby improving the recognition of training entities by the proper name recognition model. ability.
  • obtaining the semantically enhanced latent vector of each entity includes the following steps: finding one of each entity in the input text according to the preset pre-trained word vector library or multiple approximate words; calculate at least one approximate word of each entity according to the preset first algorithm to obtain the average vector of each entity; concatenate the average vector and the latent vector to obtain the semantic enhancement latent vector.
  • the pre-trained word vector library can fully cover the approximate words of each entity in the input text, effectively enhancing the semantic representation of the word, thus strengthening the named entity model's understanding of the meaning of the current word.
  • calculating at least one approximate word of each entity according to a preset first algorithm to obtain the average vector of each entity includes the following steps: according to the preset words
  • the vector matrix maps approximate words into word vectors; maps word vectors into key vectors and value vectors according to the preset second algorithm; calculates the key vectors and value vectors of each entity according to the preset third algorithm to obtain an average vector .
  • Word vectors, key vectors and value vectors are all abstract concepts for calculating and understanding the attention mechanism, which facilitates the computer to understand and calculate the meaning of each entity and approximate words.
  • mapping word vectors into key vectors and value vectors according to a preset second algorithm includes the following steps: passing the preset key matrix and word vector into the preset
  • the activation function is used to obtain the key vector; the preset value matrix and word vector are passed into the preset activation function to obtain the value vector.
  • Activation functions play a very important role in neural networks learning and understanding very complex and nonlinear functions. The activation function introduces nonlinear characteristics into the neural network. If there is no activation function, the output signal is just a simple linear function with less ability to learn complex function mapping from the data. It can be seen that introducing the activation function into the neural network improves the neural network's ability to process complex data and improves the recognition performance of the proper name recognition model.
  • calculating the key vector and value vector of each entity according to a preset third algorithm to obtain the average vector includes the following steps: according to the sum of the hidden vectors of each entity The key vector corresponding to the similar word of each entity is calculated to obtain the weight of the similar word; the average vector of each entity is calculated based on the weight and value vector.
  • some similar words of each entity in the input text have different adaptability to the current context. Therefore, if the importance of these similar words in the semantic representation of the current word is not divided, then The obtained average vector and the subsequent semantically enhanced latent vector obtained based on the average vector are not accurate enough for the meaning of the current word.
  • the weights of similar words are divided according to the degree of adaptation of different similar words in the current context, and The average vector of each entity is calculated based on the weight, and the subsequent semantically enhanced latent vector is obtained.
  • This undoubtedly improves the accuracy of the semantically enhanced latent vector in expressing word meaning, thereby further enhancing the proper name recognition model's understanding of the input text. Improved the performance of the proper name recognition model.
  • the classification conversion process of semantically enhanced latent vectors includes the following steps: After passing the semantically enhanced latent vectors through a preset fully connected layer, they are sent to the preset SoftMax classification. device to get the proper name entity tag.
  • the fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
  • the semantically enhanced latent vector of each entity is obtained by basing the latent vector on a pre-trained word vector library composed of a large number of similar words by inputting the latent vector into a preset key value.
  • key-value memory network model operates on symbolic memory that is structured into (key, value) pairs, which gives the proper name recognition model greater flexibility for encoding input text and helps To bridge the gap between reading text directly and answering from a library of pre-trained word vectors.
  • key-value memory networks have the versatility to analyze, for example, documents, pre-trained word vector libraries, or pre-trained word vector libraries built using information extraction, and Answer questions about them.
  • obtaining the hidden vector of each entity in the input text includes: obtaining the given latent vector of each entity in the input text. the hidden vector of each entity; said obtaining the enhanced representation of each entity according to the hidden vector of each entity includes: calculating the given hidden vector of each entity through a preset first algorithm to obtain the said The first entity vector representation and the second entity vector representation respectively correspond to the given two entities; the first entity vector representation and the second entity vector representation are processed to obtain the first semantic enhanced representation and the second semantic representation.
  • Enhanced representation the conversion processing of the enhanced representation of each entity to obtain a processing result includes: calculating the first semantic enhanced representation and the second semantic enhanced representation according to a preset second algorithm to obtain an intermediate vector ;Convert the intermediate vector to obtain the predicted relationship type between text entities.
  • This embodiment of the present application provides a natural language processing method, which is used to extract the relationship between two given entities in the input text. It is implemented through a relationship extraction model and is used to extract The relationship between entities, the relationship extraction model includes an encoder 20, a decoder 21 and a semantic enhancement module 22 based on the attention mechanism.
  • the natural language processing method includes the following steps: obtaining the input text, passing the input text to the encoder 20 and processing Input text is encoded, and the hidden vector of each entity given in the input text is output; the hidden vector of each given entity is calculated through the preset first algorithm, and the corresponding third entity corresponding to the given two entities is obtained.
  • An entity vector representation and a second entity vector representation input the first entity vector representation and the second entity vector representation into the semantic enhancement module 22 to obtain the first semantic enhancement representation and the second semantic enhancement representation; combine the first semantic enhancement representation with the third semantic enhancement representation
  • the second semantic enhanced representation is calculated through a preset second algorithm to obtain an intermediate vector; the intermediate vector is converted and decoded by the decoder 21 to obtain the predicted relationship type between text entities.
  • the semantic enhancement module 22 of the relationship extraction model performs semantic enhancement on each entity in the input text, and uses the semantics of entities similar to it to enhance the semantic representation of this entity, thereby enhancing the relationship extraction model's effect on the current Understanding of entity semantics. That is, if the relationship extraction model has not seen this entity during training, then the relationship extraction model can help understand the current entity through the semantics of entities similar to the entity, thereby improving the relationship extraction model's ability to understand unlogged entities, and then Improve the performance of relation extraction.
  • the first algorithm is the Max Pooling algorithm.
  • obtaining the first semantically enhanced representation and the second semantically enhanced representation includes the following steps: finding one or more of each entity given in the input text according to a preset pre-trained word vector library. Multiple approximate entities; calculate the approximate entities of the given two entities according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the corresponding third entities of the given two entities in the input text.
  • a semantically enhanced representation and a second semantically enhanced representation includes the following steps: finding one or more of each entity given in the input text according to a preset pre-trained word vector library. Multiple approximate entities; calculate the approximate entities of the given two entities according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the corresponding third entities of the given two entities in the input text.
  • the pre-trained word vector library can fully cover the approximate entities of each entity in the input text, effectively enhancing the semantic representation of this entity, thereby strengthening the relationship extraction model's understanding of the current entity semantics.
  • the first semantic enhancement representation and the second semantic enhancement representation are calculated according to a preset second algorithm to obtain an intermediate vector: the first entity vector representation and the first semantic enhancement representation are concatenated, The first enhanced vector representation is obtained; the second entity vector representation is concatenated with the second semantic enhanced vector representation to obtain the second enhanced vector representation; the first enhanced vector representation is concatenated with the second enhanced vector representation to obtain an intermediate vector.
  • the first entity vector representation can represent the semantics of the first entity itself, and the first semantic enhanced representation can represent the semantics of similar entities.
  • the first enhanced vector representation obtained by concatenating the two combines the semantics of the entity and the similar entities. (The same is true for the second enhanced vector representation).
  • the semantics of the original entity are expanded, allowing the computer to better understand the semantics of the entity, and also facilitates subsequent calculation and processing.
  • the approximate entities of the two given entities are calculated according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the two given entities in the input text.
  • the first semantic enhancement representation and the second semantic enhancement representation respectively corresponding to each entity include the following steps: mapping the approximate entities into word vectors through a preset word vector matrix; using the first entity vector representation (or the second entity vector representation) and The word vectors of one or more approximate entities of the current entity calculate the weight of each approximate entity; the first semantic enhancement representation (or the second semantic enhancement representation) of the word vector is calculated based on the weight and the word vector.
  • the semantic calculation weight of different approximate entities realizes the identification and utilization of the importance of non-similar entities, effectively avoiding the impact of potential noise in similar entities on model performance, thereby improving the performance of relationship extraction.
  • performing classification conversion processing on semantically enhanced latent vectors includes the following steps: after passing the intermediate vector through a preset fully connected layer, it is sent to the SoftMax classifier to obtain the predicted relationship type.
  • the fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
  • the two given entities in the input are "Food Factory” and "Fruit Cans”.
  • the model uses a standard encoder-decoder architecture.
  • Encoder 20 uses BERT and decoder 21 uses SoftMax;
  • the first step is to use BERT to encode the input text and obtain the latent vector of each entity. Among them, the hidden vectors of the i-th word x i are respectively recorded as h i .
  • the Max Pooling algorithm is used to calculate the vector representations h E1 and h E2 of the two entities (E1 and E2 represent two entities respectively).
  • h E1 and h E2 are sent to the semantic enhancement module 22 to obtain semantic enhancement representations a1 and a2 corresponding to E1 and E2 respectively.
  • the fourth step is to connect h E1 and a1 in series to obtain the first enhanced vector representation. Similarly, the second enhanced vector representation o 2 can be obtained.
  • the fifth step is to connect o 1 and o 2 in series to get the intermediate vector
  • the SoftMax classifier After o passes through a fully connected layer, it is sent to the SoftMax classifier to obtain the predicted relationship type.
  • the processing flow of the semantic enhancement module 22 is as follows:
  • entity vector library such as Tencent's 8 million word vectors
  • the third step is to use the vector of E i to represent h Ei and e i,j and calculate the weight p i,j as follows:
  • is used to calculate the inner product of vectors
  • the fourth step is to calculate the average vector a i of the word vector e i, j based on the weight p i ,j as follows:
  • is used to calculate the product of the weight (a positive real number between 0 and 1) and the word vector e i,j .
  • converting the intermediate vector to obtain the predicted relationship type between text entities includes the following steps: passing the intermediate vector through a preset fully connected layer and then sending it to the SoftMax classifier to obtain the predicted relationship type.
  • the fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
  • the embodiment of this application provides a natural language processing method, which is applied to extract the relationship between two given entities in the input text. It is implemented through a relationship extraction model, including: obtaining the input text, and analyzing the input text.
  • the text is encoded to obtain the hidden vector of each entity given in the input text; the hidden vector of each given entity is calculated through the preset first algorithm, and the first corresponding corresponding entities of the given two entities are obtained.
  • Entity vector representation and second entity vector representation process the first entity vector representation and the second entity vector representation to obtain the first semantic enhanced representation and the second semantic enhanced representation; combine the first semantic enhanced representation and the second semantic enhanced representation
  • the intermediate vector is calculated through the preset second algorithm; the intermediate vector is converted to obtain the predicted relationship type between text entities.
  • the semantic enhancement module of the relationship extraction model performs semantic enhancement on each entity in the input text and uses the semantics of similar entities to enhance the semantic representation of this entity, thereby enhancing the relationship extraction model's Understanding of current entity semantics. That is, if the relationship extraction model has not seen this entity during training, then the relationship extraction model can help understand the current entity through the semantics of entities similar to the entity, thereby improving the relationship extraction model's ability to understand unlogged entities, and then Improve the performance of relation extraction.
  • the embodiment of the present application provides a natural language processing method.
  • Obtaining the first semantic enhancement representation and the second semantic enhancement representation includes the following steps: finding each given word in the input text according to the preset pre-trained word vector library. One or more approximate entities of the entity; calculate the approximate entities of the given two entities according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the two given entities in the input text
  • the corresponding first semantic enhancement representation and the second semantic enhancement representation respectively.
  • the pre-trained word vector library can fully cover the approximate entities of each entity in the input text, effectively enhancing the semantic representation of this entity, thereby strengthening the relationship extraction model's understanding of the current entity semantics.
  • the embodiment of the present application provides a natural language processing method that calculates the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector, including: combining the first entity vector representation with the third semantic enhancement representation. Concatenate one semantic enhanced representation to obtain the first enhanced vector representation; concatenate the second entity vector representation with the second semantic enhanced representation to obtain the second enhanced vector representation; concatenate the first enhanced vector representation with the second enhanced vector representation to obtain the intermediate vector.
  • the first entity vector representation can represent the semantics of the first entity itself, and the first enhancement vector representation can represent the semantics of similar entities.
  • the first enhancement vector obtained after concatenating the two combines the semantics of the entity and similar entities (The same is true for the second enhanced vector representation), which expands its semantics on the original entity, allowing the computer to better understand the semantics of the entity, and also facilitates subsequent calculation and processing.
  • the embodiment of the present application provides a natural language processing method in which the approximate entities of the given two entities are calculated according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the input text.
  • the first semantic enhancement representation and the second semantic enhancement representation corresponding to the given two entities respectively include: mapping the approximate entities into word vectors through a preset word vector matrix; representing the first entity vector (or the second entity vector representation) and the word vectors of one or more approximate entities of the current entity to calculate the weight of each approximate entity; calculate the first semantic enhancement representation (or the second semantic enhancement representation) of the word vector based on the weight and the word vector.
  • the semantic calculation weight of different approximate entities realizes the identification and utilization of the importance of dissimilar entities, effectively avoiding the impact of potential noise in similar entities on model performance, thereby improving the performance of relationship extraction.
  • the embodiment of the present application provides a natural language processing method. Converting intermediate vectors to obtain the predicted relationship type between text entities includes the following steps: After passing the intermediate vector through a preset fully connected layer, it is sent to SoftMax. Classifier, get the predicted relationship type. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
  • an embodiment of the present application also provides a computer device, including a processor 30, a memory 31, and a computer program stored on the memory 31.
  • the processor 30 executes the computer program to implement the above method.
  • the computer equipment has the same effect as the above-mentioned natural language processing method, which will not be described again here.
  • Embodiments of the present application also provide a readable storage medium on which computer program instructions are stored. When the computer program instructions are executed, the above method is implemented.
  • the readable storage medium has the same effect as the above-mentioned natural language processing method, which will not be described again here.
  • Embodiments of the present application also provide a program product.
  • the program product includes computer program instructions. When the computer program instructions are executed, the above method is implemented.
  • the program product has the same effect as one of the above natural language processing methods, and will not be described again here.
  • B corresponding to A means that B is associated with A, and B can be determined based on A.
  • determining B based on A does not mean determining B only based on A.
  • B can also be determined based on A and/or other information.
  • references herein to "one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic associated with the embodiment is included in at least one embodiment of the present application. Therefore, appearances of "in one embodiment” or “in an embodiment” in various places herein are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. The embodiments described in this article are all optional embodiments, and the actions and modules involved are not necessarily necessary for this application.
  • the size of the sequence numbers of the above-mentioned multiple processes does not necessarily mean the order of execution.
  • the execution order of the multiple processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending upon the functionality involved.
  • Each block in the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration may be implemented by special purpose hardware-based systems that perform the specified functions or operations, or may be implemented using special purpose hardware implemented in combination with computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

A natural language processing method, a computer device, a readable storage medium, and a program product. The natural language processing method comprises: acquiring an input text and encoding same to obtain a latent vector of each entity in the input text; obtaining an enhanced representation of each entity according to the latent vector of each entity; and performing conversion processing on the enhanced representation of each entity to obtain a processing result.

Description

自然语言处理方法、计算机设备、可读存储介质和程序产品Natural language processing methods, computer equipment, readable storage media and program products
本申请要求在2022年07月29日提交中国专利局、申请号为202210911109.5的中国专利申请的优先权,要求在2022年07月29日提交中国专利局、申请号为202210909700.7的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application with application number 202210911109.5 submitted to the China Patent Office on July 29, 2022, and claims priority to the Chinese patent application with application number 202210909700.7 submitted to the China Patent Office on July 29, 2022. rights, the entire contents of which are incorporated herein by reference.
技术领域Technical field
本申请涉及自然语言处理技术领域,例如涉及一种自然语言处理方法、设备、可读存储介质和程序产品。This application relates to the technical field of natural language processing, for example, to a natural language processing method, device, readable storage medium and program product.
背景技术Background technique
专名识别任务以及关系抽取任务均属于自然语言处理。专名识别任务旨在从给定的句子中抽取名实体。对于特定的领域,例如社交媒体领域,由于语言词汇使用变化较快,导致训练数据严重不足。传统的专名识别模型往往会面临数据稀疏的问题,难以正确地抽取训练时未遇到的实体。关系抽取任务旨在从给定的句子和两个实体中,抽取(预测)这两个给定实体之间的关系。其中,对实体本身意义的理解对其关系的预测十分重要。然而,一般的方法往往忽视对实体本身的建模。而且,由于训练数据的不足,传统的关系抽取模型往往会面临实体数据稀疏的问题,难以正确地抽取训练时未遇到的实体(即未登录实体)之间的关系。Both the proper name recognition task and the relationship extraction task belong to natural language processing. The proper name recognition task aims to extract nominal entities from a given sentence. For certain fields, such as social media, there is a serious shortage of training data due to the rapid changes in language vocabulary usage. Traditional proper name recognition models often face the problem of data sparseness, making it difficult to correctly extract entities not encountered during training. The relationship extraction task aims to extract (predict) the relationship between the two given entities from a given sentence and two entities. Among them, understanding the meaning of the entity itself is very important for predicting its relationship. However, general methods often ignore the modeling of the entity itself. Moreover, due to insufficient training data, traditional relationship extraction models often face the problem of sparse entity data, making it difficult to correctly extract relationships between entities not encountered during training (that is, unlogged entities).
发明内容Contents of the invention
本申请提供一种自然语言处理方法、设备、存储介质和程序产品,以解决传统的自然语言处理方法难以正确地抽取训练时未遇到的实体或者实体之间的关系的问题。This application provides a natural language processing method, equipment, storage medium and program product to solve the problem that traditional natural language processing methods are difficult to correctly extract entities or relationships between entities that have not been encountered during training.
本申请提供一种自然语言处理方法,包括:This application provides a natural language processing method, including:
获取输入文本并对所述输入文本进行编码,得到所述输入文本中每个实体的隐向量;Obtain the input text and encode the input text to obtain the hidden vector of each entity in the input text;
根据每个实体的隐向量得到所述每个实体的增强表征;Obtain the enhanced representation of each entity according to the latent vector of each entity;
对每个实体的增强表征进行转换处理,得到处理结果。The enhanced representation of each entity is transformed and processed to obtain the processing result.
一实施例中,所述方法应用于专名识别的情况下,所述根据每个实体的隐向量得到所述每个实体的增强表征,包括:In one embodiment, when the method is applied to proper name recognition, obtaining an enhanced representation of each entity based on its latent vector includes:
根据每个实体的隐向量以及基于大量相似词组成的预训练词向量库得到所述每个实体的语义增强隐向量;Obtain the semantically enhanced latent vector of each entity according to the latent vector of each entity and a pre-trained word vector library based on a large number of similar words;
所述对每个实体的增强表征进行转换处理,得到处理结果,包括:The enhanced representation of each entity is converted and processed to obtain processing results, including:
将每个实体的语义增强隐向量经过分类转换处理,得到所述每个实体对应的专名实体标签。The semantically enhanced latent vector of each entity is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity.
一实施例中,所述方法应用于抽取所述输入文本中给定的两个实体之间关系的情况下,所述得到所述输入文本中每个实体的隐向量,包括:In one embodiment, when the method is applied to extract the relationship between two entities given in the input text, obtaining the latent vector of each entity in the input text includes:
得到给定的每个实体的隐向量;Get the hidden vector of each given entity;
所述根据每个实体的隐向量得到所述每个实体的增强表征,包括:Obtaining the enhanced representation of each entity based on the latent vector of each entity includes:
通过预设的第一算法对给定的每个实体的隐向量进行计算,得到所述给定的两个实体分别对应的第一实体向量表征与第二实体向量表征;将所述第一实体向量表征与所述第二实体向量表征进行处理,得到第一语义增强表征与第二语义增强表征;Calculate the hidden vector of each given entity through the preset first algorithm to obtain the first entity vector representation and the second entity vector representation respectively corresponding to the given two entities; The vector representation is processed with the second entity vector representation to obtain a first semantically enhanced representation and a second semantically enhanced representation;
所述对每个实体的增强表征进行转换处理,得到处理结果,包括:The enhanced representation of each entity is converted and processed to obtain processing results, including:
根据预设的第二算法对所述第一语义增强表征与所述第二语义增强表征进行计算得到中间向量;Calculate the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector;
将所述中间向量进行转换处理,得到预测的文本实体之间的关系类型。The intermediate vector is converted to obtain the predicted relationship type between text entities.
本申请还提供了一种计算机设备,所述计算机设备包括处理器、存储器以及存储在所述存储器上的计算机程序,所述处理器执行上述计算机程序以实现上述的自然语言处理方法。This application also provides a computer device. The computer device includes a processor, a memory, and a computer program stored on the memory. The processor executes the above computer program to implement the above natural language processing method.
本申请还提供了一种可读存储介质,其上存储有计算机程序指令,计算机程序指令被执行时实现上述的自然语言处理方法。This application also provides a readable storage medium on which computer program instructions are stored. When the computer program instructions are executed, the above natural language processing method is implemented.
本申请还提供了一种计算机程序产品,包括计算机程序指令,所述计算机程序指令被执行时实现上述的自然语言处理方法。This application also provides a computer program product, which includes computer program instructions. When the computer program instructions are executed, the above-mentioned natural language processing method is implemented.
附图说明Description of drawings
图1是本申请实施例提供的一种自然语言处理方法的流程示意图;Figure 1 is a schematic flowchart of a natural language processing method provided by an embodiment of the present application;
图2是本申请实施例提供的另一种自然语言处理方法的流程示意图;Figure 2 is a schematic flowchart of another natural language processing method provided by an embodiment of the present application;
图3是本申请实施例提供的一种专名识别模型的结构示意图;Figure 3 is a schematic structural diagram of a proper name recognition model provided by an embodiment of the present application;
图4是本申请实施例提供的一种获取语义增强隐向量的流程示意图;Figure 4 is a schematic flowchart of obtaining semantically enhanced latent vectors provided by an embodiment of the present application;
图5是本申请实施例提供的一种获取平均向量的流程示意图;Figure 5 is a schematic flowchart of obtaining an average vector provided by an embodiment of the present application;
图6是本申请实施例提供的另一种专名识别模型的结构示意图;Figure 6 is a schematic structural diagram of another proper name recognition model provided by an embodiment of the present application;
图7是本申请实施例提供的另一种自然语言处理方法流程示意图;Figure 7 is a schematic flow chart of another natural language processing method provided by an embodiment of the present application;
图8是本申请实施例提供的一种关系抽取模型的模块示意图;Figure 8 is a schematic module diagram of a relationship extraction model provided by an embodiment of the present application;
图9是本申请实施例提供的一种得到语义增强表征的流程示意图;Figure 9 is a schematic flowchart of obtaining semantic enhanced representation provided by an embodiment of the present application;
图10是本申请实施例提供的一种得到中间向量的流程示意图;Figure 10 is a schematic flowchart of obtaining an intermediate vector provided by an embodiment of the present application;
图11是本申请实施例提供的一种得到语义增强向量表征的流程示意图;Figure 11 is a schematic flowchart of obtaining semantically enhanced vector representation provided by an embodiment of the present application;
图12是本申请实施例提供计算机设备的结构示意图。Figure 12 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
具体实施方式Detailed ways
以下结合附图及实施实例,对本申请进行说明。此处所描述的具体实施例仅仅用以解释本申请。The present application will be described below with reference to the accompanying drawings and implementation examples. The specific embodiments described herein are merely illustrative of the present application.
如图1所示,本申请实施例提供一种自然语言处理方法,包括:获取输入文本并对所述输入文本进行编码,得到所述输入文本中每个实体的隐向量;根据每个实体的隐向量得到所述每个实体的增强表征;对每个实体的增强表征进行转换处理,得到处理结果。As shown in Figure 1, an embodiment of the present application provides a natural language processing method, which includes: obtaining input text and encoding the input text to obtain the latent vector of each entity in the input text; according to the The latent vector obtains the enhanced representation of each entity; the enhanced representation of each entity is converted to obtain the processing result.
本实施例中根据每个实体的隐向量得到改实体的增强表征,可以对每个实体进行语义增强,帮助理解实体,从而提升了对应的自然语言处理模型的理解能力,提升了模型性能。In this embodiment, an enhanced representation of the entity is obtained based on the latent vector of each entity. Semantic enhancement can be performed on each entity to help understand the entity, thereby improving the understanding ability of the corresponding natural language processing model and improving model performance.
一实施例中,上述方法应用于专名识别的情况下,根据每个实体的隐向量得到所述每个实体的增强表征,包括:根据每个实体的隐向量以及基于大量相似词组成的预训练词向量库得到每个实体的语义增强隐向量;对每个实体的增强表征进行转换处理,得到处理结果,包括:将每个实体的语义增强隐向量经过分类转换处理,得到每个实体对应的专名实体标签。In one embodiment, when the above method is applied to proper name recognition, the enhanced representation of each entity is obtained based on the latent vector of each entity, including: based on the latent vector of each entity and a predetermined representation based on a large number of similar words. Train the word vector library to obtain the semantically enhanced latent vector of each entity; perform conversion processing on the enhanced representation of each entity to obtain the processing results, including: subjecting the semantically enhanced latent vector of each entity to classification conversion processing to obtain the corresponding The proper name entity tag.
请结合图2和图3,本申请实施例提供一种自然语言处理方法,用于得到专名实体标签,通过专名识别模型实现,专名识别模型包括编码器10、神经网络11和解码器12,自然语言处理方法包括以下步骤:通过编码器10获取输入文本X并对输入文本X进行编码,得到输入文本X中每个实体的隐向量;根据隐向量以及基于大量相似词组成的预训练词向量库得到每个实体的语义增强隐向量。例如,将隐向量输入神经网络11,神经网络11包含由大量相似词组成的预训练词向量库,进而得到每个实体的语义增强隐向量h′ i(对于输入文本X中的每个实体,语义增强隐向量h′ i通过相似词增强了这个词的语义特征);将语义增强隐向量h′ i 经过解码器12的分类转换处理,得到每个实体对应的专名实体标签。 Please combine Figure 2 and Figure 3. The embodiment of the present application provides a natural language processing method for obtaining proper name entity tags, which is implemented through a proper name recognition model. The proper name recognition model includes an encoder 10, a neural network 11 and a decoder. 12. The natural language processing method includes the following steps: obtaining the input text X through the encoder 10 and encoding the input text X to obtain the latent vector of each entity in the input text The word vector library obtains the semantically enhanced latent vector of each entity. For example, the latent vector is input to the neural network 11, which contains a pre-trained word vector library composed of a large number of similar words, and then the semantically enhanced latent vector h′ i of each entity is obtained (for each entity in the input text X, The semantic enhancement latent vector h′ i enhances the semantic features of the word through similar words); the semantic enhancement latent vector h′ i is subjected to classification conversion processing by the decoder 12 to obtain the proper name entity label corresponding to each entity.
传统的专名识别模型往往会面临数据稀松的问题,难以正确地抽取训练时未遇到实体。而本方案提供的方法中的神经网络11,可利用输入文本X中每个实体的相似词的词义来增强当前词的语义表征,从而增强专名识别模型对当前词词义的理解。即如果专名识别模型没有在训练的时候训练过这个词,那么,专名识别模型可以通过与该词相近的词的词义,来帮助理解当前词,从而提升专名识别模型对训练实体的识别能力。Traditional proper name recognition models often face the problem of sparse data, making it difficult to correctly extract entities that are not encountered during training. The neural network 11 in the method provided by this solution can use the meaning of similar words of each entity in the input text X to enhance the semantic representation of the current word, thereby enhancing the proper name recognition model's understanding of the meaning of the current word. That is, if the proper name recognition model has not trained this word during training, then the proper name recognition model can help understand the current word through the meanings of words that are similar to the word, thereby improving the proper name recognition model's recognition of training entities. ability.
在一些实施例中,编码器10采用变形器的双向编码器表示(Bidirectional Encoder Representation from Transformers,BERT)。示例性地,使用BERT对输入文本X进行编码,得到输入文本X中每个实体的隐向量。其中,第i个词x i的隐向量分别记为h iIn some embodiments, the encoder 10 adopts Bidirectional Encoder Representation from Transformers (BERT). For example, BERT is used to encode the input text X to obtain the latent vector of each entity in the input text X. Among them, the hidden vectors of the i-th word x i are respectively recorded as h i .
请参阅图4,在一些实施例中,得到每个实体的语义增强隐向量h′ i包括以下步骤:根据预设的预训练词向量库找出输入文本X中每个实体的一个或多个近似词;根据预设的第一算法对每个实体的至少一个近似词进行计算得到每个实体的平均向量o i;将每个实体的平均向量o i与隐向量h i串联得到每个实体的语义增强隐向量h′ iPlease refer to Figure 4. In some embodiments, obtaining the semantically enhanced latent vector h′ i of each entity includes the following steps: finding one or more of each entity in the input text X according to a preset pre-trained word vector library Approximate words; calculate at least one approximate word of each entity according to the preset first algorithm to obtain the average vector o i of each entity; concatenate the average vector o i of each entity with the hidden vector h i to obtain each entity The semantically enhanced latent vector h′ i .
预训练词向量库可对输入文本X中每个实体的近似词进行充分覆盖,有效地增强这个词的语义表征,从而加强了命名实体模型对当前词的词义的理解。The pre-trained word vector library can fully cover the approximate words of each entity in the input text X, effectively enhancing the semantic representation of the word, thereby strengthening the named entity model's understanding of the meaning of the current word.
请参阅图5,在一些实施例中,根据预设的第一算法对每个实体的至少一个近似词进行计算得到每个实体的平均向量,包括以下步骤:根据预设的词向量矩阵将近似词映射为词向量;根据预设的第二算法将词向量映射为键向量和值向量;根据预设的第三算法对每个实体的键向量和值向量进行计算得到每个实体的平均向量。词向量、键向量和值向量,都是计算和理解注意力机制的抽象概念,便于计算机对每个实体以及近似词的词义进行理解和计算。Please refer to Figure 5. In some embodiments, calculating at least one approximate word of each entity according to a preset first algorithm to obtain an average vector of each entity includes the following steps: calculating the approximate word vector matrix according to the preset word vector matrix. Words are mapped to word vectors; word vectors are mapped into key vectors and value vectors according to the preset second algorithm; key vectors and value vectors of each entity are calculated according to the preset third algorithm to obtain the average vector of each entity . Word vectors, key vectors and value vectors are all abstract concepts for calculating and understanding the attention mechanism, which facilitates the computer to understand and calculate the meaning of each entity and approximate words.
在一些实施例中,神经网络11包括预设的键矩阵和值矩阵,根据预设的第二算法将词向量映射为键向量和值向量,包括以下步骤:将键矩阵和词向量传入预设的激活函数,得到键向量;将值矩阵与词向量传入预设的激活函数,得到值向量。激活函数对于神经网络11学习、理解非常复杂和非线性的函数来说具有十分重要的作用。激活函数将非线性的特性引入神经网络11中,若没有激活函数的话,输出的信号仅仅是一个从数据中学习复杂函数映射能力更小的简单的线性函数。可见,将激活函数引入神经网络11提高了神经网络11处理复杂数据的能力,提高了专名识别模型的识别性能。In some embodiments, the neural network 11 includes a preset key matrix and a value matrix, and maps word vectors into key vectors and value vectors according to a preset second algorithm, including the following steps: passing the key matrix and word vectors into the preset Set the activation function to get the key vector; pass the value matrix and word vector into the preset activation function to get the value vector. Activation functions play a very important role in neural networks11 learning and understanding very complex and nonlinear functions. The activation function introduces nonlinear characteristics into the neural network 11. Without the activation function, the output signal is just a simple linear function with less ability to learn complex function mapping from the data. It can be seen that introducing the activation function into the neural network 11 improves the ability of the neural network 11 to process complex data and improves the recognition performance of the proper name recognition model.
在一些实施例中,根据预设的第三算法对每个实体的键向量和值向量进行 计算得到每个实体的平均向量,包括以下步骤:根据每个实体的隐向量h i和每个实体的相似词对应的键向量k i,j计算得到相似词的权重p i,j;据权重p i,j与值向量p i,j计算得到每个实体的平均向量o i。在很多语境下,输入文本X中每个实体的一些相似词,对于当前语境的适配程度并不相同,因此,若不对这些相似词在当前词的语义表征中的重要性进行划分,那么得到的平均向量o i,以及后续根据平均向量o i得到的语义增强隐向量h′ i对于当前词的词义表现是不够准确的,而依据不同相似词在当前语境下的适配程度对相似词的权重进行划分,并依据权重来计算出每个实体的平均向量o i,以及得到后续的语义增强隐向量h′ i,这样无疑提高了语义增强隐向量h′ i对词义表现得准确度,从而进一步增强了专名识别模型对输入文本X得理解,提升了专名识别模型的性能。 In some embodiments, calculating the key vector and value vector of each entity according to a preset third algorithm to obtain the average vector of each entity includes the following steps: according to the hidden vector h i of each entity and each entity The key vector k i,j corresponding to the similar word is calculated to obtain the weight p i,j of the similar word; the average vector o i of each entity is calculated according to the weight p i,j and the value vector p i,j . In many contexts, some similar words for each entity in the input text Then the average vector o i obtained, and the subsequent semantically enhanced latent vector h′ i obtained based on the average vector o i , are not accurate enough for the meaning of the current word. However, based on the degree of adaptation of different similar words in the current context, the The weights of similar words are divided, and the average vector o i of each entity is calculated based on the weight, and the subsequent semantic enhancement latent vector h′ i is obtained. This undoubtedly improves the accuracy of the semantic enhancement latent vector h′ i for word meaning. degree, thus further enhancing the proper name recognition model’s understanding of the input text X and improving the performance of the proper name recognition model.
在一些实施例中,将根据隐向量以及基于大量相似词组成的预训练词向量库得到每个实体的语义增强隐向量是通过将隐向量输入预设的键值记忆神经网络的方式得到。神经网络11包括键值记忆神经网络,键值记忆神经网络可以使得机器能够接受输入(例如问题、难题、任务等),并且作为响应,基于来自知识源的信息生成输出(例如,答案、解决方案、对任务的响应等)。键值记忆网络模型对被结构化为(键,值)对的符号记忆(symbolic memory)进行操作,这给了专名识别模型更大的灵活性来用于对输入文本X进行编码,并且有助于缩小直接读取文本和从预训练词向量库进行回答之间的差距。通过将关于手头任务的先验知识编码在键值记忆中,键值记忆网络具有多功能性。例如,文档、预训练词向量库、或使用信息提取所构建的预训练词向量库,并回答关于它们的问题。In some embodiments, the semantically enhanced latent vector of each entity is obtained based on the latent vector and a pre-trained word vector library composed of a large number of similar words, and is obtained by inputting the latent vector into a preset key-value memory neural network. Neural network 11 includes a key-value memory neural network that enables a machine to accept input (e.g., questions, puzzles, tasks, etc.) and, in response, generate output (e.g., answers, solutions, etc.) based on information from a knowledge source , response to tasks, etc.). The key-value memory network model operates on symbolic memory that is structured into (key, value) pairs, which gives the proper name recognition model greater flexibility for encoding the input text X, and has Helps bridge the gap between reading text directly and answering from a library of pre-trained word vectors. Key-value memory networks are versatile by encoding prior knowledge about the task at hand in key-value memories. For example, documents, pre-trained word vector libraries, or pre-trained word vector libraries built using information extraction, and answer questions about them.
本申请实施例提供的方法与传统方法类似,把实体识别视为序列标注任务,即预测每一个输入词的实体识别标签。The method provided by the embodiment of this application is similar to the traditional method. Entity recognition is regarded as a sequence labeling task, that is, predicting the entity recognition label of each input word.
请参阅图6,在图上的示例中,“张三”是“人名”(PER),“北京海淀”是“地名”(LOC),其中“PER”和“LOC”表示一类标签,全称为“PERSON”和“LOCATION”。示例性地,使用预设的预训练词向量库(例如腾讯的800万词向量),对输入文本X中的每一个词x i,基于向量之间的余弦距离,找到与x i距离最近的m个词,记为s i,1,…,s i,j,…,s i,m;使用预设的词向量矩阵,将相似词s i,j映射为词向量e i,j;使用键矩阵W k和值矩阵Wv,并通过激活函数将词向量e i,j分别映射为键向量k i,j和值向量v i,j,方法如下: Please refer to Figure 6. In the example in the figure, "Zhang San" is a "person name" (PER), and "Beijing Haidian" is a "place name" (LOC), where "PER" and "LOC" represent a type of label, the full name for "PERSON" and "LOCATION". For example, using a preset pre-trained word vector library (such as Tencent's 8 million word vectors), for each word xi in the input text X, based on the cosine distance between vectors, find the closest word to xi m words, recorded as s i,1 ,…,s i,j ,…,s i,m ; use the preset word vector matrix to map similar words s i,j to word vectors e i,j ; use Key matrix W k and value matrix Wv, and the word vector e i,j is mapped to key vector k i,j and value vector v i,j respectively through the activation function. The method is as follows:
k i,j=ReLU(W k·e i,j) k i,j =ReLU(W k ·e i,j )
v i,j=ReLU(W v·e i,j) v i,j =ReLU(W v ·e i,j )
其中,“·”表示矩阵和向量的乘积,计算结果为向量,ReLU是激活函数;使用x i从BERT编码器10获得的隐向量h i和k i,j,计算权重p i,j,方式如下: Among them, "·" represents the product of matrix and vector, the calculation result is a vector, and ReLU is the activation function; use the hidden vectors h i and k i,j obtained from the BERT encoder 10 by x i to calculate the weight p i,j in the following way as follows:
Figure PCTCN2022128622-appb-000001
Figure PCTCN2022128622-appb-000001
式(1)中,“·”计算向量的内积,计算结果为数值。In formula (1), "·" calculates the inner product of vectors, and the calculation result is a numerical value.
基于p i,j,计算值向量的平均向量o i,方式如下: Based on p i,j , calculate the average vector o i of the value vectors as follows:
Figure PCTCN2022128622-appb-000002
Figure PCTCN2022128622-appb-000002
其中,“·”用于计算权重p i,j(一个0-1之间的正实数)与向量v i,j的乘积;将平均向量o i与h i串联
Figure PCTCN2022128622-appb-000003
得到键值记忆神经网络21的输出:
Among them, "·" is used to calculate the product of weight p i,j (a positive real number between 0-1) and vector v i,j ; connect the average vector o i and h i in series
Figure PCTCN2022128622-appb-000003
Get the output of the key-value memory neural network 21:
Figure PCTCN2022128622-appb-000004
Figure PCTCN2022128622-appb-000004
在一些实施例中,专名识别模型还包括全连接层,解码器12为SoftMax分类器,将语义增强隐向量进行分类转换处理包含以下步骤:将语义增强隐向量经过全连接层后,送入SoftMax分类器,得到专名实体标签。全连接层和SoftMax分类器可将语义增强隐向量中的包含的不同连接的权重信息可视化并与预设的模板进行匹配,从而能够更加方便地预测出实体之间的关系类型。In some embodiments, the proper name recognition model also includes a fully connected layer, and the decoder 12 is a SoftMax classifier. Classifying and converting the semantically enhanced latent vectors includes the following steps: After passing the semantically enhanced latent vectors through the fully connected layer, the SoftMax classifier, obtain proper name entity labels. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
与相关技术相比,本申请实施例所提供的一种自然语言处理方法,实现了:Compared with related technologies, the natural language processing method provided by the embodiments of this application achieves:
1.本申请实施例提供一种自然语言处理方法,应用于专名识别的情况下,该方法包括以下步骤:获取输入文本并对输入文本进行编码,得到输入文本中每个实体的隐向量;根据隐向量以及基于大量相似词组成的预训练词向量库得到每个实体的语义增强隐向量;将语义增强隐向量经过分类转换处理,得到每个实体对应的专名实体标签。传统的专名识别模型往往会面临数据稀松的问题,难以正确地抽取训练时未遇到实体。而本方案提供的专名识别方法中的神经网络模型,可利用输入文本中每个实体的相似词的词义来增强当前词的语义表征,从而增强专名识别模型对当前词词义的理解。即如果专名识别模型没有在训练的时候见过这个词,那么,专名识别模型可以通过与该词相近的词的词义,来帮助理解当前词,从而提升专名识别模型对训练实体的识别能力。1. The embodiment of this application provides a natural language processing method, which is applied to the case of proper name recognition. The method includes the following steps: obtaining the input text and encoding the input text to obtain the latent vector of each entity in the input text; The semantically enhanced latent vector of each entity is obtained based on the latent vector and the pre-trained word vector library based on a large number of similar words; the semantically enhanced latent vector is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity. Traditional proper name recognition models often face the problem of sparse data, making it difficult to correctly extract entities that are not encountered during training. The neural network model in the proper name recognition method provided by this solution can use the meaning of similar words of each entity in the input text to enhance the semantic representation of the current word, thereby enhancing the proper name recognition model's understanding of the meaning of the current word. That is, if the proper name recognition model has not seen this word during training, then the proper name recognition model can help understand the current word through the meanings of words similar to the word, thereby improving the recognition of training entities by the proper name recognition model. ability.
2.本申请实施例提供的一种自然语言处理方法中,得到每个实体的语义增强隐向量包括以下步骤:根据预设的预训练词向量库找出所述输入文本中每个实体的一个或多个近似词;根据预设的第一算法对每个实体的至少一个近似词进行计算得到每个实体的平均向量;将平均向量与隐向量串联得到实施语义增强隐向量。预训练词向量库可对输入文本中每个实体的近似词进行充分覆盖,有效地增强这个词的语义表征,从而加强了命名实体模型对当前词词义的理解。2. In a natural language processing method provided by the embodiment of the present application, obtaining the semantically enhanced latent vector of each entity includes the following steps: finding one of each entity in the input text according to the preset pre-trained word vector library or multiple approximate words; calculate at least one approximate word of each entity according to the preset first algorithm to obtain the average vector of each entity; concatenate the average vector and the latent vector to obtain the semantic enhancement latent vector. The pre-trained word vector library can fully cover the approximate words of each entity in the input text, effectively enhancing the semantic representation of the word, thus strengthening the named entity model's understanding of the meaning of the current word.
3.本申请实施例提供的一种自然语言处理方法中,根据预设的第一算法对每 个实体的至少一个近似词进行计算得到每个实体的平均向量包括以下步骤:根据预设的词向量矩阵将近似词映射为词向量;根据预设的第二算法将词向量映射为键向量和值向量;根据预设的第三算法对每个实体的键向量和值向量进行计算得到平均向量。词向量、键向量和值向量,都是计算和理解注意力机制的抽象概念,便于计算机对每个实体以及近似词的词义进行理解和计算。3. In a natural language processing method provided by the embodiment of the present application, calculating at least one approximate word of each entity according to a preset first algorithm to obtain the average vector of each entity includes the following steps: according to the preset words The vector matrix maps approximate words into word vectors; maps word vectors into key vectors and value vectors according to the preset second algorithm; calculates the key vectors and value vectors of each entity according to the preset third algorithm to obtain an average vector . Word vectors, key vectors and value vectors are all abstract concepts for calculating and understanding the attention mechanism, which facilitates the computer to understand and calculate the meaning of each entity and approximate words.
4.本申请实施例提供的一种自然语言处理方法中,根据预设的第二算法将词向量映射为键向量和值向量包括以下步骤:将预设的键矩阵和词向量传入预设的激活函数,得到键向量;将预设的值矩阵与词向量传入预设的激活函数,得到值向量。激活函数对于神经网络学习、理解非常复杂和非线性的函数来说具有十分重要的作用。激活函数将非线性的特性引入神经网络中,若没有激活函数的话,则输出的信号仅仅是一个从数据中学习复杂函数映射能力更小的简单的线性函数。可见,将激活函数引入神经网络提高了神经网络处理复杂数据的能力,提高了专名识别模型的识别性能。4. In a natural language processing method provided by the embodiment of the present application, mapping word vectors into key vectors and value vectors according to a preset second algorithm includes the following steps: passing the preset key matrix and word vector into the preset The activation function is used to obtain the key vector; the preset value matrix and word vector are passed into the preset activation function to obtain the value vector. Activation functions play a very important role in neural networks learning and understanding very complex and nonlinear functions. The activation function introduces nonlinear characteristics into the neural network. If there is no activation function, the output signal is just a simple linear function with less ability to learn complex function mapping from the data. It can be seen that introducing the activation function into the neural network improves the neural network's ability to process complex data and improves the recognition performance of the proper name recognition model.
5.本申请实施例提供的一种自然语言处理方法中,根据预设的第三算法对每个实体的键向量和值向量进行计算得到平均向量包括以下步骤:根据每个实体的隐向量和每个实体的相似词对应的键向量计算得到相似词的权重;据权重与值向量计算得到每个实体的平均向量。在很多语境下,输入文本中每个实体的一些相似词,对于当前语境的适配程度并不相同,因此,若不对这些相似词在当前词的语义表征中的重要性进行划分,那么得到的平均向量,以及后续根据平均向量得到的语义增强隐向量对于当前词的词义表现是不够准确的,而依据不同相似词在当前语境下的适配程度对相似词的权重进行划分,并依据权重来计算出每个实体的平均向量,以及得到后续的语义增强隐向量,这样无疑提高了语义增强隐向量对词义表现得准确度,从而进一步增强了专名识别模型对输入文本的理解,提升了专名识别模型的性能。5. In a natural language processing method provided by the embodiment of this application, calculating the key vector and value vector of each entity according to a preset third algorithm to obtain the average vector includes the following steps: according to the sum of the hidden vectors of each entity The key vector corresponding to the similar word of each entity is calculated to obtain the weight of the similar word; the average vector of each entity is calculated based on the weight and value vector. In many contexts, some similar words of each entity in the input text have different adaptability to the current context. Therefore, if the importance of these similar words in the semantic representation of the current word is not divided, then The obtained average vector and the subsequent semantically enhanced latent vector obtained based on the average vector are not accurate enough for the meaning of the current word. Instead, the weights of similar words are divided according to the degree of adaptation of different similar words in the current context, and The average vector of each entity is calculated based on the weight, and the subsequent semantically enhanced latent vector is obtained. This undoubtedly improves the accuracy of the semantically enhanced latent vector in expressing word meaning, thereby further enhancing the proper name recognition model's understanding of the input text. Improved the performance of the proper name recognition model.
6.本申请实施例提供的一种自然语言处理方法中,将语义增强隐向量进行分类转换处理包含以下步骤:将语义增强隐向量经过预设的全连接层后,送入预设的SoftMax分类器,得到专名实体标签。全连接层和SoftMax分类器可将语义增强隐向量中的包含的不同连接的权重信息可视化并与预设的模板进行匹配,从而能够更加方便地预测出实体之间的关系类型。6. In a natural language processing method provided by the embodiment of this application, the classification conversion process of semantically enhanced latent vectors includes the following steps: After passing the semantically enhanced latent vectors through a preset fully connected layer, they are sent to the preset SoftMax classification. device to get the proper name entity tag. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
7.本申请实施例提供的一种自然语言处理方法中,将隐向量基于大量相似词组成的预训练词向量库得到每个实体的语义增强隐向量是通过将隐向量输入预设的键值记忆神经网络的方式得到。键值记忆神经网络可以使得机器能够接受输入(例如,问题、难题、任务等),并且作为响应,基于来自知识源的信息生成输出(例如,答案、解决方案、对任务的响应等)。键值记忆网络模型对 被结构化为(键,值)对的符号记忆(symbolic memory)进行操作,这给了专名识别模型更大的灵活性来用于对输入文本进行编码,并且有助于缩小直接读取文本和从预训练词向量库进行回答之间的差距。通过将关于手头任务的先验知识编码在键值记忆中,键值记忆网络具有多功能性来分析例如,文档、预训练词向量库、或使用信息提取所构建的预训练词向量库,并回答关于它们的问题。7. In a natural language processing method provided by the embodiment of this application, the semantically enhanced latent vector of each entity is obtained by basing the latent vector on a pre-trained word vector library composed of a large number of similar words by inputting the latent vector into a preset key value. Memory neural network approach. Key-value memory neural networks can enable machines to accept inputs (e.g., questions, puzzles, tasks, etc.) and, in response, generate outputs (e.g., answers, solutions, responses to tasks, etc.) based on information from knowledge sources. The key-value memory network model operates on symbolic memory that is structured into (key, value) pairs, which gives the proper name recognition model greater flexibility for encoding input text and helps To bridge the gap between reading text directly and answering from a library of pre-trained word vectors. By encoding prior knowledge about the task at hand in key-value memories, key-value memory networks have the versatility to analyze, for example, documents, pre-trained word vector libraries, or pre-trained word vector libraries built using information extraction, and Answer questions about them.
一实施例中,上述方法应用于抽取所述输入文本中给定的两个实体之间关系的情况下,所述得到所述输入文本中每个实体的隐向量,包括:得到给定的每个实体的隐向量;所述根据每个实体的隐向量得到所述每个实体的增强表征,包括:通过预设的第一算法对给定的每个实体的隐向量进行计算,得到所述给定的两个实体分别对应的第一实体向量表征与第二实体向量表征;将所述第一实体向量表征与所述第二实体向量表征进行处理,得到第一语义增强表征与第二语义增强表征;所述对每个实体的增强表征进行转换处理,得到处理结果,包括:根据预设的第二算法对所述第一语义增强表征与所述第二语义增强表征进行计算得到中间向量;将所述中间向量进行转换处理,得到预测的文本实体之间的关系类型。In one embodiment, when the above method is applied to extract the relationship between two given entities in the input text, obtaining the hidden vector of each entity in the input text includes: obtaining the given latent vector of each entity in the input text. the hidden vector of each entity; said obtaining the enhanced representation of each entity according to the hidden vector of each entity includes: calculating the given hidden vector of each entity through a preset first algorithm to obtain the said The first entity vector representation and the second entity vector representation respectively correspond to the given two entities; the first entity vector representation and the second entity vector representation are processed to obtain the first semantic enhanced representation and the second semantic representation. Enhanced representation; the conversion processing of the enhanced representation of each entity to obtain a processing result includes: calculating the first semantic enhanced representation and the second semantic enhanced representation according to a preset second algorithm to obtain an intermediate vector ;Convert the intermediate vector to obtain the predicted relationship type between text entities.
请结合图7与图8,本申请实施例提供一种自然语言处理方法,应用于抽取所述输入文本中给定的两个实体之间关系的情况下,通过关系抽取模型实现,用于抽取实体之间的关系,关系抽取模型包括编码器20、解码器21和基于注意力机制的语义增强模块22,自然语言处理方法包括以下步骤:获取输入文本,将输入文本传入编码器20并对输入文本进行编码,输出输入文本中给定的每个实体的隐向量;通过预设的第一算法对给定的每个实体的隐向量进行计算,得到给定的两个实体分别对应的第一实体向量表征与第二实体向量表征;将第一实体向量表征与第二实体向量表征输入语义增强模块22,得到第一语义增强表征与第二语义增强表征;将第一语义增强表征与第二语义增强表征经过预设的第二算法计算得到中间向量;将中间向量进行转换处理并通过解码器21解码,得到预测的文本实体之间的关系类型。Please combine Figure 7 and Figure 8. This embodiment of the present application provides a natural language processing method, which is used to extract the relationship between two given entities in the input text. It is implemented through a relationship extraction model and is used to extract The relationship between entities, the relationship extraction model includes an encoder 20, a decoder 21 and a semantic enhancement module 22 based on the attention mechanism. The natural language processing method includes the following steps: obtaining the input text, passing the input text to the encoder 20 and processing Input text is encoded, and the hidden vector of each entity given in the input text is output; the hidden vector of each given entity is calculated through the preset first algorithm, and the corresponding third entity corresponding to the given two entities is obtained. An entity vector representation and a second entity vector representation; input the first entity vector representation and the second entity vector representation into the semantic enhancement module 22 to obtain the first semantic enhancement representation and the second semantic enhancement representation; combine the first semantic enhancement representation with the third semantic enhancement representation The second semantic enhanced representation is calculated through a preset second algorithm to obtain an intermediate vector; the intermediate vector is converted and decoded by the decoder 21 to obtain the predicted relationship type between text entities.
由于训练数据的不足,传统的关系抽取模型往往会面临实体数据稀疏的问题,难以正确地抽取训练时未遇到的实体。而本方案提供的方法中,关系抽取模型的语义增强模块22通过对输入文本中的每个实体进行语义增强,利用与其相似的实体的语义增强这个实体的语义表征,从而增强关系抽取模型对当前实体语义的理解。即,如果关系抽取模型没有在训练的时候见过这个实体,那么关系抽取模型可以通过与该实体相近的实体的语义,帮助理解当前实体,从而 提升关系抽取模型对未登录实体的理解能力,进而提升关系抽取的性能。Due to insufficient training data, traditional relationship extraction models often face the problem of sparse entity data, making it difficult to correctly extract entities not encountered during training. In the method provided by this solution, the semantic enhancement module 22 of the relationship extraction model performs semantic enhancement on each entity in the input text, and uses the semantics of entities similar to it to enhance the semantic representation of this entity, thereby enhancing the relationship extraction model's effect on the current Understanding of entity semantics. That is, if the relationship extraction model has not seen this entity during training, then the relationship extraction model can help understand the current entity through the semantics of entities similar to the entity, thereby improving the relationship extraction model's ability to understand unlogged entities, and then Improve the performance of relation extraction.
在一些实施例中,第一算法为Max Pooling算法。In some embodiments, the first algorithm is the Max Pooling algorithm.
请参阅图9,在一些实施例中,得到第一语义增强表征与第二语义增强表征包括以下步骤:根据预设的预训练词向量库找出输入文本中给定的每个实体的一个或多个近似实体;根据预设的第三算法、第一实体向量表征与第二实体向量表征对给定的两个实体的近似实体进行计算得到输入文本中给定的两个实体分别对应的第一语义增强表征与第二语义增强表征。Please refer to Figure 9. In some embodiments, obtaining the first semantically enhanced representation and the second semantically enhanced representation includes the following steps: finding one or more of each entity given in the input text according to a preset pre-trained word vector library. Multiple approximate entities; calculate the approximate entities of the given two entities according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the corresponding third entities of the given two entities in the input text. A semantically enhanced representation and a second semantically enhanced representation.
预训练词向量库可对输入文本中每个实体的近似实体进行充分覆盖,有效地增强这个实体的语义表征,从而加强了关系抽取模型对当前实体语义的理解。The pre-trained word vector library can fully cover the approximate entities of each entity in the input text, effectively enhancing the semantic representation of this entity, thereby strengthening the relationship extraction model's understanding of the current entity semantics.
请参阅图10,在一些实施例中,根据预设的第二算法对第一语义增强表征与第二语义增强表征进行计算得到中间向量:将第一实体向量表征与第一语义增强表征串联,得到第一增强向量表征;将第二实体向量表征与第二语义增强表征串联,得到第二增强向量表征;将第一增强向量表征与第二增强向量表征串联,得到中间向量。Please refer to Figure 10. In some embodiments, the first semantic enhancement representation and the second semantic enhancement representation are calculated according to a preset second algorithm to obtain an intermediate vector: the first entity vector representation and the first semantic enhancement representation are concatenated, The first enhanced vector representation is obtained; the second entity vector representation is concatenated with the second semantic enhanced vector representation to obtain the second enhanced vector representation; the first enhanced vector representation is concatenated with the second enhanced vector representation to obtain an intermediate vector.
第一实体向量表征可对第一实体本身的语义进行表征,第一语义增强表征可对相似实体的语义进行表征,将两者串联之后得到的第一增强向量表征结合了实体和相似实体的语义(第二增强向量表征亦然),在原有的实体上对其语义进行了扩展,使得计算机对实体的语义进行更好地理解,也便于后续的计算处理。The first entity vector representation can represent the semantics of the first entity itself, and the first semantic enhanced representation can represent the semantics of similar entities. The first enhanced vector representation obtained by concatenating the two combines the semantics of the entity and the similar entities. (The same is true for the second enhanced vector representation). The semantics of the original entity are expanded, allowing the computer to better understand the semantics of the entity, and also facilitates subsequent calculation and processing.
请参阅图11,在一些实施例中,根据预设的第三算法、第一实体向量表征与第二实体向量表征对给定的两个实体的近似实体进行计算得到输入文本中给定的两个实体分别对应的第一语义增强表征与第二语义增强表征包括以下步骤:通过预设的词向量矩阵将近似实体映射为词向量;通过第一实体向量表征(或第二实体向量表征)与当前实体的一个或多个近似实体的词向量计算每个近似实体所占的权重;根据权重与词向量计算得到词向量的第一语义增强表征(或第二语义增强表征)。Please refer to Figure 11. In some embodiments, the approximate entities of the two given entities are calculated according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the two given entities in the input text. The first semantic enhancement representation and the second semantic enhancement representation respectively corresponding to each entity include the following steps: mapping the approximate entities into word vectors through a preset word vector matrix; using the first entity vector representation (or the second entity vector representation) and The word vectors of one or more approximate entities of the current entity calculate the weight of each approximate entity; the first semantic enhancement representation (or the second semantic enhancement representation) of the word vector is calculated based on the weight and the word vector.
对不同的近似实体的语义计算权重,实现了对不近似实体的重要性的识别和利用,有效地避免了相似实体中潜在的噪音对模型性能的影响,进而提升了关系抽取的性能。The semantic calculation weight of different approximate entities realizes the identification and utilization of the importance of non-similar entities, effectively avoiding the impact of potential noise in similar entities on model performance, thereby improving the performance of relationship extraction.
在一些实施例中,将语义增强隐向量进行分类转换处理包含以下步骤:将中间向量经过预设的全连接层后,送入SoftMax分类器,得到预测的关系类型。全连接层和SoftMax分类器可将语义增强隐向量中的包含的不同连接的权重信息可视化并与预设的模板进行匹配,从而能够更加方便地预测出实体之间的关 系类型。In some embodiments, performing classification conversion processing on semantically enhanced latent vectors includes the following steps: after passing the intermediate vector through a preset fully connected layer, it is sent to the SoftMax classifier to obtain the predicted relationship type. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
示例性地,请参阅图6,输入中的两个给定的实体为“食物工厂”和“水果罐头”。模型使用了标准的编码-解码架构。编码器20使用BERT,解码器21使用SoftMax;For example, referring to Figure 6, the two given entities in the input are "Food Factory" and "Fruit Cans". The model uses a standard encoder-decoder architecture. Encoder 20 uses BERT and decoder 21 uses SoftMax;
第一步,使用BERT对输入文本进行编码,得到每个实体的隐向量。其中,第i个词x i的隐向量分别记为h iThe first step is to use BERT to encode the input text and obtain the latent vector of each entity. Among them, the hidden vectors of the i-th word x i are respectively recorded as h i .
第二步,使用Max Pooling算法,计算两个实体的向量表征h E1和h E2(E1和E2分别表示两个实体)。方法如下: In the second step, the Max Pooling algorithm is used to calculate the vector representations h E1 and h E2 of the two entities (E1 and E2 represent two entities respectively). Methods as below:
Figure PCTCN2022128622-appb-000005
Figure PCTCN2022128622-appb-000005
Figure PCTCN2022128622-appb-000006
Figure PCTCN2022128622-appb-000006
第三步,把h E1和h E2送入语义增强模块22,得到E1和E2分别对应的语义增强表征a1和a2. In the third step, h E1 and h E2 are sent to the semantic enhancement module 22 to obtain semantic enhancement representations a1 and a2 corresponding to E1 and E2 respectively.
第四步,把h E1与a1串联,得到第一增强向量表征
Figure PCTCN2022128622-appb-000007
同样地,可以得到第二增强向量表征o 2
The fourth step is to connect h E1 and a1 in series to obtain the first enhanced vector representation.
Figure PCTCN2022128622-appb-000007
Similarly, the second enhanced vector representation o 2 can be obtained.
第五步,把o 1和o 2串联,得到中间向量
Figure PCTCN2022128622-appb-000008
The fifth step is to connect o 1 and o 2 in series to get the intermediate vector
Figure PCTCN2022128622-appb-000008
第六步,o经过一个全连接层后,送入SoftMax分类器,得到预测的关系类型。In the sixth step, after o passes through a fully connected layer, it is sent to the SoftMax classifier to obtain the predicted relationship type.
语义增强模块22的处理流程如下:The processing flow of the semantic enhancement module 22 is as follows:
第一步,使用预训练的实体向量库(例如腾讯的800万词向量),对输入文本中的每一个实体E i(其中i=1或2),基于实体向量之间的余弦距离,找到与E i距离最近的m个实体,记为c i,1,…,c i,j,…,c i,mThe first step is to use a pre-trained entity vector library (such as Tencent's 8 million word vectors), for each entity E i (where i = 1 or 2) in the input text, based on the cosine distance between entity vectors, find The m entities closest to E i are denoted as c i,1 ,…,c i,j ,…,c i,m .
第二步,使用词向量矩阵,把c i,j映射为词向量e i,jIn the second step, use the word vector matrix to map c i,j to word vector e i,j .
第三步,使用E i的向量表征h Ei和e i,j,计算权重p i,j,方式如下: The third step is to use the vector of E i to represent h Ei and e i,j and calculate the weight p i,j as follows:
Figure PCTCN2022128622-appb-000009
Figure PCTCN2022128622-appb-000009
其中“·”用于计算向量内积;Among them, "·" is used to calculate the inner product of vectors;
第四步,基于权重p i,j,计算词向量e i,j的平均向量a i,方式如下: The fourth step is to calculate the average vector a i of the word vector e i, j based on the weight p i ,j as follows:
Figure PCTCN2022128622-appb-000010
Figure PCTCN2022128622-appb-000010
其中,“·”用于计算权重(一个0~1之间的正实数)与词向量e i,j的乘积。 Among them, "·" is used to calculate the product of the weight (a positive real number between 0 and 1) and the word vector e i,j .
在一些实施例中,将中间向量进行转换处理得到预测的文本实体之间的关系类型包含以下步骤:将中间向量经过预设的全连接层后,送入SoftMax分类器,得到预测的关系类型。全连接层和SoftMax分类器可将语义增强隐向量中的包含的不同连接的权重信息可视化并与预设的模板进行匹配,从而能够更加方便地预测出实体之间的关系类型。In some embodiments, converting the intermediate vector to obtain the predicted relationship type between text entities includes the following steps: passing the intermediate vector through a preset fully connected layer and then sending it to the SoftMax classifier to obtain the predicted relationship type. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
与相关技术相比,本申请所提供的一种自然语言处理方法,实现了:Compared with related technologies, the natural language processing method provided by this application achieves:
1.本申请实施例提供一种自然语言处理方法,应用于抽取所述输入文本中给定的两个实体之间关系的情况下,通过关系抽取模型实现,包括:获取输入文本,并对输入文本进行编码,得到输入文本中给定的每个实体的隐向量;通过预设的第一算法对给定的每个实体的隐向量进行计算,得到给定的两个实体分别对应的第一实体向量表征与第二实体向量表征;将第一实体向量表征与第二实体向量表征进行处理,得到第一语义增强表征与第二语义增强表征;将第一语义增强表征与第二语义增强表征经过预设的第二算法计算得到中间向量;将中间向量进行转换处理,得到预测的文本实体之间的关系类型。由于训练数据的不足,传统的关系抽取模型往往会面临实体数据稀疏的问题,难以正确地抽取训练时未遇到的实体。而本方案提供的关系抽取方法中,关系抽取模型的语义增强模块通过对输入文本中的每个实体进行语义增强,利用与其相似的实体的语义增强这个实体的语义表征,从而增强关系抽取模型对当前实体语义的理解。即,如果关系抽取模型没有在训练的时候见过这个实体,那么关系抽取模型可以通过与该实体相近的实体的语义,帮助理解当前实体,从而提升关系抽取模型对未登录实体的理解能力,进而提升关系抽取的性能。1. The embodiment of this application provides a natural language processing method, which is applied to extract the relationship between two given entities in the input text. It is implemented through a relationship extraction model, including: obtaining the input text, and analyzing the input text. The text is encoded to obtain the hidden vector of each entity given in the input text; the hidden vector of each given entity is calculated through the preset first algorithm, and the first corresponding corresponding entities of the given two entities are obtained. Entity vector representation and second entity vector representation; process the first entity vector representation and the second entity vector representation to obtain the first semantic enhanced representation and the second semantic enhanced representation; combine the first semantic enhanced representation and the second semantic enhanced representation The intermediate vector is calculated through the preset second algorithm; the intermediate vector is converted to obtain the predicted relationship type between text entities. Due to insufficient training data, traditional relationship extraction models often face the problem of sparse entity data, making it difficult to correctly extract entities not encountered during training. In the relationship extraction method provided by this solution, the semantic enhancement module of the relationship extraction model performs semantic enhancement on each entity in the input text and uses the semantics of similar entities to enhance the semantic representation of this entity, thereby enhancing the relationship extraction model's Understanding of current entity semantics. That is, if the relationship extraction model has not seen this entity during training, then the relationship extraction model can help understand the current entity through the semantics of entities similar to the entity, thereby improving the relationship extraction model's ability to understand unlogged entities, and then Improve the performance of relation extraction.
2.本申请实施例提供一种自然语言处理方法中,得到第一语义增强表征与第二语义增强表征包括以下步骤:根据预设的预训练词向量库找出输入文本中给定的每个实体的一个或多个近似实体;根据预设的第三算法、第一实体向量表征与第二实体向量表征对给定的两个实体的近似实体进行计算得到输入文本中给定的两个实体分别对应的第一语义增强表征与第二语义增强表征。预训练词向量库可对输入文本中每个实体的近似实体进行充分覆盖,有效地增强这个实体的语义表征,从而加强了关系抽取模型对当前实体语义的理解。2. The embodiment of the present application provides a natural language processing method. Obtaining the first semantic enhancement representation and the second semantic enhancement representation includes the following steps: finding each given word in the input text according to the preset pre-trained word vector library. One or more approximate entities of the entity; calculate the approximate entities of the given two entities according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the two given entities in the input text The corresponding first semantic enhancement representation and the second semantic enhancement representation respectively. The pre-trained word vector library can fully cover the approximate entities of each entity in the input text, effectively enhancing the semantic representation of this entity, thereby strengthening the relationship extraction model's understanding of the current entity semantics.
3.本申请实施例提供一种自然语言处理方法中,根据预设的第二算法对第一语义增强表征与第二语义增强表征进行计算得到中间向量,包括:将第一实体向量表征与第一语义增强表征串联,得到第一增强向量表征;将第二实体向量表征与第二语义增强表征串联,得到第二增强向量表征;将第一增强向量表征与第二增强向量表征串联,得到中间向量。第一实体向量表征可对第一实体本身的语义进行表征,第一增强向量表征可对相似实体的语义进行表征,将两者 串联之后得到的第一增强向量结合了实体和相似实体的语义(第二增强向量表征亦然),在原有的实体上对其语义进行了扩展,使得计算机对实体的语义进行更好地理解,也便于后续的计算处理。3. The embodiment of the present application provides a natural language processing method that calculates the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector, including: combining the first entity vector representation with the third semantic enhancement representation. Concatenate one semantic enhanced representation to obtain the first enhanced vector representation; concatenate the second entity vector representation with the second semantic enhanced representation to obtain the second enhanced vector representation; concatenate the first enhanced vector representation with the second enhanced vector representation to obtain the intermediate vector. The first entity vector representation can represent the semantics of the first entity itself, and the first enhancement vector representation can represent the semantics of similar entities. The first enhancement vector obtained after concatenating the two combines the semantics of the entity and similar entities ( The same is true for the second enhanced vector representation), which expands its semantics on the original entity, allowing the computer to better understand the semantics of the entity, and also facilitates subsequent calculation and processing.
4.本申请实施例提供一种自然语言处理方法中,根据预设的第三算法、第一实体向量表征与第二实体向量表征对给定的两个实体的近似实体进行计算得到输入文本中给定的两个实体分别对应的第一语义增强表征与第二语义增强表征,包括:通过预设的词向量矩阵将近似实体映射为词向量;通过第一实体向量表征(或第二实体向量表征)与当前实体的一个或多个近似实体的词向量计算每个近似实体所占的权重;根据权重与词向量计算得到词向量的第一语义增强表征(或第二语义增强表征)。对不同的近似实体的语义计算权重,实现了对不近似实体的重要性的识别和利用,有效地避免了相似实体中潜在的噪音对模型性能的影响,进而提升了关系抽取的性能。4. The embodiment of the present application provides a natural language processing method in which the approximate entities of the given two entities are calculated according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the input text. The first semantic enhancement representation and the second semantic enhancement representation corresponding to the given two entities respectively include: mapping the approximate entities into word vectors through a preset word vector matrix; representing the first entity vector (or the second entity vector representation) and the word vectors of one or more approximate entities of the current entity to calculate the weight of each approximate entity; calculate the first semantic enhancement representation (or the second semantic enhancement representation) of the word vector based on the weight and the word vector. The semantic calculation weight of different approximate entities realizes the identification and utilization of the importance of dissimilar entities, effectively avoiding the impact of potential noise in similar entities on model performance, thereby improving the performance of relationship extraction.
5.本申请实施例提供一种自然语言处理方法中,将中间向量进行转换处理得到预测的文本实体之间的关系类型包含以下步骤:将中间向量经过预设的全连接层后,送入SoftMax分类器,得到预测的关系类型。全连接层和SoftMax分类器可将语义增强隐向量中的包含的不同连接的权重信息可视化并与预设的模板进行匹配,从而能够更加方便地预测出实体之间的关系类型。5. The embodiment of the present application provides a natural language processing method. Converting intermediate vectors to obtain the predicted relationship type between text entities includes the following steps: After passing the intermediate vector through a preset fully connected layer, it is sent to SoftMax. Classifier, get the predicted relationship type. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.
请参阅图12,本申请实施例还提供一种计算机设备,包括处理器30、存储器31以及存储在所述存储器31上的计算机程序,所述处理器30执行所述计算机程序以实现上述方法。计算机设备具有与上述一种自然语言处理方法相同的效果,在此不再赘述。Referring to Figure 12, an embodiment of the present application also provides a computer device, including a processor 30, a memory 31, and a computer program stored on the memory 31. The processor 30 executes the computer program to implement the above method. The computer equipment has the same effect as the above-mentioned natural language processing method, which will not be described again here.
本申请实施例还提供一种可读存储介质,其上存储有计算机程序指令,当计算机程序指令被执行时,实现上述方法。可读存储介质具有与上述一种自然语言处理方法相同的效果,在此不再赘述。Embodiments of the present application also provide a readable storage medium on which computer program instructions are stored. When the computer program instructions are executed, the above method is implemented. The readable storage medium has the same effect as the above-mentioned natural language processing method, which will not be described again here.
本申请实施例还提供一种程序产品,程序产品包括计算机程序指令,当计算机程序指令被执行时实现上述方法。程序产品具有与上述一种自然语言处理方法相同的效果,在此不再赘述。Embodiments of the present application also provide a program product. The program product includes computer program instructions. When the computer program instructions are executed, the above method is implemented. The program product has the same effect as one of the above natural language processing methods, and will not be described again here.
在本申请所提供的实施例中,“与A对应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其他信息确定B。In the embodiments provided in this application, "B corresponding to A" means that B is associated with A, and B can be determined based on A. However, it should also be understood that determining B based on A does not mean determining B only based on A. B can also be determined based on A and/or other information.
本文中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在本文多处出现的“在一 个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定特征、结构或特性可以以任意适合的方式结合在一个或多个实施例中。本文中所描述的实施例均属于可选实施例,所涉及的动作和模块并不一定是本申请所必须的。Reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic associated with the embodiment is included in at least one embodiment of the present application. Therefore, appearances of "in one embodiment" or "in an embodiment" in various places herein are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. The embodiments described in this article are all optional embodiments, and the actions and modules involved are not necessarily necessary for this application.
在本申请的多种实施例中,上述多个过程的序号的大小并不意味着执行顺序的必然先后,多个过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。In the various embodiments of the present application, the size of the sequence numbers of the above-mentioned multiple processes does not necessarily mean the order of execution. The execution order of the multiple processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application. The implementation process constitutes any limitation.
在本申请的附图中的流程图和框图,图示了按照本申请多种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方案中,方框中所标注的功能也可以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,在此基于涉及的功能而确定。框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the drawings of this application illustrate the architecture, functionality and operation of possible implementations of systems, methods and computer program products according to various embodiments of this application. In this regard, each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending upon the functionality involved. Each block in the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, may be implemented by special purpose hardware-based systems that perform the specified functions or operations, or may be implemented using special purpose hardware implemented in combination with computer instructions.

Claims (18)

  1. 一种自然语言处理方法,包括:A natural language processing method that includes:
    获取输入文本并对所述输入文本进行编码,得到所述输入文本中每个实体的隐向量;Obtain the input text and encode the input text to obtain the hidden vector of each entity in the input text;
    根据每个实体的隐向量得到所述每个实体的增强表征;Obtain the enhanced representation of each entity according to the latent vector of each entity;
    对每个实体的增强表征进行转换处理,得到处理结果。The enhanced representation of each entity is transformed and processed to obtain the processing result.
  2. 如权利要求1所述的方法,所述方法应用于专名识别的情况下,所述根据每个实体的隐向量得到所述每个实体的增强表征,包括:The method of claim 1, when the method is applied to proper name recognition, obtaining the enhanced representation of each entity based on the latent vector of each entity includes:
    根据每个实体的隐向量以及基于相似词组成的预训练词向量库得到所述每个实体的语义增强隐向量;Obtain the semantically enhanced latent vector of each entity according to the latent vector of each entity and a pre-trained word vector library based on similar words;
    所述对每个实体的增强表征进行转换处理,得到处理结果,包括:The enhanced representation of each entity is converted and processed to obtain processing results, including:
    将每个实体的语义增强隐向量经过分类转换处理,得到所述每个实体对应的专名实体标签。The semantically enhanced latent vector of each entity is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity.
  3. 如权利要求2所述的方法,其中,所述根据每个实体的隐向量以及基于相似词组成的预训练词向量库得到所述每个实体的语义增强隐向量,包括:The method of claim 2, wherein the semantically enhanced latent vector of each entity is obtained according to the latent vector of each entity and a pre-trained word vector library composed of similar words, including:
    根据所述预训练词向量库得到所述输入文本中每个实体的至少一个近似词;Obtain at least one approximate word for each entity in the input text according to the pre-trained word vector library;
    根据预设的第一算法对每个实体的至少一个近似词进行计算得到所述每个实体的平均向量;Calculate at least one approximate word of each entity according to a preset first algorithm to obtain the average vector of each entity;
    将每个实体的平均向量与隐向量串联得到所述每个实体的语义增强隐向量。The average vector of each entity is concatenated with the latent vector to obtain the semantically enhanced latent vector of each entity.
  4. 如权利要求3所述的方法,其中,所述根据预设的第一算法对每个实体的至少一个近似词进行计算得到所述每个实体的平均向量,包括:The method of claim 3, wherein calculating at least one approximate word of each entity according to a preset first algorithm to obtain the average vector of each entity includes:
    根据预设的词向量矩阵将每个近似词映射为词向量;Map each approximate word to a word vector according to the preset word vector matrix;
    根据预设的第二算法将所述词向量映射为键向量和值向量;Map the word vectors into key vectors and value vectors according to a preset second algorithm;
    根据预设的第三算法对每个实体的键向量和值向量进行计算得到所述每个实体的平均向量。The key vector and value vector of each entity are calculated according to a preset third algorithm to obtain the average vector of each entity.
  5. 如权利要求4所述的方法,其中,所述根据预设的第二算法将所述词向量映射为键向量和值向量,包括:The method of claim 4, wherein mapping the word vectors into key vectors and value vectors according to a preset second algorithm includes:
    将预设的键矩阵和所述词向量传入预设的激活函数,得到所述键向量;Pass the preset key matrix and the word vector into the preset activation function to obtain the key vector;
    将预设的值矩阵与所述词向量传入预设的激活函数,得到所述值向量。Pass the preset value matrix and the word vector into the preset activation function to obtain the value vector.
  6. 如权利要求4所述的方法,其中,所述根据预设的第三算法对每个实体的 键向量和值向量进行计算得到所述每个实体的平均向量,包括:The method of claim 4, wherein calculating the key vector and value vector of each entity according to a preset third algorithm to obtain the average vector of each entity includes:
    根据每个实体的隐向量和相似词对应的键向量计算得到所述每个实体的相似词的权重;Calculate the weight of the similar words of each entity based on the latent vector of each entity and the key vector corresponding to the similar word;
    根据每个实体的相似词的权重与值向量计算得到所述每个实体的平均向量。The average vector of each entity is calculated based on the weight and value vector of similar words of each entity.
  7. 如权利要求2所述的方法,其中,所述将每个实体的语义增强隐向量进行分类转换处理得到所述每个实体对应的专名实体标签,包含:The method of claim 2, wherein the semantically enhanced latent vector of each entity is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity, including:
    将每个实体的语义增强隐向量输入预设的全连接层得到所述预设的全连接层的输出,并将所述预设的全连接层的输出输入预设的SoftMax分类器,得到所述每个实体的专名实体标签。Input the semantically enhanced latent vector of each entity into the preset fully connected layer to obtain the output of the preset fully connected layer, and input the output of the preset fully connected layer into the preset SoftMax classifier to obtain the Describes the proper name entity tag for each entity.
  8. 如权利要求2所述的方法,其中,所述根据所述隐向量以及基于相似词组成的预训练词向量库得到所述每个实体的语义增强隐向量,包括:The method of claim 2, wherein obtaining the semantically enhanced latent vector of each entity based on the latent vector and a pre-trained word vector library composed of similar words includes:
    将所述隐向量输入预设的键值记忆神经网络得到所述每个实体的语义增强隐向量。Input the latent vector into a preset key-value memory neural network to obtain the semantically enhanced latent vector of each entity.
  9. 如权利要求1所述的方法,所述方法应用于抽取所述输入文本中给定的两个实体之间关系的情况下,所述得到所述输入文本中每个实体的隐向量,包括:The method according to claim 1, when the method is applied to extract the relationship between two given entities in the input text, obtaining the latent vector of each entity in the input text includes:
    得到给定的每个实体的隐向量;Get the hidden vector of each given entity;
    所述根据每个实体的隐向量得到所述每个实体的增强表征,包括:Obtaining the enhanced representation of each entity based on the latent vector of each entity includes:
    通过预设的第一算法对给定的每个实体的隐向量进行计算,得到所述给定的两个实体分别对应的第一实体向量表征与第二实体向量表征;Calculate the hidden vector of each given entity through a preset first algorithm to obtain the first entity vector representation and the second entity vector representation respectively corresponding to the given two entities;
    将所述第一实体向量表征与所述第二实体向量表征进行处理,得到第一语义增强表征与第二语义增强表征;Process the first entity vector representation and the second entity vector representation to obtain a first semantically enhanced representation and a second semantically enhanced representation;
    所述对每个实体的增强表征进行转换处理,得到处理结果,包括:The enhanced representation of each entity is converted and processed to obtain processing results, including:
    根据预设的第二算法对所述第一语义增强表征与所述第二语义增强表征进行计算得到中间向量;Calculate the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector;
    将所述中间向量进行转换处理,得到预测的文本实体之间的关系类型。The intermediate vector is converted to obtain the predicted relationship type between text entities.
  10. 如权利要求9所述的方法,其中,所述将所述第一实体向量表征与所述第二实体向量表征进行处理,得到第一语义增强表征与第二语义增强表征,包括:The method of claim 9, wherein processing the first entity vector representation and the second entity vector representation to obtain a first semantic enhanced representation and a second semantic enhanced representation includes:
    根据预设的预训练词向量库得到所述输入文本中给定的每个实体的至少一个近似实体;Obtain at least one approximate entity of each entity given in the input text according to the preset pre-trained word vector library;
    根据预设的第三算法、所述第一实体向量表征与所述第二实体向量表征对所述给定的两个实体的近似实体进行计算得到所述给定的两个实体分别对应的 第一语义增强表征与第二语义增强表征。According to the preset third algorithm, the first entity vector representation and the second entity vector representation, the approximate entities of the given two entities are calculated to obtain the third corresponding entities of the given two entities. A semantically enhanced representation and a second semantically enhanced representation.
  11. 如权利要求9所述的方法,其中,所述根据预设的第二算法对所述第一语义增强表征与所述第二语义增强表征进行计算得到中间向量,包括:The method of claim 9, wherein calculating the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector includes:
    将所述第一实体向量表征与所述第一语义增强表征串联,得到第一增强向量表征;Concatenate the first entity vector representation and the first semantic enhancement representation to obtain a first enhancement vector representation;
    将所述第二实体向量表征与所述第二语义增强表征串联,得到第二增强向量表征;Concatenate the second entity vector representation and the second semantic enhancement representation to obtain a second enhancement vector representation;
    将所述第一增强向量表征与第二增强向量表征串联,得到所述中间向量。The first enhancement vector representation is concatenated with the second enhancement vector representation to obtain the intermediate vector.
  12. 如权利要求10所述的方法,其中,所述根据预设的第三算法、所述第一实体向量表征与所述第二实体向量表征对所述给定的两个实体的近似实体进行计算得到所述所述给定的两个实体分别对应的第一语义增强表征与第二语义增强表征,包括:The method of claim 10, wherein the approximate entities of the given two entities are calculated according to a preset third algorithm, the first entity vector representation and the second entity vector representation. Obtaining the first semantic enhancement representation and the second semantic enhancement representation respectively corresponding to the given two entities includes:
    通过预设的词向量矩阵将所述近似实体映射为词向量;Map the approximate entities into word vectors through a preset word vector matrix;
    通过所述第一实体向量表征与当前实体的至少一个近似实体的词向量计算每个近似实体所占的权重;根据所述权重与所述词向量计算得到所述词向量的第一语义增强表征;Calculate the weight of each approximate entity through the first entity vector representation and the word vector of at least one approximate entity of the current entity; calculate the first semantic enhancement representation of the word vector based on the weight and the word vector. ;
    通过所述第二实体向量表征与当前实体的至少一个近似实体的词向量计算每个近似实体所占的权重;根据所述权重与所述词向量计算得到所述词向量的第二语义增强表征。Calculate the weight of each approximate entity through the second entity vector representation and the word vector of at least one approximate entity of the current entity; calculate the second semantic enhancement representation of the word vector based on the weight and the word vector. .
  13. 如权利要求9所述的方法,其中,所述第一算法为Max Pooling算法。The method of claim 9, wherein the first algorithm is a Max Pooling algorithm.
  14. 如权利要求9所述的方法,其中,所述将所述中间向量进行转换处理得到预测的文本实体之间的关系类型,包含:The method according to claim 9, wherein the conversion processing of the intermediate vector to obtain the predicted relationship type between text entities includes:
    将所述中间向量输入预设的全连接层得到所述预设的全连接层的输出,将所述预设的全连接层的输出输入预设的SoftMax分类器进行分类,得到所述关系类型。Input the intermediate vector into the preset fully connected layer to obtain the output of the preset fully connected layer, input the output of the preset fully connected layer into the preset SoftMax classifier for classification, and obtain the relationship type .
  15. 如权利要求12所述的方法,其中,所述权重通过如下公式计算:The method of claim 12, wherein the weight is calculated by the following formula:
    Figure PCTCN2022128622-appb-100001
    Figure PCTCN2022128622-appb-100001
    式中,p i,j表示权重,h Ei表示所述第一实体向量表征或所述第二实体向量表征,e i,j表示所述词向量,i=1或2,m表示一个实体的近似实体的个数,j∈(1,m)。 In the formula, p i, j represents the weight, h Ei represents the first entity vector representation or the second entity vector representation, e i, j represents the word vector, i = 1 or 2, and m represents an entity. The number of approximate entities, j∈(1,m).
  16. 一种计算机设备,包括处理器、存储器以及存储在所述存储器上的计算机程序,所述处理器执行所述计算机程序以实现如权利要求1-15中任一项所述的自然语言处理方法。A computer device includes a processor, a memory, and a computer program stored on the memory. The processor executes the computer program to implement the natural language processing method according to any one of claims 1-15.
  17. 一种可读存储介质,存储有计算机程序指令,所述计算机程序指令被执行时实现如权利要求1-15中任一项所述的自然语言处理方法。A readable storage medium that stores computer program instructions. When the computer program instructions are executed, the natural language processing method as described in any one of claims 1-15 is implemented.
  18. 一种程序产品,包括计算机程序指令,所述计算机程序指令被执行时实现如权利要求1-15中任一项所述的自然语言处理方法。A program product includes computer program instructions that, when executed, implement the natural language processing method according to any one of claims 1-15.
PCT/CN2022/128622 2022-07-29 2022-10-31 Natural language processing method, computer device, readable storage medium, and program product WO2024021343A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210911109.5A CN115270812A (en) 2022-07-29 2022-07-29 Relationship extraction method, computer device readable storage medium and program product
CN202210909700.7A CN115329764A (en) 2022-07-29 2022-07-29 Proper name recognition method, computer equipment, readable storage medium and program product
CN202210911109.5 2022-07-29
CN202210909700.7 2022-07-29

Publications (1)

Publication Number Publication Date
WO2024021343A1 true WO2024021343A1 (en) 2024-02-01

Family

ID=89705143

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/128622 WO2024021343A1 (en) 2022-07-29 2022-10-31 Natural language processing method, computer device, readable storage medium, and program product

Country Status (1)

Country Link
WO (1) WO2024021343A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111274394A (en) * 2020-01-16 2020-06-12 重庆邮电大学 Method, device and equipment for extracting entity relationship and storage medium
CN111291565A (en) * 2020-01-17 2020-06-16 创新工场(广州)人工智能研究有限公司 Method and device for named entity recognition
CN111597341A (en) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 Document level relation extraction method, device, equipment and storage medium
CN113204618A (en) * 2021-04-30 2021-08-03 平安科技(深圳)有限公司 Information identification method, device and equipment based on semantic enhancement and storage medium
WO2022005188A1 (en) * 2020-07-01 2022-01-06 Samsung Electronics Co., Ltd. Entity recognition method, apparatus, electronic device and computer readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111274394A (en) * 2020-01-16 2020-06-12 重庆邮电大学 Method, device and equipment for extracting entity relationship and storage medium
CN111291565A (en) * 2020-01-17 2020-06-16 创新工场(广州)人工智能研究有限公司 Method and device for named entity recognition
CN111597341A (en) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 Document level relation extraction method, device, equipment and storage medium
WO2022005188A1 (en) * 2020-07-01 2022-01-06 Samsung Electronics Co., Ltd. Entity recognition method, apparatus, electronic device and computer readable storage medium
CN113204618A (en) * 2021-04-30 2021-08-03 平安科技(深圳)有限公司 Information identification method, device and equipment based on semantic enhancement and storage medium

Similar Documents

Publication Publication Date Title
WO2022022163A1 (en) Text classification model training method, device, apparatus, and storage medium
CN111488739A (en) Implicit discourse relation identification method based on multi-granularity generated image enhancement representation
CN111324696B (en) Entity extraction method, entity extraction model training method, device and equipment
WO2021174922A1 (en) Statement sentiment classification method and related device
CN114139551A (en) Method and device for training intention recognition model and method and device for recognizing intention
WO2023020522A1 (en) Methods for natural language processing and training natural language processing model, and device
CN116304748B (en) Text similarity calculation method, system, equipment and medium
CN113887229A (en) Address information identification method and device, computer equipment and storage medium
CN113282714B (en) Event detection method based on differential word vector representation
CN111368066B (en) Method, apparatus and computer readable storage medium for obtaining dialogue abstract
CN116579345B (en) Named entity recognition model training method, named entity recognition method and named entity recognition device
CN115759092A (en) Network threat information named entity identification method based on ALBERT
WO2023134085A1 (en) Question answer prediction method and prediction apparatus, electronic device, and storage medium
WO2022228127A1 (en) Element text processing method and apparatus, electronic device, and storage medium
CN115718792A (en) Sensitive information extraction method based on natural semantic processing and deep learning
CN115934883A (en) Entity relation joint extraction method based on semantic enhancement and multi-feature fusion
CN112256765A (en) Data mining method, system and computer readable storage medium
WO2024021343A1 (en) Natural language processing method, computer device, readable storage medium, and program product
CN115659242A (en) Multimode emotion classification method based on mode enhanced convolution graph
CN113157866B (en) Data analysis method, device, computer equipment and storage medium
CN114662499A (en) Text-based emotion recognition method, device, equipment and storage medium
CN115221284A (en) Text similarity calculation method and device, electronic equipment and storage medium
CN114925695A (en) Named entity identification method, system, equipment and storage medium
CN114722818A (en) Named entity recognition model based on anti-migration learning
CN113705197A (en) Fine-grained emotion analysis method based on position enhancement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22952781

Country of ref document: EP

Kind code of ref document: A1