WO2024021343A1

WO2024021343A1 - Natural language processing method, computer device, readable storage medium, and program product

Info

Publication number: WO2024021343A1
Application number: PCT/CN2022/128622
Authority: WO
Inventors: 宋彦; 田元贺; 李世鹏
Original assignee: 苏州思萃人工智能研究所有限公司
Priority date: 2022-07-29
Filing date: 2022-10-31
Publication date: 2024-02-01

Abstract

A natural language processing method, a computer device, a readable storage medium, and a program product. The natural language processing method comprises: acquiring an input text and encoding same to obtain a latent vector of each entity in the input text; obtaining an enhanced representation of each entity according to the latent vector of each entity; and performing conversion processing on the enhanced representation of each entity to obtain a processing result.

Description

Natural language processing methods, computer equipment, readable storage media and program products

This application claims priority to the Chinese patent application with application number 202210911109.5 submitted to the China Patent Office on July 29, 2022, and claims priority to the Chinese patent application with application number 202210909700.7 submitted to the China Patent Office on July 29, 2022. rights, the entire contents of which are incorporated herein by reference.

Technical field

This application relates to the technical field of natural language processing, for example, to a natural language processing method, device, readable storage medium and program product.

Background technique

Both the proper name recognition task and the relationship extraction task belong to natural language processing. The proper name recognition task aims to extract nominal entities from a given sentence. For certain fields, such as social media, there is a serious shortage of training data due to the rapid changes in language vocabulary usage. Traditional proper name recognition models often face the problem of data sparseness, making it difficult to correctly extract entities not encountered during training. The relationship extraction task aims to extract (predict) the relationship between the two given entities from a given sentence and two entities. Among them, understanding the meaning of the entity itself is very important for predicting its relationship. However, general methods often ignore the modeling of the entity itself. Moreover, due to insufficient training data, traditional relationship extraction models often face the problem of sparse entity data, making it difficult to correctly extract relationships between entities not encountered during training (that is, unlogged entities).

Contents of the invention

This application provides a natural language processing method, equipment, storage medium and program product to solve the problem that traditional natural language processing methods are difficult to correctly extract entities or relationships between entities that have not been encountered during training.

This application provides a natural language processing method, including:

Obtain the input text and encode the input text to obtain the hidden vector of each entity in the input text;

Obtain the enhanced representation of each entity according to the latent vector of each entity;

The enhanced representation of each entity is transformed and processed to obtain the processing result.

In one embodiment, when the method is applied to proper name recognition, obtaining an enhanced representation of each entity based on its latent vector includes:

Obtain the semantically enhanced latent vector of each entity according to the latent vector of each entity and a pre-trained word vector library based on a large number of similar words;

The enhanced representation of each entity is converted and processed to obtain processing results, including:

The semantically enhanced latent vector of each entity is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity.

In one embodiment, when the method is applied to extract the relationship between two entities given in the input text, obtaining the latent vector of each entity in the input text includes:

Get the hidden vector of each given entity;

Obtaining the enhanced representation of each entity based on the latent vector of each entity includes:

Calculate the hidden vector of each given entity through the preset first algorithm to obtain the first entity vector representation and the second entity vector representation respectively corresponding to the given two entities; The vector representation is processed with the second entity vector representation to obtain a first semantically enhanced representation and a second semantically enhanced representation;

Calculate the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector;

The intermediate vector is converted to obtain the predicted relationship type between text entities.

This application also provides a computer device. The computer device includes a processor, a memory, and a computer program stored on the memory. The processor executes the above computer program to implement the above natural language processing method.

This application also provides a readable storage medium on which computer program instructions are stored. When the computer program instructions are executed, the above natural language processing method is implemented.

This application also provides a computer program product, which includes computer program instructions. When the computer program instructions are executed, the above-mentioned natural language processing method is implemented.

Description of drawings

Figure 1 is a schematic flowchart of a natural language processing method provided by an embodiment of the present application;

Figure 2 is a schematic flowchart of another natural language processing method provided by an embodiment of the present application;

Figure 3 is a schematic structural diagram of a proper name recognition model provided by an embodiment of the present application;

Figure 4 is a schematic flowchart of obtaining semantically enhanced latent vectors provided by an embodiment of the present application;

Figure 5 is a schematic flowchart of obtaining an average vector provided by an embodiment of the present application;

Figure 6 is a schematic structural diagram of another proper name recognition model provided by an embodiment of the present application;

Figure 7 is a schematic flow chart of another natural language processing method provided by an embodiment of the present application;

Figure 8 is a schematic module diagram of a relationship extraction model provided by an embodiment of the present application;

Figure 9 is a schematic flowchart of obtaining semantic enhanced representation provided by an embodiment of the present application;

Figure 10 is a schematic flowchart of obtaining an intermediate vector provided by an embodiment of the present application;

Figure 11 is a schematic flowchart of obtaining semantically enhanced vector representation provided by an embodiment of the present application;

Figure 12 is a schematic structural diagram of a computer device provided by an embodiment of the present application.

Detailed ways

The present application will be described below with reference to the accompanying drawings and implementation examples. The specific embodiments described herein are merely illustrative of the present application.

As shown in Figure 1, an embodiment of the present application provides a natural language processing method, which includes: obtaining input text and encoding the input text to obtain the latent vector of each entity in the input text; according to the The latent vector obtains the enhanced representation of each entity; the enhanced representation of each entity is converted to obtain the processing result.

In this embodiment, an enhanced representation of the entity is obtained based on the latent vector of each entity. Semantic enhancement can be performed on each entity to help understand the entity, thereby improving the understanding ability of the corresponding natural language processing model and improving model performance.

In one embodiment, when the above method is applied to proper name recognition, the enhanced representation of each entity is obtained based on the latent vector of each entity, including: based on the latent vector of each entity and a predetermined representation based on a large number of similar words. Train the word vector library to obtain the semantically enhanced latent vector of each entity; perform conversion processing on the enhanced representation of each entity to obtain the processing results, including: subjecting the semantically enhanced latent vector of each entity to classification conversion processing to obtain the corresponding The proper name entity tag.

Please combine Figure 2 and Figure 3. The embodiment of the present application provides a natural language processing method for obtaining proper name entity tags, which is implemented through a proper name recognition model. The proper name recognition model includes an encoder 10, a neural network 11 and a decoder. 12. The natural language processing method includes the following steps: obtaining the input text X through the encoder 10 and encoding the input text X to obtain the latent vector of each entity in the input text The word vector library obtains the semantically enhanced latent vector of each entity. For example, the latent vector is input to the neural network 11, which contains a pre-trained word vector library composed of a large number of similar words, and then the semantically enhanced latent vector h′ _i of each entity is obtained (for each entity in the input text X, The semantic enhancement latent vector h′ _i enhances the semantic features of the word through similar words); the semantic enhancement latent vector h′ _i is subjected to classification conversion processing by the decoder 12 to obtain the proper name entity label corresponding to each entity.

Traditional proper name recognition models often face the problem of sparse data, making it difficult to correctly extract entities that are not encountered during training. The neural network 11 in the method provided by this solution can use the meaning of similar words of each entity in the input text X to enhance the semantic representation of the current word, thereby enhancing the proper name recognition model's understanding of the meaning of the current word. That is, if the proper name recognition model has not trained this word during training, then the proper name recognition model can help understand the current word through the meanings of words that are similar to the word, thereby improving the proper name recognition model's recognition of training entities. ability.

In some embodiments, the encoder 10 adopts Bidirectional Encoder Representation from Transformers (BERT). For example, BERT is used to encode the input text X to obtain the latent vector of each entity in the input text X. Among them, the hidden vectors of the i-th word x _i are respectively recorded as h _i .

Please refer to Figure 4. In some embodiments, obtaining the semantically enhanced latent vector h′ _i of each entity includes the following steps: finding one or more of each entity in the input text X according to a preset pre-trained word vector library Approximate words; calculate at least one approximate word of each entity according to the preset first algorithm to obtain the average vector o _i of each entity; concatenate the average vector o _i of each entity with the hidden vector h _i to obtain each entity The semantically enhanced latent vector h′ _i .

The pre-trained word vector library can fully cover the approximate words of each entity in the input text X, effectively enhancing the semantic representation of the word, thereby strengthening the named entity model's understanding of the meaning of the current word.

Please refer to Figure 5. In some embodiments, calculating at least one approximate word of each entity according to a preset first algorithm to obtain an average vector of each entity includes the following steps: calculating the approximate word vector matrix according to the preset word vector matrix. Words are mapped to word vectors; word vectors are mapped into key vectors and value vectors according to the preset second algorithm; key vectors and value vectors of each entity are calculated according to the preset third algorithm to obtain the average vector of each entity . Word vectors, key vectors and value vectors are all abstract concepts for calculating and understanding the attention mechanism, which facilitates the computer to understand and calculate the meaning of each entity and approximate words.

In some embodiments, the neural network 11 includes a preset key matrix and a value matrix, and maps word vectors into key vectors and value vectors according to a preset second algorithm, including the following steps: passing the key matrix and word vectors into the preset Set the activation function to get the key vector; pass the value matrix and word vector into the preset activation function to get the value vector. Activation functions play a very important role in neural networks11 learning and understanding very complex and nonlinear functions. The activation function introduces nonlinear characteristics into the neural network 11. Without the activation function, the output signal is just a simple linear function with less ability to learn complex function mapping from the data. It can be seen that introducing the activation function into the neural network 11 improves the ability of the neural network 11 to process complex data and improves the recognition performance of the proper name recognition model.

In some embodiments, calculating the key vector and value vector of each entity according to a preset third algorithm to obtain the average vector of each entity includes the following steps: according to the hidden vector h _i of each entity and each entity The key vector k _i,j corresponding to the similar word is calculated to obtain the weight p _i,j of the similar word; the average vector o _i of each entity is calculated according to the weight p _i,j and the value vector p _i,j . In many contexts, some similar words for each entity in the input text Then the average vector o _i obtained, and the subsequent semantically enhanced latent vector h′ _i obtained based on the average vector o _i , are not accurate enough for the meaning of the current word. However, based on the degree of adaptation of different similar words in the current context, the The weights of similar words are divided, and the average vector o _i of each entity is calculated based on the weight, and the subsequent semantic enhancement latent vector h′ _i is obtained. This undoubtedly improves the accuracy of the semantic enhancement latent vector h′ _i for word meaning. degree, thus further enhancing the proper name recognition model’s understanding of the input text X and improving the performance of the proper name recognition model.

In some embodiments, the semantically enhanced latent vector of each entity is obtained based on the latent vector and a pre-trained word vector library composed of a large number of similar words, and is obtained by inputting the latent vector into a preset key-value memory neural network. Neural network 11 includes a key-value memory neural network that enables a machine to accept input (e.g., questions, puzzles, tasks, etc.) and, in response, generate output (e.g., answers, solutions, etc.) based on information from a knowledge source , response to tasks, etc.). The key-value memory network model operates on symbolic memory that is structured into (key, value) pairs, which gives the proper name recognition model greater flexibility for encoding the input text X, and has Helps bridge the gap between reading text directly and answering from a library of pre-trained word vectors. Key-value memory networks are versatile by encoding prior knowledge about the task at hand in key-value memories. For example, documents, pre-trained word vector libraries, or pre-trained word vector libraries built using information extraction, and answer questions about them.

The method provided by the embodiment of this application is similar to the traditional method. Entity recognition is regarded as a sequence labeling task, that is, predicting the entity recognition label of each input word.

Please refer to Figure 6. In the example in the figure, "Zhang San" is a "person name" (PER), and "Beijing Haidian" is a "place name" (LOC), where "PER" and "LOC" represent a type of label, the full name for "PERSON" and "LOCATION". For example, using a preset pre-trained word vector library (such as Tencent's 8 million word vectors), for each word _xi in the input text X, based on the cosine distance between vectors, find the closest word to _xi m words, recorded as s _i,1 ,…,s _i,j ,…,s _i,m ; use the preset word vector matrix to map similar words s _i,j to word vectors e _i,j ; use Key matrix W _k and value matrix Wv, and the word vector e _i,j is mapped to key vector k _i,j and value vector v _i,j respectively through the activation function. The method is as follows:

k _i,j =ReLU(W _k ·e _i,j )

v _i,j =ReLU(W _v ·e _i,j )

Among them, "·" represents the product of matrix and vector, the calculation result is a vector, and ReLU is the activation function; use the hidden vectors h _i and k _i,j obtained from the BERT encoder 10 by x _i to calculate the weight p _i,j in the following way as follows:

In formula (1), "·" calculates the inner product of vectors, and the calculation result is a numerical value.

Based on p _i,j , calculate the average vector o _i of the value vectors as follows:

Among them, "·" is used to calculate the product of weight p _i,j (a positive real number between 0-1) and vector v _i,j ; connect the average vector o _i and h _{i in} series

Get the output of the key-value memory neural network 21:

In some embodiments, the proper name recognition model also includes a fully connected layer, and the decoder 12 is a SoftMax classifier. Classifying and converting the semantically enhanced latent vectors includes the following steps: After passing the semantically enhanced latent vectors through the fully connected layer, the SoftMax classifier, obtain proper name entity labels. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.

Compared with related technologies, the natural language processing method provided by the embodiments of this application achieves:

1. The embodiment of this application provides a natural language processing method, which is applied to the case of proper name recognition. The method includes the following steps: obtaining the input text and encoding the input text to obtain the latent vector of each entity in the input text; The semantically enhanced latent vector of each entity is obtained based on the latent vector and the pre-trained word vector library based on a large number of similar words; the semantically enhanced latent vector is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity. Traditional proper name recognition models often face the problem of sparse data, making it difficult to correctly extract entities that are not encountered during training. The neural network model in the proper name recognition method provided by this solution can use the meaning of similar words of each entity in the input text to enhance the semantic representation of the current word, thereby enhancing the proper name recognition model's understanding of the meaning of the current word. That is, if the proper name recognition model has not seen this word during training, then the proper name recognition model can help understand the current word through the meanings of words similar to the word, thereby improving the recognition of training entities by the proper name recognition model. ability.

2. In a natural language processing method provided by the embodiment of the present application, obtaining the semantically enhanced latent vector of each entity includes the following steps: finding one of each entity in the input text according to the preset pre-trained word vector library or multiple approximate words; calculate at least one approximate word of each entity according to the preset first algorithm to obtain the average vector of each entity; concatenate the average vector and the latent vector to obtain the semantic enhancement latent vector. The pre-trained word vector library can fully cover the approximate words of each entity in the input text, effectively enhancing the semantic representation of the word, thus strengthening the named entity model's understanding of the meaning of the current word.

3. In a natural language processing method provided by the embodiment of the present application, calculating at least one approximate word of each entity according to a preset first algorithm to obtain the average vector of each entity includes the following steps: according to the preset words The vector matrix maps approximate words into word vectors; maps word vectors into key vectors and value vectors according to the preset second algorithm; calculates the key vectors and value vectors of each entity according to the preset third algorithm to obtain an average vector . Word vectors, key vectors and value vectors are all abstract concepts for calculating and understanding the attention mechanism, which facilitates the computer to understand and calculate the meaning of each entity and approximate words.

4. In a natural language processing method provided by the embodiment of the present application, mapping word vectors into key vectors and value vectors according to a preset second algorithm includes the following steps: passing the preset key matrix and word vector into the preset The activation function is used to obtain the key vector; the preset value matrix and word vector are passed into the preset activation function to obtain the value vector. Activation functions play a very important role in neural networks learning and understanding very complex and nonlinear functions. The activation function introduces nonlinear characteristics into the neural network. If there is no activation function, the output signal is just a simple linear function with less ability to learn complex function mapping from the data. It can be seen that introducing the activation function into the neural network improves the neural network's ability to process complex data and improves the recognition performance of the proper name recognition model.

5. In a natural language processing method provided by the embodiment of this application, calculating the key vector and value vector of each entity according to a preset third algorithm to obtain the average vector includes the following steps: according to the sum of the hidden vectors of each entity The key vector corresponding to the similar word of each entity is calculated to obtain the weight of the similar word; the average vector of each entity is calculated based on the weight and value vector. In many contexts, some similar words of each entity in the input text have different adaptability to the current context. Therefore, if the importance of these similar words in the semantic representation of the current word is not divided, then The obtained average vector and the subsequent semantically enhanced latent vector obtained based on the average vector are not accurate enough for the meaning of the current word. Instead, the weights of similar words are divided according to the degree of adaptation of different similar words in the current context, and The average vector of each entity is calculated based on the weight, and the subsequent semantically enhanced latent vector is obtained. This undoubtedly improves the accuracy of the semantically enhanced latent vector in expressing word meaning, thereby further enhancing the proper name recognition model's understanding of the input text. Improved the performance of the proper name recognition model.

6. In a natural language processing method provided by the embodiment of this application, the classification conversion process of semantically enhanced latent vectors includes the following steps: After passing the semantically enhanced latent vectors through a preset fully connected layer, they are sent to the preset SoftMax classification. device to get the proper name entity tag. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.

7. In a natural language processing method provided by the embodiment of this application, the semantically enhanced latent vector of each entity is obtained by basing the latent vector on a pre-trained word vector library composed of a large number of similar words by inputting the latent vector into a preset key value. Memory neural network approach. Key-value memory neural networks can enable machines to accept inputs (e.g., questions, puzzles, tasks, etc.) and, in response, generate outputs (e.g., answers, solutions, responses to tasks, etc.) based on information from knowledge sources. The key-value memory network model operates on symbolic memory that is structured into (key, value) pairs, which gives the proper name recognition model greater flexibility for encoding input text and helps To bridge the gap between reading text directly and answering from a library of pre-trained word vectors. By encoding prior knowledge about the task at hand in key-value memories, key-value memory networks have the versatility to analyze, for example, documents, pre-trained word vector libraries, or pre-trained word vector libraries built using information extraction, and Answer questions about them.

In one embodiment, when the above method is applied to extract the relationship between two given entities in the input text, obtaining the hidden vector of each entity in the input text includes: obtaining the given latent vector of each entity in the input text. the hidden vector of each entity; said obtaining the enhanced representation of each entity according to the hidden vector of each entity includes: calculating the given hidden vector of each entity through a preset first algorithm to obtain the said The first entity vector representation and the second entity vector representation respectively correspond to the given two entities; the first entity vector representation and the second entity vector representation are processed to obtain the first semantic enhanced representation and the second semantic representation. Enhanced representation; the conversion processing of the enhanced representation of each entity to obtain a processing result includes: calculating the first semantic enhanced representation and the second semantic enhanced representation according to a preset second algorithm to obtain an intermediate vector ;Convert the intermediate vector to obtain the predicted relationship type between text entities.

Please combine Figure 7 and Figure 8. This embodiment of the present application provides a natural language processing method, which is used to extract the relationship between two given entities in the input text. It is implemented through a relationship extraction model and is used to extract The relationship between entities, the relationship extraction model includes an encoder 20, a decoder 21 and a semantic enhancement module 22 based on the attention mechanism. The natural language processing method includes the following steps: obtaining the input text, passing the input text to the encoder 20 and processing Input text is encoded, and the hidden vector of each entity given in the input text is output; the hidden vector of each given entity is calculated through the preset first algorithm, and the corresponding third entity corresponding to the given two entities is obtained. An entity vector representation and a second entity vector representation; input the first entity vector representation and the second entity vector representation into the semantic enhancement module 22 to obtain the first semantic enhancement representation and the second semantic enhancement representation; combine the first semantic enhancement representation with the third semantic enhancement representation The second semantic enhanced representation is calculated through a preset second algorithm to obtain an intermediate vector; the intermediate vector is converted and decoded by the decoder 21 to obtain the predicted relationship type between text entities.

Due to insufficient training data, traditional relationship extraction models often face the problem of sparse entity data, making it difficult to correctly extract entities not encountered during training. In the method provided by this solution, the semantic enhancement module 22 of the relationship extraction model performs semantic enhancement on each entity in the input text, and uses the semantics of entities similar to it to enhance the semantic representation of this entity, thereby enhancing the relationship extraction model's effect on the current Understanding of entity semantics. That is, if the relationship extraction model has not seen this entity during training, then the relationship extraction model can help understand the current entity through the semantics of entities similar to the entity, thereby improving the relationship extraction model's ability to understand unlogged entities, and then Improve the performance of relation extraction.

In some embodiments, the first algorithm is the Max Pooling algorithm.

Please refer to Figure 9. In some embodiments, obtaining the first semantically enhanced representation and the second semantically enhanced representation includes the following steps: finding one or more of each entity given in the input text according to a preset pre-trained word vector library. Multiple approximate entities; calculate the approximate entities of the given two entities according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the corresponding third entities of the given two entities in the input text. A semantically enhanced representation and a second semantically enhanced representation.

The pre-trained word vector library can fully cover the approximate entities of each entity in the input text, effectively enhancing the semantic representation of this entity, thereby strengthening the relationship extraction model's understanding of the current entity semantics.

Please refer to Figure 10. In some embodiments, the first semantic enhancement representation and the second semantic enhancement representation are calculated according to a preset second algorithm to obtain an intermediate vector: the first entity vector representation and the first semantic enhancement representation are concatenated, The first enhanced vector representation is obtained; the second entity vector representation is concatenated with the second semantic enhanced vector representation to obtain the second enhanced vector representation; the first enhanced vector representation is concatenated with the second enhanced vector representation to obtain an intermediate vector.

The first entity vector representation can represent the semantics of the first entity itself, and the first semantic enhanced representation can represent the semantics of similar entities. The first enhanced vector representation obtained by concatenating the two combines the semantics of the entity and the similar entities. (The same is true for the second enhanced vector representation). The semantics of the original entity are expanded, allowing the computer to better understand the semantics of the entity, and also facilitates subsequent calculation and processing.

Please refer to Figure 11. In some embodiments, the approximate entities of the two given entities are calculated according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the two given entities in the input text. The first semantic enhancement representation and the second semantic enhancement representation respectively corresponding to each entity include the following steps: mapping the approximate entities into word vectors through a preset word vector matrix; using the first entity vector representation (or the second entity vector representation) and The word vectors of one or more approximate entities of the current entity calculate the weight of each approximate entity; the first semantic enhancement representation (or the second semantic enhancement representation) of the word vector is calculated based on the weight and the word vector.

The semantic calculation weight of different approximate entities realizes the identification and utilization of the importance of non-similar entities, effectively avoiding the impact of potential noise in similar entities on model performance, thereby improving the performance of relationship extraction.

In some embodiments, performing classification conversion processing on semantically enhanced latent vectors includes the following steps: after passing the intermediate vector through a preset fully connected layer, it is sent to the SoftMax classifier to obtain the predicted relationship type. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.

For example, referring to Figure 6, the two given entities in the input are "Food Factory" and "Fruit Cans". The model uses a standard encoder-decoder architecture. Encoder 20 uses BERT and decoder 21 uses SoftMax;

The first step is to use BERT to encode the input text and obtain the latent vector of each entity. Among them, the hidden vectors of the i-th word x _i are respectively recorded as h _i .

In the second step, the Max Pooling algorithm is used to calculate the vector representations h _E1 and h _E2 of the two entities (E1 and E2 represent two entities respectively). Methods as below:

In the third step, h _E1 and h _E2 are sent to the semantic enhancement module 22 to obtain semantic enhancement representations a1 and a2 corresponding to E1 and E2 respectively.

The fourth step is to connect h _E1 and a1 in series to obtain the first enhanced vector representation.

Similarly, the second enhanced vector representation o ₂ can be obtained.

The fifth step is to connect o ₁ and o ₂ in series to get the intermediate vector

In the sixth step, after o passes through a fully connected layer, it is sent to the SoftMax classifier to obtain the predicted relationship type.

The processing flow of the semantic enhancement module 22 is as follows:

The first step is to use a pre-trained entity vector library (such as Tencent's 8 million word vectors), for each entity E _i (where i = 1 or 2) in the input text, based on the cosine distance between entity vectors, find The m entities closest to E _i are denoted as c _i,1 ,…,c _i,j ,…,c _i,m .

In the second step, use the word vector matrix to map c _i,j to word vector e _i,j .

The third step is to use the vector of E _i to represent h _Ei and e _i,j and calculate the weight p _i,j as follows:

Among them, "·" is used to calculate the inner product of vectors;

The fourth step is to calculate the average vector a _i of the word vector e _i, _{j based on the weight p i} ,j as follows:

Among them, "·" is used to calculate the product of the weight (a positive real number between 0 and 1) and the word vector e _i,j .

In some embodiments, converting the intermediate vector to obtain the predicted relationship type between text entities includes the following steps: passing the intermediate vector through a preset fully connected layer and then sending it to the SoftMax classifier to obtain the predicted relationship type. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.

Compared with related technologies, the natural language processing method provided by this application achieves:

1. The embodiment of this application provides a natural language processing method, which is applied to extract the relationship between two given entities in the input text. It is implemented through a relationship extraction model, including: obtaining the input text, and analyzing the input text. The text is encoded to obtain the hidden vector of each entity given in the input text; the hidden vector of each given entity is calculated through the preset first algorithm, and the first corresponding corresponding entities of the given two entities are obtained. Entity vector representation and second entity vector representation; process the first entity vector representation and the second entity vector representation to obtain the first semantic enhanced representation and the second semantic enhanced representation; combine the first semantic enhanced representation and the second semantic enhanced representation The intermediate vector is calculated through the preset second algorithm; the intermediate vector is converted to obtain the predicted relationship type between text entities. Due to insufficient training data, traditional relationship extraction models often face the problem of sparse entity data, making it difficult to correctly extract entities not encountered during training. In the relationship extraction method provided by this solution, the semantic enhancement module of the relationship extraction model performs semantic enhancement on each entity in the input text and uses the semantics of similar entities to enhance the semantic representation of this entity, thereby enhancing the relationship extraction model's Understanding of current entity semantics. That is, if the relationship extraction model has not seen this entity during training, then the relationship extraction model can help understand the current entity through the semantics of entities similar to the entity, thereby improving the relationship extraction model's ability to understand unlogged entities, and then Improve the performance of relation extraction.

2. The embodiment of the present application provides a natural language processing method. Obtaining the first semantic enhancement representation and the second semantic enhancement representation includes the following steps: finding each given word in the input text according to the preset pre-trained word vector library. One or more approximate entities of the entity; calculate the approximate entities of the given two entities according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the two given entities in the input text The corresponding first semantic enhancement representation and the second semantic enhancement representation respectively. The pre-trained word vector library can fully cover the approximate entities of each entity in the input text, effectively enhancing the semantic representation of this entity, thereby strengthening the relationship extraction model's understanding of the current entity semantics.

3. The embodiment of the present application provides a natural language processing method that calculates the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector, including: combining the first entity vector representation with the third semantic enhancement representation. Concatenate one semantic enhanced representation to obtain the first enhanced vector representation; concatenate the second entity vector representation with the second semantic enhanced representation to obtain the second enhanced vector representation; concatenate the first enhanced vector representation with the second enhanced vector representation to obtain the intermediate vector. The first entity vector representation can represent the semantics of the first entity itself, and the first enhancement vector representation can represent the semantics of similar entities. The first enhancement vector obtained after concatenating the two combines the semantics of the entity and similar entities ( The same is true for the second enhanced vector representation), which expands its semantics on the original entity, allowing the computer to better understand the semantics of the entity, and also facilitates subsequent calculation and processing.

4. The embodiment of the present application provides a natural language processing method in which the approximate entities of the given two entities are calculated according to the preset third algorithm, the first entity vector representation and the second entity vector representation to obtain the input text. The first semantic enhancement representation and the second semantic enhancement representation corresponding to the given two entities respectively include: mapping the approximate entities into word vectors through a preset word vector matrix; representing the first entity vector (or the second entity vector representation) and the word vectors of one or more approximate entities of the current entity to calculate the weight of each approximate entity; calculate the first semantic enhancement representation (or the second semantic enhancement representation) of the word vector based on the weight and the word vector. The semantic calculation weight of different approximate entities realizes the identification and utilization of the importance of dissimilar entities, effectively avoiding the impact of potential noise in similar entities on model performance, thereby improving the performance of relationship extraction.

5. The embodiment of the present application provides a natural language processing method. Converting intermediate vectors to obtain the predicted relationship type between text entities includes the following steps: After passing the intermediate vector through a preset fully connected layer, it is sent to SoftMax. Classifier, get the predicted relationship type. The fully connected layer and SoftMax classifier can visualize the weight information of different connections contained in the semantically enhanced latent vector and match it with the preset template, making it easier to predict the type of relationship between entities.

Referring to Figure 12, an embodiment of the present application also provides a computer device, including a processor 30, a memory 31, and a computer program stored on the memory 31. The processor 30 executes the computer program to implement the above method. The computer equipment has the same effect as the above-mentioned natural language processing method, which will not be described again here.

Embodiments of the present application also provide a readable storage medium on which computer program instructions are stored. When the computer program instructions are executed, the above method is implemented. The readable storage medium has the same effect as the above-mentioned natural language processing method, which will not be described again here.

Embodiments of the present application also provide a program product. The program product includes computer program instructions. When the computer program instructions are executed, the above method is implemented. The program product has the same effect as one of the above natural language processing methods, and will not be described again here.

In the embodiments provided in this application, "B corresponding to A" means that B is associated with A, and B can be determined based on A. However, it should also be understood that determining B based on A does not mean determining B only based on A. B can also be determined based on A and/or other information.

Reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic associated with the embodiment is included in at least one embodiment of the present application. Therefore, appearances of "in one embodiment" or "in an embodiment" in various places herein are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. The embodiments described in this article are all optional embodiments, and the actions and modules involved are not necessarily necessary for this application.

In the various embodiments of the present application, the size of the sequence numbers of the above-mentioned multiple processes does not necessarily mean the order of execution. The execution order of the multiple processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application. The implementation process constitutes any limitation.

The flowcharts and block diagrams in the drawings of this application illustrate the architecture, functionality and operation of possible implementations of systems, methods and computer program products according to various embodiments of this application. In this regard, each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending upon the functionality involved. Each block in the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, may be implemented by special purpose hardware-based systems that perform the specified functions or operations, or may be implemented using special purpose hardware implemented in combination with computer instructions.

Claims

A natural language processing method that includes:

Obtain the input text and encode the input text to obtain the hidden vector of each entity in the input text;

Obtain the enhanced representation of each entity according to the latent vector of each entity;

The enhanced representation of each entity is transformed and processed to obtain the processing result.
The method of claim 1, when the method is applied to proper name recognition, obtaining the enhanced representation of each entity based on the latent vector of each entity includes:

Obtain the semantically enhanced latent vector of each entity according to the latent vector of each entity and a pre-trained word vector library based on similar words;

The enhanced representation of each entity is converted and processed to obtain processing results, including:

The semantically enhanced latent vector of each entity is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity.
The method of claim 2, wherein the semantically enhanced latent vector of each entity is obtained according to the latent vector of each entity and a pre-trained word vector library composed of similar words, including:

Obtain at least one approximate word for each entity in the input text according to the pre-trained word vector library;

Calculate at least one approximate word of each entity according to a preset first algorithm to obtain the average vector of each entity;

The average vector of each entity is concatenated with the latent vector to obtain the semantically enhanced latent vector of each entity.
The method of claim 3, wherein calculating at least one approximate word of each entity according to a preset first algorithm to obtain the average vector of each entity includes:

Map each approximate word to a word vector according to the preset word vector matrix;

Map the word vectors into key vectors and value vectors according to a preset second algorithm;

The key vector and value vector of each entity are calculated according to a preset third algorithm to obtain the average vector of each entity.
The method of claim 4, wherein mapping the word vectors into key vectors and value vectors according to a preset second algorithm includes:

Pass the preset key matrix and the word vector into the preset activation function to obtain the key vector;

Pass the preset value matrix and the word vector into the preset activation function to obtain the value vector.
The method of claim 4, wherein calculating the key vector and value vector of each entity according to a preset third algorithm to obtain the average vector of each entity includes:

Calculate the weight of the similar words of each entity based on the latent vector of each entity and the key vector corresponding to the similar word;

The average vector of each entity is calculated based on the weight and value vector of similar words of each entity.
The method of claim 2, wherein the semantically enhanced latent vector of each entity is subjected to classification conversion processing to obtain the proper name entity label corresponding to each entity, including:

Input the semantically enhanced latent vector of each entity into the preset fully connected layer to obtain the output of the preset fully connected layer, and input the output of the preset fully connected layer into the preset SoftMax classifier to obtain the Describes the proper name entity tag for each entity.
The method of claim 2, wherein obtaining the semantically enhanced latent vector of each entity based on the latent vector and a pre-trained word vector library composed of similar words includes:

Input the latent vector into a preset key-value memory neural network to obtain the semantically enhanced latent vector of each entity.
The method according to claim 1, when the method is applied to extract the relationship between two given entities in the input text, obtaining the latent vector of each entity in the input text includes:

Get the hidden vector of each given entity;

Obtaining the enhanced representation of each entity based on the latent vector of each entity includes:

Calculate the hidden vector of each given entity through a preset first algorithm to obtain the first entity vector representation and the second entity vector representation respectively corresponding to the given two entities;

Process the first entity vector representation and the second entity vector representation to obtain a first semantically enhanced representation and a second semantically enhanced representation;

The enhanced representation of each entity is converted and processed to obtain processing results, including:

Calculate the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector;

The intermediate vector is converted to obtain the predicted relationship type between text entities.
The method of claim 9, wherein processing the first entity vector representation and the second entity vector representation to obtain a first semantic enhanced representation and a second semantic enhanced representation includes:

Obtain at least one approximate entity of each entity given in the input text according to the preset pre-trained word vector library;

According to the preset third algorithm, the first entity vector representation and the second entity vector representation, the approximate entities of the given two entities are calculated to obtain the third corresponding entities of the given two entities. A semantically enhanced representation and a second semantically enhanced representation.
The method of claim 9, wherein calculating the first semantic enhancement representation and the second semantic enhancement representation according to a preset second algorithm to obtain an intermediate vector includes:

Concatenate the first entity vector representation and the first semantic enhancement representation to obtain a first enhancement vector representation;

Concatenate the second entity vector representation and the second semantic enhancement representation to obtain a second enhancement vector representation;

The first enhancement vector representation is concatenated with the second enhancement vector representation to obtain the intermediate vector.
The method of claim 10, wherein the approximate entities of the given two entities are calculated according to a preset third algorithm, the first entity vector representation and the second entity vector representation. Obtaining the first semantic enhancement representation and the second semantic enhancement representation respectively corresponding to the given two entities includes:

Map the approximate entities into word vectors through a preset word vector matrix;

Calculate the weight of each approximate entity through the first entity vector representation and the word vector of at least one approximate entity of the current entity; calculate the first semantic enhancement representation of the word vector based on the weight and the word vector. ;

Calculate the weight of each approximate entity through the second entity vector representation and the word vector of at least one approximate entity of the current entity; calculate the second semantic enhancement representation of the word vector based on the weight and the word vector. .
The method of claim 9, wherein the first algorithm is a Max Pooling algorithm.
The method according to claim 9, wherein the conversion processing of the intermediate vector to obtain the predicted relationship type between text entities includes:

Input the intermediate vector into the preset fully connected layer to obtain the output of the preset fully connected layer, input the output of the preset fully connected layer into the preset SoftMax classifier for classification, and obtain the relationship type .
The method of claim 12, wherein the weight is calculated by the following formula:

In the formula, p i, j represents the weight, h Ei represents the first entity vector representation or the second entity vector representation, e i, j represents the word vector, i = 1 or 2, and m represents an entity. The number of approximate entities, j∈(1,m).
A computer device includes a processor, a memory, and a computer program stored on the memory. The processor executes the computer program to implement the natural language processing method according to any one of claims 1-15.
A readable storage medium that stores computer program instructions. When the computer program instructions are executed, the natural language processing method as described in any one of claims 1-15 is implemented.
A program product includes computer program instructions that, when executed, implement the natural language processing method according to any one of claims 1-15.