CN114330342A - Named entity identification method, device and equipment - Google Patents

Named entity identification method, device and equipment Download PDF

Info

Publication number
CN114330342A
CN114330342A CN202011081230.7A CN202011081230A CN114330342A CN 114330342 A CN114330342 A CN 114330342A CN 202011081230 A CN202011081230 A CN 202011081230A CN 114330342 A CN114330342 A CN 114330342A
Authority
CN
China
Prior art keywords
text
characteristic
target word
feature
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011081230.7A
Other languages
Chinese (zh)
Inventor
许璐
邴立东
陆巍
揭展明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Singapore University of Technology and Design
Original Assignee
Alibaba Group Holding Ltd
Singapore University of Technology and Design
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd, Singapore University of Technology and Design filed Critical Alibaba Group Holding Ltd
Priority to CN202011081230.7A priority Critical patent/CN114330342A/en
Publication of CN114330342A publication Critical patent/CN114330342A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The application discloses a named entity identification method, a named entity identification device and equipment. The named entity identification method comprises the following steps: determining a first text characteristic and a first text structure characteristic of a target word in a text to be processed; determining a second text structure characteristic of the target word entering a memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state; determining a second text characteristic of the target word according to at least the first text characteristic and the second text structure characteristic of the target word; and determining the named entity type of the target word according to at least the second text characteristic of the target word. By adopting the processing mode, two characteristics of text linear characteristics and text structural characteristics are better fused; therefore, the recognition accuracy can be effectively improved.

Description

Named entity identification method, device and equipment
Technical Field
The application relates to the technical field of natural language processing, in particular to a named entity identification method, device and equipment.
Background
In the E-commerce scene, key entity words in user search and entity words of commodity names can be analyzed through named entity recognition, and therefore buyers can be helped to more accurately locate searched related products. Named entity recognition is to extract named entities from unstructured text and classify the extracted named entities, for example, in the commodity name "Hua Qi V30 cell phone", it is extracted that "Hua Qi" is the cell phone brand and "Rong Qi V30" is the cell phone model.
At present, a typical named entity recognition method is to extract text features through a common neural network model (such as Bi-directional long and short term memory network Bi-LSTM), and then input the extracted features into a conditional random field model (CRF) for sequence labeling, thereby obtaining which named entities appear in the text.
However, in the process of implementing the present invention, the inventors found that the above solution has at least the following problems: 1) the common model does not well fuse text linear features and text structural features, so that the extracted features are not well expressed, and the recognition accuracy of the named entity is limited; 2) the meaning of the same word is different in different texts under the influence of context information, while the common neural network model is difficult to capture the dependency relationship between long-distance words and words, and the information is lost when the text is too long, and the text structure information existing in a sentence cannot be completely captured, so that the meaning of the target word in the whole text is difficult to extract. In summary, how to improve the recognition accuracy of the named entity and further improve the search accuracy is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The application provides a named entity identification method, which aims to solve the problem of low named entity identification accuracy in the prior art. The application additionally provides a named entity recognition device and equipment.
The application provides a named entity identification method, which comprises the following steps:
determining a first text characteristic and a first text structure characteristic of a target word in a text to be processed;
determining a second text structure characteristic of the target word entering a memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state;
determining a second text characteristic of the target word according to at least the first text characteristic and the second text structure characteristic of the target word;
and determining the named entity type of the target word according to at least the second text characteristic of the target word.
Optionally, the first text structure feature of the target word is determined through a structure information extraction module included in the named entity recognition model;
determining a second text structure characteristic of the target word entering a memory state at least according to a first text structure characteristic of the target word and a second text structure characteristic of the previous word entering the memory state through a characteristic memory sub-module included in a text characteristic determination module in the named entity recognition model;
determining a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word through an output sub-module included by the text characteristic determining module;
and determining the named entity type of the target word at least according to the second text characteristic of the target word through a classifier included in the named entity recognition model.
Optionally, the method further includes:
determining a third text characteristic of the target word entering a memory state at least according to the second text characteristic of the fused text linear information and the text structure information of the previous word of the target word, the first text structure characteristic of the target word and the third text characteristic of the previous word entering the memory state through the characteristic memory sub-module, wherein the third text characteristic comprises the second text structure characteristic of the previous word entering the memory state;
and determining the second text characteristic of the target word through the output sub-module at least according to the first text characteristic of the target word, the second text characteristic of the previous word, the first text structure characteristic of the target word and the third text characteristic of the target word entering a memory state.
Optionally, the method further includes:
the feature memory submodule determines a second text structure feature of the target word entering a memory state at least according to the second text feature of the fused text linear information and the text structure information of the previous word of the target word and the first text structure feature of the target word through a feature memory controller;
and the feature memory submodule determines a third text feature of the target word entering the memory state at least according to the second text structure feature of the target word entering the memory state and the third text feature of the previous word entering the memory state through the feature memory.
Optionally, the method further includes:
the output sub-module determines a fourth text characteristic at least according to the first text characteristic and the first text structure characteristic of the target word and the second text characteristic of the previous word through the first output controller;
and the output sub-module determines a second text characteristic of the target word through a second output controller at least according to the fourth text characteristic and the third text characteristic of the target word entering the memory state.
Optionally, the method further includes:
the structural information extraction module included by the named entity recognition model is used for determining the first text structural feature of the target word, and the method comprises the following steps:
determining word dependency relationships of the text, wherein the word dependency relationships comprise dependency relationship types;
determining a word dependency relationship matrix according to the word dependency relationship;
determining a third text structure characteristic of the target word according to the matrix, the word vector of the target word and the dependency relationship type vector;
and determining the first text structure characteristic of the target word according to the third text structure characteristic of the target word through a structure information extraction module based on the multilayer graph neural network.
Optionally, the determining, by the structural information extraction module based on the multilayer graph neural network, the first text structural feature of the target word according to the third text structural feature of the target word includes:
determining a fourth text structure characteristic of the target word according to the third text structure characteristic through the first layer graph neural network included by the structure information extraction module;
and determining the first text structure characteristic of the target word according to the fourth text structure characteristic through a second-layer graph neural network included by the structure information extraction module.
Optionally, the method further includes:
and determining the characteristic of the word without memory according to the second text characteristic of the previous word, the first text characteristic of the target word and the first text structure characteristic through a characteristic forgetting submodule included in the named entity recognition model.
The application also provides a named entity identification model construction method, which comprises the following steps:
acquiring a corresponding relation set between the text and the named entity labeling data;
constructing a network structure of a named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; the output sub-module determines a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word;
and learning to obtain a named entity recognition model according to the corresponding relation set.
The present application further provides a named entity recognition apparatus, including:
the first feature determination unit is used for determining a first text feature and a first text structure feature of a target word in a text to be processed;
the text structure characteristic control unit is used for determining a second text structure characteristic of the target word entering the memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state;
the second characteristic determining unit is used for determining a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word;
and the classification unit is used for determining the named entity type of the target word at least according to the second text characteristic of the target word.
The present application further provides a named entity recognition model building apparatus, including:
the training data acquisition unit is used for acquiring a corresponding relation set between the text and the named entity labeling data;
the network construction unit is used for constructing a network structure of the named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; the output sub-module determines a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word;
and the training unit is used for learning to obtain a named entity recognition model according to the corresponding relation set.
The present application also provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform the various methods described above.
The present application also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the various methods described above.
Compared with the prior art, the method has the following advantages:
the named entity recognition method provided by the embodiment of the application determines a first text characteristic and a first text structure characteristic of a target word in a text to be processed; determining a second text structure characteristic of the target word entering a memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state; determining a second text characteristic of the target word at least according to the first text characteristic of the target word and the second text structure characteristic of the target word entering a memory state; determining the named entity type of the target word at least according to the second text characteristic of the target word; by the processing mode, two characteristics of text linear characteristics and text structural characteristics are better fused; therefore, the recognition accuracy can be effectively improved.
According to the method for constructing the named entity recognition model, a corresponding relation set between a text and named entity marking data is obtained; constructing a network structure of a named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; the output sub-module determines a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word; learning to obtain a named entity recognition model according to the corresponding relation set; by the processing mode, two characteristics, namely a text linear characteristic and a text structural characteristic, are better fused, so that the model prediction capability is enhanced; therefore, the model accuracy can be effectively improved.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a named entity recognition method provided by the present application;
FIG. 2 is a schematic diagram of a model structure of an embodiment of a named entity recognition method provided by the present application;
fig. 3 is a schematic structural diagram of a text feature determination module according to an embodiment of a named entity recognition method provided in the present application;
fig. 4 is a schematic diagram of word dependency relationships in an embodiment of a named entity recognition method provided by the present application.
Fig. 5 is a schematic diagram of a structural information extraction module based on a two-layer graph neural network according to an embodiment of a named entity recognition method provided by the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
In the application, a named entity identification method and device, a named entity identification model construction method and device and electronic equipment are provided. Each of the schemes is described in detail in the following examples.
First embodiment
Please refer to fig. 1, which is a flowchart illustrating an embodiment of a named entity recognition method according to the present application. The execution subject of the method includes but is not limited to a server, and may be any device capable of implementing the method. In this embodiment, the method may include the steps of:
step S101: determining a first text characteristic and a first text structure characteristic of a target word in a text to be processed.
The text to be processed may be a sentence composed of a plurality of words. Named Entity Recognition (NER), also called "proper name Recognition", refers to recognizing entities with specific meaning in text, mainly including names of people, places, organizations, proper nouns, etc.
For example, the text to be processed is: "ACM announced that the three creators of deep learning Yoshua Bengio, Yann LeCun, and Geoffrey Hinton received the Turing prize in 2019". The task of NER is to extract from this sentence the "organization name: ACM "," name of person: yoshua Bengio, Yann LeCun, Geoffrey Hinton "," time: 2019 "," proper noun: tuling award ".
For example, in the e-commerce scene, when different users search commodities on the e-commerce platform through the client, the server performs named entity recognition on search terms specified by the users through the named entity recognition model, so that entities with specific meanings similar to regions, brands, trade names and the like can be accurately recognized, and user experience can be effectively improved. For another example, the server may also perform entity identification such as brand name, material, consumer group, region, commodity category, and the like on the commodity description in the commodity library through the named entity identification model, so as to recommend the interested commodity to the user.
The text to be processed can be subjected to named entity type prediction on each word through a named entity recognition model. The named entity recognition model includes a text feature determination module and a classifier. The text feature determination module is used for determining the feature (namely, a second text feature) of each word in the text to be processed; the classifier is used for determining the type of the named entity of the word according to the second text characteristic, so that the named entity existing in the text to be processed is obtained.
The text feature determination module may be a neural network-based model, and obtains a better word feature vector based on text information, that is, a second text feature, by fusing a linear text feature and a structural information feature of a target word.
The classifier can adopt a conditional random field CRF and other structures. Taking the conditional random field as an example, the layer mainly utilizes the conditional random field to capture the relationship between the corresponding tags of each word. The optimal tags can be found on the whole according to the relation between the tags through the conditional random field, namely, the optimal sequence tag group is found by considering the mutual influence between the tags, and finally the named entity type of each word is determined, so that the named entities existing in the text to be processed are obtained.
The input data of the named entity recognition model comprises: first text feature of each word, X in FIG. 2t-1、Xt、Xt+1And Xt+2Etc., and a first text structure feature
Figure BDA0002716553370000061
And the like. Assuming that the target word is the tth word in the text to be processed, the first text feature of the target word is XtThe first text structure of the target word is characterized as
Figure BDA0002716553370000062
The first text feature includes, but is not limited to: word vectors, character vectors, dependency vectors, and part-of-speech vectors. The first text structure feature includes but is not limited to: word vectors, character vectors, and dependency vectors. The word vector can adopt 100-dimensional Glove, the character vector can be obtained by bidirectional LSTM, and the dependency relationship vector and the part-of-speech vector are randomly obtained.
As shown in fig. 2, in this embodiment, the named entity recognition model may further include a structure information extraction module, configured to determine a first text structure feature of the target word
Figure BDA0002716553370000071
The structure information extraction module can adopt the structure of a graph neural network, and can also adopt other network structures, such as a circulating neural network and the like.
In this embodiment, the structural information extraction module based on the graph neural network determines the first text structural feature of the target word according to the word dependency relationship in the text to be processed
Figure BDA0002716553370000072
Because the graph neural network can be considered in operationThe adjacent relation of the word i and the word j on the graph does not weaken the relation of long distance words and words because of the length of the distance between the word i and the word j in a word. Therefore, the graph neural network utilizes the dependency relationship structure, so that the output of the graph neural network has structural information in the text, and the graph neural network based on the dependency relationship graph can better capture the relationship between long-distance words.
A graph neural network is a graph having n nodes, and the structure of the graph can be represented as an n × n matrix a. For a graph neural network of l layers, the expression can be defined as:
Figure BDA0002716553370000073
wherein i represents the ith node, A (i, j) represents that the node i is connected with the node j, and h _ j ^ (l-1) and h _ i ^ (l) respectively represent the input vector and the output vector of the node i at the l layer. W and b are learnable parameters.
In one example, the determining a first text structure feature of the target word by the structure information extraction module included in the named entity recognition model may include the following sub-steps:
1) determining word dependencies for the text, the word dependencies including a dependency type.
The word dependency relationship of the text refers to the dependency relationship between words in the text. Word dependencies may be determined by prior art techniques and are therefore not described in detail herein.
As shown in fig. 4, the text to be processed is "Precision Castparts Corp.," portland, and "bright tracing with the symbol pcp.", wherein the word "Corp" depends on "Castparts", "Precision", "portland"; the word "portlet" depends on ","; the word "begin" depends on "pull", "tracing", "", "Corp"; the word "tracking" depends on "with"; the word "with" depends on "symbol"; the word "symbol" depends on "the", "PCP".
The word dependency includes a dependency type, which may be a grammatical relationship, etc., such as a move-to-guest relationship, a predicate relationship, a modify relationship (who describes who), etc.
2) And determining a word dependency relationship matrix according to the word dependency relationship.
This step converts the word dependencies into a corresponding word dependency matrix A, where A (i, j) indicates that there is a dependency between word i and word j.
3) Determining a third text structure characteristic of the target word according to the matrix, the word vector of the target word and the dependency relationship type vector, wherein the third text structure characteristic of the tth target word shown in fig. 2 is represented as
Figure BDA0002716553370000081
4) Determining the first text structure characteristic of the target word according to the third text structure characteristic of the target word through a structure information extraction module based on a multilayer graph neural network
Figure BDA0002716553370000082
Through the above graph neural network expression, it can be found that the graph neural network of one layer can only exchange information between the target node and its adjacent nodes, but the graph neural network of multiple layers can exchange information for its distant nodes. In particular, a five-layer neural network can be used, so that five-layer cascade dependency relationship between words can be captured, for example, the word "corp" in fig. 4 depends on "PCP" through five-layer cascade, so that long-distance word-to-word relationship can be captured.
However, the five-layer graph neural network can capture the relationship between long-distance words, but has larger noise. Experiments show that better recognition effect can be achieved by combining the two-layer graph neural network with the feature fusion method provided by the embodiment of the application. As shown in fig. 4, by combining two-layer neural network, it is possible to capture the cascade dependency relationship between two or more layers of long-distance words, such as the word "corp.
In another example, as shown in fig. 5, the determining, by the structural information extraction module based on the multi-layer neural network, the first text structural feature of the target word according to the third text structural feature of the target word may include the following sub-steps: 1) according to the third text structure characteristic (input vector) through the first layer graph neural network included by the structure information extraction module
Figure BDA0002716553370000083
) Determining a fourth text structure characteristic of the target word; 2) determining the first text structure characteristic of the target word according to the fourth text structure characteristic through the second layer graph neural network included by the structure information extraction module
Figure BDA0002716553370000084
After the first text feature and the first text structure feature of the target word are determined in step S101, the second text feature of the target word can be determined in a unique feature fusion manner provided in the embodiment of the present application.
Step S103: and determining the second text structure characteristic of the target word entering the memory state according to at least the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state.
The named entity recognition model provided by the embodiment of the application is different from model accumulation of a common graph neural network and an LSTM, and a brand-new mode is adopted to fuse the structural information and the text information of the words, namely, the first text feature and the first text structural feature of the target words. The model adopted by the embodiment can introduce structural information of words (also called as a first text structural feature, that is, structural information of a current state is introduced in each state) for each word, and can determine how much structural information is input (that is, a second text structural feature of each word entering a memory state) according to the useful degree of the structural information. By the design, the meaning of each word in the text to be processed (namely the second text characteristic of the target word) can be effectively extracted, so that the same word has different meanings in different texts.
The above words may include a plurality of words arranged in front of the target word in the text to be processed. The above words may include the previous word adjacent to the target word, or may not include the previous word adjacent to the target word. In this embodiment, the text feature determination module in the named entity recognition model may determine a second text structure feature of each word in the text to be processed, where the second text structure feature of the word entering the memory state may include a second text structure feature of a plurality of words entering the memory state. The step can determine the second text structure characteristic of the target word entering the memory state, namely determine the input amount of the structure information of the target word at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state.
In one example, the second text structure feature of the target word entering the memory state is determined at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state by a feature memory submodule included in a text feature determination module in the named entity recognition model.
The text feature determination module may use bi-directional LSTM to fuse common linear text features and structural information features (the first text structural feature of the target word) to obtain a better word feature vector based on text information. After being modified, the feature memory submodule in the bidirectional LSTM can determine the input amount of the structural information according to the useful degree of the structural information (the first text structural feature of the target word).
In this embodiment, the feature memory sub-module is used to at least merge the linear text information of the previous word of the target word with the second text feature h of the text structure informationt-1First text structure characteristic of target word
Figure BDA0002716553370000091
And a third text of the above word entering a memory stateFeature Ct-1(second text structure feature m including the last word entering a memory statet-1) Determining a third text characteristic C of the target word entering a memory statet(second text structure feature m which may include an entry into memory state for the target wordt). The third text feature comprises a second text structure feature entering a memory state, and can also comprise a feature of text linearity and the like entering the memory state. The text linear information belongs to a common concept in the prior art, and is not described in detail here.
As shown in FIG. 3, the feature memorization submodule passes through a feature memorization controller (newly designed m)tGate) based on at least the second text feature h of the fused text linear information and text structure information of the previous word of the target wordt-1And a first text structure characteristic of the target word
Figure BDA0002716553370000092
Determining a second text structure feature m of the target word entering a memory statet(ii) a The feature memory submodule at least according to the second text structure feature m of the target word entering the memory state through the feature memorytAnd a third text characteristic C of the entry of the above word into a memory statet-1Determining a third text characteristic C of the target word entering a memory statet
After determining the second text structure feature that the target word can enter the memory state, the next step may be performed to determine the second text feature of the target word based at least on the first text feature and the second text structure feature of the target word.
Step S105: and determining a second text characteristic of the target word according to at least the first text characteristic and the second text structure characteristic of the target word.
According to the method provided by the embodiment of the application, the two characteristics, namely the text linear characteristic and the text structural characteristic, of the target word are better fused through the step S103 and the step S105. In this embodiment, the output sub-module included in the text feature determination module determines the second text feature of the target word at least according to the first text feature and the second text structure feature of the target word.
As shown in fig. 3, in implementation, the output sub-module may be configured to at least obtain the first text feature X of the target wordtSecond text feature h of the previous wordt-1First text structure characteristic of target word
Figure BDA0002716553370000101
And a third text characteristic C of the target word entering a memory statetDetermining a second text feature h of the target wordt
As shown in fig. 3, in practical implementation, the output sub-module passes through the first output controller OtA gate based at least on the first text feature X of the target wordtAnd a first text structure feature
Figure BDA0002716553370000102
Second text feature h of the previous wordt-1Determining a fourth text feature Ot(ii) a The output sub-module at least according to the fourth text characteristic O through the second output controllertAnd a third text characteristic C of the target word entering a memory statetDetermining a second text feature h of the target wordt
As shown in fig. 3, in this embodiment, the feature forgetting submodule f included in the named entity recognition model may also be includedtDoor, according to the second text characteristic h of the previous wordt-1First text characteristic X of target wordtAnd a first text structure feature
Figure BDA0002716553370000103
Characteristics of the above words that are amnesic are determined.
With reference to fig. 2, fig. 3 and fig. 4, the named entity recognition model provided in this embodiment is based on the original LSTM, and a graph-encoded representation (g _ t) module based on a two-layer graph neural network is newly added to input captured text structure features
Figure BDA0002716553370000104
Correspond to each otherThe newly added structure information extraction module also designs a new door (characteristic memory controller) to control the structure information
Figure BDA0002716553370000105
How much of the memory state (cell state) is entered. F in FIG. 3t,it,mtAnd otRespectively representing the output data of a forgetting gate, an input gate, a newly added gate (a characteristic memory controller) and an output gate. Where forgetting the door to forget some information from a previous state only retains the remaining information that may be valuable for a later state. In addition, since not all current text information is useful, how much information of the current state can be input through the input gate control. The newly added feature memory controller corresponds to the front input gate, which controls how much structure information is input because not all current structure information is useful. The output gate is used for controlling how much memory information can represent the second text characteristic of the target word currently. Meanwhile, the sizes of the forgetting gate (forget gate) and the output gate (output gate) can be determined by the input X of the target wordt,
Figure BDA0002716553370000111
And hidden state h of the front layert-1And (6) determining. By adopting the result model, the relation between long-distance words can be captured, and the noise can be effectively reduced, so that the linear characteristic and the structural characteristic of the text can be better fused, and the recognition accuracy is effectively improved.
Step S107: and determining the named entity type of the target word according to at least the second text characteristic of the target word.
After the meaning (i.e., the second text feature) of the target word in the text to be processed is determined, the type of the named entity of the target word can be determined at least according to the second text feature of the target word by a classifier included in the named entity recognition model.
As can be seen from the foregoing embodiments, in the named entity recognition method provided in the embodiments of the present application, a first text feature and a first text structure feature of a target word in a text to be processed are determined; determining a second text structure characteristic of the target word entering a memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state; determining a second text characteristic of the target word at least according to the first text characteristic of the target word and the second text structure characteristic of the target word entering a memory state; determining the named entity type of the target word at least according to the second text characteristic of the target word; by the processing mode, two characteristics of text linear characteristics and text structural characteristics are better fused; therefore, the recognition accuracy can be effectively improved.
Second embodiment
In the above embodiments, a named entity recognition method is provided, and correspondingly, the present application also provides a named entity recognition apparatus. The apparatus corresponds to an embodiment of the method described above. Parts of this embodiment that are the same as the first embodiment are not described again, please refer to corresponding parts in the first embodiment.
The application provides a named entity recognition device, including:
the first feature determination unit is used for determining a first text feature and a first text structure feature of a target word in a text to be processed;
the text structure characteristic control unit is used for determining a second text structure characteristic of the target word entering the memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state;
the second characteristic determining unit is used for determining a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word;
and the classification unit is used for determining the named entity type of the target word at least according to the second text characteristic of the target word.
Third embodiment
The application also provides an electronic device. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
An electronic device of the present embodiment includes: a processor and a memory; a memory for storing a program for implementing the named entity recognition method, the device performing the following steps after being powered on and running the program of the method by said processor: determining a first text characteristic and a first text structure characteristic of a target word in a text to be processed; determining a second text structure characteristic of the target word entering a memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state; determining a second text characteristic of the target word at least according to the first text characteristic of the target word and the second text structure characteristic of the target word entering a memory state; and determining the named entity type of the target word according to at least the second text characteristic of the target word.
Fourth embodiment
In the above embodiment, a named entity recognition method is provided, and correspondingly, a named entity recognition model construction method is also provided in the present application. The execution subject of the method includes but is not limited to a server, and may be any device capable of implementing the method. The method corresponds to the embodiment of the method described above. Parts of this embodiment that are the same as the first embodiment are not described again, please refer to corresponding parts in the first embodiment.
In this embodiment, the method may include the steps of:
step 1: and acquiring a corresponding relation set between the text and the named entity labeling data.
The text comprises sentences, and the set of corresponding relationships serves as a training data set of the model.
Step 2: constructing a network structure of a named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; and the output sub-module determines the second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word.
The network structure of the model corresponds to the model in the first embodiment, and details are described in the related description of the first embodiment, and are not repeated here.
And step 3: and learning to obtain a named entity recognition model according to the corresponding relation set.
Training the model parameters according to the training data belongs to the mature prior art, and therefore, the details are not repeated here.
As can be seen from the foregoing embodiments, in the method for constructing a named entity recognition model provided in the embodiments of the present application, a corresponding relationship set between a text and named entity annotation data is obtained; constructing a network structure of a named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; the output sub-module determines a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word; learning to obtain a named entity recognition model according to the corresponding relation set; by the processing mode, two characteristics, namely a text linear characteristic and a text structural characteristic, are better fused, so that the model prediction capability is enhanced; therefore, the model accuracy can be effectively improved.
Fifth embodiment
In the foregoing embodiment, a named entity recognition model construction method is provided, and correspondingly, the present application also provides a named entity recognition model construction device. The apparatus corresponds to an embodiment of the method described above. Parts of this embodiment that are the same as the fourth embodiment are not described again, please refer to corresponding parts in the fourth embodiment.
The application provides a named entity recognition model construction device, including:
the training data acquisition unit is used for acquiring a corresponding relation set between the text and the named entity labeling data;
the network construction unit is used for constructing a network structure of the named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; the output sub-module determines a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word;
and the training unit is used for learning to obtain a named entity recognition model according to the corresponding relation set.
Sixth embodiment
The application also provides an electronic device. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
An electronic device of the present embodiment includes: a processor and a memory; a memory for storing a program for implementing the named entity recognition model building method, the device performing the following steps after being powered on and running the program of the method through the processor: acquiring a corresponding relation set between the text and the named entity labeling data; constructing a network structure of a named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; the output sub-module determines a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word; and learning to obtain a named entity recognition model according to the corresponding relation set.
Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims (11)

1. A named entity recognition method, comprising:
determining a first text characteristic and a first text structure characteristic of a target word in a text to be processed;
determining a second text structure characteristic of the target word entering a memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state;
determining a second text characteristic of the target word according to at least the first text characteristic and the second text structure characteristic of the target word;
and determining the named entity type of the target word according to at least the second text characteristic of the target word.
2. The method of claim 1,
determining a first text structure characteristic of a target word through a structure information extraction module included in a named entity recognition model;
determining a second text structure characteristic of the target word entering a memory state at least according to a first text structure characteristic of the target word and a second text structure characteristic of the previous word entering the memory state through a characteristic memory sub-module included in a text characteristic determination module in the named entity recognition model;
determining a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word through an output sub-module included by the text characteristic determining module;
and determining the named entity type of the target word at least according to the second text characteristic of the target word through a classifier included in the named entity recognition model.
3. The method of claim 2, further comprising:
determining a third text characteristic of the target word entering a memory state at least according to the second text characteristic of the fused text linear information and the text structure information of the previous word of the target word, the first text structure characteristic of the target word and the third text characteristic of the previous word entering the memory state through the characteristic memory sub-module, wherein the third text characteristic comprises the second text structure characteristic of the previous word entering the memory state;
and determining the second text characteristic of the target word through the output sub-module at least according to the first text characteristic of the target word, the second text characteristic of the previous word, the first text structure characteristic of the target word and the third text characteristic of the target word entering a memory state.
4. The method of claim 3, further comprising:
the feature memory submodule determines a second text structure feature of the target word entering a memory state at least according to the second text feature of the fused text linear information and the text structure information of the previous word of the target word and the first text structure feature of the target word through a feature memory controller;
and the feature memory submodule determines a third text feature of the target word entering the memory state at least according to the second text structure feature of the target word entering the memory state and the third text feature of the previous word entering the memory state through the feature memory.
5. The method of claim 3, further comprising:
the output sub-module determines a fourth text characteristic at least according to the first text characteristic and the first text structure characteristic of the target word and the second text characteristic of the previous word through the first output controller;
and the output sub-module determines a second text characteristic of the target word through a second output controller at least according to the fourth text characteristic and the third text characteristic of the target word entering the memory state.
6. The method of claim 2, wherein determining the first text structural feature of the target word by a structural information extraction module included in the named entity recognition model comprises:
determining word dependency relationships of the text, wherein the word dependency relationships comprise dependency relationship types;
determining a word dependency relationship matrix according to the word dependency relationship;
determining a third text structure characteristic of the target word according to the matrix, the word vector of the target word and the dependency relationship type vector;
and determining the first text structure characteristic of the target word according to the third text structure characteristic of the target word through a structure information extraction module based on the multilayer graph neural network.
7. The method according to claim 6, wherein determining the first text structure feature of the target word according to the third text structure feature of the target word by the structural information extraction module based on the multi-layer graph neural network comprises:
determining a fourth text structure characteristic of the target word according to the third text structure characteristic through the first layer graph neural network included by the structure information extraction module;
and determining the first text structure characteristic of the target word according to the fourth text structure characteristic through a second-layer graph neural network included by the structure information extraction module.
8. The method of claim 3, further comprising:
and determining the characteristic of the word without memory according to the second text characteristic of the previous word, the first text characteristic of the target word and the first text structure characteristic through a characteristic forgetting submodule included in the named entity recognition model.
9. A named entity recognition model construction method is characterized by comprising the following steps:
acquiring a corresponding relation set between the text and the named entity labeling data;
constructing a network structure of a named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; the output sub-module determines a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word;
and learning to obtain a named entity recognition model according to the corresponding relation set.
10. A named entity recognition apparatus, comprising:
the first feature determination unit is used for determining a first text feature and a first text structure feature of a target word in a text to be processed;
the text structure characteristic control unit is used for determining a second text structure characteristic of the target word entering the memory state at least according to the first text structure characteristic of the target word and the second text structure characteristic of the previous word entering the memory state;
the second characteristic determining unit is used for determining a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word;
and the classification unit is used for determining the named entity type of the target word at least according to the second text characteristic of the target word.
11. A named entity recognition model building apparatus, comprising:
the training data acquisition unit is used for acquiring a corresponding relation set between the text and the named entity labeling data;
the network construction unit is used for constructing a network structure of the named entity recognition model; the model comprises: the text feature determination module comprises a structure information extraction module and a text feature determination module, wherein the text feature determination module comprises: a characteristic memory submodule and an output submodule; the structural information extraction module is used for determining a first text structural feature of the target word; the feature memory submodule is used for determining a second text structure feature of the target word entering the memory state at least according to the first text structure feature of the target word and the second text structure feature of the previous word entering the memory state; the output sub-module determines a second text characteristic of the target word at least according to the first text characteristic and the second text structure characteristic of the target word;
and the training unit is used for learning to obtain a named entity recognition model according to the corresponding relation set.
CN202011081230.7A 2020-10-09 2020-10-09 Named entity identification method, device and equipment Pending CN114330342A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011081230.7A CN114330342A (en) 2020-10-09 2020-10-09 Named entity identification method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011081230.7A CN114330342A (en) 2020-10-09 2020-10-09 Named entity identification method, device and equipment

Publications (1)

Publication Number Publication Date
CN114330342A true CN114330342A (en) 2022-04-12

Family

ID=81032180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011081230.7A Pending CN114330342A (en) 2020-10-09 2020-10-09 Named entity identification method, device and equipment

Country Status (1)

Country Link
CN (1) CN114330342A (en)

Similar Documents

Publication Publication Date Title
Kádár et al. Representation of linguistic form and function in recurrent neural networks
Li et al. Visual to text: Survey of image and video captioning
Sohangir et al. Big Data: Deep Learning for financial sentiment analysis
CN111563551B (en) Multi-mode information fusion method and device and electronic equipment
CN110717017B (en) Method for processing corpus
CN110069709B (en) Intention recognition method, device, computer readable medium and electronic equipment
CN111738016A (en) Multi-intention recognition method and related equipment
Islam et al. Exploring video captioning techniques: A comprehensive survey on deep learning methods
CN112329824A (en) Multi-model fusion training method, text classification method and device
WO2023045605A1 (en) Data processing method and apparatus, computer device, and storage medium
Biswas et al. Towards explanatory interactive image captioning using top-down and bottom-up features, beam search and re-ranking
Wang et al. Data set and evaluation of automated construction of financial knowledge graph
Kumar et al. ATE-SPD: simultaneous extraction of aspect-term and aspect sentiment polarity using Bi-LSTM-CRF neural network
CN111739520A (en) Speech recognition model training method, speech recognition method and device
Biesialska et al. Leveraging contextual embeddings and self-attention neural networks with bi-attention for sentiment analysis
Hong et al. Knowledge-grounded dialogue modelling with dialogue-state tracking, domain tracking, and entity extraction
CN113704466B (en) Text multi-label classification method and device based on iterative network and electronic equipment
Nazir et al. Idea plagiarism detection with recurrent neural networks and vector space model
CN114417891A (en) Reply sentence determination method and device based on rough semantics and electronic equipment
CN114330342A (en) Named entity identification method, device and equipment
CN114662496A (en) Information identification method, device, equipment, storage medium and product
KR20230049354A (en) Malicious comment filter device and method
Kumari et al. Emotion aided multi-task framework for video embedded misinformation detection
Javed et al. Multimodal summarization: A concise review
CN112711642A (en) Medicine name matching method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination