CN115114924A

CN115114924A - Named entity recognition method, device, computing equipment and storage medium

Info

Publication number: CN115114924A
Application number: CN202210691397.8A
Authority: CN
Inventors: 唐光远; 罗琴; 李润静; 张俊杰; 熊琼; 陈海波
Original assignee: Gree Electric Appliances Inc of Zhuhai; Zhuhai Lianyun Technology Co Ltd
Current assignee: Gree Electric Appliances Inc of Zhuhai; Zhuhai Lianyun Technology Co Ltd
Priority date: 2022-06-17
Filing date: 2022-06-17
Publication date: 2022-09-27

Abstract

The invention provides a named entity identification method, a named entity identification device, computing equipment and a storage medium. The method comprises the following steps: segmenting a text in the target data information by using an ELMo model in the trained named entity recognition model, and performing semantic coding on a plurality of words obtained by segmenting the word to obtain a word vector of each word; aiming at each word, extracting an entity characteristic sequence of the word according to the word vector by using a BilSTM model in the trained named entity recognition model; learning the weight of each entity feature in the entity feature sequence by using a SEnet model in the trained named entity recognition model; and identifying the entities of the words in the target data information by utilizing a CRF (named object model) model in the trained named entity identification model based on the entity feature sequence of the words and the weight of each entity feature. The accuracy of named entity recognition is improved.

Description

Named entity recognition method, device, computing equipment and storage medium

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a named entity identification method, apparatus, computing device, and storage medium.

Background

Named entity recognition is a fundamental and important task in the fields of information extraction and natural language processing. In recent years, named entity recognition has found widespread application in our daily lives as artificial intelligence has developed. For example, people may obtain their own desired and important information from a piece of news or text, at which time named entity recognition technology may quickly retrieve key information from the text. Named entity recognition can automatically recognize entity information such as person name, organization name, place name, time and the like. Named entity recognition is a foundation of technologies such as event or relationship extraction and the like, is very important for acquiring text semantic knowledge, and has significance for extracting unstructured information.

At present, the effect of Chinese named entity recognition is not very good compared to English, because Chinese recognition has certain difficulty compared to English. In the early days, the method is mainly based on rules and statistics, and is mainly based on lexical, syntactic and semantic rule templates manually set by linguists, so that the recognition effect is not ideal.

Therefore, a new method for identifying a named entity is needed to improve the accuracy of identifying a named entity in chinese.

Disclosure of Invention

The invention mainly aims to provide a named entity identification method, a named entity identification device, a computing device and a storage medium, so as to realize accurate identification of texts in data information.

The invention provides a named entity identification method, which comprises the following steps: segmenting words of a text in the target data information by using an ELMo model in the trained named entity recognition model, and performing semantic coding on a plurality of words obtained by segmenting the words respectively to obtain a word vector of each word; aiming at each word, extracting an entity characteristic sequence of the word according to the word vector by using a BilSTM model in the trained named entity recognition model; learning the weight of each entity feature in the entity feature sequence by utilizing a SENET model in the trained named entity recognition model; and identifying the entities of the words in the target data information by utilizing a CRF (named object model) model in the trained named entity identification model based on the entity feature sequence of the words and the weight of each entity feature.

In one embodiment, the method further comprises: training a named entity recognition model by using a named entity recognition sample set, comprising: and training the ELMo model by utilizing a semantic coding sample set, wherein the semantic coding sample set comprises a plurality of semantic coding samples, and each semantic coding sample comprises sample data information and word vectors of a plurality of words contained in the text.

In one embodiment, training a named entity recognition model with a named entity recognition sample set includes: and training the BilSTM model by utilizing a feature extraction sample set, wherein the feature extraction sample set comprises a plurality of feature extraction samples, and each feature extraction sample comprises a word vector of a word and an entity feature sequence of the word in a sample data information environment.

In one embodiment, training a named entity recognition model with a named entity recognition sample set includes: training a SENET model by utilizing a characteristic weight sample set, wherein the characteristic weight sample set comprises a plurality of characteristic weight samples, and each characteristic weight sample comprises an entity characteristic sequence of a word and the weight of each entity characteristic in the sample data information environment.

In one embodiment, training a named entity recognition model with a named entity recognition sample set includes: the CRF model is trained by utilizing an entity screening sample set, wherein the entity screening sample set comprises a plurality of entity screening samples, each entity screening sample comprises an entity characteristic sequence of a word in a sample data information environment, the weight of each entity characteristic in the entity characteristic sequence in the sample data information environment and the entities of a plurality of words in the sample data information.

In one embodiment, the method for segmenting a text in target data information by using an ELMo model in a trained named entity recognition model and performing semantic coding on a plurality of words obtained by segmenting the word to obtain a word vector of each word includes: the method comprises the steps of utilizing an ELMo model in a trained named entity recognition model to perform word segmentation on a text in target data information, performing semantic coding on a plurality of words obtained by word segmentation respectively to obtain an initial word vector of each word, and training a dynamic semantic vector of a context of each word to obtain a word vector of each word.

In one embodiment, identifying the entities of the words in the target data information by using a CRF model in a trained named entity identification model based on the entity feature sequence of the words and the weight of each entity feature thereof includes: based on the entity feature sequences of the words and the weight of each entity feature, the CRF model in the trained named entity recognition model is utilized to obtain the joint probability of the sequences of the words in the target data information, and the entity of the words corresponding to the maximum joint probability is used as the entity of the words in the target data information.

The invention provides a named entity recognition device, comprising: the ELMo module is used for segmenting the text in the target data information by utilizing an ELMo model in the trained named entity recognition model, and performing semantic coding on a plurality of words obtained by segmenting the words to obtain a word vector of each word; the BilSTM module is used for extracting an entity characteristic sequence of a word vector of each word by utilizing a BilSTM model in the trained named entity recognition model aiming at each word; the SENet module is used for learning the weight of each entity feature in the entity feature sequence by utilizing a SENet model in the trained named entity recognition model; and the CRF module is used for identifying the entities of the words in the target data information by utilizing a CRF model in the trained named entity identification model based on the entity feature sequence of the words and the weight of each entity feature.

The invention provides a computing device comprising a processor and a memory, in which a computer program is stored which, when being executed by the processor, carries out the steps of the named entity recognition method as described above.

The present invention provides a storage medium storing a computer program which, when being executed by a processor, carries out the steps of the named entity recognition method as described above.

The method provided by the invention combines the SENet model on the basis of the BilSTM model, and can integrally improve the representation capability of the model to words, thereby improving the accuracy of named entity recognition.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention

In the figure:

FIG. 1 is a flow diagram of a named entity recognition method according to an exemplary embodiment of the present application;

FIG. 2 is a schematic diagram of a named entity recognition apparatus according to an embodiment of the present application;

FIG. 3 is a diagram of a SENET model according to an embodiment of the present application.

Detailed Description

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.

In the related art, regarding named entity recognition and number, a method for Machine learning is also generated, such as Conditional Random Field (CRF), Support Vector Machine (SVM), hidden markov, etc., and the present invention is an improvement on the deep learning algorithm of BiLSTM.

Example one

The present embodiment provides a method for identifying a named entity, and fig. 1 is a flowchart of a method for identifying a named entity according to an exemplary embodiment of the present application, as shown in fig. 1, the method of the present embodiment may include the following steps:

s100: the ELMo model (Embeddings from Language Models) in the trained named entity recognition model is used for segmenting the text in the target data information, and semantic coding is respectively carried out on a plurality of words obtained by segmenting the words to obtain a word vector of each word.

S200: and aiming at each word, extracting an entity characteristic sequence of the word according to the word vector by using a Bi-Directional Long Short-Term Memory (Bi-LSTM) model in the trained named entity recognition model.

S300: and learning the weight of each entity feature in the entity feature sequence by utilizing a SENet (Squeeze-and-Excitation Networks) model in the trained named entity recognition model.

S400: and identifying the entities of the words in the target data information by utilizing a CRF (named object model) model in the trained named entity identification model based on the entity characteristic sequence of the words and the weight of each entity characteristic.

Through the steps, the SENET model is used for learning the weight of each entity feature of the words in the text in the target data information, so that the weight of important entity features can be increased, the weight of non-important entity features can be reduced, the representation capability of the entity features of the words can be increased, and the accuracy of named entity recognition can be improved.

The entity features in the entity feature sequence may be each possible named entity of the term, or in some other cases, each feature of a named entity of the term, and may be selected by one skilled in the art as desired.

In one example, the method of this embodiment may further include: training a named entity recognition model by using a named entity recognition sample set, comprising: and training the ELMo model by utilizing a semantic coding sample set, wherein the semantic coding sample set comprises a plurality of semantic coding samples, and each semantic coding sample comprises sample data information and word vectors of a plurality of words contained in the text.

In one example, training a named entity recognition model with a named entity recognition sample set may include: and training the BilSTM model by utilizing a feature extraction sample set, wherein the feature extraction sample set comprises a plurality of feature extraction samples, and each feature extraction sample comprises a word vector of a word and an entity feature sequence of the word in a sample data information environment.

In one example, training a named entity recognition model with a named entity recognition sample set may include: training a SENET model by utilizing a characteristic weight sample set, wherein the characteristic weight sample set comprises a plurality of characteristic weight samples, and each characteristic weight sample comprises an entity characteristic sequence of a word and the weight of each entity characteristic in the sample data information environment.

In one example, training a named entity recognition model with a named entity recognition sample set may include: and training the CRF model by utilizing an entity screening sample set, wherein the entity screening sample set comprises a plurality of entity screening samples, and each entity screening sample comprises an entity characteristic sequence of a word in a sample data information environment, a weight of each entity characteristic in the entity characteristic sequence in the sample data information environment and entities of a plurality of words in the sample data information.

In one example, segmenting a text in target data information by using an ELMo model in a trained named entity recognition model, and performing semantic coding on a plurality of words obtained by segmenting the word to obtain a word vector of each word, which may include: the method comprises the steps of utilizing an ELMo model in a trained named entity recognition model to perform word segmentation on a text in target data information, performing semantic coding on a plurality of words obtained by word segmentation respectively to obtain an initial word vector of each word, and training a dynamic semantic vector of a context of each word to obtain a word vector of each word.

In one example, identifying the entities of the plurality of words in the target data information by using a CRF model in a trained named entity identification model based on the entity feature sequence of the plurality of words and the weight of each entity feature thereof may include: based on the entity feature sequences of the words and the weight of each entity feature, the CRF model in the trained named entity recognition model is utilized to obtain the joint probability of the sequences of the words in the target data information, and the entity of the words corresponding to the maximum joint probability is used as the entity of the words in the target data information.

Before the corresponding model is trained by using the sample set, the sample set may be preprocessed, for example, useless information such as punctuations and spaces in the sample data information or information with little use may be cleaned, and a person skilled in the art may select a specific preprocessing method as needed.

The named entity recognition method of the embodiment includes the steps of firstly, carrying out semantic coding on words in a text by utilizing the semantic representation capability of an ELMo model to obtain word vectors of the words, then, inputting the word vectors of the words into a BilSTM, extracting entity feature sequences of the words, calculating weights of entity features in the entity feature sequences of the words by utilizing SENTet, and finally, outputting named entities of the words in the text through a CRF model. The SENet model is combined on the basis of the BilSTM model, the representation capability of the model to words can be integrally improved, and therefore the accuracy of named entity recognition is improved.

Example two

Fig. 2 is a schematic structural diagram of a named entity recognition apparatus according to an embodiment of the present disclosure. As shown in fig. 2, the apparatus 10 of the present embodiment may include: the ELMo module 101 is configured to perform word segmentation on a text in the target data information by using an ELMo model in the trained named entity recognition model, and perform semantic coding on a plurality of words obtained by word segmentation to obtain a word vector of each word; the BilSTM module 102 is used for extracting an entity characteristic sequence of a word vector of each word by utilizing a BilSTM model in the trained named entity recognition model aiming at each word; a SENet module 103, configured to learn a weight of each entity feature in the entity feature sequence by using a SENet model in the trained named entity recognition model; and the CRF module 104 is used for identifying the entities of the words in the target data information by utilizing a CRF model in the trained named entity identification model based on the entity feature sequence of the words and the weight of each entity feature.

In another embodiment, the named entity recognition apparatus may further include a processor and a memory, the processor to execute the following program modules stored in the memory: an ELMo module 101, a BilSTM module 102, a SENet module 103 and a CRF module 104 to realize accurate identification of named entities in the text.

EXAMPLE III

The embodiment describes a named entity identification method by taking industrial file data as an example. Mainly comprises the following steps:

firstly, an initial model in the named entity recognition device is trained by using a named entity recognition sample data set, so that the accuracies of the trained ELMo model, SENet model, BilSTM model and CRF model all reach the preset requirements and are mutually coordinated.

And secondly, carrying out pretreatment such as data cleaning and entity labeling on the industrial file data. In which data is cleaned up, for example, some unnecessary punctuation marks and spaces are removed.

Thirdly, inputting the preprocessed industrial document data into an ELMo model, wherein the ELMo model can perform word segmentation on the text in the industrial document data, and perform text semantic coding on the words to obtain word vectors of the words. Because the ELMo model is used as a pre-training language model, the ELMo model can train a dynamic semantic vector of a context, so that words can be represented as word vectors by combining the context, and the problem of polysemous of a word is solved to a certain extent.

Fourthly, the word vectors are input into the BilSTM model, the BilSTM model can fully utilize the context information to extract the entity characteristic sequence of the words according to the word vectors, and the continuously represented context information can be obtained.

Fifthly, the SENet model integrated into the BilSTM model can learn the relationship among all the characteristic channels of the BilSTM model when the entity characteristics are extracted, display and model the interdependence relationship among the characteristic channels, adopt a method of re-calibrating the characteristic channels, and simultaneously learn the contribution value (as weight) of each characteristic channel to the whole, thereby increasing the weight of important characteristics, reducing the weight of non-important characteristics, improving the interdependence relationship among the channels, further improving the model performance and enhancing the representation capability of the network model.

FIG. 3 is a diagram of a SENET model according to an embodiment of the present application. The SENET model of the embodiment adds a Global pooling layer (Global output), two fully connected layers (FC) and a normalization layer (Sigmoid) between the input and the output.

The SENET model is similar to the action of an attention mechanism, the SENET model focuses on the weight vector of the important word, and the named entity recognition effect is improved after the weight of the important word is focused, so that the SENET model can fully recognize the weight of the important word and fully utilize information carried by input.

And sixthly, inputting the entity characteristic sequence of each word and the weight of each entity characteristic in the entity characteristic sequence into a CRF model, obtaining probability sequences of all entity characteristics corresponding to all words in the text by the CRF, forming a probability matrix of the text, and taking the named entity corresponding to the maximum probability as the named entity of the word. The CRF model combines the characteristics of the maximum entropy model and the hidden Markov model, and can obtain the maximum joint probability in the sequence of the industrial file data of the words so as to determine the named entities of the words in the industrial file data.

Example four

The present embodiment provides a computing device comprising a processor and a memory, the memory having stored therein a computer program which, when executed by the processor, carries out the steps of the named entity recognition method as described above.

In one embodiment, the computing device may include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or FLASH memory (FLASH RAM). Memory is an example of a computer-readable medium.

EXAMPLE five

The present embodiment provides a storage medium storing a computer program which, when executed by a processor, performs the steps of the named entity recognition method as described above.

The computer program may employ any combination of one or more storage media. The storage medium may be a readable signal medium or a readable storage medium.

A readable storage medium may include, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the above. More specific examples (a non-exhaustive list) of the readable storage medium may include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Readable signal media may include a propagated data signal with a readable computer program embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, and may include, for example, an electromagnetic signal, an optical signal, or any suitable combination of the foregoing. A readable signal medium may also be any storage medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer program embodied on the storage medium may be transmitted using any appropriate medium, including by way of example, wirelessly, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

A computer program for carrying out operations of the present invention may be written in any combination of one or more programming languages. The programming languages may include an object oriented programming language such as Java, C + +, or the like, and may also include a conventional procedural programming language such as the "C" language or similar programming languages. The computer program may execute entirely on the user's computing device, partly on the user's device, or entirely on a remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device over any kind of network (which may include, for example, a local area network or a wide area network), or may be connected to an external computing device (which may be connected over the internet, for example, using an internet service provider).

It is noted that the terms used herein are merely for describing particular embodiments and are not intended to limit exemplary embodiments according to the present application, and when the terms "include" and/or "comprise" are used in this specification, they specify the presence of features, steps, operations, devices, components, and/or combinations thereof.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances.

It should be understood that the exemplary embodiments herein may be embodied in many different forms and should not be construed as limited to only the embodiments set forth herein. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions. These embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of these exemplary embodiments to those skilled in the art, and should not be construed as limiting the present invention.

While the spirit and principles of the invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects cannot be combined to advantage. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A named entity recognition method, comprising:

segmenting words of a text in the target data information by using an ELMo model in the trained named entity recognition model, and performing semantic coding on a plurality of words obtained by segmenting the words respectively to obtain a word vector of each word;

aiming at each word, extracting an entity characteristic sequence of the word according to the word vector by using a BilSTM model in the trained named entity recognition model;

learning the weight of each entity feature in the entity feature sequence by utilizing a SENET model in the trained named entity recognition model;

and identifying the entities of the words in the target data information by utilizing a CRF (named object model) model in the trained named entity identification model based on the entity feature sequence of the words and the weight of each entity feature.

2. The named entity recognition method of claim 1, wherein the method further comprises:

training the named entity recognition model using a named entity recognition sample set, comprising:

the ELMo model is trained by utilizing a semantic coding sample set, wherein the semantic coding sample set comprises a plurality of semantic coding samples, and each semantic coding sample comprises sample data information and word vectors of a plurality of words contained in a text.

3. The named entity recognition method of claim 2, wherein training the named entity recognition model with a named entity recognition sample set comprises:

and training the BilSTM model by utilizing a feature extraction sample set, wherein the feature extraction sample set comprises a plurality of feature extraction samples, and each feature extraction sample comprises a word vector of a word and an entity feature sequence of the word in the sample data information environment.

4. The named entity recognition method of claim 2 or 3, wherein training the named entity recognition model with a named entity recognition sample set comprises:

training a SENET model by utilizing a characteristic weight sample set, wherein the characteristic weight sample set comprises a plurality of characteristic weight samples, and each characteristic weight sample comprises an entity characteristic sequence of a word and the weight of each entity characteristic in the sample data information environment.

5. The named entity recognition method of any one of claims 2 to 4, wherein training a named entity recognition model using the named entity recognition sample set comprises:

and training the CRF model by utilizing an entity screening sample set, wherein the entity screening sample set comprises a plurality of entity screening samples, and each entity screening sample comprises an entity characteristic sequence of a word in the sample data information environment, the weight of each entity characteristic in the entity characteristic sequence in the sample data information environment and the entities of a plurality of words in the sample data information.

6. The method for identifying the named entities according to claim 1, wherein the step of segmenting the text in the target data information by using an ELMo model in the trained named entity identification model, and performing semantic coding on a plurality of words obtained by segmenting the words to obtain a word vector of each word comprises the steps of:

the method comprises the steps of utilizing an ELMo model in a trained named entity recognition model to perform word segmentation on a text in target data information, performing semantic coding on a plurality of words obtained by word segmentation respectively to obtain an initial word vector of each word, and training a dynamic semantic vector of a context of each word to obtain a word vector of each word.

7. The method of claim 1, wherein identifying the entities of the words in the target data information using a CRF model in a trained named entity recognition model based on the entity feature sequence of the words and the weight of each entity feature thereof comprises:

based on the entity feature sequences of the words and the weight of each entity feature, the CRF model in the trained named entity recognition model is utilized to obtain the joint probability of the sequences of the words in the target data information, and the entity of the words corresponding to the maximum joint probability is used as the entity of the words in the target data information.

8. A named entity recognition apparatus, comprising:

the ELMo module is used for segmenting the text in the target data information by utilizing an ELMo model in the trained named entity recognition model, and performing semantic coding on a plurality of words obtained by segmenting the words to obtain a word vector of each word;

the BilSTM module is used for extracting an entity characteristic sequence of a word vector of each word by utilizing a BilSTM model in the trained named entity recognition model aiming at each word;

the SENet module is used for learning the weight of each entity feature in the entity feature sequence by utilizing a SENet model in the trained named entity recognition model;

and the CRF module is used for identifying the entities of the words in the target data information by utilizing a CRF model in the trained named entity identification model based on the entity feature sequence of the words and the weight of each entity feature.

9. A computing device, characterized in that it comprises a processor and a memory, in which a computer program is stored which, when being executed by the processor, carries out the steps of the named entity recognition method according to any one of claims 1 to 7.

10. A storage medium, characterized in that a computer program is stored which, when being executed by a processor, carries out the steps of the named entity recognition method according to any one of claims 1 to 7.