CN112417874A

CN112417874A - Named entity recognition method and device, storage medium and electronic device

Info

Publication number: CN112417874A
Application number: CN202011279896.3A
Authority: CN
Inventors: 唐光远; 李卓茜; 陈海波; 罗琴; 张俊杰; 李润静
Original assignee: Gree Electric Appliances Inc of Zhuhai; Zhuhai Lianyun Technology Co Ltd
Current assignee: Gree Electric Appliances Inc of Zhuhai; Zhuhai Lianyun Technology Co Ltd
Priority date: 2020-11-16
Filing date: 2020-11-16
Publication date: 2021-02-26

Abstract

The application discloses a named entity identification method and device, a storage medium and an electronic device. Wherein, the method comprises the following steps: acquiring a target text, wherein the target text is a text of a named entity to be extracted; and obtaining the named entity in the target text by utilizing the bidirectional semantic representation of the target text, and improving the recognition accuracy of the named entity by using a deep learning method. The method and the device solve the technical problem that the named entity in the related technology is low in extraction accuracy.

Description

Named entity recognition method and device, storage medium and electronic device

Technical Field

The application relates to the field of artificial intelligence, in particular to a named entity identification method and device, a storage medium and an electronic device.

Background

Named entity recognition is a fundamental and important task in the fields of information extraction and natural language processing. In the modern information society, people usually obtain needed and important information from a piece of news or characters, so that the named entity recognition technology is very important at this time, and can help people to quickly retrieve the needed information from the text. Generally, named entity recognition includes a person's name, an organization's name, a place name, a time, and so on. NER (named entity recognition) can extract the above entities from many unstructured texts, and it can also recognize a larger variety of entities, so the entities are defined here according to business needs.

At present, the recognition effect of the named entity in Chinese is not good compared with English, because Chinese recognition has certain difficulty compared with English.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the application provides a method and a device for identifying a named entity, a storage medium and an electronic device, which are used for at least solving the technical problem of low extraction accuracy of the named entity in the related technology.

According to an aspect of an embodiment of the present application, there is provided a method for identifying a named entity, including: acquiring a target text, wherein the target text is a text of a named entity to be extracted; and obtaining the named entity in the target text by utilizing the bidirectional semantic representation of the target text.

Optionally, when obtaining the named entity in the target text by using the bidirectional semantic representation of the target text includes performing the following processing, performing semantic coding on the target text by using a BERT semantic representation layer, inputting the semantic coding to a BilStm layer for decoding, outputting the semantic coding to an attention mechanism layer to obtain an attention weight, and optimizing sequence information transmitted by the attention layer through a CRF layer.

Optionally, when the BERT semantic representation layer is used for semantically coding the target text, inputting the target text into the BERT semantic representation layer through a text input layer; and carrying out semantic coding on the words in the target text by using a BERT model at the BERT semantic representation layer to obtain word vectors of the words in the target text.

Optionally, when the word vectors are input into the BilSTM layer for decoding, the word vectors of the words in the target text are input into the BilSTM layer to fuse the information of the context features, so as to obtain the text feature vectors.

Optionally, when the attention mechanism layer is output to obtain the attention weight, inputting the output text feature vector of the BilSTM layer into the attention mechanism layer; determining, by the attention mechanism layer, attention weights for keywords in the target text.

Optionally, when the sequence information transmitted from the attention layer is optimized through the CRF layer, the optimal joint probability in the sequence information is calculated by using the CRF layer, so as to obtain an optimized global feature sequence.

According to another aspect of the embodiments of the present application, there is also provided a device for identifying a named entity, including: the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a target text, and the target text is a text of a named entity to be extracted; and the identification unit is used for obtaining the named entity in the target text by utilizing the bidirectional semantic representation of the target text.

Optionally, the identification unit is further configured to: and performing semantic coding on the target text by using a BERT semantic representation layer, inputting the semantic coding into a BilSTM layer for decoding, outputting the semantic coding to an attention mechanism layer to obtain attention weight, and optimizing sequence information transmitted by the attention layer through a CRF layer.

Optionally, the identification unit is further configured to: when the BERT semantic representation layer is used for carrying out semantic coding on the target text, the target text is input to the BERT semantic representation layer through a text input layer; and carrying out semantic coding on the words in the target text by using a BERT model at the BERT semantic representation layer to obtain word vectors of the words in the target text.

Optionally, the identification unit is further configured to: and when the word vectors are input into the BilSTM layer for decoding, the word vectors of the words in the target text are input into the BilSTM layer to fuse the information of the context characteristics to obtain the text characteristic vectors.

Optionally, the identification unit is further configured to: when the attention mechanism layer is output to obtain the attention weight, inputting the text feature vector output by the BilSTM layer into the attention mechanism layer; determining, by the attention mechanism layer, attention weights for keywords in the target text.

Optionally, the identification unit is further configured to: when the sequence information transmitted by the attention layer is optimized through the CRF layer, the optimal joint probability in the sequence information is calculated by using the CRF layer, and the optimized global characteristic sequence is obtained.

According to another aspect of the embodiments of the present application, there is also provided a storage medium including a stored program which, when executed, performs the above-described method.

According to another aspect of the embodiments of the present application, there is also provided an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the above method through the computer program.

In the embodiment of the application, a target text is obtained, wherein the target text is a text of a named entity to be extracted; the named entities in the target text are obtained by utilizing the bidirectional semantic representation of the target text, the recognition accuracy of the named entities is improved by using a deep learning method, and the technical problem of low extraction accuracy of the named entities in the related technology can be solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a flow diagram of an alternative method of identifying a named entity according to an embodiment of the present application;

FIG. 2 is a schematic diagram of an alternative network layer for named entity extraction according to an embodiment of the present application;

FIG. 3 is a schematic diagram of an alternative named entity identification scheme in accordance with embodiments of the present application;

FIG. 4 is a schematic diagram of an alternative named entity recognition arrangement according to an embodiment of the present application;

and

fig. 5 is a block diagram of a terminal according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The early named entity extraction method is mainly based on a rule and a statistical method, the effect is not ideal, if a machine learning method such as a Conditional Random Field (CRF), a Support Vector Machine (SVM), a hidden Markov and the like is adopted, the effect is general, even if models such as CNN \ RNN and the like are adopted for realization, according to one aspect of the embodiment of the application, the embodiment of the named entity recognition method is provided, and deep learning algorithms such as RNN and the like can be improved. Fig. 1 is a flowchart of an alternative named entity identification method according to an embodiment of the present application, and as shown in fig. 1, the method may include the following steps:

step S1, obtaining a target text, wherein the target text is the text of the named entity to be extracted.

And step S2, obtaining the named entity in the target text by using the bidirectional semantic representation of the target text.

Through the steps, a target text is obtained, wherein the target text is a text of the named entity to be extracted; the named entities in the target text are obtained by utilizing the bidirectional semantic representation of the target text, the recognition accuracy of the named entities is improved by using a deep learning method, and the technical problem of low extraction accuracy of the named entities in the related technology can be solved.

Semantic coding is carried out by utilizing the strong semantic representation capability of BERT (bidirectional Encoder retrieval from transformations), word vectors are trained, then the word vectors are input into a BilSTM layer for further decoding, and then the word vectors are output to an attention mechanism layer to obtain the attention weight of the learning words, and then the sequence information transmitted by the attention layer is further optimized by a CRF layer.

As an alternative example, the technical solution of the present application is further described below with reference to specific embodiments.

The scheme is a naming identification method based on BERT-BilSTM-CRF. Semantic coding is carried out by utilizing the strong semantic representation capability of BERT, word vectors are trained, then the word vectors are input into a BilSTM layer for further decoding, and then the word vectors are output to an attention mechanism layer to obtain the attention weight of the learned words, and then sequence information transmitted by the attention layer is further optimized by a CRF layer. The BilSTM layer in the scheme is shown in figure 2, and the overall model block diagram of the scheme is shown in figure 3. The specific implementation steps are as follows:

step 1, a text input layer and a BERT semantic representation layer: the decoding part carries out semantic representation by using a BERT model, converts words of the text into a vector form and obtains word vectors represented by the text.

Step 2, a BilSTM layer: for the task of named entity recognition, in order to obtain continuously characterized context information, the context information is difficult to utilize by using a traditional neural network model, so a BilSTM (bidirectional LSTM) model is adopted for training to obtain a text feature vector of information fused with context features.

Step 3, attention is paid to a mechanical layer: although the BILSTM model can capture the context information of the text, the semantic vector representing the weight of the related words is difficult to obtain, so the attention is used for controlling the weight of the important words to improve the effect of named entity recognition. In addition, in named entity recognition, not every word contributes greatly to correctly recognizing an entity, so that after the BilSTM layer, an attention mechanism layer is added, and information which is most relevant to current recognition can be more accurately searched by a model, so that the recognition accuracy can be improved.

Step 4, CRF layer: CRF (conditional random field) is a probabilistic graphical model. Conditional random fields can compute the optimal joint probability in a sequence. The CRF layer is used to obtain an optimized global signature sequence.

Semantic coding is carried out by utilizing the strong semantic representation capability of BERT (bidirectional Encoder retrieval from transformations), word vectors are trained, then the word vectors are input into a BilSTM layer for further decoding, and then the word vectors are output to an attention mechanism layer to obtain the attention weight of the learning words, and then the sequence information transmitted by the attention layer is further optimized by a CRF layer. Based on the BERT-BilSTM-CRF technology, the method avoids the defects of the traditional method, improves the accuracy rate of named entity recognition by using a deep learning method, and solves the problem of poor effect of named entity recognition of the traditional and machine learning methods.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present application.

According to another aspect of the embodiments of the present application, there is also provided a named entity recognition apparatus for implementing the above named entity recognition method. Fig. 4 is a schematic diagram of an alternative named entity recognition apparatus according to an embodiment of the present application, and as shown in fig. 4, the apparatus may include:

an obtaining unit 41, configured to obtain a target text, where the target text is a text of a named entity to be extracted; and the identifying unit 43 is configured to obtain the named entity in the target text by using the bidirectional semantic representation of the target text.

It should be noted that the obtaining unit 41 in this embodiment may be configured to execute step S1 in this embodiment, and the identifying unit 43 in this embodiment may be configured to execute step S2 in this embodiment.

Acquiring a target text through the module, wherein the target text is a text of a named entity to be extracted; the named entities in the target text are obtained by utilizing the bidirectional semantic representation of the target text, the recognition accuracy of the named entities is improved by using a deep learning method, and the technical problem of low extraction accuracy of the named entities in the related technology can be solved.

The general model block diagram of the scheme is shown in fig. 3. The specific implementation mode is as follows:

It should be noted here that the modules described above are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the above embodiments. It should be noted that the modules as a part of the apparatus may run in a corresponding hardware environment, and may be implemented by software, or may be implemented by hardware, where the hardware environment includes a network environment.

According to another aspect of the embodiment of the present application, there is also provided a server or a terminal for implementing the above named entity identification method.

Fig. 5 is a block diagram of a terminal according to an embodiment of the present application, and as shown in fig. 5, the terminal may include: one or more processors 201 (only one shown), memory 203, and transmission means 205, as shown in fig. 5, the terminal may further comprise an input-output device 207.

The memory 203 may be configured to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for identifying a named entity in the embodiment of the present application, and the processor 201 executes various functional applications and data processing by running the software programs and modules stored in the memory 203, that is, implements the above-mentioned method for identifying a named entity. The memory 203 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 203 may further include memory located remotely from the processor 201, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 205 is used for receiving or sending data via a network, and can also be used for data transmission between a processor and a memory. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 205 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 205 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

Wherein the memory 203 is specifically used for storing application programs.

The processor 201 may call the application stored in the memory 203 via the transmission means 205 to perform the following steps:

acquiring a target text, wherein the target text is a text of a named entity to be extracted; and obtaining the named entity in the target text by utilizing the bidirectional semantic representation of the target text.

The processor 201 is further configured to perform the following steps:

when the named entity in the target text is obtained by utilizing the bidirectional semantic representation of the target text, the following processing is carried out, the semantic coding is carried out on the target text by utilizing a BERT semantic representation layer, the semantic coding is input into a BilSTM layer for decoding, the decoding is output to an attention mechanism layer to obtain the attention weight, and the sequence information transmitted by the attention layer is optimized by a CRF layer.

Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments, and this embodiment is not described herein again.

It can be understood by those skilled in the art that the structure shown in fig. 5 is only an illustration, and the terminal may be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, and a Mobile Internet Device (MID), a PAD, etc. Fig. 5 is a diagram illustrating a structure of the electronic device. For example, the terminal may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 5, or have a different configuration than shown in FIG. 5.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

Embodiments of the present application also provide a storage medium. Alternatively, in this embodiment, the storage medium may be a program code for executing the method for identifying a named entity.

Optionally, in this embodiment, the storage medium may be located on at least one of a plurality of network devices in a network shown in the above embodiment.

Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps:

Optionally, the storage medium is further arranged to store program code for performing the steps of:

Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including instructions for causing one or more computer devices (which may be personal computers, servers, network devices, or the like) to execute all or part of the steps of the method described in the embodiments of the present application.

In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims

1. A method for identifying a named entity, comprising:

acquiring a target text, wherein the target text is a text of a named entity to be extracted;

and obtaining the named entity in the target text by utilizing the bidirectional semantic representation of the target text.

2. The method of claim 1, wherein obtaining the named entities in the target text using the bi-directional semantic representation of the target text comprises:

and performing semantic coding on the target text by using a BERT semantic representation layer, inputting the semantic coding into a BilSTM layer for decoding, outputting the semantic coding to an attention mechanism layer to obtain attention weight, and optimizing sequence information transmitted by the attention layer through a CRF layer.

3. The method of claim 2, wherein semantically encoding the target text with a BERT semantic representation layer comprises:

inputting the target text into a BERT semantic representation layer through a text input layer;

and carrying out semantic coding on the words in the target text by using a BERT model at the BERT semantic representation layer to obtain word vectors of the words in the target text.

4. The method of claim 2, wherein inputting into the BilSTM layer for decoding comprises:

and inputting the word vectors of the words in the target text into a BilSTM layer to fuse the information of the context characteristics to obtain the text characteristic vectors.

5. The method of claim 2, wherein outputting to the attention mechanism layer to obtain the attention weight comprises:

inputting the text feature vector output by the BilSTM layer into an attention mechanism layer;

determining, by the attention mechanism layer, attention weights for keywords in the target text.

6. The method of claim 2, wherein optimizing the sequence information conveyed by the attention layer over the CRF layer comprises:

and calculating the optimal joint probability in the sequence information by using the CRF layer to obtain an optimized global characteristic sequence.

7. An apparatus for identifying named entities, comprising:

the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a target text, and the target text is a text of a named entity to be extracted;

and the identification unit is used for obtaining the named entity in the target text by utilizing the bidirectional semantic representation of the target text.

8. The apparatus of claim 7, wherein the identification unit is further configured to:

9. A storage medium, characterized in that the storage medium comprises a stored program, wherein the program when executed performs the method of any of the preceding claims 1 to 6.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the method of any of the preceding claims 1 to 6 by means of the computer program.