CN112668333A

CN112668333A - Named entity recognition method and device, and computer-readable storage medium

Info

Publication number: CN112668333A
Application number: CN201910979122.2A
Authority: CN
Inventors: 孟函可; 祝官文
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-10-15
Filing date: 2019-10-15
Publication date: 2021-04-16
Also published as: WO2021073179A1

Abstract

The embodiment of the application provides a method and equipment for recognizing a named entity, which are related to a Natural Language Processing (NLP) technology, can be applied to voice recognition in the field of Artificial Intelligence (AI), and particularly can be applied to applications such as a voice assistant. The method for identifying the named entity provided by the embodiment of the application comprises the following steps: acquiring a text to be identified; determining a scene type applied by a named entity recognition model for recognizing the named entity in the text to be recognized; inputting the text to be recognized and the scene type into the named entity recognition model; and acquiring output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized according to the scene type. According to the method and the device, the scene information is embedded into the input information of the named entity model, so that the probability that the named entity recognition model recognizes the named entity under different use scenes is improved.

Description

Named entity recognition method and device, and computer-readable storage medium

Technical Field

The present application relates to the field of named entity recognition technology, and in particular, to a method and apparatus for recognizing a named entity, and a computer-readable storage medium.

Background

Named entity recognition, also known as entity recognition or NER, is a fundamental task in natural language processing and has a very wide application range. A named entity generally refers to an entity in text that has a particular meaning or strong reference, and typically includes a person's name, place name, organization name, time of day, proper noun, and the like. The NER system extracts the entities from the unstructured input text and can identify more classes of entities according to business requirements. In the existing recognition method of named entities in the related art, usually, a named entity recognition model is trained only for a certain specific application scene, a large number of corpora under corresponding scenes are required to be adopted for training for different scenes, a plurality of recognition models are trained respectively to be suitable for different scenes, the training process is complex, and the adaptability of the models is not strong.

Disclosure of Invention

The application provides a named entity recognition method and equipment and a computer readable storage medium, which are used for enabling a named entity recognition model to be applied to different scenes.

In a first aspect, the present application provides a method for identifying a named entity, which is used for identifying the named entity in a text. A Named Entity (or simply an Entity) refers to an Entity having a specific meaning or strong reference in the text, and generally includes a name of a person, a name of a place, a name of an organization, a date and time, a proper noun, etc., and a broader Entity also includes a number, a currency, an address, etc.

The named entity recognition method provided by the application can be applied to formal recognition scene types, for example, providing the voice assistant with the named entity recognition of text.

The method for recognizing the named entities can also be applied to the process of training the named entity recognition model, when the method is applied to the training process, after the named entities in the training text are recognized by the method, the named entities are compared with the named entities marked in the training text in advance, and parameters in the named entity recognition model are adjusted according to the comparison result.

Specifically, the method for identifying a named entity provided by the first aspect comprises the following steps:

acquiring a text to be identified; determining a scene type applied by a named entity recognition model for recognizing the named entity in the text to be recognized; inputting the text to be recognized and the scene type into the named entity recognition model; and acquiring output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized according to the scene type. Optionally, the text to be recognized may be a text obtained by converting speech into a text, or may also be a training sample text labeled with a named entity tag in advance, where the text to be recognized may include characters such as chinese, numbers, symbols, and english.

A Named Entity recognition (called Entity recognition) model (may also be referred to as a Named Entity recognition system) may extract the Named entities from the text to be recognized, and may recognize more types of Named entities according to business requirements. The named entity recognition model may adopt a model of an implementation mode of a feature template-based method, a neural network-based method and the like in the prior art, and in an alternative example, the named entity recognition model may adopt a model of Word Embedding) + LSTM (long short term memory network)/BiLSTM (bidirectional long short term memory network) + CRF (conditional random field).

The scene is an actual situation to which the named entity recognition model is applied, for example, the scene type classification manner applied by the named entity recognition model may be classified according to a terminal type, and/or an application software type, and/or a usage type, for example, the terminal type may include different types of terminal electronic devices such as a television, a mobile phone, an automobile console, and the like, the application software type may be a system type or a type of application software (such as a movie type, a music type, and the like), the usage type may be classified as a voice assistant or an automatic response system, and the scene type may be classified into multiple types by any one of the above classification manners or a combination of the multiple classification manners, for example, an application scene of a system voice assistant for a television, an application scene of an automatic response system for shopping software, and the like, which is not particularly limited in this application.

The named entity recognition model described above is different from the named entity recognition model in the related art in that in the method provided by the present application, in addition to inputting the text to be recognized to the named entity recognition model, the input information also includes scene type information to which the named entity recognition model is applied, and by inputting the scene type information as the input information of the named entity recognition model in the training process and/or the recognition process, the named entity recognition model can be applied to different application scene types, for example, if the user says "change to your name" to the voice assistant, for the television voice assistant, the "your name" will be recognized as the name of a movie, and for the mobile phone voice assistant, the "your name" may not be recognized as the name of a movie, because the types of scenes to which the named entity recognition model is applied are different, the named entity recognition model is based on scene type information of input information, and output recognition results may be different.

In one possible embodiment, before the text to be recognized and the scene type are input into the named entity recognition model, the method further comprises: marking the text to be recognized with a content index according to a dividing unit, wherein the dividing units with the same content are marked by the same content index; determining a scene type index corresponding to the scene type; the scene type index is labeled for each of the division units, that is, each division unit (e.g., each word or each participle) is labeled with the applied scene type index.

Correspondingly, the inputting the text to be recognized and the scene type into the named entity recognition model includes: and inputting the indexes marked by all the dividing units in the text to be recognized into the named entity recognition model.

The above-mentioned division unit refers to a basic unit of sentence division, for example, for chinese, a single chinese character may be used as the most basic division unit, and a word after performing word segmentation processing may also be used as the most basic division unit, where word segmentation processing may use a word segmentation tool in the related art, such as a jieba word segmentation tool, etc., the goal of word segmentation processing is to divide a sentence into a plurality of words, each division unit in the text to be recognized is labeled with a corresponding content index (also may be referred to as a word index or a word index), and the same word or word index is the same, for example, the "word index is 15, the" name "index is 92, etc.

The determining manner of the scene type index corresponding to the scene type may be manually configured in advance, for example, the scene type index is set in a factory according to the type of the terminal device applied, and the indexes corresponding to different types of scenes may be configured in advance, for example, for an application scene of a voice assistant applied to a mobile phone terminal, the scene type index is 1, for an application scene of a voice assistant applied to a television terminal, the scene type index is 2, and the like.

In one possible implementation manner, after the index in which all the dividing units in the text to be recognized are labeled is input into the named entity recognition model, the processing method of the named entity recognition model includes: respectively converting the marked different types of indexes into multi-dimensional vectors aiming at each division unit; for each division unit, sequentially splicing a plurality of multi-dimensional vectors subjected to index conversion of different types; and acquiring an output result of the sequence labeling model to obtain labeling information of the named entity in the text to be recognized.

Alternatively, the index or other types of indexes used for representing division units (words or participles) of the natural language can be converted into a multi-dimensional vector that can be recognized by a machine by using Word Embedding (Word Embedding), distributed Vectors (distributed Vectors), one-hot (one-hot) and other algorithms or improved algorithms in the related art.

And then, after obtaining a plurality of multidimensional vectors after different types of index conversion, splicing according to the division units. For example, for each word, the vector of content index conversion and the scene type vector are sequentially spliced to obtain a spliced vector of each word, for example, for a word of "the content index is 15, the word is converted into a 50-dimensional vector w1, the application scene of the named entity recognition model is a voice assistant of a mobile phone terminal, the index is 1, the word of" the spliced vector is [ w1, w2], and the named entity recognition model is applied to a voice assistant of a mobile phone terminal, the index is 10-dimensional vector w 2.

It should be noted that, the sequence tagging model is a model for executing a sequence tagging task, in this application, an input of the sequence tagging model is a sequence of a plurality of splicing vectors, each splicing vector in the sequence corresponds to a partition unit at a corresponding position, an output is a tag corresponding to each splicing vector, and the tag is used for indicating whether the corresponding partition unit is a named entity.

In one possible example, different labels may also be used to mark out the named entity type and/or whether the corresponding division unit is the starting position, for example, the text to be recognized is "play your name", the word segmentation result is "play/you/name", the scene type index is 1, the concatenation vector is input into the sequence labeling model, and the output result is "O B-movie I-movie", where the label "O" indicates that the corresponding division unit "play" is not a named entity, the label "B-movie" indicates that the corresponding division unit "you" is a named entity, the type is movie (movie), and is the starting position of the named entity, the label "I-movie" indicates that the corresponding division unit "and" name "is a named entity, the type is movie (movie), and is not the starting position of the named entity.

In one possible embodiment, the sequence annotation model is a deep learning model, for example, RNN (recurrent neural network) + CRF model, and in particular, the deep learning model includes one or more layers of recurrent neural networks and conditional random fields. Specifically, in one or more layers of cyclic neural networks, each layer of cyclic neural network utilizes a neural network computing unit to sequentially compute a splicing vector of each division unit in the text to be recognized so as to output a computation result vector corresponding to each division unit; the conditional random field is used for receiving a vector sequence, the vector sequence comprises a plurality of calculation result vectors which are arranged in sequence, the calculation result vectors are calculation result vectors of the last layer of the cyclic neural network in the one or more layers of cyclic neural networks aiming at all the division units, and a label used for identifying whether each division unit is a named entity is output. In one possible embodiment, the RNN may select LSTM or BilSTM.

In a possible implementation, the deep learning model may be a model obtained by unsupervised pre-training using corpora. Unsupervised pre-training is a mode of training a model by using a corpus without labels, the training mode can be to encode a raw corpus, and the training target is to output the model as the raw corpus (namely, to predict the raw corpus) under the condition that the input is the encoded raw corpus, or the training target of a deep learning model can be to output a next word (namely, to predict the next word) after inputting a word under the condition that the deep learning model is an LSTM + CRF model.

In a possible implementation manner, before inputting the index, in which all the dividing units in the text to be recognized are labeled, into the named entity recognition model, the method further includes: labeling a part-of-speech index for each division unit in the text to be recognized, wherein the part-of-speech index is used for indicating the part-of-speech of the corresponding division unit, such as a verb, a noun, an adverb, a word and the like; correspondingly, the marked index of each division unit input into the named entity recognition model comprises the part-of-speech index. Correspondingly, when the index is converted into the vector, the part-of-speech index is also converted into a multidimensional vector, and when the splicing vector is generated, for each division unit, the vectors of all types of indexes (including the content index, the part-of-speech index and the scene index) are spliced into one vector to serve as the splicing vector of the corresponding division unit.

In a possible implementation manner, before inputting the index, in which all the dividing units in the text to be recognized are labeled, into the named entity recognition model, the method further includes: matching the text to be recognized with a preset named entity dictionary, and determining all named entities matched in the text to be recognized; and labeling a knowledge index aiming at each division unit in the text to be recognized, wherein the knowledge index is used for representing the information of the named entity matched by the corresponding division unit in the named entity recognition model, and correspondingly, the labeled index of each division unit input into the named entity recognition model comprises the knowledge index.

It should be appreciated that since the named entity dictionary is known knowledge, the text to be recognized is matched in the named entity dictionary, the matching result can be entered as known knowledge into the named entity recognition model, the method for inputting the matching result information is characterized in that the knowledge index is used for marking the type and/or the initial position and/or the matching times of the named entity by improving the recognition success rate of the named entity recognition model, for example, "you," "name," "your name" in "your name" all match the movie name entity in the named entity dictionary, "you" is indexed by the index B2-I0-movie to indicate that "you" is a movie name naming entity that is matched twice as a starting position, the notation B1-I1-movie for "name" means that "name" is a movie name naming entity that is matched once as a starting location and once as a non-starting location.

In one possible embodiment, before the text to be recognized and the scene type are input into the named entity recognition model, the method further comprises: acquiring a first text to be recognized corresponding to a current round of conversation, wherein the conversation can be a voice conversation or a text conversation, the current round of conversation described in the embodiment of the application refers to the latest conversation sent by another party corresponding to an executing party in the embodiment of the application, and the first text to be recognized is acquired according to the voice or the text of the current round of conversation; detecting whether a first keyword, such as "change", and the like, exists in the first text, wherein the first keyword is used to indicate that the first text is related to a multi-turn conversation scene, and the multi-turn conversation scene is a scene with a context, for example, a user says "please play well" and then says "change your name", and optionally, the first keyword is also used to indicate that the intention corresponding to the first text is incomplete; replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text, for example, replacing 'change' with 'play', and obtaining a second text to be identified; correspondingly, the named entity recognition model is further configured to recognize, in the second text, a named entity according to at least the second keyword and the scene type, where the second keyword is related to the entity category corresponding to the recognized named entity, "play" of the second keyword in the above example is related to the entity category of the movie category, and the named entity recognition model receives the second text containing the second keyword, where the entity category of the recognized named entity is related to the entity category corresponding to the second keyword, that is, the second keyword is related to the entity category of the recognized named entity.

In a possible implementation, before detecting whether the first keyword exists in the first text, the method further includes: the method includes obtaining an intention analysis result of a third text to be recognized, which corresponds to a previous session of the current session, to obtain an intention type of the third text, where it should be noted that an executor in an embodiment of the present application may obtain an intention analysis result of a previous session to obtain an intention type corresponding to the intention analysis result, for example, an intention type of a search video class, and the like; judging whether the intention type of the third text is a specified intention type; correspondingly, if the intention type of the third text is the designated intention type, whether the first keyword exists in the first text is detected, that is, only in the case that the intention type in the previous round is a specific intention type or several intention types, whether the first keyword exists in the first text needs to be detected. Optionally, the second keyword is determined according to the result of the intention analysis corresponding to the previous session of the current session, for example, each intention type may correspond to one or more keywords, and one keyword is selected as the second keyword, or the second keyword may be a word extracted from the previous session, for example, a verb (such as "play"), or a name related to the entity category (such as "movie"). Optionally, the second keyword is a verb and/or a noun.

In one possible embodiment, detecting whether a first keyword exists in the first text includes: and matching in the first text by using a preset regular expression, and determining whether the first keyword exists in the first text. For example, the regular expression is ([ want to only ]? "indicates that the corresponding character can be discarded," | "indicates or". dot. "indicates any character string, that is, for the regular expression, the sentence pattern" only needs xxx "," select xxx "," change to xxx "," select xxx ", etc. can be successfully matched.

In a possible implementation manner, the entity category corresponding to the second keyword is: a movie class or a music class; the second keyword corresponding to the entity category of the movie class is "play" and/or "movie"; the second keyword corresponding to the entity category of the music category is "listen" and/or "music".

In one possible embodiment, after identifying a named entity in the second text using the named entity identification model, the method further comprises: searching the named entity identified by the named entity identification model aiming at the second text in the first text; if the named entity is not found in the first text, determining that the corresponding named entity is invalid; and if the named entity is found in the first text, determining that the corresponding named entity is valid.

The purpose of the above embodiments is to prevent the identified named entities from being out of context. For example, the first text is "change to ABC", the second text is "play ABC", and the identified named entity is "put AB", which is not in the original text of the first text, because after replacing the keyword, the named entity is identified incorrectly.

In a possible implementation, the scene types are classified according to the terminal type and/or the type of application software applied by the named entity recognition model.

In a possible implementation manner, the method is applied to an electronic device equipped with a sound receiver, such as a mobile phone terminal or a television terminal, and the obtaining of the text to be recognized includes: acquiring voice collected by the sound receiver, such as a voice instruction sent by a user; and converting the voice into a text to obtain the text to be recognized.

In a second aspect, the present application further provides a method for identifying a named entity, including: acquiring a first text to be identified corresponding to the current round of conversation; detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene; replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text to obtain a second text to be identified; and identifying the named entity in the second text by using a named entity identification model at least according to the second keyword, wherein the second keyword is related to the entity category corresponding to the identified named entity. It should be noted that the named entity recognition model may be combined with a scene type to which the named entity recognition model is applied, the scene type to which the named entity recognition model is applied is obtained before recognizing the named entity in the second text by using the named entity recognition model, and the named entity is recognized in the second text according to at least the second keyword and the scene type, and those skilled in the art are able to combine the embodiment provided in the first aspect and the embodiment provided in the second aspect.

In a possible implementation, before detecting whether the first keyword exists in the first text, the method further includes: acquiring an intention analysis result of a third text to be identified corresponding to a previous round of the current round of conversation to obtain an intention type of the third text; judging whether the intention type of the third text is a specified intention type; accordingly, if the intention type of the third text is the designated intention type, it is detected whether a first keyword exists in the first text.

Optionally, the second keyword is determined according to an intention analysis result corresponding to a previous round of conversation of the current round of conversation.

Optionally, the second keyword is a verb and/or a noun.

Optionally, the first keyword is used to indicate that the intention corresponding to the first text is incomplete.

Optionally, the detecting whether the first text has the first keyword includes: and matching in the text to be recognized by using a preset regular expression, and determining whether the first keyword exists in the first text.

Optionally, the entity category corresponding to the second keyword is: movies or the likeOrA music class; the second keyword corresponding to the entity category of the movie class is "play" and/or "movie"; the second keyword corresponding to the entity category of the music category is "listen" and/or "music".

In a third aspect, the present application provides a named entity recognition apparatus, including: the first acquisition module is used for acquiring a text to be recognized; the first determination module is used for determining a scene type applied by a named entity recognition model for recognizing the named entity in the text to be recognized; the input module is used for inputting the text to be recognized and the scene type into the named entity recognition model; and the execution module is used for acquiring the output information of the named entity recognition model so as to determine the named entity recognized by the named entity recognition model in the text to be recognized according to the scene type.

In a possible embodiment, the apparatus further comprises: the division module is used for marking the text to be recognized with a content index according to a division unit before the text to be recognized and the scene type are input into the named entity recognition model, wherein the division units with the same content are marked by the same content index; a second determining module, configured to determine a scene type index corresponding to the scene type; labeling the scene type index for each dividing unit; correspondingly, the input module comprises: and the first input unit is used for inputting the indexes marked by all the dividing units in the text to be recognized into the named entity recognition model.

In one possible embodiment, the named entity recognition model includes: the first conversion unit is used for converting the marked indexes of different types into multi-dimensional vectors aiming at each division unit after the marked indexes of all the division units in the text to be recognized are input into the named entity recognition model; the splicing unit is used for splicing a plurality of multi-dimensional vectors subjected to index conversion of different types in sequence aiming at each division unit; the second input unit is used for inputting the splicing vectors of all the division units in the text to be recognized into a sequence labeling model; and the first acquisition unit is used for acquiring the output result of the sequence labeling model so as to obtain the labeling information of the named entity in the text to be recognized.

In one possible embodiment, the sequence annotation model is a deep learning model, and the deep learning model includes: the cyclic neural network on each layer utilizes a neural network computing unit to sequentially compute the splicing vector of each division unit in the text to be recognized so as to output a computing result vector corresponding to each division unit; and the conditional random field is used for receiving a vector sequence, the vector sequence comprises a plurality of calculation result vectors which are arranged in sequence, the calculation result vectors are calculation result vectors of the last layer of the cyclic neural network in the one or more layers of cyclic neural networks aiming at all the division units, and a label used for identifying whether each division unit is a named entity is output.

Optionally, the recurrent neural network is a long-short term memory neural network or a bidirectional long-short term memory neural network.

Optionally, the deep learning model is obtained by performing unsupervised pre-training using corpora.

In a possible embodiment, the apparatus further comprises: the first labeling module is used for labeling a part-of-speech index for each division unit in the text to be recognized before the indexes labeled by all the division units in the text to be recognized are input into the named entity recognition model; correspondingly, the marked index of each division unit input into the named entity recognition model comprises the part-of-speech index.

In a possible embodiment, the apparatus further comprises: a third determining module, configured to match the text to be recognized with a preset named entity dictionary before inputting the indexes, labeled by all the dividing units, in the text to be recognized into the named entity recognition model, and determine all the named entities matched in the text to be recognized; the second labeling module is used for labeling a knowledge index aiming at each division unit in the text to be recognized, wherein the knowledge index is used for representing the information of the named entity matched with the corresponding division unit in the named entity recognition model; accordingly, the index to which each of the division units input to the named entity recognition model is labeled includes the knowledge index.

In a possible embodiment, the apparatus further comprises: the second obtaining module is used for obtaining a first text to be recognized corresponding to the current round of conversation before the text to be recognized and the scene type are input into the named entity recognition model; the detection module is used for detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene; the replacing module is used for replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text to obtain a second text to be identified; correspondingly, the named entity recognition model is further configured to recognize a named entity in the second text according to at least the second keyword and the scene type, where the second keyword is related to an entity category corresponding to the recognized named entity.

In a possible implementation, before detecting whether the first keyword exists in the first text, the apparatus further includes: a third obtaining module, configured to obtain an intention analysis result of a third text to be identified, where the third text corresponds to a previous round of the current round of conversations, so as to obtain an intention type of the third text; the judging module is used for judging whether the intention type of the third text is a specified intention type; correspondingly, if the intention type of the third text is the designated intention type, the detection module performs detection whether the first keyword exists in the first text.

In a possible embodiment, the apparatus further comprises: a searching module, configured to search, in the first text, a named entity identified by the named entity identification model for the second text after identifying the named entity in the second text by using the named entity identification model; a fourth determining module, configured to determine that the corresponding named entity is invalid if the named entity is not found in the first text; a fifth determining module, configured to determine that the corresponding named entity is valid if the named entity is found in the first text.

Optionally, the second keyword is a verb and/or a noun.

In one possible embodiment, the detection module comprises: and the matching unit is used for matching in the first text by using a preset regular expression and determining whether the first keyword exists in the first text.

Optionally, the entity category corresponding to the second keyword is: a movie class or a music class; the second keyword corresponding to the entity category of the movie class is "play" and/or "movie"; the second keyword corresponding to the entity category of the music category is "listen" and/or "music".

Optionally, the scene type is classified according to a terminal type and/or a type of application software applied by the named entity recognition model.

In a possible implementation, the apparatus is applied to an electronic device configured with a sound receiver, and the obtaining module includes: the second acquisition unit is used for acquiring the voice collected by the voice receiver; and the second conversion unit is used for converting the voice into a text to obtain the text to be recognized.

In a fourth aspect, an embodiment of the present application further provides a device for identifying a named entity, where the device includes: the first acquisition module is used for acquiring a first text to be identified corresponding to the current round of conversation; the detection module is used for detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene; the replacing module is used for replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text to obtain a second text to be identified; and the identification module is used for identifying the named entity in the second text by utilizing a named entity identification model at least according to the second keyword, wherein the second keyword is related to the entity category corresponding to the identified named entity.

In a possible embodiment, the apparatus further comprises: a second obtaining module, configured to obtain an intention analysis result of a third text to be identified, which corresponds to a previous round of conversation of the current round of conversation, before detecting whether a first keyword exists in the first text, so as to obtain an intention type of the third text; the judging module is used for judging whether the intention type of the third text is a specified intention type; accordingly, if the intention type of the third text is the designated intention type, the detection module performs detection of whether a first keyword exists in the first text.

In a possible embodiment, the apparatus further comprises: a searching module, configured to search, in the first text, a named entity identified by the named entity identification model for the second text after identifying the named entity in the second text by using the named entity identification model; a first determining module, configured to determine that the corresponding named entity is invalid if the named entity is not found in the first text; a second determining module, configured to determine that the corresponding named entity is valid if the named entity is found in the first text.

Optionally, the second keyword is a verb and/or a noun.

In one possible embodiment, the detection module comprises: and the matching unit is used for matching in the text to be recognized by utilizing a preset regular expression and determining whether the first keyword exists in the first text.

In a fifth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform the method according to the first aspect.

In a sixth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform the method according to the second aspect.

In a seventh aspect, the present application provides a computer program for performing the method of the first aspect when the computer program is executed by a computer.

In an eighth aspect, the present application provides a computer program for performing the method of the second aspect when the computer program is executed by a computer.

In a possible implementation, the program in the seventh aspect or the eighth aspect may be stored in whole or in part on a storage medium packaged with the processor, or in part or in whole on a memory not packaged with the processor.

In a ninth aspect, the present application provides a named entity recognition apparatus, comprising: one or more processors; a memory; a plurality of application programs; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions which, when executed by the apparatus, cause the apparatus to perform the method of the first aspect.

In a tenth aspect, the present application provides a named entity recognition apparatus, comprising: one or more processors; a memory; a plurality of application programs; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions which, when executed by the apparatus, cause the apparatus to perform the method of the second aspect.

Therefore, in each aspect, the probability that the named entity recognition model recognizes the named entity under different use scenes is improved by embedding the scene information into the input information of the named entity model.

Drawings

Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a method for identifying a named entity according to the present application;

FIG. 3 is a schematic structural diagram of a named entity recognition model provided herein;

FIG. 4 is a schematic flow chart illustrating another method for identifying named entities provided herein;

FIG. 5 is a schematic flow chart illustrating another method for identifying named entities provided herein;

fig. 6 is a schematic structural diagram of a named entity recognition device according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of another named entity recognition device according to an embodiment of the present application.

Detailed Description

The terminology used in the description of the embodiments section of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application. Some embodiments provided by the present application may be applied to speech recognition in the field of Artificial Intelligence (AI), and may be related to Natural Language Processing (NLP), and may be specifically applied to applications such as a speech assistant.

The technical solution in the present application will be described below with reference to the accompanying drawings.

The application provides a named entity identification method which is used for identifying named entities in texts. A Named Entity (or simply an Entity) refers to an Entity having a specific meaning or strong reference in the text, and generally includes a name of a person, a name of a place, a name of an organization, a date and time, a proper noun, etc., and a broader Entity also includes a number, a currency, an address, etc.

The named entity recognition method provided by the application can be applied to formal recognition scene types, for example, providing named entity recognition of texts for an artificial intelligent voice assistant. As shown in fig. 1, a system architecture applied to a voice assistant is provided, where a user enters a mode of the voice assistant through a terminal such as a vehicle-mounted terminal, a computer terminal, a mobile phone terminal, etc., the voice assistant obtains a voice, converts the voice into a text by using an Automatic Speech Recognition (ASR) module, inputs the text into a Dialog Management (DM) module, distributes the text to a Natural Language Understanding (NLU) module by the Dialog Management module, and after receiving a current sentence and a text, the NLU module performs named entity Recognition on the current sentence, optionally, the named entity Recognition module performs an example of a Recognition method of a named entity provided in this application embodiment, including word segmentation, part-of-Speech tagging, named entity tagging (a sequence tagging model in the Recognition method of a named entity provided in this application), etc., in addition, the system also comprises modules for intention recognition and classification, slot filling and the like so as To understand the semantics in the Text, then the NLU module returns the analysis result To the DM module, the DM module further utilizes a Natural Language Generation (NLG) module To generate a replied dialog Text according To the recognized semantics, and the replied dialog Text is generated into voice by a voice synthesis (Text-To-Speech) module and broadcasted To the user. In the three application scenarios (the voice assistant of the car terminal, the voice assistant of the computer terminal, and the voice assistant of the mobile phone terminal) shown in fig. 1, the recognition results of the named entities for different application scenarios may be different, resulting in different responses from the terminal to the user, for example, "change to your name" does not recognize the named entity, and for the voice assistant of the mobile phone, it is a chat, and for the voice assistant of the television, it recognizes that the named entity is the movie "your name", and the voice assistant switches to the movie "your name".

The method for recognizing the named entities can also be applied to a scene of training a named entity recognition model, when the method is applied to a training process, after the named entities in a training text are recognized by the method, the named entities are compared with the named entities marked in the training text in advance, and parameters in the named entity recognition model are adjusted according to a comparison result.

The named entity identification method according to the embodiment of the present application is described in detail below with reference to fig. 2. The method shown in fig. 2 includes steps 101 to 104, which are described in detail below.

Step 101, obtaining a text to be recognized.

Optionally, the text to be recognized may be a text obtained by converting speech into a text, or may also be a training sample text labeled with a named entity tag in advance, where the text to be recognized may include characters such as chinese, numbers, symbols, and english.

In an alternative example, the named entity identification method provided in the embodiment of the present application is applied to an electronic device equipped with a sound receiver, where the electronic device may specifically be a mobile terminal (e.g., a smart phone), a computer, a personal digital assistant, a wearable device, an in-vehicle device, an internet of things device, or another electronic device capable of receiving sound. In this example, the speech collected by the sound receiver is converted into text by the speech-to-text module shown in fig. 1, and the text to be recognized is obtained.

Step 102, determining a scene type applied by a named entity recognition model for recognizing the named entity in the text to be recognized.

In the application, the scene refers to a service requirement situation that a named entity recognition model is needed to recognize a named entity, and the scene types can be divided into different types according to different terminal types and/or different application software types applied by the named entity recognition model. For example, the terminal may include different types of terminal electronic devices such as a television, a mobile phone, an automobile console, and the like, the application software may be in a system level, or types of application software installed in an operating system of the terminal (such as a video type, a music type, and the like), and the application type may be classified into a voice assistant, an automatic response system, and the like.

Step 103, inputting the text to be recognized and the scene type into the named entity recognition model.

And 104, acquiring output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized according to the scene type.

The named entity recognition model according to the embodiment of the present application is different from the named entity recognition model in the related art, in the method provided by the present application, in addition to inputting the text to be recognized into the named entity recognition model, the input information further includes scene type information applied by the named entity recognition model, and by inputting the scene type information as the input information of the named entity recognition model in the training process and/or the recognition process, the named entity recognition model can be applied to different application scene types, for example, if the user says "change to your name" to the voice assistant, for the television voice assistant, the "your name" will be recognized as the name of a movie, and for the mobile phone voice assistant, the "your name" may not be recognized as the name of a movie, because the scene types applied by the named entity recognition model are different, the named entity recognition model is based on scene type information of input information, and output recognition results may be different.

In an alternative embodiment, before step 103 is executed, before the text and scene type to be recognized are input into the named entity recognition model, the method further includes the following steps 11 to 13:

and 11, marking the text to be recognized with a content index according to the division units, wherein the division units with the same content are marked with the same content index.

And step 12, determining a scene type index corresponding to the scene type.

The determining manner of the scene type index corresponding to the scene type may be manually configured in advance, for example, the scene type index is set in a factory according to the type of the terminal device applied, and the indexes corresponding to different types of scenes may be configured in advance, for example, for an application scene of a voice assistant applied to a mobile phone terminal, the scene type index is 1, for an application scene of a voice assistant applied to a television terminal, the scene type index is 2, and the like. Optionally, the scene type index may be floating point type float data, for example, the index of the general scene is 0, and the index of the high recall rate scene is 1.0, so that the named entity identification model can be actively migrated to a new scene by adjusting the scene type index to change between 0 and 1.

And step 13, marking scene type indexes for each division unit.

That is, each division unit (e.g., each word or each participle) is labeled with the applied scene type index.

Correspondingly, step 103 inputs the text to be recognized and the scene type into the named entity recognition model, which includes:

and 14, inputting indexes marked by all the division units in the text to be recognized into the named entity recognition model.

Further, after the step 14 is executed to input the indexes marked by all the dividing units in the text to be recognized into the named entity recognition model, the processing method of the named entity recognition model may include the following steps 21 to 23:

and step 21, respectively converting the marked different types of indexes into multi-dimensional vectors aiming at each division unit.

And 22, splicing a plurality of multi-dimensional vectors after index conversion of different types according to each division unit.

And step 23, acquiring an output result of the sequence labeling model to obtain labeling information of the named entity in the text to be recognized.

And then, after obtaining a plurality of multidimensional vectors after different types of index conversion, splicing according to the division units. For example, for each word, the vector of content index conversion and the scene type vector are sequentially spliced to obtain a spliced vector of each word, for example, for a word of "the content index is 15, the content index is converted into a 50-dimensional vector w1, the application scene of the named entity recognition model is a voice assistant of a mobile phone terminal, the index is 1, the application scene is converted into a 20-dimensional vector w2, and then the spliced vector of the word of" is [ w1, w2 ].

In an alternative example, it may also be noted whether the named entity type and/or the corresponding division unit is a starting position by using different labels, for example, the text to be recognized is "play your name", the word segmentation result is "play/you/name", the scene type index is 1, the concatenation vector is input into the sequence tagging model, and an output result is "O B-movie I-movie", where the label "O" indicates that the corresponding division unit "play" is not a named entity, the label "B-movie" indicates that the corresponding division unit "you" is a named entity, the type is movie (movie), and is a starting position of the named entity, the label "I-movie" indicates that the corresponding division unit "and" name "is a named entity, the type is movie (movie), and is not the starting position of the named entity.

Optionally, the input information input to the named entity recognition model may further include a part-of-speech index and/or a knowledge index.

In an optional implementation manner, before inputting the index, in which all the dividing units in the text to be recognized are labeled, into the named entity recognition model, the method further includes: labeling a part-of-speech index for each division unit in the text to be recognized, wherein the part-of-speech index is used for representing the part-of-speech of the corresponding division unit, such as a verb, a noun, an adverb, a conjunctive and the like; accordingly, the index to which each division unit of the input named entity recognition model is tagged includes a part-of-speech index. Correspondingly, when the index is converted into the vector, the part-of-speech index is also converted into a multidimensional vector, and when the splicing vector is generated, for each division unit, the vectors of all types of indexes (including the content index, the part-of-speech index and the scene index) are spliced into one vector to serve as the splicing vector of the corresponding division unit.

In an optional implementation manner, before inputting the index, in which all the dividing units in the text to be recognized are labeled, into the named entity recognition model, the method further includes: matching the text to be recognized with a preset named entity dictionary, and determining all named entities matched in the text to be recognized; and labeling a knowledge index aiming at each division unit in the text to be recognized, wherein the knowledge index is used for representing the information of the named entity matched with the corresponding division unit in the named entity recognition model, and correspondingly, the labeled index of each division unit input into the named entity recognition model comprises the knowledge index.

In an alternative embodiment, the sequence annotation model is a deep learning model, e.g., RNN (recurrent neural network) + CRF model. Specifically, the deep learning model includes one or more layers of recurrent neural networks and conditional random fields.

Specifically, in one or more layers of recurrent neural networks, each layer of recurrent neural network calculates the splicing vector of each division unit in the text to be recognized in sequence by using a neural network calculation unit, so as to output a calculation result vector corresponding to each division unit.

The conditional random field is used for receiving a vector sequence, the vector sequence comprises a plurality of calculation result vectors which are arranged in sequence, the calculation result vectors are calculation result vectors of the last layer of cyclic neural network in one or more layers of cyclic neural networks aiming at all the division units, and a labeling label used for identifying whether each division unit is a named entity is output.

In an alternative embodiment, the RNN may select LSTM or BilSTM. The LSTM (Long Short-Term Memory) model is one of RNNs (Recurrent Neural networks). The BilSTM (Bi-directional Long Short-Term Memory) model is formed by combining a forward LSTM and a backward LSTM. Both LSTM and BiLSTM are commonly used to model context information in natural language processing tasks.

FIG. 3 is a schematic diagram of an example of a named entity recognition model using BilSTM + CRF. As shown in FIG. 3, the named entity recognition model 300 includes a vector transformation module, a vector splicing module and a sequence labeling module 301, and the sequence labeling module 301 includes a bidirectional long-short term memory (BilSTM) and a CRF. The input of the vector conversion module is various indexes of each Word (division unit), including scene (type) index, part of speech index and Word (content) index, the vector conversion module utilizes Word embedding technology to map each index into a vector respectively, the vector splicing module splices a plurality of vectors of each division unit together, the input of the BilSTM comprises a layer of forward LSTM network and a layer of backward LSTM network, each layer of LSTM network comprises a plurality of repeated LSTM neural units, and each neural unit is used for calculating the input vector. And aiming at each division unit, splicing the outputs of the forward LSTM neural unit and the backward LSTM neural unit together to obtain a calculation result vector of the corresponding division unit, arranging all calculation result vectors into a vector sequence in sequence, inputting CRF (critical random access) to obtain finally output labels t1 and t2 … …, wherein each label is used for indicating whether the corresponding division unit belongs to a named entity or not, and also can be used for identifying whether the corresponding division unit is at the initial position in the named entity or not.

In an alternative embodiment, a model obtained by unsupervised pre-training using corpora is used as the deep learning model by using the principle of transfer learning. Migration learning is a machine learning method, meaning that a pre-trained model is reused in another task, for example, in the process of developing a model for task B, with the model developed for task A as the initial point. In the embodiment of the present application, the unsupervised pre-training is a method for training a model by using a corpus without labels, for example, the task of the unsupervised pre-training may be any one of the following:

1) encoding an original sequence by using an automatic encoder (autoencoder) of the sequence, and inputting the encoded sequence into a deep learning model to predict the original sequence;

2) in the case of the LSTM + CRF model, the trained task is the traditional language model task: the next word is predicted.

In an alternative embodiment, before entering the text to be recognized and the scene type into the named entity recognition model, the method further comprises: acquiring a first text to be recognized corresponding to the current round of conversation, wherein the conversation can be a voice conversation or a text conversation, the current round of conversation in the embodiment of the application refers to the latest conversation sent by another party corresponding to an executing party in the embodiment of the application, and the first text to be recognized is acquired according to the voice or the text of the current round of conversation; detecting whether a first keyword exists in the first text, for example, "change to", and the like, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene, and the multi-turn conversation scene is a scene with a context, for example, a user says "please play well" and then says "change to your name", and optionally, the first keyword is also used for indicating that the intention corresponding to the first text is incomplete; under the condition that the first keywords exist in the first text, replacing the first keywords in the first text with second keywords, for example, replacing ' change to ' play ', and obtaining a second text to be identified; correspondingly, the named entity recognition model is further configured to recognize the named entity in the second text at least according to a second keyword and the scene type, the second keyword is related to the entity category corresponding to the recognized named entity, the second keyword "play" in the above example is related to the entity category of the movie category, the named entity recognition model receives the second text containing the second keyword, the entity category of the recognized named entity is related to the entity category corresponding to the second keyword, that is, the second keyword is related to the entity category of the recognized named entity.

In a possible implementation, before detecting whether the first keyword exists in the first text, the method further includes: the method includes the steps of obtaining an intention analysis result of a third text to be recognized, which corresponds to a previous session of the current session, to obtain an intention type of the third text, where it should be noted that an executor according to the embodiment of the present application may obtain the intention analysis result of the previous session to obtain an intention type corresponding to the intention analysis result, for example, the intention type may be an intention type of a search video class, and the like; judging whether the intention type of the third text is a specified intention type; accordingly, if the intention type of the third text is the designated intention type, it is detected whether the first keyword exists in the first text, that is, it is only necessary to detect whether the first keyword exists in the first text in the case that the previous turn of intention type is a specific intention type or types. Alternatively, the second keyword is determined according to the result of the intention analysis corresponding to the previous session of the current session, for example, each intention type may correspond to one or more keywords, one of which is selected as the second keyword, or the second keyword may be a word extracted from the previous session, for example, a verb (such as "play"), or a name related to the entity category (such as "movie"). Optionally, the second keyword is a verb and/or a noun.

In one possible implementation, detecting whether a first keyword exists in the first text includes: and matching in the first text by using a preset regular expression, and determining whether the first text has the first keyword. For example, the regular expression is ([ want to only ]? "indicates that the corresponding character can be discarded," | "indicates or". dot. "indicates any character string, that is, for the regular expression, the sentence pattern" only needs xxx "," select xxx "," change to xxx "," select xxx ", etc. can be successfully matched.

In one possible embodiment, after identifying the named entity in the second text using the named entity identification model, the method further comprises: searching a named entity identified by the named entity identification model aiming at the second text in the first text; if the named entity is not found in the first text, determining that the corresponding named entity is invalid; if the named entity is found in the first text, the corresponding named entity is determined to be valid.

In one possible embodiment, the scene types are classified according to the terminal type and/or the type of application software applied by the named entity recognition model.

The embodiment of the present application further provides another embodiment of a method for identifying a named entity, and it should be understood that for parts that are not described in detail in this embodiment, reference may be made to the detailed description of corresponding parts in the above embodiments. The method provided by the embodiment comprises the following steps 32-38:

step 32, acquiring a first text to be identified corresponding to the current round of conversation;

step 34, detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene;

step 36, replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text, so as to obtain a second text to be identified;

and step 38, identifying the named entity in the second text by using a named entity identification model at least according to the second keyword, wherein the second keyword is related to the entity category corresponding to the identified named entity.

It should be noted that the named entity recognition model may be combined with a scene type to which the named entity recognition model is applied, the scene type to which the named entity recognition model is applied is obtained before recognizing the named entity in the second text by using the named entity recognition model, and the named entity is recognized in the second text according to at least the second keyword and the scene type, and those skilled in the art are able to combine the embodiment provided in the first aspect and the embodiment provided in the second aspect.

Optionally, the second keyword is a verb and/or a noun.

Fig. 4 is an optional flowchart of the method for identifying a named entity according to the embodiment of the present application, and as shown in fig. 4, the method includes:

before the recognition method of the named entity processes the text to be recognized, keywords and multi-turn sentence patterns are extracted and collected manually or by corpora under a certain scene.

For example, the keywords for the movie type may be "play", "movie", and the keywords for the music type may be "listen", "song", each collected category of keywords may correspond to a particular scene, the enhanced features are identified for the named entities in the usage scene, and the probability of identifying the related named entities is improved.

The multi-turn sentence pattern in the embodiment of the present application refers to a sentence pattern that contains incomplete intention information and needs the intention information in the text to supplement, for example, the multi-turn sentence pattern may be "as long as ironmen", "select tomorrow", "change your name", "select all good", and so on. Based on the collected multiple rounds of sentences, a regular expression of the multiple rounds of sentences is constructed, for example, the regular expression may be ([ want to only ].

It should be noted that the above steps of collecting keywords, collecting multiple rounds of sentences and constructing regular expressions are relatively independent from the recognition process of the recognition model of the named entity on the text to be recognized.

After the text to be recognized is obtained, the text is input into the recognition model of the named entity provided in the embodiment of the application.

Firstly, matching sentences to be replaced by collected regular expressions of multiple rounds of sentences, and replacing the sentences to be replaced by sentences with keywords, specifically, splicing keywords and/or replacing keywords, that is, splicing the keywords in the original sentences, and/or replacing a part of words in the original sentences by the keywords. For example, the input text to be recognized is "change to your name", and after being matched by the regular expression, the text is replaced by "play your name".

Second, the replaced text is input to the recognition model of the named entity. In the recognition model of the named entity, word segmentation processing is firstly performed on the replaced text to obtain 'playing/your/name', indexes (including word indexes, part of speech indexes and scene indexes) are labeled on each word segmentation, then each index is mapped into a vector, and after the vectors of a plurality of indexes of each word segmentation are spliced, a sequence labeling model is input. The sequence labeling model can adopt the structure of the BilSTM + CRF model. For example, after the vector sequence of "play your name" is labeled by the sequence labeling model, the label of each participle is determined to be "O B-movie I-movie", and each label represents a non-named entity, a starting position of the named entity, a non-starting position of the named entity, and a non-starting position of the named entity.

Third, the annotated named entity is structured, e.g., structured (or normalized) to the named entity "ironmen 2" if the named entity is identified as "ironmen second". For the named entity "your name," the result of the structured entity is still "your name.

And finally, matching the unstructured named entities in the original text to be recognized, checking whether the recognized named entities exist in the original text, and avoiding recognizing the named entities which do not exist in the original text by combining the replaced keywords with the original text. If so, the identified named entity is determined to be valid.

Fig. 5 is a schematic view of another alternative flow of the method for identifying a named entity according to the embodiment of the present application, and as shown in fig. 5, the method includes two parts, namely a training flow and an identification flow. The training process is a process of training a named entity recognition model which is actually used, and the recognition process is a process of using the named entity recognition model. The training process and the recognition process may be relatively independent, and the recognition model of the named entity trained by the training process is used as the recognition model of the named entity used by the recognition process.

In the training process, the collected training corpus can be processed by corpus enhancement, wherein the corpus enhancement is a processing method for expanding the corpus by replacing some key words of the corpus with the multi-turn sentence pattern. For example, the corpus text is "play your name", and after the corpus enhancement processing, the corpus-enhanced text is "change to your name". Corpus enhancement can be regarded as the reverse practice of matching regular expressions and replacing keywords in the text to be recognized. The preprocessing of the corpus comprises word segmentation processing, knowledge extraction, part of speech judgment, index labeling and the like. The word segmentation processing is used for segmenting words of the corpus, the knowledge extraction is used for matching in the corpus by using a preset named entity dictionary, the matched named entities can be used as known knowledge and input into a recognition model of the named entities together for training, the part-of-speech judgment is used for determining the part-of-speech of each word segmentation, the indexing labels can label the content (characters) of the words, can label the part-of-speech indexes, and can index scene types and/or knowledge. The corpus preprocessing can process the original corpus or the enhanced corpus according to the current training target.

After the index is built, the basic model and the preprocessed information are input, and the model is trained. The underlying model may be configured as a BilSTM + CRF model. The input to the model includes at least a scene type index and a word index for each participle.

The output is the label labeled by the sequence, namely the named entity label of each word.

In the identification process, the above intention or slot information may be obtained first, and the above intention may be the intention and slot information identified by the intention identification module and slot filling module shown in fig. 1 when processing the sentence of the previous round or n rounds. After obtaining the above intent and slot information, it may be determined whether the intent is related to the current entity, e.g., the intent above is to play a movie: when the text (a sentence) to be recognized is acquired in the current round, the name of the user can be firstly recognized by utilizing the named entity dictionary to recognize whether an entity exists in the current text, if the entity is changed to be good, the named entity dictionary is matched with TV play good, the fact that the entity in the current text to be recognized is related to the intention of the user, and the keyword is required to be used for enhancement. Or, the regular expression can be used for direct matching, if the regular expression is not matched, the feature enhancement of the keywords of the text to be recognized is not needed, and if the regular expression is matched, the feature enhancement is carried out.

The enhanced method is that the regular expression is used for searching and matching the multi-turn sentence pattern, and the sentence pattern is replaced by the sentence pattern containing the key words. For example, replace "change to your name" with "play your name".

After the text is replaced, preprocessing is performed on the replaced text, including word segmentation processing, the knowledge extraction and the like, and indexes are marked. And inputting the indexes obtained after preprocessing and the indexes of the scene types into the recognition model of the named entity to obtain a labeling sequence. The labeling sequence is used for labeling whether each participle is an entity.

After the annotation sequence is obtained, post-processing is performed, that is, whether the named entity marked in the original text before replacement exists is checked, if not, the named entity is identified by mistake, and if the named entity does exist, the entity is structured and then output to a subsequent downstream module, such as an intention identification module and a slot filling module in fig. 1.

The named entity recognition method provided by the embodiment of the application can be applicable to different scenes by using the same named entity recognition model, is applicable to voice recognition in the field of artificial intelligence, for example, voice assistance, AI customer service, AI chat and other applications of a terminal system or application software. According to the embodiment of the application, the input of the scene type information is added in the input of the named entity recognition model, so that the input value of the scene type can be dynamically adjusted when the named entity is recognized, the entity recognition result can be adjusted to adapt to different scenes, and the recognition result which is more accurate and more suitable for the applied scene is provided for a user.

It is to be understood that some or all of the steps or operations in the above-described embodiments are merely examples, and other operations or variations of various operations may be performed by the embodiments of the present application. Further, the various steps may be performed in a different order presented in the above-described embodiments, and it is possible that not all of the operations in the above-described embodiments are performed.

As shown in fig. 6, a schematic structural diagram of a named entity recognition device is provided, and the named entity recognition device 600 includes a first obtaining module 601, a first determining module 602, an input module 603, and an executing module 604.

The first acquisition module is used for acquiring a text to be recognized; the system comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for determining a scene type applied by a named entity identification model for identifying a named entity in a text to be identified; the input module is used for inputting the text to be recognized and the scene type into the named entity recognition model; and the execution module is used for acquiring the output information of the named entity recognition model so as to determine the named entity recognized by the named entity recognition model in the text to be recognized according to the scene type.

In one possible embodiment, the apparatus further comprises: the system comprises a division module, a content index module and a content recognition module, wherein the division module is used for marking the text to be recognized according to division units before inputting the text to be recognized and the scene type into a named entity recognition model, and the division units with the same content are marked by the same content index; the second determining module is used for determining a scene type index corresponding to the scene type; labeling a scene type index for each division unit; correspondingly, the input module comprises: and the first input unit is used for inputting the indexes marked by all the dividing units in the text to be recognized into the named entity recognition model.

In one possible implementation, the named entity recognition model includes: the first conversion unit is used for respectively converting the marked different types of indexes into multi-dimensional vectors aiming at each division unit after the marked indexes of all the division units in the text to be recognized are input into the named entity recognition model; the splicing unit is used for splicing a plurality of multi-dimensional vectors subjected to index conversion of different types in sequence for each division unit; the second input unit is used for inputting the splicing vectors of all the division units in the text to be recognized into the sequence labeling model; and the first acquisition unit is used for acquiring the output result of the sequence labeling model so as to obtain the labeling information of the named entity in the text to be recognized.

In one possible embodiment, the sequence annotation model is a deep learning model, and the deep learning model includes: the method comprises the following steps that one or more layers of cyclic neural networks are adopted, each layer of cyclic neural network utilizes a neural network computing unit to sequentially compute the splicing vector of each division unit in a text to be recognized, and a computing result vector corresponding to each division unit is output; and the conditional random field is used for receiving a vector sequence, the vector sequence comprises a plurality of calculation result vectors which are arranged in sequence, the calculation result vectors are calculation result vectors of the last layer of cyclic neural network in one or more layers of cyclic neural networks aiming at all the division units, and a labeling label used for identifying whether each division unit is a named entity is output.

Optionally, the recurrent neural network is a long-short term memory neural network, or a bidirectional long-short term memory neural network.

In one possible embodiment, the apparatus further comprises: the first labeling module is used for labeling a part-of-speech index for each division unit in the text to be recognized before the indexes labeled by all the division units in the text to be recognized are input into the named entity recognition model; accordingly, the index to which each division unit of the input named entity recognition model is tagged includes a part-of-speech index.

In one possible embodiment, the apparatus further comprises: the third determining module is used for matching the text to be recognized with a preset named entity dictionary and determining all the named entities matched in the text to be recognized before the indexes marked in all the division units in the text to be recognized are input into the named entity recognition model; the second labeling module is used for labeling a knowledge index aiming at each division unit in the text to be recognized, wherein the knowledge index is used for representing the information of the named entity matched with the corresponding division unit in the named entity recognition model; accordingly, the indexed per division unit of the input named entity recognition model comprises a knowledge index.

In one possible embodiment, the apparatus further comprises: the second acquisition module is used for acquiring a first text to be recognized corresponding to the current round of conversation before inputting the text to be recognized and the scene type into the named entity recognition model; the detection module is used for detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene; the replacing module is used for replacing the first keywords in the first text with the second keywords under the condition that the first keywords exist in the first text to obtain a second text to be identified; correspondingly, the named entity recognition model is further configured to recognize the named entity in the second text according to at least a second keyword and the scene type, where the second keyword is related to the entity category corresponding to the recognized named entity.

In a possible implementation, before detecting whether the first keyword exists in the first text, the apparatus further includes: the third acquisition module is used for acquiring an intention analysis result of a third text to be recognized corresponding to a previous conversation of the current conversation so as to obtain an intention type of the third text; the judging module is used for judging whether the intention type of the third text is the appointed intention type; accordingly, if the intention type of the third text is the designated intention type, the detection module performs detecting whether the first keyword exists in the first text.

In one possible embodiment, the apparatus further comprises: the searching module is used for searching the named entities identified by the named entity identification model aiming at the second text in the first text after identifying the named entities in the second text by utilizing the named entity identification model; a fourth determining module, configured to determine that a corresponding named entity is invalid if the named entity is not found in the first text; and the fifth determining module is used for determining that the corresponding named entity is valid if the named entity is found in the first text.

Optionally, the second keyword is a verb and/or a noun.

In one possible embodiment, the detection module comprises: and the matching unit is used for matching in the first text by using a preset regular expression and determining whether the first text has the first keyword.

Optionally, the scene type is classified according to a terminal type and/or a type of application software to which the named entity recognition model is applied.

In one possible implementation, the apparatus is applied to an electronic device equipped with a sound receiver, and the acquisition module includes: the second acquisition unit is used for acquiring the voice collected by the voice receiver; and the second conversion unit is used for converting the voice into the text so as to obtain the text to be recognized.

As shown in fig. 7, a schematic structural diagram of another named entity recognition device is also provided, where the named entity recognition device 700 includes a first obtaining module 701, a detecting module 702, a replacing module 703, and a recognizing module 704.

The first acquisition module is used for acquiring a first text to be identified corresponding to the current round of conversation; the detection module is used for detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene; the replacing module is used for replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text to obtain a second text to be identified; and the identification module is used for identifying the named entity in the second text by utilizing a named entity identification model at least according to the second keyword, wherein the second keyword is related to the entity category corresponding to the identified named entity.

Optionally, the second keyword is a verb and/or a noun.

In the parts that are not described in detail in the description of the named entity recognition device 600 or the named entity recognition device 700 provided in the embodiment of the present application, it will be apparent to those skilled in the art that corresponding contents should be determined from the description of the named entity recognition method provided in the embodiment of the present application, and thus are not described herein again.

It should be understood that the division of the modules of the apparatus shown in fig. 6 or fig. 7 is merely a logical division, and the actual implementation may be wholly or partially integrated into one physical entity or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling by the processing element in software, and part of the modules can be realized in the form of hardware. For example, the determining module may be a processing element that is separately set up, or may be implemented by being integrated in a certain chip of the communication apparatus, such as a terminal, or may be stored in a memory of the communication apparatus in the form of a program, and the certain processing element of the communication apparatus calls and executes the functions of the above modules. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. As another example, when one of the above modules is implemented in the form of a Processing element scheduler, the Processing element may be a general purpose processor, such as a Central Processing Unit (CPU) or other processor capable of invoking programs. As another example, these modules may be integrated together, implemented in the form of a system-on-a-chip (SOC).

It is to be understood that, in the present application, "a plurality" means two or more, and other words are analogous. "and/or" describes the corresponding relationship of the associated objects, and indicates that three relationships may exist, for example, a and/or B, and may indicate that: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

In the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.

It should be noted that the method and apparatus for identifying a named entity provided in the embodiments of the present application are only examples, and the embodiments of the present application are not limited thereto.

Embodiments of the present application further provide a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method described in the above embodiments.

In addition, the present application also provides a computer program product, which includes a computer program, when the computer program product runs on a computer, the computer is caused to execute the method described in the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the procedures or functions described in accordance with the present application are generated, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, digital subscriber line) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk), among others.

Claims

1. A method for identifying a named entity, comprising:

acquiring a text to be identified;

determining a scene type applied by a named entity recognition model for recognizing the named entity in the text to be recognized;

inputting the text to be recognized and the scene type into the named entity recognition model;

and acquiring output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized according to the scene type.

2. The method of claim 1,

before inputting the text to be recognized and the scene type into the named entity recognition model, the method further comprises:

marking the text to be recognized with a content index according to a dividing unit, wherein the dividing units with the same content are marked by the same content index;

determining a scene type index corresponding to the scene type; labeling the scene type index for each dividing unit;

correspondingly, the inputting the text to be recognized and the scene type into the named entity recognition model includes:

and inputting the indexes marked by all the dividing units in the text to be recognized into the named entity recognition model.

3. The method according to claim 2, wherein after the index in which all the division units in the text to be recognized are labeled is input into the named entity recognition model, the processing method of the named entity recognition model comprises the following steps:

respectively converting the marked different types of indexes into multi-dimensional vectors aiming at each division unit;

for each division unit, sequentially splicing a plurality of multi-dimensional vectors subjected to index conversion of different types;

inputting the splicing vectors of all the division units in the text to be recognized into a sequence labeling model;

and acquiring an output result of the sequence labeling model to obtain labeling information of the named entity in the text to be recognized.

4. The method of claim 3, wherein the sequence annotation model is a deep learning model comprising:

the cyclic neural network on each layer utilizes a neural network computing unit to sequentially compute the splicing vector of each division unit in the text to be recognized so as to output a computing result vector corresponding to each division unit;

and the conditional random field is used for receiving a vector sequence, the vector sequence comprises a plurality of calculation result vectors which are arranged in sequence, the calculation result vectors are calculation result vectors of the last layer of the cyclic neural network in the one or more layers of cyclic neural networks aiming at all the division units, and a label used for identifying whether each division unit is a named entity is output.

5. The method of any of claims 1-4, wherein prior to entering the text to be recognized and the scene type into the named entity recognition model, the method further comprises: acquiring a first text to be identified corresponding to the current round of conversation; detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene; replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text to obtain a second text to be identified; correspondingly, the inputting the text to be recognized and the scene type into the named entity recognition model includes: and identifying a named entity in the second text by using the named entity identification model at least according to the second keyword, wherein the second keyword is related to the entity category corresponding to the identified named entity.

6. A method for identifying a named entity, the method comprising:

acquiring a first text to be identified corresponding to the current round of conversation;

detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene;

replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text to obtain a second text to be identified;

and identifying the named entity in the second text by using a named entity identification model at least according to the second keyword, wherein the second keyword is related to the entity category corresponding to the identified named entity.

7. The method of claim 6, wherein prior to detecting whether a first keyword is present in the first text, the method further comprises:

acquiring an intention analysis result of a third text to be identified corresponding to a previous round of the current round of conversation to obtain an intention type of the third text;

judging whether the intention type of the third text is a specified intention type;

accordingly, if the intention type of the third text is the designated intention type, it is detected whether a first keyword exists in the first text.

8. The method of claim 6 or 7, wherein after identifying a named entity in the second text using the named entity identification model, the method further comprises:

searching the named entity identified by the named entity identification model aiming at the second text in the first text;

if the named entity is not found in the first text, determining that the corresponding named entity is invalid;

and if the named entity is found in the first text, determining that the corresponding named entity is valid.

9. The method of any of claims 6-8, wherein detecting whether a first keyword is present in the first text comprises: and matching in the text to be recognized by using a preset regular expression, and determining whether the first keyword exists in the first text.

10. An apparatus for identifying named entities, the apparatus comprising:

the first acquisition module is used for acquiring a text to be recognized;

the first determination module is used for determining a scene type applied by a named entity recognition model for recognizing the named entity in the text to be recognized;

the input module is used for inputting the text to be recognized and the scene type into the named entity recognition model;

and the execution module is used for acquiring the output information of the named entity recognition model so as to determine the named entity recognized by the named entity recognition model in the text to be recognized according to the scene type.

11. The apparatus of claim 10, wherein the apparatus further comprises:

the division module is used for marking the text to be recognized with a content index according to a division unit before the text to be recognized and the scene type are input into the named entity recognition model, wherein the division units with the same content are marked by the same content index;

a second determining module, configured to determine a scene type index corresponding to the scene type; labeling the scene type index for each dividing unit;

correspondingly, the input module comprises:

and the first input unit is used for inputting the indexes marked by all the dividing units in the text to be recognized into the named entity recognition model.

12. The apparatus of claim 11, wherein the named entity recognition model comprises:

the first conversion unit is used for converting the marked indexes of different types into multi-dimensional vectors aiming at each division unit after the marked indexes of all the division units in the text to be recognized are input into the named entity recognition model;

the splicing unit is used for splicing a plurality of multi-dimensional vectors subjected to index conversion of different types in sequence aiming at each division unit;

the second input unit is used for inputting the splicing vectors of all the division units in the text to be recognized into a sequence labeling model;

and the first acquisition unit is used for acquiring the output result of the sequence labeling model so as to obtain the labeling information of the named entity in the text to be recognized.

13. The apparatus of claim 12, in which the sequence annotation model is a deep learning model comprising:

14. The apparatus of claim 13, in which the recurrent neural network is a long-short term memory neural network, or a bidirectional long-short term memory neural network.

15. The apparatus according to claim 13 or 14, wherein the deep learning model is a model obtained by unsupervised pre-training using corpora.

16. The apparatus of any one of claims 11-15, wherein the apparatus further comprises:

the first labeling module is used for labeling a part-of-speech index for each division unit in the text to be recognized before the indexes labeled by all the division units in the text to be recognized are input into the named entity recognition model;

correspondingly, the marked index of each division unit input into the named entity recognition model comprises the part-of-speech index.

17. The apparatus of any one of claims 11-16, wherein the apparatus further comprises:

a third determining module, configured to match the text to be recognized with a preset named entity dictionary before inputting the indexes, labeled by all the dividing units, in the text to be recognized into the named entity recognition model, and determine all the named entities matched in the text to be recognized;

the second labeling module is used for labeling a knowledge index aiming at each division unit in the text to be recognized, wherein the knowledge index is used for representing the information of the named entity matched with the corresponding division unit in the named entity recognition model;

accordingly, the index to which each of the division units input to the named entity recognition model is labeled includes the knowledge index.

18. The apparatus of any one of claims 10-17, further comprising:

the second obtaining module is used for obtaining a first text to be recognized corresponding to the current round of conversation before the text to be recognized and the scene type are input into the named entity recognition model;

the detection module is used for detecting whether a first keyword exists in the first text, wherein the first keyword is used for indicating that the first text is related to a multi-turn conversation scene;

the replacing module is used for replacing the first keyword in the first text with a second keyword under the condition that the first keyword exists in the first text to obtain a second text to be identified;

correspondingly, the named entity recognition model is further configured to recognize a named entity in the second text according to at least the second keyword and the scene type, where the second keyword is related to an entity category corresponding to the recognized named entity.

19. The apparatus of any of claims 10-18, wherein the scene types are classified according to a type of terminal and/or a type of application software to which the named entity recognition model is applied.

20. The apparatus according to any one of claims 10-19, wherein the apparatus is applied to an electronic device configured with a sound receiver, and the obtaining module comprises:

the second acquisition unit is used for acquiring the voice collected by the voice receiver;

and the second conversion unit is used for converting the voice into a text to obtain the text to be recognized.

21. An apparatus for identifying named entities, the apparatus comprising:

the first acquisition module is used for acquiring a first text to be identified corresponding to the current round of conversation;

and the identification module is used for identifying the named entity in the second text by utilizing a named entity identification model at least according to the second keyword, wherein the second keyword is related to the entity category corresponding to the identified named entity.

22. The apparatus of claim 21, wherein the apparatus further comprises:

a second obtaining module, configured to obtain an intention analysis result of a third text to be identified, which corresponds to a previous round of conversation of the current round of conversation, before detecting whether a first keyword exists in the first text, so as to obtain an intention type of the third text;

the judging module is used for judging whether the intention type of the third text is a specified intention type; accordingly, if the intention type of the third text is the designated intention type, the detection module performs detection of whether a first keyword exists in the first text.

23. The apparatus of claim 21 or 22, wherein the apparatus further comprises:

a searching module, configured to search, in the first text, a named entity identified by the named entity identification model for the second text after identifying the named entity in the second text by using the named entity identification model;

a first determining module, configured to determine that the corresponding named entity is invalid if the named entity is not found in the first text;

a second determining module, configured to determine that the corresponding named entity is valid if the named entity is found in the first text.

24. The apparatus according to any of claims 21-23, wherein the second keyword is determined according to an intention analysis result corresponding to a previous session of the current session.

25. The apparatus of claim 24, in which the second keyword is a verb and/or a noun.

26. The apparatus of any of claims 21-25, wherein the first keyword is to indicate that an intent corresponding to the first text is incomplete.

27. The apparatus of any one of claims 21-26, wherein the detection module comprises:

and the matching unit is used for matching in the text to be recognized by utilizing a preset regular expression and determining whether the first keyword exists in the first text.

28. The apparatus of any of claims 21-27, wherein the entity categories to which the second keyword corresponds are: movies or the likeOrA music class; the second keyword corresponding to the entity category of the movie class is "play" and/or "movie"; the second keyword corresponding to the entity category of the music category is "listen" and/or "music".

29. An apparatus for identifying named entities, the apparatus comprising:

one or more processors; a memory; a plurality of application programs; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions which, when executed by the apparatus, cause the apparatus to perform the method of any of claims 1-9.

30. A computer-readable storage medium, in which a computer program is stored which, when run on a computer, causes the computer to carry out the method according to any one of claims 1 to 9.