WO2021073179A1

WO2021073179A1 - Named entity identification method and device, and computer-readable storage medium

Info

Publication number: WO2021073179A1
Application number: PCT/CN2020/102094
Authority: WO
Inventors: 孟函可; 祝官文
Original assignee: 华为技术有限公司
Priority date: 2019-10-15
Filing date: 2020-07-15
Publication date: 2021-04-22
Also published as: CN112668333A

Abstract

A named entity identification method and device, which are related to natural language processing (NLP) technology, can be applied to speech recognition in the field of artificial intelligence (AI), and can be specifically applied to an application, such as a voice assistant. The named entity identification method comprises: acquiring text to be identified (101); determining a scenario type used by a named entity identification model for identifying a named entity in said text (102); inputting said text and the scenario type into the named entity identification model (103); and acquiring output information of the named entity identification model to determine a named entity identified from said text by means of the named entity identification model with respect to the scenario type (104). In the method, the probability of a named entity identification model identifying a named entity under different usage scenarios is improved by means of embedding scenario information into input information of the named entity identification model.

Description

Method and device for identifying named entity, and computer readable storage medium

Technical field

This application relates to the technical field of named entity recognition, in particular to the method and equipment of named entity recognition, and computing

Computer readable storage medium.

Background technique

Named entity recognition, also known as entity recognition or NER, is a basic task in natural language processing and has a wide range of applications. A named entity generally refers to an entity with a specific meaning or strong referentiality in the text, which usually includes the name of a person, place name, organization name, date and time, proper nouns, etc. The NER system extracts the above entities from unstructured input text, and can identify more types of entities according to business needs. The existing named entity recognition method in the related art usually only trains the named entity recognition model for a specific application scenario. For different scenarios, a large number of corpus corresponding to the scenario needs to be used for training, and multiple recognition models are trained separately to apply. In different scenarios, the training process is complicated, and the adaptability of the model is not strong.

Application content

This application provides a method and device for identifying a named entity, and a computer-readable storage medium, so that the named entity recognition model can be applied to different scenarios. By embedding the scene information into the input information of the named entity model, it improves The named entity recognition model recognizes the probability of a named entity in different usage scenarios.

In the first aspect, this application provides a method for identifying named entities for identifying named entities in text. Named Entity (or entity for short) refers to an entity with specific meaning or strong referentiality in the text. It usually includes the name of a person, place name, organization name, date and time, proper nouns, etc., and more extensive entities also include Numbers, currencies, addresses, etc.

The method for recognizing named entities provided in this application can be applied to formal recognition scenarios, for example, to provide text-based named entity recognition for voice assistants.

The method for identifying named entities provided in this application can also be applied in the process of training the named entity recognition model. When applied to the training process, after the method provided in this application is used to identify the named entities in the training text, The pre-marked named entities in the training text are compared, and the parameters in the named entity recognition model are adjusted according to the comparison results.

Specifically, the method for identifying named entities provided by the first aspect includes:

Obtain the text to be recognized; determine the scene type applied by the named entity recognition model used to identify the named entity in the text to be recognized; input the text to be recognized and the scene type into the named entity recognition model; obtain all Output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized for the scene type. Optionally, the text to be recognized may be text obtained after voice conversion into text, or a training sample text pre-marked with named entity tags. The text to be recognized may include characters such as Chinese, numbers, symbols, and English.

The Named Entity Recongition (NER) model (also referred to as a named entity recognition system) can extract the named entity from the text to be recognized, and can identify more types of named entities according to business requirements. The named entity recognition model can use the feature template-based method and neural network-based method in the existing related technologies. In an optional example, the named entity recognition model can use Word Embedding (word embedding). +LSTM (Long Short Term Memory Network)/BiLSTM (Bidirectional Long Short Term Memory Network) + CRF (Conditional Random Field) model.

The scenario is the actual situation in which the named entity recognition model is applied. For example, the scenario type classification method of the named entity recognition model application can be classified according to the terminal type, and/or application software type, and/or usage type. For example, the terminal type may include Different types of terminal electronic devices such as TVs, mobile phones, car consoles, etc. The application software types can be system or application software types (such as movies, music, etc.), and the use types can be divided into voice assistants or automatic response systems, etc. , Through any one of the above classification methods or a combination of multiple classification methods, the scene types are divided into multiple categories, for example, the application scenarios of the voice assistant of the TV system, the application scenarios of the automatic reply system of the shopping software, etc. The application does not make specific restrictions on this.

The above named entity recognition model is different from the named entity recognition model in the related prior art. In the method provided in this application, in addition to inputting the text to be recognized into the named entity recognition model, the input information also includes the named entity recognition model. The applied scene type information, by inputting the scene type information as the input information in the training process and/or recognition process of the named entity recognition model, makes the named entity recognition model applicable to different application scene types, for example, if the user is interested in the voice assistant Say "change to your name". For TV voice assistants, "your name" will be recognized as the name of the movie, while for mobile voice assistants, "your name" may not be recognized as the name of the movie. This is because the named entity recognition model is applied to different scene types. The named entity recognition model is based on the scene type information of the input information, and the output recognition results may be different.

In a possible implementation manner, before inputting the to-be-recognized text and the scene type into the named entity recognition model, the method further includes: marking the to-be-recognized text in a content index according to a division unit, where , The division units with the same content are marked by the same content index; the scene type index corresponding to the scene type is determined; the scene type index is marked for each division unit, that is, each division unit ( For example, each word or each word segmentation) is marked with the scene type index to which it is applied.

Correspondingly, the inputting the text to be recognized and the scene type into the named entity recognition model includes: inputting the marked indexes of all the division units in the text to be recognized into the named entity recognition model.

The above division unit refers to the basic unit of sentence division. For example, for Chinese, a single Chinese character can be used as the most basic division unit, or the word after word segmentation processing can be used as the most basic division unit. There are word segmentation tools in related technologies, such as the jieba word segmentation tool. The goal of word segmentation processing is to divide a sentence into multiple word segments. Each segmentation unit in the text to be recognized will be marked with a corresponding content index (also called a word index) Or word index), the index of the same word or word is the same, for example, the index of "的" is 15, the index of "Name" is 92, and so on.

The method for determining the scene type index corresponding to the above scene type may be manually configured in advance. For example, the scene type index is set in the factory according to the type of terminal device to be applied. The indexes corresponding to different types of scenes may be pre-configured mappings. For example, for an application scenario of a voice assistant applied to a mobile phone terminal, the scene type index is 1, for an application scenario of a voice assistant applied to a TV terminal, the scenario type index is 2, and so on.

In a possible implementation manner, after inputting the marked indexes of all the division units in the text to be recognized into the named entity recognition model, the processing method of the named entity recognition model includes: The division unit is to convert the different types of indexes to be labeled into multi-dimensional vectors; for each of the division units, multiple multi-dimensional vectors after the conversion of different types of indexes are sequentially spliced; and the output result of the sequence labeling model is obtained , To obtain the label information of the named entity in the text to be recognized.

Optionally, the conversion (mapping) of the index into a vector can use the related prior art word embedding (Word Embedding), distributed vector (Distributional Vectors), one-hot (one-hot) and other algorithms or improved algorithms. An index or other type of index used to represent the division unit (character or word segmentation) of natural language is converted into a multidimensional vector that can be recognized by the machine.

Furthermore, after obtaining multiple multi-dimensional vectors converted by different types of indexes, they are spliced according to the above-mentioned division unit. For example, for each word, the vector of content index conversion and the scene type vector are spliced in order to obtain the splicing vector of each word. For example, for the word "的", the content index is 15, which is converted into a 50-dimensional vector w1 , The application scenario of the named entity recognition model is the voice assistant of the mobile phone terminal, the index is 1, and it is converted into a 10-dimensional vector w2, then the stitching vector of the word "的" is [w1, w2].

It should be noted that the above sequence labeling model is a model for performing sequence labeling tasks. In this application, the input of the sequence labeling model is a sequence of multiple splicing vectors, and each splicing vector in the sequence corresponds to the division unit of the corresponding position. , The output is the label corresponding to each splicing vector, and the label is used to indicate whether the corresponding division unit is a named entity.

In a possible example, different tags can also be used to indicate whether the named entity type and/or the corresponding division unit is the starting position. For example, the text to be recognized is "Play your name", and the word segmentation result is "Play/ You/的/Name", the scene type index is 1, input the splicing vector into the sequence labeling model, and the output result is "O B-movie I-movie I-movie", where the label "O" represents the corresponding division unit " "Play" is not a named entity, the label "B-movie" indicates that the corresponding division unit "you" is a named entity, the type is movie (movie), and it is the starting position of the named entity, the label "I-movie" indicates the corresponding division The units "of" and "name" are named entities, the type is movie (movie), and they are not the starting position of the named entity.

In a possible implementation, the sequence labeling model is a deep learning model, for example, RNN (cyclic neural network) + CRF model. Specifically, the deep learning model includes one or more layers of cyclic neural networks and conditions Random field. Specifically, in one or more layers of recurrent neural networks, each layer of the recurrent neural network uses a neural network calculation unit to sequentially calculate the splicing vector of each division unit in the text to be recognized to output A calculation result vector corresponding to each division unit; a conditional random field is used to receive a vector sequence, the vector sequence includes a plurality of the calculation result vectors arranged in order, and the plurality of calculation result vectors are the one layer Or, the last layer of the recurrent neural network in the multi-layer recurrent neural network calculates a result vector for all the division units to output a label used to identify whether each division unit is a named entity. In a possible implementation manner, the RNN can select LSTM or BiLSTM.

In a possible implementation manner, the aforementioned deep learning model may be a model obtained by unsupervised pre-training using corpus. Unsupervised pre-training is a way to train the model using unlabeled corpus. The training method can be to encode the original corpus. The training goal is that when the input is the encoded original corpus, the output of the model is the original corpus ( That is, to predict the original corpus), or, in the case where the deep learning model is an LSTM+CRF model, the training goal of the deep learning model can be to input a word and output the next word (that is, predict the next word) .

In a possible implementation manner, before inputting the marked indexes of all the division units in the to-be-recognized text into the named entity recognition model, the method further includes: for each of the to-be-recognized texts. Each of the division units is marked with a part-of-speech index. The part-of-speech index is used to indicate the part-of-speech of the corresponding division unit, such as verbs, nouns, adverbs, conjunctions, etc.; correspondingly, each division unit entered into the named entity recognition model is The marked index includes the part-of-speech index. Correspondingly, when converting an index into a vector, it also includes converting a part-of-speech index into a multi-dimensional vector. When generating a splicing vector, for each division unit, all types of indexes (including content index, part-of-speech index, scene index) The vector is spliced into a vector as the splicing vector of the corresponding division unit.

In a possible implementation manner, before inputting the marked indexes of all the division units in the to-be-recognized text into the named entity recognition model, the method further includes: combining the to-be-recognized text with a preset Named entity dictionary matching in the to-be-recognized text to determine all the named entities that are matched in the text to be recognized; for each of the division units in the text to be recognized, a knowledge index is marked, where the knowledge index is used to indicate the corresponding The information of the named entity matched by the division unit in the named entity recognition model, and correspondingly, the index marked for each division unit input to the named entity recognition model includes the knowledge index.

It should be understood that since the named entity dictionary is known knowledge, the text to be recognized is matched in the named entity dictionary, and the matching result can be used as the known knowledge input into the named entity recognition model to improve the recognition success rate of the named entity recognition model. The matching result information is marked by the knowledge index. The knowledge index can mark the type and/or starting position and/or matching times of the named entity, for example, "you", "name", "in your name" "Your name" matches the movie name entity in the named entity dictionary, so mark "you" with the index B2-I0-movie to indicate that "you" is the movie name named entity that matches twice as the starting position. The "name" label B1-I1-movie indicates that the "name" is a named entity that matches the movie name once as the starting position and once as the non-starting position.

In a possible implementation manner, before the text to be recognized and the scene type are input into the named entity recognition model, the method further includes: obtaining the first text to be recognized corresponding to the current round of conversation, and the conversation It can be a voice conversation or a text conversation. The current round of conversation described in the embodiment of this application refers to the latest conversation sent by another party corresponding to the executor of the embodiment of this application. According to the voice or text of the current round of conversation, the waiting session is obtained. Recognized first text; detect whether there is a first keyword in the first text, for example, "change", "replace", etc., the first keyword is used to indicate that the first text and multiple Round conversation scenes are related. Multi-round conversation scenes refer to contextual scenes. For example, the user says “please play it all well” and then “change your name”. Optionally, the first The keyword is also used to indicate that the intent corresponding to the first text is incomplete; if the first keyword exists in the first text, the first keyword in the first text is Replace with the second keyword, for example, replace “change” with “play” to obtain the second text to be recognized; correspondingly, the named entity recognition model is also used to at least according to the second keyword and the The scene type, the named entity is identified in the second text, the second keyword is related to the entity category corresponding to the identified named entity, and the second keyword "play" in the above example is related to the movie category The entity category of the named entity recognition model receives the second text containing the second keyword, and the entity category of the identified named entity is related to the entity category corresponding to the second keyword, that is, the second keyword is related to The entity category of the identified named entity is related.

In a possible implementation manner, before detecting whether the first keyword exists in the first text, the method further includes: obtaining the third text to be recognized corresponding to the previous session of the current session In order to obtain the intent type of the third text, it should be noted that the executor of the embodiment of the present application can obtain the intent analysis result of the previous round of conversation, and obtain the corresponding intent type, for example, It can be the intent type of the search video category, etc.; determine whether the intent type of the third text is the specified intent type; correspondingly, if the intent type of the third text is the specified intent type, then detect Whether the first keyword exists in the first text, that is, it is necessary to detect whether the first keyword exists in the first text only when the intent type in the previous round is a specific one or several intent types . Optionally, the second keyword is determined according to the intent analysis result corresponding to the previous session of the current session. For example, each intent type may correspond to one or more keywords, and one keyword is selected from these keywords. As the second keyword, or, the second keyword may be a word extracted from the previous round of conversation, for example, a verb (such as "play"), or a name related to the entity category (such as "movie"). Optionally, the second keyword is a verb and/or noun.

In a possible implementation manner, detecting whether the first keyword exists in the first text includes: using a preset regular expression to match in the first text, and determining whether the first text is The first keyword exists. For example, the regular expression is ([I want only]? Want|select [select out]?|[screen pick]select?|[变变变][为成]).*, where "[]" means match to Any one of the characters in the brackets is sufficient, "?" means that the corresponding character can be discarded, "|" means or, and ".*" means any character string, that is, for the regular expression, the sentence "as long as "xxx", "select xxx", "change to xxx", "select xxx", etc. can be matched successfully.

In a possible implementation manner, the entity category corresponding to the second keyword is: movie or music; the second keyword corresponding to the entity category of the movie is "play" and/ Or "movie"; the second keyword corresponding to the entity category of the music category is "listening" and/or "music".

In a possible implementation manner, after the named entity recognition model is used to recognize the named entity in the second text, the method further includes: searching for the named entity recognition model in the first text The named entity identified for the second text; if the named entity is not found in the first text, it is determined that the corresponding named entity is invalid; if the named entity is found in the first text Named entity, it is determined that the corresponding named entity is valid.

The purpose of the above embodiment is to prevent the identified named entity from not being in the original text. For example, the first text is "Change to ABC", the second text is "Play ABC", and the recognized named entity is "Put AB", which is not in the original text of the first text. This is due to the replacement of keywords. Error in identifying named entities.

In a possible implementation manner, the above-mentioned scene types are classified according to the type of terminal and/or the type of application software to which the named entity recognition model is applied.

In a possible implementation manner, the method is applied to an electronic device equipped with a sound receiver, such as a mobile phone terminal or a TV terminal, and the obtaining of the text to be recognized includes: obtaining the voice collected by the sound receiver , Such as a voice command issued by the user; convert the voice into text to obtain the text to be recognized.

In a second aspect, the present application also provides a method for identifying named entities. The method includes: obtaining a first text to be recognized corresponding to this round of conversation; detecting whether there is a first keyword in the first text, said The first keyword is used to indicate that the first text is related to multiple rounds of conversation scenarios; if the first keyword exists in the first text, the first key in the first text is Word is replaced with a second keyword to obtain the second text to be recognized; at least according to the second keyword, a named entity recognition model is used to identify a named entity in the second text, and the second keyword is the same as the The entity category corresponding to the identified named entity is related. It should be noted that the named entity recognition model can be combined with the scene type applied by the named entity recognition model, and before the named entity recognition model is used to identify the named entity in the second text, the scene type applied by the named entity recognition model is obtained, At least according to the second keyword and the scene type to identify the named entity in the second text, those skilled in the art have the ability to combine the embodiments provided in the first aspect and the embodiments provided in the second aspect.

In a possible implementation manner, before detecting whether the first keyword exists in the first text, the method further includes: obtaining the third text to be recognized corresponding to the previous session of the current session Intent analysis result of the third text to obtain the intent type of the third text; determine whether the intent type of the third text is the specified intent type; correspondingly, if the intent type of the third text is the specified intent Type, then it is detected whether there is a first keyword in the first text.

Optionally, the second keyword is determined according to the intent analysis result corresponding to the previous round of the current round of conversation.

Optionally, the second keyword is a verb and/or noun.

Optionally, the first keyword is used to indicate that the intent corresponding to the first text is incomplete.

Optionally, detecting whether the first keyword exists in the first text includes: using a preset regular expression to perform matching in the text to be recognized to determine whether the first keyword exists in the first text.

Optionally, the entity category corresponding to the second keyword is: movie or music; the second keyword corresponding to the entity category of the movie is "play" and/or "movie"; The second keyword corresponding to the entity category of the music category is "listen" and/or "music".

In a third aspect, the present application provides a device for identifying a named entity. The device includes: a first acquiring module for acquiring a text to be recognized; a first determining module for determining a device used to identify the text to be recognized The type of scene to which the named entity recognition model of the named entity is applied; an input module for inputting the text to be recognized and the type of scene into the named entity recognition model; an execution module for obtaining information about the named entity recognition model Output information to determine the named entity recognized by the named entity recognition model in the to-be-recognized text for the scene type.

In a possible implementation manner, the device further includes: a division module, configured to divide the text to be recognized according to a division unit before inputting the text to be recognized and the scene type into the named entity recognition model Annotated content index, wherein the division units with the same content are marked by the same content index; the second determining module is configured to determine the scene type index corresponding to the scene type; and mark the division unit for each of the division units. Scene type index; correspondingly, the input module includes: a first input unit, configured to input the marked indexes of all the division units in the text to be recognized into the named entity recognition model.

In a possible implementation manner, the named entity recognition model includes: a first conversion unit, configured to input the marked indexes of all the division units in the to-be-recognized text into the named entity recognition model, For each of the division units, respectively convert the marked indexes of different types into multi-dimensional vectors; a splicing unit is used for sequentially splicing multiple multi-dimensional vectors after conversion of different types of indexes for each of the division units; The second input unit is used to input the stitching vectors of all the division units in the text to be recognized into the sequence labeling model; the first obtaining unit is used to obtain the output result of the sequence labeling model to obtain the to be recognized The label information of the named entity in the text.

In a possible implementation manner, the sequence labeling model is a deep learning model, and the deep learning model includes: one or more layers of recurrent neural networks, and each layer of the recurrent neural network uses neural network computing units to sequentially pair The stitching vector of each division unit in the text to be recognized is calculated to output a calculation result vector corresponding to each division unit; a conditional random field is used to receive a vector sequence, and the vector sequence includes a sequence of The plurality of calculation result vectors are the calculation result vectors of the last layer of the recurrent neural network in the one or more layers of the recurrent neural network for all the division units for output A label to identify whether each of the division units is a named entity.

Optionally, the cyclic neural network is a long- and short-term memory neural network, or a bidirectional long- and short-term memory neural network.

Optionally, the deep learning model is a model obtained by unsupervised pre-training using corpus.

In a possible implementation manner, the device further includes: a first labeling module, configured to, before inputting the labelled indexes of all the division units in the to-be-recognized text into the named entity recognition model, target all the Each of the division units in the text to be recognized is annotated with a part-of-speech index; correspondingly, the index of each of the division units input into the named entity recognition model includes the part-of-speech index.

In a possible implementation manner, the device further includes: a third determining module, configured to input all the marked indexes of the division units in the to-be-recognized text into the named entity recognition model. The text to be recognized is matched with a preset named entity dictionary to determine all named entities that are matched in the text to be recognized; the second tagging module is used to tag each of the division units in the text to be recognized Knowledge index, wherein the knowledge index is used to represent the information of the named entity matched by the corresponding division unit in the named entity recognition model; correspondingly, input each of the named entity recognition models The index to which the division unit is marked includes the knowledge index.

In a possible implementation manner, the device further includes: a second acquiring module, configured to acquire the pending session corresponding to the current round of the conversation before inputting the to-be-recognized text and the scene type into the named entity recognition model. Recognized first text; a detection module for detecting whether there is a first keyword in the first text, the first keyword for indicating that the first text is related to multiple rounds of conversation scenarios; a replacement module, using In the case where the first keyword exists in the first text, replace the first keyword in the first text with a second keyword to obtain the second text to be recognized; corresponding , The named entity recognition model is also used to identify a named entity in the second text at least according to the second keyword and the scene type, and the second keyword is related to the identified named entity. The corresponding entity category is related.

In a possible implementation manner, before detecting whether the first keyword exists in the first text, the device further includes: a third obtaining module, configured to obtain the previous session correspondence of the current session The intent analysis result of the third text to be recognized to obtain the intent type of the third text; the judgment module is used to determine whether the intent type of the third text is the specified intent type; correspondingly, if the The intent type of the third text is the specified intent type, and the detection module executes to detect whether the first keyword exists in the first text.

In a possible implementation manner, the device further includes: a search module, configured to, after recognizing a named entity in the second text by using the named entity recognition model, search for all the named entities in the first text. The named entity recognition model is for a named entity recognized by the second text; a fourth determining module is configured to determine that the corresponding named entity is invalid if the named entity is not found in the first text; The fifth determining module is configured to determine that the corresponding named entity is valid if the named entity is found in the first text.

Optionally, the second keyword is a verb and/or noun.

In a possible implementation manner, the detection module includes: a matching unit, configured to perform matching in the first text by using a preset regular expression, and determine whether the first text exists in the first text. Key words.

Optionally, the scene type is classified according to the type of terminal to which the named entity recognition model is applied and/or the type of application software.

In a possible implementation manner, the device is applied to an electronic device equipped with a sound receiver, and the acquisition module includes: a second acquisition unit configured to acquire the voice collected by the sound receiver; and a second conversion The unit is used to convert the speech into text to obtain the text to be recognized.

In a fourth aspect, an embodiment of the present application also provides a device for identifying a named entity. The device includes: a first obtaining module, configured to obtain the first text to be recognized corresponding to the current session; and a detection module, configured to detect Whether there is a first keyword in the first text, the first keyword is used to indicate that the first text is related to multiple rounds of conversation scenarios; the replacement module is used to indicate that the first text is present in the first text In the case of a keyword, replace the first keyword in the first text with a second keyword to obtain the second text to be recognized; the recognition module is configured to at least according to the second keyword, A named entity recognition model is used to recognize a named entity in the second text, and the second keyword is related to the entity category corresponding to the recognized named entity.

In a possible implementation manner, the device further includes: a second obtaining module, configured to obtain the previous session correspondence of the current session before detecting whether the first keyword exists in the first text The intent analysis result of the third text to be recognized to obtain the intent type of the third text; the judgment module is used to determine whether the intent type of the third text is the specified intent type; correspondingly, if the If the intent type of the third text is the specified intent type, the detection module performs detection of whether the first keyword exists in the first text.

In a possible implementation manner, the device further includes: a search module, configured to, after recognizing a named entity in the second text by using the named entity recognition model, search for all the named entities in the first text. The named entity recognition model is for the named entity recognized by the second text; a first determining module is configured to determine that the corresponding named entity is invalid if the named entity is not found in the first text; The second determining module is configured to determine that the corresponding named entity is valid if the named entity is found in the first text.

Optionally, the second keyword is a verb and/or noun.

In a possible implementation manner, the detection module includes: a matching unit, configured to use a preset regular expression to perform matching in the text to be recognized, and determine whether the first keyword exists in the first text .

In a fifth aspect, the present application provides a computer-readable storage medium in which a computer program is stored, and when it runs on a computer, the computer executes the method as described in the first aspect.

In a sixth aspect, the present application provides a computer-readable storage medium in which a computer program is stored, and when it runs on a computer, the computer executes the method described in the second aspect.

In a seventh aspect, the present application provides a computer program, when the computer program is executed by a computer, it is used to execute the method described in the first aspect.

In an eighth aspect, the present application provides a computer program, when the computer program is executed by a computer, it is used to execute the method described in the second aspect.

In a possible implementation manner, the program in the seventh aspect or the eighth aspect may be stored in whole or in part on a storage medium that is packaged with the processor, or may be stored in part or all in a storage medium that is not packaged with the processor. On the memory.

In a ninth aspect, this application provides a named entity identification device, the device comprising: one or more processors; a memory; a plurality of application programs; and one or more computer programs, wherein the one or more computers A program is stored in the memory, and the one or more computer programs include instructions that, when executed by the device, cause the device to perform the method as described in the first aspect.

In a tenth aspect, the present application provides a named entity identification device, the device includes: one or more processors; a memory; a plurality of application programs; and one or more computer programs, wherein the one or more computers A program is stored in the memory, and the one or more computer programs include instructions that, when executed by the device, cause the device to perform the method as described in the second aspect.

It can be seen that in the above aspects, by embedding the scene information into the input information of the named entity model, the probability of the named entity recognition model identifying the named entity in different usage scenarios is improved.

Description of the drawings

FIG. 1 is a schematic diagram of an application scenario provided by an embodiment of the application;

FIG. 2 is a schematic flowchart of a method for identifying named entities provided by this application;

Figure 3 is a schematic structural diagram of a named entity recognition model provided by this application;

FIG. 4 is a schematic flowchart of another method for identifying named entities provided by this application;

FIG. 5 is a schematic flowchart of another method for identifying named entities provided by this application;

FIG. 6 is a schematic structural diagram of a named entity identification device provided by an embodiment of this application;

FIG. 7 is a schematic structural diagram of another named entity identification device provided by an embodiment of this application.

Detailed ways

The terminology used in the implementation mode part of this application is only used to explain the specific embodiments of this application, and is not intended to limit this application. Some embodiments provided in this application can be applied to speech recognition in the field of artificial intelligence (AI), related to natural language processing (NLP) technology, and can be specifically applied to applications such as voice assistants.

The technical solution in this application will be described below in conjunction with the accompanying drawings.

This application provides a method for identifying named entities to identify named entities in text. Named Entity (or entity for short) refers to an entity with specific meaning or strong referentiality in the text. It usually includes the name of a person, place name, organization name, date and time, proper nouns, etc., and more extensive entities also include Numbers, currencies, addresses, etc.

The method for recognizing named entities provided in this application can be applied to formal recognition scenarios, for example, to provide text-based named entity recognition for artificial intelligence voice assistants. As shown in Figure 1, a system architecture applied to voice assistants is provided. Users enter the voice assistant mode through vehicle-mounted terminals, computer terminals, and mobile phone terminals. The voice assistant obtains voice and uses automatic speech recognition technology (Automatic Speech Recognition). The Recognition (ASR) module converts speech into text, and the text is input into the Dialog Management (DM) module, and the dialog management module distributes the text to the Natural Language Understand (NLU) module, and the NLU module receives it To the current sentence and beyond, perform named entity recognition for the current sentence. Optionally, the named entity recognition module executes an example of the named entity recognition method provided in the embodiment of this application, including word segmentation, part-of-speech tagging, and named entity tagging ( The sequence labeling model in the named entity recognition method provided by this application, in addition, also includes modules such as intention recognition classification, slot filling, etc., to understand the semantics in the text, and then the NLU module returns the analysis result to the DM module. The DM module then uses the Natural Language Generation (NLG) module to generate the reply dialogue text according to the recognized semantics, and the reply dialogue text is generated by the speech synthesis (Text-To-Speech) module and broadcast to the user. In the three application scenarios shown in Figure 1 (Voice Assistant for Vehicle Terminals, Voice Assistant for Computer Terminals, and Voice Assistant for Mobile Phone Terminals), the recognition results of named entities for different application scenarios may be different, leading to the terminal The response to the user’s instructions is also different. For example, "Change to your name" does not recognize the named entity, for mobile phone voice assistants, it is small chat, and for TV voice assistants, the named entity is recognized as the movie "Your Name". The voice assistant switches to the movie "Your Name".

The method for recognizing named entities provided in this application can also be applied to a scenario where a named entity recognition model is trained. When applied to the training process, after the method provided in this application is used to identify the named entity in the training text, and The pre-marked named entities in the training text are compared, and the parameters in the named entity recognition model are adjusted according to the comparison results.

The following describes the method for identifying a named entity in an embodiment of the present application in detail with reference to FIG. 2. The method shown in FIG. 2 includes steps 101 to 104, and these steps are respectively described in detail below.

Step 101: Obtain the text to be recognized.

Optionally, the text to be recognized may be text obtained after voice conversion into text, or a training sample text pre-marked with named entity tags. The text to be recognized may include characters such as Chinese, numbers, symbols, and English.

In an optional example, the method for identifying a named entity provided in the embodiment of the present application is applied to an electronic device equipped with a sound receiver. The electronic device may specifically be a mobile terminal (for example, a smart phone), a computer, or a personal digital assistant. , Wearable devices, in-vehicle devices, Internet of Things devices or other electronic devices that can receive sound. In this example, the voice collected by the sound receiver is converted into text by the voice-to-text module shown in FIG. 1 to obtain the text to be recognized.

Step 102: Determine the scene type to which the named entity recognition model used to recognize the named entity in the text to be recognized is applied.

In this application, a scenario refers to a business requirement scenario that requires a named entity recognition model to identify a named entity. The scenario type can be divided into different types according to the terminal type and/or application software type to which the named entity recognition model is applied. For example, the terminal may include different types of terminal electronic equipment such as televisions, mobile phones, and car consoles. The application software may be at the system level or the type of application software installed in the terminal's operating system (such as video, music, etc.). Use types can be divided into voice assistants or automatic response systems, etc. Through any one of the above classification methods or a combination of multiple classification methods, the scene types can be divided into multiple categories, for example, voice assistants for TV systems, manuals for shopping software There are multiple application scenarios such as smart customer service, voice assistants for video applications, and artificial intelligence voice assistants for smart home appliances. This application does not specifically limit the specific categories of scenarios into which they can be configured according to actual conditions.

Step 103: Input the text to be recognized and the scene type into the named entity recognition model.

Step 104: Obtain output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized for the scene type.

The difference between the named entity recognition model described in the embodiments of this application and the named entity recognition model in the related prior art is that in the method provided in this application, in addition to inputting the text to be recognized into the named entity recognition model, the input information is also Including the scene type information applied by the named entity recognition model. By inputting the scene type information as the input information in the training process and/or recognition process of the named entity recognition model, the named entity recognition model can be applied to different application scene types, for example, If the user says "change your name" to the voice assistant, the TV voice assistant will recognize "your name" as the name of the movie, while for the mobile phone voice assistant, it may not recognize "your name" It is the name of the movie. This is because the named entity recognition model uses different scene types. The named entity recognition model is based on the scene type information of the input information, and the output recognition results may be different.

In an optional implementation manner, before step 103 is performed, before the text to be recognized and the scene type are input into the named entity recognition model, the method further includes performing the following steps 11 to 13:

Step 11: Mark the content index of the text to be recognized according to the division unit, wherein the division units with the same content are marked by the same content index.

Step 12: Determine the scene type index corresponding to the scene type.

The method for determining the scene type index corresponding to the above scene type may be manually configured in advance. For example, the scene type index is set in the factory according to the type of terminal device to be applied. The indexes corresponding to different types of scenes may be pre-configured mappings. For example, for an application scenario of a voice assistant applied to a mobile phone terminal, the scene type index is 1, for an application scenario of a voice assistant applied to a TV terminal, the scenario type index is 2, and so on. Optionally, the scene type index can be floating-point float data. For example, the index of general scenes is 0, and the index of high recall scenes is 1.0, so that the scene type index can be adjusted to change between 0 and 1 to make the naming The entity recognition model is actively migrated to the new scene.

Step 13: Mark the scene type index for each division unit.

That is, each division unit (for example, each word or each word segmentation) is marked with an applied scene type index.

Correspondingly, step 103 inputs the text to be recognized and the scene type into the named entity recognition model, including:

Step 14. Input the marked indexes of all the division units in the text to be recognized into the named entity recognition model.

Further, after performing step 14 to input the marked indexes of all the division units in the text to be recognized into the named entity recognition model, the processing method of the named entity recognition model may include the following steps 21 to 23:

Step 21: For each division unit, convert the marked indexes of different types into multi-dimensional vectors.

Step 22: For each division unit, a plurality of multi-dimensional vectors converted by different types of indexes are sequentially spliced.

Step 23: Obtain the output result of the sequence annotation model to obtain the annotation information of the named entity in the text to be recognized.

Furthermore, after obtaining multiple multi-dimensional vectors converted by different types of indexes, they are spliced according to the above-mentioned division unit. For example, for each word, the vector of content index conversion and the scene type vector are spliced in order to obtain the splicing vector of each word. For example, for the word "的", the content index is 15, which is converted into a 50-dimensional vector w1 , The application scenario of the named entity recognition model is the voice assistant of the mobile phone terminal, the index is 1, and it is converted into a 20-dimensional vector w2, then the stitching vector of the word "的" is [w1, w2].

In an optional example, different tags can also be used to mark whether the named entity type and/or the corresponding division unit is the starting position. For example, the text to be recognized is "Play your name", and the word segmentation result is "Play /你/的/name", the scene type index is 1, input the splicing vector into the sequence labeling model, and the output result is "O B-movie I-movie I-movie", where the label "O" represents the corresponding division unit "Play" is not a named entity, the label "B-movie" indicates that the corresponding division unit "you" is a named entity, the type is movie (movie), and it is the starting position of the named entity, and the label "I-movie" indicates the corresponding The division unit "of" and "name" are named entities, the type is movie, and they are not the starting position of the named entity.

Optionally, the input information input to the named entity recognition model may also include part-of-speech index and/or knowledge index.

In an optional implementation manner, before inputting the marked indexes of all the division units in the text to be recognized into the named entity recognition model, the method further includes: marking the part-of-speech index for each division unit in the text to be recognized. The index is used to indicate the part-of-speech of the corresponding division unit, such as verb, noun, adverb, conjunction, etc.; correspondingly, the index marked for each division unit input to the named entity recognition model includes a part-of-speech index. Correspondingly, when converting an index into a vector, it also includes converting a part-of-speech index into a multi-dimensional vector. When generating a splicing vector, for each division unit, all types of indexes (including content index, part-of-speech index, scene index) The vector is spliced into a vector as the splicing vector of the corresponding division unit.

In an optional implementation manner, before inputting the marked indexes of all the division units in the text to be recognized into the named entity recognition model, the method further includes: matching the text to be recognized with a preset named entity dictionary to determine the Recognize all named entities that are matched in the text; label a knowledge index for each division unit in the text to be recognized, where the knowledge index is used to indicate the information of the named entity matched by the corresponding division unit in the named entity recognition model Correspondingly, the index marked for each division unit of the input named entity recognition model includes a knowledge index.

In an alternative embodiment, the sequence labeling model is a deep learning model, for example, an RNN (recurrent neural network) + CRF model. Specifically, the deep learning model includes one or more layers of recurrent neural networks and conditional random fields.

Specifically, in one or more layers of recurrent neural networks, each layer of recurrent neural network uses neural network computing units to calculate the stitching vector of each division unit in the recognized text in order to output the calculation corresponding to each division unit The result vector.

The conditional random field is used to receive the vector sequence. The vector sequence includes multiple calculation result vectors arranged in order. The multiple calculation result vectors are the calculation results of the last layer of the recurrent neural network in one or more layers of the recurrent neural network for all division units. Vector to output a label used to identify whether each division unit is a named entity.

In an alternative implementation manner, the RNN can select LSTM or BiLSTM. The LSTM (Long Short-Term Memory) model is a type of RNN (Recurrent Neural Network). BiLSTM (Bi-directional Long Short-Term Memory) model is a combination of forward LSTM and backward LSTM. Both LSTM and BiLSTM are often used to model contextual information in natural language processing tasks.

Fig. 3 is a schematic structural diagram of an example of a named entity recognition model using BiLSTM+CRF. As shown in FIG. 3, the named entity recognition model 300 includes a vector conversion module, a vector splicing module, and a sequence labeling module 301. The sequence labeling module 301 includes bidirectional long and short-term memory BiLSTM and CRF. The input of the vector conversion module is various indexes of each word (division unit), including scene (type) index, part-of-speech index, and word (content) index. The vector conversion module uses Word embedding technology to map each index into a vector. The vector stitching module stitches together multiple vectors of each division unit, and inputs BiLSTM. BiLSTM includes a forward LSTM network and a backward LSTM network. Each layer of LSTM network includes multiple repeated LSTM neural units. The neural unit is used to calculate the input vector. For each division unit, splice the outputs of the forward LSTM neural unit and the backward LSTM neural unit together to obtain the calculation result vector of the corresponding division unit, arrange all the calculation result vectors in order into a vector sequence, input the CRF, and get the final The output tags t1, t2..., each tag is used to indicate whether the corresponding division unit belongs to a named entity, and can also identify whether the corresponding division unit is at the starting position in the named entity.

In an optional implementation manner, the principle of transfer learning is used, and a model obtained by unsupervised pre-training using corpus is used as the aforementioned deep learning model. Transfer learning is a machine learning method that refers to a pre-trained model being reused in another task. For example, the model developed for task A is used as the initial point and reused in the process of developing the model for task B . In the embodiment of this application, unsupervised pre-training is a way to train the model using unlabeled corpus. For example, the task of unsupervised pre-training can be any of the following:

1) Use the autoencoder of the sequence to encode the original sequence, and input the encoded sequence into the deep learning model to predict the original sequence;

2) In the case of the LSTM+CRF model, the training task is the traditional language model task: predict the next word.

In an optional implementation manner, before inputting the text to be recognized and the scene type into the named entity recognition model, the method further includes: obtaining the first text to be recognized corresponding to the current round of conversation. The conversation may be a voice conversation or a text. Conversation, the current round of conversations in this embodiment of this application refers to the latest conversation sent by another party corresponding to the executor of this embodiment of this application. According to the voice or text of this round of conversation, the first text to be recognized is obtained; Whether there is a first keyword in a text, for example, "change", "replace", etc. The first keyword is used to indicate that the first text is related to a multi-round conversation scene, which means that there is context Contextual scenarios, for example, the user says "please play it all well" and then "change to your name". Optionally, the first keyword is also used to indicate that the intent corresponding to the first text is incomplete; When the first keyword exists in the first text, replace the first keyword in the first text with the second keyword, for example, replace "change" with "play" to obtain the second keyword to be recognized Text; Correspondingly, the named entity recognition model is also used to identify named entities in the second text at least according to the second keywords and scene types. The second keywords are related to the entity category corresponding to the identified named entity, the above example The second keyword "play" is related to the entity category of the movie category. The named entity recognition model receives the second text containing the second keyword, and the entity category of the identified named entity corresponds to the second keyword The entity category is related, that is, the second keyword is related to the entity category of the identified named entity.

In a possible implementation manner, before detecting whether the first keyword exists in the first text, the method further includes: obtaining an intent analysis result of the third text to be recognized corresponding to the previous conversation of the current round of conversation, In order to obtain the intent type of the third text, it should be noted that the executor of the embodiment of the present application can obtain the intent analysis result of the previous round of conversation, and obtain the corresponding intent type. For example, it may be an intent to search for a video category. Type, etc.; determine whether the intent type of the third text is the specified intent type; correspondingly, if the intent type of the third text is the specified intent type, check whether the first keyword exists in the first text, that is , Only when the intent type in the previous round is a specific one or several intent types, it is necessary to detect whether the first keyword exists in the first text. Optionally, the second keyword is determined according to the intent analysis result corresponding to the previous session of the current session. For example, each intent type can correspond to one or more keywords, and one keyword is selected as the first keyword among these keywords. The second keyword, or, the second keyword may be a word extracted from the previous round of conversation, for example, a verb (such as "play"), or a name related to the entity category (such as "movie"). Optionally, the second keyword is a verb and/or noun.

In a possible implementation manner, detecting whether the first keyword exists in the first text includes: using a preset regular expression to perform matching in the first text to determine whether the first keyword exists in the first text. For example, the regular expression is ([I want only]? Want|select [select out]?|[screen pick]select?|[变变变][为成]).*, where "[]" means match to Any one of the characters in the brackets is sufficient, "?" means that the corresponding character can be discarded, "|" means or, and ".*" means any character string, that is, for the regular expression, the sentence "as long as "xxx", "select xxx", "change to xxx", "select xxx", etc. can be matched successfully.

In a possible implementation, the entity category corresponding to the second keyword is: movie or music; the second keyword corresponding to the entity category of the movie is "play" and/or "movie"; The second keyword corresponding to the entity category is "listening" and/or "music".

In a possible implementation manner, after the named entity recognition model is used to recognize the named entity in the second text, the method further includes: in the first text, searching for the named entity recognized by the named entity recognition model for the second text; If the named entity is not found in the first text, then the corresponding named entity is determined to be invalid; if the named entity is found in the first text, then the corresponding named entity is determined to be valid.

The embodiment of the present application also provides an embodiment of another named entity identification method. It should be understood that for parts that are not described in detail in this embodiment, reference may be made to the specific description of the corresponding part in the foregoing embodiment. The method provided in this embodiment includes the following steps 32 to 38:

Step 32: Obtain the first text to be recognized corresponding to this round of conversation;

Step 34: Detect whether there is a first keyword in the first text, and the first keyword is used to indicate that the first text is related to multiple rounds of conversation scenarios;

Step 36: If the first keyword exists in the first text, replace the first keyword in the first text with a second keyword to obtain the second text to be recognized;

Step 38: Use a named entity recognition model to identify a named entity in the second text according to at least the second keyword, where the second keyword is related to the entity category corresponding to the identified named entity.

It should be noted that the named entity recognition model can be combined with the scene type applied by the named entity recognition model, and before the named entity recognition model is used to identify the named entity in the second text, the scene type applied by the named entity recognition model is obtained, At least according to the second keyword and the scene type to identify the named entity in the second text, those skilled in the art have the ability to combine the embodiments provided in the first aspect and the embodiments provided in the second aspect.

Optionally, the second keyword is a verb and/or noun.

FIG. 4 is a schematic diagram of an optional process of a named entity identification method provided by an embodiment of the application. As shown in FIG. 4, the method includes:

Before the named entity recognition method processes the text to be recognized, keywords and multiple rounds of sentence patterns are collected manually or from corpus in a certain type of scene.

For example, the keywords for the movie genre can be "play" and "movie", and the keywords for the music genre can be "listen" and "song". Each type of keyword collected can correspond to a special above scene. , The enhanced feature of named entity recognition in the use scene improves the probability of identifying related named entities. For example, movie keywords can correspond to scene types such as voice assistants on TV terminals, and music keywords can correspond to mobile phones. Scene types such as terminal music software.

The multi-round sentence pattern described in the embodiments of this application refers to a sentence pattern that contains incomplete intention information and needs to be supplemented by the above intention information. For example, the multi-round sentence pattern can be "as long as Iron Man", " Choose tomorrow", "Change to your name", "Choose all very well" and so on. Based on the collected multiple rounds of sentence patterns, construct regular expressions of multiple rounds of sentence patterns. For example, the regular expression can be ([I want only]? Want|Select [Select out]?|[Screen Pick]Select?|[ Change and change][为成]).*.

It should be noted that the above steps of collecting keywords, collecting multiple rounds of sentence patterns, and constructing regular expressions are relatively independent of the recognition process of the text to be recognized by the named entity recognition model.

After the text to be recognized is obtained, the text is input into the recognition model of the named entity provided in the embodiment of the present application.

First, use the collected regular expressions of multiple rounds to match the sentence to be replaced, and replace the sentence to be replaced with a sentence with keywords. Specifically, it can include splicing keywords and/or replacing keywords, that is, Join keywords in the original sentence, and/or replace part of the words in the original sentence with keywords. For example, the input text to be recognized is "replace your name", and after the regular expression is matched, replace it with "play your name".

Second, input the replaced text into the recognition model of the named entity. In the recognition model of named entities, first perform word segmentation processing on the replaced text to get "play/you/的/name", mark each word segmentation index (including word index, part-of-speech index, scene index), and secondly, Each index is mapped to a vector, and after the vectors of multiple indexes of each word segmentation are spliced, the sequence labeling model is input. The sequence labeling model can adopt the structure of the above-mentioned BiLSTM+CRF model. For example, after the sequence labeling model is used to label the vector sequence of "Play your name", the label of each word segmentation is determined to be "O B-movie I-movie I-movie", and each label represents whether it is a named entity or a named entity. The starting position of the named entity, the non-starting position of the named entity, and the non-starting position of the named entity.

Third, structure the marked named entity. For example, if the named entity is identified as "Iron Man Part 2", it will be structured (or normalized) as the named entity "Iron Man 2". For the named entity "your name", the result of the structured entity is still "your name".

Finally, the unstructured named entity is matched in the original text to be recognized to verify whether the recognized named entity exists in the original text, and to avoid combining the replaced keywords with the original text to identify a named entity that does not exist in the original text. If it exists, confirm that the identified named entity is valid.

FIG. 5 is a schematic diagram of another optional process of the method for identifying a named entity provided by an embodiment of the application. As shown in FIG. 5, the method is divided into two parts: a training process and a recognition process. The training process is the process of training the actual named entity recognition model, and the recognition process is the process of using the named entity recognition model. The training process and the recognition process can be relatively independent, and the recognition model of the named entity trained in the training process is used as the recognition model of the named entity used in the recognition process.

In the training process, the collected training corpus can be processed with corpus enhancement. The corpus enhancement is to replace some keywords of the corpus with the above-mentioned multi-round sentence pattern to expand the processing method of the corpus. For example, the text of the corpus is "Play your name", and after the processing of the corpus enhancement, the text of the enhanced corpus is "Change your name". Corpus enhancement can be regarded as the reverse method of matching regular expressions and replacing keywords in the text to be recognized. The preprocessing of the corpus includes word segmentation, knowledge extraction, part-of-speech judgment, index labeling and so on. Word segmentation is used to segment the corpus, and knowledge extraction is used to match in the corpus using a preset named entity dictionary. The matched named entity can be used as known knowledge and input into the named entity recognition model for training, and part of speech judgment Used to determine the part of speech of each word segmentation, the index tagging can mark the content of the word itself (text), it can also index the part of speech tag, and it can also index the scene type and/or knowledge. The corpus preprocessing can process the original corpus or the enhanced corpus, depending on the current training goal.

After building the index, input the basic model and preprocessed information to train the model. The basic model can be configured as BiLSTM+CRF model. The input of the model includes at least the scene type index and the word index of each word segmentation.

The output is the label of the sequence label, which is the named entity label of each word.

In the recognition process, the above intention or slot information can be obtained first. The above intention can be recognized by the intention recognition module and the slot filling module shown in Figure 1 when processing the last round or last n rounds of sentences. Out the intention and slot information. After obtaining the above intention and slot information, you can determine whether the intention is related to the current entity. For example, the above intention is to play a movie: your name, when the text (a sentence) to be recognized is obtained in this round , You can first use the named entity dictionary to identify whether there is an entity in the current text, such as "It’s good to change it", and the TV series “very good” is matched in the named entity dictionary, indicating that the entity in the current text to be recognized is the same as the above intention Related, need to use keywords for enhancement. Alternatively, regular expressions can also be used for direct matching. If the regular expressions cannot be matched, it means that there is no need to enhance the features of the keywords in the text to be recognized. If the regular expressions are matched, then the enhancements are performed.

The enhanced method is to use regular expression search to match the above-mentioned multiple rounds of sentence patterns and replace them with sentence patterns that contain keywords. For example, replace "Replace your name" with "Play your name".

After replacing the text, perform preprocessing on the replaced text, including word segmentation, the above-mentioned knowledge extraction, etc., and mark the index. The index obtained after the preprocessing and the index of the scene type are input into the recognition model of the named entity to obtain the labeling sequence. The labeling sequence is used to label whether each word segmentation is an entity.

After obtaining the label sequence, perform post-processing, that is, check whether the labeled named entity exists in the original text before replacement. If it does not exist, it is misrecognized. If it exists, the entity is structured and output to the subsequent Downstream modules, such as the intent recognition module and slot filling module in Figure 1.

The named entity recognition method provided in the embodiments of this application can use the same named entity recognition model to apply to different scenarios, and is suitable for voice recognition in the field of artificial intelligence, for example, voice assistants of terminal systems or application software, AI customer service, AI For applications such as chat, the method for identifying named entities provided in the embodiments of this application is related to natural language processing technology, which can effectively identify named entities in corresponding scenarios for different scenarios, which helps to improve the semantic recognition capabilities of natural language processing and reduce collection The labor cost required for the corpus improves the generalization ability of semantic recognition. The embodiment of the application adds the input of the "scene type" information to the input of the named entity recognition model, so that the input value of the "scene type" can be dynamically adjusted when the named entity is recognized, so as to adjust the entity recognition result to adapt to different scenarios. Provide users with more accurate recognition results that are more in line with the application scenario.

It can be understood that part or all of the steps or operations in the above-mentioned embodiments are only examples, and the embodiments of the present application may also perform other operations or various operation variations. In addition, each step may be executed in a different order presented in the foregoing embodiment, and it may not be necessary to perform all operations in the foregoing embodiment.

As shown in FIG. 6, a schematic structural diagram of a named entity recognition device is provided. The named entity recognition device 600 includes a first acquisition module 601, a first determination module 602, an input module 603, and an execution module 604.

Among them, the first obtaining module is used to obtain the text to be recognized; the first determining module is used to determine the scene type applied by the named entity recognition model used to recognize the named entity in the text to be recognized; the input module is used to transfer the The recognition text and scene type are input to the named entity recognition model; the execution module is used to obtain the output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized for the scene type.

In a possible implementation manner, the device further includes: a division module, which is used to mark the content index of the text to be recognized according to the division unit before inputting the text to be recognized and the scene type into the named entity recognition model, wherein the content of the same content is indexed. The division units are marked by the same content index; the second determination module is used to determine the scene type index corresponding to the scene type; the scene type index is marked for each division unit; correspondingly, the input module includes: a first input unit, To input the marked index of all the division units in the text to be recognized into the named entity recognition model.

In a possible implementation manner, the named entity recognition model includes: a first conversion unit, configured to input the marked indexes of all the division units in the text to be recognized into the named entity recognition model, and for each division unit, respectively The marked different types of indexes are converted into multi-dimensional vectors; the splicing unit is used to sequentially splice multiple multi-dimensional vectors after the conversion of different types of indexes for each division unit; the second input unit is used to convert the text to be recognized The splicing vectors of all the division units are input to the sequence labeling model; the first obtaining unit is used to obtain the output result of the sequence labeling model to obtain the labeling information of the named entity in the text to be recognized.

In a possible implementation, the sequence labeling model is a deep learning model, and the deep learning model includes: one or more layers of recurrent neural networks, and each layer of recurrent neural networks uses neural network computing units to treat each segment of the recognized text in order The stitching vector of the unit is calculated to output the calculation result vector corresponding to each division unit; the conditional random field is used to receive the vector sequence. The vector sequence includes multiple calculation result vectors arranged in order, and multiple calculation result vectors are one layer Or the last layer of the recurrent neural network in the multi-layer recurrent neural network calculates the result vector for all the division units to output a label used to identify whether each division unit is a named entity.

Optionally, the cyclic neural network is a long and short-term memory neural network, or a bidirectional long- and short-term memory neural network.

In a possible implementation manner, the device further includes: a first labeling module, configured to, before inputting the labelled index of all the division units in the text to be recognized into the named entity recognition model, for each division unit in the text to be recognized Mark the part-of-speech index; correspondingly, the index marked for each division unit of the input named entity recognition model includes the part-of-speech index.

In a possible implementation manner, the device further includes: a third determining module, configured to combine the text to be recognized with a preset name before inputting the index of all the division units in the text to be recognized into the named entity recognition model. Entity dictionary matching is used to determine all named entities that are matched in the text to be recognized; the second labeling module is used to label the knowledge index for each division unit in the text to be recognized, where the knowledge index is used to indicate that the corresponding division unit is in The information of the matched named entity in the named entity recognition model; correspondingly, the index marked for each division unit of the input named entity recognition model includes a knowledge index.

In a possible implementation manner, the device further includes: a second acquiring module, configured to acquire the first text to be recognized corresponding to the current session before inputting the text and scene type into the named entity recognition model; and detecting The module is used to detect whether the first keyword exists in the first text, the first keyword is used to indicate that the first text is related to multiple rounds of conversation scenarios; the replacement module is used when the first keyword exists in the first text Next, replace the first keyword in the first text with the second keyword to obtain the second text to be recognized; correspondingly, the named entity recognition model is also used to at least according to the second keyword and scene type, in the second The named entity is recognized in the text, and the second keyword is related to the entity category corresponding to the recognized named entity.

In a possible implementation manner, before detecting whether the first keyword exists in the first text, the device further includes: a third acquiring module, configured to acquire the first to be identified corresponding to the previous session of the current round of conversation The intent analysis result of the three texts to obtain the intent type of the third text; the judgment module is used to determine whether the intent type of the third text is the specified intent type; correspondingly, if the intent type of the third text is the specified intent type , The detection module performs detection of whether the first keyword exists in the first text.

In a possible implementation manner, the device further includes: a search module, which is used to search for the named entity recognition model for the second text in the first text after the named entity recognition model is used to identify the named entity in the second text. The identified named entity; the fourth determining module is used to determine that the corresponding named entity is invalid if the named entity is not found in the first text; the fifth determining module is used to determine if the named entity is found in the first text , The corresponding named entity is determined to be valid.

Optionally, the second keyword is a verb and/or noun.

In a possible implementation manner, the detection module includes: a matching unit, configured to perform matching in the first text by using a preset regular expression to determine whether the first keyword exists in the first text.

Optionally, the entity category corresponding to the second keyword is: movie or music; the second keyword corresponding to the entity category of the movie is "play" and/or "movie"; the second keyword corresponding to the entity category of the music The second key word is "listening" and/or "music".

In a possible implementation manner, the device is applied to an electronic device equipped with a sound receiver, and the acquisition module includes: a second acquisition unit for acquiring the voice collected by the sound receiver; and a second conversion unit for converting the voice Convert to text to get the text to be recognized.

As shown in FIG. 7, there is also provided a schematic structural diagram of another named entity identification device. The named entity identification device 700 includes a first acquisition module 701, a detection module 702, a replacement module 703, and an identification module 704.

The first obtaining module is used to obtain the first text to be recognized corresponding to the current session; the detection module is used to detect whether there is a first keyword in the first text, and the first keyword is used to indicate the The first text is related to multiple rounds of conversation scenarios; the replacement module is used to replace the first keyword in the first text with the first keyword when the first keyword exists in the first text Two keywords to obtain the second text to be recognized; the recognition module is used to recognize named entities in the second text using a named entity recognition model at least according to the second keywords, and the second keywords are It is related to the entity category corresponding to the identified named entity.

Optionally, the second keyword is a verb and/or noun.

In the description of the named entity identification device 600 or the named entity identification device 700 provided in the embodiments of the present application, those skilled in the art should determine from the description of the named entity identification method provided in the embodiments of the present application. The corresponding content is obvious to those skilled in the art, and will not be repeated here.

It should be understood that the division of the various modules of the device shown in FIG. 6 or FIG. 7 is only a division of logical functions, and may be fully or partially integrated into one physical entity during actual implementation, or may be physically separated. And these modules can all be implemented in the form of software called by processing elements; they can also be implemented in the form of hardware; part of the modules can be implemented in the form of software called by the processing elements, and some of the modules can be implemented in the form of hardware. For example, the determination module may be a separately established processing element, or it may be integrated in a communication device, such as a certain chip of a terminal, and it may also be stored in the memory of the communication device in the form of a program. The processing element calls and executes the functions of the above modules. The implementation of other modules is similar. In addition, all or part of these modules can be integrated together or implemented independently. The processing element described here may be an integrated circuit with signal processing capabilities. In the implementation process, each step of the above method or each of the above modules may be completed by an integrated logic circuit of hardware in the processor element or instructions in the form of software.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more application specific integrated circuits (ASIC), or one or more microprocessors (digital singnal processor, DSP), or, one or more Field Programmable Gate Array (Field Programmable Gate Array, FPGA), etc. For another example, when one of the above modules is implemented in the form of a processing element scheduler, the processing element may be a general-purpose processor, such as a central processing unit (CPU) or other processors that can call programs. For another example, these modules can be integrated together and implemented in the form of a system-on-a-chip (SOC).

It should be understood that in this application, "a plurality of" refers to two or more than two, and other quantifiers are similar. "And/or" describes the corresponding relationship of the associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the associated objects before and after are in an "or" relationship.

In this application, "at least one" refers to one or more, and "multiple" refers to two or more. "And/or" describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects before and after are in an "or" relationship. "The following at least one item (a)" or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a). For example, at least one item (a) of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .

It should be noted that the method and device for identifying named entities provided in the embodiments of the present application are merely examples, and the embodiments of the present application are not limited thereto.

The embodiments of the present application also provide a computer-readable storage medium in which a computer program is stored, and when the computer program is run on a computer, the computer executes the method described in the above-mentioned embodiment.

In addition, the embodiments of the present application also provide a computer program product, which includes a computer program, which when running on a computer, causes the computer to execute the method described in the foregoing embodiment.

In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in this application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium, (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk).

Claims

A method for identifying named entities, which is characterized in that it includes:

Obtain the text to be recognized;

Determining the type of scene applied by the named entity recognition model used to recognize the named entity in the text to be recognized;

Inputting the to-be-recognized text and the scene type into the named entity recognition model;

Obtain the output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized for the scene type.
The method of claim 1, wherein:

Before inputting the to-be-recognized text and the scene type into the named entity recognition model, the method further includes:

Marking the text to be recognized with a content index according to division units, wherein the division units with the same content are marked by the same content index;

Determine the scene type index corresponding to the scene type; mark the scene type index for each division unit;

Correspondingly, the inputting the text to be recognized and the scene type into the named entity recognition model includes:

Input the marked indexes of all the division units in the text to be recognized into the named entity recognition model.
3. The method according to claim 2, wherein after inputting the marked indexes of all the division units in the to-be-recognized text into the named entity recognition model, the processing method of the named entity recognition model comprises:

For each of the division units, convert the marked indexes of different types into multi-dimensional vectors;

For each of the division units, sequentially splicing multiple multi-dimensional vectors converted by different types of indexes;

Inputting the stitching vectors of all the division units in the text to be recognized into the sequence labeling model;

Obtain the output result of the sequence annotation model to obtain the annotation information of the named entity in the text to be recognized.
The method of claim 3, wherein the sequence labeling model is a deep learning model, and the deep learning model comprises:

One or more layers of cyclic neural network, each layer of the cyclic neural network uses a neural network calculation unit to sequentially calculate the stitching vector of each of the division units in the text to be recognized, to output each of the division units The corresponding calculation result vector;

The conditional random field is used to receive a vector sequence, the vector sequence includes a plurality of the calculation result vectors arranged in order, and the plurality of calculation result vectors are the last layer of the one or more layers of the cyclic neural network. The recurrent neural network outputs a vector of calculation results for all the division units to output a label used to identify whether each division unit is a named entity.
The method according to any one of claims 1 to 4, characterized in that, before the text to be recognized and the scene type are input into the named entity recognition model, the method further comprises: obtaining the current session correspondence The first text to be recognized; detecting whether there is a first keyword in the first text, the first keyword is used to indicate that the first text is related to multiple rounds of conversation scenarios; in the first text If the first keyword exists, replace the first keyword in the first text with a second keyword to obtain the second text to be recognized; correspondingly, the Inputting the text and the scene type into the named entity recognition model includes: at least according to the second keyword, using the named entity recognition model to recognize a named entity in the second text, and the second keyword is related to The entity category corresponding to the identified named entity is related.
A method for identifying named entities, characterized in that the method includes:

Obtain the first text to be recognized corresponding to this round of conversation;

Detecting whether there is a first keyword in the first text, where the first keyword is used to indicate that the first text is related to multiple rounds of conversation scenarios;

In the case where the first keyword exists in the first text, replacing the first keyword in the first text with a second keyword to obtain the second text to be recognized;

At least according to the second keyword, a named entity recognition model is used to identify a named entity in the second text, and the second keyword is related to the entity category corresponding to the identified named entity.
8. The method of claim 6, wherein before detecting whether the first keyword exists in the first text, the method further comprises:

Acquiring an intent analysis result of the third text to be recognized corresponding to the previous session of the current round of conversation, so as to obtain the intent type of the third text;

Determine whether the intent type of the third text is a designated intent type;

Correspondingly, if the intent type of the third text is the specified intent type, it is detected whether the first keyword exists in the first text.
8. The method according to claim 6 or 7, wherein after using the named entity recognition model to identify the named entity in the second text, the method further comprises:

In the first text, search for named entities recognized by the named entity recognition model for the second text;

If the named entity is not found in the first text, determining that the corresponding named entity is invalid;

If the named entity is found in the first text, it is determined that the corresponding named entity is valid.
The method according to any one of claims 6-8, wherein detecting whether the first keyword exists in the first text comprises: using a preset regular expression to match in the text to be recognized, and determining the Whether the first keyword exists in the first text.
A device for identifying named entities, characterized in that the device includes:

The first obtaining module is used to obtain the text to be recognized;

The first determining module is configured to determine the type of scene applied by the named entity recognition model used to recognize the named entity in the text to be recognized;

An input module, configured to input the to-be-recognized text and the scene type into the named entity recognition model;

The execution module is configured to obtain output information of the named entity recognition model to determine the named entity recognized by the named entity recognition model in the text to be recognized for the scene type.
The device of claim 10, wherein the device further comprises:

The dividing module is configured to mark the content index of the text to be recognized according to the division unit before the text to be recognized and the scene type are input into the named entity recognition model, wherein the division units with the same content pass the same Mark the content index;

The second determining module is configured to determine the scene type index corresponding to the scene type; mark the scene type index for each division unit;

Correspondingly, the input module includes:

The first input unit is configured to input the marked indexes of all the division units in the text to be recognized into the named entity recognition model.
The apparatus of claim 11, wherein the named entity recognition model comprises:

The first conversion unit is configured to, after inputting the marked indexes of all the division units in the to-be-recognized text into the named entity recognition model, for each division unit, separate different types of indexes to be marked Convert to multidimensional vector;

The splicing unit is used for splicing multiple multi-dimensional vectors of different types of index conversion in sequence for each of the division units;

The second input unit is configured to input the stitching vectors of all the division units in the text to be recognized into the sequence labeling model;

The first acquiring unit is configured to acquire the output result of the sequence annotation model to obtain the annotation information of the named entity in the text to be recognized.
The device of claim 12, wherein the sequence labeling model is a deep learning model, and the deep learning model comprises:

One or more layers of cyclic neural network, each layer of the cyclic neural network uses a neural network calculation unit to sequentially calculate the stitching vector of each of the division units in the text to be recognized, to output each of the division units The corresponding calculation result vector;

The conditional random field is used to receive a vector sequence, the vector sequence includes a plurality of the calculation result vectors arranged in order, and the plurality of calculation result vectors are the last layer of the one or more layers of the cyclic neural network. The recurrent neural network outputs a vector of calculation results for all the division units to output a label used to identify whether each division unit is a named entity.
The device according to claim 13, wherein the cyclic neural network is a long- and short-term memory neural network or a bidirectional long- and short-term memory neural network.
The device according to claim 13 or 14, wherein the deep learning model is a model obtained by unsupervised pre-training using corpus.
The device according to any one of claims 11-15, wherein the device further comprises:

The first labeling module is configured to label the part-of-speech index for each of the division units in the text to be recognized before inputting the index of all the division units in the text to be recognized into the named entity recognition model ；

Correspondingly, the index marked for each division unit input to the named entity recognition model includes the part-of-speech index.
The device according to any one of claims 11-16, wherein the device further comprises:

The third determining module is configured to match the to-be-recognized text with a preset named-entity dictionary before inputting the marked indexes of all the division units in the to-be-recognized text into the named entity recognition model, and determine the State all named entities matched in the text to be recognized;

The second labeling module is configured to label a knowledge index for each of the division units in the text to be recognized, wherein the knowledge index is used to indicate the corresponding division unit in the named entity recognition model. Information of the matched named entity;

Correspondingly, the index marked for each division unit input to the named entity recognition model includes the knowledge index.
The device according to any one of claims 10-17, wherein the device further comprises:

The second obtaining module is configured to obtain the first text to be recognized corresponding to the current round of conversation before the text to be recognized and the scene type are input into the named entity recognition model;

A detection module, configured to detect whether a first keyword exists in the first text, and the first keyword is used to indicate that the first text is related to multiple rounds of conversation scenarios;

The replacement module is used to replace the first keyword in the first text with a second keyword when the first keyword exists in the first text to obtain the second keyword to be recognized text;

Correspondingly, the named entity recognition model is also used to identify a named entity in the second text at least according to the second keyword and the scene type, and the second keyword corresponds to the recognized name. The entity type corresponding to the entity is related.
The device according to any one of claims 10-18, wherein the scene type is classified according to the type of terminal to which the named entity recognition model is applied and/or the type of application software.
The device according to any one of claims 10-19, wherein the device is applied to an electronic device equipped with a sound receiver, and the acquisition module comprises:

The second acquiring unit is configured to acquire the voice collected by the sound receiver;

The second conversion unit is used to convert the voice into text to obtain the text to be recognized.
A device for identifying named entities, characterized in that the device comprises:

The first obtaining module is configured to obtain the first text to be recognized corresponding to the current round of conversation;

A detection module, configured to detect whether a first keyword exists in the first text, and the first keyword is used to indicate that the first text is related to multiple rounds of conversation scenarios;

The replacement module is used to replace the first keyword in the first text with a second keyword when the first keyword exists in the first text to obtain the second keyword to be recognized text;

The recognition module is configured to recognize a named entity in the second text using a named entity recognition model at least according to the second keyword, where the second keyword is related to the entity category corresponding to the recognized named entity .
The device of claim 21, wherein the device further comprises:

The second acquiring module is used to acquire the intent analysis result of the third text to be recognized corresponding to the previous round of the current conversation before detecting whether the first keyword exists in the first text, so as to obtain the State the type of intent of the third text;

The judgment module is used to judge whether the intent type of the third text is the specified intent type; correspondingly, if the intent type of the third text is the specified intent type, the detection module executes the detection of the Whether the first keyword exists in the first text.
The device according to claim 21 or 22, wherein the device further comprises:

The search module is used to search for the name recognized by the named entity recognition model for the second text in the first text after the named entity recognition model is used to identify the named entity in the second text entity;

A first determining module, configured to determine that the corresponding named entity is invalid if the named entity is not found in the first text;

The second determining module is configured to determine that the corresponding named entity is valid if the named entity is found in the first text.
The device according to any one of claims 21-23, wherein the second keyword is determined according to an intent analysis result corresponding to the previous round of the current round of conversation.
The device of claim 24, wherein the second keyword is a verb and/or a noun.
The device according to any one of claims 21-25, wherein the first keyword is used to indicate that the intent corresponding to the first text is incomplete.
The device according to any one of claims 21-26, wherein the detection module comprises:

The matching unit is configured to perform matching in the text to be recognized by using a preset regular expression to determine whether the first keyword exists in the first text.
The device according to any one of claims 21-27, wherein the entity category corresponding to the second keyword is: movie category or music category; the first category corresponding to the entity category of the movie category The second keyword is "play" and/or "movie"; the second keyword corresponding to the entity category of the music category is "listen" and/or "music".
A device for identifying named entities, characterized in that the device includes:

One or more processors; memory; multiple application programs; and one or more computer programs, wherein the one or more computer programs are stored in the memory, and the one or more computer programs include instructions, When the instruction is executed by the device, the device is caused to execute the method according to any one of claims 1-9.
A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which when running on a computer, causes the computer to execute the method according to any one of claims 1-9.