CN110781682B

CN110781682B - Named entity recognition model training method, recognition method, device and electronic equipment

Info

Publication number: CN110781682B
Application number: CN201911010612.8A
Authority: CN
Inventors: 郑孙聪; 周博通
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-10-23
Filing date: 2019-10-23
Publication date: 2023-04-07
Anticipated expiration: 2039-10-23
Also published as: CN110781682A

Abstract

The invention provides a named entity recognition model training method, a recognition method, a device, electronic equipment and a storage medium; the method comprises the following steps: acquiring a plurality of text sentences, and setting a label for each word in each text sentence according to an entity dictionary; updating the weight parameters of the named entity recognition model according to the plurality of text sentences, and predicting each text sentence according to the test recognition model obtained after updating to obtain a prediction label of each character in each text sentence; when the label tag corresponding to the word in the text sentence is different from the prediction tag, setting an ambiguous tag for the word; and updating the named entity recognition model according to words without ambiguity labels in the plurality of text sentences. By the method and the device, the training effect of the named entity recognition model can be improved.

Description

Named entity recognition model training method, recognition method, device and electronic equipment

Technical Field

The invention relates to an artificial intelligence technology, in particular to a named entity recognition model training method, a named entity recognition device, electronic equipment and a storage medium.

Background

Artificial Intelligence (AI) is a theory, method and technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. Natural Language Processing (NLP) is an important direction in artificial intelligence, and various theories and methods for realizing efficient communication between a person and a computer using natural Language are mainly studied.

Named entity recognition is a branch of natural language processing and refers to the recognition of entities in text sentences that have particular meaning, such as names of people and places. In the solutions provided in the related art, data is usually labeled according to an entity dictionary, a named entity recognition model is trained according to the labeled data, and then the trained named entity recognition model is used to achieve the corresponding recognition purpose. And the entity dictionary can not cover all named entities, so that the labeled data is possibly inaccurate, the accuracy of model training is low, and the obtained named entity recognition model has poor recognition effect.

Disclosure of Invention

The embodiment of the invention provides a named entity recognition model training method, a named entity recognition device, electronic equipment and a storage medium, which can improve the accuracy of model training and named entity recognition according to a model.

The technical scheme of the embodiment of the invention is realized as follows:

the embodiment of the invention provides a named entity recognition model training method, which comprises the following steps:

acquiring a plurality of text sentences, and setting a label for each word in each text sentence according to an entity dictionary;

updating the weight parameters of the named entity recognition model according to the plurality of text sentences, and predicting each text sentence according to the test recognition model obtained after updating to obtain a prediction label of each character in each text sentence;

when the label corresponding to the word in the text sentence is different from the prediction label, setting an ambiguous label for the word;

and updating the named entity recognition model according to words without ambiguity labels in the plurality of text sentences.

The embodiment of the invention provides a recognition method based on a named entity recognition model, which comprises the following steps:

predicting a text sentence through a named entity recognition model to obtain a prediction label of each character in the text sentence;

determining a word of which the corresponding prediction tag is an entity prediction tag and the entity position included by the prediction tag is the first bit as a first bit word;

traversing backwards from the head word in the text sentence;

when the prediction tag corresponding to the traversed word is an entity prediction tag, the entity position included by the prediction tag is a non-first word, and the entity type included by the prediction tag is the same as the first word, determining the traversed word as the non-first word;

and jointly determining the first word and the corresponding non-first word as a named entity.

The embodiment of the invention provides a named entity recognition model training device, which comprises:

the labeling module is used for acquiring a plurality of text sentences and setting labeling labels for all words in all the text sentences according to the entity dictionary;

the training prediction module is used for updating the weight parameters of the named entity recognition model according to the plurality of text sentences and predicting each text sentence according to the test recognition model obtained after updating so as to obtain a prediction label of each character in each text sentence;

the ambiguity setting module is used for setting an ambiguity label for the word when the label corresponding to the word in the text sentence is different from the prediction label;

and the updating module is used for updating the named entity recognition model according to the words without the ambiguity labels in the text sentences.

The embodiment of the invention provides a recognition device based on a named entity recognition model, which comprises:

the system comprises a prediction module, a search module and a recognition module, wherein the prediction module is used for performing prediction processing on a text sentence through a named entity recognition model to obtain a prediction tag of each character in the text sentence;

a first determining module, configured to determine a word that a corresponding prediction tag is an entity prediction tag and an entity position included in the prediction tag is a first word as a first word;

the traversal module is used for traversing backwards from the first word in the text sentence;

a non-first determining module, configured to determine a traversed word as a non-first word when a prediction tag corresponding to the traversed word is an entity prediction tag, an entity position included in the prediction tag is a non-first word, and an entity type included in the prediction tag is the same as the first word;

and the entity determining module is used for determining the first word and the corresponding non-first word as a named entity.

An embodiment of the present invention provides an electronic device, including:

a memory for storing executable instructions;

and the processor is used for implementing the named entity recognition model training method provided by the embodiment of the invention or the recognition method based on the named entity recognition model when the executable instruction stored in the memory is executed.

The embodiment of the invention provides a storage medium, which stores executable instructions and is used for causing a processor to execute the executable instructions so as to realize the named entity recognition model training method or the recognition method based on the named entity recognition model provided by the embodiment of the invention.

The embodiment of the invention has the following beneficial effects:

the method and the device for identifying the named entity provided by the embodiment of the invention set the label tag for each word in the text sentence, update the named entity identification model according to the text sentences to obtain the test identification model, obtain the prediction tag of each word in the text sentences according to the test identification model, set the ambiguity tag for the word with the label tag different from the prediction tag, and update the named entity identification model according to the word without the ambiguity tag.

Drawings

FIG. 1 is an alternative architecture diagram of a named entity recognition model training system according to an embodiment of the present invention;

FIG. 2A is a schematic diagram of an alternative architecture of a server according to an embodiment of the present invention;

FIG. 2B is a schematic diagram of an alternative architecture of a server provided by an embodiment of the invention;

FIG. 3 is an alternative architecture diagram of a named entity recognition model training apparatus according to an embodiment of the present invention;

FIG. 4A is an alternative flow chart of a named entity recognition model training method according to an embodiment of the present invention;

FIG. 4B is a schematic flow chart diagram illustrating an alternative method for training a named entity recognition model according to an embodiment of the present invention;

FIG. 4C is a schematic flow chart diagram illustrating an alternative method for training a named entity recognition model according to an embodiment of the present invention;

FIG. 5 is an alternative flow chart of the named entity recognition model-based recognition method according to an embodiment of the present invention;

fig. 6 is a schematic flowchart of another alternative method for training a named entity recognition model according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.

Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.

1) Naming an entity: name of person, organization name, place name and other entities identified by name.

2) Named Entity Recognition (NER): the named entity recognition method refers to recognizing named entities in text sentences, and can be realized by training corresponding models, such as a Conditional Random Field (CRF) model, a Bi-directional Long Short-Term Memory network (Bi-LSTM) model, or a Bidirectional Encoder Representation (BERT) model based on transforms, and the like, which are not limited in the embodiment of the present invention.

3) BIO labeling System: one way to label an element in a text sentence is to label the element as "B-X", "I-X", or "O", where "B" in "B-X" indicates that the entity position of the element is first, "I" in "I-X" indicates that the entity position of the element is not first, "X" in "B-X" and "I-X" indicates that the entity type of the element is X type, and "O" indicates that the element does not belong to any type. Wherein the elements may be words in a text sentence.

4) Remote supervision remark data: and performing character string matching on the text sentence by using a known entity dictionary, marking the named entity in the text sentence as a corresponding type when the text sentence contains the named entity in the entity dictionary, and marking characters which are not successfully matched in the text sentence at the same time to obtain data, namely remote supervision relabeling data. For example, the entity dictionary is a large set of names of people, "zhang san" is a named entity in the entity dictionary, and the text sentence "zhang san" is a famous singer using the entity dictionary. And matching is carried out, the remote supervision return standard data can be obtained as' Zhang/B-PER three/I-PER is/O name/O song/O hand/O. and/O ", wherein" PER "indicates the character type.

The inventor finds that the method for constructing the training data of the training task of the named entity recognition model mainly comprises manual marking and automatic marking, in the automatic marking, the data is marked according to an entity dictionary, the recognition model is trained according to the obtained remote supervision relabeling data, and then the trained recognition model is used for realizing the corresponding recognition purpose. However, this method is not comprehensive, and some important information in the text sentence is easily lost.

For example, the entity dictionary is an unambiguous name dictionary, that is, named entities in the entity dictionary are all only referring to characters, and the entity dictionary includes "zhang san" but does not include "forsythia", which is an ambiguous named entity. Then, the movie is played by the entity dictionary for the textual sentence "zhang san and forsythia suspensa. After matching, the obtained remote supervision echo data are' tension/B-PER three/I-PER and/O connection/O warp/O parameter/O evolution/O completion/O (input/output) electric/O shadow/O. O ", i.e. in the remote supervision callback data," Forsythia suspensa "is labeled as" O "type, but in practice," Forsythia suspensa "should be consistent with" Zhang san "and labeled as" PER "type. In conclusion, by labeling through the entity dictionary, wrong remote supervision remark data may be obtained, so that the training effect of model training performed according to the remote supervision remark data is poor, and the obtained model has low recognition accuracy.

Embodiments of the present invention provide a method, a device, an electronic device, and a storage medium for training a named entity recognition model, which can improve accuracy of model training and recognition effect of the trained model.

Referring to fig. 1, fig. 1 is an alternative architecture diagram of a named entity recognition model training system 100 according to an embodiment of the present invention, in order to implement supporting a named entity recognition model training application, a terminal 400 (exemplary terminals 400-1 and 400-2 are shown) is connected to a server 200 through a network 300, the server 200 is connected to a database 500, and the network 300 may be a wide area network or a local area network, or a combination of the two.

The server 200 is configured to obtain an entity dictionary and a plurality of text sentences from the database 500, and set a label for each word in each text sentence according to the entity dictionary; updating the weight parameters of the named entity recognition model according to the plurality of text sentences, and predicting each text sentence according to the test recognition model obtained after updating to obtain a prediction label of each character in each text sentence; when the label tag corresponding to the word in the text sentence is different from the prediction tag, setting an ambiguous tag for the word; and updating the named entity recognition model according to the words without the ambiguity labels in the plurality of text sentences.

The terminal 400 is configured to send a text sentence to the server 200; the server 200 is further configured to perform prediction processing on the text sentence through the updated named entity recognition model to obtain a prediction tag of each word in the text sentence; determining a word of which the corresponding prediction tag is an entity prediction tag and the entity position included by the prediction tag is the first word as a first word; traversing backwards from the first character in the text sentence; when the prediction tag corresponding to the traversed word is an entity prediction tag, the entity position included by the prediction tag is a non-first word, and the entity type included by the prediction tag is the same as the first word, determining the traversed word as the non-first word; determining the first word and the corresponding non-first word as a named entity, and sending the named entity to the terminal 400; the terminal 400 is further configured to display the textual sentence and the textual sentence labeled with the named entity on a graphical interface 410 (graphical interface 410-1 and graphical interface 410-2 are shown as examples).

In some practical application scenarios of named entity recognition, the server 200 may process the named entity in the text sentence and then send the processing result to the terminal 400. For example, in the response scenario, the server 200 may query the knowledge graph according to the identified named entity to obtain a response sentence, and send the response sentence to the terminal 400.

The following continues to illustrate exemplary applications of the electronic device provided by embodiments of the present invention. The electronic device may be implemented as various types of terminal devices such as a notebook computer, a tablet computer, a desktop computer, a set-top box, a mobile device (e.g., a mobile phone, a portable music player, a personal digital assistant, a dedicated messaging device, a portable game device), and the like, and may also be implemented as a server. Next, an electronic device will be described as an example of a server.

Referring to fig. 2A, fig. 2A is a schematic diagram of an architecture of a server 200 (for example, the server 200 shown in fig. 1) provided by an embodiment of the present invention, where the server 200 shown in fig. 2A includes: at least one processor 210, memory 250, at least one network interface 220, and a user interface 230. The various components in server 200 are coupled together by a bus system 240. It is understood that the bus system 240 is used to enable connected communication between these components. The bus system 240 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 240 in fig. 2.

The Processor 210 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.

The user interface 230 includes one or more output devices 231, including one or more speakers and/or one or more visual display screens, that enable the presentation of media content. The user interface 230 also includes one or more input devices 232, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

The memory 250 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 250 optionally includes one or more storage devices physically located remotely from processor 210.

The memory 250 includes volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 250 described in embodiments of the invention is intended to comprise any suitable type of memory.

In some embodiments, memory 250 may be capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 251 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;

a network communication module 252 for communicating to other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;

a presentation module 253 to enable presentation of information (e.g., a user interface for operating peripherals and displaying content and information) via one or more output devices 231 (e.g., a display screen, speakers, etc.) associated with the user interface 230;

an input processing module 254 for detecting one or more user inputs or interactions from one of the one or more input devices 232 and translating the detected inputs or interactions.

In some embodiments, the named entity recognition model training apparatus provided by the embodiments of the present invention may be implemented in software, and fig. 2A illustrates a named entity recognition model training apparatus 2550 stored in a memory 250, which may be software in the form of programs and plug-ins, and includes the following software modules: the annotation module 25501, the training prediction module 25502, the ambiguity setting module 25503 and the update module 25504 are logical and therefore can be arbitrarily combined or further split according to the implemented functions.

In some embodiments, the recognition device based on the named entity recognition model provided in the embodiments of the present invention may also be implemented in a software manner, referring to fig. 2B, the remaining parts of fig. 2B except the illustrated recognition device 2551 based on the named entity recognition model may be the same as those of fig. 2A, and are not described herein again. For the named entity recognition model based recognition means 2551 stored in the memory 250, which may be software in the form of programs and plug-ins or the like, the following software modules are included: prediction module 25511, first order determination module 25512, traversal module 25513, non-first order determination module 25514, and entity determination module 25515, which are logical and therefore can be arbitrarily combined or further split depending on the functionality implemented.

The functions of the respective modules will be explained below.

In other embodiments, the named entity recognition model training device and the recognition device based on the named entity recognition model provided in the embodiments of the present invention may be implemented in hardware, for example, the named entity recognition model training device provided in the embodiments of the present invention may be a processor in the form of a hardware decoding processor, which is programmed to execute the named entity recognition model training method provided in the embodiments of the present invention; the recognition device based on the named entity recognition model provided by the embodiment of the invention can be a processor in the form of a hardware decoding processor, and is programmed to execute the recognition method based on the named entity recognition model provided by the embodiment of the invention. For example, a processor in the form of a hardware decoding processor may employ one or more Application Specific Integrated Circuits (ASICs), DSPs, programmable Logic Devices (PLDs), complex Programmable Logic Devices (CPLDs), field Programmable Gate Arrays (FPGAs), or other electronic components.

The named entity recognition model training method and the recognition method based on the named entity recognition model provided by the embodiment of the invention can be executed by the server, or can be executed by terminal equipment (for example, the terminal 400-1 and the terminal 400-2 shown in fig. 1), or can be executed by both the server and the terminal equipment.

The following describes a process of implementing the named entity recognition model training method by an embedded named entity recognition model training apparatus in an electronic device, in conjunction with the above-mentioned exemplary application and structure of the electronic device.

Referring to fig. 3 and fig. 4A, fig. 3 is a schematic architecture diagram of a named entity recognition model training device 2550 according to an embodiment of the present invention, which shows a process of implementing named entity recognition model training through a series of modules, and fig. 4A is a schematic flowchart of a named entity recognition model training method according to an embodiment of the present invention, and the steps shown in fig. 4A will be described with reference to fig. 3.

In step 101, a plurality of text sentences are obtained, and labeling labels are set for words in each text sentence according to an entity dictionary.

When training a named entity recognition model, firstly, a plurality of unlabelled text sentences are obtained, and labeling labels are set for words in each text sentence according to an entity dictionary, wherein the entity dictionary can be an unambiguous named entity dictionary, namely, named entities in the entity dictionary only have unique meanings, and in addition, the labeling labels can be set according to a BIO labeling system.

In some embodiments, the setting of the label tag for each word in each text sentence according to the entity dictionary may be implemented by: matching each text sentence according to an entity dictionary, and determining a named entity which is successfully matched in each text sentence; setting an entity label for the word included in the named entity; the entity labeling label comprises an entity position and an entity type of a corresponding word; and setting a non-entity labeling label for the words except the named entity in the text sentence.

For example, referring to fig. 3, in the tagging module 25501, matching processing is performed on each text sentence according to the entity dictionary, and a named entity that is successfully matched in each text sentence is determined. An entity label corresponding to the named entity is set in the entity dictionary, and comprises an entity position and an entity type of the corresponding word, wherein the entity position is first or not, and the entity type is character type or other types. And after the matching is successful, setting a corresponding entity label for the words included by the named entities in the text sentence, and setting a non-entity label for the words except the named entities in the text sentence. By the method, unambiguous named entities in the text sentence can be determined, and automatic annotation is realized.

In step 102, updating the weight parameters of the named entity recognition model according to the plurality of text sentences, and performing prediction processing on each text sentence according to the test recognition model obtained after the updating processing to obtain a prediction tag of each character in each text sentence.

Here, the weight parameter of the initial named entity recognition model may be updated according to all text sentences, or a part of the text sentences may be used as data for training, which will be described later. In order to facilitate distinguishing, the model obtained through updating in the step is named as a test recognition model, and prediction processing is performed on each text sentence according to the test recognition model to obtain a prediction label of each character in each text sentence.

In step 103, when the label tag corresponding to the word in the text sentence is different from the prediction tag, an ambiguity tag is set for the word.

For example, referring to fig. 3, in the ambiguity setting module 25503, a tag corresponding to each word in a text sentence is compared with a predicted tag, and when the tag is different from the predicted tag, it is proved that the entity in which the word is located is ambiguous, and an ambiguity tag is set for the word.

In some embodiments, the above-mentioned setting an ambiguity label for a word in the text sentence when the annotation label corresponding to the word is different from the prediction label may be implemented in such a manner that: when the label corresponding to the word in the text sentence is the non-entity label and the corresponding prediction label is the entity prediction label, setting an ambiguous label for the word; wherein the entity prediction tag comprises an entity location and an entity type of the word.

In the embodiment of the present invention, when the label tag corresponding to the word in the text sentence is a non-entity label tag and the corresponding prediction tag is an entity prediction tag, it may be set to determine that an entity in which the word is located has ambiguity, and set an ambiguity tag for the word. Conversely, when the label tag corresponding to the word is an entity label tag and the corresponding prediction tag is a non-entity prediction tag, the entity dictionary is an unambiguous dictionary, so that the entity label tag corresponding to the word is determined to be correct, the prediction result of the test recognition model is incorrect, and the ambiguous tag is not set for the word. By the method, the accuracy of setting the ambiguous labels is improved.

In step 104, the named entity recognition model is updated according to words without ambiguity labels in the text sentences.

After the tags of the words in the text sentences are compared, the initial named entity recognition model is updated according to the words without ambiguous tags in the text sentences, and the specific updating operation is specifically described later.

As can be known from the above exemplary implementation of fig. 4A, in the embodiment of the present invention, by recognizing words included in an ambiguous entity and setting an ambiguity label for such words, the words do not participate in the updating process of the named entity recognition model, so that the accuracy of training the named entity recognition model is improved.

In some embodiments, referring to fig. 4B, fig. 4B is another optional flowchart of the named entity recognition model training method provided in the embodiment of the present invention, and step 104 shown in fig. 4A may be implemented through steps 201 to 206, which will be described with reference to each step.

In step 201, adding a plurality of text sentences to a training set and a test set respectively; wherein the training set comprises a greater number of text sentences than the test set.

For example, the ratio of the number of text sentences included in the training set to the number of text sentences included in the test set is set to 9: and 1, the training set is training data of the named entity recognition model, and the test set is used for evaluating the training effect of the named entity recognition model.

In step 202, the text sentence in the training set is subjected to prediction processing through the named entity recognition model, so as to obtain a prediction tag of each word in the text sentence.

For example, referring to fig. 3, in the updating module 25504, each text sentence in the training set is subjected to prediction processing by the named entity recognition model, so as to obtain a prediction tag of each word in each text sentence.

In step 203, the difference between the annotation tag and the predictive tag corresponding to the word without the ambiguous tag is determined.

Because the entity in which the word with the ambiguous label is positioned is ambiguous, the word is not arranged to participate in the training of the named entity recognition model. In this step, the difference between the annotation tag and the predictive tag corresponding to the word without the ambiguous tag is determined. In practical applications, values of different labels and predicted labels may be preset, for example, a value corresponding to the entity label is set to 1, and a value corresponding to the non-entity label is set to 0, so as to calculate a numerical difference between the label and the predicted label, which is convenient for calculation. Of course, the numerical value may be further set according to the entity location and the entity type, and the embodiment of the present invention is not limited.

In step 204, performing back propagation in the named entity recognition model according to the difference, and updating the weight parameters of the named entity recognition model in the process of back propagation until all the text sentences in the training set are traversed.

And performing back propagation in the named entity recognition model according to the obtained difference, and calculating a gradient in the process of back propagation so as to update the weight parameters of the named entity recognition model along the gradient descending direction until all the text sentences in the training set are traversed, thereby obtaining the updated named entity recognition model as shown in an updating module 25504 of fig. 3.

In step 205, according to the test set, the updated recognition efficiency of the named entity recognition model is determined.

For example, referring to fig. 3, in the updating module 25504, the updated named entity recognition model is tested according to the test set, so that the recognition efficiency is obtained.

In some embodiments, the above-mentioned determining the updated recognition efficiency of the named entity recognition model based on the test set may be implemented in such a way that: predicting the text sentences in the test set through the updated named entity recognition model to obtain prediction labels of all characters in the text sentences; determining the accuracy and recall rate of the updated named entity recognition model according to the label labels and the prediction labels corresponding to the characters without the ambiguous labels; and determining the effective identification rate of the updated named entity identification model according to the accuracy and the recall rate.

In the embodiment of the invention, the effective identification rate can be an accuracy rate, a recall rate or a fusion result of the accuracy rate and the recall rate. Specifically, each text sentence in the test set is subjected to prediction processing through the updated named entity recognition model, and a prediction label of each character in each text sentence is obtained. Then, a ratio (the number of words whose corresponding label and prediction label are both entity labels/the number of words whose corresponding prediction label is an entity label) is calculated, and the ratio is determined as accuracy precision. And calculating a proportion (the number of words with the corresponding labeling labels and the prediction labels both being entity labels/the number of words with the corresponding labeling labels being entity labels), and determining the proportion as a recall. The accuracy and the recall are fused to obtain the recognition effective rate, for example, a value of [ (precision × call)/(precision + call) ] × 2 is calculated, and the value is determined as the recognition effective rate. By the method, the accuracy rate and the recall rate are balanced, so that the calculated effective recognition rate can effectively reflect the training effect of the named entity recognition model.

In step 206, when the recognition efficiency rate does not exceed the efficiency rate threshold, the training set is predicted again according to the updated named entity recognition model until the recognition efficiency rate exceeds the efficiency rate threshold.

Setting an effective rate threshold, such as 70%, when the recognition effective rate does not exceed the effective rate threshold, performing the operations of steps 202-205 again according to the updated named entity recognition model until the recognition effective rate exceeds the effective rate threshold.

As can be seen from the above exemplary implementation of fig. 4B, in the embodiment of the present invention, the training set and the test set are divided, the named entity recognition model is trained according to the training set, and the recognition efficiency is obtained according to the test set, and the training is continued when the recognition efficiency does not reach the standard, so that the training effect of model training is improved.

In some embodiments, referring to fig. 4C, fig. 4C is another optional flowchart of the named entity recognition model training method provided in the embodiment of the present invention, and step 102 shown in fig. 4A may be implemented by steps 301 to 302, which will be described in conjunction with the steps.

In step 301, adding a plurality of text sentences to N sentence sets on average; wherein N is an integer greater than 1.

For example, N may be 10, and multiple text sentences are added to 10 sentence sets on average, i.e., each sentence set includes the same number of text sentences.

In step 302, updating the weight parameters of the initial named entity recognition model in turn according to the N-1 sentence sets, and performing prediction processing on the remaining 1 sentence sets according to the test recognition model obtained after the updating processing until obtaining the prediction tags of each word included in each sentence set.

For example, referring to fig. 3, in the training prediction model 25502, the weight parameters of the named entity recognition model are updated according to N-1 sentence sets, and the remaining 1 sentence sets are predicted according to the test recognition model obtained after the updating, and the above operations are repeated until the prediction labels of the words included in each sentence set are obtained.

It is worth noting that for different sets of N-1 sentences, the corresponding training prediction processes do not interfere with each other. For example, the sentence set includes sentence set 1 to sentence set 10, in a training process, the initial named entity recognition model is updated according to sentence set 1 to sentence set 9, and prediction processing is performed on sentence set 10 according to the obtained test recognition model, so as to obtain a prediction label of each word included in sentence set 10; in another training process, the initial named entity recognition model is updated according to the sentence sets 2 to 10, and the sentence set 1 is subjected to prediction processing according to the obtained test recognition model, so that the prediction label of each character included in the sentence set 1 is obtained.

In some embodiments, the above-mentioned process of updating the weight parameters of the named entity recognition model according to N-1 sentence sets may be implemented in such a way that: predicting text sentences contained in N-1 sentence sets through a named entity recognition model to obtain temporary prediction labels of characters contained in the text sentences; determining the difference between the labeling label corresponding to each word and the temporary prediction label; and performing back propagation in the named entity recognition model according to the difference, and updating the weight parameters of the named entity recognition model in the process of back propagation until all text sentences included in the N-1 sentence sets are traversed to obtain a test recognition model.

When updating the weight parameters of the named entity recognition model according to the N-1 sentence sets, firstly, predicting each text sentence included in the N-1 sentence sets through the named entity recognition model to obtain a prediction label of each word included in each text sentence, and naming the prediction label obtained here as a temporary prediction label for distinguishing from the subsequent prediction labels. And then, determining the difference between the label tag corresponding to each word in the N-1 sentence sets and the temporary prediction tag, performing backward propagation in the named entity recognition model according to the difference, and calculating the gradient in the backward propagation process, so as to update the weight parameter of the named entity recognition model along the gradient descending direction until all the words included in the N-1 sentence sets are traversed to obtain the test recognition model. The test recognition model is obtained through the method, so that the prediction label can be conveniently obtained through subsequent prediction, and the cross prediction of different sentence sets is realized.

As can be seen from the above exemplary implementation of fig. 4C, in the embodiment of the present invention, a plurality of text sentences are added to N sentence sets respectively, and cross-training prediction is performed, so as to obtain prediction tags of each word included in each sentence set.

In the following, a procedure of implementing the named entity recognition model based recognition method by an embedded named entity recognition model based recognition apparatus in an electronic device will be described in conjunction with the above-mentioned exemplary application and structure of the electronic device.

Referring to fig. 5, fig. 5 is an alternative flowchart of a recognition method based on a named entity recognition model according to an embodiment of the present invention, which will be described with reference to the steps shown.

In step 401, a text sentence is subjected to prediction processing by a named entity recognition model, and a prediction tag of each word in the text sentence is obtained.

Here, the text sentence is subjected to prediction processing by the trained named entity recognition model, so as to obtain a prediction tag of each word in the text sentence, wherein the prediction tag may be an entity prediction tag or a non-entity prediction tag.

In step 402, a word whose corresponding prediction tag is an entity prediction tag and whose entity position is the first bit included in the prediction tag is determined as a first word.

In the text sentence, the corresponding prediction tag is an entity prediction tag, and the entity position included in the prediction tag is a first word, and the word is determined as a first word. It is worth noting that in a text sentence, there may be at least two first words.

In step 403, a traversal is performed backward in the text sentence starting from the first word.

And for the first word in the text sentence, traversing backwards from the first word, and traversing one word at a time.

In step 404, when the prediction tag corresponding to the traversed word is an entity prediction tag, the entity position included in the prediction tag is a non-first word, and the entity type included in the prediction tag is the same as the first word, the traversed word is determined as the non-first word.

In the embodiment of the present invention, the continuous traversal condition is set as follows: the prediction tag corresponding to the word is an entity prediction tag, the entity position included by the prediction tag is non-first, and the entity type included by the prediction tag is the same as the first word. When the traversed word meets the continuous traversal condition, determining the word as a non-first-bit word corresponding to the first-bit word, and traversing backwards; and when the traversed word does not meet the continuous traversal condition, stopping traversal.

In step 405, the first word and the corresponding non-first word are determined together as a named entity.

For each first word, the first word and the corresponding non-first word are jointly determined as a named entity. On this basis, further processing may be performed according to the determined named entity, for example, when the text sentence is a question sentence, the query may be performed in the knowledge graph according to the determined named entity to obtain a response sentence, thereby implementing a semantic response, which is not limited in the embodiment of the present invention.

As can be seen from the above exemplary implementation of fig. 5, the embodiment of the present invention performs recognition according to the trained named entity recognition model, so that the recognition effect is improved, and particularly, for some ambiguous entities, more accurate recognition can be achieved.

In the following, an exemplary application of the embodiments of the present invention in a practical application scenario will be described.

Referring to fig. 6, fig. 6 is another alternative flowchart of the named entity recognition model training method according to the embodiment of the present invention, where a dashed box represents data, and a solid box represents a specific operation, and for ease of understanding, the following description is made in a step form.

1) And matching the large-scale text sentences to obtain the remote supervision remark returning data.

Firstly, an unambiguous entity dictionary is obtained, the entity dictionary comprises unambiguous named entities, such as names of Zhang three and Li four, and each named entity is provided with a corresponding entity labeling label according to a BIO labeling system. And then, carrying out character string matching on the entity dictionary and the plurality of text sentences, and setting corresponding entity labeling labels for words included in the named entities in the text sentences when the text sentences contain the named entities of the entity dictionary. And setting a non-entity labeling label for the character which is not successfully matched in the text sentence. And setting a label for each character in each text sentence to obtain remote supervision relabeling data.

2) And setting an ambiguous label by using an N-fold cross training prediction mode.

For the remote monitoring data comprising a plurality of text sentences, the remote monitoring data is equally divided into N data, and N can be 10. Training the named entity recognition model according to N-1 parts of data in turn, naming the model obtained after training as a test recognition model for distinguishing, and then performing prediction processing on the remaining 1 part of data according to the test recognition model to obtain a prediction label. Repeating the above operations of training and predicting in turn until the predicted label of each word in each data is obtained. And for each piece of data, comparing the label tag corresponding to each word with the prediction tag, and performing label specialization processing on the words with inconsistent tags.

For example, a text sentence included in a certain piece of data is "zhang san and forsythia suspensa, which participated in the movie. "the corresponding label is" piece/B-PER three/I-PER and/O connect/O tilt/O refer/O play/O go/O this/O electricity/O shadow/O. and/O ', after cross training prediction, the prediction label corresponding to the text sentence is' one/B-PER three/I-PER and/O connection/B-PER warp/I-PER parameter/O evolution/O, power/O shadow/O. And O'. Then, the label of "forsythia suspense" in the text sentence is subjected to specialization processing and an ambiguous label MASK is set, so that the final label of the text sentence is: "tension/B-PER three/I-PER and/O connection/MASK warp/MASK parameter/O evolution/O/electric/O shadow/O. And O'.

Through N times of training and prediction, all entities with ambiguity in the batch of remote supervision bidding return data are provided with MASK labels, and therefore the remote supervision bidding return data containing the ambiguity labels are obtained.

3) And training the named entity recognition model based on a MASK training mode, so that the words corresponding to the ambiguous labels do not participate in updating the model weight parameters.

The training process of the named entity recognition model comprises the steps of giving a text sentence, conducting prediction processing on the text sentence through the named entity recognition model to obtain a prediction label of each word in the text sentence, then calculating the difference between a marking label and the prediction label corresponding to each word in the text sentence, conducting backward propagation on the difference in the named entity recognition model, calculating the gradient in the backward propagation process, and therefore updating the weight parameter of the named entity recognition model in the gradient descending direction. The larger the difference obtained, the larger the influence on the weight parameters during training. Since the MASK is an ambiguous tag, and the meaning of the entity formed by the corresponding words is not necessarily accurate, in the training process, the difference of the word is not calculated for the word provided with the ambiguous tag, so that the gradient cannot be obtained, and the weight parameter of the named entity recognition model cannot be updated, thereby avoiding the influence of the ambiguous entity on the named entity recognition model.

As can be seen from the above exemplary implementation of fig. 6, in the embodiment of the present invention, the ambiguous tags are set for the words included in the ambiguous entities, so that the words do not participate in the training process of the named entity recognition model, and the training effect of training the named entity recognition model is improved.

Continuing with the exemplary structure of the named entity recognition model training apparatus 2550 implemented as a software module provided by the embodiment of the present invention, in some embodiments, as shown in fig. 2A, the software module stored in the named entity recognition model training apparatus 2550 of the memory 250 may include: a labeling module 25501, configured to obtain a plurality of text sentences and set a label for each word in each text sentence according to an entity dictionary; the training prediction module 25502 is configured to update the weight parameters of the named entity recognition model according to the plurality of text sentences, and perform prediction processing on each text sentence according to the test recognition model obtained after the update processing to obtain a prediction tag of each character in each text sentence; an ambiguity setting module 25503, configured to set an ambiguity label for a word in the text sentence when a label tag corresponding to the word is different from a predicted tag; an updating module 25504, configured to update the named entity recognition model according to a word in the text sentence for which an ambiguous tag is not set.

In some embodiments, update module 25504 is further configured to: adding the text sentences to a training set and a test set respectively; wherein the number of the text sentences included in the training set is greater than that of the test set; predicting the text sentences in the training set through the named entity recognition model to obtain a prediction label of each character in the text sentences; determining the difference between a labeling label corresponding to the word without the set ambiguous label and a prediction label; performing back propagation in the named entity recognition model according to the difference, and updating the weight parameters of the named entity recognition model in the process of back propagation until all the text sentences in the training set are traversed; determining the effective recognition rate of the updated named entity recognition model according to the test set; and when the effective rate of recognition does not exceed the effective rate threshold, performing prediction processing on the training set again according to the updated named entity recognition model until the effective rate of recognition exceeds the effective rate threshold.

In some embodiments, the training prediction module 25502 is further configured to: adding a plurality of text sentences to N sentence sets on average; wherein N is an integer greater than 1; and updating the weight parameters of the named entity recognition model according to the N-1 sentence sets in turn, and predicting the rest 1 sentence set according to the test recognition model obtained after updating until the prediction label of each word included in each sentence set is obtained.

In some embodiments, the training prediction module 25502 is further configured to: predicting text sentences contained in N-1 sentence sets through a named entity recognition model to obtain temporary prediction labels of all characters contained in the text sentences; determining the difference between the labeling label corresponding to each word and the temporary prediction label; and performing back propagation in the named entity recognition model according to the difference, and updating the weight parameters of the named entity recognition model in the process of back propagation until all text sentences included in the N-1 sentence sets are traversed to obtain a test recognition model.

In some embodiments, the labeling module 25501 is further configured to: matching each text sentence according to an entity dictionary, and determining a named entity which is successfully matched in each text sentence; setting an entity labeling label for the characters included in the named entity; the entity labeling label comprises an entity position and an entity type of a corresponding word; and setting a non-entity labeling label for the words except the named entity in the text sentence.

In some embodiments, the ambiguity setting module 25503 is further configured to: when the label corresponding to the word in the text sentence is the non-entity label and the corresponding prediction label is the entity prediction label, setting an ambiguous label for the word; wherein the entity prediction tag comprises an entity location and an entity type of the word.

Continuing with the exemplary structure of the named entity recognition model based recognition device 2551 implemented as a software module provided by the embodiment of the present invention, in some embodiments, as shown in fig. 2B, the software modules stored in the named entity recognition model based recognition device 2551 of the memory 250 may include: the prediction module 25511 is configured to perform prediction processing on a text sentence through a named entity recognition model to obtain a prediction tag of each word in the text sentence; a first determining module 25512, configured to determine that a corresponding prediction tag is an entity prediction tag and a word whose entity position included in the prediction tag is a first word; a traversal module 25513, configured to traverse backward from the first word in the text sentence; a non-first determination module 25514, configured to determine the traversed word as a non-first word when the prediction tag corresponding to the traversed word is an entity prediction tag, the entity position included in the prediction tag is a non-first word, and the entity type included in the prediction tag is the same as the first word; an entity determining module 25515, configured to determine the first word and the corresponding non-first word together as a named entity.

Embodiments of the present invention provide a storage medium storing executable instructions, where the executable instructions are stored, and when executed by a processor, will cause the processor to execute a named entity recognition model training method provided by an embodiment of the present invention, for example, a named entity recognition model training method as shown in fig. 4A, 4B, or 4C, or will cause the processor to execute a named entity recognition model-based recognition method provided by an embodiment of the present invention, for example, a named entity recognition model-based recognition method as shown in fig. 5.

In some embodiments, the storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, the executable instructions may be in the form of a program, software module, script, or code written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may, but need not, correspond to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

In summary, the embodiments of the present invention identify an ambiguous entity in a text sentence, and set an ambiguous tag for a word in the entity, so that the ambiguous entity does not participate in a training process of a named entity recognition model, thereby improving a training effect of the named entity recognition model.

The above description is only an example of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present invention are included in the protection scope of the present invention.

Claims

1. A named entity recognition model training method is characterized by comprising the following steps:

when the label tag corresponding to the word in the text sentence is different from the prediction tag, setting an ambiguous tag for the word;

adding a plurality of text sentences to a training set and a test set respectively; wherein the training set comprises a number of the text sentences greater than the test set;

predicting the text sentences in the training set through the named entity recognition model to obtain prediction labels of all words in the text sentences;

determining the difference between the labeling label corresponding to the word without the ambiguous label and the prediction label;

performing back propagation in the named entity recognition model according to the difference, and updating the weight parameters of the named entity recognition model in the process of back propagation until all the text sentences in the training set are traversed;

determining the effective recognition rate of the updated named entity recognition model according to the test set;

and when the identification effective rate does not exceed the effective rate threshold, performing prediction processing on the training set again according to the updated named entity identification model until the identification effective rate exceeds the effective rate threshold.

2. The method for training the named entity recognition model according to claim 1, wherein the updating the weight parameters of the named entity recognition model according to the text sentences and the predicting each text sentence according to the test recognition model obtained after the updating to obtain the predicted label of each word in each text sentence comprises:

adding a plurality of text sentences to N sentence sets on average; wherein N is an integer greater than 1;

and updating the weight parameters of the named entity recognition model according to the N-1 sentence sets in turn, and predicting the remaining 1 sentence sets according to the test recognition model obtained after updating until the prediction label of each character included in each sentence set is obtained.

3. The method for training the named entity recognition model according to claim 2, wherein the updating of the weight parameters of the named entity recognition model according to the N-1 sentence sets comprises:

predicting text sentences contained in N-1 sentence sets through a named entity recognition model to obtain temporary prediction labels of all characters contained in the text sentences;

determining the difference between the labeling label corresponding to each word and the temporary prediction label;

and performing back propagation in the named entity recognition model according to the difference, and updating the weight parameters of the named entity recognition model in the process of back propagation until all text sentences included in the N-1 sentence sets are traversed to obtain a test recognition model.

4. The method for training the named entity recognition model according to claim 1, wherein the setting of the label for each word in each text sentence according to the entity dictionary comprises:

matching each text sentence according to an entity dictionary, and determining a named entity which is successfully matched in each text sentence;

setting an entity labeling label for the characters included in the named entity; the entity marking label comprises an entity position and an entity type of a corresponding word;

and setting a non-entity labeling label for the words except the named entity in the text sentence.

5. The training method of the named entity recognition model according to claim 4, wherein when the annotation tag corresponding to the word in the text sentence is different from the predictive tag, setting an ambiguous tag for the word, comprises:

when the label tag corresponding to the word in the text sentence is the non-entity label tag and the corresponding prediction tag is an entity prediction tag, setting an ambiguity tag for the word;

wherein the entity prediction tag comprises an entity location and an entity type of the word.

6. A recognition method based on the named entity recognition model of any one of claims 1 to 5, comprising:

predicting a text sentence through a named entity recognition model to obtain a prediction tag of each character in the text sentence;

determining a word of which the corresponding prediction tag is an entity prediction tag and the entity position included by the prediction tag is the first word as a first word;

traversing backwards from the head word in the text sentence;

7. A named entity recognition model training device, comprising:

the updating module is used for respectively adding the text sentences to a training set and a test set; wherein the training set comprises a number of the text sentences greater than the test set; predicting the text sentences in the training set through the named entity recognition model to obtain prediction labels of all words in the text sentences; determining the difference between the labeling label corresponding to the word without the ambiguous label and the prediction label; performing back propagation in the named entity recognition model according to the difference, and updating the weight parameters of the named entity recognition model in the process of back propagation until all the text sentences in the training set are traversed; determining the effective recognition rate of the updated named entity recognition model according to the test set; and when the effective rate of recognition does not exceed the effective rate threshold, performing prediction processing on the training set again according to the updated named entity recognition model until the effective rate of recognition exceeds the effective rate threshold.

8. A recognition apparatus based on the named entity recognition model of any one of claims 1 to 5, comprising:

9. An electronic device, comprising:

a memory for storing executable instructions;

a processor for implementing the named entity recognition model training method of any one of claims 1 to 5, or the named entity recognition model-based recognition method of claim 6, when executing executable instructions stored in the memory.

10. A storage medium storing executable instructions for causing a processor to perform the method for training a named entity recognition model according to any one of claims 1 to 5 or the method for recognizing a named entity based on a named entity recognition model according to claim 6 when the processor executes the instructions.