CN117709344A

CN117709344A - Named entity recognition model training method, related method and related product

Info

Publication number: CN117709344A
Application number: CN202311666327.8A
Authority: CN
Inventors: 张家晟; 刘喜凯
Original assignee: Shuhang Technology Beijing Co ltd
Current assignee: Shuhang Technology Beijing Co ltd
Priority date: 2023-12-06
Filing date: 2023-12-06
Publication date: 2024-03-15

Abstract

The application discloses a training method, a related method and related products of a named entity recognition model. The method comprises the following steps: acquiring training data, wherein the training data comprises training texts, first task contents of a main task and second task contents of a first auxiliary task, the first task contents comprise determining first entity words and entity types of the first entity words in the training texts, and the second task contents comprise determining entity types contained in the training texts; in the process of training a pre-training language model by using training data, the pre-training language model executes a main task to obtain a first execution result, and executes a first auxiliary task to obtain a second execution result; determining a first difference between the first execution result and a first label of the main task; determining a second difference between the second execution result and a second label of the first auxiliary task; and updating parameters of the pre-training language model based on the first difference and the second difference to obtain a named entity recognition model.

Description

Named entity recognition model training method, related method and related product

Technical Field

The application relates to the technical field of natural language processing, in particular to a training method, a related method and related products of a named entity recognition model.

Background

As natural language processing technology has evolved, the application of natural language processing has increased, including the recognition of named entities (named entity recognition, NER) of text based on natural language models. Before the text is used for identifying the named entities, the natural language model needs to be trained so that the natural language model has the capability of identifying the named entities of the text, and therefore, how to train the natural language model has very important significance.

Disclosure of Invention

The application provides a training method, a related method and a related product of a named entity recognition model, so as to train and obtain a model for recognizing a named entity of a text, wherein the related method comprises the named entity recognition method, and the related product comprises the following steps: training apparatus for named entity recognition model, named entity recognition apparatus, electronic device, computer readable storage medium, computer program product.

In a first aspect, a training method for a named entity recognition model is provided, the method comprising:

acquiring training data, wherein the training data comprises training text, a main task instruction and a first auxiliary task instruction, the main task instruction comprises first task content of a main task, the first task content comprises determining a first entity word in the training text and an entity type of the first entity word, the first entity word is a Named Entity (NE), the first auxiliary task instruction comprises second task content of the first auxiliary task, and the second task content comprises determining an entity type contained in the training text;

In the process of training a pre-training language model by using the training data, the pre-training language model executes the main task according to the first task content in the main task instruction to obtain a first execution result, and the pre-training language model executes the first auxiliary task according to the second task content in the first auxiliary instruction to obtain a second execution result;

determining a first difference between the first execution result and a first label of the main task;

determining a second difference between the second execution result and a second label of the first auxiliary task;

and updating parameters of the pre-training language model based on the first difference and the second difference to obtain a named entity recognition model, wherein the named entity recognition model is used for recognizing the named entity of the text.

In combination with any of the embodiments of the present application, the training data further includes an example of an entity type describing a second entity word, the second entity word being a named entity, and the second entity word being different from the first entity word;

the pre-training language model executes the main task according to the first task content in the main task instruction to obtain a first execution result, and the method comprises the following steps:

And under the example prompt, the pre-training language model executes the main task according to the first task content in the main task instruction to obtain the first execution result.

In combination with any one of the embodiments of the present application, the training data further includes a second auxiliary task instruction, where the second auxiliary task instruction includes third task content of a second auxiliary task, and the third task content includes determining the first entity word from the training text;

before updating the parameters of the pre-trained language model based on the first difference and the second difference to obtain a named entity recognition model, the method further comprises:

in the process of training a pre-training language model by using the training data, the pre-training language model executes the second auxiliary task according to the third task content in the second auxiliary task instruction to obtain a third execution result;

determining a third difference between the third execution result and a third label of the second auxiliary task;

updating parameters of the pre-training language model based on the first difference and the second difference to obtain a named entity recognition model, wherein the named entity recognition model comprises:

And updating parameters of the pre-training language model based on the first difference, the second difference and the third difference to obtain the named entity recognition model.

In combination with any of the embodiments of the present application, the training data further includes a third auxiliary task instruction, the third auxiliary task instruction including fourth task content of a third auxiliary task, the fourth task content including determining an entity type of a word in the training text;

before updating the parameters of the pre-trained language model based on the first difference, the second difference, and the third difference to obtain the named entity recognition model, the method further includes:

in the process of training a pre-training language model by using the training data, the pre-training language model executes the third auxiliary task according to the fourth task content in the third auxiliary task instruction to obtain a fourth execution result;

determining a fourth difference between the fourth execution result and a fourth tag of the third auxiliary task;

updating parameters of the pre-training language model based on the first difference, the second difference and the third difference to obtain the named entity recognition model, wherein the method comprises the following steps:

And updating parameters of the pre-training language model based on the first difference, the second difference, the third difference and the fourth difference to obtain the named entity recognition model.

In combination with any one of the embodiments of the present application, updating parameters of the pre-training language model based on the first difference, the second difference, the third difference, and the fourth difference to obtain the named entity recognition model includes:

determining a loss of the pre-trained language model based on the first difference, the second difference, the third difference, and the fourth difference, the loss being positively correlated to the first difference, the second difference, the third difference, and the fourth difference;

and based on the loss, updating parameters of the pre-training language model to obtain the named entity recognition model.

In combination with any one of the embodiments of the present application, the training data further includes a preset entity type, and the fourth task content includes determining an entity type of the first entity word in the training text from the preset entity type.

In combination with any of the embodiments of the present application, the first execution result and the third execution result each include text describing the first entity word in a preset sentence structure.

In combination with any one of the embodiments of the present application, the training data further includes a preset entity type, and the first task content includes determining an entity type of the first entity word from the preset entity type.

In combination with any of the embodiments of the present application, the pre-trained language model is a text-to-text conversion model.

In a second aspect, there is provided a named entity recognition method, the method comprising:

acquiring a text to be identified;

and carrying out named entity recognition on the text to be recognized by using a named entity recognition model to obtain a named entity recognition result, wherein the named entity recognition model is obtained through training of the first aspect and any implementation mode thereof.

In combination with any one of the embodiments of the present application, after obtaining the named entity recognition result, the method further includes:

and determining a target mapping relation based on the named entity recognition result, wherein the target mapping relation is a mapping relation between a target entity word in the text to be recognized and a target entity type of the target entity word, and the target entity word is a named entity.

In combination with any embodiment of the application, the named entity recognition result describes candidate entity types of candidate entity words according to a preset sentence pattern, wherein the candidate entity words are named entities;

The determining the target mapping relation based on the named entity recognition result comprises the following steps:

dividing texts used for describing the same candidate entity words in the named entity recognition result into the same sub-texts to obtain at least one sub-text;

determining the candidate entity words in each sub-text and the candidate entity types in each sub-text based on the preset sentence patterns;

and determining the target mapping relation based on the candidate entity words and the candidate entity types.

In combination with any one of the embodiments of the present application, the determining the target mapping relationship based on the candidate entity word and the candidate entity type includes:

and aiming at the candidate entity words and the candidate entity types in the same sub-text, determining that the mapping relationship between the candidate entity words and the candidate entity types is the target mapping relationship under the condition that the candidate entity words belong to the text to be identified and the candidate entity types are preset entity types.

In a third aspect, a training device for a named entity recognition model is provided, where the training device includes:

The training text comprises a training text, a main task instruction and a first auxiliary task instruction, wherein the main task instruction comprises first task content of a main task, the first task content comprises a first entity word and an entity type of the first entity word, the first entity word is a named entity, the first auxiliary task instruction comprises second task content of the first auxiliary task, and the second task content comprises an entity type contained in the training text;

the execution unit is used for executing the main task on the training text according to the main task instruction to obtain a first execution result in the process of training a pre-training language model by utilizing the training data, and executing the first auxiliary task on the training text according to the first auxiliary instruction by the pre-training language model to obtain a second execution result;

a determining unit, configured to determine a first difference between the first execution result and a first tag of the primary task;

the determining unit is used for determining a second difference between the second execution result and a second label of the first auxiliary task;

And the updating unit is used for updating the parameters of the pre-training language model based on the first difference and the second difference to obtain a named entity recognition model, and the named entity recognition model is used for recognizing the named entity of the text.

the execution unit is specifically configured to:

the execution unit is further configured to execute the second auxiliary task according to the third task content in the second auxiliary task instruction to obtain a third execution result in a process of training a pre-training language model by using the training data;

The determining unit is further configured to determine a third difference between the third execution result and a third tag of the second auxiliary task;

the updating unit is specifically configured to update parameters of the pre-training language model based on the first difference, the second difference, and the third difference, so as to obtain the named entity recognition model.

the execution unit is further configured to execute the third auxiliary task according to the fourth task content in the third auxiliary task instruction to obtain a fourth execution result in a process of training a pre-training language model by using the training data;

the determining unit is further configured to determine a fourth difference between the fourth execution result and a fourth tag of the third auxiliary task;

the updating unit is specifically configured to update parameters of the pre-training language model based on the first difference, the second difference, the third difference, and the fourth difference, so as to obtain the named entity recognition model.

In combination with any one of the embodiments of the present application, the updating unit is specifically configured to:

In a fourth aspect, there is provided a named entity recognition device, the named entity recognition device comprising:

the acquisition unit is used for acquiring the text to be identified;

and the identifying unit is used for carrying out named entity identification on the text to be identified by using a named entity identification model to obtain a named entity identification result, wherein the named entity identification model is obtained by training the first aspect and any implementation mode thereof.

In combination with any one of the embodiments of the present application, the named entity recognition device further includes: the determining unit is used for determining a target mapping relation based on the named entity recognition result, wherein the target mapping relation is a mapping relation between a target entity word in the text to be recognized and a target entity type of the target entity word, and the target entity word is a named entity.

the determining unit is specifically configured to:

In combination with any embodiment of the present application, the determining unit is specifically configured to determine, for the candidate entity word and the candidate entity type in the same sub-text, that a mapping relationship between the candidate entity word and the candidate entity type is the target mapping relationship when the candidate entity word belongs to the text to be identified and the candidate entity type is a preset entity type.

In a fifth aspect, there is provided an electronic device comprising: a processor and a memory for storing computer program code, the computer program code comprising computer instructions;

the electronic device performs the first aspect and any implementation thereof as described above, when the processor executes the computer instructions; the electronic device may alternatively perform the second aspect and any embodiments thereof as described above, when the processor executes the computer instructions.

In a sixth aspect, there is provided another electronic device comprising: a processor, a transmitting device, an input device, an output device, and a memory for storing computer program code, the computer program code comprising computer instructions;

In a seventh aspect, there is provided a computer readable storage medium having a computer program stored therein, the computer program comprising program instructions;

causing a processor to perform the first aspect and any implementation thereof as described above, when the program instructions are executed by the processor; in the case where the program instructions are executed by a processor, either cause the processor to perform or perform the second aspect as described above and any embodiments thereof.

In an eighth aspect, there is provided a computer program product comprising a computer program or instructions; when the computer program or instructions are run on a computer, the computer is caused to perform the first aspect and any implementation thereof described above; the program instructions, when executed by a processor, or cause the processor to perform the second aspect and any embodiments thereof as described above.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

In the application, the training data comprises training text, a main task instruction and a first auxiliary task instruction, wherein the main task instruction comprises first task content of a main task, the first task content comprises determining a first entity word in the training text and an entity type of the first entity word, the first auxiliary task instruction comprises second task content of the first auxiliary task, and the second task content comprises determining the entity type contained in the training text. Therefore, after the training device acquires the training data, the training device trains the pre-training language model by utilizing the training data, so that the pre-training language model can execute the main task on the training text according to the first task content in the main task instruction to obtain a first execution result, and the pre-training language model can execute the first auxiliary task on the training text according to the second task content in the first auxiliary instruction to obtain a second execution result. The training device can obtain a first difference used for representing the effect of the pre-training language model on executing the main task by determining the difference between the first execution result and the first label of the main task. The training means may obtain a second difference characterizing the effect of the pre-trained language model executing the first auxiliary task by determining the difference of the second execution result and the second label of the first auxiliary task. Thus, the training device updates the parameters of the pre-training language model based on the first difference and the second difference, so that the effect of the pre-training language model on executing the main task and the effect of the pre-training language model on executing the first auxiliary task can be improved. The training device updates the parameters of the pre-training language model based on the first difference and the second difference to obtain the named entity recognition model, and can improve the accuracy of the named entity recognition model in recognizing the named entity of the text.

In addition, since the training object of the training method is a pre-training language model, the training method can train to obtain a named entity recognition model under the condition of using a small amount of training text, in other words, the training method can train to obtain the named entity recognition model through a small sample (few shot) by utilizing the learning capability of the pre-training language model.

Drawings

In order to more clearly describe the technical solutions in the embodiments or the background of the present application, the following description will describe the drawings that are required to be used in the embodiments or the background of the present application.

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the technical aspects of the application.

Fig. 1 is a flow chart of a training method of a named entity recognition model according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a training architecture of a named entity recognition model according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating a named entity recognition method according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a training device for named entity recognition model according to an embodiment of the present application;

Fig. 5 is a schematic structural diagram of a named entity recognition device according to an embodiment of the present application;

fig. 6 is a schematic hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will clearly and completely describe the technical solution in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The terms first, second and the like in the description and in the claims of the present application and in the above-described figures, are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

The embodiment of the application provides a training method of a named entity recognition model and a named entity recognition method, wherein an execution subject of the training method of the named entity recognition model is a training device (hereinafter simply referred to as a training device) of the named entity recognition model, and the training device can be any electronic device capable of executing the technical scheme disclosed by the embodiment of the method of the application. Alternatively, the training device may be one of the following: computer, server.

It will be appreciated that the method of training the named entity recognition model may also be implemented by means of a processor executing computer program code. Embodiments of the present application are described below with reference to the accompanying drawings in the embodiments of the present application. Referring to fig. 1, fig. 1 is a flowchart of a training method of a named entity recognition model according to an embodiment of the present application.

101. Training data is acquired.

In this embodiment, the training data includes training text, a primary task instruction, and a first auxiliary task instruction. The training text may be any text including named entities, alternatively, the training text may be a sentence, for example, the training text may be: where within 1 mile inexpensive hamburgers and fries can be purchased.

The primary task instruction includes task content for a primary task, which is referred to herein as first task content. Specifically, the first task content includes determining a first entity word in the training text and an entity type of the first entity word, where the first entity word is a named entity in the training text, that is, determining the first entity word in the training text is determining which words in the training text are named entities. For example, the main task instruction is: please extract the words that are named entities and the entity type of the extracted words from the training text. The entity type is a named entity type, and for example, the entity type comprises a person name, a place name, an organization name, a time and a product name.

The first auxiliary task instruction includes task content of a first auxiliary task, which is referred to herein as second task content, and specifically, the second task content includes determining an entity type included in the training text, for example, the first auxiliary task instruction is: please extract the entity type in the training text.

It should be appreciated that the above-mentioned primary task instruction and the first auxiliary task instruction are both used to instruct the pre-training language model (pretrained language models, PLM) to perform the corresponding tasks, specifically, the primary task instruction is used to instruct the pre-training language model to perform the primary task based on the first task content, and the first auxiliary task instruction is used to instruct the pre-training language model to perform the first auxiliary task based on the second task content.

In one implementation of acquiring training data, a training device receives training data input by a user through an input component. The input assembly includes at least one of: keyboard, mouse, touch screen, touch pad, audio input device.

In another implementation of acquiring training data, the training device receives training data sent by the terminal. The terminal may be any of the following: cell phone, computer, panel computer, server.

102. In the process of training the pre-training language model by using the training data, the pre-training language model executes the main task according to the first task content in the main task instruction to obtain a first execution result, and the pre-training language model executes the first auxiliary task according to the second task content in the first auxiliary instruction to obtain a second execution result.

In the embodiment of the application, the pre-training language model is a pre-trained natural language large model, wherein the natural language large model refers to a natural language model with complex structure and multiple parameters. Alternatively, the pre-trained language model is a text-to-text conversion model (text-to-text transfer transformer, T5).

In the process of training the pre-training language model by using training data, the pre-training language model executes the main task according to the first task content in the main task instruction to obtain a first execution result, and specifically, the pre-training language model determines a first entity word in the training text and the entity type of the first entity word according to the first task content in the main task instruction to obtain the first execution result. For example, training text is: the method comprises the steps that a user can buy cheap hamburgers and fries in 1 mile, a pre-training language model processes training texts according to main task instructions, and a first execution result is that: hamburger is a dish, chips are dishes, 1 mile is a location, and cheaply is a price, that is, in the first execution result, hamburger, chips, 1 mile, cheaply are all first entity words, the entity type of hamburger is a dish, the entity type of chips is a dish, the entity type of 1 mile is a location, and the cheaply entity type is a price.

In the process of training the pre-training language model by using training data, the pre-training language model executes the first auxiliary task according to the second task content in the first auxiliary task instruction to obtain a second execution result, and specifically, the pre-training language model determines the entity type contained in the training text according to the second task content of the first auxiliary task instruction to obtain the second execution result. For example, training text is: the method is characterized in that the training text is processed by the pre-training language model according to the first auxiliary task instruction when the cheap four-star restaurant exists nearby, and the second execution result is that: the level is an entity type that exists in the training text, the location is an entity type that exists in the training text, and the price is an entity type that exists in the training text. That is, in the second execution result, the training text includes the following three entity types: grade, location, price.

103. And determining a first difference between the first execution result and the first label of the main task.

In this embodiment of the present application, the first tag is a tag of a main task, and specifically, the first tag includes a true value (GT) of a first entity word in a training text and a GT of an entity type of the first entity word, for example, the training text is: where within 1 mile inexpensive hamburgers and fries can be purchased, the first label includes that hamburgers are dishes, fries are dishes, 1 mile is location, and inexpensive is price, i.e. GT of the first entity word includes hamburgers, fries, 1 mile, inexpensive, GT of the entity type of the first entity word includes that the entity type of hamburgers is dishes, the entity type of han fries is dishes, the entity type of 1 mile is location, and the inexpensive entity type is price.

The first difference is a difference between the first execution result and the first label. The first difference characterizes the accuracy of the first execution result, i.e., the effect of the first difference characterizing the execution of the primary task by the pre-trained language model.

104. And determining a second difference between the second execution result and a second label of the first auxiliary task.

In this embodiment of the present application, the second tag is a tag of the first auxiliary task, and specifically, the second tag includes a GT of an entity type in a training text, for example, the training text is: in this vicinity there is an inexpensive four-star restaurant, the second label comprises a grade, a location, a price, that is, in the second label, the training text comprises the following three entity types: grade, location, price.

The second difference is the difference between the second execution result and the second label. The second difference characterizes an accuracy of the second execution result, i.e., the second difference characterizes an effect of the pre-trained language model to execute the first auxiliary task.

105. And updating parameters of the pre-training language model based on the first difference and the second difference to obtain a named entity recognition model.

In the embodiment of the application, the named entity recognition model is used for recognizing the named entity of the text. Because the first difference characterizes the effect of the pre-training language model to execute the main task, the parameters of the pre-training language model are updated based on the first difference, so that the effect of the pre-training language model to execute the main task can be improved. Because the second difference characterizes the effect of the pre-training language model to execute the primary first auxiliary task, the parameters of the pre-training language model are updated based on the second difference, so that the effect of the pre-training language model to execute the primary first auxiliary task can be improved. Therefore, the training device updates the parameters of the pre-training language model based on the first difference and the second difference, so that the effect of executing the main task by the pre-training language model and the effect of executing the first auxiliary task by the pre-training language model can be improved.

Because the first auxiliary task is based on semantic information of the training text, the entity types contained in the training text are determined, so that the effect of executing the first auxiliary task by the pre-training language model is improved, the pre-training language model can better understand the semantics of the entity types in the text, and the pre-training language model is further beneficial to executing the main task. Therefore, the training device can improve the capability of the named entity recognition model to execute the main task under the condition that the parameters of the pre-training language model are updated based on the first difference and the second difference to obtain the named entity recognition model, namely, the accuracy of the named entity recognition model to the text in named entity recognition can be improved.

In one possible implementation, the training device determines a loss of the pre-trained language model based on the first difference and the second difference, wherein the loss is positively correlated to both the first difference and the second difference. And updating parameters of the pre-training language model based on the loss of the pre-training language model in a back propagation mode until the loss of the pre-training language model converges to obtain a named entity recognition model.

In the embodiment of the application, the training data comprises a training text, a main task instruction and a first auxiliary task instruction, wherein the main task instruction comprises first task content of the main task, the first task content comprises determining a first entity word in the training text and an entity type of the first entity word, the first auxiliary task instruction comprises second task content of the first auxiliary task, and the second task content comprises determining the entity type contained in the training text. Therefore, after the training device acquires the training data, the training device trains the pre-training language model by utilizing the training data, so that the pre-training language model can execute the main task on the training text according to the first task content in the main task instruction to obtain a first execution result, and the pre-training language model can execute the first auxiliary task on the training text according to the second task content in the first auxiliary instruction to obtain a second execution result. The training device can obtain a first difference used for representing the effect of the pre-training language model on executing the main task by determining the difference between the first execution result and the first label of the main task. The training means may obtain a second difference characterizing the effect of the pre-trained language model executing the first auxiliary task by determining the difference of the second execution result and the second label of the first auxiliary task. Thus, the training device updates the parameters of the pre-training language model based on the first difference and the second difference, so that the effect of the pre-training language model on executing the main task and the effect of the pre-training language model on executing the first auxiliary task can be improved. The training device updates the parameters of the pre-training language model based on the first difference and the second difference to obtain the named entity recognition model, and can improve the accuracy of the named entity recognition model in recognizing the named entity of the text.

In addition, since the training object of the training method is a pre-training language model, the training method can train to obtain a named entity recognition model under the condition of using a small amount of training text, in other words, the training method can train to obtain the named entity recognition model through a small sample by utilizing the learning capability of the pre-training language model.

As an optional implementation manner, the training data further includes a preset entity type, and the first task content includes determining a first entity word in the training text and determining an entity type of the first entity word from the preset entity types, where determining the entity type of the first entity word is determining the entity type of the first entity word from the preset entity types. For example, the preset entity types include: restaurant name, grade, welfare facility, time, dish, meal, location, price, then the first task content is determining a first entity word in the training text, and determining an entity type of the first entity word from the restaurant name, grade, welfare facility, time, dish, meal, location, price.

In this embodiment, when the training data further includes a preset entity type, and the first task content includes determining an entity type of the first entity word from the preset entity types, the main task is executed on the training text according to the first task content in the main task instruction, so that the pre-training language model can improve the capability of identifying the named entity of the preset entity type in the text, and further can improve the accuracy of identifying the named entity of the preset entity type in the text by the named entity identification model obtained through training.

In one possible implementation scenario, since the named entity recognition model has the ability to recognize named entities of a preset entity type in text, the named entity recognition model can be used to recognize named entities of a specific neighborhood by changing the preset entity type to the entity type of the named entity of the specific domain. For example, after the named entity recognition model is obtained through training, in the case that the preset entity type is changed to the entity type of the wild domain, the named entity of the wild domain may be identified by using the named entity recognition model, and optionally, the entity type of the wild domain may include: character, clothing, store name, grade, color, style, material, size. For another example, after obtaining the named entity recognition model through training, in the case of changing the preset entity type to the entity type of the legal domain, the named entity of the legal domain may be recognized by using the named entity recognition model, and optionally, the entity type of the legal domain may include: persona, time, place, motivation, event.

As an alternative embodiment, the training data further comprises an example of an entity type describing a second entity word, wherein the second entity word is a named entity and the second entity word is different from the first entity word. For example, training text is: inexpensive hamburgers and fries are available anywhere within 1 mile, examples include: a facial restaurant is a restaurant name, a class, a group meal is a welfare facility, a time late at night, a curry corner is a dish, sushi is a meal, dick is a location, and a price of 10 dollars or less per person. The first entity word in the training text comprises: hamburger, french fries, 1 mile, inexpensive, and the second entity words in the example text include: museums, senior, group meals, late night, curry angle, sushi, dick, and $ 10 per person or less.

In this embodiment, the implementation process of executing the main task according to the first task content in the main task instruction by the pre-training language model to obtain the first execution result includes the following steps: under the example prompt, the pre-training language model executes the main task according to the first task content in the main task instruction to obtain a first execution result.

In the case that the training data includes examples, training the pre-training language model by using the training data may enable the pre-training language model to learn how to identify a named entity and an entity type of the named entity from the examples through context learning (ICL), thereby facilitating the pre-training language model to perform a main task. Therefore, under the prompt of an example, the pre-training language model executes the main task according to the first task content in the main task instruction to obtain a first execution result, so that the effect of executing the main task by the pre-training language model can be improved, and further, the training effect and the training efficiency of the pre-training language model can be improved.

Optionally, the number of examples of different entity types is the same, for example, in the examples of training data there are 12 entity types in total, and there are 5 examples of each entity type, then in the training data there are 12×5=60 examples in total. In this way, the pre-trained language model may be made more balanced against learning of different entity types.

As an alternative embodiment, the training data further comprises a second auxiliary task instruction, wherein the second auxiliary task instruction comprises a task content of the second auxiliary task, which is herein referred to as a third task content. In particular, the third task content includes determining the first entity words from the training text, in other words, the third task content includes finding out which words are named entities from the training text, i.e., finding out spans (span) of named entities from the training text.

In this embodiment, the training device, prior to performing step 105, further performs the steps of:

2001. and in the process of training a pre-training language model by using the training data, the pre-training language model executes the second auxiliary task according to the third task content in the second auxiliary task instruction to obtain a third execution result.

In the training process of training the pre-training language model by using training data, the pre-training language model executes a second auxiliary task according to third task content in a second auxiliary task instruction to obtain a third execution result, and specifically, the pre-training language model determines a first entity word from a training text according to the third task content of the second auxiliary task instruction to obtain the third execution result. For example, training text is: for me to see restaurants with good evaluation and low price, the pre-training language model processes training texts according to the second auxiliary task instruction, and the third execution result is that: the evaluation is a first entity word, and the price is a first entity word. That is, in the third execution result, the training text includes the following two first entity words: the evaluation is good and the price is low.

2002. And determining a third difference between the third execution result and a third label of the second auxiliary task.

In this embodiment of the present application, the third tag is a tag of the second auxiliary task, and specifically, the third tag includes a GT of the first entity word in the training text, for example, the training text is: to give me a look at restaurants that are rated well but inexpensive, the third label comprises: the evaluation is good and inexpensive, that is, in the third label the training text comprises the following two first entity words: the evaluation is good and the price is low.

The third difference is the difference between the third execution result and the third label. The third difference characterizes an accuracy of the third execution result, i.e. the third difference characterizes an effect of the pre-trained language model to execute the second auxiliary task.

After determining the third difference, the training device performs the following steps in performing step 105:

2003. and updating parameters of the pre-training language model based on the first difference, the second difference and the third difference to obtain the named entity recognition model.

Because the third difference characterizes the effect of the pre-training language model to execute the second auxiliary task, the parameters of the pre-training language model are updated based on the third difference, so that the effect of the pre-training language model to execute the second auxiliary task can be improved. Specifically, based on the third difference, the parameters of the pre-training language model are updated, so that the pre-training language model can more accurately determine which words in the training text are named entities, that is, the pre-training language model can more accurately determine the span of the named entities in the text. Because the pre-training language model accurately determines the span of the named entity in the text, the pre-training language model is beneficial to executing the main task, so in step 2003, the training device updates the parameters of the pre-training language model based on the first difference and the second difference and updates the parameters of the pre-training language model based on the third difference, thereby further improving the capability of the pre-training language model to execute the main task.

In one possible implementation, the training device determines a loss of the pre-trained language model based on the first difference, the second difference, and the third difference, wherein the loss is positively correlated to the first difference, the second difference, and the third difference. And updating parameters of the pre-training language model based on the loss of the pre-training language model in a back propagation mode until the loss of the pre-training language model converges to obtain a named entity recognition model.

In this embodiment, the training data further comprises a second auxiliary task instruction, wherein the second auxiliary task instruction comprises third task content of the second auxiliary task, the third task content comprising determining the first entity word from the training text. Therefore, the training device trains the pre-training language model by utilizing the training data, and the pre-training language model can execute the second auxiliary task on the training text according to the third task content in the second auxiliary instruction to obtain a third execution result. The training means may obtain a third difference characterizing the effect of the pre-trained language model executing the second auxiliary task by determining the difference of the third execution result and a third label of the second auxiliary task. Thus, the training device updates the parameters of the pre-training language model based on the first difference, the second difference and the third difference, so that the effect of the pre-training language model on executing the main task, the effect of the pre-training language model on executing the first auxiliary task and the effect of the pre-training language model on executing the second auxiliary task can be improved. The training device updates the parameters of the pre-training language model based on the first difference, the second difference and the third difference to obtain the named entity recognition model, and the accuracy of the named entity recognition model on the named entity recognition of the text can be improved.

As an alternative embodiment, the first execution result and the third execution result each include text describing the first entity word in a preset sentence structure, for example, the preset sentence structure is a named entity, where a is the first entity word. Therefore, the pre-training language model can output the execution result of the third auxiliary task according to the mode of outputting the execution result of the main task, and further the promotion effect of the third auxiliary task on the main task is improved, so that the effect of executing the main task by the pre-training language model is improved greatly through training of the third auxiliary task on the pre-training language model.

As an alternative embodiment, the training data further comprises a third auxiliary task instruction, wherein the third auxiliary task instruction comprises a task content of a third auxiliary task, which is herein referred to as a fourth task content. Specifically, the fourth task content includes determining an entity type of the first entity word in the training text, in other words, the fourth task content is to determine an entity type of a named entity given which words in the training text are named entities, that is, to determine an entity type of a span of named entities in the training text given the span.

In this embodiment, the training device further performs the following steps before performing step 2003: 3001. in the process of training the pre-training language model by using the training data, the pre-training language model executes the third auxiliary task according to the fourth task content in the third auxiliary task instruction to obtain a fourth execution result.

In the process of training the pre-training language model by using training data, the pre-training language model executes a third auxiliary task according to fourth task content in a third auxiliary task instruction to obtain a fourth execution result, and specifically, the pre-training language model determines the entity type of a first entity word in a training text according to the fourth task content of the third auxiliary task instruction to obtain the fourth execution result. For example, training text is: where the best 10 yuan below restaurant in Mininapperisce is, the fourth task is: please determine the entity types of the following words in the training text: preferably, mininappellis is 10 yuan or less. The training text is processed by the pre-training language model according to the third auxiliary task instruction, and a fourth execution result is obtained as follows: the best entity type is rank, minneapolis entity type is location, and below 10 yuan entity type is price.

3002. And determining a fourth difference between the fourth execution result and a fourth label of the third auxiliary task.

In this embodiment of the present application, the fourth tag is a tag of the third auxiliary task, and specifically, the fourth tag includes a GT of an entity type of a word in a training text, for example, the training text is: where the best 10 yuan below restaurant in Mininapperisce is, the fourth task is: please determine the entity types of the following words in the training text: preferably, the Mininappellis, 10 or less, and the fourth tag comprises: the best entity type is rank, minneapolis entity type is location, and below 10 yuan entity type is price.

The fourth difference is the difference between the fourth execution result and the fourth label. The fourth difference characterizes the accuracy of the fourth execution result, i.e. the fourth difference characterizes the effect of the pre-trained language model to execute the third auxiliary task.

After determining the fourth difference, the training device performs the following steps in performing step 105:

3003. and updating parameters of the pre-training language model based on the first difference, the second difference, the third difference and the fourth difference to obtain the named entity recognition model.

Because the fourth difference characterizes the effect of the pre-training language model to execute the third auxiliary task, the parameters of the pre-training language model are updated based on the fourth difference, so that the effect of the pre-training language model to execute the third auxiliary task can be improved. Specifically, based on the third difference, the parameters of the pre-training language model are updated, so that the pre-training language model can more accurately determine the entity type of the named entity in the training text. Because the pre-training language model accurately determines the entity type of the named entity in the text, the pre-training language model is beneficial to executing the main task, so in step 3003, the training device updates the parameters of the pre-training language model based on the first difference, the second difference and the third difference, and updates the parameters of the pre-training language model based on the fourth difference, thereby further improving the capability of the pre-training language model to execute the main task.

In one possible implementation, the training device determines a loss of the pre-training language model based on the first difference, the second difference, the third difference, and the fourth difference, wherein the loss is positively correlated with the first difference, the second difference, the third difference, and the fourth difference. And updating parameters of the pre-training language model based on the loss of the pre-training language model in a back propagation mode until the loss of the pre-training language model converges to obtain a named entity recognition model.

In this embodiment, the training data further comprises a third auxiliary task instruction, wherein the third auxiliary task instruction comprises fourth task content of the third auxiliary task, the fourth task content comprising an entity type that determines the first entity word in the training text. Therefore, the training device trains the pre-training language model by using the training data, and the pre-training language model can execute the third auxiliary task according to the fourth task content in the third auxiliary instruction to obtain a fourth execution result. The training device may obtain a fourth difference for characterizing the effect of the pre-training language model executing the third auxiliary task by determining the difference between the fourth execution result and a fourth label of the third auxiliary task. Thus, the training device updates the parameters of the pre-training language model based on the first difference, the second difference, the third difference and the fourth difference, so as to improve the effect of the pre-training language model on executing the main task, improve the effect of the pre-training language model on executing the first auxiliary task, improve the effect of the pre-training language model on executing the second auxiliary task and improve the effect of the pre-training language model on executing the third auxiliary task. The training device updates the parameters of the pre-training language model based on the first difference, the second difference, the third difference and the fourth difference to obtain a named entity recognition model, and can improve the accuracy of the named entity recognition model for recognizing the named entity of the text.

As an alternative embodiment, the training device determines the fourth difference by performing steps 3001 and 3002 before performing step 105, and performs the following steps in the course of performing step 105: and updating parameters of the pre-training language model based on the first difference, the second difference and the fourth difference to obtain the named entity recognition model. Therefore, the effect of executing the main task by the pre-training language model can be improved, the effect of executing the first auxiliary task by the pre-training language model can be improved, and the effect of executing the third auxiliary task by the pre-training language model can be improved. The training device updates the parameters of the pre-training language model based on the first difference, the second difference and the fourth difference to obtain the named entity recognition model, and the accuracy of the named entity recognition model on the named entity recognition of the text can be improved.

As an optional implementation manner, the training data further includes a preset entity type, the fourth task content includes determining an entity type of the first entity word in the training text from the preset entity type, and the preset entity type includes: the fourth task content is to determine the entity type of the first entity word in the training text from the restaurant name, grade, welfare facility, time, dish, meal, location, price.

In this embodiment, when the training data further includes a preset entity type, and the fourth task content includes determining an entity type of the first entity word in the training text from the preset entity types, the main task is executed on the training text according to the fourth task content in the third auxiliary task instruction, so that the capability of identifying the named entity of the preset entity type in the text can be improved by the pre-training language model, and further, the accuracy of identifying the named entity of the preset entity type in the text by the named entity identification model obtained through training can be improved.

Referring to fig. 2, fig. 2 is a schematic diagram of a training architecture of a named entity recognition model according to an embodiment of the present application. As shown in FIG. 2, the named entity recognition model is obtained through training of a main task, a first auxiliary task, a second auxiliary task and a third auxiliary task, wherein the main task is a named entity recognition task, the first auxiliary task is an entity type extracting task, the second auxiliary task is a first entity word determining task, and the third auxiliary task is an entity type determining task.

As shown in fig. 2, the information required to perform the main task includes: the method comprises a main task instruction, training text, a preset entity type and an example, wherein the main task instruction comprises first task content of a main task. Specifically, the first task content is: please extract the first entity word that is a named entity from the training text based on the knowledge of the example, and determine the entity type of the first entity word from the preset entity types. The training text is as follows: where within 1 mile inexpensive hamburgers and fries can be purchased. The preset entity types comprise: restaurant name, grade, welfare facility, time, dish, meal, location, price. Examples are: a facial restaurant is a restaurant name, a class, a group meal is a welfare facility, a time late at night, a curry corner is a dish, sushi is a meal, dick is a location, and a price of 10 dollars or less per person. The named entity recognition model executes a main task according to a main task instruction to obtain a first execution result as follows: hamburger is a dish, french fries are dishes, 1 mile is a location, and low price is achieved.

It should be appreciated that the examples instruct the pre-training language model to perform the primary task by giving hints to the pre-training language model to perform the primary task, but the training apparatus need not update parameters of the pre-training language model based on the examples.

Optionally, the first execution result is output in a text form according to a preset first template, for example, the first template is a and B, where a is a first entity word determined by the named entity recognition model from the training text, and B is an entity type of the first entity word determined by the named entity recognition model.

As shown in fig. 2, the information required to perform the first auxiliary task includes: the training text comprises a first auxiliary task instruction and training text, wherein the first auxiliary task instruction comprises second task content of the first auxiliary task. Specifically, the second task content is: please determine the entity types contained in the training text. The training text is as follows: near this is the case with inexpensive four-star restaurant. The named entity recognition model executes the first auxiliary task according to the first auxiliary task instruction to obtain a second execution result as follows: the level is an entity type that exists in the training text, the location is an entity type that exists in the training text, and the price is an entity type that exists in the training text.

Optionally, the second execution result outputs the second execution result in a text form according to a preset second template, for example, the entity type of the second template C exists in the training text, wherein C is the entity type contained in the training text determined by the named entity recognition model.

As shown in fig. 2, the information required to perform the second auxiliary task includes: and the second auxiliary task instruction comprises third task content of the second auxiliary task and training text. Specifically, the third task content is: please determine the first entity word from the training text. The training text is as follows: i are given a look at restaurants that are rated well but are inexpensive. The named entity recognition model executes the second auxiliary task according to the second auxiliary task instruction to obtain a third execution result as follows: the evaluation is a first entity word, and the price is a first entity word.

Optionally, the third execution result is output in text form according to a preset third template, for example, D is a first entity word, where D is the first entity word determined by the named entity recognition model from the training text.

As shown in fig. 2, the information required to perform the third auxiliary task includes: the system comprises a third auxiliary task instruction, training text and a preset entity type, wherein the third auxiliary task instruction comprises fourth task content of a third auxiliary task. Specifically, the fourth task content is: please determine the entity type of the following first entity words from the preset entity types according to the training text: preferably, mininappellis is 10 yuan or less. The training text is as follows: where the best 10 yuan below Mininappers restaurant. The preset entity types comprise: restaurant name, grade, welfare facility, time, dish, meal, location, price. The named entity recognition model executes a main task according to a main task instruction to obtain a first execution result as follows: the best entity type is rank, minneapolis entity type is location, and below 10 yuan entity type is price.

Optionally, the fourth execution result is output in a text form according to a preset fourth template, for example, the fourth template is that the entity type of E is F, where E is the first entity word determined in the training text, and F is the entity type of the first entity word determined by the named entity recognition model.

After training to obtain a named entity recognition model based on the training architecture shown in fig. 2, comparing the named entity recognition effect of the named entity recognition model with the named entity recognition effect of a state-of-the-art (SOTA) model by using a colll-2003 dataset, wherein the generating model comprises: design challenges and misunderstanding in neural sequence tags (design challenges and misconceptions in neural sequence labeling, DCAM), end-to-end sequence labeling via bi-directional LSTM-CNNs-CRF, ESLV), uncertainty aware tag refinement of sequence tags (uncertaitry-aware label refinement for sequence labeling, UALR), self-attention-directed deep contextualized entity characterization with global awareness (deep contextualized entity representations with entity aware self-intent, DCER), unified MRC framework for named entity recognition (a unified MRC framework for named entity recognition, AUMF), named entity recognition as dependency resolution (Named entity recognition as dependency parsing, NERA), pre-training based on language understanding of deep bi-directional converters (Pre-training of deep bidirectional transformers for language under standing, LC-BERT), natural language generation, sequence-to-sequence Pre-training for natural language generation, translation, and comprehension, LC-BART), named entity recognition based on a generated Pre-training model (bidirectional and auto-regressive transformers, BART), unified BARR framework for various NER subtasks (23, 7538), scalable implementation of low-resource hints by means of the scalable light-resource NER via plug gable prompting, lightNER). Specific comparative effects are shown in table 1 below.

TABLE 1

In table 1, F1 is an index representing accuracy of named entity recognition of a model, wherein, in the case where the model correctly recognizes a named entity in a text and the model correctly recognizes an entity type of the named entity, the named entity recognition is correct. In the event that the model does not correctly identify a named entity in the text, the named entity identifies the error. In the event that the model does not correctly identify the entity type of the named entity, the named entity identification is incorrect. span-F1 is an indicator representing the accuracy of recognition of spans of a model, where the recognition of spans is correct if the model correctly recognizes a named entity in text and the recognition of spans is incorrect if the model does not correctly recognize a named entity in text. As can be seen from Table 1, the performance of the named entity recognition model is at the same level as that of the SOTA model, and the span-F1 of the named entity recognition model is higher than that of BARTNER.

In addition, table 2 also shows a comparison of the named entity recognition effect of the named entity recognition model trained based on the training framework of fig. 2 and the named entity recognition effect of BARTNER in the case where the model is obtained through training of a small sample.

TABLE 2

As shown in table 2, in the case of sampling 10 samples, 20 samples, 50 samples from the milo manual Movie data set (MIT Movie) as training samples, respectively, the accuracy of the named entity model was significantly higher than that of BARTNER. In the case of 10 samples, 20 samples, 50 samples were sampled as training samples from the MIT resultarant Restaurant data set (MIT), respectively, the accuracy of the named entity model was significantly higher than that of BARTNER. In the case of sampling 10, 20, 50 sets of samples, respectively, from the air travel information system data set (ATIS) as training samples, the accuracy of the named entity model is slightly lower than the accuracy of BARTNER, where 1 set of samples comprises 100 to 200 samples. According to table 2, it can be seen that the training of small samples based on the training framework shown in fig. 2 can obtain better training effect, so that the recognition accuracy of the named entity recognition model obtained by training is higher.

Referring to fig. 3, fig. 3 is a flow chart of a named entity recognition method according to an embodiment of the present application. The execution subject of the named entity recognition method is named entity recognition device (hereinafter simply referred to as recognition device) of a named entity recognition model, where the recognition device may be any electronic device capable of executing the technical scheme disclosed in the method embodiment of the application. Alternatively, the identification means may be one of the following: computer, server.

301. And acquiring a text to be identified.

In the implementation of the present application, the text to be identified may be any text. Alternatively, the text to be recognized is different from the training text.

In one implementation of obtaining text to be recognized, the recognition device receives text to be recognized entered by a user through an input component. In another implementation manner of obtaining the text to be recognized, the recognition device receives the text to be recognized sent by the terminal.

In yet another possible implementation,

302. and carrying out named entity recognition on the text to be recognized by using a named entity recognition model to obtain a named entity recognition result.

In the embodiment of the application, the named entity recognition model is obtained through training by the named entity recognition model training method. The recognition device carries out named entity recognition on the text to be recognized by using the named entity recognition model, and can determine which words in the text to be recognized are named entities and the entity types of the words which are the named entities. Thus, the named entity recognition result obtained by performing step 302 includes which words in the text to be recognized are named entities and the entity types of these words that are named entities.

In the embodiment of the application, after the recognition device obtains the text to be recognized, the named entity recognition model is utilized to perform named entity recognition on the text to be recognized, so that a named entity recognition result is obtained, and the accuracy of the named entity recognition result can be improved.

In one possible implementation scenario, the text to be identified is a text to be translated, and the identifying device uses the named entity identifying model to identify the named entity of the text to be identified under the condition that the identifying device receives an instruction for translating the text to be identified, so as to obtain a named entity identifying result. And then, based on the named entity recognition result, the text to be recognized is translated, so that the accuracy of translation can be improved.

In another possible implementation scenario, the text to be identified is a question posed by the user, and the identifying device uses the named entity identifying model to identify the named entity of the text to be identified under the condition that the identifying device receives the instruction for answering the text to be identified, so as to obtain the named entity identifying result. And then, the answer is carried out based on the named entity recognition result, so that the accuracy of the answer can be improved.

As an optional implementation manner, the identifying device further performs the following steps after obtaining the named entity identifying result:

4001. And determining a target mapping relation based on the named entity recognition result.

In the embodiment of the present application, the target mapping relationship is a mapping relationship between a target entity word in a text to be identified and a target entity type of the target entity word, where the target entity word is a named entity. That is, the target mapping relationship is the mapping relationship between the named entity and the entity type in the text to be identified. For example, the text to be recognized is: inexpensive hamburgers and fries can be purchased anywhere within 1 mile, and the following words in the text to be identified are named entities: hamburger, french fries, 1 mile, inexpensive, where the target entity words include: hamburger, french fries, 1 mile, inexpensive. Since the entity type of hamburger is dish, the entity type of French fries is dish, the entity type of 1 mile is position, and the inexpensive entity type is price, the target mapping relationship includes: hamburger and dish mapping relation, french fries and dish mapping relation, 1 mile and position mapping relation, and cheap and price mapping relation. In one possible implementation, the target mapping relationship is obtained by forming a word pair of the target entity word and the target entity type, for example, the target entity word is a hamburger, the target entity type is a dish, and the target mapping relationship may be (hamburger, dish), in another possible implementation, the target mapping relationship is obtained by establishing an index between the target entity word and the target entity type. Since the result text is obtained by performing named entity recognition on the text to be recognized, the recognition device can determine the target mapping relationship based on the result text.

In this embodiment, after the recognition device obtains the recognition result of the named entity of the text to be recognized, the recognition device determines the target mapping relationship based on the result text, so that the relationship between the target entity word and the target entity type in the text to be recognized can be more directly reflected through the target mapping relationship.

As an alternative implementation manner, the named entity recognition result describes the candidate entity type of the candidate entity word according to the preset sentence pattern, wherein the candidate entity word is a named entity, that is, the named entity recognition result describes the entity type of the named entity in a text form, the named entity in the named entity recognition result is called a candidate entity word, and the entity type of the candidate entity word is called a candidate entity type, for example, the named entity recognition result is: the parcels are names of people, then the parcels are candidate entity words, and the names of people are candidate entity types. And the named entity result describes candidate entity types of the candidate entity words according to a preset sentence pattern, for example, the preset sentence pattern is X and Y, wherein X is the candidate entity word and Y is the candidate entity type.

As an optional implementation manner, the result text describes the entity type of the entity word according to a preset sentence pattern;

The identification means performs the following steps in performing step 4001:

5001. and dividing texts used for describing the same candidate entity words in the named entity recognition result into the same sub-texts to obtain at least one sub-text.

By dividing the text describing the same candidate entity word into the same sub-text, the content in the same sub-text can be used for describing the same candidate entity word, and the content in different sub-texts can be used for describing different candidate entity words. In the case where the number of candidate entity words in the named entity recognition result is 1, the number of sub-texts is 1. And under the condition that the number of candidate entity words in the named entity recognition result is greater than 1, the number of the sub-texts is greater than 1. Specifically, the number of sub-texts is the same as the number of candidate entity words.

For example, named entity recognition results include: hamburger is a dish, 1 mile is a location, where hamburger and 1 mile are different candidate entity words, and thus, hamburger is a sub-text of the dish that describes the candidate entity word hamburger, and 1 mile is a sub-text of the location that describes the candidate entity word 1 mile.

5002. And determining the candidate entity words in each sub-text and the candidate entity types in each sub-text based on the preset sentence patterns.

Because the position of the candidate entity word and the position of the candidate entity type are both fixed in the preset sentence pattern, and the sub-text describes the candidate entity type of the candidate entity word according to the preset sentence pattern, the recognition device can determine the candidate entity word in the sub-text and the candidate entity type of the candidate entity word based on the preset sentence pattern. For example, the preset sentence pattern is X and Y, where X is a candidate entity word, Y is a candidate entity type, and according to the preset sentence pattern, it is known that the word "yes" before is a candidate entity word, and the word "yes" after is a candidate entity type. If the hamburger is a sub-text obtained by dividing the text to be identified, the hamburger is a candidate entity word and the dish is a candidate entity type based on a preset sentence pattern.

5003. And determining the target mapping relation based on the candidate entity words and the candidate entity types.

In one possible implementation manner, for the candidate entity word and the candidate entity type in the same sub-text, when the candidate entity word belongs to the text to be identified and the candidate entity type is a preset entity type, determining that a mapping relationship between the candidate entity word and the candidate entity type is the target mapping relationship.

In this embodiment of the present application, the preset entity type may be set according to actual requirements, for example, a named entity recognition model is required to recognize a named entity and an entity type in a legal field, and then the preset entity type may be an entity type in the legal field, for example, a named entity and an entity type in a cross-lap field are required to recognize a named entity and an entity type in a cross-lap field, and then the preset entity type may be an entity type in the cross-lap field.

It should be appreciated that the preset entity type in the training method of the named entity recognition model is different from the preset entity type in the named entity recognition method. Specifically, in the training method of the named entity recognition model, the pre-training language model is trained based on the preset entity type to obtain the named entity recognition model, so that the named entity recognition model has the capability of recognizing the named entity of the preset entity type, and in the named entity recognition method, the preset entity type is the named entity of the specific entity type, which is expected to be recognized by utilizing the named entity recognition model. For example, the preset entity types in the training method of the named entity recognition model include: the preset entity types in the method for identifying the named entity comprise the following components of restaurant names, grades, welfare facilities, time, dishes, meals, positions and prices: character, clothing, store name, grade, color, style, material, size.

On the one hand, the candidate entity word may be different from the target entity word, so that it is required to further judge whether the candidate entity word is the target entity word, specifically, the candidate entity word belongs to the text to be recognized, and the candidate entity word is described as the target entity word. On the other hand, the candidate entity type may not be the preset entity type, so it is further required to determine whether the candidate entity type is the preset entity type.

And because the candidate entity type is the entity type of the candidate entity word for the candidate entity word and the candidate entity type in the same sub-text, the mapping relationship between the candidate entity word and the candidate entity type can be determined to be the target mapping relationship under the condition that the candidate entity word is the target entity word and the candidate entity type is the preset entity type for the candidate entity word and the candidate entity type in the same sub-text.

Based on the above, the recognition device determines, for the candidate entity word and the candidate entity type in the same sub-text, that the mapping relationship between the candidate entity word and the candidate entity type is the target mapping relationship when the candidate entity word belongs to the text to be recognized and the candidate entity type is the preset entity type. Optionally, for the candidate entity word and the candidate entity type in the same sub-text, determining that the mapping relationship between the candidate entity word and the candidate entity type is not the target mapping relationship under the condition that the candidate entity word does not belong to the text to be recognized. Optionally, for the candidate entity word and the candidate entity type in the same sub-text, determining that the mapping relationship between the candidate entity word and the candidate entity type is not the target mapping relationship under the condition that the candidate entity type is not the preset entity type.

In another possible implementation manner, the recognition device determines, for a candidate entity word and a candidate entity type in the same sub-text, that a mapping relationship between the candidate entity word and the candidate entity type is a target mapping relationship in a case that the candidate entity word belongs to a text to be recognized.

In still another possible implementation manner, the identifying device determines, for the candidate entity word and the candidate entity type in the same sub-text, that the mapping relationship between the candidate entity word and the candidate entity type is a target mapping relationship when the candidate entity type is a preset entity type.

In yet another possible implementation manner, the identifying means determines a mapping relationship between the candidate entity word and the candidate entity type as the target mapping relationship.

In this embodiment, when the named entity recognition result describes the candidate entity type of the candidate entity word according to the preset sentence pattern, the text used for describing the same candidate entity word in the named entity recognition result is first divided into the same sub-text, so as to obtain at least one sub-text. Then, based on the preset sentence pattern, candidate entity words in each sub-text and candidate entity types in each sub-text can be determined, and further, a target mapping relation can be determined based on the candidate entity words in each sub-text and the candidate entity types in each sub-text.

It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.

The foregoing details the method of embodiments of the present application, and the apparatus of embodiments of the present application is provided below.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a training device for a named entity recognition model according to an embodiment of the present application, where the training device 1 for a named entity recognition model includes: acquisition unit 11, execution unit 12, determination unit 13, update unit 14, specifically:

an obtaining unit 11, configured to obtain training data, where the training data includes a training text, a main task instruction, and a first auxiliary task instruction, the main task instruction includes first task content of a main task, the first task content includes determining a first entity word in the training text and an entity type of the first entity word, the first entity word is a named entity, the first auxiliary task instruction includes second task content of the first auxiliary task, and the second task content includes determining an entity type included in the training text;

The execution unit 12 is configured to execute the main task on the training text according to the main task instruction to obtain a first execution result in the training process of using the training data to train a pre-training language model, and execute the first auxiliary task on the training text according to the first auxiliary instruction to obtain a second execution result;

a determining unit 13, configured to determine a first difference between the first execution result and a first tag of the primary task;

the determining unit 13 is configured to determine a second difference between the second execution result and a second tag of the first auxiliary task;

an updating unit 14, configured to update parameters of the pre-training language model based on the first difference and the second difference, to obtain a named entity recognition model, where the named entity recognition model is used for performing named entity recognition on text.

the execution unit 12 is specifically configured to:

the execution unit 12 is further configured to execute the second auxiliary task according to the third task content in the second auxiliary task instruction to obtain a third execution result in a process of training a pre-training language model using the training data;

the determining unit 13 is further configured to determine a third difference between the third execution result and a third label of the second auxiliary task;

the updating unit 14 is specifically configured to update parameters of the pre-training language model based on the first difference, the second difference, and the third difference, to obtain the named entity recognition model.

The execution unit 12 is further configured to execute the third auxiliary task according to the fourth task content in the third auxiliary task instruction to obtain a fourth execution result in a process of training a pre-training language model using the training data;

the determining unit 13 is further configured to determine a fourth difference between the fourth execution result and a fourth tag of the third auxiliary task;

the updating unit 14 is specifically configured to update parameters of the pre-training language model based on the first difference, the second difference, the third difference, and the fourth difference, so as to obtain the named entity recognition model.

In combination with any embodiment of the present application, the updating unit 14 is specifically configured to:

Referring to fig. 5, fig. 5 is a schematic structural diagram of a named entity recognition device according to an embodiment of the present application, where the named entity recognition device 2 includes: the obtaining unit 21, the identifying unit 22, and optionally, the named entity identifying device 2 further comprises a determining unit 23, specifically:

an acquisition unit 21 for acquiring a text to be recognized;

and the identifying unit 22 is configured to identify a named entity of the text to be identified by using a named entity identifying model, so as to obtain a named entity identifying result, where the named entity identifying model is obtained through training in the first aspect and any implementation manner thereof.

In combination with any one of the embodiments of the present application, the named entity recognition device further includes: the determining unit 23 is configured to determine, based on the named entity recognition result, a target mapping relationship, where the target mapping relationship is a mapping relationship between a target entity word in the text to be recognized and a target entity type of the target entity word, and the target entity word is a named entity.

the determining unit 23 is specifically configured to:

In combination with any embodiment of the present application, the determining unit 23 is specifically configured to determine, for the candidate entity word and the candidate entity type in the same sub-text, that a mapping relationship between the candidate entity word and the candidate entity type is the target mapping relationship when the candidate entity word belongs to the text to be identified and the candidate entity type is a preset entity type.

In some embodiments, functions or modules included in the apparatus provided in the embodiments of the present application may be used to perform the methods described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.

Fig. 6 is a schematic hardware structure of an electronic device according to an embodiment of the present application. The electronic device 3 comprises a processor 31, a memory 32. Optionally, the electronic device 3 further comprises input means 33 and output means 34. The processor 31, memory 32, input device 33, and output device 34 are coupled by connectors, including various interfaces, transmission lines or buses, etc., as the embodiments are not limited in this respect. It should be understood that in various embodiments of the present application, coupled is intended to mean interconnected by a particular means, including directly or indirectly through other devices, e.g., through various interfaces, transmission lines, buses, etc.

The processor 31 may comprise one or more processors, for example one or more central processing units (central processing unit, CPU), which in the case of a CPU may be a single core CPU or a multi core CPU. Alternatively, the processor 31 may be a processor group constituted by a plurality of CPUs, the plurality of processors being coupled to each other through one or more buses. In the alternative, the processor may be another type of processor, and the embodiment of the present application is not limited.

Memory 32 may be used to store computer program instructions as well as various types of computer program code for performing aspects of the present application. Optionally, the memory includes, but is not limited to, a random access memory (random access memory, RAM), a read-only memory (ROM), an erasable programmable read-only memory (erasable programmable read only memory, EPROM), or a portable read-only memory (compact disc read-only memory, CD-ROM) for associated instructions and data.

The input means 33 are for inputting data and/or signals and the output means 34 are for outputting data and/or signals. The input device 33 and the output device 34 may be separate devices or may be an integral device.

It will be appreciated that in the embodiment of the present application, the memory 32 may be used to store not only related instructions, but also related data, for example, the memory 32 may be used to store training data obtained through the input device 33, or the memory 32 may be further used to store a named entity recognition model obtained through the processor 31, etc., and the embodiment of the present application is not limited to the data specifically stored in the memory.

It will be appreciated that fig. 6 shows only a simplified design of an electronic device. In practical applications, the electronic device may further include other necessary elements, including but not limited to any number of input/output devices, processors, memories, etc., and all electronic devices that may implement the embodiments of the present application are within the scope of protection of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein. It will be further apparent to those skilled in the art that the descriptions of the various embodiments herein are provided with emphasis, and that the same or similar parts may not be explicitly described in different embodiments for the sake of convenience and brevity of description, and thus, parts not described in one embodiment or in detail may be referred to in the description of other embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital versatile disk (digital versatiledisc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Those of ordinary skill in the art will appreciate that implementing all or part of the above-described method embodiments may be accomplished by a computer program to instruct related hardware, the program may be stored in a computer readable storage medium, and the program may include the above-described method embodiments when executed. And the aforementioned storage medium includes: a read-only memory (ROM) or a random access memory (random access memory, RAM), a magnetic disk or an optical disk, or the like.

Claims

1. A method for training a named entity recognition model, the method comprising:

acquiring training data, wherein the training data comprises training texts, a main task instruction and a first auxiliary task instruction, the main task instruction comprises first task content of a main task, the first task content comprises determining first entity words in the training texts and entity types of the first entity words, the first entity words are named entities, the first auxiliary task instruction comprises second task content of the first auxiliary task, and the second task content comprises determining entity types contained in the training texts;

2. The method of claim 1, wherein the training data further comprises an example of an entity type describing a second entity word, the second entity word being a named entity and the second entity word being different from the first entity word;

3. The method of claim 1 or 2, wherein the training data further comprises a second auxiliary task instruction, the second auxiliary task instruction comprising third task content of a second auxiliary task, the third task content comprising determining the first entity word from the training text;

4. A method according to claim 3, wherein the training data further comprises a third auxiliary task instruction comprising fourth task content of a third auxiliary task, the fourth task content comprising determining an entity type of a word in the training text;

5. The method of claim 4, wherein updating parameters of the pre-trained language model based on the first difference, the second difference, the third difference, and the fourth difference to obtain the named entity recognition model comprises:

6. The method of claim 4, wherein the training data further comprises a preset entity type, and wherein the fourth task content comprises determining an entity type of the first entity word in the training text from among preset entity types.

7. The method of claim 3, wherein the first execution result and the third execution result each include text describing the first entity word in a preset sentence structure.

8. The method of claim 1 or 2, wherein the training data further comprises a preset entity type, and wherein the first task content comprises determining an entity type of the first entity word from the preset entity type.

9. The method of claim 1 or 2, wherein the pre-trained language model is a text-to-text conversion model.

10. A named entity recognition method, the method comprising:

acquiring a text to be identified;

and carrying out named entity recognition on the text to be recognized by using a named entity recognition model to obtain a named entity recognition result, wherein the named entity recognition model is obtained by training the method of any one of claims 1 to 9.

11. The method of claim 10, wherein after obtaining the named entity recognition result, the method further comprises:

12. The method of claim 11, wherein the named entity recognition result describes candidate entity types of candidate entity words in a preset sentence pattern, and the candidate entity words are named entities;

13. The method of claim 12, wherein the determining the target mapping relationship based on the candidate entity term and the candidate entity type comprises:

14. A training device for named entity recognition models, the training device comprising:

15. A named entity recognition device, characterized in that the named entity recognition device comprises:

the acquisition unit is used for acquiring the text to be identified;

and the recognition unit is used for carrying out named entity recognition on the text to be recognized by using a named entity recognition model to obtain a named entity recognition result, wherein the named entity recognition model is obtained by training the method according to any one of claims 1 to 9.

16. An electronic device, comprising: a processor and a memory for storing computer program code comprising computer instructions which, when executed by the processor, cause the electronic device to perform the method of any one of claims 1 to 9;

The electronic device or the method of any of claims 10 to 13, when the processor executes the computer instructions.

17. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1 to 9;

where the program instructions are executed by a processor, or cause the processor to perform the method of any of claims 10 to 13.

18. A computer program product, characterized in that the computer program product comprises a computer program or instructions; when the computer program or instructions are run on a computer, cause the computer to perform the method of any one of claims 1 to 9;

the computer program or instructions, when run on a computer, or cause the computer to perform the method of any of claims 10 to 13.