CN115422936A

CN115422936A - Entity identification method, entity identification device, computer equipment and storage medium

Info

Publication number: CN115422936A
Application number: CN202211032451.4A
Authority: CN
Inventors: 杨东泉; 陈东来
Original assignee: Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd
Current assignee: Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd
Priority date: 2022-08-26
Filing date: 2022-08-26
Publication date: 2022-12-02

Abstract

The invention discloses an entity identification method, an entity identification device, computer equipment and a storage medium, wherein the entity identification method comprises the following steps: acquiring an initial recognition sentence; adopting a target encoder in an entity recognition model to perform entity extraction processing on the initial recognition sentence to obtain at least one character string to be recognized; adopting a target decoder in an entity recognition model to construct at least one character string to be recognized, and acquiring at least one sentence to be recognized; and calculating the probability distribution corresponding to each sentence to be recognized, and acquiring the target recognition sentence and the target entity type corresponding to the target recognition sentence. The technical scheme can improve the entity identification capability of the new entity data.

Description

Entity identification method, entity identification device, computer equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to an entity identification method and apparatus, a computer device, and a storage medium.

Background

Entity identification (entity identification) is an information extraction technology, which is mainly used for acquiring entity data such as names of people and places from text data. In the prior art, entity identification is generally realized by adopting a BIO (B-begin, I-entity, O-outside) labeling method, each entity has a B and an I label, B represents the beginning of an entity, I represents other parts of an entity, and O represents that the entity does not belong to any entity. However, when the entity type corresponding to the new entity data needs to be recognized on the trained entity recognition model, for example, the ability of recognizing a company name needs to be trained on the entity recognition model with the trained person name, and the ability of the newly trained entity recognition model to recognize the original person name needs to be ensured not to be reduced. The prior art needs to retrain the entity recognition model, and needs to train a considerable amount of new entity data to realize the entity recognition model, and the recognition capability of the old entity type is affected because more new entity data and less old entity data exist.

Disclosure of Invention

Embodiments of the present invention provide an entity identification method, an entity identification device, a computer device, and a storage medium, so as to solve the problem that an entity identification capability of new entity data is poor in an existing entity identification process.

An entity identification method, comprising:

acquiring an initial recognition sentence;

adopting a target encoder in an entity recognition model to perform entity extraction processing on the initial recognition sentence to obtain at least one character string to be recognized;

adopting a target decoder in the entity recognition model to construct at least one character string to be recognized, and acquiring at least one sentence to be recognized;

and calculating the probability distribution corresponding to each sentence to be recognized, and acquiring a target recognition sentence and a target entity type corresponding to the target recognition sentence.

An entity identification apparatus comprising:

a sentence acquisition module for acquiring an initial recognition sentence;

the entity extraction module is used for adopting a target encoder in an entity recognition model to perform entity extraction processing on the initial recognition sentence to obtain at least one character string to be recognized;

a sentence construction module, configured to adopt a target decoder in the entity recognition model to construct at least one character string to be recognized, so as to obtain at least one sentence to be recognized;

and the type acquisition module is used for calculating the probability distribution corresponding to each sentence to be recognized and acquiring a target recognition sentence and a target entity type corresponding to the target recognition sentence.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the entity identification method when executing the computer program.

A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the above-mentioned entity identification method.

According to the entity identification method, the entity identification device, the computer equipment and the storage medium, the initial identification sentence is obtained, then the target encoder in the entity identification model is adopted to extract the entity from the initial identification sentence, and at least one character string to be identified is obtained, so that different sentences to be identified are constructed by using the at least one character string to be identified, the purpose of entity type extension is achieved, and the entity identification accuracy is improved; adopting a target decoder in an entity recognition model to construct and process at least one character string to be recognized and obtain at least one sentence to be recognized so as to expand a new entity type and improve the entity recognition capability of new entity data; and finally, calculating the probability distribution corresponding to each sentence to be recognized, and acquiring the target recognition sentence and the target entity type corresponding to the target recognition sentence, thereby improving the entity recognition capability of the new entity data.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.

FIG. 1 is a schematic diagram of an application environment of an entity identification method according to an embodiment of the invention;

FIG. 2 is a flowchart of an entity identification method according to an embodiment of the present invention;

FIG. 3 is another flow chart of a method for entity identification in an embodiment of the present invention;

FIG. 4 is another flow chart of a method for entity identification in accordance with an embodiment of the present invention;

FIG. 5 is another flow chart of a method for entity identification according to an embodiment of the present invention;

FIG. 6 is another flow chart of a method for entity identification in an embodiment of the present invention;

FIG. 7 is another flow chart of a method for entity identification in an embodiment of the present invention;

FIG. 8 is another flow chart of a method for entity identification in an embodiment of the present invention;

FIG. 9 is a diagram of an entity recognition device in accordance with an embodiment of the present invention;

FIG. 10 is a schematic diagram of a computer device according to an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.

The entity identification method provided by the embodiment of the invention can be applied to the application environment shown in fig. 1. Specifically, the entity identification method is applied to an entity identification system, which comprises a client and a server as shown in fig. 1, wherein the client and the server are in communication through a network and are used for improving the entity identification capability of new entity data. The client is also called a user side, and refers to a program corresponding to the server and providing local services for the client. The client may be installed on, but is not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.

In an embodiment, as shown in fig. 2, an entity identification method is provided, which is described by taking the method applied to the server in fig. 1 as an example, and includes the following steps:

s201: an initial recognition sentence is obtained.

S202: and adopting a target encoder in the entity recognition model to perform entity extraction processing on the initial recognition sentence to obtain at least one character string to be recognized.

S203: and adopting a target decoder in the entity recognition model to construct at least one character string to be recognized, and acquiring at least one sentence to be recognized.

S204: and calculating the probability distribution corresponding to each sentence to be recognized, and acquiring the target recognition sentence and the target entity type corresponding to the target recognition sentence.

Wherein the initially recognized sentence is a sentence input by the user. The initial recognition sentence includes entity data and non-entity data. Entity data refers to the entity that initially identifies the character composition in the sentence. Non-entity data refers to the non-entities that were initially identified as being comprised of characters in the sentence. The entity data includes different entity types. Optionally, the entity types include, but are not limited to, person name, company name, trade name, and place name. Illustratively, the initial recognition sentence is "world cup held at ABC in 2022", where "world cup", "ABC" is entity data, e.g., ABC may be a place name, or a company name and "world cup at" is non-entity data. It should be noted that this example is only for illustration and is not intended to specifically limit the entity data or the non-entity data.

As an example, in step S201, the server may receive an entity identification request sent by the client, perform parsing processing on the entity identification request, and obtain an initial identification sentence. The entity identification request refers to a request for the server to perform entity identification. In this example, the server obtains an initial recognition sentence to identify an entity type corresponding to the entity data in the initial recognition sentence in a subsequent step.

The entity recognition model refers to a preset model for entity recognition of the initial recognition sentence. It should be noted that the entity identification model can identify a plurality of entity types. Illustratively, the entity recognition model may be an entity recognition model that recognizes a person name, a company name, a trade name, and a place name. The character string to be recognized is a character string obtained by performing entity extraction processing on the initial recognition sentence.

As an example, in step S202, the entity identification model may be a Sequence to Sequence model (Sequence to Sequence, seq2Seq for short). Specifically, the entity recognition model may employ an encoder-decoder (coder-decoder) structured entity recognition model.

Illustratively, in step S202, the server performs entity extraction processing on the initial recognition sentence by using a target encoder in the entity recognition model based on a preset word segmentation tool, and obtains at least one character string to be recognized. Alternatively, the preset word segmentation tool may be a jieba word segmentation tool, or other existing tool capable of performing entity extraction processing on the initial recognition sentence. In this example, a target encoder in the entity recognition model is adopted to perform entity extraction processing on the initial recognition sentence, and at least one character string to be recognized is obtained, so that in the subsequent steps, the entity type of the character string to be recognized is judged.

The sentence to be recognized is a sentence constructed by processing the character string to be recognized by using a target decoder in the entity recognition model, and the sentence is a sentence of which the entity type needs to be recognized.

As an example, in step S203, the server processes, specifically performs a construction process, using a target decoder in the entity recognition model, on at least one character string to be recognized, that is, constructs at least one character string to be recognized into a corresponding sentence to be recognized by using the target decoder. In a specific embodiment, the target decoder includes a preset configuration template, where the preset configuration template is a template used to perform configuration processing on at least one character string to be recognized to obtain at least one sentence to be recognized. Illustratively, the sentence to be recognized includes a character string to be recognized and a preset entity tag. The server adopts a target decoder in the entity recognition model, and constructs each character string to be recognized and a preset entity tag into a sentence to be recognized according to a preset construction template, so that the entity type corresponding to the character string to be recognized in the initial recognition sentence is recognized in the subsequent steps. It should be noted that, in the present embodiment, a target decoder in an entity recognition model is used to construct at least one character string to be recognized and obtain at least one sentence to be recognized, so that at least one character string to be recognized and a new preset entity tag can be constructed into at least one sentence to be recognized, thereby expanding a new entity type and improving the entity recognition capability of new entity data.

The target entity type refers to an entity type corresponding to the target recognition sentence.

As an example, in step S204, the server calculates a probability distribution corresponding to each sentence to be recognized, that is, counts the probability distributions that each sentence to be recognized belongs to different entity tags, and obtains the target recognition sentence and the target entity type corresponding to the target recognition sentence. Illustratively, the probability distribution is a probability distribution that different character strings to be recognized in the sentence to be recognized are matched with different preset entity labels. In this example, after calculating the probability distribution corresponding to each sentence to be recognized, the server determines the sentence to be recognized having the maximum probability as the target recognition sentence, and according to the preset entity tag in the target recognition sentence, the target recognition sentence and the target entity type corresponding to the target recognition sentence can be obtained.

In this embodiment, the server first obtains an initial recognition sentence, and then performs entity extraction processing on the initial recognition sentence by using a target encoder in an entity recognition model to obtain at least one character string to be recognized, so as to construct different sentences to be recognized by using the at least one character string to be recognized, thereby achieving the purpose of entity type extension and contributing to improving the accuracy of entity recognition; adopting a target decoder in an entity recognition model to construct and process at least one character string to be recognized and obtain at least one sentence to be recognized so as to expand a new entity type and improve the entity recognition capability of new entity data; and finally, calculating the probability distribution corresponding to each sentence to be recognized, and acquiring the target recognition sentence and the target entity type corresponding to the target recognition sentence, thereby improving the entity recognition capability of the new entity data.

In an embodiment, as shown in fig. 3, in step S202, that is, using a target encoder in an entity recognition model, an entity extraction process is performed on an initial recognition sentence, and at least one character string to be recognized is obtained, including:

s301: and acquiring at least two continuous character strings from the initial recognition sentence based on a preset length by adopting a target encoder in the entity recognition model.

S302: from at least two consecutive character strings, at least one character string to be recognized is determined.

The target encoder refers to an encoder in the entity recognition model. The preset length refers to the character length set by the user. The continuous character string refers to a character string formed by continuous characters acquired in an initial recognition sentence.

As an example, in step S301, the server inputs an initial recognition sentence to a target encoder in the solid recognition model, so that the target encoder obtains at least two continuous character strings from the initial recognition sentence based on a preset length. In this example, since there may be too much non-entity data in the initial recognition sentence, that is, the number of non-entities formed by the characters in the initial recognition sentence may be too many, in order to avoid interference of the non-entity data, at least two continuous character strings need to be obtained from the initial recognition sentence based on the preset length, so that the length of each continuous character string is the preset length, and the probability that the continuous character strings are entity data is improved, thereby improving the accuracy of subsequent entity recognition. The preset length can be specifically set according to actual experience. Preferably, the preset length may be a length of a maximum consecutive character string acquired in the initial recognition sentence. Illustratively, the preset length is 10 characters.

As an example, in step S302, the server determines at least one character string to be recognized from at least two consecutive character strings. For example, the server may determine both of the at least two consecutive character strings as the character string to be recognized, or may determine at least one of the at least two consecutive character strings as the character string to be recognized.

In this embodiment, the server uses a target encoder in the entity recognition model, obtains at least two continuous character strings from the initial recognition sentence based on a preset length, and determines at least one character string to be recognized from the at least two continuous character strings, so as to improve the probability that the at least two continuous character strings obtained from the initial recognition sentence are entity data, thereby improving the accuracy of subsequent entity recognition.

In one embodiment, as shown in fig. 4, in step S301, using a target encoder in the entity recognition model, obtaining at least two consecutive character strings from the initial recognition sentence based on a preset length includes:

s401: and analyzing the initial recognition sentence to obtain the target character.

S402: and acquiring Chinese characters with preset lengths from all target characters.

S403: at least two consecutive character strings are obtained from the chinese character.

The target character refers to a character obtained after the initial recognition sentence is analyzed.

As an example, in step S401, the server parses the initial recognition sentence, and acquires the target character. Optionally, the target characters include Chinese characters, punctuation characters, or other special characters. For example, the server may use a preset word segmentation tool, such as a jieba word segmentation tool, to parse the initial recognition sentence to obtain the target characters.

As an example, in step S402, the server acquires a chinese character of a preset length from all target characters to acquire a continuous string from the chinese character of the preset length in the subsequent step.

As an example, in step S403, after the server obtains a chinese character of a preset length from all target characters, the server obtains at least two continuous character strings from the chinese character. In this example, the server obtains a chinese character of a preset length from all the target characters and obtains at least two consecutive character strings from the chinese character, thereby avoiding determining punctuation characters or other special characters as consecutive character strings, and improving the accuracy of subsequent entity detection.

In this embodiment, the server parses the initially recognized sentence to obtain target characters, obtains chinese characters of a preset length from all the target characters, i.e., removes continuous chinese characters after punctuation characters or other special characters, and obtains at least two continuous character strings from the chinese characters, so as to avoid determining punctuation characters or other special characters as continuous character strings, thereby achieving the purpose of improving the accuracy of subsequent entity detection.

In an embodiment, as shown in fig. 5, in step S203, a target decoder in the entity recognition model is used to perform a construction process on at least one character string to be recognized, and obtain at least one sentence to be recognized, including:

s501: and acquiring all entity labels corresponding to the character strings to be identified in the entity identification model.

S502: and according to a preset construction template, carrying out construction processing on the character string to be recognized and each entity tag to obtain at least one sentence to be recognized.

The entity label is a label corresponding to the entity type of the character string to be recognized.

As an example, in step S501, the entity identification model includes entity tags corresponding to different entity types, such as, but not limited to, entity tags corresponding to a person name, a company name, a trade name, and a place name respectively. It should be noted that the entity tag may be a historical entity tag in the entity identification model, or may be an entity tag newly added to the entity identification model. In this example, the server obtains all entity tags corresponding to the strings to be recognized in the entity recognition model, so as to construct at least one sentence to be recognized according to all entity tags corresponding to the strings to be recognized in the entity recognition model, thereby expanding a new entity type to improve the entity recognition capability of new entity data.

The preset construction template is a template used for constructing the character string to be recognized and each entity tag to obtain at least one sentence to be recognized. The preset configuration template includes preset fixed characters. The preset fixed character can be set according to actual experience.

As an example, in step S502, the server performs a construction process on the character string to be recognized and each entity tag according to a preset construction template, and obtains at least one sentence to be recognized. Specifically, the preset configuration template may be "a character string to be recognized + a preset fixed character + an entity tag", for example, the preset configuration template including the preset fixed character may be "one" or may be other characters. The initial recognition sentence is 'world cup held at ABC in 2022', the character string to be recognized in the initial recognition sentence is 'ABC', and all the entity labels in the entity recognition model comprise entity labels corresponding to company names, trade names and place names respectively. The server processes the character string to be recognized 'ABC' or 'world cup' and the entity tags corresponding to the company name, the trade name and the place name respectively according to a preset construction template, and the obtained sentence to be recognized comprises the following steps: "ABC is a company name", "ABC is a trade name", and "ABC is a place name", so as to calculate a probability distribution corresponding to each sentence to be recognized in the subsequent steps, and obtain the target recognition sentence and a target entity type corresponding to the target recognition sentence.

It should be noted that the entity tag may further include a tag corresponding to the non-entity data, "character string to be recognized + non-entity tag," and exemplarily, the obtained sentence to be recognized includes: "ABC is not an entity".

In this embodiment, the server obtains all entity tags in the entity recognition model, and performs construction processing on the character string to be recognized and each entity tag according to a preset construction template to obtain at least one sentence to be recognized, so as to calculate probability distribution corresponding to each sentence to be recognized in subsequent steps, and obtain a target recognition sentence and a target entity type corresponding to the target recognition sentence.

In one embodiment, as shown in fig. 6, in step S204, calculating a probability distribution corresponding to each sentence to be recognized, and obtaining a target recognition sentence and a target entity type corresponding to the target recognition sentence, includes:

s601: and converting the character sequence of each character string to be recognized into a mark sequence.

S602: and calculating the probability distribution corresponding to each sentence to be recognized based on the mark sequence to obtain a probability distribution result.

S603: and acquiring the target recognition sentence and a target entity type corresponding to the target recognition sentence based on the probability distribution result.

Wherein the marker sequence is a token sequence.

As an example, in step S601, the server performs lexical analysis on each character string to be recognized, obtains a character sequence of each character string to be recognized, and converts the character sequence of each character string to be recognized into a token sequence. Illustratively, the character sequence of each string to be recognized is converted into a token sequence using a lexical analysis tool. It should be noted that the lexical analysis tool may be an existing lexical analysis tool.

The probability distribution result is a result obtained by calculating the probability distribution corresponding to each sentence to be recognized.

As an example, in step S602, the server calculates a probability distribution of each sentence to be recognized based on the tag sequence by using a log-loss function, and obtains a probability distribution result. The logarithmic loss function is a calculation function for calculating the probability distribution of each sentence to be recognized.

As an example, in step S603, the server acquires the target recognition sentence and the target entity type corresponding to the target recognition sentence based on the probability distribution result. Specifically, the server determines the sentence to be recognized corresponding to the maximum probability value obtained by calculating the logarithmic loss function in the probability distribution result as a target recognition sentence based on the probability distribution result, and obtains a target entity type corresponding to the target recognition sentence, thereby realizing the recognition of the entity type.

In this embodiment, the server converts the character sequence of each character string to be recognized into a tag sequence, calculates probability distribution corresponding to each sentence to be recognized based on the tag sequence, obtains a probability distribution result, finally obtains a target recognition sentence and a target entity type corresponding to the target recognition sentence based on the probability distribution result, obtains the probability distribution result through probability distribution calculation, and obtains the target recognition sentence and the target entity type corresponding to the target recognition sentence from the probability distribution result, so as to improve accuracy of the target entity type.

In one embodiment, as shown in fig. 7, in step S602, calculating a probability distribution corresponding to each sentence to be recognized based on the tag sequence, and obtaining a result of the probability distribution, includes:

s701: and acquiring the character probability corresponding to each character in each character string to be recognized based on the mark sequence.

S702: and calculating the probability distribution corresponding to each sentence to be recognized based on the character probability and each sentence to be recognized, and acquiring a probability distribution result.

The character probability refers to the probability corresponding to each character in each character string to be recognized.

As an example, in step S701, the server obtains a character probability corresponding to each character in each character string to be recognized based on the marker sequence. Illustratively, the server obtains the character probability corresponding to each character in each character string to be recognized by adopting a preset character probability calculation formula based on the mark sequence. The preset character probability calculation formula is a formula for calculating the character probability corresponding to each character in each character string to be recognized. Illustratively, the preset character probability calculation formula is: log (t) _c |t ₀ ，...，t _c-1 X), wherein c is the number of characters in the character string to be recognized, and x is the number of all characters in the initial recognition sentence.

As an example, in step S702, the server calculates a probability distribution corresponding to each sentence to be recognized based on the character probability and each sentence to be recognized, and obtains a probability distribution result. In this example, the server is based on the character probabilities and each to be recognizedAnd the sentences adopt a preset probability distribution calculation formula to calculate the probability distribution corresponding to each sentence to be recognized, and obtain a probability distribution result. The preset probability distribution calculation formula is a calculation formula for calculating the probability corresponding to each sentence to be recognized. Illustratively, the predetermined probability distribution calculation formula is:

wherein m is the number of characters in the sentence to be recognized.

In this embodiment, the server obtains a character probability corresponding to each character in each character string to be recognized based on the tag sequence, calculates a probability distribution corresponding to each sentence to be recognized based on the character probability and each sentence to be recognized, obtains a probability distribution result, ensures accuracy of the probability distribution result, and further improves accuracy of the target entity type.

In one embodiment, as shown in fig. 8, before step S201, before the initial recognition sentence is acquired, the entity recognition method includes:

s801: and acquiring a sentence to be trained, wherein the sentence to be trained carries an entity label.

S802: and inputting the sentence to be trained into an original encoder in the entity recognition model for training, and acquiring a training character string output by the original encoder.

S803: and inputting the training character string into an original decoder in the entity recognition model for training, and acquiring an entity prediction label output by the original decoder.

S804: and judging whether the model convergence condition is met or not according to the entity labeling label and the entity prediction label.

S805: and if the model convergence condition is met, determining the original encoder as a target encoder and determining the original decoder as a target decoder.

Wherein, the sentence to be trained is a sentence for training the entity recognition model. Illustratively, the sentence to be trained is a sentence input by the user. The entity labeling label is a label corresponding to the sentence to be trained.

As an example, in step S801, the server obtains a sentence to be trained and an entity tag corresponding to the sentence to be trained, so as to provide a training basis for subsequent model training.

The original encoder refers to an encoder in the entity recognition model. The training character string is a character string obtained by performing entity extraction processing on a sentence to be trained.

As an example, in step S802, the server inputs a sentence to be trained into an original encoder in the entity recognition model for training, and obtains a training character string output by the original encoder. In this example, the server may input the sentence to be trained into the original encoder in the entity recognition model, and perform entity recognition on the sentence to be trained based on a word segmentation tool preset in the original encoder, to obtain the training character string output by the original encoder. Alternatively, the preset word segmentation tool may be a jieba word segmentation tool, or other existing tool capable of performing entity extraction processing on the initial recognition sentence.

The entity prediction tag is a tag for predicting an entity type.

As an example, in step S803, the server inputs a training character string into an original decoder in the entity recognition model for training, and obtains an entity prediction label output by the original decoder. Illustratively, the original decoder includes a preset construction template, where the preset construction template is a template used for performing construction processing on the training character string and each entity tag to obtain at least one sentence to be recognized. The server inputs the training character string into an original decoder in the entity recognition model, the original decoder is used for carrying out construction processing on the training character string and each entity label according to a preset construction template to obtain at least one entity recognition sentence, then the probability distribution corresponding to each sentence to be recognized is calculated, a target recognition sentence and a target entity type corresponding to the target recognition sentence are obtained, and an entity prediction label corresponding to the target entity type is obtained.

The model convergence condition is a condition for determining whether the entity recognition model converges.

As an example, in step S803, the server determines whether the model convergence condition is satisfied according to the entity tagging label and the entity predicting label. Illustratively, the server calculates a target loss value corresponding to the entity tagging tag and the entity prediction tag according to a preset loss value calculation strategy, compares the target loss value with a loss threshold, and if the target loss value is less than the loss threshold, satisfies a model convergence condition, and if the target loss value is greater than or equal to the loss threshold, does not satisfy the model convergence condition. The preset loss value calculation strategy is a calculation strategy for calculating a loss value between an entity labeling label and an entity prediction label. It should be noted that the preset loss value calculation strategy may be some existing loss value calculation strategies, and is not limited herein. The loss threshold is a threshold set by a user for judging whether a model convergence condition is met.

As an example, in step S804, if the model convergence condition is satisfied, the original encoder is determined as the target encoder, and the original decoder is determined as the target decoder, so that the entity recognition is performed by using the target encoder and the target decoder in the entity recognition model, thereby improving the entity recognition capability for the new entity data.

In this embodiment, a sentence to be trained and an entity label corresponding to the sentence to be trained are obtained, an original encoder in an entity recognition model is used to perform entity recognition on the sentence to be trained, a target character string is obtained, an original decoder in the entity recognition model is used to perform entity recognition on the target character string, an entity prediction label corresponding to a preset construction template is obtained, whether a model convergence condition is met is judged according to the entity label and the entity prediction label, and finally, when the model convergence condition is met, the original encoder is determined as the target encoder and the original decoder is determined as the target decoder, so that the original decoder capable of performing entity recognition on target character strings of different entity types is trained, a new entity type is expanded, and the entity recognition capability on new entity data is improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by functions and internal logic of the process, and should not limit the implementation process of the embodiments of the present invention in any way.

In an embodiment, an entity identification apparatus is provided, and the entity identification apparatus corresponds to the entity identification method in the embodiment one to one. As shown in fig. 9, the entity recognition apparatus includes a sentence acquisition module 901, an entity extraction module 902, a sentence construction module 903, and a type acquisition module 904. The functional modules are explained in detail as follows:

a sentence obtaining module 901, configured to obtain an initial recognition sentence;

an entity extraction module 902, configured to perform entity extraction processing on the initial recognition sentence by using a target encoder in the entity recognition model, to obtain at least one character string to be recognized;

a sentence construction module 903, configured to adopt a target decoder in the entity recognition model to construct at least one character string to be recognized, and obtain at least one sentence to be recognized;

the type obtaining module 904 is configured to calculate a probability distribution corresponding to each sentence to be recognized, and obtain a target recognition sentence and a target entity type corresponding to the target recognition sentence.

Further, the entity extraction module 902 includes:

the continuous character sub-module is used for acquiring at least two continuous character strings from the initial recognition sentence based on a preset length by adopting a target encoder in the entity recognition model;

and the character string determining sub-module is used for determining at least one character string to be recognized from at least two continuous character strings.

Further, the consecutive characters sub-module includes:

the character acquisition unit is used for analyzing the initial recognition sentence to acquire a target character;

the Chinese character acquisition unit is used for acquiring Chinese characters with preset lengths from all target characters;

a character string obtaining unit for obtaining at least two continuous character strings from the Chinese characters.

Further, the sentence construction module 903 comprises:

the entity tag submodule is used for acquiring all entity tags corresponding to the character strings to be identified in the entity identification model;

and the sentence recognition submodule is used for carrying out construction processing on the character string to be recognized and each entity tag according to a preset construction template to obtain at least one sentence to be recognized.

Further, the type obtaining module 904 includes:

the sequence conversion sub-module is used for converting the character sequence of each character string to be recognized into a mark sequence;

the probability distribution submodule is used for calculating the probability distribution corresponding to each sentence to be recognized based on the mark sequence and acquiring a probability distribution result;

and the entity type submodule is used for acquiring the target recognition sentence and the target entity type corresponding to the target recognition sentence based on the probability distribution result.

Further, the probability distribution submodule includes:

the character probability acquiring unit is used for acquiring the character probability corresponding to each character in each character string to be recognized based on the marking sequence;

and the distribution result acquisition unit is used for calculating the probability distribution corresponding to each sentence to be recognized based on the character probability and each sentence to be recognized and acquiring the probability distribution result.

Further, the entity recognition device further comprises:

the entity labeling module is used for acquiring sentences to be trained, and the sentences to be trained carry entity labeling labels;

the target character string module is used for inputting the sentence to be trained into an original encoder in the entity recognition model for training, and acquiring a training character string output by the original encoder;

the recognition result acquisition module is used for inputting the training character string into an original decoder in the entity recognition model for training and acquiring an entity prediction label output by the original decoder;

the convergence condition module is used for judging whether the model convergence condition is met or not according to the entity labeling label and the entity prediction label;

and the target determining module is used for determining the original encoder as the target encoder and determining the original decoder as the target decoder when the model convergence condition is met.

For the specific definition of the entity identification device, reference may be made to the above definition of the entity identification method, which is not described herein again. The modules in the entity identification device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure thereof may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The database of the computer device is used for storing data in the entity identification process. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an entity identification method.

In an embodiment, a computer device is provided, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the processor executes the computer program, the entity identification method in the foregoing embodiments is implemented, and details are not repeated herein to avoid repetition. Alternatively, the processor implements the functions of each module/unit in the embodiment of the entity identification apparatus when executing the computer program, and is not described herein again to avoid repetition.

In an embodiment, a computer-readable storage medium is provided, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the entity identification method in the foregoing embodiment is implemented, and details are not repeated herein to avoid repetition. Alternatively, the computer program, when executed by the processor, implements the functions of each module/unit in the embodiment of the entity identifying apparatus, and is not described herein again to avoid repetition.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct Rambus Dynamic RAM (DRDRAM), and Rambus Dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.

The above-mentioned embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. An entity identification method, comprising:

acquiring an initial recognition sentence;

2. The entity recognition method of claim 1, wherein said performing entity extraction on the initial recognition sentence by using a target encoder in the entity recognition model to obtain at least one character string to be recognized comprises:

acquiring at least two continuous character strings from the initial recognition sentence based on a preset length by adopting a target encoder in the entity recognition model;

and determining at least one character string to be recognized from at least two continuous character strings.

3. The entity recognition method of claim 1, wherein said obtaining at least two consecutive strings from the initial recognition sentence based on a preset length using a target encoder in the entity recognition model comprises:

analyzing the initial recognition sentence to obtain a target character;

acquiring Chinese characters with preset lengths from all the target characters;

at least two consecutive strings are obtained from the chinese character.

4. The entity recognition method of claim 1, wherein the constructing at least one character string to be recognized by using a target decoder in the entity recognition model to obtain at least one sentence to be recognized comprises:

acquiring all entity labels corresponding to the character strings to be identified in the entity identification model;

and according to a preset construction template, constructing the character string to be recognized and each entity tag to obtain at least one sentence to be recognized.

5. The entity recognition method according to claim 1, wherein said calculating a probability distribution corresponding to each of the sentences to be recognized to obtain a target recognition sentence and a target entity type corresponding to the target recognition sentence comprises:

converting the character sequence of each character string to be recognized into a mark sequence;

calculating the probability distribution corresponding to each sentence to be recognized based on the mark sequence to obtain a probability distribution result;

and acquiring a target recognition sentence and a target entity type corresponding to the target recognition sentence based on the probability distribution result.

6. The entity identification method according to claim 5, wherein said calculating a probability distribution corresponding to each sentence to be identified based on the tag sequence to obtain a probability distribution result comprises:

acquiring character probability corresponding to each character in each character string to be recognized based on the mark sequence;

and calculating the probability distribution corresponding to each sentence to be recognized based on the character probability and each sentence to be recognized, and acquiring a probability distribution result.

7. The entity recognition method of claim 1, wherein prior to said obtaining the initial recognition sentence, the entity recognition method comprises:

acquiring a sentence to be trained, wherein the sentence to be trained carries an entity label;

inputting the sentence to be trained into an original encoder in an entity recognition model for training, and acquiring a training character string output by the original encoder;

inputting the training character string into an original decoder in an entity recognition model for training, and acquiring an entity prediction label output by the original decoder;

judging whether a model convergence condition is met or not according to the entity labeling label and the entity prediction label;

and if the model convergence condition is met, determining the original encoder as a target encoder and determining the original decoder as a target decoder.

8. An entity identification apparatus, comprising:

a sentence acquisition module for acquiring an initial recognition sentence;

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the entity identification method as claimed in any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the entity identification method according to any one of claims 1 to 7.