CN113239697A

CN113239697A - Entity recognition model training method and device, computer equipment and storage medium

Info

Publication number: CN113239697A
Application number: CN202110611212.3A
Authority: CN
Inventors: 于凤英; 王健宗
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-06-01
Filing date: 2021-06-01
Publication date: 2021-08-10
Anticipated expiration: 2041-06-01
Also published as: CN113239697B

Abstract

The invention discloses an entity recognition model training method, a device, computer equipment and a storage medium, wherein the entity recognition model training method adopts a word frequency algorithm to calculate the vector similarity of a training standard entity and each training synonymous entity, and acquires the sparse similarity of each training synonymous entity and the training standard entity; adopting a semantic recognition model to calculate the vector similarity of the training standard entity and each training synonymous entity to obtain the dense similarity corresponding to each training synonymous entity; screening the training synonymous entities according to the sparse similarity and the dense similarity to obtain target synonymous entities; processing the target synonymous entity by adopting a batch gradient descent method to obtain a plurality of batch training sets; and sequentially adopting batch training sets to perform batch training on the biobert model, optimizing a loss function in the biobert model, and acquiring the entity recognition model so as to improve the performance of the entity recognition model.

Description

Entity recognition model training method and device, computer equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a device for training an entity recognition model, computer equipment and a storage medium.

Background

With the continuous development of various cultural knowledge, entity words have more and more synonyms or abbreviations and the like, so that great difficulty is brought to reading or understanding of users, most of the existing technical schemes adopt models for recognition, but the accuracy of the existing models is low.

Disclosure of Invention

The embodiment of the invention provides an entity recognition model training method, an entity recognition model training device, computer equipment and a storage medium, and aims to solve the problem of low accuracy of the existing model.

An entity recognition model training method comprises the following steps:

obtaining training samples, wherein the training samples comprise training standard entities and a plurality of training synonymous entities corresponding to each training standard entity;

performing vector similarity calculation on the training standard entities and each training synonymous entity by adopting a word frequency algorithm to obtain the sparse similarity of each training synonymous entity and the training standard entities;

adopting a semantic recognition model to calculate the vector similarity of the training standard entity and each training synonymous entity to obtain the dense similarity corresponding to each training synonymous entity;

screening a target synonymous entity from the training synonymous entity according to the sparse similarity and the dense similarity;

processing the target synonymous entity by adopting a batch gradient descent method to obtain a plurality of batch training sets;

and sequentially adopting the batch training sets to perform batch training on the biobert model, optimizing a loss function in the biobert model, and obtaining an entity recognition model.

An entity recognition model training apparatus, comprising:

the training sample acquisition module is used for acquiring training samples, and each training sample comprises a training standard entity and a plurality of training synonymous entities corresponding to each training standard entity;

the sparse similarity obtaining module is used for calculating the vector similarity of the training standard entities and each training synonymous entity by adopting a word frequency algorithm to obtain the sparse similarity of each training synonymous entity and the training standard entities;

the dense similarity obtaining module is used for calculating the vector similarity of the training standard entities and each training synonymous entity by adopting a semantic recognition model to obtain the dense similarity corresponding to each training synonymous entity;

the target synonymous entity acquisition module is used for screening the training synonymous entities according to the sparse similarity and the dense similarity to obtain target synonymous entities;

the batch training set acquisition module is used for processing the target synonymous entity by adopting a batch gradient descent method to acquire a plurality of batch training sets;

and the entity recognition model obtaining module is used for sequentially adopting the batch training sets to perform batch training on the biobert model, optimizing a loss function in the biobert model and obtaining the entity recognition model.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the entity recognition model training method when executing the computer program.

A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned entity recognition model training method.

According to the entity recognition model training method, the entity recognition model training device, the computer equipment and the storage medium, vector similarity calculation is carried out on the training standard entities and each training synonymous entity by adopting a word frequency algorithm, the sparse similarity of each training synonymous entity and the training standard entities is obtained, and the sparse similarity between the training synonymous entities and the training standard entities is obtained through the word frequency algorithm, so that the training synonymous entity training model with higher sparse similarity is used, and the model obtained through training is ensured to be higher in accuracy. And performing vector similarity calculation on the training standard entities and each training synonymous entity by adopting a semantic recognition model to obtain the dense similarity corresponding to each training synonymous entity, and quickly obtaining the dense similarity by utilizing the semantic recognition model so as to find out the training synonymous entities with higher semantic similarity with the training standard entities in the subsequent process, thereby providing technical support for the subsequent model training and ensuring higher accuracy of the model obtained by training. And screening the training synonymous entities to obtain target synonymous entities according to the sparse similarity and the dense similarity so as to fully consider semantic information and morphological information of the training synonymous entities, thereby ensuring that an entity recognition model obtained by utilizing the training of the target synonymous entities has better performance and higher accuracy, and the model training time can be greatly shortened. And processing the target synonymous entity by adopting a batch gradient descent method to obtain a plurality of batch training sets so as to reduce the calculation overhead and reduce the randomness. The batch training sets are sequentially adopted to perform batch training on the biobert model, the loss function in the biobert model is optimized, the entity recognition model is obtained, the operation amount can be reduced, the randomness is reduced, and the high accuracy and the good performance of the trained entity recognition model can be ensured due to the fact that the target synonymous entity has high voice information and form information.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a diagram illustrating an application environment of a method for training an entity recognition model according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for training entity recognition models according to an embodiment of the present invention;

FIG. 3 is another flow chart of a method for training an entity recognition model according to an embodiment of the invention;

FIG. 4 is another flow chart of a method for training an entity recognition model according to an embodiment of the invention;

FIG. 5 is another flow chart of a method for training an entity recognition model according to an embodiment of the invention;

FIG. 6 is another flow chart of a method for training an entity recognition model according to an embodiment of the invention;

FIG. 7 is another flow chart of a method for training an entity recognition model according to an embodiment of the invention;

FIG. 8 is another flow chart of a method for training an entity recognition model in accordance with an embodiment of the present invention;

FIG. 9 is a diagram illustrating an embodiment of an entity recognition model training apparatus;

FIG. 10 is a schematic diagram of a computer device according to an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The entity recognition model training method provided by the embodiment of the invention can be applied to the application environment shown in fig. 1. Specifically, the entity recognition model training method is applied to an entity recognition model training system, the entity recognition model training system comprises a client and a server shown in fig. 1, and the client and the server are communicated through a network and used for realizing entity recognition model training and improving the accuracy of the model. The client is also called a user side, and refers to a program corresponding to the server and providing local services for the client. The client may be installed on, but is not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster consisting of a plurality of servers.

In an embodiment, as shown in fig. 1, an entity recognition model training method is provided, which is described by taking the example that the method is applied to the server in fig. 1, and includes the following steps:

s201: training samples are obtained, wherein the training samples comprise training standard entities and a plurality of training synonymous entities corresponding to each training standard entity.

The training samples refer to samples for training the entity recognition model. Training standard entities are uniformly named entities within the industry. The training synonymous entity is an alias of the training standard entity or an entity with a similar form to the training standard entity. For example, when the training sample is a medically relevant sample, the training standard entity may be COVID-19; training synonymous entities may be covid-19 and MERS.

S202: and performing vector similarity calculation on the training standard entities and each training synonymous entity by adopting a word frequency algorithm to obtain the sparse similarity of each training synonymous entity and the training standard entity.

The word frequency algorithm is a common weighting technique used for intelligence retrieval and text mining, and is used for evaluating the repetition degree of a word on a domain file set in a corpus.

The sparse similarity is used to represent the morphological similarity of each training synonymous entity and the training standard entity. It can be understood that if the corresponding sparse similarity of a training synonymous entity is higher, the more similar the training synonymous entity is in form to the training standard entity.

In this embodiment, the sparse similarity between the training synonymous entity and the training standard entity is obtained through the word frequency algorithm, so that the training synonymous entity with higher sparse similarity is used to train the model, and the accuracy of the trained model is ensured to be higher.

S203: and adopting a semantic recognition model to calculate the vector similarity of the training standard entity and each training synonymous entity to obtain the dense similarity corresponding to each training synonymous entity.

The semantic recognition model is a model which is trained in advance and used for recognizing training standard entities and training synonymous entities. The semantic model may be a pre-trained BERT model.

The dense similarity is used for representing the semantic similarity degree of each training synonymous entity and each training standard entity. It can be understood that, if the dense similarity corresponding to a training synonymous entity is higher, the more semantically similar the training synonymous entity is to the training standard entity.

In the embodiment, the dense similarity is quickly obtained by using the semantic recognition model, so that the training synonymous entity with the training standard entity with higher semantic similarity can be found subsequently, technical support can be provided for subsequent model training, and the model obtained by training is ensured to have higher accuracy.

S204: and screening the training synonymous entities according to the sparse similarity and the dense similarity to obtain the target synonymous entities.

Wherein, the target synonymous entity is a training synonymous entity for training the entity recognition model. In this embodiment, the target synonymous entity is a training synonymous entity with a large dense similarity and a large sparse similarity, so that semantic information and morphological information of the training synonymous entity are fully considered, and it is ensured that an entity recognition model obtained by training the target synonymous entity has better performance and higher accuracy, and the model training time can be greatly shortened.

In the embodiment, the target synonymous entity used for training the model is selected from the training synonymous entities through the dense similarity and the sparse similarity, so that the training duration of the model can be effectively shortened, semantic information and morphological information carried by the target synonymous entity are similar to those of a training standard entity, and the entity recognition model obtained by training the target synonymous entity is better in performance and higher in accuracy.

S205: and processing the target synonymous entity by adopting a batch gradient descent method to obtain a plurality of batch training sets.

The batch gradient descent method is an algorithm for dividing the target synonymous entity into a plurality of batch training sets, and parameters of the model are updated by using batch training, so that the calculation overhead is reduced, and the randomness is reduced. The batch training set refers to a set of target synonymous entities obtained by batch processing of the target synonymous entities. For example, the number of target synonymous entities is 5 thousand, the target identical entity is divided into 5 batch training sets, and the number of target synonymous entities in each batch training set is 1 thousand.

S206: and sequentially adopting batch training sets to perform batch training on the biobert model, optimizing a loss function in the biobert model, and obtaining an entity recognition model.

Wherein the loss function is a model corresponding to the pre-trained biobert model.

In this embodiment, the target synonymous entity is divided to realize the bibert model trained in advance in batch, so that the computation amount can be reduced, the randomness is reduced, and the target synonymous entity has higher voice information and form information, so that the high accuracy and better performance of the trained entity recognition model can be ensured.

In the entity recognition model training method provided by this embodiment, a word frequency algorithm is adopted to perform vector similarity calculation on a training standard entity and each training synonymous entity, obtain sparse similarity between each training synonymous entity and the training standard entity, and obtain sparse similarity between the training synonymous entity and the training standard entity through the word frequency algorithm, so that a training synonymous entity training model with higher sparse similarity is used to ensure higher accuracy of the model obtained through training. The semantic recognition model is adopted to calculate the vector similarity of the training standard entity and each training synonymous entity to obtain the dense similarity corresponding to each training synonymous entity, and the dense similarity is quickly obtained by the semantic recognition model, so that the training synonymous entity with the training standard entity with higher semantic similarity can be found subsequently, technical support can be provided for subsequent model training, and the model obtained by training is ensured to have higher accuracy. According to the sparse similarity and the dense similarity, the target synonymous entity is obtained by screening from the training synonymous entity, so that the semantic information and the morphological information of the training synonymous entity are fully considered, the entity recognition model obtained by utilizing the training of the target synonymous entity is better in performance and higher in accuracy, and the model training time can be greatly shortened. And processing the target synonymous entity by adopting a batch gradient descent method to obtain a plurality of batch training sets so as to reduce the calculation overhead and reduce the randomness. The training process is simple, batch training sets are sequentially adopted to perform batch training on the biobert model, loss functions in the biobert model are optimized, the entity recognition model is obtained, the operation amount can be reduced, the randomness is reduced, and the fact that the target synonymous entity has high voice information and form information can ensure that the entity recognition model obtained through training is high in accuracy and good in performance.

As an embodiment, as shown in fig. 3, in step S202, a word frequency algorithm is used to perform vector similarity calculation on training standard entities and each training synonymous entity, and obtaining sparse similarity between each training synonymous entity and the training standard entity includes:

s301: and respectively carrying out vector conversion processing on the training standard entities and each training synonymous entity by adopting a word frequency algorithm to obtain standard sparse vectors of the training standard entities and synonymous sparse vectors of each training synonymous entity.

S302: and performing inner product processing on each synonymous sparse vector and the standard sparse vector to obtain the sparse similarity of each synonymous sparse vector and the standard sparse vector.

The standard sparse vector is a sparse vector corresponding to the training standard entity, and the standard sparse vector is used for representing morphological information of the training standard entity. The sparse vector refers to the fact that the number of elements with the numerical value of 0 in the vector is more than the number of elements with the numerical value of not 0.

The synonymous sparse vector is a sparse vector corresponding to the training synonymous entity, and the synonymous sparse vector is used for representing morphological information of the training synonymous entity.

In this embodiment, the synonymous sparse vector and the standard sparse vector are obtained through the word frequency algorithm, so that the sparse similarity is obtained according to the synonymous sparse vector and the standard sparse vector, and therefore it is ensured that a training synonymous entity with a higher morphological similarity degree with a training standard entity can be found, and technical support can be provided for subsequent model training.

In the entity recognition model training method provided in this embodiment, a word frequency algorithm is used to perform vector transformation processing on a training standard entity and each training synonymous entity respectively, so as to obtain a standard sparse vector of the training standard entity and a synonymous sparse vector of each training synonymous entity. And performing inner product processing on each synonymous sparse vector and a standard sparse vector to obtain the sparse similarity of each synonymous sparse vector and the standard sparse vector, thereby ensuring that a training synonymous entity with higher morphological similarity with a training standard entity can be found, and providing technical support for subsequent model training.

As an embodiment, as shown in fig. 4, step S301, namely, performing vector transformation processing on a training standard entity and each training synonymous entity by using a word frequency algorithm, respectively, to obtain a standard sparse vector of the training standard entity and a synonymous sparse vector of each training synonymous entity, includes:

s401: and carrying out segmentation processing on the training standard entity and each training synonymous entity to respectively obtain the multi-element segmentation characters corresponding to the training standard entity and the multi-element segmentation characters corresponding to all the training synonymous entities.

The multivariate segmentation character is obtained by performing character segmentation on a training standard entity and each training synonymous entity to obtain a character, for example, the training standard entity is COVID-19, and the training standard entity is segmented to obtain multivariate segmentation characters which can be CO, VI, D-and 19; or COVI and D-19, etc. The multi-element segmentation character of the embodiment is a 2-element segmentation character, that is, the training standard entity is COVID-19, and the obtained multi-element segmentation character can be CO, VI, D-and 19.

In this embodiment, the training standard entities and each training synonymous entity are subjected to character-level segmentation processing, so that technical support is improved for subsequently calculating the similarity between the training standard entities and each training synonymous entity from a character level, it is ensured that subsequently obtained synonymous sparse vectors are more accurate, and the model obtained by training is more accurate.

S402: processing the multi-element segmentation characters corresponding to the training standard entity by adopting a TF-IDF algorithm to obtain a standard sparse vector corresponding to the training standard entity; and processing the multi-element segmentation characters corresponding to each training synonymous entity by adopting a TF-IDF algorithm to obtain the synonymous sparse vector of each training synonymous entity.

The TF-IFD algorithm is a statistical analysis method for keywords, and is used to evaluate the importance of a word to a document set or a corpus. In this embodiment, the TF-IFD algorithm is used to determine the sparse vector of each training synonymous entity or training standard entity, so that the morphological similarity of the training synonymous entities or training standard entities can be clearly reflected, and the model is trained by subsequently combining the semantic similarity of the training synonymous entities or training standard entities, thereby improving the performance of the model.

The entity recognition model training method provided by this embodiment performs segmentation processing on a training standard entity and each training synonymous entity to obtain a plurality of segmented characters corresponding to the training standard entity and a plurality of segmented characters corresponding to all the training synonymous entities, and performs character-level segmentation processing on the training standard entity and each training synonymous entity to improve technical support for subsequently calculating the similarity between the training standard entity and each training synonymous entity from a character level, thereby ensuring that a subsequently obtained synonymous sparse vector is more accurate and a model obtained by training is more accurate. Processing the multi-element segmentation characters corresponding to the training standard entity by adopting a TF-IDF algorithm to obtain a standard sparse vector corresponding to the training standard entity; and processing the multi-element segmentation characters corresponding to each training synonymous entity by adopting a TF-IDF algorithm to obtain the synonymous sparse vector of each training synonymous entity, and subsequently combining a semantic similarity training model of the training synonymous entity or the training standard entity to improve the performance of the model.

As an embodiment, as shown in fig. 5, in step S402, processing the multi-element segmented character by using a TF-IDF algorithm, and obtaining a standard sparse vector corresponding to a training standard entity and a synonymous sparse vector of each training synonymous entity, the method includes:

s501: processing the multi-element segmentation characters corresponding to the training standard entity by adopting a TF-IDF algorithm to obtain word frequencies and inverse document frequencies corresponding to the multi-element segmentation characters contained in the training standard entity; and processing the multi-element segmentation characters corresponding to each training synonymous entity by adopting a TF-IDF algorithm to obtain the word frequency and the inverse document frequency corresponding to the multi-element segmentation characters contained in the synonymous entity.

S502: and acquiring a standard sparse vector corresponding to the training standard entity based on the word frequency and the inverse document frequency corresponding to the multi-element segmentation characters contained in the training standard entity.

S502: and acquiring a synonymy sparse vector corresponding to the training synonymy entity based on the word frequency and the inverse document frequency corresponding to the multivariate segmentation characters contained in the training synonymy entity.

The term frequency refers to the number of occurrences of the multi-element segmentation character in the training sample. The inverse document frequency is a measure of the general importance of a multiply-segmented character. The inverse document frequency of a multiple segmented character can be obtained by dividing the total number of documents by the number of documents containing the term and taking the logarithm of the quotient. An inverse document frequency of a multiple segmented character is

Wherein, f refers to the total number of training standard entities and training synonymous entities; d is the number of training standard entities and training synonymous entities containing the multi-element segmentation character.

Specifically, a character library is constructed according to the multi-element segmentation characters; determining the character position of each multi-element segmentation character in a character library; determining the word frequency corresponding to each multi-element segmentation character contained in the training standard entity according to the character position; combining the word frequencies corresponding to all the multi-element segmentation characters contained in the training standard entity to obtain a word frequency vector; calculating the inverse document frequency corresponding to each multi-element segmentation character contained in the training standard entity according to the inverse document frequency function; combining the inverse document frequencies corresponding to all the multivariate segmentation characters contained in the training standard entity to obtain an inverse document frequency vector; and multiplying the word frequency vector and the inverse document frequency vector to obtain a corresponding standard sparse vector corresponding to the training standard entity. The character library is a database containing multiple segmented characters. The process of obtaining the synonymous sparse vector corresponding to the training synonymous entity is the same as the process of obtaining the standard sparse vector corresponding to the training standard entity, and is not repeated herein.

Illustratively, the training standard entity is COVID-19; if the training synonym is MERS and SARS-COV2, 2-element segmentation is carried out on the training standard entity and the training synonym to obtain a character library, and the character library is CO, 1; VI, 2; d-, 3; 19, 4; ME, 5; RS, 6; SA, 7; RS, 8; -C, 9; OV,10 and V2, 11; wherein the numbers 1 and 2 represent the positions of the multi-element segmentation characters in the character library. At this time, the word frequency of the training standard entity is a vector formed by the word frequencies corresponding to the multiple segmented characters forming the training standard entity [ 1111000000 ]. The inverse document frequency corresponding to the training standard entity is a vector formed by the inverse document frequencies corresponding to the multi-element segmentation characters forming the training standard entity.

According to the entity recognition model training method provided by the embodiment, the standard sparse vector of the training standard entity and the synonymous sparse vector of the training synonymous entity are determined through the TF-IDF algorithm, the number of training samples can be greatly reduced, the model training speed can be accelerated, the similarity between the training standard entity and the training synonymous entity can be determined from the character level, and the higher accuracy of the similarity is ensured. Therefore, the similarity of the training standard entity and the training synonymous entity can be better determined in the following process, and the accuracy of the similarity is better ensured.

As an embodiment, as shown in fig. 6, in step S203, a semantic recognition model is adopted to perform vector similarity calculation on the training standard entity and each training synonymous entity to obtain the dense similarity corresponding to each training synonymous entity, where the dense similarity calculation includes:

s601: and performing vector conversion processing on the training standard entity by adopting a semantic recognition model to obtain a standard dense vector of the training standard entity.

The standard dense vector is a dense vector corresponding to the training standard entity, and is used for representing semantic information of the training standard entity, wherein the dense vector and the sparse vector have opposite meanings, and the number of elements with the numerical value of 0 in the dense vector is less than the number of elements with the numerical value of not 0.

S602: adopting a semantic recognition model to perform vector conversion processing on each training synonymous entity to obtain a synonymous dense vector of each training synonymous entity;

the synonymy dense vector is a dense vector corresponding to the training synonymy entity, and the standard dense vector is used for representing semantic information of the training synonymy entity.

S603: and performing inner product processing on each synonymy dense vector and the standard dense vector respectively to obtain the dense similarity of each synonymy dense vector and the standard dense vector.

In this embodiment, the dense similarity is obtained according to the standard dense vector corresponding to the training standard entity and the synonymous dense vector corresponding to the training synonymous entity, so that a training synonymous entity which is semantically more similar to the training standard entity can be subsequently screened out, and a model is trained, so as to ensure that the model obtained by training has higher accuracy.

In the entity recognition model training method provided by this embodiment, the semantic recognition model is used to obtain the standard dense vectors of the training standard entities and the synonymous dense vectors of each training synonymous entity, so that the standard dense vectors and the synonymous dense vectors can be quickly obtained, and the training efficiency is improved. And performing inner product processing on each synonymy dense vector and the standard dense vector respectively to obtain the dense similarity of each synonymy dense vector and the standard dense vector, so that training synonymy entities which are semantically more similar to the training standard entities can be screened out in the following process, and the model is trained to ensure that the model obtained by training has higher accuracy.

As an embodiment, as shown in fig. 7, in step S204, the screening the training synonymous entities according to the sparse similarity and the dense similarity to obtain the target synonymous entity includes:

s701: acquiring target parameters, wherein the target parameters comprise an acquisition quantity parameter and a proportion parameter;

the target parameters are parameters for screening the target synonymous entities in the training synonymous entities.

The quantity parameter is a parameter that represents the quantity of desired target synonymous entities. The proportion parameter is a parameter indicating the proportion of the target synonymous entities determined according to the density similarity among all the target synonymous entities. The ratio parameter may be 0.78 and the number of synonymous entities may be 10 ten thousand.

S702: and putting the first a training synonymous entities with the highest dense similarity into a first candidate entity set, wherein a is the product of a quantity parameter and a proportion parameter.

The first candidate entity set is an entity set obtained by screening all training synonymous entities according to the intensity of the dense similarity, and the first candidate entity set comprises the first a training synonymous entities with the highest dense similarity, so that the accuracy of the model is improved.

For example, if the number parameter of the synonymous entities is k and the ratio parameter is α, the training synonymous entities included in the first candidate entity set are α k.

S703: putting the first b training synonymous entities with the highest sparse similarity into a second candidate entity set, wherein b is the difference of the quantity parameter minus a;

the second candidate entity set is an entity set obtained by screening all training synonymous entities according to the size of the sparse similarity, and the second candidate entity set comprises the first b training synonymous entities with the sparse similarity, so that the accuracy of the model is improved.

Illustratively, if the number parameter of the synonymous entities is k and the ratio parameter is α, the training synonymous entities included in the second candidate entity set are k- α k.

S704: and acquiring the target synonymous entities corresponding to the quantity parameters according to the first candidate entity set and the second candidate entity set.

In this embodiment, the training synonymous entities in the first candidate entity set and the training synonymous entities in the second candidate entity set are determined as target synonymous entities, so that samples for model training can be effectively reduced, and the efficiency of model training is improved.

Further, S704 includes: judging whether the same training synonymous entity exists in the first candidate entity set and the second candidate entity set; if the same training synonymous entity exists in the first candidate entity set and the second candidate entity set, deleting the same training synonymous entity from the first candidate entity set to obtain a third candidate entity set; counting the number of entities corresponding to the same training synonymous entity in the first candidate entity set and the second candidate entity set; and acquiring candidate synonymous entities corresponding to the entity quantity from training synonymous entities except the first candidate entity set and the second candidate entity set according to the dense similarity, and acquiring target synonymous entities according to the candidate synonymous entities, the second candidate entity set and the third candidate entity set.

And the third candidate entity set is the entity set obtained by deleting the same training synonymous entities in the first candidate entity set. The number of entities is the number corresponding to the same training synonymous entity in the first candidate entity set and the second candidate entity set.

In this embodiment, training synonymous entities except the first candidate entity set and the second candidate entity set are sorted in the order of high dense similarity to low dense similarity; and taking the training synonymous entities which are ranked at the front and correspond to the entity number as candidate synonymous entities, and determining the training synonymous entities in the candidate synonymous entities, the second candidate entity set and the third candidate entity set as target synonymous entities. In the embodiment, more target synonymous entities with higher dense similarity are obtained, and the model obtained through subsequent training is ensured to have better performance.

For example, when the first candidate entity set includes the training synonymous entity MERS and the second candidate entity set trains the synonymous entity MERS, the same training synonymous entity exists in the first candidate entity set and the second candidate entity set. When the first candidate entity set does not include the training synonymous entity MERS, and the second candidate entity set trains the synonymous entity MERS, the same training synonymous entity does not exist in the first candidate entity set and the second candidate entity set.

In the entity recognition model training method provided by this embodiment, the first a training synonymous entities with the highest dense similarity are determined as the first candidate entity set, which is beneficial to improving the accuracy of the model. And according to the quantity parameter and the proportion parameter of the synonymous entities, determining the first b training synonymous entities with the highest sparse similarity as a second candidate entity set, so that the accuracy of the model is improved. And acquiring the target synonymous entity corresponding to the synonymous entity quantity parameter according to the first candidate entity set and the second candidate entity set, so that the samples of model training can be effectively reduced, the efficiency of model training is improved, and the target synonymous entity is an entity with higher sparse similarity and dense similarity with the training standard entity, so that the accuracy of the model can be effectively improved.

As an embodiment, as shown in fig. 8, before step S206, that is, before the obtaining the entity recognition model by optimizing the loss function in the biobert model by batch training the biobert model using the batch training sets in sequence, the method further includes:

s801: and calculating the target similarity corresponding to the training synonyms based on the sparse similarity and the dense similarity corresponding to each training synonym entity.

The target similarity is obtained according to the sparse similarity and the dense similarity, and understandably, the sparse similarity and the dense similarity are considered in the target similarity, so that the importance of the sparse similarity and the dense similarity can be balanced in a subsequent training model, and the accuracy of the trained model can be effectively improved.

Specifically, a similarity function s (m, n) ═ s is used₁(m，n)+λs₂And (m, n) calculating to obtain the target similarity. Wherein m refers to a standard synonymous entity; n refers to training synonymous entities; s (m, n) refers to the object similarity, s₁(m, n) means the density similarity; s₂(m, n) refers to standard dense similarity; the lambda is a weight scalar which is obtained by utilizing logistic regression training, so that the importance of sparse similarity and dense similarity can be better balanced, and the accuracy of the model obtained by training is effectively improved.

S802: and determining the probability of each training synonymous entity according to the target similarity corresponding to the training synonymous entities and the target similarity corresponding to the target synonymous entities, and determining the marginal probability of the training synonymous entities based on the probabilities of the training synonymous entities.

Wherein the probability of training the synonymous entities is

Wherein N is_1：KThe number of target synonymous entities, i.e. the number of target synonymous entities is 1 to k, wherein k can be understood as a quantity parameter, n_xIs a target synonymous entity. Marginal probability of training synonymous entities is

S803: a loss function is obtained based on the marginal probability of each training synonymous entity.

In this example, according to

Wherein M is the number of training standard entities. In the embodiment, when the loss function is determined, the target training entity with high sparse similarity and high dense similarity is used, so that the model can learn better terrain information and semantic information, and the accuracy of the model is improved.

In the entity recognition model training method provided by this embodiment, the target similarity corresponding to the training synonym is calculated based on the sparse similarity and the dense similarity corresponding to each training synonym; determining the probability of each training synonymous entity according to the target similarity corresponding to the training synonymous entities and the target similarity corresponding to the target synonymous entities, and determining the marginal probability of the training synonymous entities based on the probabilities of the training synonymous entities; and acquiring a loss function based on the marginal probability of each training synonymous entity, and when determining the loss function, utilizing a target training entity with higher sparse similarity and dense similarity to ensure that the model can learn better terrain information and semantic information and improve the accuracy of the model.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

In an embodiment, an entity recognition model training apparatus is provided, and the entity recognition model training apparatus corresponds to the entity recognition model training methods in the above embodiments one to one. As shown in fig. 9, the entity recognition model training apparatus includes a training sample acquisition module 901, a sparse similarity acquisition module 902, a dense similarity acquisition module 903, a target synonymous entity acquisition module 904, a batch training set acquisition module 905, and an entity recognition model acquisition module 906. The functional modules are explained in detail as follows:

a training sample obtaining module 901, configured to obtain training samples, where a training sample includes training standard entities and a plurality of training synonymous entities corresponding to each training standard entity.

The sparse similarity obtaining module 902 is configured to perform vector similarity calculation on the training standard entities and each training synonymous entity by using a word frequency algorithm, and obtain sparse similarity between each training synonymous entity and the training standard entity.

The dense similarity obtaining module 903 is configured to perform vector similarity calculation on the training standard entities and each training synonymous entity by using a semantic recognition model to obtain dense similarity corresponding to each training synonymous entity.

And a target synonymous entity obtaining module 904, configured to obtain a target synonymous entity by screening from the training synonymous entities according to the sparse similarity and the dense similarity.

A batch training set obtaining module 905, configured to process the target synonymous entity by using a batch gradient descent method, and obtain multiple batch training sets.

And an entity identification model obtaining module 906, configured to perform batch training on the biobert model by sequentially using the batch training sets, optimize a loss function in the biobert model, and obtain the entity identification model.

Preferably, the sparse similarity obtaining module 902 includes: a sparse vector acquisition unit and a first inner product acquisition unit.

And the sparse vector acquisition unit is used for respectively carrying out vector conversion processing on the training standard entities and each training synonymous entity by adopting a word frequency algorithm to acquire the standard sparse vectors of the training standard entities and the synonymous sparse vectors of each training synonymous entity.

And the first inner product acquisition unit is used for carrying out inner product processing on each synonymous sparse vector and the standard sparse vector respectively to acquire the sparse similarity of each synonymous sparse vector and the standard sparse vector.

Preferably, the sparse vector acquisition unit includes: a character acquisition unit and a processing unit.

And the character acquisition unit is used for carrying out segmentation processing on the training standard entity and each training synonymous entity to respectively obtain the multi-element segmentation characters corresponding to the training standard entity and the multi-element segmentation characters corresponding to all the training synonymous entities.

The processing unit is used for processing the multi-element segmentation characters corresponding to the training standard entity by adopting a TF-IDF algorithm to obtain a standard sparse vector corresponding to the training standard entity; and processing the multi-element segmentation characters corresponding to each training synonymous entity by adopting a TF-IDF algorithm to obtain the synonymous sparse vector of each training synonymous entity.

Preferably, the processing unit comprises: the word frequency acquiring unit comprises a word frequency acquiring subunit, a first sparse subunit and a second sparse subunit.

The word frequency obtaining subunit is used for processing the multi-element segmentation characters corresponding to the training standard entity by adopting a TF-IDF algorithm to obtain the word frequency and the inverse document frequency corresponding to the multi-element segmentation characters contained in the training standard entity; and processing the multi-element segmentation characters corresponding to each training synonymous entity by adopting a TF-IDF algorithm to obtain the word frequency and the inverse document frequency corresponding to the multi-element segmentation characters contained in the synonymous entity.

And the first sparse subunit is used for acquiring a standard sparse vector corresponding to the training standard entity based on the word frequency and the inverse document frequency corresponding to the multivariate segmentation characters contained in the training standard entity.

And the second sparse subunit is used for acquiring the synonymous sparse vector corresponding to the training synonymous entity based on the word frequency and the inverse document frequency corresponding to the multivariate segmentation characters contained in the training synonymous entity.

Preferably, the dense similarity obtaining module 903 includes: the device comprises a first dense vector unit, a second dense vector unit and a dense similarity acquisition unit.

And the first dense vector unit is used for performing vector conversion processing on the training standard entity by adopting a semantic recognition model to obtain a standard dense vector of the training standard entity.

And the second dense vector unit is used for performing vector conversion processing on each training synonymous entity by adopting a semantic recognition model to obtain the synonymous dense vector of each training synonymous entity.

And the dense similarity acquisition unit is used for performing inner product processing on each synonymy dense vector and the standard dense vector respectively to acquire the dense similarity of each synonymy dense vector and the standard dense vector.

Preferably, the target synonymous entity obtaining module 904 includes: the system comprises a target parameter acquisition unit, a first candidate entity set acquisition unit, a second candidate entity set acquisition unit and a target synonymous entity acquisition unit.

And the target parameter acquisition unit is used for acquiring target parameters, and the target parameters comprise acquisition quantity parameters and proportion parameters.

And the first candidate entity set acquisition unit is used for putting the first a training synonymous entities with the highest dense similarity into the first candidate entity set, wherein a is the product of the quantity parameter and the proportion parameter.

And the second candidate entity set acquisition unit is used for putting the first b training synonymous entities with the highest sparse similarity into the second candidate entity set, wherein b is the difference of the quantity parameter minus a.

And the target synonymous entity acquiring unit is used for acquiring the target synonymous entities corresponding to the quantity parameters according to the first candidate entity set and the second candidate entity set.

Preferably, prior to the entity recognition model acquisition module 906, the method further comprises: the device comprises a target similarity obtaining module, a probability obtaining module and a loss function obtaining module.

And the target similarity obtaining module is used for calculating the target similarity corresponding to the training synonym based on the sparse similarity and the dense similarity corresponding to each training synonym entity.

And the probability acquisition module is used for determining the probability of each training synonymous entity according to the target similarity corresponding to the training synonymous entities and the target similarity corresponding to the target synonymous entities, and determining the marginal probability of the training synonymous entities based on the probability of the training synonymous entities.

And the loss function acquisition module is used for acquiring a loss function based on the marginal probability of each training synonymous entity.

For the specific definition of the entity recognition model training device, reference may be made to the above definition of the entity recognition model training method, which is not described herein again. The modules in the entity recognition model training device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store training samples. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of entity recognition model training.

In an embodiment, a computer device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the entity recognition model training method in the foregoing embodiments are implemented, for example, steps S201 to S206 shown in fig. 2 or steps shown in fig. 3 to fig. 8, which are not repeated herein to avoid repetition. Alternatively, the processor executes the computer program to implement the functions of each module/unit in the embodiment of the entity recognition model training apparatus, such as the functions of the training sample acquisition module 901, the sparse similarity acquisition module 902, the dense similarity acquisition module 903, the target synonymous entity acquisition module 904, the batch training set acquisition module 905, and the entity recognition model acquisition module 906 shown in fig. 9, and in order to avoid repetition, details are not repeated here.

In an embodiment, a computer-readable storage medium is provided, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements the steps of the entity recognition model training method in the foregoing embodiments, for example, steps S201 to S206 shown in fig. 2 or steps shown in fig. 3 to fig. 8, which are not repeated herein to avoid repetition. Alternatively, the processor executes the computer program to implement the functions of each module/unit in the embodiment of the entity recognition model training apparatus, such as the functions of the training sample acquisition module 901, the sparse similarity acquisition module 902, the dense similarity acquisition module 903, the target synonymous entity acquisition module 904, the batch training set acquisition module 905, and the entity recognition model acquisition module 906 shown in fig. 9, and in order to avoid repetition, details are not repeated here.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. An entity recognition model training method is characterized by comprising the following steps:

2. The entity recognition model training method of claim 1, wherein the performing vector similarity calculation on the training standard entity and each of the training synonymous entities by using a word frequency algorithm to obtain the sparse similarity between each of the training synonymous entities and the training standard entity comprises:

respectively carrying out vector conversion processing on a training standard entity and each training synonymous entity by adopting a word frequency algorithm to obtain a standard sparse vector of the training standard entity and a synonymous sparse vector of each training synonymous entity;

and performing inner product processing on each synonymous sparse vector and the standard sparse vector respectively to obtain the sparse similarity of each synonymous sparse vector and the standard sparse vector.

3. The entity recognition model training method of claim 1, wherein the performing vector transformation processing on training standard entities and each training synonymous entity by using a word frequency algorithm to obtain standard sparse vectors of the training standard entities and synonymous sparse vectors of each training synonymous entity respectively comprises:

carrying out segmentation processing on the training standard entity and each training synonymous entity to respectively obtain a multi-element segmentation character corresponding to the training standard entity and multi-element segmentation characters corresponding to all the training synonymous entities;

processing the multi-element segmentation characters corresponding to the training standard entity by adopting a TF-IDF algorithm to obtain a standard sparse vector corresponding to the training standard entity; and processing the multivariate segmentation characters corresponding to each training synonymous entity by adopting a TF-IDF algorithm to obtain the synonymous sparse vector of each training synonymous entity.

4. The entity recognition model training method of claim 3, wherein the multiple segmented characters corresponding to the training standard entity are processed by adopting a TF-IDF algorithm to obtain a standard sparse vector corresponding to the training standard entity; adopting TF-IDF algorithm to process the multivariate segmentation characters corresponding to each training synonymous entity to obtain the synonymous sparse vector of each training synonymous entity, comprising:

processing the multi-element segmentation characters corresponding to the training standard entity by adopting a TF-IDF algorithm to obtain word frequencies and inverse document frequencies corresponding to the multi-element segmentation characters contained in the training standard entity; processing the multivariate segmentation characters corresponding to each training synonymous entity by adopting a TF-IDF algorithm to obtain word frequencies and inverse document frequencies corresponding to the multivariate segmentation characters contained in the synonymous entity;

acquiring a standard sparse vector corresponding to a training standard entity based on the word frequency and the inverse document frequency corresponding to the multivariate segmentation characters contained in the training standard entity;

and acquiring a synonymy sparse vector corresponding to the training synonymy entity based on the word frequency and the inverse document frequency corresponding to the multivariate segmentation characters contained in the training synonymy entity.

5. The entity recognition model training method of claim 1, wherein the using the semantic recognition model to perform vector similarity calculation on the training standard entity and each training synonymous entity to obtain the dense similarity corresponding to each training synonymous entity comprises:

adopting a semantic recognition model to perform vector conversion processing on a training standard entity to obtain a standard dense vector of the training standard entity;

adopting a semantic recognition model to perform vector conversion processing on each training synonymous entity to obtain a synonymous dense vector of each training synonymous entity;

and performing inner product processing on each synonymy dense vector and the standard dense vector respectively to obtain the dense similarity of each synonymy dense vector and the standard dense vector.

6. The entity recognition model training method of claim 1, wherein the screening of the training synonymous entities according to the sparse similarity and the dense similarity to obtain target synonymous entities comprises:

acquiring target parameters, wherein the target parameters comprise an acquisition quantity parameter and a proportion parameter;

putting the first a training synonymous entities with the highest dense similarity into a first candidate entity set, wherein a is the product of the quantity parameter and the proportion parameter;

putting the first b training synonymous entities with the highest sparse similarity into a second candidate entity set, wherein b is the difference of the quantity parameter minus a;

and acquiring the target synonymous entities corresponding to the quantity parameters according to the first candidate entity set and the second candidate entity set.

7. The entity recognition model training method of claim 1, wherein before said employing the batch training set in turn to perform batch training on a biobert model, optimizing a loss function in the biobert model, and obtaining an entity recognition model, the method further comprises:

calculating target similarity corresponding to the training synonyms based on the sparse similarity and the dense similarity corresponding to each training synonymous entity;

determining the probability of each training synonymous entity according to the target similarity corresponding to the training synonymous entities and the target similarity corresponding to the target synonymous entities, and determining the marginal probability of the training synonymous entities based on the probability of the training synonymous entities;

and obtaining a loss function based on the marginal probability of each training synonymous entity.

8. An entity recognition model training device, comprising:

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the entity recognition model training method according to any one of claims 1 to 7.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the entity recognition model training method according to any one of claims 1 to 7.