CN113673245A - Entity identification method and device, electronic equipment and readable storage medium - Google Patents

Entity identification method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN113673245A
CN113673245A CN202110802245.6A CN202110802245A CN113673245A CN 113673245 A CN113673245 A CN 113673245A CN 202110802245 A CN202110802245 A CN 202110802245A CN 113673245 A CN113673245 A CN 113673245A
Authority
CN
China
Prior art keywords
entity
model
sentence
recognition model
entity recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110802245.6A
Other languages
Chinese (zh)
Inventor
黄江华
胡炎根
江会星
武威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202110802245.6A priority Critical patent/CN113673245A/en
Publication of CN113673245A publication Critical patent/CN113673245A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the disclosure provides an entity identification method, an entity identification device, electronic equipment and a readable storage medium. The method comprises the following steps: obtaining a pre-trained entity recognition model and model parameters of the entity recognition model; calling a matrix operation library to rewrite the model structure of the entity identification model, and endowing the rewritten entity identification model with the model parameters to generate a target entity identification model; obtaining a sentence to be identified; inputting the sentence to be recognized into the target entity recognition model; and calling the target entity recognition model to output the target entity words in the sentence to be recognized and the entity types of the target entity words. The embodiment of the disclosure can ensure high bearing capacity of the NER service under high-load flow, and improve the efficiency and accuracy of entity identification.

Description

Entity identification method and device, electronic equipment and readable storage medium
Technical Field
Embodiments of the present disclosure relate to the field of entity identification technologies, and in particular, to an entity identification method and apparatus, an electronic device, and a readable storage medium.
Background
NER (Named Entity Recognition technology) is one of the core fundamental technologies in the field of artificial intelligence, which aims to recognize Named entities in unstructured text and classify them into predefined types. Since named entity recognition is one of the lowest layer and basic modules in many service scenarios, under high-load service traffic, high requirements are put on throughput, performance and recognition effect of the NER service.
Currently, a commonly used method for identifying a named entity is a method based on a BERT (Graphics Processing Unit) NER model, and currently, due to the scarcity of GPU (Graphics Processing Unit) resources, the BERT based NER model cannot be calculated on line on a large scale in real time, and the complexity of BERT also causes that the inference speed does not meet the performance requirement, so that the efficiency and accuracy of entity identification are low.
Disclosure of Invention
Embodiments of the present disclosure provide an entity identification method, an entity identification device, an electronic device, and a readable storage medium, so as to ensure high load bearing of an NER service under a high-load traffic, and improve efficiency and accuracy of entity identification.
According to a first aspect of embodiments of the present disclosure, there is provided an entity identification method, including:
obtaining a pre-trained entity recognition model and model parameters of the entity recognition model;
calling a matrix operation library to rewrite the model structure of the entity identification model, and endowing the rewritten entity identification model with the model parameters to generate a target entity identification model;
obtaining a sentence to be identified;
inputting the sentence to be recognized into the target entity recognition model;
and calling the target entity recognition model to output the target entity words in the sentence to be recognized and the entity types of the target entity words.
Optionally, before the obtaining the entity recognition model trained in advance and the model parameters of the entity recognition model, the method further includes:
obtaining a sample statement;
preprocessing the sample sentence to generate a model training sentence;
and training an initial entity recognition model based on the model training sentences to obtain the entity recognition model.
Optionally, the preprocessing the sample sentence to generate a model training sentence includes:
and replacing the entity words in the sample sentence according to a preset probability to generate the model training sentence.
Optionally, the initial entity recognition model comprises: a first word vector acquisition layer, a second word vector acquisition layer, a word vector acquisition layer and a transition probability acquisition layer, wherein the model training sentence comprises at least two entity types corresponding to entity words in the training sentence,
training an initial entity recognition model based on the model training sentences to obtain the entity recognition model, including:
inputting the model training sentence into the initial entity recognition model;
calling the first word vector acquisition layer to acquire a first word vector of each word integrated in the model training sentence according to a left-to-right sequence;
calling the second word vector acquisition layer to acquire a second word vector of each word integrated in the model training sentence from right to left;
calling the word vector acquisition layer to acquire the word vector of each word in the model training sentence;
calling the probability transition matrix acquisition layer to process the first word vector, the second word vector and the word vector, acquiring a predicted value of each word and each word in the entity type, and acquiring transition probability between the at least two entity types according to the predicted value;
calculating to obtain a loss value of the initial entity recognition model based on the transition probability;
and under the condition that the loss value is within a preset range, taking the trained initial entity recognition model as the entity recognition model.
Optionally, the invoking the target entity recognition model to output the target entity word in the sentence to be recognized and the entity type of the target entity word includes:
acquiring a target word vector of each word in the sentence to be recognized;
acquiring a target word vector of each word in the sentence to be recognized;
splicing the target word vector and the target word vector to generate a spliced vector;
inputting the stitching vector into the target entity recognition model;
and calling the target entity recognition model to process the spliced vector to obtain a target entity word contained in the sentence to be recognized and an entity type of the target entity word.
Optionally, the obtaining a target word vector of each word in the sentence to be recognized includes:
acquiring a first word vector of each word integrated in the sentence to be recognized according to a left-to-right sequence;
acquiring a second word vector of each word integrated in the sentence to be recognized according to the sequence from right to left;
and taking the first word vector and the second word vector as the target word vector.
According to a second aspect of embodiments of the present disclosure, there is provided an entity identifying apparatus including:
the entity recognition model acquisition module is used for acquiring a pre-trained entity recognition model and model parameters of the entity recognition model;
the target entity recognition model generation module is used for calling a matrix operation library to rewrite the model structure of the entity recognition model, endowing the model parameters to the rewritten entity recognition model and generating a target entity recognition model;
the sentence to be recognized acquiring module is used for acquiring sentences to be recognized;
the sentence to be recognized input module is used for inputting the sentence to be recognized into the target entity recognition model;
and the entity type acquisition module is used for calling the target entity recognition model to output the target entity words in the sentence to be recognized and the entity types of the target entity words.
Optionally, the apparatus further comprises:
the sample statement acquisition module is used for acquiring sample statements;
the model training sentence generating module is used for preprocessing the sample sentences to generate model training sentences;
and the entity recognition model acquisition module is used for training an initial entity recognition model based on the model training sentences to obtain the entity recognition model.
Optionally, the model training sentence generating module includes:
and the model training sentence generating unit is used for replacing the entity words in the sample sentences according to a preset probability to generate the model training sentences.
Optionally, the initial entity recognition model comprises: a first word vector acquisition layer, a second word vector acquisition layer, a word vector acquisition layer and a transition probability acquisition layer, wherein the model training sentence comprises at least two entity types corresponding to entity words in the training sentence,
the entity recognition model acquisition module includes:
a model training sentence input unit for inputting the model training sentence to the initial entity recognition model;
a first word vector obtaining unit, configured to call the first word vector obtaining layer to obtain a first word vector of each word integrated in the model training sentence according to a left-to-right sequence;
a second word vector obtaining unit, configured to invoke the second word vector obtaining layer to obtain a second word vector of each word integrated in the model training sentence according to a right-to-left order;
the word vector acquisition unit is used for calling the word vector acquisition layer to acquire the word vector of each word in the model training sentence;
a transition probability obtaining unit, configured to invoke the probability transition matrix obtaining layer to process the first word vector, the second word vector, and the word vector, obtain a predicted value of each word and each word in the entity type, and obtain a transition probability between the at least two entity types according to the predicted value;
a loss value calculation unit, configured to calculate a loss value of the initial entity identification model based on the transition probability;
and the entity recognition model obtaining unit is used for taking the trained initial entity recognition model as the entity recognition model under the condition that the loss value is within a preset range.
Optionally, the entity type obtaining module includes:
the target word vector acquiring unit is used for acquiring a target word vector of each word in the sentence to be identified;
the target word vector acquiring unit is used for acquiring a target word vector of each word in the sentence to be recognized;
the splicing vector generating unit is used for splicing the target word vector and the target word vector to generate a splicing vector;
a splicing vector input unit, configured to input the splicing vector into the target entity recognition model;
and the entity type obtaining unit is used for calling the target entity recognition model to process the spliced vector to obtain the target entity words contained in the sentence to be recognized and the entity types of the target entity words.
Optionally, the target word vector obtaining unit includes:
the first word vector acquiring subunit is used for acquiring a first word vector of each word integrated in the sentence to be recognized according to a left-to-right sequence;
the second word vector acquiring subunit is used for acquiring a second word vector of each word integrated in the sentence to be recognized according to the sequence from right to left;
and the target word vector acquiring subunit is configured to use the first word vector and the second word vector as the target word vectors.
According to a third aspect of embodiments of the present disclosure, there is provided an electronic apparatus including:
a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing the entity identification method of any one of the above when executing the program.
According to a fourth aspect of embodiments of the present disclosure, there is provided a readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform any one of the entity identification methods described above.
The embodiment of the disclosure provides an entity recognition method, an entity recognition device, electronic equipment and a readable storage medium. According to the embodiment of the disclosure, the recognition effect of the NER is improved by adopting the pre-trained entity recognition model, and the recognition inference process is realized based on the matrix operation library rewriting model, so that the problem that the BERT-based NER model cannot be calculated on line on a large scale in real time can be solved, the high bearing performance of the NER service can be ensured under the high-load flow, and the efficiency and the accuracy of entity recognition are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments of the present disclosure will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a flowchart illustrating steps of an entity identification method according to an embodiment of the present disclosure;
FIG. 2 is a flow chart illustrating steps of another entity identification method provided by an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a model processing flow provided by an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an entity identification apparatus according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of another entity identification apparatus according to an embodiment of the present disclosure.
Detailed Description
Technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present disclosure, belong to the protection scope of the embodiments of the present disclosure.
Example one
Referring to fig. 1, a flowchart illustrating steps of an entity identification method provided by an embodiment of the present disclosure is shown, and as shown in fig. 1, the entity identification method may include the following steps:
step 101: and obtaining a pre-trained entity recognition model and model parameters of the entity recognition model.
The embodiment of the disclosure can be applied to a scene of rewriting the entity identification model and carrying out entity identification by combining the matrix operation library.
The entity recognition model refers to a model trained in advance for recognizing an entity and an entity type of the entity in a sentence, and a training process of the entity recognition model will be described in detail in the following second embodiment, which is not described herein again.
Model parameters refer to parameters that are applied during the model reasoning process.
When the entity recognition is needed, a pre-trained entity recognition model and model parameters of the entity recognition model can be obtained.
After the entity recognition model trained in advance and the model parameters of the entity recognition model are acquired, step 102 is executed.
Step 102: and calling a matrix operation library to rewrite the model structure of the entity recognition model, endowing the rewritten entity recognition model with the model parameters, and generating the target entity recognition model.
In this example, the matrix operation library is Eigen, and the matrix operation library is a template library of C + + linear operations, which can be used to complete operations related to matrix, vector, numerical solution, and the like.
After the entity recognition model trained in advance and the model parameters of the entity recognition model are obtained, a matrix operation library can be called to rewrite the model structure of the entity recognition model, and the model parameters are endowed to the rewritten entity recognition model, so that the target entity recognition model is obtained.
After the target entity recognition model is obtained, step 103 is performed.
Step 103: and acquiring a sentence to be recognized.
The statement to be recognized refers to a statement for performing entity and entity type recognition.
In this example, the sentence to be recognized may be a sentence input by a user on the service platform, or may also be a sentence input by a service person, specifically, the obtaining manner of the sentence to be recognized may be determined according to a service requirement, which is not limited in this embodiment.
And after the model structure of the entity recognition model is rewritten by calling the matrix operation library, the model parameters are endowed to the rewritten entity recognition model, and the target entity recognition model is generated, the sentence to be recognized can be obtained.
After the statement to be recognized is acquired, step 104 is executed.
Step 104: and inputting the sentence to be recognized into the target entity recognition model.
After the sentence to be recognized is acquired, the sentence to be recognized can be input into the target entity recognition model, so that the sentence to be recognized can be processed through the target entity recognition model.
After the sentence to be recognized is input to the target entity recognition model, step 105 is performed.
Step 105: and calling the target entity recognition model to output the target entity words in the sentence to be recognized and the entity types of the target entity words.
After the sentence to be recognized is input to the target entity recognition model, the target entity recognition model may be called to process the sentence to be recognized, so as to obtain the target entity word and the entity type of the target entity word in the sentence to be recognized output by the target entity recognition model, and specifically, the process of the sentence to be recognized by the target entity recognition model is described in detail in the following embodiment two.
The embodiment of the disclosure improves the recognition effect of the NER by adopting the entity recognition model which is pre-trained, and realizes the recognition inference process based on the rewriting model of the matrix operation library, so that the problem that the NER model based on the BERT can not be calculated on line on a large scale can be solved.
The entity recognition method provided by the embodiment of the disclosure includes the steps of obtaining a pre-trained entity recognition model and model parameters of the entity recognition model, calling a matrix operation library to rewrite a model structure of the entity recognition model, endowing the model parameters with the rewritten entity recognition model, generating a target entity recognition model, obtaining a sentence to be recognized, inputting the sentence to be recognized into the target entity recognition model, and calling the target entity recognition model to output a target entity word in the sentence to be recognized and an entity type of the target entity word. According to the embodiment of the disclosure, the recognition effect of the NER is improved by adopting the pre-trained entity recognition model, and the recognition inference process is realized based on the matrix operation library rewriting model, so that the problem that the BERT-based NER model cannot be calculated on line on a large scale in real time can be solved, the high bearing performance of the NER service can be ensured under the high-load flow, and the efficiency and the accuracy of entity recognition are improved.
Example two
Referring to fig. 2, a flowchart illustrating steps of another entity identification method provided by an embodiment of the present disclosure is shown, and as shown in fig. 2, the entity identification method may include the following steps:
step 201: a sample statement is obtained.
The embodiment of the disclosure can be applied to a scene of rewriting the entity identification model and carrying out entity identification by combining the matrix operation library.
The sample sentence is a sentence used for training to obtain the entity recognition model.
When the entity recognition model needs to be trained, sample sentences used for training the entity recognition model can be obtained.
After the sample statement is obtained, step 202 is performed.
Step 202: and preprocessing the sample sentence to generate a model training sentence.
After the sample sentence is obtained, the sample sentence may be preprocessed to generate a model training sentence, and the preprocessing process for the sample sentence may be to perform replacement processing on entity words in the sample sentence according to a preset probability to generate the model training sentence, specifically, a word or a word in the sample sentence may be randomly selected with a probability P (15%), the selected entity word is replaced with a special mark symbol with a probability of 80%, the selected entity word is randomly replaced with another entity word with a probability of 10%, and the probability of 10% remains unchanged to predict an actual entity word at the position.
After the pre-processing, the pre-processed sentence and the original sample sentence can be used together as a model training sentence.
After preprocessing the sample statements to generate model training statements, step 203 is performed.
Step 203: and training an initial entity recognition model based on the model training sentences to obtain the entity recognition model.
The initial entity recognition model refers to a built model which needs to be trained and is used for recognizing entity words in sentences and entity types of the entity words.
After obtaining the model training sentences, the initial entity recognition model may be trained based on the model training sentences to obtain the entity recognition model, and specifically, the training process may be described in detail in conjunction with the following specific implementation manner.
In a specific implementation manner of the present disclosure, the initial entity recognition model includes: a first word vector obtaining layer, a second word vector obtaining layer, a word vector obtaining layer, and a transition probability obtaining layer, where the model training sentence includes at least two entity types corresponding to entity words in the training sentence, where step 203 may include:
substep A1: inputting the model training sentence to the initial entity recognition model.
In this embodiment, the initial entity recognition model may include: the word processing method comprises a first word vector acquisition layer, a second word vector acquisition layer, a word vector acquisition layer and a transition probability acquisition layer, wherein the first word vector acquisition time is left bigram, the second word vector is right bigram, the word vector acquisition layer is unigram, and the transition probability acquisition layer is softmax, as shown in fig. 3.
After the model training sentence is obtained, the model training sentence may be input to the initial entity recognition model, as shown in fig. 3, the model training sentence is "i want to eat a hot pot", and after the model training sentence is obtained, the model training sentence may be input to the initial entity recognition model.
After the model training sentences are input to the initial entity recognition model, substep A2, substep A3, and substep A4 are performed.
Substep A2: and calling the first word vector acquisition layer to acquire a first word vector of each word integrated in the model training sentence according to the sequence from left to right.
After the model training sentence is input into the initial entity recognition model, a first word vector obtaining layer may be invoked to obtain a first word vector of each word integrated in the model training sentence according to a left-to-right order, as shown in fig. 3, words obtained after the model training sentence "i want to eat a hot pot" is integrated according to the left-to-right order are: "< B > me", "I want", "want to eat", "eat fire", and "hot pot", the word vector of these words, i.e. the first word vector, can be obtained in conjunction with the first word vector obtaining layer.
Substep A3: and calling the second word vector acquisition layer to acquire a second word vector of each word integrated in the model training sentence from right to left.
After the model training sentence is input into the initial entity recognition model, a second word vector acquisition layer may be invoked to acquire a second word vector of each word integrated in the model training sentence in order from right to left, as shown in fig. 3, words obtained after the model training sentence "i want to eat a hot pot" is integrated in order from right to left are: the words vectors of the words, namely the second word vector, can be obtained by combining the words of "I want", "want to eat", "eat fire", "hot pot" and "pot < B >", and the second word vector obtaining layer.
Substep A4: and calling the word vector acquisition layer to acquire the word vector of each word in the model training sentence.
After the model training sentence is input into the initial entity recognition model, a word vector obtaining layer can be called to obtain a word vector of each word in the model training sentence, as shown in fig. 3, the model training sentence is "i want to eat a hot pot", the single words contained in the model training sentence are "i", "want", "eat", "fire" and "pot", and a unigram layer is called to obtain word vectors and the like corresponding to the words respectively.
After the first word vector, the second word vector and the word vector are obtained, sub-step a5 is performed.
Substep A5: and calling the probability transition matrix acquisition layer to process the first word vector, the second word vector and the word vector, acquiring a predicted value of each word and each word in the entity type, and acquiring the transition probability between the at least two entity types according to the predicted value.
After the first word vector, the second word vector and the word vector are obtained, a probability transition matrix obtaining layer can be called to process the first word vector, the second word vector and the word vector so as to obtain a predicted value of each word and each word in an entity type, and the transition probability between at least two entity types is obtained according to the predicted values. The sequence labeling model based on CNN and LSTM usually accesses CRF on the sequence labeling layer to model the joint probability distribution of labels in the whole sentence, and then adding CRF on the NN model is more time-consuming than the original CRF, while removing the CRF layer is easy to have label skip. To address this problem, this embodiment corrects the tag prediction result by learning the probability transition matrix between tags and using this probability:
s(X,yi)=Logit(X,yi)+λ*A(yi-1,yi) (1)
Figure BDA0003165097780000111
in the above formula (1) and formula (2), X is a word, and y isiLabel representing the ith character (i.e. type of word), Logit (X, y)i) As a word on the label yiA (y) is a predicted value ofi-1,yi) Represents from yi-1Transfer to yiTransition probability of, s (X, y)i) And λ is a hyper-parameter controlling both expressions, s (X, y)i) For the corrected vector representation, the label skip problem is well relieved by fusing the probability transition matrix in the display area.
For the label transition probability matrix, the present embodiment proposes two calculation methods: 1. directly counting transition probabilities among all labels from training data; 2. the matrix parameters are trained along with other parameters.
After obtaining the transition probabilities between at least two entity types, sub-step a6 is performed.
Substep A6: and calculating to obtain a loss value of the initial entity recognition model based on the transition probability.
After obtaining the transition probabilities between at least two entity types, a loss value of the initial entity identification model may be calculated based on the transition probabilities.
Substep A7: and under the condition that the loss value is within a preset range, taking the trained initial entity recognition model as the entity recognition model.
After the loss value is obtained through calculation, whether the loss value is within a preset range or not can be judged, and if the loss value is within the preset range, the trained initial entity recognition model can be used as a final entity recognition model for entity and entity type recognition. If the loss value is not within the preset range, training the initial entity recognition model by combining more model training sentences until the loss value is within the preset range.
Step 204: and obtaining a pre-trained entity recognition model and model parameters of the entity recognition model.
Model parameters refer to parameters that are applied during the model reasoning process.
After the entity recognition model is obtained through the training in the above steps, when the entity recognition is required, the entity recognition model trained in advance and the model parameters of the entity recognition model can be obtained.
In that
Step 205: and calling a matrix operation library to rewrite the model structure of the entity recognition model, endowing the rewritten entity recognition model with the model parameters, and generating the target entity recognition model.
In this example, the matrix operation library is Eigen, and the matrix operation library is a template library of C + + linear operations, which can be used to complete operations related to matrix, vector, numerical solution, and the like.
After the entity recognition model trained in advance and the model parameters of the entity recognition model are obtained, a matrix operation library can be called to rewrite the model structure of the entity recognition model, and the model parameters are endowed to the rewritten entity recognition model, so that the target entity recognition model is obtained.
After the target entity recognition model is obtained, step 206 is performed.
Step 206: and acquiring a sentence to be recognized.
The statement to be recognized refers to a statement for performing entity and entity type recognition.
In this example, the sentence to be recognized may be a sentence input by a user on the service platform, or may also be a sentence input by a service person, specifically, the obtaining manner of the sentence to be recognized may be determined according to a service requirement, which is not limited in this embodiment.
And after the model structure of the entity recognition model is rewritten by calling the matrix operation library, the model parameters are endowed to the rewritten entity recognition model, and the target entity recognition model is generated, the sentence to be recognized can be obtained.
After the statement to be recognized is acquired, step 207 and step 208 are executed.
Step 207: and acquiring a target word vector of each word in the sentence to be recognized.
Step 208: and acquiring a target word vector of each word in the sentence to be recognized.
After the sentence to be recognized is input to the target entity recognition model, a target word vector of each word in the sentence to be recognized and a target word vector of each word in the sentence to be recognized may be acquired.
The way to obtain the target word vector may be: and acquiring a first word vector of each word integrated in the sentence to be recognized according to the sequence from left to right and a second word vector of each word integrated from the sequence from right to left, wherein the first word vector and the second word vector form a target word vector.
After the target word vector and the target word vector are obtained, step 209 is performed.
Step 209: and splicing the target word vector and the target word vector to generate a spliced vector.
After the target word vector and the target word vector are obtained, the target word vector and the target word vector may be spliced to generate a spliced vector, and then step 210 is performed.
Step 210: inputting the stitching vector into the target entity recognition model.
After generating the stitching vector, the stitching vector may be input into the target entity recognition model, and step 211 is performed.
Step 211: and calling the target entity recognition model to process the spliced vector to obtain a target entity word contained in the sentence to be recognized and an entity type of the target entity word.
After the splicing vector is input into the target entity recognition model, the target entity recognition model can be called to process the splicing vector so as to obtain a target entity word contained in the sentence to be recognized and an entity type of the target entity word.
The entity recognition method provided by the embodiment of the disclosure includes the steps of obtaining a pre-trained entity recognition model and model parameters of the entity recognition model, calling a matrix operation library to rewrite a model structure of the entity recognition model, endowing the model parameters with the rewritten entity recognition model, generating a target entity recognition model, obtaining a sentence to be recognized, inputting the sentence to be recognized into the target entity recognition model, and calling the target entity recognition model to output a target entity word in the sentence to be recognized and an entity type of the target entity word. According to the embodiment of the disclosure, the recognition effect of the NER is improved by adopting the pre-trained entity recognition model, and the recognition inference process is realized based on the matrix operation library rewriting model, so that the problem that the BERT-based NER model cannot be calculated on line on a large scale in real time can be solved, the high bearing performance of the NER service can be ensured under the high-load flow, and the efficiency and the accuracy of entity recognition are improved.
EXAMPLE III
Referring to fig. 4, which illustrates a schematic structural diagram of an entity identification apparatus provided in an embodiment of the present disclosure, as shown in fig. 4, the entity identification apparatus 300 may include the following modules:
an entity recognition model obtaining module 310, configured to obtain a pre-trained entity recognition model and model parameters of the entity recognition model;
a target entity recognition model generation module 320, configured to invoke a matrix operation library to rewrite a model structure of the entity recognition model, and assign the model parameters to the rewritten entity recognition model to generate a target entity recognition model;
a sentence to be recognized obtaining module 330, configured to obtain a sentence to be recognized;
a sentence to be recognized input module 340, configured to input the sentence to be recognized to the target entity recognition model;
and an entity type obtaining module 350, configured to invoke the target entity recognition model to output the target entity word in the sentence to be recognized and the entity type of the target entity word.
The entity recognition device provided by the embodiment of the disclosure calls a matrix operation base to rewrite a model structure of an entity recognition model by obtaining a pre-trained entity recognition model and model parameters of the entity recognition model, gives the model parameters to the rewritten entity recognition model, generates a target entity recognition model, obtains a sentence to be recognized, inputs the sentence to be recognized to the target entity recognition model, and calls the target entity recognition model to output a target entity word in the sentence to be recognized and an entity type of the target entity word. According to the embodiment of the disclosure, the recognition effect of the NER is improved by adopting the pre-trained entity recognition model, and the recognition inference process is realized based on the matrix operation library rewriting model, so that the problem that the BERT-based NER model cannot be calculated on line on a large scale in real time can be solved, the high bearing performance of the NER service can be ensured under the high-load flow, and the efficiency and the accuracy of entity recognition are improved.
Example four
Referring to fig. 5, which shows a schematic structural diagram of another entity identification apparatus provided in an embodiment of the present disclosure, as shown in fig. 5, the entity identification apparatus 400 may include the following modules:
a sample sentence obtaining module 410, configured to obtain a sample sentence;
a model training sentence generating module 420, configured to preprocess the sample sentence to generate a model training sentence;
an entity recognition model obtaining module 430, configured to train an initial entity recognition model based on the model training sentence, so as to obtain the entity recognition model;
an entity recognition model obtaining module 440, configured to obtain a pre-trained entity recognition model and model parameters of the entity recognition model;
a target entity recognition model generation module 450, configured to invoke a matrix operation library to rewrite a model structure of the entity recognition model, and assign the model parameters to the rewritten entity recognition model to generate a target entity recognition model;
a sentence to be recognized obtaining module 460, configured to obtain a sentence to be recognized;
a sentence to be recognized input module 470, configured to input the sentence to be recognized to the target entity recognition model;
and an entity type obtaining module 480, configured to invoke the target entity recognition model to output the target entity word in the sentence to be recognized and the entity type of the target entity word.
Optionally, the model training sentence generating module 420 includes:
and the model training sentence generating unit is used for replacing the entity words in the sample sentences according to a preset probability to generate the model training sentences.
Optionally, the initial entity recognition model comprises: a first word vector acquisition layer, a second word vector acquisition layer, a word vector acquisition layer and a transition probability acquisition layer, wherein the model training sentence comprises at least two entity types corresponding to entity words in the training sentence,
the entity recognition model obtaining module 440 includes:
a model training sentence input unit for inputting the model training sentence to the initial entity recognition model;
a first word vector obtaining unit, configured to call the first word vector obtaining layer to obtain a first word vector of each word integrated in the model training sentence according to a left-to-right sequence;
a second word vector obtaining unit, configured to invoke the second word vector obtaining layer to obtain a second word vector of each word integrated in the model training sentence according to a right-to-left order;
the word vector acquisition unit is used for calling the word vector acquisition layer to acquire the word vector of each word in the model training sentence;
a transition probability obtaining unit, configured to invoke the probability transition matrix obtaining layer to process the first word vector, the second word vector, and the word vector, obtain a predicted value of each word and each word in the entity type, and obtain a transition probability between the at least two entity types according to the predicted value;
a loss value calculation unit, configured to calculate a loss value of the initial entity identification model based on the transition probability;
and the entity recognition model obtaining unit is used for taking the trained initial entity recognition model as the entity recognition model under the condition that the loss value is within a preset range.
Optionally, the entity type obtaining module 480 includes:
a target word vector obtaining unit 481, configured to obtain a target word vector of each word in the sentence to be recognized;
a target word vector obtaining unit 482, configured to obtain a target word vector for each word in the sentence to be recognized;
a splicing vector generating unit 483, configured to splice the target word vector and the target word vector to generate a splicing vector;
a stitching vector input unit 484, configured to input the stitching vector into the target entity recognition model;
and the entity type obtaining unit 485 is configured to call the target entity recognition model to process the spliced vector, so as to obtain a target entity word included in the sentence to be recognized and an entity type of the target entity word.
Optionally, the target word vector obtaining unit 482 includes:
the first word vector acquiring subunit is used for acquiring a first word vector of each word integrated in the sentence to be recognized according to a left-to-right sequence;
the second word vector acquiring subunit is used for acquiring a second word vector of each word integrated in the sentence to be recognized according to the sequence from right to left;
and the target word vector acquiring subunit is configured to use the first word vector and the second word vector as the target word vectors.
The entity recognition device provided by the embodiment of the disclosure calls a matrix operation base to rewrite a model structure of an entity recognition model by obtaining a pre-trained entity recognition model and model parameters of the entity recognition model, gives the model parameters to the rewritten entity recognition model, generates a target entity recognition model, obtains a sentence to be recognized, inputs the sentence to be recognized to the target entity recognition model, and calls the target entity recognition model to output a target entity word in the sentence to be recognized and an entity type of the target entity word. According to the embodiment of the disclosure, the recognition effect of the NER is improved by adopting the pre-trained entity recognition model, and the recognition inference process is realized based on the matrix operation library rewriting model, so that the problem that the BERT-based NER model cannot be calculated on line on a large scale in real time can be solved, the high bearing performance of the NER service can be ensured under the high-load flow, and the efficiency and the accuracy of entity recognition are improved.
An embodiment of the present disclosure also provides an electronic device, including: a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the entity identification method of the foregoing embodiments when executing the program.
Embodiments of the present disclosure also provide a readable storage medium, in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform the entity identification method of the foregoing embodiments.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. In addition, embodiments of the present disclosure are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the embodiments of the present disclosure as described herein, and any descriptions of specific languages are provided above to disclose the best modes of the embodiments of the present disclosure.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the disclosure may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the disclosure, various features of the embodiments of the disclosure are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that is, claimed embodiments of the disclosure require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of an embodiment of this disclosure.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
The various component embodiments of the disclosure may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be understood by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in a motion picture generating device according to an embodiment of the present disclosure. Embodiments of the present disclosure may also be implemented as an apparatus or device program for performing a portion or all of the methods described herein. Such programs implementing embodiments of the present disclosure may be stored on a computer readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit embodiments of the disclosure, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. Embodiments of the disclosure may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The above description is only for the purpose of illustrating the preferred embodiments of the present disclosure and is not to be construed as limiting the embodiments of the present disclosure, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the embodiments of the present disclosure are intended to be included within the scope of the embodiments of the present disclosure.
The above description is only a specific implementation of the embodiments of the present disclosure, but the scope of the embodiments of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the embodiments of the present disclosure, and all the changes or substitutions should be covered by the scope of the embodiments of the present disclosure. Therefore, the protection scope of the embodiments of the present disclosure shall be subject to the protection scope of the claims.

Claims (10)

1. An entity identification method, comprising:
obtaining a pre-trained entity recognition model and model parameters of the entity recognition model;
calling a matrix operation library to rewrite the model structure of the entity identification model, and endowing the rewritten entity identification model with the model parameters to generate a target entity identification model;
obtaining a sentence to be identified;
inputting the sentence to be recognized into the target entity recognition model;
and calling the target entity recognition model to output the target entity words in the sentence to be recognized and the entity types of the target entity words.
2. The method of claim 1, wherein before the obtaining the pre-trained entity recognition model and the model parameters of the entity recognition model, further comprising:
obtaining a sample statement;
preprocessing the sample sentence to generate a model training sentence;
and training an initial entity recognition model based on the model training sentences to obtain the entity recognition model.
3. The method of claim 2, wherein preprocessing the sample sentence to generate a model training sentence comprises:
and replacing the entity words in the sample sentence according to a preset probability to generate the model training sentence.
4. The method of claim 2, wherein the initial entity recognition model comprises: a first word vector acquisition layer, a second word vector acquisition layer, a word vector acquisition layer and a transition probability acquisition layer, wherein the model training sentence comprises at least two entity types corresponding to entity words in the training sentence,
training an initial entity recognition model based on the model training sentences to obtain the entity recognition model, including:
inputting the model training sentence into the initial entity recognition model;
calling the first word vector acquisition layer to acquire a first word vector of each word integrated in the model training sentence according to a left-to-right sequence;
calling the second word vector acquisition layer to acquire a second word vector of each word integrated in the model training sentence from right to left;
calling the word vector acquisition layer to acquire the word vector of each word in the model training sentence;
calling the probability transition matrix acquisition layer to process the first word vector, the second word vector and the word vector, acquiring a predicted value of each word and each word in the entity type, and acquiring transition probability between the at least two entity types according to the predicted value;
calculating to obtain a loss value of the initial entity recognition model based on the transition probability;
and under the condition that the loss value is within a preset range, taking the trained initial entity recognition model as the entity recognition model.
5. The method of claim 1, wherein the invoking the target entity recognition model to output the target entity word and the entity type of the target entity word in the sentence to be recognized comprises:
acquiring a target word vector of each word in the sentence to be recognized;
acquiring a target word vector of each word in the sentence to be recognized;
splicing the target word vector and the target word vector to generate a spliced vector;
inputting the stitching vector into the target entity recognition model;
and calling the target entity recognition model to process the spliced vector to obtain a target entity word contained in the sentence to be recognized and an entity type of the target entity word.
6. The method of claim 5, wherein the obtaining a target word vector for each word in the sentence to be recognized comprises:
acquiring a first word vector of each word integrated in the sentence to be recognized according to a left-to-right sequence;
acquiring a second word vector of each word integrated in the sentence to be recognized according to the sequence from right to left;
and taking the first word vector and the second word vector as the target word vector.
7. An entity identification apparatus, comprising:
the entity recognition model acquisition module is used for acquiring a pre-trained entity recognition model and model parameters of the entity recognition model;
the target entity recognition model generation module is used for calling a matrix operation library to rewrite the model structure of the entity recognition model, endowing the model parameters to the rewritten entity recognition model and generating a target entity recognition model;
the sentence to be recognized acquiring module is used for acquiring sentences to be recognized;
the sentence to be recognized input module is used for inputting the sentence to be recognized into the target entity recognition model;
and the entity type acquisition module is used for calling the target entity recognition model to output the target entity words in the sentence to be recognized and the entity types of the target entity words.
8. The apparatus of claim 6, further comprising:
the sample statement acquisition module is used for acquiring sample statements;
the model training sentence generating module is used for preprocessing the sample sentences to generate model training sentences;
and the entity recognition model acquisition module is used for training an initial entity recognition model based on the model training sentences to obtain the entity recognition model.
9. An electronic device, comprising:
a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing the entity identification method of any one of claims 1 to 6 when executing the program.
10. A readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the entity identification method of any of claims 1 to 6.
CN202110802245.6A 2021-07-15 2021-07-15 Entity identification method and device, electronic equipment and readable storage medium Pending CN113673245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110802245.6A CN113673245A (en) 2021-07-15 2021-07-15 Entity identification method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110802245.6A CN113673245A (en) 2021-07-15 2021-07-15 Entity identification method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN113673245A true CN113673245A (en) 2021-11-19

Family

ID=78539354

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110802245.6A Pending CN113673245A (en) 2021-07-15 2021-07-15 Entity identification method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN113673245A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573306A (en) * 2017-03-10 2018-09-25 北京搜狗科技发展有限公司 Export method, the training method and device of deep learning model of return information
CN109857845A (en) * 2019-01-03 2019-06-07 北京奇艺世纪科技有限公司 Model training and data retrieval method, device, terminal and computer readable storage medium
CN110298019A (en) * 2019-05-20 2019-10-01 平安科技(深圳)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN111104911A (en) * 2019-12-20 2020-05-05 湖南千视通信息科技有限公司 Pedestrian re-identification method and device based on big data training
CN111192091A (en) * 2019-12-30 2020-05-22 烟台中科网络技术研究所 Method for identifying communication enterprise group client members, storage medium and computer equipment
CN111506705A (en) * 2020-04-13 2020-08-07 北京奇艺世纪科技有限公司 Information query method and device and electronic equipment
CN111832303A (en) * 2019-04-12 2020-10-27 普天信息技术有限公司 Named entity identification method and device
CN111859964A (en) * 2019-04-29 2020-10-30 普天信息技术有限公司 Method and device for identifying named entities in sentences
CN111881681A (en) * 2020-06-16 2020-11-03 北京三快在线科技有限公司 Entity sample obtaining method and device and electronic equipment
CN112818691A (en) * 2021-02-01 2021-05-18 北京金山数字娱乐科技有限公司 Named entity recognition model training method and device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573306A (en) * 2017-03-10 2018-09-25 北京搜狗科技发展有限公司 Export method, the training method and device of deep learning model of return information
CN109857845A (en) * 2019-01-03 2019-06-07 北京奇艺世纪科技有限公司 Model training and data retrieval method, device, terminal and computer readable storage medium
CN111832303A (en) * 2019-04-12 2020-10-27 普天信息技术有限公司 Named entity identification method and device
CN111859964A (en) * 2019-04-29 2020-10-30 普天信息技术有限公司 Method and device for identifying named entities in sentences
CN110298019A (en) * 2019-05-20 2019-10-01 平安科技(深圳)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN111104911A (en) * 2019-12-20 2020-05-05 湖南千视通信息科技有限公司 Pedestrian re-identification method and device based on big data training
CN111192091A (en) * 2019-12-30 2020-05-22 烟台中科网络技术研究所 Method for identifying communication enterprise group client members, storage medium and computer equipment
CN111506705A (en) * 2020-04-13 2020-08-07 北京奇艺世纪科技有限公司 Information query method and device and electronic equipment
CN111881681A (en) * 2020-06-16 2020-11-03 北京三快在线科技有限公司 Entity sample obtaining method and device and electronic equipment
CN112818691A (en) * 2021-02-01 2021-05-18 北京金山数字娱乐科技有限公司 Named entity recognition model training method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
霍振朗: "《基于深度学习的命名实体识别研究》", 《中国优秀硕士学位论文全文数据库》 *

Similar Documents

Publication Publication Date Title
US11699275B2 (en) Method and system for visio-linguistic understanding using contextual language model reasoners
CN111414916B (en) Method and device for extracting and generating text content in image and readable storage medium
CN112199473A (en) Multi-turn dialogue method and device in knowledge question-answering system
CN114822812A (en) Character dialogue simulation method, device, equipment and storage medium
JP2022145623A (en) Method and device for presenting hint information and computer program
CN112069799A (en) Dependency syntax based data enhancement method, apparatus and readable storage medium
CN112420205A (en) Entity recognition model generation method and device and computer readable storage medium
CN114241524A (en) Human body posture estimation method and device, electronic equipment and readable storage medium
CN113780365A (en) Sample generation method and device
CN112735564A (en) Mental health state prediction method, mental health state prediction apparatus, mental health state prediction medium, and computer program product
CN112052681A (en) Information extraction model training method, information extraction device and electronic equipment
CN111475635A (en) Semantic completion method and device and electronic equipment
WO2023159945A1 (en) Multi-modal model training method and apparatus, image recognition method and apparatus, and electronic device
CN114792097B (en) Method and device for determining prompt vector of pre-training model and electronic equipment
CN113673245A (en) Entity identification method and device, electronic equipment and readable storage medium
CN114970666B (en) Spoken language processing method and device, electronic equipment and storage medium
CN113836297B (en) Training method and device for text emotion analysis model
CN116049597A (en) Pre-training method and device for multi-task model of webpage and electronic equipment
CN113688232B (en) Method and device for classifying bid-inviting text, storage medium and terminal
CN113392249A (en) Image-text information classification method, image-text classification model training method, medium, and apparatus
CN110598028B (en) Image classification method and device, storage medium and electronic equipment
CN110851600A (en) Text data processing method and device based on deep learning
CN113782001B (en) Specific field voice recognition method and device, electronic equipment and storage medium
CN114564562B (en) Question generation method, device, equipment and storage medium based on answer guidance
US20240220730A1 (en) Text data processing method, neural-network training method, and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211119