CN111651992A - Named entity labeling method and device, computer equipment and storage medium - Google Patents

Named entity labeling method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111651992A
CN111651992A CN202010333674.9A CN202010333674A CN111651992A CN 111651992 A CN111651992 A CN 111651992A CN 202010333674 A CN202010333674 A CN 202010333674A CN 111651992 A CN111651992 A CN 111651992A
Authority
CN
China
Prior art keywords
vector
matrix
textcnn
training
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010333674.9A
Other languages
Chinese (zh)
Inventor
陈桢博
金戈
徐亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010333674.9A priority Critical patent/CN111651992A/en
Publication of CN111651992A publication Critical patent/CN111651992A/en
Priority to PCT/CN2020/118522 priority patent/WO2021212749A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application relates to the field of artificial intelligence, and provides a named entity labeling method and related equipment, which comprise the following steps: constructing word vectors of sentences in the resume text; performing multilayer convolution operation on the word vector through the multilayer convolution layer of the TextCNN model to obtain a word vector matrix; calculating the word vector matrix to obtain a query vector, a key vector and a value vector; calculating to obtain an attention weight matrix between every two characters in the statement, and adjusting based on the value vector pair; and based on the word vector matrix and the adjusted attention weight matrix, processing through a full connection layer, and inputting into a softmax classification layer fused with Gaussian errors for classification to obtain a first named entity label of each character in the sentence. The present application enhances the ability to capture local context. In addition, the present application relates to the field of blockchains in which resume text may be stored.

Description

Named entity labeling method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of classification model technology, and in particular, to a named entity labeling method, apparatus, computer device, and storage medium.
Background
The Named Entity Recognition (NER) task mainly recognizes and classifies the special names such as the names of people, places, organization names and the like appearing in the corresponding text, and is the basis of various natural language processing tasks such as information extraction, information retrieval, question and answer systems and the like. For example, in a resume identification scenario, it is often necessary to identify named entities such as school names, place names, etc. in the resume text.
The named entity labeling task is a necessary process in named entity recognition, and refers to a process for performing classification labeling processing on each character in a text. Although the traditional deep learning method has a good effect, the model operation is carried out by giving the same feature weight to the long-distance features in all sentences, so that the ideal effect on the recognition accuracy of the short-distance key features cannot be achieved.
Disclosure of Invention
The application mainly aims to provide a named entity labeling method, a named entity labeling device, computer equipment and a storage medium, and overcomes the defect that the recognition accuracy of short-distance key features is low during the labeling of named entities.
In order to achieve the above object, the present application provides a named entity tagging method, comprising the following steps:
obtaining a statement in a resume text and constructing a word vector of the statement;
performing multilayer convolution operation on the word vector through a multilayer convolution layer of a textCNN model obtained by pre-training to obtain a word vector matrix;
calculating the word vector matrix based on the full connection layer of the TextCNN model to obtain a query vector, a key vector and a value vector;
calculating to obtain an attention weight matrix between every two characters in the statement according to the query vector and the key vector and by combining a Gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector;
and based on the word vector matrix and the adjusted attention weight matrix, performing full-connected layer processing through the TextCNN model, and then inputting the processed word vector matrix and the adjusted attention weight matrix into a softmax classification layer fused with Gaussian errors for classification to obtain a first named entity label of each character in the sentence.
Further, the step of calculating the word vector matrix based on the full link layer of the TextCNN model to obtain a query vector, a key vector, and a value vector includes:
calculating the word vector matrix based on a query vector calculation parameter obtained by pre-training in a full connection layer of the TextCNN model to obtain the query vector;
calculating the word vector matrix based on key vector calculation parameters obtained by pre-training in a full connection layer of the TextCNN model to obtain the key vector;
and calculating the word vector matrix based on a value vector calculation parameter obtained by pre-training in the full-link layer of the TextCNN model to obtain the value vector.
Further, after the step of performing full-connected layer processing on the basis of the word vector matrix and the adjusted attention weight matrix through the TextCNN model and then inputting the processed word vector matrix and the adjusted attention weight matrix into a softmax classification layer with gaussian error fusion for classification, obtaining the first named entity label of each character in the sentence, the method includes:
adding the named entity labels obtained by classification to each character in the sentence to generate a first training sample;
performing return sampling on the first training sample to obtain a plurality of groups of training sample sets; respectively training an initial TextCNN model based on each group of training sample sets to obtain TextCNN submodels with corresponding numbers;
inputting the same label-free resume text into all the TextCNN submodels to output the named entity label result predicted by each TextCNN submodel;
judging whether the labeling results of the named entities predicted by the TextCNN submodels are the same, if so, verifying that the training of the TextCNN submodels is finished, and verifying that the labeling of the first named entity of each character in the sentence is correct.
Further, the step of obtaining the sentence in the resume text and constructing the word vector of the sentence includes:
acquiring a resume text;
inputting the resume text into a preset text detection model to detect each character area in the resume text; the text detection model is obtained by training based on a natural scene text detection model;
respectively adding a marking frame outside each character area;
identifying each mark frame based on an image identification technology, performing character identification on character contents in each mark frame through a character identification model to identify character information in each mark frame, and taking each identified character information as a sentence;
and constructing a word vector corresponding to each character in each sentence based on a preset word embedding model.
Further, the step of obtaining an attention weight matrix between every two characters in the sentence by calculation according to the query vector and the key vector in combination with the gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector includes:
calculating parameters based on the corresponding weight matrix according to the query vector and the key vector to obtain a weight matrix;
calculating parameters based on a corresponding Gaussian deviation matrix according to the query vector and the key vector to obtain the Gaussian deviation matrix;
summing the weight matrix and the Gaussian deviation matrix, and performing normalization processing to obtain the attention weight matrix;
and multiplying the attention weight matrix and the value vector to adjust the attention weight matrix.
The present application further provides a named entity labeling apparatus, including:
the system comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring sentences in a resume text and constructing word vectors of the sentences;
the first calculation unit is used for carrying out multilayer convolution operation on the word vector through a multilayer convolution layer of a textCNN model obtained by pre-training to obtain a word vector matrix;
the second calculation unit is used for calculating the word vector matrix based on the full connection layer of the TextCNN model to obtain a query vector, a key vector and a value vector;
the third calculation unit is used for calculating an attention weight matrix between every two characters in the statement according to the query vector and the key vector and by combining a Gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector;
and the classification unit is used for performing full-connection layer processing through the TextCNN model based on the word vector matrix and the adjusted attention weight matrix, and then inputting the processed word vector matrix and the adjusted attention weight matrix into a softmax classification layer fused with Gaussian errors for classification to obtain a first named entity label of each character in the sentence.
Further, the second calculation unit includes:
the first calculation subunit is configured to calculate the word vector matrix based on a query vector calculation parameter obtained by pre-training in a full connection layer of the TextCNN model to obtain the query vector;
the second calculation subunit is used for calculating the word vector matrix based on key vector calculation parameters obtained by pre-training in a full connection layer of the TextCNN model to obtain the key vectors;
and the third calculation subunit is used for calculating the word vector matrix based on a value vector calculation parameter obtained by pre-training in the full-link layer of the TextCNN model to obtain the value vector.
Further, the named entity labeling apparatus further includes:
the generating unit is used for adding the named entity labels obtained by classification to each character in the sentence to generate a first training sample;
the training unit is used for performing playback sampling on the first training sample to obtain a plurality of groups of training sample sets; respectively training an initial TextCNN model based on each group of training sample sets to obtain TextCNN submodels with corresponding numbers;
the output unit is used for inputting the same label-free resume text into all the TextCNN submodels so as to output the labeling result of the named entity predicted by each TextCNN submodel;
and the verification unit is used for judging whether the labeling results of the named entities predicted by all the TextCNN submodels are the same or not, verifying that the training of the TextCNN submodels is finished if the labeling results of the named entities predicted by all the TextCNN submodels are the same, and verifying that the labeling of the first named entity of each character in the sentence is correct.
Further, the acquiring unit includes:
the obtaining subunit is used for obtaining the resume text;
the detection subunit is used for inputting the resume text into a preset text detection model so as to detect each character area in the resume text; the text detection model is obtained by training based on a natural scene text detection model;
the adding subunit is used for respectively adding a marking frame outside each character area;
the identification subunit is used for identifying each mark frame based on an image identification technology, performing character identification on character contents in each mark frame through a character identification model to identify character information in each mark frame, and taking the identified character information as a sentence respectively;
and the construction subunit is used for constructing a word vector corresponding to each character in each sentence based on a preset word embedding model.
Further, the third calculation unit includes:
the fourth calculating subunit is used for calculating parameters based on the corresponding weight matrix according to the query vector and the key vector to obtain a weight matrix;
the fifth calculating subunit is used for calculating parameters based on a corresponding Gaussian deviation matrix according to the query vector and the key vector to obtain the Gaussian deviation matrix;
the addition subunit is used for adding the weight matrix and the Gaussian deviation matrix and carrying out normalization processing to obtain the attention weight matrix;
and the adjusting subunit is used for multiplying the attention weight matrix and the value vector to calculate so as to adjust the attention weight matrix.
The present application further provides a computer device comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of any one of the above methods when executing the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method of any of the above.
The method, the device, the computer equipment and the storage medium for labeling the named entity provided by the application comprise the following steps: obtaining a statement in a resume text and constructing a word vector of the statement; performing multilayer convolution operation on the word vector through a multilayer convolution layer of a textCNN model obtained by pre-training to obtain a word vector matrix; calculating the word vector matrix based on the full connection layer of the TextCNN model to obtain a query vector, a key vector and a value vector; calculating to obtain an attention weight matrix between every two characters in the statement according to the query vector and the key vector and by combining a Gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector; and based on the word vector matrix and the adjusted attention weight matrix, performing full-connected layer processing through the TextCNN model, and then inputting the processed word vector matrix and the adjusted attention weight matrix into a softmax classification layer fused with Gaussian errors for classification to obtain a first named entity label of each character in the sentence. The method introduces the weight of a learnable Gaussian deviation matrix, and introduces the central position and the moving window of a local range to calculate the Gaussian deviation, and the Gaussian deviation is put into the softmax function to correct the locally strengthened weight distribution, so that the capacity of capturing local context is enhanced.
Drawings
FIG. 1 is a schematic diagram illustrating steps of a named entity tagging method in an embodiment of the present application;
FIG. 2 is a block diagram of a named entity tagging apparatus in an embodiment of the present application;
fig. 3 is a block diagram illustrating a structure of a computer device according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Referring to fig. 1, an embodiment of the present application provides a named entity tagging method, including the following steps:
step S1, obtaining the sentences in the resume text and constructing the word vectors of the sentences;
step S2, carrying out multilayer convolution operation on the word vector through a multilayer convolution layer of a textCNN model obtained by pre-training to obtain a word vector matrix;
step S3, calculating the word vector matrix based on the full connection layer of the TextCNN model to obtain a query vector, a key vector and a value vector;
step S4, calculating an attention weight matrix between every two characters in the sentence according to the query vector and the key vector and combining a Gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector;
and step S5, based on the word vector matrix and the adjusted attention weight matrix, performing full-connected layer processing through the TextCNN model, and inputting the processed result into a softmax classification layer fused with Gaussian errors for classification to obtain a first named entity label of each character in the sentence.
In this embodiment, the named entity tagging method is applied to a scene of automatically extracting named entities such as school names, company names, professional information and the like in a resume text.
As described in the step S1, in this embodiment, the resume text generally includes a plurality of sentences, and in this embodiment, each sentence in the resume text is obtained, and a corresponding word vector is constructed for each sentence; it is understood that before the word vector of each statement is constructed, the statement may be preprocessed, and the preprocessing includes removing characters such as special symbols and stop words, and converting the unformatted text into a format that can be operated by an algorithm. After the preprocessing is completed, an embedding layer (i.e., embedding layer) of the word embedding model is input to convert each word (character) in the sentence into a corresponding word vector (the word vector is typically 300-dimensional). The Word vector dictionary in the embedding layer is obtained by Word2Vec or Glove algorithm training in advance, and details are not repeated herein.
As described in step S2, the TextCNN model is an algorithm for classifying texts by using a convolutional neural network, and can extract all information in the texts well. It has a forward propagation layer and a backward propagation layer for learning the above and below information, respectively, while being connected to the output layer. In this embodiment, the TextCNN model calculates an input word vector by using a plurality of convolution layers to obtain a word vector matrix. The convolution kernel of the TextCNN model is in a 1-dimensional form, the length can be set to 2 to 3, the number of convolution channels is set to 128 in the present scheme, and the convolution layer activation function is ReLU. In this embodiment, the above sentence takes the length m as an example, and is converted into a matrix of m × 300 after the embedding layer processing, and then a word vector matrix of m × 128 is output after the operation of the multi-layer convolution layer processing based on the TextCNN model.
As described in step S3, after the word vector matrix is obtained by the operation in the above step, the word vector matrix is operated by the full-connected layer of the TextCNN model to obtain three vectors, which are the query vector Q, the key vector K, and the value vector V, and are all m × n matrices. The query vector Q, the key vector K and the value vector V are obtained by operating the same word vector matrix through a full connection layer, and the difference is only that the calculation parameters are different; the query vector Q and the key vector K are constructed to calculate the influence weight between the characters in the same sentence. When a named entity is identified, the characters at other positions of the sentence need to be referred for judgment, and the influence weight of other characters needs to be considered. The query vector Q and the key vector K construct a similar weight matrix for calculating the weight between words in the sentence to quantify the influence relationship.
It should be understood that, the TextCNN model in the present embodiment is different from the existing model, and introduces the calculation parameters of the query vector Q, the key vector K, the value vector V, and the calculation parameters of the weight matrix and the gaussian deviation matrix in the full connection layer; and during the textCNN model training, the optimal query vector Q, key vector K, calculation parameters of value vector V, and calculation parameters of a weight matrix and a Gaussian deviation matrix are obtained through iterative training.
As described in the step S4, the attention weight matrix between every two characters in the sentence is obtained by combining the gaussian deviation matrix according to the query vector and the key vector, in this embodiment, weights of the learnable gaussian deviation matrix are introduced, and the central position and the moving window of the local range are introduced to calculate the gaussian deviation, and the gaussian deviation is put into the softmax function to correct the locally enhanced weight distribution, so that the ability of capturing the local context is enhanced.
The attention weight matrix between every two characters is used for scoring each character in the whole sentence, and the score of the scoring determines the importance degree of the character to the characters of other parts of the sentence. Specifically, the attention weight matrix can be obtained by multiplying the query vector by the key vector and normalizing the result; more specifically, after multiplying the above-mentioned query vector by the key vector,in order to control the distribution of the result so that the maximum value does not cause the gradient update amount to be too large, the vector obtained by the multiplication result is divided by
Figure BDA0002465839860000081
Then normalization is performed to make the gradient more stable. Where d is the dimension of the key vector K.
As described in step S5, the word vector matrix and the adjusted attention weight matrix are added first, and are processed through the full connection layer to obtain a classification matrix, and then the classification matrix is input to the softmax classification layer, and the probability of the biees label to which each character belongs is output through classification calculation of a softmax function, and then the highest probability is directly output as the first named entity label corresponding to each character, or a CRF algorithm may be superimposed to perform label output processing.
In this embodiment, a biees labeling manner is adopted, B represents an entity beginning, I represents an entity inside, O represents a non-entity, E represents an entity end, and S represents a single-character entity. Corresponding distinction is also needed for different types of named entities; for example, a character may be labeled as the beginning of a person's name or as inside an entity in a place name.
In an embodiment, the TextCNN model in this embodiment is different from the existing model, and the calculation parameters of the query vector Q, the key vector K, and the value vector V are introduced into the full connection layer, and when the TextCNN model is trained, the optimal calculation parameters of the query vector Q, the key vector K, and the value vector V are obtained through iterative training.
Therefore, the step S3 of calculating the word vector matrix based on the fully connected layer of the TextCNN model to obtain a query vector, a key vector, and a value vector includes:
calculating the word vector matrix based on a query vector calculation parameter obtained by pre-training in a full connection layer of the TextCNN model to obtain the query vector;
calculating the word vector matrix based on key vector calculation parameters obtained by pre-training in a full connection layer of the TextCNN model to obtain the key vector; the query vector Q and the key vector K are constructed to calculate the influence weight between the characters in the same sentence.
And calculating the word vector matrix based on a value vector calculation parameter obtained by pre-training in the full-link layer of the TextCNN model to obtain the value vector. The value vector is constructed in order to adjust the attention weight matrix.
In an embodiment, after the step S5 of performing full-connected-layer processing on the TextCNN model based on the word vector matrix and the adjusted attention weight matrix, and inputting the processed full-connected-layer processing into a softmax classification layer with gaussian error fusion for classification to obtain the first named entity label for each character in the sentence, the method includes:
step S6, adding the named entity labels obtained by classification to each character in the sentence, and generating a first training sample;
step S7, performing replacement sampling on the first training sample to obtain a plurality of groups of training sample sets; respectively training an initial TextCNN model based on each group of training sample sets to obtain TextCNN submodels with corresponding numbers;
step S8, inputting the same annotation-free resume text into all the TextCNN submodels to output the naming entity annotation result predicted by each TextCNN submodel;
step S9, judging whether the labeling results of the named entities predicted by the TextCNN submodel are the same, if so, verifying that the training of the TextCNN submodel is finished, and verifying that the labeling of the first named entity of each character in the sentence is correct.
It can be understood that the training set adopted by the TextCNN submodel is the text in the resume domain, and after the iterative training, the TextCNN submodel is more targeted to the professional domain. Meanwhile, when the TextCNN submodels are trained, a plurality of groups of TextCNN submodels are adopted for simultaneous training, and only when all results are the same, the training can be verified to be finally completed. Meanwhile, multiple groups of TextCNN submodels are trained simultaneously, and when all results are the same, the fact that the first named entity label of each character in the sentence is correct can be indicated.
When the named entity labeling is carried out by using the TextCNN submodel subsequently, the same resume text can be input into a plurality of TextCNN submodels for prediction, and the same named entity labeling result is used as the named entity labeling result of the resume text only when the named entity labeling results predicted by all the TextCNN submodels are the same.
In an embodiment, the step S1 of obtaining the sentence in the resume text and constructing the word vector of the sentence includes:
step S11, acquiring a resume text; the resume text may be a word electronic document or a picture.
It is emphasized that, to further ensure the privacy and security of the resume text, the resume text may also be stored in a node of a block chain.
Step S12, inputting the resume text into a preset text detection model to detect each character area in the resume text; the text detection model is obtained by training based on a natural scene text detection model; the text detection model is used for detecting the area where the text appears in the resume text, and is only used for detecting the area where the text is located, and is not used for identifying what the characters in the area are specifically.
Step S13, adding a marking frame outside each character area; after the mark frame is added, the corresponding character area can be conveniently recognized, and the subsequent recognition processing amount can be simplified.
Step S14, recognizing each mark frame based on image recognition technology, and performing character recognition on the character content in each mark frame through a character recognition model to recognize the character information in each mark frame, and using each recognized character information as a sentence; after each mark frame is identified, the character content in each mark frame can be directly identified by adopting a character identification model, and the content in each mark frame is respectively used as a sentence.
Step S15, constructing a word vector corresponding to each character in each sentence based on a preset word embedding model. The Word embedding model is obtained by training through Word2Vec or Glove algorithm and is used for converting the characters in each sentence into a corresponding Word vector.
In an embodiment, the step S4 of obtaining an attention weight matrix between every two characters in the sentence through calculation according to the query vector, the key vector and the gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector includes:
step S41, calculating parameters based on the corresponding weight matrix according to the query vector and the key vector to obtain a weight matrix; calculating parameters based on a corresponding Gaussian deviation matrix according to the query vector and the key vector to obtain the Gaussian deviation matrix; wherein, the calculation parameters adopted by the weight matrix obtained by calculation are different from those adopted by the Gaussian deviation matrix; it should be understood that the above-described calculation parameters are obtained when the above-described TextCNN model is iteratively trained.
Step S42, summing the weight matrix and the Gaussian deviation matrix, and carrying out normalization processing to obtain the attention weight matrix;
step S43, the attention weight matrix is multiplied by the value vector to adjust the attention weight matrix.
Wherein, the calculation parameters adopted by the M and the G are different after calculation; it should be understood that the above-described calculation parameters are obtained when the above-described TextCNN model is iteratively trained.
The gaussian error matrix G is used for adjusting the weight matrix M, so as to introduce a central position of a local range and a moving window to calculate a gaussian deviation, and put the gaussian deviation into the softmax function to correct the locally enhanced weight distribution, thereby enhancing the capability of capturing a local context.
Further, according to the weight matrix M and the Gaussian deviation matrix G; an attention weight matrix ATT is calculated, where ATT (Q, K) ═ Softmax (M + G).
In order to adjust the attention weight matrix, the obtained attention weight matrix is multiplied by the value vector, i.e., ATT × V. In the present embodiment, the resultant attention weight matrix is constructed as a weight for use in the calculation of the value vector V. It can be understood that, during the training of the model, under the supervision and learning task, the optimization algorithm can automatically optimize the parameters according to the result to obtain the optimal calculation parameters; therefore, in the specific prediction process of the model, the optimal matrixes Q and K are convenient to find, and therefore an accurate attention weight matrix can be obtained.
In an embodiment, the calculation process of obtaining the first named entity label of each character in the sentence, based on the word vector matrix and the adjusted attention weight matrix, after performing full-connected layer processing by the TextCNN model, inputting the processed word vector matrix into a softmax classification layer with gaussian error fusion for classification, includes:
combining the word vector matrix with the adjusted attention weight matrix to obtain L1 ═ C + ATT ═ V1; wherein C is a word vector matrix; then, the whole connection layer is processed, so that L2 ═ FC (L1) is obtained; and finally, classifying by a softmax classification layer to obtain the probability of the biees label to which each character belongs in the sentence, namely L3 is softmax (L2). The labeling result corresponding to the character is usually the label with the highest probability.
In the embodiment, a gaussian deviation matrix G is added to the softmax activation function in the softmax classification layer, wherein the gaussian deviation matrix G is a matrix of L × L, L is the length of a character in a sentence, and G is added to the softmax activation function in the softmax classification layerijWeigh character xjAnd predicted center position PiTightness between, DiIs a gaussian error with a window size that is also twice as large.
The attention weight matrix between every two characters is:
ATT(Q,K)=Softmax(M+G)
Gijcomprises the following steps:
Figure BDA0002465839860000111
wherein, PiAnd DiThe calculation method of (2) is as follows: to make PiAnd DiBetween 0 and L, a labeling factor L is added. Since each center position depends on the corresponding query vector, a feed-forward mechanism is applied to convert the vector to a hidden state, mapped to a scalar with a linear mapping.
Figure BDA0002465839860000112
Figure BDA0002465839860000121
Figure BDA0002465839860000122
Wherein
Figure BDA0002465839860000123
And WpIs a trainable linear mapping, QiIs a query vector.
In the embodiment, the learnable weighting of the Gaussian error is introduced, the central position of the local range and the moving window are introduced to calculate the Gaussian error, the Gaussian error is put into the softmax function to correct the locally strengthened weight distribution, the long-distance dependency is ensured to be obtained, meanwhile, the neighborhood inside the small range is learnt, and the capability of capturing the local context is enhanced.
Referring to fig. 2, an embodiment of the present application further provides a named entity labeling apparatus, including:
the acquiring unit 10 is configured to acquire a statement in a resume text and construct a word vector of the statement;
the first calculation unit 20 is configured to perform multilayer convolution operation on the word vector through a multilayer convolution layer of a TextCNN model obtained through pre-training to obtain a word vector matrix;
a second calculating unit 30, configured to calculate the word vector matrix based on the full connection layer of the TextCNN model to obtain a query vector, a key vector, and a value vector;
a third calculating unit 40, configured to calculate an attention weight matrix between every two characters in the sentence according to the query vector and the key vector in combination with a gaussian deviation matrix, and adjust the attention weight matrix based on the value vector;
and the classifying unit 50 is configured to perform full-connected layer processing on the word vector matrix and the adjusted attention weight matrix through the TextCNN model, and then input the processed word vector matrix and the adjusted attention weight matrix into a softmax classification layer with gaussian error fusion for classification, so as to obtain a first named entity label of each character in the sentence.
In an embodiment, the second computing unit 30 includes:
the first calculation subunit is configured to calculate the word vector matrix based on a query vector calculation parameter obtained by pre-training in a full connection layer of the TextCNN model to obtain the query vector;
the second calculation subunit is used for calculating the word vector matrix based on key vector calculation parameters obtained by pre-training in a full connection layer of the TextCNN model to obtain the key vectors;
and the third calculation subunit is used for calculating the word vector matrix based on a value vector calculation parameter obtained by pre-training in the full-link layer of the TextCNN model to obtain the value vector.
In an embodiment, the named entity labeling apparatus further includes:
the generating unit is used for adding the named entity labels obtained by classification to each character in the sentence to generate a first training sample;
the training unit is used for performing playback sampling on the first training sample to obtain a plurality of groups of training sample sets; respectively training an initial TextCNN model based on each group of training sample sets to obtain TextCNN submodels with corresponding numbers;
the output unit is used for inputting the same label-free resume text into all the TextCNN submodels so as to output the labeling result of the named entity predicted by each TextCNN submodel;
and the verification unit is used for judging whether the labeling results of the named entities predicted by all the TextCNN submodels are the same or not, verifying that the training of the TextCNN submodels is finished if the labeling results of the named entities predicted by all the TextCNN submodels are the same, and verifying that the labeling of the first named entity of each character in the sentence is correct.
In an embodiment, the obtaining unit 10 includes:
the obtaining subunit is used for obtaining the resume text;
the detection subunit is used for inputting the resume text into a preset text detection model so as to detect each character area in the resume text; the text detection model is obtained by training based on a natural scene text detection model;
the adding subunit is used for respectively adding a marking frame outside each character area;
the identification subunit is used for identifying each mark frame based on an image identification technology, performing character identification on character contents in each mark frame through a character identification model to identify character information in each mark frame, and taking the identified character information as a sentence respectively;
and the construction subunit is used for constructing a word vector corresponding to each character in each sentence based on a preset word embedding model.
In an embodiment, the third calculating unit 40 includes:
the fourth calculating subunit is used for calculating parameters based on the corresponding weight matrix according to the query vector and the key vector to obtain a weight matrix;
the fifth calculating subunit is used for calculating parameters based on a corresponding Gaussian deviation matrix according to the query vector and the key vector to obtain the Gaussian deviation matrix;
the addition subunit is used for adding the weight matrix and the Gaussian deviation matrix and carrying out normalization processing to obtain the attention weight matrix;
and the adjusting subunit is used for multiplying the attention weight matrix and the value vector to calculate so as to adjust the attention weight matrix.
In this embodiment, please refer to corresponding parts in the above method embodiments for specific implementation of the units/sub-units, which will not be described herein again.
Referring to fig. 3, a computer device, which may be a server and whose internal structure may be as shown in fig. 3, is also provided in the embodiment of the present application. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer designed processor is used to provide computational and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing text data, training data, etc. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a named entity tagging method.
Those skilled in the art will appreciate that the architecture shown in fig. 3 is only a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects may be applied.
An embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements a named entity tagging method. It is to be understood that the computer-readable storage medium in the present embodiment may be a volatile-readable storage medium or a non-volatile-readable storage medium.
In summary, the method, apparatus, computer device and storage medium for labeling a named entity provided in the embodiments of the present application include: obtaining a statement in a resume text and constructing a word vector of the statement; performing multilayer convolution operation on the word vector through a multilayer convolution layer of a textCNN model obtained by pre-training to obtain a word vector matrix; calculating the word vector matrix based on the full connection layer of the TextCNN model to obtain a query vector, a key vector and a value vector; calculating to obtain an attention weight matrix between every two characters in the statement according to the query vector and the key vector and by combining a Gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector; and based on the word vector matrix and the adjusted attention weight matrix, performing full-connected layer processing through the TextCNN model, and then inputting the processed word vector matrix and the adjusted attention weight matrix into a softmax classification layer fused with Gaussian errors for classification to obtain a first named entity label of each character in the sentence. The method introduces the weight of a learnable Gaussian deviation matrix, and introduces the central position and the moving window of a local range to calculate the Gaussian deviation, and the Gaussian deviation is put into the softmax function to correct the locally strengthened weight distribution, so that the capacity of capturing local context is enhanced.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above description is only for the preferred embodiment of the present application and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are intended to be included within the scope of the present application.

Claims (10)

1. A named entity labeling method is characterized by comprising the following steps:
obtaining a statement in a resume text and constructing a word vector of the statement;
performing multilayer convolution operation on the word vector through a multilayer convolution layer of a textCNN model obtained by pre-training to obtain a word vector matrix;
calculating the word vector matrix based on the full connection layer of the TextCNN model to obtain a query vector, a key vector and a value vector;
calculating to obtain an attention weight matrix between every two characters in the statement according to the query vector and the key vector and by combining a Gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector;
and based on the word vector matrix and the adjusted attention weight matrix, performing full-connected layer processing through the TextCNN model, and then inputting the processed word vector matrix and the adjusted attention weight matrix into a softmax classification layer fused with Gaussian errors for classification to obtain a first named entity label of each character in the sentence.
2. The method for naming an entity as in claim 1, wherein said step of computing said word vector matrix based on the fully connected layer of said TextCNN model to obtain a query vector, a key vector, and a value vector comprises:
calculating the word vector matrix based on a query vector calculation parameter obtained by pre-training in a full connection layer of the TextCNN model to obtain the query vector;
calculating the word vector matrix based on key vector calculation parameters obtained by pre-training in a full connection layer of the TextCNN model to obtain the key vector;
and calculating the word vector matrix based on a value vector calculation parameter obtained by pre-training in the full-link layer of the TextCNN model to obtain the value vector.
3. The method for naming an entity as claimed in claim 1, wherein said step of obtaining a first named entity label for each character in said sentence after said step of performing full-connectivity layer processing based on said word vector matrix and said adjusted attention weight matrix by said TextCNN model and then inputting said processed word vector matrix into a softmax classification layer with gaussian error fusion for classification comprises:
adding the named entity labels obtained by classification to each character in the sentence to generate a first training sample;
performing return sampling on the first training sample to obtain a plurality of groups of training sample sets; respectively training an initial TextCNN model based on each group of training sample sets to obtain TextCNN submodels with corresponding numbers;
inputting the same label-free resume text into all the TextCNN submodels to output the named entity label result predicted by each TextCNN submodel;
judging whether the labeling results of the named entities predicted by the TextCNN submodels are the same, if so, verifying that the training of the TextCNN submodels is finished, and verifying that the labeling of the first named entity of each character in the sentence is correct.
4. The method for naming an entity as claimed in claim 1, wherein the step of obtaining a sentence in a resume text and constructing a word vector of the sentence comprises:
acquiring a resume text, wherein the resume text is stored in a block chain;
inputting the resume text into a preset text detection model to detect each character area in the resume text; the text detection model is obtained by training based on a natural scene text detection model;
respectively adding a marking frame outside each character area;
identifying each mark frame based on an image identification technology, performing character identification on character contents in each mark frame through a character identification model to identify character information in each mark frame, and taking each identified character information as a sentence;
and constructing a word vector corresponding to each character in each sentence based on a preset word embedding model.
5. The method for labeling named entities according to claim 1, wherein the step of calculating an attention weight matrix between every two characters in the sentence according to the query vector, the key vector and the gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector comprises:
calculating parameters based on the corresponding weight matrix according to the query vector and the key vector to obtain a weight matrix;
calculating parameters based on a corresponding Gaussian deviation matrix according to the query vector and the key vector to obtain the Gaussian deviation matrix;
summing the weight matrix and the Gaussian deviation matrix, and performing normalization processing to obtain the attention weight matrix;
and multiplying the attention weight matrix and the value vector to adjust the attention weight matrix.
6. A named entity tagging apparatus comprising:
the system comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring sentences in a resume text and constructing word vectors of the sentences;
the first calculation unit is used for carrying out multilayer convolution operation on the word vector through a multilayer convolution layer of a textCNN model obtained by pre-training to obtain a word vector matrix;
the second calculation unit is used for calculating the word vector matrix based on the full connection layer of the TextCNN model to obtain a query vector, a key vector and a value vector;
the third calculation unit is used for calculating an attention weight matrix between every two characters in the statement according to the query vector and the key vector and by combining a Gaussian deviation matrix, and adjusting the attention weight matrix based on the value vector;
and the classification unit is used for performing full-connection layer processing through the TextCNN model based on the word vector matrix and the adjusted attention weight matrix, and then inputting the processed word vector matrix and the adjusted attention weight matrix into a softmax classification layer fused with Gaussian errors for classification to obtain a first named entity label of each character in the sentence.
7. The named entity tagging device of claim 6, wherein the second computing unit comprises:
the first calculation subunit is configured to calculate the word vector matrix based on a query vector calculation parameter obtained by pre-training in a full connection layer of the TextCNN model to obtain the query vector;
the second calculation subunit is used for calculating the word vector matrix based on key vector calculation parameters obtained by pre-training in a full connection layer of the TextCNN model to obtain the key vectors;
and the third calculation subunit is used for calculating the word vector matrix based on a value vector calculation parameter obtained by pre-training in the full-link layer of the TextCNN model to obtain the value vector.
8. The named entity tagging device of claim 6, further comprising:
the generating unit is used for adding the named entity labels obtained by classification to each character in the sentence to generate a first training sample;
the training unit is used for performing playback sampling on the first training sample to obtain a plurality of groups of training sample sets; respectively training an initial TextCNN model based on each group of training sample sets to obtain TextCNN submodels with corresponding numbers;
the output unit is used for inputting the same label-free resume text into all the TextCNN submodels so as to output the labeling result of the named entity predicted by each TextCNN submodel;
and the verification unit is used for judging whether the labeling results of the named entities predicted by all the TextCNN submodels are the same or not, verifying that the training of the TextCNN submodels is finished if the labeling results of the named entities predicted by all the TextCNN submodels are the same, and verifying that the labeling of the first named entity of each character in the sentence is correct.
9. A computer device comprising a memory and a processor, the memory having stored therein a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method according to any of claims 1 to 5.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.
CN202010333674.9A 2020-04-24 2020-04-24 Named entity labeling method and device, computer equipment and storage medium Pending CN111651992A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010333674.9A CN111651992A (en) 2020-04-24 2020-04-24 Named entity labeling method and device, computer equipment and storage medium
PCT/CN2020/118522 WO2021212749A1 (en) 2020-04-24 2020-09-28 Method and apparatus for labelling named entity, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010333674.9A CN111651992A (en) 2020-04-24 2020-04-24 Named entity labeling method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111651992A true CN111651992A (en) 2020-09-11

Family

ID=72352510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010333674.9A Pending CN111651992A (en) 2020-04-24 2020-04-24 Named entity labeling method and device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN111651992A (en)
WO (1) WO2021212749A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112580822A (en) * 2020-12-16 2021-03-30 北京百度网讯科技有限公司 Countermeasure training method and apparatus for machine learning model, electronic device, and medium
CN112580628A (en) * 2020-12-22 2021-03-30 浙江智慧视频安防创新中心有限公司 License plate character recognition method and system based on attention mechanism
CN112784015A (en) * 2021-01-25 2021-05-11 北京金堤科技有限公司 Information recognition method and apparatus, device, medium, and program
CN113051897A (en) * 2021-05-25 2021-06-29 中国电子科技集团公司第三十研究所 GPT2 text automatic generation method based on Performer structure
CN113282707A (en) * 2021-05-31 2021-08-20 平安国际智慧城市科技股份有限公司 Data prediction method and device based on Transformer model, server and storage medium
CN113312477A (en) * 2021-04-19 2021-08-27 上海快确信息科技有限公司 Semi-structure text classification scheme based on graph attention
WO2021212749A1 (en) * 2020-04-24 2021-10-28 平安科技(深圳)有限公司 Method and apparatus for labelling named entity, computer device, and storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114496115B (en) * 2022-04-18 2022-08-23 北京白星花科技有限公司 Automatic generation method and system for entity relation label
CN114580424B (en) * 2022-04-24 2022-08-05 之江实验室 Labeling method and device for named entity identification of legal document
CN115564393A (en) * 2022-10-24 2023-01-03 深圳今日人才信息科技有限公司 Recruitment requirement similarity-based job recommendation method
CN116030014B (en) * 2023-01-06 2024-04-09 浙江伟众科技有限公司 Intelligent processing method and system for soft and hard air conditioner pipes
CN116611439B (en) * 2023-07-19 2023-09-19 北京惠每云科技有限公司 Medical information extraction method, device, electronic equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3454260A1 (en) * 2017-09-11 2019-03-13 Tata Consultancy Services Limited Bilstm-siamese network based classifier for identifying target class of queries and providing responses thereof
CN110502738A (en) * 2018-05-18 2019-11-26 阿里巴巴集团控股有限公司 Chinese name entity recognition method, device, equipment and inquiry system
CN110222188B (en) * 2019-06-18 2023-04-18 深圳司南数据服务有限公司 Company notice processing method for multi-task learning and server
CN110298043B (en) * 2019-07-03 2023-04-07 吉林大学 Vehicle named entity identification method and system
CN111651992A (en) * 2020-04-24 2020-09-11 平安科技(深圳)有限公司 Named entity labeling method and device, computer equipment and storage medium

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021212749A1 (en) * 2020-04-24 2021-10-28 平安科技(深圳)有限公司 Method and apparatus for labelling named entity, computer device, and storage medium
CN112580822A (en) * 2020-12-16 2021-03-30 北京百度网讯科技有限公司 Countermeasure training method and apparatus for machine learning model, electronic device, and medium
CN112580822B (en) * 2020-12-16 2023-10-17 北京百度网讯科技有限公司 Countermeasure training method device for machine learning model, electronic equipment and medium
CN112580628A (en) * 2020-12-22 2021-03-30 浙江智慧视频安防创新中心有限公司 License plate character recognition method and system based on attention mechanism
CN112580628B (en) * 2020-12-22 2023-08-01 浙江智慧视频安防创新中心有限公司 Attention mechanism-based license plate character recognition method and system
CN112784015A (en) * 2021-01-25 2021-05-11 北京金堤科技有限公司 Information recognition method and apparatus, device, medium, and program
CN112784015B (en) * 2021-01-25 2024-03-12 北京金堤科技有限公司 Information identification method and device, apparatus, medium, and program
CN113312477A (en) * 2021-04-19 2021-08-27 上海快确信息科技有限公司 Semi-structure text classification scheme based on graph attention
CN113051897A (en) * 2021-05-25 2021-06-29 中国电子科技集团公司第三十研究所 GPT2 text automatic generation method based on Performer structure
CN113051897B (en) * 2021-05-25 2021-09-10 中国电子科技集团公司第三十研究所 GPT2 text automatic generation method based on Performer structure
CN113282707A (en) * 2021-05-31 2021-08-20 平安国际智慧城市科技股份有限公司 Data prediction method and device based on Transformer model, server and storage medium
CN113282707B (en) * 2021-05-31 2024-01-26 平安国际智慧城市科技股份有限公司 Data prediction method and device based on transducer model, server and storage medium

Also Published As

Publication number Publication date
WO2021212749A1 (en) 2021-10-28

Similar Documents

Publication Publication Date Title
CN111651992A (en) Named entity labeling method and device, computer equipment and storage medium
CN110377730B (en) Case-by-case classification method, apparatus, computer device, and storage medium
CN109992664B (en) Dispute focus label classification method and device, computer equipment and storage medium
CN110750965B (en) English text sequence labeling method, english text sequence labeling system and computer equipment
CN110704576B (en) Text-based entity relationship extraction method and device
Gupta et al. Integration of textual cues for fine-grained image captioning using deep CNN and LSTM
CN110427612B (en) Entity disambiguation method, device, equipment and storage medium based on multiple languages
CN112417887B (en) Sensitive word and sentence recognition model processing method and related equipment thereof
CN111814482B (en) Text key data extraction method and system and computer equipment
CN113849648A (en) Classification model training method and device, computer equipment and storage medium
CN111178358A (en) Text recognition method and device, computer equipment and storage medium
CN113742733A (en) Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device
CN115495553A (en) Query text ordering method and device, computer equipment and storage medium
CN113159013A (en) Paragraph identification method and device based on machine learning, computer equipment and medium
CN114647713A (en) Knowledge graph question-answering method, device and storage medium based on virtual confrontation
CN115587583A (en) Noise detection method and device and electronic equipment
Inunganbi et al. Handwritten Meitei Mayek recognition using three‐channel convolution neural network of gradients and gray
CN113536784A (en) Text processing method and device, computer equipment and storage medium
CN112183513A (en) Method and device for identifying characters in image, electronic equipment and storage medium
CN110929013A (en) Image question-answer implementation method based on bottom-up entry and positioning information fusion
CN113360644B (en) Text model retraining method, device, equipment and storage medium
CN113420116B (en) Medical document analysis method, device, equipment and medium
CN112989820A (en) Legal document positioning method, device, equipment and storage medium
CN117235234B (en) Object information acquisition method, device, computer equipment and storage medium
Yap et al. Enhancing BISINDO Recognition Accuracy Through Comparative Analysis of Three CNN Architecture Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40033518

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination