CN114298047A - Chinese named entity recognition method and system based on stroke volume and word vector - Google Patents
Chinese named entity recognition method and system based on stroke volume and word vector Download PDFInfo
- Publication number
- CN114298047A CN114298047A CN202111641955.1A CN202111641955A CN114298047A CN 114298047 A CN114298047 A CN 114298047A CN 202111641955 A CN202111641955 A CN 202111641955A CN 114298047 A CN114298047 A CN 114298047A
- Authority
- CN
- China
- Prior art keywords
- stroke
- vector
- word
- character
- chinese character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Document Processing Apparatus (AREA)
Abstract
The invention provides a Chinese named entity recognition method and a system based on stroke volume and word vector, which relate to the technical field of named entity recognition and comprise the following steps: acquiring a stroke sequence corresponding to each Chinese character in the text and a character feature vector of each Chinese character; inputting the stroke sequence into a stroke convolution neural network to obtain a stroke feature vector; setting a sliding window according to the maximum length of an entity in the text, and acquiring a word vector of each word in the sliding window through a self-attention mechanism; splicing the stroke characteristic vector, the word vector and the character characteristic vector of each Chinese character in the text, inputting the stroke characteristic vector, the word vector and the character characteristic vector into a BilSTM network, and acquiring the score of each Chinese character corresponding to each entity label; and determining an optimal entity label for each Chinese character in the text by adopting a CRF model. The method considers the influence of the stroke sequence of the Chinese character on the Chinese character, combines the stroke characteristic vector, the word characteristic vector and the character characteristic vector of the Chinese character, and then carries out named entity recognition, thereby improving the effect of named entity recognition.
Description
Technical Field
The invention relates to the technical field of named entity recognition, in particular to a Chinese named entity recognition method and system based on stroke volume and word vectors.
Background
With the rapid development of internet technology, unstructured data is growing continuously, and the world is in a massive unstructured data era. How to efficiently manage data and extract effective information from unstructured data becomes a problem which needs to be solved urgently.
The purpose of Named Entity Recognition (NER) is to identify defined Named entities from unstructured text, such as person names, place names, organization names, etc., which are the basic core tasks for information retrieval and information extraction. The Chinese NER is a division of the NER in the Chinese field, and still has a plurality of problems due to the characteristics of Chinese characters. The main difficulties of Chinese NER are the following: 1) chinese characters usually have a word ambiguity, and in different text contexts, the meanings may be greatly different; 2) the Chinese text does not have obvious entity boundary identifiers such as spaces and the like in similar English texts; 3) the research of Chinese NER starts late, related labeled data sets are few, and the problems of single field exist.
The existing Chinese named entity recognition usually has two methods, namely a word-based sequence labeling method and a character-based sequence labeling method. A word-based labeling method firstly utilizes a word segmentation tool to segment a text, and then entity recognition is carried out, the word boundary of the method is also an entity boundary, and if errors occur in the word segmentation stage, the subsequent NER model cannot correctly recognize the entity. The word-based sequence labeling method generally has the condition of insufficient semantics, so people mainly consider how to better utilize word information, some appliers introduce external vocabulary information on the basis of the word-based sequence labeling method and integrate the external vocabulary information into word vector representation on an input layer, so that the model is changed, meanwhile, the introduction of the external word vector also causes the model training efficiency to be lower, and finally, the accuracy of named entity recognition is reduced; some applications establish an ElMo model based on stroke sequences only on the basis of a word-based sequence labeling method, and have defects in the aspects of effectiveness and accuracy of named entity identification.
Disclosure of Invention
In order to solve the problems, the invention provides a Chinese named entity recognition method and a Chinese named entity recognition system based on stroke volume and word vectors.
In order to achieve the above object, the present invention provides a method for identifying a named entity in chinese based on stroke volume and word vector, comprising:
acquiring a stroke sequence corresponding to each Chinese character in the text and a character feature vector of each Chinese character;
inputting the stroke sequence into a stroke convolution neural network to obtain a stroke feature vector;
setting a sliding window according to the maximum length of the entity in the text, and acquiring a word vector of each word in the sliding window through a self-attention mechanism;
splicing the stroke feature vector, the word vector and the character feature vector of each Chinese character in the text, and inputting the stroke feature vector, the word vector and the character feature vector into a BilSTM network to obtain the score of each Chinese character corresponding to each entity label;
and determining an optimal entity label for each Chinese character in the text by adopting a CRF model.
As a further improvement of the invention, a mapping table from Chinese characters to stroke sequences is constructed, and the stroke sequences corresponding to the Chinese characters are obtained through the mapping table.
As a further improvement of the present invention, the stroke convolution neural network convolves the stroke sequence by convolution kernels of different window sizes to obtain the stroke feature vector.
As a further improvement of the invention, the stroke convolution neural network obtains the stroke feature graph through convolution kernel convolution with different window sizes, and performs maximum pooling and full connection on the feature graph to obtain the stroke feature vector, wherein the formula is as follows:
wherein:
w represents weights in convolutional neural network training;
Mt,t+k-1a feature representing an input;
b represents the bias in the convolutional neural network training;
as a further improvement of the invention, a classification loss function L (cls) is added in the stroke convolution neural network training process:
L(cls)=-logP(z|X)=-logsoftmax(w*semb)
wherein the content of the first and second substances,
x represents an input stroke sequence;
z represents a Chinese label corresponding to the stroke sequence;
w represents a parameter in the network;
semb represents the stroke feature vector.
As a further improvement of the present invention, the obtaining, by a self-attention mechanism, a word vector of each word within the sliding window; the method comprises the following steps:
calculating the similarity between every two words in the sliding window through the self-attention mechanism;
and acquiring word vector quantity of each word in the sliding window according to the similarity by adopting a softmax function.
As a further improvement of the present invention,
for each Chinese character in the sliding window, generating a corresponding Query vector, a corresponding Key vector and a corresponding Value vector according to the character feature vector;
and calculating the dot product of the Query vector and the Key vector to obtain the score of each word, and multiplying the score by the Value vector of each word to obtain the word vector of the word in the sliding window.
As a further improvement of the present invention, the CRF model is used to determine an optimal entity tag for each chinese character in the text; the method comprises the following steps:
defining the character sequence of the input text as x ═ x (x)1,x2,...,xn) The predicted tag sequence is y ═ y (y)1,y2,…,yn);
Definition ofIs the ith word output by the BilSTM network model and is marked as a label yiA predicted score of (d);
and taking the predicted tag sequence with the highest score as a final tag sequence, and acquiring the Chinese named entity according to the tag.
As a further improvement of the present invention,
And if the conditional probability of the predicted tag sequence with the highest score is also the highest, taking the predicted tag sequence with the highest score as the final tag sequence.
The invention also provides a Chinese named entity recognition system based on stroke volume and word vector, which comprises a pre-preparation module, a stroke characteristic acquisition module, a word vector acquisition module, a label prediction module and an optimal label acquisition module;
the pre-preparation module is configured to:
acquiring a stroke sequence corresponding to each Chinese character in the text and a character feature vector of each Chinese character;
the stroke characteristic acquisition module is used for:
inputting the stroke sequence into a stroke convolution neural network to obtain a stroke feature vector;
the word vector acquisition module is configured to:
setting a sliding window according to the maximum length of the entity in the text, and acquiring a word vector of each word in the sliding window through a self-attention mechanism;
the label prediction module is configured to:
splicing the stroke feature vector, the word vector and the character feature vector of each Chinese character in the text, and inputting the stroke feature vector, the word vector and the character feature vector into a BilSTM network to obtain the score of each Chinese character corresponding to each entity label;
the best tag obtaining module is configured to:
and determining an optimal entity label for each Chinese character in the text by adopting a CRF model.
Compared with the prior art, the invention has the beneficial effects that:
the invention considers the influence of the stroke sequence of the Chinese character on the basis of the character-based sequence labeling method in the named entity recognition method, combines the stroke characteristic vector, the word characteristic vector and the character characteristic vector of the Chinese character, and then performs the named entity recognition, thereby improving the effect of the named entity recognition.
In the process of obtaining the stroke feature vector, the method extracts the stroke feature vector of the Chinese character by adopting a convolution method, and the convolution method is more suitable for the number range of strokes of the Chinese character; meanwhile, a convolution core with the size of multiple windows is selected in the convolution process to perform convolution on the stroke sequence, and the most effective stroke feature vector is obtained.
In the process of solving the word feature vector of the Chinese character, the word vector information in the sliding window is obtained through a self-attention mechanism, so that the defect of semantics is overcome, and the condition that the prediction accuracy is reduced under the condition of introducing external words in the prior art is avoided.
In the stroke convolution neural network training process, the classification loss function is added, so that the stroke convolution neural network training accuracy is improved.
Drawings
FIG. 1 is a flow chart of a method for identifying a named entity in Chinese based on stroke volume and word vector according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a system for identifying a named entity in Chinese based on stroke volume and word vector according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a stroke convolution neural network according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a model of a self-attention mechanism according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a bidirectional timing model and a CRF model according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention is described in further detail below with reference to the attached drawing figures:
as shown in fig. 1, the method for identifying a named entity in chinese based on stroke volume and word vector provided by the present invention includes:
s1, acquiring a stroke sequence corresponding to each Chinese character in the text and a character feature vector of each Chinese character;
wherein the content of the first and second substances,
and acquiring the stroke sequence of each Chinese character in a training set in the training process through a Chinese dictionary website, constructing a mapping table from the Chinese character to the stroke sequence, and acquiring the stroke sequence corresponding to each Chinese character in the text through the mapping table.
For example: as shown in FIG. 3, the stroke sequence obtained from the mapping table is "left-falling stroke")Fold-back is one.
S2, inputting the stroke sequence into a stroke convolution neural network to obtain a stroke feature vector;
wherein the content of the first and second substances,
as shown in fig. 3, the stroke convolution neural network convolves the stroke sequence through convolution kernels of different window sizes, obtains a stroke feature map after convolution of the stroke convolution neural network, performs maximum pooling and full connection on the feature map to obtain a stroke feature vector, and has the formula:
wherein:
w represents weights in convolutional neural network training;
Mt,t+k-1a feature representing an input;
b represents the bias in the convolutional neural network training;
in the invention, a classification loss function is added in the stroke convolution neural network training process to improve the training accuracy, and the classification loss function is expressed as follows:
L(cls)=-logP(z|X)=-logsoftmax(w*semb)
wherein the content of the first and second substances,
x represents an input stroke sequence;
z represents a Chinese label corresponding to the stroke sequence;
w represents a parameter in the network;
semb represents the stroke feature vector.
S3, setting a sliding window according to the maximum length of an entity in the text, and acquiring a word vector of each word in the sliding window through a self-attention mechanism;
wherein the content of the first and second substances,
the word-based sequence labeling method generally has the problem of insufficient semantics, and in order to better utilize word vector information, the SA mechanism (self-attention mechanism) is used to acquire the word vector information in a sliding window to solve the problem.
Acquiring the maximum length of an entity in a training set in the training process, taking the maximum length as a sliding window, and calculating the similarity between every two characters in the sliding window through a self-attention mechanism; and then, a softmax function is adopted to obtain a word vector of each word in the sliding window according to the similarity.
Specifically, for each Chinese character in the sliding window, generating a corresponding Query vector, a corresponding Key vector and a corresponding Value vector according to the character feature vector;
and calculating the dot product of the Query vector and the Key vector to obtain the score of each word, and multiplying the score by the Value vector of each word to obtain the word vector of the word in the sliding window.
For example:
as shown in fig. 4, if the text content is "beijing city", e1、e2、e3Respectively corresponding to the character feature vectors of each word, and generating a Query vector, a Key vector and a Value vector for each word, wherein the vectors are the character feature vectors e corresponding to each word1、e2、e3Multiplying by three weight matrixes created in the training process; calculating a score corresponding to each word through a dot product between the Query vector and the Key vector, and then multiplying the score and the corresponding Value vector to obtain a word vector corresponding to each word in the sliding window, wherein the formula is as follows:
s4, splicing the stroke feature vectors, word vectors and character feature vectors of all Chinese characters in the text, and inputting the stroke feature vectors, word vectors and character feature vectors into a BilSTM network to obtain the score of each Chinese character corresponding to each entity label;
wherein the content of the first and second substances,
the splicing is a direct splicing of vector dimensions, and if the stroke feature vector of a certain Chinese character can be represented as 1 × 20, the word vector can be represented as 1 × 30, and the character feature vector can be represented as 1 × 60, the spliced feature vector 1 × 110 can be obtained after the splicing.
The BilSTM (Bi-directional Long Short-Term Memory) is a bidirectional Long-time and Short-time Memory network; the LSTM (Long Short-Term Memory) is a Long-Short time Memory network, is an improved time sequence network, solves the problem of gradient information, realizes effective utilization of Long-distance information, can only acquire unidirectional time sequence information, but has important influence on NER (named entity identification) tasks by context information, and therefore, the application adopts the BilSTM network to acquire the context information;
as shown in fig. 5, taking "beijing smith" as an example, the score of each word corresponding to multiple labels is obtained through forward LSTM calculation and reverse LSTM calculation, where the labels are preset, and the method may include: address, time, person name, book name, etc.
And S5, determining an optimal entity label for each Chinese character in the text by adopting a CRF model.
Wherein the content of the first and second substances,
due to the strong constraint relationship between adjacent tags in the NER task, for example, after the B-LOC tag (the start tag of the address), the tag can only be an I-LOC tag or an O tag, but cannot be other tags such as a B-PER tag (the start tag of the name of a person). Therefore, after sequence modeling by the BiLSTM network, Conditional Random Field (CRF) is used herein to predict the tags of the entire sequence, specifically:
defining the character sequence of the input text as x ═ x (x)1,x2,...,xn) The predicted tag sequence is y ═ y (y)1,y2,...,yn) (ii) a Y (x) represents the set of all possible tag sequences for the text;
definition ofIs the ith character mark output by the BilSTM network modelNote as label yiA predicted score of (d);
and taking the predicted tag sequence with the highest score as a final tag sequence, and acquiring the Chinese named entity according to the tag.
Further, in the above-mentioned case,
a loss function may be set, such as:
And if the conditional probability of the predicted tag sequence with the highest score is also the maximum, taking the predicted tag sequence with the highest score as the final tag sequence.
Finally, the optimal label sequence is found through a Viterbi algorithm, and the formula is as follows:
as shown in fig. 2, the present invention further provides a chinese named entity recognition system based on stroke convolution kernel word vectors, which includes a pre-preparation module, a stroke feature acquisition module, a word vector acquisition module, a label prediction module, and an optimal label acquisition module;
a pre-preparation module to:
acquiring a stroke sequence corresponding to each Chinese character in the text and a character feature vector of each Chinese character;
a stroke characteristic acquisition module for:
inputting the stroke sequence into a stroke convolution neural network to obtain a stroke feature vector;
a word vector acquisition module to:
setting a sliding window according to the maximum length of an entity in the text, and acquiring a word vector of each word in the sliding window through a self-attention mechanism;
a label prediction module to:
splicing the stroke characteristic vector, the word vector and the character characteristic vector of each Chinese character in the text, inputting the stroke characteristic vector, the word vector and the character characteristic vector into a BilSTM network, and acquiring the score of each Chinese character corresponding to each entity label;
a best label acquisition module to:
and determining an optimal entity label for each Chinese character in the text by adopting a CRF model.
The invention has the advantages that:
the invention considers the influence of the stroke sequence of the Chinese character on the basis of the character-based sequence labeling method in the named entity recognition method, combines the stroke characteristic vector, the word characteristic vector and the character characteristic vector of the Chinese character, and then performs the named entity recognition, thereby improving the effect of the named entity recognition.
In the process of obtaining the stroke feature vector, the method extracts the stroke feature vector of the Chinese character by adopting a convolution method, and the convolution method is more suitable for the number range of strokes of the Chinese character; meanwhile, a convolution core with the size of multiple windows is selected in the convolution process to perform convolution on the stroke sequence, and the most effective stroke feature vector is obtained.
In the process of solving the word feature vector of the Chinese character, the word vector information in the sliding window is obtained through a self-attention mechanism, so that the defect of semantics is overcome, and the condition that the prediction accuracy is reduced under the condition of introducing external words in the prior art is avoided.
In the stroke convolution neural network training process, the classification loss function is added, so that the stroke convolution neural network training accuracy is improved.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. The Chinese named entity recognition method based on stroke volume and word vector is characterized by comprising the following steps:
acquiring a stroke sequence corresponding to each Chinese character in the text and a character feature vector of each Chinese character;
inputting the stroke sequence into a stroke convolution neural network to obtain a stroke feature vector;
setting a sliding window according to the maximum length of the entity in the text, and acquiring a word vector of each word in the sliding window through a self-attention mechanism;
splicing the stroke feature vector, the word vector and the character feature vector of each Chinese character in the text, and inputting the stroke feature vector, the word vector and the character feature vector into a BilSTM network to obtain the score of each Chinese character corresponding to each entity label;
and determining an optimal entity label for each Chinese character in the text by adopting a CRF model.
2. The method of claim 1, wherein the method comprises: and constructing a mapping table from the Chinese characters to the stroke sequences, and acquiring the stroke sequences corresponding to the Chinese characters through the mapping table.
3. The method of claim 1, wherein the method comprises: and the stroke convolution neural network performs convolution on the stroke sequence through convolution cores with different window sizes to obtain the stroke feature vector.
4. The method of claim 3, wherein the method comprises: the stroke convolution neural network obtains a stroke feature graph through convolution kernel convolution of different window sizes, performs maximum pooling and full connection on the feature graph to obtain a stroke feature vector, and the formula is as follows:
wherein:
w represents weights in convolutional neural network training;
Mt,t+k-1a feature representing an input;
b represents the bias in the convolutional neural network training.
5. The method of claim 1, wherein the method comprises: adding a classification loss function L (cls) in the stroke convolution neural network training process:
L(cls)=-log P(z|X)=-log softmax(w*semb)
wherein the content of the first and second substances,
x represents an input stroke sequence;
z represents a Chinese label corresponding to the stroke sequence;
w represents a parameter in the network;
semb represents the stroke feature vector.
6. The method of claim 1, wherein the method comprises: acquiring a word vector of each word in the sliding window through a self-attention mechanism; the method comprises the following steps:
calculating the similarity between every two words in the sliding window through the self-attention mechanism;
and acquiring a word vector of each word in the sliding window according to the similarity by adopting a soffmax function.
7. The method of claim 6, wherein the method comprises:
for each Chinese character in the sliding window, generating a corresponding Query vector, a corresponding Key vector and a corresponding Value vector according to the character feature vector;
and calculating the dot product of the Query vector and the Key vector to obtain the score of each word, and multiplying the score by the Value vector of each word to obtain the word vector of the word in the sliding window.
8. The method for identifying named entities as claimed in claim 1, wherein the CRF model is used to determine an optimal entity label for each Chinese character in the text; the method comprises the following steps:
defining the character sequence of the input text as x ═ x (x)1,x2,...,xn) The predicted tag sequence is y ═ y (y)1,y2,...,yn);
Definition ofIs the ith word output by the BilSTM network model and is marked as a label yiA predicted score of (a);
and taking the predicted tag sequence with the highest score as a final tag sequence, and acquiring the Chinese named entity according to the tag.
9. The method of claim 8, wherein the method comprises:
And if the conditional probability of the predicted tag sequence with the highest score is also the maximum, taking the predicted tag sequence with the highest score as the final tag sequence.
10. A system for implementing the method for identifying a named entity in chinese according to any one of claims 1 to 9, comprising a pre-preparation module, a stroke feature acquisition module, a word vector acquisition module, a label prediction module, and an optimal label acquisition module;
the pre-preparation module is configured to:
acquiring a stroke sequence corresponding to each Chinese character in the text and a character feature vector of each Chinese character;
the stroke characteristic acquisition module is used for:
inputting the stroke sequence into a stroke convolution neural network to obtain a stroke feature vector;
the word vector acquisition module is configured to:
setting a sliding window according to the maximum length of the entity in the text, and acquiring a word vector of each word in the sliding window through a self-attention mechanism;
the label prediction module is configured to:
splicing the stroke feature vector, the word vector and the character feature vector of each Chinese character in the text, and inputting the stroke feature vector, the word vector and the character feature vector into a BilSTM network to obtain the score of each Chinese character corresponding to each entity label;
the best tag obtaining module is configured to:
and determining an optimal entity label for each Chinese character in the text by adopting a CRF model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111641955.1A CN114298047A (en) | 2021-12-29 | 2021-12-29 | Chinese named entity recognition method and system based on stroke volume and word vector |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111641955.1A CN114298047A (en) | 2021-12-29 | 2021-12-29 | Chinese named entity recognition method and system based on stroke volume and word vector |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114298047A true CN114298047A (en) | 2022-04-08 |
Family
ID=80972401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111641955.1A Pending CN114298047A (en) | 2021-12-29 | 2021-12-29 | Chinese named entity recognition method and system based on stroke volume and word vector |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114298047A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114757184A (en) * | 2022-04-11 | 2022-07-15 | 中国航空综合技术研究所 | Method and system for realizing knowledge question answering in aviation field |
-
2021
- 2021-12-29 CN CN202111641955.1A patent/CN114298047A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114757184A (en) * | 2022-04-11 | 2022-07-15 | 中国航空综合技术研究所 | Method and system for realizing knowledge question answering in aviation field |
CN114757184B (en) * | 2022-04-11 | 2023-11-10 | 中国航空综合技术研究所 | Method and system for realizing knowledge question and answer in aviation field |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021147726A1 (en) | Information extraction method and apparatus, electronic device and storage medium | |
CN109753660B (en) | LSTM-based winning bid web page named entity extraction method | |
CN111160031A (en) | Social media named entity identification method based on affix perception | |
CN112541355B (en) | Entity boundary type decoupling few-sample named entity recognition method and system | |
CN108287911B (en) | Relation extraction method based on constrained remote supervision | |
CN113591483A (en) | Document-level event argument extraction method based on sequence labeling | |
CN113392209B (en) | Text clustering method based on artificial intelligence, related equipment and storage medium | |
CN111738169B (en) | Handwriting formula recognition method based on end-to-end network model | |
CN111475622A (en) | Text classification method, device, terminal and storage medium | |
CN114330354B (en) | Event extraction method and device based on vocabulary enhancement and storage medium | |
CN113177412A (en) | Named entity identification method and system based on bert, electronic equipment and storage medium | |
CN110276396B (en) | Image description generation method based on object saliency and cross-modal fusion features | |
CN108509423A (en) | A kind of acceptance of the bid webpage name entity abstracting method based on second order HMM | |
CN114091450A (en) | Judicial domain relation extraction method and system based on graph convolution network | |
CN115565177A (en) | Character recognition model training method, character recognition device, character recognition equipment and medium | |
CN113076758B (en) | Task-oriented dialog-oriented multi-domain request type intention identification method | |
CN113191150B (en) | Multi-feature fusion Chinese medical text named entity identification method | |
CN113360654B (en) | Text classification method, apparatus, electronic device and readable storage medium | |
CN114417874A (en) | Chinese named entity recognition method and system based on graph attention network | |
CN112699685B (en) | Named entity recognition method based on label-guided word fusion | |
CN114298047A (en) | Chinese named entity recognition method and system based on stroke volume and word vector | |
CN113076744A (en) | Cultural relic knowledge relation extraction method based on convolutional neural network | |
Li et al. | Review network for scene text recognition | |
CN115186670B (en) | Method and system for identifying domain named entities based on active learning | |
CN111737470A (en) | Text classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |