CN109800437B - Named entity recognition method based on feature fusion - Google Patents

Named entity recognition method based on feature fusion Download PDF

Info

Publication number
CN109800437B
CN109800437B CN201910099671.0A CN201910099671A CN109800437B CN 109800437 B CN109800437 B CN 109800437B CN 201910099671 A CN201910099671 A CN 201910099671A CN 109800437 B CN109800437 B CN 109800437B
Authority
CN
China
Prior art keywords
word
features
concept
feature
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910099671.0A
Other languages
Chinese (zh)
Other versions
CN109800437A (en
Inventor
赵青
王丹
杜金莲
付利华
苏航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201910099671.0A priority Critical patent/CN109800437B/en
Publication of CN109800437A publication Critical patent/CN109800437A/en
Application granted granted Critical
Publication of CN109800437B publication Critical patent/CN109800437B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

A named entity recognition method based on feature fusion belongs to the field of computers, text features with different granularities, conceptual features and non-conceptual word features are extracted and fused through two aspects, so that the accuracy of named entity recognition is improved, and the calculated amount is reduced. The method comprises the following steps: the system comprises a data preprocessing module, a feature construction module, a named entity network model training module and a named entity classifier module, wherein the feature module comprises four sub-modules of semantic feature extraction, word feature extraction, character feature extraction and feature fusion. In the method, the context information of the named entity task is considered by combining the time sequence Memory characteristics of a neural network model LSTM (Long Short-Term Memory) or GRU (Gated Recurrent Unit), and finally, the entity category label is predicted by using softmax. In the model construction process, sparse data can be used as a training set and the LSTM neural network model and the GRU neural network model can be compared, so that the entity recognition task can be ensured to achieve a satisfactory effect.

Description

Named entity recognition method based on feature fusion
Technical Field
The invention belongs to the field of computers, and relates to a named entity identification method based on feature fusion.
Background
In recent years, with the wide application of artificial intelligence technology in the field of natural language processing (Natural Language Processing, NLP), the search for knowledge in the field is increasing. Named entity recognition is a fundamental and also vital step in building domain knowledge, such as: named entity recognition is needed in the fields of knowledge graph construction, text retrieval, text classification, information extraction and the like.
Named entity recognition (Named Entity Recognition, NER) can be seen as a sequential labeling task, where entities are looked up and separated into a fixed set of categories by extracted information. Two main approaches to the traditional NER problem are rule-based learning methods and supervised learning methods, where the supervised learning methods dominate. Both rule-based and supervised learning methods find tag sequences for candidate entities from a document, assuming that the available training data has been marked in its entirety (i.e., all entities contained in the document are marked). However, the use of fully annotated data as a training set in today's big data age is very time consuming and labor intensive, and due to the specificity of most domain terms, today's named entity recognition tasks also present the following challenges: (1) Most of the information in real life is semi-structural or unstructured, and many pieces of information are descriptive, have no structural information and are not applicable to knowledge discovery and extraction; (2) Domain entities themselves are structurally complex and the same concepts have a variety of expression methods, for example in the medical domain: chronic obstructive pulmonary disease may be abbreviated as COPD; (3) Named entities are typically made up of multiple words, and considering only word features can split semantic information. Based on the above problems, the conventional named entity recognition method is difficult to adapt to the application scene nowadays.
At present, with the excellent performance of deep learning in various fields, the application in the task of identifying named entities is more and more, and compared with the traditional method, the effect of the method of deep learning is better. But most of the NER methods combined with deep learning are based on english or on word vectors and character vectors without considering conceptual features.
In 2016, published in ACL, paper "Neural Architectures for Named Entity Recognition" by guilloume sample et al, a named entity recognition method based on a combination of a recurrent neural network (Recurrent Neural Network, RNN) and a conditional random field (Conditional Random Fields, CRF) was proposed for recognizing english names, place names, etc., the method extracts word features and character features through the RNN, and finally classifies the entities through the CRF.
In 2017, a method for identifying an entity based on word characteristics and combined with an attention mechanism is proposed in computer research and development, and the paper "identifying a chemical drug named entity based on an attention mechanism" by Yang Pei et al, wherein an entity identification classifier is trained through a neural network LSTM (Long Short-Term Memory), and a final entity tag classification result is generated by adopting CRF.
Although the above methods can complete the task of identifying the named entity, the existing named entity identification methods all assume that no domain knowledge exists, and features are learned only through training sets, however, in real life, most domains have partial domain knowledge, and although the methods are imperfect, the domain knowledge can help us to better identify the named entity in sparse data, and meanwhile, the huge calculation amount caused by inconsistent expression can be reduced to a certain extent.
Disclosure of Invention
The invention comprises the following steps:
a named entity identification method based on feature fusion comprises the following steps:
(1) the named entity recognition method based on feature fusion is provided, the effect of predicting new words in a sparse marked prediction library can be achieved according to concepts contained in the domain ontology, unified expression modes can be adopted for entities which are inconsistent in expression and have the same concepts, and therefore accuracy can be improved, and calculation cost can be reduced.
(2) Firstly, extracting semantic features from the preprocessed data by adopting a CBOW model, wherein the semantic features comprise conceptual features and non-conceptual word features, extracting the conceptual features, words and character features from the conceptual features, and directly extracting the word features and character features from the non-conceptual word features.
(3) And secondly, carrying out feature fusion on the extracted new feature set, wherein the feature fusion also comprises two parts, namely feature fusion based on concepts and feature fusion based on non-concept words. And reducing the dimension of the concept features by calculating the concept similarity.
(4) The characteristics of the neural network LSTM or GRU (Gated Recurrent Unit) model time sequence memory are adopted to extract the context information related to the named entity, and the new feature set is used as the input of the training model.
The principle of the invention is that a named entity recognition method based on feature fusion not only adopts traditional word vector features and character vector features, but also considers concept features and character position features contained in words, the dimension of the word vector can be reduced through the concept features, the effect of predicting new words can be achieved to a certain extent in a sparse marked corpus according to concepts contained in a ontology, and finally context information is focused through a neural network LSTM or GRU, so that the accuracy of named entity recognition can be improved well.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
a named entity identification method based on feature fusion comprises the following steps: the system comprises a data preprocessing module, a feature construction module, a named entity network model training module and a named entity classifier module. The feature construction module mainly extracts and fuses text features with different granularities, and specifically comprises four sub-modules, namely a semantic feature extraction module, a word feature extraction module, a character feature extraction module and a feature fusion module.
A semantic feature extraction module, the semantic feature comprising two parts, a conceptual feature and a non-conceptual word feature, the concept referring to a particular domain term consisting of a plurality of separate words comprising semantics, e.g., chronic obstructive pulmonary disease; non-conceptual words refer to a single semantic vocabulary, e.g., difficulty. For the extracted concept features capable of mapping concepts from the domain ontology, the word features of the concepts cannot be directly extracted, and finally the semantic features are extracted through a CBOW model.
The word feature extraction module, since the concept is composed of a plurality of words, for example: chronic pulmonary heart disease, the meaning of the concept is therefore determined by the words it contains. In order to maintain the integrity of semantic information, the method is divided into two aspects, namely, extracting word features based on concepts and extracting word features based on non-concept words, wherein a CBOW model is adopted by the extraction method of the non-concept word features as the extraction method of the semantic features.
The character feature extraction module, the character is the minimum semantic unit of Chinese, also contains certain semantic information, the meaning of the word is determined by the character contained in the character, and the semantic information based on the character can also achieve the effect of predicting new words to a certain extent, thereby being beneficial to the inference of entity categories, such as: pain, vector of pain + vector of pain is close to the term vector of pain. Meanwhile, the position information of the characters is very critical, and the meaning of two words can be completely different in different positions of the same character, so that in order to improve the accuracy of entity identification, the method considers not only character characteristics but also character position characteristics.
And the feature fusion module is used for fusing the extracted conceptual features, word features and character features into a new feature set. Secondly, a new fusion method is provided, which mainly considers two cases, namely, fusion of concepts, words and character features is carried out on words capable of extracting concepts in the domain ontology, and word features are directly extracted and fused with character features on words incapable of extracting concepts from the ontology. Finally, feature dimension reduction is carried out on the extracted conceptual features through the domain ontology, so that the calculated amount can be reduced on the basis of improving the accuracy of named entity identification, and the fused features are used as the input of a model for training.
According to the invention, text features with different granularities are extracted, and a novel feature fusion method is provided, so that semantic information contained in the text can be fully learned, and the ambiguity of domain terms and huge calculation amount caused by expression inconsistency can be solved.
Drawings
FIG. 1 is a diagram of an overall architecture of a named entity recognition method based on feature fusion;
FIG. 2 is a flow chart of a named entity recognition method based on feature fusion;
Detailed Description
Features and exemplary embodiments of various aspects of the invention are described in detail below
The method for extracting the feature extraction and feature fusion of different granularities is used for identifying the named entity, and is hopeful to improve the accuracy of named entity identification and reduce the calculated amount. The whole architecture is shown in fig. 1 and is divided into a data preprocessing module (1), a feature construction module (2), a training named entity network model module (3) and a named entity classifier module (4). A specific process flow diagram is shown in fig. 2.
Data preprocessing module (1): firstly, adding unlabeled data into a labeled training set to form a sparse labeled corpus, and loading a domain ontology; secondly, the corpus of all sparse marks is segmented into shorter Chinese character strings (comprising punctuation marks, numbers and space characters) according to special symbols, and stop words are removed.
Feature construction module (2): the module mainly extracts features with different granularities from the text and fuses the extracted features. More specifically, the method can be divided into semantic feature extraction, word feature extraction, character feature extraction and feature fusion.
Semantic feature extraction module (21): mapping the segmented character string L= (L1 … Ln) to the ontology O, and finding out the length Lmax of the maximum initial matching semantic contained in the character string by adopting a maximum matching method (if the length Lmax of the maximum initial matching semantic is equal to the length Llen of the character string, llen is a semantic). Then extracting Lmax from L, dividing two sides of Lmax into new character strings with segmentation, defining all segmented character strings as a semantic set { Y } 1 ,...Y N ) E D, which contains a concept set and a non-concept vocabulary { G ] 1 ,...G N }∪{F 1 ,...F N E Y. Then extracting semantic features through a CBOW model, wherein the training goal of the CBOW is to maximize the following average logarithmic probability, and the specific formula is as follows:
wherein K is the context information of the target word in the data set D, Y i Is the semantics in the dataset D.
In CBOW, probability Pr (Y i |Y i-K ,...,Y i+K ) Is calculated by the following formula:
wherein y is 0 And y i For target semantics Y i Vector representations of input and output, and y 0 For the average vector representation of all contexts, W is the semantic dictionary.
Word feature extraction module (22): word feature is divided into two cases, concept-based word feature extraction and non-concept word-based feature extraction.
Feature extraction based on concept words: since a concept is typically g= { C composed of a plurality of words 1 ,...C N The meaning of the concept is determined by the words it contains, so the method will extract word features on the basis of the concept features. The specific formula is as follows
Wherein g i Is concept G i Concept vector of c j G is g n The j-th word vector, g n Is concept G i Number of words contained, Q i The concept vector and the average word vector are added to obtain +vector addition operation, and the addition calculation method obtained according to the past experimental experience is simpler and faster than the combination method without losing the precision, so that the vector addition mode is adopted to calculate in the following methods.
Feature extraction based on non-concept words will employ a CBOW model in the semantic feature extraction module (21).
Character feature extraction module (23): character features are also classified into two cases, conceptual word-based character feature extraction and non-conceptual word-based character feature extraction.
Character feature extraction based on concept words: in the extracted concept and word characteristics P i On the basis of the character feature extraction, the specific formula is as follows:
wherein z is k C is n The kth character vector, c n Is the concept word C i The number of characters included, + is a vector addition operation, Q i The concept vector, the average word vector and the average character vector are added. The character feature formula is extracted based on the non-concept word features as follows:
wherein w is i Is a non-conceptual word F i Word vector representation of f n Is a non-conceptual word F i The number of characters d m Is f n In the m-th character vector, + is vector addition operation,from the addition of the non-conceptual word vector and the average character vector.
Since the meaning of the word in Chinese generally depends on the position of the character, the meaning expressed differently by the position of the character is also different, so that extracting the position characteristics of the character can more accurately infer the semantic information of the word. For each character we denote B (beginning), I (middle), E (end), the formula can be expressed as:
the same expression is used for extracting the position features of the characters of the non-conceptual feature words.
Feature fusion (24): based on the feature extraction work, the feature fusion part is also considered in two cases, namely a feature fusion method based on concept and a feature fusion method based on non-concept words. The method fuses the extracted new feature sets through vector addition operation, mainly considers that the concept features are very important as word features in the named entity recognition task based on the partial domain ontology, and can directly extract the named entities with partial labels in the sparse labeled corpus, thereby reducing the calculated amount.
The feature fusion method based on the concept comprises the following steps: the extracted conceptual features, word features, character features and character position features are fused, and the formula is as follows:
the feature fusion method based on the non-concept words comprises the following steps: the extracted word features, character features and character position features are fused, and the formula is as follows:
wherein f n For the word F i The number of characters to be included in the character string,for the word F i First character of +.>For the word F i Middle character feature,/, of->Word F i Is the last character feature in (c).
The domain terms for Chinese generally have the characteristic of inconsistent expression, especially in the medical domain, the medical terms of the same concept have multiple expression methods, such as: chronic obstructive pulmonary disease may also be expressed as COPD. With the increase of data, huge calculation amount is brought, based on the problem, therefore, a method for calculating the similarity of the concept features based on the ontology is adopted to reduce the dimension of the concept vector, and the formula is as follows:
wherein o is i G is a conceptual feature in the ontology i And g m For the conceptual features identified in dataset D, R () is g i And g m Is the cosine similarity, alpha is the similarity threshold, and according to the previous experiment, the similarity thresholdThe value is set to be too small and easy to be misjudged, and too large and easy to be missed, so that the normal similarity threshold value is between 0.87 and 0.93, the recommended initial threshold value is set to be 0.9, the error is calculated by adopting a gradient descent method, namely, the gradient descent slope of the error function is calculated smoothly and continuously, the gradient is smaller when the error function approaches to the minimum value, the overshoot risk can be reduced by adjusting the step length, the step length can be set to be 0.01 in the experimental process, and the threshold value range is set between 0.87 and 0.93, and adjustment is carried out until the gradient reaches the minimum value, namely the optimal threshold value of the similarity.
More specifically, the concept features are mapped to the domain ontology O, if there are two concepts g i And g m Close to the ontology concept o i G is calculated by cosine similarity i And g m To ontology concept o i If less than the similarity threshold α, g i And g m Each being an independent concept in the ontology, g can be considered if it is greater than the similarity threshold α i And g m Is of the same concept and g can be taken as i Replaced by g m Or g will g m Replaced by g i . Thereby reducing the dimension of the conceptual features and reducing the amount of computation.
Training a named entity network model module (3): the fused features are used as the input of a model for training, and the named entity recognition is also called a sequence labeling task, so that the context information is very important, and a neural network LSTM or GRU model with a time sequence memory function is adopted as the training model. The specific formula of LSTM is as follows:
i t =σ(W i x t +U i h t-1 +b i )
f t =σ(W f x t +U f h t-1 +b f )
o t =σ(W o x t +U o h t-1 +b o )
wherein i is t 、f t 、o t Input, forget, output gates representing time node t, σ representing a nonlinear function, parameters of each control gateConsists of two matrices and a bias vector, so that the matrix parameters of the three control gates are W i ,U i ,W f ,U f ,W o ,U o The deviation parameter is b i ,b f ,b o . The memory cell parameters of LSTM are W respectively c ,U c And b c . These parameters are updated at each step in the training and storage.
Named entity classifier module (4): the final entity tag classification result is generated according to the neural network LSTM or GRU model softmax classifier.

Claims (1)

1. A named entity recognition method based on feature fusion is characterized by comprising the following four modules: the system comprises a data preprocessing module (1), a feature construction module (2), a training named entity network model module (3) and a named entity classifier module (4);
(1) Data preprocessing module
Adding unlabeled data into the labeled training set to form a sparse labeled corpus, and loading the sparse labeled corpus into the domain ontology; cutting the text to be processed into Chinese character strings according to punctuation marks, numbers and space characters, and removing stop words;
(2) Feature construction module
The module is divided into feature extraction and feature fusion, and is specifically divided into four sub-modules: semantic feature extraction, word feature extraction, character feature extraction and feature fusion;
(3) Training named entity network model module
Training the fused features as the input of the model, and extracting context information to assist in deducing entity types because named entity identification is also called a sequence labeling task, so that a neural network model LSTM or GRU with a time sequence memory function is adopted as the training model;
(4) Named entity classifier module
Generating a final entity tag classification result according to a softmax classifier of a neural network LSTM or GRU model;
step (2), the concrete steps are as follows:
semantic feature extraction (21): semantic features contain two parts: conceptual features and non-conceptual word features; wherein, the concept refers to a special domain term composed of a plurality of independent words containing semantics; the non-concept word refers to a single semantic vocabulary; for extracted concept features that can map concepts out of the domain ontology, direct extracted word features of the concepts cannot be extracted;
firstly mapping the preprocessed corpus to a domain ontology, and segmenting data into semantic sets { Y }, by a maximum matching method 1 ,…Y N E D, which contains a conceptual set and a non-conceptual word set { G } 1 ,…G U }∪{F 1 ,…F V E Y; secondly, extracting semantic features by adopting a CBOW model, wherein the training goal of the CBOW is to maximize the following average logarithmic probability, and the formula is as follows:
wherein K is the context information of the target word in the data set D, Y i Is the semantics in dataset D;
in CBOW, probability Pr (Y i |Y i-K ,…,Y i+K ) Is calculated by the following formula:
wherein y is 0 And y i For target semantics Y i Vector representations of input and output, and y 0 For the average vector representation of all contexts, T is the rank of rotation, and W is the semantic dictionary;
word feature extraction (22): word feature extraction is divided into two cases, concept-based word feature extraction and non-concept-based word feature extraction;
concept-based word feature extraction is to extract word features based on concept features, since one concept is g= { C composed of a plurality of words 1 ,…C N "thus the meaning of concept isDetermined by the words involved; the formula for concept-based word feature extraction is expressed as:
wherein g i Is concept G i Concept vector of c j G is g n The j-th word vector, g n Is concept G i Number of words contained, P i The concept vector and the average word vector are added to obtain +vector addition operation;
the non-conceptual word feature extraction method directly extracts word features by adopting a CBOW model of the semantic feature extraction module (21);
character feature extraction (23): extracting character features on the basis of conceptual words and on the basis of non-conceptual words; the character feature formula extracted based on the words in the concept is as follows:
wherein z is k Is C i The kth character vector, c n Is the concept word C i The number of characters included, + is a vector addition operation, Q i The concept vector, the average word vector and the average character vector are added;
the character feature formula is extracted based on the non-concept word features as follows:
wherein w is i Is a non-conceptual word F i Word vector representation of f n Is a non-conceptual word F i The number of characters d m Is F i In the m-th character vector, + is vector addition operation,the non-concept word vector and the average character vector are added to obtain;
in Chinese, the meaning of the different expressions of the positions of the characters is also different, so that the extraction of the position characteristics of the characters also assists in deducing the semantic information of the words; for each character we denote the beginning, middle and end of the character with B, I, E, the formula is:
wherein c n For the word C i The number of characters to be included in the character string,for the word C i Is the first character feature of +.>For the word C i Middle character feature,/, of->For the word C i The last character feature in (a);
extracting the position features of the characters of the non-conceptual feature words by adopting the same expression mode;
feature fusion (24): feature fusion is also divided into two cases, namely conceptual feature fusion and non-conceptual word feature fusion; in the named entity recognition task based on the part of domain ontology, the concept features are the same as the word features, and part of unlabeled named entities are directly extracted from the sparse marked corpus, so that the calculated amount is reduced;
fusion of conceptual features: fusing the extracted conceptual features, word features and character features and the position features of the characters, wherein the expression of the fusion of the conceptual features is as follows:
non-concept word feature fusion: fusing the extracted word features, character features and position features of the characters, and expressing a non-conceptual word feature fusion formula as follows:
wherein f n For the word F i The number of characters to be included in the character string,for the word F i First character of +.>For the word F i Middle character feature,/, of->For the word F i The last character feature in (a);
the dimension of the concept vector is reduced by adopting a method for calculating the similarity of the body concept features, and the formula is as follows:
wherein o is i G is a conceptual feature in the ontology i And g m For the conceptual features identified in dataset D, R () is g i And g m The gradient is calculated by adopting a gradient descent method, namely, the gradient descent slope of an error function is calculated smoothly and continuously, the gradient is smaller when the gradient is closer to the minimum value, and the gradient reaches the minimum value, so that the gradient is the optimal threshold of the similarity.
CN201910099671.0A 2019-01-31 2019-01-31 Named entity recognition method based on feature fusion Active CN109800437B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910099671.0A CN109800437B (en) 2019-01-31 2019-01-31 Named entity recognition method based on feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910099671.0A CN109800437B (en) 2019-01-31 2019-01-31 Named entity recognition method based on feature fusion

Publications (2)

Publication Number Publication Date
CN109800437A CN109800437A (en) 2019-05-24
CN109800437B true CN109800437B (en) 2023-11-14

Family

ID=66560740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910099671.0A Active CN109800437B (en) 2019-01-31 2019-01-31 Named entity recognition method based on feature fusion

Country Status (1)

Country Link
CN (1) CN109800437B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329465A (en) * 2019-07-18 2021-02-05 株式会社理光 Named entity identification method and device and computer readable storage medium
CN110852359B (en) * 2019-07-24 2023-05-26 上海交通大学 Family tree identification method and system based on deep learning
CN110704640A (en) * 2019-09-30 2020-01-17 北京邮电大学 Representation learning method and device of knowledge graph
CN110866399B (en) * 2019-10-24 2023-05-02 同济大学 Chinese short text entity recognition and disambiguation method based on enhanced character vector
CN111489746B (en) * 2020-03-05 2022-07-26 国网浙江省电力有限公司 Power grid dispatching voice recognition language model construction method based on BERT
CN111539209B (en) * 2020-04-15 2023-09-15 北京百度网讯科技有限公司 Method and apparatus for entity classification
CN111832307A (en) * 2020-07-09 2020-10-27 北京工业大学 Entity relationship extraction method and system based on knowledge enhancement
CN112101028B (en) * 2020-08-17 2022-08-26 淮阴工学院 Multi-feature bidirectional gating field expert entity extraction method and system
CN112331332A (en) * 2020-10-14 2021-02-05 北京工业大学 Disease prediction method and system based on multi-granularity feature fusion
CN112257417A (en) * 2020-10-29 2021-01-22 重庆紫光华山智安科技有限公司 Multi-task named entity recognition training method, medium and terminal
CN113035362B (en) * 2021-02-26 2024-04-09 北京工业大学 Medical prediction method and system based on semantic graph network
CN113378569A (en) * 2021-06-02 2021-09-10 北京三快在线科技有限公司 Model generation method, entity identification method, model generation device, entity identification device, electronic equipment and storage medium
CN113361272B (en) * 2021-06-22 2023-03-21 海信视像科技股份有限公司 Method and device for extracting concept words of media asset title
CN113593709B (en) * 2021-07-30 2022-09-30 江先汉 Disease coding method, system, readable storage medium and device
CN114638222B (en) * 2022-05-17 2022-08-16 天津卓朗科技发展有限公司 Natural disaster data classification method and model training method and device thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
CN107203511A (en) * 2017-05-27 2017-09-26 中国矿业大学 A kind of network text name entity recognition method based on neutral net probability disambiguation
CN107885721A (en) * 2017-10-12 2018-04-06 北京知道未来信息技术有限公司 A kind of name entity recognition method based on LSTM
EP3407208A1 (en) * 2017-05-22 2018-11-28 Fujitsu Limited Ontology alignment apparatus, program, and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3407208A1 (en) * 2017-05-22 2018-11-28 Fujitsu Limited Ontology alignment apparatus, program, and method
CN107203511A (en) * 2017-05-27 2017-09-26 中国矿业大学 A kind of network text name entity recognition method based on neutral net probability disambiguation
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
CN107885721A (en) * 2017-10-12 2018-04-06 北京知道未来信息技术有限公司 A kind of name entity recognition method based on LSTM

Also Published As

Publication number Publication date
CN109800437A (en) 2019-05-24

Similar Documents

Publication Publication Date Title
CN109800437B (en) Named entity recognition method based on feature fusion
Wang et al. Application of convolutional neural network in natural language processing
CN108959252B (en) Semi-supervised Chinese named entity recognition method based on deep learning
Gasmi et al. LSTM recurrent neural networks for cybersecurity named entity recognition
CN109858041B (en) Named entity recognition method combining semi-supervised learning with user-defined dictionary
CN111737496A (en) Power equipment fault knowledge map construction method
CN111046179B (en) Text classification method for open network question in specific field
CN109325231B (en) Method for generating word vector by multitasking model
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN110263325B (en) Chinese word segmentation system
CN111666758B (en) Chinese word segmentation method, training device and computer readable storage medium
CN111611802B (en) Multi-field entity identification method
CN115238690A (en) Military field composite named entity identification method based on BERT
Deng et al. Self-attention-based BiGRU and capsule network for named entity recognition
CN114781375A (en) Military equipment relation extraction method based on BERT and attention mechanism
Aggarwal et al. Recurrent neural networks
CN112699685A (en) Named entity recognition method based on label-guided word fusion
Mankolli et al. Machine learning and natural language processing: Review of models and optimization problems
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT
Luan Information extraction from scientific literature for method recommendation
Qi et al. Semi-supervised sequence labeling with self-learned features
Pingili et al. Target-based sentiment analysis using a bert embedded model
CN113361277A (en) Medical named entity recognition modeling method based on attention mechanism
Yang et al. Hierarchical dialog state tracking with unknown slot values
Xu et al. A Data‐Driven Model for Automated Chinese Word Segmentation and POS Tagging

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant