CN109800437A - A kind of name entity recognition method based on Fusion Features - Google Patents
A kind of name entity recognition method based on Fusion Features Download PDFInfo
- Publication number
- CN109800437A CN109800437A CN201910099671.0A CN201910099671A CN109800437A CN 109800437 A CN109800437 A CN 109800437A CN 201910099671 A CN201910099671 A CN 201910099671A CN 109800437 A CN109800437 A CN 109800437A
- Authority
- CN
- China
- Prior art keywords
- word
- feature
- concept
- character
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000004927 fusion Effects 0.000 title claims abstract description 38
- 238000000605 extraction Methods 0.000 claims abstract description 45
- 239000000284 extract Substances 0.000 claims abstract description 19
- 238000012549 training Methods 0.000 claims abstract description 18
- 238000004364 calculation method Methods 0.000 claims abstract description 10
- 238000010276 construction Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 239000013598 vector Substances 0.000 claims description 45
- 230000014509 gene expression Effects 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 230000007423 decrease Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 230000006386 memory function Effects 0.000 claims description 2
- 239000000203 mixture Substances 0.000 claims description 2
- 210000004218 nerve net Anatomy 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 5
- 230000000306 recurrent effect Effects 0.000 abstract description 4
- 230000015654 memory Effects 0.000 abstract description 3
- 230000006403 short-term memory Effects 0.000 abstract description 2
- 238000003062 neural network model Methods 0.000 abstract 2
- 208000006545 Chronic Obstructive Pulmonary Disease Diseases 0.000 description 5
- 206010028916 Neologism Diseases 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 206010010970 Cor pulmonale chronic Diseases 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
A kind of name entity recognition method based on Fusion Features belongs to computer field, extracts and merge varigrained text feature, concept characteristic and non-notional word feature by two aspects, thus to improve the accuracy rate of name Entity recognition and reduce calculation amount.Method includes: data preprocessing module, feature construction module, training name physical network model module and name entity classification device module, and wherein characteristic module includes semantic feature extraction, word feature extraction, four character feature extraction, Fusion Features submodules.It combines the timing memory feature of neural network model LSTM (Long Short-Term Memory) or GRU (Gated Recurrent Unit) in the method to consider the contextual information of name entity task, finally predicts entity class label using softmax.In model construction process, it can use sparse data and compared as training set and to two kinds of neural network models of LSTM and GRU, it is ensured that the present invention can obtain satisfactory effect in Entity recognition task.
Description
Technical field
The invention belongs to computer fields, are related to a kind of name entity recognition method based on Fusion Features.
Background technique
In recent years, with artificial intelligence technology natural language processing (Natural Language Processing,
NLP) the extensive use in field, people are also more and more to the exploration of domain knowledge.Name Entity recognition is to constitute domain knowledge
Basis and a vital step, such as: knowledge mapping building, text retrieval, text classification and information extraction etc.
It requires to be named Entity recognition in field.
Name Entity recognition (Named Entity Recognition, NER) can be regarded as a sequence labelling task,
Entity is searched by the information extracted and is classified as the classification of one group of fixation.The main side of two kinds of traditional NER problem
Method is rule-based learning method and has the learning method of supervision, wherein there is the learning method of supervision to occupy an leading position.It is based on
The method of rule learning and have the learning method of supervision all assume that available training data all label (that is, all include
Entity in a document is all labeled) under the premise of, in the sequence label for finding candidate entity from document.However, nowadays
Big data era taken time and effort very much using the data sufficiently marked as training set, and due to most of field terms
Particularity, it is of today to name Entity recognition task there is also following challenges: (1) to be largely half structure or non-in actual life
Structuring, and many information are narrative, no structural informations, are not suitable for the discovery and extraction of knowledge;(2) field is real
Structure is complicated and same concept has a variety of expressions for body itself, such as in medical field: Chronic Obstructive Pulmonary Disease can
To be abbreviated as COPD;(3) name entity is usually to be made of multiple words, only considers that word feature can be such that semantic information isolates.It is based on
Problem above, traditional name entity recognition method have been difficult to be suitable for application scenarios of today.
Currently, the performance all excellent in every field with deep learning, the application in name Entity recognition task
Also more and more, compare conventional method, and the method effect of deep learning is more preferable.But the NER method that deep learning combines is big
Be all based on more it is English or word-based vector sum character vector, without considering concept characteristic.
2016, it is published in ACL, " the Neural Architectures of the paper as written by Guillaume Lample et al.
For Named Entity Recognition " is proposed a kind of based on Recognition with Recurrent Neural Network (Recurrent Neural
Network, RNN) and condition random field (Conditional Random Fields, CRF) combine name Entity recognition side
Method, for identifying English name-to, place name etc., this method extracts word feature and character feature by RNN, finally by CRF pairs
Entity is classified.
2017, it is published in the Journal of Computer Research and Development, " chemical drugs based on attention mechanism of paper written by You Yangpei et al.
Object names Entity recognition ", a kind of entity recognition method based on words feature and combination attention mechanism is proposed, this method is logical
Neural network LSTM (Long Short-Term Memory) Lai Xunlian Entity recognition classifier is crossed, and is generated finally using CRF
Entity tag classification results.
Although above method can complete name Entity recognition task, existing name entity recognition method is all
Assuming that no domain knowledge, feature is only learnt by training set, however in actual life, most of fields are all with part
Domain knowledge, although being still not perfect, these domain knowledges can help us, and preferably identification is ordered in sparse data
Name entity, while can also reduce to a certain extent by expressing the inconsistent huge calculation amount of bring.
Summary of the invention
The contents of the present invention:
A kind of name entity recognition method based on Fusion Features, this method comprises:
1. proposing a kind of name entity recognition method based on Fusion Features, this method not only can be according to domain body
Included in concept achieve the effect that predict neologisms in the expectation library of sparse markup, can also be inconsistent but have to expressing
The entity of same concept takes unified expression way, and can not only improve accuracy rate can also reduce calculating cost.
2. first to pretreated data use CBOW model extraction semantic feature, semantic feature include concept characteristic and
Non- notional word feature extracts concept, word and character feature for concept characteristic, and it is special that word is just directly extracted for non-notional word feature
It seeks peace character feature.
3. the new feature set extracted is secondly carried out Fusion Features, Fusion Features also include two parts, are based on
The Fusion Features of concept and Fusion Features based on non-notional word.And the dimension of concept characteristic is reduced by calculating concept similarity
Degree.
4. being extracted using the characteristics of neural network LSTM or GRU (Gated Recurrent Unit) model timing memory
The relevant contextual information of entity is named, and using new feature set as the input of training pattern.
The principle of the present invention is a kind of name entity recognition method based on Fusion Features, do not use only traditional word to
Measure feature and character vector feature, it is also considered that the concept characteristic and character position feature that word is included, not by concept characteristic
Term vector dimension can be only reduced, the concept according to included in ontology is in the corpus of sparse markup, certain journey
Achieve the effect that predict neologisms on degree, contextual information is paid close attention to finally by neural network LSTM or GRU, so as to good
Improve the accuracy rate of name Entity recognition.
To reach the above goal of the invention, the present invention is adopted the following technical scheme that:
A kind of name entity recognition method based on Fusion Features, comprising: data preprocessing module, feature construction module,
Training name physical network model module, name entity classification device module.Wherein, feature construction module is mainly for different grain size
Text feature extract and merge, specifically comprising four submodules be semantic feature extraction module, word feature extraction respectively
Module, character feature extraction module, Fusion Features module.
Semantic feature extraction module, semantic feature include two parts, and concept characteristic and non-notional word feature, concept refer to
By multiple special field terms formed comprising semantic independent vocabulary, for example, Chronic Obstructive Pulmonary Disease;Non- concept
Word just refers to an individual semantic vocabulary, for example, difficult.For the extraction concept of concept can be mapped out in domain body
Feature cannot extract the direct extraction word feature of concept, finally by CBOW model extraction semantic feature.
Word characteristic extracting module is made of due to concept multiple words, such as: chronic cor pulmonale, thus it is general
It reads and is meant that the word being contained by it determined.In order to keep the integrality of semantic information, this method is divided into from the aspect of two,
Word feature is extracted based on concept and word feature is extracted based on non-notional word, wherein the extracting method of non-notional word feature and semanteme are special
Sign extracting method equally uses CBOW model.
Character feature extraction module, character are the smallest semantic units of Chinese, also include certain semantic information, the meaning of word
Think of is that the character being contained by it determines, also, the semantic information based on character itself can also reach to a certain extent pre-
The effect for surveying neologisms, facilitates the deduction of entity class, such as: pain, vector+pain vector of pain is close to a painful word
Vector.Meanwhile the location information of character be also it is very crucial, identical characters different location may make the meaning of two words complete
Difference, therefore in order to improve the accuracy rate of Entity recognition, this method not only considers that character feature also considers character position feature.
A Fusion Features module, firstly, it is new that the concept characteristic extracted, word feature and character feature are permeated
Feature set.Secondly, proposing a kind of new fusion method, this method mainly considers two kinds of situations, for can be in domain body
In extract the word of concept and just merge concept, word and character feature, for the word of concept cannot be extracted from ontology with regard to direct
It extracts word feature and is blended with character feature.Finally, Feature Dimension Reduction is carried out to the concept characteristic extracted by domain body,
So as to reduce calculation amount on the basis of improving and naming Entity recognition accuracy rate, and using fused feature as model
Input is trained.
The present invention is extracted varigrained text feature and proposes a new Feature fusion, can not only be abundant
The study semantic information that includes into text, also can solve the ambiguity of field term and by expression inconsistency bring
Huge calculation amount.
Detailed description of the invention
Name entity recognition method integrated stand composition of the Fig. 1 based on Fusion Features;
Name entity recognition method flow chart of the Fig. 2 based on Fusion Features;
Specific embodiment
The feature and exemplary embodiment of various aspects of the present invention is described more fully below
The present invention extracts the method for varigrained feature extraction and Fusion Features to identify name entity, it is desirable to increase life
The accuracy rate of name Entity recognition simultaneously reduces calculation amount.Overall architecture is as shown in Figure 1, be divided into data preprocessing module (1), feature structure
Model block (2), training name physical network model module (3) and name entity classification device module (4).Specific method flow chart is such as
Shown in Fig. 2.
Data preprocessing module (1): firstly, the data not marked are added in the training set marked forms sparse markup
Corpus, and be loaded into domain body;Secondly, the corpus of all sparse markups is cut into according to additional character shorter
Man's character string (including punctuation mark, number and space character) and remove stop words.
Feature construction module (2): the module is mainly to extract varigrained feature from text and will extract
Feature is merged.Semantic feature extraction, word feature extraction, character feature extraction and Fusion Features can be more specifically divided into.
Semantic feature extraction module (21): the character string L=segmented (L1 ... Ln) is mapped to ontology O, using maximum
Matching method finds out the length Lmax for the maximum initial matching semanteme for including in character string (if maximum initial matching semanteme length
Lmax is equal to string length Llen, then Llen is a semanteme).Then Lmax is extracted from L, and the both sides of Lmax are divided
For new band cutting character string, the character string all segmented is defined as a semantic collection { Y1... YN) ∈ D, wherein including
Concept set and non-concept word set { G1... GN}∪{F1... FN}∈Y.Then pass through CBOW model extraction semantic feature, CBOW
Training objective be by the maximization of following average log probability, specific formula is as follows:
Wherein, K is the contextual information of target word in data set D, YiFor the semanteme in data set D.
In CBOW, probability P r (Yi|Yi-K..., Yi+K) it is to be calculated by following formula:
Wherein, y0And yiFor target semanteme YiThe vector output and input indicates, and y0For all contexts it is average to
Amount indicates that W is semantic dictionary.
Word characteristic extracting module (22): word feature is divided into two kinds of situations and considers, word feature extraction based on concept and is based on
The feature extraction of non-notional word.
Feature extraction based on notional word: since concept is usually the G={ C being made of multiple words1... CN, concept
It is meant that and is determined by the word that it is included, therefore this method will extract word feature on the basis of concept characteristic.Specific formula
It is as follows
Wherein, giFor concept GiConcept Vectors, cjFor gnIn j-th of term vector, gnFor concept GiFor the word for being included
Number, QiIt is added and is obtained with its average term vector by Concept Vectors ,+it is addition of vectors operation, according to phase obtained by previous experiment experience
The calculation method added compares the method for combining, and in the case where not losing precision, more operation is simple, quickly, therefore in following methods
In calculate all by the way of addition of vectors.
Character based on non-notional word extracts will be using the CBOW model in semantic feature extraction module (21).
Character feature extraction module (23): character feature is equally divided into two kinds of situations and considers, the character based on notional word is special
Sign mentions and the character feature based on non-notional word extracts.
Character feature based on notional word extracts: in extracted concept and word feature PiOn the basis of to extract character special
Sign, specific formula is as follows:
Wherein, zkFor cnIn k-th of character vector, cnFor notional word CiThe character number for being included ,+transported for addition of vectors
It calculates, QiIt is added and is obtained with its average character vector by Concept Vectors, its average term vector.Based on non-notional word feature extraction character
Characteristic formula is as follows:
Wherein, wiFor non-notional word FiTerm vector indicate, fnFor non-notional word FiThe character number for being included, dmFor fnIn
M-th of character vector ,+it is addition of vectors operation,By non-notional word vector sum, its average character vector addition is obtained.
Position as where the meaning of word in Chinese generally depends on character, the meaning of character position difference expression
Also different, therefore the position feature for extracting character can more accurately infer the semantic information of word.For each character
We indicate that formula can be expressed as with B (beginning), I (centre), E (end):
The position feature for extracting its character for non-concept characteristic word also uses same expression way.
Fusion Features (24): being worked based on feature extraction, and Fusion Features part is similarly divided into two kinds of situations and considers, is based on
The Feature fusion of concept and Feature fusion based on non-notional word.This method passes through the new feature set extracted
Addition of vectors operation is merged, primary concern is that concept is special in the name Entity recognition task based on certain fields ontology
Words feature of seeking peace equally is very important, it direct extraction unit can be divided into the life of mark in the corpus of sparse markup
Name entity, to reduce calculation amount.
Feature fusion based on concept: we are by the concept characteristic of extraction, word feature, character feature and character bit
It sets feature to be merged, formula is as follows:
Feature fusion based on non-notional word: we are special by the word feature of extraction, character feature and character position
Sign is merged, and formula is as follows:
Wherein, fnFor word FiThe character number for being included,For word FiIn first character,For word Fi's
Intermediate character feature,Word FiIn last character feature.
Usually have the characteristics that express inconsistency for the field term of Chinese, especially in medical field, with without exception
The medical terms of thought can there are many expressions, such as: Chronic Obstructive Pulmonary Disease also can be expressed as COPD.With data
Huge calculation amount can be brought by increasing, and be based on this problem, therefore we use the side that concept characteristic similarity is calculated based on ontology
Method reduces the dimensions of Concept Vectors, and formula is as follows:
Wherein, oiFor a concept characteristic in ontology, giAnd gmFor the concept characteristic identified in data set D, R () is
giAnd gmRelationship, maxsimilarity () be cosine similarity, α is similarity threshold, according to previous experiment, similarity
The too small easy misjudgement that threshold value is set, it is excessive to be easy to fail to judge, therefore usually similarity threshold be between 0.87-0.93, recommendation
Initial threshold is set as 0.9, calculates error using the method for gradient decline, exactly error function is made smoothly continuously to calculate gradient
The slope of decline, it is smaller closer to minimum value gradient, overshoot risk can be reduced by adjusting step-length, during the experiment may be used
Step-length is set as 0.01, threshold range, which is located at, to be adjusted until the slope of gradient reaches minimum value just between 0.87 and 0.93
It is the optimal threshold of similarity.
For more specifically, concept characteristic is exactly mapped to domain body O, if there are two concept giAnd gmClose to
Ontological concept oi, g is just calculated by cosine similarityiAnd gmTo Ontological concept oiSimilarity distance, if it is less than similarity
Threshold alpha, then giAnd gmAn independent concept respectively in ontology, if it is greater than similarity threshold α, then can think
giAnd gmFor same concept, and can be by giReplace with gmOr by gmReplace with gi.To reduce the dimension of concept characteristic, reduce
Calculation amount.
Training name physical network model module (3): being trained fused feature as the input of model, due to
Name Entity recognition is also referred to as sequence labelling task, therefore contextual information is extremely important, and training pattern will be using with timing
Neural network LSTM or the GRU model of memory function.LSTM's specific formula is as follows:
it=σ (Wixt+Uiht-1+bi)
ft=σ (Wfxt+Ufht-1+bf)
ot=σ (Woxt+Uoht-1+bo)
Wherein it、ft、otThe input, forgetting, out gate of timing node t are represented, σ represents nonlinear function, each control
The parameter of door is all made of two matrixes and a bias vector, and therefore, the matrix parameter of three control doors is Wi,Ui,Wf,Uf,
Wo,Uo, straggling parameter bi,bf,bo.The memory unit parameter of LSTM is respectively Wc,UcAnd bc.These parameters are in training and store
When each step be all updated.
Name entity classification device module (4): it is generated most according to neural network LSTM or GRU model softmax classifier
Entity tag classification results afterwards.
Claims (2)
1. a kind of name entity recognition method based on Fusion Features, feature include following four module: data prediction mould
Block (1), feature construction module (2), training name physical network model module (3), name entity classification device module (4);
(1) data preprocessing module
The data not marked are added in the training set marked and form the corpus of sparse markup, and are loaded into domain body;Root
According to punctuation mark, number and space character by text dividing to be processed at Chinese character string, and remove stop words;
(2) feature construction module
The module is divided into feature extraction and Fusion Features, is specifically divided into four submodules: semantic feature extraction, word feature extraction,
Character feature extracts and Fusion Features;
(3) training name physical network model module
It is trained fused feature as the input of model, since name Entity recognition is also referred to as sequence labelling task,
It needs to extract contextual information auxiliary and infers entity class, therefore training pattern will be using the nerve net with timing memory function
Network model LSTM or GRU;
(4) entity classification device module is named
Last entity tag classification results are generated according to the softmax classifier of neural network LSTM or GRU model.
2. a kind of name entity recognition method based on Fusion Features according to claim 1, it is characterised in that step
(2), specific as follows:
Semantic feature extraction (21): semantic feature includes two parts: concept characteristic and non-notional word feature;Wherein, concept is
Refer to by multiple special field terms formed comprising semantic independent vocabulary;Non- notional word just refers to an individual language
Adopted vocabulary;For the extraction concept characteristic of concept can be mapped out in domain body, the direct extraction word of concept cannot be extracted
Feature;
Pretreated corpus is mapped to domain body first, data cutting is collected to be semantic by { Y by maximum matching method1,
...YN∈ D, wherein including concept set and non-concept word set { G1... GN}∪{F1... FN}∈Y;Secondly CBOW model is used
Semantic feature is extracted, the training objective of CBOW is by the maximization of following average log probability, formula are as follows:
Wherein, K is the contextual information of target word in data set D, YiFor the semanteme in data set D;
In CBOW, probability P r (Yi|Yi-K..., Yi+K) it is to be calculated by following formula:
Wherein, y0And yiFor target semanteme YiThe vector output and input indicates, and y0For the average vector table of all contexts
Show, T is to turn order, and W is semantic dictionary;
Word feature extraction (22): word feature extraction is divided into two kinds of situations, word feature extraction based on concept and based on non-concept
Word feature extraction;
Word feature extraction based on concept is extraction word feature on the basis of concept characteristic, since a concept is by multiple words
G={ the C of composition1... CN, therefore concept is meant that and is determined by the word for being included;Word feature extraction based on concept
Formula indicates are as follows:
Wherein, giFor concept GiConcept Vectors, cjFor gnIn j-th of term vector, gnFor concept GiThe number for the word for being included, Qi
It is added and is obtained with its average term vector by Concept Vectors ,+it is addition of vectors operation;
The word feature extracting method of non-concept will directly extract word spy using the CBOW model of semantic feature extraction module (21)
Sign;
Character feature extracts (23): extracting character feature on the basis of notional word and on the basis of non-notional word;Based on general
It is as follows that word in thought extracts character feature formula:
Wherein, zkFor cnIn k-th of character vector, cnFor notional word CiThe character number for being included ,+it is addition of vectors operation, QiIt is added and is obtained with its average character vector by Concept Vectors, its average term vector;Based on non-notional word feature extraction character feature
Formula is as follows:
Wherein, wiFor non-notional word FiTerm vector indicate, fnFor non-notional word FiThe character number for being included, dmFor fnIn m
A character vector ,+it is addition of vectors operation,By non-notional word vector sum, its average character vector addition is obtained;
In Chinese, the meaning of character position difference expression is also different, therefore the position feature for extracting character also assists
Infer the semantic information of word;For each character, we are indicated with B (beginning), I (centre), E (end), formula expression
Are as follows:
Wherein, cnFor word CiThe character number for being included,For word CiIn first character feature,For word CiIn
Between character feature,For word CiIn last character feature;
The position feature for extracting its character for non-concept characteristic word also uses same expression way;
Fusion Features (24): according to above content, Fusion Features are equally divided into two kinds of situations, concept characteristic fusion and non-notional word
Fusion Features;Primary concern is that concept characteristic and words feature in the name Entity recognition task based on certain fields ontology
Equally, it directly extracts the name entity that part does not mark in the corpus of sparse markup, to reduce calculation amount;
Concept characteristic fusion: the position feature of the concept characteristic extracted, word feature and character spy and character are merged,
The formula expression of concept characteristic fusion are as follows:
Non- notional word Fusion Features: the position feature of the word feature, character feature and the character that extract is blended, non-notional word
The formula of Fusion Features is expressed are as follows:
Wherein, fnFor word FiThe character number for being included,For word FiIn first character,For word FiCentre
Character feature,Word FiIn last character feature;
The dimension of Concept Vectors is reduced using the method for calculating Ontological concept characteristic similarity, formula is as follows:
Wherein, oiFor a concept characteristic in ontology, giAnd gmFor the concept characteristic identified in data set D, R () is giAnd gm
Relationship, maxsimilarity () be cosine similarity, α is similarity threshold, and initial threshold is set as 0.9, using under gradient
The method of drop calculates error, and error function is exactly made smoothly continuously to calculate the slope of gradient decline, closer to minimum value ladder
Spend it is smaller, until the slope of gradient reaches the optimal threshold that minimum value is exactly similarity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910099671.0A CN109800437B (en) | 2019-01-31 | 2019-01-31 | Named entity recognition method based on feature fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910099671.0A CN109800437B (en) | 2019-01-31 | 2019-01-31 | Named entity recognition method based on feature fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109800437A true CN109800437A (en) | 2019-05-24 |
CN109800437B CN109800437B (en) | 2023-11-14 |
Family
ID=66560740
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910099671.0A Active CN109800437B (en) | 2019-01-31 | 2019-01-31 | Named entity recognition method based on feature fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109800437B (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110704640A (en) * | 2019-09-30 | 2020-01-17 | 北京邮电大学 | Representation learning method and device of knowledge graph |
CN110852359A (en) * | 2019-07-24 | 2020-02-28 | 上海交通大学 | Family tree identification method and system based on deep learning |
CN110866399A (en) * | 2019-10-24 | 2020-03-06 | 同济大学 | Chinese short text entity identification and disambiguation method based on enhanced character vector |
CN111489746A (en) * | 2020-03-05 | 2020-08-04 | 国网浙江省电力有限公司 | Power grid dispatching voice recognition language model construction method based on BERT |
CN111539209A (en) * | 2020-04-15 | 2020-08-14 | 北京百度网讯科技有限公司 | Method and apparatus for entity classification |
CN111832307A (en) * | 2020-07-09 | 2020-10-27 | 北京工业大学 | Entity relationship extraction method and system based on knowledge enhancement |
CN112015901A (en) * | 2020-09-08 | 2020-12-01 | 迪爱斯信息技术股份有限公司 | Text classification method and device and warning situation analysis system |
CN112101028A (en) * | 2020-08-17 | 2020-12-18 | 淮阴工学院 | Multi-feature bidirectional gating field expert entity extraction method and system |
CN112257417A (en) * | 2020-10-29 | 2021-01-22 | 重庆紫光华山智安科技有限公司 | Multi-task named entity recognition training method, medium and terminal |
CN112329465A (en) * | 2019-07-18 | 2021-02-05 | 株式会社理光 | Named entity identification method and device and computer readable storage medium |
CN112331332A (en) * | 2020-10-14 | 2021-02-05 | 北京工业大学 | Disease prediction method and system based on multi-granularity feature fusion |
CN113035362A (en) * | 2021-02-26 | 2021-06-25 | 北京工业大学 | Medical prediction method and system based on semantic graph network |
CN113361272A (en) * | 2021-06-22 | 2021-09-07 | 海信视像科技股份有限公司 | Method and device for extracting concept words of media asset title |
CN113378569A (en) * | 2021-06-02 | 2021-09-10 | 北京三快在线科技有限公司 | Model generation method, entity identification method, model generation device, entity identification device, electronic equipment and storage medium |
CN113593709A (en) * | 2021-07-30 | 2021-11-02 | 江先汉 | Disease coding method, system, readable storage medium and device |
CN114638222A (en) * | 2022-05-17 | 2022-06-17 | 天津卓朗科技发展有限公司 | Natural disaster data classification method and model training method and device thereof |
CN114925198A (en) * | 2022-04-11 | 2022-08-19 | 华东师范大学 | Knowledge-driven text classification method fusing character information |
CN115577117A (en) * | 2022-09-26 | 2023-01-06 | 兰州理工大学 | Ontology matching method based on true-false triple-connected neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107133220A (en) * | 2017-06-07 | 2017-09-05 | 东南大学 | Name entity recognition method in a kind of Geography field |
CN107203511A (en) * | 2017-05-27 | 2017-09-26 | 中国矿业大学 | A kind of network text name entity recognition method based on neutral net probability disambiguation |
CN107885721A (en) * | 2017-10-12 | 2018-04-06 | 北京知道未来信息技术有限公司 | A kind of name entity recognition method based on LSTM |
EP3407208A1 (en) * | 2017-05-22 | 2018-11-28 | Fujitsu Limited | Ontology alignment apparatus, program, and method |
-
2019
- 2019-01-31 CN CN201910099671.0A patent/CN109800437B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3407208A1 (en) * | 2017-05-22 | 2018-11-28 | Fujitsu Limited | Ontology alignment apparatus, program, and method |
CN107203511A (en) * | 2017-05-27 | 2017-09-26 | 中国矿业大学 | A kind of network text name entity recognition method based on neutral net probability disambiguation |
CN107133220A (en) * | 2017-06-07 | 2017-09-05 | 东南大学 | Name entity recognition method in a kind of Geography field |
CN107885721A (en) * | 2017-10-12 | 2018-04-06 | 北京知道未来信息技术有限公司 | A kind of name entity recognition method based on LSTM |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112329465B (en) * | 2019-07-18 | 2024-06-25 | 株式会社理光 | Named entity recognition method, named entity recognition device and computer readable storage medium |
CN112329465A (en) * | 2019-07-18 | 2021-02-05 | 株式会社理光 | Named entity identification method and device and computer readable storage medium |
CN110852359A (en) * | 2019-07-24 | 2020-02-28 | 上海交通大学 | Family tree identification method and system based on deep learning |
CN110852359B (en) * | 2019-07-24 | 2023-05-26 | 上海交通大学 | Family tree identification method and system based on deep learning |
CN110704640A (en) * | 2019-09-30 | 2020-01-17 | 北京邮电大学 | Representation learning method and device of knowledge graph |
CN110866399A (en) * | 2019-10-24 | 2020-03-06 | 同济大学 | Chinese short text entity identification and disambiguation method based on enhanced character vector |
CN110866399B (en) * | 2019-10-24 | 2023-05-02 | 同济大学 | Chinese short text entity recognition and disambiguation method based on enhanced character vector |
CN111489746A (en) * | 2020-03-05 | 2020-08-04 | 国网浙江省电力有限公司 | Power grid dispatching voice recognition language model construction method based on BERT |
CN111489746B (en) * | 2020-03-05 | 2022-07-26 | 国网浙江省电力有限公司 | Power grid dispatching voice recognition language model construction method based on BERT |
CN111539209A (en) * | 2020-04-15 | 2020-08-14 | 北京百度网讯科技有限公司 | Method and apparatus for entity classification |
CN111539209B (en) * | 2020-04-15 | 2023-09-15 | 北京百度网讯科技有限公司 | Method and apparatus for entity classification |
CN111832307A (en) * | 2020-07-09 | 2020-10-27 | 北京工业大学 | Entity relationship extraction method and system based on knowledge enhancement |
CN112101028A (en) * | 2020-08-17 | 2020-12-18 | 淮阴工学院 | Multi-feature bidirectional gating field expert entity extraction method and system |
CN112101028B (en) * | 2020-08-17 | 2022-08-26 | 淮阴工学院 | Multi-feature bidirectional gating field expert entity extraction method and system |
CN112015901A (en) * | 2020-09-08 | 2020-12-01 | 迪爱斯信息技术股份有限公司 | Text classification method and device and warning situation analysis system |
CN112331332A (en) * | 2020-10-14 | 2021-02-05 | 北京工业大学 | Disease prediction method and system based on multi-granularity feature fusion |
CN112257417A (en) * | 2020-10-29 | 2021-01-22 | 重庆紫光华山智安科技有限公司 | Multi-task named entity recognition training method, medium and terminal |
CN113035362A (en) * | 2021-02-26 | 2021-06-25 | 北京工业大学 | Medical prediction method and system based on semantic graph network |
CN113035362B (en) * | 2021-02-26 | 2024-04-09 | 北京工业大学 | Medical prediction method and system based on semantic graph network |
CN113378569A (en) * | 2021-06-02 | 2021-09-10 | 北京三快在线科技有限公司 | Model generation method, entity identification method, model generation device, entity identification device, electronic equipment and storage medium |
CN113361272B (en) * | 2021-06-22 | 2023-03-21 | 海信视像科技股份有限公司 | Method and device for extracting concept words of media asset title |
CN113361272A (en) * | 2021-06-22 | 2021-09-07 | 海信视像科技股份有限公司 | Method and device for extracting concept words of media asset title |
CN113593709B (en) * | 2021-07-30 | 2022-09-30 | 江先汉 | Disease coding method, system, readable storage medium and device |
CN113593709A (en) * | 2021-07-30 | 2021-11-02 | 江先汉 | Disease coding method, system, readable storage medium and device |
CN114925198A (en) * | 2022-04-11 | 2022-08-19 | 华东师范大学 | Knowledge-driven text classification method fusing character information |
CN114925198B (en) * | 2022-04-11 | 2024-07-12 | 华东师范大学 | Knowledge-driven text classification method integrating character information |
CN114638222A (en) * | 2022-05-17 | 2022-06-17 | 天津卓朗科技发展有限公司 | Natural disaster data classification method and model training method and device thereof |
CN115577117A (en) * | 2022-09-26 | 2023-01-06 | 兰州理工大学 | Ontology matching method based on true-false triple-connected neural network |
Also Published As
Publication number | Publication date |
---|---|
CN109800437B (en) | 2023-11-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109800437A (en) | A kind of name entity recognition method based on Fusion Features | |
CN109902145B (en) | Attention mechanism-based entity relationship joint extraction method and system | |
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
CN111046179B (en) | Text classification method for open network question in specific field | |
CN109858041B (en) | Named entity recognition method combining semi-supervised learning with user-defined dictionary | |
CN113268995B (en) | Chinese academy keyword extraction method, device and storage medium | |
CN108628823A (en) | In conjunction with the name entity recognition method of attention mechanism and multitask coordinated training | |
CN110263325B (en) | Chinese word segmentation system | |
CN113591483A (en) | Document-level event argument extraction method based on sequence labeling | |
CN111666758B (en) | Chinese word segmentation method, training device and computer readable storage medium | |
CN113673254B (en) | Knowledge distillation position detection method based on similarity maintenance | |
CN111222318A (en) | Trigger word recognition method based on two-channel bidirectional LSTM-CRF network | |
CN113515632A (en) | Text classification method based on graph path knowledge extraction | |
Song et al. | Classification of traditional chinese medicine cases based on character-level bert and deep learning | |
CN112699685A (en) | Named entity recognition method based on label-guided word fusion | |
CN111666752A (en) | Circuit teaching material entity relation extraction method based on keyword attention mechanism | |
CN114781375A (en) | Military equipment relation extraction method based on BERT and attention mechanism | |
CN115169349A (en) | Chinese electronic resume named entity recognition method based on ALBERT | |
CN111145914A (en) | Method and device for determining lung cancer clinical disease library text entity | |
CN113254602B (en) | Knowledge graph construction method and system for science and technology policy field | |
Cao et al. | Knowledge guided short-text classification for healthcare applications | |
CN113934835A (en) | Retrieval type reply dialogue method and system combining keywords and semantic understanding representation | |
Nouhaila et al. | Arabic sentiment analysis based on 1-D convolutional neural network | |
CN111813927A (en) | Sentence similarity calculation method based on topic model and LSTM | |
Mars et al. | Combination of DE-GAN with CNN-LSTM for Arabic OCR on Images with Colorful Backgrounds |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |