CN107102989A - A kind of entity disambiguation method based on term vector, convolutional neural networks - Google Patents

A kind of entity disambiguation method based on term vector, convolutional neural networks Download PDF

Info

Publication number
CN107102989A
CN107102989A CN201710373502.2A CN201710373502A CN107102989A CN 107102989 A CN107102989 A CN 107102989A CN 201710373502 A CN201710373502 A CN 201710373502A CN 107102989 A CN107102989 A CN 107102989A
Authority
CN
China
Prior art keywords
entity
disambiguation
candidate
term vector
knowledge base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710373502.2A
Other languages
Chinese (zh)
Other versions
CN107102989B (en
Inventor
张雷
高扬
唐驰
谢俊元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201710373502.2A priority Critical patent/CN107102989B/en
Publication of CN107102989A publication Critical patent/CN107102989A/en
Application granted granted Critical
Publication of CN107102989B publication Critical patent/CN107102989B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The present invention provides a kind of entity disambiguation method based on term vector, convolutional neural networks, including Entity recognition stage, Entity Semantics represent the four-stages such as stage, neural network learning training stage and entity classification stage.This method relies on the term vector and convolutional neural networks of word2vec training, respectively for treating candidate's entity summary info constructing semantic characteristic vector in disambiguation entity context and knowledge base.The cosine similarity of characteristic vector is calculated in the entity classification stage, the maximum candidate's entity of similarity is taken as the final goal entity for treating disambiguation entity.By the method for the present invention, the semantic expressiveness ability of entity is substantially increased, and then improve the accuracy rate of follow-up disambiguation.

Description

A kind of entity disambiguation method based on term vector, convolutional neural networks
Technical field
The invention belongs to technical field of Internet information, and in particular to a kind of entity disambiguation method, more particularly to a kind of base In term vector, the entity disambiguation method of convolutional neural networks.
Background technology
With the popularization of mobile Internet, microblogging, blog, mhkc, forum and major news websites, government work website Etc. the life for greatly facilitating the people.The data overwhelming majority on these platforms is all with unstructured or semi-structured shape Formula is present, and causes to there is substantial amounts of Ambiguity in these data.If it is accurate that these can be carried out containing ambiguous entity Disambiguation, it will for later-stage utilization produce great convenience.
The entity disambiguation algorithm underlying model of current main flow is to be based on bag of words mostly, the intrinsic limitation of bag of words, Cause these algorithms to be unable to make full use of the semantic information of context, cause entity disambiguation effect also to have greatly improved sky Between.Word insertion is the focus of machine learning in recent years, and the core concept of word insertion is exactly to construct a distribution for each word Formula represents that this avoid the wide gap between vocabulary and vocabulary.Convolutional neural networks are a branches of neural network model, can Effectively to catch local feature, then global modeling.If can be modeled using convolutional neural networks to word insertion, then energy Access semantic feature more more effective than bag of words.And based on local sensing and the shared thought of weights, convolutional Neural net Network Model Parameter greatly reduces, and training speed is very fast, and the AlphaGo of Google core is exactly two convolutional neural networks.
The present invention combines term vector and convolutional neural networks, for treating disambiguation entity context and knowledge base entity Summary info, respectively constructing semantic represent that training convolutional neural networks are predicted.Substantially increase the language of entity context Adopted descriptive power.
The content of the invention
Goal of the invention:The present invention is difficult by the present situation of the semantic information of context for existing entity disambiguation method, carries For a kind of entity disambiguation method based on term vector, convolutional neural networks, it is intended to catch context semantic information to help entity Disambiguation.
Technical scheme:
A kind of entity disambiguation method based on term vector, convolutional neural networks, including step:
Step 1:According to application scenarios collect comprising treating the text set of disambiguation entity, and text set is pre-processed, Determine that each in text set treats disambiguation entity and its contextual feature;
Step 2:The knowledge base of disambiguation entity is treated according to what domain knowledge was built, and searches for knowledge base, determines that each is treated The Expressive Features of each candidate's entity in candidate's entity sets of disambiguation entity and set;
Step 3:The term vector of the noun in centered on treating disambiguation entity fixed size window is taken to be constituted one Term vector matrix, is used as the context semantic feature for treating disambiguation entity;The summary info of the entity of each in knowledge base is taken in meter The term vector for calculating larger preceding 20 nouns of weight ratio after TFIDF constitutes term vector matrix, is used as the semanteme of knowledge base entity Feature;
Step 4:By known unambiguously entity joint knowledge base target entity and candidate's entity composing training in text Gather, and be input to convolutional neural networks model and be trained, the parameter in adjustment model;
Step 5:The sample for entity and knowledge base candidate the entity sets composition for treat disambiguation to each, is input to step 4 Obtained convolutional neural networks model, respectively obtains and treats that each knowledge base is real in disambiguation entity and knowledge base candidate's entity sets The semantic feature vector of body;
Step 6:Based on semantic feature vector, each entity in disambiguation entity and knowledge base candidate's entity sets is treated in calculating Cosine similarity;It is the final goal entity for treating disambiguation entity to take the maximum candidate's entity of similarity.
Pretreatment in the step 1 is to carry out part-of-speech tagging to text set with Chinese Academy of Sciences Chinese word segmentation program ICTCLAS Participle, then filters out stop words according to deactivation vocabulary, and for some proper nouns and the indiscernible physical name of comparison Create a name dictionary.
Chinese word segmentation program ICTCLAS is called to carry out part-of-speech tagging point to the entity description in knowledge base in the step 2 Word, stop words is filtered out according to vocabulary is disabled.
The term vector of the noun in a fixed size window in the step 3 centered on treating disambiguation entity is constituted one Individual term vector matrix is specially:
1) Google deep learning program word2vec is called to be trained Chinese wikipedia corpus, so as to obtain word Vector table L, the length of term vector therein is 200 dimensions, and often one-dimensional is all a real number;
2) disambiguation entity e context conetxt is treatede={ w1,w2,…,wi,…,wKIn each noun wiInquiry Term vector table L, obtains the term vector v of each nouni
3) according to the term vector for the context words for treating disambiguation entity e, the context term vector square for treating disambiguation entity e is built Battle array [v1,v2,v3,…vi,…,vK];
4) terminate.
Take the summary info of the entity of each in the knowledge base weight ratio after TFIDF is calculated larger in the step 3 The term vector of preceding 20 nouns constitutes term vector matrix:
1) to candidate entity sets E={ e1,e2,…,enIn each candidate's entity eiExpressive Features in each name Word wiQuery word vector table L, obtains the term vector v of each nouni
2) according to the term vector of the noun of each in Expressive Features, the term vector matrix of entity description is built;
3) terminate.
The step 4 convolutional neural networks learning training is specially:
1) each treat that the semantic feature of disambiguation and the semantic feature of candidate's entity sets, as a training sample, are input to Neural network model;
2) semantic feature for treating disambiguation carries out convolution, sets convolution kernel feature map number as 200, sets Convolution kernel feature map size is [2,200], i.e., a length of 2, a width of 200 matrix;
3) result of each convolution kernel convolution uses 1-max ponds, obtains the feature of each convolution kernel;
4) 200 convolution kernel feature composition intermediate result, then full articulamentum is input to, full articulamentum size is 50, finally Obtain the semantic feature vector of one 50 dimension;
5) semantic feature of candidate's entity sets, first asks and adds and average, a full articulamentum is then input to again, size is same Sample is 50, finally gives the semantic feature vector of one 50 dimension;
6) the loss function Loss of the training sample of each in neutral neteIt is defined as:
Losse=max (0,1-sim (e, eε)+sim(e,e′))
Wherein:eεExpression treats that any other candidates in disambiguation entity e target entity, e ' expression candidate's entity setses are real Body, means the difference for maximizing target entity and any other candidate's Entity Semantics characteristic vector similarities;
Whole loss function is defined as:Loss=∑s Losse
7) parameter in neutral net, which is used, is uniformly distributed U (- 0.01,0.01) initialization;
8) activation primitive in neutral net uses tanh tanh activation primitives;
9) parameter in neutral net is updated using stochastic gradient descent;
10) terminate.
The step 6 entity classification stage is specially:
1) the semantic feature vector a for treating disambiguation entity e is read from file system;
2) candidate entity sets E={ e are read from file system1,e2,…,enIn the vectorial set B=of semantic feature {b1,b2,…,bn};
3) candidate's entity sets is traveled through, the cosine similarity of each characteristic vector in e and E is calculated
4) the maximum entity of similarity is chosenAs finally predicting the outcome;
5) terminate.
Beneficial effect:Entity disambiguation method of the invention based on term vector, convolutional neural networks treats disambiguation entity respectively With knowledge base candidate's entity structure semantic expressiveness.Using training set training neural network model, in entity disambiguation, it will wait to disappear Discrimination entity is input to the neural network model trained, and output treats that most like candidate's entity of disambiguation entity is real as final goal Body.
Brief description of the drawings
In order to illustrate more clearly of the present invention, the accompanying drawing used needed for the present invention will be briefly described below:
Fig. 1 is the entity disambiguation method based on term vector, convolutional neural networks of the invention.
Fig. 2 is the structure chart of convolutional neural networks model.
Fig. 3 is the flow chart in entity classification stage.
Embodiment
The present invention is further described below in conjunction with the accompanying drawings.
The flow chart based on term vector, the entity disambiguation method of convolutional neural networks of the present invention is as shown in Figure 1.
Step 0 is the initial state of the entity disambiguation method of the present invention;
In the Entity recognition stage (step 1-6):
Step 1 is to be collected according to application scenarios comprising the text set for treating disambiguation entity;
Step 2 is the knowledge base for treating disambiguation entity built according to domain knowledge;
Step 3 is to call Chinese Academy of Sciences Chinese word segmentation program ICTCLAS to carry out part-of-speech tagging participle to text set, then basis Disable vocabulary and filter out stop words, and for some proper nouns and comparison one noun word of indiscernible entity name creation Allusion quotation;
Step 4 is to call Chinese word segmentation program ICTCLAS to carry out part-of-speech tagging participle, root to the entity description in knowledge base Stop words is filtered out according to vocabulary is disabled;
Step 5 be according to application scenarios determine each pay close attention to treat disambiguation entity and its contextual feature;
Step 6 is the generation of candidate's entity, searches for knowledge base, compares and treats that disambiguation entity is censured in item and knowledge base in text Whether entity censures identical, if identical, it is to treat that disambiguation entity censures candidate's entity of item in text that these entities, which are regarded, really The Expressive Features of each candidate's entity during each is treated candidate's entity sets of disambiguation entity and gathered calmly;
The stage (step 7-10) is represented in Entity Semantics:
Step 7 is that the term vector for taking the noun in a fixed size window centered on treating disambiguation entity is constituted one Term vector matrix;(being carried out to text set after part-of-speech tagging word segmentation processing, the word with/n mark), window size is 10;
1) Google deep learning program word2vec is called to be trained Chinese wikipedia corpus, so as to obtain word Vector table L, the length of term vector therein is 200 dimensions, and often one-dimensional is all a real number;
2) disambiguation entity e context conetxt is treatede={ w1,w2,…,wi,…,wKIn each noun wiInquiry Term vector table L, obtains the term vector v of each nouni
3) according to the term vector for the context words for treating disambiguation entity e, the context term vector square for treating disambiguation entity e is built Battle array [v1,v2,v3,…vi,…,vK];
4) terminate.
Step 8 is the summary info for taking the entity of each in knowledge base 20 before weight ratio after calculating TFIDF is larger The term vector of individual noun constitutes term vector matrix;If inadequate 20 nouns, take existing all nouns;
1) to candidate entity sets E={ e1,e2,…,enIn each candidate's entity eiExpressive Features in each name Word wiQuery word vector table L, obtains the term vector v of each nouni
2) according to the term vector of the noun of each in Expressive Features, the term vector matrix of entity description is built;
3) terminate.
Step 9 is to regard the term vector matrix of step 7 as the context semantic feature for treating disambiguation entity;
Step 10 be using the term vector matrix of step 8 as knowledge base entity semantic feature;
In the neural network learning training stage (step 11-12):
Step 11 is by known unambiguously entity joint knowledge base entity composing training set in text;
The training set that step 12 is directed in step 11 is input to convolutional neural networks model and is trained, in adjustment model Parameter;
1) each treat that the semantic expressiveness of disambiguation and the semantic feature of candidate's entity sets, as a training sample, are input to Neural network model;
2) semantic feature for treating disambiguation carries out convolution, sets convolution kernel feature map number as 200, sets Convolution kernel feature map size is [2,200], i.e., a length of 2, a width of 200 matrix;
3) result of each convolution kernel convolution uses 1-max ponds, obtains the feature of each convolution kernel;
4) 200 convolution kernel feature composition intermediate result, then full articulamentum is input to, full articulamentum size is 50, finally Obtain the semantic feature vector of one 50 dimension;
5) semantic feature of candidate's entity sets, first asks and adds and average, a full articulamentum is then input to again, size is same Sample is 50, finally gives the semantic feature vector of one 50 dimension;
6) the loss function Loss of the training sample of each in neutral neteIt is defined as:
Losse=max (0,1-sim (e, eε)+sim(e,e′))
Wherein:eεExpression treats that any other candidates in disambiguation entity e target entity, e ' expression candidate's entity setses are real Body, means the difference for maximizing target entity and any other candidate's Entity Semantics characteristic vector similarities;
Whole loss function is defined as:Loss=∑s Losse
7) parameter in neutral net, which is used, is uniformly distributed U (- 0.01,0.01) initialization;
8) activation primitive in neutral net uses tanh tanh activation primitives;
9) parameter in neutral net is updated using stochastic gradient descent;
10) terminate.
In the entity classification stage (step 13-14):
Step 13 is to read the sample set that disambiguation entity and knowledge base candidate's entity are treated in text;
The sample set that step 14 traversal step 13 is read, the convolution that step 12 training is obtained is input to by each sample Neural network model, and output category result;
Step 15 is the end step based on term vector, the entity disambiguation method of convolutional neural networks of the present invention;
Fig. 2 is the detailed overview diagram to the neural network structure of the step 12 in the neural network learning training stage in Fig. 1, Including following several parts:
Term vector matrix:Treat the term vector matrix of disambiguation entity context and the term vector square of knowledge base entity description feature Battle array as convolutional neural networks input;
Convolutional layer:Disambiguation entity context term vector matrix is treated, convolution is carried out by 200 different convolution kernels, obtains To the feature of each convolution kernel;
1-max ponds layer:1-max ponds are carried out to the feature that convolutional layer is exported, the intermediate result of one 200 dimension is obtained;
Full articulamentum:The full articulamentum that a size is 50 is connected to above-mentioned intermediate result, to knowledge base candidate's entity Term vector adds and is averaged also one size of connection to be 50 full articulamentum, so as to obtain the semantic features vector of two 50 dimensions;
Similarity Measure:Calculate the cosine similarity of two semantic feature vectors;
Fig. 3 is the detailed process description to the step 14 in the entity classification stage in Fig. 1:
Step 16 is Fig. 3 initial state figure;
Step 17 is to read the neural network model trained in file system;
Step 18 is to read the sample set that disambiguation entity and knowledge base candidate's entity are treated in text;
Step 19 is that sample set is input into convolutional neural networks model, obtains after semantic feature vector, travels through knowledge Storehouse candidate's entity sets, calculates the cosine similarity for the semantic feature vector for treating disambiguation entity and each candidate's entity;
Step 20 is output similarity highest entity as final goal entity;
Step 21 is Fig. 3 done state figure;
Specifically:1) the semantic feature vector a for treating disambiguation entity e is read from file system;
2) candidate entity sets E={ e are read from file system1,e2,…,enIn the vectorial set B=of semantic feature {b1,b2,…,bn};
3) candidate's entity sets is traveled through, the cosine similarity of each characteristic vector in e and E is calculated
4) the maximum entity of similarity is chosenAs finally predicting the outcome;
5) terminate.
In summary, present invention comprehensive utilization term vector, the method for convolutional neural networks, are treated under disambiguation physically respectively Text and knowledge base candidate's entity summary info construct term vector matrix, are input to convolutional neural networks model.Training convolutional nerve Parameter in network model, adjustment model.In forecast period, export most like entity and be used as target entity.Solve tradition Bag of words in there is vocabulary wide gap, so as to the problem of semantic expressiveness scarce capacity, further increase the standard of entity disambiguation True rate.
Described above is only the preferred embodiment of the present invention, it should be pointed out that:For the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims (7)

1. a kind of entity disambiguation method based on term vector, convolutional neural networks, it is characterised in that:Including step:
Step 1:According to application scenarios collect comprising treating the text set of disambiguation entity, and text set is pre-processed, it is determined that Each in text set treats disambiguation entity and its contextual feature;
Step 2:The knowledge base of disambiguation entity is treated according to what domain knowledge was built, and searches for knowledge base, determines that each treats disambiguation The Expressive Features of each candidate's entity in candidate's entity sets of entity and set;
Step 3:Take the noun in centered on treating disambiguation entity fixed size window term vector constituted a word to Moment matrix, is used as the context semantic feature for treating disambiguation entity;The summary info of the entity of each in knowledge base is taken to calculate The term vector of larger preceding 20 nouns of weight ratio constitutes term vector matrix after TFIDF, special as the semanteme of knowledge base entity Levy;
Step 4:By known unambiguously entity joint knowledge base target entity and candidate's entity composing training set in text, And be input to convolutional neural networks model and be trained, the parameter in adjustment model;
Step 5:The sample for entity and knowledge base candidate the entity sets composition for treat disambiguation to each, is input to step 4 and obtains Convolutional neural networks model, respectively obtain and treat each knowledge base entity in disambiguation entity and knowledge base candidate's entity sets Semantic feature vector;
Step 6:Based on semantic feature vector, calculate and treat the remaining of each entity in disambiguation entity and knowledge base candidate's entity sets String similarity;It is the final goal entity for treating disambiguation entity to take the maximum candidate's entity of similarity.
2. entity disambiguation method according to claim 1, it is characterised in that:Pretreatment in the step 1 is section in using Institute Chinese word segmentation program ICTCLAS carries out part-of-speech tagging participle to text set, then filters out stop words according to deactivation vocabulary, and And for some proper nouns and comparison one noun dictionary of indiscernible entity name creation.
3. entity disambiguation method according to claim 1, it is characterised in that:Chinese word segmentation program is called in the step 2 ICTCLAS carries out part-of-speech tagging participle to the entity description in knowledge base, and stop words is filtered out according to vocabulary is disabled.
4. entity disambiguation method according to claim 1, it is characterised in that:To treat disambiguation entity in the step 3 The term vector of noun in one fixed size window of the heart constitutes a term vector matrix:
1) Google deep learning program word2vec is called to be trained Chinese wikipedia corpus, so as to obtain term vector Table L, the length of term vector therein is 200 dimensions, and often one-dimensional is all a real number;
2) disambiguation entity e context conetxt is treatede={ w1,w2,…,wi,…,wKIn each noun wiQuery word to Scale L, obtains the term vector v of each nouni
3) according to the term vector for the context words for treating disambiguation entity e, the context term vector matrix for treating disambiguation entity e is built [v1,v2,v3,…vi,…,vK];
4) terminate.
5. entity disambiguation method according to claim 1, it is characterised in that:Each in knowledge base is taken in the step 3 It is specific that the summary info of entity term vector of larger preceding 20 nouns of weight ratio after TFIDF is calculated constitutes term vector matrix For:
1) to candidate entity sets E={ e1,e2,…,enIn each candidate's entity eiExpressive Features in each noun wi Query word vector table L, obtains the term vector v of each nouni
2) according to the term vector of the noun of each in Expressive Features, the term vector matrix of entity description is built;
3) terminate.
6. entity disambiguation method according to claim 1, it is characterised in that:The step 4 convolutional neural networks study instruction Practice detailed process as follows:
1) each treat that the semantic feature of disambiguation and the semantic feature of candidate's entity sets, as a training sample, are input to nerve Network model;
2) semantic feature for treating disambiguation carries out convolution, sets convolution kernel feature map number as 200, sets convolution Core feature map size is [2,200], i.e., a length of 2, a width of 200 matrix;
3) result of each convolution kernel convolution uses 1-max ponds, obtains the feature of each convolution kernel;
4) 200 convolution kernel feature composition intermediate result, then be input to full articulamentum, full articulamentum size is 50, is finally given The semantic feature vector of one 50 dimension;
5) semantic feature of candidate's entity sets, first asks and adds and average, a full articulamentum is then input to again, size is similarly 50, finally give the semantic feature vector of one 50 dimension;
6) the loss function Loss of the training sample of each in neutral neteIt is defined as:
Losse=max (0,1-sim (e, eε)+sim(e,e′))
Wherein:eεAny other candidate's entities in disambiguation entity e target entity, e ' expression candidate's entity setses, meaning are treated in expression Think of is the difference for maximizing target entity and any other candidate's Entity Semantics characteristic vector similarities;
Whole loss function is defined as:Loss=∑s Losse
7) parameter in neutral net, which is used, is uniformly distributed U (- 0.01,0.01) initialization;
8) activation primitive in neutral net uses tanh tanh activation primitives;
9) parameter in neutral net is updated using stochastic gradient descent;
10) terminate.
7. entity disambiguation method according to claim 1, it is characterised in that:The step 6 entity classification stage specific mistake Journey is as follows:
1) the semantic feature vector a for treating disambiguation entity e is read from file system;
2) candidate entity sets E={ e are read from file system1,e2,…,enIn semantic feature vector set B={ b1, b2,…,bn};
3) candidate's entity sets is traveled through, the cosine similarity of each characteristic vector in e and E is calculated
4) the maximum entity of similarity is chosenAs finally predicting the outcome;
5) terminate.
CN201710373502.2A 2017-05-24 2017-05-24 Entity disambiguation method based on word vector and convolutional neural network Active CN107102989B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710373502.2A CN107102989B (en) 2017-05-24 2017-05-24 Entity disambiguation method based on word vector and convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710373502.2A CN107102989B (en) 2017-05-24 2017-05-24 Entity disambiguation method based on word vector and convolutional neural network

Publications (2)

Publication Number Publication Date
CN107102989A true CN107102989A (en) 2017-08-29
CN107102989B CN107102989B (en) 2020-09-29

Family

ID=59670296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710373502.2A Active CN107102989B (en) 2017-05-24 2017-05-24 Entity disambiguation method based on word vector and convolutional neural network

Country Status (1)

Country Link
CN (1) CN107102989B (en)

Cited By (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107562729A (en) * 2017-09-14 2018-01-09 云南大学 The Party building document representation method strengthened based on neutral net and theme
CN107729509A (en) * 2017-10-23 2018-02-23 中国电子科技集团公司第二十八研究所 The chapter similarity decision method represented based on recessive higher-dimension distributed nature
CN107730002A (en) * 2017-10-13 2018-02-23 国网湖南省电力公司 A kind of communication network shutdown remote control parameter intelligent fuzzy comparison method
CN108280061A (en) * 2018-01-17 2018-07-13 北京百度网讯科技有限公司 Text handling method based on ambiguity entity word and device
CN108304552A (en) * 2018-02-01 2018-07-20 浙江大学 A kind of name entity link method that knowledge based planting modes on sink characteristic extracts
CN108335731A (en) * 2018-02-09 2018-07-27 辽宁工程技术大学 A kind of invalid diet's recommendation method based on computer vision
CN108399230A (en) * 2018-02-13 2018-08-14 上海大学 A kind of Chinese financial and economic news file classification method based on convolutional neural networks
CN108446269A (en) * 2018-03-05 2018-08-24 昆明理工大学 A kind of Word sense disambiguation method and device based on term vector
CN108563766A (en) * 2018-04-19 2018-09-21 天津科技大学 The method and device of food retrieval
CN108573047A (en) * 2018-04-18 2018-09-25 广东工业大学 A kind of training method and device of Module of Automatic Chinese Documents Classification
CN108647191A (en) * 2018-05-17 2018-10-12 南京大学 It is a kind of based on have supervision emotion text and term vector sentiment dictionary construction method
CN108647785A (en) * 2018-05-17 2018-10-12 普强信息技术(北京)有限公司 A kind of neural network method for automatic modeling, device and storage medium
CN108804595A (en) * 2018-05-28 2018-11-13 中山大学 A kind of short text representation method based on word2vec
CN108805290A (en) * 2018-06-28 2018-11-13 国信优易数据有限公司 A kind of determination method and device of entity class
CN108921213A (en) * 2018-06-28 2018-11-30 国信优易数据有限公司 A kind of entity classification model training method and device
CN108920467A (en) * 2018-08-01 2018-11-30 北京三快在线科技有限公司 Polysemant lexical study method and device, search result display methods
CN108959242A (en) * 2018-05-08 2018-12-07 中国科学院信息工程研究所 A kind of target entity recognition methods and device based on Chinese character part of speech feature
CN109101579A (en) * 2018-07-19 2018-12-28 深圳追科技有限公司 customer service robot knowledge base ambiguity detection method
CN109214007A (en) * 2018-09-19 2019-01-15 哈尔滨理工大学 A kind of Chinese sentence meaning of a word based on convolutional neural networks disappears qi method
CN109241294A (en) * 2018-08-29 2019-01-18 国信优易数据有限公司 A kind of entity link method and device
CN109299462A (en) * 2018-09-20 2019-02-01 武汉理工大学 Short text similarity calculating method based on multidimensional convolution feature
CN109325108A (en) * 2018-08-13 2019-02-12 北京百度网讯科技有限公司 Inquiry processing method, device, server and storage medium
CN109614615A (en) * 2018-12-04 2019-04-12 联想(北京)有限公司 Methodology for Entities Matching, device and electronic equipment
CN109635114A (en) * 2018-12-17 2019-04-16 北京百度网讯科技有限公司 Method and apparatus for handling information
CN109740728A (en) * 2018-12-10 2019-05-10 杭州世平信息科技有限公司 A kind of measurement of penalty calculation method based on a variety of neural network ensembles
CN109933788A (en) * 2019-02-14 2019-06-25 北京百度网讯科技有限公司 Type determines method, apparatus, equipment and medium
CN110019792A (en) * 2017-10-30 2019-07-16 阿里巴巴集团控股有限公司 File classification method and device and sorter model training method
CN110555208A (en) * 2018-06-04 2019-12-10 北京三快在线科技有限公司 ambiguity elimination method and device in information query and electronic equipment
CN110569506A (en) * 2019-09-05 2019-12-13 清华大学 Medical named entity recognition method based on medical dictionary
CN110598846A (en) * 2019-08-15 2019-12-20 北京航空航天大学 Hierarchical recurrent neural network decoder and decoding method
CN110674304A (en) * 2019-10-09 2020-01-10 北京明略软件系统有限公司 Entity disambiguation method and device, readable storage medium and electronic equipment
CN110705295A (en) * 2019-09-11 2020-01-17 北京航空航天大学 Entity name disambiguation method based on keyword extraction
CN110705292A (en) * 2019-08-22 2020-01-17 成都信息工程大学 Entity name extraction method based on knowledge base and deep learning
CN110826331A (en) * 2019-10-28 2020-02-21 南京师范大学 Intelligent construction method of place name labeling corpus based on interactive and iterative learning
CN110852106A (en) * 2019-11-06 2020-02-28 腾讯科技(深圳)有限公司 Named entity processing method and device based on artificial intelligence and electronic equipment
CN110852108A (en) * 2019-11-11 2020-02-28 中山大学 Joint training method, apparatus and medium for entity recognition and entity disambiguation
CN111241298A (en) * 2020-01-08 2020-06-05 腾讯科技(深圳)有限公司 Information processing method, apparatus and computer readable storage medium
CN111241824A (en) * 2020-01-09 2020-06-05 中国搜索信息科技股份有限公司 Method for identifying Chinese metaphor information
CN111310481A (en) * 2020-01-19 2020-06-19 百度在线网络技术(北京)有限公司 Speech translation method, device, computer equipment and storage medium
CN111597804A (en) * 2020-05-15 2020-08-28 腾讯科技(深圳)有限公司 Entity recognition model training method and related device
CN111709243A (en) * 2020-06-19 2020-09-25 南京优慧信安科技有限公司 Knowledge extraction method and device based on deep learning
WO2020228376A1 (en) * 2019-05-16 2020-11-19 华为技术有限公司 Text processing method and model training method and apparatus
CN112069826A (en) * 2020-07-15 2020-12-11 浙江工业大学 Vertical domain entity disambiguation method fusing topic model and convolutional neural network
CN112100356A (en) * 2020-09-17 2020-12-18 武汉纺织大学 Knowledge base question-answer entity linking method and system based on similarity
CN112257443A (en) * 2020-09-30 2021-01-22 华泰证券股份有限公司 MRC-based company entity disambiguation method combined with knowledge base
CN112464669A (en) * 2020-12-07 2021-03-09 宁波深擎信息科技有限公司 Stock entity word disambiguation method, computer device and storage medium
CN112580351A (en) * 2020-12-31 2021-03-30 成都信息工程大学 Machine-generated text detection method based on self-information loss compensation
CN112966117A (en) * 2020-12-28 2021-06-15 成都数之联科技有限公司 Entity linking method
CN113010633A (en) * 2019-12-20 2021-06-22 海信视像科技股份有限公司 Information interaction method and equipment
CN113283236A (en) * 2021-05-31 2021-08-20 北京邮电大学 Entity disambiguation method in complex Chinese text
CN113704416A (en) * 2021-10-26 2021-11-26 深圳市北科瑞声科技股份有限公司 Word sense disambiguation method and device, electronic equipment and computer-readable storage medium
CN113761218A (en) * 2021-04-27 2021-12-07 腾讯科技(深圳)有限公司 Entity linking method, device, equipment and storage medium
WO2023202170A1 (en) * 2022-04-21 2023-10-26 北京沃东天骏信息技术有限公司 Product word disambiguation method and apparatus
CN113761218B (en) * 2021-04-27 2024-05-10 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for entity linking

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572892A (en) * 2014-12-24 2015-04-29 中国科学院自动化研究所 Text classification method based on cyclic convolution network
CN106295796A (en) * 2016-07-22 2017-01-04 浙江大学 Entity link method based on degree of depth study
CN106547735A (en) * 2016-10-25 2017-03-29 复旦大学 The structure and using method of the dynamic word or word vector based on the context-aware of deep learning
CN106570170A (en) * 2016-11-09 2017-04-19 武汉泰迪智慧科技有限公司 Text classification and naming entity recognition integrated method and system based on depth cyclic neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572892A (en) * 2014-12-24 2015-04-29 中国科学院自动化研究所 Text classification method based on cyclic convolution network
CN106295796A (en) * 2016-07-22 2017-01-04 浙江大学 Entity link method based on degree of depth study
CN106547735A (en) * 2016-10-25 2017-03-29 复旦大学 The structure and using method of the dynamic word or word vector based on the context-aware of deep learning
CN106570170A (en) * 2016-11-09 2017-04-19 武汉泰迪智慧科技有限公司 Text classification and naming entity recognition integrated method and system based on depth cyclic neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杜婧君等: "基于中文维基百科的命名实体消歧方法", 《杭州电子科技大学学报》 *

Cited By (81)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107562729A (en) * 2017-09-14 2018-01-09 云南大学 The Party building document representation method strengthened based on neutral net and theme
CN107562729B (en) * 2017-09-14 2020-12-08 云南大学 Party building text representation method based on neural network and theme enhancement
CN107730002A (en) * 2017-10-13 2018-02-23 国网湖南省电力公司 A kind of communication network shutdown remote control parameter intelligent fuzzy comparison method
CN107730002B (en) * 2017-10-13 2020-06-02 国网湖南省电力公司 Intelligent fuzzy comparison method for remote control parameters of communication gateway machine
CN107729509A (en) * 2017-10-23 2018-02-23 中国电子科技集团公司第二十八研究所 The chapter similarity decision method represented based on recessive higher-dimension distributed nature
CN107729509B (en) * 2017-10-23 2020-07-07 中国电子科技集团公司第二十八研究所 Discourse similarity determination method based on recessive high-dimensional distributed feature representation
CN110019792A (en) * 2017-10-30 2019-07-16 阿里巴巴集团控股有限公司 File classification method and device and sorter model training method
CN108280061B (en) * 2018-01-17 2021-10-26 北京百度网讯科技有限公司 Text processing method and device based on ambiguous entity words
CN108280061A (en) * 2018-01-17 2018-07-13 北京百度网讯科技有限公司 Text handling method based on ambiguity entity word and device
US20190220749A1 (en) * 2018-01-17 2019-07-18 Beijing Baidu Netcom Science And Technology Co., Ltd. Text processing method and device based on ambiguous entity words
US11455542B2 (en) * 2018-01-17 2022-09-27 Beijing Baidu Netcom Science And Technology Co., Ltd. Text processing method and device based on ambiguous entity words
CN108304552A (en) * 2018-02-01 2018-07-20 浙江大学 A kind of name entity link method that knowledge based planting modes on sink characteristic extracts
CN108335731A (en) * 2018-02-09 2018-07-27 辽宁工程技术大学 A kind of invalid diet's recommendation method based on computer vision
CN108399230A (en) * 2018-02-13 2018-08-14 上海大学 A kind of Chinese financial and economic news file classification method based on convolutional neural networks
CN108446269A (en) * 2018-03-05 2018-08-24 昆明理工大学 A kind of Word sense disambiguation method and device based on term vector
CN108573047A (en) * 2018-04-18 2018-09-25 广东工业大学 A kind of training method and device of Module of Automatic Chinese Documents Classification
CN108563766A (en) * 2018-04-19 2018-09-21 天津科技大学 The method and device of food retrieval
CN108959242A (en) * 2018-05-08 2018-12-07 中国科学院信息工程研究所 A kind of target entity recognition methods and device based on Chinese character part of speech feature
CN108647785A (en) * 2018-05-17 2018-10-12 普强信息技术(北京)有限公司 A kind of neural network method for automatic modeling, device and storage medium
CN108647191A (en) * 2018-05-17 2018-10-12 南京大学 It is a kind of based on have supervision emotion text and term vector sentiment dictionary construction method
CN108804595B (en) * 2018-05-28 2021-07-27 中山大学 Short text representation method based on word2vec
CN108804595A (en) * 2018-05-28 2018-11-13 中山大学 A kind of short text representation method based on word2vec
CN110555208A (en) * 2018-06-04 2019-12-10 北京三快在线科技有限公司 ambiguity elimination method and device in information query and electronic equipment
CN108921213A (en) * 2018-06-28 2018-11-30 国信优易数据有限公司 A kind of entity classification model training method and device
CN108921213B (en) * 2018-06-28 2021-06-22 国信优易数据股份有限公司 Entity classification model training method and device
CN108805290A (en) * 2018-06-28 2018-11-13 国信优易数据有限公司 A kind of determination method and device of entity class
CN109101579B (en) * 2018-07-19 2021-11-23 深圳追一科技有限公司 Customer service robot knowledge base ambiguity detection method
CN109101579A (en) * 2018-07-19 2018-12-28 深圳追科技有限公司 customer service robot knowledge base ambiguity detection method
CN108920467A (en) * 2018-08-01 2018-11-30 北京三快在线科技有限公司 Polysemant lexical study method and device, search result display methods
CN109325108B (en) * 2018-08-13 2022-05-27 北京百度网讯科技有限公司 Query processing method, device, server and storage medium
US11216618B2 (en) 2018-08-13 2022-01-04 Beijing Baidu Netcom Science And Technology Co., Ltd. Query processing method, apparatus, server and storage medium
CN109325108A (en) * 2018-08-13 2019-02-12 北京百度网讯科技有限公司 Inquiry processing method, device, server and storage medium
CN109241294A (en) * 2018-08-29 2019-01-18 国信优易数据有限公司 A kind of entity link method and device
CN109214007A (en) * 2018-09-19 2019-01-15 哈尔滨理工大学 A kind of Chinese sentence meaning of a word based on convolutional neural networks disappears qi method
CN109299462A (en) * 2018-09-20 2019-02-01 武汉理工大学 Short text similarity calculating method based on multidimensional convolution feature
CN109614615A (en) * 2018-12-04 2019-04-12 联想(北京)有限公司 Methodology for Entities Matching, device and electronic equipment
CN109740728A (en) * 2018-12-10 2019-05-10 杭州世平信息科技有限公司 A kind of measurement of penalty calculation method based on a variety of neural network ensembles
CN109740728B (en) * 2018-12-10 2019-11-01 杭州世平信息科技有限公司 A kind of measurement of penalty calculation method based on a variety of neural network ensembles
CN109635114A (en) * 2018-12-17 2019-04-16 北京百度网讯科技有限公司 Method and apparatus for handling information
CN109933788A (en) * 2019-02-14 2019-06-25 北京百度网讯科技有限公司 Type determines method, apparatus, equipment and medium
WO2020228376A1 (en) * 2019-05-16 2020-11-19 华为技术有限公司 Text processing method and model training method and apparatus
CN110598846B (en) * 2019-08-15 2022-05-03 北京航空航天大学 Hierarchical recurrent neural network decoder and decoding method
CN110598846A (en) * 2019-08-15 2019-12-20 北京航空航天大学 Hierarchical recurrent neural network decoder and decoding method
CN110705292B (en) * 2019-08-22 2022-11-29 成都信息工程大学 Entity name extraction method based on knowledge base and deep learning
CN110705292A (en) * 2019-08-22 2020-01-17 成都信息工程大学 Entity name extraction method based on knowledge base and deep learning
CN110569506A (en) * 2019-09-05 2019-12-13 清华大学 Medical named entity recognition method based on medical dictionary
CN110705295A (en) * 2019-09-11 2020-01-17 北京航空航天大学 Entity name disambiguation method based on keyword extraction
CN110705295B (en) * 2019-09-11 2021-08-24 北京航空航天大学 Entity name disambiguation method based on keyword extraction
CN110674304A (en) * 2019-10-09 2020-01-10 北京明略软件系统有限公司 Entity disambiguation method and device, readable storage medium and electronic equipment
CN110826331B (en) * 2019-10-28 2023-04-18 南京师范大学 Intelligent construction method of place name labeling corpus based on interactive and iterative learning
CN110826331A (en) * 2019-10-28 2020-02-21 南京师范大学 Intelligent construction method of place name labeling corpus based on interactive and iterative learning
CN110852106B (en) * 2019-11-06 2024-05-03 腾讯科技(深圳)有限公司 Named entity processing method and device based on artificial intelligence and electronic equipment
CN110852106A (en) * 2019-11-06 2020-02-28 腾讯科技(深圳)有限公司 Named entity processing method and device based on artificial intelligence and electronic equipment
CN110852108A (en) * 2019-11-11 2020-02-28 中山大学 Joint training method, apparatus and medium for entity recognition and entity disambiguation
CN113010633B (en) * 2019-12-20 2023-01-31 海信视像科技股份有限公司 Information interaction method and equipment
CN113010633A (en) * 2019-12-20 2021-06-22 海信视像科技股份有限公司 Information interaction method and equipment
CN111241298A (en) * 2020-01-08 2020-06-05 腾讯科技(深圳)有限公司 Information processing method, apparatus and computer readable storage medium
CN111241298B (en) * 2020-01-08 2023-10-10 腾讯科技(深圳)有限公司 Information processing method, apparatus, and computer-readable storage medium
CN111241824A (en) * 2020-01-09 2020-06-05 中国搜索信息科技股份有限公司 Method for identifying Chinese metaphor information
CN111310481A (en) * 2020-01-19 2020-06-19 百度在线网络技术(北京)有限公司 Speech translation method, device, computer equipment and storage medium
CN111597804A (en) * 2020-05-15 2020-08-28 腾讯科技(深圳)有限公司 Entity recognition model training method and related device
CN111597804B (en) * 2020-05-15 2023-03-10 腾讯科技(深圳)有限公司 Method and related device for training entity recognition model
CN111709243B (en) * 2020-06-19 2023-07-07 南京优慧信安科技有限公司 Knowledge extraction method and device based on deep learning
CN111709243A (en) * 2020-06-19 2020-09-25 南京优慧信安科技有限公司 Knowledge extraction method and device based on deep learning
CN112069826B (en) * 2020-07-15 2021-12-07 浙江工业大学 Vertical domain entity disambiguation method fusing topic model and convolutional neural network
CN112069826A (en) * 2020-07-15 2020-12-11 浙江工业大学 Vertical domain entity disambiguation method fusing topic model and convolutional neural network
CN112100356A (en) * 2020-09-17 2020-12-18 武汉纺织大学 Knowledge base question-answer entity linking method and system based on similarity
CN112257443A (en) * 2020-09-30 2021-01-22 华泰证券股份有限公司 MRC-based company entity disambiguation method combined with knowledge base
CN112257443B (en) * 2020-09-30 2024-04-02 华泰证券股份有限公司 MRC-based company entity disambiguation method combined with knowledge base
CN112464669B (en) * 2020-12-07 2024-02-09 宁波深擎信息科技有限公司 Stock entity word disambiguation method, computer device, and storage medium
CN112464669A (en) * 2020-12-07 2021-03-09 宁波深擎信息科技有限公司 Stock entity word disambiguation method, computer device and storage medium
CN112966117A (en) * 2020-12-28 2021-06-15 成都数之联科技有限公司 Entity linking method
CN112580351B (en) * 2020-12-31 2022-04-19 成都信息工程大学 Machine-generated text detection method based on self-information loss compensation
CN112580351A (en) * 2020-12-31 2021-03-30 成都信息工程大学 Machine-generated text detection method based on self-information loss compensation
CN113761218A (en) * 2021-04-27 2021-12-07 腾讯科技(深圳)有限公司 Entity linking method, device, equipment and storage medium
CN113761218B (en) * 2021-04-27 2024-05-10 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for entity linking
CN113283236B (en) * 2021-05-31 2022-07-19 北京邮电大学 Entity disambiguation method in complex Chinese text
CN113283236A (en) * 2021-05-31 2021-08-20 北京邮电大学 Entity disambiguation method in complex Chinese text
CN113704416B (en) * 2021-10-26 2022-03-04 深圳市北科瑞声科技股份有限公司 Word sense disambiguation method and device, electronic equipment and computer-readable storage medium
CN113704416A (en) * 2021-10-26 2021-11-26 深圳市北科瑞声科技股份有限公司 Word sense disambiguation method and device, electronic equipment and computer-readable storage medium
WO2023202170A1 (en) * 2022-04-21 2023-10-26 北京沃东天骏信息技术有限公司 Product word disambiguation method and apparatus

Also Published As

Publication number Publication date
CN107102989B (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN107102989A (en) A kind of entity disambiguation method based on term vector, convolutional neural networks
CN107133213B (en) Method and system for automatically extracting text abstract based on algorithm
US10997370B2 (en) Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time
CN107944559B (en) Method and system for automatically identifying entity relationship
CN110032632A (en) Intelligent customer service answering method, device and storage medium based on text similarity
CN103927302B (en) A kind of file classification method and system
CN109241283A (en) A kind of file classification method based on multi-angle capsule network
CN106599029A (en) Chinese short text clustering method
CN106815252A (en) A kind of searching method and equipment
CN104778256B (en) A kind of the quick of field question answering system consulting can increment clustering method
CN113392209B (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN106407280A (en) Query target matching method and device
WO2017193685A1 (en) Method and device for data processing in social network
CN112163425A (en) Text entity relation extraction method based on multi-feature information enhancement
CN103646099A (en) Thesis recommendation method based on multilayer drawing
CN110347776A (en) Interest point name matching process, device, equipment and storage medium
CN105893362A (en) A method for acquiring knowledge point semantic vectors and a method and a system for determining correlative knowledge points
CN109582761A (en) A kind of Chinese intelligent Answer System method of the Words similarity based on the network platform
CN107092605A (en) A kind of entity link method and device
Liu et al. Structured alignment networks for matching sentences
CN110851570A (en) Unsupervised keyword extraction method based on Embedding technology
CN113761890A (en) BERT context sensing-based multi-level semantic information retrieval method
Fahrni et al. HITS'Monolingual and Cross-lingual Entity Linking System at TAC 2013.
Lu et al. Feature words selection for knowledge-based word sense disambiguation with syntactic parsing
Feifei et al. Bert-based Siamese Network for Semantic Similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant