CN108763192A - Entity relation extraction method and device for text-processing - Google Patents

Entity relation extraction method and device for text-processing Download PDF

Info

Publication number
CN108763192A
CN108763192A CN201810348221.6A CN201810348221A CN108763192A CN 108763192 A CN108763192 A CN 108763192A CN 201810348221 A CN201810348221 A CN 201810348221A CN 108763192 A CN108763192 A CN 108763192A
Authority
CN
China
Prior art keywords
entity
similarity
threshold value
predetermined threshold
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810348221.6A
Other languages
Chinese (zh)
Other versions
CN108763192B (en
Inventor
朱耀邦
高翔
纪达麒
陈运文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Daguan Data Co ltd
Original Assignee
Information Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Technology (shanghai) Co Ltd filed Critical Information Technology (shanghai) Co Ltd
Priority to CN201810348221.6A priority Critical patent/CN108763192B/en
Publication of CN108763192A publication Critical patent/CN108763192A/en
Application granted granted Critical
Publication of CN108763192B publication Critical patent/CN108763192B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

This application discloses a kind of entity relation extraction method and devices for text-processing.This method includes:Input pending text;Identify the entity in the pending text, wherein the pending text includes multiple entities;The entity is screened according to default sample to obtain the contextual feature of input example;The context similarity between each seed sample in the input example and seed sample library is calculated by the contextual feature;Judge whether the context similarity is more than the first predetermined threshold value;If the similarity is more than first predetermined threshold value, number of the similarity more than the seed sample of the predetermined threshold value is counted;Judge that whether the similarity is more than the number of the seed sample of the predetermined threshold value more than second predetermined threshold value;If the number that the similarity is more than the seed sample of the predetermined threshold value is more than second predetermined threshold value, using the entity relationship example for inputting example and being obtained as the text-processing.Present application addresses the high precision of rule and method low the technical issues of recalling.

Description

Entity relation extraction method and device for text-processing
Technical field
This application involves text-processing technical fields, are taken out in particular to a kind of entity relationship for text-processing Take method and device.
Background technology
With the fast development of internet, internet has become the main channel that people obtain information, on internet Text data also show explosive growth.Abundant information is contained in text data on internet, and structure is known Knowing library and knowledge mapping has very important effect;But manually progress relevant knowledge extraction workload is extremely huge, if Useful information can be gone out using Computer Automatic Extraction, that will have very important significance.However the textual data on internet According to be nearly all in the form of natural language existing for can not directly be handled without structured data, computer.
In order to solve this problem, information extraction technique comes into being, textual data of the information extraction technique from Un-structured Relationship between structural data, including entity, entity, event etc. are extracted in.Relation extraction is one in information extraction field Key technology usually identifies the entity in text by name entity recognition techniques, then identifies entity by Relation extraction technology Relationship between.The common method of Relation extraction includes:Rule-based method, unsupervised approaches have measure of supervision and half Measure of supervision.Rule-based method is there are clearly disadvantageous, and this method needs manual compiling largely regular, and workload is very Greatly, not easy care, cannot expand to other field well.When unsupervised approaches are clustered text, often effect is not Very well, there is a problem of that recall rate and preparation rate be not high, and need many manual interventions.
When carrying out relationship classification based on traditional machine learning algorithm, need manually to mark a large amount of training corpus, workload Greatly, and field transplantability and processing new relation can not be solved the problems, such as.And semi-supervised method mainly utilizes a small amount of mark Example is noted as initial seed set, then by continuous iteration, similar case extension kind is extracted from unstructured data Subclass, in view of the above-mentioned problems, currently no effective solution has been proposed.
Invention content
The main purpose of the application is to provide a kind of entity relation extraction method and device for text-processing, with solution Certainly the high precision of rule and method is low recalls problem.
To achieve the goals above, according to the one side of the application, a kind of entity pass for text-processing is provided It is abstracting method.
Include according to the entity relation extraction method for text-processing of the application:Input pending text;Identification institute State the entity in pending text, wherein the pending text includes multiple entities;The entity is sieved according to default sample Choosing obtains the contextual feature of input example;By the contextual feature calculate the input example and each seed sample it Between context similarity;Judge whether the context similarity is more than predetermined threshold value;If the similarity is more than described First predetermined threshold value then counts number of the similarity more than the seed sample of the predetermined threshold value;Judge the similarity Whether the number more than the seed sample of the predetermined threshold value is more than second predetermined threshold value;If the similarity is more than institute State the seed sample of predetermined threshold value number be more than second predetermined threshold value, then using the input example as the text at Manage obtained entity relationship example.
Further, include before the entity abstracting method starts:Training term vector model, specifically includes:It uses Gensim tools training background language material obtains the term vector model.
Further, identify that the entity in the pending text includes:Described in name entity recognition method acquisition Entity in pending text.Further, the context for screening to obtain input example to the entity according to default sample is special Sign includes:The pending text is segmented;Part-of-speech tagging is carried out to word segmentation result;Filtering part of speech annotation results are waited for Select word;The target word in the word to be selected is obtained using contextual window;Above and below target word composition input example Literary feature.
Further, to calculate the input example by the contextual feature similar to the context between seed sample Degree includes:Contextual feature substitution preset formula is obtained into the context similarity;The preset formula is:
Wherein, similarity indicates the context similarity.
To achieve the goals above, according to the another aspect of the application, a kind of entity pass for text-processing is provided It is draw-out device.
Include according to the entity relation extraction device for text-processing of the application:Input module inputs pending text This;Identification module identifies the entity in the pending text, wherein the pending text includes multiple entities, and structure is defeated Enter example (entity, entity 2 input text);Screening module screens the entity according to default sample to obtain input example Contextual feature;Computing module is calculated upper between the input example and each seed sample by the contextual feature Hereafter similarity;First judgment module, judges whether the context similarity is more than the first predetermined threshold value;Statistical module, such as Similarity described in fruit is more than the predetermined threshold value, then counts seed sample of the similarity more than first predetermined threshold value Number;Second judgment module, for judging that whether the similarity is more than the number of the seed sample of the predetermined threshold value more than institute State the second predetermined threshold value;Terminate module, if the number that the similarity is more than the seed sample of the predetermined threshold value is more than institute The second predetermined threshold value is stated, then the entity relationship example obtained the input example as the text-processing.
Further, the entity relation extraction device further includes:Training module, for training term vector model, specifically Including:The term vector model is obtained using gensim tools training background language material.
Further, the identification module includes:Entity acquisition module is waited for using described in name entity recognition method acquisition Handle the entity in text.
Further, the screening module includes:Word-dividing mode, for being segmented to the pending text;Mark Module carries out part-of-speech tagging to word segmentation result;Filtering module, filtering part of speech annotation results obtain word to be selected;Target word obtains mould Block, for obtaining the target word in the word to be selected using contextual window;Contextual feature generation module, it is described for obtaining Target word constitutes the contextual feature of input example.
Further, the computing module includes:Module is substituted into, for obtaining contextual feature substitution preset formula Go out the context similarity.
In the embodiment of the present application, it in such a way that term vector model is combined with context similarity, is inputted by calculating Similarity between example and seed sample, is compared with predetermined threshold value, obtains the sample for meeting target, has reached reality The purpose of body Relation extraction to realize the technique effect for the recall rate for promoting Relation extraction, and then solves rule and method High precision low the technical issues of recalling.
Description of the drawings
The attached drawing constituted part of this application is used for providing further understanding of the present application so that the application's is other Feature, objects and advantages become more apparent upon.The illustrative examples attached drawing and its explanation of the application is for explaining the application, not Constitute the improper restriction to the application.In the accompanying drawings:
Fig. 1 is the entity relation extraction method schematic diagram for text-processing according to the embodiment of the present application;
Fig. 2 is the generation contextual feature schematic diagram according to the embodiment of the present application;
Fig. 3 is the entity relation extraction schematic device for text-processing according to the embodiment of the present application;
Fig. 4 is the screening module schematic diagram according to the embodiment of the present application;And
Fig. 5 is the method operational flowchart according to the embodiment of the present application.
Specific implementation mode
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, technical solutions in the embodiments of the present application are clearly and completely described, it is clear that described embodiment is only The embodiment of the application part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people The every other embodiment that member is obtained without making creative work should all belong to the model of the application protection It encloses.
It should be noted that the term " comprising " in the description and claims of this application and above-mentioned attached drawing and " tool Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing series of steps or unit Process, method, system, product or equipment those of are not necessarily limited to clearly to list step or unit, but may include without clear It is listing to Chu or for these processes, method, product or equipment intrinsic other steps or unit.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
As shown in Figure 1, this application involves a kind of entity relation extraction method for text-processing, this method includes as follows Step S101 to step S106:
Step S101 inputs pending text;
Pending text can be contained:The structural data extracted from the text data of Un-structured is needed, Include but not limited in pending text, relationship, event etc. between entity, entity.
Step S102 identifies the entity in the pending text, wherein the pending text includes multiple entities;
Identify that the mode of the entity in pending text is to obtain the pending text using name entity recognition method In entity.
Step S103 screens the entity according to default sample to obtain the contextual feature of input example;
As the preferred of the present embodiment, as shown in Fig. 2, wherein step S103, screens the entity according to default sample The contextual feature for obtaining input example includes the following steps S201 to step S205:
Step S201 segments the pending text;
Step S202 carries out part-of-speech tagging to word segmentation result;
Preferably, word segmentation result is labeled as:Noun, verb, adverbial word etc..
Step S203, filtering part of speech annotation results obtain word to be selected;
Preferably, only retain the verb and noun in the word to be selected.
Step S204 obtains the target word in the word to be selected using contextual window;
Preferably, based on context window (a, b, c, d) obtains context [left1, right1, left2, right2], Wherein left1, right1, left2, right2 are respectively a, 1 left side of entity word, b, the right word, c, 2 left side of entity word, the right side D, side word.If practical word number is less than window size, whole words are taken.
Step S205 constitutes input example context feature according to the target word.
Step S104 calculates the context between the input example and each seed sample by the contextual feature Similarity;
Preferably, contextual feature substitution preset formula is obtained into the context similarity;The preset formula For:
Wherein, similarity indicates the context similarity.
Step S105, judges whether the context similarity is more than the first predetermined threshold value;
Step S106 counts the similarity more than described if the similarity is more than first predetermined threshold value The number of the seed sample of predetermined threshold value;
Step S107 judges that the similarity is more than the number of the seed sample of the predetermined threshold value and whether is more than described the Two predetermined threshold values;
Step S108, if the number that the similarity is more than the seed sample of the predetermined threshold value is more than described second in advance If threshold value, then using the entity relationship example for inputting example and being obtained as the text-processing.
It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not The sequence being same as herein executes shown or described step.
According to the embodiment of the present application, a kind of device for implementing the above method is additionally provided, as shown in figure 3, the device Including:Input module 10 inputs pending text;Identification module 20 identifies the entity in the pending text, wherein institute It includes multiple entities to state pending text;Screening module 30 screens the entity according to default sample to obtain input example Contextual feature;Computing module 40 is calculated by the contextual feature between the input example and each kind of sub-instance Context similarity;First judgment module 50, judges whether the context similarity is more than the first predetermined threshold value;Statistical module 60, if the similarity is more than the predetermined threshold value, count the seed that the similarity is more than first predetermined threshold value Sample number;Second judgment module 70, for judging that the number of seed sample that the similarity is more than the predetermined threshold value is It is no to be more than second predetermined threshold value;Terminate module 80, if the similarity is more than the seed sample of the predetermined threshold value Number is more than second predetermined threshold value, then the entity relationship reality obtained the input example example as the text-processing Example.
As shown in figure 4, screening module 30 includes:Word-dividing mode 301, for being segmented to the pending text;Mark Injection molding block 302 carries out part-of-speech tagging to word segmentation result;Filtering module 303, filtering part of speech annotation results obtain word to be selected;Target Word acquisition module 304, for obtaining the target word in the word to be selected using contextual window;Contextual feature generation module 305, the contextual feature of input example is constituted for obtaining the target word.
As shown in figure 5, the method operational flowchart of the present invention is specific as follows:
Seed sample generates, and writes some rule templates according to domain knowledge, identifies designated entities relationship.Rule template is most Amount is stringent, it is ensured that high-accuracy.In addition, rule template answers the expression way of covering relation as much as possible.It is identified in rule After candidate seed sample, by artificial filter, the sample of mistake is removed, obtains final seed sample in this way.
Training term vector model, term vector method is that Hinton was proposed in 1986, by one low-dimensional real number of word Vector indicates, such as [0.179, -0.157, -0.117,0.909, -0.532 ...] this form, that is, term vector.And And in term vector space, two small points of vector angle, the word representated by them is semantically similar or related.Compared with The term vector that good training algorithm obtains, can preferably reflect the similarity between word semantically.
The similitude similarityX, Y of word X and word Y is calculated with COS distance:
The present embodiment trains term vector using gensim tools.The language material used is full field news corpus.Vector dimension For 128 dimensions.
Sample contextual feature generates, and sample is a triple (entity 1, entity 2, content of text).For what is given Sample, we segment content of text, part-of-speech tagging, name Entity recognition, obtain following form result [w0/tag0, W1/tag1 ..., wi-1/tagi-1, entity 1, wi+1/tagi+1 ..., wj-1/tagj-1, entity 2, wj+1/tagj+1 ..., wk/tagk].It is filtered by part of speech, only retains verb, noun.Based on context window (a, b, c, d) obtain context [left1, Right1, left2, right2], wherein left1, right1, left2, right2 are respectively a, 1 left side of entity word, the right b A word, c, 2 left side of entity word, d, the right word.If practical word number is less than window size, whole words are taken.Finally according to training Good term vector model, the vector for obtaining contextual feature indicate [[vj-a ..., vj-1], [vj+1 ..., vj+b], [vk- C ..., vk-1], [vk+1 ..., vk+d]].
Sample similarity calculation generates contextual feature to candidate sample, and calculates the phase with each seed sample successively Like degree.For candidate sample feature [[wj-a ..., wj-1], [wj+1 ..., wj+b], [wk-c ..., wk-1], [wk+ of input 1 ..., wk+d]] and seed sample feature [[vj-a ..., vj-1], [vj+1 ..., vj+b], [vk-c ..., vk-1], [vk+ 1 ..., vk+d]], weight vectors [[f1 ..., fa], [fa+1 ..., fa+b], [fa+b+1 ..., fa+b+c], [fa+b+c+ 1 ..., fa+b+c+d]], calculating formula of similarity is as follows
Here the physical length of two feature vector windows is not necessarily identical, and common point is taken when calculating molecule, calculates and divides The actual size of seed sample feature vector window is taken when female.
It is phase of the candidate sample relative to seed sample it should be pointed out that similarity here and being unsatisfactory for symmetry Like degree.
Seed sample extends, and for the corpus of input, traverses every document therein, to document by big punctuate (fullstop, Question mark etc.) carry out subordinate sentence.
To each big sentence, it is named Entity recognition first, if including the entity of two specified types, constructs candidate sample Example (entity 1, entity 2, content of text).Otherwise next processing is carried out.
The contextual feature of the candidate sample of construction, calculates the similarity of candidate sample and each sample in seed sample library, And count the sample number that similarity is more than given threshold value.If obtained sample number is more than given threshold value (such as current kind of increment The 10% of number of cases), then candidate sample is added in sample library, otherwise carries out next processing.
It can be seen from the above description that the application realizes following technique effect:By with identical entity relationship Entity to similar context, based on sample context similarity extension sample library, can effectively promote Relation extraction Recall rate.By training term vector model, it is trained using extensive general language material.Context phase is carried out based on term vector It is calculated like degree, generalization ability can be obviously improved.
Obviously, those skilled in the art should be understood that each module of above-mentioned the application or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Be performed by computing device in the storage device, either they are fabricated to each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the application be not limited to it is any specific Hardware and software combines.
The foregoing is merely the preferred embodiments of the application, are not intended to limit this application, for the skill of this field For art personnel, the application can have various modifications and variations.Within the spirit and principles of this application, any made by repair Change, equivalent replacement, improvement etc., should be included within the protection domain of the application.

Claims (10)

1. a kind of entity relation extraction method for text-processing, which is characterized in that including:
Input pending text;
Identify the entity in the pending text, wherein the pending text includes multiple entities;
The entity is screened according to default sample to obtain the contextual feature of input example;
By the contextual feature calculate it is described input example and seed sample library in each seed sample between up and down Literary similarity;
Judge whether the context similarity is more than the first predetermined threshold value;
If the similarity is more than first predetermined threshold value, the seed that the similarity is more than the predetermined threshold value is counted The number of sample;
Judge that whether the similarity is more than the number of the seed sample of the predetermined threshold value more than second predetermined threshold value;
If the number that the similarity is more than the seed sample of the predetermined threshold value is more than second predetermined threshold value, by institute State the entity relationship example that input example is obtained as the text-processing.
2. entity relation extraction method according to claim 1, which is characterized in that before the entity abstracting method starts Including:
Training term vector model, specifically includes:The term vector model is obtained using gensim tools training background language material.
3. entity relation extraction method according to claim 1, which is characterized in that the reality in the identification pending text Body includes:
Entity in the pending text is obtained using name entity recognition method.
4. entity relation extraction method according to claim 1, which is characterized in that sieved to the entity according to default example Choosing obtain input example contextual feature include:
The pending text is segmented;
Part-of-speech tagging is carried out to word segmentation result;
Filtering part of speech annotation results obtain word to be selected;
The target word in the word to be selected is obtained using contextual window;
Input example context feature is constituted according to the target word.
5. entity relation extraction method according to claim 1, which is characterized in that calculate institute by the contextual feature Stating the context similarity inputted between example and each seed sample includes:
Contextual feature substitution preset formula is obtained into the context similarity;
The preset formula is:
Wherein, similarity indicates the context similarity.
6. a kind of entity relation extraction device for text-processing, which is characterized in that including:
Input module inputs pending text;
Identification module identifies the entity in the pending text, wherein the pending text includes multiple entities;
Screening module screens the entity according to default sample to obtain the contextual feature of input example;
It is similar to the context between each seed sample to calculate the input example by the contextual feature for computing module Degree;
First judgment module, judges whether the context similarity is more than the first predetermined threshold value;
It is default more than described first to count the similarity if the similarity is more than the predetermined threshold value for statistical module The seed sample number of threshold value;
Second judgment module, for judging that whether the similarity is more than the number of the seed sample of the predetermined threshold value more than institute State the second predetermined threshold value;
Terminate module, if the number that the similarity is more than the seed sample of the predetermined threshold value is more than the described second default threshold Value, then the entity relationship example obtained the input example as the text-processing.
7. entity relation extraction device according to claim 6, which is characterized in that the entity relation extraction device also wraps It includes:Training module is specifically included for training term vector model:Institute's predicate is obtained using gensim tools training background language material Vector model.
8. entity relation extraction device according to claim 6, which is characterized in that the identification module includes:
Entity acquisition module obtains the entity in the pending text using name entity recognition method.
9. entity relation extraction device according to claim 6, which is characterized in that the screening module includes:
Word-dividing mode, for being segmented to the pending text;
Labeling module carries out part-of-speech tagging to word segmentation result;
Filtering module, filtering part of speech annotation results obtain word to be selected;
Target word acquisition module, for obtaining the target word in the word to be selected using contextual window;
Contextual feature generation module constitutes the contextual feature of input example for obtaining the target word.
10. entity relation extraction device according to claim 6, which is characterized in that the computing module includes:
Module is substituted into, for contextual feature substitution preset formula to be obtained the context similarity.
CN201810348221.6A 2018-04-18 2018-04-18 Entity relation extraction method and device for text processing Active CN108763192B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810348221.6A CN108763192B (en) 2018-04-18 2018-04-18 Entity relation extraction method and device for text processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810348221.6A CN108763192B (en) 2018-04-18 2018-04-18 Entity relation extraction method and device for text processing

Publications (2)

Publication Number Publication Date
CN108763192A true CN108763192A (en) 2018-11-06
CN108763192B CN108763192B (en) 2022-04-19

Family

ID=64011106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810348221.6A Active CN108763192B (en) 2018-04-18 2018-04-18 Entity relation extraction method and device for text processing

Country Status (1)

Country Link
CN (1) CN108763192B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522399A (en) * 2018-11-20 2019-03-26 北京京东尚科信息技术有限公司 Method and apparatus for generating information
CN110909116A (en) * 2019-11-28 2020-03-24 中国人民解放军军事科学院军事科学信息研究中心 Entity set expansion method and system for social media
CN111488467A (en) * 2020-04-30 2020-08-04 北京建筑大学 Construction method and device of geographical knowledge graph, storage medium and computer equipment
CN113538075A (en) * 2020-04-14 2021-10-22 阿里巴巴集团控股有限公司 Data processing method, model training method, device and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170032781A1 (en) * 2015-07-28 2017-02-02 Google Inc. Collaborative language model biasing
CN107203511A (en) * 2017-05-27 2017-09-26 中国矿业大学 A kind of network text name entity recognition method based on neutral net probability disambiguation
CN107463607A (en) * 2017-06-23 2017-12-12 昆明理工大学 The domain entities hyponymy of bluebeard compound vector sum bootstrapping study obtains and method for organizing
CN107784125A (en) * 2017-11-24 2018-03-09 中国银行股份有限公司 A kind of entity relation extraction method and device
CN107861939A (en) * 2017-09-30 2018-03-30 昆明理工大学 A kind of domain entities disambiguation method for merging term vector and topic model
US10394886B2 (en) * 2015-12-04 2019-08-27 Sony Corporation Electronic device, computer-implemented method and computer program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170032781A1 (en) * 2015-07-28 2017-02-02 Google Inc. Collaborative language model biasing
US10394886B2 (en) * 2015-12-04 2019-08-27 Sony Corporation Electronic device, computer-implemented method and computer program
CN107203511A (en) * 2017-05-27 2017-09-26 中国矿业大学 A kind of network text name entity recognition method based on neutral net probability disambiguation
CN107463607A (en) * 2017-06-23 2017-12-12 昆明理工大学 The domain entities hyponymy of bluebeard compound vector sum bootstrapping study obtains and method for organizing
CN107861939A (en) * 2017-09-30 2018-03-30 昆明理工大学 A kind of domain entities disambiguation method for merging term vector and topic model
CN107784125A (en) * 2017-11-24 2018-03-09 中国银行股份有限公司 A kind of entity relation extraction method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LISHUANG LI 等: ""A distributed meta-learning system for Chinese entity relation extraction"", 《NEUROCOMPUTING》 *
黄勋 等: ""关系抽取技术研究综述"", 《现代图书情报技术》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522399A (en) * 2018-11-20 2019-03-26 北京京东尚科信息技术有限公司 Method and apparatus for generating information
CN109522399B (en) * 2018-11-20 2022-08-12 北京京东尚科信息技术有限公司 Method and apparatus for generating information
CN110909116A (en) * 2019-11-28 2020-03-24 中国人民解放军军事科学院军事科学信息研究中心 Entity set expansion method and system for social media
CN110909116B (en) * 2019-11-28 2022-12-23 中国人民解放军军事科学院军事科学信息研究中心 Entity set expansion method and system for social media
CN113538075A (en) * 2020-04-14 2021-10-22 阿里巴巴集团控股有限公司 Data processing method, model training method, device and equipment
CN111488467A (en) * 2020-04-30 2020-08-04 北京建筑大学 Construction method and device of geographical knowledge graph, storage medium and computer equipment
CN111488467B (en) * 2020-04-30 2022-04-05 北京建筑大学 Construction method and device of geographical knowledge graph, storage medium and computer equipment

Also Published As

Publication number Publication date
CN108763192B (en) 2022-04-19

Similar Documents

Publication Publication Date Title
CN104199972B (en) A kind of name entity relation extraction and construction method based on deep learning
CN108763192A (en) Entity relation extraction method and device for text-processing
CN108334495A (en) Short text similarity calculating method and system
CN104881458B (en) A kind of mask method and device of Web page subject
CN108182976A (en) A kind of clinical medicine information extracting method based on neural network
CN104035975B (en) It is a kind of to realize the method that remote supervisory character relation is extracted using Chinese online resource
CN106909537B (en) One-word polysemous analysis method based on topic model and vector space
CN109871955A (en) A kind of aviation safety accident causality abstracting method
CN110175246A (en) A method of extracting notional word from video caption
EP3483747A1 (en) Preserving and processing ambiguity in natural language
CN112101031B (en) Entity identification method, terminal equipment and storage medium
CN110929520B (en) Unnamed entity object extraction method and device, electronic equipment and storage medium
CN111475622A (en) Text classification method, device, terminal and storage medium
CN112395395A (en) Text keyword extraction method, device, equipment and storage medium
Ren et al. Detecting the scope of negation and speculation in biomedical texts by using recursive neural network
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN105760524A (en) Multi-level and multi-class classification method for science news headlines
CN104537280B (en) Protein interactive relation recognition methods based on text relation similitude
CN111177375A (en) Electronic document classification method and device
CN115600605A (en) Method, system, equipment and storage medium for jointly extracting Chinese entity relationship
CN111428502A (en) Named entity labeling method for military corpus
CN111161861A (en) Short text data processing method and device for hospital logistics operation and maintenance
CN111368532B (en) Topic word embedding disambiguation method and system based on LDA
CN113076391A (en) Remote supervision relation extraction method based on multi-layer attention mechanism
CN113191118A (en) Text relation extraction method based on sequence labeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Room 501, 502, 503, No. 66 Boxia Road, China (Shanghai) Pilot Free Trade Zone, Pudong New Area, Shanghai, March 2012

Patentee after: Daguan Data Co.,Ltd.

Address before: Room 515, building Y1, No. 112, liangxiu Road, Pudong New Area, Shanghai 201203

Patentee before: DATAGRAND INFORMATION TECHNOLOGY (SHANGHAI) Co.,Ltd.