CN109472033A - Entity relation extraction method and system in text, storage medium, electronic equipment - Google Patents
Entity relation extraction method and system in text, storage medium, electronic equipment Download PDFInfo
- Publication number
- CN109472033A CN109472033A CN201811376209.2A CN201811376209A CN109472033A CN 109472033 A CN109472033 A CN 109472033A CN 201811376209 A CN201811376209 A CN 201811376209A CN 109472033 A CN109472033 A CN 109472033A
- Authority
- CN
- China
- Prior art keywords
- sentence
- relationship
- entity
- entities
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The present invention relates to the entity relation extraction method and system in a kind of text, storage medium, electronic equipments.Entity relation extraction method in text of the present invention includes the following steps: to obtain entity triple set of relationship, entity and entity attribute set and concept set;The triple set of relationship of two entities recognized in the sentence of training text collection and the sentence;Remote supervisory mark is carried out, the set of relationship of the sentence including training text collection, two entities, the corresponding concept of two entities and two entities that recognize in the sentence is obtained, sentence vector is inputted into entity relation extraction model and training;Obtain the set of relationship that each sentence includes two entities, the corresponding concept of two entities and two entities.Entity relation extraction method in text of the present invention goes out the relationship between entity using the semantic context information extraction in text, solves the problems, such as existing error label in remote supervisory process.
Description
Technical field
The present invention relates to text-processings and information extraction technique field, take out more particularly to the entity relationship in a kind of text
Take method and system, storage medium, electronic equipment.
Background technique
In the past, people are according to some large-scale knowledge bases of knowledge architecture in the real world, such as Wikipedia
And DBpedia.These knowledge bases have been widely used in the fields such as artificial intelligence and natural language processing, as question answering system,
Information extraction etc..It is true that a large amount of triple is contained in knowledge base, such as (New York, CityOf, United States)
The fact that represent " city that New York is the U.S. ".However, existing knowledge base is limited the fact that include and far from
It is enough complete, there is new true generation daily.How marking out the new fact and carrying out completion knowledge base becomes a urgent need to resolve
Problem.It is a time-consuming and laborious engineering that true triple is marked using the method manually marked, therefore many is ground now
Study carefully and center of gravity is transferred to how automatically to mark out the new fact from complicated and diversified Internet resources.Wherein, a large amount of
It is a very important task and most crucial task that the extraction of entity relationship is carried out in text.Although in existing text
Entity relation extraction method preferable effect can be obtained with the help of remote supervisory mechanism, but the hypothesis of remote supervisory is deposited
The error label the problem of.The reason is that the relationship between a pair of of entity only has one kind, by institute in the hypothesis of remote supervisory
This is occurred the sentence of entity is regarded as to express a kind of this relationship.In fact, two entities appear in one simultaneously
When in sentence, relationship set in knowledge base may not be able to be given expression to, other relationships may be given expression to, or reflect certain
One common theme needs to be judged according to the semantic context in sentence.
Summary of the invention
Based on this, the object of the present invention is to provide a kind of entity relation extraction methods in text, using in text
Semantic context information extraction go out the relationship between entity, fundamentally solve remote supervisory process in existing error label ask
Topic.
The present invention is achieved by the following scheme:
A kind of entity relation extraction method in text, includes the following steps:
Entity triple set of relationship is obtained, entity and entity attribute set are obtained, obtains concept set;
Obtain the triple set of relationship of two entities recognized in the sentence and the sentence of training text collection;
It is right according to the entity triple set of relationship, the entity and entity attribute set and the concept set
The triple set of relationship of two entities recognized in the sentence and the sentence of the training text collection carries out remote supervisory mark
Note, obtain include the sentence of training text collection, two entities recognized in the sentence, the corresponding concept of two entities with
And the set of relationship of two entities, and set of relationship is put into mark training set;
According to the mark training set, the vector for obtaining word in training text collection sentence is indicated;
It is indicated according to the vector of word in sentence, obtains the sentence vector of each sentence of training text collection;
The sentence vector of each sentence of training text collection is inputted into entity relation extraction model, is marked according in the sentence
Two entities, the corresponding concept of two entities that is marked in the sentence and two entities being marked in the sentence
The relationship training entity relation extraction model;
Obtain the sentence vector of each sentence of text set to be extracted;
The sentence vector of each sentence of text set to be extracted is inputted into the entity relation extraction model, obtains text to be extracted
This collects the set of relationship including two entities, the corresponding concept of two entities and two entities of each sentence.
Entity relation extraction method in text of the present invention, using entity within a context belonging to concept and range
Semantic context information is represented, and obtains the entity relationship training set of the more relationships of more concepts according to concept and range, and according to training
Collection constructs entity relation extraction model, fundamentally solves the problems, such as existing error label during remote supervisory.
In one embodiment, according to the entity triple set of relationship, the entity and entity attribute set and
The concept set, the triple set of relationship to two entities recognized in the sentence and the sentence of the training text collection
Carry out remote supervisory mark, comprising:
Context identification is carried out to the sentence of the training text collection, it is right respectively to obtain two entities that the sentence recognizes
The concept answered.
In one embodiment, context identification is carried out to the sentence of the training text collection, obtains the sentence and recognizes
The corresponding concept of two entities after, further include following steps:
Two entities recognized in training text collection sentence are matched with entity triple set of relationship;
If it fails to match, a kind of relationship is randomly selected from entity triple set of relationship, generates including sentence, marked
Two entities of note, be marked two entity corresponding concept and the set of relationship randomly selected, and by the data
Collection is put into mark training set as negative sample.
In one embodiment, further include following steps:
If successful match, generates and respectively corresponded including sentence, two entities being marked, two entities being marked
Concept and matched set of relationship, and confidence score is carried out to the obtained relationship of matching, if appraisal result is more than the
One given threshold is then put into mark training set for the data set as positive sample, if appraisal result is lower than the first given threshold,
Then mark training set is put into using the data set as negative sample.
In one embodiment, confidence score is carried out to the relationship that matching obtains, comprising:
It appears in the ratio in corpus jointly according to the context with sentence, obtains in the matched relationship and the sentence
The degree of correlation of context, degree of correlation is higher, then confidence score is higher.
In one embodiment, further include following steps:
It obtains in set of relationship generated, the identical multiple set of relationship of concept;
Judge the context-sensitive degree of each relationship and sentence in the multiple set of relationship;
And the maximum relationship of degree of correlation is substituted into multiple set of relationship as new relationship.
In one embodiment, the maximum relationship of degree of correlation is substituted into multiple set of relationship as new relationship
Afterwards, further include following steps:
Delete the multiple set of relationship in the mark training set;
The multiple set of relationship including new relationship is put into the mark training set.
Further, the present invention also provides the entity relation extraction systems in a kind of text, comprising:
First obtains module, for obtaining entity triple set of relationship, obtains entity and entity attribute set, obtains general
Read set;
Second obtains module, the ternary of two entities recognized in the sentence and the sentence for obtaining training text collection
Group set of relationship;
Remote supervisory labeling module, for according to the entity triple set of relationship, the entity and entity attribute collection
Conjunction and the concept set close the triple of two entities recognized in the sentence and the sentence of the training text collection
Assembly, which is closed, carries out remote supervisory mark, obtains the sentence including training text collection, two entities recognized in the sentence, two
The set of relationship of the corresponding concept of entity and two entities, and set of relationship is put into mark training set;
Input module is indicated, for obtaining the vector table of word in training text collection sentence according to the mark training set
Show;
First indicates sentence module, for indicating according to the vector of word in sentence, obtains each sentence of training text collection
Sentence vector;
Entity relation extraction model training module, for closing the sentence vector input entity of each sentence of training text collection
It is extraction model, it is corresponding general according to two entities being marked in the sentence, two entities being marked in the sentence
The relationship training entity relation extraction model for two entities being marked in thought and the sentence;
Second indicates sentence module, for obtaining the sentence vector of each sentence of text set to be extracted;
Entity relation extraction module, for the sentence vector of each sentence of text set to be extracted to be inputted the entity relationship
Extraction model, obtain each sentence of text set to be extracted includes two entities, the corresponding concept of two entities and two
The set of relationship of a entity.
Entity relation extraction method in text of the present invention, using entity within a context belonging to concept and range
Semantic context information is represented, and obtains the entity relationship training set of the more relationships of more concepts according to concept and range, and according to training
Collection constructs entity relation extraction model, fundamentally solves the problems, such as existing error label during remote supervisory.
Further, the present invention also provides a kind of computer-readable medium, it is stored thereon with computer program, the computer
The entity relation extraction method in the text as described in any one of above-described embodiment is realized when program is executed by processor.
Further, the present invention also provides a kind of electronic equipment, including memory, processor and it is stored in the storage
Device and the computer program that can be executed by the processor when processor executes the computer program, are realized as above-mentioned
Entity relation extraction method in text described in any one of embodiment.
In order to better understand and implement, the invention will now be described in detail with reference to the accompanying drawings.
Detailed description of the invention
Fig. 1 is the entity relation extraction method flow schematic diagram in a kind of embodiment in text;
Fig. 2 is training text collection relationship match flow diagram in a kind of embodiment;
Fig. 3 is that the remote supervisory in a kind of embodiment marks flow diagram;
Fig. 4 is the flow diagram being modified in a kind of embodiment to annotation results;
Fig. 5 is a kind of schematic diagram of entity relation extraction model in embodiment;
Fig. 6 is the entity relation extraction system structure diagram in a kind of embodiment in text;
Fig. 7 is electronic devices structure schematic diagram in a kind of embodiment.
Specific embodiment
Referring to Fig. 1, in one embodiment, the entity relation extraction method in text of the present invention includes the following steps:
Step S101: obtaining entity triple set of relationship, obtains entity and entity attribute set, obtains concept set.
The present embodiment selects Freebase as Basic period structure.Freebase is an extensive knowledge mapping, wherein
Inherent contains more than 7300 kinds of relationships and more than 900,000,000 entities.By the resource description framework (Resource in Freebase
Description Framework, RDF) triple (entity 1, relationship, entity 2) arrange and store in a computer, as
The entity triple set of relationship of the present embodiment, is denoted as R, including, for example, (New York, CityOf, United States) this
The triple of sample.In addition, by the entity and entity attributes finish message in Freebase and storing in a computer, as this
The entity and entity attribute set of embodiment, are denoted as E, and each entity may include zero or more attribute.
It is related to the building and use of more conceptual knowledge bases in the scheme of the present embodiment, needs to prepare a conceptual dictionary.
The concept refers to that based on context context judges concept field belonging to entity.In knowledge mapping Probase, include
Millions of a concepts, therefore use this knowledge base as the conceptual dictionary data source in the present embodiment.It will be in set of relationship R
The entity and its corresponding concept that each relationship is related to arrange and save in a computer, as the present embodiment entity and
The set of its possible said concepts is denoted as C, and wherein concept can be one or more, such as entity and its concept (IBM,
Company;Corporation; Client;Organization;Vendor;Supplier;…).
Step S102: the triple set of relations of two entities recognized in the sentence and the sentence of training text collection is obtained
It closes.
The present embodiment is using the New York Times's text set as training text collection ratio.To each in training text collection D
A news documents d, the beginning and end of each sentence s is identified by punctuation mark, and document is divided into several sentences.For
Carry out entity relation extraction task, it is also necessary to the entity in s be identified, existing nature is used in the solution of the present invention
Language processing tools StanfordNLP is named Entity recognition operation.If the entity recognized in s be not equal to 2, or
The entity recognized is not in set E, then it is assumed that the sentence is invalid and abandons.By each sentence s for meeting condition and identification
The two entity e arrived1And e2Charge to triple (s, e1, e2) and store in a computer, it constitutes and is recognized in sentence and the sentence
Two entities triple set of relationship, be denoted as SE, may include such as (New York is the most populous
City in the United States, New York, United States).
Step S103: according to the entity triple set of relationship, the entity and entity attribute set and described general
Set is read, the triple set of relationship of two entities recognized in the sentence and the sentence of the training text collection is carried out remote
Journey supervision mark, obtaining includes that the sentence of training text collection, two entities, two entities that recognize in the sentence respectively correspond
Concept and two entities set of relationship, and set of relationship is put into mark training set.
The input of remote supervisory includes the entity triple set of relationship R, the entity and entity attribute set E, institute
State concept set C.Triple set of relationship to two entities recognized in the sentence and the sentence of the training text collection
SE passes sequentially through concept identification, remote supervisory, relationship confidence score three operations to carry out remote supervisory mark, and acquisition includes
The two entity e recognized in the sentence of training text collection, the sentence1、e1, the corresponding concept c of two entities1、c2, with
And the set of relationship r of two entities1(s, (e1, c1, r1, c2, e2)), and set of relationship is put into mark training set.And it obtains
Five-tuple relationship (e1, c1, r1, c2, e2), and five-tuple relationship is put into knowledge base KB yet to be built.
Step S104: according to the mark training set, the vector for obtaining word in training text collection sentence is indicated.
In this step, input includes mark training set TtrainWith Wikipedia text corpus, export for word to
Amount indicates.
In order to indicate mark training set TtrainIn in each word for occurring, need to carry out two step operations: 1) using term vector
Indicate each word, 2) positional relationship of word and two entities in sentence is combined to strengthen the expression of term vector.In order to calculate
Term vector out, it is thus necessary to determine that vocabulary.The word for occurring 100 times or more in Wikipedia is protected in the solution of the present invention
It leaves to collectively form vocabulary.Then passed through using the word2vec tool of open source upper in Wikipedia text corpus
The term vector that context information training obtains each word is expressed and is stored in a computer, is denoted as W, W is one and contains word
The set of term vector is corresponded to word.Here the dimension of term vector and the size of contextual window can be set, in order to guarantee
Computational efficiency, setting dimension is 50 in the present embodiment, window size 3.Assuming that having training sample (s, (e1, c1, r1,
c2, e2)), it altogether include n word, that is, s={ w in sentence s1,w2,…,wn, wherein corresponding to entity e there are two word1And e2。
The term vector v of each word is obtained by query set W first, then records each word to entity e1And e2Distance dist1
And dist2, and dist1And dist2The tail portion for being spliced to v constitutes the term vector of one 52 dimension, finally with the term vector handled
Sequence (v1,v2,…,vn) as the input for encoding sentence s vector.
Step S105: indicating according to the vector of word in sentence, obtains the sentence vector of each sentence of training text collection.
In this step, it inputs to mark training set TtrainSample in each sentence word term vector, output is
The sentence vector of each sentence.
Because each word in a sentence may include important feature letter in entity relation extraction task
Breath extracts in fact from sentence so the characteristic information of word each in sentence is integrated common expression sentence by needs to be subsequent
Relationship is prepared between body.The term vector that each word has been obtained in step 3, need to the feature in term vectors multiple in sentence into
Row extracts.The mode of feature extraction is varied, and convolutional neural networks model is used in the solution of the present invention
(convolutional neural networks, CNN).Specifically, using two entities can be efficiently used in sentence
The sectional convolution neural network model (PCNN) of location information.The process of PCNN mainly includes 3 steps: 1) convolution, needs to be arranged step
Long and filter size, 2) maximum pond, sentence is divided into three sections according to two provider locations, every section is maximum Chi Huacao respectively
Make, 3) nonlinear activation and output operation.By after operating above can the sentence expression of each input at a vector, to
The dimension of amount can be set to 200 dimensions according to the suggestion in previous scheme with self-setting.
Step S106: the sentence vector of each sentence of training text collection is inputted into entity relation extraction model, according to the sentence
It the corresponding concept of two entities being marked in two entities being marked in son, the sentence and is marked in the sentence
Two entities the relationship training entity relation extraction model.
After the completion of each of mark training set sentence is all indicated with vector, so that it may as entity relation extraction mould
The input of type M, and trained according to the relationship three of the corresponding concept of entity, entity, entity that are marked in each training sample
The parameter of neural network model M.
Step S107: the sentence vector of each sentence of text set to be extracted is obtained.
Step S108: the sentence vector of each sentence of text set to be extracted is inputted into the entity relation extraction model, is obtained
Take the relationship including two entities, the corresponding concept of two entities and two entities of each sentence of text set to be extracted
Set.
Entity relation extraction method in text of the present invention, using entity within a context belonging to concept and range
Semantic context information is represented, and obtains the entity relationship training set of the more relationships of more concepts according to concept and range, and according to training
Collection constructs entity relation extraction model, fundamentally solves the problems, such as existing error label during remote supervisory.
In one embodiment, according to the entity triple set of relationship, the entity and entity attribute set and
The concept set, the triple set of relationship to two entities recognized in the sentence and the sentence of the training text collection
Carry out remote supervisory mark, comprising:
Context identification is carried out to the sentence of the training text collection, it is right respectively to obtain two entities that the sentence recognizes
The concept answered.
Se is an element in set SE, i.e. se is included in the sentence and sentence in some news documents
Two entities triple.First to two entity e in sentence s1And e2Concept identification is carried out by context respectively to obtain
c1And c2, the problem of concept identification method herein is a classification, use Naive Bayes Classification method, entity e1And entity
e2All possible concept can inquire and obtain from set C.
Referring to Fig. 2, in one embodiment, carrying out context identification to the sentence of the training text collection, obtaining should
Further include following steps after the corresponding concept of two entities that sentence recognizes:
Step S201: by two entities recognized in training text collection sentence and the progress of entity triple set of relationship
Match.
Step S202: if it fails to match, a kind of relationship is randomly selected from entity triple set of relationship, generation includes
Sentence, two be marked entity, be marked two entity corresponding concept and the set of relationship randomly selected, and
Mark training set is put into using the data set as negative sample.
By searching for the relationship triple in entity triple set of relationship R, entity e is utilized1And e2As mark and relationship
Triple (e1, r, e2) matching.If be not matched to, then it is assumed that the entity e of knowledge base1And e2Between there is no any relationship, with
Machine takes a kind of relationship r present in triplet sets Rrandom, generate mark records (s, (e1, c1, rrandom, c2, e2)) as negative
Sample is put into mark training set Ttrain。
In one embodiment, further include following steps:
If successful match, generates and respectively corresponded including sentence, two entities being marked, two entities being marked
Concept and matched set of relationship, and confidence score is carried out to the obtained relationship of matching, if appraisal result is more than the
One given threshold is then put into mark training set for the data set as positive sample, if appraisal result is lower than the first given threshold,
Then mark training set is put into using the data set as negative sample.
Wherein, Fig. 3 is that the remote supervisory in the present embodiment marks flow diagram, if being matched to triple (e1, r,
e2), then relationship r matching obtained1Carry out confidence score.The foundation of scoring is, according to co-occurrence situation calculated relationship r1And sentence
The degree of correlation of context in sub- s, degree of correlation more high confidence level scoring are higher.When the first given threshold for being scored above setting
When, generate a five-tuple (e1, c1, r1, c2, e2), it represents and works as entity e1Concept be c1, and entity e2Concept be c2When,
e1And e2Between have a kind of relationship r1.And generate mark records (s, (e1, c1, r1, c2, e2)) as positive sample addition mark
Training set Ttrain.If scoring is less than the first given threshold, by mark records (s, (e1, c1, rrandom, c2, e2)) conduct
Mark training set T is added in negative sampletrain。
Referring to Fig. 4, further including following steps in one embodiment:
Step S401: obtaining in set of relationship generated, the identical multiple set of relationship of concept.
Step S402: judge the context-sensitive degree of each relationship and sentence in the multiple set of relationship.
Step S403: the maximum relationship of degree of correlation is substituted into multiple set of relationship as new relationship.
The triple set of relationship SE of two entities recognized in the sentence and the sentence to the training text collection
In all triples finish mark after, since all relationships of mark all derive from Freebase, if Freebase sheet
There are deviations for the fact that body includes relationship, it will brings mistake to subsequent calculating, it is therefore desirable to the positive sample in annotation results
Originally it is modified and adjusts, to improve and optimize the result of relationship marking between entity.Such as one mark positive sample (s,
(e1, c1, r1, c2, e2)), the relationship r of mark has been assumed that in research before1Be correctly, it is false in the solution of the present invention
If concept c1And c2Be mark correctly, but relationship r1Correctness need verify and correct.In order to reduce computational complexity, this
The candidate relationship set of each mark is first filtered out in embodiment.The method of screening is, two concepts point in mark records
Not identical relationship contributes to candidate relationship list R1In.Such as record (s1, (e3, c1, r2, c2, e4)), because of concept c1And c2
It is identical with above-mentioned record respectively, so entity e3And e4In concept c1And c2The relationship r of lower expression2It is included in candidate relationship
List R1In.Next, needing to identify optimal relationship from candidate relationship.Calculate separately set of relationship R1In each pass
It is riWith the degree of correlation of context in sentence s, and by the biggish relationship r of degree of correlationmaxOptimization knot as relationship marking
Fruit.Positive sample is deleted from labeled data collection T records (s, (e1, c1, r1, c2, e2)) and add record (s, (e after optimization1,
c1, rmax, c2, e2)) positive sample new as one.Finally, addition or more novel entities e into knowledge base KB yet to be built1And e2, addition
Five-tuple relationship (e1, c1, rmax, c2, e2)。
Referring to Fig. 5, its schematic diagram for entity relation extraction model M in a kind of embodiment, in the present embodiment, mark
Infuse training set TtrainIt is randomly assigned to three part T in proportiontrain(accounting for entire data set 80%), Tvalid(10%), Ttest
(10%), training set is respectively represented, verifying collection and test set, these three data sets obey same data distribution.
The parameter of entity relation extraction model M includes hyper parameter and two kinds of General Parameters.Have in 4 convolutional neural networks
Hyper parameter needs to be arranged initial value, and every batch of sample size (Batch size) B=100, the learning rate of stochastic gradient descent is arranged
λ=0.01 (Learning rate), ρ=0.5 neural network unit drop probability (Dropout probability), each sample
This most access times n=10.After setting up hyper parameter, start the training process of entity relation extraction model M.It will handle well
Positive and negative samples be input in convolutional neural networks in batch, record each sample concept identification result and mark in concept
Error between classification, entity relation extraction result and mark in entity relationship between error, pass through stochastic gradient descent
The composition error of algorithmic minimizing convolutional neural networks constantly adjusts General Parameters and preservation in model M.In order to find in time
The problem of model parameter, verify the generalization ability of model, it is every after 5 lot sample this calculating, using in advance in the solution of the present invention
Ready verifying collects TvalidIt is whether reasonable come the parameter setting of verifying current network model M, if unreasonable, adjust in time
It is whole.
After the completion of the training of entity relation extraction model M, the present invention uses two public testing data sets: 1) SemEval-
8 data set of 2010Task contains 9 kinds of two-way relationships and a kind of undirected " other " relationship in this data set, wraps altogether
Containing 10717 mark samples, 2) NYT10 data set, this data set contains 53 kinds of relationships altogether, wherein there is a kind of relationship " NA "
Representing two entities does not have any relationship, this data set includes 20202 mark samples altogether, answers respectively on the two data sets
With entity relation extraction model M and count accuracy rate, recall rate and F1 value.
Entity relation extraction method in text proposed by the present invention, can be from basic using the more relational knowledge bases of more concepts
It is upper to reduce and solve the problems, such as error label in knowledge base.At the same time, the entity relation extraction in text proposed by the present invention
Method can effectively utilize the conceptual information of entity, and the context where binding entity excludes before carrying out Relation extraction
Noise relationship reduces the search space of Relation extraction, improves the speed and precision of Relation extraction.
Referring to Fig. 6, Fig. 6 is the entity relation extraction system structure diagram in an embodiment of the present invention in text,
Wherein, the entity relation extraction system 600 in text includes:
First obtains module 601, for obtaining entity triple set of relationship, obtains entity and entity attribute set, obtains
Take concept set.
Second obtains module 602, two entities recognized in the sentence and the sentence for obtaining training text collection
Triple set of relationship.
Remote supervisory labeling module 603, for according to the entity triple set of relationship, the entity and entity attribute
Set and the concept set, the triple to two entities recognized in the sentence and the sentence of the training text collection
Set of relationship carries out remote supervisory mark, two entities, two for obtaining the sentence including training text collection, recognizing in the sentence
The set of relationship of a corresponding concept of entity and two entities, and set of relationship is put into mark training set.
Input module 604 is indicated, for obtaining the vector of word in training text collection sentence according to the mark training set
It indicates.
First indicates sentence module 605, for indicating according to the vector of word in sentence, obtains each sentence of training text collection
The sentence vector of son;
Entity relation extraction model training module 606, it is real for inputting the sentence vector of each sentence of training text collection
Body Relation extraction model is respectively corresponded according to two entities being marked in the sentence, two entities being marked in the sentence
Concept and the sentence in the relationship training entity relation extraction model of two entities that is marked;
Second indicates sentence module 607, for obtaining the sentence vector of each sentence of text set to be extracted;
Entity relation extraction module 608, for the sentence vector of each sentence of text set to be extracted to be inputted the entity
Relation extraction model, obtain each sentence of text set to be extracted include two entities, the corresponding concept of two entities with
And the set of relationship of two entities.
In one embodiment, the remote supervisory labeling module 603 further includes context identification unit 6031, for pair
The sentence of the training text collection carries out context identification, obtains the corresponding concept of two entities that the sentence recognizes.
In one embodiment, the remote supervisory labeling module 603 further includes matching unit 6032 and randomly selects list
Member 6033, two entities and entity triple relationship that the matching unit 6032 is used to recognize in training text collection sentence
Set is matched;If the unit 6033 of randomly selecting is taken out from entity triple set of relationship at random for it fails to match
Take a kind of relationship, generate include sentence, two entities being marked, the corresponding concept of two entities being marked and with
The set of relationship that machine extracts, and mark training set is put into using the data set as negative sample.
In one embodiment, the remote supervisory labeling module 603 further includes confidence score unit 6034, for such as
Fruit successful match, generate include sentence, two entities being marked, the corresponding concept of two entities that is marked and
The set of relationship matched, and confidence score is carried out to the relationship that matching obtains, if appraisal result is more than the first given threshold,
It is put into mark training set using the data set as positive sample, if appraisal result is lower than the first given threshold, by the data set
Mark training set is put into as negative sample.
In one embodiment, the remote supervisory labeling module 603 further include:
Set of relationship acquiring unit 6035, for obtaining in set of relationship generated, the identical multiple set of relations of concept
It closes.
Context-sensitive degree judging unit 6036, for judging each relationship and sentence in the multiple set of relationship
Context-sensitive degree.
Relationship replacement unit 6037 is used for and the maximum relationship of degree of correlation is substituted into multiple set of relationship as new
Relationship.
In one embodiment, the remote supervisory labeling module 603 further include:
Set of relationship deletes unit 6038, for deleting the multiple set of relationship in the mark training set.
Set of relationship replacement unit 6039, for will include that the multiple set of relationship of new relationship is put into the mark
In training set.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Entity relation extraction system in text proposed by the present invention, can be from basic using the more relational knowledge bases of more concepts
It is upper to reduce and solve the problems, such as error label in knowledge base.At the same time, the entity relation extraction in text proposed by the present invention
Method can effectively utilize the conceptual information of entity, and the context where binding entity excludes before carrying out Relation extraction
Noise relationship reduces the search space of Relation extraction, improves the speed and precision of Relation extraction.
The present invention also provides a kind of computer-readable mediums, are stored thereon with computer program, which is located
Reason device realizes the entity relation extraction method in the text in above-mentioned any one embodiment when executing.
Referring to Fig. 7, in one embodiment, electronic equipment 700 of the invention includes memory 710 and processor 720,
And the computer program that is stored in the memory 710 and can be executed by the processor 720, the processor 720 execute
When the computer program, realize such as the entity relation extraction method in the text in above-mentioned any one embodiment.
In the present embodiment, controller 720 can be one or more application specific integrated circuit (ASIC), digital signal
Processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components.Storage medium 710 can be used it is one or more its
In include program code storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) on implement
Computer program product form.Computer-readable storage media includes permanent and non-permanent, removable and non-removable
Dynamic media can be accomplished by any method or technique information storage.Information can be computer readable instructions, data structure,
The module of program or other data.The example of the storage medium of computer includes but is not limited to: phase change memory (PRAM), it is static with
Machine access memory (SRAM), dynamic random access memory (DRAM), other kinds of random access memory (RAM), only
It reads memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory techniques, read-only
Compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic magnetic
Disk storage or other magnetic storage devices or any other non-transmission medium, can be used for storage can be accessed by a computing device letter
Breath.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components
It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or
The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit
It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention
Range.
Claims (10)
1. a kind of entity relation extraction method in text, which comprises the steps of:
Entity triple set of relationship is obtained, entity and entity attribute set are obtained, obtains concept set;
Obtain the triple set of relationship of two entities recognized in the sentence and the sentence of training text collection;
According to the entity triple set of relationship, the entity and entity attribute set and the concept set, to described
The triple set of relationship of two entities recognized in the sentence of training text collection and the sentence carries out remote supervisory mark, obtains
Take the sentence including training text collection, two entities recognizing in the sentence, the corresponding concept of two entities and two
The set of relationship of a entity, and set of relationship is put into mark training set;
According to the mark training set, the vector for obtaining word in training text collection sentence is indicated;
It is indicated according to the vector of word in sentence, obtains the sentence vector of each sentence of training text collection;
The sentence vector of each sentence of training text collection is inputted into entity relation extraction model, according to two be marked in the sentence
The pass of the corresponding concept of two entities being marked in a entity, the sentence and two entities being marked in the sentence
System's training entity relation extraction model;
Obtain the sentence vector of each sentence of text set to be extracted;
The sentence vector of each sentence of text set to be extracted is inputted into the entity relation extraction model, obtains text set to be extracted
The set of relationship including two entities, the corresponding concept of two entities and two entities of each sentence.
2. the entity relation extraction method in text according to claim 1, which is characterized in that according to the entity ternary
Set of relationship, the entity and entity attribute set and the concept set are organized, to sentence and this of the training text collection
The triple set of relationship of two entities recognized in sentence carries out remote supervisory mark, comprising:
Context identification is carried out to the sentence of the training text collection, it is corresponding to obtain two entities that the sentence recognizes
Concept.
3. the entity relation extraction method in text according to claim 2, which is characterized in that the training text collection
Sentence carry out context identification further include walking as follows after obtaining the corresponding concept of two entities that the sentence recognizes
It is rapid:
Two entities recognized in training text collection sentence are matched with entity triple set of relationship;
If it fails to match, a kind of relationship is randomly selected from entity triple set of relationship, generation includes sentence, is marked
Two entities, be marked two entity corresponding concept and the set of relationship randomly selected, and the data set is made
Mark training set is put into for negative sample.
4. the entity relation extraction method in text according to claim 3, which is characterized in that further include following steps:
If successful match, generate corresponding general including sentence, two entities being marked, two entities being marked
It reads and matched set of relationship, and confidence score is carried out to the obtained relationship of matching, if appraisal result is more than first to set
Determine threshold value, is then put into mark training set for the data set as positive sample, it, will if appraisal result is lower than the first given threshold
The data set is put into mark training set as negative sample.
5. the entity relation extraction method in text according to claim 4, which is characterized in that the relationship obtained to matching
Carry out confidence score, comprising:
It appears in the ratio in corpus jointly according to the context with sentence, obtains in the matched relationship and the sentence up and down
The degree of correlation of text, degree of correlation is higher, then confidence score is higher.
6. the entity relation extraction method in text according to claim 5, which is characterized in that further include following steps:
It obtains in set of relationship generated, the identical multiple set of relationship of concept;
Judge the context-sensitive degree of each relationship and sentence in the multiple set of relationship;
The maximum relationship of degree of correlation is substituted into multiple set of relationship as new relationship.
7. the entity relation extraction method in text according to claim 6, which is characterized in that degree of correlation is maximum
Further include following steps after relationship is substituted into multiple set of relationship as new relationship:
Delete the multiple set of relationship in the mark training set;
The multiple set of relationship including new relationship is put into the mark training set.
8. the entity relation extraction system in a kind of text characterized by comprising
First obtains module, for obtaining entity triple set of relationship, obtains entity and entity attribute set, obtains concept set
It closes;
Second obtains module, and the triple of two entities recognized in the sentence and the sentence for obtaining training text collection is closed
Assembly is closed;
Remote supervisory labeling module, for according to the entity triple set of relationship, the entity and entity attribute set with
And the concept set, the triple set of relations to two entities recognized in the sentence and the sentence of the training text collection
It closes and carries out remote supervisory mark, two entities, two entities for obtaining the sentence including training text collection, being recognized in the sentence
The set of relationship of corresponding concept and two entities, and set of relationship is put into mark training set;
Input module is indicated, for according to the mark training set, the vector for obtaining word in training text collection sentence to be indicated;
First indicates sentence module, for indicating according to the vector of word in sentence, obtains the sentence of each sentence of training text collection
Subvector;
Entity relation extraction model training module, for taking out the sentence vector input entity relationship of each sentence of training text collection
Modulus type, according to two entities being marked in the sentence, the corresponding concept of two entities being marked in the sentence with
And the relationship training entity relation extraction model for two entities being marked in the sentence;
Second indicates sentence module, for obtaining the sentence vector of each sentence of text set to be extracted;
Entity relation extraction module, for the sentence vector of each sentence of text set to be extracted to be inputted the entity relation extraction
Model, obtain each sentence of text set to be extracted includes two entities, the corresponding concept of two entities and two realities
The set of relationship of body.
9. a kind of computer-readable medium, is stored thereon with computer program, it is characterised in that:
The computer program realizes that the entity in text as claimed in any one of claims 1 to 7 closes when being executed by processor
It is abstracting method.
10. a kind of electronic equipment, including memory, processor and it is stored in the memory and can be executed by the processor
Computer program, it is characterised in that:
When the processor executes the computer program, the reality in text as claimed in any one of claims 1 to 7 is realized
Body Relation extraction method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811376209.2A CN109472033B (en) | 2018-11-19 | 2018-11-19 | Method and system for extracting entity relationship in text, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811376209.2A CN109472033B (en) | 2018-11-19 | 2018-11-19 | Method and system for extracting entity relationship in text, storage medium and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109472033A true CN109472033A (en) | 2019-03-15 |
CN109472033B CN109472033B (en) | 2022-12-06 |
Family
ID=65673074
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811376209.2A Active CN109472033B (en) | 2018-11-19 | 2018-11-19 | Method and system for extracting entity relationship in text, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109472033B (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070093A (en) * | 2019-04-08 | 2019-07-30 | 东南大学 | A kind of remote supervision Relation extraction denoising method based on confrontation study |
CN110209836A (en) * | 2019-05-17 | 2019-09-06 | 北京邮电大学 | Remote supervisory Relation extraction method and device |
CN110516252A (en) * | 2019-08-30 | 2019-11-29 | 京东方科技集团股份有限公司 | Data mask method, device, computer equipment and storage medium |
CN110569366A (en) * | 2019-09-09 | 2019-12-13 | 腾讯科技(深圳)有限公司 | text entity relation extraction method and device and storage medium |
CN110674637A (en) * | 2019-09-06 | 2020-01-10 | 腾讯科技(深圳)有限公司 | Character relation recognition model training method, device, equipment and medium |
CN110765231A (en) * | 2019-10-11 | 2020-02-07 | 南京摄星智能科技有限公司 | Chapter event extraction method based on common-finger fusion |
CN111241303A (en) * | 2020-01-16 | 2020-06-05 | 东方红卫星移动通信有限公司 | Remote supervision relation extraction method for large-scale unstructured text data |
CN111291554A (en) * | 2020-02-27 | 2020-06-16 | 京东方科技集团股份有限公司 | Labeling method, relation extracting method, storage medium, and computing device |
CN111475641A (en) * | 2019-08-26 | 2020-07-31 | 北京国双科技有限公司 | Data extraction method and device, storage medium and equipment |
CN111563374A (en) * | 2020-03-23 | 2020-08-21 | 北京交通大学 | Personnel social relationship extraction method based on judicial official documents |
CN111914553A (en) * | 2020-08-11 | 2020-11-10 | 民生科技有限责任公司 | Financial information negative subject judgment method based on machine learning |
CN112507125A (en) * | 2020-12-03 | 2021-03-16 | 平安科技(深圳)有限公司 | Triple information extraction method, device, equipment and computer readable storage medium |
CN112559770A (en) * | 2020-12-15 | 2021-03-26 | 北京邮电大学 | Text data relation extraction method, device and equipment and readable storage medium |
CN112579748A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Method and device for extracting specific event relation from inquiry record |
CN112613306A (en) * | 2020-12-31 | 2021-04-06 | 恒安嘉新(北京)科技股份公司 | Method, device, electronic equipment and storage medium for extracting entity relationship |
CN113051356A (en) * | 2021-04-21 | 2021-06-29 | 深圳壹账通智能科技有限公司 | Open relationship extraction method and device, electronic equipment and storage medium |
CN113268577A (en) * | 2021-06-04 | 2021-08-17 | 厦门快商通科技股份有限公司 | Training data processing method and device based on dialogue relation and readable medium |
CN113282717A (en) * | 2021-07-23 | 2021-08-20 | 北京惠每云科技有限公司 | Method and device for extracting entity relationship in text, electronic equipment and storage medium |
CN113806493A (en) * | 2021-10-09 | 2021-12-17 | 中国人民解放军国防科技大学 | Entity relationship joint extraction method and device for Internet text data |
CN114139515A (en) * | 2021-10-18 | 2022-03-04 | 浙江香侬慧语科技有限责任公司 | Method, device, medium and equipment for generating rephrase text |
CN114519105A (en) * | 2021-12-24 | 2022-05-20 | 北京达佳互联信息技术有限公司 | Concept word determining method and device, electronic equipment and storage medium |
CN114637824A (en) * | 2022-03-18 | 2022-06-17 | 马上消费金融股份有限公司 | Data enhancement processing method and device |
CN116205235A (en) * | 2023-05-05 | 2023-06-02 | 北京脉络洞察科技有限公司 | Data set dividing method and device and electronic equipment |
CN117909487A (en) * | 2024-03-20 | 2024-04-19 | 北方健康医疗大数据科技有限公司 | Medical question-answering service method, system, device and medium for old people |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105808525A (en) * | 2016-03-29 | 2016-07-27 | 国家计算机网络与信息安全管理中心 | Domain concept hypernym-hyponym relation extraction method based on similar concept pairs |
US20170098173A1 (en) * | 2011-08-08 | 2017-04-06 | Gravity.Com, Inc. | Entity analysis system |
CN106874261A (en) * | 2017-03-17 | 2017-06-20 | 中国科学院软件研究所 | A kind of domain knowledge collection of illustrative plates and querying method based on semantic triangle |
CN107169079A (en) * | 2017-05-10 | 2017-09-15 | 浙江大学 | A kind of field text knowledge abstracting method based on Deepdive |
CN108287911A (en) * | 2018-02-01 | 2018-07-17 | 浙江大学 | A kind of Relation extraction method based on about fasciculation remote supervisory |
-
2018
- 2018-11-19 CN CN201811376209.2A patent/CN109472033B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170098173A1 (en) * | 2011-08-08 | 2017-04-06 | Gravity.Com, Inc. | Entity analysis system |
CN105808525A (en) * | 2016-03-29 | 2016-07-27 | 国家计算机网络与信息安全管理中心 | Domain concept hypernym-hyponym relation extraction method based on similar concept pairs |
CN106874261A (en) * | 2017-03-17 | 2017-06-20 | 中国科学院软件研究所 | A kind of domain knowledge collection of illustrative plates and querying method based on semantic triangle |
CN107169079A (en) * | 2017-05-10 | 2017-09-15 | 浙江大学 | A kind of field text knowledge abstracting method based on Deepdive |
CN108287911A (en) * | 2018-02-01 | 2018-07-17 | 浙江大学 | A kind of Relation extraction method based on about fasciculation remote supervisory |
Non-Patent Citations (1)
Title |
---|
周春 等: "基于概念语义相关性和LDA的文本标记算法", 《华南师范大学学报(自然科学版)》 * |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070093A (en) * | 2019-04-08 | 2019-07-30 | 东南大学 | A kind of remote supervision Relation extraction denoising method based on confrontation study |
CN110209836A (en) * | 2019-05-17 | 2019-09-06 | 北京邮电大学 | Remote supervisory Relation extraction method and device |
CN110209836B (en) * | 2019-05-17 | 2022-04-26 | 北京邮电大学 | Remote supervision relation extraction method and device |
CN111475641A (en) * | 2019-08-26 | 2020-07-31 | 北京国双科技有限公司 | Data extraction method and device, storage medium and equipment |
CN110516252A (en) * | 2019-08-30 | 2019-11-29 | 京东方科技集团股份有限公司 | Data mask method, device, computer equipment and storage medium |
US11954439B2 (en) | 2019-08-30 | 2024-04-09 | Boe Technology Group Co., Ltd. | Data labeling method and device, and storage medium |
WO2021036968A1 (en) * | 2019-08-30 | 2021-03-04 | 京东方科技集团股份有限公司 | Data labeling method and device, and storage medium |
CN110674637A (en) * | 2019-09-06 | 2020-01-10 | 腾讯科技(深圳)有限公司 | Character relation recognition model training method, device, equipment and medium |
CN110674637B (en) * | 2019-09-06 | 2023-07-11 | 腾讯科技(深圳)有限公司 | Character relationship recognition model training method, device, equipment and medium |
CN110569366A (en) * | 2019-09-09 | 2019-12-13 | 腾讯科技(深圳)有限公司 | text entity relation extraction method and device and storage medium |
CN112579748A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Method and device for extracting specific event relation from inquiry record |
CN110765231A (en) * | 2019-10-11 | 2020-02-07 | 南京摄星智能科技有限公司 | Chapter event extraction method based on common-finger fusion |
CN111241303A (en) * | 2020-01-16 | 2020-06-05 | 东方红卫星移动通信有限公司 | Remote supervision relation extraction method for large-scale unstructured text data |
CN111291554A (en) * | 2020-02-27 | 2020-06-16 | 京东方科技集团股份有限公司 | Labeling method, relation extracting method, storage medium, and computing device |
CN111291554B (en) * | 2020-02-27 | 2024-01-12 | 京东方科技集团股份有限公司 | Labeling method, relation extracting method, storage medium and arithmetic device |
CN111563374A (en) * | 2020-03-23 | 2020-08-21 | 北京交通大学 | Personnel social relationship extraction method based on judicial official documents |
CN111914553A (en) * | 2020-08-11 | 2020-11-10 | 民生科技有限责任公司 | Financial information negative subject judgment method based on machine learning |
CN111914553B (en) * | 2020-08-11 | 2023-10-31 | 民生科技有限责任公司 | Financial information negative main body judging method based on machine learning |
CN112507125A (en) * | 2020-12-03 | 2021-03-16 | 平安科技(深圳)有限公司 | Triple information extraction method, device, equipment and computer readable storage medium |
WO2022116417A1 (en) * | 2020-12-03 | 2022-06-09 | 平安科技(深圳)有限公司 | Triple information extraction method, apparatus, and device, and computer-readable storage medium |
CN112559770A (en) * | 2020-12-15 | 2021-03-26 | 北京邮电大学 | Text data relation extraction method, device and equipment and readable storage medium |
CN112613306A (en) * | 2020-12-31 | 2021-04-06 | 恒安嘉新(北京)科技股份公司 | Method, device, electronic equipment and storage medium for extracting entity relationship |
CN113051356A (en) * | 2021-04-21 | 2021-06-29 | 深圳壹账通智能科技有限公司 | Open relationship extraction method and device, electronic equipment and storage medium |
CN113268577A (en) * | 2021-06-04 | 2021-08-17 | 厦门快商通科技股份有限公司 | Training data processing method and device based on dialogue relation and readable medium |
CN113282717A (en) * | 2021-07-23 | 2021-08-20 | 北京惠每云科技有限公司 | Method and device for extracting entity relationship in text, electronic equipment and storage medium |
CN113806493B (en) * | 2021-10-09 | 2023-08-29 | 中国人民解放军国防科技大学 | Entity relationship joint extraction method and device for Internet text data |
CN113806493A (en) * | 2021-10-09 | 2021-12-17 | 中国人民解放军国防科技大学 | Entity relationship joint extraction method and device for Internet text data |
CN114139515A (en) * | 2021-10-18 | 2022-03-04 | 浙江香侬慧语科技有限责任公司 | Method, device, medium and equipment for generating rephrase text |
CN114519105A (en) * | 2021-12-24 | 2022-05-20 | 北京达佳互联信息技术有限公司 | Concept word determining method and device, electronic equipment and storage medium |
CN114637824A (en) * | 2022-03-18 | 2022-06-17 | 马上消费金融股份有限公司 | Data enhancement processing method and device |
CN114637824B (en) * | 2022-03-18 | 2023-12-01 | 马上消费金融股份有限公司 | Data enhancement processing method and device |
CN116205235A (en) * | 2023-05-05 | 2023-06-02 | 北京脉络洞察科技有限公司 | Data set dividing method and device and electronic equipment |
CN116205235B (en) * | 2023-05-05 | 2023-08-01 | 北京脉络洞察科技有限公司 | Data set dividing method and device and electronic equipment |
CN117909487A (en) * | 2024-03-20 | 2024-04-19 | 北方健康医疗大数据科技有限公司 | Medical question-answering service method, system, device and medium for old people |
Also Published As
Publication number | Publication date |
---|---|
CN109472033B (en) | 2022-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109472033A (en) | Entity relation extraction method and system in text, storage medium, electronic equipment | |
CN117033608B (en) | Knowledge graph generation type question-answering method and system based on large language model | |
CN110727779A (en) | Question-answering method and system based on multi-model fusion | |
CN111444344B (en) | Entity classification method, entity classification device, computer equipment and storage medium | |
CN111914558A (en) | Course knowledge relation extraction method and system based on sentence bag attention remote supervision | |
US20050027664A1 (en) | Interactive machine learning system for automated annotation of information in text | |
Jin et al. | Improving embedded knowledge graph multi-hop question answering by introducing relational chain reasoning | |
CN110968699A (en) | Logic map construction and early warning method and device based on event recommendation | |
CN109710744B (en) | Data matching method, device, equipment and storage medium | |
CN112328766B (en) | Knowledge graph question-answering method and device based on path search | |
CN110765277B (en) | Knowledge-graph-based mobile terminal online equipment fault diagnosis method | |
CN109933671A (en) | Construct method, apparatus, computer equipment and the storage medium of personal knowledge map | |
CN109522396B (en) | Knowledge processing method and system for national defense science and technology field | |
CN105677637A (en) | Method and device for updating abstract semantics database in intelligent question-answering system | |
CN115526236A (en) | Text network graph classification method based on multi-modal comparative learning | |
CN114117070A (en) | Method, system and storage medium for constructing knowledge graph | |
CN115390806A (en) | Software design mode recommendation method based on bimodal joint modeling | |
CN114840685A (en) | Emergency plan knowledge graph construction method | |
CN110413779B (en) | Word vector training method, system and medium for power industry | |
CN112749530B (en) | Text encoding method, apparatus, device and computer readable storage medium | |
CN112528003B (en) | Multi-item selection question-answering method based on semantic sorting and knowledge correction | |
CN116049376B (en) | Method, device and system for retrieving and replying information and creating knowledge | |
CN117216221A (en) | Intelligent question-answering system based on knowledge graph and construction method | |
Goldberg et al. | CASTLE: crowd-assisted system for text labeling and extraction | |
CN116484021A (en) | Method, device and storage medium for constructing leetcode question bank knowledge graph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |