CN108874878A - A kind of building system and method for knowledge mapping - Google Patents

A kind of building system and method for knowledge mapping Download PDF

Info

Publication number
CN108874878A
CN108874878A CN201810415531.5A CN201810415531A CN108874878A CN 108874878 A CN108874878 A CN 108874878A CN 201810415531 A CN201810415531 A CN 201810415531A CN 108874878 A CN108874878 A CN 108874878A
Authority
CN
China
Prior art keywords
relationship
subject
module
classifier
audit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810415531.5A
Other languages
Chinese (zh)
Other versions
CN108874878B (en
Inventor
李勇
倪博溢
周笑添
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhongan Information Technology Service Co ltd
Original Assignee
Zhongan Information Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongan Information Technology Service Co Ltd filed Critical Zhongan Information Technology Service Co Ltd
Priority to CN201810415531.5A priority Critical patent/CN108874878B/en
Publication of CN108874878A publication Critical patent/CN108874878A/en
Application granted granted Critical
Publication of CN108874878B publication Critical patent/CN108874878B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of building system and methods of knowledge mapping, belong to natural language processing, technical field of computer information processing.The system comprises:Crawler module carries out crawler and data cleansing to text;Basic labeling module, for carrying out the basis mark work including subject completion operation;Candidate relationship extraction module, for extracting the candidate relationship including candidate relationship sentence and/or relationship entity pair;Characteristic extracting module, for carrying out feature extraction;Relationship classifier training module constructs relationship classifier for extracting result and feature extraction result progress model training according to candidate relationship;Relationship auditing module, the candidate sentences relationship for obtaining to the relationship classifier carry out audit determination, and according to audit, determining result adjusts accordingly the relationship classifier.The present invention realizes stronger Relation extraction ability, reduces the cost manually participated in, improves the efficiency of building knowledge mapping.

Description

A kind of building system and method for knowledge mapping
Technical field
The present invention relates to natural language processing, technical field of computer information processing, in particular to a kind of knowledge mapping Construct system and method.
Background technique
Knowledge mapping is one kind centered on natural language processing (NLP), connected applications mathematics, graphics, visualization of information The knowledge organization form and specification of the multiple technologies of change.Recent knowledge mapping possesses mature answer in many industries of artificial intelligence With, such as search engine, chat robots, intelligent medical, Intelligent hardware.Knowledge mapping is divided into domain knowledge map and general knows Know map, Google proposes the concept of world knowledge map within 2012.World knowledge map emphasizes range, is hardly produced of overall importance The unified management of body layer.Common world knowledge map includes:Freebase, DBpedia, zhishi.me etc..Domain knowledge map is Based on specific area, different business scenarios is coped with, the knowledge base system with certain depth and completeness.Certain world knowledge Map and domain knowledge map are not mutually contradictory, but a mutually complementary relationship, utilize general knowledge mapping The depth of range combination domain knowledge map, can form more perfect knowledge mapping.
Knowledge mapping is a kind of effective manifestation mode of relationship, and different types of information is linked together to obtain one Relational network.By knowledge mapping, semantic understanding and reasoning are realized using relation derivation.The basic expressions form of relationship is ternary Group is such as:<node,relation,node>, can indicate that there is two entities a certain relationship or some entity to contain certain One attribute.Such as:<Zhang San, parent, Li Si>,<Old six, parent, Li Si>,<Zhang San, gender, male>,<Old six, gender, female> =><Zhang San, spouse, old six>, four triples are represented sequentially as:Zhang San and Li Si are relationship between parents, old six and Li Si be also Relationship between parents, Zhang San possess gender attribute for male, and old six possess gender attribute for female, can be derived and be opened by this four knowledge Three and old sixth is that pair bond.
The building core link of knowledge mapping is exactly Relation extraction.The scheme of the building of existing domain knowledge map is main Have:First is that first creating the data pattern based on ontology to pushing up to following formula, reflected using the structural signature data of high quality according to figure It penetrates to obtain relationship triple.This method reliability is higher, but takes time and effort very much, and needs stronger domain knowledge conduct Support, general data scale can not be made very big.Second is that formula from bottom to up, concentrated using certain technological means from public data real Existing Relation extraction.Public data collection usually contains a small amount of semi-structured data and a large amount of unstructured datas, semi-structured data Such as table, list, dictionary, infobox generally use decorator (wrapper), the form redaction rule presented according to data To extract relationship.And the relationship in non-structured plain text often present it is varied.For example four sections of texts are ok below Indicate the pair bond of A and B:1, A and B get married.2, A has married B.3, B marries A.4, the Papa and Mama A and B of C.Four words are equal Pair bond is embodied, although there are some characteristics that can follow, is difficult to handle by mode of rule merely.In non-structured text Relationship is often associated with the semantic feature of the sentence.Also useful regular template extracts relationship triple in currently existing scheme , the advantages of this method is that comparison is accurate and reliable.But disadvantage is it is obvious that first is that need manual compiling template automatic Change, second is that specific sentence pattern can only be adapted to.Have and proposes advanced pedestrian's work rule on the basis of the rule-based extraction of scheme It practises, generates new rule set, then with the new non-classified relation schema of Rule Extraction.Although this scheme can improve Rule Extraction Ability, but the deployment that can not be automated, the stage of rule learning need constantly intervention manual examination and verification, are not one fine Solution.Relationship is extracted from non-structured plain text and constructs knowledge mapping, is an intractable problem always.
Summary of the invention
In order to solve problems in the prior art, the embodiment of the invention provides a kind of building system of knowledge mapping and sides Method.The technical solution is as follows:
In a first aspect, a kind of building system of knowledge mapping is provided, including:
Crawler module carries out crawler and data cleansing to text;
Basic labeling module, for carrying out the basis mark work including subject completion operation;
Candidate relationship extraction module, for extracting the candidate relationship including candidate relationship sentence and/or relationship entity pair;
Characteristic extracting module, for carrying out feature extraction;
Relationship classifier training module, for extracting result and feature extraction result progress model instruction according to candidate relationship Practice, constructs relationship classifier;
Relationship auditing module, the candidate sentences relationship for obtaining to the relationship classifier carry out audit determination, according to The determining result of audit adjusts accordingly the relationship classifier.
With reference to first aspect, in the first possible implementation, the system also includes:
Heuristics rule base, for the heuristic rule of relationship extraction to be arranged;
Candidate sentences relationship that the relationship auditing module is used to obtain in conjunction with the relationship classifier and described heuristic Rule carries out audit determination, and according to audit, determining result adjusts accordingly the relationship classifier.
The possible implementation of with reference to first aspect the first, in the second possible implementation, the system is also Including:
Log analysis module obtains the heuristic rule for excavating to original log;And/or according to described The determining result of relationship auditing module audit is excavated, and the heuristic rule is updated.
With reference to first aspect and first and second kind of possible implementation of first aspect, in third to five kinds of possible realities In existing mode, the system also includes:
Feature weight update module, for being classified according to the determining result of relationship auditing module audit to the relationship Device carries out weight update.
With reference to first aspect, in a sixth possible implementation, the basic labeling module, includes point for carrying out Word, part-of-speech tagging, name Entity recognition, syntax dependency parsing, the basic of subject completion operation mark work.
With reference to first aspect, in the 7th kind of possible implementation, the characteristic extracting module, for being based on nerve net The word insertion feature of network language model, based on the feature of the vocabulary level of co-occurrence sequence between word and/or based on syntactic structure Grammar property is embedded in feature, the feature based on the vocabulary level of co-occurrence sequence between word based on the word of neural network language model And/or grammar property based on syntactic structure.
With reference to first aspect and first aspect first and second, six, seven kind of possible implementation, the eight to ten one kind In possible implementation, the subject completion operation includes:
Judge whether sentence includes subject,
If so, judge whether subject refers to pronoun, if so, whether upper one that judges the sentence include subject, If so, judging whether the subject is entity word, if so, carrying out the subject completion of the sentence according to the subject;
If it is not, whether upper one that then judges the sentence include subject, if so, judging whether the subject is entity Word, if so, carrying out the subject completion of the sentence according to the subject.
With reference to first aspect and first aspect first and second, six, seven kind of possible implementation, the 12nd to 15 In kind possible implementation, the relationship auditing module is waited by the method using voting mechanism and/or manually adjudicated Relationship audit is selected to determine.
Second aspect, a kind of construction method of knowledge mapping, including:
Crawler and data cleansing are carried out to text:
Carry out the basis mark work including subject completion operation;
Extract the candidate relationship including candidate relationship sentence and/or relationship entity pair;
Carry out feature extraction;
Result is extracted according to candidate relationship and feature extraction result carries out model training, constructs relationship classifier;
The candidate sentences relationship obtained to the relationship classifier carries out audit determination, according to the determining result of audit to institute The relationship classifier of stating adjusts accordingly.
In conjunction with second aspect, in the first possible implementation, the method also includes:
The heuristic rule that setting relationship is extracted;
The candidate sentences relationship obtained to the relationship classifier carries out audit determination, according to the determining result of audit The relationship classifier is adjusted accordingly, including:
The candidate sentences relationship and the heuristic rule obtained in conjunction with the relationship classifier carries out audit determination, according to The determining result of audit adjusts accordingly the relationship classifier.
In conjunction with the first possible implementation of second aspect, in the second possible implementation, the method is also Including:
Original log is excavated, the heuristic rule is obtained;And/or
According to relationship auditing module audit, determining result is excavated, and updates the heuristic rule.
In conjunction with first and second kind of possible implementation of second aspect and second aspect, in third to five kinds of possible realities In existing mode, the method also includes:
According to relationship auditing module audit, determining result carries out weight update to the relationship classifier.
In conjunction with second aspect, in a sixth possible implementation, the basis including subject completion operation is carried out Work is marked, including:
It segmented, part-of-speech tagging, the basis of Entity recognition, syntax dependency parsing, subject completion operation named to mark work Make.
In conjunction with second aspect, in the 7th kind of possible implementation, feature extraction is carried out, including:
Word based on neural network language model is embedded in feature, the feature based on the vocabulary level of co-occurrence sequence between word And/or word insertion feature of the grammar property based on syntactic structure based on neural network language model, based on co-occurrence sequence between word The feature of the vocabulary level of column and/or grammar property based on syntactic structure.
In conjunction with second aspect and second aspect first and second, six, seven kind of possible implementation, in the eight to ten one kind In possible implementation, the subject completion operation includes:
Judge whether sentence includes subject,
If so, judge whether subject refers to pronoun, if so, whether upper one that judges the sentence include subject, If so, judging whether the master is entity word, if so, carrying out the subject completion of the sentence according to the subject;
If it is not, whether upper one that then judges the sentence include subject, if so, judging whether the subject is entity Word, if so, carrying out the subject completion of the sentence according to the subject.
In conjunction with the second face and second aspect first and second, six, seven kind of possible implementation, at the 12nd to 15 kind In possible implementation, candidate relationship audit is carried out by the method using voting mechanism and/or manually adjudicated and is determined.
Technical solution bring beneficial effect provided in an embodiment of the present invention is:
Knowledge mapping provided in an embodiment of the present invention constructs system and method, has compared with the prior art below beneficial to effect Fruit:
1, it is operated due to being provided with subject completion in the mark work of basis, by combining crawler, other basis marks, waiting Other operations such as relationship extraction, feature extraction, statistical machine learning training, relationship audit are selected, so that knowledge mapping building system System and method have stronger Relation extraction ability, realize and extract relationship building knowledge mapping from non-structured plain text The convenient deployment of automation;
2, the mark means combined using heuristic rules library and statistical machine learning avoid marking corpus on a large scale Also ensure that relatively high standard calls rate together simultaneously;
3, log analysis and weight update, so that this system possesses continuous iterative learning ability, can increase in data volume Possess better Relation extraction ability later;
Generally speaking, knowledge mapping provided in an embodiment of the present invention constructs system and method, by subject completion technology with It is combined using the statistical machine learning of relationship classifier, continuous iteration updates, and Optimal Parameters realize stronger Relation extraction Ability reduces the cost manually participated in, improves the efficiency of building knowledge mapping.Exactly its stronger Relation extraction ability and Treatment effeciency, the knowledge mapping constructing plan are particularly suitable for handling the knowledge mapping building of non-structured plain text, relate to And the field of knowledge mapping has a good application prospect.It should be noted that above-described embodiment is with emphasis on financial field company Practice reference is given in the building of map, but theoretically, scheme provided in an embodiment of the present invention is suitable for any domain knowledge The building of map, while also relatively new reference role is provided to the building of world knowledge map.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is the structural schematic diagram of knowledge mapping building system provided in an embodiment of the present invention;
Fig. 2 is dependency structure example;
Fig. 3 is subject completion algorithm example;
Fig. 4 is the chart of sentence lexical feature citing;
Fig. 5 is that Relation extraction involved in the embodiment of the present invention sets up knowledge mapping example flow schematic diagram;
Fig. 6 is the knowledge mapping example of the knowledge mapping building system building provided through the embodiment of the present invention;
Fig. 7 is the structural schematic diagram of knowledge mapping building system provided in an embodiment of the present invention;
Fig. 8 is heuristic rule collection example;
Fig. 9 is flow chart of data processing schematic diagram in system involved in the embodiment of the present invention;
Figure 10 is knowledge mapping construction method flow chart provided in an embodiment of the present invention;
Figure 11 is knowledge mapping construction method flow chart provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached in the embodiment of the present invention Figure, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only this Invention a part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art exist Every other embodiment obtained under the premise of creative work is not made, shall fall within the protection scope of the present invention.
The building system and method for knowledge mapping provided in an embodiment of the present invention, by text carry out crawler pretreatment, Basic mark, candidate relationship extractions, feature extraction, relationship classifier training and relationship, which are audited, constructs knowledge mapping, due to Subject completion is provided in the mark work of basis to operate, and is realized stronger Relation extraction ability, is then classified with using relationship The statistical machine learning of device combines, and continuous iteration updates, and Optimal Parameters realize stronger Relation extraction ability, reduce The cost manually participated in improves the efficiency of building knowledge mapping.Exactly its stronger Relation extraction ability and treatment effeciency, should Knowledge mapping constructing plan is particularly suitable for handling the knowledge mapping building of non-structured plain text, is being related to knowledge mapping Field has a good application prospect.
Combined with specific embodiments below, knowledge mapping provided in an embodiment of the present invention building system and method is made further Explanation.
Embodiment 1
Fig. 1 is the structural schematic diagram of knowledge mapping building system provided in an embodiment of the present invention, as shown in Figure 1, of the invention The knowledge mapping that embodiment provides constructs system, including consisting of structure:Crawler module, basic labeling module, candidate relationship Extraction module, characteristic extracting module, relationship classifier training module and relationship auditing module.
Crawler module, for carrying out crawler and data cleansing to text.Specifically, crawler crawls relevent information, cleaning Text input gives basic labeling module out.
Basic labeling module, for carrying out the basis mark work including subject completion operation.Specifically, basis mark Injection molding block includes participle (word-seg), part-of-speech tagging (POS), name Entity recognition (NER), interdependent point of syntax for carrying out Analyse the basis mark work of (dep-parser), subject completion operation.
It should be noted that the basis mark work that the basic labeling module of the embodiment of the present invention carries out, in addition to above-mentioned column It can also include other any possible natural language processing (NLP) labeling operations in the prior art outside the treatment process of act, this Inventive embodiments do not limit it especially.
Illustratively, basic labeling module carries out sentence cutting to text according to paragraph symbol or punctuation mark first, To each sentence according to the mode of pipeline, successively segmented, part-of-speech tagging, name Entity recognition and interdependent syntax are divided Analysis.
Wherein in NER treatment process, with the method for dictionary and models coupling, entity recognition model uses crowdsourcing platform mark Note plus CRF model training, last recombination region dictionary provide result.The reality that those are cut open according to the result of Entity recognition Pronouns, general term for nouns, numerals and measure words is restored.Such as:" millet science and technology " may be cut into " millet " and " science and technology ", but below can be according to the knot of NER Fruit is reassembled into " millet science and technology " incision.At this point, we can obtain two lists, one of them is sentence tokens, The other is the dependency structure list of sentence.
Dependency structure is one using root as the tree of root, shows the dependence of each word in sentence.Fig. 2 is Dependency structure example, shows a typical dependency structure, and ATT indicates that fixed middle relationship, SBV indicate that subject-predicate relationship, VOB indicate Direct object can parse sentence trunk, processing coordination etc. using dependency analysis, in the example, dependency structure tree It is stored with list structure.
Relation extraction first must be big segment length's text segmentation at sentence, then extracts candidate entity pair and its correlated characteristic. And when text segmentation at sentence, often encountering a sentence lacks subject or replaces subject to refer to word, but This sentence itself includes very strong relationship characteristic.Then, based on the interdependent information of the sentence context, carried out syntax according to After depositing analysis, completion and filling can be carried out to the subject of current sentence, i.e. progress subject completion operation.
Fig. 3 is subject completion algorithm example, shows the detailed process of a preferred subject completion algorithm.Detailed process is such as Under:
First determine whether sentence includes subject,
If so, judge whether subject refers to pronoun, if so, whether upper one that judges sentence include subject, if so, Then judge whether subject is entity word, if so, carrying out the subject completion of sentence according to subject;
If it is not, whether upper one that then judges sentence include subject, if so, judge whether subject is entity word, if so, The subject completion of sentence is then carried out according to subject;
Except above-mentioned progress subject completion in the case where meeting subject completion condition, other situations are then without subject completion.
That is, if a sentence lacks subject or comprising referring to word, with sentence dependency analysis, in conjunction with upper one A semantic structure of sentences provides completion filling.Such as sentence:" Ma Yun is born in 1964, he is that group of Alibaba mainly creates Beginning people." it by first sentence subject known to the interdependent information of sentence is name entity (person:Ma Yun);Second sentence subject It is " he " that predicate is "Yes", object is " the main founder of group of Alibaba ", and object modification includes entity word " Alibaba Group ".So second reference word can be replaced using first subject entity word, become that " Ma Yun is born in 1964, horse Cloud is the main founder of group of Alibaba."
After above-mentioned basis mark work is completed, processing data can be input to candidate relationship extraction module.
Candidate relationship extraction module, for extracting the candidate relationship including candidate relationship sentence and/or relationship entity pair.Tool Body, according to the output of basic labeling module as a result, filtering out the candidate sentences of inclusion relation, extract process substantially:First Judge whether be greater than some threshold value comprising entity number in sentence;Whether the entity type for secondly including in sentence meets in relationship Entity type, the sentence for meeting two conditions is exactly qualified candidate sentences.For multiple realities in a candidate sentences The case where body, we collect entity type requirement corresponding with relationship using Descartes, and exhaustion generates all candidate relationships pair.
Illustratively, the sentence comprising two entities or more is filtered out, and entity type will meet current relation and mention Company's relationship is such as extracted in the requirement taken, then needing to meet two entity types is entirely corporate entity's type, extracts company and people Relationship when, the sentence met the requirements just must include at least one corporate entity's type and name entity type.
It should be noted that the candidate relationship data extracted here, in addition to candidate relationship sentence and/or relationship entity It externally, can also include that any possible candidate relationship extraction type, the embodiment of the present invention are not subject to spy to it in the prior art It does not limit.
After the processing of above-mentioned candidate relationship extraction module, data can be input to characteristic extracting module, and this feature extracts mould Block is for carrying out feature extraction.Specifically, characteristic extracting module is embedded in feature, base for the word based on neural network language model The feature of the vocabulary level of co-occurrence sequence and grammar property based on syntactic structure between word.Word insertion refers to word Semantic information distribution ground be expressed as dense low dimensional real-valued vectors.Word insertion is characterized in based on word2vec trained in advance Term vector finds out the COS distance value of the insertion vector of two entity words using distributed term vector invariance property in space translation.Figure 4 be the chart of sentence lexical feature citing, and the feature citing of vocabulary level is as shown in Figure 4.Grammar property refers to based on interdependent point The sentence structure feature of analysis and part of speech, such as the interdependent word D1 of entity word c1, the interdependent word D2's and interdependent word D1 of entity word c2 The part of speech POSD2 etc. of part of speech POSD1, interdependent word D2.Illustratively, characteristic extracting module is used to be based on neural network language model Word insertion feature, based on the feature of the vocabulary level of co-occurrence sequence between word and/or grammar property based on syntactic structure.It lifts Example explanation is obtaining inside sentence sequence and sentence extracting after the interdependent information of each word followed by characteristic extracting module Sentence contextual feature, such as:Two entity middle verbs, the previous word of first entity, second entity the latter word etc. Deng.
Next, relationship classifier training module extracts result according to candidate relationship and feature extraction result carries out model instruction Practice, constructs relationship classifier.Here relationship classifier is preferably Bayes classifier.The building process of classifier have with Lower two ways:
Mode one, first collection fraction entity relationship example crawl its related text using crawler orientation, artificial to mark A small amount of sample, one Relation extraction model of pre-training;Then result is extracted according to candidate relationship and feature extraction result carries out mould Type training constructs relationship classifier;
Mode two directly extracts result according to candidate relationship and feature extraction result carries out model training, constructs relation Class device.
Illustratively, during carrying out classifier building using aforesaid way one, manual sorting goes out a small amount of company and closes System to and company's character relation be trained to example, and with the sentence comprising wherein relationship.It needs among these a small amount of artificial Work is marked, but is not duration, the preparation process only trained in advance.Artificial mark low volume data is for just Beginningization characteristic value.The step of constructing classifier is roughly divided into:
A) data set is converted to frequency meter;
B) it creates and calculates the probability likelihood table that different characteristic sets up relationship;
C) score set up using Bayes company calculated relationship;
Note that classifier only determines a kind of positive and negative class of relationship in present design, the judgement of multirelation can be put down Capable establishes multiple classifiers.
The candidate sentences relationship that above-mentioned relation classifier obtains is input to relationship auditing module, relationship auditing module is to it Carry out audit determination, obtain the data result for meeting audit condition, then according to the data result to above-mentioned relation classifier into The corresponding adjustment of row, to be optimized to it.
The relationship classifier optimized by above-mentioned audit obtains a series of relational result data, by relationship entity to being stored in Rudimentary knowledge carrier of the relational database as knowledge mapping, for high-level interface inquiry and knowledge is processed and reasoning, is so far System completes the construction work of knowledge mapping.Illustratively, the relationship triple of extraction is finally stored in relational database, establishes base Plinth data platform selects neo4j chart database, result is stored in database automatically according to cypher graphic query language, and It establishes and supports upper layer query interface.
Fig. 5 is that Relation extraction involved in the embodiment of the present invention sets up knowledge mapping example flow schematic diagram, is shown logical It crosses knowledge mapping building system and final knowledge mapping as shown in Figure 5 is obtained by plain text.Fig. 6 is through the embodiment of the present invention The knowledge mapping example of the knowledge mapping building system building of offer, shows the knowledge mapping of shareholder's relationship.
Embodiment 2
Fig. 7 is the structural schematic diagram for the knowledge mapping building system that the embodiment of the present invention 2 provides, as shown in fig. 7, of the invention The knowledge mapping that embodiment provides constructs system, including consisting of structure:Crawler module, the basis NLP (natural language processing) Labeling module, candidate relationship extraction module, characteristic extracting module, relationship classifier training module, heuristics rule base, relationship are examined Core module, log analysis module and feature weight module.
Here the basic labeling module of crawler module, NLP (natural language processing), candidate relationship extraction module, feature mention Modulus block, relationship classifier training module are identical as corresponding module described in embodiment 1, therefore repeat no more.
Heuristics rule base, for the heuristic rule of relationship extraction to be arranged.
Specifically, heuristic rule, which can be, manually can be set some heuristic rule collection, such as according to domain knowledge, Manual sorting heuristic rule collection;Can also be according to the excavation to original log, automatic summarize obtains, such as owns in log Tape label sentence carries out sequential mining, provides heuristic rule automatically in conjunction with respective algorithms.Fig. 8 is heuristic rule collection example, As shown in figure 8, showing an example of heuristic rule collection.
The inspiration of candidate sentences relationship and heuristics rule base that relationship auditing module is obtained for marriage relation classifier Formula rule carries out audit determination, and according to audit, determining result adjusts accordingly relationship classifier, to optimize relationship classification Device.Above-mentioned audit determination process, can be carried out as follows:
A sentence is acted on while heuristic rule and classifier, obtained result is by an arbitration mechanism come really Fixed last relationship is determining, the method which is combined using voting mechanism method, artificial decision method or both.Show Example property, according to relationship classifier and heuristic rule to entity candidate in non-classified new sentence to being given a mark and thrown respectively Ticket, classifier marking rule are:Classification score (classify_score) is more than that some threshold value just throws positive ticket (+1), otherwise Throw negative ticket (- 1).Regular marking mechanism is exactly to meet some rule just to throw positive ticket, negative ticket is otherwise thrown, then all ballots Results added, if two mode final votes the result is that 0, by relationship auditing module carry out final audit judgement come it is true It is fixed.If heuristic rule and classifier provide judgement simultaneously, ruling is provided with the method for ballot;If can not solve to rush It is prominent then mark into log analysis module, wait artificial ruling.
As for log analysis module, original log is excavated it has been mentioned hereinbefore that can use it, is obtained heuristic The heuristic rule of rule base is excavated in addition, it is also used to the determining result of relationship auditing module audit, is opened to update Hairdo rule.Log analysis module mainly visually provides classifier score and error situation, according to common mistake Type sorts out enlightening artificial rule base, to excavate heuristic rule, improves accuracy rate.Above-mentioned Web log mining process It can use PrefixSpan algorithm, summarized, can also be manually summarized automatically in conjunction with cluster, the present invention is to the realization process The method of use is without being particularly limited to.
While starting log analysis module, feature weight update module, feature weight can be triggered after relationship audit Update module, for carrying out weight update to relationship classifier according to the determining result of relationship auditing module audit.Illustratively, Feature weight update module recalculates the weight of existing feature according to the sentence of tape label, is input in relationship classifier, The recognition capability for updating classifier, the candidate relationship sentence after relationship is judged can effectively obtain the feature needed to classifier Carry out weight update, feed back to relationship classifier, realize iterative learning, make it have better accuracy, thus make entirely be System realizes the iterative learning of automation, can also possess stronger recognition capability the case where data volume increases.
It should be noted that the day that update iterative process and log analysis module that feature weight update module carries out carry out Will analysis mining process can be carried out simultaneously as described above, can also sequentially be carried out, such as first pass through feature weight Update module is updated iterative process and passes through log analysis module progress log analysis mining process again, alternatively, first passing through day Will analysis module progress log analysis mining process passes through feature weight update module again and is updated iterative process, and the present invention is real Example is applied not limit it especially.
A series of relational result data are finally obtained by the above process, and relationship entity is made to relational database is stored in For the rudimentary knowledge carrier of knowledge mapping, for high-level interface inquiry and knowledge processing and reasoning, so far system completes knowledge graph The construction work of spectrum.Illustratively, the relationship triple of extraction is finally stored in relational database, establishes Base data platform, Neo4j chart database is selected, result is stored in automatically by database according to cypher graphic query language, and establishes and supports upper layer Query interface.Fig. 9 is flow chart of data processing schematic diagram in system involved in the embodiment of the present invention, the data that above-mentioned module executes Process flow is as shown in Figure 9.Fig. 5 and Fig. 6 are returned, Fig. 5 is that Relation extraction involved in the embodiment of the present invention sets up knowledge graph Example flow schematic diagram is composed, shows and system is constructed by plain text acquisition final knowledge as shown in Figure 5 by knowledge mapping Map;Fig. 6 is the knowledge mapping example of the knowledge mapping building system building provided through the embodiment of the present invention, shows company The knowledge mapping of ownership and membership relations.
It is worth noting that, above-mentioned module executes the detailed process of corresponding operating, other than manner described above, also It can realize that the process, the embodiment of the present invention are not limited specific mode by other means.
Embodiment 3
Figure 10 is knowledge mapping construction method flow chart provided in an embodiment of the present invention, and as shown in Figure 10, the present invention is implemented The knowledge mapping construction method that example provides, includes the following steps:
301, crawler and data cleansing are carried out to text:
302, the basis mark work including subject completion operation is carried out;
303, the candidate relationship including candidate relationship sentence and/or relationship entity pair is extracted;
304, feature extraction is carried out;
305, result is extracted according to candidate relationship and feature extraction result carries out model training, construct relationship classifier;
306, the candidate sentences relationship obtained to relationship classifier carries out audit determination, according to the determining result of audit to pass It is that classifier adjusts accordingly.
Embodiment 4
Figure 11 is knowledge mapping construction method flow chart provided in an embodiment of the present invention, and as shown in figure 11, the present invention is implemented The knowledge mapping construction method that example provides, includes the following steps:
401, crawler and data cleansing are carried out to text.
402, it segmented, part-of-speech tagging, the basis of Entity recognition, syntax dependency parsing, subject completion operation named to mark Infuse work.
Specifically, subject completion operation includes:
Judge whether sentence includes subject,
If so, judge whether subject refers to pronoun, if so, whether upper one that judges sentence include subject, if so, Then judge whether subject is entity word, if so, carrying out the subject completion of sentence according to subject;
If it is not, whether upper one that then judges the sentence include subject, if so, judge whether subject is entity word, If so, carrying out the subject completion of sentence according to subject.
403, the candidate relationship including candidate relationship sentence and/or relationship entity pair is extracted.
404, it extracts the word based on neural network language model and is embedded in feature, the vocabulary level based on co-occurrence sequence between word Feature and/or grammar property based on syntactic structure.
405, result is extracted according to candidate relationship and feature extraction result carries out model training, construct relationship classifier.
406, original log is excavated, obtains the heuristic rule.
407, the candidate sentences relationship and heuristic rule that marriage relation classifier obtains carry out audit determination, according to audit Determining result adjusts accordingly relationship classifier.
It is determined specifically, carrying out candidate relationship audit by the method using voting mechanism and/or manually adjudicated.
408, it is excavated according to the determining result of relationship auditing module audit, updates heuristic rule.
409, weight update is carried out to relationship classifier according to the determining result of relationship auditing module audit
It is worth noting that, the process of step 401-409, other than the mode described in the above-mentioned steps, can also pass through Other modes realize that the process, the embodiment of the present invention are not limited specific mode.
All the above alternatives can form alternative embodiment of the invention using any combination, herein no longer It repeats one by one.
It should be noted that:Knowledge mapping building system provided by the above embodiment constructs business in triggering knowledge mapping When, only the example of the division of the above functional modules, in practical application, it can according to need and divide above-mentioned function With being completed by different functional modules, i.e., the internal structure of system is divided into different functional modules, to complete above description All or part of function.In addition, knowledge mapping construction method provided by the above embodiment and knowledge mapping building system are real It applies example and belongs to same design, specific implementation process is detailed in system embodiment, and which is not described herein again.
Knowledge mapping provided in an embodiment of the present invention constructs system and method, has compared with the prior art below beneficial to effect Fruit:
1, it is operated due to being provided with subject completion in the mark work of basis, by combining crawler, other basis marks, waiting Other operations such as relationship extraction, feature extraction, statistical machine learning training, relationship audit are selected, so that knowledge mapping building system System and method have stronger Relation extraction ability, realize and extract relationship building knowledge mapping from non-structured plain text The convenient deployment of automation;
2, the mark means combined using heuristic rules library and statistical machine learning avoid marking corpus on a large scale Also ensure that relatively high standard calls rate together simultaneously;
3, log analysis and weight update, so that this system possesses continuous iterative learning ability, can increase in data volume Possess better Relation extraction ability later;
Generally speaking, knowledge mapping provided in an embodiment of the present invention constructs system and method, by subject completion technology with It is combined using the statistical machine learning of relationship classifier, continuous iteration updates, and Optimal Parameters realize stronger Relation extraction Ability reduces the cost manually participated in, improves the efficiency of building knowledge mapping.Exactly its stronger Relation extraction ability and Treatment effeciency, the knowledge mapping constructing plan are particularly suitable for handling the knowledge mapping building of non-structured plain text, relate to And the field of knowledge mapping has a good application prospect.It should be noted that above-described embodiment is with emphasis on financial field company Practice reference is given in the building of map, but theoretically, scheme provided in an embodiment of the present invention is suitable for any domain knowledge The building of map, while also relatively new reference role is provided to the building of world knowledge map.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
It should be understood by those skilled in the art that, the embodiment in the embodiment of the present application can provide as method, system or meter Calculation machine program product.Therefore, complete hardware embodiment, complete software embodiment can be used in the embodiment of the present application or combine soft The form of the embodiment of part and hardware aspect.Moreover, being can be used in the embodiment of the present application in one or more wherein includes meter Computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, the optical memory of calculation machine usable program code Deng) on the form of computer program product implemented.
It is referring to according to the method for embodiment, equipment (system) and calculating in the embodiment of the present application in the embodiment of the present application The flowchart and/or the block diagram of machine program product describes.It should be understood that can be realized by computer program instructions flow chart and/or The combination of the process and/or box in each flow and/or block and flowchart and/or the block diagram in block diagram.It can mention For the processing of these computer program instructions to general purpose computer, special purpose computer, Embedded Processor or other programmable datas The processor of equipment is to generate a machine, so that being executed by computer or the processor of other programmable data processing devices Instruction generation refer to for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of fixed function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although the preferred embodiment in the embodiment of the present application has been described, once a person skilled in the art knows Basic creative concept, then additional changes and modifications may be made to these embodiments.So appended claims are intended to explain Being includes preferred embodiment and all change and modification for falling into range in the embodiment of the present application.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (16)

1. a kind of building system of knowledge mapping, which is characterized in that including:
Crawler module carries out crawler and data cleansing to text;
Basic labeling module, for carrying out the basis mark work including subject completion operation;
Candidate relationship extraction module, for extracting the candidate relationship including candidate relationship sentence and/or relationship entity pair;
Characteristic extracting module, for carrying out feature extraction;
Relationship classifier training module, for extracting result and feature extraction result progress model training, structure according to candidate relationship Build relationship classifier;
Relationship auditing module, the candidate sentences relationship for obtaining to the relationship classifier carries out audit determination, according to audit Determining result adjusts accordingly the relationship classifier.
2. system according to claim 1, which is characterized in that the system also includes:
Heuristics rule base, for the heuristic rule of relationship extraction to be arranged;
The relationship auditing module is used for the candidate sentences relationship and the heuristic rule obtained in conjunction with the relationship classifier Audit determination is carried out, determining result adjusts accordingly the relationship classifier according to audit.
3. system according to claim 2, which is characterized in that the system also includes:
Log analysis module obtains the heuristic rule for excavating to original log;And/or according to the relationship The determining result of auditing module audit is excavated, and the heuristic rule is updated.
4. system according to any one of claims 1 to 3, which is characterized in that the system also includes:
Feature weight update module, for according to the determining result of relationship auditing module audit to the relationship classifier into Row weight updates.
5. system according to claim 1, which is characterized in that the basis labeling module includes participle, word for carrying out Property mark, name Entity recognition, syntax dependency parsing, subject completion operation basis mark work.
6. system according to claim 1, which is characterized in that the characteristic extracting module, for being based on neural network language Say the word insertion feature of model, based on the feature of the vocabulary level of co-occurrence sequence between word and/or grammer based on syntactic structure Word insertion feature of the feature based on neural network language model, based on the feature of the vocabulary level of co-occurrence sequence between word and/or Grammar property based on syntactic structure.
7. according to claim 1,2,3,5,6 described in any item systems, which is characterized in that the subject completion, which operates, includes:
Judge whether sentence includes subject,
If so, judge whether subject refers to pronoun, if so, whether upper one that judges the sentence include subject, if so, Then judge whether the subject is entity word, if so, carrying out the subject completion of the sentence according to the subject;
If it is not, whether upper one that then judges the sentence include subject, if so, judge whether the subject is entity word, If so, carrying out the subject completion of the sentence according to the subject.
8. according to claim 1,2,3,5,6 described in any item systems, which is characterized in that
The relationship auditing module carries out candidate relationship audit by the method using voting mechanism and/or manually adjudicated and determines.
9. a kind of construction method of knowledge mapping, which is characterized in that including:
Crawler and data cleansing are carried out to text:
Carry out the basis mark work including subject completion operation;
Extract the candidate relationship including candidate relationship sentence and/or relationship entity pair;
Carry out feature extraction;
Result is extracted according to candidate relationship and feature extraction result carries out model training, constructs relationship classifier;
The candidate sentences relationship obtained to the relationship classifier carries out audit determination, according to the determining result of audit to the pass It is that classifier adjusts accordingly.
10. according to the method described in claim 9, it is characterized in that, the method also includes:
The heuristic rule that setting relationship is extracted;
The candidate sentences relationship obtained to the relationship classifier carries out audit determination, according to the determining result of audit to institute The relationship classifier of stating adjusts accordingly, including:
The candidate sentences relationship and the heuristic rule obtained in conjunction with the relationship classifier carries out audit determination, according to audit Determining result adjusts accordingly the relationship classifier.
11. according to the method described in claim 10, it is characterized in that, the method also includes:
Original log is excavated, the heuristic rule is obtained;And/or
According to relationship auditing module audit, determining result is excavated, and updates the heuristic rule.
12. according to the described in any item methods of claim 9 to 11, which is characterized in that the method also includes:
According to relationship auditing module audit, determining result carries out weight update to the relationship classifier.
13. according to the method described in claim 9, it is characterized in that, carrying out the basis mark including subject completion operation Work, including:
It segmented, part-of-speech tagging, the basis of Entity recognition, syntax dependency parsing, subject completion operation named to mark work.
14. according to the method described in claim 9, it is characterized in that, carry out feature extraction, including:
Word insertion feature based on neural network language model, based on the feature of the vocabulary level of co-occurrence sequence between word and/or Grammar property based on syntactic structure is embedded in feature, the word based on co-occurrence sequence between word based on the word of neural network language model Remittance grade another characteristic and/or grammar property based on syntactic structure.
15. according to the described in any item methods of claim 9,10,11,13,14, which is characterized in that the subject completion operation Including:
Judge whether sentence includes subject,
If so, judge whether subject refers to pronoun, if so, whether upper one that judges the sentence include subject, if so, Then judge whether the master is entity word, if so, carrying out the subject completion of the sentence according to the subject;
If it is not, whether upper one that then judges the sentence include subject, if so, judge whether the subject is entity word, If so, carrying out the subject completion of the sentence according to the subject.
16. according to the described in any item methods of claim 9,10,11,13,14, which is characterized in that
Candidate relationship audit is carried out by the method using voting mechanism and/or manually adjudicated to determine.
CN201810415531.5A 2018-05-03 2018-05-03 Knowledge graph construction system and method Active CN108874878B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810415531.5A CN108874878B (en) 2018-05-03 2018-05-03 Knowledge graph construction system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810415531.5A CN108874878B (en) 2018-05-03 2018-05-03 Knowledge graph construction system and method

Publications (2)

Publication Number Publication Date
CN108874878A true CN108874878A (en) 2018-11-23
CN108874878B CN108874878B (en) 2021-02-26

Family

ID=64327555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810415531.5A Active CN108874878B (en) 2018-05-03 2018-05-03 Knowledge graph construction system and method

Country Status (1)

Country Link
CN (1) CN108874878B (en)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670051A (en) * 2018-12-14 2019-04-23 北京百度网讯科技有限公司 Knowledge mapping method for digging, device, equipment and storage medium
CN109740149A (en) * 2018-12-11 2019-05-10 英大传媒投资集团有限公司 A kind of synonym extracting method based on remote supervisory
CN109740026A (en) * 2019-01-11 2019-05-10 深圳市中电数通智慧安全科技股份有限公司 Smart city edge calculations platform and its management method, server and storage medium
CN109918475A (en) * 2019-01-24 2019-06-21 西安交通大学 A kind of Visual Inquiry method and inquiry system based on medical knowledge map
CN110032649A (en) * 2019-04-12 2019-07-19 北京科技大学 Relation extraction method and device between a kind of entity of TCM Document
CN110113314A (en) * 2019-04-12 2019-08-09 中国人民解放军战略支援部队信息工程大学 Network safety filed knowledge mapping construction method and device for dynamic threats analysis
CN110134845A (en) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 Project public sentiment monitoring method, device, computer equipment and storage medium
CN110197280A (en) * 2019-05-20 2019-09-03 中国银行股份有限公司 A kind of knowledge mapping construction method, apparatus and system
CN110347894A (en) * 2019-05-31 2019-10-18 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium based on crawler
CN110516252A (en) * 2019-08-30 2019-11-29 京东方科技集团股份有限公司 Data mask method, device, computer equipment and storage medium
CN110569366A (en) * 2019-09-09 2019-12-13 腾讯科技(深圳)有限公司 text entity relation extraction method and device and storage medium
CN110750994A (en) * 2019-10-23 2020-02-04 北京字节跳动网络技术有限公司 Entity relationship extraction method and device, electronic equipment and storage medium
CN110866125A (en) * 2019-11-14 2020-03-06 北京京航计算通讯研究所 Knowledge graph construction system based on bert algorithm model
CN111046190A (en) * 2019-11-28 2020-04-21 佰聆数据股份有限公司 Semantic graph-based big data label conflict detection method and system, storage medium and computer equipment
CN111160536A (en) * 2020-01-02 2020-05-15 福州大学 Convolution embedding representation reasoning method based on fragmentation knowledge
CN111177411A (en) * 2019-12-27 2020-05-19 赣州市智能产业创新研究院 Knowledge graph construction method based on NLP
CN111177315A (en) * 2019-12-19 2020-05-19 北京明略软件系统有限公司 Knowledge graph updating method and device and computer readable storage medium
CN111199802A (en) * 2020-01-10 2020-05-26 北京百度网讯科技有限公司 Electronic medical record data mining method, device, equipment and medium
CN111221976A (en) * 2019-11-14 2020-06-02 北京京航计算通讯研究所 Knowledge graph construction method based on bert algorithm model
CN111400395A (en) * 2020-02-17 2020-07-10 浙江大学 Knowledge graph crowdsourcing platform based on distributed account book
CN111651613A (en) * 2020-07-08 2020-09-11 海南大学 Knowledge graph embedding-based dynamic recommendation method and system
CN111858867A (en) * 2019-04-30 2020-10-30 广东小天才科技有限公司 Incomplete corpus completion method and device
CN112270196A (en) * 2020-12-14 2021-01-26 完美世界(北京)软件科技发展有限公司 Entity relationship identification method and device and electronic equipment
CN112487197A (en) * 2020-11-06 2021-03-12 中科云谷科技有限公司 Method and device for constructing knowledge graph based on conference record and processor
CN112989032A (en) * 2019-12-17 2021-06-18 医渡云(北京)技术有限公司 Entity relationship classification method, apparatus, medium and electronic device
CN113051374A (en) * 2021-06-02 2021-06-29 北京沃丰时代数据科技有限公司 Text matching optimization method and device
WO2021139257A1 (en) * 2020-06-24 2021-07-15 平安科技(深圳)有限公司 Method and apparatus for selecting annotated data, and computer device and storage medium
CN113468335A (en) * 2020-03-30 2021-10-01 海信集团有限公司 Method and equipment for extracting entity implicit relationship
CN113535981A (en) * 2021-07-21 2021-10-22 深圳证券信息有限公司 Method, system, electronic device and storage medium for analyzing announcement content
CN113672599A (en) * 2020-09-30 2021-11-19 华斌 Visual aid decision-making method for realizing government affair informatization project construction management by creating domain knowledge graph
CN114138979A (en) * 2021-10-29 2022-03-04 中南民族大学 Cultural relic safety knowledge map creation method based on word expansion unsupervised text classification
WO2022121651A1 (en) * 2020-12-09 2022-06-16 Beijing Wodong Tianjun Information Technology Co., Ltd. System and method for knowledge graph construction using capsule neural network
WO2022237013A1 (en) * 2021-05-11 2022-11-17 西安交通大学 Entity relationship joint extraction-based legal knowledge graph construction method and device
CN116303625A (en) * 2023-05-17 2023-06-23 之江实验室 Data query method and device, storage medium and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020156788A1 (en) * 2001-04-20 2002-10-24 Jia-Sheng Heh Method of constructing, editing, indexing, and matching up with information on the interner for a knowledge map
CN103678281A (en) * 2013-12-31 2014-03-26 北京百度网讯科技有限公司 Method and device for automatically labeling text
CN104809176A (en) * 2015-04-13 2015-07-29 中央民族大学 Entity relationship extracting method of Zang language
CN106503254A (en) * 2016-11-11 2017-03-15 上海智臻智能网络科技股份有限公司 Language material sorting technique, device and terminal
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
CN107463607A (en) * 2017-06-23 2017-12-12 昆明理工大学 The domain entities hyponymy of bluebeard compound vector sum bootstrapping study obtains and method for organizing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020156788A1 (en) * 2001-04-20 2002-10-24 Jia-Sheng Heh Method of constructing, editing, indexing, and matching up with information on the interner for a knowledge map
CN103678281A (en) * 2013-12-31 2014-03-26 北京百度网讯科技有限公司 Method and device for automatically labeling text
CN104809176A (en) * 2015-04-13 2015-07-29 中央民族大学 Entity relationship extracting method of Zang language
CN106503254A (en) * 2016-11-11 2017-03-15 上海智臻智能网络科技股份有限公司 Language material sorting technique, device and terminal
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
CN107463607A (en) * 2017-06-23 2017-12-12 昆明理工大学 The domain entities hyponymy of bluebeard compound vector sum bootstrapping study obtains and method for organizing

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740149A (en) * 2018-12-11 2019-05-10 英大传媒投资集团有限公司 A kind of synonym extracting method based on remote supervisory
CN109740149B (en) * 2018-12-11 2019-12-13 英大传媒投资集团有限公司 remote supervision-based synonym extraction method
CN109670051A (en) * 2018-12-14 2019-04-23 北京百度网讯科技有限公司 Knowledge mapping method for digging, device, equipment and storage medium
CN109740026A (en) * 2019-01-11 2019-05-10 深圳市中电数通智慧安全科技股份有限公司 Smart city edge calculations platform and its management method, server and storage medium
CN109918475A (en) * 2019-01-24 2019-06-21 西安交通大学 A kind of Visual Inquiry method and inquiry system based on medical knowledge map
CN109918475B (en) * 2019-01-24 2021-01-19 西安交通大学 Visual query method and system based on medical knowledge graph
CN110134845A (en) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 Project public sentiment monitoring method, device, computer equipment and storage medium
CN110113314A (en) * 2019-04-12 2019-08-09 中国人民解放军战略支援部队信息工程大学 Network safety filed knowledge mapping construction method and device for dynamic threats analysis
CN110113314B (en) * 2019-04-12 2021-05-14 中国人民解放军战略支援部队信息工程大学 Network security domain knowledge graph construction method and device for dynamic threat analysis
CN110032649A (en) * 2019-04-12 2019-07-19 北京科技大学 Relation extraction method and device between a kind of entity of TCM Document
CN111858867A (en) * 2019-04-30 2020-10-30 广东小天才科技有限公司 Incomplete corpus completion method and device
CN110197280A (en) * 2019-05-20 2019-09-03 中国银行股份有限公司 A kind of knowledge mapping construction method, apparatus and system
CN110197280B (en) * 2019-05-20 2021-08-06 中国银行股份有限公司 Knowledge graph construction method, device and system
CN110347894A (en) * 2019-05-31 2019-10-18 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium based on crawler
CN110516252B (en) * 2019-08-30 2022-12-09 京东方科技集团股份有限公司 Data annotation method and device, computer equipment and storage medium
CN110516252A (en) * 2019-08-30 2019-11-29 京东方科技集团股份有限公司 Data mask method, device, computer equipment and storage medium
CN110569366A (en) * 2019-09-09 2019-12-13 腾讯科技(深圳)有限公司 text entity relation extraction method and device and storage medium
CN110569366B (en) * 2019-09-09 2023-05-23 腾讯科技(深圳)有限公司 Text entity relation extraction method, device and storage medium
CN110750994A (en) * 2019-10-23 2020-02-04 北京字节跳动网络技术有限公司 Entity relationship extraction method and device, electronic equipment and storage medium
CN110866125A (en) * 2019-11-14 2020-03-06 北京京航计算通讯研究所 Knowledge graph construction system based on bert algorithm model
CN111221976A (en) * 2019-11-14 2020-06-02 北京京航计算通讯研究所 Knowledge graph construction method based on bert algorithm model
CN111046190A (en) * 2019-11-28 2020-04-21 佰聆数据股份有限公司 Semantic graph-based big data label conflict detection method and system, storage medium and computer equipment
CN111046190B (en) * 2019-11-28 2021-03-26 佰聆数据股份有限公司 Semantic graph-based big data label conflict detection method and system, storage medium and computer equipment
CN112989032A (en) * 2019-12-17 2021-06-18 医渡云(北京)技术有限公司 Entity relationship classification method, apparatus, medium and electronic device
CN111177315B (en) * 2019-12-19 2023-04-28 北京明略软件系统有限公司 Knowledge graph updating method and device and computer readable storage medium
CN111177315A (en) * 2019-12-19 2020-05-19 北京明略软件系统有限公司 Knowledge graph updating method and device and computer readable storage medium
CN111177411A (en) * 2019-12-27 2020-05-19 赣州市智能产业创新研究院 Knowledge graph construction method based on NLP
CN111160536B (en) * 2020-01-02 2022-06-21 福州大学 Convolution embedding representation inference method based on fragmentation knowledge
CN111160536A (en) * 2020-01-02 2020-05-15 福州大学 Convolution embedding representation reasoning method based on fragmentation knowledge
CN111199802A (en) * 2020-01-10 2020-05-26 北京百度网讯科技有限公司 Electronic medical record data mining method, device, equipment and medium
CN111400395A (en) * 2020-02-17 2020-07-10 浙江大学 Knowledge graph crowdsourcing platform based on distributed account book
CN111400395B (en) * 2020-02-17 2023-06-13 浙江大学 Knowledge graph crowdsourcing platform based on distributed account book
CN113468335A (en) * 2020-03-30 2021-10-01 海信集团有限公司 Method and equipment for extracting entity implicit relationship
WO2021139257A1 (en) * 2020-06-24 2021-07-15 平安科技(深圳)有限公司 Method and apparatus for selecting annotated data, and computer device and storage medium
CN111651613A (en) * 2020-07-08 2020-09-11 海南大学 Knowledge graph embedding-based dynamic recommendation method and system
CN113672599B (en) * 2020-09-30 2023-05-23 华斌 Visual auxiliary decision-making method for government affair informatization project construction management
CN113672599A (en) * 2020-09-30 2021-11-19 华斌 Visual aid decision-making method for realizing government affair informatization project construction management by creating domain knowledge graph
CN112487197A (en) * 2020-11-06 2021-03-12 中科云谷科技有限公司 Method and device for constructing knowledge graph based on conference record and processor
US11861311B2 (en) 2020-12-09 2024-01-02 Beijing Wodong Tianjun Information Technology Co., Ltd. System and method for knowledge graph construction using capsule neural network
WO2022121651A1 (en) * 2020-12-09 2022-06-16 Beijing Wodong Tianjun Information Technology Co., Ltd. System and method for knowledge graph construction using capsule neural network
CN112270196A (en) * 2020-12-14 2021-01-26 完美世界(北京)软件科技发展有限公司 Entity relationship identification method and device and electronic equipment
WO2022237013A1 (en) * 2021-05-11 2022-11-17 西安交通大学 Entity relationship joint extraction-based legal knowledge graph construction method and device
CN113051374B (en) * 2021-06-02 2021-08-31 北京沃丰时代数据科技有限公司 Text matching optimization method and device
CN113051374A (en) * 2021-06-02 2021-06-29 北京沃丰时代数据科技有限公司 Text matching optimization method and device
CN113535981A (en) * 2021-07-21 2021-10-22 深圳证券信息有限公司 Method, system, electronic device and storage medium for analyzing announcement content
CN114138979B (en) * 2021-10-29 2022-09-16 中南民族大学 Cultural relic safety knowledge map creation method based on word expansion unsupervised text classification
CN114138979A (en) * 2021-10-29 2022-03-04 中南民族大学 Cultural relic safety knowledge map creation method based on word expansion unsupervised text classification
CN116303625A (en) * 2023-05-17 2023-06-23 之江实验室 Data query method and device, storage medium and electronic equipment
CN116303625B (en) * 2023-05-17 2023-07-21 之江实验室 Data query method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN108874878B (en) 2021-02-26

Similar Documents

Publication Publication Date Title
CN108874878A (en) A kind of building system and method for knowledge mapping
CN110825881B (en) Method for establishing electric power knowledge graph
US10496749B2 (en) Unified semantics-focused language processing and zero base knowledge building system
CN111339313A (en) Knowledge base construction method based on multi-mode fusion
CN110298032A (en) Text classification corpus labeling training system
CN109902159A (en) A kind of intelligent O&amp;M statement similarity matching process based on natural language processing
CN108388651A (en) A kind of file classification method based on the kernel of graph and convolutional neural networks
US11113470B2 (en) Preserving and processing ambiguity in natural language
CN106776797A (en) A kind of knowledge Q-A system and its method of work based on ontology inference
WO2020010834A1 (en) Faq question and answer library generalization method, apparatus, and device
CN112559766B (en) Legal knowledge map construction system
CN112507699A (en) Remote supervision relation extraction method based on graph convolution network
US20170169355A1 (en) Ground Truth Improvement Via Machine Learned Similar Passage Detection
CN108665141B (en) Method for automatically extracting emergency response process model from emergency plan
CN112597316A (en) Interpretable reasoning question-answering method and device
CN113196277A (en) System for retrieving natural language documents
JP2018005690A (en) Information processing apparatus and program
JPH0816620A (en) Data sorting device/method, data sorting tree generation device/method, derivative extraction device/method, thesaurus construction device/method, and data processing system
CN114610846A (en) Knowledge graph expanding and complementing method for heuristic bionic knowledge grafting strategy
CN109522396A (en) A kind of method of knowledge processing and system towards science and techniques of defence field
CN113742396A (en) Mining method and device for object learning behavior pattern
CN112883182A (en) Question-answer matching method and device based on machine reading
CN116049376B (en) Method, device and system for retrieving and replying information and creating knowledge
CN111651528A (en) Open entity relation extraction method based on generative countermeasure network
CN116561264A (en) Knowledge graph-based intelligent question-answering system construction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240306

Address after: Room 1179, W Zone, 11th Floor, Building 1, No. 158 Shuanglian Road, Qingpu District, Shanghai, 201702

Patentee after: Shanghai Zhongan Information Technology Service Co.,Ltd.

Country or region after: China

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: ZHONGAN INFORMATION TECHNOLOGY SERVICE Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240415

Address after: Room 1179, W Zone, 11th Floor, Building 1, No. 158 Shuanglian Road, Qingpu District, Shanghai, 201702

Patentee after: Shanghai Zhongan Information Technology Service Co.,Ltd.

Country or region after: China

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: ZHONGAN INFORMATION TECHNOLOGY SERVICE Co.,Ltd.

Country or region before: China