CN110245131A - Entity alignment schemes, system and its storage medium in a kind of knowledge mapping - Google Patents
Entity alignment schemes, system and its storage medium in a kind of knowledge mapping Download PDFInfo
- Publication number
- CN110245131A CN110245131A CN201910485558.6A CN201910485558A CN110245131A CN 110245131 A CN110245131 A CN 110245131A CN 201910485558 A CN201910485558 A CN 201910485558A CN 110245131 A CN110245131 A CN 110245131A
- Authority
- CN
- China
- Prior art keywords
- entity
- alignment
- vector
- instance
- alignment model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/06—Electricity, gas or water supply
Abstract
The invention discloses entity alignment schemes in a kind of knowledge mapping, system and its storage medium, include: step 1: first instance alignment model being trained using the first training data, second instance alignment model is trained using the second training data, the first instance alignment model trusted entity that repetitive exercise obtains each time updates the second training data, the second instance alignment model trusted entity that repetitive exercise obtains each time updates first instance alignment model, stop iteration after the maximum times or the output result of first instance alignment model and the output result of second instance alignment model that the number of iterations reaches setting are in the threshold range of setting, obtain final entity alignment model;Step 2: map to be aligned being input in the final entity alignment model that step 1 obtains, obtain entity alignment result.
Description
Technical field
The invention belongs to the intelligent use fields of electric power big data, and in particular to entity alignment side in a kind of knowledge mapping
Method, system and its storage medium.
Background technique
With the continuous development of big data technology, the data being largely not yet used effectively are had accumulated, these data are contained
Attention of the value increasingly by enterprises and academia.For the unified convergence and sharing application for realizing data, need to construct
The knowledge mapping of data, it is established that the semantic connection net of data provides the uniform data service of semantic class interoperability for user, but
Since data are from not homologous ray, they often have respective description rule to same target, so that never extracting in homologous ray
Entity and relationship the case where there are a large amount of isomeries, redundancy, be aligned by entity and clear up the entity for being directed toward same target
Merge, solve the entity multiplying question in knowledge mapping, is the committed step for constructing the data knowledge map of high quality.
Entity alignment techniques are intended to find those of the direction same target from different data collection entity, and pass through OWL:
SameAs etc. refers to that these entity links are had the object of unitized globally unique identifier by the building of link for one altogether, realizes
High quality link between data source, promotes knowledge mapping building.Entity alignment schemes are broadly divided into two major classes, and one kind is to be based on
The entity alignment schemes of attributes similarity, another kind of is that knowledge based indicates that study alignment entity relationship is inferred.Based on category
Property similarity deduction mainly according between entity to be aligned, whether the set of attribute having the same and respective attributes value is sentenced
Disconnected.Based on indicate the deduction of study using modeling method by knowledge mapping entity and relationship map to the dense vector of low-dimensional
In space, calculating and reasoning are then carried out.
But need to find the corresponding relationship between the entity in different data collection, directly using Knowledge Representation Model or based on category
Property similarity deduction be difficult to reach satisfactory effect, and the entities that method used at present needs largely mark are aligned
Data, this is intended to the participation of a large amount of power business experts in practice, it is difficult to realize.
Summary of the invention
To solve problems of the prior art, the present invention proposes entity alignment schemes in a kind of knowledge mapping, fusion
Entity alignment that representation of knowledge study and attributes similarity are inferred has reached logarithm as a result, result of both being complements one another
According to preferable entity alignment effect.
The technical scheme adopted by the invention is that: entity alignment schemes in a kind of knowledge mapping, comprising the following steps:
Step 1: first instance alignment model being trained using the first training data, using the second training data to
Two entity alignment models are trained, and the first instance alignment model trusted entity that repetitive exercise obtains each time updates the second instruction
Practice data, the second instance alignment model trusted entity that repetitive exercise obtains each time updates first instance alignment model, when repeatedly
Generation number reaches the output of the maximum times of setting or the output result of first instance alignment model and second instance alignment model
As a result rear in the threshold range of setting to stop iteration, obtain final entity alignment model;
Step 2: knowledge mapping to be aligned being input in the final entity alignment model that step 1 obtains, obtain reality
Body is aligned result.
Further, when for completing entity alignment in grid knowledge map, first training data and the second instruction
Practicing data is electric power proprietary term data.
Further, first training data is the training data under semantic feature visual angle, including the first align data
Collection and the first unjustified data set;Second training data is the training data under attribute structure feature visual angle, including second
Align data collection and the second unjustified data set;
The first instance alignment model is the entity alignment model for indicating study;The second instance alignment model is base
In the matched entity alignment model of attributes similarity.
Further, the step 1 specifically includes:
First instance alignment model is got using the training of the first align data, using first instance alignment model to first
Unjustified data set is predicted, obtains trusted entity to L 'se, the second align data concentration is put it into, the second alignment is updated
Data set;
It is assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to the
Two unjustified data sets are predicted, obtain trusted entity to L 'st, the second align data concentration is put it into, updates first pair
Neat data set.
It further, further include building first instance alignment model before carrying out step 1, specifically:
By in knowledge mapping entity and relationship map arrive vector space, acquisition knowledge mapping in entity correspondence mappings to
Amount, the correspondence mappings vector of relationship obtain correspondence the vector h, t, r of head and the tail entity and intermediate interactions in triple;
Loss function is constructed according to formula (1), after loss function iteration reaches the maximum times or constant end value of setting
Stop iteration, obtain first instance alignment model:
Wherein, (h, r, t) ∈ Δ expression knowledge mapping is converted into all in fact existing triples after triple form
Set;(h ', r ', t ') ∈ Δ ' expression is by positive example triple by replacing in the knowledge mapping that head entity or tail entity generate not
Existing triplet sets,Alignment relation between learning objectH indicates that head entity vector, t indicate that tail is real
Body vector, r indicate relation vector, and γ > 0 is used to separate the interval of positive and negative entity pair,Indicate the head of positive example triple
The alignment relation of entity vector h and the tail entity vector t of positive example triple,Indicate the head entity of negative example triple
The alignment relation of vector h ' and the tail entity vector t ' of negative example triple;
The corresponding head entity vector of entity, tail entity vector sum relation vector are iterated update according to formula (2):
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of head entity vector h is represented, μ is learning rate.
It further, further include building second instance alignment model before carrying out step 1, specifically:
Scoring functions are constructed according to formula (6), according to scoring functions to the vector of the entity of all candidate entity relationship centerings
It gives a mark with the vector of relationship, the candidate entity for taking marking to be worth highest candidate entity relationship centering is real as the target of alignment
Body;
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, t*Indicate given entity, h is expressed as and t*There are the candidate entities of across a network entity alignment relation, pass through
Scoring functions are to all candidate entity relationships to (h, r, t*) give a mark, marking is worth the mesh that highest candidate entity h is alignment
Mark entity;||h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicate that attributes similarity, w are punishment power
Degree;
Sim(h,t*)=average (Sim (pi))
piFor entity h and entity t*Shared property set:
Up (h, t*)=property1 ∩ property2 (3)
In formula, property1 is the property set of entity h, and property2 is entity t*Property set;
Entity h and entity t*Shared property set piSimilarity sim (pi) are as follows:
In formula, piX-th of attribute p1x of correspondent entity h, attribute value v1x, piCorrespondent entity t*Y-th of attribute
P2y, attribute value v2y, lcs (v1x, v2y) are the longest common subsequence of attribute value.
The invention also discloses entity alignments in knowledge mapping, comprising:
Coorinated training unit is obtained for carrying out coorinated training to first instance alignment model and second instance alignment model
To final entity alignment model;
Reading unit is input to final entity alignment model for obtaining knowledge mapping to be aligned, obtains entity pair
Neat result.
Further, the coorinated training unit specifically: using the first training data to first instance alignment model into
Row training, is trained second instance alignment model using the second training data, first instance alignment model iteration each time
Obtained trusted entity updates the second training data, and the second instance alignment model trusted entity that iteration obtains each time updates the
One training data, when the output result and second instance alignment model of the maximum times or first instance alignment model for reaching setting
Output result in the threshold range of setting after stop iteration, obtain final entity alignment model.
Further, when for completing entity alignment in grid knowledge map, first training data and the second instruction
Practicing data is electric power proprietary term data.
Further, first training data is the training data under semantic feature visual angle, including the first align data
Collection and the first unjustified data set;Second training data is the training data under attribute structure feature visual angle, including second
Align data collection and the second unjustified data set;The first instance alignment model is the entity alignment model for indicating study;Institute
Stating second instance alignment model is the entity alignment model matched based on attributes similarity.
Further, first instance alignment model is got using the training of the first align data, is aligned using first instance
Model predicts the first unjustified data set, obtains trusted entity to L 'se, the second align data concentration is put it into, more
New second align data collection;
It is assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to the
Two unjustified data sets are predicted, obtain trusted entity to L 'st, the second align data concentration is put it into, updates first pair
Neat data set.
Further, the first instance alignment model are as follows: by the entity and relationship map to vector sky in knowledge mapping
Between, obtain the correspondence mappings vector of entity in knowledge mapping, the correspondence mappings vector of relationship, obtain in triple head and the tail entity and
Correspondence the vector h, t, r of intermediate interactions;
Loss function is constructed according to formula (1), after loss function iteration reaches the maximum times or constant end value of setting
Stop iteration, obtain first instance alignment model:
Wherein, (h, r, t) ∈ Δ expression knowledge mapping is converted into all in fact existing triples after triple form
Set;(h ', r ', t ') ∈ Δ ' expression is by positive example triple by replacing in the knowledge mapping that head entity or tail entity generate not
Existing triplet sets,Alignment relation between learning objectH indicates that head entity vector, t indicate that tail is real
Body vector, r indicate relation vector, and γ > 0 is used to separate the interval of positive and negative entity pair,Indicate the head of positive example triple
The alignment relation of entity vector h and the tail entity vector t of positive example triple,Indicate the head entity of negative example triple
The alignment relation of vector h ' and the tail entity vector t ' of negative example triple;
The corresponding head entity vector of entity, tail entity vector sum relation vector are iterated update according to formula (2):
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of head entity vector h is represented, μ is learning rate.
Further, the second instance alignment model are as follows: scoring functions are constructed according to formula (6), according to scoring functions pair
The vector of the vector sum relationship of the entity of all candidate's entity relationship centerings is given a mark, and the highest candidate entity of marking value is taken to close
It is target entity of the candidate entity of centering as alignment;
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, t*Indicate given entity, h is expressed as and t*There are the candidate entities of across a network entity alignment relation, pass through
Scoring functions are to all candidate entity relationships to (h, r, t*) give a mark, marking is worth the mesh that highest candidate entity h is alignment
Mark entity;||h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicate that attributes similarity, w are punishment power
Degree;
Sim(h,t*)=average (Sim (pi))
piFor entity h and entity t*Shared property set:
Up (h, t*)=property1 ∩ property2 (3)
In formula, property1 is the property set of entity h, and property2 is entity t*Property set;
Entity h and entity t*Shared property set piSimilarity sim (pi) are as follows:
In formula, piX-th of attribute p1x of correspondent entity h, attribute value v1x, piCorrespondent entity t*Y-th of attribute
P2y, attribute value v2y, lcs (v1x, v2y) are the longest common subsequence of attribute value.
The invention also discloses entity alignments in a kind of knowledge mapping, and the system comprises network interfaces, memory
And processor;Wherein,
The network interface, during for being received and sent messages between other ext nal network elements, the reception of signal and
It sends;
The memory, for storing the computer program instructions that can be run on the processor;
The processor, for when running the computer program instructions, executing entity in a kind of above-mentioned knowledge mapping
The step of alignment schemes.
The invention also discloses a kind of computer storage medium, the computer storage medium is stored in knowledge mapping real
The program of body alignment schemes is realized when the program of entity alignment schemes is executed by least one processor in the knowledge mapping
The step of stating the entity alignment schemes in a kind of knowledge mapping.
The utility model has the advantages that the entity alignment schemes of collaboration semantic information and structure feature that the present invention uses are in electric network data
Comparatively ideal effect is achieved in entity alignment task, this method ratio carries out entity alignment just for semantic information or knot attribute sign
The method of deduction has comprehensive promotion.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, With reference to embodiment, it further explains
The bright present invention.
Embodiment 1:
A kind of entity alignment schemes in the knowledge mapping building process of the present embodiment, comprising the following steps:
Step 1: building indicates the entity alignment model of study:
Entity alignment is carried out using the method for indicating study and is broadly divided into two steps:
First by map KG to be aligned1And KG2It is mapped in low-dimensional vector space and obtains corresponding representation of knowledge KG1With
KG2, it is then based on the entity align data collection N manually marked, the alignment relation between learning objectWith (e1,e2) indicate data
Collect any entity pair in N, i.e. (e1,e2) ∈ N, wherein e1∈E1, e2∈E2;The alignment relation quilt of entity in two knowledge mappings
It is considered a kind of special relationship r*=SameAs, entity alignment relation can form triple (e1, SameAs, e2).Starting
It is utilized before training and is uniformly distributed initialization KG1And KG2Entity and the vector of relationship indicate.
The loss function of vector space representation method is defined as:
Wherein, (h, r, t) ∈ Δ is positive example triplet sets, is converted into all facts after triple form for knowledge mapping
Present on triple.(h ', r ', t ') ∈ Δ ' be negative example triplet sets, for by positive example triple by replacement head entity or
The triple being not present in the knowledge mapping that tail entity generates.Particularly, for the triple of entity alignment relation SameAs, structure
The entity replaced when making negative example triple should be the same type entity of another data source, i.e., random replacement head entity is first
The same type entity or replacement tail entity of data source are the same type entity of second data source.Based on translation mould
The thought of type, it regards tail entity t translation process of the entity h by r as, for measuring the matching degree between two entities.γ>
0 is used to separate the interval of positive and negative entity pair.
During optimizing loss function, the constraint relationship vector SameAs is null vector, and loss function iteration is arranged and reaches
To the constant rear stopping iteration of maximum times or end value.Each iteration is by the head entity vector of entity relationship, tail entity vector sum
Relation vector is iterated update, and vector gradient method can be used in when update, and mode is as follows:
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of h is represented, μ is learning rate.
Step 2: building is based on the matched entity alignment model of attributes similarity:
Define up (e1,e2) it is entity e1And e2Shared property set, it may be assumed that
up(e1,e2)=property1 ∩ property2 (3)
In formula, property1 is entity e1Property set, property2 be entity e2Property set.
Two entities share attribute piSimilarity sim (pi) are as follows:
In formula, piCorrespondent entity e1X-th of attribute p1x, attribute value v1x.Meanwhile correspondent entity e2Y-th category
Property p2y, attribute value v2y.Lcs (v1x, v2y) is the longest common subsequence of attribute value.
Entity e1, the entity similarity of e2 shares the average value of attributes similarity for it:
Sim(e1,e2)=average (Sim (pi)) (5)
If given entity t*, infer and t*There are the h of across a network entity alignment relation, according to scoring functions to all entities
Relationship is to (h ', SameAs, t*) marking, take marking to be worth highest h ' as the result inferred.Scoring functions are indicated based on vector
Similarity and attributes similarity definition:
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, | | h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicate attributes similarity.W is
Punishment dynamics, 0 < w < 1, specific value are determined according to the confidence level of data set attribute.
Step 3: the coorinated training of learning model and attributes match model is indicated:
Knowledge based indicates study and the entity alignment schemes based on attributes similarity respectively from semantic feature and attribute knot
The two independent angles of structure feature infer entity alignment problem.In the entity alignment task of industry knowledge mapping,
The inferred results at two visual angles are complemented one another using coorinated training frame, can achieve better effect.
Specifically, training data is divided into two mutually independent visual angles, i.e. semantic feature visual angle and structure feature view
Angle.According to the align data collection X at the semantic feature visual angle marked on a small quantityse, the training data at generative semantics feature visual angle, and instruct
Get the entity alignment model m for indicating studyse, use the entity alignment model m for indicating studyseTo semantic feature visual angle
Unjustified data set Xse' predicted, and trusted entity is selected to L 'se, in the align data at attribute structure feature visual angle, more
The align data collection X at new attribute structure feature visual anglest。
Similarly, according to the align data collection X at the attribute structure feature visual angle marked on a small quantityst, it is special to generate attribute structure
Levy the training data at visual angle, the entity alignment model m matched based on attributes similarity that training obtainsst, and using based on category
The entity alignment model m of property similarity modestTo the unjustified data set X at attribute structure feature visual anglest' predicted, it obtains
To trusted entity to L 'st, put it into the align data at semantic feature visual angle, the align data at the semantic feature visual angle of update
Collect Xse。
Above-mentioned two model is constantly iterated, until convergence.
Embodiment 2:
The present embodiment completes entity alignment in grid knowledge map in the method for embodiment 1, the used in the present embodiment
One training data and the second training data are containing a large amount of electric power proprietary term data;Specifically, the first of the present embodiment instructs
Practicing data is the training data under semantic feature visual angle, including the first align data collection and the first unjustified data set;Second instruction
Practicing data is the training data under attribute structure feature visual angle, including the second align data collection and the second unjustified data set;This
The first instance alignment model of embodiment is the entity alignment model for indicating study, and second instance alignment model is based on attribute phase
Like the matched entity alignment model of degree.The electric power proprietary term data mentioned in the present embodiment include but is not limited to: distribution becomes
Depressor data, main transformer data etc..
First instance alignment model is got using the training of the first align data, using first instance alignment model to first
Unjustified data set is predicted, obtains trusted entity to L 'se, the second align data concentration is put it into, the second alignment is updated
Data set;It is assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to the
Two unjustified data sets are predicted, obtain trusted entity to L 'st, the second align data concentration is put it into, updates first pair
Neat data set reaches the maximum times of setting or the output result and second instance pair of first instance alignment model when the number of iterations
The output result of neat model is rear in the threshold range of setting to stop iteration, obtains final entity alignment model;It will be to be aligned
Grid knowledge map be input in final entity alignment model, obtain entity alignment result.Embodiment 3:
The present embodiment is with the data instance at the full-service uniform data center of Zhejiang Electric Power Company.It is multiple in data center
Operation system is directly accessed data center and is stored in patch active layer, has therefrom extracted the test number in some some week of city
According to.In the alignment work of fortune check system and goods and materials system to equipment entity, two systems are total to recording equipment entity 1448,
There are equipment entity 1032 in middle fortune check system, there are equipment entity 876 in goods and materials system, entity can be aligned in two systems to be had
460, wherein training set 160, verifying collect 150, test set 150.
The control methods of experiment is to calculate based on the matched method LCS of attributes similarity and based on indicating that the entity of study is aligned
Method cross-KG and SEEA.Evaluation index is aligned common accuracy rate (precision, P) using entity, recall rate (Recall,
) and F1 value R.
Accuracy rate indicates the order of accuarcy of extraction result, is defined as:
P=Nsuccess/Ntotal
Wherein, NtotalIndicate the relationship inferred sum;NsuccessIndicate the correct relation number that algorithm is inferred.
Recall rate indicates that the correct relationship of deduction accounts for the ratio of all existing alignment relations, is defined as:
R=Rsuccess/Rtotal
Wherein, RsuccessIt indicates to infer correct relationship number, RtotalIndicate all necessary beings to its relationship number.
F1 value is the evaluation index of comprehensive accuracy rate and recall rate, is used for concentrated expression overall effect, is defined as:
F1=2RP/ (R+P)
Experimental result is as shown in table 1.
The method according to the present invention of table 1 and Experimental comparison
Experiment shows that the coorinated training method based on semantic information and structure feature is aligned task in the entity of electric network data
It is middle to obtain comparatively ideal effect.Based on the entity alignment schemes for indicating study under semantic visual angle, it is based under scope of structure
The entity alignment schemes of attributes similarity are trained, and the preferable alignment result under respective visual angle is added to another visual angle
Under be iterated training, the accuracy rate, recall rate and F1 value for being ultimately relative to single visual angle have very big promotion.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Finally it should be noted that: the above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, to the greatest extent
Invention is explained in detail referring to above-described embodiment for pipe, it should be understood by those ordinary skilled in the art that: still
It can be with modifications or equivalent substitutions are made to specific embodiments of the invention, and without departing from any of spirit and scope of the invention
Modification or equivalent replacement, should all cover within the scope of the claims of the present invention.
Claims (15)
1. entity alignment schemes in a kind of knowledge mapping, it is characterised in that: the following steps are included:
Step 1: first instance alignment model is trained using the first training data, it is real to second using the second training data
Body alignment model is trained, and the first instance alignment model trusted entity that repetitive exercise obtains each time updates the second training number
According to the second instance alignment model trusted entity that repetitive exercise obtains each time updates the first training data, when the number of iterations reaches
Maximum times or the output result of first instance alignment model and the output result of second instance alignment model to setting are being set
Stop iteration after in fixed threshold range, obtains final entity alignment model;
Step 2: knowledge mapping to be aligned being input in the final entity alignment model that step 1 obtains, obtain entity pair
Neat result.
2. entity alignment schemes in knowledge mapping according to claim 1, it is characterised in that: when for completing grid knowledge
When entity is aligned in map, first training data and the second training data are electric power proprietary term data.
3. entity alignment schemes in knowledge mapping according to claim 1, it is characterised in that: first training data is
Training data under semantic feature visual angle, including the first align data collection and the first unjustified data set;The second training number
According to for the training data under attribute structure feature visual angle, including the second align data collection and the second unjustified data set;
The first instance alignment model is the entity alignment model for indicating study;The second instance alignment model is based on category
The entity alignment model of property similarity mode.
4. entity alignment schemes in knowledge mapping according to claim 3, it is characterised in that: the step 1 specifically includes:
First instance alignment model is got using the training of the first align data, it is not right to first using first instance alignment model
Neat data set is predicted, obtains trusted entity to L 'se, the second align data concentration is put it into, the second align data is updated
Collection;
Assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to second not
Align data collection is predicted, obtains trusted entity to L 'st, the second align data concentration is put it into, the first alignment number is updated
According to collection.
5. entity alignment schemes in knowledge mapping according to claim 1, it is characterised in that: before carrying out step 1, also
Including constructing first instance alignment model, specifically:
By the entity and relationship map to vector space in knowledge mapping, the correspondence mappings vector of entity in knowledge mapping is obtained,
The correspondence mappings vector of relationship obtains correspondence the vector h, t, r of head and the tail entity and intermediate interactions in triple;
Loss function is constructed according to formula (1), is stopped after loss function iteration reaches the maximum times or constant end value of setting
Iteration obtains first instance alignment model:
Wherein, (h, r, t) ∈ Δ expression knowledge mapping is converted into all in fact existing positive example triples after triple form
Set;(h ', r ', t ') ∈ Δ ' expression is by positive example triple by replacing in the knowledge mapping that head entity or tail entity generate not
Existing negative example triplet sets,Alignment relation between learning objectH indicates that head entity vector, t indicate
Tail entity vector, r indicate relation vector, and γ > 0 is used to separate the interval of positive and negative entity pair,Indicate positive example triple
Head entity vector h and positive example triple tail entity vector t alignment relation,Indicate the head of negative example triple
The alignment relation of entity vector h ' and the tail entity vector t ' of negative example triple;
The corresponding head entity vector of entity, tail entity vector sum relation vector are iterated update according to formula (2):
hi=hi-μ*2*|ti-hi-ri|
ri=ri-μ*2*|ti-hi-ri|
ti=ti-μ*2*|ti-hi-ri|
h′i=h 'i-μ*2*|t′i-h′i-r′i|
ri'=r 'i-μ*2*|t′i-h′i-r′i|
t′i=t 'i-μ*2*|t′i-h′i-r′i| (2)
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of head entity vector h is represented, μ is learning rate.
6. entity alignment schemes in knowledge mapping according to claim 1, it is characterised in that: before carrying out step 1, packet
Building second instance alignment model is included, specifically:
Scoring functions are constructed according to formula (6), are closed according to vector sum of the scoring functions to the entity of all candidate entity relationship centerings
The vector of system is given a mark, and marking is taken to be worth target entity of the candidate entity of highest candidate entity relationship centering as alignment;
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, t*Indicate given entity, h is expressed as and t*There are the candidate entities of across a network entity alignment relation, pass through marking
Function is to all candidate entity relationships to (h, r, t*) give a mark, it is real that marking is worth the target that highest candidate entity h is alignment
Body;||h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicating attributes similarity, w is punishment dynamics;
Sim(h,t*)=average (Sim (pi))
piFor entity h and entity t*Shared property set:
Up (h, t*)=property1 ∩ property2 (3)
In formula, property1 is the property set of entity h, and property2 is entity t*Property set;
Entity h and entity t*Shared property set piSimilarity sim (pi) are as follows:
In formula, piX-th of attribute p1x of correspondent entity h, attribute value v1x, piCorrespondent entity t*Y-th of attribute p2y,
Attribute value is v2y, and lcs (v1x, v2y) is the longest common subsequence of attribute value.
7. entity alignment in a kind of knowledge mapping, it is characterised in that: include:
Coorinated training unit obtains most for carrying out coorinated training to first instance alignment model and second instance alignment model
Whole entity alignment model;
Reading unit is input to final entity alignment model for obtaining knowledge mapping to be aligned, obtains entity alignment knot
Fruit.
8. entity alignment in knowledge mapping according to claim 7, it is characterised in that: the coorinated training unit tool
Body are as follows: first instance alignment model is trained using the first training data, using the second training data to second instance pair
Neat model is trained, the first instance alignment model trusted entity that iteration obtains each time the second training data of update, and second
The entity alignment model trusted entity that iteration obtains each time updates the first training data, when the maximum times for reaching setting or the
The rear stopping in the threshold range of setting of the output result of one entity alignment model and the output result of second instance alignment model
Iteration obtains final entity alignment model.
9. entity alignment in knowledge mapping according to claim 8, it is characterised in that: when for completing grid knowledge
When entity is aligned in map, first training data and the second training data are electric power proprietary term data.
10. entity alignment in knowledge mapping according to claim 8, it is characterised in that: first training data
For the training data under semantic feature visual angle, including the first align data collection and the first unjustified data set;Second training
Data are the training data under attribute structure feature visual angle, including the second align data collection and the second unjustified data set;It is described
First instance alignment model is the entity alignment model for indicating study;The second instance alignment model is based on attributes similarity
Matched entity alignment model.
11. entity alignment in knowledge mapping according to claim 8, it is characterised in that: use the first align data
Training gets first instance alignment model, is predicted using first instance alignment model the first unjustified data set, is obtained
To trusted entity to L 'se, the second align data concentration is put it into, the second align data collection is updated;
Assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to second not
Align data collection is predicted, obtains trusted entity to L 'st, the second align data concentration is put it into, the first alignment number is updated
According to collection.
12. entity alignment in knowledge mapping according to claim 7, it is characterised in that: the first instance alignment
Model are as follows: by knowledge mapping entity and relationship map arrive vector space, acquisition knowledge mapping in entity correspondence mappings to
Amount, the correspondence mappings vector of relationship obtain correspondence the vector h, t, r of head and the tail entity and intermediate interactions in triple;
Loss function is constructed according to formula (1), is stopped after loss function iteration reaches the maximum times or constant end value of setting
Iteration obtains first instance alignment model:
Wherein, (h, r, t) ∈ Δ expression knowledge mapping is converted into all in fact existing triplet sets after triple form;
(h ', r ', t ') ∈ Δ ' expression will be not present in knowledge mapping that positive example triple is generated by replacement head entity or tail entity
Triplet sets,Alignment relation between learning objectH indicates that head entity vector, t indicate tail entity vector,
R indicates relation vector, and γ > 0 is used to separate the interval of positive and negative entity pair,Indicate the head entity vector of positive example triple
The alignment relation of h and the tail entity vector t of positive example triple,Indicate the head entity vector h ' of negative example triple with
The alignment relation of the tail entity vector t ' of negative example triple;
The corresponding head entity vector of entity, tail entity vector sum relation vector are iterated update according to formula (2):
hi=hi-μ*2*|ti-hi-ri|
ri=ri-μ*2*|ti-hi-ri|
ti=ti-μ*2*|ti-hi-ri|
h′i=h 'i-μ*2*|t′i-h′i-r′i|
ri'=r 'i-μ*2*|t′i-h′i-r′i|
t′i=t 'i-μ*2*|t′i-h′i-r′i| (2)
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of head entity vector h is represented, μ is learning rate.
13. entity alignment in knowledge mapping according to claim 7, it is characterised in that: the second instance alignment
Model are as follows: scoring functions are constructed according to formula (6), according to scoring functions to the vector sum of the entity of all candidate entity relationship centerings
The vector of relationship is given a mark, and the candidate entity for taking marking to be worth highest candidate entity relationship centering is real as the target of alignment
Body;
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, t*Indicate given entity, h is expressed as and t*There are the candidate entities of across a network entity alignment relation, pass through marking
Function is to all candidate entity relationships to (h, r, t*) give a mark, it is real that marking is worth the target that highest candidate entity h is alignment
Body;||h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicating attributes similarity, w is punishment dynamics;
Sim(h,t*)=average (Sim (pi))
piFor entity h and entity t*Shared property set:
Up (h, t*)=property1 ∩ property2 (3)
In formula, property1 is the property set of entity h, and property2 is entity t*Property set;
Entity h and entity t*Shared property set piSimilarity sim (pi) are as follows:
In formula, piX-th of attribute p1x of correspondent entity h, attribute value v1x, piCorrespondent entity t*Y-th of attribute p2y,
Attribute value is v2y, and lcs (v1x, v2y) is the longest common subsequence of attribute value.
14. entity alignment in a kind of knowledge mapping, which is characterized in that the system comprises network interface, memory and places
Manage device;Wherein,
The network interface, during for being received and sent messages between other ext nal network elements, signal is sended and received;
The memory, for storing the computer program instructions that can be run on the processor;
The processor, for when running the computer program instructions, perform claim to require 1 to 6 described in any item one
In kind knowledge mapping the step of entity alignment schemes.
15. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with entity pair in knowledge mapping
The program of neat method realizes that right is wanted when the program of entity alignment schemes is executed by least one processor in the knowledge mapping
The step of seeking entity alignment schemes in a kind of 1 to 6 described in any item knowledge mappings.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910485558.6A CN110245131A (en) | 2019-06-05 | 2019-06-05 | Entity alignment schemes, system and its storage medium in a kind of knowledge mapping |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910485558.6A CN110245131A (en) | 2019-06-05 | 2019-06-05 | Entity alignment schemes, system and its storage medium in a kind of knowledge mapping |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110245131A true CN110245131A (en) | 2019-09-17 |
Family
ID=67886142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910485558.6A Pending CN110245131A (en) | 2019-06-05 | 2019-06-05 | Entity alignment schemes, system and its storage medium in a kind of knowledge mapping |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110245131A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765276A (en) * | 2019-10-21 | 2020-02-07 | 北京明略软件系统有限公司 | Entity alignment method and device in knowledge graph |
CN110941722A (en) * | 2019-10-12 | 2020-03-31 | 中国人民解放军国防科技大学 | Knowledge graph fusion method based on entity alignment |
CN111191462A (en) * | 2019-12-30 | 2020-05-22 | 北京航空航天大学 | Method and system for realizing cross-language knowledge space entity alignment based on link prediction |
CN111813963A (en) * | 2020-09-10 | 2020-10-23 | 平安国际智慧城市科技股份有限公司 | Knowledge graph construction method and device, electronic equipment and storage medium |
CN112149400A (en) * | 2020-09-23 | 2020-12-29 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN112258339A (en) * | 2020-09-29 | 2021-01-22 | 广东电力通信科技有限公司 | Data processing and storing method and system based on power grid system technology |
CN112328710A (en) * | 2020-11-26 | 2021-02-05 | 北京百度网讯科技有限公司 | Entity information processing method, entity information processing device, electronic equipment and storage medium |
CN112445876A (en) * | 2020-11-25 | 2021-03-05 | 中国科学院自动化研究所 | Entity alignment method and system fusing structure, attribute and relationship information |
CN112765370A (en) * | 2021-03-29 | 2021-05-07 | 腾讯科技(深圳)有限公司 | Entity alignment method and device of knowledge graph, computer equipment and storage medium |
CN113392220A (en) * | 2020-10-23 | 2021-09-14 | 腾讯科技(深圳)有限公司 | Knowledge graph generation method and device, computer equipment and storage medium |
WO2022242449A1 (en) * | 2021-05-18 | 2022-11-24 | 腾讯科技(深圳)有限公司 | Knowledge graph alignment model training method and apparatus, knowledge graph alignment method and apparatus, and device |
CN115828882A (en) * | 2022-09-23 | 2023-03-21 | 华能澜沧江水电股份有限公司 | Entity alignment method and system for risk linkage of dam safety knowledge base |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108694201A (en) * | 2017-04-10 | 2018-10-23 | 华为软件技术有限公司 | A kind of entity alignment schemes and device |
US20190019088A1 (en) * | 2017-07-14 | 2019-01-17 | Guangdong Shenma Search Technology Co., Ltd. | Knowledge graph construction method and device |
-
2019
- 2019-06-05 CN CN201910485558.6A patent/CN110245131A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108694201A (en) * | 2017-04-10 | 2018-10-23 | 华为软件技术有限公司 | A kind of entity alignment schemes and device |
US20190019088A1 (en) * | 2017-07-14 | 2019-01-17 | Guangdong Shenma Search Technology Co., Ltd. | Knowledge graph construction method and device |
Non-Patent Citations (2)
Title |
---|
林翠萍: "《中文人名消歧算法研究》", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
苏佳林 等: "《融合语义和结构信息的知识图谱实体对齐》", 《山西大学学报(自然科学版)》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110941722A (en) * | 2019-10-12 | 2020-03-31 | 中国人民解放军国防科技大学 | Knowledge graph fusion method based on entity alignment |
CN110941722B (en) * | 2019-10-12 | 2022-07-01 | 中国人民解放军国防科技大学 | Knowledge graph fusion method based on entity alignment |
CN110765276A (en) * | 2019-10-21 | 2020-02-07 | 北京明略软件系统有限公司 | Entity alignment method and device in knowledge graph |
CN111191462B (en) * | 2019-12-30 | 2022-02-22 | 北京航空航天大学 | Method and system for realizing cross-language knowledge space entity alignment based on link prediction |
CN111191462A (en) * | 2019-12-30 | 2020-05-22 | 北京航空航天大学 | Method and system for realizing cross-language knowledge space entity alignment based on link prediction |
CN111813963A (en) * | 2020-09-10 | 2020-10-23 | 平安国际智慧城市科技股份有限公司 | Knowledge graph construction method and device, electronic equipment and storage medium |
CN112149400A (en) * | 2020-09-23 | 2020-12-29 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN112149400B (en) * | 2020-09-23 | 2021-07-27 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN112258339A (en) * | 2020-09-29 | 2021-01-22 | 广东电力通信科技有限公司 | Data processing and storing method and system based on power grid system technology |
CN113392220A (en) * | 2020-10-23 | 2021-09-14 | 腾讯科技(深圳)有限公司 | Knowledge graph generation method and device, computer equipment and storage medium |
CN113392220B (en) * | 2020-10-23 | 2024-03-26 | 腾讯科技(深圳)有限公司 | Knowledge graph generation method and device, computer equipment and storage medium |
CN112445876A (en) * | 2020-11-25 | 2021-03-05 | 中国科学院自动化研究所 | Entity alignment method and system fusing structure, attribute and relationship information |
CN112445876B (en) * | 2020-11-25 | 2023-12-26 | 中国科学院自动化研究所 | Entity alignment method and system for fusing structure, attribute and relationship information |
CN112328710A (en) * | 2020-11-26 | 2021-02-05 | 北京百度网讯科技有限公司 | Entity information processing method, entity information processing device, electronic equipment and storage medium |
CN112765370A (en) * | 2021-03-29 | 2021-05-07 | 腾讯科技(深圳)有限公司 | Entity alignment method and device of knowledge graph, computer equipment and storage medium |
CN112765370B (en) * | 2021-03-29 | 2021-07-06 | 腾讯科技(深圳)有限公司 | Entity alignment method and device of knowledge graph, computer equipment and storage medium |
WO2022242449A1 (en) * | 2021-05-18 | 2022-11-24 | 腾讯科技(深圳)有限公司 | Knowledge graph alignment model training method and apparatus, knowledge graph alignment method and apparatus, and device |
CN115828882A (en) * | 2022-09-23 | 2023-03-21 | 华能澜沧江水电股份有限公司 | Entity alignment method and system for risk linkage of dam safety knowledge base |
CN115828882B (en) * | 2022-09-23 | 2023-06-16 | 华能澜沧江水电股份有限公司 | Entity alignment method and system oriented to dam safety knowledge base risk linkage |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110245131A (en) | Entity alignment schemes, system and its storage medium in a kind of knowledge mapping | |
CN106651247A (en) | Address area block matching method based on GIS topology analysis and address area block matching system thereof | |
CN103390037B (en) | Ten thousand people's Collaborative Plotting methods based on mobile terminal | |
CN110175909A (en) | A kind of enterprise's incidence relation determines method and system | |
CN108694201A (en) | A kind of entity alignment schemes and device | |
CN104798043A (en) | Data processing method and computer system | |
CN110415521A (en) | Prediction technique, device and the computer readable storage medium of traffic data | |
CN112070402A (en) | Data processing method, device and equipment based on map and storage medium | |
CN107633257B (en) | Data quality evaluation method and device, computer readable storage medium and terminal | |
CN107194672B (en) | Review distribution method integrating academic expertise and social network | |
CN104239594A (en) | Artificial environment model, Agent model and modeling method of Agent model | |
CN111737364A (en) | Safe multi-party data fusion and federal sharing method, device, equipment and medium | |
Marchal et al. | Modeling location choice of secondary activities with a social network of cooperative agents | |
CN108268512A (en) | A kind of tag queries method and device | |
Gao et al. | A multi-objective service composition method considering the interests of tri-stakeholders in cloud manufacturing based on an enhanced jellyfish search optimizer | |
CN105868478A (en) | Rotating mechanical equipment virtual assembly model and method based on context awareness | |
CN112541556A (en) | Model construction optimization method, device, medium, and computer program product | |
Yang et al. | Research on trans-region integrated traffic emergency dispatching technology based on multi-agent | |
CN112419810B (en) | Intelligent education method for accurate control based on adaptive cognitive interaction | |
CN111368060B (en) | Self-learning method, device and system for conversation robot, electronic equipment and medium | |
CN108648099A (en) | The intelligent aided design system of distribution planning | |
CN114462225A (en) | Rapid construction system for hybrid traffic simulation supporting environment under vehicle-road cooperation | |
Lynn et al. | Managing distributed cloud applications and infrastructure: A self-optimising approach | |
CN103475686B (en) | Communication data distribution system and communication data distribution method for electric analog | |
CN106547876A (en) | A kind of community discovery processing method propagated based on degree of membership label and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |