CN110245131A - Entity alignment schemes, system and its storage medium in a kind of knowledge mapping - Google Patents

Entity alignment schemes, system and its storage medium in a kind of knowledge mapping Download PDF

Info

Publication number
CN110245131A
CN110245131A CN201910485558.6A CN201910485558A CN110245131A CN 110245131 A CN110245131 A CN 110245131A CN 201910485558 A CN201910485558 A CN 201910485558A CN 110245131 A CN110245131 A CN 110245131A
Authority
CN
China
Prior art keywords
entity
alignment
vector
instance
alignment model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910485558.6A
Other languages
Chinese (zh)
Inventor
王渊
冯珺
徐海洋
冯烛明
樊华
王鑫
张淑娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHINA REALTIME DATABASE Co Ltd
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd
NARI Group Corp
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
CHINA REALTIME DATABASE Co Ltd
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd
NARI Group Corp
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHINA REALTIME DATABASE Co Ltd, State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd, NARI Group Corp, Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd filed Critical CHINA REALTIME DATABASE Co Ltd
Priority to CN201910485558.6A priority Critical patent/CN110245131A/en
Publication of CN110245131A publication Critical patent/CN110245131A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention discloses entity alignment schemes in a kind of knowledge mapping, system and its storage medium, include: step 1: first instance alignment model being trained using the first training data, second instance alignment model is trained using the second training data, the first instance alignment model trusted entity that repetitive exercise obtains each time updates the second training data, the second instance alignment model trusted entity that repetitive exercise obtains each time updates first instance alignment model, stop iteration after the maximum times or the output result of first instance alignment model and the output result of second instance alignment model that the number of iterations reaches setting are in the threshold range of setting, obtain final entity alignment model;Step 2: map to be aligned being input in the final entity alignment model that step 1 obtains, obtain entity alignment result.

Description

Entity alignment schemes, system and its storage medium in a kind of knowledge mapping
Technical field
The invention belongs to the intelligent use fields of electric power big data, and in particular to entity alignment side in a kind of knowledge mapping Method, system and its storage medium.
Background technique
With the continuous development of big data technology, the data being largely not yet used effectively are had accumulated, these data are contained Attention of the value increasingly by enterprises and academia.For the unified convergence and sharing application for realizing data, need to construct The knowledge mapping of data, it is established that the semantic connection net of data provides the uniform data service of semantic class interoperability for user, but Since data are from not homologous ray, they often have respective description rule to same target, so that never extracting in homologous ray Entity and relationship the case where there are a large amount of isomeries, redundancy, be aligned by entity and clear up the entity for being directed toward same target Merge, solve the entity multiplying question in knowledge mapping, is the committed step for constructing the data knowledge map of high quality.
Entity alignment techniques are intended to find those of the direction same target from different data collection entity, and pass through OWL: SameAs etc. refers to that these entity links are had the object of unitized globally unique identifier by the building of link for one altogether, realizes High quality link between data source, promotes knowledge mapping building.Entity alignment schemes are broadly divided into two major classes, and one kind is to be based on The entity alignment schemes of attributes similarity, another kind of is that knowledge based indicates that study alignment entity relationship is inferred.Based on category Property similarity deduction mainly according between entity to be aligned, whether the set of attribute having the same and respective attributes value is sentenced Disconnected.Based on indicate the deduction of study using modeling method by knowledge mapping entity and relationship map to the dense vector of low-dimensional In space, calculating and reasoning are then carried out.
But need to find the corresponding relationship between the entity in different data collection, directly using Knowledge Representation Model or based on category Property similarity deduction be difficult to reach satisfactory effect, and the entities that method used at present needs largely mark are aligned Data, this is intended to the participation of a large amount of power business experts in practice, it is difficult to realize.
Summary of the invention
To solve problems of the prior art, the present invention proposes entity alignment schemes in a kind of knowledge mapping, fusion Entity alignment that representation of knowledge study and attributes similarity are inferred has reached logarithm as a result, result of both being complements one another According to preferable entity alignment effect.
The technical scheme adopted by the invention is that: entity alignment schemes in a kind of knowledge mapping, comprising the following steps:
Step 1: first instance alignment model being trained using the first training data, using the second training data to Two entity alignment models are trained, and the first instance alignment model trusted entity that repetitive exercise obtains each time updates the second instruction Practice data, the second instance alignment model trusted entity that repetitive exercise obtains each time updates first instance alignment model, when repeatedly Generation number reaches the output of the maximum times of setting or the output result of first instance alignment model and second instance alignment model As a result rear in the threshold range of setting to stop iteration, obtain final entity alignment model;
Step 2: knowledge mapping to be aligned being input in the final entity alignment model that step 1 obtains, obtain reality Body is aligned result.
Further, when for completing entity alignment in grid knowledge map, first training data and the second instruction Practicing data is electric power proprietary term data.
Further, first training data is the training data under semantic feature visual angle, including the first align data Collection and the first unjustified data set;Second training data is the training data under attribute structure feature visual angle, including second Align data collection and the second unjustified data set;
The first instance alignment model is the entity alignment model for indicating study;The second instance alignment model is base In the matched entity alignment model of attributes similarity.
Further, the step 1 specifically includes:
First instance alignment model is got using the training of the first align data, using first instance alignment model to first Unjustified data set is predicted, obtains trusted entity to L 'se, the second align data concentration is put it into, the second alignment is updated Data set;
It is assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to the Two unjustified data sets are predicted, obtain trusted entity to L 'st, the second align data concentration is put it into, updates first pair Neat data set.
It further, further include building first instance alignment model before carrying out step 1, specifically:
By in knowledge mapping entity and relationship map arrive vector space, acquisition knowledge mapping in entity correspondence mappings to Amount, the correspondence mappings vector of relationship obtain correspondence the vector h, t, r of head and the tail entity and intermediate interactions in triple;
Loss function is constructed according to formula (1), after loss function iteration reaches the maximum times or constant end value of setting Stop iteration, obtain first instance alignment model:
Wherein, (h, r, t) ∈ Δ expression knowledge mapping is converted into all in fact existing triples after triple form Set;(h ', r ', t ') ∈ Δ ' expression is by positive example triple by replacing in the knowledge mapping that head entity or tail entity generate not Existing triplet sets,Alignment relation between learning objectH indicates that head entity vector, t indicate that tail is real Body vector, r indicate relation vector, and γ > 0 is used to separate the interval of positive and negative entity pair,Indicate the head of positive example triple The alignment relation of entity vector h and the tail entity vector t of positive example triple,Indicate the head entity of negative example triple The alignment relation of vector h ' and the tail entity vector t ' of negative example triple;
The corresponding head entity vector of entity, tail entity vector sum relation vector are iterated update according to formula (2):
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of head entity vector h is represented, μ is learning rate.
It further, further include building second instance alignment model before carrying out step 1, specifically:
Scoring functions are constructed according to formula (6), according to scoring functions to the vector of the entity of all candidate entity relationship centerings It gives a mark with the vector of relationship, the candidate entity for taking marking to be worth highest candidate entity relationship centering is real as the target of alignment Body;
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, t*Indicate given entity, h is expressed as and t*There are the candidate entities of across a network entity alignment relation, pass through Scoring functions are to all candidate entity relationships to (h, r, t*) give a mark, marking is worth the mesh that highest candidate entity h is alignment Mark entity;||h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicate that attributes similarity, w are punishment power Degree;
Sim(h,t*)=average (Sim (pi))
piFor entity h and entity t*Shared property set:
Up (h, t*)=property1 ∩ property2 (3)
In formula, property1 is the property set of entity h, and property2 is entity t*Property set;
Entity h and entity t*Shared property set piSimilarity sim (pi) are as follows:
In formula, piX-th of attribute p1x of correspondent entity h, attribute value v1x, piCorrespondent entity t*Y-th of attribute P2y, attribute value v2y, lcs (v1x, v2y) are the longest common subsequence of attribute value.
The invention also discloses entity alignments in knowledge mapping, comprising:
Coorinated training unit is obtained for carrying out coorinated training to first instance alignment model and second instance alignment model To final entity alignment model;
Reading unit is input to final entity alignment model for obtaining knowledge mapping to be aligned, obtains entity pair Neat result.
Further, the coorinated training unit specifically: using the first training data to first instance alignment model into Row training, is trained second instance alignment model using the second training data, first instance alignment model iteration each time Obtained trusted entity updates the second training data, and the second instance alignment model trusted entity that iteration obtains each time updates the One training data, when the output result and second instance alignment model of the maximum times or first instance alignment model for reaching setting Output result in the threshold range of setting after stop iteration, obtain final entity alignment model.
Further, when for completing entity alignment in grid knowledge map, first training data and the second instruction Practicing data is electric power proprietary term data.
Further, first training data is the training data under semantic feature visual angle, including the first align data Collection and the first unjustified data set;Second training data is the training data under attribute structure feature visual angle, including second Align data collection and the second unjustified data set;The first instance alignment model is the entity alignment model for indicating study;Institute Stating second instance alignment model is the entity alignment model matched based on attributes similarity.
Further, first instance alignment model is got using the training of the first align data, is aligned using first instance Model predicts the first unjustified data set, obtains trusted entity to L 'se, the second align data concentration is put it into, more New second align data collection;
It is assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to the Two unjustified data sets are predicted, obtain trusted entity to L 'st, the second align data concentration is put it into, updates first pair Neat data set.
Further, the first instance alignment model are as follows: by the entity and relationship map to vector sky in knowledge mapping Between, obtain the correspondence mappings vector of entity in knowledge mapping, the correspondence mappings vector of relationship, obtain in triple head and the tail entity and Correspondence the vector h, t, r of intermediate interactions;
Loss function is constructed according to formula (1), after loss function iteration reaches the maximum times or constant end value of setting Stop iteration, obtain first instance alignment model:
Wherein, (h, r, t) ∈ Δ expression knowledge mapping is converted into all in fact existing triples after triple form Set;(h ', r ', t ') ∈ Δ ' expression is by positive example triple by replacing in the knowledge mapping that head entity or tail entity generate not Existing triplet sets,Alignment relation between learning objectH indicates that head entity vector, t indicate that tail is real Body vector, r indicate relation vector, and γ > 0 is used to separate the interval of positive and negative entity pair,Indicate the head of positive example triple The alignment relation of entity vector h and the tail entity vector t of positive example triple,Indicate the head entity of negative example triple The alignment relation of vector h ' and the tail entity vector t ' of negative example triple;
The corresponding head entity vector of entity, tail entity vector sum relation vector are iterated update according to formula (2):
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of head entity vector h is represented, μ is learning rate.
Further, the second instance alignment model are as follows: scoring functions are constructed according to formula (6), according to scoring functions pair The vector of the vector sum relationship of the entity of all candidate's entity relationship centerings is given a mark, and the highest candidate entity of marking value is taken to close It is target entity of the candidate entity of centering as alignment;
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, t*Indicate given entity, h is expressed as and t*There are the candidate entities of across a network entity alignment relation, pass through Scoring functions are to all candidate entity relationships to (h, r, t*) give a mark, marking is worth the mesh that highest candidate entity h is alignment Mark entity;||h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicate that attributes similarity, w are punishment power Degree;
Sim(h,t*)=average (Sim (pi))
piFor entity h and entity t*Shared property set:
Up (h, t*)=property1 ∩ property2 (3)
In formula, property1 is the property set of entity h, and property2 is entity t*Property set;
Entity h and entity t*Shared property set piSimilarity sim (pi) are as follows:
In formula, piX-th of attribute p1x of correspondent entity h, attribute value v1x, piCorrespondent entity t*Y-th of attribute P2y, attribute value v2y, lcs (v1x, v2y) are the longest common subsequence of attribute value.
The invention also discloses entity alignments in a kind of knowledge mapping, and the system comprises network interfaces, memory And processor;Wherein,
The network interface, during for being received and sent messages between other ext nal network elements, the reception of signal and It sends;
The memory, for storing the computer program instructions that can be run on the processor;
The processor, for when running the computer program instructions, executing entity in a kind of above-mentioned knowledge mapping The step of alignment schemes.
The invention also discloses a kind of computer storage medium, the computer storage medium is stored in knowledge mapping real The program of body alignment schemes is realized when the program of entity alignment schemes is executed by least one processor in the knowledge mapping The step of stating the entity alignment schemes in a kind of knowledge mapping.
The utility model has the advantages that the entity alignment schemes of collaboration semantic information and structure feature that the present invention uses are in electric network data Comparatively ideal effect is achieved in entity alignment task, this method ratio carries out entity alignment just for semantic information or knot attribute sign The method of deduction has comprehensive promotion.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, With reference to embodiment, it further explains The bright present invention.
Embodiment 1:
A kind of entity alignment schemes in the knowledge mapping building process of the present embodiment, comprising the following steps:
Step 1: building indicates the entity alignment model of study:
Entity alignment is carried out using the method for indicating study and is broadly divided into two steps:
First by map KG to be aligned1And KG2It is mapped in low-dimensional vector space and obtains corresponding representation of knowledge KG1With KG2, it is then based on the entity align data collection N manually marked, the alignment relation between learning objectWith (e1,e2) indicate data Collect any entity pair in N, i.e. (e1,e2) ∈ N, wherein e1∈E1, e2∈E2;The alignment relation quilt of entity in two knowledge mappings It is considered a kind of special relationship r*=SameAs, entity alignment relation can form triple (e1, SameAs, e2).Starting It is utilized before training and is uniformly distributed initialization KG1And KG2Entity and the vector of relationship indicate.
The loss function of vector space representation method is defined as:
Wherein, (h, r, t) ∈ Δ is positive example triplet sets, is converted into all facts after triple form for knowledge mapping Present on triple.(h ', r ', t ') ∈ Δ ' be negative example triplet sets, for by positive example triple by replacement head entity or The triple being not present in the knowledge mapping that tail entity generates.Particularly, for the triple of entity alignment relation SameAs, structure The entity replaced when making negative example triple should be the same type entity of another data source, i.e., random replacement head entity is first The same type entity or replacement tail entity of data source are the same type entity of second data source.Based on translation mould The thought of type, it regards tail entity t translation process of the entity h by r as, for measuring the matching degree between two entities.γ> 0 is used to separate the interval of positive and negative entity pair.
During optimizing loss function, the constraint relationship vector SameAs is null vector, and loss function iteration is arranged and reaches To the constant rear stopping iteration of maximum times or end value.Each iteration is by the head entity vector of entity relationship, tail entity vector sum Relation vector is iterated update, and vector gradient method can be used in when update, and mode is as follows:
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of h is represented, μ is learning rate.
Step 2: building is based on the matched entity alignment model of attributes similarity:
Define up (e1,e2) it is entity e1And e2Shared property set, it may be assumed that
up(e1,e2)=property1 ∩ property2 (3)
In formula, property1 is entity e1Property set, property2 be entity e2Property set.
Two entities share attribute piSimilarity sim (pi) are as follows:
In formula, piCorrespondent entity e1X-th of attribute p1x, attribute value v1x.Meanwhile correspondent entity e2Y-th category Property p2y, attribute value v2y.Lcs (v1x, v2y) is the longest common subsequence of attribute value.
Entity e1, the entity similarity of e2 shares the average value of attributes similarity for it:
Sim(e1,e2)=average (Sim (pi)) (5)
If given entity t*, infer and t*There are the h of across a network entity alignment relation, according to scoring functions to all entities Relationship is to (h ', SameAs, t*) marking, take marking to be worth highest h ' as the result inferred.Scoring functions are indicated based on vector Similarity and attributes similarity definition:
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, | | h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicate attributes similarity.W is Punishment dynamics, 0 < w < 1, specific value are determined according to the confidence level of data set attribute.
Step 3: the coorinated training of learning model and attributes match model is indicated:
Knowledge based indicates study and the entity alignment schemes based on attributes similarity respectively from semantic feature and attribute knot The two independent angles of structure feature infer entity alignment problem.In the entity alignment task of industry knowledge mapping, The inferred results at two visual angles are complemented one another using coorinated training frame, can achieve better effect.
Specifically, training data is divided into two mutually independent visual angles, i.e. semantic feature visual angle and structure feature view Angle.According to the align data collection X at the semantic feature visual angle marked on a small quantityse, the training data at generative semantics feature visual angle, and instruct Get the entity alignment model m for indicating studyse, use the entity alignment model m for indicating studyseTo semantic feature visual angle Unjustified data set Xse' predicted, and trusted entity is selected to L 'se, in the align data at attribute structure feature visual angle, more The align data collection X at new attribute structure feature visual anglest
Similarly, according to the align data collection X at the attribute structure feature visual angle marked on a small quantityst, it is special to generate attribute structure Levy the training data at visual angle, the entity alignment model m matched based on attributes similarity that training obtainsst, and using based on category The entity alignment model m of property similarity modestTo the unjustified data set X at attribute structure feature visual anglest' predicted, it obtains To trusted entity to L 'st, put it into the align data at semantic feature visual angle, the align data at the semantic feature visual angle of update Collect Xse
Above-mentioned two model is constantly iterated, until convergence.
Embodiment 2:
The present embodiment completes entity alignment in grid knowledge map in the method for embodiment 1, the used in the present embodiment One training data and the second training data are containing a large amount of electric power proprietary term data;Specifically, the first of the present embodiment instructs Practicing data is the training data under semantic feature visual angle, including the first align data collection and the first unjustified data set;Second instruction Practicing data is the training data under attribute structure feature visual angle, including the second align data collection and the second unjustified data set;This The first instance alignment model of embodiment is the entity alignment model for indicating study, and second instance alignment model is based on attribute phase Like the matched entity alignment model of degree.The electric power proprietary term data mentioned in the present embodiment include but is not limited to: distribution becomes Depressor data, main transformer data etc..
First instance alignment model is got using the training of the first align data, using first instance alignment model to first Unjustified data set is predicted, obtains trusted entity to L 'se, the second align data concentration is put it into, the second alignment is updated Data set;It is assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to the Two unjustified data sets are predicted, obtain trusted entity to L 'st, the second align data concentration is put it into, updates first pair Neat data set reaches the maximum times of setting or the output result and second instance pair of first instance alignment model when the number of iterations The output result of neat model is rear in the threshold range of setting to stop iteration, obtains final entity alignment model;It will be to be aligned Grid knowledge map be input in final entity alignment model, obtain entity alignment result.Embodiment 3:
The present embodiment is with the data instance at the full-service uniform data center of Zhejiang Electric Power Company.It is multiple in data center Operation system is directly accessed data center and is stored in patch active layer, has therefrom extracted the test number in some some week of city According to.In the alignment work of fortune check system and goods and materials system to equipment entity, two systems are total to recording equipment entity 1448, There are equipment entity 1032 in middle fortune check system, there are equipment entity 876 in goods and materials system, entity can be aligned in two systems to be had 460, wherein training set 160, verifying collect 150, test set 150.
The control methods of experiment is to calculate based on the matched method LCS of attributes similarity and based on indicating that the entity of study is aligned Method cross-KG and SEEA.Evaluation index is aligned common accuracy rate (precision, P) using entity, recall rate (Recall, ) and F1 value R.
Accuracy rate indicates the order of accuarcy of extraction result, is defined as:
P=Nsuccess/Ntotal
Wherein, NtotalIndicate the relationship inferred sum;NsuccessIndicate the correct relation number that algorithm is inferred.
Recall rate indicates that the correct relationship of deduction accounts for the ratio of all existing alignment relations, is defined as:
R=Rsuccess/Rtotal
Wherein, RsuccessIt indicates to infer correct relationship number, RtotalIndicate all necessary beings to its relationship number.
F1 value is the evaluation index of comprehensive accuracy rate and recall rate, is used for concentrated expression overall effect, is defined as:
F1=2RP/ (R+P)
Experimental result is as shown in table 1.
The method according to the present invention of table 1 and Experimental comparison
Experiment shows that the coorinated training method based on semantic information and structure feature is aligned task in the entity of electric network data It is middle to obtain comparatively ideal effect.Based on the entity alignment schemes for indicating study under semantic visual angle, it is based under scope of structure The entity alignment schemes of attributes similarity are trained, and the preferable alignment result under respective visual angle is added to another visual angle Under be iterated training, the accuracy rate, recall rate and F1 value for being ultimately relative to single visual angle have very big promotion.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Finally it should be noted that: the above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, to the greatest extent Invention is explained in detail referring to above-described embodiment for pipe, it should be understood by those ordinary skilled in the art that: still It can be with modifications or equivalent substitutions are made to specific embodiments of the invention, and without departing from any of spirit and scope of the invention Modification or equivalent replacement, should all cover within the scope of the claims of the present invention.

Claims (15)

1. entity alignment schemes in a kind of knowledge mapping, it is characterised in that: the following steps are included:
Step 1: first instance alignment model is trained using the first training data, it is real to second using the second training data Body alignment model is trained, and the first instance alignment model trusted entity that repetitive exercise obtains each time updates the second training number According to the second instance alignment model trusted entity that repetitive exercise obtains each time updates the first training data, when the number of iterations reaches Maximum times or the output result of first instance alignment model and the output result of second instance alignment model to setting are being set Stop iteration after in fixed threshold range, obtains final entity alignment model;
Step 2: knowledge mapping to be aligned being input in the final entity alignment model that step 1 obtains, obtain entity pair Neat result.
2. entity alignment schemes in knowledge mapping according to claim 1, it is characterised in that: when for completing grid knowledge When entity is aligned in map, first training data and the second training data are electric power proprietary term data.
3. entity alignment schemes in knowledge mapping according to claim 1, it is characterised in that: first training data is Training data under semantic feature visual angle, including the first align data collection and the first unjustified data set;The second training number According to for the training data under attribute structure feature visual angle, including the second align data collection and the second unjustified data set;
The first instance alignment model is the entity alignment model for indicating study;The second instance alignment model is based on category The entity alignment model of property similarity mode.
4. entity alignment schemes in knowledge mapping according to claim 3, it is characterised in that: the step 1 specifically includes:
First instance alignment model is got using the training of the first align data, it is not right to first using first instance alignment model Neat data set is predicted, obtains trusted entity to L 'se, the second align data concentration is put it into, the second align data is updated Collection;
Assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to second not Align data collection is predicted, obtains trusted entity to L 'st, the second align data concentration is put it into, the first alignment number is updated According to collection.
5. entity alignment schemes in knowledge mapping according to claim 1, it is characterised in that: before carrying out step 1, also Including constructing first instance alignment model, specifically:
By the entity and relationship map to vector space in knowledge mapping, the correspondence mappings vector of entity in knowledge mapping is obtained, The correspondence mappings vector of relationship obtains correspondence the vector h, t, r of head and the tail entity and intermediate interactions in triple;
Loss function is constructed according to formula (1), is stopped after loss function iteration reaches the maximum times or constant end value of setting Iteration obtains first instance alignment model:
Wherein, (h, r, t) ∈ Δ expression knowledge mapping is converted into all in fact existing positive example triples after triple form Set;(h ', r ', t ') ∈ Δ ' expression is by positive example triple by replacing in the knowledge mapping that head entity or tail entity generate not Existing negative example triplet sets,Alignment relation between learning objectH indicates that head entity vector, t indicate Tail entity vector, r indicate relation vector, and γ > 0 is used to separate the interval of positive and negative entity pair,Indicate positive example triple Head entity vector h and positive example triple tail entity vector t alignment relation,Indicate the head of negative example triple The alignment relation of entity vector h ' and the tail entity vector t ' of negative example triple;
The corresponding head entity vector of entity, tail entity vector sum relation vector are iterated update according to formula (2):
hi=hi-μ*2*|ti-hi-ri|
ri=ri-μ*2*|ti-hi-ri|
ti=ti-μ*2*|ti-hi-ri|
h′i=h 'i-μ*2*|t′i-h′i-r′i|
ri'=r 'i-μ*2*|t′i-h′i-r′i|
t′i=t 'i-μ*2*|t′i-h′i-r′i| (2)
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of head entity vector h is represented, μ is learning rate.
6. entity alignment schemes in knowledge mapping according to claim 1, it is characterised in that: before carrying out step 1, packet Building second instance alignment model is included, specifically:
Scoring functions are constructed according to formula (6), are closed according to vector sum of the scoring functions to the entity of all candidate entity relationship centerings The vector of system is given a mark, and marking is taken to be worth target entity of the candidate entity of highest candidate entity relationship centering as alignment;
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, t*Indicate given entity, h is expressed as and t*There are the candidate entities of across a network entity alignment relation, pass through marking Function is to all candidate entity relationships to (h, r, t*) give a mark, it is real that marking is worth the target that highest candidate entity h is alignment Body;||h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicating attributes similarity, w is punishment dynamics;
Sim(h,t*)=average (Sim (pi))
piFor entity h and entity t*Shared property set:
Up (h, t*)=property1 ∩ property2 (3)
In formula, property1 is the property set of entity h, and property2 is entity t*Property set;
Entity h and entity t*Shared property set piSimilarity sim (pi) are as follows:
In formula, piX-th of attribute p1x of correspondent entity h, attribute value v1x, piCorrespondent entity t*Y-th of attribute p2y, Attribute value is v2y, and lcs (v1x, v2y) is the longest common subsequence of attribute value.
7. entity alignment in a kind of knowledge mapping, it is characterised in that: include:
Coorinated training unit obtains most for carrying out coorinated training to first instance alignment model and second instance alignment model Whole entity alignment model;
Reading unit is input to final entity alignment model for obtaining knowledge mapping to be aligned, obtains entity alignment knot Fruit.
8. entity alignment in knowledge mapping according to claim 7, it is characterised in that: the coorinated training unit tool Body are as follows: first instance alignment model is trained using the first training data, using the second training data to second instance pair Neat model is trained, the first instance alignment model trusted entity that iteration obtains each time the second training data of update, and second The entity alignment model trusted entity that iteration obtains each time updates the first training data, when the maximum times for reaching setting or the The rear stopping in the threshold range of setting of the output result of one entity alignment model and the output result of second instance alignment model Iteration obtains final entity alignment model.
9. entity alignment in knowledge mapping according to claim 8, it is characterised in that: when for completing grid knowledge When entity is aligned in map, first training data and the second training data are electric power proprietary term data.
10. entity alignment in knowledge mapping according to claim 8, it is characterised in that: first training data For the training data under semantic feature visual angle, including the first align data collection and the first unjustified data set;Second training Data are the training data under attribute structure feature visual angle, including the second align data collection and the second unjustified data set;It is described First instance alignment model is the entity alignment model for indicating study;The second instance alignment model is based on attributes similarity Matched entity alignment model.
11. entity alignment in knowledge mapping according to claim 8, it is characterised in that: use the first align data Training gets first instance alignment model, is predicted using first instance alignment model the first unjustified data set, is obtained To trusted entity to L 'se, the second align data concentration is put it into, the second align data collection is updated;
Assembled for training the second instance alignment model that gets using the second align data, using second instance alignment model to second not Align data collection is predicted, obtains trusted entity to L 'st, the second align data concentration is put it into, the first alignment number is updated According to collection.
12. entity alignment in knowledge mapping according to claim 7, it is characterised in that: the first instance alignment Model are as follows: by knowledge mapping entity and relationship map arrive vector space, acquisition knowledge mapping in entity correspondence mappings to Amount, the correspondence mappings vector of relationship obtain correspondence the vector h, t, r of head and the tail entity and intermediate interactions in triple;
Loss function is constructed according to formula (1), is stopped after loss function iteration reaches the maximum times or constant end value of setting Iteration obtains first instance alignment model:
Wherein, (h, r, t) ∈ Δ expression knowledge mapping is converted into all in fact existing triplet sets after triple form; (h ', r ', t ') ∈ Δ ' expression will be not present in knowledge mapping that positive example triple is generated by replacement head entity or tail entity Triplet sets,Alignment relation between learning objectH indicates that head entity vector, t indicate tail entity vector, R indicates relation vector, and γ > 0 is used to separate the interval of positive and negative entity pair,Indicate the head entity vector of positive example triple The alignment relation of h and the tail entity vector t of positive example triple,Indicate the head entity vector h ' of negative example triple with The alignment relation of the tail entity vector t ' of negative example triple;
The corresponding head entity vector of entity, tail entity vector sum relation vector are iterated update according to formula (2):
hi=hi-μ*2*|ti-hi-ri|
ri=ri-μ*2*|ti-hi-ri|
ti=ti-μ*2*|ti-hi-ri|
h′i=h 'i-μ*2*|t′i-h′i-r′i|
ri'=r 'i-μ*2*|t′i-h′i-r′i|
t′i=t 'i-μ*2*|t′i-h′i-r′i| (2)
In formula, dim is the dimension of space vector, hiThe i-th dimension vector of head entity vector h is represented, μ is learning rate.
13. entity alignment in knowledge mapping according to claim 7, it is characterised in that: the second instance alignment Model are as follows: scoring functions are constructed according to formula (6), according to scoring functions to the vector sum of the entity of all candidate entity relationship centerings The vector of relationship is given a mark, and the candidate entity for taking marking to be worth highest candidate entity relationship centering is real as the target of alignment Body;
fpredict(h,r,t*)=(1+w × Sim (h, t*))||h-t*|| (6)
Wherein, t*Indicate given entity, h is expressed as and t*There are the candidate entities of across a network entity alignment relation, pass through marking Function is to all candidate entity relationships to (h, r, t*) give a mark, it is real that marking is worth the target that highest candidate entity h is alignment Body;||h-t*| | it measures based on the semantic similarity for indicating study, Sim (h, t*) indicating attributes similarity, w is punishment dynamics;
Sim(h,t*)=average (Sim (pi))
piFor entity h and entity t*Shared property set:
Up (h, t*)=property1 ∩ property2 (3)
In formula, property1 is the property set of entity h, and property2 is entity t*Property set;
Entity h and entity t*Shared property set piSimilarity sim (pi) are as follows:
In formula, piX-th of attribute p1x of correspondent entity h, attribute value v1x, piCorrespondent entity t*Y-th of attribute p2y, Attribute value is v2y, and lcs (v1x, v2y) is the longest common subsequence of attribute value.
14. entity alignment in a kind of knowledge mapping, which is characterized in that the system comprises network interface, memory and places Manage device;Wherein,
The network interface, during for being received and sent messages between other ext nal network elements, signal is sended and received;
The memory, for storing the computer program instructions that can be run on the processor;
The processor, for when running the computer program instructions, perform claim to require 1 to 6 described in any item one In kind knowledge mapping the step of entity alignment schemes.
15. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with entity pair in knowledge mapping The program of neat method realizes that right is wanted when the program of entity alignment schemes is executed by least one processor in the knowledge mapping The step of seeking entity alignment schemes in a kind of 1 to 6 described in any item knowledge mappings.
CN201910485558.6A 2019-06-05 2019-06-05 Entity alignment schemes, system and its storage medium in a kind of knowledge mapping Pending CN110245131A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910485558.6A CN110245131A (en) 2019-06-05 2019-06-05 Entity alignment schemes, system and its storage medium in a kind of knowledge mapping

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910485558.6A CN110245131A (en) 2019-06-05 2019-06-05 Entity alignment schemes, system and its storage medium in a kind of knowledge mapping

Publications (1)

Publication Number Publication Date
CN110245131A true CN110245131A (en) 2019-09-17

Family

ID=67886142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910485558.6A Pending CN110245131A (en) 2019-06-05 2019-06-05 Entity alignment schemes, system and its storage medium in a kind of knowledge mapping

Country Status (1)

Country Link
CN (1) CN110245131A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765276A (en) * 2019-10-21 2020-02-07 北京明略软件系统有限公司 Entity alignment method and device in knowledge graph
CN110941722A (en) * 2019-10-12 2020-03-31 中国人民解放军国防科技大学 Knowledge graph fusion method based on entity alignment
CN111191462A (en) * 2019-12-30 2020-05-22 北京航空航天大学 Method and system for realizing cross-language knowledge space entity alignment based on link prediction
CN111813963A (en) * 2020-09-10 2020-10-23 平安国际智慧城市科技股份有限公司 Knowledge graph construction method and device, electronic equipment and storage medium
CN112149400A (en) * 2020-09-23 2020-12-29 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium
CN112258339A (en) * 2020-09-29 2021-01-22 广东电力通信科技有限公司 Data processing and storing method and system based on power grid system technology
CN112328710A (en) * 2020-11-26 2021-02-05 北京百度网讯科技有限公司 Entity information processing method, entity information processing device, electronic equipment and storage medium
CN112445876A (en) * 2020-11-25 2021-03-05 中国科学院自动化研究所 Entity alignment method and system fusing structure, attribute and relationship information
CN112765370A (en) * 2021-03-29 2021-05-07 腾讯科技(深圳)有限公司 Entity alignment method and device of knowledge graph, computer equipment and storage medium
CN113392220A (en) * 2020-10-23 2021-09-14 腾讯科技(深圳)有限公司 Knowledge graph generation method and device, computer equipment and storage medium
WO2022242449A1 (en) * 2021-05-18 2022-11-24 腾讯科技(深圳)有限公司 Knowledge graph alignment model training method and apparatus, knowledge graph alignment method and apparatus, and device
CN115828882A (en) * 2022-09-23 2023-03-21 华能澜沧江水电股份有限公司 Entity alignment method and system for risk linkage of dam safety knowledge base

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694201A (en) * 2017-04-10 2018-10-23 华为软件技术有限公司 A kind of entity alignment schemes and device
US20190019088A1 (en) * 2017-07-14 2019-01-17 Guangdong Shenma Search Technology Co., Ltd. Knowledge graph construction method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694201A (en) * 2017-04-10 2018-10-23 华为软件技术有限公司 A kind of entity alignment schemes and device
US20190019088A1 (en) * 2017-07-14 2019-01-17 Guangdong Shenma Search Technology Co., Ltd. Knowledge graph construction method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
林翠萍: "《中文人名消歧算法研究》", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
苏佳林 等: "《融合语义和结构信息的知识图谱实体对齐》", 《山西大学学报(自然科学版)》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110941722A (en) * 2019-10-12 2020-03-31 中国人民解放军国防科技大学 Knowledge graph fusion method based on entity alignment
CN110941722B (en) * 2019-10-12 2022-07-01 中国人民解放军国防科技大学 Knowledge graph fusion method based on entity alignment
CN110765276A (en) * 2019-10-21 2020-02-07 北京明略软件系统有限公司 Entity alignment method and device in knowledge graph
CN111191462B (en) * 2019-12-30 2022-02-22 北京航空航天大学 Method and system for realizing cross-language knowledge space entity alignment based on link prediction
CN111191462A (en) * 2019-12-30 2020-05-22 北京航空航天大学 Method and system for realizing cross-language knowledge space entity alignment based on link prediction
CN111813963A (en) * 2020-09-10 2020-10-23 平安国际智慧城市科技股份有限公司 Knowledge graph construction method and device, electronic equipment and storage medium
CN112149400A (en) * 2020-09-23 2020-12-29 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium
CN112149400B (en) * 2020-09-23 2021-07-27 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium
CN112258339A (en) * 2020-09-29 2021-01-22 广东电力通信科技有限公司 Data processing and storing method and system based on power grid system technology
CN113392220A (en) * 2020-10-23 2021-09-14 腾讯科技(深圳)有限公司 Knowledge graph generation method and device, computer equipment and storage medium
CN113392220B (en) * 2020-10-23 2024-03-26 腾讯科技(深圳)有限公司 Knowledge graph generation method and device, computer equipment and storage medium
CN112445876A (en) * 2020-11-25 2021-03-05 中国科学院自动化研究所 Entity alignment method and system fusing structure, attribute and relationship information
CN112445876B (en) * 2020-11-25 2023-12-26 中国科学院自动化研究所 Entity alignment method and system for fusing structure, attribute and relationship information
CN112328710A (en) * 2020-11-26 2021-02-05 北京百度网讯科技有限公司 Entity information processing method, entity information processing device, electronic equipment and storage medium
CN112765370A (en) * 2021-03-29 2021-05-07 腾讯科技(深圳)有限公司 Entity alignment method and device of knowledge graph, computer equipment and storage medium
CN112765370B (en) * 2021-03-29 2021-07-06 腾讯科技(深圳)有限公司 Entity alignment method and device of knowledge graph, computer equipment and storage medium
WO2022242449A1 (en) * 2021-05-18 2022-11-24 腾讯科技(深圳)有限公司 Knowledge graph alignment model training method and apparatus, knowledge graph alignment method and apparatus, and device
CN115828882A (en) * 2022-09-23 2023-03-21 华能澜沧江水电股份有限公司 Entity alignment method and system for risk linkage of dam safety knowledge base
CN115828882B (en) * 2022-09-23 2023-06-16 华能澜沧江水电股份有限公司 Entity alignment method and system oriented to dam safety knowledge base risk linkage

Similar Documents

Publication Publication Date Title
CN110245131A (en) Entity alignment schemes, system and its storage medium in a kind of knowledge mapping
CN106651247A (en) Address area block matching method based on GIS topology analysis and address area block matching system thereof
CN103390037B (en) Ten thousand people&#39;s Collaborative Plotting methods based on mobile terminal
CN110175909A (en) A kind of enterprise&#39;s incidence relation determines method and system
CN108694201A (en) A kind of entity alignment schemes and device
CN104798043A (en) Data processing method and computer system
CN110415521A (en) Prediction technique, device and the computer readable storage medium of traffic data
CN112070402A (en) Data processing method, device and equipment based on map and storage medium
CN107633257B (en) Data quality evaluation method and device, computer readable storage medium and terminal
CN107194672B (en) Review distribution method integrating academic expertise and social network
CN104239594A (en) Artificial environment model, Agent model and modeling method of Agent model
CN111737364A (en) Safe multi-party data fusion and federal sharing method, device, equipment and medium
Marchal et al. Modeling location choice of secondary activities with a social network of cooperative agents
CN108268512A (en) A kind of tag queries method and device
Gao et al. A multi-objective service composition method considering the interests of tri-stakeholders in cloud manufacturing based on an enhanced jellyfish search optimizer
CN105868478A (en) Rotating mechanical equipment virtual assembly model and method based on context awareness
CN112541556A (en) Model construction optimization method, device, medium, and computer program product
Yang et al. Research on trans-region integrated traffic emergency dispatching technology based on multi-agent
CN112419810B (en) Intelligent education method for accurate control based on adaptive cognitive interaction
CN111368060B (en) Self-learning method, device and system for conversation robot, electronic equipment and medium
CN108648099A (en) The intelligent aided design system of distribution planning
CN114462225A (en) Rapid construction system for hybrid traffic simulation supporting environment under vehicle-road cooperation
Lynn et al. Managing distributed cloud applications and infrastructure: A self-optimising approach
CN103475686B (en) Communication data distribution system and communication data distribution method for electric analog
CN106547876A (en) A kind of community discovery processing method propagated based on degree of membership label and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination