CN108875053A - A kind of knowledge mapping data processing method and device - Google Patents

A kind of knowledge mapping data processing method and device Download PDF

Info

Publication number
CN108875053A
CN108875053A CN201810688821.7A CN201810688821A CN108875053A CN 108875053 A CN108875053 A CN 108875053A CN 201810688821 A CN201810688821 A CN 201810688821A CN 108875053 A CN108875053 A CN 108875053A
Authority
CN
China
Prior art keywords
entity
instance
vector
association
calculated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810688821.7A
Other languages
Chinese (zh)
Inventor
朱月梅
郑凯
段立新
江建军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoxin Youe Data Co Ltd
Original Assignee
Guoxin Youe Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoxin Youe Data Co Ltd filed Critical Guoxin Youe Data Co Ltd
Priority to CN201810688821.7A priority Critical patent/CN108875053A/en
Publication of CN108875053A publication Critical patent/CN108875053A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

This application provides a kind of knowledge mapping data processing method and devices, scheme passes through the corresponding Local Subgraphs of building entity, combine the corresponding vector set of Local Subgraphs, the feature vector of entity is calculated, the vector of calculated entity is enabled to merge or embody the vector characteristics of adjacent entities, adjustment or the vector expression of optimization entity.Scheme improves the insertion effect of entity, so that the building of subsequent knowledge mapping and/or application effect are ideal enough.

Description

A kind of knowledge mapping data processing method and device
Technical field
This application involves big data processing technology field, in particular to a kind of knowledge mapping data processing method and Device.
Background technique
Knowledge mapping (Knowledge Graph) is as a kind of new knowledge representation method and Db Management Model, certainly The fields such as right Language Processing, question answering, information retrieval have important application.Knowledge mapping is intended to describe real world Present in entity and its relationship, can generally be indicated using triple, which include head entity, tail entity and relationship, It is to be interconnected between entity by relationship, forms the netted structure of knowledge.
Entity insertion is to construct the key technology of knowledge mapping, main purpose be using low dimensional vector to entity and its Relationship is modeled.Currently used entity embedding grammar is the search operation by embeded matrix, is looked into from original knowledge library The one-dimensional vector for belonging to special entity is looked for, such as this entity of Zhang San, the one-dimensional vector found corresponds to Zhang San's Relevant information (such as birthplace, identification card number).
The insertion of this mode has ignored the association between entity, between the considerations of the reliability of relationship entity and intensity not Foot, cause to be embedded in it is ineffective so that the building of subsequent knowledge mapping and/or application effect are not ideal enough.
Summary of the invention
In view of this, the embodiment of the present application is designed to provide a kind of knowledge mapping data processing method and device, energy It enough fully considers the relationship between entity, improves entity and be embedded in effect.
The embodiment of the present application provides a kind of knowledge mapping data processing method, in knowledge mapping all or part entity Each entity, perform the following operations:
Using at least one adjacent entities of the entity and the entity, the corresponding Local Subgraphs of the entity are constructed;
Combination indicates each former vector of each entity in the Local Subgraphs, obtains the corresponding former vector of the Local Subgraphs Set;
Based on the former vector set, the corresponding feature vector of the entity is calculated, described eigenvector can be anti- Reflect the relationship between the entity and other at least one entities.
Optionally, at least one described adjacent entities are at least one entities being connected directly with the entity.
Optionally, the former vector for indicating the entity is replaced or updated using the corresponding feature vector of the entity.
Optionally, the method also includes:For having calculated that at least one first instance and at least one of feature vector A second instance, performs the following operations:Using at least one corresponding first eigenvector of at least one described first instance and At least one corresponding second feature vector of at least one described second instance, calculate at least one described first instance with it is described Strength of association between at least one second instance.
Optionally, the method also includes:Using the calculated strength of association building or update it is described at least one Relationship between first instance and at least one described second instance.
Optionally, the calculating of the strength of association is executed by decoder, and the decoder also uses score function to described The calculated result of strength of association is assessed.
Optionally, described that the corresponding feature vector of the entity is calculated based on the former vector set, including:It will The original vector set is input in encoder, calculated using the interior setting parameter and weight information of encoder generate the feature to Amount, the encoder use multilayer graph convolutional neural networks, and the weight information is reflected in entity described in the Local Subgraphs Known association intensity between at least one adjacent entities of the entity.
Optionally, the strength of association that will be calculated, at least one described first instance and it is described at least one The known association intensity of second instance is compared, and is trained according to comparison result to the encoder, is optimized the coding The interior setting parameter of device.
The embodiment of the present application also provides a kind of knowledge mapping data processing equipments, including:
Subgraph constructs module and constructs the entity pair for using at least one adjacent entities of entity and the entity The Local Subgraphs answered;
Gather generation module, for combining each former vector for indicating each entity in the Local Subgraphs, obtains the office The corresponding former vector set of portion's subgraph;
Vector calculation module, for the corresponding feature vector of the entity, institute to be calculated based on the former vector set State the relationship that feature vector is able to reflect between the entity and other at least one entities.
Optionally, described device further includes:
Be associated with computing module, for using described at least one corresponding first eigenvector of at least one first instance and At least one corresponding second feature vector of at least one described second instance, calculate at least one described first instance with it is described Strength of association between at least one second instance.
Knowledge mapping data processing method provided by the embodiments of the present application and device, the entity solved in the related technology are embedding Enter method due to having ignored the association between entity, cause to be embedded in ineffective, the reliability of relationship and intensity are poor between entity The problem of.Knowledge mapping data processing method and device, have fully considered in knowledge mapping provided by the embodiment of the present application Local map structure constructs Local Subgraphs for the adjacent entities of entity and the entity, by the operation to Local Subgraphs, It obtains the corresponding feature vector of entity and improves entity so that obtained feature vector is able to reflect the relationship between entity Between relationship reliability and intensity, optimize insertion effect.
To enable the above objects, features, and advantages of the application to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate Appended attached drawing, is described in detail below.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows a kind of flow chart of knowledge mapping data processing method provided by the embodiment of the present application;
Fig. 2 shows a kind of combination encoder and decoder provided by the embodiment of the present application to realize feature vector iteration fortune The schematic diagram of calculation;
Fig. 3 shows a kind of functional block diagram of knowledge mapping data processing equipment provided by the embodiment of the present application;
Fig. 4 shows a kind of structural schematic diagram of computer equipment provided by the embodiment of the present application.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application Middle attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only It is some embodiments of the present application, instead of all the embodiments.The application being usually described and illustrated herein in the accompanying drawings is real The component for applying example can be arranged and be designed with a variety of different configurations.Therefore, below to the application's provided in the accompanying drawings The detailed description of embodiment is not intended to limit claimed scope of the present application, but is merely representative of the selected reality of the application Apply example.Based on embodiments herein, those skilled in the art institute obtained without making creative work There are other embodiments, shall fall in the protection scope of this application.
In view of the relevant technologies ignore the association in triple expression between entity, cause to be embedded in ineffective.Based on this, A kind of embodiment of the application provides a kind of knowledge mapping data processing method, to improve between entity the reliability of relationship and strong Degree, further such that the building of subsequent knowledge mapping and/or application effect are more preferably.
As shown in Figure 1, being the flow chart of knowledge mapping data processing method provided by the embodiments of the present application, the knowledge mapping The executing subject of data processing method can be computer equipment, and above-mentioned knowledge mapping data processing method is to knowledge mapping whole Or each entity in part entity, it performs the following operations:
S101, at least one of entity and entity adjacent entities, the corresponding Local Subgraphs of building entity are used.
Here, the entity in the embodiment of the present application and its adjacent entities may come from original knowledge library, this is original to know Knowing library can be Freebase knowledge base, can also be Wordnet knowledge base, can also be YAGO knowledge base, can also be it His knowledge base.In the embodiment of the present application, each node of each entity as knowledge base in original knowledge library can be corresponded to There is attribute information corresponding with entity.It may include based on Local Subgraphs constructed by entity and its at least one adjacent entities Connection relationship between entity and adjacent entities, and the Local Subgraphs constructed are corresponding with entity.Similarly it is found that for every For a adjacent entities, the adjacent entities based on the adjacent entities and the adjacent entities can be constructed corresponding to each adjacent entities Local Subgraphs.
Adjacent entities can be the entity being connected directly with entity, such as:Entity A is connected directly with entity B, at this time can be with Entity B is referred to as 1 grade of adjacent node of entity A.Adjacent entities are also possible to the entity being indirectly connected with entity, such as:Entity A with Entity B is connected directly, and entity B is connected directly with entity C, and entity A is not connected directly with entity C, then entity A and entity C Be indirectly connected by entity B, at this time can entity C be referred to as entity A 2 grades of adjacent nodes, and so on, further include 3 grades, 4 grades Adjacent node, etc..
When specific operation, the adjacent node adjacent with entity can be determined by the adjacent level of setting, such as:Setting Adjacent level is 1, then only constructs Local Subgraphs by entity and with the adjacent node of entity direct neighbor, can simplify fortune at this time It calculates;Adjacent level is set as 2, then Local Subgraphs include entity, 2 grades with the adjacent node of entity direct neighbor and entity Adjacent node, etc., and so on.
For a certain entity, multiple Local Subgraphs can also be constructed, such as:Construct the Local Subgraphs that adjacent level is 1, structure The Local Subgraphs that adjacent level is 2 are built, the Local Subgraphs, etc. that adjacent level is 3 are constructed.Further, it is also possible to by Local Subgraphs Sort out according to adjacent level, such as:For a certain entity, building:It only include entity and the adjacent node with entity direct neighbor Local Subgraphs, only the Local Subgraphs of 2 grades of adjacent nodes comprising entity and entity, 3 grades only comprising entity and entity it is adjacent Local Subgraphs of node, etc., and so on.
Building Local Subgraphs mode be it is diversified, be not intended to limit the building mode of Local Subgraphs herein.
Here, in the embodiment of the present application, entity can be characterized using entity vector.Due in original knowledge library, on Stating entity may be to be described with written form, and the data of original acquisition are usually needed in order to facilitate computer disposal It is converted into vector expression, i.e., by entity coding to vector space, entity each so all carries out table by the vector of vector space Show.The initial vectorization of the entity of original acquisition is indicated, that is, maps entities in vector space, can choose common Method or model, such as existing Semantic mapping method etc., herein with no restrictions.
The embodiment of the present application is not intended to limit entity and completes initial vector space reflection and construct the successive of entity Local Subgraphs Sequentially, such as:The initial vectorization that after obtaining initial data, can first carry out entity indicates, then constructs again corresponding Local Subgraphs (can be based on known triple and/or entity in the position of vector space);It can also first nodes oriented building Local Subgraphs (can determine the relationship between node, carry out the building of Local Subgraphs) based on known triple, then to entity into Row vector space reflection.
Just because of to the DUAL PROBLEMS OF VECTOR MAPPING of entity, can not sufficiently reflecting the association between entity, therefore, the application at present Embodiment carries out operation or takes turns interative computation more, so that calculated by way of constructing Local Subgraphs corresponding to entity The vector of entity can merge or embody the vector characteristics of adjacent entities, so that the original vector expression of entity is optimised.
S102, combination indicate each former vector of each entity at least one Local Subgraphs, obtain at least one part Scheme corresponding former vector set.
As previously described, entity is indicated by vector, here by combination indicate Local Subgraphs in each entity it is each A original vector, obtains former vector set corresponding to Local Subgraphs, provides basis for the calculating of next step.
Former vector herein can be and map obtained initial vector by existing vector space, be also possible to by The expression vector of corresponding entity obtained by last round of interative computation.
Here, each former vector that each entity is indicated in Local Subgraphs is combined, is can be obtained and part Scheme corresponding former vector set.When the quantity of Local Subgraphs is multiple, it can choose all or part of Local Subgraphs, it is right In selected Local Subgraphs, the former vector of presentation-entity is combined, forms former vector set.
S103, based on former vector set, be calculated the corresponding feature vector of entity, feature vector be able to reflect entity with Relationship between other at least one entities.
Here, by Local Subgraphs, by combining for entity and adjacent entities, pass through obtained former vector set Conjunction is calculated, and just with reference to the partial structurtes of knowledge mapping, obtained feature vector, the relationship being able to reflect between entity is mentioned The reliability and intensity of relationship between liter entity.
In application embodiment, the related above-mentioned process based on the corresponding feature vector of former vector set conjunction computational entity can be with It is the iterative process of a circulation, that is, epicycle can be calculated to feature vector corresponding with entity should as next round The former vector of entity, and the calculating of the wheel feature vector can be carried out based on the determining original vector.It in specific application, can be with Above-mentioned iterative process is realized in conjunction with encoder and decoder.
As shown in Fig. 2, the encoder in the embodiment of the present application, can receive the former vector set of multiple entities first, and Interior setting parameter based on present weight information used by epicycle iteration and encoder, by each former vector collective encoding be with Then multiple feature vectors can be input to decoder by the corresponding feature vector of entity, and based between multiple feature vectors Similarity determine the strength of association between multiple entities, finally can be according to determining strength of association and known association intensity Between comparison result to adjust the parameter-embedded of weight information and encoder, and the weight information and parameter-embedded feedback are arrived Encoder, to carry out the iteration, etc. of next round, and so on.
Knowledge mapping data processing method provided by the embodiments of the present application, the specific work process of encoder are as follows:
The embodiment of the present application can be used encoder operation and obtain target feature vector, and encoder is using multilayer graph convolution mind Through network.It is the input of encoder by the corresponding former vector set cooperation of Local Subgraphs,
This feature vector can be calculated using following formula:
Wherein,The input feature vector that l layers of presentation code device, f () are analogous to the nonlinear activation of ReLU activation primitive Function,Refer to l layers of all entities of neural network share the matrix of a linear transformation (namely encoder in set Parameter), p(ij)Weight information is indicated, for measuring entity eiWith entity ejBetween strength of association.Neighbor (i) refer to Entity eiAdjacent whole adjacent entities collection.Here, it is not intended to limit the number of plies of encoder, can according to need and be set or adjusted. After calculating by the last layer of encoder, output result becomes target feature vector.
Weight information is in weighted graph, and the weighting weight of each entity entity adjacent thereto is defined as follows:
Wherein, pij (l)′Refer to entity eiWith adjacent entities ejBetween weight, σ () function refers to for acquisition probability variable Sigmoid activation primitive, p(ij)Refer to the aggregate weight value after σ () function normalization.For calculating for the first time, Ke Yishe Determine initial value, such as weighted average is distributed, for 5 adjacent entities of entity A, each adjacent entities distribution is same to be weighed Weight, represents strength of association having the same.
For above-mentioned formula (1), it is contemplated that W(l)The excessive influence power that will weaken weight information, therefore, the application are implemented Example can also constrain W using L2 regularization or both activation primitives of Squashing function(l)Length is to avoid weight information Reduction phenomenon.Wherein, when the negligible amounts of the adjacent entities corresponding to entity, it can choose L2 regularization constraint, in reality When the quantity of adjacent entities corresponding to body is more, Squashing function constraint can choose.
Wherein, above-mentioned L2 regularization constraint can use and such as give a definition:
Above-mentioned Squashing function constraint, which can use, such as to be given a definition:
In this way, formula (3) or formula (4), which are substituted into formula (2), can be obtained updated feature vector, it is specifically expressed as follows formula:
Wherein, function g () indicates L2 regularization constraint or Squashing function constraint.
As it can be seen that using the encoding function of encoder the original vector of certain entity can be converted to features described above vector to Representation is measured, influence of the adjacent entities to the entity of the entity can be merged or embody, is also based on operation or more wheels Interative computation, the vector for advanced optimizing original vector indicate.
It, can be by the corresponding feature of the entity after the corresponding feature vector of entity is calculated in the embodiment of the present application Vector replacement or the former vector for updating presentation-entity, in this way, the corresponding former vector set of Local Subgraphs also changes therewith, and Based on former vector set, the corresponding feature vector of entity is calculated can also change therewith., it is understood that for more Secondary interative computation, after carrying out vector replacement or updating, the feature vector change of epicycle entity should when constituting next round operation The former vector of entity, and so on, i.e., by way of successive ignition, until the obtained corresponding feature vector of entity meet it is pre- If can also be that the strength of association between multiple entities reaches scoring it is required that the preset requirement, which can be to reach, analogizes number The assessed value of function can also be other preset requirements.
The strength of association of multiple entities can be calculated by decoder, i.e., by the feature of presentation-entity after calculating to Amount is input to decoder, and contrary operation solves the strength of association between entity, the strength of association being calculated, before can updating The weight information stated.Based on the feature vector of presentation-entity come the relationship between computational entity, can be calculated by existing method, Herein with no restrictions.
Such as:Based on Local Subgraphs, calculated at least one corresponding fisrt feature of at least one first instance to At least one second feature vector corresponding at least one second instance is measured, above-mentioned first eigenvector and second feature are based on Vector is calculated by decoder, can obtain the strength of association between first instance and second instance, determines pass between the two System.Strength of association can express the relationship between entity, such as:Strength of association is bigger, and the relationship between presentation-entity is closer Or it contacts more, etc..
Such as:Both James Kazakhstan is stepped on, two entities in Stefan library, the information of original acquisition is not known Between relationship or relationship between the two and incorrect, Local Subgraphs of both buildings respectively, obtain and iteration updates point The feature vector for not indicating the two entities calculates weight information between the two by calculated feature vector, thus It can determine relationship and relationship strength between the two.It further, can also be to local son based on the strength of association of above-mentioned determination Relationship between figure entity is constructed or is updated.
When realizing, decoder can be used score function and assess the calculated result of strength of association, and will assessment As a result feedback arrives encoder, realizes the training to encoder, adjusts the interior setting parameter W of encoder(l).It can be based on passing through feature Strength of association between the calculated entity of vector is compared with known association intensity, adjusts encoder according to comparison result Interior setting parameter and realize.Such as:In raw information, it is known that there are stronger association is strong between first instance and second instance Degree, for example can choose first instance adjacent entities are inputted each other with second instance, it is calculated by encoder, decoder Afterwards, if the strength of association recalculated and Given information (at least both known neighbours each other) difference are larger, by result Encoder is fed back, parameter is adjusted.That is, the embodiment of the present application can also find the parameter-embedded so that two of an optimization The comparison result of a strength of association as close as.
Based on the same inventive concept, the embodiment of the present application, which provides, a kind of corresponding with knowledge mapping data processing method knows Know spectrum data processing unit, the principle solved the problems, such as due to the device in the embodiment of the present application with the embodiment of the present application is above-mentioned knows It is similar to know spectrum data processing method, therefore the implementation of device may refer to the implementation of method, overlaps will not be repeated.
As shown in figure 3, the structural schematic diagram of knowledge mapping data processing equipment provided by the embodiment of the present application, the knowledge Spectrum data processing unit specifically includes:
Subgraph constructs module 301, and for using at least one of entity and entity adjacent entities, building entity is corresponding extremely Few Local Subgraphs;
Gather generation module 302, for combining each former vector for indicating each entity at least one Local Subgraphs, obtains The corresponding former vector set of at least one Local Subgraphs;
Vector calculation module 303, for based on former vector set, being calculated the corresponding feature vector of entity, feature to Amount is able to reflect the relationship between entity and other at least one entities.
Wherein, at least one adjacent entities is at least one entity being connected directly with entity.
In one embodiment, gather generation module 302, be also used for entity corresponding feature vector replacement or more The former vector of new presentation-entity.
In another embodiment, above-mentioned knowledge mapping data processing equipment further includes:
Strength of association computing module 304, for using at least one corresponding fisrt feature of at least one first instance to At least one second feature vector corresponding at least one second instance is measured, at least one first instance and at least one are calculated Strength of association between second instance.
In yet another embodiment, above-mentioned knowledge mapping data processing equipment further includes:
Relationship update module 305, for constructed or updated using calculated strength of association at least one first instance with Relationship between at least one second instance.
Wherein, the calculating of strength of association is executed by decoder, and decoder also uses calculating of the score function to strength of association As a result it is assessed.
In another embodiment, vector calculation module 303 is specifically used for:
Former vector set is input in encoder, is calculated using the interior setting parameter and weight information of encoder and generates feature Vector, encoder use multilayer graph convolutional neural networks, and weight information is reflected at least one of entity and entity in Local Subgraphs Known association intensity between a adjacent entities.
In another embodiment, above-mentioned knowledge mapping data processing equipment further includes:
Parameter optimization module 306, the strength of association for will be calculated, at least one first instance and at least one The known association intensity of second instance is compared, and is trained according to comparison result to encoder, is set in Optimized Coding Based device Parameter.
As shown in figure 4, for the schematic device of computer equipment provided by the embodiment of the present application, the computer equipment packet It includes:Processor 401, memory 402 and bus 403, the storage of memory 402 execute instruction, when the device is running, processor 401 It is communicated between memory 402 by bus 403, what is stored in the execution memory 402 of processor 401 executes instruction as follows:
Using at least one of entity and entity adjacent entities, at least one corresponding Local Subgraphs of entity are constructed;
Combination indicates each former vector of each entity at least one Local Subgraphs, and it is corresponding to obtain at least one Local Subgraphs Former vector set;
Based on former vector set, the corresponding feature vector of entity is calculated, feature vector is able to reflect entity and other Relationship between at least one entity.
Wherein, at least one adjacent entities is at least one entity being connected directly with entity.
In one embodiment, it in the processing that above-mentioned processor 401 executes, is replaced using the corresponding feature vector of entity Or update the former vector of presentation-entity.
In another embodiment, in the processing that above-mentioned processor 401 executes, for having calculated that feature vector extremely A few first instance and at least one second instance, perform the following operations:It is corresponding at least using at least one first instance One first eigenvector at least one second feature vector corresponding at least one second instance, calculate at least one first Strength of association between entity and at least one second instance.
In yet another embodiment, in the processing that above-mentioned processor 401 executes, further include:Use calculated association Intensity constructs or updates the relationship between at least one first instance and at least one second instance.
Wherein, the calculating of strength of association is executed by decoder, and decoder also uses calculating of the score function to strength of association As a result it is assessed.
In another embodiment, in the processing that above-mentioned processor 401 executes, based on former vector set, it is calculated The corresponding feature vector of entity, including:Former vector set is input in encoder, the interior setting parameter and weight of encoder are utilized Information, which calculates, generates feature vector, and encoder uses multilayer graph convolutional neural networks, and weight information is reflected in Local Subgraphs real Known association intensity between at least one of body and entity adjacent entities.
In another embodiment, in the processing that above-mentioned processor 401 executes, further include:The association that will be calculated Intensity is compared, according to comparison result at least one first instance with the known association intensity of at least one second instance Encoder is trained, the interior setting parameter of Optimized Coding Based device.
The embodiment of the present application also provides a kind of computer readable storage medium, stored on the computer readable storage medium The step of having computer program, executing above-mentioned knowledge mapping data processing method when the computer program is by the operation of processor 401.
Specifically, which can be general storage medium, such as mobile disk, hard disk, on the storage medium Computer program when being run, above-mentioned knowledge mapping data processing method is able to carry out, to solve reality in the related technology Body embedding grammar is due to having ignored the connection in triple expression between entity, and the reliability of relationship and intensity are poor between entity The problem of, to promote reliability and intensity between entity.
Next illustrated in conjunction with example knowledge mapping data processing method provided by the embodiment of the present application and/or The application effect of device.
It as illustrated in chart 1, can be using the data in four original knowledge libraries as data set.Wherein, FB15K data set It is the world knowledge provided based on Freebase knowledge base, such as film knowledge and movement knowledge;WN18 data set is that Wordnet knows Know the data in library, is available dictionary and dictionary in the Wordnet, it is main that the semantic knowledge of vocabulary is provided;YAGO3 data set owner If being provided with the knowledge of the attribute about people based on YAGO.In addition,Presentation-entity quantity, | R | indicate relationship number, #Train Indicate that training sample, #Test indicate test sample.
Data set statistics in 1 example of table
Based on above-mentioned data set, by knowledge mapping data processing method provided by the embodiments of the present application with compare in the prior art More common knowledge mapping incorporation model compares, as table 2 to table 4 successively shown in FB15K data set, WN18 data set, The experimental result of YAGO3 data set, wherein MR (mean rank, average ranking), (mean reciprocal rank is put down MRR Interactive ranking) and Hits@k (wherein { 1,3,10 } k ∈) be experimental evaluation index.MR indicates the average row of correct entity Name, MRR indicate averagely interactive ranking, and Hits@k indicates k (k=1 or 3 or 10) before the ranking of original triple ratio.According to reality Result is tested it is found that the embodiment of the present application has more excellent entity insertion effect, between entity the reliability of relationship and intensity compared with It is good.
The experimental result of 2 FB15K data set of table
The experimental result of 3 WIN8 data set of table
The experimental result of 4 YAGO3 data set of table
In embodiment provided herein, it should be understood that disclosed device and method, it can be by others side Formula is realized.The apparatus embodiments described above are merely exemplary, for example, the division of the unit, only one kind are patrolled Function division is collected, there may be another division manner in actual implementation, in another example, multiple units or components can combine or can To be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Coupling, direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some communication interfaces, device or unit It connects, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in embodiment provided by the present application can integrate in one processing unit, it can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) execute each embodiment the method for the application all or part of the steps. And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.
It should be noted that:Similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing, in addition, term " the One ", " second ", " third " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.
Finally it should be noted that:Embodiment described above, the only specific embodiment of the application, to illustrate the application Technical solution, rather than its limitations, the protection scope of the application is not limited thereto, although with reference to the foregoing embodiments to this Shen It please be described in detail, those skilled in the art should understand that:Anyone skilled in the art Within the technical scope of the present application, it can still modify to technical solution documented by previous embodiment or can be light It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make The essence of corresponding technical solution is detached from the spirit and scope of the embodiment of the present application technical solution.The protection in the application should all be covered Within the scope of.Therefore, the protection scope of the application shall be subject to the protection scope of the claim.

Claims (10)

1. a kind of knowledge mapping data processing method, which is characterized in that each reality in knowledge mapping all or part entity Body performs the following operations:
Using at least one adjacent entities of the entity and the entity, at least one corresponding part of the entity is constructed Figure;
Combination indicates each former vector of each entity at least one described Local Subgraphs, obtains at least one described Local Subgraphs Corresponding original vector set;
Based on the former vector set, the corresponding feature vector of the entity is calculated, described eigenvector is able to reflect institute State the relationship between entity and other at least one entities.
2. the method according to claim 1, wherein at least one described adjacent entities are direct with the entity At least one connected entity.
3. the method according to claim 1, wherein replacing or updating using the corresponding feature vector of the entity Indicate the former vector of the entity.
4. the method according to claim 1, wherein further including:For having calculated that at least the one of feature vector A first instance and at least one second instance, perform the following operations:It is corresponding at least using at least one described first instance One first eigenvector and at least one corresponding second feature vector of at least one described second instance, calculating are described at least Strength of association between one first instance and at least one described second instance.
5. according to the method described in claim 4, it is characterized in that, further including:It is constructed using the calculated strength of association Or the relationship between at least one described first instance of update and at least one described second instance.
6. described according to the method described in claim 4, it is characterized in that, the calculating of the strength of association is executed by decoder Decoder is also assessed using calculated result of the score function to the strength of association.
7. according to any method of claim 4-6, which is characterized in that it is described based on the former vector set, it calculates To the corresponding feature vector of the entity, including:The former vector set is input in encoder, using being set in encoder Parameter and weight information, which calculate, generates described eigenvector, and the encoder uses multilayer graph convolutional neural networks, the weight Known association between message reflection entity described in the Local Subgraphs and at least one adjacent entities of the entity is strong Degree.
8. the method according to the description of claim 7 is characterized in that the strength of association that will be calculated, with it is described at least One first instance is compared with the known association intensity of at least one second instance, according to comparison result to the volume Code device is trained, and optimizes the interior setting parameter of the encoder.
9. a kind of knowledge mapping data processing equipment, which is characterized in that including:
Subgraph constructs module and it is corresponding to construct the entity for using at least one adjacent entities of entity and the entity At least one Local Subgraphs;
Gather generation module, for combining each former vector for indicating each entity at least one described Local Subgraphs, obtains institute State the corresponding former vector set of at least one Local Subgraphs;
Vector calculation module, for the corresponding feature vector of the entity, the spy to be calculated based on the former vector set Sign vector is able to reflect the relationship between the entity and other at least one entities.
10. device according to claim 9, which is characterized in that further include:
It is associated with computing module, for using described at least one corresponding first eigenvector of at least one first instance and described At least one corresponding second feature vector of at least one second instance, calculate at least one described first instance and it is described at least Strength of association between one second instance.
CN201810688821.7A 2018-06-28 2018-06-28 A kind of knowledge mapping data processing method and device Pending CN108875053A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810688821.7A CN108875053A (en) 2018-06-28 2018-06-28 A kind of knowledge mapping data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810688821.7A CN108875053A (en) 2018-06-28 2018-06-28 A kind of knowledge mapping data processing method and device

Publications (1)

Publication Number Publication Date
CN108875053A true CN108875053A (en) 2018-11-23

Family

ID=64296552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810688821.7A Pending CN108875053A (en) 2018-06-28 2018-06-28 A kind of knowledge mapping data processing method and device

Country Status (1)

Country Link
CN (1) CN108875053A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508419A (en) * 2018-11-23 2019-03-22 成都品果科技有限公司 A kind of recommended method and system of knowledge based study
CN109582802A (en) * 2018-11-30 2019-04-05 国信优易数据有限公司 A kind of entity embedding grammar, device, medium and equipment
CN109978060A (en) * 2019-03-28 2019-07-05 科大讯飞华南人工智能研究院(广州)有限公司 A kind of training method and device of natural language element extraction model
CN110147414A (en) * 2019-05-23 2019-08-20 北京金山数字娱乐科技有限公司 Entity characterization method and device of knowledge graph
CN110659723A (en) * 2019-09-03 2020-01-07 腾讯科技(深圳)有限公司 Data processing method, device, medium and electronic equipment based on artificial intelligence
WO2020107929A1 (en) * 2018-11-26 2020-06-04 厦门市美亚柏科信息股份有限公司 Method and terminal for obtaining associated information
CN111934937A (en) * 2020-09-14 2020-11-13 中国人民解放军国防科技大学 Dependent network node importance degree evaluation method and device based on importance iteration
CN112015792A (en) * 2019-12-11 2020-12-01 天津泰凡科技有限公司 Material duplicate code analysis method and device and computer storage medium
CN112487201A (en) * 2020-11-26 2021-03-12 西北工业大学 Knowledge graph representation method using shared parameter convolutional neural network
WO2021056770A1 (en) * 2019-09-27 2021-04-01 深圳市商汤科技有限公司 Image reconstruction method and apparatus, electronic device, and storage medium
CN114064926A (en) * 2021-11-24 2022-02-18 国家电网有限公司大数据中心 Multi-modal power knowledge graph construction method, device, equipment and storage medium
CN114493856A (en) * 2022-04-11 2022-05-13 支付宝(杭州)信息技术有限公司 Method, system, apparatus and medium for processing data

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508419A (en) * 2018-11-23 2019-03-22 成都品果科技有限公司 A kind of recommended method and system of knowledge based study
WO2020107929A1 (en) * 2018-11-26 2020-06-04 厦门市美亚柏科信息股份有限公司 Method and terminal for obtaining associated information
CN109582802A (en) * 2018-11-30 2019-04-05 国信优易数据有限公司 A kind of entity embedding grammar, device, medium and equipment
CN109582802B (en) * 2018-11-30 2020-11-03 国信优易数据股份有限公司 Entity embedding method, device, medium and equipment
CN109978060A (en) * 2019-03-28 2019-07-05 科大讯飞华南人工智能研究院(广州)有限公司 A kind of training method and device of natural language element extraction model
CN109978060B (en) * 2019-03-28 2021-10-22 科大讯飞华南人工智能研究院(广州)有限公司 Training method and device of natural language element extraction model
CN110147414A (en) * 2019-05-23 2019-08-20 北京金山数字娱乐科技有限公司 Entity characterization method and device of knowledge graph
CN110659723A (en) * 2019-09-03 2020-01-07 腾讯科技(深圳)有限公司 Data processing method, device, medium and electronic equipment based on artificial intelligence
CN110659723B (en) * 2019-09-03 2023-09-19 腾讯科技(深圳)有限公司 Data processing method and device based on artificial intelligence, medium and electronic equipment
WO2021056770A1 (en) * 2019-09-27 2021-04-01 深圳市商汤科技有限公司 Image reconstruction method and apparatus, electronic device, and storage medium
CN112015792A (en) * 2019-12-11 2020-12-01 天津泰凡科技有限公司 Material duplicate code analysis method and device and computer storage medium
CN112015792B (en) * 2019-12-11 2023-12-01 天津泰凡科技有限公司 Material repeated code analysis method and device and computer storage medium
CN111934937A (en) * 2020-09-14 2020-11-13 中国人民解放军国防科技大学 Dependent network node importance degree evaluation method and device based on importance iteration
CN112487201A (en) * 2020-11-26 2021-03-12 西北工业大学 Knowledge graph representation method using shared parameter convolutional neural network
CN112487201B (en) * 2020-11-26 2022-05-10 西北工业大学 Knowledge graph representation method using shared parameter convolutional neural network
CN114064926A (en) * 2021-11-24 2022-02-18 国家电网有限公司大数据中心 Multi-modal power knowledge graph construction method, device, equipment and storage medium
CN114493856A (en) * 2022-04-11 2022-05-13 支付宝(杭州)信息技术有限公司 Method, system, apparatus and medium for processing data

Similar Documents

Publication Publication Date Title
CN108875053A (en) A kind of knowledge mapping data processing method and device
CN110263227B (en) Group partner discovery method and system based on graph neural network
EP3862893A1 (en) Recommendation model training method, recommendation method, device, and computer-readable medium
Zhang et al. Pasca: A graph neural architecture search system under the scalable paradigm
CN114048331A (en) Knowledge graph recommendation method and system based on improved KGAT model
CN106960219A (en) Image identification method and device, computer equipment and computer-readable medium
CN116261731A (en) Relation learning method and system based on multi-hop attention-seeking neural network
CN109313720A (en) The strength neural network of external memory with sparse access
CN112861936B (en) Graph node classification method and device based on graph neural network knowledge distillation
WO2022228425A1 (en) Model training method and apparatus
US20200074296A1 (en) Learning to search deep network architectures
CN112905801A (en) Event map-based travel prediction method, system, device and storage medium
AU2021236553A1 (en) Graph neural networks for datasets with heterophily
CN111931067A (en) Interest point recommendation method, device, equipment and medium
CN108804473A (en) The method, apparatus and Database Systems of data query
CN116664719B (en) Image redrawing model training method, image redrawing method and device
Bharti et al. EMOCGAN: a novel evolutionary multiobjective cyclic generative adversarial network and its application to unpaired image translation
KR20230095796A (en) Joint personalized search and recommendation with hypergraph convolutional networks
CN116681104B (en) Model building and realizing method of distributed space diagram neural network
CN112580733A (en) Method, device and equipment for training classification model and storage medium
CN114240555A (en) Click rate prediction model training method and device and click rate prediction method and device
CN113902010A (en) Training method of classification model, image classification method, device, equipment and medium
CN112131261A (en) Community query method and device based on community network and computer equipment
WO2022100607A1 (en) Method for determining neural network structure and apparatus thereof
CN114358257A (en) Neural network pruning method and device, readable medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 101-8, 1st floor, building 31, area 1, 188 South Fourth Ring Road West, Fengtai District, Beijing

Applicant after: Guoxin Youyi Data Co., Ltd

Address before: 100070, No. 188, building 31, headquarters square, South Fourth Ring Road West, Fengtai District, Beijing

Applicant before: SIC YOUE DATA Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181123