CN114691877A - Body alignment method, device, equipment and storage medium - Google Patents

Body alignment method, device, equipment and storage medium Download PDF

Info

Publication number
CN114691877A
CN114691877A CN202011622145.7A CN202011622145A CN114691877A CN 114691877 A CN114691877 A CN 114691877A CN 202011622145 A CN202011622145 A CN 202011622145A CN 114691877 A CN114691877 A CN 114691877A
Authority
CN
China
Prior art keywords
entities
alignment
different
entity
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011622145.7A
Other languages
Chinese (zh)
Inventor
葛婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN202011622145.7A priority Critical patent/CN114691877A/en
Publication of CN114691877A publication Critical patent/CN114691877A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a body alignment method, a device, equipment and a storage medium, wherein the body alignment method comprises the following steps: aligning the entities from different knowledge maps according to the attribute information of the entities, and determining the aligned entities as aligned entities; associating the aligned entity with the ontology in different knowledge maps according to the association relationship between the entity and the ontology recorded by each knowledge map respectively to obtain the association relationship between the aligned entity and the different ontology; and aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies. According to the scheme, the alignment of the body can be realized based on the incidence relation of the entity, and compared with the traditional mode of carrying out map alignment by utilizing various text information to calculate similarity, the alignment which is greatly different in text expression and difficult to find from the text similarity can be found.

Description

Body alignment method, device, equipment and storage medium
Technical Field
The present application relates to the field of knowledge graph technology, and in particular, to a method, an apparatus, a device, and a storage medium for ontology alignment.
Background
Ontology (ontology) is a clear formal specification of a shared conceptual model, and each ontoloy is a unique individual which actually exists; the ontology is widely applied to the fields of semantic Web, knowledge data engineering, electronic commerce and the like.
Since knowledge engineers in different backgrounds construct and maintain ontologies in similar or same domains, so that content heterogeneity exists between different knowledge systems (e.g., databases and knowledge graphs), in order to implement sharing, reusing and interoperating of knowledge between different knowledge systems, it is usually necessary to align the same ontologies between different knowledge systems, that is, merge the content described by the same ontologies.
The current ontology alignment method aligns ontologies in different knowledge systems mainly by means of synonym table matching or text similarity calculation according to ontology names. However, in practical applications, there may be large differences in text expressions used for the same ontology in different knowledge systems, so that the alignment of the same ontology cannot be realized by using the current synonym table matching or text similarity calculation method; in addition, for the situation that the same name is possessed but the same ontology is not represented, different ontologies can be aligned by adopting the ontology alignment method, so that misjudgment is caused. If misjudgment exists in the body alignment, errors can occur in the aligned knowledge graph, and accuracy in knowledge graph application aspects such as machine cognition, machine learning and content recommendation can be further reduced.
Therefore, there is a need to provide a highly accurate body alignment scheme.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method, an apparatus, a device, and a storage medium for body alignment, so as to solve the problem of low accuracy of the current body alignment.
In order to solve the above technical problem, an embodiment of the present application provides the following technical solutions:
a first aspect of the present application provides a body alignment method, including:
aligning the entities from different knowledge maps according to the attribute information of the entities, and determining the aligned entities as aligned entities;
associating the aligned entity with the ontology in different knowledge maps according to the association relationship between the entity and the ontology recorded by each knowledge map respectively to obtain the association relationship between the aligned entity and the different ontology;
and aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies.
In some modified embodiments of the first aspect of the present application, the aligning, according to the association relationship between the alignment entity and different ontologies, different ontologies associated with the same alignment entity includes:
determining an alignment entity proportion corresponding to each ontology according to the incidence relation between the alignment entities and different ontologies, wherein the alignment entity proportion comprises the ratio of the number of the alignment entities to the number of all entities, the number of the alignment entities is the number of the alignment entities associated with the ontology, and the number of all entities is the number of all entities associated with the ontology;
and if the occupation ratios of the alignment entities corresponding to the different ontologies associated with the same alignment entity are all larger than a preset occupation ratio threshold value, aligning the different ontologies.
In some modified embodiments of the first aspect of the present application, the aligning, according to the association relationship between the alignment entity and different ontologies, different ontologies associated with the same alignment entity includes:
determining the number of the alignment entities associated with different ontologies according to the association relationship between the alignment entities and the different ontologies;
and if the number of the alignment entities is greater than a preset alignment entity number threshold, aligning the different bodies.
In some variations of the first aspect of the present application, the aligning entities from different knowledge-graphs according to attribute information of the entities includes:
and aligning the entities from different knowledge graphs according to at least one of the attribute names, the attribute types and the attribute values of the entities.
In some variations of the first aspect of the present application, aligning the entities from different knowledge-graphs according to at least one of attribute names, attribute types, and attribute values of the entities comprises:
vectorizing attribute names of entities from different knowledge maps to obtain attribute name vectors corresponding to the entities;
calculating the similarity between attribute name vectors corresponding to the entities;
and aligning the entities with the similarity larger than a first similarity threshold.
In some variations of the first aspect of the present application, aligning the entities from different knowledge-graphs according to at least one of attribute names, attribute types, and attribute values of the entities comprises:
if the entities from different knowledge graphs have the same attribute name, determining the attribute value similarity of the attribute name corresponding to different entities with the same attribute name;
and aligning the entities with the attribute value similarity larger than a second similarity threshold.
In some variations of the first aspect of the present application, the determining similarity of attribute values corresponding to the attribute names of different entities having the same attribute name includes:
if the attribute value corresponding to the attribute name is a numerical value or a unique identity identifier, determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name in a character matching mode;
and if the attribute value corresponding to the attribute name is a text, determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name by calculating the text similarity.
A second aspect of the present application provides a body alignment device comprising:
the entity alignment module is used for aligning the entities from different knowledge maps according to the attribute information of the entities and determining the aligned entities as aligned entities;
the association module is used for associating the aligned entity with the ontology in different knowledge maps according to the association relationship between the entity and the ontology recorded by each knowledge map respectively to obtain the association relationship between the aligned entity and the different ontology;
and the body alignment module is used for aligning different bodies associated with the same alignment entity according to the association relationship between the alignment entity and the different bodies.
In some variations of the second aspect of the present application, the body alignment module comprises:
an alignment entity proportion determining unit, configured to determine an alignment entity proportion corresponding to each ontology according to an association relationship between the alignment entity and different ontologies, where the alignment entity proportion includes a ratio of an alignment entity number to a total entity number, the alignment entity number is the number of alignment entities associated with the ontology, and the total entity number is the number of all entities associated with the ontology;
the first body aligning unit is used for aligning different bodies associated with the same aligning body if the aligning body occupation ratios corresponding to the different bodies are larger than a preset occupation ratio threshold value.
In some variations of the second aspect of the present application, the body alignment module comprises:
the alignment entity number determining unit is used for determining the number of alignment entities associated with different ontologies according to the association relationship between the alignment entities and the different ontologies;
and the second body alignment unit is used for aligning the different bodies if the number of the alignment entities is greater than a preset alignment entity number threshold value.
In some variations of the second aspect of the present application, the entity alignment module includes:
and the entity alignment unit is used for aligning the entities from different knowledge graphs according to at least one of the attribute names, the attribute types and the attribute values of the entities.
In some modified embodiments of the second aspect of the present application, the entity alignment unit includes:
the attribute name vector determining subunit is used for vectorizing the attribute names of the entities from different knowledge maps to obtain attribute name vectors corresponding to the entities;
the vector similarity determining subunit is used for calculating the similarity between attribute name vectors corresponding to the entities;
and the attribute name alignment subunit is used for aligning the entities with the similarity greater than the first similarity threshold.
In some modified embodiments of the second aspect of the present application, the entity alignment unit includes:
the attribute value similarity determining subunit is used for determining the attribute value similarity of the attribute names corresponding to different entities with the same attribute names if the entities from different knowledge maps have the same attribute names;
and the attribute value alignment subunit is used for aligning the entities with the attribute value similarity greater than the second similarity threshold.
In some modified embodiments of the second aspect of the present application, the attribute value similarity determination subunit includes:
the character matching subunit is used for determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name in a character matching mode if the attribute value corresponding to the attribute name is a numerical value or a unique identity identifier;
and the text similarity operator unit is used for determining the similarity of the attribute values of different entities with the same attribute name corresponding to the attribute name in a text similarity calculation mode if the attribute value corresponding to the attribute name is a text.
A third aspect of the present application provides an apparatus comprising: at least one processor, and at least one memory, bus connected with the processor; the processor and the memory complete mutual communication through a bus; the processor is configured to call program instructions in the memory to perform the body alignment method provided in the first aspect.
A fourth aspect of the present application provides a storage medium having a program stored thereon, the program, when executed by a processor, implementing the body alignment method provided by the first aspect.
By means of the technical scheme, the technical scheme provided by the application at least has the following advantages:
the application provides a body alignment method, a device, equipment and a storage medium, firstly aligning the entities from different knowledge maps according to the attribute information of the entities, determining the aligned entities as aligned entities, then associating the aligned entities with the bodies in different knowledge maps according to the association relationship between the entities and the bodies recorded by each knowledge map respectively to obtain the association relationship between the aligned entities and different bodies, and then aligning the different bodies associated with the same aligned entity according to the association relationship between the aligned entities and different bodies. Compared with the prior art, the method has the advantages that a mode of aligning the bodies based on the body names and the text similarity is abandoned, the alignment of the bodies can be realized based on the association relation of the bodies, compared with the traditional mode of aligning the maps by calculating the similarity through various text information, many alignments which have great differences from the text expressions and are difficult to find from the text similarity can be found, even the same bodies with great differences from the text expressions can be accurately aligned, and in addition, under the condition that the same bodies are not represented by the same names but the same bodies are not represented, the alignments can be distinguished based on the association relation of the bodies, and misjudgment caused by the alignment through the text similarity is avoided Accuracy in knowledge graph application such as content recommendation.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present application will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present application are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings and in which like reference numerals refer to similar or corresponding parts and in which:
FIG. 1 schematically illustrates a flow chart of a method of aligning a body provided by some embodiments of the present application;
FIG. 2 schematically illustrates a schematic diagram of ontology alignment based on entity association provided by some embodiments of the present application;
fig. 3 schematically illustrates a schematic view of a body alignment device provided by some embodiments of the present application;
fig. 4 schematically illustrates a schematic diagram of an apparatus provided by some embodiments of the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which this application belongs.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and "a" and "an" typically include at least two, but do not exclude the presence of at least one.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.
In addition, the terms "first" and "second", etc. are used to distinguish different objects, rather than to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
The embodiments of the present application provide a body alignment method, apparatus, device and storage medium, which are described below with reference to the accompanying drawings.
It should be noted that the ontology alignment method provided in the embodiment of the present application can implement ontology alignment of multiple knowledge graphs, where the multiple knowledge graphs include two or more integer numbers, and when the number of the multiple knowledge graphs is greater than two, the ontology alignment can be performed in a pairwise alignment manner. The following example description is illustrative of the ontology alignment of two knowledge graphs.
Referring to fig. 1, which schematically illustrates a flowchart of a body alignment method provided in some embodiments of the present application, as shown in fig. 1, the body alignment method may include the following steps:
step S101: and aligning the entities from different knowledge graphs according to the attribute information of the entities, and determining the aligned entities as aligned entities.
The knowledge graph can be constructed in advance, and comprises an ontology, an entity and an association relationship between the ontology and the entity.
When the step S101 is implemented specifically, the method may include:
and aligning the entities from different knowledge graphs according to at least one of the attribute names, the attribute types and the attribute values of the entities.
For example, entities from different knowledge-graphs may be aligned according to their attribute names, which may specifically include:
vectorizing attribute names of entities from different knowledge maps to obtain attribute name vectors corresponding to the entities;
calculating the similarity between attribute name vectors corresponding to the entities;
and aligning the entities with the similarity larger than a first similarity threshold.
The vectorization processing of the attribute names of the entities from different knowledge maps can be implemented by using any text vectorization technology provided in the prior art, and the embodiment of the present application is not limited.
The similarity between the attribute name vectors may be represented by cosine similarity, euclidean distance, hamming distance, and the like, which all may achieve the purpose of the embodiment of the present application, and the embodiment of the present application does not specifically limit the similarity.
The first similarity threshold may be flexibly set according to actual requirements, and the embodiment of the present application is not limited.
For another example, aligning entities from different knowledge graphs according to attribute names of the entities and corresponding attribute values may specifically include:
if the entities from different knowledge graphs have the same attribute name, determining the attribute value similarity of the attribute name corresponding to different entities with the same attribute name;
and aligning the entities with the attribute value similarity larger than a second similarity threshold.
By the embodiment, the entities can be aligned based on the similarity of the attribute values under the condition that the attribute names are aligned, so that the accuracy of entity alignment is improved, and the accuracy of subsequent body alignment is further improved.
In addition to the foregoing embodiments, in some variations, the determining the similarity of the attribute values corresponding to the attribute names of different entities having the same attribute name may include:
if the attribute value corresponding to the attribute name is a numerical value or a unique identity identifier, determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name in a character matching mode;
and if the attribute value corresponding to the attribute name is a text, determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name by calculating the text similarity.
The unique identity identifier, i.e. id (identity), belongs to a strong alignment attribute, and may be considered preferentially when entities are aligned, or a higher weight may be set to improve the accuracy of entity alignment, so as to further improve the accuracy of subsequent entity alignment.
In addition, the text similarity may also be determined by converting a text into a vector, calculating a vector similarity, and determining the vector similarity as the text similarity, where the text similarity may be represented by a cosine similarity, an euclidean distance, a hamming distance, and the like, which all may achieve the purpose of the embodiment of the present application, and the embodiment of the present application is not particularly limited thereto.
The second similarity threshold may be flexibly set according to actual requirements, and the embodiment of the present application is not limited.
In addition, the attribute type may also be one of the factors of entity alignment, and may be applied together with the attribute name and the attribute value to align the entities from different knowledge graphs.
Step S102: and associating the aligned entity with the ontology in different knowledge maps according to the association relationship between the entity and the ontology recorded by each knowledge map respectively to obtain the association relationship between the aligned entity and different ontologies.
For easy understanding, please refer to fig. 2, which schematically illustrates a schematic diagram of ontology alignment based on entity association provided by some embodiments of the present application, as shown in fig. 2, an ontology "item" in a knowledge graph a corresponds to (i.e., is associated with) an entity "item a", "item B", "item c", "item d", and an ontology "company accepting item" in a knowledge graph B corresponds to (i.e., is associated with) an entity "item aa", "item bf", "item bb", "item dd". Through entity alignment, it is found that entity "item a" and "item aa", entity "item b" and "item bb", and entity "item d" and "item dd" can be aligned respectively, so that aligned entity "item a", "item b", and "item d" (names can be unified after entity alignment) can be obtained, the aligned entity inherits the association relationship between the original entity and the ontology, and therefore, the association relationship between the aligned entity and the ontology from different knowledge graphs can be obtained.
Step S103: and aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies.
After the entities are aligned (fused), the entities are associated with the ontology structure (association relationship is already established when the independent map is established). After the entities are aligned, the same entity node points to the history unaligned entity and the belonged ontology node. Therefore, in the relationship of the graph, the same entity node can belong to a plurality of ontology nodes, and similarly, two different ontologies can be associated with the same entity node. The ontology, which is associated by a plurality of entity nodes, describes the same concept. Therefore, the ontology is aligned by setting a threshold according to the occupation ratio of the entity under the ontology in all the entities of the ontology and the number of relationships in the entities pointing to different ontologies.
For ease of understanding, continuing with reference to FIG. 2, the ontology names differ in the ontology structures in the two knowledge-graphs. The member body and the employee body belong to the same body, and the entities can be aligned according to employee numbers and then the employee body is aligned according to the entities. In addition, the project entities can be aligned according to the project numbers, and the ontology "project" in the knowledge graph a and the ontology "company accepting project" in the knowledge graph B can be aligned through the alignment entity "project".
Specifically, in some embodiments, the step S103 may include:
determining an alignment entity proportion corresponding to each ontology according to the incidence relation between the alignment entities and different ontologies, wherein the alignment entity proportion comprises the ratio of the number of the alignment entities to the number of all entities, the number of the alignment entities is the number of the alignment entities associated with the ontology, and the number of all entities is the number of all entities associated with the ontology;
and if the occupation ratios of the alignment entities corresponding to the different ontologies associated with the same alignment entity are all larger than a preset occupation ratio threshold value, aligning the different ontologies.
For example, referring to fig. 2, after entity alignment, all entities corresponding to "item" in the ontology "item" in the knowledge graph a include "item a", "item b", "item c" and "item d", and the total number of all entities is 4, where "item a", "item b" and "item d" are aligned entities and the number of aligned entities is 3, and therefore, the aligned entity proportion of the ontology "item" is 3/4-75%. Similarly, after entity alignment, all entities corresponding to the ontology "company accepting item" in the knowledge graph B include "item a", "item B", "item bf", and "item d", and the total number of entities is 4, where "item a", "item B", and "item d" are aligned entities and the number of aligned entities is 3, and therefore, the aligned entity proportion of the ontology "item" is 3/4-75%. Assuming that the preset occupation ratio threshold is 50%, the occupation ratios of the alignment entities corresponding to the different ontologies "item" and "company accepting item" associated with the same alignment entity "item a", "item b", and "item d" are all greater than the preset occupation ratio threshold, and the different ontologies "item" and "company accepting item" can be aligned.
Through the embodiment, the alignment entity proportion corresponding to the body can be used as the body alignment factor to realize the alignment of the bodies in different maps, and the accuracy of the body alignment can be effectively improved.
In other embodiments, the step S103 may include:
determining the number of the alignment entities associated with different ontologies according to the association relationship between the alignment entities and the different ontologies;
and if the number of the alignment entities is greater than a preset alignment entity number threshold, aligning the different bodies.
The present embodiment is similar to the above-mentioned manner of implementing body alignment by aligning the entity proportion, but the number of the aligning entities associated with different bodies is changed as the body alignment factor, and as long as the number is greater than the preset threshold value of the number of the aligning entities, the different bodies can be aligned, which can also implement the purpose of the embodiment of the present application.
In addition, when the ontology is aligned, a unique reserved name of the aligned ontology needs to be set first, for example, in the example of fig. 2, the unique reserved name after the ontology "item" and the "company accepting item" are aligned may be set as "item" or other names as long as the unique reserved name is ensured.
According to the at least one body alignment method provided by the embodiment of the application, firstly, according to attribute information of an entity, entities from different knowledge maps are aligned, the aligned entity is determined to be an aligned entity, then, according to the incidence relation between the entity and a body recorded by each knowledge map, the aligned entity is associated with the body in different knowledge maps to obtain the incidence relation between the aligned entity and different bodies, and then, according to the incidence relation between the aligned entity and different bodies, different bodies associated with the same aligned entity are aligned. Compared with the prior art, the method has the advantages that a mode of aligning the bodies based on the body names and the text similarity is abandoned, the alignment of the bodies can be realized based on the association relation of the bodies, compared with the traditional mode of aligning the maps by calculating the similarity through various text information, many alignments which have great differences from the text expressions and are difficult to find from the text similarity can be found, even the same bodies with great differences from the text expressions can be accurately aligned, and in addition, under the condition that the same bodies are not represented by the same names but the same bodies are not represented, the alignments can be distinguished based on the association relation of the bodies, and misjudgment caused by the alignment through the text similarity is avoided Accuracy in knowledge graph application such as content recommendation.
In the above embodiments, a body alignment method is provided, and correspondingly, the present application also provides a body alignment apparatus. The body alignment device provided by the embodiment of the application can implement the information processing method, and the information processing device can be implemented by software, hardware or a combination of software and hardware. For example, the information processing apparatus may include integrated or separate functional modules or units to perform the corresponding steps in the above-described methods. Please refer to fig. 3, which schematically illustrates a schematic diagram of a body alignment apparatus provided in some embodiments of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
As shown in fig. 3, the body alignment apparatus 10 may include:
an entity alignment module 101, configured to align entities from different knowledge maps according to attribute information of the entities, and determine the aligned entities as aligned entities;
the association module 102 is configured to associate the aligned entity with an ontology in different knowledge graphs according to association relationships between the entities and the ontology recorded by each knowledge graph respectively, so as to obtain association relationships between the aligned entity and different ontologies;
and the ontology aligning module 103 is configured to align different ontologies associated with the same aligning entity according to the association relationship between the aligning entity and the different ontologies.
In some variations of the embodiments of the present application, the body alignment module 103 includes:
an alignment entity proportion determining unit, configured to determine an alignment entity proportion corresponding to each ontology according to an association relationship between the alignment entity and different ontologies, where the alignment entity proportion includes a ratio of an alignment entity number to a total entity number, the alignment entity number is the number of alignment entities associated with the ontology, and the total entity number is the number of all entities associated with the ontology;
the first body aligning unit is used for aligning different bodies associated with the same aligning body if the aligning body occupation ratios corresponding to the different bodies are larger than a preset occupation ratio threshold value.
In some variations of the embodiments of the present application, the body alignment module 103 includes:
the alignment entity number determining unit is used for determining the number of alignment entities associated with different ontologies according to the association relationship between the alignment entities and the different ontologies;
and the second body alignment unit is used for aligning the different bodies if the number of the alignment entities is greater than a preset alignment entity number threshold value.
In some variations of the embodiments of the present application, the entity alignment module 101 includes:
and the entity alignment unit is used for aligning the entities from different knowledge graphs according to at least one of the attribute names, the attribute types and the attribute values of the entities.
In some variations of the embodiments of the present application, the entity alignment unit includes:
the attribute name vector determining subunit is used for vectorizing the attribute names of the entities from different knowledge maps to obtain attribute name vectors corresponding to the entities;
the vector similarity determining subunit is used for calculating the similarity between attribute name vectors corresponding to the entities;
and the attribute name alignment subunit is used for aligning the entities with the similarity greater than the first similarity threshold.
In some variations of the embodiments of the present application, the entity alignment unit includes:
the attribute value similarity determining subunit is used for determining the attribute value similarity of the attribute names corresponding to different entities with the same attribute names if the entities from different knowledge maps have the same attribute names;
and the attribute value alignment subunit is used for aligning the entities with the attribute value similarity greater than the second similarity threshold.
In some variations of the embodiments of the present application, the attribute value similarity determining subunit includes:
the character matching subunit is used for determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name in a character matching mode if the attribute value corresponding to the attribute name is a numerical value or a unique identity identifier;
and the text similarity operator unit is used for determining the similarity of the attribute values of different entities with the same attribute name corresponding to the attribute name in a text similarity calculation mode if the attribute value corresponding to the attribute name is a text.
The body alignment apparatus 10 provided in the embodiment of the present application and the body alignment method provided in the foregoing embodiment of the present application have the same inventive concept and the same beneficial effects, and are not described herein again.
It should be noted that, in some implementations, the entity alignment apparatus 10 includes a processor and a memory, where the entity alignment module, the association module, the entity alignment module, the alignment entity proportion determination unit, the first entity alignment unit, the alignment entity quantity determination unit, the second entity alignment unit, the attribute name vector determination subunit, the vector similarity determination subunit, the attribute name alignment subunit, the attribute value similarity determination subunit, the attribute value alignment subunit, the character matching subunit, and the text similarity calculation subunit are all stored in the memory as program units, and the processor executes the program units stored in the memory to implement corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set with one or more than one kernel, the kernel parameters are adjusted to align the entities from different knowledge maps according to the attribute information of the entities, the aligned entities are determined as aligned entities, then the aligned entities are associated with the bodies in different knowledge maps according to the association relationship between the entities and the bodies recorded by each knowledge map respectively to obtain the association relationship between the aligned entities and different bodies, and then different bodies associated with the same aligned entities are aligned according to the association relationship between the aligned entities and different bodies, so that the aim of aligning the bodies based on the association relationship between the entities is fulfilled.
An embodiment of the present application provides a storage medium, on which a program is stored, and the program, when executed by a processor, implements the body alignment method provided in any of the above embodiments.
An embodiment of the present application provides a processor, where the processor is configured to execute a program, where the program executes the body alignment method provided in any of the above embodiments when running.
The embodiment of the present application provides a device 20, as shown in fig. 4, the device includes at least one processor 201, and at least one memory 202 and a bus 203 connected to the processor 201; the processor 201 and the memory 202 complete communication with each other through the bus 203; the processor 201 is arranged to call program instructions in the memory 202 to perform the body alignment method provided in any of the embodiments described above. The device herein may be a server, a PC, a PAD, a mobile phone, etc.
The present application also provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device:
a method of aligning a body, comprising: aligning the entities from different knowledge maps according to the attribute information of the entities, and determining the aligned entities as aligned entities; associating the aligned entity with the ontology in different knowledge maps according to the association relationship between the entity and the ontology recorded by each knowledge map respectively to obtain the association relationship between the aligned entity and the different ontology; and aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies.
Further, the aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies includes: determining an alignment entity proportion corresponding to each ontology according to the incidence relation between the alignment entities and different ontologies, wherein the alignment entity proportion comprises the ratio of the number of the alignment entities to the number of all entities, the number of the alignment entities is the number of the alignment entities associated with the ontology, and the number of all entities is the number of all entities associated with the ontology; and if the occupation ratios of the alignment entities corresponding to the different ontologies associated with the same alignment entity are all larger than a preset occupation ratio threshold value, aligning the different ontologies.
Further, the aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies includes: determining the number of the alignment entities associated with different ontologies according to the association relationship between the alignment entities and the different ontologies; and if the number of the alignment entities is greater than a preset alignment entity number threshold, aligning the different bodies.
Further, the aligning the entities from different knowledge graphs according to the attribute information of the entities includes: and aligning the entities from different knowledge graphs according to at least one of the attribute names, the attribute types and the attribute values of the entities.
Further, the aligning the entities from the different knowledge-graphs according to at least one of the attribute names, the attribute types, and the attribute values of the entities includes: vectorizing attribute names of entities from different knowledge maps to obtain attribute name vectors corresponding to the entities; calculating the similarity between attribute name vectors corresponding to the entities; and aligning the entities with the similarity larger than a first similarity threshold.
Further, the aligning the entities from the different knowledge-graphs according to at least one of the attribute names, the attribute types, and the attribute values of the entities includes: if the entities from different knowledge graphs have the same attribute name, determining the attribute value similarity of the attribute name corresponding to different entities with the same attribute name; and aligning the entities with the attribute value similarity larger than a second similarity threshold.
Further, the determining the similarity of the attribute values corresponding to the attribute names of different entities having the same attribute name includes: if the attribute value corresponding to the attribute name is a numerical value or a unique identity identifier, determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name in a character matching mode; and if the attribute value corresponding to the attribute name is a text, determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name by calculating the text similarity.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a device includes one or more processors (CPUs), memory, and a bus. The device may also include input/output interfaces, network interfaces, and the like.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip. The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method of aligning a body, comprising:
aligning the entities from different knowledge maps according to the attribute information of the entities, and determining the aligned entities as aligned entities;
associating the aligned entity with the ontology in different knowledge maps according to the association relationship between the entity and the ontology recorded by each knowledge map respectively to obtain the association relationship between the aligned entity and the different ontology;
and aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies.
2. The method according to claim 1, wherein aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies comprises:
determining an alignment entity proportion corresponding to each ontology according to the incidence relation between the alignment entities and different ontologies, wherein the alignment entity proportion comprises the ratio of the number of the alignment entities to the number of all entities, the number of the alignment entities is the number of the alignment entities associated with the ontology, and the number of all entities is the number of all entities associated with the ontology;
and if the occupation ratios of the alignment entities corresponding to the different ontologies associated with the same alignment entity are all larger than a preset occupation ratio threshold value, aligning the different ontologies.
3. The method according to claim 1, wherein aligning different ontologies associated with the same alignment entity according to the association relationship between the alignment entity and the different ontologies comprises:
determining the number of the alignment entities associated with different ontologies according to the association relationship between the alignment entities and the different ontologies;
and if the number of the alignment entities is greater than a preset alignment entity number threshold, aligning the different bodies.
4. The method of claim 1, wherein aligning entities from different knowledge-graphs according to attribute information of the entities comprises:
and aligning the entities from different knowledge graphs according to at least one of the attribute names, the attribute types and the attribute values of the entities.
5. The method of claim 4, wherein aligning entities from different knowledge-graphs according to at least one of attribute names, attribute types, and attribute values of the entities comprises:
vectorizing attribute names of entities from different knowledge maps to obtain attribute name vectors corresponding to the entities;
calculating the similarity between attribute name vectors corresponding to the entities;
and aligning the entities with the similarity larger than a first similarity threshold.
6. The method of claim 4, wherein aligning entities from different knowledge-graphs according to at least one of attribute names, attribute types, and attribute values of the entities comprises:
if the entities from different knowledge graphs have the same attribute name, determining the attribute value similarity of the attribute name corresponding to different entities with the same attribute name;
and aligning the entities with the attribute value similarity larger than a second similarity threshold.
7. The method of claim 6, wherein determining the similarity of the attribute values corresponding to the attribute names of different entities having the same attribute name comprises:
if the attribute value corresponding to the attribute name is a numerical value or a unique identity identifier, determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name in a character matching mode;
and if the attribute value corresponding to the attribute name is a text, determining the attribute value similarity of different entities with the same attribute name corresponding to the attribute name by calculating the text similarity.
8. A body alignment device, comprising:
the entity alignment module is used for aligning the entities from different knowledge maps according to the attribute information of the entities and determining the aligned entities as aligned entities;
the association module is used for associating the aligned entity with the ontology in different knowledge maps according to the association relationship between the entity and the ontology recorded by each knowledge map respectively to obtain the association relationship between the aligned entity and the different ontology;
and the body alignment module is used for aligning different bodies associated with the same alignment entity according to the association relationship between the alignment entity and the different bodies.
9. An apparatus, comprising: at least one processor, and at least one memory, bus connected with the processor; the processor and the memory complete mutual communication through a bus; a processor is used to call program instructions in the memory to perform the method of any one of claims 1 to 7.
10. A storage medium, having stored thereon a program which, when executed by a processor, carries out the method of any one of claims 1 to 7.
CN202011622145.7A 2020-12-30 2020-12-30 Body alignment method, device, equipment and storage medium Pending CN114691877A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011622145.7A CN114691877A (en) 2020-12-30 2020-12-30 Body alignment method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011622145.7A CN114691877A (en) 2020-12-30 2020-12-30 Body alignment method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114691877A true CN114691877A (en) 2022-07-01

Family

ID=82134117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011622145.7A Pending CN114691877A (en) 2020-12-30 2020-12-30 Body alignment method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114691877A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116702899A (en) * 2023-08-07 2023-09-05 上海银行股份有限公司 Entity fusion method suitable for public and private linkage scene

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116702899A (en) * 2023-08-07 2023-09-05 上海银行股份有限公司 Entity fusion method suitable for public and private linkage scene
CN116702899B (en) * 2023-08-07 2023-11-28 上海银行股份有限公司 Entity fusion method suitable for public and private linkage scene

Similar Documents

Publication Publication Date Title
CN109347787B (en) Identity information identification method and device
CN111898139B (en) Data reading and writing method and device and electronic equipment
CN108664812A (en) Information desensitization method, apparatus and system
CN110472438B (en) Transaction data processing and transaction inquiring method, device and equipment based on blockchain
CN112529694B (en) Credit granting processing method, device, equipment and system
TWI686703B (en) Method and device for data storage and business processing
CN112463991A (en) Historical behavior data processing method and device, computer equipment and storage medium
CN114338413A (en) Method and device for determining topological relation of equipment in network and storage medium
CN111310137B (en) Block chain associated data evidence storing method and device and electronic equipment
CN115905630A (en) Graph database query method, device, equipment and storage medium
CN111475503A (en) Virtual knowledge graph construction method and device
CN115599764A (en) Method, device and medium for migrating table data
CN114564571A (en) Graph data query method and system
CN114691877A (en) Body alignment method, device, equipment and storage medium
CN109901991A (en) A kind of method, apparatus and electronic equipment for analyzing exception call
CN111709327B (en) Fuzzy matching method and device based on OCR (optical character recognition)
CN112597105A (en) Processing method of file associated object, server side equipment and storage medium
CN110019544B (en) Data query method and system
CN113094414B (en) Method and device for generating circulation map
CN110245136B (en) Data retrieval method, device, equipment and storage equipment
CN112364181A (en) Insurance product matching degree determination method and device
CN112148782A (en) Market data access method and device
Cheng et al. An efficient service discovery algorithm for counting bloom filter-based service registry
CN111047415A (en) Clothing accessory order processing method, system, electronic equipment and storage medium
CN112182507A (en) Data quality measuring method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination