CN113934729A - Data management method based on knowledge graph, related equipment and medium - Google Patents

Data management method based on knowledge graph, related equipment and medium Download PDF

Info

Publication number
CN113934729A
CN113934729A CN202111224839.XA CN202111224839A CN113934729A CN 113934729 A CN113934729 A CN 113934729A CN 202111224839 A CN202111224839 A CN 202111224839A CN 113934729 A CN113934729 A CN 113934729A
Authority
CN
China
Prior art keywords
data
metadata
vertex
entity
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111224839.XA
Other languages
Chinese (zh)
Inventor
刘建林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202111224839.XA priority Critical patent/CN113934729A/en
Publication of CN113934729A publication Critical patent/CN113934729A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data management method based on a knowledge graph, related equipment and a medium, and relates to the technical field of big data. The method can comprise the following steps: the method comprises the steps of obtaining a plurality of resource data, obtaining at least one metadata corresponding to each resource data, and constructing at least one ternary group of data aiming at each resource data according to each resource data, a metadata item corresponding to each metadata and metadata content corresponding to the metadata item; receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction; and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation. The embodiment of the method is beneficial to improving the efficiency of data management. The present invention relates to blockchain techniques, such as data can be written into blockchains for use in scenarios such as data forensics.

Description

Data management method based on knowledge graph, related equipment and medium
Technical Field
The present application relates to the field of big data technologies, and in particular, to a data management method based on a knowledge graph, a related device, and a medium.
Background
With the advent of the big data era, data management becomes more and more important in life, various data play an important role in the business management process, and the data volume becomes more and more huge. The aim is to manage data mainly based on a relational database to store the data, design a relational table according to a database triple model in advance and model an entity-relation diagram (ER diagram). In the practical process, the ER relational modeling design is too complex, so that a complete and extensible ER relational model cannot be designed in advance, subsequent use and change are difficult, and the efficiency of data management is low. Therefore, how to improve the efficiency of data management becomes an urgent problem to be solved.
Disclosure of Invention
The embodiment of the application provides a data management method based on a knowledge graph, related equipment and a medium, which are beneficial to improving the efficiency of data management.
In one aspect, an embodiment of the present application discloses a data management method based on a knowledge graph, where the method includes:
the method comprises the steps of obtaining a plurality of resource data and obtaining at least one metadata corresponding to each resource data in the plurality of resource data, wherein each metadata comprises a metadata item and metadata content corresponding to the metadata item;
constructing at least one ternary group data for each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item;
receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from the at least one triple data set;
and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation.
On the other hand, the embodiment of the application discloses a data management device based on a knowledge graph, and the device comprises:
the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of resource data and acquiring at least one metadata corresponding to each resource data in the plurality of resource data, and each metadata comprises a metadata item and metadata content corresponding to the metadata item;
the processing unit is used for constructing at least one ternary set of data aiming at each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item;
the receiving unit is used for receiving a data query instruction submitted by a user client and querying a triple data set to be queried indicated by the data query instruction from the at least one triple data set;
the processing unit is further configured to determine, according to each triplet data in the triplet data set, an entity and a relationship corresponding to each triplet data, and construct a target knowledge graph for the data query instruction according to the entity and the relationship.
In yet another aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to perform the following steps:
the method comprises the steps of obtaining a plurality of resource data and obtaining at least one metadata corresponding to each resource data in the plurality of resource data, wherein each metadata comprises a metadata item and metadata content corresponding to the metadata item;
constructing at least one ternary group data for each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item;
receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from the at least one triple data set;
and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation.
In another aspect, an embodiment of the present application provides a computer-readable storage medium, in which computer program instructions are stored, and when executed by a processor, the computer program instructions are configured to perform the following steps:
the method comprises the steps of obtaining a plurality of resource data and obtaining at least one metadata corresponding to each resource data in the plurality of resource data, wherein each metadata comprises a metadata item and metadata content corresponding to the metadata item;
constructing at least one ternary group data for each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item;
receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from the at least one triple data set;
and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation.
In yet another aspect, embodiments of the present application disclose a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the above-mentioned knowledge-graph-based data management method.
In the embodiment of the application, the data management device can acquire a plurality of resource data, acquire at least one metadata corresponding to each resource data in the plurality of resource data, and construct at least one ternary group of data for each resource data according to each resource data, a metadata item corresponding to each metadata and metadata content corresponding to the metadata item; receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from at least one triple data set; and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation. Therefore, the relation among the resource data is displayed by constructing the knowledge graph based on the metadata to realize the management of the data, and the efficiency of data management is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram illustrating the effect of a knowledge-graph based data management system according to an embodiment of the present application;
FIG. 2 is a flow chart diagram of a method for knowledge-graph based data management according to an embodiment of the present application;
FIG. 3 is a diagram illustrating the effect of a knowledge-graph provided by an embodiment of the present application;
FIG. 4 is a diagram illustrating the effect of a knowledge-graph provided by an embodiment of the present application;
FIG. 5 is a diagram illustrating the effect of a knowledge-graph provided by an embodiment of the present application;
FIG. 6 is a diagram illustrating the effect of a knowledge-graph provided by an embodiment of the present application;
FIG. 7 is a schematic flow chart diagram of another knowledge-graph based data management method provided by the embodiments of the present application;
FIG. 8 is a schematic diagram illustrating an effect of a target knowledge graph display interface provided by an embodiment of the present application;
FIG. 9 is a schematic structural diagram of a knowledge-graph-based data management apparatus according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The application provides a data management scheme based on a knowledge graph, which can acquire a plurality of resource data, acquire at least one metadata corresponding to each resource data in the plurality of resource data, and construct at least one ternary group of data aiming at each resource data according to each resource data, a metadata item corresponding to each metadata and metadata content corresponding to the metadata item; receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from at least one triple data set; and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation. According to the scheme, the relation among the resource data is displayed by constructing the knowledge graph based on the metadata so as to realize the management of the metadata, and the entity-relation diagram (ER diagram) modeling is carried out without designing a relation table for the metadata in advance, so that the efficiency of the management of the metadata is improved.
Metadata (also called intermediary data and relay data) in the present application is data (data about data) describing data, and is mainly information describing data property (property) for supporting functions such as indicating storage location, history data, resource search, file record, and the like.
The technical solution of the present application may be applied to an electronic device, where the electronic device may be a terminal, a server, or other devices for performing policy determination based on a knowledge graph, and the present application is not limited. The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In a possible implementation manner, the technical solution of the present application may be applied to a data management system based on a knowledge graph, please refer to fig. 1, and fig. 1 is a schematic diagram illustrating an effect of the data management system based on a knowledge graph provided in an embodiment of the present application. The data management system based on the knowledge-graph can comprise the electronic equipment and a user client. The electronic device may be configured to obtain a plurality of resource data and at least one piece of metadata corresponding to each resource data, and construct corresponding triple data according to the at least one piece of metadata; the method is also used for acquiring the triple data sets when receiving a data query instruction submitted by the user client, constructing the corresponding target knowledge graph according to the triple data sets, and returning the constructed target knowledge graph to the user client so as to facilitate the user client to display the target knowledge graph. The user client may be configured to submit a data query instruction to the electronic device in response to a data query operation of the target object, and receive a target knowledge graph returned by the electronic device to display the target knowledge graph. Therefore, the relation among the resource data is displayed through the knowledge graph, the management of the metadata is realized, and the efficiency of data management is improved.
Based on the above description, the embodiment of the present application provides a data management method based on a knowledge graph. Referring to fig. 2, fig. 2 is a schematic flowchart of a data management method based on a knowledge graph according to an embodiment of the present application. The method may be performed by the above mentioned electronic device. The method may comprise steps S201-S204.
S201, acquiring a plurality of resource data, and acquiring at least one metadata corresponding to each resource data in the plurality of resource data.
The resource data may be data generated in the processes of business, technical implementation, management, and the like, such as some documents, data tables, documents, and the like, which is not limited herein. The metadata corresponding to the resource data may be data describing the resource data, and each resource data corresponds to at least one metadata.
Each metadata includes a metadata item and metadata content corresponding to the metadata item. That is, one metadata may be composed of a metadata item and metadata content. For example, at least one metadata corresponding to a form data is obtained, and the at least one metadata corresponding to the form data may be { creator: zhang III; creation date: 2021-xx-xx; field: f1, F2, F3; storage address: c \\ oracle }, then in at least one metadata corresponding to the form data, "creator", "creation date", "field", "storage address" can be referred to as metadata item of the corresponding metadata, "Zhang three", "2021-xx-xx", "F1", "F2", "F3", "C \\ oracle" is referred to as metadata content corresponding to the corresponding metadata item. Alternatively, a plurality of metadata contents may correspond to the same metadata item.
In a possible implementation, at least one metadata corresponding to each resource data has a corresponding metadata type, and the metadata type may be technical metadata, business metadata, or management metadata.
The technical metadata is data for describing related concepts, relationships and rules of resource data in the technical field, and mainly comprises feature descriptions of data structures and data processing, and common technical metadata comprises: fields, storage locations, data models, database tables, field lengths, ETL scripts, SQL scripts, interface programs, data relationships, and the like; the business metadata is data for describing concepts, relationships and rules related to the resource data in the business field, and common business metadata includes: service definition, service terms, service rules, service indexes, and the like; management metadata is data used to describe related concepts, relationships and rules of resource data in the management domain, and common management metadata includes: data owner, data quality accountability, data security level, etc.
In a possible implementation manner, at least one piece of metadata corresponding to each piece of resource data is acquired, and the acquisition may be performed based on a target metadata acquisition scheme that is constructed in advance. The target metadata obtaining scheme may be used to indicate which resource data are obtained, may also indicate which metadata items are obtained, and may also obtain metadata from which data sources or databases, which is not limited herein. Optionally, when the metadata obtaining scheme is constructed, an initial metadata obtaining scheme may be constructed according to metadata requirements, and the initial metadata obtaining scheme is submitted to a review client for review, if the review passes, the initial metadata obtaining scheme is used as the target metadata obtaining scheme, and if the review does not pass, the initial metadata is adjusted and then submitted again to the review until the review passes, and the initial metadata obtaining scheme is used as the target metadata obtaining scheme. Optionally, when at least one piece of metadata corresponding to each piece of resource data in the target metadata obtaining scheme is obtained, the plurality of pieces of metadata may be manually collected according to the target metadata obtaining scheme, or may be automatically obtained from a data source indicated by the target metadata obtaining scheme through an interface such as an API. In some scenarios, when the plurality of metadata are automatically acquired from the data source indicated by the target metadata acquisition scheme through an interface such as an API, the plurality of metadata may be acquired once at certain intervals to update the acquired metadata and maintain the real-time performance of the acquired metadata.
S202, constructing at least one ternary set of data aiming at each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item.
Wherein the triple data may include the first entity, the relationship, and the second entity. The triple data may be represented as { first entity, relationship, second entity }. In some scenarios, the triple data may also be represented as { entity, attribute value }. And are not limited herein.
In one possible implementation, constructing at least one triple data for each resource data according to each resource data, a metadata item corresponding to each metadata, and metadata content corresponding to the metadata item includes the following steps: determining each resource data as a first entity, determining the metadata content corresponding to the metadata item as a second entity, and determining the relationship between the first entity and the second entity according to the metadata item corresponding to each metadata item; at least one triple set of data for each resource data is constructed from the first entity, the second entity, and the relationship between the first entity and the second entity.
Determining each resource data as a first entity, wherein the first entity can be represented by a unique code or a data name corresponding to the resource data; determining the metadata content corresponding to the metadata item as a second entity, wherein the second entity can be represented by a unique code or content text corresponding to the metadata content; the relation between the first entity and the second entity is determined according to the metadata item corresponding to each metadata item, and the relation between the first entity and the second entity can be represented by the unique code or the item name corresponding to the metadata item. The unique code may be a unique code corresponding to each resource data, metadata item, and metadata content, and the code may be composed of numbers, letters, or characters, which is not limited herein. The data name may be a corresponding name of the resource data; the content text is a text corresponding to the metadata content in the metadata; the item name is a name corresponding to a metadata item in the metadata.
For example, in at least one metadata corresponding to the resource data "2021-year employee information statistics table" and the resource data "2021-year employee information statistics table", a metadata item with an item name of "storage address" exists, the metadata content of the metadata item "storage address" is "C: \" oracle ", the first entity may be represented as" 2021-year employee information statistics table ", the second entity may be represented as" C: \ "oracle", the relationship between the first entity and the second entity may be represented as "storage address", and the obtained triple data for the resource data "2021-year employee information statistics table" is { 2021-year employee information statistics table, storage address, C: \ oracle "}.
It is understood that for each metadata of each resource data, a triple data set can be constructed, in which a first entity is the resource data, a second entity is the metadata content of the metadata, and the relationship is the metadata item of the metadata. Thereby, at least one ternary group data corresponding to each resource data can be obtained.
In a possible implementation manner, if the triple data is triple data constructed according to the resource data, the metadata item of the technical metadata, and the metadata content corresponding to the metadata item of the technical metadata, that is, the triple data constructed according to the resource data and the technical metadata of the resource data, the triple data is the technical triple data. If the triple data is constructed according to the resource data, the metadata item of the service metadata and the metadata content corresponding to the metadata item of the service metadata, that is, the triple data constructed according to the resource data and the service metadata of the resource data, the triple data is the service triple data. If the triple data is constructed, the triple data is constructed according to the resource data, the metadata item for managing the metadata and the metadata content corresponding to the metadata item for managing the metadata, that is, the triple data is constructed according to the resource data and the management metadata of the resource data, and the triple data is the management triple data. Therefore, different types of ternary group data such as technical ternary group data, service ternary group data, management ternary group data and the like can be obtained, so that a ternary group data set can be obtained subsequently according to the type of the ternary group data.
In a possible implementation manner, after the at least one triplet data for each resource object is obtained, the at least one triplet data for each resource object may be stored in the block chain, that is, the at least one triplet data for each resource object is subjected to data uplink processing. The data uplink processing on the at least one triple data of each resource object may include, but is not limited to, content certificate storage, haohich certificate storage, link certificate storage, privacy certificate storage, and privacy certificate sharing, and the like, which is not limited herein. Therefore, the safety of the data can be improved, and the data is prevented from being tampered.
S203, receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from at least one triple data set.
The user client is any client for submitting a data query instruction. The user client logs in an account corresponding to the target object, and can submit a data query instruction to the electronic equipment in response to the data query operation of the target object. The data query operation may be a touch operation for a control for instructing to query the triple data set, or may be a voice instruction for instructing to query the triple data set, which is not limited herein.
The data query instruction may be an instruction for querying a triple data set to be queried, where the triple data set to be queried is a set of at least one triple data that the data query instruction indicates needs to query.
In one possible implementation, the query data query instruction carries identification information indicating a triple data set to be queried. The identification information may be used to indicate which triple data each triple data in the triple data set to be queried is. When the triple data set to be queried indicated by the data query instruction is queried from the at least one triple data, the at least one triple data indicated by the identification information may be obtained from the at least one triple data. For example, the identification information may identify a category of each triplet data in the triplet data set to be queried, and if the category of the triplet data is technical triplet data, service triplet data, management triplet data, or the like, a triplet data set belonging to the category of the triplet data indicated by the identification information may be queried from at least one triplet data; for another example, the identification information may also identify a type of the resource data corresponding to the first entity in each triplet group of the triplet group data set to be queried, and if the type of the resource data is a document, a table, a document, or the like, the triplet group data set corresponding to the type of the resource data indicated by the identification information may be queried from at least one triplet group data. The identification information may also identify other information, and is not limited herein.
In a possible implementation manner, the data query refers to carrying authority information of the target object and identification information for indicating a triple data set to be queried. Querying the triple data set to be queried indicated by the data query instruction from at least one triple data may include the following steps: detecting whether the authority information of the target object carried by the data query instruction indicates that the target object has the authority of the triple data set indicated by the query identification information; and if the detection result is that the authority information indicates that the target object has the authority to query the triple data set indicated by the identification information, querying the triple data set to be queried indicated by the data query instruction from at least one triple data set. The target object is an object corresponding to an account number logged in a user client, the target object has corresponding authority information, and the authority information is used for indicating that the target object has authority for querying which ternary group data, namely, which ternary group data can be queried by the target object. The detection result is obtained by detecting whether the authority information of the target object carried by the data query instruction indicates that the target object has the authority of the triple data set indicated by the query identification information, and the detection result may be that the authority information indicates that the target object has the authority of the triple data set indicated by the query identification information, or the authority information indicates that the target object does not have the authority of the triple data set indicated by the query identification information. And if the detection result is that the authority information indicates that the target object has the authority to query the triple data set indicated by the identification information, querying the triple data set to be queried indicated by the data query instruction from at least one triple data, namely querying the triple data set indicated by the identification information from at least one triple data.
Optionally, if the detection result is that the permission information indicates that the target object does not have the permission of the triple data set indicated by the query identifier information, prompt information may be generated and sent to the user client corresponding to the target object, so as to prompt that the target object does not have the query permission. The prompt information may be text information for prompting that the target object does not have the query right. Therefore, the situation that an object without permission acquires the triple data set can be avoided, the privacy and the safety of data are improved, and data leakage is avoided.
Optionally, the permission information may be determined according to an object group to which the target object belongs, the object group may be divided according to information such as responsibility, position, department, and the like of the object, at least one object may be provided under each object group, each object group has corresponding permission information, and the permission information of the target object may be determined in a manner that the object group to which the target object belongs is determined, and the permission information corresponding to the object group to which the target object belongs is determined as the permission information of the target object.
And S204, determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation.
The entity may include a first entity and a second entity in each triplet data in the triplet data set, and the relationship may be a relationship in each triplet data in the triplet data set. The target knowledge graph may be a knowledge graph formed according to each triplet data in the queried triplet data set, and the target knowledge graph includes a plurality of vertices and edges connecting the vertices. For example, a first entity and a second entity in the triple data may be determined as corresponding vertices, and a relationship in the triple data may be determined as an edge between the vertices corresponding to the first entity and the second entity in the triple data, respectively.
In a possible implementation manner, determining an entity and a relationship corresponding to each triplet data according to each triplet data in the triplet data set, and constructing a target knowledge graph for the data query instruction according to the entity and the relationship may include the following steps: determining a first entity in each triplet of data in the triplet data set as a first vertex and determining a second entity in each triplet of data in the triplet data set as a second vertex; determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet data set; merging the same vertex in the first vertex and the second vertex to obtain a first target vertex; and constructing a target knowledge graph for the data query instruction according to the first vertex, the second vertex, the first target vertex and the edges among the vertices.
The first vertex is determined according to a first entity in the triple data, the second vertex is determined according to a second entity in the triple data, and an edge between the first vertex and the second vertex can be determined according to a relation in corresponding triple data. It is understood that each triplet set of data in the triplet set of data has a corresponding first vertex, second vertex, and edge, that is, if a plurality of triplets of data are included in the triplet set of data, the triplet set of data has a plurality of first vertices, a second vertex (i.e., a plurality of second vertices) corresponding to each first vertex, and an edge (i.e., a plurality of edges) between each first vertex and the corresponding second vertex.
The first target vertex is obtained by merging the same vertices between the first vertices and the second vertices corresponding to the triple data sets, that is, if the first vertex corresponding to one triple data set is the same as the second vertex corresponding to another triple data set, the first vertex and the second vertex corresponding to another triple data set may be merged. The merging of the same vertices may be to merge the same vertices into the same vertex in the knowledge-graph. Therefore, by combining the same vertex in the first vertex and the second vertex, the resource data can be associated, and the relationship between the resource data can be determined.
For example, please refer to fig. 3, fig. 3 is a schematic diagram illustrating an effect of a knowledge graph provided in an embodiment of the present application. If the ternary group data A { document A, an upstream document, a document B } and the other ternary group data B { document B, a creator, Zhang III }, determining a first entity 'document A' in the ternary group data A as a first vertex, determining a second entity 'document B' in the ternary group data A as a second vertex corresponding to the first vertex, and determining a relation 'upstream document' in the ternary group data A as a relation between the first vertex and the second vertex corresponding to the first vertex; determining a first entity 'table B' in the triple data B as another first vertex, determining a second entity 'Zhang three' in the triple data B as a second vertex corresponding to the another first vertex, and determining a relation 'creator' in the triple data B as a relation between the another first vertex and the second vertex corresponding to the another first vertex. It can be seen that the second vertex corresponding to the second entity of the triple data a and the first vertex corresponding to the first entity of the triple data B are both "document B", the same vertices corresponding to the two triple data may be merged to obtain a first target vertex, that is, the knowledge graph including the triple data a and the triple data B may be obtained as shown in fig. 3, and 301 in fig. 3 is the merged first target vertex.
In a possible implementation manner, determining an entity and a relationship corresponding to each triplet data according to each triplet data in the triplet data set, and constructing a target knowledge graph for the data query instruction according to the entity and the relationship, may further include the following steps: determining a first entity in each triplet of data in the triplet data set as a first vertex and determining a second entity in each triplet of data in the triplet data set as a second vertex; determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet data set; merging the same vertexes in the second vertexes to obtain second target vertexes; and constructing a target knowledge graph aiming at the data query instruction according to the first vertex, the second target vertex and the edges among the vertexes.
The description of the first vertex and the second vertex is the same as the description of the first vertex and the second vertex, and is not repeated here.
The second target vertex is obtained by merging the same vertices between the second vertices corresponding to the triplet data set, that is, if the second vertex corresponding to one triplet data set is the same as the second vertex corresponding to another triplet data set, the two second vertices may be merged. Therefore, by carrying out merging processing on the same vertex in each second vertex, the metadata contents of each resource data can be associated, and the determination of the relationship among the attributes of each resource data is facilitated.
For example, please refer to fig. 4, fig. 4 is a schematic diagram illustrating an effect of a knowledge graph provided in an embodiment of the present application. If the ternary group data C { document A, creator, Zhang III } and the other ternary group data D { document B, creator, Zhang III }, determining a first entity 'document A' in the ternary group data C as a first vertex, determining a second entity 'Zhang III' in the ternary group data C as a second vertex corresponding to the first vertex, and determining a relation 'creator' in the ternary group data C as a relation between the first vertex and the second vertex corresponding to the first vertex; determining a first entity 'bill B' in the triple data D as another first vertex, determining a second entity 'Zhang III' in the triple data D as a second vertex corresponding to the another first vertex, and determining a relation 'creator' in the triple data D as a relation between the another first vertex and the second vertex corresponding to the another first vertex. It can be seen that the second vertex corresponding to the second entity of the triplet data C and the second vertex corresponding to the second entity of the triplet data D are both "zhang san", the same second vertices corresponding to the two triplets of data may be merged to obtain a second target vertex, that is, the knowledge graph including the triplet data C and the triplet data D may be obtained as shown in fig. 4, and 401 in fig. 4 is the merged second target vertex.
In a possible implementation manner, determining an entity and a relationship corresponding to each triplet data according to each triplet data in the triplet data set, and constructing a target knowledge graph for the data query instruction according to the entity and the relationship, may further include the following steps: determining a first entity in each triplet of data in the triplet data set as a first vertex and determining a second entity in each triplet of data in the triplet data set as a second vertex; determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet data set; merging the same vertexes in the first vertexes to obtain a third target vertex; and constructing a target knowledge graph aiming at the data query instruction according to the first vertex, the second vertex, the third target vertex and the edges among the vertexes.
The description of the first vertex and the second vertex is the same as the description of the first vertex and the second vertex, and is not repeated here.
The third target vertex is obtained by merging the same vertices between the first vertices corresponding to the triplet data set, that is, if the first vertex corresponding to one triplet data set is the same as the first vertex corresponding to another triplet data set, the two first vertices may be merged. By performing the merging process on the same vertex in each first vertex, it is possible to avoid the presence of a plurality of first vertices corresponding to the same resource data in the obtained knowledge graph.
For example, please refer to fig. 5, fig. 5 is a schematic diagram illustrating an effect of a knowledge graph provided in an embodiment of the present application. If three-component data E { document A, a storage address, C: \ oracle } and another three-component data F { document A, a creator, Zhang three }, determining a first entity 'document A' in the three-component data E as a first vertex, determining a second entity 'C: \ oracle' in the three-component data E as a second vertex corresponding to the first vertex, and determining a relation 'storage address' in the three-component data E as a relation between the first vertex and the second vertex corresponding to the first vertex; determining a first entity 'bill A' in the triple data F as another first vertex, determining a second entity 'Zhang III' in the triple data F as a second vertex corresponding to the another first vertex, and determining a relation 'creator' in the triple data F as a relation between the another first vertex and the second vertex corresponding to the another first vertex. It can be seen that the first vertex corresponding to the first entity of the triple data E and the first vertex corresponding to the entity of the triple data F are both "document a", and then the same first vertices corresponding to the two triple data E and the same first vertex corresponding to the triple data F may be merged to obtain a third target vertex, that is, a knowledge graph including the triple data E and the triple data F may be obtained as shown in fig. 5, where 501 in fig. 5 is the merged third target vertex.
In a possible implementation manner, after obtaining the first vertex, the second vertex, and the edge between the first vertex and the second vertex in each triplet data in the triplet data set, the first target vertex, the second target vertex, and the third target vertex may also be obtained, and then the target knowledge graph for the data query instruction may be constructed according to the first vertex, the second vertex, the first target vertex, the second target vertex, the third target vertex, and the edge between the vertices. That is, the same vertex of the multiple vertices (including the first vertices and the second vertices) corresponding to the triple data in the triple data set may be merged at the same time. Optionally, the first target vertex and the second target vertex may also be obtained, or the first target vertex and the third target vertex may also be obtained, or the second target vertex and the third target vertex may also be obtained, which is not limited herein.
For example, please refer to fig. 6, fig. 6 is a schematic diagram illustrating an effect of a knowledge graph provided in an embodiment of the present application. If there is a triplet of data a { document a, an upstream document, document B }, a triplet of data B { document B, a creator, zhang san }, a triplet of data C { document a, a creator, zhang san }, a triplet of data E { document a, a storage address, C: \\ oracle }, if a first target vertex, a second target vertex, and a third target vertex in the above triplets of data are obtained, then a target knowledge graph for the data query instruction is constructed according to each first vertex, second vertex, first target vertex, second target vertex, third target vertex, and an edge between each vertex, as shown in fig. 6, 601 in fig. 6 is a first target vertex, 602 is a second target vertex, and 603 is a third target vertex.
It is to be understood that, determining a first entity in each triplet of data in the triplet data set as a first vertex and a second entity in each triplet of data in the triplet data set as a second vertex, determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet data set, that is, for each resource data, determining the resource data as the first vertex, determining a metadata content of the metadata corresponding to the resource data as the second vertex, and determining a metadata item of the metadata corresponding to the resource data as an edge between the first vertex and the second vertex.
By adopting the embodiment of the application, a plurality of resource data can be obtained, at least one metadata corresponding to each resource data in the plurality of resource data is obtained, and at least one ternary group of data aiming at each resource data is constructed according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item; receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from at least one triple data set; and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation. Therefore, the relation among the resource data is displayed by constructing the knowledge graph based on the metadata to realize the management of the data, and the efficiency of data management is improved.
Referring to fig. 7, fig. 7 is a flowchart illustrating another method for data management based on a knowledge-graph according to an embodiment of the present application, which can be executed by the electronic device. The method may include the following steps.
S701, acquiring a plurality of resource data, and acquiring at least one metadata corresponding to each resource data in the plurality of resource data.
For the relevant description of step S701, reference may be made to the relevant description of step S201, which is not described herein again.
In a possible implementation manner, if the resource data includes first resource data and second resource data, the embodiment of the present application may further include obtaining a data source identifier carried in the first resource data, where the data source identifier is used to indicate an obtaining position of data in the first resource data; when the data source identification is used for indicating data in the second resource data, the first metadata corresponding to the data source identification is determined from the at least one metadata corresponding to the first resource data. Therefore, metadata used for indicating the data source in each resource data can be obtained, so that the resource data can be traced.
The first resource data may be any one of the acquired plurality of resource data, the second resource data may be any one of the acquired plurality of resource data except the first resource data, and data in the second resource data is a data source of the first resource data. The data source identifier is used to indicate an acquisition location of data in the first resource data, and if an acquisition location of a part of data in the first resource data is the second resource data, the data source identifier of the first resource data may indicate data in the second resource data, that is, the part of data in the first resource data is derived from data in the second resource data.
The first metadata is metadata of a data source identification object in at least one metadata corresponding to the first resource data. The metadata of the data source identifier object may be metadata content of the metadata as indicated content corresponding to the data source identifier, and if the data source identifier is used to indicate data in the second resource data, the metadata content of the first metadata is the second resource data. And further determining that the metadata content comprises the metadata of the second resource data from at least one resource data of the first resource data as the first metadata aiming at the first resource data. Optionally, it may be further determined that the metadata content includes second resource data from at least one metadata corresponding to the first resource data, and a project name of the metadata project is used to represent metadata of a data source, and as the first metadata for the first resource data, the project name used to represent the data source may be "data source", "data acquisition source", "upstream data", and the like, which is not limited herein. For example, if the data source id of table S is used to indicate the data in table R, the first metadata corresponding to the data source id is determined from the at least one metadata corresponding to table S, and the first metadata is { data source, table R }.
Optionally, if the data source identifier carried by the first resource data is empty or the first resource data does not carry the data source identifier, it may be indicated that the data in the first resource data is not derived from other resource data, and it is not necessary to determine the first metadata corresponding to the first resource data.
Optionally, the present application may also determine, directly from at least one metadata of the resource data, that the item name is used to represent the metadata of the data source as the first metadata, for example, the item name is "data source", "data acquisition source", "upstream data", and the like, which is not limited herein. That is, the corresponding first metadata is not determined by the data source identifier, but is determined directly according to the metadata item of the at least one metadata of the first resource data.
In a possible implementation manner, the resource data includes first table resource data and second table resource data, and this embodiment of the present application may further include, when it is detected that both the first table resource data and the second table resource data include a target field, acquiring second metadata corresponding to the target field from at least one piece of metadata corresponding to the first table resource data, and acquiring third metadata corresponding to the target field from at least one piece of metadata corresponding to the second table resource data. Therefore, the data relation determination of each field of the various table resource data can be facilitated by determining the same field in the fields included in each table resource data.
The first table resource data may be any one of the obtained multiple resource data, the second table resource data may be any one of the obtained multiple resource data except the first table resource data, and both the first table resource data and the second table resource data include a target field. The tabular resource data may be resource data of a tabular form, the tabular resource data includes a plurality of fields, each line of data in each tabular resource data may generally correspond to one field, for example, in one tabular resource data, each line of data is respectively used for representing "employee name", "position", "age" and "gender", and then "user name", "position", "age" and "gender" may respectively represent one field.
The target field is the same field in the first table type resource data and the second table type resource data. The second metadata is metadata corresponding to the target field in at least one metadata corresponding to the first table type resource data. The third metadata is metadata corresponding to the target field in at least one metadata corresponding to the second table type resource data. The metadata corresponding to the target field may be metadata whose metadata content is a field name or a unique field code of the target field, and the metadata contents of the second metadata and the third metadata are both field names or unique field codes of the target field. Optionally, it may be further determined that the metadata content includes a target field from at least one metadata of the first table resource data, and an item name of the metadata item is used for representing metadata of field information, as the second metadata for the first table resource data, and it is determined that the metadata content includes the target field from at least one metadata of the second table resource data, and an item name of the metadata item is used for representing metadata of field information, as the third metadata for the second table resource data. The item name for indicating the field information may be "field information", "field type", "field included", or the like, and is not limited herein. For example, if the field "staff member name" is included in the table S and the field "staff member name" is also included in the table R, the second metadata { field, staff member name } corresponding to the field "staff member name" is determined from the at least one piece of metadata corresponding to the table S, and the third metadata { field, staff member name } corresponding to the field "staff member name" is determined from the at least one piece of metadata corresponding to the table R.
S702, determining each resource data as a first entity, determining the metadata content corresponding to the metadata item as a second entity, and determining the relationship between the first entity and the second entity according to the metadata item corresponding to each metadata item.
S703, constructing at least one ternary set of data aiming at each resource data according to the first entity, the second entity and the relationship between the first entity and the second entity.
For the relevant description of steps S702 to S703, reference may be made to the relevant description of steps S202 to S203, which is not described herein again.
In a possible implementation manner, for a first metadata of the first resource data, the first resource data is determined as a first entity, a metadata content in the first metadata is determined as a second entity, and a relationship is determined according to a metadata item in the first metadata, thereby obtaining triple data corresponding to the first metadata. For a second resource data, the second resource data may be determined as a first entity in the triple data corresponding to any metadata corresponding to the second resource data, for example, for a table S, the first metadata is { data source, table R }, the triple data corresponding to the first metadata is { table S, data source, table R }, and for a table R, the table R may be determined as a first entity in the triple data corresponding to any metadata corresponding to table R.
In one possible implementation, corresponding triple data may be constructed for the second metadata of the first table type resource data, and corresponding triple data may be constructed for the third metadata of the second table type resource data. Specifically, the first table resource data is determined as a first entity, the metadata content in the second metadata is determined as a second entity, and the relationship is determined according to the metadata item in the second metadata, so that the triple data corresponding to the second metadata is obtained. And determining the second table resource data as a first entity, determining the metadata content in the third metadata as a second entity, and determining the relationship according to the metadata item in the first metadata, thereby obtaining the triple data corresponding to the third metadata. For example, if the second metadata corresponding to the form S is { field, employee name }, the triple data corresponding to the second metadata corresponding to the form S is { table S, field, employee name }, the third metadata corresponding to the form R is { field, employee name }, and the triple data corresponding to the third metadata corresponding to the form S is { table R, field, employee name }.
S704, receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from at least one triple data set.
S705, determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation.
For the relevant description of steps S704-S705, reference may be made to the relevant description of steps S203-S204, which is not described herein again.
In one possible implementation, corresponding first metadata is determined for the first resource data, and metadata content of the first metadata is second resource data. The first entity corresponding to the second resource data may be determined as a corresponding first vertex, and the second entity corresponding to the metadata content included in the first metadata may be determined as a corresponding second vertex. And a first entity corresponding to a first resource data included in the first metadata may be determined as another first vertex, and an edge between the first vertex and a second vertex corresponding to the first metadata may be determined based on a relationship in the first metadata.
And then merging the same vertex in the first vertex and the second vertex to obtain a first target vertex, which may be obtained by merging the first vertex corresponding to the second resource data with the second vertex corresponding to the first metadata. Therefore, an edge can be established between the vertex corresponding to the first resource data and the vertex corresponding to the second resource data, and the relationship between the first resource data and the second resource object is that the data source of the first resource data is the second resource object. If the above operation is performed for each resource object, the data source of each resource data can be determined and embodied in the knowledge graph, so that the data in the resource data can be traced.
In a possible implementation manner, corresponding second metadata is determined for the first table type resource data, and corresponding third metadata is determined for the second table type resource data, the metadata content of the second metadata is the same as the metadata content of the third metadata, and the metadata content is the target field. A second entity corresponding to the metadata content included in the second metadata may be determined as a corresponding second vertex, and a second entity corresponding to the metadata content included in the third metadata may be determined as another second vertex. And a first entity corresponding to the first table type resource data included in the second metadata may be determined as a first vertex, an edge between the first vertex and the second vertex corresponding to the second metadata may be determined based on a relationship in the second metadata, a first entity corresponding to the second table type resource data included in the third metadata may be determined as another first vertex, and an edge between the first vertex and the second vertex corresponding to the third metadata may be determined based on a relationship in the third metadata.
And then merging the same vertex in each second vertex to obtain a second target vertex and obtain a first target vertex, where the second target vertex corresponding to the second metadata and another second vertex corresponding to the third metadata may be merged to obtain a second target vertex. Therefore, the vertex corresponding to the target field in the first table type resource data and the vertex corresponding to the target field in the second table type resource data can be merged, and the association relationship between the target fields included in the first table type resource and the second table type resource can be found. If the above operation is performed for each table resource object, the relationship between the field in the data source of each resource data and the fields of other resource data can be determined and embodied in the knowledge graph, so as to discover the relationship between the fields in the table resource data.
In one possible implementation, after the target knowledge-graph is constructed, the target knowledge-graph may be returned to the user client so that the user client can display the target knowledge-graph. For example, please refer to fig. 8, and fig. 8 is a schematic diagram illustrating an effect of a target knowledge graph display interface according to an embodiment of the present application. The target knowledge-graph display interface shown in fig. 8 includes a graph type selection area shown at 801, a search area shown at 802, and a knowledge-graph display area shown at 803.
The map type selection area may select which knowledge map to acquire, such as a full metadata map, a business metadata map, a technical metadata map, and a management metadata map. The all metadata map is a knowledge map constructed according to all the triple data, the service metadata map is a knowledge map constructed according to the service triple data, the technical metadata map is a knowledge map constructed according to the technical triple data, and the management metadata map is a knowledge map constructed according to the management triple data. And when the user client detects that the corresponding control is clicked, a data query instruction can be submitted to the electronic device, and the identification information carried in the data query instruction can be used for indicating that each ternary group data in the triple data set to be queried is the type of the corresponding ternary group data. For example, if the target object clicks a control for instructing to query the service metadata map, the user client submits a data query instruction to the electronic device, and the identification information carried in the data query instruction may be used to indicate that each triplet data in the triplet data set to be queried is a service triplet data.
The map display area may be an area displaying a target knowledge map transmitted by the electronic device. The received target knowledge-graph may be displayed in the target knowledge-graph region. The search region may be used to search for any vertex in the knowledge-graph to facilitate quick location from the target knowledge-graph to the location of the searched vertex in the knowledge-graph.
By adopting the embodiment of the application, a plurality of resource data can be obtained, at least one metadata corresponding to each resource data in the plurality of resource data is obtained, and at least one ternary group of data aiming at each resource data is constructed according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item; receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from at least one triple data set; and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation. Therefore, the relation among the resource data is displayed by constructing the knowledge graph based on the metadata to realize the management of the data, and the efficiency of data management is improved.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a data management apparatus based on a knowledge graph according to an embodiment of the present application. Optionally, the data management apparatus based on the knowledge-graph may be disposed in the electronic device. As shown in fig. 9, the data management apparatus based on a knowledge-graph described in the present embodiment may include:
an obtaining unit 901, configured to obtain a plurality of resource data, and obtain at least one metadata corresponding to each resource data in the plurality of resource data, where each metadata includes a metadata item and a metadata content corresponding to the metadata item;
a processing unit 902, configured to construct at least one triple data for each resource data according to the each resource data, the metadata item corresponding to the each metadata, and the metadata content corresponding to the metadata item;
a receiving unit 903, configured to receive a data query instruction submitted by a user client, and query, from the at least one triple data, a triple data set to be queried, where the triple data set is indicated by the data query instruction;
the processing unit 902 is further configured to determine, according to each triplet data in the triplet data set, an entity and a relationship corresponding to each triplet data, and construct a target knowledge graph for the data query instruction according to the entity and the relationship.
In one implementation, the triple data includes a first entity, a relationship, and a second entity; the processing unit 902 is specifically configured to:
determining each resource data as a first entity, determining the metadata content corresponding to the metadata item as a second entity, and determining the relationship between the first entity and the second entity according to the metadata item corresponding to each metadata;
constructing at least one triple set of data for the each resource data according to the first entity, the second entity, and a relationship between the first entity and the second entity.
In one implementation, the processing unit 902 is specifically configured to:
determining a first entity in each triplet of data in the triplet of data set as a first vertex and a second entity in each triplet of data in the triplet of data set as a second vertex;
determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet set of data;
merging the same vertex in the first vertex and the second vertex to obtain a first target vertex;
constructing the target knowledge-graph for the data query instruction from the first vertex, the second vertex, the first target vertex, and edges between the vertices.
In one implementation, the resource data includes a first resource data and a second resource data, and the processing unit 902 is further configured to:
acquiring a data source identifier carried in the first resource data, wherein the data source identifier is used for indicating an acquisition position of data in the first resource data;
when the data source identification is used for indicating data in the second resource data, determining first metadata corresponding to the data source identification from at least one metadata corresponding to the first resource data;
determining that a first entity corresponding to the second resource data is the first vertex, and determining that a second entity corresponding to metadata content included in the first metadata is the second vertex;
the processing unit 902 is specifically configured to:
and combining the first vertex corresponding to the second resource data and the second vertex corresponding to the first metadata to obtain the first target vertex.
In one implementation, the processing unit 902 is specifically configured to:
determining a first entity in each triplet of data in the triplet of data set as a first vertex and a second entity in each triplet of data in the triplet of data set as a second vertex;
determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet set of data;
merging the same vertexes in the second vertexes to obtain second target vertexes;
constructing the target knowledge-graph for the data query instruction from the first vertex, the second target vertex, and edges between the respective vertices.
In one implementation, the resource data includes first table type resource data and second table type resource data; the processing unit 902 is further configured to:
when it is detected that the first table resource data and the second table resource data both include a target field, acquiring second metadata corresponding to the target field from at least one piece of metadata corresponding to the first table resource data, and acquiring third metadata corresponding to the target field from at least one piece of metadata corresponding to the second table resource data;
determining a second entity corresponding to the metadata content in the second metadata as a second vertex, and determining a second entity corresponding to the metadata content in the third metadata as another second vertex;
the processing unit 902 is specifically configured to:
and combining the second vertex corresponding to the second metadata and the other second vertex corresponding to the third metadata to obtain the second target vertex.
In one implementation manner, the data query instruction carries permission information of a target object and identification information used for indicating a triple data set to be queried; the processing unit 902 is specifically configured to:
detecting whether the authority information of the target object carried by the data query instruction indicates that the target object has the authority to query the triple data set indicated by the identification information;
and if the detection result is that the permission information indicates that the target object has the permission to query the triple data set indicated by the identification information, querying the triple data set to be queried indicated by the data query instruction from the at least one triple data.
Referring to fig. 10, fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device described in this embodiment includes: a processor 1001 and a memory 1002. Optionally, the electronic device may further include a network interface 1003 or a power supply module. The processor 1001, the memory 1002, and the network interface 1003 may exchange data with each other.
The Processor 1001 may be a Central Processing Unit (CPU), and may also be other general purpose processors, Digital Signal Processors (DSP), Application Specific Integrated Circuits (ASIC), Field-Programmable Gate arrays (FPGA) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The network interface 1003 may include an input device such as a control panel, a microphone, a receiver, etc., and/or an output device such as a display, a transmitter, etc., to name but a few. For example, in an application embodiment, the network interface may include a receiver and a transmitter.
The memory 1002 may include a read-only memory and a random access memory, and provides program instructions and data to the processor 1001. A portion of the memory 1002 may also include non-volatile random access memory. When the processor 1001 calls the program instruction, it is configured to:
the method comprises the steps of obtaining a plurality of resource data and obtaining at least one metadata corresponding to each resource data in the plurality of resource data, wherein each metadata comprises a metadata item and metadata content corresponding to the metadata item;
constructing at least one ternary group data for each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item;
receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from the at least one triple data set;
and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation.
In one implementation, the triple data includes a first entity, a relationship, and a second entity; the processor 1001 is specifically configured to:
determining each resource data as a first entity, determining the metadata content corresponding to the metadata item as a second entity, and determining the relationship between the first entity and the second entity according to the metadata item corresponding to each metadata;
constructing at least one triple set of data for the each resource data according to the first entity, the second entity, and a relationship between the first entity and the second entity.
In one implementation, the processor 1001 is specifically configured to:
determining a first entity in each triplet of data in the triplet of data set as a first vertex and a second entity in each triplet of data in the triplet of data set as a second vertex;
determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet set of data;
merging the same vertex in the first vertex and the second vertex to obtain a first target vertex;
constructing the target knowledge-graph for the data query instruction from the first vertex, the second vertex, the first target vertex, and edges between the vertices.
In one implementation, the resource data includes a first resource data and a second resource data and the processor 1001 is further configured to:
acquiring a data source identifier carried in the first resource data, wherein the data source identifier is used for indicating an acquisition position of data in the first resource data;
when the data source identification is used for indicating data in the second resource data, determining first metadata corresponding to the data source identification from at least one metadata corresponding to the first resource data;
determining that a first entity corresponding to the second resource data is the first vertex, and determining that a second entity corresponding to metadata content included in the first metadata is the second vertex;
the processor 1001 is specifically configured to:
and combining the first vertex corresponding to the second resource data and the second vertex corresponding to the first metadata to obtain the first target vertex.
In one implementation, the processor 1001 is specifically configured to:
determining a first entity in each triplet of data in the triplet of data set as a first vertex and a second entity in each triplet of data in the triplet of data set as a second vertex;
determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet set of data;
merging the same vertexes in the second vertexes to obtain second target vertexes;
constructing the target knowledge-graph for the data query instruction from the first vertex, the second target vertex, and edges between the respective vertices.
In one implementation, the resource data includes first table type resource data and second table type resource data; the processor 1001 is further configured to:
when it is detected that the first table resource data and the second table resource data both include a target field, acquiring second metadata corresponding to the target field from at least one piece of metadata corresponding to the first table resource data, and acquiring third metadata corresponding to the target field from at least one piece of metadata corresponding to the second table resource data;
determining a second entity corresponding to the metadata content in the second metadata as a second vertex, and determining a second entity corresponding to the metadata content in the third metadata as another second vertex;
the processor 1001 is specifically configured to:
and combining the second vertex corresponding to the second metadata and the other second vertex corresponding to the third metadata to obtain the second target vertex.
In one implementation manner, the data query instruction carries permission information of a target object and identification information used for indicating a triple data set to be queried; the processor 1001 is specifically configured to:
detecting whether the authority information of the target object carried by the data query instruction indicates that the target object has the authority to query the triple data set indicated by the identification information;
and if the detection result is that the permission information indicates that the target object has the permission to query the triple data set indicated by the identification information, querying the triple data set to be queried indicated by the data query instruction from the at least one triple data.
Optionally, the program instructions may also implement other steps of the method in the above embodiments when executed by the processor, and details are not described here.
The present application further provides a computer-readable storage medium, in which a computer program is stored, the computer program comprising program instructions, which, when executed by a processor, cause the processor to perform the above method, such as performing the above method performed by an electronic device, which is not described herein in detail.
Optionally, the storage medium, such as a computer-readable storage medium, referred to herein may be non-volatile or volatile.
Alternatively, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like. The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
It should be noted that, for simplicity of description, the above-mentioned embodiments of the method are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the order of acts described, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps performed in the embodiments of the methods described above. For example, the computer device may be a terminal, or may be a server.
The data management method based on the knowledge graph, the related device and the medium provided by the embodiment of the application are introduced in detail, a specific example is applied in the text to explain the principle and the implementation of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A data management method based on knowledge graph is characterized by comprising the following steps:
the method comprises the steps of obtaining a plurality of resource data and obtaining at least one metadata corresponding to each resource data in the plurality of resource data, wherein each metadata comprises a metadata item and metadata content corresponding to the metadata item;
constructing at least one ternary group data for each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item;
receiving a data query instruction submitted by a user client, and querying a triple data set to be queried indicated by the data query instruction from the at least one triple data set;
and determining an entity and a relation corresponding to each triple data according to each triple data in the triple data set, and constructing a target knowledge graph aiming at the data query instruction according to the entity and the relation.
2. The method of claim 1, wherein the triple data comprises a first entity, a relationship, and a second entity; the constructing at least one triple data for each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item includes:
determining each resource data as a first entity, determining the metadata content corresponding to the metadata item as a second entity, and determining the relationship between the first entity and the second entity according to the metadata item corresponding to each metadata;
constructing at least one triple set of data for the each resource data according to the first entity, the second entity, and a relationship between the first entity and the second entity.
3. The method of claim 2, wherein the determining, according to each triplet data in the triplet data set, an entity and a relationship corresponding to each triplet data, and constructing a target knowledge-graph for the data query instruction according to the entity and the relationship comprises:
determining a first entity in each triplet of data in the triplet of data set as a first vertex and a second entity in each triplet of data in the triplet of data set as a second vertex;
determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet set of data;
merging the same vertex in the first vertex and the second vertex to obtain a first target vertex;
constructing the target knowledge-graph for the data query instruction from the first vertex, the second vertex, the first target vertex, and edges between the vertices.
4. The method of claim 3, wherein the resource data comprises a first resource data and a second resource data; the method further comprises the following steps:
acquiring a data source identifier carried in the first resource data, wherein the data source identifier is used for indicating an acquisition position of data in the first resource data;
when the data source identification is used for indicating data in the second resource data, determining first metadata corresponding to the data source identification from at least one metadata corresponding to the first resource data;
determining that a first entity corresponding to the second resource data is the first vertex, and determining that a second entity corresponding to metadata content included in the first metadata is the second vertex;
the merging the same vertex in the first vertex and the second vertex to obtain a first target vertex includes:
and combining the first vertex corresponding to the second resource data and the second vertex corresponding to the first metadata to obtain the first target vertex.
5. The method of claim 2, wherein the determining, according to each triplet data in the triplet data set, an entity and a relationship corresponding to each triplet data, and constructing a target knowledge-graph for the data query instruction according to the entity and the relationship comprises:
determining a first entity in each triplet of data in the triplet of data set as a first vertex and a second entity in each triplet of data in the triplet of data set as a second vertex;
determining an edge between the first vertex and the second vertex based on a relationship between the first entity and the second entity in each triplet of data in the triplet set of data;
merging the same vertexes in the second vertexes to obtain second target vertexes;
constructing the target knowledge-graph for the data query instruction from the first vertex, the second target vertex, and edges between the respective vertices.
6. The method of claim 5, wherein the resource data comprises a first table type resource data and a second table type resource data; the method further comprises the following steps:
when it is detected that the first table resource data and the second table resource data both include a target field, acquiring second metadata corresponding to the target field from at least one piece of metadata corresponding to the first table resource data, and acquiring third metadata corresponding to the target field from at least one piece of metadata corresponding to the second table resource data;
determining a second entity corresponding to the metadata content in the second metadata as a second vertex, and determining a second entity corresponding to the metadata content in the third metadata as another second vertex;
the merging the same vertices in the second vertices to obtain second target vertices includes:
and combining the second vertex corresponding to the second metadata and the other second vertex corresponding to the third metadata to obtain the second target vertex.
7. The method according to claim 1, wherein the data query instruction carries authority information of a target object and identification information for indicating a triple data set to be queried; the querying the triple data set to be queried indicated by the data query instruction from the at least one triple data includes:
detecting whether the authority information of the target object carried by the data query instruction indicates that the target object has the authority to query the triple data set indicated by the identification information;
and if the detection result is that the permission information indicates that the target object has the permission to query the triple data set indicated by the identification information, querying the triple data set to be queried indicated by the data query instruction from the at least one triple data.
8. A knowledge-graph-based data management apparatus, comprising:
the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of resource data and acquiring at least one metadata corresponding to each resource data in the plurality of resource data, and each metadata comprises a metadata item and metadata content corresponding to the metadata item;
the processing unit is used for constructing at least one ternary set of data aiming at each resource data according to each resource data, the metadata item corresponding to each metadata and the metadata content corresponding to the metadata item;
the receiving unit is used for receiving a data query instruction submitted by a user client and querying a triple data set to be queried indicated by the data query instruction from the at least one triple data set;
the processing unit is further configured to determine, according to each triplet data in the triplet data set, an entity and a relationship corresponding to each triplet data, and construct a target knowledge graph for the data query instruction according to the entity and the relationship.
9. An electronic device comprising a processor, a memory, wherein the memory is configured to store a computer program comprising program instructions, and wherein the processor is configured to invoke the program instructions to perform the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-7.
CN202111224839.XA 2021-10-20 2021-10-20 Data management method based on knowledge graph, related equipment and medium Pending CN113934729A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111224839.XA CN113934729A (en) 2021-10-20 2021-10-20 Data management method based on knowledge graph, related equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111224839.XA CN113934729A (en) 2021-10-20 2021-10-20 Data management method based on knowledge graph, related equipment and medium

Publications (1)

Publication Number Publication Date
CN113934729A true CN113934729A (en) 2022-01-14

Family

ID=79280996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111224839.XA Pending CN113934729A (en) 2021-10-20 2021-10-20 Data management method based on knowledge graph, related equipment and medium

Country Status (1)

Country Link
CN (1) CN113934729A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115357656A (en) * 2022-10-24 2022-11-18 太极计算机股份有限公司 Information processing method and device based on big data and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190018849A1 (en) * 2017-07-14 2019-01-17 Guangzhou Shenma Mobile Information Technology Co., Ltd. Information query method and apparatus
US20190018839A1 (en) * 2017-07-14 2019-01-17 Guangzhou Shenma Mobile Information Technology Co., Ltd. Knowledge map-based question-answer method, device, and storage medium
CN109992672A (en) * 2019-04-11 2019-07-09 华北科技学院 Knowledge mapping construction method based on disaster scene
CN110489561A (en) * 2019-07-12 2019-11-22 平安科技(深圳)有限公司 Knowledge mapping construction method, device, computer equipment and storage medium
CN111324609A (en) * 2020-02-17 2020-06-23 腾讯云计算(北京)有限责任公司 Knowledge graph construction method and device, electronic equipment and storage medium
CN111651614A (en) * 2020-07-16 2020-09-11 宁波方太厨具有限公司 Method and system for constructing medicated diet knowledge graph, electronic equipment and storage medium
CN111767440A (en) * 2020-09-03 2020-10-13 平安国际智慧城市科技股份有限公司 Vehicle portrayal method based on knowledge graph, computer equipment and storage medium
CN112269883A (en) * 2020-10-19 2021-01-26 北京希瑞亚斯科技有限公司 Personnel information query method and device, electronic equipment and storage medium
CN112948547A (en) * 2021-01-26 2021-06-11 中国石油大学(北京) Logging knowledge graph construction query method, device, equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190018849A1 (en) * 2017-07-14 2019-01-17 Guangzhou Shenma Mobile Information Technology Co., Ltd. Information query method and apparatus
US20190018839A1 (en) * 2017-07-14 2019-01-17 Guangzhou Shenma Mobile Information Technology Co., Ltd. Knowledge map-based question-answer method, device, and storage medium
CN109992672A (en) * 2019-04-11 2019-07-09 华北科技学院 Knowledge mapping construction method based on disaster scene
CN110489561A (en) * 2019-07-12 2019-11-22 平安科技(深圳)有限公司 Knowledge mapping construction method, device, computer equipment and storage medium
CN111324609A (en) * 2020-02-17 2020-06-23 腾讯云计算(北京)有限责任公司 Knowledge graph construction method and device, electronic equipment and storage medium
CN111651614A (en) * 2020-07-16 2020-09-11 宁波方太厨具有限公司 Method and system for constructing medicated diet knowledge graph, electronic equipment and storage medium
CN111767440A (en) * 2020-09-03 2020-10-13 平安国际智慧城市科技股份有限公司 Vehicle portrayal method based on knowledge graph, computer equipment and storage medium
CN112269883A (en) * 2020-10-19 2021-01-26 北京希瑞亚斯科技有限公司 Personnel information query method and device, electronic equipment and storage medium
CN112948547A (en) * 2021-01-26 2021-06-11 中国石油大学(北京) Logging knowledge graph construction query method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115357656A (en) * 2022-10-24 2022-11-18 太极计算机股份有限公司 Information processing method and device based on big data and storage medium

Similar Documents

Publication Publication Date Title
CN110472068B (en) Big data processing method, equipment and medium based on heterogeneous distributed knowledge graph
US20120023586A1 (en) Determining privacy risk for database queries
US20230244653A1 (en) Semantic compliance validation for blockchain
EP3188051B1 (en) Systems and methods for search template generation
CN111046237B (en) User behavior data processing method and device, electronic equipment and readable medium
CN107015987B (en) Method and equipment for updating and searching database
CN110659282B (en) Data route construction method, device, computer equipment and storage medium
CN111914135A (en) Data query method and device, electronic equipment and storage medium
US9998450B2 (en) Automatically generating certification documents
CN112597168A (en) Processing method, device and platform of multi-source customer data and storage medium
US11113267B2 (en) Enforcing path consistency in graph database path query evaluation
CN116719799A (en) Environment-friendly data management method, device, computer equipment and storage medium
CN112328575B (en) Data asset blood-edge generation method and device and electronic equipment
CN113934729A (en) Data management method based on knowledge graph, related equipment and medium
CN109947797B (en) Data inspection device and method
CN112508119A (en) Feature mining combination method, device, equipment and computer readable storage medium
CN111611230A (en) Method and device for establishing main data system, computer equipment and storage medium
CN116701355A (en) Data view processing method, device, computer equipment and readable storage medium
CN116414854A (en) Data asset query method, device, computer equipment and storage medium
WO2019168677A1 (en) Multi-dimensional organization of data for efficient analysis
CN115733787A (en) Network identification method, device, server and storage medium
US9489438B2 (en) Systems and methods for visualizing master data services information
CN115017185A (en) Data processing method, device and storage medium
US20200201829A1 (en) Systems and methods for compiling a database
CN115423595B (en) File information processing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination