CN111639082B

CN111639082B - Object storage management method and system of billion-level node scale knowledge graph based on Ceph

Info

Publication number: CN111639082B
Application number: CN202010514803.4A
Authority: CN
Inventors: 曹亮; 刘魁; 李超
Original assignee: Chengdu University of Information Technology
Current assignee: Chengdu University of Information Technology
Priority date: 2020-06-08
Filing date: 2020-06-08
Publication date: 2022-12-23
Anticipated expiration: 2040-06-08
Also published as: CN111639082A

Abstract

The invention discloses an object storage management method and system of a billion-level node scale knowledge graph based on Ceph, wherein the method comprises the following steps: the method comprises the steps of constructing and designing a graph storage framework, obtaining entity data of a plurality of entities corresponding to a target service, generating a knowledge graph corresponding to the target service according to the entity data, storing the knowledge graph, taking Ceph as a distributed resource storage, adding an external index background mechanism, decomposing a large task into a plurality of subtasks by using a distributed computing engine, distributing the subtasks to different machines for execution, and summarizing the subtasks after the execution is finished so as to provide large-scale data processing capacity to support OLAP requirements for a user to perform data analysis based on the knowledge graph. The invention also provides an object storage management system based on the Ceph billion-level node scale knowledge graph. The scheme introduces the distributed resource manager, has the characteristics of expandability and high availability, can store and express massive knowledge, supports the data volume of billions of nodes, and has the characteristics of reliability, easy use and high efficiency.

Description

Object storage management method and system of billion-level node scale knowledge graph based on Ceph

Technical Field

The invention relates to the technical field of information processing, in particular to a object storage management method and system of a billion-level node scale knowledge graph based on Ceph.

Background

The knowledge map is a method for describing knowledge resources and carriers thereof by using a visualization technology, mining, analyzing, constructing, drawing and displaying knowledge and the mutual relation among the knowledge resources and the carriers. The knowledge graph can extract hidden knowledge in large-scale data to construct a data model based on the graph. The final purpose of the technology is to essentially arrange data collection into structured, reusable and reasonable storage for further use scenes, and the storage format of the knowledge graph can nearly perfectly match the requirements. The knowledge graph aims at describing various entities or concepts existing in the real world and the incidence relation among the entities, and each entity of the knowledge graph is identified by a globally and uniquely determined ID, namely, each person has an identity card number; the second is to use attribute-value pairs to characterize the intrinsic properties of entities, and to use relationships to connect two entities and characterize the relationship between them.

The biggest defects of the existing graph storage system are that the graph storage system is not truly distributed, more and more data can be obtained in a big data era, the capacity of a single machine is limited, the data volume is difficult to process after exceeding the bearing capacity of the single machine, the bottom layer storage is far less than the block storage and object storage mode, the efficiency of graph query and graph analysis is low, the system has poor disaster tolerance and instantaneity, and the problems that the dynamic capacity expansion is difficult to realize in hundreds of millions of nodes, the node association query efficiency is low and the like are faced.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a method and a system for object storage management of a billion-level node scale knowledge graph based on Ceph, can process the data of the billion-level node scale knowledge graph, supports large-scale graph data storage and elastic and linear expansion, has high availability and fault tolerance rate, has OLTP and CRUD characteristics, and simultaneously supports OLAP data analysis and external indexing.

The purpose of the invention is realized by the following technical scheme:

the object storage management method of the billion-level node scale knowledge graph based on the Ceph comprises the following steps:

s1: constructing a graph storage architecture, acquiring entity data of a plurality of entities corresponding to a target service, generating a knowledge graph corresponding to the target service according to the entity data, storing the knowledge graph, taking Ceph as a distributed resource storage, constructing a Ceph cluster by using a small cluster consisting of a plurality of monitors by using a Client/Server architecture, and simultaneously storing graph data by using a plurality of OSD (on screen displays) under a single Monitor small cluster;

s2, constructing an external index background, namely mapping the knowledge map data into a fixed index data structure, using an elastic search/Solr retrieval engine as an external index plug-in to realize non-equivalent query, and simultaneously combining an efficient indexing mechanism to construct the external index background;

s3, constructing an integrated distributed computing engine framework, constructing a distributed computing engine by using a Spark computing engine framework, converting the graph relation into a Spark operator by using a graph X library, storing the graph data on the nodes of the Ceph cluster in a distributed manner by using RDD (resource description language) by using the graph X library, and respectively and correspondingly storing a vertex set and an edge set by using the vertex RDD and the edge RDD;

and S4, managing a graph storage architecture, and providing three layers of expanded line query, data write-in, data read, cluster expansion, metadata backup, metadata snapshot, online object analysis and online analysis processing operations to realize the management of graph data of the knowledge graph on the basis of the graph storage architecture, the external index background and the distributed computation engine.

Specifically, the efficient indexing mechanism in step S2 includes a graph index and a vertex center index, where the graph index is a global index structure of the entire knowledge graph; the vertex center index is a local index structure built for each vertex.

Specifically, the step S3 further includes a partitioning operation, and specifically includes the following sub-steps:

s101, carrying out Hash partitioning on the vertex RDD according to the ID of the vertex, and distributing vertex data on a cluster in a multi-partition mode;

s102, partitioning the RDD according to a specified partition strategy, and distributing the edge data on the cluster in a multi-partition mode;

s103, storing a routing table for recording the relation between the vertex and all the side RDD partitions in the partitions of the vertex RDD, and when the side RDD needs vertex data, the vertex data is sent to the side RDD partitions by the vertex RDD according to the routing table.

Specifically, the data writing step in step S4 includes the following substeps:

s201, connecting a client to a Monitor, acquiring the Map information of the cluster, and requesting a corresponding main OSD data node;

s202, the main OSD data node writes the data of the other two replica nodes simultaneously, and waits for the main node and the other two replica nodes to finish the data writing state, and after the main node and the replica nodes are successful in writing state, a finishing signal is returned to the Client, and the data writing is finished.

Specifically, the cluster capacity expansion step in step S4 includes the following substeps:

s301, the Client connects the Monitor to obtain Map information of the cluster, the OSD1 of the new main node uploads a request to the Monitor, and the OSD2 node takes over the OSD1 node to become a temporary main node;

s302, the temporary main node OSD2 synchronizes the total data to the new main node OSD1, and the ClientIO read-write is directly connected with the temporary main node OSD2 for data read-write;

s303, the temporary main node OSD2 receives the read-write IO and writes the data in the other two copy nodes at the same time, and after the data in the temporary main node OSD2 and the data in the other two copy nodes are written successfully, a signal is returned to the Client, and the read-write of the Client IO is finished;

s304, if the OSD1 data of the nodes are synchronized, the temporary main node OSD2 uploads a request to the Monitor to give out the role of the main node, the OSD1 nodes become the main node again, and the OSD2 nodes become copy nodes;

and S305, at the same time, on the graph data level, after the node capacity expansion is realized, the graph data is cut according to a graph data cutting mode and is respectively stored on a plurality of machines.

Specifically, the graph data cutting mode comprises two cutting modes of point cutting and edge cutting; performing data cutting on the vertexes of the graph according to a point cutting mode, wherein a cutting line passes through the vertexes of the graph, each edge is only stored once and only appears on one machine, and vertexes with more neighbor vertexes can be distributed to a plurality of different machines for storage; and performing data cutting by using the graph edges according to an edge cutting mode, wherein a cutting line only passes through the edges connecting the vertexes, each vertex is only stored once, and the cut edges are distributed to a plurality of different machines for storage.

Specifically, the metadata snapshot step in step S4 includes: according to the metadata information, the previous data state can be effectively recovered, and the program can also be recovered to the system operation history state; saving system data of a specific time point, and generating a report of the corresponding time point of the system; and exporting the snapshot data for offline work.

Specifically, the three-layer wire expansion query step in step S4 includes the following substeps:

s401, setting a user-given vertex set Vset as basic data of a first-layer expanded line query, setting a query filtering condition of a first layer as a filtering condition ConditionA of a vertex Label/vertex attribute, and performing the vertex expanded line query of the first layer;

s402, taking a vertex set meeting the first layer query filtering condition as basic data of second layer expansion query, setting the query filtering condition of the second layer as a filtering condition ConditionB of edge Label/edge attribute, and performing second layer edge expansion query;

and S403, setting attribute query conditions by taking the edge sets meeting the second-layer query filtering conditions as basic data of third-layer expanded line query, performing the third-layer attribute expanded line query, and outputting query results subjected to the third-layer expanded line query.

The object storage management system based on the Ceph billion-level node scale knowledge graph comprises a graph data storage module, a distributed computing module, an index module and a metadata management module. The graph data storage module is used for storing object data of the large-scale knowledge graph in a distributed mode and providing object storage, block device storage and file system service;

the distributed computing module is used for decomposing a large task into a plurality of subtasks by utilizing spark RDD memory computing, respectively deploying the subtasks to different machines for execution, and summarizing the subtasks after the subtasks are completed so as to provide efficient large-scale data processing capacity to support OLAP requirements and provide data analysis for users based on knowledge graphs;

the index module is used for mapping the knowledge data into a fixed index data structure and providing a graphic index, a vertex center index and an external index function for a user;

the metadata management module is used for backup of metadata, snapshot of the metadata, program recovery, generation of a time point report and offline work of the system.

The invention has the beneficial effects that: the scheme adds a large data distributed architecture and refers to a distributed resource manager, has the main performance characteristics of expandability, high availability and the like, and is mainly embodied in the aspects of distributed clusters, external indexes, data reliability and the distributed resource manager. Meanwhile, the system can store and express massive knowledge, supports the data volume of billions of nodes, and has the characteristics of reliability, easy use and high efficiency.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is an overall distribution architecture diagram of the present invention.

FIG. 3 is a diagram of the distributed resource management architecture of the present invention.

FIG. 4 is a diagram of the integrated distributed computing engine architecture of the present invention.

FIG. 5 is a diagram of the external index plug architecture of the present invention.

Fig. 6 is a data writing flow chart of the present invention.

FIG. 7 is a flow chart of cluster capacity expansion of the present invention.

Fig. 8 is a functional block diagram of the system of the present invention.

Detailed Description

In order to more clearly understand the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings.

In this embodiment, as shown in fig. 1-2, the object storage management method based on the Ceph knowledge graph with billion node scales includes the following steps:

step 1: and constructing a graph storage framework, acquiring entity data of a plurality of entities corresponding to the target service, and generating and storing a knowledge graph corresponding to the target service according to the entity data. As shown in fig. 3, for the overall distributed architecture, ceph is used as a distributed resource storage, a Client/Server architecture is adopted, a Ceph cluster is constructed by a small cluster composed of multiple monitors, and multiple OSDs are used for storing graph data under a single Monitor small cluster.

Step 2, firstly mapping the knowledge graph data into a fixed index data structure, in order to have the capability of processing billions of nodes of knowledge data, as shown in fig. 5, adding an external index background mechanism, using an elastic search/Solr search engine as an external index plug-in, realizing that the index can be used when non-equivalent query is carried out, and simultaneously combining with an efficient index mechanism to construct an external index background. And the external index background and the index engine exchange data in an API mode.

And 3, constructing an integrated distributed computing engine framework, constructing a distributed computing engine by using a Spark computing engine framework, converting the graph relation into a Spark operator by using a graph X library, storing the graph data on the nodes of the Ceph cluster in a distributed manner by using RDD (resource description language) by using the graph X library, and correspondingly storing a vertex set and an edge set by using the vertex RDD and the edge RDD respectively.

And 4, managing a graph storage architecture, namely providing three layers of expanded line query, data write-in, data read, cluster expansion, metadata backup, metadata snapshot, online object analysis and online analysis processing operations to realize the management of graph data of the knowledge graph on the basis of the graph storage architecture, the external index background and the construction of a distributed computing engine.

In this embodiment, the efficient indexing mechanism includes a graph index and a vertex center index, the graph index is a global index structure of the entire knowledge graph, better selectivity is obtained by indexing attributes of entities or edges, so that the graph traversal speed is increased, and equivalent retrieval is performed through a fixed attribute combination composed of one or a group of attributes. The vertex center index is a local index structure established for each vertex, but when each vertex has thousands or more edges in a large graph, traversal of the vertices has filtering of corresponding edges, and traversal efficiency is low, so that the vertex center index only supports leftmost matching.

For the index-based three-layer wire expanding query, firstly, a user-given vertex set Vset is set as basic data of a first-layer wire expanding query, a query filtering condition of a first layer is set as a filtering condition ConditionA of a vertex Label/vertex attribute, and the vertex wire expanding query of the first layer is carried out. And then, taking the vertex set of the side meeting the query filtering condition of the first layer as basic data for the line expansion query of the second layer, setting the query filtering condition of the second layer as a filtering condition ConditionB of a side Label/side attribute, and performing the line expansion query of the second layer. The last line expansion query only queries edges which meet ConditionB, but vertexes related to the edges only have information of vertex IDs, and do not contain any attribute information, and whether the ConditionA is met or not is uncertain. The edge set meeting the second layer query filtering condition is used as basic data of the third layer expanded line query, the attribute query condition is set, the third layer attribute expanded line query is carried out, the query result subjected to the third layer expanded line query is output, and the efficient index is implemented to play a role.

In this embodiment, as shown in fig. 4, in order to support the OLA requirement P, a set of high performance computing framework API is further extended, spark is supported, a graph x library is used to convert the graph relationship into a Spark operator, graph x stores graph data on the nodes of the cluster in a distributed manner by RDD, and vertex RDD (vertex RDD) and edge RDD (EdgeRDD) are used to store a vertex set and an edge set. Vertex RDD distributes vertex data across the cluster in a multi-partition fashion by hashing partitions by their IDs. And partitioning the edge RDD according to a specified partition strategy (partitionStrategy), and distributing the edge data on the cluster in a multi-partition mode. In addition, the vertex RDD also has a routing table, which is the routing information of the vertex-to-edge RDD partition. The routing table exists in a partition of the vertex RDD and records the relationship of the vertex to all the edge RDD partitions in the partition. When the edge RDD needs the vertex data, the vertex data can be sent to the edge RDD subarea by the vertex RDD according to the routing table. Up to this point, the graph data is stored as the RDD of Spark.

At the Spark bottom layer, an operator executes and starts Spark context, the Spark context registers and applies for running an Executor resource to a resource manager, the resource manager allocates the Executor resource and starts StandaloneExecutionBackend (task scheduling), the running condition of the Executor is sent to the resource manager along with heartbeat, the Spark context constructs a DAG graph, the DAG graph is decomposed into Stage, and the Taskset is sent to a TaskScheduler. The Executor applies for Task from the SparkContext, the TaskSchedule issues the Task to the Executor for operation, the SparkContext issues the application program code to the Executor, the Task operates on the Executor, and all resources are released after the operation is finished. Therefore, efficient operations such as mapridges, mapVertics and aggregateMessages are achieved, and data analysis requirements are responded quickly.

In this embodiment, as shown in fig. 6, for data writing, a Client (Client) is connected to a Monitor, obtains Map information of a cluster, requests a corresponding master OSD data node, and the master OSD data node writes data of two other replica nodes at the same time, waits for a data writing status of the master node and the two other replica nodes, and returns the data to the Client after the data writing status of the master node and the data writing status of the replica nodes are successful, so that the data writing is completed. The data reading mode and the data writing mode are the same.

In this example, for cluster capacity expansion, the Client connects the Monitor to obtain the Map information of the cluster. Meanwhile, the new main node OSD1 reports the Monitor actively due to no PG (Placement groups) data, the OSD2 node is informed to take over as the main node temporarily, the temporary main node OSD2 synchronizes the whole data to the new main node OSD1, the ClientIO read-write is directly connected with the temporary main node OSD2 for reading and writing, the OSD2 node receives the read-write IO and writes into the other two copy nodes simultaneously, the OSD2 node and the other two copy nodes are waited to write successfully, and after the three data of the OSD2 node are written successfully, a signal is returned to the Client. At the moment, the ClientIO reading and writing are finished, if the OSD1 node data synchronization is finished, the temporary main node OSD2 uploads a request to the Monitor, the temporary main node OSD2 can give out a main role, the OSD1 becomes a main node, and the OSD2 becomes a copy node. And meanwhile, on the data level of the graph, after capacity expansion is realized, the graph is cut, namely the data needs to be cut and stored on a plurality of machines, the first type of cutting is point-by-point cutting, and a cutting line passes through a Vertex (Vertex) of the graph instead of an Edge (Edge). Each edge is only stored once, each edge only appears on one machine, and multiple adjacent vertexes can be distributed to different machines; and the second type of cutting according to edges, wherein the cutting line only passes through the Edge (Edge) connected with the vertex, each vertex is only stored once, and the cut edges can be stored on a plurality of machines until the cluster expansion is completed.

In this embodiment, an object storage management system of a Ceph-based giga-level node scale knowledge graph is also provided, and the system includes a graph data storage module, a distributed computation module, an index module, and a metadata management module.

The graph data storage module is used for storing object data of the large-scale knowledge graph in a distributed mode and providing object storage, block device storage and file system services.

The distributed computing module is used for decomposing a large task into a plurality of subtasks by utilizing spark RDD memory computing, respectively deploying the subtasks to different machines for execution, and summarizing the subtasks after the subtasks are completed so as to provide efficient large-scale data processing capacity to support OLAP requirements and provide data analysis for users based on knowledge graphs.

The index module is used for mapping the knowledge data into a fixed index data structure and providing functions of graphic index, vertex center index and external index for a user.

In this embodiment, the entire embodiment can be applied to the anti-fraud detection scenario. The method comprises the steps of constructing a heterogeneous network by user information, equipment information and social relations, and applying the heterogeneous network graph to a user association analysis and anti-fraud detection scene. After data is imported, the number of nodes reaches the order of 11 hundred million, and the relation data reaches the order of 500 hundred million, so that a complex heterogeneous network comprising 11 types of nodes and 13 types of edges is formed. Screening suspicious users through a specific rule, and checking users having specific association with the suspicious users; checking the network characteristics and the user characteristics of all user forming subnets which are specifically associated with the suspicious users; analyzing what association a particular user can associate together through; data of 6-layer incidence relation can be analyzed at most to complete a series of data analysis tasks, and in the map of the patent with 11 hundred million-level nodes, the map traversal and query response time of the scheme is 4-100 times faster than that of the existing map storage system. The technical scheme is compared with the existing map storage solution as follows:

TABLE 1 data Loading

The technical scheme	NEO4J-OFFLINE	NEO4J-CYPHER
			45375 second	Not completed in 24 hours	Not completed in 24 hours

TABLE 2 data storage size

The technical scheme	NEO4J-OFFLINE	NEO4J-CYPHER
			609375MB	275950MB	1276175MB

TABLE 3 query Performance

The technical scheme	NEO4J-OFFLINE	NEO4J-CYPHER
			7.5 milliseconds	55.0 msec	34.1 ms

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. The object storage management method of the knowledge graph with billion-level node scale based on Ceph is characterized by comprising the following steps:

s1: the method comprises the steps of constructing a graph storage architecture, obtaining entity data of a plurality of entities corresponding to a target service, generating a knowledge graph corresponding to the target service according to the entity data, storing the knowledge graph, taking Ceph as a distributed resource storage, constructing a Ceph cluster by using a small cluster composed of a plurality of monitors by using a Client and Server architecture, and simultaneously storing graph data by using a plurality of OSD under a single Monitor small cluster;

s2, constructing an external index background, namely mapping the knowledge map data into a fixed index data structure, using an elastic search engine and a Solr retrieval engine as external index plug-ins to realize non-equivalent query, and meanwhile, combining a high-efficiency index mechanism to construct the external index background; the efficient indexing mechanism comprises a graph index and a vertex center index, wherein the graph index is a global index structure of the whole knowledge graph; the vertex center index is a local index structure established for each vertex;

s4, managing a graph storage architecture, and providing three layers of expanded line query, data write-in, data reading, cluster expansion, metadata backup, metadata snapshot, online object analysis and online analysis processing operations to realize management of graph data of the knowledge graph on the basis of the graph storage architecture, an external index background and a distributed computation engine; the three-layer wire expansion inquiring step comprises the following substeps:

s401, setting a user given vertex set Vset as basic data of a first layer of expanded line query, setting a query filtering condition of the first layer as a filtering condition ConditionA of a vertex label and a vertex attribute, and performing the vertex expanded line query of the first layer;

s402, taking a vertex set meeting the first layer query filtering condition as basic data of second layer expansion query, setting the second layer query filtering condition as a filtering condition ConditionB of an edge label and an edge attribute, and performing second layer edge expansion query;

and S403, setting an attribute query condition by taking the edge set meeting the second-layer query filtering condition as basic data of the third-layer expanded line query, performing the third-layer attribute expanded line query, and outputting a query result subjected to the third-layer expanded line query.

2. The object storage management method of a Ceph-based knowledge-graph of billion node sizes as claimed in claim 1, wherein the step S3 further comprises a partitioning operation, and specifically comprises the following sub-steps:

3. The method for object storage management based on the Ceph-scale knowledgegraph of billions of nodes of claim 1, wherein the data writing step of step S4 comprises the following substeps:

4. The method for object storage management based on the Ceph-based billion-level node scale knowledge graph of claim 1 wherein the step of expanding the cluster in step S4 comprises the sub-steps of:

s301, the Client is connected with the Monitor to acquire the Map information of the cluster, the OSD1 of the new main node uploads a request to the Monitor, and the OSD2 node replaces the OSD1 node to become a temporary main node;

and S305, simultaneously, on the graph data level, after the node capacity expansion is realized, the graph data are cut according to a graph data cutting mode and are respectively stored on a plurality of machines.

5. The object storage management method of the Ceph-based billion-scale node-scale knowledge-graph according to claim 4, wherein the graph data cutting modes comprise two cutting modes of point-by-point cutting and edge-by-edge cutting; performing data cutting on the vertexes of the graph according to a point cutting mode, wherein a cutting line passes through the vertexes of the graph, each edge is only stored once and only appears on one machine, and vertexes with more neighbor vertexes are distributed to a plurality of different machines for storage; and performing data cutting by using the graph edges according to an edge cutting mode, wherein a cutting line only passes through the edges connecting the vertexes, each vertex is only stored once, and the cut edges are distributed to a plurality of different machines for storage.

6. The method for object storage management based on a Ceph-scale knowledge-graph, according to claim 1, wherein the step of snapshotting metadata in step S4 comprises: according to the metadata information, the previous data state can be effectively recovered, and the program can also be recovered to the system operation history state; saving system data of a specific time point, and generating a report of the corresponding time point of the system; and exporting the snapshot data for offline work.

7. A system for object storage management based on a Ceph-scale knowledge-graph, implemented in a method for object storage management based on a Ceph-scale knowledge-graph according to any one of claims 1 to 6, comprising:

the graph data storage module is used for storing object data of the large-scale knowledge graph in a distributed mode and providing object storage, block device storage and file system services;

the distributed computing module is used for decomposing a large task into a plurality of subtasks by utilizing spark rdd memory computing, respectively deploying the subtasks to different machines for execution, and summarizing the subtasks after the subtasks are completed so as to provide efficient large-scale data processing capacity to support OLAP (online analytical processing) requirements and provide data analysis for users based on knowledge graphs;

the index module is used for mapping the knowledge data into a fixed index data structure and providing a graph index, a vertex center index and an external index function for a user;

and the metadata management module is used for backup of metadata, snapshot of the metadata, program recovery, generation of a time point report and offline work of the system.