CN107659626B

CN107659626B - Temporary metadata oriented separation storage method

Info

Publication number: CN107659626B
Application number: CN201710814016.XA
Authority: CN
Inventors: 陈榕; 陈海波; 臧斌宇; 管海兵
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2017-09-11
Filing date: 2017-09-11
Publication date: 2020-09-15
Anticipated expiration: 2037-09-11
Also published as: CN107659626A

Abstract

The invention provides a temporary metadata oriented separation storage method, which comprises the following steps: the method comprises the following steps: the data source sends data streams, and the server receives the data streams and identifies metadata corresponding to each piece of data; step two: the server distributes the streaming data to the corresponding server to be stored; step three: converting the streaming data which needs to be stored in the local into a plurality of key value pairs, inserting the key value pairs into a local key value pair storage system, and recording information such as the inserted memory position; step four: combining metadata and key value pair insertion information corresponding to each piece of data, and inserting the metadata and the key value pair insertion information into another garbage recycling-friendly local storage system; and fifthly, copying the metadata and the key value pair insertion information to a plurality of servers according to a certain strategy to serve as cache. The invention avoids large amount of data transfer when deleting the metadata, thereby improving the working efficiency of the storage system.

Description

Temporary metadata oriented separation storage method

Technical Field

The invention relates to a separation storage method, in particular to a separation storage method facing temporary metadata.

Background

The Stream Processing (Stream Processing) model is increasingly important in large data applications. Popular big data processing platforms including Spark, Flink, etc. provide data stream processing functions. In this model, a plurality of data sources continuously generate data forming a data stream, which is sent to a processing platform. The user can query the data generated in the last period of time through the platform, so as to know the latest event (such as querying the microblog which is most approved in the last hour). Thus, all data and the time of data generation (i.e., metadata) need to be stored in the system until it is too old to be used by any queries. In actual use, people find that when much data is too old, metadata should be deleted to save space, but the data itself should remain to provide historical information for later queries. Due to this new property of stream processing, finding an efficient storage management method becomes an important issue to improve system space and time efficiency. The efficient method can quickly delete the old metadata and simultaneously ensure the overall performance of the query system.

An RDF (Resource Description Framework) is a data format for representing a graph structure, each piece of data consists of a subject, a predicate and an object, and can be regarded as an edge from the subject to the object, and the edge has a predicate label, so that an RDF data set can be regarded as a directed graph. Due to the strong description capability of the RDF, the RDF is widely applied to resource modeling and data description in various fields. In recent years, data streams in RDF format have been used in a number of fields, including social network analytics and internet of things applications.

The Key-Value Storage System (Key-Value Storage System) is a widely used distributed Storage method. Graph data is stored in a large data system, and usually a point (vertex) in the graph is used as a key and a corresponding edge set (edge set) is used as a value. When the existing system processes the RDF data stream, each edge and the time metadata thereof are bound together for storage, and the method is very unfavorable for deleting old metadata.

Remote Direct Memory Access (RDMA) is a high-performance network communication technology, and can directly read and write the Memory of a Remote server without the participation of a target server CPU. Compared with traditional network communication, RDMA has the characteristics of low delay and high throughput rate. When the amount of data transferred is small, the network bandwidth occupancy is not high, and the delay of RDMA communication is kept at a stable and low level.

Therefore, how to design an efficient storage management method, which can not only delete old metadata instantly to save storage space, but also provide time information of data for query processing quickly, and make full use of new network communication technology and its characteristics, has become a technical problem to be solved by those skilled in the art.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a temporary metadata-oriented separation storage method, which can make full use of the characteristics of a high-performance network, separate and manage RDF stream data and RDF stream metadata, and avoid large data transfer caused by deleting the metadata, thereby improving the working efficiency of a storage system.

According to one aspect of the present invention, there is provided a method for split storage oriented to temporary metadata, comprising the steps of:

the method comprises the following steps: the data source sends data streams, and the server receives the data streams and identifies metadata corresponding to each piece of data;

step two: the server distributes the streaming data to the corresponding server to be stored;

step three: converting the streaming data which needs to be stored in the local into a plurality of key value pairs, inserting the key value pairs into a local key value pair storage system, and recording information such as the inserted memory position;

step four: combining metadata and key value pair insertion information corresponding to each piece of data, and inserting the metadata and the key value pair insertion information into another garbage recycling-friendly local storage system;

and fifthly, copying the metadata and the key value pair insertion information to a plurality of servers according to a certain strategy to serve as cache.

Preferably, the first step comprises the steps of: the data source selects a server to send the data stream, and the server listens to the data stream and converts the data stream into RDF format graph data and related metadata which can be recognized by the system.

Preferably, the second step comprises the steps of: after the server identifies the RDF graph data, judging whether the RDF graph data should be stored locally according to a graph partitioning algorithm used by the system; if not, the data is forwarded to the corresponding server, and the process is ended.

Preferably, the third step comprises the following steps:

thirty-one steps: converting RDF graph data into a plurality of key value pairs, and inserting the key value pairs into a key value pair storage system at the bottom layer;

step thirty-two: judging whether to modify the key value pair representing the index according to the insertion condition of the thirty-one step;

when the key A is not found in the thirty-one step, the matching of the subject and the predicate appears for the first time in the whole RDF graph, and the predicate-subject index is modified at the moment;

when the key A can be found in the thirty-two steps, the matching of the subject and the predicate is known to appear in the whole RDF graph through the current predicate-subject index, and the index does not need to be modified;

step thirty three: all modifications to the key value storage system in the thirty-one and thirty-two steps are to add an element to the value of a certain key, and the added element has an address in the storage system; for each piece of stream data, its corresponding temporary data, including time metadata and key-value pair insertion information, is packaged and provided as input to step four.

Preferably, the fourth step comprises the steps of: inserting temporary data corresponding to each piece of data into another local storage system, namely a 'separation storage method'; a storage system for storing temporary data uses a circular linked list data structure that is friendly to garbage collection.

Preferably, the step five comprises the following steps:

fifthly, steps: checking each server of the data center; judging whether the target server needs to inquire the data source or not according to the inquiry information registered on the target server and the data source of the streaming data;

step fifty-two: if the target server needs to inquire the data source, all the information generated by the data in the step thirty-third is sent to a storage space reserved for the data source by the target server, and the information is used as a cache to be provided for an inquiry thread on the server;

fifthly, steps: when the target server processes the query, accessing a space reserved for the metadata to complete the query; after the query execution is finished, the temporary data which are too old are deleted to save space.

Compared with the prior art, the invention has the following beneficial effects:

the temporary metadata oriented separation storage method stores data and metadata in different subsystems, avoids mutual influence of the two data, and optimizes different data characteristics. Compared with the traditional hybrid management method, the method can avoid mass data movement caused by deleting temporary data, and therefore, the method has a great performance advantage.

Secondly, the metadata storage format designed by the invention fully considers the characteristics of a high performance network (RDMA), and can enable each server to use the locally stored temporary data to reduce the RDMA network communication times during query processing. Since network communication delay usually accounts for a large part of the total query processing delay, the design can well reduce the total query delay.

Thirdly, the metadata storage format designed by the invention is organized according to the time sequence, so that the server can use the annular linked list data structure to quickly delete the old temporary data, and the storage space is saved.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a flow chart of the present invention using a temporary metadata oriented split storage method.

FIG. 2 is a schematic diagram illustrating an embodiment of a method for split storage using temporary metadata oriented according to the present invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.

The invention relates to a temporary metadata oriented separation storage method, which comprises the following steps:

the method comprises the following steps: loading initial data when the system is started, wherein the data are stored on all servers in a scattered manner; the server receives data sent by a data source and distinguishes the data from time metadata; the data source sends data streams, and the server receives the data streams and identifies metadata corresponding to each piece of data;

The first step comprises the following steps:

step eleven: a server is selected by a data source, a data stream is sent through a network, and the reliability of data can be ensured by using network technologies such as TCP (transmission control protocol), RDMA (remote direct memory Access) and the like;

step twelve: the server receives the data stream and converts the received data into RDF format graph data and related metadata (such as time information) which can be recognized by the system;

step thirteen: according to the RDF related ontology specification (ontology), verifying the format normalization of the RDF data;

fourteen steps: the time information of the data is ensured to be monotonically increased according to the sequence of the data stream, and the time stamp can be correspondingly adjusted;

step one, the data streams and the servers are in one-to-one correspondence, and the method has the advantage that sequential semantics of the data streams in a distributed environment can be conveniently ensured, namely elements in the same data stream are processed according to the sequence of the timestamps. Meanwhile, different data streams are registered on different servers, so that the processing pressure of a single server can be relieved.

The second step comprises the following steps:

twenty one: after the stream data is converted into a standard RDF format, the server judges a target storage server corresponding to the data according to the data hash value, and the subject hash value in the RDF can be used for division, so that the spatial locality of query operation is improved.

Step twenty-two: and if the target storage server is judged to be the local server, sending the data to the local corresponding processing module for data insertion, and entering the third step.

Twenty-three steps: and if the target storage server is judged to be other servers, the data is sent to the corresponding server through the network, and the process is ended.

The data are divided by using a Hash algorithm, so that the advantage that the data are divided more uniformly on a plurality of servers is ensured; meanwhile, as the hash value of the key information is adopted, the data with the same key information is divided into the same server, and the data locality during query is improved.

The third step comprises the following steps:

thirty-one steps: and converting the RDF graph data into a plurality of key value pairs, and inserting the key value pairs into the underlying key value pair storage system. Specifically, for a key value pair to be inserted, setting a key as a first value B, searching the first in a storage system, and if the first is not found, inserting the key A, wherein the corresponding value is B; if the key A can be found and the corresponding value is C, C and B are combined to be a new value corresponding to A.

Step thirty-two: and judging whether to modify the key value pair representing the index according to the insertion condition of the thirty-one step. Here, a predicate-subject index used in the graph query system Wukong is taken as an example, and the index describes which subjects having a certain predicate all have in the whole RDF graph.

And when the key A is not found in the thirty-one step, the matching of the subject and the predicate appears for the first time in the whole RDF graph, and the predicate-subject index is modified at the moment. Specifically, the predicate is used as a key, a corresponding value is found, and the subject is added to the value to form a new value.

When the key A can be found in the thirty-two steps, the matching of the subject and the predicate can be known to appear in the whole RDF graph through the current predicate-subject index. No modification of the index is required at this point.

Step thirty three: all modifications to the key value pair storage system in steps thirty-one and thirty-two are additions to the value of a key, the addition having an address in the storage system. For each piece of stream data, its corresponding temporary data, including time metadata and key-value pair insertion information, is packaged and provided as input to step four.

The fourth step comprises the following steps: inserting temporary data corresponding to each piece of data into another local storage system, namely a 'separation storage method'; a storage system for storing temporary data may use a circular linked list data structure that is friendly to garbage collection. Specifically, the temporary data is stored in the order of time stamps, and each element of the circular linked list stores the temporary data with the same time stamp. Therefore, the temporary data with older timestamp and no longer used by inquiry can be subjected to memory recovery through the covering operation of the ring linked list. Compared with the traditional garbage recovery mechanism, the method used in the step four is purposefully designed for the data stream processing scene: since the data stream query usually only needs the latest generated data, the length of the circular linked list can be set to be smaller, so that the useless metadata can be recovered in a more timely manner. Meanwhile, the method does not affect the insertion and query performance of the data.

The fifth step comprises the following steps:

fifthly, steps: each server of the data center is checked. And judging whether the target server needs to inquire the data source or not according to the inquiry information registered on the target server and the data source of the streaming data.

Step fifty-two: and if the target server needs to query the data source, all the information generated by the data in the step thirty-three is sent to a storage space reserved for the data source by the target server, and the information is provided to a query thread on the server as a cache.

Fifthly, steps: when the target server processes the query, the space reserved for the metadata is accessed to complete the query. After the query execution is finished, the temporary data which are too old are deleted to save space.

Step five plays a role of software caching, and the execution time of the query can be effectively reduced. It features copying metadata, rather than data, as a cache to multiple machines. The advantage is that the memory size occupied by the metadata is significantly smaller than the data size when the data stream flow rate is fast. Meanwhile, the data structure used in this step is similar to that described in step four, so that the obsolete metadata can be conveniently recycled.

FIG. 2 is a schematic composition diagram of an embodiment of the present invention. As shown in fig. 2, this embodiment mainly includes four software abstraction levels, and seven main functional modules. Among them, the functional modules closely related to the present invention are: the system comprises a data stream receiving module, a graph storage module and a query execution module.

The data stream receiving module is responsible for converting the system input into a standardized RDF format and generating corresponding metadata. The graph storage module is responsible for organizing a key value pair storage system according to the content of the invention, namely storing metadata separately, thereby providing efficient graph data query and recovery. And the query execution module is responsible for ensuring the query consistency and simultaneously improving the query execution efficiency by utilizing the cache mechanism in the step five.

Further specifically, the metadata distribution and query processing of this step takes advantage of the properties of high performance networking (RDMA). For a key-value pair in a storage system, typically only a small portion of the value needs to maintain metadata, and the remaining portion of the metadata is deleted because it is too old, so the metadata is relatively small. One remarkable characteristic of the RDMA communication mode is as follows: when the size of the transmitted data is small (e.g., less than 2000 bytes), the delay of the transmission remains low and substantially constant. By utilizing the above-mentioned characteristics of RDMA to deliver metadata, higher transmission efficiency and lower transmission delay can be achieved. In terms of query processing, since the amount of data involved in streaming queries is relatively small, processing is amenable to reading data directly from the target server to local using RDMA communication. The metadata information designed by the invention records the storage position of the data on the corresponding server, and can acquire the data through one-time communication, thereby reducing the network communication times required during query processing and improving the query processing performance.

The storage management method in the invention is realized based on data and metadata separation. The storage structure used by the data does not support delete operations, thereby providing faster queries and insertions. Metadata is stored using a circular linked list structure, called a metadata record table, with one metadata record table for each data stream. The metadata record table is organized according to a time sequence, and when a new element is inserted, the new element is inserted at the tail end of the linked list; when stale metadata is deleted, the header elements are deleted. For ease of lookup, each element in the linked list is a small hash table that represents the location in the storage system where each key-value pair is modified in a basic unit of time. When the range of the query is a plurality of basic time units, a plurality of hash tables need to be queried.

Compared with the prior art, the method and the device can identify and delete the old metadata in time according to the query time range of the registered query. Meanwhile, the characteristics of a high performance network (RDMA) are fully utilized, and the performance of the query system is improved. Compared with the traditional mixed storage management method, the management complexity caused by uniformly storing the data and the metadata is avoided, so that the use efficiency of the storage system can be greatly improved.

The invention adopts a separation storage method facing temporary metadata instead of the traditional mixed storage management method, and the main reason is that the traditional storage management method usually causes larger extra cost when the storage space is recycled. The conventional hybrid storage management method has the following problems:

(1) it is difficult to delete the stale metadata and recycle the corresponding storage space. Mixed storage of data and metadata can cause a large amount of data migration when metadata is deleted, and queries on the key-value pairs during migration can be blocked;

(2) data generated in a certain time period is difficult to query efficiently, due to mixed storage of the data and metadata, corresponding data is difficult to find in a given time interval, and the process may cause multiple RDMA network communications, thereby improving the delay of query processing.

Compared with the traditional method, the method for separating and storing temporary metadata has the following advantages that:

the data moving caused by the traditional method is effectively avoided, the high-efficiency recovery of the occupied space of the metadata is realized by separating the management data and the metadata, and the recovery does not cause any influence on the query;

secondly, the cost of distributing metadata is reduced by efficiently utilizing the characteristics of stream data and RDMA network characteristics, the network bandwidth is effectively saved, and simultaneously, the metadata is represented by adopting an RDMA friendly data structure, so that the performance of query processing is improved.

In summary, the temporary metadata-oriented separation storage method provided by the present invention can identify the stale metadata in real time according to the query time range of the registered query, and delete the stale metadata with a small overhead. Meanwhile, the characteristics of a high performance network (RDMA) are fully utilized, and the performance of the query system is improved. Finally, the invention can avoid the management complexity caused by uniformly storing the data and the metadata, and can improve the use efficiency of the storage space to the maximum extent.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims

1. A method for separating and storing temporary metadata is characterized by comprising the following steps:

step five: copying the metadata and the key value pair insertion information to a plurality of servers according to a certain strategy to serve as cache; the third step comprises the following steps:

when the key A is not found in the thirty-two steps, the matching of the subject and the predicate appears for the first time in the whole RDF graph, and the predicate-subject index is modified at the moment;

when the key A is found in the step thirty-two, the matching of the subject and the predicate is known to appear in the whole RDF graph through the current predicate-subject index, and the index does not need to be modified at this time;

2. The method for separate storage oriented to temporary metadata according to claim 1, wherein said step one comprises the steps of: the data source selects a server to send the data stream, and the server listens to the data stream and converts the data stream into RDF format graph data and related metadata which can be recognized by the system.

3. The method for separate storage oriented to temporary metadata according to claim 1, wherein said step two comprises the steps of: after the server identifies the RDF graph data, judging whether the RDF graph data should be stored locally according to a graph partitioning algorithm used by the system; if not, the data is forwarded to the corresponding server, and the process is ended.

4. The method for separate storage oriented to temporary metadata according to claim 1, wherein said step four comprises the steps of: inserting temporary data corresponding to each piece of data into another local storage system, namely a 'separation storage method'; a storage system for storing temporary data uses a circular linked list data structure that is friendly to garbage collection.

5. The method for separate storage oriented to temporary metadata according to claim 1, wherein said step five comprises the steps of: