CN108984617A

CN108984617A - A kind of metadata catalog structure implementation method towards memory cloud

Info

Publication number: CN108984617A
Application number: CN201810604826.7A
Authority: CN
Inventors: 侯迪; 侯智琦; 齐勇; 王培健; 赵文嘉
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2018-06-13
Filing date: 2018-06-13
Publication date: 2018-12-11

Abstract

Metadata catalog structure implementation method towards memory cloud of the invention, using the form of directory tree, bibliographic structure is separated with directory node content, improve the variable flexibility of node content, each directory node individually adds abbreviation pid field, the pid field of child node is directed toward the fileid field of father node, and secondary index is added to pid field, when user search catalogue, utilize the advantage of pid secondary index, Key-Value and memory storage, catalogue recall precision is improved, carry efficiency is also greatly improved.

Description

A kind of metadata catalog structure implementation method towards memory cloud

Technical field

The present invention relates to computer fields, are related to a kind of metadata catalog structure implementation method towards memory cloud.

Background technique

Memory cloud (RAMCloud) is a kind of Key-Value type new types of data central store system, it is by thousands of Large scale system composed by the main memory of platform common server, whenever, all information be stored in these quick DRAM In (dynamic RAM, that is, the memory being commonly called as), memory instead of the hard disk in legacy system, and hard disk be only used as it is standby Part uses.The cluster of memory cloud is mainly made of primary server (Master) and backup server (Backup), wherein Master For storing and calculating；Backup is then the fast quick-recovery after restarting system.In addition, there are one coordinators in cluster (Coordinator), its NameNode node being functionally similar in Hadoop distributed file system is responsible for management configuration The functions such as information.

All data are stored in DRAM by RAMCloud, and performance can achieve the hard-disc storage system than current peak performance It unites taller 100~1000 times.In terms of access delay, a process in the application server is run in RAMCloud scheme Reading hundreds of byte datas only by network from the storage server of same data center needs 5~10 μ s, and reality is at present 0.5~10ms is spent as unified, being specifically dependent upon data is in server memory caching or in hard disk.Moreover, one Multicore storage server is per second can to service at least 1,000,000 read requests.And in hard-disk system it is same machine is per second can only 1000~10000 requests of service.It can be seen that memory cloud will be a big important breakthrough point in the following storage.

Traditional high-energy physics metadata by using disk storage, using its bibliographic structure of Mysq1 database purchase with And metadata information, it can satisfy the retrieval of the metadata catalog structure of the data volume generation of TB rank.And in the feelings of EB rank Under condition, since its bibliographic structure is deeper, retrieval performance is not obviously able to satisfy using the requirements for access to metadata catalog.

Summary of the invention

It is an object of the invention to overcome the shortage of prior art, it is real to provide a kind of metadata catalog structure towards memory cloud Existing method, bibliographic structure is separated with directory node content, overcomes existing memory cloud metadata catalog structure deeper, retrieval performance Lower problem.

The present invention uses following scheme to achieve the above object:

A kind of metadata catalog structure implementation method towards memory cloud, includes the following steps:

1) RAMCloud memory database is used, bibliographic structure tree, each node definition fileid field and pid word are defined Section, data field carry out assignment as needed, and the pid field of child node is directed toward the fileid field of father node；

2) TABLE_INDEX table and TABLE_DATA table is defined respectively to separate bibliographic structure and directory information content； TABLE_INDEX table is used for storing directory structure, and using pid and name as key, fileid is as value；TABLE_DATA table For storing node content, using fileid as key, node other information is packaged into unified structural body as value；

3) increase secondary index mechanism, define the index IndexKey::IndexKeyRange type in memory cloud, by Pid+name and individual pid forms compound Key, and VALUE is fileid and query result is the compound of fileid set VALUE, the range of definition search IndexLookup type, pass through IndexLookup chaining search subdirectory；

4) i.e. the KEY of TABLE_DATA table is the VALUE of TABLE_INDEX table, when some file of user search, first Understand catalog structure, the catalogue where retrieving file simultaneously obtains the fileid of destination node, then passes through TABLE_DATA Fileid is obtained file content as key by table.

Metadata catalog structure implementation method of the present invention towards memory cloud, uses the full memory of Key-Value non-relational Database RAMCloud redesigns traditional bibliographic structure tree, using the form of directory tree, by bibliographic structure and mesh Record node content is separated, and the variable flexibility of node content is improved, and each directory node individually adds Parent_fileid Field (abbreviation pid), the pid field of child node is directed toward the fileid field of father node, and adds secondary index to pid field, When user search catalogue, using the advantage of pid secondary index, Key-Value and memory storage, catalogue retrieval effect is improved Rate, carry efficiency are also greatly improved.

Using secondary index accelerate retrieval subdirectory, when user search directory information, using pid secondary index, The advantage of Key-Value and full memory improve catalogue recall precision.

Detailed description of the invention

Fig. 1 is abstract table structure figure of the invention

Fig. 2 is secondary index architecture diagram of the invention

Fig. 3 is the illustraton of model of the embodiment of the present invention

Specific embodiment

Present invention is further described in detail in the following with reference to the drawings and specific embodiments, but not as to limit of the invention It is fixed.

As shown in Figure 1, the metadata catalog structure implementation method of the invention towards memory cloud, the specific steps are as follows:

1) tradition mysql disk database is replaced using RAMCloud memory database；

It 2) is TABLE_ respectively by defining two metadata tables for bibliographic structure and directory information content separate design INDEX table and TABLE_DATA table；

3) TABLE_INDEX table is used for storing directory structure, and using pid and name as key, fileid is as value, together When ensure that the uniqueness of key and value；

4) TABLE_DATA table is for storing node content, using fileid as key, with name, nlink, atime ... Equal nodes other information is packaged into unified structural body as value；

5) increase secondary index mechanism simultaneously, define the index IndexKey::IndexKeyRange type in memory cloud, Original Key being made of pid+name is converted into the compound Key formed by pid+name and individual pid, original VALUE is converted into the compound VALUE by unique fileid and query result for fileid set by single fileid, defines model It encloses and searches IndexLookup type, pass through IndexLookup chaining search subdirectory, such as Fig. 2；

6) i.e. the KEY of TABLE_DATA table is the VALUE of TABLE_INDEX table, therefore when some file of user search When, first can catalog structure, catalogue where retrieving file simultaneously obtains the fileid of destination node, then passes through Fileid is obtained file content as key by TABLE_DATA table.

As shown in figure 3, by embodiment, the present invention will be further described in conjunction with Fig. 3.

With bibliographic structure/A ,/B/D ,/B/E, for/C.Root node/fileid be 1, pid 0, represent root node, on Grade directory node is sky, and the fileid of A is 2, pid 1, be connected to root node/, similarly, the fileid of B is 3, pid 1, C's Fileid is 3, pid 1, and the fileid of D is 5, pid 3, and the fileid of E is 6, pid 3, forms directory tree.

Each directory node has a fileid, the parent_fileid (abbreviation pid) of child node to be directed toward father's section Point, the pid of child node is directed toward the fileid of father node, and passes through this side using secondary index to pid addition secondary index Formula realizes the storage of bibliographic structure, accelerates retrieval subdirectory.

Finally it should be noted that: the above examples are only used to illustrate the technical scheme of the present invention rather than its limitations, to the greatest extent Pipe is described the invention in detail referring to above-described embodiment, it should be understood by those ordinary skilled in the art that: still may be used With modifications or equivalent substitutions are made to specific embodiments of the invention, and repaired without departing from any of spirit and scope of the invention Change or equivalent replacement, should all cover in present claims range.

Claims

1. a kind of metadata catalog structure implementation method towards memory cloud, it is characterised in that include the following steps:

1) RAMCloud memory database is used, bibliographic structure tree, each node definition fileid field and pid field are defined, Data field carries out assignment as needed, and the pid field of child node is directed toward the fileid field of father node；

2) TABLE_INDEX table and TABLE_DATA table is defined respectively to separate bibliographic structure and directory information content；TABLE_ INDEX table is used for storing directory structure, and using pid and name as key, fileid is as value；TABLE_DATA table is for depositing Node content is put, using fileid as key, node other information is packaged into unified structural body as value；

3) increase secondary index mechanism, the index IndexKey::IndexKeyRange type in memory cloud is defined, by pid+ Name and individual pid forms compound Key, and VALUE is fileid and query result is the compound VALUE of fileid set, fixed Adopted range-based searching IndexLookup type passes through IndexLookup chaining search subdirectory；

4) i.e. the KEY of TABLE_DATA table is the VALUE of TABLE_INDEX table, when some file of user search, can be examined first Rope bibliographic structure, catalogue where retrieving file simultaneously obtain the fileid of destination node, then will by TABLE_DATA table Fileid obtains file content as key.