CN106874383A - A kind of decoupling location mode of metadata of distributed type file system - Google Patents

A kind of decoupling location mode of metadata of distributed type file system Download PDF

Info

Publication number
CN106874383A
CN106874383A CN201710016284.7A CN201710016284A CN106874383A CN 106874383 A CN106874383 A CN 106874383A CN 201710016284 A CN201710016284 A CN 201710016284A CN 106874383 A CN106874383 A CN 106874383A
Authority
CN
China
Prior art keywords
metadata
file
catalogue
directory
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710016284.7A
Other languages
Chinese (zh)
Other versions
CN106874383B (en
Inventor
陆游游
舒继武
李思阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201710016284.7A priority Critical patent/CN106874383B/en
Publication of CN106874383A publication Critical patent/CN106874383A/en
Application granted granted Critical
Publication of CN106874383B publication Critical patent/CN106874383B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of decoupling location mode of metadata of distributed type file system, including:Metadata to distributed file system is separated, to obtain the metadata of the metadata of catalogue, the metadata of directory entry and file;The directory metadata is centrally stored in directory metadata index node, and not comprising the pointer of sensing directory entry.Directory operation is performed according to the directory inode.Associated file metadata is stored in same node after each directory entry metadata is split, and sets up the reverse indexing for pointing to directory metadata.The invention has the advantages that:Reduce the information exchange between each node when distributed file system accesses metadata, reduce the delay of metadata access, simultaneously, method by separating directory content, decoupling High relevancy between file and catalogue, handling capacity very high can be reached, so as to improve treatment effeciency of the distributed file system for metadata.

Description

A kind of decoupling location mode of metadata of distributed type file system
Technical field
The present invention relates to computer realm, and in particular to a kind of decoupling distribution side of metadata of distributed type file system Method.
Background technology
Distributed file system is a kind of Novel storage system for supporting mass data storage, is widely used in data The heart, Chao Suan centers and publicly-owned cloud platform.Distributed file system has the advantages that many good relative to traditional centralised storage. Data storage can such as be carried out extending transversely, the capacity of storage can be dynamically expanded by way of increasing memory node, And ensure the Synchronous lifting of access throughput.Secondly, distributed file system has flexible relative to traditional centralised storage Fault-tolerant strategy, it is possible to use copy mechanism and correcting and eleting codes carry out distributed fault-tolerant.Distributed file system can also be used More cheap storage and computing device goes to build a large-scale storage cluster, to ensure the access of mass data.But The access standard (POSIX) of file system is limited to, the metadata access of distributed file system often becomes its performance Bottleneck.The access of its metadata cannot often meet the demand of high-throughput and low delay, but in actual system, exceed The data access of more than half is needed by metadata node.In order to solve the scalability of metadata of distributed type file system, Existing technology mainly has following three:
The characteristics of a kind of distributed meta data point spread method for being to be based on Dictionary tree, this method is to be distributed The name space of formula file system is divided into different subtrees according to subdirectory, and each subtree independence is stored in some node, And the node of the dynamic regulation storage of load for more accessing.The advantage of this mode is can be according to the synchronous dynamic of load The position that accesses of adjustment, but this mode cannot solve the problems, such as the path backtracking of file access, when accessing file When, it is necessary to access all catalogues in whole path, and these catalogues are not often stored in same node, often result in Larger access delay.
Another kind is the metadata profile method based on hash algorithm, is characterized in the file in a catalogue by breathing out Uncommon mode is by data allocations to different nodes.The advantage of this mode has substantial amounts of file in being directed to a catalogue When, the load of file access can be reduced.But the scaling concern of catalogue cannot be solved.
The third method is the method by using key value database storage file metadata, and this method make use of key assignments Database access is fast, and the characteristics of time delay is low, but this method still has the path searching of path such as first method presence Problem, cannot still solve the problems, such as relatively low during access time delay.
In order to solve the problems, such as path delay, these methods are often in client-cache metadata, but this brings again The expense of many inconsistencies, so that cannot be from more this upper solve problem.
The content of the invention
It is contemplated that at least solving one of above-mentioned technical problem.
Therefore, the decoupling distribution side it is an object of the present invention to propose a kind of metadata of distributed type file system Method, to solve the metadata profile of distributed file system, throughput is not high and postpones relatively low problem.
To achieve these goals, embodiment of the invention discloses that a kind of metadata of distributed type file system it is decoupling Location mode, comprises the following steps:S1:Metadata to distributed file system is separated, to obtain directory inode Metadata, the metadata of directory entry and file metadata;S2:The metadata of the catalogue is arranged on directory index section Point;S3:Each directory entry is split according to the distribution situation of file, and it is associated in the node storage of file storage Directory entry, and set up the reverse indexing for pointing to directory metadata.
Further, the directory operation includes establishment, the deletion of catalogue, reading catalogue, the institute of acquisition catalogue of catalogue The user belonging to user's group and change catalogue where having metadata, change catalogue.
Further, also include:The globally unique mark for determining file is provided;The overall situation of the file accessed required for calculating The cryptographic Hash of the mark;The node that metadata is deposited is positioned according to the cryptographic Hash.
Further, the fullpath for being designated file.
Further, also include:When establishment file or catalogue, one is created in the node of establishment file or catalogue All directory entries in the parent directory path comprising the file or the catalogue;If the directory entry is all or part of Created in the node, then create remaining directory entry.
Further, also include:When a file is deleted, the metadata of node and institute where the file are deleted State the project that the corresponding directory entry metadata of node where file points to the file.
Further, also include:When be read out catalogue or deltree operation when, access all of metadata node, To obtain all directory entries under reading catalogue or deltreeing.
Further, also include:Client-cache is provided, wherein, the directory metadata of the client-cache is used for visitor Determine whether the authority with establishment file when the establishment file of family end;The client access file metadata when, The metadata of catalogue is accessed, to obtain access rights;When the client has access rights, the metadata of file is accessed.
Further, also include:In the cache invalidation of described directory metadata client, directory metadata is carried out The change of the authority of catalogue and the deletion for catalogue.
The decoupling location mode of metadata of distributed type file system according to embodiments of the present invention, it is all of for file Metadata operation at most access 2 minor nodes, under the caching situation of directory metadata, it is only necessary to access a minor node, it prolongs When be only once access round RTT, due to being stored using key assignments, thus for metadata obtain time delay it is very low, Be can be ignored in the RTT time delays of Ethernet, institute's distributed file system that can in this way be effectively reduced is accessed Information exchange during metadata between each node, reduces the delay of metadata access, meanwhile, by separating directory content Method, decoupling High relevancy between file and catalogue, can reach handling capacity very high, so as to improve distributed text Treatment effeciency of the part system for metadata.
Additional aspect of the invention and advantage will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by practice of the invention.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from description of the accompanying drawings below to embodiment is combined Substantially and be readily appreciated that, wherein:
Fig. 1 is the flow chart of the decoupling location mode of the metadata of distributed type file system of the embodiment of the present invention;
Fig. 2 is the overall system architecture figure of one embodiment of the invention;
Fig. 3 is the uncoupled schematic diagram of contents segmentation of one embodiment of the invention.
Specific embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from start to finish Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached It is exemplary to scheme the embodiment of description, is only used for explaining the present invention, and is not considered as limiting the invention.
With reference to following description and accompanying drawing, it will be clear that these and other aspects of embodiments of the invention.In these descriptions In accompanying drawing, specifically disclose some particular implementations in embodiments of the invention to represent implementation implementation of the invention Some modes of the principle of example, but it is to be understood that the scope of embodiments of the invention is not limited.Conversely, of the invention Embodiment includes all changes, modification and the equivalent that fall into the range of the spiritual and intension of attached claims.
Below in conjunction with the Description of Drawings present invention.
Fig. 1 is the flow chart of the decoupling location mode of the metadata of distributed type file system of the embodiment of the present invention.Such as Fig. 1 Shown, the decoupling location mode of metadata of distributed type file system according to embodiments of the present invention is comprised the following steps:
S1:Metadata to distributed file system is separated, to obtain metadata, the metadata of directory entry of catalogue With the metadata of file.
S2:The metadata of the catalogue is arranged on directory inode.
S3:Each directory entry is split according to the distribution situation of file, and the node deposited in file is stored therewith Related directory entry, and set up the reverse indexing for pointing to directory metadata.
In one embodiment of the invention, the directory metadata of file system is centrally stored in a node.At this Under the mode of kind, the metadata information of directory inode is not comprising the address for pointing to directory entry metadata.Only retain basic The creation time of catalogue data, including but not limited to catalogue, the capability identification of catalogue, the group where catalogue is identified, belonging to catalogue ID.Herein on basis, for most of related to the metadata of directory inode with directory entry metadata Unrelated metadata operation all will be carried out only on this node of storage catalogue index metadata.Directory operation includes catalogue Establishment, the deletion of catalogue, read catalogue, obtain catalogue all metadata, change catalogue where user's group and change mesh User belonging to record.
In one embodiment of the invention, also machine is stored including a kind of distributed file metadata based on Hash System.This memory mechanism is supported for the storage and access for file metadata to expand to multiple nodes, so as to reach balance system The purpose of system load.This algorithm is used can be in one mark of file of globally unique determination:When client is carried out to file During metadata operation, the cryptographic Hash of the globally unique identifier of the file that client is accessed required for by calculating, positioning file institute The node of storage, operates in the node to metadata.This method ensure that for All Files metadata operation all An information for metadata node will at most be changed.
In one embodiment of the invention, it is designated the fullpath of file.
In one embodiment of the invention, the also reverse storage method including a kind of directory entry.This method passes through will The metadata of directory entry is assigned to multiple nodes after being split, it is ensured that need not increase during the establishment and deletion of file Plus extra node visit expense.Its distribution method is when establishment file or catalogue, it is not necessary to which modification is literary except creating The metadata information on node outside part node, but the node establishment one in establishment file or catalogue includes this document Or all directory entries in the parent directory path of catalogue.If all or part of of these directory entries is created in the node Build, then create remaining directory entry.This directory entry ensure that for when establishment and deletion text by this reverse storage method During part, the modification to multiple metadata nodes is not resulted in, reduce and access the distributed synchronization expense that multiple nodes bring.
In one embodiment of the invention, when a file is deleted, the metadata of node where file is deleted Directory entry metadata corresponding with node where file points to the project of file.
In one embodiment of the invention, also include:When be read out catalogue or deltree operation when, access all Metadata node, read catalogue or all directory entries under deltreeing to obtain.
In one embodiment of the invention, also include:Client-cache is provided, wherein, the catalogue unit of client-cache Data are used to determine whether the authority with establishment file when client establishment file;Client is accessing first number of file According to when, access catalogue metadata, to obtain access rights;When client has access rights, the metadata of file is accessed.
In one embodiment of the invention, also include:In the cache invalidation of described directory metadata client, enter The change of the authority of the catalogue of column catalogue metadata and the deletion for catalogue.
To make it is further understood that the present invention, will be described in detail by following examples.
As shown in Figure 2, there is provided a kind of metadata node of storage catalogue, this metadata node is used to process distribution All requests on directory metadata in file system.In implementation process, it is led to using the RPC agreements based on Ethernet The POSIX access interfaces for crossing offer standard receive the request from client.It is divided into four modules, and one is to provide and performs unit The access interface of data request operation, one is key assignments thesaurus, for metadata to be persisted into disk.In addition, also The module for having a storing directory content caches the directory content of each node.When metadata node is started, it can scan key Each value in value storage builds the content of catalogue.
In the treatment for directory metadata, the metadata of catalogue includes, the creation time of catalogue, the authority mark of catalogue Know, the group mark where catalogue, the ID belonging to catalogue, the globally unique identifier of catalogue.Wherein, the path length of catalogue It is an elongated character string, the metadata of remaining every catalogue is respectively a fixed length mark of 8 bytes.This first number According to node support the associative operation on catalogue, including catalogue establishment, the deletion of catalogue, read catalogue, obtain catalogue institute There is metadata, the user's group where changing catalogue changes the user belonging to catalogue.
Wherein the establishment process of catalogue includes being received from client the request for createing directory, and this request includes client The path for createing directory, the permission mode that client creaties directory, user's group mark and ID that client creaties directory. Wherein the permission mode of catalogue includes the access limit of the owner of catalogue, the access limit and other users of user's group user Access limit.After node side receives these requests, the parent directory that can be creatied directory to client first carries out authority inspection Look into and confirm, when establishment condition is determined for compliance with, by the storage of the metadata write-in directory metadata node of catalogue.Wherein accord with The establishment condition of conjunction includes the parent directory formulated in the presence of it, and possesses the access rights of parent directory.
For the deletion of catalogue, all metadata of catalogue are read, the user's group and change catalogue where changing catalogue Whether these are operated attribute, after the request of client is received, will all be first checked for catalogue and be whether there is and possess access right Limit.When it is determined that after this two, just being operated accordingly to catalogue.Compare special, when deleting catalogue, Must determine that the file of its subdirectory and subdirectory is deleted.
In whole framework, directory metadata node only has 1 in logic in the entire system.First number of all of catalogue According to being all stored in a metadata node.In order to ensure its reliability, it is possible to use the method for backed up in synchronization and can not use The method of distribution extension.
Fig. 2 also describes a kind of node of storage file metadata, and this metadata node is used to process distributed document The metadata request of all about file of system, and distributed storage file system metadata.It is right that includes supporting In the standard POSIX access interfaces of file operation, the metadata management layer of classification storage, and a key are carried out for metadata Metadata is persisted to disk by value storage.
The metadata of each file include first aspect described in its parent directory GUID, file name, The access time of file, the authority of file, the user's group mark where file, the modification time of file, the access of file content Time, the size of file, the size of blocks of files creates the overall identification of the metadata node of this document, the place of establishment file Metadata node.The wherein GUID of its parent directory and the title of filename is combined, and composition one is elongated Character string.Other metadata fields are then the field of the byte of fixed length 8.This metadata supports the phase for file metadata Close operation, including file establishment, the deletion of file obtains the metadata of file, changes the user's group of file, change file institute The user of category, changes the authority of file, and the read-write of file changes the size of file.
Wherein, the establishment process of file is divided into two stages, first stage, and client needs to access establishment file institute first The catalogue of category, client needs to access the directory metadata node described in first aspect, it is determined whether with the wound under the catalogue Build the authority of file.After it is determined that being a legal request to create, into second stage, client will transmit a request to text Part metadata node, the character string that file node will be combined with the title of the globally unique identifier of parent directory and filename Go to be indexed on file metadata node, when it is determined that there is no this document on the node, then establishment file success.
For the metadata operation of other files, then firstly the need of file is found, determine whether afterwards with access right Limit, if possessing access rights, is changed it accordingly.In these modifications, the modification for file read-write is necessary To wait until that the data of file have been written into the storage system of rear end or successfully read ability from the storage system of rear end The modification of data is enough completed, this process needs to be defined as a process for affairs.
Metadata node is a kind of network node based on RPC communication, needs to open corresponding clothes when initialization Business port, while starting the database of storage catalogue metadata.For the treatment that metadata operation is asked, the side of multithreading is used Formula is concurrent to be processed request.
As shown in Fig. 2 file metadata node is only multiple in logic in the entire system.The metadata of all of file The character string that is made up of the unique identifier and filename that calculate its parent directory determines the position of the node that file is deposited. This computational methods are ensured using the hash algorithm of uniformity.This method can ensure that it possesses preferable autgmentability, while The metadata node of dynamic extended file can also be supported.
As shown in Fig. 2 its framework also includes a kind of distributed type file system client side with caching.In this client On, application program can access distributed storage by directly accessing the storehouse of distributed file system offer.In linux system In, user can be just supported in client directly by file after can calling this storehouse by User space file system (FUSE) System is mounted on disk.After client terminal start-up, a Directory caching can be set up in client, this caching is placed on accessing On storehouse.
Client builds different access strategies by RPC agreements when accessing for different access process.For mesh The metadata operation of record, including mkdir creaties directory, and getattr obtains an attribute for file, catalogue or file (here Only obtain the metadata of catalogue), chmod file/directory priority assignation orders, chown is Multi-User Multi-Task operating system, institute Some files all have its owner (Owner).Client will directly access directory metadata node, for different operations, visitor Family end can provide different parameters.Node side is received after directly being processed after asking and returns to client.For the metadata of file Operation, including open, read, write, getattr, truncate etc., it checks his father's mesh firstly the need of Directory caching is accessed Whether record has access rights, if not having the information of correlation in Directory caching, client will first access directory metadata section Point, by the Directory caching of the metadata cache of parent directory to client, checks authority afterwards, it is determined that after possessing authority, client Metadata operation is initiated in end with file metadata node again asks.What needs were particularly pointed out, (read in catalogue for readdir Hold) this operation, client will initiate All Files metadata node and directory metadata node to read directory content The directory content of respective storage is returned to client by request, these nodes, and client is by these Content Organizings into ordered list Return to user.For this operation of rmdir (deltreeing), client is needed to all of file metadata node and catalogue Metadata node initiate obtain directory content request, when all returns request be space-time, then confirm be for the deletion of catalogue Legal, now could initiate the request that deltrees to file metadata node.
Caching on client, the example also needs to process following situation.Due to there are some visitors in the process of running The content of catalogue is changed at family end, so the directory metadata for caching needs the out-of-service time of setting caching, for catalogue unit The change of the authority of the catalogue of data and need to wait the cache invalidation of all directory metadata clients for the deletion of catalogue Operation can be continued.So when certain client sends both requests, request will not be performed, but be waited until at once After the cache invalidation time time-out of directory metadata node sets, can just enter the renewal of column catalogue and delete work.
Client as shown in Figure 2 also needs to a configuration file to deposit the map of the overall situation.Global map includes mesh IP address where record metadata node and file metadata node, the globally unique numbering of each file metadata node, base In this global map and numbering, client just can calculate the position for determining file node by hash algorithm, while can Enter row information by network and corresponding metadata node to exchange.During client initialization, client can be by the overall situation Map reads in, and directory metadata node and file metadata node set up heartbeat link, and each will determine catalogue for a period of time Whether also normally worked with file metadata node.
As shown in figure 3, it is necessary to carry out cutting to the metadata of file in the organizational process of metadata, in this process In, 4 kinds of metadata structures of this example definition, including catalogue metadata (d-inode), the content (d-entry) of catalogue, text The metadata (f-inode) of part, the content (f-content) of file.In traditional file system, d-inode passes through d- Entry can index f-inode, and f-inode can index the particular content of file.And in this example, it is proposed that one Plant new metadata organization method.This metadata cutting method includes each Nodes Self-organized directory content of oneself.Such as Fig. 2 It is shown, by decoupling, the d-entry of script is assigned to each d-inode.Thus d-entry is solved to file and mesh The close coupling relation that record brings.Specifically in implementation process, when a file is created, it is true that it reads its parent directory first Timing is waited and can creatied directory, and is illustrated in method above, and the metadata of its parent directory can be by the catalogue of client Caching is obtained.Be assigned to file on a file metadata node and stored by client, and this distribution method is based on one The distribution method of cause property Hash.While storage, file is added into its institute in the directory content caching of this document metadata Cached in the directory content of node, interacted with other nodes so as to avoid, reduce the delay of access.When need obtain During directory content, it is only necessary to access each node, it is possible to which uncoupled directory content is aggregated into a complete catalogue Appearance is sent to application program.
As shown in Fig. 2 in the implementation procedure of example, in addition it is also necessary to carry out persistence to metadata by key value database, This example has used a kind of method that key value database stores metadata, and its method is with file or the title or the overall situation of catalogue Unique mark, using the metadata of file or catalogue as value, is stored in key value database as key.
In directory metadata node, its key value database using the path of file as search for key, after Metadata is stored in directory metadata node as value.
In file metadata node, the character string constituted with the GUID of parent directory and filename be for The key of search, using other metadata as value, stores in file metadata node.
In the client of distributed file system, first number is cached using with directory metadata node identical key assignments mode According to.Its specific process is, when the metadata of one catalogue of storage, using the path of catalogue as key, with other yuan of catalogue Data are stored for value, meanwhile, add the GUID for catalogue, this identifier to have the unit of catalogue in the rear end of key assignments Back end end is managed.In establishment file, using the unique mark and filename of its parent directory as key, its with file Remaining metadata is stored as value.Now because file is distributed across each node, if being text with a unique node Part is created and one unique identifier of management will inevitably make this node as performance bottleneck.Therefore, this example is used The mark of the node of document creation and the metadata node of establishment file are the unique mark on this node of this document creation Constitute a unique file identification for the overall situation.This mark file after is modified, such as renaming or shifting Dynamic path will not all change.When the metadata of file is preserved, this mark can be stored the afterbody in key assignments.
In the internal data structure of key assignments storage, this example is deposited for different data, services using different key assignments Storage.Wherein, using the key assignments data storage storehouse based on B-tree, file metadata node is used based on Hash directory metadata node Key assignments data storage storehouse, in the client of distributed file system using based on internal memory database purchase catalogue metadata Caching.
In the optimization stored for metadata, this example uses the key of random length and the value of fixed length.This method requirement What the field of each metadata was to determine.Wherein, each value of the metadata fields of directory metadata is fixed length.File unit number It is also fixed length according to each value of the metadata fields of node.The field of one fixed length is directly stored in key assignments and deposits when storage In the value of storage, do not serialized and unserializing.In addition, this method does not need extra internal storage data in storage Structure carrys out cache metadata, directly by metadata cache in the caching of key value database.
Realized in result of the invention, all of metadata operation for file at most accesses 2 minor nodes, in catalogue Under the caching situation of metadata, it is only necessary to access a minor node, its time delay is only once to access round RTT, due to using Key assignments is stored, thus it is very low for the time delay of metadata acquisition, be can be ignored in the RTT time delays of Ethernet, so This method can be effectively reduced the information exchange between each node when distributed file system accesses metadata, reduce The delay of metadata access, meanwhile, the method by separating directory content, decoupling strong association between file and catalogue Property, handling capacity very high can be reached, so as to improve treatment effeciency of the distributed file system for metadata.
In addition, the decoupling location mode of the metadata of distributed type file system of the embodiment of the present invention other constitute and Effect is all for a person skilled in the art known, in order to reduce redundancy, is not repeated.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means to combine specific features, structure, material or spy that the embodiment or example are described Point is contained at least one embodiment of the invention or example.In this manual, to the schematic representation of above-mentioned term not Necessarily refer to identical embodiment or example.And, the specific features of description, structure, material or feature can be any One or more embodiments or example in combine in an appropriate manner.
Although an embodiment of the present invention has been shown and described, it will be understood by those skilled in the art that:Not Can these embodiments be carried out with various changes, modification, replacement and modification in the case of departing from principle of the invention and objective, this The scope of invention is by claim and its equivalent limits.

Claims (9)

1. the decoupling location mode of a kind of metadata of distributed type file system, it is characterised in that comprise the following steps:
S1:Metadata to distributed file system is separated, to obtain the metadata of catalogue, the metadata of directory entry and text The metadata of part;
S2:The metadata of the catalogue is arranged on directory inode;
S3:Each directory entry is split according to the distribution situation of file, and it is associated in the node storage of file storage Directory entry, and set up point to directory metadata reverse indexing.
2. the decoupling location mode of metadata of distributed type file system according to claim 1, it is characterised in that described Directory operation is included where establishment, the deletion of catalogue, all metadata for reading catalogue, obtaining catalogue, the change catalogue of catalogue User's group and change catalogue belonging to user.
3. the decoupling location mode of metadata of distributed type file system according to claim 1, it is characterised in that also wrap Include:
The globally unique mark for determining file is provided;
The cryptographic Hash of the global described mark of the file accessed required for calculating;
The node that metadata is deposited is positioned according to the cryptographic Hash.
4. the decoupling location mode of metadata of distributed type file system according to claim 3, it is characterised in that described It is designated the fullpath of file.
5. the decoupling location mode of metadata of distributed type file system according to claim 1, it is characterised in that also wrap Include:
When establishment file or catalogue, create one in the node of establishment file or catalogue and include the file or described All directory entries in the parent directory path of catalogue;
If all or part of of the directory entry is created in the node, remaining directory entry is created.
6. the decoupling location mode of metadata of distributed type file system according to claim 1, it is characterised in that also wrap Include:
When a file is deleted, node where deleting the metadata and the file of node where the file is corresponding Directory entry metadata points to the project of the file.
7. the decoupling location mode of metadata of distributed type file system according to claim 1, it is characterised in that also wrap Include:
When be read out catalogue or deltree operation when, access all of metadata node, read catalogue or deletion to obtain All directory entries under catalogue.
8. the decoupling location mode of metadata of distributed type file system according to claim 1, it is characterised in that also wrap Include:
Client-cache is provided, wherein, it is true when the directory metadata of the client-cache is for client establishment file Determine the authority whether with establishment file;
The client accesses the metadata of catalogue, to obtain access rights when the metadata of file is accessed;
When the client has access rights, the metadata of file is accessed.
9. the decoupling location mode of metadata of distributed type file system according to claim 8, it is characterised in that also wrap Include:
In the cache invalidation of described directory metadata client, the change of the authority of the catalogue of directory metadata and right is carried out In the deletion of catalogue.
CN201710016284.7A 2017-01-10 2017-01-10 Decoupling distribution method of metadata of distributed file system Active CN106874383B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710016284.7A CN106874383B (en) 2017-01-10 2017-01-10 Decoupling distribution method of metadata of distributed file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710016284.7A CN106874383B (en) 2017-01-10 2017-01-10 Decoupling distribution method of metadata of distributed file system

Publications (2)

Publication Number Publication Date
CN106874383A true CN106874383A (en) 2017-06-20
CN106874383B CN106874383B (en) 2019-12-20

Family

ID=59165495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710016284.7A Active CN106874383B (en) 2017-01-10 2017-01-10 Decoupling distribution method of metadata of distributed file system

Country Status (1)

Country Link
CN (1) CN106874383B (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107479827A (en) * 2017-07-24 2017-12-15 上海德拓信息技术股份有限公司 A kind of mixing storage system implementation method based on IO and separated from meta-data
CN108595287A (en) * 2018-04-27 2018-09-28 新华三技术有限公司成都分公司 Data truncation method and device based on correcting and eleting codes
CN108984617A (en) * 2018-06-13 2018-12-11 西安交通大学 A kind of metadata catalog structure implementation method towards memory cloud
CN109783449A (en) * 2018-12-13 2019-05-21 深圳壹账通智能科技有限公司 Data query processing method, platform, system and readable storage medium storing program for executing
CN109783462A (en) * 2018-12-13 2019-05-21 创新科存储技术有限公司 A kind of data access method and device based on distributed file system
CN109947730A (en) * 2017-07-25 2019-06-28 中兴通讯股份有限公司 Metadata restoration methods, device, distributed file system and readable storage medium storing program for executing
CN110046133A (en) * 2019-04-12 2019-07-23 苏州浪潮智能科技有限公司 A kind of metadata management method, the apparatus and system of storage file system
CN110704375A (en) * 2019-09-26 2020-01-17 深圳前海大数金融服务有限公司 File management method, device, equipment and computer storage medium
CN110727675A (en) * 2018-07-17 2020-01-24 阿里巴巴集团控股有限公司 Method and device for processing linked list
CN111143293A (en) * 2019-12-22 2020-05-12 浪潮电子信息产业股份有限公司 Metadata acquisition method, device, equipment and computer readable storage medium
CN111258508A (en) * 2020-02-16 2020-06-09 西安奥卡云数据科技有限公司 Metadata management method in distributed object storage
CN111309677A (en) * 2020-02-11 2020-06-19 西安奥卡云数据科技有限公司 File management method and device of distributed file system
CN111400249A (en) * 2020-03-06 2020-07-10 深圳市瑞驰信息技术有限公司 File storage system and method easy for counting file number
CN111949619A (en) * 2020-07-21 2020-11-17 苏州元核云技术有限公司 Dynamic directory generation method, system, electronic device and storage medium
WO2021004295A1 (en) * 2019-07-05 2021-01-14 中兴通讯股份有限公司 Metadata processing method and apparatus, and computer-readable storage medium
WO2021103600A1 (en) * 2019-11-29 2021-06-03 浪潮电子信息产业股份有限公司 Method, apparatus and device for deleting distributed system file, and storage medium
CN113190505A (en) * 2021-04-15 2021-07-30 网宿科技股份有限公司 Metadata management method, file storage system and server
CN113836143A (en) * 2021-09-28 2021-12-24 新华三大数据技术有限公司 Index creation method and device
CN114048185A (en) * 2021-11-18 2022-02-15 北京聚存科技有限公司 Method for transparently packaging, storing and accessing massive small files in distributed file system
CN115098466A (en) * 2022-07-18 2022-09-23 重庆紫光华山智安科技有限公司 Metadata management method and device, storage node and readable storage medium
CN116795296A (en) * 2023-08-16 2023-09-22 中移(苏州)软件技术有限公司 Data storage method, storage device and computer readable storage medium
US12001397B2 (en) 2019-11-29 2024-06-04 Inspur Electronic Information Industry Co., Ltd. Method, apparatus and device for deleting distributed system file, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354726A (en) * 2008-09-17 2009-01-28 中国科学院计算技术研究所 Method for managing memory metadata of cluster file system
CN101692239A (en) * 2009-10-19 2010-04-07 浙江大学 Method for distributing metadata of distributed type file system
CN103150394A (en) * 2013-03-25 2013-06-12 中国人民解放军国防科学技术大学 Distributed file system metadata management method facing to high-performance calculation
CN103473337A (en) * 2013-09-22 2013-12-25 北京航空航天大学 Massive catalogs and files oriented processing method in distributed type storage system
CN104537023A (en) * 2014-12-19 2015-04-22 华为技术有限公司 Storage method and device for reverse index records

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354726A (en) * 2008-09-17 2009-01-28 中国科学院计算技术研究所 Method for managing memory metadata of cluster file system
CN101692239A (en) * 2009-10-19 2010-04-07 浙江大学 Method for distributing metadata of distributed type file system
CN103150394A (en) * 2013-03-25 2013-06-12 中国人民解放军国防科学技术大学 Distributed file system metadata management method facing to high-performance calculation
CN103473337A (en) * 2013-09-22 2013-12-25 北京航空航天大学 Massive catalogs and files oriented processing method in distributed type storage system
CN104537023A (en) * 2014-12-19 2015-04-22 华为技术有限公司 Storage method and device for reverse index records

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PENG SUN: "MetaFlow: A Scalable Metadata Lookup Service for Distributed File Systems in Data Centers", 《IEEE TRANSACTIONS ON BIG DATA》 *
YAO SUN: "A Distributed Cache Framework for Metadata Service of Distributed File Systems", 《2013 19TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS》 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107479827A (en) * 2017-07-24 2017-12-15 上海德拓信息技术股份有限公司 A kind of mixing storage system implementation method based on IO and separated from meta-data
CN109947730A (en) * 2017-07-25 2019-06-28 中兴通讯股份有限公司 Metadata restoration methods, device, distributed file system and readable storage medium storing program for executing
CN109947730B (en) * 2017-07-25 2024-02-02 中兴通讯股份有限公司 Metadata recovery method, device, distributed file system and readable storage medium
CN108595287A (en) * 2018-04-27 2018-09-28 新华三技术有限公司成都分公司 Data truncation method and device based on correcting and eleting codes
CN108595287B (en) * 2018-04-27 2021-11-05 新华三技术有限公司成都分公司 Data truncation method and device based on erasure codes
CN108984617A (en) * 2018-06-13 2018-12-11 西安交通大学 A kind of metadata catalog structure implementation method towards memory cloud
CN110727675A (en) * 2018-07-17 2020-01-24 阿里巴巴集团控股有限公司 Method and device for processing linked list
CN110727675B (en) * 2018-07-17 2023-06-27 阿里巴巴集团控股有限公司 Method and device for processing linked list
CN109783449A (en) * 2018-12-13 2019-05-21 深圳壹账通智能科技有限公司 Data query processing method, platform, system and readable storage medium storing program for executing
CN109783462A (en) * 2018-12-13 2019-05-21 创新科存储技术有限公司 A kind of data access method and device based on distributed file system
CN109783462B (en) * 2018-12-13 2021-01-05 创新科技术有限公司 Data access method and device based on distributed file system
CN110046133A (en) * 2019-04-12 2019-07-23 苏州浪潮智能科技有限公司 A kind of metadata management method, the apparatus and system of storage file system
WO2021004295A1 (en) * 2019-07-05 2021-01-14 中兴通讯股份有限公司 Metadata processing method and apparatus, and computer-readable storage medium
CN110704375A (en) * 2019-09-26 2020-01-17 深圳前海大数金融服务有限公司 File management method, device, equipment and computer storage medium
WO2021103600A1 (en) * 2019-11-29 2021-06-03 浪潮电子信息产业股份有限公司 Method, apparatus and device for deleting distributed system file, and storage medium
US12001397B2 (en) 2019-11-29 2024-06-04 Inspur Electronic Information Industry Co., Ltd. Method, apparatus and device for deleting distributed system file, and storage medium
CN111143293A (en) * 2019-12-22 2020-05-12 浪潮电子信息产业股份有限公司 Metadata acquisition method, device, equipment and computer readable storage medium
CN111143293B (en) * 2019-12-22 2022-06-07 浪潮电子信息产业股份有限公司 Metadata acquisition method, device, equipment and computer readable storage medium
CN111309677A (en) * 2020-02-11 2020-06-19 西安奥卡云数据科技有限公司 File management method and device of distributed file system
CN111309677B (en) * 2020-02-11 2023-05-23 西安奥卡云数据科技有限公司 File management method and device of distributed file system
CN111258508A (en) * 2020-02-16 2020-06-09 西安奥卡云数据科技有限公司 Metadata management method in distributed object storage
CN111400249A (en) * 2020-03-06 2020-07-10 深圳市瑞驰信息技术有限公司 File storage system and method easy for counting file number
CN111949619B (en) * 2020-07-21 2024-04-26 苏州元核云技术有限公司 Dynamic catalog generation method, system, electronic equipment and storage medium
CN111949619A (en) * 2020-07-21 2020-11-17 苏州元核云技术有限公司 Dynamic directory generation method, system, electronic device and storage medium
CN113190505A (en) * 2021-04-15 2021-07-30 网宿科技股份有限公司 Metadata management method, file storage system and server
CN113836143A (en) * 2021-09-28 2021-12-24 新华三大数据技术有限公司 Index creation method and device
CN113836143B (en) * 2021-09-28 2024-02-27 新华三大数据技术有限公司 Index creation method and device
CN114048185A (en) * 2021-11-18 2022-02-15 北京聚存科技有限公司 Method for transparently packaging, storing and accessing massive small files in distributed file system
CN115098466A (en) * 2022-07-18 2022-09-23 重庆紫光华山智安科技有限公司 Metadata management method and device, storage node and readable storage medium
CN116795296B (en) * 2023-08-16 2023-11-21 中移(苏州)软件技术有限公司 Data storage method, storage device and computer readable storage medium
CN116795296A (en) * 2023-08-16 2023-09-22 中移(苏州)软件技术有限公司 Data storage method, storage device and computer readable storage medium

Also Published As

Publication number Publication date
CN106874383B (en) 2019-12-20

Similar Documents

Publication Publication Date Title
CN106874383A (en) A kind of decoupling location mode of metadata of distributed type file system
US8214334B2 (en) Systems and methods for distributed system scanning
US8296312B1 (en) Search and update of attributes in file systems
CA2139693C (en) Summary catalogs
US9043372B2 (en) Metadata subsystem for a distributed object store in a network storage system
US7430570B1 (en) Shadow directory structure in a distributed segmented file system
US7228299B1 (en) System and method for performing file lookups based on tags
US8214400B2 (en) Systems and methods for maintaining distributed data
US8176013B2 (en) Systems and methods for accessing and updating distributed data
CN103282899B (en) The storage method of data, access method and device in file system
US20130013602A1 (en) Database system
CN113010486B (en) Metadata layered caching method and device for centerless distributed file system
US20040225963A1 (en) Dynamic maintenance of web indices using landmarks
US9081784B2 (en) Delta indexing method for hierarchy file storage
CN105824723A (en) Method and system for backup of data of public cloud storage account
US7844596B2 (en) System and method for aiding file searching and file serving by indexing historical filenames and locations
WO2023179787A1 (en) Metadata management method and apparatus for distributed file system
CN103136294B (en) File operating method and device
JP3300399B2 (en) Computer system and file management method
CN103246718B (en) File access method, device and equipment
CN105468599A (en) Metadata hierarchy management method for storage virtualization system
CN110019016A (en) The KV for providing logic key stores device and method thereof
WO2021004295A1 (en) Metadata processing method and apparatus, and computer-readable storage medium
JPH08235040A (en) Data file management system
CN117435559B (en) Metadata hierarchical management method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Lu Youyou

Inventor after: Shu Jiwu

Inventor before: Lu Youyou

Inventor before: Shu Jiwu

Inventor before: Li Siyang

CB03 Change of inventor or designer information