WO2016018450A1 - Distributed segmented file systems - Google Patents

Distributed segmented file systems Download PDF

Info

Publication number
WO2016018450A1
WO2016018450A1 PCT/US2014/067945 US2014067945W WO2016018450A1 WO 2016018450 A1 WO2016018450 A1 WO 2016018450A1 US 2014067945 W US2014067945 W US 2014067945W WO 2016018450 A1 WO2016018450 A1 WO 2016018450A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
file server
inode
server
segment
Prior art date
Application number
PCT/US2014/067945
Other languages
French (fr)
Inventor
Sudheer Kurichiyath
Original Assignee
Hewlett-Packard Development Company, L.P.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett-Packard Development Company, L.P. filed Critical Hewlett-Packard Development Company, L.P.
Publication of WO2016018450A1 publication Critical patent/WO2016018450A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Definitions

  • the data may be generated by computing devices and may be generated for the use by the computing devices in the IT environments.
  • the generated data, or the data files are generally stored in storage devices that are managed by file systems.
  • the file systems control the storage and retrieval of data to and from storage devices.
  • the file systems may be implemented in a network- based data storage system in which the data or data files may be read, written, or accessed through a plurality of file servers.
  • Figure 1 illustrates a network environment implementing a distributed segmented file system based data storage system, according to an example of the present subject matter.
  • Figure 2 illustrates a distributed segmented file system based data storage system for creation of a shadow tree entry and inter-segment hard-links, according to an example of the present subject matter.
  • Figure 3 illustrates a method of accessing a file in a distributed segmented file system based data storage system, according to an example of the present subject matter.
  • Figure 4 illustrates a method of reading a file based on the shadow tree entry of a file, according to an example of the present subject matter.
  • Figure 5 illustrates a system environment for implementing a distributed segmented file system based data storage system, according to an example of the present subject matter.
  • the present subject matter relates to file servers and data storage systems that implement distributed segmented file systems, and methods of accessing files in distributed segmented file system based data storage systems.
  • the files herein, are understood as data files that are stored, or to be stored, in storage devices.
  • the files may be accessed for reading and writing data in the files.
  • a distributed segmented file system such as an IBRIX file system, deploys multiple file servers, where each file server has physical storage media segmented into contiguous units referred to as segments. Files in such a distributed segmented file system are distributed across the file servers, and at least the metadata, included in directory entries (dentries) and inodes, associated with the files and the directories thereof are controlled and stored in the segments in the file servers.
  • the file server may receive a file system call having a file identifier of the file, for example, an inode.
  • a file can be accessed by a file server through kernel virtual file system (VFS) operation interfaces.
  • VFS kernel virtual file system
  • the segment of the physical storage media of the file servers in which the inode resides may be determined.
  • the file server having the physical storage media that has the determined segment may be identified, and the file system call may be forwarded to the identified file server for accessing the file.
  • the file server having the determined segment may be identified based on a globally administered table that maps the segments to the file servers, for example, based on the internet protocol (IP) address of the file servers.
  • IP internet protocol
  • the file servers associated with the distributed segmented file system may communicate with each other over a network.
  • the file system call introduces an additional hop across the two file servers over the network for accessing the file each time. This additional hop for accessing the file may cause latency. This may affect the operation of applications, such as database transactions, mail applications, etc., which utilize the distributed segmented file system and are sensitive to latencies.
  • File servers and data storage systems that implement distributed segmented file systems, and methods of accessing files in distributed segmented file system based data storage systems are described herein.
  • additional hops are not introduced across the file servers during the file system calls even if the inode of a file to be accessed using one file server resides on a segment of another file server. With this, the latency, which is otherwise present due to additional hop, can be avoided during the file accessing. This facilitates in improving the speed of file accessing in the data storage systems.
  • a data storage system implements a distributed segmented file system and is accessible through a plurality of file servers.
  • Each file server in the data storage system includes physical storage media which are segmented into contiguous units referred to as segments.
  • the file server may control metadata of files and of directories associated with the files, and the metadata may reside locally in the segments in the file servers.
  • the metadata of the files and directories may be included in dentries and inodes associated with the files and the directories.
  • a dentry of a file refers to an object that may include information, such as a name of the file and an inode number that points to the inode of the file and to a segment number in which the inode of the file resides.
  • the inode number may indicate to a unique number to represent the inode.
  • the inode may include a variety of information of the file, such as type, length, access and modification times, delegation, location on disk, etc.
  • the data storage system includes a shared storage pool that may be composed of one or more storage devices commonly shared by the plurality of file servers for accessing, reading, and writing of files in the data storage system.
  • the shared storage pool may be abstracted to the plurality of file servers through Avatar Domain.
  • Avatar Domain in addition to functioning as a logical volume or storage manager, logically owns on-disk inodes and data blocks associated with the files accessible through plurality of file server in the data storage system.
  • the on-disk inodes and data blocks may be physically stored in the shared storage pool. With the Avatar Domain owning the on-disk inodes and the block of the files, the inodes residing in two different segments of two different file servers can be linked to a corresponding single on-disk inode in the Avatar Domain.
  • a dentry of the file to be accessed may be received by the first file server.
  • the received dentry may be stored in a segment of the first file server.
  • it may be identified whether the dentry of the file is pointing to an inode residing on a segment of another file server, let's say, a second file server.
  • the inode residing on the segment of the second file server may be pointing to, or hard-linked to, a corresponding on-disk inode of the file in the Avatar Domain,
  • a shadow tree entry of the file is created and stored in the first file server.
  • the shadow free entry of the file is a file path associated with the inode on the segment in the second file server and linked to the on-disk inode of the file in the Avatar Domain,
  • the shadow tree entry may be stored in the segment of first file server in which the dentry of the file resides.
  • the shadow tree entry thus, stored in the segment of the first file server may have the inode of the file as its leaf node, which is linked to the corresponding on-disk inode in the Avatar Domain, thus forming inter-segment hard-links with the same on- disk inode.
  • the file can be directly accessed using the shadow tree entry in the first file server.
  • the accessing of the file through the first file server does not involve a hop to the second file server, which may otherwise be involved for determining the file path.
  • the accessing of the file through the first file server without the hop to the second file server does not cause latency, and thus may not affect the operation of applications, such as database transactions, mail applications, etc., utilizing the distributed segmented file system based data storage system, [0017]
  • the above systems and methods are further described with reference to Figures 1 to 5. it should be noted that the description and figures merely illustrate the principles of the present subject matter.
  • Figure 1 illustrates a network environment 100 implementing a distributed segmented file system based data storage system, according to an example of the present subject matter.
  • the data storage system includes a shared storage poo! 102 that may be composed of one or more storage devices s1 , s2, ... , sn where the on-disk inodes and the data blocks associated with files may be physically stored.
  • the shared storage pool 102 is a common storage that may be shared by a plurality of file servers 104-1 , 104-2, ... , 104-n, for the purpose of accessing, i.e., reading and writing, files on the shared storage pool 102.
  • the plurality of file servers 104-1 , 104-2, ... , 104-n, are hereinafter collectively referred to as file servers 104 and individually referred to as a file server 104.
  • the shared storage pool 102 is abstracted to the file servers 104 through Avatar Domain 106.
  • the Avatar Domain 106 is an abstraction layer or a logical layer that single handedly owns the on-disk inodes and data blocks of the files in the shared storage pool 102.
  • the file servers 104 communicate with the Avatar Domain 106 for the purpose of accessing files from the shared storage pool 102.
  • each of the file servers 104 has local physical storage media that may be segmented into segments.
  • the segments in a file server 104 may be of equal size or unequal size, for example, from a few megabytes to a few gigabytes, depending of the segmentation.
  • the file servers 104 in the data storage system implement a distributed segmented file system, where the files to be accessed through the file servers 104 are distributed across different segments of the file servers 104.
  • the metadata of the files and the directories associated with the files may be controlled and stored in the segments of the file servers 104.
  • the metadata of the files and the directories may be in the form of dentries and inodes.
  • a dentry of a file may include information such as a name of the file and an inode number.
  • the inode number points to the inode of the file and the segment number in which the inode of the file resides.
  • An inode of a file may be represented by a unique number and may include information of the file, such as type, length, access and modification times, delegation, location on disk, etc,
  • the Avatar Domain 106 owns, or has the logical ownership of, the on-disk inodes and data blocks of the files stored in the shared storage pool 102 accessible by the file servers 104.
  • the inodes residing in two different segments of two different file servers 104 can point to a single on-disk inode in the shared storage pool 102 to form inter-segment hard-links with the same on-disk inode. Formation of inter-segment hard-links is described in detail through the description of Figure 2 later in the description.
  • the inter-segment hard- links form multiple access paths to the same file and allow accessing of the same file through different file servers 104, without hopping between the file servers 104 as described in the description herein.
  • the file servers 104 may communicate with each other and with the Avatar Domain 108 over a communication network 108.
  • the file servers 104 may access files on receiving a file access request from one or more client devices (not shown), over the communication network 108.
  • the client devices may include, but are not restricted to, laptop computers, desktop computers, notebooks, workstations, mainframe computers, smart phones, and personal digital assistants.
  • the communication network 108 may be a wireless network, a wired network, or a combination thereof.
  • the communication network 108 can also be an individual network or a collection of many such individual networks, interconnected with each other and functioning as a single large network, e.g., the Internet or an intranet.
  • the communication network 108 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), and the internet.
  • the communication network 108 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), and Transmission Control Protocol/Internet Protocol (TCP/IP), to communicate with each other.
  • HTTP Hypertext Transfer Protocol
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the communication network 108 may include a Global System for Mobile Communication (GSM) network, a Universal Mobile Telecommunications System (UMTS) network, or any other communication network that use any of the commonly used protocols, for example, Hypertext Transfer Protocol (HTTP) and Transmission Control Protocol/Internet Protocol (TCP/IP).
  • GSM Global System for Mobile Communication
  • UMTS Universal Mobile Telecommunications System
  • HTTP Hypertext Transfer Protocol
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • each of the file servers 104 includes processor(s) (not shown).
  • the processor(s) may be implemented as microprocessors, microcomputers, microcontroliers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions.
  • the functions of processor(s) may be provided through the use of dedicated hardware as well as hardware capable of executing machine readable instructions.
  • the processor(s) in each file server 104 may be coupled to the segments of the respective file server 104 to perform functions associated with accessing of files from the shared storage pool 102.
  • the file servers 104 may include interface(s) (not shown).
  • the interface(s) may include a variety of commercially available interfaces, for example, interfaces for peripheral device(s), such as data input output devices, referred to as I/O devices, storage devices, network device.
  • the I/O device(s) may include Universal Serial Bus (USB) ports, Ethernet ports, host bus adaptors, and their corresponding device drivers.
  • the interface(s) may facilitate the communication of the file servers with various communication and computing devices and various communication networks, such as networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP) and Transmission Control Protocol/Internet Protocol (TCP/IP).
  • HTTP Hypertext Transfer Protocol
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the file servers 104 may further includes a memory (not shown) coupled to the processor(s).
  • the processor(s) may fetch and execute computer-readable instructions stored in the memory.
  • the memory may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, and flash memories.
  • Figure 2 illustrates a distributed segmented file system based data storage system for creation of a shadow tree entry and inter-segment hard-links, according to an example of the present subject matter.
  • Figure 2 shows two of the file servers, namely a first file server 104-1 and a second file server 104-2, which may communicate with the Avatar Domain 106 for accessing files stored in the shared storage pool 102.
  • the description herein describes the procedure of creation of a shadow tree entry and inter-segment hard-links for the purpose of accessing a file from the first file server 104-1 , when the inode of the file is originally residing in a segment of the second file server 104-2.
  • the first file server 104-1 may receive a file access request from a client device.
  • the file access request may include information of the file, for example, the file name, to be accessed through the first file server 104- 1 .
  • the first file server 104-1 may receive a dentry associated with the file, in an example implementation, for the IBR!X file system, the dentry of the file may be received based on a server lookup to an IBRIX global namespace that maps the file name with the metadata associated with the file.
  • the dentry may be stored in a segment in the first file server 104-1 .
  • the dentry may be stored in segment 1 , referenced by 204, in the first file server 104-1 .
  • the dentry 202 includes the file name "Foo.c” and the inode number "17".
  • the inode number 17 may be point to the segment number "3" indicative of a segment where the inode of the file resides.
  • the file server having the segment corresponding to the indicated segment number where the inode resides may be identified. This identification may be done using a globally administered table that maps the segments, or the segment numbers, to the file servers 104. As illustrated in Figure 2, the second file server 104-2 may be identified as the file server having the segment 3 where the inode number 17 resides. Segment 3 in the second file server 104-2 is referenced by 206.
  • the inode 17 residing on segment 3 of the second file server 104-2 is pointing to, or hard-linked to, a corresponding on-disk inode 17 in the Avatar Domain 106, thereby forming a hard-link between segment 3 and the on-disk inode 17.
  • the on-disk inode 17 is linked to the data blocks 1 to m that may be associated the file.
  • the inode 17 residing in segment 3 of the second file server 104-2 is associated with a shadow tree entry which is indicative of a file path of the file corresponding to the inode 17.
  • the file path may include directories and sub-directories associated with the inode 17, from the root directory up to the inode 17.
  • the information of the file path associated with the inode 17 may be locally stored in segment 3.
  • the shadow tree entry is referenced by 208.
  • the first file server 104-1 creates a shadow tree entry which is same as the shadow tree entry 208 in the second file server 104-2 and stores the shadow tree entry in the first file server 104-1 .
  • the file "Foo.c" corresponding to the inode 17, can be accessed through the first file server 104-1 using the stored shadow tree entry. Such accessing does not involve hopping from the first file server 104-1 to the second file server 104-2.
  • the shadow tree entry 208 may be a persistent entry stored in a non-volatile memory in the first file server 104-1 .
  • the shadow tree entry 208 may be a non-persistent entry stored in a volatile memory in the first file server 104-1 .
  • the shadow tree entry 208 in segment 1 of the first file server 104-1 has inode 17 as the leaf node.
  • This inode 17 points to the on-disk inode 17 in the Avatar Domain 106, thereby forming a hard-link between segment 1 and the on-disk inode 17 in the Avatar Domain 106.
  • inter-segment hard-links are formed with the same on-disk inode 17 in the Avatar Domain 106.
  • a reference count of links of shadow tree entries in the plurality of file servers with each on-disk inode in the Avatar Domain 106 may be updated atomically based on a classic distributed lock manager, in an example implementation, a reference count of links of shadow tree entries in the plurality of file servers with each on-disk inode in the Avatar Domain 106 may be undated atomically based on IBR!X delegation. The reference count is incremented when a new segment hard-link is formed and is decremented when an existing segment hard-link is removed. Further, the inode in the Avatar Domain 106 is removed when the reference count is zero.
  • Figure 3 illustrates a method 300 of accessing a file in a distributed segmented file system based data storage system, according to an example of the present subject matter.
  • the order in which method 3(30 is described is not intended to be construed as a limitation, and some of the described method blocks can be combined in a different order to implement the method 300, or an alternative method.
  • the method 300 may be implemented in any suitable hardware, computer- readable instructions, or combination thereof.
  • the steps of the method 300 may be performed by either a computing device under the instruction of machine executable instructions stored on a non-transitory computer readable medium or by dedicated hardware circuits, microcontrollers, or logic circuits.
  • some examples are also intended to cover computer readable medium, for example, digital data storage media, which are machine or computer readable and encode machine-executable or computer-executable instructions, where said instructions perform some or all of the steps of the described method 30(3.
  • the method 300 includes receiving by a first file server 104-1 of a plurality of fiie servers 104 in the data storage system a directory entry (dentry) of a file to be accessed through the first file server 104-1 .
  • the dentry of the file may be received on receiving a file access request from a client device.
  • the dentry may be stored in a segment of the first file server 104-1 .
  • the dentry may be received based on a server lookup to an IBRIX global namespace.
  • the method 300 includes identifying whether the dentry of the file to be accessed is pointing to an inode residing on a segment of a second file server 104-2, different from the first file server 104-1 .
  • the segment where the inode of the fiie resides may be determined based on the dentry.
  • the file server from amongst the file servers 104 in the data storage system, which has the determined segment may be determined.
  • the inode residing in the second file server 104-2 may point to a corresponding on-disk inode of the file in Avatar Domain 106.
  • the Avatar Domain 106 is a logical layer through which a shared storage pool 102 is abstracted to the fiie servers 104.
  • the shared storage pool 102 may include one or more storage devices that are commonly shared by the file servers 104 for storing and accessing files.
  • the method 300 includes storing In the first file server 104-1 , a shadow tree entry of the file.
  • the shadow tree entry of the file is a file path associated with the inode of the file.
  • the shadow tree entry stored in the first file server 104-1 is linked to the on- disk inode of the file in the Avatar Domain 106.
  • the file can be accessed through the first file server 104-1 using the stored shadow tree entry, without hopping from the first file server 104-1 to the second file server 104-2.
  • the shadow tree entry in the first file server 104-1 before creating and storing the shadow tree entry in the first file server 104-1 , it may be checked whether the shared storage pool 102 is abstracted through the Avatar Domain 106. if not, then the file accessing through the first file server 104-1 is based on hopping to the second file server 104-2, which introduces latency in file accessing.
  • a reference count of links of shadow tree entries in the plurality of file servers with each on-disk inode in the Avatar Domain may be updated.
  • the reference count of links may be updated afomicaliy based on a classic distributed lock manager or based on IBRIX delegation.
  • Figure 4 illustrates a method 400 of reading a file based on the shadow tree entry of a file, according to an example of the present subject matter.
  • the order in which method 400 is described is not intended to be construed as a limitation, and some of the described method blocks can be combined in a different order to implement the method 400, or an alternative method.
  • the method 400 may be implemented in any suitable hardware, computer-readable instructions, or combination thereof.
  • the steps of the method 400 may be performed by either a computing device under the instruction of machine executable instructions stored on a non-transitory computer readable medium or by dedicated hardware circuits, microcontrollers, or logic circuits.
  • some examples are also intended to cover computer readable medium, for example, digital data storage media, which are machine or computer readable and encode machine-executable or computer-executable instructions, where said instructions perform some or ail of the steps of the described method 400.
  • the method 400 includes determining, based on the inode in the dentry of the file to be accessed through a file server 104, whether the client device from which the file access request is received is delegated with read rights. If read rights are not delegated, then at block 404 the read rights may be obtained ("no" branch from block 402).
  • the method 400 includes determining whether the shadow tree entry associated with the inode of the file to be accessed through the file server 104 is stored in the file server 104 ("yes" branch from block 402). If the shadow tree entry is not stored in the file server 104, then at block 408 the file accessing takes a default read path ("no" branch from block 406) in which the file system call is transferred to that file server in which the inode of the file resides.
  • an on-disk inode in the Avatar Domain 106 is read ("yes" branch from block 408), which is linked with the shadow tree entry.
  • the reading of the on-disk inode is enabled by the segment hard-link between the file server 104 and the on-disk inode.
  • the data blocks associated with the on-disk inode are determined, and at block 414 a read access of the determined data blocks is provided to the file server 104.
  • the contents of the data blocks, read by the file server 104 are then provided to the client device by the file server 104.
  • FIG. 5 illustrates a system environment 500 for implementing a distributed segmented file system based data storage system, according to an example of the present subject matter.
  • the system environment 500 includes a processing resource 502 communicatively coupled to a non-transitory computer readable medium 504 through a communication link 506.
  • the processing resource 502 can be a processor of a file server 104 that implements a distributed segmented file system and utilizes the non- transitory computer readable medium 504 for accessing files in a data storage system.
  • the non- transitory computer readable medium 504 may be, for example, an internal memory device or an external memory device, in one implementation, the communication link 506 may be a direct communication link, such as any memory read/write interface. In another implementation, the communication link 506 may be an indirect communication link, such as a network interface. In such a case, the processing resource 502 may access the non-transitory computer readable medium 504 through a network 508.
  • the network 508 may be a single network or a combination of multiple networks and may use a variety of different communication protocols.
  • the processing resource 502 and the non-transitory computer readable medium 504 may also be communicating with data sources 510 over the network 508.
  • the data sources 510 may be a shared storage poo! including one or more storage devices that are accessed by the processing resource 502 for storing and accessing files, in accordance with the present subject matter.
  • the data sources 510 is abstracted to the processing resource 502 through Avatar Domain 512 that single-handedly owns the on-disk inodes and the data blocks of the files stored in the data resources 510.
  • the non-transitory computer readable medium 504 includes a set of computer readable instructions, for example, to receive a dentry of a file to be accessed through the file server 104, identify whether the dentry of the file is pointing to an inode residing on a segment of another file server, and store, in the file server 104, a shadow tree entry of the file.
  • the set of computer readable instructions referred to as instructions hereinafter, can be accessed by the processing resource 502 through the communication link 506 and subsequently executed to perform acts for accessing files in the distributed segmented file system based data storage system.
  • a dentry of a file to be accessed through the file server 104 is received by the file server 104.
  • the dentry may be received based on a server lookup to an IBR!X global namespace and may be stored in a segment of the file server 104. Further, it may be identified whether the dentry of the file is pointing to an inode residing on a segment of another file server, wherein the inode in the other file server is pointing to an on-disk inode of the file in Avatar Domain 512.
  • a shadow free entry of the file may be stored in the file server 104, wherein the shadow tree entry in the file server 104 is a file path associated with the inode of the file and linked to the on-disk inode of the file in the Avatar Domain 512.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present subject matter relates to distributed segmented file systems. In an implementation, a directory entry of a file to be accessed through a first file server is received by the first file server. It is identified whether the directory entry of the file to be accessed is pointing to an inode residing on a segment of a second file server, where the inode in the second file server is pointing to an on-disk inode of the file in Avatar Domain through which a shared storage pool is abstracted to a plurality of file servers. A shadow tree entry of the file is stored in the first file server, where the shadow tree entry in the first file server is a file path associated with the inode of the file and linked to the on-disk inode of the file in the Avatar Domain.

Description

DISTRIBUTED SEGMENTED FILE SYST EMS
BACKGROUND
[0001 ] Advancements in information technology (IT) have led to exponential growth in data generation in IT environments. The data may be generated by computing devices and may be generated for the use by the computing devices in the IT environments. The generated data, or the data files, are generally stored in storage devices that are managed by file systems. The file systems control the storage and retrieval of data to and from storage devices. The file systems may be implemented in a network- based data storage system in which the data or data files may be read, written, or accessed through a plurality of file servers. BRIEF DESCRIPTION OF DRAWINGS
[0002] The detailed description is provided with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to reference like features and components.
[0003] Figure 1 illustrates a network environment implementing a distributed segmented file system based data storage system, according to an example of the present subject matter.
[0004] Figure 2 illustrates a distributed segmented file system based data storage system for creation of a shadow tree entry and inter-segment hard-links, according to an example of the present subject matter.
[0005] Figure 3 illustrates a method of accessing a file in a distributed segmented file system based data storage system, according to an example of the present subject matter. [0006] Figure 4 illustrates a method of reading a file based on the shadow tree entry of a file, according to an example of the present subject matter.
[0007] Figure 5 illustrates a system environment for implementing a distributed segmented file system based data storage system, according to an example of the present subject matter.
DETAILED DESCRIPTION
[0008] The present subject matter relates to file servers and data storage systems that implement distributed segmented file systems, and methods of accessing files in distributed segmented file system based data storage systems. The files, herein, are understood as data files that are stored, or to be stored, in storage devices. The files may be accessed for reading and writing data in the files.
[0009] Recent advances in file systems have led to distributed segmented file systems that allow for easy increment of storage capacity and expansion of file systems. The distributed segmented file systems also permit data sharing, are easy to administer, and manage and operate effectively with large storage capacity and client loads. A distributed segmented file system, such as an IBRIX file system, deploys multiple file servers, where each file server has physical storage media segmented into contiguous units referred to as segments. Files in such a distributed segmented file system are distributed across the file servers, and at least the metadata, included in directory entries (dentries) and inodes, associated with the files and the directories thereof are controlled and stored in the segments in the file servers.
[0010] For accessing a file in the distributed segmented file system through a file server, the file server may receive a file system call having a file identifier of the file, for example, an inode. In case of in-kernel file servers, a file can be accessed by a file server through kernel virtual file system (VFS) operation interfaces. Based on the inode, the segment of the physical storage media of the file servers in which the inode resides may be determined. Subsequently, the file server having the physical storage media that has the determined segment may be identified, and the file system call may be forwarded to the identified file server for accessing the file. The file server having the determined segment may be identified based on a globally administered table that maps the segments to the file servers, for example, based on the internet protocol (IP) address of the file servers.
[001 1 ] The file servers associated with the distributed segmented file system may communicate with each other over a network. In a case where a file is to be accessed through one file server and the inode of the file resides in, or is pointing to, a segment of another file server, the file system call introduces an additional hop across the two file servers over the network for accessing the file each time. This additional hop for accessing the file may cause latency. This may affect the operation of applications, such as database transactions, mail applications, etc., which utilize the distributed segmented file system and are sensitive to latencies.
[0012] File servers and data storage systems that implement distributed segmented file systems, and methods of accessing files in distributed segmented file system based data storage systems are described herein. In accordance with the present subject matter, additional hops are not introduced across the file servers during the file system calls even if the inode of a file to be accessed using one file server resides on a segment of another file server. With this, the latency, which is otherwise present due to additional hop, can be avoided during the file accessing. This facilitates in improving the speed of file accessing in the data storage systems.
[0013] In an example implementation, a data storage system implements a distributed segmented file system and is accessible through a plurality of file servers. Each file server in the data storage system includes physical storage media which are segmented into contiguous units referred to as segments. The file server may control metadata of files and of directories associated with the files, and the metadata may reside locally in the segments in the file servers. The metadata of the files and directories may be included in dentries and inodes associated with the files and the directories. A dentry of a file refers to an object that may include information, such as a name of the file and an inode number that points to the inode of the file and to a segment number in which the inode of the file resides. The inode number may indicate to a unique number to represent the inode. The inode may include a variety of information of the file, such as type, length, access and modification times, delegation, location on disk, etc.
[0014] In an example implementation, the data storage system includes a shared storage pool that may be composed of one or more storage devices commonly shared by the plurality of file servers for accessing, reading, and writing of files in the data storage system. The shared storage pool may be abstracted to the plurality of file servers through Avatar Domain. Avatar Domain, in addition to functioning as a logical volume or storage manager, logically owns on-disk inodes and data blocks associated with the files accessible through plurality of file server in the data storage system. The on-disk inodes and data blocks may be physically stored in the shared storage pool. With the Avatar Domain owning the on-disk inodes and the block of the files, the inodes residing in two different segments of two different file servers can be linked to a corresponding single on-disk inode in the Avatar Domain.
[0015] In an example implementation, for the purpose of accessing a file through a file server, let's say, a first file server in a distributed segmented file system based data storage system, a dentry of the file to be accessed may be received by the first file server. The received dentry may be stored in a segment of the first file server. On receiving and storing the dentry, it may be identified whether the dentry of the file is pointing to an inode residing on a segment of another file server, let's say, a second file server. The inode residing on the segment of the second file server may be pointing to, or hard-linked to, a corresponding on-disk inode of the file in the Avatar Domain, On identifying that the dentry points to the inode residing on the segment of the second file server, a shadow tree entry of the file is created and stored in the first file server. The shadow free entry of the file is a file path associated with the inode on the segment in the second file server and linked to the on-disk inode of the file in the Avatar Domain, The shadow tree entry may be stored in the segment of first file server in which the dentry of the file resides. The shadow tree entry, thus, stored in the segment of the first file server may have the inode of the file as its leaf node, which is linked to the corresponding on-disk inode in the Avatar Domain, thus forming inter-segment hard-links with the same on- disk inode.
[0016] Once the shadow tree entry of the file is stored in the first file server, the file can be directly accessed using the shadow tree entry in the first file server. With this, the accessing of the file through the first file server does not involve a hop to the second file server, which may otherwise be involved for determining the file path. The accessing of the file through the first file server without the hop to the second file server does not cause latency, and thus may not affect the operation of applications, such as database transactions, mail applications, etc., utilizing the distributed segmented file system based data storage system, [0017] The above systems and methods are further described with reference to Figures 1 to 5. it should be noted that the description and figures merely illustrate the principles of the present subject matter. It is thus understood that various arrangements can be devised that, although not explicitly described or shown herein, embody the principles of the present subject matter. Moreover, all statements herein reciting principles, aspects, and implementations of the present subject matter, as well as specific examples thereof, are intended to encompass equivalents thereof. [0018] Figure 1 illustrates a network environment 100 implementing a distributed segmented file system based data storage system, according to an example of the present subject matter. The data storage system includes a shared storage poo! 102 that may be composed of one or more storage devices s1 , s2, ... , sn where the on-disk inodes and the data blocks associated with files may be physically stored. The shared storage pool 102 is a common storage that may be shared by a plurality of file servers 104-1 , 104-2, ... , 104-n, for the purpose of accessing, i.e., reading and writing, files on the shared storage pool 102. The plurality of file servers 104-1 , 104-2, ... , 104-n, are hereinafter collectively referred to as file servers 104 and individually referred to as a file server 104.
[0019] As shown in Figure 1 , the shared storage pool 102 is abstracted to the file servers 104 through Avatar Domain 106. The Avatar Domain 106 is an abstraction layer or a logical layer that single handedly owns the on-disk inodes and data blocks of the files in the shared storage pool 102. The file servers 104 communicate with the Avatar Domain 106 for the purpose of accessing files from the shared storage pool 102.
[0020] Further, each of the file servers 104 has local physical storage media that may be segmented into segments. The segments in a file server 104 may be of equal size or unequal size, for example, from a few megabytes to a few gigabytes, depending of the segmentation. The file servers 104 in the data storage system implement a distributed segmented file system, where the files to be accessed through the file servers 104 are distributed across different segments of the file servers 104. In this, the metadata of the files and the directories associated with the files may be controlled and stored in the segments of the file servers 104. The metadata of the files and the directories may be in the form of dentries and inodes. A dentry of a file may include information such as a name of the file and an inode number. The inode number points to the inode of the file and the segment number in which the inode of the file resides. An inode of a file may be represented by a unique number and may include information of the file, such as type, length, access and modification times, delegation, location on disk, etc,
[0021 ] As mentioned earlier, the Avatar Domain 106 owns, or has the logical ownership of, the on-disk inodes and data blocks of the files stored in the shared storage pool 102 accessible by the file servers 104. With this, the inodes residing in two different segments of two different file servers 104 can point to a single on-disk inode in the shared storage pool 102 to form inter-segment hard-links with the same on-disk inode. Formation of inter-segment hard-links is described in detail through the description of Figure 2 later in the description. The inter-segment hard- links form multiple access paths to the same file and allow accessing of the same file through different file servers 104, without hopping between the file servers 104 as described in the description herein.
[0022] Further, in an example implementation, for accessing files from the shared storage pool 102, the file servers 104 may communicate with each other and with the Avatar Domain 108 over a communication network 108. The file servers 104 may access files on receiving a file access request from one or more client devices (not shown), over the communication network 108. The client devices may include, but are not restricted to, laptop computers, desktop computers, notebooks, workstations, mainframe computers, smart phones, and personal digital assistants.
[0023] The communication network 108 may be a wireless network, a wired network, or a combination thereof. The communication network 108 can also be an individual network or a collection of many such individual networks, interconnected with each other and functioning as a single large network, e.g., the Internet or an intranet. The communication network 108 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), and the internet. The communication network 108 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), and Transmission Control Protocol/Internet Protocol (TCP/IP), to communicate with each other. In an example implementation, the communication network 108 may include a Global System for Mobile Communication (GSM) network, a Universal Mobile Telecommunications System (UMTS) network, or any other communication network that use any of the commonly used protocols, for example, Hypertext Transfer Protocol (HTTP) and Transmission Control Protocol/Internet Protocol (TCP/IP).
[0024] Further, each of the file servers 104 includes processor(s) (not shown). The processor(s) may be implemented as microprocessors, microcomputers, microcontroliers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. The functions of processor(s) may be provided through the use of dedicated hardware as well as hardware capable of executing machine readable instructions. The processor(s) in each file server 104 may be coupled to the segments of the respective file server 104 to perform functions associated with accessing of files from the shared storage pool 102.
[0025] Apart from the processor(s), the file servers 104 may include interface(s) (not shown). The interface(s) may include a variety of commercially available interfaces, for example, interfaces for peripheral device(s), such as data input output devices, referred to as I/O devices, storage devices, network device. The I/O device(s) may include Universal Serial Bus (USB) ports, Ethernet ports, host bus adaptors, and their corresponding device drivers. The interface(s) may facilitate the communication of the file servers with various communication and computing devices and various communication networks, such as networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP) and Transmission Control Protocol/Internet Protocol (TCP/IP). [0026] The file servers 104 may further includes a memory (not shown) coupled to the processor(s). Among other capabilities, the processor(s) may fetch and execute computer-readable instructions stored in the memory. The memory may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, and flash memories.
[0027] Figure 2 illustrates a distributed segmented file system based data storage system for creation of a shadow tree entry and inter-segment hard-links, according to an example of the present subject matter. Figure 2 shows two of the file servers, namely a first file server 104-1 and a second file server 104-2, which may communicate with the Avatar Domain 106 for accessing files stored in the shared storage pool 102. The description herein describes the procedure of creation of a shadow tree entry and inter-segment hard-links for the purpose of accessing a file from the first file server 104-1 , when the inode of the file is originally residing in a segment of the second file server 104-2.
[0028] For the purpose of accessing a file from the shared storage pool 102, the first file server 104-1 may receive a file access request from a client device. The file access request may include information of the file, for example, the file name, to be accessed through the first file server 104- 1 . On receiving the file access request, the first file server 104-1 may receive a dentry associated with the file, in an example implementation, for the IBR!X file system, the dentry of the file may be received based on a server lookup to an IBRIX global namespace that maps the file name with the metadata associated with the file. The dentry may be stored in a segment in the first file server 104-1 . As illustrated in Figure 2, the dentry, referenced by 202, may be stored in segment 1 , referenced by 204, in the first file server 104-1 . The dentry 202 includes the file name "Foo.c" and the inode number "17". The inode number 17 may be point to the segment number "3" indicative of a segment where the inode of the file resides.
[0029] Further, based on the segment number indicated by the dentry, the file server having the segment corresponding to the indicated segment number where the inode resides may be identified. This identification may be done using a globally administered table that maps the segments, or the segment numbers, to the file servers 104. As illustrated in Figure 2, the second file server 104-2 may be identified as the file server having the segment 3 where the inode number 17 resides. Segment 3 in the second file server 104-2 is referenced by 206. Further, as shown, the inode 17 residing on segment 3 of the second file server 104-2 is pointing to, or hard-linked to, a corresponding on-disk inode 17 in the Avatar Domain 106, thereby forming a hard-link between segment 3 and the on-disk inode 17. The on-disk inode 17 is linked to the data blocks 1 to m that may be associated the file.
[0030] In an example implementation, the inode 17 residing in segment 3 of the second file server 104-2 is associated with a shadow tree entry which is indicative of a file path of the file corresponding to the inode 17. The file path may include directories and sub-directories associated with the inode 17, from the root directory up to the inode 17. The information of the file path associated with the inode 17 may be locally stored in segment 3. The shadow tree entry is referenced by 208.
[0031 ] Now, since the inode 17 of the file to be accessed through the first file server 104-1 is residing in segment 3 in the second file server 104- 2, the first file server 104-1 creates a shadow tree entry which is same as the shadow tree entry 208 in the second file server 104-2 and stores the shadow tree entry in the first file server 104-1 . By creating and storing the shadow tree entry associated with the inode 17 in the first file server 104- 1 , the file "Foo.c", corresponding to the inode 17, can be accessed through the first file server 104-1 using the stored shadow tree entry. Such accessing does not involve hopping from the first file server 104-1 to the second file server 104-2.
[0032] In an example implementation, the shadow tree entry 208 may be a persistent entry stored in a non-volatile memory in the first file server 104-1 . in an example, implementation, the shadow tree entry 208 may be a non-persistent entry stored in a volatile memory in the first file server 104-1 .
[0033] As shown, the shadow tree entry 208 in segment 1 of the first file server 104-1 has inode 17 as the leaf node. This inode 17 points to the on-disk inode 17 in the Avatar Domain 106, thereby forming a hard-link between segment 1 and the on-disk inode 17 in the Avatar Domain 106. Thus, as shown in Figure 2, inter-segment hard-links are formed with the same on-disk inode 17 in the Avatar Domain 106.
[0034] Further, in an example implementation, a reference count of links of shadow tree entries in the plurality of file servers with each on-disk inode in the Avatar Domain 106 may be updated atomically based on a classic distributed lock manager, in an example implementation, a reference count of links of shadow tree entries in the plurality of file servers with each on-disk inode in the Avatar Domain 106 may be undated atomically based on IBR!X delegation. The reference count is incremented when a new segment hard-link is formed and is decremented when an existing segment hard-link is removed. Further, the inode in the Avatar Domain 106 is removed when the reference count is zero.
[0035] Figure 3 illustrates a method 300 of accessing a file in a distributed segmented file system based data storage system, according to an example of the present subject matter. The order in which method 3(30 is described is not intended to be construed as a limitation, and some of the described method blocks can be combined in a different order to implement the method 300, or an alternative method. Furthermore, the method 300 may be implemented in any suitable hardware, computer- readable instructions, or combination thereof. [0036] The steps of the method 300 may be performed by either a computing device under the instruction of machine executable instructions stored on a non-transitory computer readable medium or by dedicated hardware circuits, microcontrollers, or logic circuits. Herein, some examples are also intended to cover computer readable medium, for example, digital data storage media, which are machine or computer readable and encode machine-executable or computer-executable instructions, where said instructions perform some or all of the steps of the described method 30(3.
[0037] With reference to method 300 as depicted in Figure 3, at block 302, the method 300 includes receiving by a first file server 104-1 of a plurality of fiie servers 104 in the data storage system a directory entry (dentry) of a file to be accessed through the first file server 104-1 . The dentry of the file may be received on receiving a file access request from a client device. The dentry may be stored in a segment of the first file server 104-1 . in an example implementation, for the IBRIX file system, the dentry may be received based on a server lookup to an IBRIX global namespace.
[0038] At block 304, the method 300 includes identifying whether the dentry of the file to be accessed is pointing to an inode residing on a segment of a second file server 104-2, different from the first file server 104-1 . For the identification, the segment where the inode of the fiie resides may be determined based on the dentry. Then, the file server, from amongst the file servers 104 in the data storage system, which has the determined segment may be determined. The inode residing in the second file server 104-2 may point to a corresponding on-disk inode of the file in Avatar Domain 106. The Avatar Domain 106 is a logical layer through which a shared storage pool 102 is abstracted to the fiie servers 104. The shared storage pool 102 may include one or more storage devices that are commonly shared by the file servers 104 for storing and accessing files. [0039] Further, at block 306, the method 300 Includes storing In the first file server 104-1 , a shadow tree entry of the file. The shadow tree entry of the file is a file path associated with the inode of the file. The shadow tree entry stored in the first file server 104-1 is linked to the on- disk inode of the file in the Avatar Domain 106. As mentioned earlier, by storing the shadow tree entry in the first file server 104-1 , the file can be accessed through the first file server 104-1 using the stored shadow tree entry, without hopping from the first file server 104-1 to the second file server 104-2.
[0040] In an example implementation, before creating and storing the shadow tree entry in the first file server 104-1 , it may be checked whether the shared storage pool 102 is abstracted through the Avatar Domain 106. if not, then the file accessing through the first file server 104-1 is based on hopping to the second file server 104-2, which introduces latency in file accessing.
[0041 ] In an example implementation, a reference count of links of shadow tree entries in the plurality of file servers with each on-disk inode in the Avatar Domain may be updated. The reference count of links may be updated afomicaliy based on a classic distributed lock manager or based on IBRIX delegation.
[0042] Figure 4 illustrates a method 400 of reading a file based on the shadow tree entry of a file, according to an example of the present subject matter. The order in which method 400 is described is not intended to be construed as a limitation, and some of the described method blocks can be combined in a different order to implement the method 400, or an alternative method. Furthermore, the method 400 may be implemented in any suitable hardware, computer-readable instructions, or combination thereof.
[0043] The steps of the method 400 may be performed by either a computing device under the instruction of machine executable instructions stored on a non-transitory computer readable medium or by dedicated hardware circuits, microcontrollers, or logic circuits. Herein, some examples are also intended to cover computer readable medium, for example, digital data storage media, which are machine or computer readable and encode machine-executable or computer-executable instructions, where said instructions perform some or ail of the steps of the described method 400.
[0044] With reference to method 400 as depicted in Figure 4, at block 402, the method 400 includes determining, based on the inode in the dentry of the file to be accessed through a file server 104, whether the client device from which the file access request is received is delegated with read rights. If read rights are not delegated, then at block 404 the read rights may be obtained ("no" branch from block 402).
[0045] If read rights are delegated, then at block 406, the method 400 includes determining whether the shadow tree entry associated with the inode of the file to be accessed through the file server 104 is stored in the file server 104 ("yes" branch from block 402). If the shadow tree entry is not stored in the file server 104, then at block 408 the file accessing takes a default read path ("no" branch from block 406) in which the file system call is transferred to that file server in which the inode of the file resides.
[0046] If the shadow tree entry is stored in the file server 104, then at block 410 an on-disk inode in the Avatar Domain 106 is read ("yes" branch from block 408), which is linked with the shadow tree entry. The reading of the on-disk inode is enabled by the segment hard-link between the file server 104 and the on-disk inode. Further, at block 412, the data blocks associated with the on-disk inode are determined, and at block 414 a read access of the determined data blocks is provided to the file server 104. The contents of the data blocks, read by the file server 104, are then provided to the client device by the file server 104.
[0047] Figure 5 illustrates a system environment 500 for implementing a distributed segmented file system based data storage system, according to an example of the present subject matter. The system environment 500 includes a processing resource 502 communicatively coupled to a non-transitory computer readable medium 504 through a communication link 506. in one implementation, the processing resource 502 can be a processor of a file server 104 that implements a distributed segmented file system and utilizes the non- transitory computer readable medium 504 for accessing files in a data storage system.
[0048] The non- transitory computer readable medium 504 may be, for example, an internal memory device or an external memory device, in one implementation, the communication link 506 may be a direct communication link, such as any memory read/write interface. In another implementation, the communication link 506 may be an indirect communication link, such as a network interface. In such a case, the processing resource 502 may access the non-transitory computer readable medium 504 through a network 508. The network 508 may be a single network or a combination of multiple networks and may use a variety of different communication protocols.
[0049] The processing resource 502 and the non-transitory computer readable medium 504 may also be communicating with data sources 510 over the network 508. The data sources 510 may be a shared storage poo! including one or more storage devices that are accessed by the processing resource 502 for storing and accessing files, in accordance with the present subject matter. The data sources 510 is abstracted to the processing resource 502 through Avatar Domain 512 that single-handedly owns the on-disk inodes and the data blocks of the files stored in the data resources 510.
[0050] In one implementation, the non-transitory computer readable medium 504 includes a set of computer readable instructions, for example, to receive a dentry of a file to be accessed through the file server 104, identify whether the dentry of the file is pointing to an inode residing on a segment of another file server, and store, in the file server 104, a shadow tree entry of the file. The set of computer readable instructions, referred to as instructions hereinafter, can be accessed by the processing resource 502 through the communication link 506 and subsequently executed to perform acts for accessing files in the distributed segmented file system based data storage system.
[0051 ] For discussion purposes, the execution of the instructions by the processing resource 502 has been described with reference to various components introduced earlier with reference to description of Figures 1 and 2.
[0052] On execution by the processing resource 502, a dentry of a file to be accessed through the file server 104 is received by the file server 104. The dentry may be received based on a server lookup to an IBR!X global namespace and may be stored in a segment of the file server 104. Further, it may be identified whether the dentry of the file is pointing to an inode residing on a segment of another file server, wherein the inode in the other file server is pointing to an on-disk inode of the file in Avatar Domain 512. Further, a shadow free entry of the file may be stored in the file server 104, wherein the shadow tree entry in the file server 104 is a file path associated with the inode of the file and linked to the on-disk inode of the file in the Avatar Domain 512.
[0053] Although implementations of accessing files in a distributed segmented file system based data storage system have been described in language specific to structural features and/or methods, it is to be understood that the present subject matter is not necessarily limited to the specific features or methods described. Rather, the specific features and methods are disclosed and explained in the context of a few implementations for accessing files in a distributed segmented file system based data storage system.

Claims

i/We claim:
1 . A file server thai implements a distributed segmented file system, the file server comprising:
at least one segment for storing directory entries and inodes of one or more files stored in a shared storage pool, wherein the shared storage pool is abstracted to a plurality of file servers through Avatar Domain; and
a processor coupled to the at least one segment to:
identify whether a directory entry, in the at least one segment, of a file to be accessed through the file server is pointing to an inode residing on a segment of another file server, from amongst the plurality of file servers, wherein the inode in the other file server is pointing to an on-disk inode of the file in the Avatar Domain; and
store, in the file server, a shadow tree entry of the file, wherein the shadow tree entry in the file server is a file path associated with the inode of the file and linked to the on-disk inode of the file in the Avatar Domain.
2. The file server as claimed in claim 1 , wherein the file server receives the directory entry of the file based on a server lookup to an IBRIX global namespace.
3. The file server as claimed in claim 1 , wherein the shadow tree entry is stored in one of a non-volatile memory and a volatile memory in the file server.
4. A method of accessing a file in a distributed segmented file system based data storage system accessible through a plurality of file servers, wherein the method comprises:
receiving, by a first file server of the plurality of file servers, a directory entry of a file to be accessed through the first file server, wherein the directory entry is stored in a segment of the first file server; identifying whether the directory entry of the file to be accessed is pointing to an inode residing on a segment of a second file server from amongst the plurality of servers, wherein the inode in the second file server is pointing to an on-disk inode of the file in Avatar Domain through which a shared storage pool is abstracted to the plurality of file servers; and
storing, in the first file server, a shadow tree entry of the file, wherein the shadow tree entry in the first file server is a file path associated with the inode of the file and linked to the on-disk inode of the file in the Avatar Domain.
5. The method as claimed in claim 4, wherein the receiving the directory entry of the file is based on a server lookup to an IBRIX global namespace.
6. The method as claimed in claim 4 further comprising updating a reference count of links of shadow tree entries in the plurality of file servers with each on-disk inode in the Avatar Domain, wherein the reference count of links is updated atomicaliy based on a classic distributed lock manager.
7. The method as claimed in claim 4 further comprising updating a reference count of links of shadow tree entries in the plurality of file servers with each on-disk inode in the Avatar Domain, wherein the reference count of links is updated atomicaliy based on IBRIX delegation.
8. A non-transitory computer-readable medium comprising computer readable instructions that, when executed, cause a file server that implements a distributed segmented file system to:
receive a directory entry of a file to be accessed through the file server, wherein the directory entry is stored in a segment of the file server;
identify whether the directory entry of the file is pointing to an inode residing on a segment of another file server, wherein the inode in the other file server is pointing to an on-disk inode of the file in Avatar Domain through which a shared storage pool is abstracted to a plurality of file servers, including the file server and the other file server; and store, in the file server, a shadow tree entry of the file, wherein the shadow tree entry in the file server is a file path associated with the inode of the file and linked to the on-disk inode of the file in the Avatar Domain.
9. The non-transitory computer-readable medium as claimed in claim 8 comprising computer readable instructions that, when executed, cause the file server to store the shadow tree entry in one of a non-volatile memory and a volatile memory in the file server.
10. The non-transitory computer-readable medium as claimed in claim 8 comprising computer readable instructions that, when executed, cause the file server to receive the directory entry of the file based on a server lookup to an 1BRIX global namespace.
PCT/US2014/067945 2014-07-30 2014-12-01 Distributed segmented file systems WO2016018450A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN3723/CHE/2014 2014-07-30
IN3723CH2014 2014-07-30

Publications (1)

Publication Number Publication Date
WO2016018450A1 true WO2016018450A1 (en) 2016-02-04

Family

ID=55218156

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/067945 WO2016018450A1 (en) 2014-07-30 2014-12-01 Distributed segmented file systems

Country Status (1)

Country Link
WO (1) WO2016018450A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2656739C1 (en) * 2017-10-04 2018-06-06 Общество с ограниченной ответственностью "ИНТЕЛЛЕКТУАЛЬНЫЙ РЕЗЕРВ" Data storage method and system
RU2682010C1 (en) * 2018-04-10 2019-03-14 Общество с ограниченной ответственностью "Генератор" (ООО "Генератор") Data in the database access separation method
CN112685337A (en) * 2021-01-15 2021-04-20 浪潮云信息技术股份公司 Method for hierarchically caching read and write data in storage cluster
CN112905531A (en) * 2019-11-19 2021-06-04 千寻位置网络有限公司 GNSS offline data storage method, GNSS offline data storage system and GNSS offline data calling method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7546319B1 (en) * 2003-04-28 2009-06-09 Ibrix, Inc. File system consistency checking in a distributed segmented file system
US7836017B1 (en) * 2000-09-12 2010-11-16 Hewlett-Packard Development Company, L.P. File replication in a distributed segmented file system
US20110313973A1 (en) * 2010-06-19 2011-12-22 Srivas Mandayam C Map-Reduce Ready Distributed File System
US20140181239A1 (en) * 2009-10-29 2014-06-26 Netapp, Inc. Mapping of logical start addresses to physical start addresses in a system having misalignment between logical and physical data blocks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7836017B1 (en) * 2000-09-12 2010-11-16 Hewlett-Packard Development Company, L.P. File replication in a distributed segmented file system
US7546319B1 (en) * 2003-04-28 2009-06-09 Ibrix, Inc. File system consistency checking in a distributed segmented file system
US8316066B1 (en) * 2003-04-28 2012-11-20 Hewlett-Packard Development Company, L.P. Shadow directory structure in a distributed segmented file system
US20140181239A1 (en) * 2009-10-29 2014-06-26 Netapp, Inc. Mapping of logical start addresses to physical start addresses in a system having misalignment between logical and physical data blocks
US20110313973A1 (en) * 2010-06-19 2011-12-22 Srivas Mandayam C Map-Reduce Ready Distributed File System

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2656739C1 (en) * 2017-10-04 2018-06-06 Общество с ограниченной ответственностью "ИНТЕЛЛЕКТУАЛЬНЫЙ РЕЗЕРВ" Data storage method and system
RU2682010C1 (en) * 2018-04-10 2019-03-14 Общество с ограниченной ответственностью "Генератор" (ООО "Генератор") Data in the database access separation method
CN112905531A (en) * 2019-11-19 2021-06-04 千寻位置网络有限公司 GNSS offline data storage method, GNSS offline data storage system and GNSS offline data calling method
CN112905531B (en) * 2019-11-19 2022-11-15 千寻位置网络有限公司 GNSS offline data storage method, GNSS offline data storage system and GNSS offline data calling method
CN112685337A (en) * 2021-01-15 2021-04-20 浪潮云信息技术股份公司 Method for hierarchically caching read and write data in storage cluster
CN112685337B (en) * 2021-01-15 2022-05-31 浪潮云信息技术股份公司 Method for hierarchically caching read and write data in storage cluster

Similar Documents

Publication Publication Date Title
US11797498B2 (en) Systems and methods of database tenant migration
US10853339B2 (en) Peer to peer ownership negotiation
US10635643B2 (en) Tiering data blocks to cloud storage systems
US8751763B1 (en) Low-overhead deduplication within a block-based data storage
US9251003B1 (en) Database cache survivability across database failures
US10542073B2 (en) File transfer to a distributed file system
US20180205791A1 (en) Object storage in cloud with reference counting using versions
US11080253B1 (en) Dynamic splitting of contentious index data pages
US10747677B2 (en) Snapshot locking mechanism
US11055265B2 (en) Scale out chunk store to multiple nodes to allow concurrent deduplication
WO2016018450A1 (en) Distributed segmented file systems
Cruz et al. A scalable file based data store for forensic analysis
US9696919B1 (en) Source/copy reference tracking with block pointer sets
US20130318086A1 (en) Distributed file hierarchy management in a clustered redirect-on-write file system
US11093169B1 (en) Lockless metadata binary tree access
US11748039B2 (en) Vblock metadata management
US10628391B1 (en) Method and system for reducing metadata overhead in a two-tier storage architecture
US10853320B1 (en) Scavenging directories for free space
US11755538B2 (en) Distributed management of file modification-time field
US11989159B2 (en) Hybrid snapshot of a global namespace
US11797447B2 (en) Efficient caching and data access to a remote data lake in a large scale data processing environment
Barton et al. Lustre R©
Wang An Analysis of the File System for Linux
Wang et al. An Inter-framework Cache for Diverse Data-Intensive Computing Environments
US20170308542A1 (en) File system configuration data storage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14899062

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14899062

Country of ref document: EP

Kind code of ref document: A1