WO2016130167A1 - Consistency check on namespace of an online file system - Google Patents

Consistency check on namespace of an online file system Download PDF

Info

Publication number
WO2016130167A1
WO2016130167A1 PCT/US2015/025969 US2015025969W WO2016130167A1 WO 2016130167 A1 WO2016130167 A1 WO 2016130167A1 US 2015025969 W US2015025969 W US 2015025969W WO 2016130167 A1 WO2016130167 A1 WO 2016130167A1
Authority
WO
WIPO (PCT)
Prior art keywords
inode
directory
dentry
file system
consistency check
Prior art date
Application number
PCT/US2015/025969
Other languages
French (fr)
Inventor
Anoop Kumar RAVEENDRAN
Santigopal MONDAL
Anand Andaneppa GANJIHAL
Sandya Srivilliputtur Mannarswamy
Original Assignee
Hewlett Packard Enterprise Development Lp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Enterprise Development Lp filed Critical Hewlett Packard Enterprise Development Lp
Publication of WO2016130167A1 publication Critical patent/WO2016130167A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots

Definitions

  • a file system is a means for organizing data on a storage device.
  • a file system may be used to control how data is stored and retrieved from a storage system. Since data is typically stored in the form of a file, a file system provides the basic structure for organizing files in a storage device. The file system keeps track of file locations (physical or virtual) on a storage medium.
  • FIG. 1 is a block diagram of an example computing device for performing a consistency check on namespace of an online file system
  • FIG. 2 is a block diagram of an example computing environment for performing a consistency check on namespace of an online file system
  • FIG. 3 is a flowchart of an example method of performing a consistency check on namespace of an online file system
  • FIG. 4 is a block diagram of an example system for performing a consistency check on namespace of an online file system. Detailed Description
  • a file system is an integral part of an operating system. It provides the underlying structure that a computing device uses to organize data on a storage medium.
  • a computer file or "file” is the basic component of a file system. Each piece of data on a storage device may be called a "file”.
  • a file may contain data, such as text files, image files, video files, and the like, or it may be an executable file or program.
  • a file system consistency check may be required to be performed at regular intervals.
  • a file system hosting a large amount of data may need to be online and serve user requests while consistency checks are carried out.
  • an online consistency check performed on the namespace of a file system may take a few hours to multiple days depending on the size of the file system and number of file system objects.
  • the structure of a namespace tree may change at any time (for example, if a sub-tree is moved).
  • an inode table may be selected from an online file system. Further to selection of the inode table, a directory inode may be selected in the inode table, and a consistency check may be performed on a dentry of a directory that maps to the selected directory inode. In response to identifying an inconsistency with the dentry, consequent to the consistency check, the dentry may be recorded in a database.
  • FIG. 1 is a block diagram of an example computing device 100 for performing a consistency check on namespace of an online file system.
  • Computing device 100 may represent any type of computing system capable of reading machine-executable instructions. Examples of computing device 100 may include, without limitation, a server, a desktop computer, a notebook computer, a tablet computer, a thin client, a mobile device, a personal digital assistant (PDA), a phablet, and the like. In an example, computing device 100 may be a file storage system.
  • computing device 100 may be a data storage device or medium.
  • Computing device 100 may be a primary storage device such as, but not limited to, random access memory (RAM), read only memory (ROM), processor cache, or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by a processor.
  • RAM random access memory
  • ROM read only memory
  • processor cache or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by a processor.
  • SDRAM Synchronous DRAM
  • DDR Double Data Rate
  • RDRAM Rambus DRAM
  • Computing device 100 may be a secondary storage device such as, but not limited to, a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, a flash memory (e.g. USB flash drives or keys), a paper tape, an Iomega Zip drive, and the like.
  • flash memory e.g. USB flash drives or keys
  • Computing device 100 may be a tertiary storage device such as, but not limited to, a tape library, an optical jukebox, and the like.
  • storage device 102 may be a Direct Attached Storage (DAS) device, a Network Attached Storage (NAS) device, a tape drive, a magnetic tape drive, a data archival storage system, or a combination of these devices.
  • DAS Direct Attached Storage
  • NAS Network Attached Storage
  • tape drive a magnetic tape drive
  • data archival storage system or a combination of these devices.
  • computing device 100 may include a file system 102, a directory module 104, a selection module 106, a file system check (fsck) module 108, and a database 1 10.
  • module may refer to a software component (machine readable instructions), a hardware component or a combination thereof.
  • a module may include, by way of example, components, such as software components, processes, tasks, coroutines, functions, attributes, procedures, drivers, firmware, data, databases, data structures, Application Specific Integrated Circuits (ASIC) and other computing devices.
  • a module may reside on a volatile or nonvolatile storage medium and configured to interact with a processor of a computing device (e.g. 100).
  • file system 102 may be used for storage and retrieval of data from computing device 100.
  • Files in file system 102 may be organized by storing related files in a directory or sub-directory.
  • a directory or subdirectory is also a file.
  • the term "directory” (or "file directory), as used herein, may include a file that contains references (for example, names) to other files.
  • a directory may be considered as a container for files.
  • an inode may be associated with each file of the file system 102.
  • An inode is a data structure, which is used to represent a file system object (for example, directory, file, etc.).
  • Each file or directory may be associated with an inode, which is identified by an integer number (i.e. an inode number).
  • An inode may store the attributes and disk block location(s) of a file system object's data.
  • an inode may store information about data blocks associated with a file or directory, or it may point to a data block map that points to the data blocks.
  • each inode in the file system is a unique number, and the file system may locate the contents of a file by its inode number.
  • An inode may also store information related to file ownership and file access permissions.
  • files in file system 102 may be organized by storing related files in a directory or sub-directory.
  • a directory may include one or more files.
  • each directory may be represented in the form of an inode table that maps directory and file names to inode numbers.
  • a directory may store filenames and their respective inode numbers (i.e. (filename, inode number)), in an on-disk page.
  • Each such page may be called as directory page.
  • Directory page thus, is a container which may hold a finite set of filenames and respective inode mapping i.e. ⁇ filename, inode number ⁇ . Apart from ⁇ filename, inode number ⁇ , a directory page may hold other metadata relevant to an inode or a file.
  • the inodes of file system 102 may be placed in several tables (i.e. inode tables).
  • a file in the file system 102 may be represented by using multiple file names that may map to the same inode number. Any of these file names may be used to identify the inode number of the file. These file names may be called as "links" (or pointers) to the file.
  • a "link count" in the inode may be used to track the number of directories that may contain a name-inode number mapping for that inode. If the link count in an inode is zero, it means that no directory points to the inode. If the link count in an inode is one, it means that the inode has only one name-inode number mapping. Likewise, if the inode has two name-inode number maps, its link count is considered to be two.
  • a parent directory in the file system 102 may be represented by ". .” (dot dot), which maps to the inode of the parent directory.
  • a directory in the file system may be represented by ".” (dot), which maps to the inode of the directory.
  • An empty directory in the file system 102 has a link count of two (i.e. 1 +1 ): one link count for the parent directory (i.e. dot dot), and the other link count directory itself (i.e. dot).
  • File system 102 may be a local file system or a scale-out file system such as a shared file system or a network file system.
  • Examples of a shared file system may include a Storage Area Network (SAN) file system or a cluster file system.
  • Examples of a network file system may include a distributed file system or a distributed parallel file system.
  • Some non-limiting examples of file systems that may be used on storage device may include FAT (FAT12, FAT16, FAT32), NTFS, HFS and HFS+, HPFS, UFS, ext2, ext3, and ext4.
  • file system 102 is an online file system.
  • Directory module 104 may select an inode table from an online file system (for example, 102).
  • a file system for example, 102
  • the directory module 104 may randomly select an inode table from the file system 102 for performing a consistency check on the namespace of the file system 102.
  • the directory module 104 may select each inode table of the file system 102 for carrying out a consistency check on the namespace of the file system 102.
  • Selection module 106 may select a directory inode from the selected inode table. In other words, once the directory module 104 selects an inode table from the file system 102, the selection module 106 may randomly select a directory inode from the selected inode table for performing a consistency check on the namespace of the file system 102. Likewise, the selection module 106 may select each directory inode of a selected inode table.
  • the file system 102 may include one or more directories.
  • Each file name-inode number ⁇ i.e. filename, inode number ⁇ mapping in a directory inode may be referred as directory entry or "dentry".
  • a file name in a directory entry may refer to name of a directory, a subdirectory, a file, or any other file object.
  • a mapping ⁇ . (dot): 2 ⁇ , where ".” (dot) is the file name of the parent directory and "2" is the inode number of the parent directory may represent a dentry.
  • a mapping ⁇ home : 222 ⁇ where "home" is file name of the home directory and "222" is the inode number of the home directory, may represent another dentry in the inode table.
  • a mapping ⁇ "rm”: 444 ⁇ where "rm” is file name of a file and "444" is the inode number of the file, may represent a dentry.
  • a dentry may be construed a specific component in a file path.
  • abc/def both “abc” and “def may represent files, wherein “abc” is a directory file and “def is an ordinary file.
  • 7", “abc”, and “def represent dentry objects.
  • all components in a file path represent dentry objects (or dentries).
  • File system check (fsck) module 108 may be used for checking and repairing file system inconsistencies on the computing device 100.
  • a file system (for example, 102) may become inconsistent, for example, due to power failure, a nonstandard shutdown, hardware failure, etc. This may cause inconsistencies and mismatched information relating to data blocks, free blocks, inodes, pointers, etc. in a file system.
  • Fsck module 108 may perform a consistency check on a dentry of a directory that maps to the directory inode selected by the selection module 106. In an example, such consistency check may involve determining whether the selected dentry includes an orphan inode i.e. an inode without any dentry pointing to it.
  • such determination may be made by verifying whether the link between the dentry and inode is valid. This may be carried out by verifying whether the back pointer from the inode is correctly pointing to the directory inode.
  • the fsck module may perform aforementioned consistency check for all dentries in the directory that maps to the selected directory inode.
  • a consistency check by fsck module 108 may involve determining whether the selected dentry is a dangling entry i.e. a directory entry that does not include a valid inode pointer, or includes a wrong pointer.
  • the fsck module may perform aforementioned consistency check for all dentries in the directory that maps to the selected directory inode. After the dentry-inode relation is verified, the fsck module may increment the link count maintained in-memory for the selected inode. At the end of checking all the inodes in the table, the fsck module 108 may build the in-memory link and apply the link counts on disk. After the inode table check is complete, if any inode having link count 0 is found, fsck module 108 may add the inode to the orphan inode list.
  • fsck module 108 may perform a consistency check related to a dentry along with a file metadata consistency check that may be carried out in parallel.
  • fsck module 108 may use a background thread, which is used for carrying out a file metadata consistency check, to perform a consistency check on a selected inode table.
  • the background thread may offload the thread to a directory checker thread after checking for the inode consistency.
  • the directory checker thread may come from a thread pool that may verify all the dentries in the inode tables and verify the link to the inodes.
  • the fsck module 108 may record such "corrupted" dentry in a database (for example, 1 10). Fsck module 108 may log a corrupted dentry along with its directory inode, to the database. In an instance, such database may be processed at the end of a consistency check, and dentries recorded therein may be corrected. For instance, if an orphan inode (i.e. an inode with a link count of zero) is identified in the database, fsck module 108 may add a new dentry to the directory page that points to a back pointer pointing to the directory inode of the orphan inode.
  • a database for example, 1 10
  • Fsck module 108 may log a corrupted dentry along with its directory inode, to the database.
  • such database may be processed at the end of a consistency check, and dentries recorded therein may be corrected. For instance, if an orphan inode (i.e. an inode with a link count of zero)
  • a corresponding inode with a link count of "0" (zero) may be identified in the database. If the parent directory inode number in the inode matches with the directory where an inconsistency is seen in its dentry, the dentry may be corrected to point to that inode and the link count on the inode may be corrected.
  • a connection between two directories in the file system 102 may be verified by setting a flag on the object state structure, which may be maintained to determine the consistency of the object, of an inode. Presence of such flag on a directory entry may indicate that a link is observed for this directory inode from a non-dot dot based dentry.
  • the directory may be added to an orphan inode list. Such directory may be corrected by adding a lost+found entry to the parent directory, which may be obtained through the back pointer found on the inode.
  • fsck module may act as a repository of all those dentries that may include an inconsistency.
  • the fsck module may also record the current directory of an inconsistent directory entry in the database 1 10.
  • fsck module 108 may correct the inconsistency with the directory entry that is recorded in the database 1 10.
  • FIG. 2 is a block diagram of an example computing environment 200 for performing a consistency check on namespace of an online file system.
  • Computing environment 200 may include nodes 202, 204, 206, 208, and 210, and a server 212.
  • a "node” may be a computing device (i.e. includes at least one processor), a storage device, a network device, or any combination thereof.
  • the number of nodes 202, 204, 206, 208 and 210, and server 212 shown in FIG. 2 is for the purpose of illustration only and their number may vary in other implementations.
  • computing environment 200 may represent a file storage system wherein nodes 202, 204, 206, 208, and 210 may serve as file storage nodes. In an instance, said file storage system may be a scale-out file system.
  • Nodes 202, 204, 206, 208, and 210 may each be a computing device such as a desktop computer, a notebook computer, a tablet computer, a mobile phone, personal digital assistant (PDA), a server, and the like.
  • Nodes 202, 204, 206, 208, and 210 may be a storage device.
  • the storage device may be a primary storage device such as, but not limited to, random access memory (RAM), read only memory (ROM), processor cache, or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by a processor.
  • RAM random access memory
  • ROM read only memory
  • processor cache or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by a processor.
  • SDRAM Synchronous DRAM
  • DDR Double Data Rate
  • RDRAM Rambus DRAM
  • Rambus RAM etc.
  • the storage device may be a secondary storage device such as, but not limited to, a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, a flash memory (e.g. USB flash drives or keys), a paper tape, an Iomega Zip drive, and the like.
  • the storage device may be a tertiary storage device such as, but not limited to, a tape library, an optical jukebox, and the like.
  • the storage device may be a Direct Attached Storage (DAS) device, a Network Attached Storage (NAS) device, a tape drive, a magnetic tape drive, a data archival storage system, or a combination of these devices.
  • DAS Direct Attached Storage
  • NAS Network Attached Storage
  • Nodes 202, 204, 206, 208, and 210 may communicate with each other and server 212, for example, via a computer network 224.
  • Computer network 224 may be a wireless or wired network.
  • Computer network 224 may include, for example, a Local Area Network (LAN), a Wireless Local Area Network (WAN), a Metropolitan Area Network (MAN), a Storage Area Network (SAN), a Campus Area Network (CAN), or the like.
  • LAN Local Area Network
  • WAN Wireless Local Area Network
  • MAN Metropolitan Area Network
  • SAN Storage Area Network
  • CAN Campus Area Network
  • computer network 224 may be a public network (for example, the Internet) or a private network (for example, an intranet).
  • server 212 may be analogous to computing device 100, in which like reference numerals correspond to the same or similar, though perhaps not identical, components.
  • like reference numerals correspond to the same or similar, though perhaps not identical, components.
  • components or reference numerals of FIG. 2 having a same or similarly described function in FIG. 1 are not being described in detail in connection with FIG. 2. Said components or reference numerals may be considered alike.
  • server 212 may include a file system 102, a directory module 104, a selection module 106, a file system check (fsck) module 108, and a database 1 10.
  • one or more nodes may include a portion of the file system 102.
  • file system 102 is an online file system.
  • directory module 104 may select an inode table in each of a plurality of nodes hosting an online file system (for example, 102). For instance, directory module 104 may select an inode table in each of a plurality of nodes, such as 202, 204, and 208. Upon such selection, the selection module may select a directory inode in each of the selected inode tables.
  • the file system check (fsck) module 108 may then perform, in parallel, a consistency check on a dentry in each of a directory that individually maps to each of the selected directory inode. Further to the consistency check, if the fsck module 108 identifies an inconsistency with the dentry in any of the selected directories, the fsck module 108 may log the dentry in a database (for example, 1 10).
  • FIG. 3 is a flowchart of an example method 300 for performing a consistency check on namespace of an online file system.
  • the method 300 may at least partially be executed on a computing device 100 of FIG. 1 or server 212 of FIG. 2. However, other computing devices may be used as well.
  • an inode table may be selected (for example, by directory module 104), from an online file system (for example, 102).
  • a directory inode may be selected in the selected inode table (for example, by selection module 106).
  • a consistency check may be performed on a dentry (for example, by fsck module 108) of a directory that maps to the directory inode.
  • FIG. 4 is a block diagram of an example system 400 for performing a consistency check on namespace of an online file system.
  • System 400 includes a processor 402 and a machine-readable storage medium 404 communicatively coupled through a system bus.
  • system 400 may be analogous to computing device 100 of FIG. 1 or server 212 of FIG. 2.
  • Processor 402 may be any type of Central Processing Unit (CPU), microprocessor, or processing logic that interprets and executes machine- readable instructions stored in machine-readable storage medium 404.
  • CPU Central Processing Unit
  • microprocessor or processing logic that interprets and executes machine- readable instructions stored in machine-readable storage medium 404.
  • Machine-readable storage medium 404 may be a random access memory (RAM) or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by processor 402.
  • machine-readable storage medium 404 may be Synchronous DRAM (SDRAM), Double Data Rate (DDR), Rambus DRAM (RDRAM), Rambus RAM, etc. or a storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like.
  • machine- readable storage medium 404 may be a non-transitory machine-readable medium.
  • Machine-readable storage medium 404 may store instructions 406, 408, and 410.
  • instructions 406 may be executed by processor 402 to select a directory inode in an inode table of an online file system. Instructions 406 may be executed by processor 402 to perform a consistency check on each directory entry of a directory that maps to the directory inode. Instructions 408 may be executed by processor 402 to record a directory entry in a database if consequent to the consistency check, an inconsistency is identified with the directory entry.
  • FIG. 3 is shown as executing serially, however it is to be understood and appreciated that the present and other examples are not limited by the illustrated order.
  • the example systems of FIGS. 1 , 2 and 4, and method of FIG. 3 may be implemented in the form of a computer program product including computer-executable instructions, such as program code, which may be run on any suitable computing device in conjunction with a suitable operating system (for example, Microsoft Windows, Linux, UNIX, and the like).
  • a suitable operating system for example, Microsoft Windows, Linux, UNIX, and the like.
  • Embodiments within the scope of the present solution may also include program products comprising non-transitory computer-readable media for carrying or having computer-executable instructions or data structures stored thereon.
  • Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer.
  • Such computer-readable media can comprise RAM, ROM, EPROM, EEPROM, CD-ROM, magnetic disk storage or other storage devices, or any other medium which can be used to carry or store desired program code in the form of computer-executable instructions and which can be accessed by a general purpose or special purpose computer.
  • the computer readable instructions can also be accessed from memory and executed by a processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Some examples described herein relate to performing a consistency check on namespace of an online file system. In an example, an inode table may be selected from an online file system. A directory inode may be selected in the inode table, and a consistency check may be performed on a dentry of a directory that maps to the selected directory inode. In response to identifying an inconsistency with the dentry, consequent to the consistency check, the dentry may be recorded in a database.

Description

CONSISTENCY CHECK ON NAMESPACE OF AN ONLINE FILE SYSTEM Background
[001] A file system is a means for organizing data on a storage device. In other words, a file system may be used to control how data is stored and retrieved from a storage system. Since data is typically stored in the form of a file, a file system provides the basic structure for organizing files in a storage device. The file system keeps track of file locations (physical or virtual) on a storage medium.
Brief Description of the Drawings
[002] For a better understanding of the solution, embodiments will now be described, purely by way of example, with reference to the accompanying drawings, in which:
[003] FIG. 1 is a block diagram of an example computing device for performing a consistency check on namespace of an online file system;
[004] FIG. 2 is a block diagram of an example computing environment for performing a consistency check on namespace of an online file system;
[005] FIG. 3 is a flowchart of an example method of performing a consistency check on namespace of an online file system; and
[006] FIG. 4 is a block diagram of an example system for performing a consistency check on namespace of an online file system. Detailed Description
[007] A file system is an integral part of an operating system. It provides the underlying structure that a computing device uses to organize data on a storage medium. A computer file or "file" is the basic component of a file system. Each piece of data on a storage device may be called a "file". A file may contain data, such as text files, image files, video files, and the like, or it may be an executable file or program.
[008] Since a large amount of data generated these days is stored in computer files, a modern file system should be able to expand to support a namespace that may grow to millions or billions of files. To ensure consistent data, a file system consistency check may be required to be performed at regular intervals. However, in many instances, a file system hosting a large amount of data may need to be online and serve user requests while consistency checks are carried out. In such case, an online consistency check performed on the namespace of a file system may take a few hours to multiple days depending on the size of the file system and number of file system objects. Further, during an online consistency check, the structure of a namespace tree may change at any time (for example, if a sub-tree is moved). Further still, in case any disconnect is identified in a namespace tree, it may become difficult to correct a sub-tree unless the disconnection is corrected. This may make a file system consistency check a stop-resume- stop model leading to an exponential increase in the time taken to complete the consistency checking since in order to fix any disconnect in the namespace, the entire inode table may need to be walked. In such scenarios, it may be challenging to verify the name space of a file system. In order to ensure that a name space consistency check is not affected by changes occurring in the structure of the name space, it may be desirable to have a mechanism of checking the name space consistency without walking the namespace tree of a file system. [009] The present disclosure describes various examples for performing a consistency check on namespace of an online file system. In an example, an inode table may be selected from an online file system. Further to selection of the inode table, a directory inode may be selected in the inode table, and a consistency check may be performed on a dentry of a directory that maps to the selected directory inode. In response to identifying an inconsistency with the dentry, consequent to the consistency check, the dentry may be recorded in a database.
[0010] FIG. 1 is a block diagram of an example computing device 100 for performing a consistency check on namespace of an online file system. Computing device 100 may represent any type of computing system capable of reading machine-executable instructions. Examples of computing device 100 may include, without limitation, a server, a desktop computer, a notebook computer, a tablet computer, a thin client, a mobile device, a personal digital assistant (PDA), a phablet, and the like. In an example, computing device 100 may be a file storage system.
[0011] In an example, computing device 100 may be a data storage device or medium. Computing device 100 may be a primary storage device such as, but not limited to, random access memory (RAM), read only memory (ROM), processor cache, or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by a processor. For example, Synchronous DRAM (SDRAM), Double Data Rate (DDR), Rambus DRAM (RDRAM), Rambus RAM, etc. Computing device 100 may be a secondary storage device such as, but not limited to, a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, a flash memory (e.g. USB flash drives or keys), a paper tape, an Iomega Zip drive, and the like. Computing device 100 may be a tertiary storage device such as, but not limited to, a tape library, an optical jukebox, and the like. In another example, storage device 102 may be a Direct Attached Storage (DAS) device, a Network Attached Storage (NAS) device, a tape drive, a magnetic tape drive, a data archival storage system, or a combination of these devices.
[0012] In the example of FIG. 1 , computing device 100 may include a file system 102, a directory module 104, a selection module 106, a file system check (fsck) module 108, and a database 1 10. The term "module" may refer to a software component (machine readable instructions), a hardware component or a combination thereof. A module may include, by way of example, components, such as software components, processes, tasks, coroutines, functions, attributes, procedures, drivers, firmware, data, databases, data structures, Application Specific Integrated Circuits (ASIC) and other computing devices. A module may reside on a volatile or nonvolatile storage medium and configured to interact with a processor of a computing device (e.g. 100).
[0013] In general, file system 102 may be used for storage and retrieval of data from computing device 100. Files in file system 102 may be organized by storing related files in a directory or sub-directory. A directory or subdirectory is also a file. The term "directory" (or "file directory), as used herein, may include a file that contains references (for example, names) to other files. Thus, a directory may be considered as a container for files.
[0014] In an example, an inode may be associated with each file of the file system 102. An inode is a data structure, which is used to represent a file system object (for example, directory, file, etc.). Each file or directory may be associated with an inode, which is identified by an integer number (i.e. an inode number). An inode may store the attributes and disk block location(s) of a file system object's data. In other words, an inode may store information about data blocks associated with a file or directory, or it may point to a data block map that points to the data blocks. Thus, each inode in the file system is a unique number, and the file system may locate the contents of a file by its inode number. An inode may also store information related to file ownership and file access permissions.
[0015] As mentioned earlier, files in file system 102 may be organized by storing related files in a directory or sub-directory. Thus, a directory may include one or more files. In an example, each directory may be represented in the form of an inode table that maps directory and file names to inode numbers. In other words, a directory may store filenames and their respective inode numbers (i.e. (filename, inode number)), in an on-disk page. Each such page may be called as directory page. Directory page, thus, is a container which may hold a finite set of filenames and respective inode mapping i.e. {filename, inode number}. Apart from {filename, inode number}, a directory page may hold other metadata relevant to an inode or a file. The inodes of file system 102 may be placed in several tables (i.e. inode tables).
[0016] In an example, a file in the file system 102 may be represented by using multiple file names that may map to the same inode number. Any of these file names may be used to identify the inode number of the file. These file names may be called as "links" (or pointers) to the file. In an instance, for each inode in the file system 102, a "link count" in the inode may be used to track the number of directories that may contain a name-inode number mapping for that inode. If the link count in an inode is zero, it means that no directory points to the inode. If the link count in an inode is one, it means that the inode has only one name-inode number mapping. Likewise, if the inode has two name-inode number maps, its link count is considered to be two.
[0017] In an example, a parent directory in the file system 102 may be represented by ". ." (dot dot), which maps to the inode of the parent directory. Likewise, a directory in the file system may be represented by "." (dot), which maps to the inode of the directory. An empty directory in the file system 102 has a link count of two (i.e. 1 +1 ): one link count for the parent directory (i.e. dot dot), and the other link count directory itself (i.e. dot). Each subdirectory that is added to a directory increases the link count of the directory by one. For example, if a directory includes 4 subdirectories, its link count would be 6 (i.e. 2+4=6).
[0018] File system 102 may be a local file system or a scale-out file system such as a shared file system or a network file system. Examples of a shared file system may include a Storage Area Network (SAN) file system or a cluster file system. Examples of a network file system may include a distributed file system or a distributed parallel file system. Some non-limiting examples of file systems that may be used on storage device (example, 102) may include FAT (FAT12, FAT16, FAT32), NTFS, HFS and HFS+, HPFS, UFS, ext2, ext3, and ext4. In an example, file system 102 is an online file system.
[0019] Directory module 104 may select an inode table from an online file system (for example, 102). As mentioned earlier, a file system (for example, 102) may include one or more inode tables. In an example, the directory module 104 may randomly select an inode table from the file system 102 for performing a consistency check on the namespace of the file system 102. Likewise, the directory module 104 may select each inode table of the file system 102 for carrying out a consistency check on the namespace of the file system 102.
[0020] Selection module 106 may select a directory inode from the selected inode table. In other words, once the directory module 104 selects an inode table from the file system 102, the selection module 106 may randomly select a directory inode from the selected inode table for performing a consistency check on the namespace of the file system 102. Likewise, the selection module 106 may select each directory inode of a selected inode table.
[0021] As mentioned earlier, the file system 102 may include one or more directories. Each file name-inode number {i.e. filename, inode number} mapping in a directory inode may be referred as directory entry or "dentry". A file name in a directory entry may refer to name of a directory, a subdirectory, a file, or any other file object. For example, a mapping {. (dot): 2}, where "." (dot) is the file name of the parent directory and "2" is the inode number of the parent directory, may represent a dentry. In another example, a mapping {home : 222}, where "home" is file name of the home directory and "222" is the inode number of the home directory, may represent another dentry in the inode table. In a further example, a mapping {"rm": 444}, where "rm" is file name of a file and "444" is the inode number of the file, may represent a dentry.
[0022] In an instance, a dentry may be construed a specific component in a file path. For instance, in a path "/abc/def , both "abc" and "def may represent files, wherein "abc" is a directory file and "def is an ordinary file. In this case, 7", "abc", and "def represent dentry objects. In other words, all components in a file path represent dentry objects (or dentries).
[0023] File system check (fsck) module 108 may be used for checking and repairing file system inconsistencies on the computing device 100. A file system (for example, 102) may become inconsistent, for example, due to power failure, a nonstandard shutdown, hardware failure, etc. This may cause inconsistencies and mismatched information relating to data blocks, free blocks, inodes, pointers, etc. in a file system. Fsck module 108 may perform a consistency check on a dentry of a directory that maps to the directory inode selected by the selection module 106. In an example, such consistency check may involve determining whether the selected dentry includes an orphan inode i.e. an inode without any dentry pointing to it. In an instance, such determination may be made by verifying whether the link between the dentry and inode is valid. This may be carried out by verifying whether the back pointer from the inode is correctly pointing to the directory inode. In an instance, the fsck module may perform aforementioned consistency check for all dentries in the directory that maps to the selected directory inode.
[0024] In another example, a consistency check by fsck module 108 may involve determining whether the selected dentry is a dangling entry i.e. a directory entry that does not include a valid inode pointer, or includes a wrong pointer. In an instance, the fsck module may perform aforementioned consistency check for all dentries in the directory that maps to the selected directory inode. After the dentry-inode relation is verified, the fsck module may increment the link count maintained in-memory for the selected inode. At the end of checking all the inodes in the table, the fsck module 108 may build the in-memory link and apply the link counts on disk. After the inode table check is complete, if any inode having link count 0 is found, fsck module 108 may add the inode to the orphan inode list.
[0025] In an example, fsck module 108 may perform a consistency check related to a dentry along with a file metadata consistency check that may be carried out in parallel. In an instance, fsck module 108 may use a background thread, which is used for carrying out a file metadata consistency check, to perform a consistency check on a selected inode table. As part of the metadata consistency check, if an inode of type directory is encountered, the background thread may offload the thread to a directory checker thread after checking for the inode consistency. The directory checker thread may come from a thread pool that may verify all the dentries in the inode tables and verify the link to the inodes.
[0026] In an example, further to a consistency check, if the fsck module 108 identifies an inconsistency with the dentry, it may record such "corrupted" dentry in a database (for example, 1 10). Fsck module 108 may log a corrupted dentry along with its directory inode, to the database. In an instance, such database may be processed at the end of a consistency check, and dentries recorded therein may be corrected. For instance, if an orphan inode (i.e. an inode with a link count of zero) is identified in the database, fsck module 108 may add a new dentry to the directory page that points to a back pointer pointing to the directory inode of the orphan inode. In another instance, for a corrupted dentry in the database, a corresponding inode with a link count of "0" (zero) may be identified in the database. If the parent directory inode number in the inode matches with the directory where an inconsistency is seen in its dentry, the dentry may be corrected to point to that inode and the link count on the inode may be corrected.
[0027] In an example, a connection between two directories in the file system 102 may be verified by setting a flag on the object state structure, which may be maintained to determine the consistency of the object, of an inode. Presence of such flag on a directory entry may indicate that a link is observed for this directory inode from a non-dot dot based dentry. In case the flag is not set for a directory inode, the directory may be added to an orphan inode list. Such directory may be corrected by adding a lost+found entry to the parent directory, which may be obtained through the back pointer found on the inode.
[0028] In an example, further to a consistency check carried out on a dentry of a directory that maps to the selected directory inode, if an inconsistency is identified with the dentry, such dentry may be recorded in database 1 10, for instance by , by fsck module. The database 1 10, therefore, may act as a repository of all those dentries that may include an inconsistency. The fsck module may also record the current directory of an inconsistent directory entry in the database 1 10. In an instance, fsck module 108 may correct the inconsistency with the directory entry that is recorded in the database 1 10.
[0029] FIG. 2 is a block diagram of an example computing environment 200 for performing a consistency check on namespace of an online file system. Computing environment 200 may include nodes 202, 204, 206, 208, and 210, and a server 212. As used herein, a "node" may be a computing device (i.e. includes at least one processor), a storage device, a network device, or any combination thereof. The number of nodes 202, 204, 206, 208 and 210, and server 212 shown in FIG. 2 is for the purpose of illustration only and their number may vary in other implementations. In some examples, computing environment 200 may represent a file storage system wherein nodes 202, 204, 206, 208, and 210 may serve as file storage nodes. In an instance, said file storage system may be a scale-out file system.
[0030] Nodes 202, 204, 206, 208, and 210 may each be a computing device such as a desktop computer, a notebook computer, a tablet computer, a mobile phone, personal digital assistant (PDA), a server, and the like. Nodes 202, 204, 206, 208, and 210 may be a storage device. In an example, the storage device may be a primary storage device such as, but not limited to, random access memory (RAM), read only memory (ROM), processor cache, or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by a processor. For example, Synchronous DRAM (SDRAM), Double Data Rate (DDR), Rambus DRAM (RDRAM), Rambus RAM, etc. In another example, the storage device may be a secondary storage device such as, but not limited to, a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, a flash memory (e.g. USB flash drives or keys), a paper tape, an Iomega Zip drive, and the like. In another example, the storage device may be a tertiary storage device such as, but not limited to, a tape library, an optical jukebox, and the like. In another example, the storage device may be a Direct Attached Storage (DAS) device, a Network Attached Storage (NAS) device, a tape drive, a magnetic tape drive, a data archival storage system, or a combination of these devices.
[0031] Nodes 202, 204, 206, 208, and 210, may communicate with each other and server 212, for example, via a computer network 224. Computer network 224 may be a wireless or wired network. Computer network 224 may include, for example, a Local Area Network (LAN), a Wireless Local Area Network (WAN), a Metropolitan Area Network (MAN), a Storage Area Network (SAN), a Campus Area Network (CAN), or the like. Further, computer network 224 may be a public network (for example, the Internet) or a private network (for example, an intranet).
[0032] In an example, server 212 may be analogous to computing device 100, in which like reference numerals correspond to the same or similar, though perhaps not identical, components. For the sake of brevity, components or reference numerals of FIG. 2 having a same or similarly described function in FIG. 1 are not being described in detail in connection with FIG. 2. Said components or reference numerals may be considered alike.
[0033] In an example, server 212 may include a file system 102, a directory module 104, a selection module 106, a file system check (fsck) module 108, and a database 1 10. In an example, one or more nodes (202, 204, 206, 208, and/or 210), may include a portion of the file system 102. In an instance, file system 102 is an online file system.
[0034] In an example, directory module 104 may select an inode table in each of a plurality of nodes hosting an online file system (for example, 102). For instance, directory module 104 may select an inode table in each of a plurality of nodes, such as 202, 204, and 208. Upon such selection, the selection module may select a directory inode in each of the selected inode tables. The file system check (fsck) module 108 may then perform, in parallel, a consistency check on a dentry in each of a directory that individually maps to each of the selected directory inode. Further to the consistency check, if the fsck module 108 identifies an inconsistency with the dentry in any of the selected directories, the fsck module 108 may log the dentry in a database (for example, 1 10).
[0035] FIG. 3 is a flowchart of an example method 300 for performing a consistency check on namespace of an online file system. The method 300, which is described below, may at least partially be executed on a computing device 100 of FIG. 1 or server 212 of FIG. 2. However, other computing devices may be used as well. At block 302, an inode table may be selected (for example, by directory module 104), from an online file system (for example, 102). At block 304, a directory inode may be selected in the selected inode table (for example, by selection module 106). At block 306, a consistency check may be performed on a dentry (for example, by fsck module 108) of a directory that maps to the directory inode. At block 308, consequent to the consistency check by fsck module 108, if an inconsistency is identified with the dentry, fsck module 108 may record the dentry in a database (for example, 1 10). 36] FIG. 4 is a block diagram of an example system 400 for performing a consistency check on namespace of an online file system. System 400 includes a processor 402 and a machine-readable storage medium 404 communicatively coupled through a system bus. In an example, system 400 may be analogous to computing device 100 of FIG. 1 or server 212 of FIG. 2. Processor 402 may be any type of Central Processing Unit (CPU), microprocessor, or processing logic that interprets and executes machine- readable instructions stored in machine-readable storage medium 404. Machine-readable storage medium 404 may be a random access memory (RAM) or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by processor 402. For example, machine-readable storage medium 404 may be Synchronous DRAM (SDRAM), Double Data Rate (DDR), Rambus DRAM (RDRAM), Rambus RAM, etc. or a storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like. In an example, machine- readable storage medium 404 may be a non-transitory machine-readable medium. Machine-readable storage medium 404 may store instructions 406, 408, and 410. In an example, instructions 406 may be executed by processor 402 to select a directory inode in an inode table of an online file system. Instructions 406 may be executed by processor 402 to perform a consistency check on each directory entry of a directory that maps to the directory inode. Instructions 408 may be executed by processor 402 to record a directory entry in a database if consequent to the consistency check, an inconsistency is identified with the directory entry.
[0037] For the purpose of simplicity of explanation, the example method of FIG.
3 is shown as executing serially, however it is to be understood and appreciated that the present and other examples are not limited by the illustrated order. The example systems of FIGS. 1 , 2 and 4, and method of FIG. 3 may be implemented in the form of a computer program product including computer-executable instructions, such as program code, which may be run on any suitable computing device in conjunction with a suitable operating system (for example, Microsoft Windows, Linux, UNIX, and the like). Embodiments within the scope of the present solution may also include program products comprising non-transitory computer-readable media for carrying or having computer-executable instructions or data structures stored thereon. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer. By way of example, such computer-readable media can comprise RAM, ROM, EPROM, EEPROM, CD-ROM, magnetic disk storage or other storage devices, or any other medium which can be used to carry or store desired program code in the form of computer-executable instructions and which can be accessed by a general purpose or special purpose computer. The computer readable instructions can also be accessed from memory and executed by a processor.
[0038] It may be noted that the above-described examples of the present solution is for the purpose of illustration only. Although the solution has been described in conjunction with a specific embodiment thereof, numerous modifications may be possible without materially departing from the teachings and advantages of the subject matter described herein. Other substitutions, modifications and changes may be made without departing from the spirit of the present solution. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.

Claims

Claims:
1 . A method of performing a consistency check on namespace of an online file system, comprising:
selecting an inode table from an online file system;
selecting a directory inode in the inode table;
performing a consistency check on a dentry of a directory that maps to the selected directory inode; and
in response to identifying an inconsistency with the dentry, consequent to the consistency check, recording the dentry in a database.
2. The method of claim 1 , wherein the performing comprises verifying whether a link between the dentry and an inode is valid.
3. The method of claim 2, wherein the verifying comprises verifying whether a back pointer from the inode is correctly pointing to directory inode of the dentry.
4. The method of claim 1 , wherein the performing comprises determining whether the dentry maps to an invalid or incorrect inode.
5. The method of claim 1 , wherein the consistency check on the dentry is performed in parallel to metadata consistency check on the inode table.
6. A system to perform a consistency check on namespace of an online file system, comprising:
a directory module to select an inode table in each of a plurality of nodes hosting an online file system;
a selection module to select a directory inode in each of the selected inode tables; and
a file system check (fsck) module to:
perform, in parallel, a consistency check on a dentry in each of a directory that individually maps to each of the selected directory inode; and in response to identification of an inconsistency with the dentry, further to the consistency check, logging the dentry in a database.
7. The system of claim 6, wherein the inode table in each of a plurality of nodes is randomly selected.
8. The system of claim 6, wherein the file system is a scale-out file system.
9. The system of claim 6, wherein the dentry is logged along with directory inode of the dentry.
10. The system of claim 6, wherein the consistency check comprises the fsck module to:
verify whether a link between the dentry and an inode is valid; and determine whether the dentry includes an invalid or incorrect inode.
1 1 . A non-transitory machine-readable storage medium comprising instructions to perform a consistency check on namespace of an online file system, the instructions executable by a processor to:
select a directory inode in an inode table of an online file system;
perform a consistency check on each directory entry of a directory that maps to the directory inode; and
in response to identification of an inconsistency with a directory entry, consequent to the consistency check, record the directory entry in a database.
12. The storage medium of claim 1 1 , wherein the inconsistency includes presence of an orphan inode in the inode table.
13. The storage medium of claim 1 1 , wherein the inconsistency includes presence of a dangling directory entry in the directory.
14. The storage medium of claim 1 1 , further comprising instructions to correct the inconsistency with the directory entry that is recorded in the database.
15. The storage medium of claim 1 1 , wherein the instructions to record the directory entry in the database include instructions to record current directory of the directory entry in the database.
PCT/US2015/025969 2015-02-13 2015-04-15 Consistency check on namespace of an online file system WO2016130167A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN719CH2015 2015-02-13
IN719/CHE/2015 2015-02-13

Publications (1)

Publication Number Publication Date
WO2016130167A1 true WO2016130167A1 (en) 2016-08-18

Family

ID=56615368

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/025969 WO2016130167A1 (en) 2015-02-13 2015-04-15 Consistency check on namespace of an online file system

Country Status (1)

Country Link
WO (1) WO2016130167A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180039542A (en) * 2016-10-07 2018-04-18 한국전자통신연구원 Distributed storage server, server device included therein, and method of operating server device
US11704301B2 (en) 2020-09-11 2023-07-18 International Business Machines Corporation Reducing file system consistency check downtime
US11734232B2 (en) 2018-10-03 2023-08-22 Hewlett Packard Enterprise Development Lp Initial baselines of file systems

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080222152A1 (en) * 2007-03-09 2008-09-11 Saurabh Godbole System and Method for Performing File System Checks on an Active File System
US7552146B1 (en) * 2005-04-28 2009-06-23 Network Appliance, Inc. Method and apparatus for offline and online consistency checking of aggregates and flexible volumes
WO2010050944A1 (en) * 2008-10-30 2010-05-06 Hewlett-Packard Development Company, L.P. Online checking of data structures of a file system
US8131782B1 (en) * 2003-04-28 2012-03-06 Hewlett-Packard Development Company, L.P. Shadow directory structure in a distributed segmented file system
US20120095970A1 (en) * 2010-10-19 2012-04-19 Symantec Corporation Identifying unreferenced file system components

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8131782B1 (en) * 2003-04-28 2012-03-06 Hewlett-Packard Development Company, L.P. Shadow directory structure in a distributed segmented file system
US7552146B1 (en) * 2005-04-28 2009-06-23 Network Appliance, Inc. Method and apparatus for offline and online consistency checking of aggregates and flexible volumes
US20080222152A1 (en) * 2007-03-09 2008-09-11 Saurabh Godbole System and Method for Performing File System Checks on an Active File System
WO2010050944A1 (en) * 2008-10-30 2010-05-06 Hewlett-Packard Development Company, L.P. Online checking of data structures of a file system
US20120095970A1 (en) * 2010-10-19 2012-04-19 Symantec Corporation Identifying unreferenced file system components

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180039542A (en) * 2016-10-07 2018-04-18 한국전자통신연구원 Distributed storage server, server device included therein, and method of operating server device
KR102162466B1 (en) 2016-10-07 2020-10-08 한국전자통신연구원 Distributed storage server, server device included therein, and method of operating server device
US11734232B2 (en) 2018-10-03 2023-08-22 Hewlett Packard Enterprise Development Lp Initial baselines of file systems
US11704301B2 (en) 2020-09-11 2023-07-18 International Business Machines Corporation Reducing file system consistency check downtime

Similar Documents

Publication Publication Date Title
US10114564B2 (en) Management of virtual machine snapshots
US10387405B2 (en) Detecting inconsistencies in hierarchical organization directories
US8606752B1 (en) Method and system of restoring items to a database while maintaining referential integrity
US9336219B2 (en) Distributed file system snapshot
US9886443B1 (en) Distributed NFS metadata server
US20150356133A1 (en) Distributed, Transactional Key-Value Store
US20120084272A1 (en) File system support for inert files
US10769025B2 (en) Indexing a relationship structure of a filesystem
US20170344579A1 (en) Data deduplication
KR20150064593A (en) Deduplication method using data association and system thereof
CN107357920B (en) Incremental multi-copy data synchronization method and system
CN107330024B (en) Storage method and device of tag system data
CN104199888A (en) Data recovery method and device for resilient file system
US20230394010A1 (en) File system metadata deduplication
US11650967B2 (en) Managing a deduplicated data index
US20180189301A1 (en) Managing appendable state of an immutable file
WO2016130167A1 (en) Consistency check on namespace of an online file system
WO2015187187A1 (en) Journal events in a file system and a database
WO2017007496A1 (en) Managing a database index file
US10521405B2 (en) Policy and configuration data for a user directory
US20170161294A1 (en) File directory storage on a storage device
WO2016118176A1 (en) Database management
Cheon et al. Exploiting multi-block atomic write in SQLite transaction
WO2016137524A1 (en) File level snapshots in a file system
WO2016085532A1 (en) Secure file deletion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15882248

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15882248

Country of ref document: EP

Kind code of ref document: A1