WO2015116023A1 - Online file system metadata analysis and correction - Google Patents

Online file system metadata analysis and correction Download PDF

Info

Publication number
WO2015116023A1
WO2015116023A1 PCT/US2014/013280 US2014013280W WO2015116023A1 WO 2015116023 A1 WO2015116023 A1 WO 2015116023A1 US 2014013280 W US2014013280 W US 2014013280W WO 2015116023 A1 WO2015116023 A1 WO 2015116023A1
Authority
WO
WIPO (PCT)
Prior art keywords
file system
metadata
corrupt
online
corruption
Prior art date
Application number
PCT/US2014/013280
Other languages
French (fr)
Inventor
Rajagopal Chellam
Anand GANJIHAL
Santigopal MONDAL
Anoop KUMAR R.
Sandya SRIVILLIPUTTUR-MANNARSWAMY
Kumarswamy SUBBAKRISHNA
Original Assignee
Hewlett-Packard Development Company, L.P.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett-Packard Development Company, L.P. filed Critical Hewlett-Packard Development Company, L.P.
Priority to PCT/US2014/013280 priority Critical patent/WO2015116023A1/en
Publication of WO2015116023A1 publication Critical patent/WO2015116023A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata

Definitions

  • FIG. 1 Illustrates a block diagram of a computing system for online file system metadata analysis and correction according to examples of the present disclosure
  • FIG. 2 illustrates a block diagram of a computing system for online file system metadata analysis and correction according to examples of the present disclosure
  • FIG. 3 illustrates a flow diagram of a method for online file system metadata analysis and correction according to examples of the present disclosure
  • FIG. 4 illustrates a flow diagram of a method for online file system metadata analysis and correction according to examples of the present disclosure
  • file systems running on enterprise and user computing systems have also increased in size. These file systems act as containers of data, and their sizes may be in the gigabytes, terabytes, and/or petabytes and may support millions or billions of file system objects.
  • the increases in file system sizes, along with the corresponding increases in the number of file system objects burden system resources and negatively impact users' experiences when the file systems are checked for file system consistency.
  • a file system consistency check (also referred to as "checker,” “file system checker, 8 or “fsck”) refers to a software program or instructions that checks the consistency of a file system on a computing system.
  • the file system check analyzes a file system to discover and identify inconsistent states.
  • the file system check may also correct or fix the identified inconsistencies, problems, errors, etc.
  • file systems are checked for errors and inconsistencies while the file system is in an offline state (that is, while the file system is inaccessibie to users, applications, etc).
  • offline file system checks depends on the number of file system objects that need to be examined. Depending on the size of the file system and the number of related objects, the file system check ma take hours or days to execute, rendering the computing system unusable and/or unavailable. For example, in a typical offline file system check on a large file system such as a 32 terabyte with 50 million objects may take thirty or more hours to perform the checking. Similarly, a 32 terabyte file system with 100 million objects may take more than fifty hours to perform file system checking, for example. During this time, the file system remains offline and unavailable. Such large downtimes are typically unacceptable in enterprise storage environments, and the large downtimes make offline file system checking non-scalable as file system sizes and number of objects continues to grow.
  • a file system may be taken offline in order to perform the file system: check. However, this may take many hours to complete, rendering the file system otherwise unusable, as discussed above.
  • Another solution provides for online file system checking by creating a snapshot of the file system and its associated metadata. The checking is then performed on the snapshot such thai errors are identified and corrected. Once the checking is completed including fixing any errors in the snapshot of the fiie system, the fixes are then applied back to the original file system: from the snapshot file system.
  • a file system check may be performed on the snapshot with errors logged to a separate file. The file system may then be taken offline in order to apply any necessary fixes.
  • a method may include analyzing file system metadata utilized to bring the fife system online (i.e., for mounting the file system) while the file system is offline to identify any corrupt file system metadata, and correcting the identified corrupt fite system metadata white the file system is offline.
  • the method may further include analyzing file system metadata of the file system while the file system is online to identify any corrupt file system metadata, and togging the identified corrupt fite system metadata into a corruption list.
  • the method may also include correcting the identified corrupt fit ⁇ system metadata logged in the corruption list while freezing the file system.
  • fite system consistency checking occurs while the fsle system is online and accepting normal user file system operations. In this way, excessive downtime due to offline file system checking is avoided. Moreover, the file system may be corrected and "updated in place” rather than having to transition the file system into an offline state.
  • Metadata includes file tags (traditionally known as inode numbers in an example) which act as indirection mechanisms between directory entries and mce!ls.
  • the mcelis represent traditional inodes.
  • a first type of metadata is file system metadata relating to metadata utilized in mounting the file system. This file system metadata is verified or checked prior to mounting the fsle system.
  • a second type of metadata is metadata relating to the operation of the file system after it is mounted, and is analyzed or checked while the file system is mounted, online, and operational.
  • FIG. 1 illustrates a block diagram of a computing system 100 for online fife system (FS) metadata analysis and correction according to examples of the present disclosure.
  • FIG. 1 includes particular components, modules, etc. according to various examples. However, in different embodiments, more, fewer, and/or other components, modules, arrangements of components/modules, etc. may be used according to the teachings described herein.
  • various components, modules, etc. described herein may be implemented as one or more software modules, hardware modules, special-purpose hardware (e.g., application specific hardware, application specific integrated circuits (ASICs), embedded controllers, hardwired circuitry, etc.), or some combination of these.
  • ASICs application specific integrated circuits
  • the computing system 100 may include any appropriate type of computing device, including for example smartphones, tablets, desktops, laptops, workstations, servers, smart monitors, smart televisions, digital signage, scientific instruments, retail point of sale devices, video walls, imaging devices, peripherals, or the like, including any suitable combinations thereof.
  • the computing system 100 may include a processing resource 102 that represents generally any suitable type or form of processing unit or units capable of processing data or interpreting and executing instructions.
  • the Instructions may be stored on a non-transitory tangible computer-readable storage medium, such as a memory resource or on a separate device (not shown), or on any other type of volatile or non-volatile memory that stores instructions to cause a programmable processor to perform the techniques described herein.
  • the computing system 100 may include dedicated hardware, such as one or more integrated circuits, Application Specific Integrated Circuits (ASICs), Application Specific Special Processors (ASSPs), Field Programmable Gate Arrays (FPGAs), or any combination of the foregoing examples of dedicated hardware, for performing the techniques described herein.
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Special Processors
  • FPGAs Field Programmable Gate Arrays
  • the computing system 100 may include a pre-mount file system (FS) verification module 110, an online FS analysis module 112, and an FS correction module 120,
  • FS file system
  • the modules described herein may be a combination of hardware and programming.
  • the programming may be processor executable instructions stored on a tangible memory resource such as a memory resource (not shown), and the hardware may include processing resource 102 for executing those instructions.
  • the memory resource can be said to store program instructions that when executed by the processing resource 102 implement the modules described herein.
  • Other modules may also be utilized as will be discussed further below in other examples.
  • the pre-mount FS verification module 1 0 verifies the FS metadata necessary for mounting the file system. For example, the pre-mount FS verification module analyzes the file system metadata of the file system while the file system Is offline ( , ⁇ ,, before the file system is mounted on the operating system of the computing system) to identify any corrupt file system metadata for mounting the file system.
  • the pre-mount FS verification module 1 10 may also assign the volumes of the file system into appropriate service classes.
  • the pre-mount FS verification module 1 10 may validate and fix per volume superblock, metadata describing FS metadata, root directory metadata, and reserved bitfile metadata table ⁇ RBJV1T ⁇ pages.
  • the pre-mount FS verification module 1 10 may also check and/or correct domain attributes, domain mutable attributes, and volume attributes. These corrections are then synchronized across different volumes of the file system.
  • the pre-mount FS verification module 1 10 provides hooks into file system functions to ensure that metadata is checked before it is accessed by the operating system.
  • the pre- mount FS verification module 110 may also perform metadata verification for RBIV1T pages, including those R81V1T pages that are accessed during the mounting process.
  • the pre-mount FS verification module 110 may also assign certain file system volumes into a special service class that accepts read/write/update file requests. In this way, a special volume designated as "FSCK volume" may be created ant put into the active service class where new creates occur. In this case, the FSCK volume separates existing metadata that is yet to be checked from new metadata that is created during active file system operation.
  • While the this example provides for a separate FSCK volume, other examples may use reserve space on each volume itself for online file system checking to be utilized by active file system operations such as creates/writes/updates, which might utilize storage space to be allocated while the online file system checking is in process.
  • the pre-mount FS verification After the FS metadata analysts is performed by the pre-mount FS verification module 1 10, the pre-mount FS verification then corrects any identified corrupt file system: metadata while the file system is offline.
  • the online file system analysis module 112 checks the consistency of metadata and metadata pages of the file system.
  • the online FS analysis module 112 analyzes file system metadata of the file system white the file system is online (/.e, f after the file system is mounted) to Identify any corrupt file system metadata. At this analysis stage, no corrections to the metadata are made. However, in one example, the online FS analysis module 112 may fix any inconsistencies found in the page header of a metadata page. If page header inconsistencies are corrected at this stage, a free mcelt list within the corrected pages is maintained and the links to the respective pages are also corrected. After verifying and correcting these free mceISs, the corresponding mceISs are cleaned on the BMT bitmap, which may be maintained per volume.
  • the online file system analysis module 1 12 maintains a page header stat of verified and fixed on an auxiliary data structure, which may be maintained per volume. This state change is protected through a page-bit-lock so that a parallel on-demand analysis thread context does not perform any corrections on the same page for header verification purposes,
  • the online FS analysis module 112 logs the metadata identified as corrupt file system metadata in a corruptio log, such as a tag corruption log or a mcel! corruption fog. in other examples, the logging functionality may be performed by other suitable modules, as described more fully below. Corruptions relating to tags are logged in the tag corruption log while corruptions relating to mcelts are togged in the mcell corruption log.
  • the online file system analysis module 1 12 may check the tag entry for the mcell to determine whether the tag entry also points to the particular mcell. if not, the tag is identified as corrupt and is logged in the tag corrupt list, if the tag on a particular mceli in not valid, then the invalid mcell is logged in the mcell corruption list. Further, if the link from the particular mcell to the corresponding tag and back from the tag to the mceli is valid, the tag is added to an in-process verification list. Once this tag is added to the list, other parallel threads (on-demand threads) may be prevented from acting on the same tag.
  • the records of the mcell ⁇ i.e., file statistics, extent maps of a file, a file block map table, etc) are then verified, and if the records are not consistent, the mcell is logged in the mceil corruption list as being corrupt.
  • the auxiliary mcell bitmap that is based on the state of each of the primary mcells is updated, as are the corresponding mcells under each of the primary mcells. Once a primary mcell and its chair is verified, the corresponding tag may be marked as valid in the tag bitmap, and the tag may be removed from the in-process verification list.
  • the online file system analysis module 1 12 may also detect orphan tags by analyzing the tag directory for each mountable entity (/ ' .e., file system, file set, etc.) and checking to determine whether each tag is verified by analyzing the tag bitmap, if any ta Is determined to be an orphan (that is, not associated with a particular mcell), it is added to the tag corruption list.
  • the online file system analysis module 112 may also detect orphan mcells.
  • the online FS analysis module 112 may handle any active file system operations that arrive during the analysis process (i.e., an on- demand analysis). For example, if a file system operation arrives during the analysis process, the online FS analysis module 1 12 checks whether the relevant metadata has already been verified using a verification status bit, which signifies whether the metadata has been verified, if verified, normal file system operatio may continue. However, if the verification status bit indicates that the metadata has not been verified yet, the metadata is verified first in the context of the active file system thread and then the file system operation continues. In this way, the normal anaiysis flow is interrupted to process the new file system operation. The analysis of the remainder of the file system vvii! then continue. User actions are therefore accommodated upon request rather than being delayed.
  • the functionality of the online FS analysis module 1 12 may be divided into separate modules.
  • the computing system may include a background online FS analysis module that performs the functionality of the online FS anaiysis module 1 12 as a background process. This enables users to continue to access the fiie system and its data while the analysis is being performed.
  • the computing system 100 may include an on-demand online FS analysis module. When a user requests information that has not yet been verified by the background online FS analysis module, the on-demand online FS analysis module performs an immediate check on the requested metadata so that it may be made available to the user.
  • the functionality of these modules may include the functionality of the online FS analysis module 1 12 and/or additional functionality as discussed below regarding FIG, 2,
  • the FS correction module 122 is responsible for correcting any metadata that is identified as corrupt during the anaiysis performed by the online FS analysis module 1 2.
  • the file system is temporarily placed in a hold state (or frozen).
  • the corrections are applied to the file system.
  • the FS correction module 122 may hold or freeze the online file system only once to apply necessary changes and corrections, rather than bringing it offline during the analysis phase as inconsistencies are detected,
  • the FS correction module 122 freezes or holds the file system and performs a file system flush.
  • the FS correction module 122 then corrects the tag corruption log entries and the mcelS corruption log eniries.
  • the FS correction module 122 may also perform a correction of the storage bitmap that tracks aSlocated disk space.
  • FIG. 2 illustrates a block diagram of a computing system 200 for online file system (FS) metadata analysis and correction according to examples of the present disclosure.
  • FIG, 2 includes particular components, modules, etc. according to various examples. However, in different embodiments, more, fewer, and/or other components, modules, arrangements of components/modules, etc. may be used according to the teachings described herein.
  • various components, moduies, etc. described heresn may be implemented as one or more software moduies, hardware moduies, special-purpose hardware (e.g., application specific hardware, application specific integrated circuits (ASICs), embedded controllers, hardwired circuitry, etc.), or some combination of these.
  • ASICs application specific integrated circuits
  • the computing system 200 may include any appropriate type of computing device, including for example smartphones, tablets, desktops, laptops, workstations, servers, smart monitors, smart televisions, digital signage, scientific instruments, retail point of sale devices, video walls, imaging devices, peripherals, or the like, including any suitable combinations thereof.
  • the computing system 200 may include a processing resource 202 that represents generally any suitable typ or form of processing unit or units capable of processing data or interpreting and executing instructions.
  • the instructions may be stored on a non-transitory tangible computer-readable storage medium:, such as a memory resource or on a separate device (not shown), or on any other type of volatile or non-volatile memory that stores instructions to cause a programmable processor to perform the techniques described herein.
  • the computing system 100 may include dedicated hardware, such as one or more integrated circuits.
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Special Processors
  • FPGAs Field Programmable Gate Arrays
  • multiple processors and/or processing resources may be used, as appropriate, along with multiple memories, memory resources, and/or types of memory.
  • the computing system 200 may include a pre-mount FS verification module 210, a background online FS analysis module 216, an on-demand online FS verification module 218, an FS logging module 220, and an FS correction module 222.
  • the modules described herein may be a combination of hardware and programming.
  • the programming may be processor executable instructions stored on a tangible memory resource such as a memory resource (not shown), and the hardware may include processing resource 202 for executing those instructions.
  • the memory resource can be said to store program instructions that when executed by the processing resource 202 implement the modules described herein.
  • Other modules may also be utilized as will be discussed further below in other examples.
  • the pre-mount FS verification module 210 verifies the FS metadata necessary for mounting the file system. For example, the pre-mount FS verification module analyzes the file system metadata of the file system used during the mounting process while the file system is offline (i.e., before the file system is mounted on the operating system of the computing system) to identify any corrupt file system metadata for mounting the file system. The pre-mount FS verification module 210 may also assign the volumes of the file system into appropriate service classes.
  • the pre-mount FS verification module 210 may validate and fix per voiume superblock, metadata describing FS metadata, root directory metadata, and reserved bitfife metadata table (RBMT) pages.
  • the pre-mount FS verification module 210 may also check and/or correct domain attributes, domain mutable attributes, and volume attributes. These corrections are then synchronized across different volumes of the file system.
  • the pre-mount FS verification module 210 provides hooks into file system functions to ensure that metadata is checked before it is accessed by the operating system.
  • the pre- mount FS verification module 210 may also perform metadata verification for RB T pages, including those RBMT pages that are accessed during the mounting process.
  • the pre-mount FS verification module 210 may also assign certain file system volumes into a special service class that accepts read/write/update file requests.
  • a special volume designated as "FSCK volume” may be created ant put into the active service class where new creates occur.
  • the FSCK volume separates existing metadata that is yet to be checked from new metadata that is created during active file system operation. While the this example provides for a separate FSCK volume, other examples may use reserve space on each volume itself for online file system checking to be utilized by active file system operations such as creates/writes/updates, which my utilize storage space to be aliocated while the online file system: checking is in process.
  • the pre-mount FS verification After the FS metadata analysis is performed by the pre-mount FS verification module 210, the pre-mount FS verification then corrects any identified corrupt file system metadata while the fife system is offline,
  • the background online FS analysis module 218 may perform functionality similar to the online file system analysis module 1 12 of FIG. 1 as described above. For example, the background online FS analysis module 216 analyzes file system metadata of the file system while the file system is online (i.e., after the file system is mounted) to identify any corrupt file system metadata. At this analysis stage, no corrections to the metadata are made. However, in one example, the background online FS analysis module 218 may fix any inconsistencies found in the page header of a metadata page. If page header inconsistencies are corrected at this stage, a free mcell list within the corrected pages is maintained and the links to the respective pages are also corrected. After verifying and correcting these free mceSIs, the corresponding mce!ls are cleaned on the mcell bitmap, which may be maintained per volume.
  • the background online FS analysis module 218 maintains a page header state of verified and fixed on an auxiliary data structure, which may be maintained per volume. This state change is protected through a page-bit-iock so thai a parallel on-demand analysis thread context does not perform any corrections on the same page for header verification purposes.
  • the FS logging module 220 logs the metadata Identified as corrupt file system metadata in a corruption log, such as a tag corruption log or a mcell corruption log. fn other examples, the logging functionality may be performed by other suitable module, such as by the online FS analysis module 1 12 of FIG. 1. Corruptions relating to tags are logged in the tag corruption log while corruptions relating to mcells are logged in the mcell corruption log.
  • the background online FS analysts module 216 may check the tag entry for the mcell to determine whether the tag entry also points to the particular mcell. If not, the tag is identified as corrupt and is logged in the tag corrupt list, ff the tag on a particular mcell in not valid, then the invalid mcell is logged in the mcell corruption list. Further, if the link from the particular mcell to the corresponding tag and back from the tag to the mcell is valid, the tag is added to an in-process verification list. Once this tag is added to the list, other parallel threads (on-demand threads) may be prevented from acting on the same tag.
  • the records of the mcell i.e., file statistics, extent maps of a file ,a file block map table, etc.
  • the records of the mcell are then verified, and if the records are not consistent, the mcell is logged in the mcell corruption list as being corrupt.
  • the auxiliary mcell bitmap that is based on the state of each of the primary mcells is updated, as are the corresponding mcells under each of the primary mcells. Once a primary mcell and its chain is verified, the corresponding tag may be marked as valid in the tag bitmap, and the tag may be removed from the in-process verification list.
  • the background online FS analysis module 216 may also detect orphan tags by analyzing the tag directory for each mountabSe entity (i.e., file system, file set, etc.) and checking to determine whether each tag is verified by analyzing the tag bitmap. If any tag is determined to be an orphan (that is, not associated with a particular mcell), it is added to the tag corruption list. Similarly background online FS analysis module 216 may also detect orphan mcells. [0048] !n one example, the on-demand online FS analysis module 218 handles any active file system operations that arrive during the analysis process ⁇ i.e., an on-demand analysis).
  • the on- demand online FS analysis module 218 checks whether the relevant metadata has already been verified using a verification status bit, which signifies whether the metadata has been verified, if verified, normal file system operation may continue. However, if the verification status bit indicates that the metadata has not been verified yet, the metadata is verified first in the context of the active file system thread and then the file system operation continues. In this way, the normal analysis flow is interrupted to process the new file system operation. The analysis of the remainder of the file system will then continue. User actions are therefore accommodated upon request rather than being delayed.
  • the FS correction module 222 is responsible for correcting any metadata that is identified as corrupt during the analysis performed by the background online FS analysis module 218 and/or the on-demand online FS analysis module 218, as logged in the tag corruption log and mcell corruption log by the FS logging module 220.
  • the file system is temporarily placed in a hold state (or frozen). During this brief time, the corrections are applied to the file system.
  • the FS correction module 222 may hold or freeze the online file system only once to apply necessary changes and corrections, rather than bringing it offline during the analysis phase as inconsistencies are detected.
  • the FS correction module 222 freezes or holds the file system and performs a file system flush.
  • the FS correction module 222 then corrects the tag corruption log entries and the mcell corruption log entries.
  • the FS correction module 222 may also perform a correction of the storage bitmap that tracks allocated disk space.
  • FIG. 3 illustrates a flow diagram of a method 300 for online file system metadata analysis and correction according to examples of the present disclosure.
  • the method 300 may be executed by a computing system: or a computing device such as computing systems 100 and/or 200 of FSGs. 1 and 2 respectively.
  • method 300 may include: analyzing file system metadata for mounting the file system while the file system is offline to identify corrupt file system metadata (block 302); correcting the identified corrupt file system metadata while the file system is offline ⁇ block 304); analyzing fiie system metadata while the file system is online to identify corrupt file system metadata (block 306); logging the identified corrupt file system metadata into a corruption list (block 308); correcting the identified corrupt file system metadata white freezing the file system (block 310).
  • the method 300 includes analyzing file system metadata for mounting the file system while the file system is offline to identify corrupt file system metadata.
  • a computing system e.g., computing system 100 of FIG. 1 , computing system 200 of FIG. 2 analyzes fiie system metadata for mounting a file system while the file system is offline to Identify any corrupt file system metadata
  • a variety of consistency checks of the FS metadata and/or associated metadata pages can be performed, such as by the pre-mount FS verification module 110 of the computing system 100, for example. Any inconsistencies in the FS metadata may be corrected prior to mounting the file system and bringing it online. Moreover, during the pre-mount FS verification, volumes of the file system may also be assigned into appropriate service classes. The method 300 continues to block 304.
  • the method 300 includes correcting the identified corrupt file system metadata while the file system is offline.
  • the computing system e.g. , computing system 100 of FIG. 1 , computing system 200 of FIG. 2 corrects the identified corrupt file system metadata while the file system is offline.
  • the method 300 continues to biock 306.
  • the method 300 includes analyzing file system metadata while the file system is online to identify corrupt file system metadata.
  • the computing system e.g., computing system 100 of FIG. 1, computing system 200 of FIG. 2 ⁇ analyzes file system metadata of the file system: white the file system is online to identify any corrupt file system metadata
  • the analysis may be performed by the online FS analysis module 1 12 of FIG. 1 or by the background online FS analysis module 216 and the on-demand online FS analysis module 218 of FIG, 2,
  • analyzing the fite system metadata includes performing, by the computing system, a background analysts of the file system metadata as described above. This includes checking the consistency of metadata and metadata pages of the file system. While the background analysis is being performed, if the computing system is interrupted, such as by a user action or request for data from the file system, an on-demand analysis of the fife system metadata relating to the user action or request may be performed, In this way, the user action causes the computing system to suspend performing the background analysis on the set of metadata objects undergoing the on-demand analysis relating to the user action (e.g., a request for data). The background analysis of the file system metadata resumes following the completion of the on- demand analysis of the file system metadata relating to the user action.
  • the background analysts may continue in parallel with the on-demand analysis being performed relating to the user action if the metadata objects being analyzed are different.
  • any corrupt page headers for bitfiie metadata table (BMT) (traditionally known as inode table) pages associated with the file system metadata may be corrected during the background analysis.
  • BMT bitfiie metadata table
  • the method 300 Includes logging the identified corrupt file system metadata into a corruption list.
  • the computing system ⁇ e.g. , computing system 100 of FIG. 1 , computing system 200 of FIG, 2) logs the identified corrupt file system metadata into a corruption list.
  • the logging may be performed by the FS correction logging module 222 of FIG. 2 in one example.
  • the corruption list for example, includes multiple corruption lists such as a tag corruption list and a mcell corruption list. In this case, corrupt tag metadata are logged in the tag corruption list, while corrupt mcell metadata are logged in the mcell corruption list.
  • the method 300 continues to block 310.
  • the method 300 includes correcting the identified corrupt fife system metadata.
  • the computing system (e.g., computing system 100 of FIG. 1 , computing system 200 of FIG, 2) corrects the identified corrupt file system metadata logged in the corruption list.
  • the correcting may be performed, for example, by the FS correction module 122 of FIG. 1 or the FS correction module 222 of FIG. 2.
  • the file system may be frozen or placed in a state of temporary hold in an example, during which time corrections may be applied to the metadata as indicated in the corruption log lists, in one example, correcting the identified corrupt file system metadata includes correcting the identified corrupt file system metadata logged into the tag corruption list and also correcting the identified corrupt file system metadata logged into the mce!l corruption list.
  • the method 300 may include mounting, by the computing system, the file system on an operating system of the computing device to cause the file system to be online after correcting the identified corrupt file system metadata. It should be understood that the processes depicted in FIG. 3 represent illustrations, and that other processes may be added or existing processes may be removed, modified, or rearranged without departing from the scope and spirit of the present disclosure.
  • FIG. 4 illustrates a flow diagram of a method 400 for online file system metadata analysis and correction according to examples of the present disclosure.
  • the method 400 may be executed by a computing system: or a computing device such as computing systems 100 and/or 200 of FSGs. 1 and 2 respectively.
  • method 400 may include: validating file system metadata prior to mounting the file system (block 402); analyzing the file system: metadata after mounting the file system to identify any errors in the file system metadata while the file system is online (block 404); logging any identified errors in a tag corruption log or a mcelS corruption log (block 408); and correcting the identified errors while the file system is in a temporary hold (block 408).
  • the method 400 includes validating file system metadata prior to mounting the file system.
  • a computing system e.g., computing system 100 of FIG, 1 , computing system 200 of FIG, 2 ⁇ validates the file system metadata for a file system prior to mounting the file system on an operating system of a computing system,
  • a variety of consistency checks of the FS metadata and/or associated metadata pages can be performed, such as by the p re-mount FS verification module 1 10 of the computing system 100, for example. Any inconsistencies in the FS metadata may be corrected prior to mounting the file system and bringing it online. Moreover, during the pre-mount FS verification, volumes of the file system may also be assigned Into appropriate service classes. The method 400 continues to block 404.
  • the method 400 includes analyzing the file system metadata after mounting the file system to identify any errors in the file system metadata while the file system Is online.
  • the computing system e.g., computing system 100 of FIG. 1 , computing system 200 of FIG. 2 analyzes file system metadata for the file system after mounting the file system on the computing system to identify any errors in the file system metadata while the file system is online.
  • the file system remains mounted and online during the analysis of the file system metadata.
  • the analysis may be performed by the online FS analysis module 1 12 of FIG. 1 or by the background online FS analysis module 216 and the on-demand online FS analysis module 218 of FIG, 2.
  • analyzing the file system metadata includes performing, by the computing system, a background analysis of the file system metadata as described above while the file system is online (that is, after it is mounted). This includes checking the consistency of metadata and metadata pages of the file system. While the background analysis is being performed, if the computing system is interrupted, such as by a user action or request for data from the file system, an on-demand analysis of the file system metadata relating to the user action or request may be performed. In this way, the user action causes the computing system to suspend performing the background analysis on ihe metadata objects undergoing the on-demand analysts of the file system: metadata relating to the user action. The background analysis of the file system metadata resumes following the completion of the on-demand analysis of the file system metadata relating to the user action.
  • the background analysis can continue in parallel with the on-demand analysis being performed relating to a user action if the metadata objects being analyzed are different.
  • any corrupt page headers for bitfile metadat table (BfVIT) (traditionally inode table) pages associated with the file system metadata may be corrected during the background analysis.
  • BfVIT bitfile metadat table
  • the method 400 includes logging any identified errors in a tag corruption log or a mcell corruption tog.
  • the computing system e.g., computing system 100 of FIG. 1 , computing system 200 of FIG. 2 logs any identified errors in one of a tag corruption log and a mcell corruption log.
  • the logging may be performed by the FS correction logging module 222 of FIG, 2 in one example.
  • the corruption list for example, includes multiple corruption Iists such as a tag corruption list and a mcell corruption list. In this case, corrupt tag metadata are logged in the tag corruption list, while corrupt mcell metadata are logged in the mcell corruption list.
  • the method 400 continues to block 408,
  • the method 400 includes correcting the identified errors while the file system is in a temporary hold.
  • the computing system ⁇ e.g.-, computing system 100 of FIG. 1 , computing system 200 of FIG. 2 ⁇ corrects the identified errors in the file system white the file system is in a temporary hold.
  • the correcting may be performed, for example, by the FS correction module 122 of FIG, 1 or the FS correction module 222 of FIG. 2.
  • the file system is frozen or placed in a state of temporary hold, dursng which time corrections may be applied to the metadata as indicated in the corruption log lists.
  • correcting the identified corrupt file system metadata includes correcting the identified corrupt file system metadata logged into the tag corruption list and also correcting the identified corrupt file system metadata logged into the mcell corruption list.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Examples of online file system metadata analysis and correction are disclosed. In one example implementation according to aspects of the present disclosure, a method may include analyzing file system metadata for mounting a file system while the file system is offline to identify any corrupt file system metadata, and correcting the identified corrupt file system metadata while the file system is offline. The method may further include analyzing file system metadata of the file system while the file system is online to identify any corrupt file system metadata, and logging the identified corrupt file system metadata into a corruption list. The method may also include correcting the identified corrupt file system metadata logged in the corruption list while freezing the file system.

Description

ONLINE FILE SYSTEM METADATA ANALYSIS AND CORRECTION
BACKGROUND
[0001] As computing devices such as laptops, smart phones, tablets, and other similar computing devices become more popular, increasing amounts of data is generated. These devices often rely on various remote and/or cloud based computing environments for access to data, applications, and other information, which may be stored on servers or other similar computing systems accessible via the Internet or another suitable network. As the amount of data has increased, so too have the demands for performance and system capabilities increased. Consequently, computing systems and their related file systems have grown in size and complexity,
BRIEF DESCRIPTION OF THE DRAWINGS
[0002] The following detailed description references the drawings, in which;
[0003] FIG. 1 Illustrates a block diagram of a computing system for online file system metadata analysis and correction according to examples of the present disclosure;
[0004] FIG. 2 illustrates a block diagram of a computing system for online file system metadata analysis and correction according to examples of the present disclosure;
[0005] FIG. 3 illustrates a flow diagram of a method for online file system metadata analysis and correction according to examples of the present disclosure; and
[0006] FIG. 4 illustrates a flow diagram of a method for online file system metadata analysis and correction according to examples of the present disclosure,
DETAILED DESCRIPTION
[0007] With the rapid growth of big data, cloud based computing, and user generated content, file systems running on enterprise and user computing systems have also increased in size. These file systems act as containers of data, and their sizes may be in the gigabytes, terabytes, and/or petabytes and may support millions or billions of file system objects. However, the increases in file system sizes, along with the corresponding increases in the number of file system objects burden system resources and negatively impact users' experiences when the file systems are checked for file system consistency.
[0008] A file system consistency check (also referred to as "checker," "file system checker,8 or "fsck") refers to a software program or instructions that checks the consistency of a file system on a computing system. In one example, the file system check analyzes a file system to discover and identify inconsistent states. The file system check may also correct or fix the identified inconsistencies, problems, errors, etc.
[0009] Typically, file systems are checked for errors and inconsistencies while the file system is in an offline state (that is, while the file system is inaccessibie to users, applications, etc). However, such offline file system: checks depends on the number of file system objects that need to be examined. Depending on the size of the file system and the number of related objects, the file system check ma take hours or days to execute, rendering the computing system unusable and/or unavailable. For example, in a typical offline file system check on a large file system such as a 32 terabyte with 50 million objects may take thirty or more hours to perform the checking. Similarly, a 32 terabyte file system with 100 million objects may take more than fifty hours to perform file system checking, for example. During this time, the file system remains offline and unavailable. Such large downtimes are typically unacceptable in enterprise storage environments, and the large downtimes make offline file system checking non-scalable as file system sizes and number of objects continues to grow.
[0010] Various file system consistency checking solutions exist. In one example, a file system may be taken offline in order to perform the file system: check. However, this may take many hours to complete, rendering the file system otherwise unusable, as discussed above. Another solution provides for online file system checking by creating a snapshot of the file system and its associated metadata. The checking is then performed on the snapshot such thai errors are identified and corrected. Once the checking is completed including fixing any errors in the snapshot of the fiie system, the fixes are then applied back to the original file system: from the snapshot file system. Alternatively, a file system check may be performed on the snapshot with errors logged to a separate file. The file system may then be taken offline in order to apply any necessary fixes.
[0011] Various embodiments will be described below by referring to several examples of online fite system (FS) metadata analysis and correction. In one example implementation according to aspects of the present disclosure, a method may include analyzing file system metadata utilized to bring the fife system online (i.e., for mounting the file system) while the file system is offline to identify any corrupt file system metadata, and correcting the identified corrupt fite system metadata white the file system is offline. The method may further include analyzing file system metadata of the file system while the file system is online to identify any corrupt file system metadata, and togging the identified corrupt fite system metadata into a corruption list. The method may also include correcting the identified corrupt fit© system metadata logged in the corruption list while freezing the file system.
[0012] in some implementations, fite system consistency checking occurs while the fsle system is online and accepting normal user file system operations. In this way, excessive downtime due to offline file system checking is avoided. Moreover, the file system may be corrected and "updated in place" rather than having to transition the file system into an offline state. These and other advantages wilt be apparent from the description that follows.
[0013] St should be understood that, throughout the disclosure, reference is made to a two tiered metadata scheme with two distinct types of metadata. Generally, metadata includes file tags (traditionally known as inode numbers in an example) which act as indirection mechanisms between directory entries and mce!ls. The mcelis represent traditional inodes. A first type of metadata is file system metadata relating to metadata utilized in mounting the file system. This file system metadata is verified or checked prior to mounting the fsle system. A second type of metadata is metadata relating to the operation of the file system after it is mounted, and is analyzed or checked while the file system is mounted, online, and operational. [0014] FIG. 1 illustrates a block diagram of a computing system 100 for online fife system (FS) metadata analysis and correction according to examples of the present disclosure. FIG. 1 includes particular components, modules, etc. according to various examples. However, in different embodiments, more, fewer, and/or other components, modules, arrangements of components/modules, etc. may be used according to the teachings described herein. In addition, various components, modules, etc. described herein may be implemented as one or more software modules, hardware modules, special-purpose hardware (e.g., application specific hardware, application specific integrated circuits (ASICs), embedded controllers, hardwired circuitry, etc.), or some combination of these.
[0015] It should be understood that the computing system 100 ma include any appropriate type of computing device, including for example smartphones, tablets, desktops, laptops, workstations, servers, smart monitors, smart televisions, digital signage, scientific instruments, retail point of sale devices, video walls, imaging devices, peripherals, or the like, including any suitable combinations thereof.
[0016] The computing system 100 may include a processing resource 102 that represents generally any suitable type or form of processing unit or units capable of processing data or interpreting and executing instructions. The Instructions may be stored on a non-transitory tangible computer-readable storage medium, such as a memory resource or on a separate device (not shown), or on any other type of volatile or non-volatile memory that stores instructions to cause a programmable processor to perform the techniques described herein. Alternatively or additionally, the computing system 100 may include dedicated hardware, such as one or more integrated circuits, Application Specific Integrated Circuits (ASICs), Application Specific Special Processors (ASSPs), Field Programmable Gate Arrays (FPGAs), or any combination of the foregoing examples of dedicated hardware, for performing the techniques described herein. In some implementations, multiple processors and/or processing resources may be used, as appropriate, along with multiple memories, memor resources, and/or types of memory. [0017] !n addition to the processing resource 102, the computing system 100 may include a pre-mount file system (FS) verification module 110, an online FS analysis module 112, and an FS correction module 120, In one example, the modules described herein may be a combination of hardware and programming. The programming may be processor executable instructions stored on a tangible memory resource such as a memory resource (not shown), and the hardware may include processing resource 102 for executing those instructions. Thus the memory resource can be said to store program instructions that when executed by the processing resource 102 implement the modules described herein. Other modules may also be utilized as will be discussed further below in other examples.
[0018] The pre-mount FS verification module 1 0 verifies the FS metadata necessary for mounting the file system. For example, the pre-mount FS verification module analyzes the file system metadata of the file system while the file system Is offline ( ,ø,, before the file system is mounted on the operating system of the computing system) to identify any corrupt file system metadata for mounting the file system. The pre-mount FS verification module 1 10 may also assign the volumes of the file system into appropriate service classes.
[0019] A variety of consistency checks of the FS metadata and/or associated metadata pages can be performed on the metadata for mounting the file system. For example, the pre-mount FS verification module 1 10 may validate and fix per volume superblock, metadata describing FS metadata, root directory metadata, and reserved bitfile metadata table {RBJV1T} pages. The pre-mount FS verification module 1 10 may also check and/or correct domain attributes, domain mutable attributes, and volume attributes. These corrections are then synchronized across different volumes of the file system. In one example, the pre-mount FS verification module 1 10 provides hooks into file system functions to ensure that metadata is checked before it is accessed by the operating system. The pre- mount FS verification module 110 may also perform metadata verification for RBIV1T pages, including those R81V1T pages that are accessed during the mounting process. [0020] The pre-mount FS verification module 110 may also assign certain file system volumes into a special service class that accepts read/write/update file requests. In this way, a special volume designated as "FSCK volume" may be created ant put into the active service class where new creates occur. In this case, the FSCK volume separates existing metadata that is yet to be checked from new metadata that is created during active file system operation. While the this example provides for a separate FSCK volume, other examples may use reserve space on each volume itself for online file system checking to be utilized by active file system operations such as creates/writes/updates, which might utilize storage space to be allocated while the online file system checking is in process.
[0021] After the FS metadata analysts is performed by the pre-mount FS verification module 1 10, the pre-mount FS verification then corrects any identified corrupt file system: metadata while the file system is offline.
[0022] The online file system analysis module 112 checks the consistency of metadata and metadata pages of the file system. In one example, the online FS analysis module 112 analyzes file system metadata of the file system white the file system is online (/.e,f after the file system is mounted) to Identify any corrupt file system metadata. At this analysis stage, no corrections to the metadata are made. However, in one example, the online FS analysis module 112 may fix any inconsistencies found in the page header of a metadata page. If page header inconsistencies are corrected at this stage, a free mcelt list within the corrected pages is maintained and the links to the respective pages are also corrected. After verifying and correcting these free mceISs, the corresponding mceISs are cleaned on the BMT bitmap, which may be maintained per volume.
[0023] In an example, the online file system analysis module 1 12 maintains a page header stat of verified and fixed on an auxiliary data structure, which may be maintained per volume. This state change is protected through a page-bit-lock so that a parallel on-demand analysis thread context does not perform any corrections on the same page for header verification purposes,
[0024] The online FS analysis module 112 logs the metadata identified as corrupt file system metadata in a corruptio log, such as a tag corruption log or a mcel! corruption fog. in other examples, the logging functionality may be performed by other suitable modules, as described more fully below. Corruptions relating to tags are logged in the tag corruption log while corruptions relating to mcelts are togged in the mcell corruption log.
[0025] For each mcell, the online file system analysis module 1 12 may check the tag entry for the mcell to determine whether the tag entry also points to the particular mcell. if not, the tag is identified as corrupt and is logged in the tag corrupt list, if the tag on a particular mceli in not valid, then the invalid mcell is logged in the mcell corruption list. Further, if the link from the particular mcell to the corresponding tag and back from the tag to the mceli is valid, the tag is added to an in-process verification list. Once this tag is added to the list, other parallel threads (on-demand threads) may be prevented from acting on the same tag. The records of the mcell {i.e., file statistics, extent maps of a file, a file block map table, etc) are then verified, and if the records are not consistent, the mcell is logged in the mceil corruption list as being corrupt.
[0026] After the tags and mcells are analyzed, the auxiliary mcell bitmap that is based on the state of each of the primary mcells is updated, as are the corresponding mcells under each of the primary mcells. Once a primary mcell and its chair is verified, the corresponding tag may be marked as valid in the tag bitmap, and the tag may be removed from the in-process verification list.
0027] !n one example, the online file system analysis module 1 12 may also detect orphan tags by analyzing the tag directory for each mountable entity (/'.e., file system, file set, etc.) and checking to determine whether each tag is verified by analyzing the tag bitmap, if any ta Is determined to be an orphan (that is, not associated with a particular mcell), it is added to the tag corruption list. Similarly, the online file system analysis module 112 may also detect orphan mcells.
[0028] In one example, the online FS analysis module 112 may handle any active file system operations that arrive during the analysis process (i.e., an on- demand analysis). For example, if a file system operation arrives during the analysis process, the online FS analysis module 1 12 checks whether the relevant metadata has already been verified using a verification status bit, which signifies whether the metadata has been verified, if verified, normal file system operatio may continue. However, if the verification status bit indicates that the metadata has not been verified yet, the metadata is verified first in the context of the active file system thread and then the file system operation continues. In this way, the normal anaiysis flow is interrupted to process the new file system operation. The analysis of the remainder of the file system vvii! then continue. User actions are therefore accommodated upon request rather than being delayed.
[0029] In one example, the functionality of the online FS analysis module 1 12 may be divided into separate modules. For example, the computing system may include a background online FS analysis module that performs the functionality of the online FS anaiysis module 1 12 as a background process. This enables users to continue to access the fiie system and its data while the analysis is being performed. Additionally, the computing system 100 may include an on-demand online FS analysis module. When a user requests information that has not yet been verified by the background online FS analysis module, the on-demand online FS analysis module performs an immediate check on the requested metadata so that it may be made available to the user. The functionality of these modules may include the functionality of the online FS analysis module 1 12 and/or additional functionality as discussed below regarding FIG, 2,
[0030] The FS correction module 122 is responsible for correcting any metadata that is identified as corrupt during the anaiysis performed by the online FS analysis module 1 2. During the correction phase performed by the FS correction module 122, the file system is temporarily placed in a hold state (or frozen). During this brief time, the corrections are applied to the file system. By applying all of the corrections after the online FS analysis module 1 12 has completed the analysis of the file system, the FS correction module 122 may hold or freeze the online file system only once to apply necessary changes and corrections, rather than bringing it offline during the analysis phase as inconsistencies are detected,
[0031] During the correction phase, the FS correction module 122 freezes or holds the file system and performs a file system flush. The FS correction module 122 then corrects the tag corruption log entries and the mcelS corruption log eniries. The FS correction module 122 may also perform a correction of the storage bitmap that tracks aSlocated disk space.
[0032] FIG. 2 illustrates a block diagram of a computing system 200 for online file system (FS) metadata analysis and correction according to examples of the present disclosure. FIG, 2 includes particular components, modules, etc. according to various examples. However, in different embodiments, more, fewer, and/or other components, modules, arrangements of components/modules, etc. may be used according to the teachings described herein. In addition, various components, moduies, etc. described heresn may be implemented as one or more software moduies, hardware moduies, special-purpose hardware (e.g., application specific hardware, application specific integrated circuits (ASICs), embedded controllers, hardwired circuitry, etc.), or some combination of these.
[0033] it should be understood that the computing system 200 may include any appropriate type of computing device, including for example smartphones, tablets, desktops, laptops, workstations, servers, smart monitors, smart televisions, digital signage, scientific instruments, retail point of sale devices, video walls, imaging devices, peripherals, or the like, including any suitable combinations thereof.
[0034] The computing system 200 may include a processing resource 202 that represents generally any suitable typ or form of processing unit or units capable of processing data or interpreting and executing instructions. The instructions may be stored on a non-transitory tangible computer-readable storage medium:, such as a memory resource or on a separate device (not shown), or on any other type of volatile or non-volatile memory that stores instructions to cause a programmable processor to perform the techniques described herein. Alternatively or additionally, the computing system 100 may include dedicated hardware, such as one or more integrated circuits. Application Specific Integrated Circuits (ASICs), Application Specific Special Processors (ASSPs), Field Programmable Gate Arrays (FPGAs), or any combination of the foregoing examples of dedicated hardware, for performing the techniques described herein. In some implementations, multiple processors and/or processing resources may be used, as appropriate, along with multiple memories, memory resources, and/or types of memory.
[0035] In addition to the processing resource 202, the computing system 200 may include a pre-mount FS verification module 210, a background online FS analysis module 216, an on-demand online FS verification module 218, an FS logging module 220, and an FS correction module 222. In one example, the modules described herein may be a combination of hardware and programming. The programming may be processor executable instructions stored on a tangible memory resource such as a memory resource (not shown), and the hardware may include processing resource 202 for executing those instructions. Thus the memory resource can be said to store program instructions that when executed by the processing resource 202 implement the modules described herein. Other modules may also be utilized as will be discussed further below in other examples.
[0036] The pre-mount FS verification module 210 verifies the FS metadata necessary for mounting the file system. For example, the pre-mount FS verification module analyzes the file system metadata of the file system used during the mounting process while the file system is offline (i.e., before the file system is mounted on the operating system of the computing system) to identify any corrupt file system metadata for mounting the file system. The pre-mount FS verification module 210 may also assign the volumes of the file system into appropriate service classes.
[0037] A variety of consistency checks of the FS metadata and/or associated metadata pages can be performed on the metadata for mounting the file system. For example, the pre-mount FS verification module 210 may validate and fix per voiume superblock, metadata describing FS metadata, root directory metadata, and reserved bitfife metadata table (RBMT) pages. The pre-mount FS verification module 210 may also check and/or correct domain attributes, domain mutable attributes, and volume attributes. These corrections are then synchronized across different volumes of the file system. In one example, the pre-mount FS verification module 210 provides hooks into file system functions to ensure that metadata is checked before it is accessed by the operating system. The pre- mount FS verification module 210 may also perform metadata verification for RB T pages, including those RBMT pages that are accessed during the mounting process.
[0038] The pre-mount FS verification module 210 may also assign certain file system volumes into a special service class that accepts read/write/update file requests. In this way, a special volume designated as "FSCK volume" may be created ant put into the active service class where new creates occur. In this case, the FSCK volume separates existing metadata that is yet to be checked from new metadata that is created during active file system operation. While the this example provides for a separate FSCK volume, other examples may use reserve space on each volume itself for online file system checking to be utilized by active file system operations such as creates/writes/updates, which my utilize storage space to be aliocated while the online file system: checking is in process.
[0039] After the FS metadata analysis is performed by the pre-mount FS verification module 210, the pre-mount FS verification then corrects any identified corrupt file system metadata while the fife system is offline,
[0040] The background online FS analysis module 218 may perform functionality similar to the online file system analysis module 1 12 of FIG. 1 as described above. For example, the background online FS analysis module 216 analyzes file system metadata of the file system while the file system is online (i.e., after the file system is mounted) to identify any corrupt file system metadata. At this analysis stage, no corrections to the metadata are made. However, in one example, the background online FS analysis module 218 may fix any inconsistencies found in the page header of a metadata page. If page header inconsistencies are corrected at this stage, a free mcell list within the corrected pages is maintained and the links to the respective pages are also corrected. After verifying and correcting these free mceSIs, the corresponding mce!ls are cleaned on the mcell bitmap, which may be maintained per volume.
[0041] In an example, the background online FS analysis module 218 maintains a page header state of verified and fixed on an auxiliary data structure, which may be maintained per volume. This state change is protected through a page-bit-iock so thai a parallel on-demand analysis thread context does not perform any corrections on the same page for header verification purposes.
[0042] The FS logging module 220 logs the metadata Identified as corrupt file system metadata in a corruption log, such as a tag corruption log or a mcell corruption log. fn other examples, the logging functionality may be performed by other suitable module, such as by the online FS analysis module 1 12 of FIG. 1. Corruptions relating to tags are logged in the tag corruption log while corruptions relating to mcells are logged in the mcell corruption log.
[0043] For each meet!, the background online FS analysts module 216 may check the tag entry for the mcell to determine whether the tag entry also points to the particular mcell. If not, the tag is identified as corrupt and is logged in the tag corrupt list, ff the tag on a particular mcell in not valid, then the invalid mcell is logged in the mcell corruption list. Further, if the link from the particular mcell to the corresponding tag and back from the tag to the mcell is valid, the tag is added to an in-process verification list. Once this tag is added to the list, other parallel threads (on-demand threads) may be prevented from acting on the same tag. The records of the mcell (i.e., file statistics, extent maps of a file ,a file block map table, etc.) are then verified, and if the records are not consistent, the mcell is logged in the mcell corruption list as being corrupt.
[0044] After the tags and mcells are analyzed, the auxiliary mcell bitmap that is based on the state of each of the primary mcells is updated, as are the corresponding mcells under each of the primary mcells. Once a primary mcell and its chain is verified, the corresponding tag may be marked as valid in the tag bitmap, and the tag may be removed from the in-process verification list.
[0045] in one example, the background online FS analysis module 216 may also detect orphan tags by analyzing the tag directory for each mountabSe entity (i.e., file system, file set, etc.) and checking to determine whether each tag is verified by analyzing the tag bitmap. If any tag is determined to be an orphan (that is, not associated with a particular mcell), it is added to the tag corruption list. Similarly background online FS analysis module 216 may also detect orphan mcells. [0048] !n one example, the on-demand online FS analysis module 218 handles any active file system operations that arrive during the analysis process {i.e., an on-demand analysis). For example, if a file system operation arrives during the analysis process as a result of a user action or request, the on- demand online FS analysis module 218 checks whether the relevant metadata has already been verified using a verification status bit, which signifies whether the metadata has been verified, if verified, normal file system operation may continue. However, if the verification status bit indicates that the metadata has not been verified yet, the metadata is verified first in the context of the active file system thread and then the file system operation continues. In this way, the normal analysis flow is interrupted to process the new file system operation. The analysis of the remainder of the file system will then continue. User actions are therefore accommodated upon request rather than being delayed.
[0047] The FS correction module 222 is responsible for correcting any metadata that is identified as corrupt during the analysis performed by the background online FS analysis module 218 and/or the on-demand online FS analysis module 218, as logged in the tag corruption log and mcell corruption log by the FS logging module 220. During the correction phase performed by the FS correction module 222, the file system is temporarily placed in a hold state (or frozen). During this brief time, the corrections are applied to the file system. By applying all of the corrections after the background online FS analysis module 216 and the on-demand online FS analysis module 218 have completed the analysis of the file system:, the FS correction module 222 may hold or freeze the online file system only once to apply necessary changes and corrections, rather than bringing it offline during the analysis phase as inconsistencies are detected.
[0048] During the correction phase, the FS correction module 222 freezes or holds the file system and performs a file system flush. The FS correction module 222 then corrects the tag corruption log entries and the mcell corruption log entries. The FS correction module 222 may also perform a correction of the storage bitmap that tracks allocated disk space.
[0049] FIG. 3 illustrates a flow diagram of a method 300 for online file system metadata analysis and correction according to examples of the present disclosure. The method 300 may be executed by a computing system: or a computing device such as computing systems 100 and/or 200 of FSGs. 1 and 2 respectively.
[0050] In one example, method 300 may include: analyzing file system metadata for mounting the file system while the file system is offline to identify corrupt file system metadata (block 302); correcting the identified corrupt file system metadata while the file system is offline {block 304); analyzing fiie system metadata while the file system is online to identify corrupt file system metadata (block 306); logging the identified corrupt file system metadata into a corruption list (block 308); correcting the identified corrupt file system metadata white freezing the file system (block 310).
[0051] At block 302, the method 300 includes analyzing file system metadata for mounting the file system while the file system is offline to identify corrupt file system metadata. For exampie, a computing system (e.g., computing system 100 of FIG. 1 , computing system 200 of FIG. 2) analyzes fiie system metadata for mounting a file system while the file system is offline to Identify any corrupt file system metadata,
[0052] A variety of consistency checks of the FS metadata and/or associated metadata pages can be performed, such as by the pre-mount FS verification module 110 of the computing system 100, for example. Any inconsistencies in the FS metadata may be corrected prior to mounting the file system and bringing it online. Moreover, during the pre-mount FS verification, volumes of the file system may also be assigned into appropriate service classes. The method 300 continues to block 304.
0053] At block 304, the method 300 includes correcting the identified corrupt file system metadata while the file system is offline. For example, the computing system (e.g. , computing system 100 of FIG. 1 , computing system 200 of FIG. 2) corrects the identified corrupt file system metadata while the file system is offline. The method 300 continues to biock 306.
0054] At block 306, the method 300 includes analyzing file system metadata while the file system is online to identify corrupt file system metadata. For example, the computing system (e.g., computing system 100 of FIG. 1, computing system 200 of FIG. 2} analyzes file system metadata of the file system: white the file system is online to identify any corrupt file system metadata, in one example, the analysis may be performed by the online FS analysis module 1 12 of FIG. 1 or by the background online FS analysis module 216 and the on-demand online FS analysis module 218 of FIG, 2,
[0055] In one example, analyzing the fite system metadata includes performing, by the computing system, a background analysts of the file system metadata as described above. This includes checking the consistency of metadata and metadata pages of the file system. While the background analysis is being performed, if the computing system is interrupted, such as by a user action or request for data from the file system, an on-demand analysis of the fife system metadata relating to the user action or request may be performed, In this way, the user action causes the computing system to suspend performing the background analysis on the set of metadata objects undergoing the on-demand analysis relating to the user action (e.g., a request for data). The background analysis of the file system metadata resumes following the completion of the on- demand analysis of the file system metadata relating to the user action. The background analysts may continue in parallel with the on-demand analysis being performed relating to the user action if the metadata objects being analyzed are different. In an example, any corrupt page headers for bitfiie metadata table (BMT) (traditionally known as inode table) pages associated with the file system metadata may be corrected during the background analysis. The method 300 continues to block 308.
[0056] At block 308, the method 300 Includes logging the identified corrupt file system metadata into a corruption list. For example, the computing system {e.g. , computing system 100 of FIG. 1 , computing system 200 of FIG, 2) logs the identified corrupt file system metadata into a corruption list. The logging may be performed by the FS correction logging module 222 of FIG. 2 in one example. The corruption list, for example, includes multiple corruption lists such as a tag corruption list and a mcell corruption list. In this case, corrupt tag metadata are logged in the tag corruption list, while corrupt mcell metadata are logged in the mcell corruption list. The method 300 continues to block 310. [0057] At b!ock 310, the method 300 includes correcting the identified corrupt fife system metadata. For example, the computing system (e.g., computing system 100 of FIG. 1 , computing system 200 of FIG, 2) corrects the identified corrupt file system metadata logged in the corruption list. The correcting may be performed, for example, by the FS correction module 122 of FIG. 1 or the FS correction module 222 of FIG. 2. During the correcting phase, the file system may be frozen or placed in a state of temporary hold in an example, during which time corrections may be applied to the metadata as indicated in the corruption log lists, in one example, correcting the identified corrupt file system metadata includes correcting the identified corrupt file system metadata logged into the tag corruption list and also correcting the identified corrupt file system metadata logged into the mce!l corruption list.
[0058] Additional processes also may be included. For example, the method 300 may include mounting, by the computing system, the file system on an operating system of the computing device to cause the file system to be online after correcting the identified corrupt file system metadata. It should be understood that the processes depicted in FIG. 3 represent illustrations, and that other processes may be added or existing processes may be removed, modified, or rearranged without departing from the scope and spirit of the present disclosure.
0059] FIG. 4 illustrates a flow diagram of a method 400 for online file system metadata analysis and correction according to examples of the present disclosure. The method 400 may be executed by a computing system: or a computing device such as computing systems 100 and/or 200 of FSGs. 1 and 2 respectively.
[0060] In one example, method 400 may include: validating file system metadata prior to mounting the file system (block 402); analyzing the file system: metadata after mounting the file system to identify any errors in the file system metadata while the file system is online (block 404); logging any identified errors in a tag corruption log or a mcelS corruption log (block 408); and correcting the identified errors while the file system is in a temporary hold (block 408). [0061] At block 402, the method 400 includes validating file system metadata prior to mounting the file system. For example, a computing system (e.g., computing system 100 of FIG, 1 , computing system 200 of FIG, 2} validates the file system metadata for a file system prior to mounting the file system on an operating system of a computing system,
[0062] A variety of consistency checks of the FS metadata and/or associated metadata pages can be performed, such as by the p re-mount FS verification module 1 10 of the computing system 100, for example. Any inconsistencies in the FS metadata may be corrected prior to mounting the file system and bringing it online. Moreover, during the pre-mount FS verification, volumes of the file system may also be assigned Into appropriate service classes. The method 400 continues to block 404.
[0063] At block 404, the method 400 includes analyzing the file system metadata after mounting the file system to identify any errors in the file system metadata while the file system Is online. For example, the computing system (e.g., computing system 100 of FIG. 1 , computing system 200 of FIG. 2) analyzes file system metadata for the file system after mounting the file system on the computing system to identify any errors in the file system metadata while the file system is online. The file system remains mounted and online during the analysis of the file system metadata. In one example, the analysis may be performed by the online FS analysis module 1 12 of FIG. 1 or by the background online FS analysis module 216 and the on-demand online FS analysis module 218 of FIG, 2.
[0064] In one example, analyzing the file system metadata includes performing, by the computing system, a background analysis of the file system metadata as described above while the file system is online (that is, after it is mounted). This includes checking the consistency of metadata and metadata pages of the file system. While the background analysis is being performed, if the computing system is interrupted, such as by a user action or request for data from the file system, an on-demand analysis of the file system metadata relating to the user action or request may be performed. In this way, the user action causes the computing system to suspend performing the background analysis on ihe metadata objects undergoing the on-demand analysts of the file system: metadata relating to the user action. The background analysis of the file system metadata resumes following the completion of the on-demand analysis of the file system metadata relating to the user action. The background analysis can continue in parallel with the on-demand analysis being performed relating to a user action if the metadata objects being analyzed are different. In an example, any corrupt page headers for bitfile metadat table (BfVIT) (traditionally inode table) pages associated with the file system metadata may be corrected during the background analysis. The method 400 continues to block 406.
[0065] At block 408, the method 400 includes logging any identified errors in a tag corruption log or a mcell corruption tog. For example, the computing system (e.g., computing system 100 of FIG. 1 , computing system 200 of FIG. 2) logs any identified errors in one of a tag corruption log and a mcell corruption log. The logging may be performed by the FS correction logging module 222 of FIG, 2 in one example. The corruption list, for example, includes multiple corruption Iists such as a tag corruption list and a mcell corruption list. In this case, corrupt tag metadata are logged in the tag corruption list, while corrupt mcell metadata are logged in the mcell corruption list. The method 400 continues to block 408,
[0066] At block 408, the method 400 includes correcting the identified errors while the file system is in a temporary hold. For example, the computing system {e.g.-, computing system 100 of FIG. 1 , computing system 200 of FIG. 2} corrects the identified errors in the file system white the file system is in a temporary hold. The correcting may be performed, for example, by the FS correction module 122 of FIG, 1 or the FS correction module 222 of FIG. 2. During the correcting phase, the file system is frozen or placed in a state of temporary hold, dursng which time corrections may be applied to the metadata as indicated in the corruption log lists. In one example, correcting the identified corrupt file system metadata includes correcting the identified corrupt file system metadata logged into the tag corruption list and also correcting the identified corrupt file system metadata logged into the mcell corruption list.
[0067] Additional processes also may be included, and it should be understood that the processes depicted in FIG. 4 represent illustrations, and that other processes may be added or existing processes may be removed, modified, or rearranged without departing from the scope and spirit of the present disclosure.
[0068] It should be emphasized that the above-described examples are merely possible examples of implementations and set forth for a clear understanding of the present disclosure. Many variations and modifications may be made to the above-described examples without departing substantially from the spirit and principles of the present disclosure. Further, the scope of the present disclosure is intended to cover any and a!l appropriate combinations and sub-combinations of all elements, features, and aspects discussed above. All such appropriate modifications and variations are Intended to be included within the scope of the present disclosure, and all possible ciatms to individual aspects or combinations of elements or steps are intended to be supported by the present disclosure.

Claims

WHAT IS CLAIMED IS:
1 , A method comprising;
analyzing, by a computing system, file system metadata for mounting a file system whiie the file system is offline to identify any corrupt file system metadata; correcting, by the computing system, the identified corrupt file system metadata while the file system: is offline;
analyzing, by the computing system, file system metadata of the file system while the file system is online to identify any corrupt file system metadata; logging, by the computing system, the identified corrupt file system metadata into a corruption list; and
correcting, by the computing system, the identified corrupt file system: metadata logged in the corruption list.
2, The method of claim 1 , further comprising:
mounting, by the computing system, the file system on an operating system of the computing device to cause the file system to be online after correcting the identified corrupt file system metadata,
3, The method of claim 1 , wherein the corrupt list includes a tag corruption list and a mcell corruption list, and wherein the identified corrupt file system metadata is logged into one of the tag corruption list and the mcell corruption list.
4, The method of claim 3, wherein correcting the identified corrupt file system metadata includes correcting the identified corrupt file system metadata logged into the tag corruption list and correcting the identified corrupt file system metadata logged into the rncell corruption list, and wherein the correcting the Identified corrupt file system metadata occurs while the file system is frozen.
5, The method of claim 1 , wherein analyzing the file system metadata further comprises:
performing, by the computing system, a background anaiysis of the file system metadata; and
responsive to a user action, performing, by the computing system, an on- demand analysis of the file system metadata relating to the user action.
6. The method of claim 5, wherein performing the background analysis of the file system metadata further comprises:
correcting any corrupt page headers for bitfile metadata table (B T) pages associated with the file system metadata.
?. The method of claim 5, wherein the user action causes the computing system to suspend performing the background analysis while performing the on-demand analysis of the file system metadata relating to the user action.
8, The method of claim 7, wherein performing the background anaiysis of the file system metadata resumes following the completion of the on-demand analysis of the fife system metadata relating to the user action,
9. A system comprising:
a processing resource;
a pre-mount file system verification module, executable by the processin resource, to analyze file system metadata of a file system while the file system: is offline to identify any corrupt file system metadata and to correct the identified corrupt file system: metadata while the file system is offline;
an online file system analysis module, executable by the processing resource, to analyze file system metadata of the file system while the file system is online to identify any corrupt file system metadata; and
a file system correction module, executable by the processing resource, to correct the identified corrupt file system metadata while freezing the file system.
10. The system of c!aim 9, further comprising:
an on-demand file system analysis module, executable by the processing resource, to analyze the file system metadata of the file system whi!e the file system is online to identify any corrupt file system metadata in file system metadata relating to a user action, the on-demand analysis being responsive to the user action,
11. The system of claim 10, wherein the on-demand file system analysis module interrupts the online file system analysis module responsive to the user action, and wherein the online file system analysis module resumes analyzing the fife system: metadata upon completion of the on-demand file system analysis.
12. The system of claim 9, further comprising:
a logging module, executable by the processing resource, to log the identified corrupt file system metadata into a corruption list, wherein the corruption list includes a tag corruption list and a meet! corruption list.
13. A non-transitory computer-readable storage medium storing instructions that, when executed fay a processor, cause the processor to:
validate file system metadata for a file system prior to mounting the file system on an operating system of a computing system;
analyze file system metadata for the file system after mounting the file system on the computing system to identify any errors in the file system metadata while the file system is online- log any identified errors in one of a tag corruption log and a mcelt corruption log; and
correct the identified errors in the file system white the file system is in a temporary hold.
14. The non-transitory computer-readable storage medium of claim 13, wherein correcting the identified errors in the metadata page of the file system includes correcting the errors togged in the tag corrupiton log list and correcting the errors logged in the m-cell corruption log list.
15. The non-transitory computer-readable storage medium of claim 13, wherein the file system remains mounted and online during the analysis of the file system metadata.
PCT/US2014/013280 2014-01-28 2014-01-28 Online file system metadata analysis and correction WO2015116023A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2014/013280 WO2015116023A1 (en) 2014-01-28 2014-01-28 Online file system metadata analysis and correction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2014/013280 WO2015116023A1 (en) 2014-01-28 2014-01-28 Online file system metadata analysis and correction

Publications (1)

Publication Number Publication Date
WO2015116023A1 true WO2015116023A1 (en) 2015-08-06

Family

ID=53757438

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/013280 WO2015116023A1 (en) 2014-01-28 2014-01-28 Online file system metadata analysis and correction

Country Status (1)

Country Link
WO (1) WO2015116023A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10733045B2 (en) 2016-07-14 2020-08-04 Microsoft Technology Licensing, Llc Online repair of metadata for structured data including file systems

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182797A1 (en) * 2004-02-12 2005-08-18 International Business Machines Corporation Method and apparatus for file system snapshot persistence
US20060282471A1 (en) * 2005-06-13 2006-12-14 Mark Timothy W Error checking file system metadata while the file system remains available
US7552146B1 (en) * 2005-04-28 2009-06-23 Network Appliance, Inc. Method and apparatus for offline and online consistency checking of aggregates and flexible volumes
WO2010050944A1 (en) * 2008-10-30 2010-05-06 Hewlett-Packard Development Company, L.P. Online checking of data structures of a file system
US20120198287A1 (en) * 2007-10-01 2012-08-02 Day Mark S File system error detection and recovery framework

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182797A1 (en) * 2004-02-12 2005-08-18 International Business Machines Corporation Method and apparatus for file system snapshot persistence
US7552146B1 (en) * 2005-04-28 2009-06-23 Network Appliance, Inc. Method and apparatus for offline and online consistency checking of aggregates and flexible volumes
US20060282471A1 (en) * 2005-06-13 2006-12-14 Mark Timothy W Error checking file system metadata while the file system remains available
US20120198287A1 (en) * 2007-10-01 2012-08-02 Day Mark S File system error detection and recovery framework
WO2010050944A1 (en) * 2008-10-30 2010-05-06 Hewlett-Packard Development Company, L.P. Online checking of data structures of a file system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10733045B2 (en) 2016-07-14 2020-08-04 Microsoft Technology Licensing, Llc Online repair of metadata for structured data including file systems

Similar Documents

Publication Publication Date Title
US10372559B2 (en) Managing a redundant computerized database using a replicated database cache
US9268648B1 (en) System and method for consistency verification of replicated data in a recovery system
US9639429B2 (en) Creating validated database snapshots for provisioning virtual databases
US10152396B2 (en) Time-based checkpoint target for database media recovery
WO2015116125A1 (en) File system analysis in user daemon
US8577855B2 (en) Online file system consistency check
US8356148B2 (en) Snapshot metadata management in a storage system
US11487714B2 (en) Data replication in a data analysis system
US20140108753A1 (en) Merging an out of synchronization indicator and a change recording indicator in response to a failure in consistency group formation
US9697242B2 (en) Buffering inserts into a column store database
US10678653B2 (en) Recovery of in-memory state in a log-structured filesystem using fuzzy checkpoints
US10007548B2 (en) Transaction system
US9619403B2 (en) Method and system for object-based transactions in a storage system
US20170212902A1 (en) Partially sorted log archive
US11176004B2 (en) Test continuous log replay
US20170371916A1 (en) Database management device, database management method, and storage medium
US10521454B2 (en) Reorganization of partition by growth space with LOB columns
US20170177653A1 (en) Computer system, method for inspecting data, and computer
US11853284B2 (en) In-place updates with concurrent reads in a decomposed state
US20160019128A1 (en) Systems and methods providing mount catalogs for rapid volume mount
EP3264254B1 (en) System and method for a simulation of a block storage system on an object storage system
US10599530B2 (en) Method and apparatus for recovering in-memory data processing system
US20140149697A1 (en) Memory Pre-Allocation For Cleanup and Rollback Operations
WO2015116023A1 (en) Online file system metadata analysis and correction
US20190050455A1 (en) Adaptive page rendering for a data management system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14881294

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14881294

Country of ref document: EP

Kind code of ref document: A1