WO2017054643A1 - 一种数据抢救方法及文件服务器 - Google Patents

一种数据抢救方法及文件服务器 Download PDF

Info

Publication number
WO2017054643A1
WO2017054643A1 PCT/CN2016/098862 CN2016098862W WO2017054643A1 WO 2017054643 A1 WO2017054643 A1 WO 2017054643A1 CN 2016098862 W CN2016098862 W CN 2016098862W WO 2017054643 A1 WO2017054643 A1 WO 2017054643A1
Authority
WO
WIPO (PCT)
Prior art keywords
disk
data
file server
new
data block
Prior art date
Application number
PCT/CN2016/098862
Other languages
English (en)
French (fr)
Inventor
王力涛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2017054643A1 publication Critical patent/WO2017054643A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's

Definitions

  • the embodiments of the present invention relate to the field of storage technologies, and in particular, to a data rescue method and a file server.
  • a file server at each physical location can manage one or more disks, including disks and parity disks.
  • System data in a distributed file system is generally saved by a data disk and a verification disk.
  • System data in a distributed file system is lost when a disk that exceeds the checksum fails. At this time, it is necessary to rescue the fault disk in time, and rescue the system data in the distributed file system according to the data in the rescued fault disk and the data in the remaining non-faulty disks in the distributed file system.
  • the data in the disk can be logically divided into three levels of management metadata, directories and files. See Figure 2, where the management metadata refers to the management information of the disk, which is used to provide a structured data about resources or data. Is a description of data structuring, for example, may include root directory information, disk space management information, and the like.
  • the management metadata refers to the management information of the disk, which is used to provide a structured data about resources or data. Is a description of data structuring, for example, may include root directory information, disk space management information, and the like.
  • the above complicated file parsing process makes the data copying speed slow, which leads to slow recovery of the faulty disk, and further causes the system data in the distributed file system to be rescued slowly.
  • the embodiment of the present invention provides a data rescue method and a file server, which can solve the problem that when the data in the faulty disk is copied to a new disk in the prior art, the copying speed is slow due to the need for file parsing, thereby causing the faulty disk to be rescued slowly.
  • the problem when the data in the faulty disk is copied to a new disk in the prior art, the copying speed is slow due to the need for file parsing, thereby causing the faulty disk to be rescued slowly.
  • a data rescue method including:
  • management metadata is created in the new disk, and the management metadata is management information of the new disk;
  • a node data block copied from the failed disk to the new disk is mounted to management metadata of the new disk.
  • the management metadata includes root directory information and disk management information.
  • the node data block includes a descriptor and data content
  • the descriptor occupies storage
  • the space is a preset number of bytes
  • the descriptor includes a start identifier and a first byte number
  • the start identifier is located at a header of the descriptor, and is used to indicate a start position of the node data block
  • the first number of bytes is used to indicate the number of bytes occupied by the data content.
  • the copying the node data block in the faulty disk to the new disk includes:
  • the descriptor of the preset number of bytes and the data content of the first number of bytes are copied to the new disk.
  • the descriptor is an inode inode.
  • the starting identifier is a magic number.
  • the method further includes:
  • the method further includes:
  • the method further includes:
  • the data is saved to a business disk in the business disk pool.
  • a file server including:
  • a creating unit configured to create management metadata in a new disk when the data in the faulty disk needs to be rescued, where the management metadata is management information of the new disk;
  • a copy unit configured to copy a node data block in the faulty disk to the new disk, where the node data block includes a directory data block or a file data block;
  • a mounting unit configured to mount a node data block that copies the failed disk to the new disk to management metadata of the new disk.
  • the management metadata includes root directory information and disk management information.
  • the node data block includes a descriptor and a data content
  • the descriptor occupies storage
  • the space is a preset number of bytes
  • the descriptor includes a start identifier and a first byte number
  • the start identifier is located at a header of the descriptor, and is used to indicate a start position of the node data block
  • the first number of bytes is used to indicate the number of bytes occupied by the data content.
  • the copy unit is specifically configured to:
  • the descriptor of the preset number of bytes and the data content of the first number of bytes are copied to the new disk.
  • the descriptor is an index node inode.
  • the starting identifier is a magic number.
  • a first processing unit configured to add the new disk to the rescue disk pool
  • the copy unit is further configured to copy data in the new disk to a service disk in the service disk pool.
  • the method further includes:
  • a sending unit configured to send data in the new disk to another file server.
  • the method further includes:
  • a receiving unit configured to receive data sent by another file server
  • a second processing unit configured to save the data received by the receiving unit to a service disk in the service disk pool.
  • the embodiment of the invention provides a data rescue method and a file server.
  • the management metadata is created in the new disk, and the node data block in the fault disk is copied to the new disk, and then The node data block copied from the failed disk to the new disk is mounted on the management metadata of the new disk, thereby restoring the data in the failed disk to the new disk, thereby avoiding the complicated file parsing process in the prior art and improving the failure.
  • the speed of the rescue Therefore, it is possible to solve the problem in the prior art that when the data in the faulty disk is copied to the new disk, the copying speed is slow due to the need for file parsing, thereby causing the faulty disk to be rescued at a slow speed.
  • FIG. 1 is a schematic structural diagram of a distributed file system according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a logical structure of a disk according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a data rescue method according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a disk management system according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of disk data storage according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of copying a node data block in a faulty disk to a new disk according to an embodiment of the present invention
  • FIG. 7 is a schematic structural diagram of a node data block according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of a logical structure of disk data according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic flowchart of another data rescue method according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of a service disk pool and a rescue disk pool according to an embodiment of the present disclosure
  • FIG. 11 is a schematic flowchart of another data rescue method according to an embodiment of the present invention.
  • FIG. 12 is a schematic flowchart of another data rescue method according to an embodiment of the present invention.
  • FIG. 13 is a schematic flowchart of another data rescue method according to an embodiment of the present invention.
  • FIG. 14 is a schematic structural diagram of a file server according to an embodiment of the present disclosure.
  • FIG. 15 is a schematic structural diagram of another file server according to an embodiment of the present disclosure.
  • FIG. 16 is a schematic structural diagram of another file server according to an embodiment of the present disclosure.
  • FIG. 17 is a schematic structural diagram of another file server according to an embodiment of the present disclosure.
  • FIG. 18 is a schematic structural diagram of another file server according to an embodiment of the present disclosure.
  • FIG. 19 is a schematic structural diagram of another file server according to an embodiment of the present disclosure.
  • FIG. 20 is a schematic structural diagram of another file server according to an embodiment of the present invention.
  • the data rescue method provided by the following embodiments of the present invention directly copies the node in the fault disk to the new disk and mounts the copied node to the new disk by creating the management metadata in the new disk without performing the file parsing process.
  • the management metadata the data in the failed disk is restored to the new disk, thereby avoiding the complicated file parsing process in the prior art, thereby improving the rescue speed of the failed disk.
  • An embodiment of the present invention provides a data rescue method.
  • the method may include:
  • management metadata is the management information of the new disk.
  • the file server is a server for disk management in the distributed file system, and can be used to rescue the data in the fault disk managed by the file server, and the distributed file can rescue the data of the fault disk managed by the distributed file to the new disk, and can also be a user.
  • the file server can create management metadata on another new disk.
  • the management metadata may include root directory information and disk space management information, etc., which are used to describe characteristics and attributes of the information resource or the data itself, and specify the organization of the digitized information, and have functions of positioning, discovery, certification, evaluation, and selection.
  • the management structure of each disk is the same, the difference is that the management structure manages different data, so management metadata can be created on a new disk to establish management of new disks. Structure, and copy the directory data block and file data block in the fault disk to the new disk, manage the management structure created by the management metadata, and restore the data in the fault disk to the new disk to rescue the data in the fault disk. Instead of copying the management metadata in the failed disk to the new disk and not performing file parsing based on the management metadata in the failed disk, the management metadata in the failed disk can be directly discarded, thereby restoring the data in the failed disk. To the new disk. For a schematic diagram of the management structure corresponding to the management metadata in the disk, refer to FIG. 4 .
  • the management metadata backup can only solve the management metadata corruption caused by the physical failure of the fault disk.
  • the management metadata may be damaged, which may result in source management metadata and management elements backed up to other disks.
  • the data is all erroneous data.
  • the data rescue method provided by the embodiment of the present invention can save the management metadata of the faulty disk and create the management metadata on the new disk, thereby avoiding the backup of the management metadata in the faulty disk in the prior art, thereby saving the data. Disk space, and whether the management metadata in the failed disk is hardware damage or software damage, it will not affect the rescue of data in the failed disk.
  • the file server copies the node data block in the fault disk to a new disk, where the node data block includes a directory data block or a file data block.
  • the remaining data in the disk is the directory data block managed by the management structure established by the management metadata and A file data block, where the directory data block and the file data block are collectively referred to as a node data block.
  • management metadata After the management metadata is created in the new disk, it is not necessary to perform file parsing according to the management metadata in the fault disk, so that the data in the fault disk is copied to the new disk according to the full path resolution of the file, and the data can be bypassed.
  • the management metadata and file parsing in the fault disk directly copy the node data blocks in the fault disk to the new disk.
  • the schematic diagram of the file server copying the node data block in the fault disk to the new disk through steps 101 and 102 can be seen in FIG. 6.
  • the node data block may include a descriptor and a data content
  • the storage space occupied by the descriptor may be a preset number of bytes
  • the descriptor may include a start identifier and a first byte number, and the start identifier is located at a header of the descriptor.
  • the first number of bytes can be used to indicate the number of bytes occupied by the data content.
  • the preset number of bytes occupied by the description character can be set as needed.
  • the file server copying the node data block in the fault disk to the new disk may include:
  • the file server looks up the start identifier in the fault disk
  • the file server copies the descriptor of the preset number of bytes and the data content of the first byte number to the new disk starting from the header of the descriptor.
  • the descriptor can be an inode inode
  • the starting identifier can be a magic number
  • the descriptor index node may include a start identifier and a first byte number
  • the start identifier may be a magic number.
  • the file server can find the starting identifier magic number in the header of the descriptor in the fault disk, and copy the preset byte number descriptor and the first byte number data content from the first byte of the magic number. Go to the new disk to copy the node data block to the new disk.
  • the file server may copy the node data block corresponding to the start identifier to the new disk every time a fault identifier is found, or the file server may scan the fault disk. All the initial identifiers are used to establish a node data block list according to the start identifier, so that the node data block is copied to the new disk according to the node data block list, which is not limited in the embodiment of the present invention.
  • the file server mounts the node data block copied from the failed disk to the new disk to the management metadata of the new disk.
  • the file server copies the node data block in the failed disk to the new disk, refer to the logical structure diagram of the data in the new disk shown in FIG. 8, and the node data block in the new disk can be mounted to the management element of the new disk.
  • the directory data block and the file data block are managed by the management structure to form a complete disk storage system.
  • the data in the new disk is the data in the rescued failed disk.
  • the fault disk is usually a disk in the distributed file system.
  • the system data in the distributed file system can be further rescued according to the data in the rescued fault disk. Therefore, referring to FIG. 9, the data rescue method provided by the embodiment of the present invention may further include:
  • the file server adds the new disk to the rescue disk pool.
  • the new disk can also be added to the rescue disk pool and the rescue disk list managed by the data rescue module to perform system data rescue of the distributed file system.
  • the data rescue module is used to manage disks in the rescue disk pool and cannot manage other disks.
  • the step 104 is to restore the data in the failed disk to the new disk by the file server and add the new disk to the rescue disk pool managed by the file server, for example, to perform system data rescue of the distributed file system. That is, data rescue of the fault disk and system data rescue are performed through the same file server.
  • the file server 1 and the file server 2 are located at the same physical location, the data in the failed disk is restored to the new disk by the file server 1, the new disk is inserted into the file server 2, and the rescue disk pool of the file server 2 is added for distribution.
  • System data rescue of a file system is a technical solution that is easily conceivable and is also within the scope of the present invention.
  • the file server copies the data in the new disk to the business disk in the service disk pool.
  • the file server obtains a list of rescue disks in the rescue disk pool, and according to the list, the data in the new disk that recovers the faulty disk data is copied to the service disk in the service disk pool managed by the service data module to rescue the distributed file system.
  • the service data module is used to manage disks in the service disk pool, and cannot manage other disks, such as disks in the rescue disk pool.
  • the data rescue module can only manage the disks in the rescue disk pool, but cannot manage the disks in the service disk pool, and the service data module can only manage the service disk pool.
  • the rescue disk pool and the business disk pool in the distributed file system are isolated from each other, so the system data rescue and service of the distributed file system are also parallel. , do not interfere with each other.
  • the data rescue method provided by the embodiment of the present invention can improve the processing performance of the distributed file system by performing system data rescue and parallel processing of the distributed file system.
  • the service disk in the file server service disk pool includes 32 disks, the number of verification disks is 4, and the disks 1, 2, 3, 4 have failed, and when the 5th disk, for example, disk 5 fails,
  • the distributed file system loses system data. Therefore, when the disk 5 fails, the file server can rescue the data in the failed disk 5 to the new disk 33, and add the new disk 33 to the rescue disk pool, and the file server copies the data in the new disk 33 to the service disk pool.
  • the non-faulty disk in the system thereby retrieving the system data of the distributed file system based on the data in the non-faulty disk in the business disk pool.
  • the method provided by the embodiment of the present invention may further include:
  • the file server sends the data in the new disk to other file servers.
  • the file server can also send the data in the new disk to other file servers through the network, so as to perform system data rescue of the distributed file system through other file servers in other physical locations.
  • the data rescue method provided by the embodiment of the present invention may further include:
  • the file server receives data sent by other file servers.
  • the file server saves the data to the business disk in the service disk pool.
  • the file server receives the data sent by the other file servers through steps 107-108, and saves the received data to the service disk in the service disk pool, thereby salvaging the system data in the distributed file system.
  • the file server performs data rescue of the fault disk through steps 101-106, and there is no clear relationship.
  • the embodiment of the invention provides a data rescue method.
  • the management metadata is created in the new disk, and the node data block in the faulty disk is copied to the new disk, and then the slave disk is recovered.
  • the node data block copied to the new disk is mounted on the management metadata of the new disk, thereby restoring the data in the fault disk, thereby avoiding the complicated file parsing process in the prior art and improving the rescue speed of the fault disk. Therefore, it is possible to solve the problem in the prior art that when the data in the faulty disk is copied to the new disk, the copying speed is slow due to the need for file parsing, thereby causing the faulty disk to be rescued at a slow speed.
  • the file server that performs the data rescue of the fault disk and the system data rescue of the distributed system is not the same file server, and the first file server that performs the rescue of the fault disk data and the second file server that performs the distributed system data rescue
  • the data rescue method provided by the embodiment of the present invention may include:
  • the first file server creates management metadata in the new disk, and manages the metadata as management information of the new disk.
  • the first file server copies the node data block in the faulty disk to a new disk, where the node data block includes a directory data block or a file data block.
  • the first file server mounts the node data block that is copied from the fault disk to the new disk to the management metadata of the new disk.
  • steps 201-203 For a detailed description of the steps 201-203, refer to steps 101-103 in Embodiment 1, and details are not described herein again.
  • the first file server sends the data in the new disk to the second file server.
  • the data in the new disk can be sent to the second file server not in the same physical location through the network, and the system data of the distributed file system is performed through the second file server. rescue.
  • the second file server receives data sent by the first file server.
  • the second file server receives data in a new disk of the first file server sent by the first file server.
  • the second file server saves the data to the service disk in the service disk pool.
  • step 205 and 206 For the description of the steps 205 and 206, refer to step 107 and step 108 in Embodiment 1, and details are not described herein again.
  • the embodiment of the invention provides a data rescue method.
  • the management metadata is created in the new disk, and the node data block in the faulty disk is copied to the new disk, and then the slave disk is recovered.
  • the node data block copied to the new disk is mounted on the management metadata of the new disk, thereby restoring the data in the fault disk, thereby avoiding the complicated file parsing process in the prior art and improving the rescue speed of the fault disk. Therefore, it is possible to solve the problem in the prior art that when the data in the faulty disk is copied to the new disk, the copying speed is slow due to the need for file parsing, thereby causing the faulty disk to be rescued at a slow speed.
  • the embodiment of the present invention provides a file server 300.
  • the file server 300 may include:
  • the creating unit 301 is configured to create management metadata in the new disk and manage the metadata as management information of the new disk when the data in the faulty disk needs to be rescued.
  • the copy unit 302 is configured to copy the node data block in the faulty disk to a new disk, and the node data block includes a directory data block or a file data block.
  • the mounting unit 303 is configured to mount the node data block copied from the failed disk to the new disk to the management metadata of the new disk.
  • the file server 300 is a server for performing disk management in a distributed file system, and It is used to rescue the data in the fault disk that it manages, and to perform the distributed file to rescue the data of the fault disk it manages to the new disk, and also provide normal service processing for the user.
  • the file server 300 provided by the embodiment of the present invention does not need to perform file parsing according to the management metadata in the faulty disk, thereby copying the data in the faulty disk to the new disk, thereby avoiding the complicated file parsing process in the prior art and improving The rescue speed of the faulty disk.
  • the management metadata may include root directory information and disk management information.
  • the node data block may include a descriptor and a data content, the storage space occupied by the descriptor is a preset number of bytes, the descriptor includes a start identifier and a first byte number, and the start identifier is located at a header of the descriptor for indicating The starting position of the node data block, the first byte number is used to indicate the number of bytes occupied by the data content.
  • the copy unit 302 can be specifically used to:
  • the descriptor of the preset number of bytes and the data content of the first byte number are copied to the new disk.
  • the descriptor may be an inode inode, and the starting identifier may be a magic number.
  • the file server 300 may further include:
  • the first processing unit 304 can be configured to add a new disk to the rescue disk pool
  • the copy unit 302 can also be used to copy data in the new disk to the business disk in the service disk pool.
  • the file server 300 may further include:
  • the sending unit 305 can be configured to send data in the new disk to other file servers.
  • the file server 300 may further include:
  • the receiving unit 306 can be configured to receive data sent by another file server.
  • the second processing unit 307 can be configured to save the data received by the receiving unit 306 to the service disk in the service disk pool.
  • the data rescue module can only manage the disks in the rescue disk pool, and cannot manage the disks in the service disk pool, and the service data module can only manage the disks in the service disk pool.
  • the disk in the rescue disk pool cannot be managed.
  • the rescue disk pool and the service disk pool are isolated from each other. Therefore, the system data rescue and services of the distributed file system are parallel and do not interfere with each other.
  • An embodiment of the present invention provides a file server, which creates management metadata on a new disk and copies a node data block in the failed disk to a new disk when the data in the failed disk needs to be rescued, and then copies the data from the failed disk.
  • the node data block to the new disk is mounted on the management metadata of the new disk, thereby restoring the data in the fault disk, thereby avoiding the complicated file parsing process in the prior art and improving the rescue speed of the fault disk. Therefore, it is possible to solve the problem in the prior art that when the data in the faulty disk is copied to the new disk, the copying speed is slow due to the need for file parsing, thereby causing the faulty disk to be rescued at a slow speed.
  • the embodiment of the present invention provides a file server 400.
  • the file server 400 may include a processor 401, a memory 402, and a bus 403.
  • the bus 403 is used to connect the processor 401 and the memory 402.
  • the processor 401 executes the instruction for creating management metadata in a new disk when the data in the failed disk needs to be rescued, managing the metadata as management information of the new disk, and the node in the failed disk
  • the data block is copied to a new disk, and the node data block includes a directory data block or a file data block, and the node data block copied from the failed disk to the new disk is mounted on the management metadata of the new disk.
  • the file server 400 here is a server for performing disk management in a distributed file system, and can be used for rescuing the data in the fault disk managed by the distributed file system, and performing distributed file to rescue the data of the fault disk managed by the distributed file to the new disk, or The user provides normal business processing.
  • the file server 400 provided by the embodiment of the present invention does not need to perform file parsing according to the management metadata in the faulty disk, thereby copying the data in the faulty disk to the new disk, thereby avoiding the complicated file parsing process in the prior art and improving The rescue speed of the faulty disk.
  • the management metadata may include root directory information and disk management information
  • the node data block may include a descriptor and a data content
  • the storage space occupied by the descriptor is a preset number of bytes
  • the descriptor includes a start identifier and a first byte. The number, the start identifier is located at the head of the descriptor, and is used to indicate the starting position of the node data block, and the first byte number is used to indicate the number of bytes occupied by the data content.
  • executing the instruction by the processor 401 for copying the node data block in the faulty disk to the new disk may include:
  • the descriptor of the preset number of bytes and the data content of the first byte number are copied to the new disk.
  • the descriptor may be an inode inode, and the starting identifier may be a magic number.
  • processor 401 executing the instruction may further be configured to add a new disk to the rescue disk pool;
  • the file server 400 may further include:
  • the sender 404 can be used to send data in the new disk to other file servers.
  • the file server 400 may further include:
  • the receiver 405 can be configured to receive data sent by other file servers.
  • the processor 401 is further configured to save the data received by the receiver 405 to a service disk in the service disk pool.
  • the data rescue module can only manage the disks in the rescue disk pool, but cannot manage the disks in the service disk pool, and the service data module can only manage the disks in the service disk pool.
  • the disk in the rescue disk pool cannot be managed.
  • the rescue disk pool and the service disk pool are isolated from each other. Therefore, the system data rescue and services of the distributed file system are parallel and do not interfere with each other.
  • An embodiment of the present invention provides a file server, which creates management metadata on a new disk and copies a node data block in the failed disk to a new disk when the data in the failed disk needs to be rescued, and then copies the data from the failed disk.
  • the node data block to the new disk is mounted on the management metadata of the new disk, thereby restoring the data in the fault disk, thereby avoiding the complicated file parsing process in the prior art and improving the rescue speed of the fault disk. Therefore, it is possible to solve the problem in the prior art that when the data in the faulty disk is copied to the new disk, the copying speed is slow due to the need for file parsing, thereby causing the faulty disk to be rescued at a slow speed.
  • the disclosed file server and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated as The components displayed by the unit may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be physically included separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
  • the software functional units described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform some of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, and the program code can be stored. Medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据抢救方法及文件服务器,涉及存储技术领域,能够解决现有技术中在将故障盘中的数据拷贝到新磁盘时,由于需要进行文件解析使得拷贝速度慢,从而导致故障盘抢救速度慢的问题。具体方案为:当需要抢救故障盘中的数据时,文件服务器在新磁盘中创建管理元数据,管理元数据为新磁盘的管理信息(101);文件服务器将故障盘中的节点数据块拷贝至新磁盘,节点数据块包括目录数据块或文件数据块(102);文件服务器将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上(103)。

Description

一种数据抢救方法及文件服务器 技术领域
本发明实施例涉及存储技术领域,尤其涉及一种数据抢救方法及文件服务器。
背景技术
在当今信息时代,通过文件系统存储和管理的数据成指数倍增长,为了有效地存储和管理数据,分布式文件系统将固定于一个物理地点的文件系统,扩展到多个物理地点,从而组成一个文件系统网络,并通过网络进行不同物理地点间的通信和数据传输。参见图1,每个物理地点的文件服务器可以管理一块或多块磁盘,其中的磁盘包括数据盘和校验盘。
分布式文件系统中的系统数据一般由数据磁盘和校验磁盘共同保存。当超出校验数量的磁盘出现故障时,分布式文件系统中的系统数据会丢失。此时,需要及时抢救故障盘,并根据已抢救的故障盘中的数据以及分布式文件系统中剩余的非故障盘中的数据,抢救分布式文件系统中的系统数据。
磁盘中的数据逻辑上可以划分为管理元数据、目录和文件三个层级,参见图2,其中的管理元数据是指磁盘的管理信息,用于提供关于资源或数据的一种结构化的数据,是对数据结构化的描述,例如可以包括根目录信息、磁盘空间管理信息等。当磁盘出现故障时,通常需要将故障盘中的数据扫描后拷贝到另一个新磁盘,以抢救故障盘中的数据。扫描过程需要不停地根据故障盘中的管理元数据解析目录与故障盘位置的对应关系,并根据目录解析文件与故障盘位置的对应关系,从而将文件拷贝至新磁盘。
上述复杂的文件解析过程使得数据拷贝速度慢,从而导致故障盘抢救速度慢,并且也进一步导致了分布式文件系统中的系统数据的抢救速度慢。
发明内容
本发明实施例提供一种数据抢救方法及文件服务器,能够解决现有技术中在将故障盘中的数据拷贝到新磁盘时,由于需要进行文件解析使得拷贝速度慢,从而导致故障盘抢救速度慢的问题。
为达到上述目的,本发明的实施例采用如下技术方案:
第一方面,提供一种数据抢救方法,包括:
当需要抢救故障盘中的数据时,在新磁盘中创建管理元数据,所述管理元数据为所述新磁盘的管理信息;
将所述故障盘中的节点数据块拷贝至所述新磁盘,所述节点数据块包括目录数据块或文件数据块;
将从所述故障盘拷贝至所述新磁盘的节点数据块挂载到所述新磁盘的管理元数据上。
结合第一方面,在第一方面的第一种可能的实现方式中,所述管理元数据包括根目录信息和磁盘管理信息。
结合第一方面或第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述节点数据块包括描述符和数据内容,所述描述符占用的存储空间为预设字节数,所述描述符包括起始标识和第一字节数,所述起始标识位于所述描述符的头部,用于指示所述节点数据块的起始位置,所述第一字节数用于指示所述数据内容占用的字节数量。
结合第一方面的第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述将所述故障盘中的节点数据块拷贝至所述新磁盘包括:
在所述故障盘中查找所述起始标识;
从所述描述符的头部开始,将所述预设字节数的描述符和所述第一字节数的数据内容拷贝至所述新磁盘。
结合第一方面的第二或第三种可能的实现方式,在第一方面的第四种可能的实现方式中,所述描述符为索引节点inode。
结合第一方面的第四种可能的实现方式,在第一方面的第五种可能的实现方式中,所述起始标识为魔数字。
结合第一方面至第一方面的第五种可能的实现方式中的任意一种,在第一方面的第六种可能的实现方式中,所述方法还包括:
将所述新磁盘加入抢救磁盘池;
将所述新磁盘中的数据拷贝至所述业务磁盘池中的业务磁盘。
结合第一方面至第一方面的第五种可能的实现方式中的任意一种,在第一方面的第七种可能的实现方式中,所述方法还包括:
将所述新磁盘中的数据发送给其它文件服务器。
结合第一方面至第一方面的第七种可能的实现方式中的任意一种,在第一方面的第八种可能的实现方式中,所述方法还包括:
接收其它文件服务器发送的数据;
将所述数据保存至所述业务磁盘池中的业务磁盘。
第二方面,提供一种文件服务器,包括:
创建单元,用于当需要抢救故障盘中的数据时,在新磁盘中创建管理元数据,所述管理元数据为所述新磁盘的管理信息;
拷贝单元,用于将所述故障盘中的节点数据块拷贝至所述新磁盘,所述节点数据块包括目录数据块或文件数据块;
挂载单元,用于将从所述故障盘拷贝至所述新磁盘的节点数据块挂载到所述新磁盘的管理元数据上。
结合第二方面,在第二方面的第一种可能的实现方式中,所述管理元数据包括根目录信息和磁盘管理信息。
结合第二方面或第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,所述节点数据块包括描述符和数据内容,所述描述符占用的存储空间为预设字节数,所述描述符包括起始标识和第一字节数,所述起始标识位于所述描述符的头部,用于指示所述节点数据块的起始位置,所述第一字节数用于指示所述数据内容占用的字节数量。
结合第二方面的第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述拷贝单元具体用于:
在所述故障盘中查找所述起始标识;
从所述描述符的头部开始,将所述预设字节数的描述符和所述第一字节数的数据内容拷贝至所述新磁盘。
结合第二方面的第二或第三种可能的实现方式,在第二方面的第四种可能的实现方式中,所述描述符为索引节点inode。
结合第二方面的第四种可能的实现方式中,在第二方面的第五种可能的实现方式中,所述起始标识为魔数字。
结合第二方面至第二方面的第五种可能的实现方式中的任意一种,在第 二方面的第六种可能的实现方式中,还包括:
第一处理单元,用于将所述新磁盘加入抢救磁盘池;
所述拷贝单元还用于,将所述新磁盘中的数据拷贝至所述业务磁盘池中的业务磁盘。
结合第二方面至第二方面的第五种可能的实现方式中的任意一种,在第二方面的第七种可能的实现方式中,还包括:
发送单元,用于将所述新磁盘中的数据发送给其它文件服务器。
结合第二方面至第二方面的第七种可能的实现方式中的任意一种,在第二方面的第八种可能的实现方式中,还包括:
接收单元,用于接收其它文件服务器发送的数据;
第二处理单元,用于将所述接收单元接收的所述数据保存至所述业务磁盘池中的业务磁盘。
本发明实施例提供一种数据抢救方法及文件服务器,在需要抢救故障盘中的数据时,通过在新磁盘中创建管理元数据,并将故障盘中的节点数据块拷贝至新磁盘,而后将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上,从而将故障盘中的数据恢复至新磁盘,因而避免了现有技术中复杂的文件解析过程,提高了故障盘的抢救速度。因此,能够解决现有技术中在将故障盘中的数据拷贝到新磁盘时,由于需要进行文件解析使得拷贝速度慢,从而导致故障盘抢救速度慢的问题。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种分布式文件系统结构示意图;
图2为本发明实施例提供的一种磁盘逻辑结构示意图;
图3为本发明实施例提供的一种数据抢救方法流程示意图;
图4为本发明实施例提供的一种磁盘管理结构示意图;
图5为本发明实施例提供的一种磁盘数据存放结构示意图;
图6为本发明实施例提供的一种将故障盘中的节点数据块拷贝至新磁盘的示意图;
图7为本发明实施例提供的一种节点数据块的结构示意图;
图8为本发明实施例提供的一种磁盘数据的逻辑结构示意图;
图9为本发明实施例提供的另一种数据抢救方法流程示意图;
图10为本发明实施例提供的一种业务磁盘池和抢救磁盘池分布示意图;
图11为本发明实施例提供的另一种数据抢救方法流程示意图;
图12为本发明实施例提供的另一种数据抢救方法流程示意图;
图13为本发明实施例提供的另一种数据抢救方法流程示意图;
图14为本发明实施例提供的一种文件服务器结构示意图;
图15为本发明实施例提供的另一种文件服务器结构示意图;
图16为本发明实施例提供的另一种文件服务器结构示意图;
图17为本发明实施例提供的另一种文件服务器结构示意图;
图18为本发明实施例提供的另一种文件服务器结构示意图;
图19为本发明实施例提供的另一种文件服务器结构示意图;
图20为本发明实施例提供的另一种文件服务器结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明以下实施例提供的数据抢救方法,通过在新磁盘中创建管理元数据后,不进行文件解析过程而直接将故障盘中的节点拷贝至新磁盘,并将拷贝的节点挂载到新磁盘中的管理元数据上,从而将故障盘中的数据恢复至新磁盘,因而避免了现有技术中复杂的文件解析过程,从而提高了故障盘的抢救速度。
实施例1
本发明实施例提供一种数据抢救方法,参见图3,可以包括:
101、当需要抢救故障盘中的数据时,文件服务器在新磁盘中创建管理元 数据,管理元数据为新磁盘的管理信息。
其中,文件服务器是分布式文件系统中进行磁盘管理的服务器,可以用于抢救其管理的故障盘中的数据,进行分布式文件将其管理的故障盘的数据抢救至新磁盘,还可以为用户提供正常业务处理。当文件服务器管理的某一磁盘发生故障需要进行数据抢救时,文件服务器可以在另外一块新磁盘中创建管理元数据。其中的管理元数据可以包括根目录信息和磁盘空间管理信息等,用于描述信息资源或数据本身的特征和属性,规定数字化信息的组织,具有定位、发现、证明、评估、选择等功能。
实际上,对于本地文件系统或者对象文件系统来说,各个磁盘的管理结构是相同的,区别在于管理结构所管理的数据不同,因而可以在一个新磁盘上创建管理元数据从而建立新磁盘的管理结构,并将故障盘中的目录数据块和文件数据块拷贝至新磁盘中,通过管理元数据创建的管理结构进行管理,从而将故障盘中的数据恢复至新磁盘以抢救故障盘中的数据,而不用将故障盘中的管理元数据拷贝至新磁盘且不用根据故障盘中的管理元数据进行文件解析,而可以直接将故障盘中的管理元数据丢弃,从而将故障盘中的数据恢复至新磁盘。其中,磁盘中管理元数据对应的管理结构示意图可以参见图4。
需要说明的是,通过在新磁盘上创建元数据来恢复故障盘中的数据,可以避免现有技术中根据故障盘中的管理元数据进行文件解析,因而即使故障盘中的管理元数据损坏也不会影响故障盘中数据的抢救。
此外,现有技术中在进行故障盘抢救时,需要根据管理元数据进行复杂的文件解析过程,当管理元数据损坏时将导致文件解析失败,从而无法抢救故障盘中的数据。为此,故障盘中的管理元数据一般会事先备份于其它磁盘中,从而耗费了额外的磁盘空间,同时还额外增加了其它磁盘的读写操作,影响了其性能。另外,将管理元数据备份仅能解决故障盘由于物理故障导致的管理元数据损坏,当故障盘由于软件损坏导致管理元数据损坏时,则可能导致源管理元数据和备份至其它磁盘的管理元数据均是错误数据。
本发明实施例提供的数据抢救方法,通过丢弃故障盘中的管理元数据,并在新磁盘上创建管理元数据,还可以避免现有技术中对故障盘中管理元数据的备份,因而节省了磁盘空间,并且无论故障盘中的管理元数据是硬件损坏还是软件损坏,均不会影响故障盘中数据的抢救。
102、文件服务器将故障盘中的节点数据块拷贝至新磁盘,节点数据块包括目录数据块或文件数据块。
由于管理元数据、目录数据块和文件数据块随机存储于磁盘中的,参见图5,因而除了管理元数据以外,磁盘中的剩余数据为管理元数据建立的管理结构所管理的目录数据块和文件数据块,这里将目录数据块和文件数据块统称为节点数据块。
在新磁盘中创建管理元数据以后,不需要再根据故障盘中的管理元数据进行文件解析,从而按照文件完整路径解析的方式将故障盘中的数据拷贝至新磁盘中了,而可以绕过故障盘中的管理元数据和文件解析,直接将故障盘中的节点数据块拷贝至新磁盘中。其中,文件服务器通过步骤101和步骤102将故障盘中的节点数据块拷贝至新磁盘的示意图可以参见图6。
需要说明的是,在本步骤中,由于不需要根据故障盘中的管理元数据进行文件解析,从而将故障盘中的数据拷贝至新磁盘,因而避免了现有技术中复杂的文件解析过程,提高了故障盘的抢救速度。
其中,节点数据块可以包括描述符和数据内容,描述符占用的存储空间可以为预设字节数,描述符可以包括起始标识和第一字节数,起始标识位于描述符的头部,可以用于指示节点数据块的起始位置,第一字节数可以用于指示数据内容占用的字节数量。其中,描述字符占用的预设字节数具体可以根据需要进行设定。节点数据块的结构示意图可以参见图7。
具体的,文件服务器将故障盘中的节点数据块拷贝至新磁盘可以包括:
文件服务器在故障盘中查找起始标识;
文件服务器从描述符的头部开始,将预设字节数的描述符和第一字节数的数据内容拷贝至新磁盘。
示例性的,描述符可以为索引节点inode,起始标识可以为魔数字。即描述符索引节点,可以包括起始标识和第一字节数,起始标识可以为魔数字。文件服务器可以在故障盘中查找位于描述符头部的起始标识魔数字,并从魔数字的第一个字节开始将预设字节数的描述符和第一字节数的数据内容拷贝到新磁盘中,从而将节点数据块拷贝至新磁盘。
其中,文件服务器可以在故障盘中每查找到一个起始标识,就将该起始标识对应的节点数据块拷贝至新磁盘中;或者,文件服务器也可以扫描故障盘中 所有的起始标识,并根据该起始标识建立节点数据块列表,从而根据节点数据块列表将节点数据块拷贝至新磁盘,本发明实施例不做限定。
103、文件服务器将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上。
文件服务器在将故障盘中的节点数据块拷贝至新磁盘后,参见图8所示的新磁盘中的数据的逻辑结构示意图,可以将新盘中的节点数据块挂载到新磁盘的管理元数据上,即将目录数据块和文件数据块交由管理结构管理,从而构成完整的磁盘存储系统。此时,可以根据新磁盘管理元数据进行文件解析从而正常读取新磁盘中的数据。实际上,新磁盘中的数据是抢救的故障盘中的数据。
在将节点数据块挂载到新磁盘中的管理元数据上之后,由于新磁盘中的管理的目录数据块和文件数据块均为故障盘中的目录数据块和文件数据块,因而此时已完全将故障盘中的数据恢复至新磁盘中,完成了故障盘的数据抢救。
进一步地,上述故障盘通常为分布式文件系统中的一个磁盘,在故障盘的数据抢救完成后,还可以根据已抢救的故障盘中的数据,继续抢救分布式文件系统中的系统数据。因而,参见图9,本发明实施例提供的数据抢救方法在步骤103之后,还可以包括:
104、文件服务器将新磁盘加入抢救磁盘池。
在新磁盘中创建管理元数据,将故障盘中的节点数据块拷贝至新磁盘并挂载在新磁盘的管理元数据上,从而将故障盘中的数据恢复至新磁盘,完成故障盘的数据抢救之后,还可以将该新磁盘加入数据抢救模块管理的抢救磁盘池及抢救磁盘列表,以进行分布式文件系统的系统数据抢救。其中,数据抢救模块用于管理抢救磁盘池中的磁盘,并不能管理其他磁盘。
值的注意的是,步骤104是以文件服务器将故障盘中的数据恢复至新磁盘并将新磁盘加入该文件服务器管理的抢救磁盘池,以进行分布式文件系统的系统数据抢救为例进行说明的,即通过同一文件服务器进行故障盘的数据抢救和进行系统数据抢救。当文件服务器1和文件服务器2位于同一物理地点时,通过文件服务器1将故障盘中的数据恢复至新磁盘,将该新磁盘插入文件服务器2,并加入文件服务器2的抢救磁盘池以进行分布式文件系统的系统数据抢救,是很容易想到的技术方案,也在本发明的保护范围内。
105、文件服务器将新磁盘中的数据拷贝至业务磁盘池中的业务磁盘。
文件服务器获取抢救磁盘池中的抢救磁盘列表,并根据该列表将恢复有故障盘数据的新磁盘中的数据,拷贝至业务数据模块管理的业务磁盘池中的业务磁盘,以抢救分布式文件系统中的系统数据。其中,业务数据模块用于管理业务磁盘池中的磁盘,并不能管理其他磁盘例如抢救磁盘池中的磁盘。
需要说明的是,在本发明实施例提供的数据抢救方法中,数据抢救模块只能管理抢救磁盘池中的磁盘,而不能管理业务磁盘池中的磁盘,且业务数据模块只能管理业务磁盘池中的磁盘,而不能管理抢救磁盘池中的磁盘,即参见图10,分布式文件系统中抢救磁盘池和业务磁盘池是相互隔离的,因而分布式文件系统的系统数据抢救和业务也是并行的,互不干扰。
而现有技术在进行数据抢救时需要暂停分布式文件系统的业务,否则将导致故障盘格式化或业务不一致,并且随着磁盘容量的扩增,故障盘的抢救时间越来越长,从而导致分布式文件系统的业务中断时间也越来越长,降低了分布式文件系统的处理性能。因而,本发明实施例提供的数据抢救方法通过将分布式文件系统的系统数据抢救和业务并行处理,可以提高分布式文件系统的处理性能。
示例性的,若文件服务器业务磁盘池中的业务磁盘包括32块磁盘,校验磁盘数量为4,且磁盘1、2、3、4已发生故障,当第5块盘例如磁盘5发生故障时,分布式文件系统会丢失系统数据。因而当磁盘5出现故障时,文件服务器可以将故障盘磁盘5中的数据抢救至新磁盘33中,并将新磁盘33加入抢救磁盘池,文件服务器将新磁盘33中的数据拷贝至业务磁盘池中的非故障磁盘中,从而根据业务磁盘池中的非故障磁盘中的数据抢救分布式文件系统的系统数据。
此外,参见图11,当进行故障盘的数据抢救的文件服务器和进行分布式系统的系统数据抢救的文件服务器不在同一物理地点时,不便于将新磁盘插入不在同一物理地点的其它文件服务器从而进行系统数据抢救,因而在步骤103之后,本发明实施例提供的方法还可以包括:
106、文件服务器将新磁盘中的数据发送给其它文件服务器。
在将故障盘的数据抢救至新磁盘之后,文件服务器还可以将新磁盘中的数据通过网络发送给其它文件服务器,以便于通过其它物理地点的其它文件服务器进行分布式文件系统的系统数据抢救。
另外,当进行故障盘的数据抢救的文件服务器和进行分布式系统的系统数据抢救的文件服务器不在同一物理地点,且已通过其它文件服务器完成了故障盘的数据抢救时,其它文件服务器还可以将已抢救的故障盘中的数据通过网络发送给当前文件服务器,以通过当前文件服务器业务磁盘池中的业务磁盘接收该数据,从而抢救分布式文件系统的系统数据。因而,参见图12,本发明实施例提供的数据抢救方法还可以包括:
107、文件服务器接收其它文件服务器发送的数据。
108、文件服务器将数据保存至业务磁盘池中的业务磁盘。
需要说明的是,文件服务器通过步骤107-108接收其它文件服务器发送的数据,并将接收到的数据保存至业务磁盘池中的业务磁盘,从而抢救分布式文件系统中的系统数据,与该同一文件服务器通过步骤101-106进行故障盘的数据抢救,并没有明确的先后关系。
本发明实施例提供一种数据抢救方法,在需要抢救故障盘中的数据时,通过在新磁盘中创建管理元数据,并将故障盘中的节点数据块拷贝至新磁盘,而后将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上,从而恢复故障盘中的数据,因而避免了现有技术中复杂的文件解析过程,提高了故障盘的抢救速度。因此,能够解决现有技术中在将故障盘中的数据拷贝到新磁盘时,由于需要进行文件解析使得拷贝速度慢,从而导致故障盘抢救速度慢的问题。
实施例2
参见图13,当进行故障盘的数据抢救和分布式系统的系统数据抢救的文件服务器不是同一文件服务器,且进行故障盘数据抢救的第一文件服务器和进行分布式系统的系统数据抢救的第二文件服务器不在同一物理地点时,本发明实施例提供的一种数据抢救方法,可以包括:
201、当需要抢救故障盘中的数据时,第一文件服务器在新磁盘中创建管理元数据,管理元数据为新磁盘的管理信息。
202、第一文件服务器将故障盘中的节点数据块拷贝至新磁盘,节点数据块包括目录数据块或文件数据块。
203、第一文件服务器将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上。
其中,步骤201-203的具体描述可以参见实施例1中的步骤101-103,这里不再赘述。
204、第一文件服务器将新磁盘中的数据发送给第二文件服务器。
第一文件服务器将故障盘的数据抢救至新磁盘后,可以通过网络将新磁盘中的数据发送给不在同一物理地点的第二文件服务器,以通过第二文件服务器进行分布式文件系统的系统数据抢救。
205、第二文件服务器接收第一文件服务器发送的数据。
第二文件服务器接收第一文件服务器发送的第一文件服务器的新磁盘中的数据。
206、第二文件服务器将数据保存至业务磁盘池中的业务磁盘。
其中,步骤205和步骤206的描述具体可以参见实施例1中的步骤107和步骤108,这里不再赘述。
本发明实施例提供一种数据抢救方法,在需要抢救故障盘中的数据时,通过在新磁盘中创建管理元数据,并将故障盘中的节点数据块拷贝至新磁盘,而后将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上,从而恢复故障盘中的数据,因而避免了现有技术中复杂的文件解析过程,提高了故障盘的抢救速度。因此,能够解决现有技术中在将故障盘中的数据拷贝到新磁盘时,由于需要进行文件解析使得拷贝速度慢,从而导致故障盘抢救速度慢的问题。
实施例3
本发明实施例提供一种文件服务器300,参见图14,该文件服务器300可以包括:
创建单元301,用于当需要抢救故障盘中的数据时,在新磁盘中创建管理元数据,管理元数据为新磁盘的管理信息。
拷贝单元302,用于将故障盘中的节点数据块拷贝至新磁盘,节点数据块包括目录数据块或文件数据块。
挂载单元303,用于将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上。
其中,文件服务器300是分布式文件系统中进行磁盘管理的服务器,可以 用于抢救其管理的故障盘中的数据,进行分布式文件将其管理的故障盘的数据抢救至新磁盘,还可以为用户提供正常业务处理。
本发明实施例提供的文件服务器300由于不需要根据故障盘中的管理元数据进行文件解析,从而将故障盘中的数据拷贝至新磁盘,因而避免了现有技术中复杂的文件解析过程,提高了故障盘的抢救速度。
其中,管理元数据可以包括根目录信息和磁盘管理信息。
节点数据块可以包括描述符和数据内容,描述符占用的存储空间为预设字节数,描述符包括起始标识和第一字节数,起始标识位于描述符的头部,用于指示节点数据块的起始位置,第一字节数用于指示数据内容占用的字节数量。
拷贝单元302可以具体用于:
在故障盘中查找起始标识;
从描述符的头部开始,将预设字节数的描述符和第一字节数的数据内容拷贝至新磁盘。
其中,描述符可以为索引节点inode,起始标识可以为魔数字。
此外,参见图15,文件服务器300还可以包括:
第一处理单元304,可以用于将新磁盘加入抢救磁盘池;
拷贝单元302还可以用于,将新磁盘中的数据拷贝至业务磁盘池中的业务磁盘。
另外,参见图16,文件服务器300还可以包括:
发送单元305,可以用于将新磁盘中的数据发送给其它文件服务器。
进一步地,参见图17,文件服务器300还可以包括:
接收单元306,可以用于接收其它文件服务器发送的数据;
第二处理单元307,可以用于将接收单元306接收的数据保存至业务磁盘池中的业务磁盘。
在本发明实施例提供的文件服务器300中,数据抢救模块只能管理抢救磁盘池中的磁盘,而不能管理业务磁盘池中的磁盘,且业务数据模块只能管理业务磁盘池中的磁盘,而不能管理抢救磁盘池中的磁盘,即抢救磁盘池和业务磁盘池是相互隔离的,因而分布式文件系统的系统数据抢救和业务也是并行的,互不干扰。
本发明实施例提供一种文件服务器,在需要抢救故障盘中的数据时,通过在新磁盘中创建管理元数据,并将故障盘中的节点数据块拷贝至新磁盘,而后将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上,从而恢复故障盘中的数据,因而避免了现有技术中复杂的文件解析过程,提高了故障盘的抢救速度。因此,能够解决现有技术中在将故障盘中的数据拷贝到新磁盘时,由于需要进行文件解析使得拷贝速度慢,从而导致故障盘抢救速度慢的问题。
实施例4
本发明实施例提供一种文件服务器400,参见图18,该文件服务器400可以包括处理器401、存储器402和总线403,其中,该总线403用于连接处理器401和存储器402,该存储器402用于存储指令和数据,该处理器401执行该指令用于当需要抢救故障盘中的数据时,在新磁盘中创建管理元数据,管理元数据为新磁盘的管理信息,将故障盘中的节点数据块拷贝至新磁盘,节点数据块包括目录数据块或文件数据块,并将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上。
这里的文件服务器400是分布式文件系统中进行磁盘管理的服务器,可以用于抢救其管理的故障盘中的数据,进行分布式文件将其管理的故障盘的数据抢救至新磁盘,还可以为用户提供正常业务处理。
本发明实施例提供的文件服务器400由于不需要根据故障盘中的管理元数据进行文件解析,从而将故障盘中的数据拷贝至新磁盘,因而避免了现有技术中复杂的文件解析过程,提高了故障盘的抢救速度。
其中,管理元数据可以包括根目录信息和磁盘管理信息,节点数据块可以包括描述符和数据内容,描述符占用的存储空间为预设字节数,描述符包括起始标识和第一字节数,起始标识位于描述符的头部,用于指示节点数据块的起始位置,第一字节数用于指示数据内容占用的字节数量。
可选的,处理器401执行该指令用于将故障盘中的节点数据块拷贝至新磁盘可以包括:
在故障盘中查找起始标识;
从描述符的头部开始,将预设字节数的描述符和第一字节数的数据内容拷贝至新磁盘。
其中,描述符可以为索引节点inode,起始标识可以为魔数字。
进一步地,处理器401执行该指令还可以用于,将新磁盘加入抢救磁盘池;
将新磁盘中的数据拷贝至业务磁盘池中的业务磁盘。
此外,参见图19,文件服务器400还可以包括:
发送器404,可以用于将新磁盘中的数据发送给其它文件服务器。
另外,参见图20,文件服务器400还可以包括:
接收器405,可以用于接收其它文件服务器发送的数据。
处理器401还可以用于,将接收器405接收到的数据保存至业务磁盘池中的业务磁盘。
在本发明实施例提供的文件服务器400中,数据抢救模块只能管理抢救磁盘池中的磁盘,而不能管理业务磁盘池中的磁盘,且业务数据模块只能管理业务磁盘池中的磁盘,而不能管理抢救磁盘池中的磁盘,即抢救磁盘池和业务磁盘池是相互隔离的,因而分布式文件系统的系统数据抢救和业务也是并行的,互不干扰。
本发明实施例提供一种文件服务器,在需要抢救故障盘中的数据时,通过在新磁盘中创建管理元数据,并将故障盘中的节点数据块拷贝至新磁盘,而后将从故障盘拷贝至新磁盘的节点数据块挂载到新磁盘的管理元数据上,从而恢复故障盘中的数据,因而避免了现有技术中复杂的文件解析过程,提高了故障盘的抢救速度。因此,能够解决现有技术中在将故障盘中的数据拷贝到新磁盘时,由于需要进行文件解析使得拷贝速度慢,从而导致故障盘抢救速度慢的问题。
在本申请所提供的几个实施例中,应该理解到,所揭露的文件服务器和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为 单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理包括,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用于使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上实施例仅用于说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims (18)

  1. 一种数据抢救方法,其特征在于,包括:
    当需要抢救故障盘中的数据时,在新磁盘中创建管理元数据,所述管理元数据为所述新磁盘的管理信息;
    将所述故障盘中的节点数据块拷贝至所述新磁盘,所述节点数据块包括目录数据块或文件数据块;
    将从所述故障盘拷贝至所述新磁盘的节点数据块挂载到所述新磁盘的管理元数据上。
  2. 根据权利要求1所述的方法,其特征在于,所述管理元数据包括根目录信息和磁盘管理信息。
  3. 根据权利要求1或2所述的方法,其特征在于,所述节点数据块包括描述符和数据内容,所述描述符占用的存储空间为预设字节数,所述描述符包括起始标识和第一字节数,所述起始标识位于所述描述符的头部,用于指示所述节点数据块的起始位置,所述第一字节数用于指示所述数据内容占用的字节数量。
  4. 根据权利要求3所述的方法,其特征在于,所述将所述故障盘中的节点数据块拷贝至所述新磁盘包括:
    在所述故障盘中查找所述起始标识;
    从所述描述符的头部开始,将所述预设字节数的描述符和所述第一字节数的数据内容拷贝至所述新磁盘。
  5. 根据权利要求3或4所述的方法,其特征在于,所述描述符为索引节点inode。
  6. 根据权利要求5所述的方法,其特征在于,所述起始标识为魔数字。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述方法还包括:
    将所述新磁盘加入抢救磁盘池;
    将所述新磁盘中的数据拷贝至所述业务磁盘池中的业务磁盘。
  8. 根据权利要求1-6任一项所述的方法,其特征在于,所述方法还包括:
    将所述新磁盘中的数据发送给其它文件服务器。
  9. 根据权利要求1-8任一项所述的方法,其特征在于,所述方法还包括:
    接收其它文件服务器发送的数据;
    将所述数据保存至所述业务磁盘池中的业务磁盘。
  10. 一种文件服务器,其特征在于,包括:
    创建单元,用于当需要抢救故障盘中的数据时,在新磁盘中创建管理元数据,所述管理元数据为所述新磁盘的管理信息;
    拷贝单元,用于将所述故障盘中的节点数据块拷贝至所述新磁盘,所述节点数据块包括目录数据块或文件数据块;
    挂载单元,用于将从所述故障盘拷贝至所述新磁盘的节点数据块挂载到所述新磁盘的管理元数据上。
  11. 根据权利要求10所述的文件服务器,其特征在于,所述管理元数据包括根目录信息和磁盘管理信息。
  12. 根据权利要求10或11所述的文件服务器,其特征在于,所述节点数据块包括描述符和数据内容,所述描述符占用的存储空间为预设字节数,所述描述符包括起始标识和第一字节数,所述起始标识位于所述描述符的头部,用于指示所述节点数据块的起始位置,所述第一字节数用于指示所述数据内容占用的字节数量。
  13. 根据权利要求12所述的文件服务器,其特征在于,所述拷贝单元具体用于:
    在所述故障盘中查找所述起始标识;
    从所述描述符的头部开始,将所述预设字节数的描述符和所述第一字节数的数据内容拷贝至所述新磁盘。
  14. 根据权利要求12或13所述的文件服务器,其特征在于,所述描述符为索引节点inode。
  15. 根据权利要求14所述的文件服务器,其特征在于,所述起始标识为魔数字。
  16. 根据权利要求10-15任一项所述的文件服务器,其特征在于,还包括:
    第一处理单元,用于将所述新磁盘加入抢救磁盘池;
    所述拷贝单元还用于,将所述新磁盘中的数据拷贝至所述业务磁盘池中的业务磁盘。
  17. 根据权利要求10-15任一项所述的文件服务器,其特征在于,还包括:
    发送单元,用于将所述新磁盘中的数据发送给其它文件服务器。
  18. 根据权利要求10-17任一项所述的文件服务器,其特征在于,还包括:
    接收单元,用于接收其它文件服务器发送的数据;
    第二处理单元,用于将所述接收单元接收的数据保存至所述业务磁盘池中的业务磁盘。
PCT/CN2016/098862 2015-09-30 2016-09-13 一种数据抢救方法及文件服务器 WO2017054643A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510641030.5A CN105159790B (zh) 2015-09-30 2015-09-30 一种数据抢救方法及文件服务器
CN201510641030.5 2015-09-30

Publications (1)

Publication Number Publication Date
WO2017054643A1 true WO2017054643A1 (zh) 2017-04-06

Family

ID=54800652

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/098862 WO2017054643A1 (zh) 2015-09-30 2016-09-13 一种数据抢救方法及文件服务器

Country Status (2)

Country Link
CN (1) CN105159790B (zh)
WO (1) WO2017054643A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105159790B (zh) * 2015-09-30 2018-03-16 成都华为技术有限公司 一种数据抢救方法及文件服务器
CN110389855B (zh) * 2018-04-19 2021-12-28 浙江宇视科技有限公司 磁带库数据校验方法、装置、电子设备和可读存储介质
CN110989929A (zh) * 2019-11-22 2020-04-10 浪潮电子信息产业股份有限公司 一种mon服务迁移方法、装置、设备及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102024016A (zh) * 2010-11-04 2011-04-20 天津曙光计算机产业有限公司 一种分布式文件系统快速数据恢复的方法
CN102081559A (zh) * 2011-01-11 2011-06-01 成都市华为赛门铁克科技有限公司 一种独立磁盘冗余阵列的数据恢复方法和装置
CN103534688A (zh) * 2013-05-29 2014-01-22 华为技术有限公司 数据恢复方法、存储设备和存储系统
US20140229763A1 (en) * 2013-01-22 2014-08-14 Tencent Technology (Shenzhen) Company Limited Disk fault tolerance method, device and system
CN105159790A (zh) * 2015-09-30 2015-12-16 成都华为技术有限公司 一种数据抢救方法及文件服务器

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3072722B2 (ja) * 1997-06-20 2000-08-07 ソニー株式会社 フラッシュメモリを用いるデータ管理装置及びデータ管理方法並びにフラッシュメモリを用いる記憶媒体
KR101801147B1 (ko) * 2011-08-30 2017-11-27 삼성전자주식회사 데이터 신뢰성을 개선하는 데이터 관리 방법 및 그에 따른 데이터 저장 장치

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102024016A (zh) * 2010-11-04 2011-04-20 天津曙光计算机产业有限公司 一种分布式文件系统快速数据恢复的方法
CN102081559A (zh) * 2011-01-11 2011-06-01 成都市华为赛门铁克科技有限公司 一种独立磁盘冗余阵列的数据恢复方法和装置
US20140229763A1 (en) * 2013-01-22 2014-08-14 Tencent Technology (Shenzhen) Company Limited Disk fault tolerance method, device and system
CN103534688A (zh) * 2013-05-29 2014-01-22 华为技术有限公司 数据恢复方法、存储设备和存储系统
CN105159790A (zh) * 2015-09-30 2015-12-16 成都华为技术有限公司 一种数据抢救方法及文件服务器

Also Published As

Publication number Publication date
CN105159790A (zh) 2015-12-16
CN105159790B (zh) 2018-03-16

Similar Documents

Publication Publication Date Title
US11163653B2 (en) Storage cluster failure detection
US9250824B2 (en) Backing up method, device, and system for virtual machine
US20190163591A1 (en) Remote Data Replication Method and System
CN101539873B (zh) 数据恢复的方法、数据节点及分布式文件系统
JP6264666B2 (ja) データ格納方法、データストレージ装置、及びストレージデバイス
CN106776130B (zh) 一种日志恢复方法、存储装置和存储节点
CN105814544B (zh) 用于支持分布式数据网格中的持久化分区恢复的系统和方法
JP2005242403A (ja) 計算機システム
CN108733311B (zh) 用于管理存储系统的方法和设备
WO2017132790A1 (zh) 数据恢复方法及存储设备
JP2004334574A (ja) ストレージの運用管理プログラム、運用管理方法及び管理計算機
CN107315659B (zh) 一种元数据的冗余备份方法及装置
CN109491609B (zh) 一种缓存数据处理方法、装置、设备及可读存储介质
CN111552437A (zh) 一种应用于分布式存储系统的快照方法及快照装置
CN102314503A (zh) 一种索引方法
WO2018113484A1 (zh) 一种多副本数据恢复方法及装置
WO2015085529A1 (zh) 数据复制方法、数据复制装置和存储设备
JP2005182683A (ja) データ転送方法及びシステム並びにプログラム
US11934280B2 (en) Use of cluster-level redundancy within a cluster of a distributed storage management system to address node-level errors
WO2017054643A1 (zh) 一种数据抢救方法及文件服务器
CN109407975B (zh) 写数据方法与计算节点以及分布式存储系统
CN105740049B (zh) 一种控制方法及装置
WO2018081960A1 (zh) 管理文件的方法、文件系统和服务器系统
CN113986450A (zh) 一种虚拟机备份方法及装置
JP6376626B2 (ja) データ格納方法、データストレージ装置、及びストレージデバイス

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16850267

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16850267

Country of ref document: EP

Kind code of ref document: A1