WO2010148988A1 - Method, device and system for taking over fault metadata server - Google Patents

Method, device and system for taking over fault metadata server Download PDF

Info

Publication number
WO2010148988A1
WO2010148988A1 PCT/CN2010/074042 CN2010074042W WO2010148988A1 WO 2010148988 A1 WO2010148988 A1 WO 2010148988A1 CN 2010074042 W CN2010074042 W CN 2010074042W WO 2010148988 A1 WO2010148988 A1 WO 2010148988A1
Authority
WO
WIPO (PCT)
Prior art keywords
local
failure
take over
image
metadata
Prior art date
Application number
PCT/CN2010/074042
Other languages
French (fr)
Chinese (zh)
Inventor
程菊生
徐涛
陈浩
钟吉林
Original Assignee
成都市华为赛门铁克科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都市华为赛门铁克科技有限公司 filed Critical 成都市华为赛门铁克科技有限公司
Publication of WO2010148988A1 publication Critical patent/WO2010148988A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions

Definitions

  • This provides methods, devices, and systems for taking over faults to improve reliability and improve the reliability of distributed file storage systems.
  • this method provides a method for taking over the fault, Each (SC se ), including local and local local files, is managed locally.
  • the method includes the local working of each device in the local area, and the local image in each local device, and each local fault in each device is faulty, local
  • this section provides means for taking over the faults, each of which includes local,
  • local devices include ⁇ ⁇ ⁇ , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
  • the upper system includes the customers.
  • Each of the elements in each unit includes local and local local file management, and each local and local. Everything works often, through
  • the local image in each device is managed by each local file in each device, and each local image is managed to take over the failed storage.
  • Each of the media stores the elements in each device.
  • each local device In the operation of each local device, the fault occurs on each image, and the local image management is performed to take over the fault.
  • the various wooden means so as to improve the reliability of each, and improve the reliability of the distributed file storage system.
  • the method of taking over the faults is in each, including local,
  • the method includes the following steps
  • Each of the lines is the second of each of the original locals.
  • Various faults, local devices and villages Everything is normal.
  • the wood scheme of the above method has the following beneficial effects.
  • the local devices are often used to pass through the meta-images of the upper devices in the local devices.
  • forest file system is a general document system, 2, including
  • Image (o-O) can only be operated like the image of . This image refers to the number of copies.
  • the location can be different (oca ead aTee, local) and (eghbo ead aTee, ).
  • the local refers to the yuan that is stored locally.
  • Liben Forest file is generally indicated, in 2, including 3 yuan, respectively 1, 2 and 3.
  • Each of the number of media is unmanaged, and each S-, S-2, and S-3.
  • you can put the composition, "( eeR )" that is, in each of the above, the local, the local, the phase, and the ⁇ , do not appear local or more, Therefore, the same can be seen as a mediation. 2
  • the elements shown in the local the next 1
  • S-3 3 2 is 1, local is 1 in each S-, 3 is the same, 3 is also local in S-3, that is, S-1 is local in S-3, and S- 3 If a fault occurs, the S-1 (3) line manages the takeover of the S-3 that has not failed.
  • the forest framework has all the distributions in the system, and all The mutual (sub, brother, etc.), which is based on the local file system, has the management and operation of the file, mercury and sub-responsibility. Same, you can separate each of the four in the tree, the next 2, the force points
  • Figure 4 shows the 5 of the forest framework of the Liben Forest Document System. It is the tree of the forest file system, and this is the local file system. The tree is managed by each. The operations of , , and so on are unfinished. Is a local file system, by local
  • the local file system is a forest file system.
  • the forest frame can be re-planted on the new yuan, and the local part of the mercury is moved to the village to form a new yuan, local, same, and other.
  • the local devices work normally, the general case, the local system shown in 2, the S-3 in the S-
  • Each local file is governed by each.
  • each of the elements can be the same, the case 2, in S-, each refers to the yuan, that is, each 3 (3).
  • S-1 fails and does not exist for a long time, like S-1's local S-2 force local -2, take over S-1
  • S-2 generates a new local image of S-3 (S-3 original S-), and S-3 generates a new -2 S-2 local-2 image.
  • This paper proposes a method in which Shentong does not take over the faults, and each local device works normally.
  • each local device 70 includes local and local local file management. Included in each local device 70
  • each faulty receiver 702 is managed by a local file in each device, and each faulty, local image is managed to take over the faulty ones.
  • the device 70 includes a forest frame-forming sheep, a distribution of the forest frame, and a local and mutual image.
  • Each local file management fails to take over the sheep 702 in the second, second in the second local, in the second image
  • Device 70 includes faulty sheep, local devices and .
  • the upper system includes the client 81, and each of the elements in the 82 and the storage 83 stores each storage.
  • each of the 82 includes local and local local file management, and the local local devices work normally.
  • the local files in each device are managed by each row of faults and local images, in order to take over the faulty memory stores 83, in each of them, to store the elements in each device.
  • Figure 9 shows the symptoms of each failure and takeover in the case of the cluster.
  • the method proposed by the system is replaced by a failure. Since all the latest elements and mercury have been saved on the faults, it is only necessary to take over the faults, and only the """ can take over the faults.
  • the wood scheme of the present system has improved the reliability of each and improved the reliability of the distributed file storage system. Yes, the program is not commanded. The hardware is not completed.
  • the program can be stored in the storable program, including all or part of the steps, storage, RO / , and so on.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method, device and system for taking over a fault metadata server is provided in the embodiments of the present invention, wherein the method is applied to a metadata server cluster, and the metadata sever comprises a local metadata tree and a neighbor metadata tree; the local metadata tree manages the local file system, and the neighbor metadata tree corresponds to the local metadata tree of the neighbor metadata server. Said method includes the following steps: when the neighbor metadata server corresponding to the local metadata server is in working order, the local metadata server performs real-time mirror image on the local metadata tree of the neighbor metadata server by the neighbor metadata tree; when a fault occurs in the neighbor metadata server, the local metadata server, by managing the neighbor metadata tree which has been performed real-time mirror image on, takes over the neighbor metadata server with fault. Improving the responsibility of the metadata server is achieved by the technical solution in the embodiments of the present invention, further improving the responsibility of the whole distributed file storage system.

Description

接管故障 各 的方法、 裝置及 統 本 要求于 2009 6 24 提交中固 利局、  The method, device and system requirements for taking over the fault are submitted to the China Finance Bureau in 2009 6 24
200910150732.8、 " 接管故障 各 的方法、 裝置及 統" 的中固 的 先 , 其全部內容 引用結合在本 。 木領域 200910150732.8, "The method, device and system for taking over the faults", the first of its contents, the contents of which are incorporated in this section. Wood field
本 涉及分布 文件存儲領域, 尤其涉及 接管故障 各 ( eadaaSe Ve , ) 的方法、 裝置及 統。 背景 木  This paper deals with the field of distributed file storage, especially the methods, devices and systems for taking over faults (eadaaSe Ve , ). Background
近 未分布 文件 統得到大力 , 于存儲解決方案 。 由于分布 文件 統管理的文件數量非常大, 出現上化 文件或 十 化 文件, 需要用 的元 各 行管理。 看存儲 模的 大和 存 儲( 存儲是指 ( se ) 、 木或分布 文件 統等功能, 將 中大量各神不同 型的存儲 各 軟件集合起未 同工作,共同 外提供 存儲和 各 功能的 介 統)的 ,羊 各 已 不能滿足用戶需求, 需要用多 各 組成的 未管理  Nearly undistributed files have been heavily developed for storage solutions. Since the number of files managed by the distributed file system is very large, there are upper-level files or dozens of files that need to be managed by each line. Look at the storage mode of the large and storage (storage refers to (se), wood or distributed file system and other functions, the collection of a large number of different types of storage software to work together, provide storage and functions of the common system) The sheep are no longer able to meet the needs of users, and need to use multiple components of unmanaged
 .
在 本 明 現有 木中至少存在 下 統的 分布 文件 統分力 部分 客戶 、 各 、 存儲 各 ( ec So age SeVe , OS )。 各 管理 統所有 。 果 出現故障,分布 文件 統將不能工作。 看分布 文件 統的增大, 各 的數量 越未越多, 各 出現故障的概率也越未越大, 統的可靠性 以得到 。 因而, 何提高 各 的可靠性就 了 分布 文件存儲的 。 內容  In the existing woods of the present invention, at least the following distribution files are distributed to each customer, each storage, and each storage (ec Soage SeVe, OS). All management systems are owned. If a failure occurs, the distribution file will not work. Looking at the increase of the distribution file system, the more the number of each is, the less the probability of each failure is, and the reliability of the system is obtained. Therefore, how to improve the reliability of the distributed file storage. Content
本 提供 接管故障 各 的方法、裝置及 統, 以提 高 各 的可靠性, 而提高 分布 文件存儲 統的可靠性。  This provides methods, devices, and systems for taking over faults to improve reliability and improve the reliability of distributed file storage systems.
方面,本 提供了 接管故障 各 的方法, 于 各 ( S C s e ), 各 包括本地 、 本地 于 本地文件 統 行管理 于 各 的本地 Aspect, this method provides a method for taking over the fault, Each (SC se ), including local and local local files, is managed locally.
方法包括 占本地 各 的 各器工作 常 , 本地 各 各器中 的本地 像,以 各器中的本地文件 統 行各 各 出現故障 , 本地 各  The method includes the local working of each device in the local area, and the local image in each local device, and each local fault in each device is faulty, local
像的 行管理,以接管出現故障的 各 。  Like line management, to take over each of the faults.
另 方面,本 提供了 接管故障 各 的裝置, 于 各 , 各 包括本地 、  On the other hand, this section provides means for taking over the faults, each of which includes local,
本地 于 本地文件 統 行管理, 于 各 的本地 裝置包括 奕吋 像羊 , 于 本地 各 的 各器工作 常 ,通  Locally managed locally, local devices include 奕吋 羊 羊 , , , , , , , , , , , , , ,
各器中的本地 像,以 各器中的本地文件 統 行各 故障接管羊 , 于  The local image in each device, with each local file in each device, the faulty takeover of the sheep,
各 出現故障 , 本地 各 像的 行 管理, 以接管出現故障的 各 。 Each fault occurs and the local image is managed to take over the faulty ones.
再 方面,本 提供了 接管故障 各 的 統, 上 統包括 客戶 , 于 各 中的元 各 的元 各 , 各 中的 各 包括本地 、 本地 于 本地文件 統 行管理, 于 各 的本地 于 本地 各 的 各器工作 常 ,通  On the other hand, this provides the system for taking over the faults. The upper system includes the customers. Each of the elements in each unit includes local and local local file management, and each local and local. Everything works often, through
各器中的本地 像,以 各器 中的本地文件 統 行各 于 各 出現故障 , 本地 各 像的 行管理,以接管出現故 障的 各 存儲 各 , 于 于 各 中的 介 ,存儲 各器中的元 的 。 The local image in each device is managed by each local file in each device, and each local image is managed to take over the failed storage. Each of the media stores the elements in each device.
上 木方案具有 下有益效果  The wood plan has the following beneficial effects
在 用本地 各 的 各器工作 常 ,通 上 像 上 各 出現故障 ,上 本地 各 像的 行管理,以接管出現故障的上  In the operation of each local device, the fault occurs on each image, and the local image management is performed to take over the fault.
各 的 木手段, 所以 到了提高 各 的可靠性, 而提高了 分 布 文件存儲 統的可靠性。 The various wooden means, so as to improve the reliability of each, and improve the reliability of the distributed file storage system.
固說明  Solid description
了更清楚 說明本 中的 木方案,下面將 中所 需要使用的 羊 介紹, 而易 , 下面 中的 是本 的 些 , 于本領域普通 木 未 ,在不付出 造性 功的前提下,  To explain more clearly the wood scheme in this book, the following is the introduction of the sheep that need to be used, and the following is the ones in the field, in the ordinary wood in the field, without paying for the work,
1力本 接管故障 各 的方法流程 1 force this take over the fault of each method flow
2力本 森林文件 統示意  2 Liben Forest Document
3力本 常情況下的工作示意  3 Forces
5 力本 各 故障和接管情況下的工作 6力本 接管故障 各 的裝置 示意 5 Liben's work under various faults and take-overs 6 Forces take over the faults of each device
7力本 接管故障 各 的 統組成示意  7 Liben take over the fault
8力本 例集群情況下 常情況的示意  8 force example cluster case
9 力本 例集群情況下 各 故障和接管情況  9 Forces in the case of clusters, each fault and takeover situation
休 方式 Hugh way
下面將結合本 中的 , 本 中的 木方案 、 完整地 , 然, 的 是本 部分 , 而不是 全部的 。 于本 明中的 ,本領域普通 木 在沒有做出 造 性 前提下 得的所有其他 , 都 于本 明保 的 固。 The following will be combined with the wood scheme in this book. , complete, of course, this part, not all. In this book, all the other common woods in this field are not guaranteed to be made.
囤 1所示, 力本 接管故障 各 的方法流程 于 各 , 各 包括本地 、  As shown in Figure 1, the method of taking over the faults is in each, including local,
本地 于 本地文件 統 行管理  Local to local file management
于 各 的本地 Locally
上 方法包括 下步驟  The method includes the following steps
S 1, 本地 各 的 各器工作 常 , 本 像, 以 各器中的本地文件 統 行各 s102, 各 出現故障 , 本地 各  S 1, the local devices work normally, this image, each local s102 in each device, each failure, local
像的 行管理,以接管出現故障的 各  Like the management of the line to take over the faulty
和 的分布情況,以及 本地 和 Distribution of and , as well as local and
的相互 。  Mutual.
本地 各 的 各器工作 常 , 性力 " 像", 本地 性力 " "  Every local device works often, sexual "like", local sexual force " "
各 出現故障 , 的 " 像" " "。 可 的, 各 出現故障 長期 ,  "Image" and " " of each failure. Yes, each failure has occurred for a long time,
力本地 , 出現故障的 各 的本地文件 行管理 方法 包括, 第二 , 第二  Force local, faulty local file management methods including, second, second
于第二 各 的本地 , 于 第二  In the second local, in the second
各 像 行各 第二 各 的本地 原先 各 的 。 各 的故障 , 本地 各器和 村中的 各 常 的 。 Each of the lines is the second of each of the original locals. Various faults, local devices and villages Everything is normal.
本 上 方法 的 木方案具有 下有益效果 用本地 各 的 各器工作 常 ,通 上 本地 各器中 的 上 各器中的元 像 上  The wood scheme of the above method has the following beneficial effects. The local devices are often used to pass through the meta-images of the upper devices in the local devices.
各 出現故障 ,上 本地 各 像的 行管理, 以接管出現故障的上 各 的 木手段, 所以 提高了 各 的可靠性。 了接管故障 各 ,提高 各 的可靠性,本 Each of the failures and the management of the local image are taken over to take over the various wooden means of the failure, thus improving the reliability. Take over the faults and improve the reliability of each
"森林文件 統"未 和管理。 森林文件 統是 介全局 的文件 統, 2, 包括  "Forest File System" is not managed. The forest file system is a general document system, 2, including
森林框架 ( a ewo )和 ( ead aTee)  Forest frame ( a ewo ) and ( ead aTee )
森林框架 了 統中所有 的分布情況,以及 的相 互 ( 子 、 弟 、 等)。 同 , 每 介  The forest frame all the distributions in the system, as well as the mutual (sub, brother, etc.). Same as each
分配 介本地文件 統。 介森林框架可以管理多 。 Assign a local file system. The forest framework can be managed much.
是森林文件 統的 棵 ,由 各 的本地文件 統 組成。 棵 由 介 各 管理。 的 等操作都 未完成。  It is a forest file system consisting of local files. The tree is managed by each. The operations are not completed.
的操作 可以力以下 Operation can be below
(1) " " (Read- e) 可以 等操作  (1) " " (Read- e) can be operated
(2) 只 " ( ead-O ) 只有 , 不能 和其他操作  (2) Only "(ead-O) only, can't and other operations
(3) " 像" ( o-O ) 只有 像 , 于 的 像, 不能 等操作。 此 像, 是指 拷貝 的數 。  (3) "Image" (o-O) can only be operated like the image of . This image refers to the number of copies.
存 位置的不同, 可以分力本地 ( oca ead aTee, 本地 )和 ( eghbo ead aTee, )。  The location can be different (oca ead aTee, local) and (eghbo ead aTee, ).
其 本地 是指 的元 各 在本地存 的元  The local refers to the yuan that is stored locally.
是指 各 的數 除了保存在本地外,同 像到  Means that each number is saved in the local area, the same image
各 上, 即 介保存在 S 的元 , 另 元 的 像。 本地 出現故障 , 由 未接管。 常情況下, 只具有 " 像" , 沒有 " 。 接管 , , 特換成本地 , 具有本 地 的全部 。Each of them, that is, the element stored in S, the other element. Local failure occurred and was not taken over. Often, only have "like" , no ". take over, special cost-changing, with local all.
2, 力本 森林文件 統示意囤, 在 2 , 包括3介元 , 分別 1、 2和 3。 都由 介元數 各 未管理, 分別 各 S- 、 S-2和 S-3。 了 余各 和故障 , 可以將 組成 介 , " ( eeR )", 即在 各 上都 , 介本地 , 介 本地 于 介 , 于相 的元 , 且都是 吋 , 不 出現 介本地 或 多 的情況, 因此, 可以將 同的 看成是 介 。 2, 中所示的 元 各 本地 、 的 下 1所示 2, Liben Forest file is generally indicated, in 2, including 3 yuan, respectively 1, 2 and 3. Each of the number of media is unmanaged, and each S-, S-2, and S-3. For the rest and the faults, you can put the composition, "( eeR )", that is, in each of the above, the local, the local, the phase, and the 相, do not appear local or more, Therefore, the same can be seen as a mediation. 2, the elements shown in the local, the next 1
1 各 本地 、 的  1 local,
各 本地 Local
S-1 1 3 S-1 1 3
S-2 2 1 S-2 2 1
S-3 3 2 由 1可 , 各 S- 中本地 的是 1, 的是 3 同 , 3也是 S-3中的本地 , 即 S-1 中的 的是 S-3中的本地 , 果 S-3 出現故障, 則 S-1 ( 于 3 ) 行管理未 出現故障的 S-3的接管。  S-3 3 2 is 1, local is 1 in each S-, 3 is the same, 3 is also local in S-3, that is, S-1 is local in S-3, and S- 3 If a fault occurs, the S-1 (3) line manages the takeover of the S-3 that has not failed.
需要說明的是,上 方案力本 的方案,在 中 也可以 多 ( 2 、 或者3 )未 多 行管理, 其 管理方法可以參考本 中的相 步驟。 It should be noted that the scheme of the above scheme can also be managed by multiple (2, or 3), and the management method can refer to the phase steps in this section.
3, 本 接管故障 各 的流程 步驟 下 S301、 森林框架  3, this takeover failure process steps under S301, forest framework
森林框架 了 統中所有 的分布情況,以及所有 的相互 ( 子 、 弟 、 等), 其 是建立在本地文 件 統 上的 介 文件 統, 具有 、 汞 和分 但不 責文件的管理和操作。 同 , 可以 每 介 四分別 于 棵 , 下 2所示, 力分The forest framework has all the distributions in the system, and all The mutual (sub, brother, etc.), which is based on the local file system, has the management and operation of the file, mercury and sub-responsibility. Same, you can separate each of the four in the tree, the next 2, the force points
2  2
各 往 Each
S S
S1 小o e S1 small o e
S2 / s S2 / s
S3 /ec S3 /ec
S4 /v 在 2 , 各 S 的分 " , 即 往 S 的路往 "/ho e" 余 各器具休 的路往可以 2。 需要 說明的是, 介 汞可以 各 。 S4 / v in 2, the division of each S, that is, the road to S to "/ho e", the rest of the appliances can be 2. It should be noted that the mercury can be used.
S302、 S302,
4, 力本 森林文件 統森林框架 的 示意 ,  4, the outline of the forest document of the forest,
(1) 本地 (1) Local
4所示,力本 森林文件 統的森林框架 的5 示意囤。 是森林文件 統的 棵子 ,本 上是 各 上的本 地文件 統。 棵 由 介 各 管理。 的 、 、 等操作都是通 未完成。 是本地文件 統, 由本地的  Figure 4 shows the 5 of the forest framework of the Liben Forest Document System. It is the tree of the forest file system, and this is the local file system. The tree is managed by each. The operations of , , and so on are unfinished. Is a local file system, by local
( oo od )、 索引 ( od )、 ( aaboc ) 組成。 只不 , 本地文件 統 是 森林文件 統的 介 。 ( oo od ), index ( od ), ( aaboc ). No, the local file system is a forest file system.
0 (2) 的 0 (2) of
的 是其他 各 的本地 的 像。在 各 常的情況下, 只具有 " 像" , 只用 像 , 而沒有 " " 。 It is the local image of each other. In all normal circumstances, only have "images", only use images, and no "" .
于 各 的本地 , 可以參 考 1中的 。  For each local, you can refer to 1 in the section.
此 , 果本地 太大 ,可以 森林框架在新的元 各 上再 棵 , 然 將本地 的部分 汞 移到 村上,形成新的元 各 的本地 , 同 , 也可以 , 以及將 其他 各 上的 未。  Therefore, if the local area is too large, the forest frame can be re-planted on the new yuan, and the local part of the mercury is moved to the village to form a new yuan, local, same, and other.
5303、本地 的 各器工作 常 ,通 例 , 在 2所示的 統 , S- 中的 S-3中的本地  5303, the local devices work normally, the general case, the local system shown in 2, the S-3 in the S-
像, S-2中的 S-1中的本地 像, S-3 中的 S-2中的本地 像。 像, 可以  For example, the local image in S-1 in S-2 and the local image in S-2 in S-3. Like, can
各 的本地文件 統 行各 。  Each local file is governed by each.
5304、 各 出現故障 , , 未管理 各  5304, each failure, unmanaged
的 各 可以是 的元 各 ,例 2, 在 S- , 各 是指 的元 各 , 即 各 3 ( 3 )。  Each of the elements can be the same, the case 2, in S-, each refers to the yuan, that is, each 3 (3).
各 出現故障, 例 各 3 ( S-3) 出現故障 , 將 的 " 像" " ", 此 , S-1可 Each failure, for example, each 3 (S-3) fails, will be "like" "", this, S-1 can
3 操作, 未 3 的元 各 3 行管理。 可 , 可以包括步驟  3 operations, not 3 yuan Each line 3 management. Yes, can include steps
S305、 各 長期沒有 , 特 力本地 , 生成新的 第二 各 行管理  S305, each long-term, special local, generate new second line management
例 , 2和 5, S-1 出現故障且長期沒有 , 像 S-1的本地 的 S-2的 力本地 -2, 接管 S-1  For example, 2 and 5, S-1 fails and does not exist for a long time, like S-1's local S-2 force local -2, take over S-1
管理。  management.
同 , 新的 未 第二 各 行管理, 的 。例 , S-2生成新的 S-3的本地 像( S-3 原先 S- 的 ), S-3生成新的 -2 S-2的本地 -2 像。 Same, new, not the second line management, . For example, S-2 generates a new local image of S-3 (S-3 original S-), and S-3 generates a new -2 S-2 local-2 image.
可 , 可以包括步驟  Yes, can include steps
5306、 各 , 本地 各器和  5306, each, local and
果 各 常,則 的各 各 的 If it is constant, then each
各 常 的 。 例 , 將 5所示的 2 所示的 。  Everything is constant. For example, the 5 shown in Figure 5 will be shown.
本 提出了 神通 未接管故障 各 的方 法, 本地 各 的 各器工作 常 , 用本地  This paper proposes a method in which Shentong does not take over the faults, and each local device works normally.
各 的 像 各 的本地 各 出現故障, 由本地 各 的 接管。本 提出了 于 的 "森林文件 統 本地 " 的 方法。 森林文件 統 了各 、 和 各 的 、本地 同的 、 的 等等。本 木方案 未的有益效果 大大提高分布 文件 統 的元 的可靠性。本 將大大提高分布 文件 統的元 的可靠 性, 果 各 中 各 出現故障(只要同 吋同不出 各 故障), 統就可以 未接管出現故障的 各 , 以未 統的可用性。 只要同 吋同不出現 各 故障(在 , 同 吋同 各 故障的出現概率也 其 微 ), 在 情況 ( 最 介 各 常外, 其余所有  Each of the local images of each of them fails, and each local takes over. This paper proposes a method of "forest file system local". The forest documents are unified, local, local, and so on. The beneficial effects of the wood scheme greatly improve the reliability of the distributed file system. This will greatly improve the reliability of the distributed file system. If each of the faults occurs (as long as the faults are the same), the system can not take over the faulty, and the availability is unresolved. As long as the same fault does not occur (in the same time, the probability of occurrence of each fault is also small), in the case (most often, all other
各 都 故障), 分布 文件 統依然可用。 送神 方法大大提高了 Each file is faulty, and the distribution file is still available. The method of giving God has greatly improved
的可靠性。 6所示,力本 接管故障 各 的裝置 示意 于 各 , 各 包括本地 、 本地 于 本地文件 統 行管理, 于 各 的本地 裝置70包括 Reliability. As shown in Figure 6, the devices that take over the faults are shown in each, including local and local local file management. Included in each local device 70
奕吋 像羊 701, 于 本地 各 的 各器工 作正常 ,通 各器中的本地  像 Like the sheep 701, the local devices work normally, and the locals in each device
像, 以 各器中的本地文件 統 行各 故障接管羊 702, 于 各 出現故障 , 本地 各 像的 行管理,以接管出現故障的 各 。  For example, each faulty receiver 702 is managed by a local file in each device, and each faulty, local image is managed to take over the faulty ones.
可 的, 裝置70 包括森林框架生成羊 , 于生成森林框架, 的分布情況, 以及 本地 和 的相互 像羊 于  Yes, the device 70 includes a forest frame-forming sheep, a distribution of the forest frame, and a local and mutual image.
本地 各 的 各器工作 常 ,  Every local device works very often.
性力 " 像", 本地 性力 " "  Sexuality "like", local sexuality"
各 出現故障 ,將 的 " 像" " "。 可 的, 故障接管羊 702 于 各 出現故 障 長期 , 力本地 , 出現故障的  Each of the failures will be "like" "". Yes, the faulty takeover of the sheep 702 in each of the faults, long-term, local, faulty
各 的本地文件 統 行管理 故障接管羊 702 于 第二 , 第二 于第二 各 的本地 , 于 第二 各 像 行各  Each local file management fails to take over the sheep 702 in the second, second in the second local, in the second image
第二 各 的本地 原先 各 的 The second of each local original
 .
裝置70 包括 故障 羊 , 于 本地 各器和 的 。  Device 70 includes faulty sheep, local devices and .
, 而提高了 分布 文件存儲 統的可靠性。  , which improves the reliability of distributed file storage systems.
 inverted
7所示,力本 接管故障 各 的 統組成示意 上 統包括 客戶 81, 于 各 82中的元 各 的元 和 存儲 各 83中 存儲 各 存儲的 7 shows the overall composition of the force takeover failure The upper system includes the client 81, and each of the elements in the 82 and the storage 83 stores each storage.
比 各 82, 各 中的 各 包括本地 、 本地 于 本地文件 統 行管理, 于 的本地 于 本地 各 的 各器工作正常 ,通  More than each of the 82, each of which includes local and local local file management, and the local local devices work normally.
各器中的本地 像,以  a local image in each device to
各器中的本地文件 統 行各 于 各 出現 故障 , 本地 各 像的 行管理, 以 接管出現故障的 各 存儲 各 83, 于 各 中的 介 各 , 于存儲 各器中的 元 的 。  The local files in each device are managed by each row of faults and local images, in order to take over the faulty memory stores 83, in each of them, to store the elements in each device.
下面在上 統的 上結合 休形象的說  The following is a combination of the image of Hugh’s image on the top.
1)正常情況 1) Normal situation
8所示, 力本 例集群情況下 常情況的示意囤。本 的分布 存儲 統分力三大部分 客戶 、 各 、 存 儲 各 。 各 由多 各 組成。 各 責 介 存儲 各 。  As shown in Fig. 8, it is often indicated in the case of the cluster. This distribution and storage system is divided into three major customers, each, and each storage. Each consists of multiple components. Each of the responsibilities is stored separately.
2) 故障和接管 2) Fault and takeover
9所示,力本 例集群情況下 各 故障和接 管情況下的示意囤。 各 中 各 故障, 統 將 本 提出的方法, 由 各 未接替故障 。 由于 各 上 已 保存了故障 的全部最新的元 和 汞 , 因而只需要在 各 上將 , 由 只 " " ", 就可以接管故障 各 , 統 可用。  Figure 9 shows the symptoms of each failure and takeover in the case of the cluster. In each of the faults, the method proposed by the system is replaced by a failure. Since all the latest elements and mercury have been saved on the faults, it is only necessary to take over the faults, and only the """ can take over the faults.
3)故障排除和數 3) Troubleshooting and number
10所示, 力本 例集群情況下 障排除和數 情況下的 示意囤。 各 故障已 排除, 且重新 統將 本 提出的方法, 將本地 各 上最新的 新市 。 新市 未建立本地 ,同 將 各 上的本地 各 到 本地, 作力 。 10 shows the obstacles in the case of the cluster in the case of the case and the number of cases. Each fault has been eliminated, and the proposed method will be re-organized to bring the latest new city to the local. The new city has not established a local, and the locals are all local and local.
本 上 統的 木方案 到了提高 各 的可靠性, 而提高 了分布 文件存儲 統的可靠性。 是可以 程序未指令相 硬件未完成, 的程序可以存儲于 可 存儲 程序在 ,包括上 全部或部分步驟, 的存儲 , RO / 、 等。  The wood scheme of the present system has improved the reliability of each and improved the reliability of the distributed file storage system. Yes, the program is not commanded. The hardware is not completed. The program can be stored in the storable program, including all or part of the steps, storage, RO / , and so on.
以上 的 休 方式, 本 的目的、 木方案和有益效果 了 步 細說明, 理解的是, 以上 力本 的 休 方式而已, 不用于限定本 的保 固,凡在本 的精神和原則 所做的任何 修改、 等同替換、 等, 包含在本 的保 固 內。  The above-mentioned rest mode, the purpose, the wood plan and the beneficial effects are explained in detail, and it is understood that the above-mentioned method of rest is not used to limit the warranty, any modification made in the spirit and principle of the present. , equivalent replacement, etc., are included in this warranty.

Claims

要 求 1、 接管故障 各 的方法, 其特 在于, 于 各 , 各 包括本地 、 本地 于 本地文件 統 行管理 于 各 的本地 方法包括 Requirements 1. Each method of taking over a fault, which is characterized in that each local, local, local, local file is managed locally, including
占本地 各 的 各器工作 常 , 本地 像, 以 各器中的本地文件 統 行各  Each local device works normally, local image, and each file in each device
各 出現故障 , 本地 各  Each failure, local
像的 行管理, 以接管出現故障的 各 。  Like the line management, to take over the faulty ones.
2、 要求1 方法, 其特 在于, 包括 的分布情況,以及 本地 和 的相 互 。 2. Requirement 1 method, which is characterized by the distribution of the distribution, as well as the local and mutual.
3、 要求2 方法, 其特 在于  3, requires 2 methods, which are characterized by
本地 各 的 各器工作正常 ,  The local devices work normally.
性力 " 像", 本地 性力 " "  Sexuality "like", local sexuality"
各 出現故障 , 的 " 像" " "。 "Image" and " " of each failure.
4、 要求3 方法, 其特 在于, 方法 包括  4. The method 3 is required, and the method includes
占所 各 出現故障 長期 ,  Occupy each failure, long-term
力本地 , 出現故障的 各 的本地文件 統 行管 理 Force local, faulty local file management
第二 , 第二 于第二  Second, second, second
各 的本地 , 于 第二 各 像 行各 Each local, in the second image
第二 各 的本地 原先 各 的 。  The second of the locals originally had their own.
5、 要求1 方法, 其特 在于 各 的故障 , 本地 各器和 的 。 5, request 1 method, its special Each fault, the local device and the.
6、 接管故障 各 的裝置, 其特 在于, 于 各 , 各 包括本地 、 本地 于 本地文件 統 行管理, 于 各 的本地 裝置包括  6. Take over the faulty devices, which are characterized in that they include local and local local file management, including in each local device.
奕吋 像羊 , 于 本地 各 的 各器工作 常 ,通 各器中的本地  像 Like a sheep, working in each local device, usually in the local
像, 以 各器中的本地文件 統 行各  Like, using local files in each device
故障接管羊 , 于 各 出現故障 , 本地 各 像的 行管理,以接管出現故障的  Failure to take over the sheep, in each failure, local image management, to take over the failure
各 。  Each.
7、 要求6 裝置, 其特 在于, 裝置 包括  7. Requires 6 devices, the special features of which include
森林框架生成羊 , 于生成森林框架, 森林框架 于保存  Forest frame generation sheep, forest frame generation, forest frame preservation
各 中 本地 和 的分布情況,以及 本地 和 的相互 。  The distribution of local and local, as well as the local and mutual.
8、 要求7 裝置, 其特 在于, 像羊 于  8, requires 7 devices, which are characterized by, like sheep
本地 各 的 各器工作 常 ,  Every local device works very often.
性力 " 像", 本地 性力 " "  Sexuality "like", local sexuality"
各 出現故障 ,將 的 " 像" " "。  Each of the failures will be "like" "".
9、 要求8 裝置, 其特 在于 9. Requires 8 devices, which are characterized by
故障接管羊 于 各 出現故障 長期 , 力本地 , 出現故障的  Failure to take over the sheep in each failure, long-term, local, failure
各 的本地文件 統 行管理 Local file management
故障接管羊 于 第二 , 第二  Faulty take over the sheep in the second, second
于第二 各 的本地 , 于 第二  In the second local, in the second
各 像 行各 第二 各 的本地 原先 各 的 。 Each of the lines is the second of each of the original locals.
10、 要求6 裝置, 其特 在于, 裝置 包括 10. Requires 6 devices, the special features of which include
故障 羊 , 于 本地 各器和 村中的 元 各 常 的 。  The faulty sheep are common in the local units and in the village.
11、 接管故障 各 的 統, 其特 在于, 統包括 客戶 , 于 各 中的元 各 的元 和 各 , 各 中的 各 包括本 地 、 本地 于 本地文件 統 行管理, 于 各 的本地 于 本地 各 的 各器工作 常 ,通  11. Take over the faulty system. The special features include the customer, each of the yuan and each of the yuan, each of which includes local and local local file management, local and local. Everything works often, through
各器中的本地 像,以 各 器中的本地文件 統 行各 于 各 出現故障 , 本地 各 像的 行管理,以接管出現 障的 各  The local image in each device is managed by each local file in each device, and each local image is managed to take over the obstacles.
存儲 各 , 于 各 中的 介 各 , 于存儲 各器中的元 的 。  Each of the storage elements is stored in each of the cells.
PCT/CN2010/074042 2009-06-24 2010-06-18 Method, device and system for taking over fault metadata server WO2010148988A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200910150732.8 2009-06-24
CN2009101507328A CN101577735B (en) 2009-06-24 2009-06-24 Method, device and system for taking over fault metadata server

Publications (1)

Publication Number Publication Date
WO2010148988A1 true WO2010148988A1 (en) 2010-12-29

Family

ID=41272521

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/074042 WO2010148988A1 (en) 2009-06-24 2010-06-18 Method, device and system for taking over fault metadata server

Country Status (2)

Country Link
CN (1) CN101577735B (en)
WO (1) WO2010148988A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408393A (en) * 2014-12-08 2015-03-11 张君 RFID label reading processed signal transmitting method directed towards bottled liquid food production
CN106027634A (en) * 2016-05-16 2016-10-12 白杨 Baiyang message port switch service

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577735B (en) * 2009-06-24 2012-04-25 成都市华为赛门铁克科技有限公司 Method, device and system for taking over fault metadata server
CN102523105B (en) * 2011-11-30 2014-03-26 广东电子工业研究院有限公司 Failure recovery method of data storage and applied data distribution framework thereof
CN102523114A (en) * 2011-12-15 2012-06-27 深圳市同洲视讯传媒有限公司 Media server disaster recovery method, media access gateway and system
CN102546776B (en) * 2011-12-27 2014-10-22 北京中科大洋科技发展股份有限公司 Method for realizing off-line reading files in SAN (Storage Area Networking) shared file system
WO2014008652A1 (en) * 2012-07-12 2014-01-16 华为技术有限公司 Metadata management method and device
CN104104648A (en) * 2013-04-02 2014-10-15 杭州信核数据科技有限公司 Storage device data visiting method, application server and network
CN103605584A (en) * 2013-10-22 2014-02-26 芜湖大学科技园发展有限公司 Method for mirroring metadata in electric power metadata management platform
US9720779B2 (en) * 2014-11-27 2017-08-01 Institute For Information Industry Backup system and backup method thereof
CN104994168B (en) * 2015-07-14 2018-05-01 苏州科达科技股份有限公司 Distributed storage method and distributed memory system
CN107528872B (en) * 2016-06-22 2020-07-24 杭州海康威视数字技术股份有限公司 Data recovery method and device and cloud storage system
CN106446197B (en) * 2016-09-30 2019-11-19 华为数字技术(成都)有限公司 A kind of date storage method, apparatus and system
CN106533754A (en) * 2016-11-08 2017-03-22 北京交通大学 Fault diagnosis method and expert system for college teaching servers
CN107402870B (en) * 2017-07-31 2020-10-16 苏州浪潮智能科技有限公司 Method and device for processing log segment in metadata server
CN107729178A (en) * 2017-09-28 2018-02-23 郑州云海信息技术有限公司 A kind of Metadata Service process takes over method and device
CN108880906A (en) * 2018-07-06 2018-11-23 郑州云海信息技术有限公司 A kind of fault recovery method of Metadata Service, server, client and system
CN111159786B (en) * 2019-12-29 2022-04-22 浪潮电子信息产业股份有限公司 Metadata protection method and device, electronic equipment and storage medium
CN111176898A (en) * 2019-12-29 2020-05-19 浪潮电子信息产业股份有限公司 Distributed file system MDS (maintenance description Server) fault switching method, device, equipment and medium
CN111639114A (en) * 2020-04-07 2020-09-08 北京邮电大学 Distributed data fusion management system based on Internet of things platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060112140A1 (en) * 2004-11-19 2006-05-25 Mcbride Gregory E Autonomic data caching and copying on a storage area network aware file system using copy services
CN101059807A (en) * 2007-01-26 2007-10-24 华中科技大学 Method and system for promoting metadata service reliability
US20090138444A1 (en) * 2007-11-22 2009-05-28 Electronics And Telecommunications Research Institute Method of searching metadata servers
CN101577735A (en) * 2009-06-24 2009-11-11 成都市华为赛门铁克科技有限公司 Method, device and system for taking over fault metadata server

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100571281C (en) * 2007-06-29 2009-12-16 清华大学 Great magnitude of data hierarchical storage method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060112140A1 (en) * 2004-11-19 2006-05-25 Mcbride Gregory E Autonomic data caching and copying on a storage area network aware file system using copy services
CN101059807A (en) * 2007-01-26 2007-10-24 华中科技大学 Method and system for promoting metadata service reliability
US20090138444A1 (en) * 2007-11-22 2009-05-28 Electronics And Telecommunications Research Institute Method of searching metadata servers
CN101577735A (en) * 2009-06-24 2009-11-11 成都市华为赛门铁克科技有限公司 Method, device and system for taking over fault metadata server

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DROP, LIU, YULING ET AL.: "Design and Implementation of Two-level Metadata Server in Small-Scale Cluster File system", WUHAN UNIVERSITY JOURNAL OF NATURAL SCIENCES, vol. LL, no. 6, 31 December 2006 (2006-12-31), pages 1939 - 1942 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408393A (en) * 2014-12-08 2015-03-11 张君 RFID label reading processed signal transmitting method directed towards bottled liquid food production
CN106027634A (en) * 2016-05-16 2016-10-12 白杨 Baiyang message port switch service
CN106027634B (en) * 2016-05-16 2019-06-04 白杨 Message port Exchange Service system

Also Published As

Publication number Publication date
CN101577735A (en) 2009-11-11
CN101577735B (en) 2012-04-25

Similar Documents

Publication Publication Date Title
WO2010148988A1 (en) Method, device and system for taking over fault metadata server
US20210064476A1 (en) Backup of partitioned database tables
US11036591B2 (en) Restoring partitioned database tables from backup
US11093356B2 (en) Automated self-healing database system and method for implementing the same
US11327949B2 (en) Verification of database table partitions during backup
US10942812B2 (en) System and method for building a point-in-time snapshot of an eventually-consistent data store
US11704207B2 (en) Methods and systems for a non-disruptive planned failover from a primary copy of data at a primary storage system to a mirror copy of the data at a cross-site secondary storage system without using an external mediator
US9703853B2 (en) System and method for supporting partition level journaling for synchronizing data in a distributed data grid
WO2020248507A1 (en) Container cloud-based system resource monitoring method and related device
EP3564835A1 (en) Data redistribution method and apparatus, and database cluster
WO2017050254A1 (en) Hot backup method, device and system
US11934670B2 (en) Performing various operations at the granularity of a consistency group within a cross-site storage solution
CN102761528A (en) System and method for data management
US20120278429A1 (en) Cluster system, synchronization controlling method, server, and synchronization controlling program
US20190386875A1 (en) Methods for managing storage virtual machine configuration changes in a distributed storage system and devices thereof
US10067841B2 (en) Facilitating n-way high availability storage services
CN101686261A (en) RAC-based redundant server system
CN103780433B (en) Self-healing type virtual resource configuration management data architecture
KR100970212B1 (en) Method and System of Dual Authentication Service for Measuring Obstacle using Dynamic Query Switch of Heterogeneous DB
CN201491023U (en) Redundancy server structure based on RAC
CN108369548A (en) The disaster recovery of cloud resource

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10791494

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10791494

Country of ref document: EP

Kind code of ref document: A1