WO2022053033A1 - 一种双活存储系统及其处理数据的方法 - Google Patents
一种双活存储系统及其处理数据的方法 Download PDFInfo
- Publication number
- WO2022053033A1 WO2022053033A1 PCT/CN2021/117843 CN2021117843W WO2022053033A1 WO 2022053033 A1 WO2022053033 A1 WO 2022053033A1 CN 2021117843 W CN2021117843 W CN 2021117843W WO 2022053033 A1 WO2022053033 A1 WO 2022053033A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- storage device
- file
- node
- data
- virtual
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims 2
- 238000000034 method Methods 0.000 claims abstract description 47
- 239000012634 fragment Substances 0.000 claims description 18
- 230000002085 persistent effect Effects 0.000 description 21
- 238000007726 management method Methods 0.000 description 20
- 238000011010 flushing procedure Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/065—Replication mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1469—Backup restoration techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0617—Improving the reliability of storage systems in relation to availability
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0626—Reducing size or complexity of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/815—Virtual
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/84—Using snapshots, i.e. a logical point-in-time copy of the data
Definitions
- the present application relates to the field of storage, and in particular, to a dual-active storage system and a method for processing data.
- a network storage cluster such as a Network Attached Storage (NAS) cluster
- NAS Network Attached Storage
- the first storage device when active-active is implemented, when the first storage device receives the write data, it will write the received write data to the local At the same time, it will be synchronized to the peer storage device as backup data, so that when the first storage device fails or the first storage device is disconnected from the second storage device, the second storage device can use the backup data to take over the second storage device.
- a service of a storage device so as to ensure that the service is not interrupted, that is, the active-active mode of the active-passive mode is realized. However, the active-active mode cannot be realized.
- the present application provides a dual-active storage system and a method for implementing the dual-active storage system, which are used to realize dual-active in Active-Active mode, so that storage devices in the dual-active storage system can access data in the same file system.
- a first aspect of the present application provides an active-active storage system.
- the active-active storage system includes a first storage device and a second storage device.
- the first storage device is configured to receive data of the first file sent by the client cluster to the file system, store the data of the first file, and send data of the first copy of the data of the first file to the second storage device.
- the second storage device is configured to receive the data of the second file sent by the client cluster to the file system, store the data of the second file, and send the data of the second copy of the second file to the file system. the first storage device.
- both the first storage device and the second storage device can store file data through the same file system, and can back up the file data of the opposite end, an Active-Active mode dual-active storage system is realized.
- Traditional NAS devices also have file systems, but two storage devices in Active-Passive mode each have an independent file system. Both independent file systems need to occupy the computing/storage resources of the storage device, resulting in low resource utilization efficiency. , it is also more complicated to manage, which is not a true active-active.
- the first storage device and the second storage device have the same file system, which can improve resource utilization efficiency and reduce management complexity.
- the client sends an access request to the storage device, it also sends a request to the same file system. Therefore, the access efficiency for the client is also improved.
- the active-active storage system further includes a virtual node set, the virtual node set includes a plurality of virtual nodes, each virtual node is allocated a computing resource, and the computing resource Sourced from a physical node in the first storage device or the second storage device.
- the physical node may be a control node of the first storage device and the second storage device, or may be a CPU in the control node, or a core in the CPU.
- a virtual node is a logical concept that acts as a medium for resource allocation to achieve isolation of computing resources in the system. According to this resource management method, each virtual node is allocated with independent computing resources, so the computing resources used by files/directories corresponding to different virtual nodes are also independent. Therefore, the capacity expansion or reduction of the active-active storage system is facilitated, and the lock-free mechanism between computing resources is also facilitated, and the complexity is reduced.
- the active-active storage system further includes a management device, and the management device is further configured to create a global view, where the global view is used to record the information assigned to each virtual node and the management device.
- the corresponding relationship between computing resources; the management device is further configured to send the global view to the first storage device and the second storage device, and the first storage device and the second storage device save the global view.
- the management device can be used as a software module installed on the first storage device or the second storage device, or can be an independent device.
- a software module installed on the first storage device after generating the global view , sending the global view to the first storage device and the second storage device for storage by interacting with other modules in the storage device.
- the virtual nodes in the virtual node set are presented to the applications in the first storage device and the second storage device respectively through the global view, and the applications in the first storage device and the second storage device will use the physical node of the opposite end as the local end's physical node. resources are used to facilitate interaction with peer physical nodes.
- the first storage device when storing the data of the first file, determines the first file corresponding to the first file according to the address of the data of the first file.
- a virtual node determining the computing resources allocated to the first virtual node according to the first virtual node and the global view, and based on the computing resources allocated to the first virtual node, The data is sent to the physical node corresponding to the computing resource, and the physical node stores the data of the first file in the memory of the physical node.
- the first storage device can receive data of files belonging to the physical node corresponding to any virtual node in the virtual node set, and forward the received data of the files to the file belonging In this way, when writing data, the user does not need to perceive the actual storage location of the file, and can operate the file through any storage device.
- the first virtual node has at least one backup virtual node, and the physical node corresponding to the first virtual node and the physical node corresponding to the backup virtual node are located in different
- the first storage device after determining the first virtual node corresponding to the first file, the first storage device further determines the backup virtual node corresponding to the first virtual node; and determining according to the backup virtual node and the global view the physical node corresponding to the backup virtual node; sending the first copy data to the physical node corresponding to the backup virtual node, and the physical node corresponding to the backup virtual node stores the first backup data in the in the physical node.
- the business of the first storage device can be taken over through the backup data , thereby improving the reliability of the system.
- the files and directories included in the file system are distributed among physical nodes corresponding to multiple virtual nodes in the virtual node set.
- the files and directories included in the file system are distributed among the physical nodes corresponding to multiple virtual nodes in the virtual node set, which specifically refers to distributing the files and directories included in the file system to multiple physical nodes for processing. .
- the physical resources of the first storage device and the second storage device can be fully utilized, and the file processing efficiency can be improved.
- each virtual node in the virtual node set is set with one or more fragment identifiers, and each directory and file in the file system is allocated a fragment
- the physical nodes in the first storage device and the second storage device distribute the directories and files to physical nodes corresponding to the virtual nodes to which the fragment identifiers belong according to the fragment identifiers of each directory and file.
- the files and directories included in the file system can be more conveniently distributed to the physical nodes of the first storage device and the second storage device by using the fragment identification.
- the first physical node in the first storage device is configured to receive a creation request for the first file, and a virtual node corresponding to the first physical node is configured to receive a request for creating the first file. From the set one or more fragment identifiers, a fragment identifier is selected for the first file, and the first file is created in the storage device.
- the first storage device when the second storage device fails or the link between the first storage device and the second storage device is disconnected, the first storage device The device is further configured to restore the second file based on the second copy data of the second file, and take over the service sent by the client cluster to the second storage device.
- the business of the first storage device can be taken over by backing up data, thereby improving the reliability of the system.
- the first storage device is further configured to delete the virtual node corresponding to the computing resource of the second storage device from the global view.
- the first storage device further has a first file system
- the second storage device further has a second file system
- Running both the local file system and the clustered file system on the same storage device provides users with multiple ways to access data in the storage device.
- a second aspect of the present application provides a method for implementing an active-active file system, and the method includes steps for implementing the operations performed by the first storage device and the second storage device in the active-active storage system provided in the first aspect of the present application. each function.
- a third aspect of the present application provides a management device, the management device is used to create a global view, the global view is used to record the corresponding relationship between each virtual node and its allocated computing resources, and is also used to record the global view.
- the view is sent to the first storage device and the second storage device for storage.
- the management device is used to monitor the changes of the virtual nodes in the first storage device and the second storage device. When it is detected that a new virtual node is added to the virtual cluster, or when the virtual node is deleted, for example, the virtual node corresponds to If the physical node fails, the global view is updated.
- the monitoring module can monitor changes of virtual nodes in the virtual node cluster in real time, so as to update the global view in time.
- a fourth aspect of the present application provides a storage medium for storing program instructions, where the program instructions are used to implement various functions provided by the management device provided in the third aspect.
- Figure 1 is an architecture diagram of an active-active storage system in Active-passive mode.
- FIG. 2 is an architectural diagram of a dual-active storage system in an Active-Active mode provided by an embodiment of the present application.
- FIG. 3A is a flowchart of a method for establishing an active-active storage system in an embodiment of the present application.
- FIG. 3B is a schematic diagram of various parameters generated in the process of constructing an active-active storage system in an embodiment of the present application.
- FIG. 4A is a flowchart of establishing a file system of the active-active storage system according to an embodiment of the present application.
- FIG. 4B is a schematic diagram of a dual-active system constructed in an embodiment of the present application.
- FIG. 5 is a flowchart of a method for creating a directory in a file system according to an embodiment of the present application.
- FIG. 6 is a flowchart of a method for querying a directory in a file system according to an embodiment of the present application.
- FIG. 7 is a flowchart of a method for creating a file in a file system according to an embodiment of the present application.
- FIG. 8 is a flowchart of a method for writing data in a file in a file system according to an embodiment of the present application.
- FIG. 9 is a flowchart of a method for writing data in a file in a file system according to an embodiment of the present application.
- FIG. 10 is a schematic diagram of a first storage device taking over services of the second storage device in an embodiment of the present application.
- FIG. 11 is a flowchart of a method for a first storage device to take over a service of the second storage device in an embodiment of the present application.
- the system 10 includes a first storage device 100 and a second storage device 200 .
- a first file system 102 is provided in the control node 101 of the first storage device 100 (the first storage device may include a plurality of control nodes, and for the convenience of description, only one is taken as an example)
- a second file system 202 is provided in the control node 201 (the second storage device may also include a plurality of control nodes, and for convenience of description, only one is used as an example for description).
- the first storage device 100 mounts the first file system 102 to the first client 300 .
- the second storage device 200 mounts the second file system 202 to the second client 400 .
- Each file system has a root directory.
- Mounting the file system on the client by the storage device means that the storage device provides the root directory of the file system to the client, and the client sets the root directory of the file system in the client's In the file system, the client can obtain the root directory of the file system of the storage device, so as to access the file system of the storage device according to the root directory of the file system of the storage device.
- the first client 300 reads and writes data through the first file system 102 , and the written data is stored as the local data 103 .
- the backup data of the second storage device 200 is also stored in the first storage device 100 .
- the second client 400 reads and writes data through the second file system 202 , and the written data is stored as the local data 203 .
- the second storage device 200 also stores the backup data of the first storage device 100 , that is, the peer backup data 204 . In this way, after the first storage device 100 fails or the link with the second client is disconnected, the second client can use the peer backup data 204 to take over the services of the first client 300, that is, the Active-passive mode is implemented. 's dual life.
- the first client 300 can only access the files in the first storage device 100 through the first file system.
- the second client 400 can only access the data in the second storage device 200 through the second file system, but cannot access the data in the first storage device 100, that is, cannot Implement active-active in Active-Active mode.
- the technical solutions provided by the embodiments of the present application set a global view, which is a collection of virtual nodes, and each virtual node in the global view is allocated computing resources, and the computing resources come from the first storage device and the second storage device.
- Two physical nodes of the storage device, the physical nodes can be the controller in the first storage device and the controller in the second storage device, or the CPU in the controller, or the core in the CPU, or a distributed Servers in the storage system.
- each physical node can obtain the global view, and in addition, each physical node also uses the same file system, so that the first client connected to the first storage device and the first client connected to the second storage device
- the second client of the storage device is mounted with the same file system, so that the first client can access the data belonging to the file system in the second storage device through the file system and the global view .
- the system 500 includes a first storage device 600 and a second storage device 700 .
- the first storage device 600 includes a physical node A and a physical node B.
- the second storage device 700 includes a physical node C and a physical node D.
- the first storage device 600 and the second storage device 700 may include more physical nodes.
- this embodiment only takes that each storage device includes two physical nodes as an example for description.
- the first storage device 600 and the second storage device 700 respectively include persistent storage devices 601 and 701 composed of multiple storage disks for persistently storing data.
- the first storage device 600 and the second storage device 700 Based on the physical storage space provided by the storage disks of the persistent storage devices 601 and 701 , the first storage device 600 and the second storage device 700 create a first volume 609 and a second volume 703 , respectively.
- the first storage device 600 and the second storage device 700 may store data into persistent storage devices 601 and 701 according to the first volume 609 and the second volume 703, respectively.
- the storage disk can be, for example, a persistent storage medium such as a solid state disk (Solid State Disk, SSD), a hard disk drive (Hard Disk Drive, HDD).
- the structures of the physical node A, the physical node B, the physical node C, and the physical node D are the same, and only the structure of the node A is used as an example for description in this embodiment of the present application.
- the physical node A includes a processor 602 and a memory 603 .
- the memory 603 stores application program instructions (not shown) and data generated during the running of the processor.
- the processor 602 executes the application program instructions to implement the active-active function of the Active-Active mode provided by the embodiment of the present application.
- the memory 603 also stores a global view 604 , a file system 605 , cache data 606 and backup data 607 .
- each physical node includes two file systems, one is a file system shared by each physical node, and the other is a file system of each physical node.
- the detailed introduction of other data in the memory 603 is introduced in conjunction with the method for implementing active-active, such as the flowcharts shown in FIG. 5 to FIG. 9 .
- the first client 800 is connected to the first storage device 600 to access data in the first storage device 600
- the second client 700 is connected to the second storage device 900 to access data in the second storage device 900 . data.
- FIG. 3A it is a flowchart of a method for establishing a global view provided by an embodiment of the present application.
- Step S301 the physical node A of the first storage device 600 receives a virtual cluster establishment request sent by the client.
- the first storage device is the master array
- the physical node A in the first storage device 600 is the master node
- the physical node A processes the request.
- Step S302 the physical node A establishes a global view 604, and synchronizes the established global view 604 to physical nodes corresponding to other virtual nodes in the global view.
- the first storage device 600 After the first storage device 600 establishes a network connection with the second storage device 700 , the first storage device 600 acquires the identifiers of each physical node in the second storage device 700 and the IP address of each physical node.
- the node A assigns a virtual identifier to each physical node in the first storage device 600 and the second storage device 700 to identify the virtual node, and establishes a global view to record the virtual node's Virtual ID.
- the computing resources of each physical node such as processor resources and memory resources, are the computing resources allocated to the virtual node. In other embodiments, in addition to the computing resources, other physical resources may also be allocated to each virtual node. , such as bandwidth, etc.
- the physical resources allocated by each virtual node are independent of each other. In this way, for a storage device, it is more convenient to expand the storage device. For example, when a new storage device is added For physical resources, a new virtual node is generated according to the new physical resource, thereby increasing the number of virtual nodes, and adding the newly added virtual node to the global view. In distributed storage, the added servers are used as new physical resources, and virtual nodes are established according to the added servers, thereby increasing the number of virtual nodes in the global view. The established global view is shown in Vcluster in FIG.
- VNode A and Vnode B are allocated for physical node A and physical node B in the first storage device 600, which are physical node C of the second storage device 700.
- physical node D allocates virtual identities VNode C and VNode D.
- the node A stores the global view 604 in the memory 603 and the persistent storage device 601, and then synchronizes the node set table 604 to the physical nodes corresponding to other virtual nodes ( physical nodes B, C and D), and the persistent storage medium 701 of the second storage device 700 .
- Step S303 the physical node A generates a shard view according to the node set, and synchronizes the shard view to physical nodes corresponding to other virtual nodes in the virtual node cluster.
- a preset number of shards such as 4096 are set for the virtual cluster, and these shards are evenly distributed to each virtual node in the global view 604, that is, a shard view is generated.
- the produced shard view is shown as the shard view in Figure 3B.
- the shard is used to distribute and store the directories and files of the file system 605 to the physical nodes corresponding to each virtual node in the global view 604. The specific function of the shard view will be described in detail below.
- the physical node A After the shard view is generated, the physical node A stores the shard view in the local memory 603 and the persistent storage medium 601, and synchronizes the shard view to the physical nodes (physical nodes B, C and D), and in the persistent storage medium 701 of the second storage device 700 .
- Step S303 the physical node A generates a data backup policy, and synchronizes the data backup policy to the physical nodes corresponding to other virtual nodes in the virtual node cluster.
- a data backup policy may be set in this embodiment of the present application, that is, the generated data is backed up to multiple nodes.
- the backup strategy in the embodiment of the present application is to perform three-copy backup of data, two of which are stored in two local physical nodes, and the other one is stored in a physical node of a remote storage device.
- a set of backup nodes is set for each virtual node.
- the corresponding backup nodes of virtual node Vnode A are set as virtual nodes VnodeB and VnodeC, and the virtual node corresponding to virtual node VnodeB is VnodeA.
- the virtual nodes corresponding to the virtual node VnodeB are VnodeA and VnodeD
- the virtual nodes corresponding to the virtual node VnodeD are VnodeC and VnodeB.
- the node A stores the backup policy in the local memory 603 and the persistent storage device 601, and synchronizes the backup policy to the persistent storage device 701 and the global view of the second storage device 700 in the corresponding physical nodes of other virtual nodes.
- the establishment of the virtual cluster in FIG. 3A is performed by a management module.
- the management module is located in the first storage device as an example for illustration. After the management module generates the file system and the global view, all The generated file system and global view are sent to the first storage device and the second storage device for storage.
- the management module may also be located on an independent third-party management device, and the third-party management device sends the file system and global view to the first storage device and the second storage device after generating the file system and the global view. storage, so that each physical node can obtain the global view.
- a monitoring module will monitor the changes of the virtual nodes in the first storage device and the second storage device. When it is detected that a new virtual node is added to the virtual cluster, or when the virtual If the node is deleted, for example, the physical node corresponding to the virtual node fails, the monitoring module will notify the management module to update the global view.
- the monitoring module may be located on the third-party management device, or may be located in the first storage device and the second storage device.
- the first storage device is used as the main storage device, the second storage device will send the monitored changes to the first storage device, and the management module in the first storage device will update the global view. In this way, the establishment of the virtual node cluster can be completed.
- the first storage device 600 and the second storage device 700 can establish a file system according to the request of the client. Specifically, as shown in the flowchart of FIG. 4A .
- Step S401 physical node A receives a file system creation request.
- the first client 800 may send the file system creation request to the first storage device 600, or may send the file system creation request to the second storage device 700. If the first storage device 600 receives the file system creation request, then The file system creation request is processed by the physical node A. If the second storage device 700 receives the file system creation request, the second storage device 700 forwards the file system creation request to the first storage device 600 physical node A processing.
- Step S402 the physical node A sets a root directory for the file system.
- the master node When the master node sets the root directory, it first generates a mark of the root directory. In general, the default mark of the root directory is "/", and then assigns identification information and a shard ID to the root directory. Since the shard view created on the master node is synchronized to all nodes, the master node obtains the shard view from its own memory, and selects the shard ID for the root directory from it. As shown in Figure 3B, each virtual node in the shard view is assigned multiple shard IDs. Therefore, in order to reduce cross-network and cross-node access, the virtual node Vnode A corresponding to physical node A will be prioritized. Assign a shard ID to the root directory in the shard ID of . Since it is the root directory, the shard ID has not been allocated, for example, shard 0 can be selected as the shard ID of the root directory.
- Step S403 the physical node A sends a file system mount command to the first client 800.
- the physical node A After the root directory of the cluster file system is generated, in order to enable the first client 800 to access the file system, the physical node A will mount the file system to the file system of the first client 800 .
- the physical node A provides the root directory of the file system to the first client 800 through the mount command, and when the physical node A sends the mount command, it will carry the parameter information of the root directory, the parameter information of the root directory That is, the handle information of the root directory, and the handle information carries the shard ID and identification information of the root directory.
- Step S404 the first client 800 mounts the cluster file system to the file system of the first client 800 according to the mount command.
- the first client 800 After the first client 800 receives the parameter information of the root directory of the file system, it will generate a mount point on the file system of the first client, and record the root directory of the file system at the mount point parameter information, the mount point is a segment of storage space.
- the first client 800 may also perform data transmission through the file system 605 with the first client 800 . Users can select the file system to be accessed according to actual needs.
- Step S405 the physical node A allocates a virtual volume to the file system.
- Each newly created file system will be allocated a virtual volume Vvloume 0 for writing the data written to the file system by the first client or the second client.
- Step S406 the physical node A creates a mirrored volume pair for the virtual volume.
- the physical node A After the virtual volume Vvloume 0 is established, the physical node A first creates a local volume based on the persistent storage medium 601, such as the first volume in FIG. Create a mirrored volume of the first volume in, for example, the second volume file system file system in FIG. 2 .
- Step S407 the physical node A generates a disk flushing policy by recording the mirrored volume pair corresponding to the virtual volume.
- the generated flushing strategy is shown in the flushing strategy shown in FIG. 3B , the mirrored volume pair (the first volume and the second volume) corresponding to the virtual volume of the file system.
- the data of the file system cached in the memory can be stored in the persistent storage medium 601 of the first storage device 600 and the persistent storage medium 701 of the second storage device 700 respectively in order to ensure the reliability of the data.
- how to write the data in the memory into the persistent storage medium 601 and the persistent storage medium 701 according to the disk flushing strategy will be described in detail in FIG. 9 .
- the physical node A After generating the flushing policy, the physical node A stores the file system flushing strategy in the local memory 603 and the persistent storage device 601, and synchronizes it to the persistent storage device 701 and the global view of the second storage device 700 in the corresponding physical nodes of other virtual nodes.
- FIG. 4B The schematic diagram of the active-active storage system after the file system is created is shown in FIG. 4B, that is, the first storage device and the second storage system are Generate cross-device file systems, virtual volumes, shard views, and global views on storage devices.
- directories and files can be created and accessed based on the file system.
- the process of creating a directory under the file system will be described with reference to the flowchart shown in FIG. 5 .
- the root directory is taken as a parent directory, and the directory to be created is introduced as a subdirectory of the parent directory.
- the user may access the first storage device through the first client to create the subdirectory, and may also access the second storage device through the second client to create the subdirectory.
- the first storage device mounts the file system to the first client, a path for the first client to access the file system is established. For example, the first storage device mounts the file system through physical node A. If it is mounted to the first client, the first client will access the file system through the physical node A.
- the second storage device In order to realize Active-Active active-active access, the second storage device also mounts the file system to the file system of the second client, thus establishing a path for the second client to access the file system, The request of the second client to access the file system will be sent to the physical node on which the file system is mounted, such as physical node C.
- the following describes the process of creating a subdirectory by using the second client to send a subdirectory creation request to the second storage device as an example.
- Step S501 the second client sends a subdirectory creation request to the physical node C.
- the physical node C is the master node of the second storage device 700, that is, the node that mounts the file system to the second client.
- the subdirectory creation request includes parameter information of the parent directory and the name of the subdirectory.
- Step S502 the physical node C receives the creation request sent by the second client, and generates parameter information for the subdirectory according to the creation request.
- the parameter information includes the identification information and shard ID of the parent directory, the identification information is used to uniquely identify the subdirectory, and the identification information is, for example, the object ID in the NFS file system.
- the physical node C looks up the shard view, assigns a shard ID to the subdirectory from the shard ID recorded in the shard view, and then assigns a shard ID to the physical node corresponding to the virtual node to which the shard ID belongs.
- the subdirectory is created in the node. It should be noted that each directory can be assigned a shard ID, but a shard ID can be assigned to multiple directories.
- a shard ID is allocated to the subdirectory in the shard ID of the virtual node corresponding to the physical node that receives the subdirectory request, that is, the virtual node corresponding to the physical node C is assigned a shard ID.
- the shard [2048, 3071] corresponding to the node Vnode C assigns a shard ID to the subdirectory.
- the directory corresponding to the shard ID in the virtual node Vnode C exceeds the preset threshold, the subdirectory will be assigned the shard ID corresponding to other virtual nodes.
- Step S503 the physical node C creates the subdirectory.
- Creating the subdirectory includes generating a directory entry table (DET) and an Inode table for the subdirectory.
- the directory entry table is used to record the parameter information of the subdirectory or file established under the subdirectory as a parent directory after the subdirectory is successfully created.
- the parameter information includes, for example, the name of the subdirectory, the directory Or the identification information of the file and the shard ID, etc.
- the Inode table is used to record the detailed information of the file subsequently created in the subdirectory, such as the file length of the file, the operation authority of the user on the file, the modification time of the file, and other information.
- Step S504 the physical node C determines, according to the parameter information of the parent directory, the physical node B in the first storage device to which the parent directory belongs.
- the parameter information of the parent directory includes the Shard ID, and the virtual node corresponding to the Shard ID can be determined to be the virtual node Vnode B through the Shard view, and then the virtual node Vnode B is further determined according to the virtual node Vnode B.
- the corresponding physical node is the physical node B in the first storage device.
- Step S505 the physical node C sends the parameter information of the subdirectory and the parameter information of the parent directory to the physical node B.
- Step S506 the physical node B finds the directory entry table of the parent directory according to the parameter information of the parent directory.
- the parent directory can be found according to the shard ID and the parent directory name in the parameter information of the parent directory.
- Step S507 the physical node B records the parameter information of the subdirectory in the directory entry table of the parent directory.
- Step S508 the physical node B first returns the parameters of the subdirectory to the physical node C, and the physical node C then returns the parameters of the subdirectory to the second client.
- the client will first query the subdirectory according to the parameter information of the root directory filesystem1
- the parameter information of user1 that is, generating a request for querying the user1
- after querying the parameter information of the user1 and then querying the parameter information of the favorite according to the parameter information of the user1, that is, generating a request for querying the favorite.
- the method for querying the parameter information of the directories of each level is the same.
- the following is an example of the directory above as the parent directory and the directory to be queried as the subdirectory to illustrate the process of a directory query.
- the physical node C of the second storage device receives the query request as an example for description
- Step S601 the second client sends a subdirectory query request to the physical node C.
- the query request carries the parameter information of the parent directory and the name of the subdirectory.
- the parameter information of the parent directory is, for example, the handle of the parent directory.
- the handle of the root directory is obtained from the file system of the client.
- the handle of the parent directory can be queried through a query request for querying the parent directory.
- the handle of the parent directory includes identification information and shardID of the parent directory.
- Step S602 the physical node C receives the query request sent by the second client, and determines the physical node B to which the parent directory belongs according to the query request.
- the physical node C obtains the shard ID of the root directory from the parameter information of the root directory, and obtains the virtual node to which the parent directory belongs according to the shard ID.
- physical node C Since physical node A synchronizes the created shard view to all nodes, physical node C obtains the shard view from its own memory, determines the virtual node to which the parent directory belongs according to the shard ID of the parent directory, and then Determine the physical node corresponding to the virtual node.
- Step S603 the physical node C sends the parameters of the parent directory and the name of the subdirectory to the physical node B where the parent directory is located.
- Step S604 the physical node B determines the directory entry table of the parent directory according to the parameters of the parent directory.
- Step S605 the physical node B obtains the parameter information of the subdirectory from the directory entry table of the parent directory.
- Step S606 the physical node B returns the parameter information of the subdirectory to the physical node C.
- Step S607 the physical node C returns the parameter information of the subdirectory to the second client.
- Figures 5 and 6 illustrate the case where the second client accesses the second storage device and creates a subdirectory in the file system and queries the subdirectory, but in practical applications, the first client can also access the first A storage device creates and queries the subdirectory.
- the first client or the second client can obtain the parameter information of the subdirectory, and then can create the subdirectory according to the parameter information of the subdirectory document.
- the following describes a process in which a user accesses the first storage device through the first client and creates a file in a subdirectory, as shown in FIG. 7 .
- Step S701 the client sends a file generation request to the physical node A.
- the file generation request carries the parameter information and file name of the subdirectory.
- physical node A has sent the parameter information of the subdirectory to the client, so when the client needs to create a file in the subdirectory, it can query the file The request carries the parameter information of the subdirectory and the file name of the file.
- Step S702 after receiving the file generation request, the physical node A determines the physical node D to which the subdirectory belongs according to the parameter information of the subdirectory.
- step S602 in FIG. 6 The manner of determining the physical node D to which the subdirectory belongs is the same as that of step S602 in FIG. 6 , which will not be repeated here.
- Step S703 the physical node A sends the parameter information of the subdirectory and the file name to the physical node D.
- Step S704 the physical node D determines whether the file has been created.
- the physical node D finds the subdirectory according to the Shard ID and the subdirectory name in the parameters of the subdirectory, then finds the DET corresponding to the subdirectory, and searches for the file name in the DET, if there is , it means that a file with the same file name has been created, then step S705 is executed, if it does not exist, it means that the file can be created in the subdirectory, and step S706 is executed.
- Step S705 the node D sends the feedback that the file name has been created to the node A, and the node A further feeds back to the first client.
- the first client can further notify the user through a notification message that a file with the same file name already exists, and the user can perform further operations, such as modifying the file name, according to the prompt information.
- Step S706 the node D creates the file.
- the node D When the node D creates a file, set parameters for the file, such as assigning a shard ID, assigning file identification information, and adding the shard ID and file identification information to the DET of the subdirectory.
- an Inode table will be generated for the subdirectory, and the inode table is used to record the information of the files generated under the subdirectory. Therefore, in this In the step, after the node D creates the file, the information of the file will be added to the Inode in the subdirectory.
- the file information includes information such as the length of the file, the user's authority to operate the file, and the modification time of the file.
- Step S707 the physical node D feeds back the file parameters.
- the physical node D first sends the feedback information to the node A, and the node A further feeds back the feedback information to the first client.
- step S702 when the physical node A determines that the home node of the subdirectory is the physical node A, the physical node A executes the above steps S704 to S707.
- the user can write data in the file.
- a user can write data in the file through a first client connected to the first storage device and a second client connected to the second storage device.
- the following describes a process in which the user accesses the first storage device through the first client and writes data to the file as an example, as shown in FIG. 8 .
- Step S801 physical node A receives a write request to the file.
- any node since any node stores a file system, a user can access files in the file system through a client connected to any node.
- the write request carries address information of the file, where the address information includes parameter information, offset address and data to be written of the file.
- the parameter information of the file is the handle of the file, including a file system identifier, a file identifier, and a shard ID.
- Step S802 the physical node A determines the home node D of the file according to the write request.
- step S602 in FIG. 6 For the method of determining the home node D that records the file according to the shard ID of the file, please refer to step S602 in FIG. 6 , which will not be repeated here.
- Step S803 the physical node A forwards the write request to the physical node D.
- Step S804 the physical node D converts the access to the file system into the access to the virtual volume corresponding to the file system.
- the physical node D Since the virtual volume created for the file system is recorded in each physical node, the physical node D replaces the identifier of the file system in the write request with the identifier of the virtual volume.
- Step S805 the physical node D finds the file according to the file identifier and shard ID in the write request and updates the information of the file.
- the inode item corresponding to the file is found in the Inode according to the inode number of the file contained in the file identifier,
- the file information is recorded therein, for example, according to the length and offset address of the data to be written carried in the write request, the length and offset address of the file are updated and the current time is recorded as the file update time.
- Step S806 the physical node D writes multiple copies of the data to be written according to a preset backup policy.
- a backup policy is established for the file system, and a backup node is set for each node in the backup policy.
- the physical node can be determined
- the backup nodes of D are physical node C and physical node B.
- the physical node D writes the to-be-written data into the local memory, it sends the to-be-written data to physical node C and physical node B.
- the physical node C and the physical node B write the to-be-written data into their own memory.
- Step S807 after determining that the multi-copy writing is completed, the physical node D returns a write request completion message to the first client.
- Step S808 the physical node D persistently stores the data to be written.
- the virtual volume of the file system corresponds to the mirrored volume pair: the first volume in the first storage device and the second volume in the second storage device.
- the physical node D determines that the data to be written needs to be eliminated to persistent storage, that is, when flushing the disk, first according to the virtual data recorded in the address in the data to be written.
- the volume obtains the mirrored volume pair corresponding to the virtual volume from the disk flushing policy, that is, the first volume in the first storage device and the second volume in the second storage device, and then copies the memory of the second storage device Write the data to be written in the persistent storage 701 into the physical space corresponding to the second volume, and then send the memory address of the data to be written to the physical node D in the first storage device
- the corresponding backup node B the physical node B writes the data to be written stored in the memory of the physical node B into the physical space corresponding to the first volume in the persistent storage 601 of the first storage device according to the memory address. .
- FIG. 9 it is a flowchart of a method for reading a file in an embodiment of the present application.
- the user can also access files in the file system through any client. This embodiment is described by taking the user reading the file through the second client as an example.
- Step S901 the physical node C receives the read request of the file.
- the read request carries the address information of the file, and the address information includes the parameter information of the file and the offset address, and the parameter information is the handle of the file, including the file system identifier, the file identifier, and the shard ID.
- the parameter information of the file has been acquired according to the method shown in FIG. 6 .
- Step S902 the physical node C determines the home node B of the file according to the read request.
- step S602 in FIG. 6 Please refer to the description of step S602 in FIG. 6 for the way of confirming the home Node B of the file, which will not be repeated here.
- Step S903 the physical node C forwards the read request to the home node B.
- Step S904 the physical node B converts the access of the read request to the file system into an access to the virtual volume of the file system.
- Step S905 the physical node B reads the file from the memory of the physical node B according to the address in the read request.
- Step S906 the physical node B returns the file.
- Step S907 when the file is not in the memory, the physical node B retrieves the file from the persistent storage 601 according to the first volume in the first storage device corresponding to the virtual volume in the disk flushing policy. The file is read and returned to the physical node C, and the physical node C returns the file to the second client.
- the first storage device and the second storage device when the first storage device and the second storage device access files and directories, they both forward the access request to the home node of the files and directories based on the shard ID, which will result in cross-device data access. access, thereby affecting access efficiency.
- the data since both the first storage device and the second storage device back up the data of the opposite end, when receiving an access request to access the data of the opposite end, the data can be backed up from the local end The data that needs to be accessed is obtained from the backup data of the opposite end, without the need to obtain the accessed data from the opposite end, thereby improving the efficiency of data access.
- the business of the failed storage device can be taken over by backing up data.
- FIG. 10 when the link between the first storage device and the second storage device is disconnected, or the second storage device fails, the backup data of the second storage device stored in the first storage device can be used to take over the second storage device.
- the following description takes the disconnection of the link between the first storage device and the second storage device as an example. Specifically, as shown in the flowchart shown in FIG. 11 .
- Step S111 the first storage device and the second storage device simultaneously detect the heartbeat of the opposite end.
- Step S112 when the heartbeat of the opposite end is not detected, the first storage device and the second storage device suspend the service being executed.
- Step S113 the first storage device and the second storage device modify the global view and the file system.
- the first storage device and the second storage device When the heartbeat of the peer cannot be detected, the first storage device and the second storage device will prepare to take over the business of the peer, and the global view and file system will be modified, and the physical nodes of the peer in the global view will be modified.
- the corresponding virtual node is deleted from the global view, and the backup node of the opposite end in the backup policy is deleted.
- the first storage device modifies the global view to (Vnode A, Vnode B)
- the second storage device modifies the global view to (Vnode C, Vnode D).
- modify the shard of the virtual node corresponding to the peer node in the shard view of the file system to the shard corresponding to the virtual node corresponding to the end node.
- Step S114 both the first storage device and the second storage device send an arbitration request to the arbitration device.
- Step S115 the arbitration device arbitrates the first storage device to take over the service.
- the arbitration device may determine the device to take over the service according to the order of receiving the arbitration request, for example, the storage device corresponding to the first received arbitration request is used as the device to take over the service.
- Step S116 the arbitration device notifies the first storage device and the second storage device of the arbitration result respectively.
- Step S117 after receiving the notification, the second storage device releases the connection with the second client, that is, stops the execution of the service.
- Step S118 after receiving the notification, the first storage device shifts the IP address of the second storage array to the first storage device, and establishes a connection with the second client.
- Step S119 taking over the business of the second storage array through the backup data of the second storage array.
- the backup data of the second storage array is stored in the first storage device, when the first storage device receives an access to the data in the first storage device, it can use the shard ID in the access request to The access requests of the first client and the second client to the data in the second storage device are directed to the access to the backup data, so that the first client and the second client cannot perceive the link interruption.
- the written data is only written to the memory of the node of the first storage device, and is only stored in the volume of the first storage device.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Hardware Redundancy (AREA)
Abstract
Description
Claims (22)
- 一种双活存储系统,其特征在于,包括第一存储设备和第二存储设备,所述第一存储设备用于接收客户端集群发送给文件系统的第一文件的数据,存储所述第一文件的数据,并且将所述第一文件的数据的第一副本数据发送给所述第二存储设备;所述第二存储设备用于接收所述客户端集群发送给所述文件系统的第二文件的数据,存储所述第二文件的数据,并且将所述第二文件的第二副本数据发送给所述第一存储设备。
- 根据权利要求1所述的系统,其特征在于,所述双活存储系统还包括虚拟节点集合,所述虚拟节点集合包括多个虚拟节点,每个虚拟节点分配有计算资源,所述计算资源来源于所述第一存储设备或所述第二存储设备中的物理节点。
- 根据权利要求2所述的系统,其特征在于,所述双活存储系统还包括管理设备,所述管理设备还用于创建全局视图,所述全局视图用于记录每个虚拟节点与其分配的计算资源之间的对应关系;所述管理设备还用于将所述全局视图发送给所述第一存储设备及所述第二存储设备;所述第一存储设备还用于保存所述全局视图;所述第二存储设备还用于保存所述全局视图。
- 根据权利要求3所述的系统,其特征在于,所述第一存储设备在存储所述第一文件的数据时具体用于:根据所述第一文件的数据的地址确定所述第一文件对应的第一虚拟节点;根据所述第一虚拟节点以及所述全局视图确定为所述第一虚拟节点分配的计算资源;基于为所述第一虚拟节点分配的计算资源,将所述第一文件的数据发送给所述计算资源对应的物理节点,由所述物理节点将所述第一文件的数据存储至所述物理节点的内存中。
- 根据权利要求4所述的系统,其特征在于,所述第一虚拟节点具有至少一个备份虚拟节点,所述第一虚拟节点对应的物理节点与所述备份虚拟节点对应的物理节点位于不同的存储设备中;所述第一存储设备还用于:确定所述第一虚拟节点对应的备份虚拟节点;根据所述备份虚拟节点以及所述全局视图确定所述备份虚拟节点对应的物理节点;将所述第一副本数据发送给所述备份虚拟节点对应的物理节点,由所述备份虚拟节点对应的物理节点将所述第一备份数据存储在所述物理节点中。
- 根据权利要求2-5任意一项所述的系统,其特征在于,所述文件系统所包括的文件和目录分布在所述虚拟节点集合中的多个虚拟节点对应的物理节点中。
- 根据权利要求6所述的系统,其特征在于,所述虚拟节点集合中的每个虚拟节点设置有一个或多个分片标识,所述文件系统中的每个目录及文件分配一个分片标识,所述第一存储设备和第二存储设备中的物理节点根据每个目录及文件的分片标识将所述目录和文件分布 至所述分片标识所属的虚拟节点对应物理节点中。
- 根据权利要求7所述的系统,其特征在于,所述第一存储设备中的第一物理节点用于接收所述第一文件的创建请求,从为所述第一物理节点对应的虚拟节点设置的一个或多个分片标识中为所述第一文件选择一个分片标识,在所述存储设备中创建所述第一文件。
- 根据权利要求2-8任意一项所述的系统,其特征在于,当所述第二存储设备故障或所述第一存储设备与所述第二存储设备之间的链路断开后,所述第一存储设备还用于基于所述第二文件的第二副本数据恢复所述第二文件,并接管所述客户端集群发送给所述第二存储设备的业务。
- 根据权利要求9所述的系统,其特征在于,所述第一存储设备还用于从所述全局视图中删除所述第二存储设备的计算资源对应的虚拟节点。
- 根据权利要求1所述的系统,其特征在于,所述第一存储设备还具有第一文件系统,所述第二存储设备还具有第二文件系统。
- 一种数据处理方法,应用于双活存储系统,所述双活存储系统包括第一存储设备和第二存储设备,其特征在于,所述方法包括:所述第一存储设备接收客户端集群发送给文件系统的第一文件的数据,存储所述第一文件的数据,并且将所述第一文件的数据的第一副本数据发送给所述第二存储设备;所述第二存储设备接收所述客户端集群发送给所述文件系统的第二文件的数据,存储所述第二文件的数据,并且将所述第二文件的第二副本数据发送给所述第一存储设备。
- 根据权利要求12所述的方法,其特征在于,所述双活存储系统还包括虚拟节点集合,所述虚拟节点集合包括多个虚拟节点,每个虚拟节点分配有计算资源,所述计算资源来源于所述第一存储设备或所述第二存储设备中的物理节点。
- 根据权利要求13所述的方法,其特征在于,所述双活存储系统还包括管理设备,所述方法还包括:所述管理设备创建全局视图,所述全局视图用于记录每个虚拟节点与其分配的计算资源之间的对应关系;所述管理设备将所述全局视图发送给所述第一存储设备及所述第二存储设备;所述第一存储设备及所述第二存储设备保存所述全局视图。
- 根据权利要求14所述的方法,其特征在于,所述第一存储设备在存储所述第一文件的数据时包括:根据所述第一文件的数据的地址确定所述第一文件对应的第一虚拟节点;根据所述第一虚拟节点以及所述全局视图确定为所述第一虚拟节点分配的计算资源;基于为所述第一虚拟节点分配的计算资源,将所述第一文件的数据发送给所述计算资源对应的物理节点,由所述物理节点将所述第一文件的数据存储至所述物理节点的内存中。
- 根据权利要求15所述的方法,其特征在于,所述第一虚拟节点具有至少一个备份虚拟节点,所述第一虚拟节点对应的物理节点与所述备份虚拟节点对应的物理节点位于不同的存储设备中;所述方法还包括:所述第一存储设备确定所述第一虚拟节点对应的备份虚拟节点;所述第一存储设备根据所述备份虚拟节点以及所述全局视图确定所述备份虚拟节点对应的物理节点;所述第一存储设备将所述第一副本数据发送给所述备份虚拟节点对应的物理节点,由所述备份虚拟节点对应的物理节点将所述第一备份数据存储在所述物理节点中。
- 根据权利要求13-16任意一项所述的系统,其特征在于,所述文件系统所包括的文件和目录分布在所述虚拟节点集合中的多个虚拟节点对应的物理节点中。
- 根据权利要求17所述的方法,其特征在于,所述虚拟节点集合中的每个虚拟节点设置有一个或多个分片标识,所述文件系统中的每个目录及文件分配一个分片标识,所述第一存储设备和第二存储设备中的物理节点根据每个目录及文件的分片标识将所述目录和文件分布至所述分片标识所属的虚拟节点对应物理节点中。
- 根据权利要求18所述的方法,其特征在于,所述方法还包括:所述第一存储设备中的第一物理节点接收所述第一文件的创建请求,从为所述第一物理节点对应的虚拟节点设置的一个或多个分片标识中为所述第一文件选择一个分片标识,在所述第一存储设备中创建所述第一文件。
- 根据权利要求13-19任意一项所述的方法,其特征在于,所述方法还包括:当所述第二存储设备故障或与所述第一存储设备与所述第二存储设备之间的链路断开后,所述第一存储设备基于所述第二文件的第二副本数据恢复所述第二文件,并接管所述客户端集群发送给所述第二存储设备的业务。
- 根据权利要求20所述的方法,其特征在于,所述方法还包括所述第一存储设备从所述全局视图中删除所述第二存储设备的计算资源对应的虚拟节点。
- 根据权利要求13-21任意一项所述的方法,其特征在于,所述第一存储设备还具有第一文件系统,所述第二存储设备还具有第二文件系统。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023516240A JP2023541069A (ja) | 2020-09-11 | 2021-09-11 | アクティブ-アクティブストレージシステムおよびそのデータ処理方法 |
BR112023003725A BR112023003725A2 (pt) | 2020-09-11 | 2021-09-11 | Sistema de armazenamento ativo-ativo e método de processamento de dados do mesmo |
EP21866085.0A EP4198701A4 (en) | 2020-09-11 | 2021-09-11 | ACTIVE-ACTIVE STORAGE SYSTEM AND DATA PROCESSING METHOD BASED THEREON |
US18/178,541 US20230205638A1 (en) | 2020-09-11 | 2023-03-06 | Active-active storage system and data processing method thereof |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010955301 | 2020-09-11 | ||
CN202010955301.5 | 2020-09-11 | ||
CN202011628940.7A CN114168066A (zh) | 2020-09-11 | 2020-12-30 | 一种双活存储系统及其处理数据的方法 |
CN202011628940.7 | 2020-12-30 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/178,541 Continuation US20230205638A1 (en) | 2020-09-11 | 2023-03-06 | Active-active storage system and data processing method thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022053033A1 true WO2022053033A1 (zh) | 2022-03-17 |
Family
ID=76011613
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/117843 WO2022053033A1 (zh) | 2020-09-11 | 2021-09-11 | 一种双活存储系统及其处理数据的方法 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230205638A1 (zh) |
EP (1) | EP4198701A4 (zh) |
JP (1) | JP2023541069A (zh) |
CN (2) | CN116466876A (zh) |
BR (1) | BR112023003725A2 (zh) |
WO (1) | WO2022053033A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116466876A (zh) * | 2020-09-11 | 2023-07-21 | 华为技术有限公司 | 一种存储系统及数据处理方法 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103827843A (zh) * | 2013-11-28 | 2014-05-28 | 华为技术有限公司 | 一种写数据方法、装置和系统 |
CN106133676A (zh) * | 2014-04-21 | 2016-11-16 | 株式会社日立制作所 | 存储系统 |
CN107220104A (zh) * | 2017-05-27 | 2017-09-29 | 郑州云海信息技术有限公司 | 一种虚拟机备灾方法和装置 |
CN109964208A (zh) * | 2017-10-25 | 2019-07-02 | 华为技术有限公司 | 一种双活存储系统和地址分配方法 |
WO2020133473A1 (zh) * | 2018-12-29 | 2020-07-02 | 华为技术有限公司 | 一种备份数据的方法、装置和系统 |
CN111542812A (zh) * | 2018-01-04 | 2020-08-14 | 慧与发展有限责任合伙企业 | 基于虚拟节点资源的增强型高速缓存存储器分配 |
CN112860480A (zh) * | 2020-09-11 | 2021-05-28 | 华为技术有限公司 | 一种双活存储系统及其处理数据的方法 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8346719B2 (en) * | 2007-05-17 | 2013-01-01 | Novell, Inc. | Multi-node replication systems, devices and methods |
CN102158546B (zh) * | 2011-02-28 | 2013-05-08 | 中国科学院计算技术研究所 | 一种集群文件系统及其文件服务方法 |
US9582559B1 (en) * | 2012-06-29 | 2017-02-28 | EMC IP Holding Company LLC | Multi-site storage system with replicated file system synchronization utilizing virtual block storage appliances |
US9880777B1 (en) * | 2013-12-23 | 2018-01-30 | EMC IP Holding Company LLC | Embedded synchronous replication for block and file objects |
US9069783B1 (en) * | 2013-12-31 | 2015-06-30 | Emc Corporation | Active-active scale-out for unified data path architecture |
US9430480B1 (en) * | 2013-12-31 | 2016-08-30 | Emc Corporation | Active-active metro-cluster scale-out for unified data path architecture |
SG11201701365XA (en) * | 2014-09-01 | 2017-03-30 | Huawei Tech Co Ltd | File access method and apparatus, and storage system |
CN106909307B (zh) * | 2015-12-22 | 2020-01-03 | 华为技术有限公司 | 一种管理双活存储阵列的方法及装置 |
CN108345515A (zh) * | 2017-01-22 | 2018-07-31 | 中国移动通信集团四川有限公司 | 存储方法和装置及其存储系统 |
US10521344B1 (en) * | 2017-03-10 | 2019-12-31 | Pure Storage, Inc. | Servicing input/output (‘I/O’) operations directed to a dataset that is synchronized across a plurality of storage systems |
CN108958984B (zh) * | 2018-08-13 | 2022-02-11 | 深圳市证通电子股份有限公司 | 基于ceph的双活同步在线热备方法 |
-
2020
- 2020-12-30 CN CN202310146397.4A patent/CN116466876A/zh active Pending
- 2020-12-30 CN CN202110080368.3A patent/CN112860480B/zh active Active
-
2021
- 2021-09-11 BR BR112023003725A patent/BR112023003725A2/pt unknown
- 2021-09-11 EP EP21866085.0A patent/EP4198701A4/en active Pending
- 2021-09-11 WO PCT/CN2021/117843 patent/WO2022053033A1/zh unknown
- 2021-09-11 JP JP2023516240A patent/JP2023541069A/ja active Pending
-
2023
- 2023-03-06 US US18/178,541 patent/US20230205638A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103827843A (zh) * | 2013-11-28 | 2014-05-28 | 华为技术有限公司 | 一种写数据方法、装置和系统 |
CN106133676A (zh) * | 2014-04-21 | 2016-11-16 | 株式会社日立制作所 | 存储系统 |
CN107220104A (zh) * | 2017-05-27 | 2017-09-29 | 郑州云海信息技术有限公司 | 一种虚拟机备灾方法和装置 |
CN109964208A (zh) * | 2017-10-25 | 2019-07-02 | 华为技术有限公司 | 一种双活存储系统和地址分配方法 |
CN111542812A (zh) * | 2018-01-04 | 2020-08-14 | 慧与发展有限责任合伙企业 | 基于虚拟节点资源的增强型高速缓存存储器分配 |
WO2020133473A1 (zh) * | 2018-12-29 | 2020-07-02 | 华为技术有限公司 | 一种备份数据的方法、装置和系统 |
CN112860480A (zh) * | 2020-09-11 | 2021-05-28 | 华为技术有限公司 | 一种双活存储系统及其处理数据的方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4198701A4 * |
Also Published As
Publication number | Publication date |
---|---|
CN112860480A (zh) | 2021-05-28 |
CN116466876A (zh) | 2023-07-21 |
EP4198701A4 (en) | 2024-01-10 |
EP4198701A1 (en) | 2023-06-21 |
JP2023541069A (ja) | 2023-09-27 |
US20230205638A1 (en) | 2023-06-29 |
CN112860480B (zh) | 2022-09-09 |
BR112023003725A2 (pt) | 2023-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8301654B2 (en) | Geographical distributed storage system based on hierarchical peer to peer architecture | |
US20200081807A1 (en) | Implementing automatic switchover | |
JP5047165B2 (ja) | 仮想化ネットワークストレージシステム、ネットワークストレージ装置及びその仮想化方法 | |
US9940154B2 (en) | Storage virtual machine relocation | |
US8046422B2 (en) | Automatic load spreading in a clustered network storage system | |
US9378258B2 (en) | Method and system for transparently replacing nodes of a clustered storage system | |
JP5066415B2 (ja) | ファイルシステム仮想化のための方法および装置 | |
TWI467370B (zh) | 執行儲存虛擬化之儲存子系統及儲存系統架構及其方法 | |
US8429360B1 (en) | Method and system for efficient migration of a storage object between storage servers based on an ancestry of the storage object in a network storage system | |
US7836017B1 (en) | File replication in a distributed segmented file system | |
WO2006089479A1 (fr) | Méthode de gestion de données dans un système de stockage en réseau et système de stockage en réseau reposant sur la méthode | |
US10320905B2 (en) | Highly available network filer super cluster | |
US8756338B1 (en) | Storage server with embedded communication agent | |
US8924656B1 (en) | Storage environment with symmetric frontend and asymmetric backend | |
CN109407975B (zh) | 写数据方法与计算节点以及分布式存储系统 | |
US20230359564A1 (en) | Methods and Systems for Managing Race Conditions During Usage of a Remote Storage Location Cache in a Networked Storage System | |
US20240103744A1 (en) | Block allocation for persistent memory during aggregate transition | |
US11343308B2 (en) | Reduction of adjacent rack traffic in multi-rack distributed object storage systems | |
US20230205638A1 (en) | Active-active storage system and data processing method thereof | |
US11216204B2 (en) | Degraded redundant metadata, DRuM, technique | |
US11194501B2 (en) | Standby copies withstand cascading fails | |
CN111868704A (zh) | 用于加速存储介质访问的方法及其设备 | |
JP6697101B2 (ja) | 情報処理システム | |
WO2012046585A1 (ja) | 分散ストレージシステム、その制御方法、およびプログラム | |
CN114168066A (zh) | 一种双活存储系统及其处理数据的方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21866085 Country of ref document: EP Kind code of ref document: A1 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112023003725 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 2023516240 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021866085 Country of ref document: EP Effective date: 20230315 |
|
ENP | Entry into the national phase |
Ref document number: 112023003725 Country of ref document: BR Kind code of ref document: A2 Effective date: 20230228 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |