WO2017020668A1 - 一种物理磁盘的共享方法及装置 - Google Patents

一种物理磁盘的共享方法及装置 Download PDF

Info

Publication number
WO2017020668A1
WO2017020668A1 PCT/CN2016/087701 CN2016087701W WO2017020668A1 WO 2017020668 A1 WO2017020668 A1 WO 2017020668A1 CN 2016087701 W CN2016087701 W CN 2016087701W WO 2017020668 A1 WO2017020668 A1 WO 2017020668A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
identification code
volume
identifier
data block
Prior art date
Application number
PCT/CN2016/087701
Other languages
English (en)
French (fr)
Inventor
张志炯
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2017020668A1 publication Critical patent/WO2017020668A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation

Definitions

  • the present invention relates to the field of network distributed storage technologies, and in particular, to a method and device for sharing physical disks.
  • IP-SAN IP-Storage Area Network
  • FIG. 1 is a schematic diagram of a system structure of an existing IP-SAN technology.
  • an IP-SAN system uses an iscsi (internet Small Computer System Interface) protocol to physically disk ( A LUN (volume) 100 in which a Target storage device is disposed is shared with a plurality of Initiators (initiators, ie, front-end devices), and each Initiator can mount the volume 100 to the local device and provide it to the application as a SCSI disk locally.
  • Isi Internet Small Computer System Interface
  • multiple initiators communicate with the same volume 100 through network connections and sessions.
  • the back-end storage device single volume 100 needs to be shared to multiple initiators at the same time, this may cause the front-end shared device to Storage performance is limited.
  • the initiator 1, originator 2, and initiator 3 concurrently access a single volume at the same time, the same physical disk is actually accessed.
  • the number of initiators is too large, the single physical disk is overloaded. The blocking may be caused, so that the initiator cannot access the single volume normally. Therefore, the prior art cannot support concurrent access due to its own network architecture, and the reliability is low.
  • the embodiment of the invention provides a physical disk sharing method and device, which maps a volume and a snapshot of a front-end device to multiple physical disks, each volume and a snapshot corresponding to multiple different physical disks, accessing the same volume and snapshot.
  • a physical disk sharing method and device which maps a volume and a snapshot of a front-end device to multiple physical disks, each volume and a snapshot corresponding to multiple different physical disks, accessing the same volume and snapshot.
  • the first aspect provides a method for sharing a physical disk, where the method is used to share a plurality of physical disks to a front-end device through a network, where each physical disk has a network address, and the method includes: to be allocated to the front-end device Multiple volumes and snapshots are organized into a plurality of tree structures, and any volume or snapshot in multiple tree structures is identified by a ternary array including a tree identification code, a branch identification code, and a node identification code; data of a predetermined size
  • the block strips each volume or snapshot to identify any of the plurality of tree structures or any of the plurality of tree structures by a four-element array including a tree identification code, a branch identification code, a node identification code, and a data block identification code.
  • the routing identifier is in one-to-one correspondence with the network address of the physical disk to form a routing relationship table and stored.
  • the method further includes: acquiring a volume load request issued by the front end device; selecting a volume or a snapshot in the plurality of tree structures in response to the volume load request; sending the selected volume or The unique identifier corresponding to each data block on the snapshot to the front-end device.
  • the method further includes: acquiring a write request sent by the front end device, where the write request is a data block in the volume of the front end device
  • the write request includes a unique identification code corresponding to the data block in the volume to be written and the data to be written; the unique identification code is used as the key, and the data to be written is used as a value to form a key. Value relationship.
  • the method further includes: performing a distributed hash operation on the unique identification code to obtain a corresponding route identification code; Find the network address of the corresponding physical disk in the routing relationship table; send the key-value relationship to the physical disk corresponding to the network address, to write the unique identifier in the key-value relationship to the empty location of the metadata area of the physical disk, and The data to be written in the key value relationship is written to the data area corresponding to the empty position of the metadata area.
  • the method further includes: acquiring a read request sent by the front end device, where the read request is in the front end device to the volume or the snapshot
  • the data block is generated when the read operation is performed.
  • the read request includes the volume or snapshot to be read.
  • the unique identification code corresponding to the data block; the unique identification code is used as the key in the key value relationship.
  • the method further includes: performing a distributed hash operation on the unique identifier, and acquiring a corresponding route identifier; Find the network address of the physical disk corresponding to the routing relationship table; find whether the metadata area of the physical disk corresponding to the network address stores a unique identification code, and if so, read the data of the data area corresponding to the metadata area in which the unique identification code is stored. And use the data as the value in the key-value relationship; if not, the misreading reminder data is used as the value in the key-value relationship, wherein the misreading reminder data is used to remind the front-end device that the data block does not write data before reading. ; Send key value relationship to the front-end device.
  • any one of the first to fifth possible implementation manners of the first aspect in the seventh possible implementation, the data area of the physical disk is also stripped by the data block of a predetermined size. Chemical.
  • the size of the data block of the predetermined size is 1M.
  • the network address is a combination of an IP address and a port.
  • a second aspect provides a physical disk sharing device, wherein the device is configured to share a plurality of physical disks to a front-end device through a network, where each physical disk has a network address, and the device includes: a first identifier module, configured to: Organizing a plurality of volumes and snapshots to be allocated to the front-end device into a plurality of tree structures, and identifying any of the plurality of tree structures by a three-dimensional array including a tree identifier, a branch identifier, and a node identifier a second identifier module, configured to strip each volume or snapshot with a data block of a predetermined size, and identify the four-dimensional array including a tree identifier, a branch identifier, a node identifier, and a block identifier.
  • a first identifier module configured to: Organizing a plurality of volumes and snapshots to be allocated to the front-end device into a plurality of tree structures, and identifying any of the plurality of tree structures
  • the first operation module is configured to perform a combination operation on the quaternary array corresponding to each data block, and obtain a unique identification code corresponding to the quaternary array one-to-one correspondence
  • a second operation module configured to perform a distributed hash operation on the unique identification code, and obtain a route identification code corresponding to the data block
  • the storage module the route identification code and the physical Disk to form a network address corresponding to one and stores the routing table.
  • the apparatus further includes: a receiving module, configured to acquire a volume loading request issued by the front end device; and a selecting module, configured to select multiple trees in response to the volume loading request A volume or snapshot in the shape structure; a sending module, configured to send a unique identifier corresponding to each data block on the selected volume or snapshot to the front end device.
  • the device further includes a key value relationship generating module, wherein: the receiving module is further configured to acquire a write request sent by the front end device, The write request is generated when the front-end device performs a write operation on the data block in the volume, and the write request includes a unique identification code corresponding to the data block in the volume to be written and the data to be written; the key value relationship generation module Used to use the unique identifier as a key and use the data to be written as a value to form a key-value relationship.
  • the device further includes: a searching module, where: the second computing module is configured to perform distributed hashing on the unique identifier, Obtaining a corresponding route identifier; the searching module is configured to search a network address of a corresponding physical disk in the routing relationship table according to the route identifier; the sending module is further configured to send the key value relationship to a physical disk corresponding to the network address, so as to The unique identification code in the key-value relationship is written to the empty location of the metadata area of the physical disk, and the data to be written in the key-value relationship is written to the data area corresponding to the empty position of the metadata area.
  • a searching module where: the second computing module is configured to perform distributed hashing on the unique identifier, Obtaining a corresponding route identifier; the searching module is configured to search a network address of a corresponding physical disk in the routing relationship table according to the route identifier; the sending module is further configured to send the key value relationship to a physical disk corresponding to the network address, so as to The unique
  • the device further includes a key value relationship generating module, wherein: the receiving module is configured to acquire a read request sent by the front end device, where The read request is generated when the front-end device performs a read operation on the data block in the volume or the snapshot, and the read request includes a unique identifier corresponding to the data block in the volume or snapshot to be read; the key value relationship generation module uses The unique identifier is used as the key in the key-value relationship.
  • the device further includes: a searching module, where: the second computing module is configured to perform distributed hashing on the unique identifier, Obtaining a corresponding route identifier; the searching module is configured to search for a network address of the physical disk corresponding to the routing relationship table according to the route identifier; and find whether a metadata identifier of the physical disk corresponding to the network address stores a unique identifier, and if so,
  • the key value relationship generating module is configured to read data of a data area corresponding to the metadata area in which the unique identification code is stored, and use the data as a value in the key value relationship; if not, the key value relationship generating module is configured to use the misreading reminder The data is used as a value in the key-value relationship, wherein the misreading reminder data is used to remind the front-end device that the data block does not write data before reading; the sending module is configured to send the key-value relationship to the front-end device.
  • the size of the data block of the predetermined size is 1M.
  • the network address is a combination of an IP address and a port.
  • each data block of the volume and the snapshot is identified by a specific data structure by a quaternary array to generate a corresponding route identification code, and the route identification code and the network address of the physical disk are Correspondence, so that the volume and snapshot of the front-end device are mapped to multiple physical disks.
  • Each volume and snapshot corresponds to multiple different physical disks. When accessing the same volume and snapshot, it is necessary to access multiple different physical disks. Rather than a single physical disk, it supports concurrent access and reliability is greatly improved.
  • FIG. 1 is a schematic structural diagram of a system of an existing IP-SAN technology
  • FIG. 2 is a schematic structural diagram of a system according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of a method for sharing a physical disk according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a tree structure of a volume and a snapshot according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a format of a data block in a volume and a snapshot according to an embodiment of the present invention
  • FIG. 6 is a schematic diagram of data conversion of a method for sharing a physical disk according to an embodiment of the present invention.
  • FIG. 7 is another flowchart of a method for sharing a physical disk according to an embodiment of the present invention.
  • FIG. 8 is another flowchart of a method for sharing a physical disk according to an embodiment of the present invention.
  • FIG. 9 is a schematic diagram showing a data structure of a physical disk according to an embodiment of the present invention.
  • FIG. 10 is another flowchart of a method for sharing a physical disk according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of a device of a first embodiment of a physical disk sharing device according to the present invention.
  • Figure 12 is a block diagram showing the structure of a second embodiment of a physical disk sharing device of the present invention.
  • FIG. 2 is a schematic structural diagram of a system according to an embodiment of the present invention.
  • the method for sharing a physical disk in the embodiment of the present invention is applied to the server 20, and is used to share a plurality of physical disks 1, 2, 3, and 4 through the network to the front-end devices 11, 12, and 13, and the server 20
  • There is a network connection with the front end devices 11, 12, 13, and the server 20 and the physical disks are remotely mounted with a plurality of physical disks 1, 2, 3, 4 locally or through a network.
  • the number of servers is one, but in the optional embodiment of the present invention, the number of servers may be multiple, that is, the sharing method of the physical disks of the embodiment of the invention may be simultaneously It is applied to a plurality of servers, which is not limited by the present invention.
  • Each physical disk 1, 2, 3, 4 has a network address.
  • the network address can be an IP address plus a port number, wherein the IP address can be specifically an IP address of the server 20, and the port number is assigned to the server 20.
  • the port number of the physical disk is assigned to the server 20.
  • front-end devices 11, 12, and 13 may be specifically VMs (Virtual Machines).
  • VMs Virtual Machines
  • the number of the front-end devices and the physical disks is plural, and may be selected according to actual needs, and is not limited to The number shown in Figure 2.
  • FIG. 3 is a flowchart of a method for sharing a physical disk according to an embodiment of the present invention. As shown in FIG. 3, the method for sharing a physical disk of the present invention includes the following steps:
  • Step 101 Organize a plurality of volumes and snapshots to be allocated to the front-end device into a plurality of tree structures, and identify the plurality of tree structures in a three-element array including a tree identifier, a branch identifier, and a node identifier. Any volume or snapshot.
  • Step 102 Strip each volume or snapshot with a data block of a predetermined size, and identify a plurality of tree structures by using a four-element array including a tree identification code, a branch identification code, a node identification code, and a data block identification code. Any data block in any volume or snapshot.
  • Step 103 Perform a combination operation on the quaternary array corresponding to each data block, and obtain a unique identification code corresponding to the quaternary array one-to-one.
  • Step 104 Perform a distributed hash operation on the unique identifier, obtain a route identifier corresponding to the data block, and map the route identifier to the network address of the physical disk to form a routing relationship table and store the same.
  • FIG. 4 is a schematic diagram of a tree structure of a volume and a snapshot according to an embodiment of the present invention
  • FIG. 5 is a data block of a volume and a snapshot according to an embodiment of the present invention. Schematic diagram.
  • the e-node is (1,3,0)
  • the f-node is (1,2,1)
  • the g-node is (1,3,1).
  • a' node is (2,1,0)
  • b' node is (2,1,1)
  • c' node is (2,2,0)
  • d' node For (2,1,3)
  • the e' node is (2,3,0)
  • the f' node is (2,2,1)
  • the g' node is (2,3,1).
  • the leaf node of the tree structure can be used as a volume node, and the non-leaf node of the tree structure can be used as a snapshot node, and the node that generates the branch is used as a new volume created based on the snapshot.
  • the snapshot is different from the node id of the volume and is in an incremental form.
  • the volume and the snapshot may be further striped, and the unit of each data block is preferably 1M.
  • the unit of each data block is preferably 1M.
  • one of ordinary skill in the art can also set the units of each data block to other values.
  • Each data block is numbered by block no (block sequence number), for example, the number of each data block is 0, 1, 2, 3, 4, ... (the number of which depends on the data length of block no).
  • step 102 the data block of each 1M that is striped after the volume and the snapshot is encoded to generate a four-element array, wherein the encoding format of the four-element array is as shown in Table 1:
  • any 1M data block of the volume and the snapshot can be identified, for example, the node a in FIG. 4, the code of the first block of the node a is (1, 1, 1, 0), the data of the nth block Block compilation The code is (1, n, 1, 0).
  • the data length of the tree id can be set to 4 bits
  • the data length of the block no can be set to 4 bits
  • the data length of the branch id can be set to 4 bits
  • the data length of the node id can be set to 4 bits.
  • the quaternary array can be combined, for example, for the first data block (1, 1, 1, 0) of the a node, the combination operation is specifically: converting each element in the quaternary array into The binary number is expressed as (0001,0001,0000) in a 4-bit binary number, and the elements are combined in order, and the combined unique identification code word corresponds to 0001000100010000.
  • step 104 the distributed hash operation mode can be specifically referred to as the following example:
  • a modulo operation can be performed on 4368, such as 4368 mod n, where n is the number of physical disks.
  • the route identifier of the first data block (1, 1, 1, 0) of node a is 0.
  • the route identification code may be one of 0, 1, 2, and 3. Therefore, in the embodiment of the present invention, any one of the four element arrays may correspond to one route identification code through distributed hash operation.
  • the server 20 can map the routing identifier to the network address of the physical disk to form a routing relationship table and store it.
  • the routing relationship table can be as shown in Table 2:
  • the network address 192.168.1.1:1000 can point to physical disk 1
  • the network address 192.168.1.1:1001 can point to physical disk 2
  • the network address 192.168.1.1:1002 can point to Disk 3
  • network address 192.168.1.1: 1003 can point to physical disk 4.
  • FIG. 6 is a data conversion diagram of a method for sharing a physical disk according to an embodiment of the present invention.
  • the number of physical disks is 4, multiple unique identification codes and modulo operations of 4 can respectively obtain four different route identification codes of 0, 1, 2, and 3, so that the unique identification code can be divided.
  • the route identification code may be 0, 1, 2, ..., n-1, and the unique identification code may be divided into n groups and passed the routing relationship.
  • the table is associated with n disks.
  • the data blocks in the volume and the snapshot are identified by the quaternary array, and the data blocks in each volume and the snapshot are associated with different physical disks according to the quaternary array, and accessed at the front end device.
  • the technical solution disclosed in this embodiment can support concurrent access, and the reliability is greatly improved. .
  • front-end device mounts a volume or snapshot to the local and performs write operations and read and write operations will be described in detail below.
  • FIG. 7 is another flow chart of a method for sharing a physical disk according to an embodiment of the present invention.
  • FIG. 7 specifically describes a process in which a front-end device mounts a volume or a snapshot to a local device.
  • the method for sharing a physical disk according to an embodiment of the present invention further includes the following steps:
  • Step 104 Acquire a volume loading request issued by the front end device.
  • Step 105 Select a volume or snapshot in a plurality of tree structures in response to a volume load request.
  • Step 106 Send a unique identifier corresponding to each data block on the selected volume or snapshot to the front end device. Enables the front-end device to load the selected volume or snapshot locally.
  • the above steps 104 to 106 can all be performed by the server.
  • the current device needs to load the volume or snapshot to the local, and can issue a volume loading request to the server.
  • the server selects a volume or a snapshot in multiple tree structures in response to the volume loading request, and sends the The unique identifier corresponding to each data block on the selected volume or snapshot to the front-end device.
  • the front-end device can obtain multiple unique identification codes, and access the physical disk according to the unique identification code to read and write the physical disk.
  • FIG. 8 is another flowchart of a method for sharing a physical disk according to an embodiment of the present invention. This process is mainly used to describe the specific method for the server to write to the physical disk when the front-end device writes to the local volume. As shown in Figure 7, the method includes the following steps:
  • Step 106 Acquire a write request sent by the front end device.
  • the write request is generated when the front-end device performs a write operation on the data block loaded into the local volume, and the write request includes a unique identification code corresponding to the data block in the volume to be written and the data to be written. For example, if the front-end device mounts the g-node (1, 3, 1) of tree 1 as a local volume, and writes the first data block (1, 1, 3, 1) of the volume. Therefore, the write request includes the unique identification code 0001000100110001 of the first data block (1, 1, 3, 1) of the g node and the data to be written.
  • Step 107 The unique identification code is used as a key, and the data to be written is taken as a value to form a key value relationship.
  • Step 108 Perform a distributed hash operation on the unique identification code to obtain a corresponding route identification code.
  • a distributed hash operation is performed for the unique identifier 0001000100110001, and the corresponding decimal number is first obtained:
  • the modulo operation is performed on the 4401, such as 4401 mod 4, where 4 is the number of physical disks, and the route identification code is 1.
  • Step 109 Search for a network address of a corresponding physical disk in the routing relationship table according to the route identifier. Specifically, referring to Table 2, since the network address corresponding to the route identifier 1 is 192.168.1.1:1001, and the network address points to the physical disk 2, the physical disk 2 can be located and the physical disk 2 can be written.
  • Step 110 Send the key value relationship to the physical disk corresponding to the network address, to write the unique identification code in the key value relationship to the empty location of the metadata area of the physical disk, and write the data to be written in the key value relationship.
  • FIG. 9 is a schematic diagram of a data structure of a physical disk according to an embodiment of the present invention.
  • the physical disk 70 includes a metadata area 701 and a data area 702, and the metadata area 701 is for storing metadata, each metadata has a fixed data length, and the data area 702 is for storing data blocks, each A data block also has a fixed data length, and the data length of the data block is larger than the data length of the metadata.
  • the data area of the physical disk 70 is also striped with a data block of a predetermined size.
  • the data size of the predetermined size is the same as the data length of the predetermined size data block generated after the volume or the snapshot is striped. For example, it can be set to 1M.
  • the metadata area 701 has a mapping relationship with the data area 702, that is, as shown by the arrow in FIG. 9, the data area corresponding to the metadata area 701 can be accessed by knowing the metadata area 701. 702, the K1 area corresponds to the V1 area, and the K2 area corresponds to the V2 area, K3 The area corresponds to the V3 area, the K4 area corresponds to the V4 area, and the K5 area corresponds to the V1 area.
  • step 110 after the key value relationship is sent to the physical disk corresponding to the network address, only the metadata area of the physical disk is searched, and if the empty location is found, the unique identification code is written into the empty location, and The data to be written is written to the data area corresponding to the empty position of the metadata area according to the mapping relationship between the metadata area and the data area, thereby completing the data writing operation.
  • FIG. 10 is another flowchart of a method for sharing a physical disk according to an embodiment of the present invention. This process is mainly used to describe the specific method for the server to read the physical disk when the front-end device reads the volume or snapshot loaded to the local device. As shown in Figure 8, the method includes the following steps:
  • Step 206 Acquire a read request sent by the front end device.
  • the read request is generated when the front-end device performs a read operation on the data block loaded into the local volume or the snapshot, and the read request includes a unique identifier corresponding to the data block in the volume or snapshot to be read. For example, if the front-end device mounts the g-node (1, 3, 1) of tree 1 as a local volume, and reads the first data block (1, 1, 3, 1) of the volume. Therefore, the read request includes the unique identification code 0001000100110001 of the first data block (1, 1, 3, 1) of the g node.
  • Step 207 The unique identification code is used as a key in the key value relationship.
  • Step 208 Perform a distributed hash operation on the unique identification code to obtain a corresponding route identification code.
  • a distributed hash operation is performed for the unique identifier 0001000100110001, and the corresponding decimal number is first obtained:
  • the modulo operation is performed on the 4401, such as 4401 mod 4, where 4 is the number of physical disks, and the route identification code is 1.
  • Step 209 Search for the network address of the physical disk corresponding to the routing relationship table according to the route identifier. Specifically, referring to Table 2, since the network address corresponding to the route identifier 1 is 192.168.1.1:1001, and the network address points to the physical disk 2, the physical disk 2 can be located and the physical disk 2 can be read.
  • Step 210 Find whether the metadata area of the physical disk corresponding to the network address stores a unique identification code. If yes, go to step 212. If not, go to step 211.
  • Step 212 Read data of a data area corresponding to the metadata area in which the unique identification code is stored, and use the data as a value in the key value relationship.
  • Step 211 The misreading reminder data is used as the value in the key value relationship.
  • the misreading reminder data is used to remind the front end device that the data block does not write data before reading.
  • Step 213 Send a key value relationship to the front end device.
  • step 210 there are two cases. The first one is that the metadata area of the physical disk 2 stores a unique identification code. After the above steps 106 to 110, the metadata area of the physical disk 2 stores a unique identification code of 0001000100110001. Therefore, in step 210, the metadata area of the physical disk 2 is stored with a unique identifier 0001000100110001, which is consistent with the unique identifier 0001000100110001 calculated in step 208. At this time, step 212 is performed, and the unique identifier 0001000100110001 is stored.
  • the data of the data area corresponding to the metadata area is used as the value in the key value relationship, and in step 213, the key value relationship is sent to the front end device, so that the front end device acquires the data block corresponding to the unique identification code 0001000100110001. Corresponding data.
  • the metadata area of the physical disk 2 does not store the unique identifier 0001000100110001.
  • the misreading reminder data is used as the value in the key value relationship, and in step 213, the key value relationship is sent to the front end.
  • the device causes the front-end device to know that the data block to be read does not write data before reading.
  • Misreading reminder data can be represented by a specific value, preferably set to zero.
  • the data blocks in the volume and the snapshot are identified by the quaternary array, and the data blocks in each volume and the snapshot are associated with different physical disks according to the quaternary array.
  • the front-end device reads or writes to different data blocks in the same volume and snapshot (where the read operation can be for volumes and snapshots, and the write operation is only for volumes), essentially accessing multiple different physical disks Instead of a single physical disk.
  • the server when a certain physical disk cannot work, the server only needs to modify the routing relationship table, and the remaining physical disks can also work normally.
  • the routing relationship table 2 when the physical disk 4 is not working properly, only the routing relationship table 2 needs to be modified, and the route identifier 3 is associated with the network address 192.168.1.1:1002, that is, the route identifiers 2 and 3 are associated with the physical disk 3.
  • the physical disk 1, 2, 3 can also maintain the normal operation of the system. Therefore, when a limited number of physical disks cannot work normally, the system can operate normally by modifying the routing relationship table.
  • the server can also modify the distributed hash algorithm, and the remaining physical disks can also work normally.
  • the server can also modify the distributed hash algorithm, and the remaining physical disks can also work normally.
  • the server can also modify the distributed hash algorithm, and the remaining physical disks can also work normally.
  • the n value from 4 to 3 in the distributed hash algorithm it is only necessary to change the n value from 4 to 3 in the distributed hash algorithm to form a new routing relationship table, so that the physical disks 1, 2, and 3 can also be very Good to keep the system working properly. So in a limited number of physics When the disk fails to work properly, you can ensure the normal operation of the system by modifying the value of n in the distributed hash algorithm to match the number of physical disks that can work normally.
  • the technical solution disclosed in this embodiment can support concurrent access and greatly improve reliability.
  • the embodiment of the present invention further provides a physical disk sharing device, which can be disposed in the server 20 to implement the foregoing physical disk sharing method.
  • the shared device of the physical disk may also be disposed in multiple servers, which is not limited by the present invention.
  • FIG. 11 is a schematic structural diagram of a device in a first embodiment of a physical disk sharing device according to the present invention.
  • the physical disk sharing device 40 is configured to share a plurality of physical disks to the front-end device through a network, where each physical disk has a network address.
  • the shared device 40 of the physical disk includes:
  • the first identifier module 401 is configured to organize a plurality of volumes and snapshots to be allocated to the front-end device into a plurality of tree structures, and identify the plurality of volumes by using a ternary array including a tree identifier, a branch identifier, and a node identifier. Any volume or snapshot in the tree structure;
  • the second identifier module 402 is configured to strip each volume or snapshot with a data block of a predetermined size, and identify the multiple identifiers including the tree identifier, the branch identifier, the node identifier, and the data block identifier. Any volume in a tree structure or any data block in a snapshot;
  • the first operation module 403 is configured to perform a combination operation on the quaternary array corresponding to each data block, and obtain a unique identification code corresponding to the quaternary array one by one;
  • a second operation module 404 configured to perform a distributed hash operation on the unique identification code, and obtain a route identification code corresponding to the data block;
  • the storage module 405 has a routing identifier corresponding to the network address of the physical disk to form a routing relationship table and store the same.
  • the data blocks in the volume and the snapshot are identified by the quaternary array, and the data blocks in each volume and the snapshot are associated with different physical disks according to the quaternary array, and the front-end device accesses the same volume.
  • the technical solution disclosed in this embodiment can support concurrent access, and the reliability is greatly improved.
  • front-end device mounts a volume or snapshot to the local and performs write operations and read and write operations will be described in detail below.
  • the shared device 40 of the physical disk further includes: a receiving module 407, configured to acquire the front end a volume request to be issued; a selection module 406, configured to select a volume or a snapshot in a plurality of tree structures in response to the volume loading request; and a sending module 409, configured to send the selected volume or a unique one corresponding to each data block on the snapshot Identification code to the front-end device.
  • a receiving module 407 configured to acquire the front end a volume request to be issued
  • a selection module 406 configured to select a volume or a snapshot in a plurality of tree structures in response to the volume loading request
  • a sending module 409 configured to send the selected volume or a unique one corresponding to each data block on the snapshot Identification code to the front-end device.
  • the front-end device completes the action of loading the volume or snapshot to the local by the above method.
  • the sharing device 40 of the physical disk further includes a key value relationship generating module 410 and a searching module 408, wherein: the receiving module 407 is further configured to acquire a write request sent by the front end device, where the write request is a front end device to the volume The data block generated in the write operation, the write request includes a unique identification code corresponding to the data block in the volume to be written and the data to be written; the key value relationship generation module 410 is configured to use the unique identification code as The key and the data to be written are used as values to form a key-value relationship.
  • the second operation module 404 is configured to perform a distributed hash operation on the unique identification code to obtain a corresponding routing identifier.
  • the searching module 408 is configured to search the network address of the corresponding physical disk in the routing relationship table according to the routing identifier.
  • the module 409 is further configured to send the key value relationship to the physical disk corresponding to the network address, to write the unique identifier in the key value relationship to the empty location of the metadata area of the physical disk, and to write in the key value relationship
  • the data area is written to the data area corresponding to the empty location of the metadata area.
  • the current end device when the current end device writes to the local volume, it essentially writes the data to be written to the data area of the plurality of physical disks.
  • the receiving module 407 is further configured to obtain a read request sent by the front end device, where the read request is generated when the front end device performs a read operation on the data block in the volume or the snapshot, and the read request includes the read request.
  • the unique identifier corresponding to the data block in the volume or snapshot; the key value relationship generating module 410 is further configured to use the unique identifier as a key in the key value relationship.
  • the second operation module 404 is configured to perform a distributed hash operation on the unique identification code to obtain a corresponding routing identifier.
  • the searching module 408 is further configured to search the network address of the physical disk corresponding to the routing relationship table according to the routing identifier.
  • the key value relationship generating module 410 is configured to read data of the data area corresponding to the metadata area in which the unique identifier is stored, and the data As a value in the key-value relationship; if not, the key-value relationship generation module 410 is configured to use the misreading reminder data as a value in the key-value relationship, wherein the misreading reminder data is used to remind the front-end device that the data block is before reading and No data is written; the sending module 409 is configured to send a key value relationship to the front end device.
  • the current end device reads a local volume or a snapshot, it essentially reads the data area of multiple physical disks.
  • the volume is set in a leaf node of the tree structure, and the snapshot is set in a non-leaf node of the tree structure.
  • the data area of the physical disk is also striped in a block of data of a predetermined size.
  • the size of the data block of a predetermined size is 1M.
  • the network address is a combination of an IP address and a port.
  • FIG. 12 is a schematic structural diagram of a device according to a second embodiment of a physical disk sharing device according to the present invention.
  • the physical disk sharing device 40 is configured to share a plurality of physical disks to the front-end device through a network, where each physical disk has a network address, and the shared device 40 of the physical disk includes at least one processor 502, at least one network interface 503, A memory 501, and at least one communication bus 504, the memory 501 is configured to store program instructions, and the processor 502 is configured to:
  • the execution program instruction has a routing identifier corresponding to the network address of the physical disk to form a routing relationship table and store.
  • the data blocks in the volume and the snapshot are identified by the quaternary array, and the data blocks in each volume and the snapshot are associated with different physical disks according to the quaternary array, and the front-end device accesses the same volume.
  • the technical solution disclosed in this embodiment can support concurrent access, and the reliability is greatly improved.
  • front-end device mounts a volume or snapshot to the local and performs write operations and read and write operations will be described in detail below.
  • the network interface 503 is configured to obtain a volume load request issued by the front end device; the processor 502 is configured to execute the program instruction to select a volume or a snapshot in the plurality of tree structures in response to the volume load request; the network interface 503, further Used to send a unique identifier corresponding to each data block on the selected volume or snapshot to the front-end device.
  • the front-end device completes the action of loading the volume or snapshot to the local by the above method.
  • the network interface 503 is further configured to obtain a write request sent by the front-end device, where the write request is generated when the front-end device performs a write operation on the data block in the volume, and the write request includes the volume to be written.
  • the processor 502 executes program instructions to perform a distributed hash operation on the unique identification code to obtain a corresponding routing identifier.
  • the processor 502 executes the program instruction to search the network address of the corresponding physical disk in the routing relationship table according to the routing identifier.
  • the network interface 503 is further configured to send the key value relationship to the physical disk corresponding to the network address, so that the unique identification code in the key value relationship is written into the empty location of the metadata area of the physical disk, and the key value relationship is The data area to which the data to be written is written to the empty area of the metadata area.
  • the current end device when the current end device writes to the local volume, it essentially writes the data to be written to the data area of the plurality of physical disks.
  • the network interface 503 is configured to obtain a read request sent by the front-end device, where the read request is generated when the front-end device performs a read operation on the data block in the volume or the snapshot, and the read request includes the read request.
  • the processor 502 is configured to perform a distributed hash operation on the unique identification code to obtain a corresponding routing identifier.
  • the processor 502 executes the program instruction to search the network address of the physical disk corresponding to the routing relationship table according to the routing identifier.
  • the network interface 503 is configured to send the key-value relationship To the front end device.
  • the current end device reads a local volume or a snapshot, it essentially reads the data area of multiple physical disks.
  • the volume is set in a leaf node of the tree structure, and the snapshot is set in a non-leaf node of the tree structure.
  • the data area of the physical disk is also striped in a block of data of a predetermined size.
  • the size of the data block of a predetermined size is 1M.
  • the network address is a combination of an IP address and a port.
  • each of the volume and the snapshot is passed through the quaternion array.
  • the data block is identified by a specific data structure, generates a corresponding route identifier, and maps the route identifier to the physical address of the physical disk, thereby mapping the volume and snapshot of the front-end device to multiple physical disks, each volume and snapshot. All of them correspond to multiple different physical disks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供一种物理磁盘的共享方法及装置。该方法包括以下步骤:将待分配至前端设备的多个卷和快照组织成多个树形结构,并以包括树识别码、分支识别码以及节点识别码的三元数组来标识多个树形结构中的任意卷或快照;以预定大小的数据块对每一卷或快照进行条带化,以包括树识别码、分支识别码、节点识别码以及数据块识别码的四元数组来标识多个树形结构中的任意卷或快照中的任意数据块;对每一数据块对应的四元数组进行组合运算,获取唯一识别码;对唯一识别码进行分布式哈希运算,获取路由识别码,将路由识别码与物理磁盘的网络地址一一对应以形成路由关系表并存储。本发明可支持并发访问,且系统可靠性得到了提高。

Description

一种物理磁盘的共享方法及装置
本申请要求于2015年8月5日提交中国专利局、申请号为201510473671.4、发明名称为“一种物理磁盘的共享方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及网络分布式存储技术领域,特别是涉及一种物理磁盘的共享方法及装置。
背景技术
随着数据量的爆炸式增长,存储的数据文件越来越大,传统的数据存储方式已不能满足日益增长的需求,分布式存储技术在这几年如雨后春笋涌现出来,现有的分布式存储技术中使用较多者如IP-SAN(IP-Storage Area Network,IP-存储局域网络)技术,其顺应了当前大数据的发展趋势背景下应运而生,正逐渐取代传统的NAS和SAN存储设备。
如图1所示,图1是现有的IP-SAN技术的系统结构示意图,如图1所示,IP-SAN系统采用iscsi(internet Small Computer System Interface,互联网小型计算机接口)协议将物理磁盘(内设置有Target存储设备的LUN(卷)100)10共享给多个Initiator(发起者,即前端设备),每个Initiator均可以将卷100挂载到本地,在本地作为SCSI磁盘提供给应用使用。
在现有技术中,多个发起者通过网络连接及会话实现与同一个卷100的通信,但是,由于后端存储设备单个卷100需要同时共享给多个发起者,这会造成前端共享设备的存储性能受到限制,在发起者1、发起者2、发起者3同时对单个卷进行并发访问时,实际上是对同一个物理磁盘进行访问,当发起者数量过多时,单个物理磁盘负荷过大,会造成阻塞,使得发起者不能正常访问单个卷,故现有技术由于本身网络架构所限,并不能支持并发访问,并且可靠性较低。
发明内容
本发明实施例提供了一种物理磁盘的共享方法及装置,将前端设备的卷和快照映射至多个物理磁盘中,每一卷和快照均对应多个不同的物理磁盘,在访问同一卷和快照时,实质上需要访问多个不同的物理磁盘,而不是单一的物理磁盘,故可支持并发访问物理磁盘,可靠性亦得到极大的提高。
第一方面提供一种物理磁盘的共享方法,该方法用于将多个物理磁盘通过网络共享给前端设备,其中每一物理磁盘均具有一网络地址,该方法包括:将待分配至前端设备的多个卷和快照组织成多个树形结构,并以包括树识别码、分支识别码以及节点识别码的三元数组来标识多个树形结构中的任意卷或快照;以预定大小的数据块对每一卷或快照进行条带化,以包括树识别码、分支识别码、节点识别码以及数据块识别码的四元数组来标识多个树形结构中的任意卷或快照中的任意数据块;对每一数据块对应的四元数组进行组合运算,获取与四元数组一一对应的唯一识别码;对唯一识别码进行分布式哈希运算,获取与数据块对应的路由识别码,将路由识别码与物理磁盘的网络地址一一对应以形成路由关系表并存储。
在第一方面的第一种可能的实现方式中,该方法还包括:获取前端设备发出的卷加载请求;响应卷加载请求选择多个树形结构中的卷或快照;发送所选择的卷或快照上每一数据块对应的唯一识别码至前端设备。
根据第一方面的第一种可能的实现方式,在第二种可能的实现方式中,该方法还包括:获取前端设备发送的写入请求,其中写入请求是前端设备对卷中的数据块进行写入操作时产生的,写入请求包括所要写入的卷中的数据块对应的唯一识别码以及待写入数据;将唯一识别码作为键,并将待写入数据作为值,形成键值关系。
根据第一方面的第一种可能的实现方式,在第三种可能的实现方式中,该方法还包括:对唯一识别码进行分布式哈希运算,获取对应的路由识别码;根据路由识别码查找路由关系表中对应的物理磁盘的网络地址;将键值关系发送至网络地址对应的物理磁盘,以将键值关系中的唯一识别码写入物理磁盘的元数据区域的空位置,并将键值关系中的待写入数据写入元数据区域的空位置对应的数据区域。
根据第一方面的第一种可能的实现方式,在第四种可能的实现方式中,该方法还包括:获取前端设备发送的读取请求,其中读取请求是前端设备对卷或快照中的数据块进行读取操作时产生的,读取请求包括所要读取的卷或快照中 的数据块对应的唯一识别码;将唯一识别码作为键值关系中的键。
根据第一方面的第四种可能的实现方式,在第五种可能的实现方式中,该方法还包括:对唯一识别码进行分布式哈希运算,获取对应的路由识别码;根据路由识别码查找路由关系表对应的物理磁盘的网络地址;查找网络地址对应的物理磁盘的元数据区域是否存储有唯一识别码,如果有,读取存储有唯一识别码的元数据区域对应的数据区域的数据,并将数据作为键值关系中的值;如果没有,将误读提醒数据作为键值关系中的值,其中误读提醒数据用于提醒前端设备本数据块在读取之前并没有写入数据;发送键值关系至前端设备。
根据第一方面、第一方面的第一至第五种可能的实现方式中的任一者,在第六种可能的实现方式中,在树形结构中,卷设置在树形结构的叶子节点,快照设置在树形结构的非叶子节点。
根据第一方面、第一方面的第一至第五种可能的实现方式中的任一者,在第七种可能的实现方式中,物理磁盘的数据区域也以预定大小的数据块进行条带化。
根据第一方面、第一方面的第一至第五种可能的实现方式中的任一者,在第八种可能的实现方式中,预定大小的数据块的大小为1M。
根据第一方面、第一方面的第一至第五种可能的实现方式中的任一者,在第九种可能的实现方式中,网络地址为IP地址及端口的组合。
第二方面提供一种物理磁盘的共享装置,该装置用于将多个物理磁盘通过网络共享给前端设备,其中每一物理磁盘均具有一网络地址,该装置包括:第一标识模块,用于将待分配至前端设备的多个卷和快照组织成多个树形结构,并以包括树识别码、分支识别码以及节点识别码的三元数组来标识多个树形结构中的任意卷或快照;第二标识模块,用于以预定大小的数据块对每一卷或快照进行条带化,以包括树识别码、分支识别码、节点识别码以及数据块识别码的四元数组来标识多个树形结构中的任意卷或快照中的任意数据块;第一运算模块,用于对每一数据块对应的四元数组进行组合运算,获取与四元数组一一对应的唯一识别码;第二运算模块,用于对唯一识别码进行分布式哈希运算,获取与数据块对应的路由识别码;存储模块,将路由识别码与物理磁盘的网络地址一一对应以形成路由关系表并存储。
在第二方面的第一种可能的实现方式中,该装置还包括:接收模块,用于获取前端设备发出的卷加载请求;选择模块,用于响应卷加载请求选择多个树 形结构中的卷或快照;发送模块,用于发送所选择的卷或快照上每一数据块对应的唯一识别码至前端设备。
根据第二方面的第一种可能的实现方式,在第二种可能的实现方式中,该装置还包括键值关系生成模块,其中:接收模块,还用于获取前端设备发送的写入请求,其中写入请求是前端设备对卷中的数据块进行写入操作时产生的,写入请求包括所要写入的卷中的数据块对应的唯一识别码以及待写入数据;键值关系生成模块,用于将唯一识别码作为键,并将待写入数据作为值,形成键值关系。
根据第二方面的第二种可能的实现方式,在第三种可能的实现方式中,该装置还包括查找模块,其中:第二运算模块,用于对唯一识别码进行分布式哈希运算,获取对应的路由识别码;查找模块,用于根据路由识别码查找路由关系表中对应的物理磁盘的网络地址;发送模块,还用于将键值关系发送至网络地址对应的物理磁盘,以将键值关系中的唯一识别码写入物理磁盘的元数据区域的空位置,并将键值关系中的待写入数据写入元数据区域的空位置对应的数据区域。
根据第二方面的第一种可能的实现方式,在第四种可能的实现方式中,该装置还包括键值关系生成模块,其中:接收模块,用于获取前端设备发送的读取请求,其中读取请求是前端设备对卷或快照中的数据块进行读取操作时产生的,读取请求包括所要读取的卷或快照中的数据块对应的唯一识别码;键值关系生成模块,用于将唯一识别码作为键值关系中的键。
根据第二方面的第四种可能的实现方式,在第五种可能的实现方式中,该装置还包括查找模块,其中:第二运算模块,用于对唯一识别码进行分布式哈希运算,获取对应的路由识别码;查找模块,用于根据路由识别码查找路由关系表对应的物理磁盘的网络地址;并查找网络地址对应的物理磁盘的元数据区域是否存储有唯一识别码,如果有,键值关系生成模块用于读取存储有唯一识别码的元数据区域对应的数据区域的数据,并将数据作为键值关系中的值;如果没有,键值关系生成模块用于将误读提醒数据作为键值关系中的值,其中误读提醒数据用于提醒前端设备本数据块在读取之前并没有写入数据;发送模块,用于发送键值关系至前端设备。
根据第二方面、第二方面的第一至第五种可能的实现方式中的任一者,在第六种可能的实现方式中,在树形结构中,卷设置在树形结构的叶子节点,快 照设置在树形结构的非叶子节点。
根据第二方面、第二方面的第一至第五种可能的实现方式中的任一者,在第七种可能的实现方式中,物理磁盘的数据区域也以预定大小的数据块进行条带化。
根据第二方面、第二方面的第一至第五种可能的实现方式中的任一者,在第八种可能的实现方式中,预定大小的数据块的大小为1M。
根据第二方面、第二方面的第一至第五种可能的实现方式中的任一者,在第九种可能的实现方式中,网络地址为IP地址及端口的组合。
通过上述方案,在本发明实施例中,通过四元数组将卷和快照的每一个数据块以特定的数据结构标识,生成对应的路由识别码,并将路由识别码与物理磁盘的网络地址一一对应,从而将前端设备的卷和快照映射至多个物理磁盘,每一卷和快照均对应多个不同的物理磁盘,在访问同一卷和快照时,实质上需要访问多个不同的物理磁盘,而不是单一的物理磁盘,故可支持并发访问,可靠性亦得到极大的提高。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。其中:
图1是现有的IP-SAN技术的系统结构示意图;
图2是根据本发明实施例的系统结构示意图;
图3是根据本发明实施例的物理磁盘的共享方法的流程图;
图4是根据本发明实施例的卷和快照的树状结构示意图;
图5是根据本发明实施例的卷和快照中数据块的格式示意图;
图6是根据本发明实施例的物理磁盘的共享方法的数据转换示意图
图7是根据本发明实施例的物理磁盘的共享方法的另一流程图;
图8是根据本发明实施例的物理磁盘的共享方法的另一流程图;
图9是根据本发明实施例的物理磁盘的数据结构示意图;
图10是根据本发明实施例的物理磁盘的共享方法的另一流程图;
图11是本发明的物理磁盘的共享装置第一实施例的装置结构示意图;
图12是本发明的物理磁盘的共享装置第二实施例的装置结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性的劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
首先请参见图2,图2是根据本发明实施例的系统结构示意图。如图2所示,本发明实施例的物理磁盘的共享方法应用于服务器20中,用于将多个物理磁盘1、2、3、4通过网络共享给前端设备11、12、13,服务器20与前端设备11、12、13具有网络连接,并且服务器20与物理磁盘在本地或通过网络远程挂载有多个物理磁盘1、2、3、4。
值得注意的是,在本实施例中,服务器的数量是1个,但在本发明的可选实施例中,服务器的数量也可以是多个,即发明实施例的物理磁盘的共享方法可同时应用于多个服务器中,本发明对此不作限定。
每一物理磁盘1、2、3、4均具有一网络地址,优选地,网络地址可以是IP地址加上端口号,其中IP地址具体可为服务器20的IP地址,端口号为服务器20分配给物理磁盘的端口号。
并且,前端设备11、12、13可具体为VM(Virtual Machine,虚拟机),在本发明实施例中,前端设备及物理磁盘的数量为复数个,具体可根据实际需要选取,并不限定于如图2所示之数量。
以下请参见图3,图3是根据本发明实施例的物理磁盘的共享方法的流程图。如图3所示,本发明的物理磁盘的共享方法包括以下步骤:
步骤101:将待分配至前端设备的多个卷和快照组织成多个树形结构,并以包括树识别码、分支识别码以及节点识别码的三元数组来标识多个树形结构中的任意卷或快照。
步骤102:以预定大小的数据块对每一卷或快照进行条带化,以包括树识别码、分支识别码、节点识别码以及数据块识别码的四元数组来标识多个树形结构中的任意卷或快照中的任意数据块。
步骤103:对每一数据块对应的四元数组进行组合运算,获取与四元数组一一对应的唯一识别码。
步骤104:对唯一识别码进行分布式哈希运算,获取与数据块对应的路由识别码,将路由识别码与物理磁盘的网络地址一一对应以形成路由关系表并存储。
为了便于理解,请结合图4和图5一并参考,其中图4是根据本发明实施例的卷和快照的树状结构示意图,图5是根据本发明实施例的卷和快照中数据块的格式示意图。
如图4所示,在步骤101中,通过三元组(tree,brach,node)识别树形结构的节点,其中tree表示树的编号,brach表示分支,node表示节点;例如在树1中,tree id=1,那么a节点为(1,1,0),b节点为(1,1,1),c节点为(1,2,0),d节点为(1,1,3),e节点为(1,3,0),f节点为(1,2,1),g节点为(1,3,1)。
在树2中,tree id=2,那么a’节点为(2,1,0),b’节点为(2,1,1),c’节点为(2,2,0),d’节点为(2,1,3),e’节点为(2,3,0),f’节点为(2,2,1),g’节点为(2,3,1)。
通过三元组对树形结构进行标识后,树形结构的叶子节点可作为卷节点,树形结构的非叶子节点可作为快照节点,产生分支的节点作为基于快照创建的新卷。在树形结构的单个分支上,快照与卷的node id不同,并且呈递增的形式。
并请参见图5,正如步骤102所述,可进一步将卷和快照进行条带化,每个数据块的单位优选为1M。在其他实施例中,本领域的普通技术人员还可以将每个数据块的单位设置为其他数值。
每个数据块通过block no(块顺序号)进行编号,例如每个数据块的编号依次为0,1,2,3,4……(其编号数量取决于block no的数据长度)。
在步骤102中,对卷和快照经条带化后的每1M的数据块进行编码,生成四元数组,其中四元数组的编码格式如表1:
tree id block no branch id node id
表1
通过上述编码方式,可以将卷和快照的任何1M数据块进行标识,例如图4中的a节点,a节点第一块数据块的编码为(1,1,1,0),第n块数据块的编 码为(1,n,1,0)。
其中,示例性的,tree id的数据长度可以设为4bit,block no的数据长度可以设为4bit,branch id的数据长度可以设为4bit,node id的数据长度可以设为4bit。在步骤103中,可将四元数组进行组合运算,例如针对a节点的第一个数据块(1,1,1,0),组合运算具体为:将四元数组内的每一元素转换为二进制数并以4bit的二进制数表示为(0001,0001,0001,0000),按顺序组合各元素,组合后的唯一识别码字对应为0001000100010000。
由于上述组合运算可逆,因此四元数组与唯一识别码之间可以相互推导。
而在步骤104中,分布式哈希运算方式具体可参见以下举例:
针对a节点的第一个数据块(1,1,1,0)的唯一识别码0001000100010000而言,将唯一识别码0001000100010000转换为十进制数:
24+28+212=16+256+4096=4368
进一步,可对4368作取模运算,如4368mod n,其中n为物理磁盘的个数。
于此可示例性地定义n=4,即物理磁盘的个数为4个,故4368mod 4=0。
因此a节点的第一个数据块(1,1,1,0)的路由识别码为0。
若物理磁盘的个数为4个,路由识别码可以是0,1,2,3中的一者。故在本发明实施例中,通过分布式哈希运算,任意一个四元数组均可对应一个路由识别码。
而服务器20则可将路由识别码与物理磁盘的网络地址一一对应以形成路由关系表并存储。举例而言,路由关系表可如表2所示:
路由识别码 网络地址
0 192.168.1.1:1000
1 192.168.1.1:1001
2 192.168.1.1:1002
3 192.168.1.1:1003
表2
其中,网络地址192.168.1.1:1000可指向物理磁盘1,网络地址192.168.1.1:1001可指向物理磁盘2,网络地址192.168.1.1:1002可指向物 理磁盘3,网络地址192.168.1.1:1003可指向物理磁盘4。
可结合图6进行参考,图6是根据本发明实施例的物理磁盘的共享方法的数据转换示意图。如图6所示,当物理磁盘的数量是4时,多个唯一识别码与4进行取模运算可分别得到0,1,2,3四个不同的路由识别码,故唯一识别码可分为4个群组,分别为群组301、群组302、群组303、群组304,每一群组均对应一个路由识别码,从而通过路由关系表30与对应的一个物理磁盘建立关联。
值得注意的是,当物理磁盘的个数是n个时,路由识别码可以是0,1,2,…,n-1,此时唯一识别码可分为n个群组,并通过路由关系表与n个磁盘建立关联。
因此,在本实施例中,通过四元数组对卷和快照中的数据块进行标识,并根据四元数组来将每一卷和快照中的数据块与不同的物理磁盘关联,在前端设备访问同一卷和快照中的不同数据块时,实质上需要访问多个不同的物理磁盘,而不是单一的物理磁盘,故本实施例揭示的技术方案可支持并发访问,可靠性亦得到极大的提高。
以下将详细描述前端设备将卷或快照挂载到本地以及进行写入操作和读写操作的具体方式。
首先请参见图7,图7是根据本发明实施例的物理磁盘的共享方法的另一流程图。图7具体描述了前端设备将卷或快照挂载到本地的流程,如图7所示,本发明实施例的物理磁盘的共享方法还包括以下步骤:
步骤104:获取前端设备发出的卷加载请求。
步骤105:响应卷加载请求选择多个树形结构中的卷或快照。
步骤106:发送所选择的卷或快照上每一数据块对应的唯一识别码至前端设备。使得前端设备可将所选择的卷或快照加载到本地。
上述步骤104至106均可由服务器执行,当前端设备需要将卷或快照加载到本地,可向服务器发出卷加载请求,服务器响应卷加载请求选择多个树形结构中的卷或快照,并发送所选择的卷或快照上每一数据块对应的唯一识别码至前端设备。据此前端设备可获得多个唯一识别码,并根据唯一识别码来访问物理磁盘,以对物理磁盘进行读写。
以下请参见图8,图8是根据本发明实施例的物理磁盘的共享方法的另一流程图。本流程主要用于说明前端设备对加载到本地的卷进行写入操作时,服务器对物理磁盘进行写入操作的具体方法,如图7所示,该方法包括以下步骤:
步骤106:获取前端设备发送的写入请求。其中写入请求是前端设备对加载到本地的卷中的数据块进行写入操作时产生的,写入请求包括所要写入的卷中的数据块对应的唯一识别码以及待写入数据。举例而言,若前端设备挂载了树1的g节点(1,3,1)作为本地的卷,并对该卷的第一个数据块(1,1,3,1)进行写入操作,故此时写入请求中包括g节点第一个数据块(1,1,3,1)的唯一识别码0001000100110001以及待写入数据。
步骤107:将唯一识别码作为键,并将待写入数据作为值,形成键值关系。
步骤108:对唯一识别码进行分布式哈希运算,获取对应的路由识别码。在本步骤中,针对唯一识别码0001000100110001进行分布式哈希运算,首先获取其对应的十进制数:
20+24+25+28+212=1+16+32+256+4096=4401
进一步对4401作取模运算,如4401mod 4,其中4为物理磁盘的个数,得到路由识别码为1。
步骤109:根据路由识别码查找路由关系表中对应的物理磁盘的网络地址。具体地,可参见表2,由于路由识别码1对应的网络地址是192.168.1.1:1001,而该网络地址指向物理磁盘2,故可定位到物理磁盘2,对物理磁盘2进行写入。
步骤110:将键值关系发送至网络地址对应的物理磁盘,以将键值关系中的唯一识别码写入物理磁盘的元数据区域的空位置,并将键值关系中的待写入数据写入元数据区域的空位置对应的数据区域。
上述步骤可由服务器执行。
为了便于说明,请结合图9进行参考,图9是根据本发明实施例的物理磁盘的数据结构示意图。如图9所示,物理磁盘70包括元数据区域701和数据区域702,并且元数据区域701用于存储元数据,每一元数据具有固定的数据长度,而数据区域702用于存储数据块,每一数据块也具有固定数据长度,数据块的数据长度比元数据的数据长度大,在本发明的优选实施例中,物理磁盘70的数据区域也以预定大小的数据块进行条带化,该预定大小的数据块与卷或快照进行条带化后产生的预定大小的数据块的数据长度一致,举例而言可均设为1M。
并且,在物理磁盘70中,元数据区域701与数据区域702之间具有映射关系,即如图9中箭头所示,只要获知元数据区域701即可访问与该元数据区域701对应的数据区域702,K1区域与V1区域对应,K2区域与V2区域对应,K3 区域与V3区域对应,K4区域与V4区域对应,K5区域与V1区域对应。
故在步骤110中,将键值关系发送至网络地址对应的物理磁盘之后,只需对物理磁盘的元数据区域进行查找,若查找到空位置,则将唯一识别码写入空位置中,并根据元数据区域与数据区域之间的映射关系将待写入数据写入元数据区域的空位置对应的数据区域,从而完成数据写入操作。
以下请参见图10,图10是根据本发明实施例的物理磁盘的共享方法的另一流程图。本流程主要用于说明前端设备对加载到本地的卷或快照进行读取操作时,服务器对物理磁盘进行读取操作的具体方法,如图8所示,该方法包括以下步骤:
步骤206:获取前端设备发送的读取请求。其中读取请求是前端设备对加载到本地的卷或快照中的数据块进行读取操作时产生的,读取请求包括所要读取的卷或快照中的数据块对应的唯一识别码。举例而言,若前端设备挂载了树1的g节点(1,3,1)作为本地的卷,并对该卷的第一个数据块(1,1,3,1)进行读取操作,故此时读取请求中包括g节点第一个数据块(1,1,3,1)的唯一识别码0001000100110001。
步骤207:将唯一识别码作为键值关系中的键。
步骤208:对唯一识别码进行分布式哈希运算,获取对应的路由识别码。在本步骤中,针对唯一识别码0001000100110001进行分布式哈希运算,首先获取其对应的十进制数:
20+24+25+28+212=1+16+32+256+4096=4401
进一步对4401作取模运算,如4401mod 4,其中4为物理磁盘的个数,得到路由识别码为1。
步骤209:根据路由识别码查找路由关系表对应的物理磁盘的网络地址。具体地,可参见表2,由于路由识别码1对应的网络地址是192.168.1.1:1001,而该网络地址指向物理磁盘2,故可定位到物理磁盘2,对物理磁盘2进行读取。
步骤210:查找网络地址对应的物理磁盘的元数据区域是否存储有唯一识别码,如果有,执行步骤212,如果没有,执行步骤211。
步骤212:读取存储有唯一识别码的元数据区域对应的数据区域的数据,并将该数据作为键值关系中的值。
步骤211:将误读提醒数据作为键值关系中的值。其中误读提醒数据用于提醒前端设备本数据块在读取之前并没有写入数据。
步骤213:发送键值关系至前端设备。
上述步骤可由服务器执行。
在步骤210中,存在两种情况,第一种是物理磁盘2的元数据区域存储有唯一识别码,如经上述步骤106至110之后,物理磁盘2的元数据区域存储有唯一识别码0001000100110001,故在步骤210中,查找到物理磁盘2的元数据区域存储有唯一识别码0001000100110001,其与步骤208运算所得的唯一识别码0001000100110001一致,此时执行步骤212,读取存储有唯一识别码0001000100110001的元数据区域对应的数据区域的数据,并将该数据作为键值关系中的值,并在步骤213中,将键值关系发送至前端设备,从而使得前端设备获取唯一识别码0001000100110001对应的数据块对应的数据。
第二种情况是物理磁盘2的元数据区域没有存储唯一识别码0001000100110001,此时在步骤210中,将误读提醒数据作为键值关系中的值,并在步骤213中发送键值关系至前端设备,使得前端设备获知要读取的数据块在进行读取之前并没有写入数据。误读提醒数据可以特定数值表示,优选可设置为0。
因此,在本实施例中,在通过四元数组对卷和快照中的数据块进行标识、并在根据四元数组来将每一卷和快照中的数据块与不同的物理磁盘关联的基础上,前端设备对同一卷和快照中的不同数据块进行读取或写入操作时(其中读取操作可针对卷和快照,写入操作仅针对卷),实质上需要访问多个不同的物理磁盘,而不是单一的物理磁盘。
并且,在本发明实施例中,当某一物理磁盘不能工作时,服务器只需修改路由关系表,利用剩余物理磁盘也可正常工作。举例而言,若物理磁盘4不能正常工作,此时只需修改路由关系表2,将路由识别码3与网络地址192.168.1.1:1002关联,即路由识别码2和3均于物理磁盘3关联(仅以此作示例说明),通过物理磁盘1、2、3也能够很好地维持系统正常工作。因此在有限数量个物理磁盘不能正常工作时,通过修改路由关系表,也能保证系统正常运行。
又或者,当某一物理磁盘不能工作时,服务器也可以修改分布式哈希算法,利用剩余物理磁盘也可正常工作。举例而言,若物理磁盘4不能正常工作,此时只需在分布式哈希算法将n值从4修改为3,形成新的路由关系表,从而通过物理磁盘1、2、3也能够很好地维持系统正常工作。因此在有限数量个物理 磁盘不能正常工作时,通过修改分布式哈希算法中的n值,使其与能够正常工作的物理磁盘数量一致,也能保证系统正常运行。
故本实施例揭示的技术方案可支持并发访问,并极大的提高了可靠性。
本发明实施例进一步提供一种物理磁盘的共享装置,其可设置在服务器20中,以实现上述物理磁盘的共享方法。
可选地,该物理磁盘的共享装置也可设置在多个服务器中,本发明对此不作限定。
以下请参见图11,图11是本发明的物理磁盘的共享装置第一实施例的装置结构示意图。物理磁盘的共享装置40用于将多个物理磁盘通过网络共享给前端设备,其中每一物理磁盘均具有一网络地址,如图11所示,物理磁盘的共享装置40包括:
第一标识模块401,用于将待分配至前端设备的多个卷和快照组织成多个树形结构,并以包括树识别码、分支识别码以及节点识别码的三元数组来标识多个树形结构中的任意卷或快照;
第二标识模块402,用于以预定大小的数据块对每一卷或快照进行条带化,以包括树识别码、分支识别码、节点识别码以及数据块识别码的四元数组来标识多个树形结构中的任意卷或快照中的任意数据块;
第一运算模块403,用于对每一数据块对应的四元数组进行组合运算,获取与四元数组一一对应的唯一识别码;
第二运算模块404,用于对唯一识别码进行分布式哈希运算,获取与数据块对应的路由识别码;
存储模块405,将路由识别码与物理磁盘的网络地址一一对应以形成路由关系表并存储。
在本实施例中,通过四元数组对卷和快照中的数据块进行标识,并根据四元数组来将每一卷和快照中的数据块与不同的物理磁盘关联,在前端设备访问同一卷和快照中的不同数据块时,实质上需要访问多个不同的物理磁盘,而不是单一的物理磁盘,故本实施例揭示的技术方案可支持并发访问,可靠性亦得到极大的提高。
以下将详细描述前端设备将卷或快照挂载到本地以及进行写入操作和读写操作的具体方式。
可选地,物理磁盘的共享装置40还包括:接收模块407,用于获取前端设 备发出的卷加载请求;选择模块406,用于响应卷加载请求选择多个树形结构中的卷或快照;发送模块409,用于发送所选择的卷或快照上每一数据块对应的唯一识别码至前端设备。
前端设备通过上述方式来完成将卷或快照加载到本地的动作。
可选地,物理磁盘的共享装置40还包括键值关系生成模块410和查找模块408,其中:接收模块407,还用于获取前端设备发送的写入请求,其中写入请求是前端设备对卷中的数据块进行写入操作时产生的,写入请求包括所要写入的卷中的数据块对应的唯一识别码以及待写入数据;键值关系生成模块410,用于将唯一识别码作为键,并将待写入数据作为值,形成键值关系。第二运算模块404,用于对唯一识别码进行分布式哈希运算,获取对应的路由识别码;查找模块408,用于根据路由识别码查找路由关系表中对应的物理磁盘的网络地址;发送模块409,还用于将键值关系发送至网络地址对应的物理磁盘,以将键值关系中的唯一识别码写入物理磁盘的元数据区域的空位置,并将键值关系中的待写入数据写入元数据区域的空位置对应的数据区域。
因此,当前端设备对本地卷进行写入时,实质上是将待写入数据写入到多个物理磁盘的数据区域。
可选地,接收模块407,还用于获取前端设备发送的读取请求,其中读取请求是前端设备对卷或快照中的数据块进行读取操作时产生的,读取请求包括所要读取的卷或快照中的数据块对应的唯一识别码;键值关系生成模块410,还用于将唯一识别码作为键值关系中的键。第二运算模块404,用于对唯一识别码进行分布式哈希运算,获取对应的路由识别码;查找模块408,还用于根据路由识别码查找路由关系表对应的物理磁盘的网络地址;并查找网络地址对应的物理磁盘的元数据区域是否存储有唯一识别码,如果有,键值关系生成模块410用于读取存储有唯一识别码的元数据区域对应的数据区域的数据,并将数据作为键值关系中的值;如果没有,键值关系生成模块410用于将误读提醒数据作为键值关系中的值,其中误读提醒数据用于提醒前端设备本数据块在读取之前并没有写入数据;发送模块409,用于发送键值关系至前端设备。
因此,当前端设备读取本地卷或快照时,实质上是对多个物理磁盘的数据区域进行读取。
可选地,在树形结构中,卷设置在树形结构的叶子节点,快照设置在树形结构的非叶子节点。
可选地,物理磁盘的数据区域也以预定大小的数据块进行条带化。
可选地,预定大小的数据块的大小为1M。
可选地,网络地址为IP地址及端口的组合。
以下请参见图12,图12是本发明的物理磁盘的共享装置第二实施例的装置结构示意图。物理磁盘的共享装置40用于将多个物理磁盘通过网络共享给前端设备,其中每一物理磁盘均具有一网络地址,物理磁盘的共享装置40包括至少一个处理器502、至少一个网络接口503、存储器501、和至少一个通信总线504,存储器501用于存储程序指令,处理器502,用于:
执行程序指令以将待分配至前端设备的多个卷和快照组织成多个树形结构,并以包括树识别码、分支识别码以及节点识别码的三元数组来标识多个树形结构中的任意卷或快照;
执行程序指令以以预定大小的数据块对每一卷或快照进行条带化,以包括树识别码、分支识别码、节点识别码以及数据块识别码的四元数组来标识多个树形结构中的任意卷或快照中的任意数据块;
执行程序指令以对每一数据块对应的四元数组进行组合运算,获取与四元数组一一对应的唯一识别码;
执行程序指令以对唯一识别码进行分布式哈希运算,获取与数据块对应的路由识别码;
执行程序指令以路由识别码与物理磁盘的网络地址一一对应以形成路由关系表并存储。
在本实施例中,通过四元数组对卷和快照中的数据块进行标识,并根据四元数组来将每一卷和快照中的数据块与不同的物理磁盘关联,在前端设备访问同一卷和快照中的不同数据块时,实质上需要访问多个不同的物理磁盘,而不是单一的物理磁盘,故本实施例揭示的技术方案可支持并发访问,可靠性亦得到极大的提高。
以下将详细描述前端设备将卷或快照挂载到本地以及进行写入操作和读写操作的具体方式。
可选地,网络接口503,用于获取前端设备发出的卷加载请求;处理器502,用于执行程序指令以响应卷加载请求选择多个树形结构中的卷或快照;网络接口503,还用于发送所选择的卷或快照上每一数据块对应的唯一识别码至前端设备。
前端设备通过上述方式来完成将卷或快照加载到本地的动作。
可选地,网络接口503,还用于获取前端设备发送的写入请求,其中写入请求是前端设备对卷中的数据块进行写入操作时产生的,写入请求包括所要写入的卷中的数据块对应的唯一识别码以及待写入数据;处理器502,执行程序指令以将唯一识别码作为键,并将待写入数据作为值,形成键值关系。处理器502,执行程序指令以对唯一识别码进行分布式哈希运算,获取对应的路由识别码;处理器502,执行程序指令以根据路由识别码查找路由关系表中对应的物理磁盘的网络地址;网络接口503,还用于将键值关系发送至网络地址对应的物理磁盘,以将键值关系中的唯一识别码写入物理磁盘的元数据区域的空位置,并将键值关系中的待写入数据写入元数据区域的空位置对应的数据区域。
因此,当前端设备对本地卷进行写入时,实质上是将待写入数据写入到多个物理磁盘的数据区域。
可选地,网络接口503,用于获取前端设备发送的读取请求,其中读取请求是前端设备对卷或快照中的数据块进行读取操作时产生的,读取请求包括所要读取的卷或快照中的数据块对应的唯一识别码;处理器502,执行程序指令以将唯一识别码作为键值关系中的键。处理器502,执行程序指令以对唯一识别码进行分布式哈希运算,获取对应的路由识别码;处理器502,执行程序指令以根据路由识别码查找路由关系表对应的物理磁盘的网络地址;并查找网络地址对应的物理磁盘的元数据区域是否存储有唯一识别码,如果有,读取存储有唯一识别码的元数据区域对应的数据区域的数据,并将数据作为键值关系中的值;如果没有,将误读提醒数据作为键值关系中的值,其中误读提醒数据用于提醒前端设备本数据块在读取之前并没有写入数据;网络接口503,用于发送键值关系至前端设备。
因此,当前端设备读取本地卷或快照时,实质上是对多个物理磁盘的数据区域进行读取。
可选地,在树形结构中,卷设置在树形结构的叶子节点,快照设置在树形结构的非叶子节点。
可选地,物理磁盘的数据区域也以预定大小的数据块进行条带化。
可选地,预定大小的数据块的大小为1M。
可选地,网络地址为IP地址及端口的组合。
通过上述之介绍,在本发明实施例中,通过四元数组将卷和快照的每一个 数据块以特定的数据结构标识,生成对应的路由识别码,并将路由识别码与物理磁盘的网络地址一一对应,从而将前端设备的卷和快照映射至多个物理磁盘,每一卷和快照均对应多个不同的物理磁盘,在访问同一卷和快照时,实质上需要访问多个不同的物理磁盘,而不是单一的物理磁盘,故可支持并发访问,可靠性亦得到极大的提高。
以上所述仅为本发明的实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。

Claims (20)

  1. 一种物理磁盘的共享方法,其特征在于,所述方法用于将多个物理磁盘通过网络共享给前端设备,其中每一所述物理磁盘均具有一网络地址,所述方法包括:
    将待分配至所述前端设备的多个卷和快照组织成多个树形结构,并以包括树识别码、分支识别码以及节点识别码的三元数组来标识所述多个树形结构中的任意卷或快照;
    以预定大小的数据块对每一卷或快照进行条带化,以包括所述树识别码、所述分支识别码、所述节点识别码以及数据块识别码的四元数组来标识所述多个树形结构中的任意卷或快照中的任意数据块;
    对每一所述数据块对应的四元数组进行组合运算,获取与所述四元数组一一对应的唯一识别码;
    对所述唯一识别码进行分布式哈希运算,获取与所述数据块对应的路由识别码,将所述路由识别码与所述物理磁盘的网络地址一一对应以形成路由关系表并存储。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取所述前端设备发出的卷加载请求;
    响应所述卷加载请求选择所述多个树形结构中的卷或快照;
    发送所选择的卷或快照上每一数据块对应的唯一识别码至所述前端设备。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    获取所述前端设备发送的写入请求,其中所述写入请求是所述前端设备对所述卷中的数据块进行写入操作时产生的,所述写入请求包括所要写入的卷中的数据块对应的唯一识别码以及待写入数据;
    将所述唯一识别码作为键,并将所述待写入数据作为值,形成键值关系。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    对所述唯一识别码进行分布式哈希运算,获取对应的路由识别码;
    根据所述路由识别码查找所述路由关系表中对应的物理磁盘的网络地址;
    将所述键值关系发送至所述网络地址对应的物理磁盘,以将所述键值关系中的唯一识别码写入所述物理磁盘的元数据区域的空位置,并将所述键值关系中的待写入数据写入所述元数据区域的空位置对应的数据区域。
  5. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    获取所述前端设备发送的读取请求,其中所述读取请求是所述前端设备对所述卷或快照中的数据块进行读取操作时产生的,所述读取请求包括所要读取的卷或快照中的数据块对应的唯一识别码;
    将所述唯一识别码作为键值关系中的键。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    对所述唯一识别码进行分布式哈希运算,获取对应的路由识别码;
    根据所述路由识别码查找所述路由关系表对应的物理磁盘的网络地址;
    查找所述网络地址对应的物理磁盘的元数据区域是否存储有所述唯一识别码,如果有,读取存储有所述唯一识别码的元数据区域对应的数据区域的数据,并将所述数据作为所述键值关系中的值;如果没有,将误读提醒数据作为所述键值关系中的值,其中所述误读提醒数据用于提醒所述前端设备本数据块在读取之前并没有写入数据;
    发送所述键值关系至所述前端设备。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,在所述树形结构中,所述卷设置在所述树形结构的叶子节点,所述快照设置在所述树形结构的非叶子节点。
  8. 根据权利要求1至6任一项所述的方法,其特征在于,所述物理磁盘的数据区域也以所述预定大小的数据块进行条带化。
  9. 根据权利要求8所述的方法,其特征在于,所述预定大小的数据块的大小为1M。
  10. 根据权利要求1至6任一项所述的方法,其特征在于,所述网络地址为IP地址及端口的组合。
  11. 一种物理磁盘的共享装置,其特征在于,所述装置用于将多个物理磁盘通过网络共享给前端设备,其中每一所述物理磁盘均具有一网络地址,所述装置包括:
    第一标识模块,用于将待分配至所述前端设备的多个卷和快照组织成多个树形结构,并以包括树识别码、分支识别码以及节点识别码的三元数组来标识所述多个树形结构中的任意卷或快照;
    第二标识模块,用于以预定大小的数据块对每一卷或快照进行条带化,以包括所述树识别码、所述分支识别码、所述节点识别码以及数据块识别码的四 元数组来标识所述多个树形结构中的任意卷或快照中的任意数据块;
    第一运算模块,用于对每一所述数据块对应的四元数组进行组合运算,获取与所述四元数组一一对应的唯一识别码;
    第二运算模块,用于对所述唯一识别码进行分布式哈希运算,获取与所述数据块对应的路由识别码;
    存储模块,将所述路由识别码与所述物理磁盘的网络地址一一对应以形成路由关系表并存储。
  12. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    接收模块,用于获取所述前端设备发出的卷加载请求;
    选择模块,用于响应所述卷加载请求选择所述多个树形结构中的卷或快照;
    发送模块,用于发送所选择的卷或快照上每一数据块对应的唯一识别码至所述前端设备。
  13. 根据权利要求12所述的装置,其特征在于,所述装置还包括键值关系生成模块,其中:
    所述接收模块,还用于获取所述前端设备发送的写入请求,其中所述写入请求是所述前端设备对所述卷中的数据块进行写入操作时产生的,所述写入请求包括所要写入的卷中的数据块对应的唯一识别码以及待写入数据;
    所述键值关系生成模块,用于将所述唯一识别码作为键,并将所述待写入数据作为值,形成键值关系。
  14. 根据权利要求13所述的装置,其特征在于,所述装置还包括查找模块,其中:
    所述第二运算模块,用于对所述唯一识别码进行分布式哈希运算,获取对应的路由识别码;
    所述查找模块,用于根据所述路由识别码查找所述路由关系表中对应的物理磁盘的网络地址;
    所述发送模块,还用于将所述键值关系发送至所述网络地址对应的物理磁盘,以将所述键值关系中的唯一识别码写入所述物理磁盘的元数据区域的空位置,并将所述键值关系中的待写入数据写入所述元数据区域的空位置对应的数据区域。
  15. 根据权利要求12所述的装置,其特征在于,所述装置还包括键值关系生成模块,其中:
    所述接收模块,用于获取所述前端设备发送的读取请求,其中所述读取请求是所述前端设备对所述卷或快照中的数据块进行读取操作时产生的,所述读取请求包括所要读取的卷或快照中的数据块对应的唯一识别码;
    所述键值关系生成模块,用于将所述唯一识别码作为键值关系中的键。
  16. 根据权利要求15所述的装置,其特征在于,所述装置还包括查找模块,其中:
    所述第二运算模块,用于对所述唯一识别码进行分布式哈希运算,获取对应的路由识别码;
    所述查找模块,用于根据所述路由识别码查找所述路由关系表对应的物理磁盘的网络地址;并查找所述网络地址对应的物理磁盘的元数据区域是否存储有所述唯一识别码,如果有,所述键值关系生成模块用于读取存储有所述唯一识别码的元数据区域对应的数据区域的数据,并将所述数据作为所述键值关系中的值;如果没有,所述键值关系生成模块用于将误读提醒数据作为所述键值关系中的值,其中所述误读提醒数据用于提醒所述前端设备本数据块在读取之前并没有写入数据;
    所述发送模块,用于发送所述键值关系至所述前端设备。
  17. 根据权利要求11至16任一项所述的装置,其特征在于,在所述树形结构中,所述卷设置在所述树形结构的叶子节点,所述快照设置在所述树形结构的非叶子节点。
  18. 根据权利要求11至16任一项所述的装置,其特征在于,所述物理磁盘的数据区域也以所述预定大小的数据块进行条带化。
  19. 根据权利要求18所述的装置,其特征在于,所述预定大小的数据块的大小为1M。
  20. 根据权利要求11至16任一项所述的装置,其特征在于,所述网络地址为IP地址及端口的组合。
PCT/CN2016/087701 2015-08-05 2016-06-29 一种物理磁盘的共享方法及装置 WO2017020668A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510473671.4 2015-08-05
CN201510473671.4A CN105138281B (zh) 2015-08-05 2015-08-05 一种物理磁盘的共享方法及装置

Publications (1)

Publication Number Publication Date
WO2017020668A1 true WO2017020668A1 (zh) 2017-02-09

Family

ID=54723642

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/087701 WO2017020668A1 (zh) 2015-08-05 2016-06-29 一种物理磁盘的共享方法及装置

Country Status (2)

Country Link
CN (1) CN105138281B (zh)
WO (1) WO2017020668A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112667172A (zh) * 2021-01-19 2021-04-16 南方电网科学研究院有限责任公司 磁盘操作方法、装置、系统、存储介质及计算设备
CN114880256A (zh) * 2017-04-14 2022-08-09 华为技术有限公司 数据处理方法、存储系统和交换设备

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138281B (zh) * 2015-08-05 2018-12-07 华为技术有限公司 一种物理磁盘的共享方法及装置
CN105677252B (zh) * 2016-01-06 2019-06-07 华为技术有限公司 读数据的方法、数据处理方法及相关存储设备
CN107122141A (zh) * 2017-05-08 2017-09-01 郑州云海信息技术有限公司 一种磁盘流量的控制方法和装置
CN107291400B (zh) * 2017-06-30 2020-07-28 苏州浪潮智能科技有限公司 一种快照卷关系模拟方法及装置
CN109981768B (zh) * 2019-03-21 2021-12-07 上海霄云信息科技有限公司 分布式网络存储系统中的io多路径规划方法及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103180852A (zh) * 2012-08-09 2013-06-26 华为技术有限公司 分布式数据处理方法及装置
CN103561057A (zh) * 2013-10-15 2014-02-05 深圳清华大学研究院 基于分布式哈希表和纠删码的数据存储方法
CN103608784A (zh) * 2013-06-26 2014-02-26 华为技术有限公司 网络卷创建方法、数据存储方法、存储设备和存储系统
CN104049690A (zh) * 2014-06-10 2014-09-17 浪潮电子信息产业股份有限公司 一种关键应用主机应对高并发业务模型的设计方法
CN105138281A (zh) * 2015-08-05 2015-12-09 华为技术有限公司 一种物理磁盘的共享方法及装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6912643B2 (en) * 2002-08-19 2005-06-28 Aristos Logic Corporation Method of flexibly mapping a number of storage elements into a virtual storage element

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103180852A (zh) * 2012-08-09 2013-06-26 华为技术有限公司 分布式数据处理方法及装置
CN103608784A (zh) * 2013-06-26 2014-02-26 华为技术有限公司 网络卷创建方法、数据存储方法、存储设备和存储系统
CN103561057A (zh) * 2013-10-15 2014-02-05 深圳清华大学研究院 基于分布式哈希表和纠删码的数据存储方法
CN104049690A (zh) * 2014-06-10 2014-09-17 浪潮电子信息产业股份有限公司 一种关键应用主机应对高并发业务模型的设计方法
CN105138281A (zh) * 2015-08-05 2015-12-09 华为技术有限公司 一种物理磁盘的共享方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880256A (zh) * 2017-04-14 2022-08-09 华为技术有限公司 数据处理方法、存储系统和交换设备
CN112667172A (zh) * 2021-01-19 2021-04-16 南方电网科学研究院有限责任公司 磁盘操作方法、装置、系统、存储介质及计算设备

Also Published As

Publication number Publication date
CN105138281B (zh) 2018-12-07
CN105138281A (zh) 2015-12-09

Similar Documents

Publication Publication Date Title
WO2017020668A1 (zh) 一种物理磁盘的共享方法及装置
US10983860B2 (en) Automatic prefill of a storage system with conditioning of raid stripes
US11327834B2 (en) Efficient computation of parity data in storage system implementing data striping
US10540323B2 (en) Managing I/O operations in a storage network
US10990479B2 (en) Efficient packing of compressed data in storage system implementing data striping
US10901847B2 (en) Maintaining logical to physical address mapping during in place sector rebuild
US12099721B2 (en) Methods to configure and access scalable object stores using KV-SSDs and hybrid backend storage tiers of KV-SSDs, NVMe-SSDs and other flash devices
US9811546B1 (en) Storing data and metadata in respective virtual shards on sharded storage systems
US8996611B2 (en) Parallel serialization of request processing
TWI549060B (zh) Access methods and devices for virtual machine data
US20180113810A1 (en) Method and system for efficient hashing optimized for hardware accelerated caching
US9740722B2 (en) Representing dynamic trees in a database
US8473787B2 (en) Intelligent LUN generation and management
US12086116B2 (en) Object and sequence number management
US10346362B2 (en) Sparse file access
CN112615917A (zh) 存储系统中存储设备的管理方法及存储系统
JP2015528957A (ja) 分散ファイルシステム、ファイルアクセス方法及びクライアントデバイス
US11467906B2 (en) Storage system resource rebuild based on input-output operation indicator
CN114661249B (zh) 数据存储方法、装置、计算机设备和存储介质
US10503409B2 (en) Low-latency lightweight distributed storage system
US11531470B2 (en) Offload of storage system data recovery to storage devices
US20220027187A1 (en) Supporting clones with consolidated snapshots

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16832156

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16832156

Country of ref document: EP

Kind code of ref document: A1