WO2014101896A1 - 一种共享存储资源的方法和系统 - Google Patents

一种共享存储资源的方法和系统 Download PDF

Info

Publication number
WO2014101896A1
WO2014101896A1 PCT/CN2013/091253 CN2013091253W WO2014101896A1 WO 2014101896 A1 WO2014101896 A1 WO 2014101896A1 CN 2013091253 W CN2013091253 W CN 2013091253W WO 2014101896 A1 WO2014101896 A1 WO 2014101896A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage
read
data block
data
partition
Prior art date
Application number
PCT/CN2013/091253
Other languages
English (en)
French (fr)
Inventor
顾炯炯
闵小勇
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to JP2015549981A priority Critical patent/JP6019513B2/ja
Priority to EP13869766.9A priority patent/EP2930910B1/en
Priority to EP16194969.8A priority patent/EP3188449B1/en
Priority to ES13869766.9T priority patent/ES2624412T3/es
Priority to CN201380002608.1A priority patent/CN103797770B/zh
Publication of WO2014101896A1 publication Critical patent/WO2014101896A1/zh
Priority to US14/754,378 priority patent/US9733848B2/en
Priority to US15/675,226 priority patent/US10082972B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0632Configuration or reconfiguration of storage systems by initialisation or re-initialisation of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0665Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a method and system for sharing storage resources. Background technique
  • server cluster systems integrate computing resources, storage resources, and network resources, and use virtualization and other technologies to provide them to users through the network.
  • Applications such as Virtual Machine (“VM”), computing power, storage capacity lease, etc.
  • VM Virtual Machine
  • the storage resources provided by the server cluster system usually use different devices, and the sources of storage resources are diversified.
  • the server node's own storage resources, and independently deployed storage resources such as a storage area network ("SAN"), such as a storage area network or storage server.
  • SAN storage area network
  • Embodiments of the present invention provide a method and system for sharing storage resources, so as to integrate and share heterogeneous storage resources, thereby improving storage resource utilization.
  • the embodiment of the present invention provides a method for sharing a storage resource, which is applied to a server cluster system, where the server cluster system includes a server node and a network storage node, the server node includes a hard disk, and the network storage node includes The storage array, the method includes: dividing the storage resources of the hard disk and the storage array into a plurality of storage partitions, wherein the plurality of storage partitions form a shared storage resource pool; and assigning read and write control to each storage partition Module; generating global partition information, wherein the global partition information is recorded in the shared storage resource pool Corresponding relationship between each storage partition and the read/write control module; receiving a storage request message, determining a storage partition corresponding to the storage request message; determining, according to the global partition information, a read/write corresponding to the storage partition corresponding to the storage request message And the controlling module sends the storage request message to the determined read/write control module, so that the read/write control module performs an operation requested by the storage request message.
  • the determining the storage partition corresponding to the storage request message includes: determining a user volume ID of the data to be operated by the storage request message, and the to-be-operated a logical block address LBA of at least one data block of data; determining a storage partition corresponding to the at least one data block according to the user volume ID and an LBA of the at least one data block.
  • the method further includes: establishing metadata of each storage partition of the shared storage resource pool, where each storage The metadata of the partition records the correspondence between the storage partition ID and the data block ID assigned to the storage partition; then, the determining the at least one data according to the user volume ID and the LBA of the at least one data block
  • the storage partition corresponding to the block includes: determining an ID of the at least one data block according to the user volume ID and an LBA of the at least one data block, querying metadata of each storage partition, and determining the at least one data block The ID of the corresponding storage partition.
  • the determining, according to the user volume ID and an LBA of the at least one data block, the at least one The storage partition corresponding to the data block includes: forming the user volume ID and the LBA of each data block into a key value of each of the data blocks, and calculating a value corresponding to the key value of each data block, according to the The value of the value determines the storage partition corresponding to each of the data blocks.
  • the receiving The storing the request message includes: receiving a command to create a user volume, the command for creating a user volume indicating a size of the user volume; and determining the user volume ID of the data to be operated by the storage request message and the to-be-operated operation Determining a storage partition corresponding to the at least one data block according to the user volume ID and the LBA of the at least one data block, according to the user volume ID and the LBA of the at least one data block, comprising: allocating a location for the user volume Determining the size of the initial storage resource allocated to the user volume according to the size of the user volume, determining an LBA of at least one data block according to the size of the initial storage resource; according to the user volume An ID and an LBA of the at least one data block determine a storage partition corresponding to the at least one data block.
  • the receiving The storage request message includes: receiving a write data operation request; determining a user volume corresponding to the current write operation according to the file name carried in the write data operation request
  • the receiving The storage request message includes: receiving a read data operation request, where the read data operation request carries a file name and an offset of the data to be read; and determining a user volume corresponding to the current read operation according to the file name carried in the read data operation request Determining, according to the offset information of the data to be read, an LBA of a plurality of data blocks to be read; determining, according to the user volume ID corresponding to the current read operation and the LBA of each data block to be read.
  • a storage partition corresponding to each data block to be read determining, according to the global partition information, a read/write control module corresponding to the storage partition corresponding to the plurality of data blocks to be read; generating a plurality of data block read commands, Each data block read command corresponds to one data block to be read, and each data block read command carries a data block to be read and an ID of the data block to be read;
  • the read/write control module corresponding to each data block to be read sends the data block read command, so that the read/write control module corresponding to each data block to be read reads the read-to-read Take the data block.
  • an embodiment of the present invention provides a server cluster system, where the server cluster system includes a server node and a network storage node, the server node includes a hard disk, and the network storage node includes a storage array, where the server node Running a distributed storage controller, the distributed storage controller includes: a metadata controller, configured to divide storage resources of the hard disk and the storage array into multiple storage partitions, where the multiple storage partitions are shared a storage resource pool, a read/write control module is allocated to each of the storage partitions, and global partition information is generated, where the global partition information records a correspondence between each storage partition in the shared storage resource pool and a read/write control module, And delivering the global partition information to the virtual block service module;
  • the virtual block service module is configured to receive a storage request message for the service layer, and determine the a storage partition corresponding to the storage request message, determining, according to the global partition information, a read/write control module corresponding to the storage partition corresponding to the storage request message, and sending the storage request message to the determined read/write control module;
  • the read/write control module is configured to perform an operation requested by the storage request message to the hard disk or the network storage node.
  • the read/write control module includes an object storage proxy and a network storage proxy
  • the metadata controller is specifically configured to allocate the object storage agent as a read/write control module to a storage partition formed by the local hard disk, and allocate the network storage agent as a read/write control module to a storage partition formed by the storage array.
  • the object storage agent is configured to receive a storage request message, determine a physical address corresponding to the storage request message, and perform an operation requested by the storage request message on the hard disk according to the physical address;
  • the network storage agent is configured to receive a storage request message, determine a logical address of a network storage node corresponding to the storage request message, and perform an operation requested by the storage request message on the storage array according to the logical address.
  • the virtual block service module is specifically configured to determine at least one of a user volume ID of the data to be operated by the storage request message and the data to be operated.
  • the logical block address LBA of the data block determines a storage partition corresponding to the at least one data block according to the user volume ID and the LBA of the at least one data block.
  • the virtual block service module is specifically configured to establish metadata of each storage partition of the shared storage resource pool. Determining, by the metadata record of each storage partition, the correspondence between the storage partition ID and the data block ID assigned to the storage partition, determining the at least one data block according to the user volume ID and the LBA of the at least one data block ID, querying metadata of each storage partition, and determining an ID of a storage partition corresponding to the at least one data block.
  • the virtual block service module is specifically configured to combine the user volume ID and the LBA of each data block.
  • the key value of each data block is calculated, and the value corresponding to the key value of each data block is calculated, and the storage partition corresponding to each data block is determined according to the value.
  • the virtual block service module is specifically configured to receive a command for creating a user volume, where the user volume is created
  • the command indicates the size of the user volume, allocates the user volume ID for the user volume, and determines the size of the initial storage resource allocated to the user volume according to the size of the user volume, according to the initial storage.
  • the size of the resource determines an LBA of the at least one data block, and the storage partition corresponding to the at least one data block is determined according to the user volume ID and an LBA of the at least one data block.
  • the virtual block service module is specifically configured to receive a write data operation request, and according to the write data operation request The name of the file to be carried, determines the user volume ID corresponding to the current write operation, divides the data to be written into a plurality of data blocks to be written, and allocates an LBA for each data block to be written, according to the current write operation.
  • the read/write control module corresponding to the storage partition generates a plurality of data block write commands, wherein each data block write command corresponds to one of the to-be-written data blocks, and each of the data block write commands carries data to be written
  • the block and the ID of the data to be written, and the respective data block write command are respectively sent to the read/write control module corresponding to each of the to-be-written data blocks.
  • the virtual block service module is specifically configured to receive a read data operation request, where the read data operation request carries The file name and the offset of the data to be read are determined according to the file name carried in the read data operation request, the user volume ID corresponding to the current read operation is determined, and the plurality of to-be-determined information is determined according to the offset information of the data to be read.
  • Reading the LBA of the data block and determining, according to the user volume ID corresponding to the current read operation and the LBA of each data block to be read, a storage partition corresponding to each data block to be read, according to the global partition And determining, by the read/write control module corresponding to the storage partition corresponding to the plurality of data blocks to be read, generating a plurality of data block read commands, where each data block read command corresponds to one of the to-be-read data blocks.
  • Each data block read command carries a data block to be read and an ID of the data block to be read, and read and write control corresponding to each of the data blocks to be read respectively
  • the module sends each of the data block read commands.
  • the metadata controller is further configured to determine the object storage proxy and the network storage proxy respectively a view of the deployment on the server node, and generating view information of the read/write control module according to the determined deployment situation, where the view information of the read/write control module is used to indicate information of the server node deployed by each read/write control module, And sending the view information of the read/write control module to the virtual block service module, where the virtual block service module is configured to determine, according to the view information of the read/write control module, routing information of the read/write control module, The determined read/write control module sends the storage request message.
  • the metadata controller is specifically configured to determine that the object storage agent is deployed in the server cluster system and has a hard disk resource. And on the server node, and determining to deploy the network storage agent on a server node with a small load in the server cluster system.
  • the metadata controller is further configured to collect available storage resources of the hard disk of the server node and the network storage node The available storage resources of the storage array divide the available storage resources of the hard disk and the storage array into a plurality of storage partitions.
  • an embodiment of the present invention provides a computer.
  • an embodiment of the present invention provides a computer storage medium.
  • the storage resource of the hard disk and the storage array is divided into multiple storage partitions and formed into a shared storage resource pool, and a read/write control module is allocated for each storage partition, and global partition information is generated. Recording a correspondence between each storage partition in the shared storage resource pool and the read/write control module, so that the storage request message is subsequently received, and the storage partition corresponding to the storage request message can be determined, and according to the global partition information.
  • FIG. 1 is a schematic block diagram of a server cluster system according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of partitioning of shared storage resources according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of using a shared storage resource according to an embodiment of the present invention.
  • FIG. 4 is still another flowchart of using shared storage resources according to an embodiment of the present invention.
  • FIG. 5 is still another flowchart of using a shared storage resource according to an embodiment of the present invention.
  • FIG. 6 is still another schematic block diagram of a server cluster system according to an embodiment of the present invention
  • FIG. 7 is a composition diagram of a computer according to an embodiment of the present invention.
  • the technical solution provided by the embodiment of the present invention implements the integration of the heterogeneous storage resources by deploying the distributed controller on the server, and realizes the integration and utilization of the heterogeneous storage resources without the need to separately purchase the heterogeneous storage and converged devices.
  • the cost performance of the system implements the integration of the heterogeneous storage resources by deploying the distributed controller on the server, and realizes the integration and utilization of the heterogeneous storage resources without the need to separately purchase the heterogeneous storage and converged devices.
  • the storage resources are horizontally integrated, especially the heterogeneous storage resource is integrated and utilized, and the embodiment of the present invention deploys the distribution on the server.
  • the storage controller is configured to form a cluster shared storage resource pool to uniformly allocate and manage storage resources. In this way, you can Achieve fast and simple integration between heterogeneous storage resources, which can efficiently utilize various storage resources, save costs, and avoid waste of resources.
  • the heterogeneous storage resource in the embodiment of the present invention refers to two or more types of different types of storage devices.
  • the first storage device refers to a local hard disk that is included in the server node, such as a solid state hard disk ( Solid State Disk (SSD), Hard Disk (HD), Hybrid Hard Disk (HHD), etc.
  • the second storage device means network storage node storage device or network attached storage (Network Attached Storage) (NAS) a storage device, where the network storage node is a hardware device external to the server, not a device carried by the server itself.
  • NAS Network Attached Storage
  • FIG. 1 it is a composition diagram of a server cluster system provided by an embodiment of the present invention.
  • the server cluster system communicates with an application client or a storage management center through a network layer, where the server cluster system is composed of a server node and a network storage node.
  • a SAN storage device is used as an example.
  • the server node and the network storage node may be one or more.
  • two SAN storage nodes are taken as an example.
  • the physical device of each server node includes a CPU and a memory. Network and hard disk, etc., the physical device of the network storage node includes a controller of the storage array and the storage array.
  • the CPU and memory of the server node are used to provide computing resources for the application accessing the server cluster system.
  • the devices are collectively referred to as the computing resources of the server cluster system, which are the basis of the computing layer.
  • the hard disks of the server nodes of the storage resource layer and the storage arrays of the network storage nodes are collectively referred to as storage resources collectively referred to as the server cluster system.
  • the server cluster is used to provide computing resources to different applications for external use.
  • a WEB application or a HADOOP distributed cluster system may be run on the server cluster.
  • the computing resources of the server cluster can be further abstracted into multiple virtual machines, running different applications on each virtual machine, or multiple virtual machines forming a virtual machine cluster to provide services for the same application.
  • the example is eclectic for the specific implementation.
  • the related data of the application may be stored on a storage resource of the server cluster system, that is, stored on a hard disk of the server node or in a storage array of the SAN node, or may be simultaneously stored. In the storage array of the server node's hard disk and SAN node.
  • the server cluster system of the embodiment of the present invention further runs a distributed storage controller, and the distributed storage controller is configured to provide a hard disk of a server node and a network storage node (such as a SAN).
  • the storage resource of the storage array is divided into a plurality of storage partitions, the plurality of storage partitions constitute a shared storage resource pool of the server cluster system, and an application running on the server cluster may be from the shared storage resource pool.
  • Obtaining and using the distributed storage resource blocks ensures high storage utilization and storage hooking of the storage resources, thereby improving the read and write efficiency of the storage resources.
  • the distributed storage controller is implemented by a software module installed on a hardware device of the server, so that the problem of separately purchasing the hardware device as the storage control device can be avoided, and the solution is more economical and cost-effective.
  • the distributed storage controller in the embodiment of the present invention is a general term for a storage control function module running on each server node.
  • the distributed storage controller provided as a solution may include different functional modules, but in actual deployment.
  • Each server node can run different functional modules of the distributed storage controller according to its function and deployment strategy. That is, according to the deployment strategy of the server cluster, different distributed storage controllers can be run on different server nodes.
  • Functional modules each server node can run all the functional modules of the distributed storage controller, and can also run the functional modules of the distributed storage controller part. The specific deployment mode will be described in detail below.
  • the distributed storage controller is mainly used to: provide an interface for accessing data of the computing resources of the server cluster system, and manage and read and write control the shared storage resources of the server cluster system.
  • the distributed storage controller may be functionally divided into the following modules: a metadata controller MDC, configured to acquire a storage resource of a local hard disk of the server node and a storage resource of a storage array of the network storage node, where the server The storage resource of the node and the storage resource of the network storage node are divided into multiple storage partitions, and each storage partition is allocated a storage partition identifier, and then the plurality of storage partitions are formed into a shared storage resource pool for Applications running on the server cluster system use shared storage resources.
  • a metadata controller MDC configured to acquire a storage resource of a local hard disk of the server node and a storage resource of a storage array of the network storage node, where the server The storage resource of the node and the storage resource of the network storage node are divided into multiple storage partitions, and each storage partition is allocated a storage partition identifier, and then the plurality of storage partitions are formed into a shared storage resource pool for Applications running on the server cluster system use shared storage resources.
  • the MDC may perform a health check on the hard disk resource of the server node and the storage array of the network storage node, and collect the available storage resources to form the shared storage resource pool.
  • the MDC can be divided into storage partitions of the same size, for example, in units of 10 GB.
  • the storage resource collected by the MDC may include: a capacity and an ID of each hard disk, an ID of a server where each hard disk is located, a capacity and an ID of each logical storage unit LUN included in each storage array, and each ID of the network storage node where the LUN is located.
  • LUN 3
  • LUN Capacity 50GB
  • SAN ID 1.
  • the MDC divides the storage resources of Disk1 -3 and LUN1 -3 into multiple storage partitions, and each storage partition may be equally divided or unevenly, for example, in a size of 10 GB.
  • the storage resources are divided into 30 storage partitions, each storage partition is 10 GB, and the partition identifier of each storage partition is 1 -30, that is, P1 - P30, and the MDC will be P1.
  • -P30 constitutes a shared storage resource pool, where P1 - P15 is composed of the storage resources of the hard disk of the server node, and P16-P30 is composed of the storage resources of the storage array of the SAN node. That is, the shared storage resource includes two types of storage partitions, the first type of storage partition is P1 - P15, and the second type of storage partition is P16-P30.
  • the distributed storage controller further includes a read/write control module.
  • the read/write control module includes an Object Storage Delegate (OSD) and a SAN Storage Agent (SSA), wherein the OSD For reading and controlling the storage resources of the hard disk of the server node, that is, storing and acquiring data to the local hard disk of the server node, for example, reading and writing control the storage partitions P1 - P15 in this embodiment; Read and write control of the storage resource of the storage node of the SAN node, that is, storage and acquisition of the storage array of the data to the SAN node, for example, read and write control of the storage partition P16-P30 in this embodiment.
  • the OSD and the SSA are both functional modules of the distributed storage controller.
  • the MDC may determine how the OSD and the SSA are based on the deployment of the storage resources. Deploy in the server cluster system. Specifically, the MDC may run the OSD in each server node having a local hard disk in the server cluster system, where the MDC may run the SSA in each server node in the server cluster system, or The SSA is deployed on a server node with a smaller load according to the load condition of each server node. For example, the MDC can uniformly calculate the load of computing resources on all server nodes, and according to the storage array of each SAN storage node. The size of the capacity, the global SSA deployment information is generated according to the weight.
  • the MDC is at the server node 1 Run OSD1 on it, run OSD2 on server node 2, and run SSA1 on server node 2.
  • the OSD view information and the SSA view information may be recorded, where the OSD view information includes, on which server the OSD is deployed, to indicate the route of the OSD.
  • the OSD view may further include each OSD and its corresponding state and which DISKs are managed by each OSD.
  • the SSA view information includes which server the SSA is deployed on, and is used to indicate the SSA.
  • the routing information further includes the status of each SSA and the LUNs of which SAN storage arrays are managed by each SSA.
  • Table 1 and Table 2 respectively represent the OSD view information and the SSA view information:
  • Tables 1 and 2 above describe the view information of the OSD and the SSA separately. Those skilled in the art may also combine the above Tables 1 and 2 into the view information of a read/write control module.
  • the corresponding read/write control module may be configured for each storage partition, and the allocation process may be flexible, and the MDC is configured according to the storage partition.
  • the partitioning situation and the actual running load are determined.
  • P1-10 is correspondingly deployed on the server node 1
  • the OSD1 running on the server node 1 is used as the read/write control module of the storage partition
  • P1 1-20 is correspondingly deployed on the server node 2.
  • the OSD2 running on the server node 2 serves as a read/write control module of the storage partition.
  • P21-30 is correspondingly deployed on the server node 2
  • the SSA1 running on the server node 2 is used as a read/write control module of the storage partition.
  • the MDC may also generate global partition information (in the embodiment of the present invention, taking a global partition information table as an example), where the global partition information table records the storage in the server cluster system.
  • the distribution of partitions as shown in FIG. 2 and Table 3, records the read/write control module (OSD or SSA) corresponding to each storage partition in the global partition information table.
  • the global partition information table may also record information of a source storage device corresponding to each storage partition, such as a disk number or physical address information.
  • the read/write control module corresponding to P1 is OSD1
  • the source storage unit corresponding to P1 is DISK1 in SERVER1
  • the source physical address corresponding to P1 is 100-199.
  • Table 3 Global Partition Information Table
  • the distributed storage controller also includes a virtual block service VBS.
  • VBS virtual block service
  • the global partition information table and the read/write control module view information may be sent to the VBS.
  • the VBS obtains an I/O view according to the information sent by the MDC, where the I/O view is a sub-table of the global partition information table, and is used to indicate an actual read/write control module of each storage partition, which includes storage.
  • the mapping between the partition and the read/write control module, the I/O view may be directly sent by the MDC to the VBS, or may be generated by the VBS according to a global partition information table delivered by the MDC module.
  • the VBS may be run on each server node in the server cluster system as a storage driver layer for providing a block access interface, such as a SCSI-based block device access interface, to an application module of the server cluster system.
  • a storage driver layer for providing a block access interface, such as a SCSI-based block device access interface, to an application module of the server cluster system.
  • the VBS determines the storage partition to be read and written by the data read/write request, and determines the current data read/write request according to the view rule in the I/O view.
  • the read/write control module (OSD and SSA) corresponding to the requested storage partition sends the read and write data request to the corresponding read/write control module to complete the reading and writing of the data.
  • the VBS may further support management of global metadata, where the global metadata is Recording the global usage of the storage partitions in the shared storage resource pool in the server cluster system and the metadata of each storage partition.
  • the global usage case includes information of the storage partition that has been occupied and information of the free storage partition.
  • the metadata of each storage partition is used to indicate the allocation of each storage partition.
  • the allocation of the storage partition uses the storage allocation mode of the block data, that is, the usage unit of each storage partition.
  • the use of the storage partition includes reading, writing, or allocating, for example, the storage partition is allocated in units of data blocks when being allocated to a user volume, for example, the present invention
  • Each storage partition in the embodiment has a size of 10 GB, and the 10 GB can be equally divided into 10240 data blocks (Block), when data is read to each storage partition or data is written to each storage partition, The data block is read and written in units. Therefore, the metadata of each storage partition specifically includes the correspondence of the block IDs allocated by each storage partition, and each storage partition is allocated a plurality of data blocks.
  • the size of each data block may be averaged or not.
  • the embodiment of the present invention takes the size of each data block as 1 MB as an example.
  • the ID of each data block in the embodiment of the present invention may be composed of the ID of the user volume corresponding to the data block, or may be composed of the ID of the user volume corresponding to the data block and a logical block address (LBA). .
  • Metadata for each storage partition for example, as shown in Table 4:
  • the correspondence between the storage partition and the allocated data block may be in the form of a Key-Value index, where the ID of the data block is a Key value, for example, the Key value is related to the identifier of the user volume and the logical block address of the data block, and the storage partition.
  • the ID is the Value value.
  • the VBS may also directly determine the correspondence by using an algorithm instead of maintaining the foregoing Table 4.
  • the VBS can obtain the storage resource allocation information by traversing the hard disk of the server node and the disk of the storage node of the SAN node at the time of startup, and initialize the storage metadata according to the global partition information table delivered by the MDC.
  • the distributed storage controller further includes a read/write control module, such as an OSD that performs read and write control on the hard disk resources of the server node, and an SSA that performs read and write control on the storage resources of the storage array of the network storage node.
  • a read/write control module such as an OSD that performs read and write control on the hard disk resources of the server node, and an SSA that performs read and write control on the storage resources of the storage array of the network storage node.
  • the OSD mainly receives a read/write command of the VBS, and completes storage and acquisition of data to a hard disk of the server node.
  • the SSA mainly receives the read and write commands of the VBS, and completes the storage and retrieval of the data to the hard disk of the SAN node.
  • the SSA is used to implement the proxy of the SAN device on the host, and the storage information of each physical SAN device is established in the SSA.
  • the view, access to each physical SAN/NAS device is through its proxy, SSA adds iSCSI interface capabilities.
  • the SSA may also maintain a correspondence between the unified physical address and the original LUN address on the SAN node, where the SSA The address of the original LUN corresponding to the read/write request may also be determined according to the correspondence.
  • the above-mentioned server cluster system because the distributed storage controller is running, the MDC, the VBS, the OSD, and the SSA on the distributed storage controller can realize the integration and utilization of the heterogeneous storage resources, and the various heterogeneous storage resources.
  • the cluster shared resource pool is configured to uniformly allocate and manage storage resources, improve the utilization of storage resources, and realize simultaneous reading or writing of multiple storage partitions, thereby improving read and write performance and improving system interest rate.
  • FIG. 1 is a process flow for creating a user volume in a server cluster system in which heterogeneous storage resources are integrated according to an embodiment of the present invention:
  • S301 The VBS deployed on one server node in the server cluster system receives a command for creating a user volume sent by the application end;
  • an application for example, a virtual machine
  • the application manager initiates a command to create a user volume, and the command is forwarded by the application manager to any one of the server nodes.
  • VBS the preferred way is that the VBS of the server node where the computing resource of the virtual machine that initiated the command receives the command to create the user volume
  • the server cluster system in the embodiment of the present invention further provides the function of the active standby VBS, after receiving the command to create the user volume, the terminal may further determine whether it is the primary VBS in the server cluster, and if not, The command to create a user volume is forwarded to the primary VBS.
  • the deployment of VBS is flexible.
  • the VBS installed on each server node of the server cluster system can be divided into primary and secondary. At this time, the configuration and functions of each VBS are equivalent. You can also select a VBS in the server cluster system.
  • the primary VBS, the other VBS is used as the backup VBS, the primary VBS is used to implement the allocation of the user volume/data block and the metadata management of the storage partition, and the backup VBS is used to query the primary VBS for metadata and perform operations according to the command of the primary VBS.
  • the embodiment of the present invention takes the server cluster system as an example to implement the active and standby VBSs.
  • the primary VBS queries the global metadata according to the size information of the volume indicated by the command for creating the user volume, determines whether the remaining resources of the shared storage resource pool meet the requirements, and if yes, creates the user volume. Determining a volume identifier (ID) of the user volume, and allocating an initial storage partition to the user volume, and recording the identifier of the user volume and the information of the allocated initial storage partition in the metadata of the initial storage partition. in.
  • ID volume identifier
  • the primary VBS directly uses the ID of the user volume in the command to create a user volume, if the user volume is created The command does not specify the ID of the user volume, and the VBS allocates the ID of the user volume for the user volume.
  • the VBS may further allocate an initial storage partition to the user volume, that is, select some storage partitions from the idle storage partitions as an initial storage partition of the user volume.
  • the size of the initial storage resource of the user volume may be flexibly allocated according to the capacity of the user volume specified by the command for creating the user volume, and the capacity of the user volume specified by the command for creating the user volume may be used as the capacity of the initial storage partition.
  • the command to create a user volume requests to create a 5 GB user volume
  • the VBS can allocate all 5 GB to the user volume as an initial storage partition, that is, divide 5 GB into 5120 1 MB data blocks, and then 5120
  • the data block distribution is deployed in the storage partition of the P1 - P30, and the size of the initial storage partition is 5 GB.
  • the VBS can also use the thin allocation mode according to the actual situation in the shared storage resource pool.
  • the volume allocates a part of the storage resources, for example, allocates 1 GB of initial storage resources for the user volume, divides 1 GB into 1024 1 MB data blocks, and distributes the 1024 data blocks in the storage partition of P1 - P30.
  • the initial storage partition size is 1 GB.
  • the VBS records information of the user volume ID and the allocated initial storage partition into metadata information of each initial storage partition in the global metadata.
  • the VBS also allocates a corresponding source physical address for each data block of each user volume when the initial storage partition is allocated for the user volume.
  • the primary VBS mounts the user volume, and after the mounting is successful, generates a virtual storage device.
  • the primary VBS returns the global metadata to the MDC in the server cluster system, so that the MDC updates the global partition information table according to the global metadata.
  • Step 305 is an optional step, and the implementation sequence can also be flexibly performed.
  • FIG. 4 is a flowchart of processing data written by a user in a server cluster system integrated with heterogeneous storage resources according to an embodiment of the present invention:
  • the write data operation request carries the file name and the data to be written itself.
  • the VBS determines, according to the file name carried in the write data operation request, a user volume ID corresponding to the current write operation.
  • the VBS may further calculate a size of the data to be written according to the to-be-written data.
  • the VBS allocates an LBA for the data to be written (the LBA is optional in this step, and the VBS may not allocate an LBA for the data to be written at this step).
  • the VBS divides the data to be written into a plurality of data blocks, and allocates an LBA for each of the data blocks.
  • the VBS partitioning the data to be written may be uniformly divided according to a certain unit size, for example, dividing according to 1 MB, that is, dividing according to each usage unit of each storage partition, in this embodiment,
  • the VBS divides the data to be written with the size of 1 GB into 1024 data blocks, and the size of each data block is 1 MB. If the remainder of the data to be written is less than 1 MB, the size of the last data block is the remainder. The actual size.
  • the VBS also assigns a corresponding LBA to each data block. E.g:
  • Blockl LBA 0000-1024
  • Block 2 LBA 1025-2048
  • S404 The VBS determines a corresponding storage partition for each of the data blocks
  • the VBS first determines a logical block address (LBA) of each data block to be written, and then combines the user volume ID and the LBA of each data block into a key value of each data block, according to distributed storage.
  • An algorithm such as a hash algorithm, determines the corresponding storage partition for each data block.
  • the LBA here can be the value processed by the original LBA, for example, the LBA 0000-1024 corresponding to block1 corresponds to 1, and the LBA1025-2048 corresponding to block2 corresponds to 2.
  • the VBS generates a plurality of data block write commands, where each of the data blocks corresponds to one of the data block write commands, and each data block write command carries a data block to be written, and a data block to be written ID (for example, the Block ID is composed of the user volume ID and the data block to be written LBA).
  • This step can also be performed after the execution of the subsequent steps is completed, and the specific implementation has no timing limitation.
  • the VBS determines, according to the storage partition corresponding to each data block, a read/write control module corresponding to each data block.
  • the VBS determines, according to the global partition information table, a read/write control module corresponding to each data block.
  • the VBS sends the data block write command to a read/write control module corresponding to each data block, so that the read/write control module corresponding to each data block writes each data block into the storage hardware. Resources.
  • the OSD queries the data block metadata saved by itself according to the ID of the data block to be written to determine whether it is the first operation of the data block ID. Is the first operation, the actual physical address is allocated to the data block to be written, the data block to be written is written into the disk corresponding to the physical address, and the data block metadata saved by itself is updated. Determining the correspondence between the ID of the data block to be written and the physical address; if not the first operation, the OSD queries the data block metadata saved by the data according to the ID of the data block to be written, and determines the to-be-determined Write a physical address corresponding to the data block, and write the data block to be written into the physical address to be queried.
  • the SSA receives the data block write command, the SSA is configured according to the data block to be written.
  • the ID query itself saves the data block metadata to determine whether it is the first operation of the data block ID, and if it is the first operation, allocates the actual SAN storage node to the storage array for the data block to be written.
  • the logical address that is, the address of the LUN, writes the to-be-written data block to the disk corresponding to the address of the LUN, and updates the data block metadata saved by itself, and records the ID of the data block to be written and Corresponding relationship of the address of the LUN; if not the first operation, the OSD queries the data block metadata saved by the data block according to the ID of the data block to be written, and determines the address of the LUN corresponding to the data block to be written. Writing the to-be-written data block to the address of the queried LUN.
  • the OSD or the SSA may write the data block to the local cache layer, that is, return a response message during a write operation, to improve storage efficiency.
  • FIG. 5 is a flowchart of processing user read data in a server cluster system integrated with heterogeneous storage resources according to an embodiment of the present invention:
  • the read data operation request carries a file name and offset information of the data to be read.
  • the VBS determines, according to the file name carried in the read data operation request, a user volume ID corresponding to the current read operation, and determines an LBA of the data to be read according to the offset information of the data to be read.
  • the VBS determines, according to the ID of the user volume and the LBA of the data to be read, a plurality of data blocks to be read.
  • the ID of each data block to be read is composed of the user volume and an LBA of each data block, and the LBA of each data block may be according to the size of the data to be read and the data to be read.
  • the offset is determined.
  • the VBS determines a corresponding storage partition for each of the to-be-read data blocks. Specifically, the VBS first determines a logical block address (LBA) of each to-be-read data block, and then the user volume. The ID and the LBA of each data block are combined into a key value for each data block, and a corresponding storage partition is determined for each data block according to a distributed storage algorithm, such as a hash algorithm.
  • LBA logical block address
  • S505 The VBS generates a plurality of data block read commands, where each of the data blocks corresponds to one of the data block read commands, and each data block read command carries an ID of a data block to be read (eg, a block ID is used by a user.
  • the volume ID and the data block to be read LBA are composed.
  • S506 The VBS determines, according to the storage partition corresponding to each data block, a read/write control module corresponding to each data block.
  • the VBS determines, according to the global partition information table, a read/write control module corresponding to each data block.
  • the VBS sends the data block read command to the read/write control module corresponding to each data block, so that the read/write control module corresponding to each data block reads each read-to-read from the storage hardware resource. data block.
  • the OSD reads the data block to be written from the disk corresponding to the physical address according to the physical address to be read.
  • the SSA If the SSA receives the data block write command, the SSA reads from the disk corresponding to the address of the LUN according to the logical address on the storage array of the storage node of the data block to be written, that is, the address of the LUN. Read the data block.
  • the embodiment of the present invention provides a cluster system for computing storage convergence, which solves the problem of complicated operation and high cost caused by using a dedicated SAN in the prior art; the storage device may have multiple, each storage device Cache can be deployed, which greatly improves the scalability of the storage-side cache on the hardware; storage resources do not depend on computing resources, storage resources can be independently increased and decreased, and the scalability of the system is enhanced; Virtualized disk and cache resources are virtualized into shared resource pools and shared by all calculations. All calculations and storage can participate when data is read and written, and the storage performance of the system is improved by concurrent improvement.
  • the embodiment of the present invention provides a cluster system for computing storage convergence, the high-speed data exchange network is used for communication, thereby further speeding up data exchange.
  • FIG. 6 another component diagram of a server cluster system according to an embodiment of the present invention includes: server nodes 1 and 2, and a network storage node, that is, a SAN device of the A manufacturer, the server node 1 includes hard disks 1 and 2, the server node 2 includes a hard disk 3, the network storage node includes a storage array, that is, LUN1 and LUN2, and the distributed storage controller is run on the server node, and the distributed storage controller includes :
  • the metadata controller is deployed on two server nodes in this embodiment, where The primary MDC of the server node 1 is deployed in the server node 2 as a standby MDC, and the metadata controller is configured to divide the storage resources of the hard disk and the storage array into a plurality of storage partitions, the multiple storage The partitions constitute a shared storage resource pool, and the read/write control module is allocated to each of the storage partitions to generate global partition information, and the global partition information records each storage partition and the read/write control module in the shared storage resource pool.
  • each server node deploys a VBS, which is used for the service layer, receives the storage request message, determines the location a storage partition corresponding to the storage request message, determining, according to the global partition information, a read/write control module corresponding to the storage partition corresponding to the storage request message, and sending the storage request message to the determined read/write control module;
  • the read/write control module is configured to execute the storage request message facing the hard disk or the network storage node The requested operation.
  • the OSD1 and the OSD2 are deployed on the server node 1
  • the OSD3, SSA1, and SSA2 are deployed on the server node 2.
  • the OSD1 is used for reading and writing control of the hard disk 1.
  • the OSD2 is used for the hard disk. 2, the read/write control is performed, the OSD3 is used for reading and writing control of the hard disk 3, the SSA1 is used for reading and writing control of the LUN1, and the SSA2 is used for reading and writing control of the LUN2.
  • the metadata controller is further configured to determine, respectively, a deployment situation of the object storage agent and the network storage agent on the server node, and generate view information of the read/write control module according to the determined deployment situation, where The view information of the read/write control module is used to indicate the information of the server node deployed by each read/write control module, and the view information of the read/write control module is sent to the virtual block service module; further, the metadata control Specifically, the method is configured to deploy the object storage agent on a server node having a hard disk resource in the server cluster system, and determine to deploy the network storage agent on a server node with a small load in the server cluster system. . In this embodiment, the metadata controller deploys SSA1 and SSA2 on the server node.
  • FIG. 7 is a schematic structural diagram of a computer according to an embodiment of the present invention.
  • the computer of the embodiment of the present invention may include: a processor 701, a memory 702, a system bus 704, and a communication interface 705.
  • the CPU 701, the memory 702, and the communication interface 705 are connected by the system bus 704 and complete communication with each other.
  • Processor 701 may be a single core or multi-core central processing unit, or a particular set of circuits, or one or more integrated circuits configured to implement embodiments of the present invention.
  • the memory 702 can be a high speed RAM memory or a nonvolatile memory.
  • Memory 702 is used by computer to execute instructions 703. Specifically, the program code may be included in the computer execution instruction 703. When the computer is running, the processor 701 runs the computer execution instructions 703, which may perform the methods provided by any of the embodiments of the present invention. More specifically, the distributed storage controller described in the embodiment of the present invention, if implemented by computer code, performs the functions of the distributed storage controller of the embodiment of the present invention. It should be understood that in the embodiment of the present invention, " ⁇ corresponding to ⁇ " means that B is associated with A, and B can be determined according to A.
  • determining B according to A does not mean that B is only determined based on A, and that B can also be determined based on A and/or other information.
  • Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software or a combination of both, in order to clearly illustrate hardware and software. Interchangeability, the composition and steps of the various examples have been generally described in terms of function in the above description. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.
  • the disclosed system can be It's way to achieve it.
  • the system embodiment described above is only illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, or an electrical, mechanical or other form of connection.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention contributes in essence or to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes. .

Abstract

本发明实施例共享存储资源的方法和系统,将硬盘和存储阵列的存储资源划分为多个存储分区并组成共享存储资源池,为所述每个存储分区分配读写控制模块,并生成全局分区信息以记录所述共享存储资源池中的每个存储分区与读写控制模块的对应关系,使得后续接收到存储请求消息,能够确定所述存储请求消息对应的存储分区,并根据所述全局分区信息,确定所述存储请求消息对应的存储分区对应的读写控制模块,最终能够向确定的所述读写控制模块发送所述存储请求消息,以使所述读写控制模块执行所述存储请求消息所请求的操作。本发明实施例实现异构存储资源之间的快速简单融合,可以高效地利用各种存储资源,节约成本以及避免资源浪费。

Description

一种共享存储资源的方法和系统 技术领域 本发明涉及通信技术领域, 尤其涉及一种共享存储资源的方法和系统。 背景技术
在云计算应用中, 服务器集群系统整合计算资源、存储资源和网络资源, 利用虚拟化等技术并通过网络提供给用户使用。 应用的形式例如为虚拟机 ( Virtual Machine, 简称为 "VM" ) 、 计算能力、 存储能力租用等。
目前, 由于资源需求的类型不同等原因, 服务器集群系统提供的存储资 源通常釆用不同的设备, 存储资源来源多样化。 例如, 服务器节点的自带的 存储资源, 以及独立部署的存储资源, 例如存储区域网络( Storage Area Network, 简称为 "SAN" )等专用存储阵列或存储服务器。
现有技术中, 服务器集群系统各存储设备独立对外提供存储服务, 存储 资源的联合利用率并不高。 再一方面, 企业原始积累的网络存储设备的存储 资源并不能那个被服务器集群系统再次利用, 造成了极大的浪费。 发明内容 本发明实施例提供一种共享存储资源的方法和系统, 以对异构存储资源 进行整合和共享利用, 提高存储资源利用率。 第一方面, 本发明实施例提出了一种共享存储资源的方法, 应用于服务 器集群系统, 所述服务器集群系统包括服务器节点和网络存储节点, 所述服 务器节点包括硬盘, 所述网络存储节点包括存储阵列, 所述方法包括: 将所述硬盘和所述存储阵列的存储资源划分为多个存储分区, 所述多个 存储分区组成共享存储资源池; 为所述每个存储分区分配读写控制模块; 生成全局分区信息, 所述全局分区信息记录了所述共享存储资源池中的 每个存储分区与读写控制模块的对应关系; 接收存储请求消息, 确定所述存储请求消息对应的存储分区; 根据所述全局分区信息, 确定所述存储请求消息对应的存储分区对应的 读写控制模块; 向确定的所述读写控制模块发送所述存储请求消息, 以使所述读写控制 模块执行所述存储请求消息所请求的操作。 结合第一方面, 在第一种可能的实现方式中, 所述确定所述存储请求消 息对应的存储分区包括: 确定所述存储请求消息待操作的数据所在的用户卷 ID 和所述待操作的 数据的至少一个数据块的逻辑块地址 LBA; 根据所述用户卷 ID和所述至少一个数据块的 LBA,确定所述至少一个数 据块对应的存储分区。 结合第一方面的第一种可能的实现方式, 在第二种可能的实现方式中, 所述方法还包括: 建立所述共享存储资源池的每个存储分区的元数据, 所述每个存储分区 的元数据记录本存储分区 ID与被分配到本存储分区的数据块 ID的对应关系; 则,所述根据所述用户卷 ID和所述至少一个数据块的 LBA,确定所述至 少一个数据块对应的存储分区包括: 根据所述用户卷 ID和所述至少一个数据块的 LBA确定所述至少一个数 据块的 ID, 查询所述每个存储分区的元数据, 确定所述至少一个数据块对应 的存储分区的 ID。 结合第一方面或者第一方面的第一种可能的实现方式, 在第三种可能的 实现方式中, 所述根据所述用户卷 ID和所述至少一个数据块的 LBA,确定所 述至少一个数据块对应的存储分区包括: 将所述用户卷 ID和每个数据块的 LBA组成所述每个数据块的 key值, 计算所述每个数据块的 key值对应的 value值, 根据所述 value值确定所述 每个数据块对应的存储分区。 结合第一方面的第一种可能的实现方式或者第一方面的第二种可能的实 现方式或者第一方面的第三种可能的实现方式,在第四种可能的实现方式中, 所述接收存储请求消息包括: 接收创建用户卷的命令, 所述创建用户卷的命 令指示所述用户卷的大小; 则所述确定所述存储请求消息待操作的数据所在的用户卷 ID 和所述待 操作的数据的至少一个数据块的逻辑块地址 LBA,根据所述用户卷 ID和所述 至少一个数据块的 LBA, 确定所述至少一个数据块对应的存储分区, 包括: 为所述用户卷分配所述用户卷 ID; 根据所述所述用户卷的大小, 确定分配给所述用户卷的初始存储资源的 大小, 根据所述初始存储资源的大小确定至少一个数据块的 LBA; 根据所述用户卷 ID和所述至少一个数据块的 LBA,确定所述至少一个数 据块对应的存储分区。 结合第一方面的第一种可能的实现方式或者第一方面的第二种可能的实 现方式或者第一方面的第三种可能的实现方式,在第五种可能的实现方式中, 所述接收存储请求消息包括: 接收写数据操作请求; 根据所述写数据操作请求携带的文件名, 确定当前写操作对应的用户卷
ID; 将待写入的数据划分为多个待写入数据块, 并为每个待写入数据块分配 LBA; 根据所述当前写操作对应的用户卷 ID和所述每个待写入数据块的 LBA, 确定所述每个待写入数据块对应的存储分区; 根据所述全局分区信息, 确定所述每个待写入数据块对应的存储分区所 对应的读写控制模块; 生成多个数据块写命令, 其中, 每个数据块写命令对应一个所述待写入 数据块, 所述每个数据块写命令携带待写入数据块以及待写入数据的 ID; 分别向所述每个待写入数据块对应的读写控制模块发送所述每个数据块 写命令, 以使得所述每个待写入数据块对应的读写控制模块将所述每个待写 入数据块写入存储硬件资源。 结合第一方面的第一种可能的实现方式或者第一方面的第二种可能的实 现方式或者第一方面的第三种可能的实现方式,在第五种可能的实现方式中, 所述接收存储请求消息包括: 接收读数据操作请求, 所述读数据操作请求携 带文件名和待读取数据的偏移量; 则根据所述读数据操作请求携带的文件名, 确定当前读操作对应的用户 卷 ID; 根据所述待读取数据的偏移量信息, 确定多个待读取数据块的 LBA; 根据所述当前读操作对应的用户卷 ID和每个待读取数据块的 LBA,确定 所述每个待读取数据块对应的存储分区; 根据所述全局分区信息, 确定所述多个待读取数据块对应的存储分区所 对应的读写控制模块; 生成多个数据块读命令, 其中, 每个数据块读命令对应一个所述待读取 数据块, 所述每个数据块读命令携带待读取数据块以及待读取数据块的 ID; 分别向所述每个待读取数据块对应的读写控制模块发送所述每个数据块 读命令, 以使得所述每个待读取数据块对应的读写控制模块读取所述每个待 读取数据块。
第二方面, 本发明实施例提出了一种服务器集群系统, 所述服务器集群 系统包括服务器节点和网络存储节点, 所述服务器节点包括硬盘, 所述网络 存储节点包括存储阵列, 所述服务器节点上运行分布式存储控制器, 所述分 布式存储控制器包括: 元数据控制器, 用于将所述硬盘和所述存储阵列的存储资源划分为多个 存储分区, 所述多个存储分区组成共享存储资源池, 为所述每个存储分区分 配读写控制模块, 生成全局分区信息, 所述全局分区信息记录了所述共享存 储资源池中的每个存储分区与读写控制模块的对应关系, 以及向虚拟块服务 模块下发所述全局分区信息;
所述虚拟块服务模块, 用于面向业务层, 接收存储请求消息, 确定所述 存储请求消息对应的存储分区, 根据所述全局分区信息, 确定所述存储请求 消息对应的存储分区对应的读写控制模块, 以及向确定的所述读写控制模块 发送所述存储请求消息;
所述读写控制模块, 用于面向所述硬盘或所述网络存储节点, 执行所述 存储请求消息所请求的操作。
结合第二方面, 在第一种可能的实现方式中, 所述读写控制模块包括对 象存储代理和网络存储代理;
所述元数据控制器具体用于为所述本地硬盘组成的存储分区分配所述对 象存储代理作为读写控制模块, 为所述存储阵列组成的存储分区分配所述网 络存储代理作为读写控制模块; 所述对象存储代理, 用于接收存储请求消息, 确定所述存储请求消息对 应的物理地址, 根据所述物理地址在所述硬盘上执行所述存储请求消息所请 求的操作;
所述网络存储代理, 用于接收存储请求消息, 确定所述存储请求消息对 应的网络存储节点的逻辑地址, 根据所述逻辑地址在所述存储阵列上执行所 述存储请求消息所请求的操作。
结合第二方面, 在第二种可能的实现方式中, 所述虚拟块服务模块, 具 体用于确定所述存储请求消息待操作的数据所在的用户卷 ID 和所述待操作 的数据的至少一个数据块的逻辑块地址 LBA,根据所述用户卷 ID和所述至少 一个数据块的 LBA, 确定所述至少一个数据块对应的存储分区。
结合第二方面的第二种可能的实现方式, 在第三种可能的实现方式中, 所述虚拟块服务模块, 具体用于建立所述共享存储资源池的每个存储分区的 元数据,所述每个存储分区的元数据记录本存储分区 ID与被分配到本存储分 区的数据块 ID的对应关系,根据所述用户卷 ID和所述至少一个数据块的 LBA 确定所述至少一个数据块的 ID, 查询所述每个存储分区的元数据, 确定所述 至少一个数据块对应的存储分区的 ID。
结合第二方面的第二种可能的实现方式, 在第四种可能的实现方式中, 所述虚拟块服务模块, 具体用于将所述用户卷 ID和每个数据块的 LBA组成 所述每个数据块的 key值, 计算所述每个数据块的 key值对应的 value值, 根据所述 value值确定所述每个数据块对应的存储分区。
结合第二方面以及第二方面的任意一种可能的实现方式, 在第五种可能 的实现方式中, 所述虚拟块服务模块, 具体用于接收创建用户卷的命令, 所 述创建用户卷的命令指示所述用户卷的大小, 为所述用户卷分配所述用户卷 ID, 根据所述所述用户卷的大小, 确定分配给所述用户卷的初始存储资源的 大小, 根据所述初始存储资源的大小确定至少一个数据块的 LBA, 以及根据 所述用户卷 ID和所述至少一个数据块的 LBA,确定所述至少一个数据块对应 的存储分区。 结合第二方面以及第二方面的任意一种可能的实现方式, 在第六种可能 的实现方式中, 所述虚拟块服务模块, 具体用于接收写数据操作请求, 根据 所述写数据操作请求携带的文件名, 确定当前写操作对应的用户卷 ID, 将待 写入的数据划分为多个待写入数据块, 并为每个待写入数据块分配 LBA, 根 据所述当前写操作对应的用户卷 ID和所述每个待写入数据块的 LBA,确定所 述每个待写入数据块对应的存储分区, 根据所述全局分区信息, 确定所述每 个待写入数据块对应的存储分区所对应的读写控制模块, 生成多个数据块写 命令, 其中, 每个数据块写命令对应一个所述待写入数据块, 所述每个数据 块写命令携带待写入数据块以及待写入数据的 ID, 以及分别向所述每个待写 入数据块对应的读写控制模块发送所述每个数据块写命令。 结合第二方面以及第二方面的任意一种可能的实现方式, 在第七种可能 的实现方式中, 所述虚拟块服务模块, 具体用于接收读数据操作请求, 所述 读数据操作请求携带文件名和待读取数据的偏移量, 根据所述读数据操作请 求携带的文件名, 确定当前读操作对应的用户卷 ID, 根据所述待读取数据的 偏移量信息, 确定多个待读取数据块的 LBA, 以及根据所述当前读操作对应 的用户卷 ID和每个待读取数据块的 LBA,确定所述每个待读取数据块对应的 存储分区, 根据所述全局分区信息, 确定所述多个待读取数据块对应的存储 分区所对应的读写控制模块, 生成多个数据块读命令, 其中, 每个数据块读 命令对应一个所述待读取数据块, 所述每个数据块读命令携带待读取数据块 以及待读取数据块的 ID, 以及分别向所述每个待读取数据块对应的读写控制 模块发送所述每个数据块读命令。
结合第二方面以及第二方面的任意一种可能的实现方式, 在第八种可能 的实现方式中, 所述元数据控制器, 还用于分别确定所述对象存储代理和所 述网络存储代理在所述服务器节点上的部署情况, 并根据确定的部署情况生 成读写控制模块的视图信息, 所述读写控制模块的视图信息用于指示每个读 写控制模块部署的服务器节点的信息, 以及向所述虚拟块服务模块下发所述 读写控制模块的视图信息; 所述虚拟块服务模块, 具体用于根据所述读写控制模块的视图信息确定 读写控制模块的路由信息, 向确定的所述读写控制模块发送所述存储请求消 息。
结合第二方面的第八种可能的实现方式, 在第九种可能的实现方式中, 所述元数据控制器具体用于确定将所述对象存储代理部署在所述服务器集群 系统中具备硬盘资源的服务器节点上, 以及确定将所述网络存储代理部署在 所述服务器集群系统中的负载小的服务器节点上。 结合第二方面的第八种可能的实现方式, 在第十种可能的实现方式中, 所述元数据控制器, 还用于搜集所述服务器节点的硬盘的可用存储资源和所 述网络存储节点的存储阵列的可用存储资源, 将所述硬盘和所述存储阵列的 可用存储资源划分为多个存储分区。
第三方面, 本发明实施例提供一种计算机。 第四方面, 本发明实施例提供一种计算机存储介质。 由上述技术方案可知, 本发明实施例, 将硬盘和存储阵列的存储资源划 分为多个存储分区并组成共享存储资源池, 为所述每个存储分区分配读写控 制模块, 并生成全局分区信息以记录所述共享存储资源池中的每个存储分区 与读写控制模块的对应关系, 使得后续接收到存储请求消息, 能够确定所述 存储请求消息对应的存储分区, 并根据所述全局分区信息, 确定所述存储请 求消息对应的存储分区对应的读写控制模块, 最终能够向确定的所述读写控 制模块发送所述存储请求消息 , 以使所述读写控制模块执行所述存储请求消 以高效地利用各种存储资源, 节约成本以及避免资源浪费。
附图说明 为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对实 施例或现有技术描述中所需要使用的附图作一简单地介绍, 显而易见地, 下 面描述中的附图是本发明的一些实施例。
图 1为本发明实施例提供的服务器集群系统的示意性框图;
图 2为本发明实施例提供的共享存储资源的划分示意图;
图 3为本发明实施例提供的使用共享存储资源的流程图;
图 4为本发明实施例提供的使用共享存储资源的又一流程图;
图 5为本发明实施例提供的使用共享存储资源的又一流程图;
图 6为本发明实施例提供的服务器集群系统的又一示意性框图; 图 7是根据本发明实施例提供的一种计算机的组成图。 具体实施方式 为使本发明实施例的目的、 技术方案和优点更加清楚, 下面将结合本发 明实施例中的附图, 对本发明实施例中的技术方案进行清楚、 完整地描述, 显然, 所描述的实施例是本发明一部分实施例, 而不是全部的实施例。
另外, 本文中术语"和 /或", 仅仅是一种描述关联对象的关联关系, 表示 可以存在三种关系, 例如, A和 /或 B, 可以表示: 单独存在 A, 同时存在 A 和 B, 单独存在 B这三种情况。 另外, 本文中字符 ", —般表示前后关联对 象是一种"或"的关系。
本发明实施例提供的技术方案通过在服务器上部署分布式控制器来实现 异构存储资源的融合, 实现无需另外购买异构存储融合的设备, 即可实现异 构存储资源的融合和利用, 提升系统的性价比。
本发明实施例在实现计算资源与存储资源的垂直融合的基础上, 再将各 种存储资源进行水平整合, 尤其是对异构存储资源进行整合和利用 , 本发明 实施例通过在服务器上部署分布式存储控制器, 将各种异构存储资源组成集 群共享存储资源池, 统一进行存储资源的分配和管理。 通过这种方法, 可以 实现异构存储资源之间的快速简单融合, 可以高效地利用各种存储资源, 节 约成本以及避免资源浪费。
本发明实施例所说的异构存储资源, 指的是两种或者两种以上不同类型 的存储设备, 具体来说, 第一种存储设备意指服务器节点自带的本地硬盘, 例如固态硬盘(Solid State Disk, SSD ) 、 机械硬盘(Hard Disk, HD ) 、 混合硬盘(Hybrid Hard Disk, HHD )等; 第二种存储设备意指网络存储节 存储设备, 也可以是网络附加存储 (Network Attached Storage, NAS)存储设 备, 所述的网络存储节点为服务器外置的硬件设备, 并非服务器自身所带的 设备。
如图 1 所示, 为本发明实施例提供的服务器集群系统的组成图, 该服务 器集群系统通过网络层与应用客户端或者存储管理中心通信, 所述服务器集 群系统由服务器节点和网络存储节点 (本实施例以 SAN存储设备为例)组成, 服务器节点和网络存储节点均可以是一个或多个,本实施例以 2台 SAN存储 节点为例, 每台服务器节点的物理设备包含 CPU、 内存、 网络和硬盘等, 网 络存储节点的物理设备包含存储阵列和存储阵列的控制器, 本实施例将服务 器节点的 CPU 和内存等用于为接入所述服务器集群系统的应用程序提供计 算资源的物理设备统称为所述服务器集群系统的计算资源, 其为组成计算层 的基础, 将存储资源层的服务器节点的硬盘和网络存储节点的存储阵列统称 为统称为所述服务器集群系统的存储资源。
所述服务器集群用于对外将计算资源提供给不同的应用程序使用, 例如 可以在所述服务器集群上运行 WEB应用或者 HADOOP分布式集群系统。所 述服务器集群的计算资源还可以进一步被抽象成多台虚拟机, 在每台虚拟机 上运行不同的应用程序, 或者多台虚拟机组成虚拟机集群从而为同一个应用 程序提供服务, 本实施例对具体实现形式不拘一格。 当所述服务器集群上运 行应用程序, 所述应用程序的相关数据可以存储在所述服务器集群系统的存 储资源上, 即存储在服务器节点的硬盘上或者 SAN节点的存储阵列中, 也可 以同时存储在服务器节点的硬盘和 SAN节点的存储阵列中。
本发明实施例的所述服务器集群系统还运行了分布式存储控制器, 所述 分布式存储控制器用于将服务器节点的硬盘和网络存储节点(如 SAN )提供 的存储阵列的存储资源划分为多个存储分区, 所述多个存储分区组成所述服 务器集群系统的共享存储资源池, 运行在所述服务器集群上的应用程序可以 从所述共享存储资源池中获得分布的存储资源块并进行使用, 保证了存储资 源的较高的利用率和存储的均勾分布, 并由此提升了存储资源的读写效率。 在本发明实施例中所述分布式存储控制器通过安装在服务器的硬件设备上的 软件模块来实现,从而可以避免另外购置硬件设备作为存储控制设备的问题, 解决方案更加经济并节约成本。
本发明实施例所述分布式存储控制器是对每台服务器节点上运行的存储 控制功能模块的统称, 作为解决方案提供的分布式存储控制器可以包含不同 的功能模块, 而在实际部署的时候, 每台服务器节点根据其功能和部署策略 可以运行分布式存储控制器的不同的功能模块, 也就是说, 根据服务器集群 的部署策略, 可以在不同的服务器节点上运行分布式存储控制器的不同的功 能模块, 每台服务器节点可以运行分布式存储控制器所有的功能模块, 也可 以运行分布式存储控制器部分的功能模块, 具体部署方式下文将详细描述。
所述分布式存储控制器主要用于: 对所述服务器集群系统的计算资源提 供数据访问的接口, 以及对所述服务器集群系统的共享存储资源进行管理和 读写控制。
具体地, 所述分布式存储控制器从功能上可以划分为如下模块: 元数据控制器 MDC,用于获取服务器节点本地硬盘的存储资源和网络存 储节点的存储阵列的存储资源, 将所述服务器节点的存储资源和网络存储节 点的存储资源划分为多个存储分区(partition ) , 并为每个存储分区分配一个 存储分区标识, 再将所述多个存储分区组成一个共享存储资源池, 以供运行 在所述服务器集群系统上的应用程序使用共享存储资源。
具体地, 所述 MDC可以先对服务器节点的硬盘资源和网络存储节点的 存储阵列进行健康检查,搜集其中可用的存储资源形成所述共享存储资源池。 所述 MDC 在划分分区的时候, 可以划分为相同大小的存储分区, 例如以 10GB为单位划分。 其中, 所述 MDC搜集的所述存储资源可以包括: 每块硬 盘的容量和 ID、每块硬盘所在的服务器的 ID、每个存储阵列包含的每个逻辑 存储单元 LUN的容量和 ID, 以及每个 LUN所在的网络存储节点的 ID。
所述 MDC搜集到的存储资源的信息举例如下: Disk ID=1 , Disk Capacity=50GB, Server ID=1;
Disk ID=2, Disk Capacity=50GB, Server ID=1;
Disk ID=3, Disk Capacity=50GB, Server ID=2;
LUN =1 , LUN Capacity=50GB, SAN ID=1 ;
LUN =2, LUN Capacity=50GB, SAN ID=1 ;
LUN =3, LUN Capacity=50GB, SAN ID=1。
MDC搜集到上述存储资源的信息后, 将上述 Disk1 -3和 LUN1 -3的存储 资源划分为多个存储分区, 每个存储分区可以是均分, 也可以是不均分, 例 如, 以 10GB大小均分所述存储资源, 将所述 DISK和 LUN的存储资源划分 为 30个存储分区,每个存储分区为 10GB,每个存储分区的分区标识为 1 -30, 即 P1 -P30, MDC将 P1 -P30组成一个共享存储资源池, 其中 P1 -P15由服 务器节点自带的硬盘的存储资源组成, P16-P30由 SAN节点的存储阵列的存 储资源组成。 即, 所述共享存储资源包括两类存储分区, 第一类存储分区为 P1 -P15, 第二类存储分区为 P16-P30。
所述分布式存储控制器还包括读写控制模块, 本实施例中读写控制模块 包括对象存储代理( Object Storage Delegate, OSD )和网络存储代理 (SAN Storage Agent, SSA), 其中, 所述 OSD用于对服务器节点自带的硬盘的存 储资源进行读写控制, 即实现数据到服务器节点的本地硬盘的存放与获取, 例如对本实施例中的存储分区 P1 -P15进行读写控制; 所述 SSA对 SAN节 点的存储阵列的存储资源进行读写控制,即实现数据到 SAN节点的存储阵列 的存放与获取, 例如对本实施例中的存储分区 P16-P30进行读写控制。 所述 0SD和所述 SSA均为所述分布式存储控制器的功能模块,所述 MDC在搜集 服务器集群系统的存储资源的信息后, 还可以根据存储资源的部署情况确定 所述 0SD 和 SSA如何在所述服务器集群系统中进行部署。 具体地, 所述 MDC可以将 0SD运行在所述服务器集群系统中具有本地硬盘的每个服务器 节点中, 所述 MDC可以将 SSA运行在所述服务器集群系统中的每个服务器 节点中,也可以根据每个服务器节点的负载情况将 SSA部署在负载较小的服 务器节点上, 例如, 所述 MDC可以统一计算所有服务器节点上的计算资源 的负载情况, 并根据每个 SAN存储节点的存储阵列的容量大小, 按照权重生 成全局的 SSA的部署信息。 在本实施例中, 例如所述 MDC在服务器节点 1 上运行 OSD1 ,在服务器节点 2上运行 OSD2,在服务器节点 2上运行 SSA1。 所述 MDC所述 MDC确定所述 OSD和 SSA的部署情况之后, 还可以 记录 OSD视图信息和 SSA视图信息,所述 OSD视图信息包括 OSD对应部 署在哪个服务器上, 用于指示所述 OSD的路由信息, 进一步地, 所述 OSD 视图还可以包含每个 OSD 以及其对应的状态以及每个 OSD对应管理哪些 DISK, 所述 SSA视图信息包括 SSA对应部署在哪个服务器上, 用于指示所 述 SSA的路由信息, 进一步地, 还包括每个 SSA的状态以及每个 SSA对应 管理哪些 SAN存储阵列的 LUN ,例如下表一和表二分别为 OSD视图信息和 SSA视图信息:
Figure imgf000014_0001
表二: SSA视图信息
上述表一和表二将 OSD和 SSA的视图信息分开描述, 本领域技术人员 也可以将上述表一和表二合并为一个读写控制模块的视图信息。
所述 MDC在划分了存储分区和确定了读写控制模块的部署之后, 还可 以为每个存储分区配置对应的读写控制模块, 所述分配过程可以比较灵活, 由所述 MDC根据存储分区的划分情况和实际运行负载确定, 例如 P1 -10对 应部署在服务器节点 1上, 由服务器节点 1上运行的 OSD1作为所述存储分 区的读写控制模块, P1 1 -20对应部署在服务器节点 2上, 由服务器节点 2上 运行的 OSD2作为所述存储分区的读写控制模块, P21 -30对应部署在服务器 节点 2上, 由服务器节点 2上运行的 SSA1作为所述存储分区的读写控制模 块。
进一步, 所述 MDC还可以生成全局分区信息 (本发明实施例以全局分 区信息表为例) , 所述全局分区信息表记录了所述服务器集群系统中的存储 分区的分布情况, 如图 2和表三所示, 所述全局分区信息表中记录每个存储 分区对应的读写控制模块( OSD或 SSA )。 所述全局分区信息表还可以记录 每个存储分区对应的源存储设备的信息, 例如磁盘编号或者物理地址信息。
如表三, P1对应的读写控制模块为 OSD1 , P1对应的源存储单元为 SERVER1 中的 DISK1 , P1对应的源物理地址为 100-199。
Figure imgf000015_0001
表三: 全局分区信息表
所述分布式存储控制器还包括虚拟块服务 VBS。 所述 MDC完成存储分 区和读写控制模块的部署之后, 还可以将上述全局分区信息表和读写控制模 块视图信息下发到所述 VBS。 所述 VBS根据所述 MDC下发的信息获得 I/O 视图, 所述 I/O视图是全局分区信息表的一个子表,用于表明每个存储分区实 际的读写控制模块, 其包含存储分区与读写控制模块的对应关系, 所述 I/O 视图可以是所述 MDC直接下发给所述 VBS的,也可以是所述 VBS根据 MDC 模块下发的全局分区信息表生成的。
所述 VBS可以运行在所述服务器集群系统中的每个服务器节点上,作为 存储的驱动层, 用于向所述服务器集群系统的应用模块提供块访问接口, 例 如基于 SCSI的块设备访问接口, 所述 VBS接收上层应用下发的数据读写请 求后,确定所述数据读写请求所需要读写的存储分区, 并根据所述 I/O视图中 的视图规则, 确定当前数据读写请求所请求的存储分区所对应的读写控制模 块(OSD和 SSA ) , 将读写数据请求下发给对应的读写控制模块, 以完成数 据的读写。
具体地, 所述 VBS还可以支持对全局元数据进行管理, 所述全局元数据 记录所述服务器集群系统中的共享存储资源池中的存储分区的全局使用情况 以及每个存储分区的元数据。 所述全局使用情况包括已经占用的存储分区的 信息和空闲的存储分区的信息。 所述每个存储分区的元数据用于表明每个存 储分区的分配情况, 本发明实施例中存储分区的分配釆用块数据的存储分配 方式, 也就是说, 每个存储分区的使用单位釆用数据块为单位, 所述存储分 区的使用包括读、 写或者分配等方式, 例如所述存储分区在被分配到用户卷 的时候, 釆用数据块为单位进行分配, 举例来说, 本发明实施例中的每个存 储分区的大小为 10GB,所述 10GB可以被均分为 10240个数据块(Block ) , 读取数据到每个存储分区或者写入数据到每个存储分区的时候, 以数据块为 单位进行读写, 因此, 每个存储分区的元数据具体包括每个存储分区所分配 的 Block ID的对应关系, 每个存储分区被分配了多个数据块。 每个数据块的 大小可以平均, 也可以不限定, 本发明实施例以每个数据块的大小为 1 MB为 例。 另外, 本发明实施例中每个数据块的 ID可以由该数据块对应的用户卷的 ID组成,也可以由该数据块对应的用户卷的 ID和逻辑块地址( Logical Block Address, LBA )组成。
每个存储分区的元数据, 例如, 如表四所示:
Figure imgf000016_0001
表四: 存储分区的元数据
其中, 存储分区与分配的数据块的对应关系可以釆用 Key-Value索引形 式, 其中, 数据块的 ID为 Key值, 例如 Key值与用户卷的标识和数据块的 逻辑块地址有关, 存储分区的 ID为 Value值。 需要说明的是, 如果是釆用 Key-Value索引形式, 所述 VBS也可以不用维护上述表四, 而直接通过算法 确定对应关系。 所述 VBS可以在启动的时候, 通过遍历服务器节点的硬盘和 SAN节点的存储阵列的磁盘, 获取存储资源的分配信息, 并根据 MDC下发 的全局分区信息表, 对存储元数据进行初始化。
所述分布式存储控制器还包括读写控制模块, 例如对服务器节点的硬盘 资源进行读写控制的 OSD, 以及对网络存储节点的存储阵列的存储资源进行 读写控制的 SSA。
具体地, 所述 OSD主要接收 VBS的读写命令, 完成数据到服务器节点 的硬盘的存放与获取。所述 SSA主要接收 VBS的读写命令,完成数据到 SAN 节点的硬盘的存放与获取, 所述 SSA用于实现 SAN设备在主机上的代理, 每一个物理 SAN 设备的存储信息在 SSA 中都建立了视图, 对每个物理 SAN/NAS设备的访问都是通过其代理进行的, SSA增加了 iSCSI的接口功 能。
进一步, 如果对所述共享存储资源池中的存储分区釆用统一地址的方式 分配物理地址, 则所述 SSA还可以维护统一的物理地址与 SAN节点上的原 始 LUN地址的对应关系, 所述 SSA还可以根据所述对应关系, 确定读写请 求所对应的原始 LUN的地址。
上述的服务器集群系统, 由于运行了分布式存储控制器, 所述分布式存 储控制器上的 MDC、 VBS以及 OSD、 SSA能够实现对异构存储资源进行整 合和利用, 将各种异构存储资源组成集群共享资源池, 统一进行存储资源的 分配和管理, 提高存储资源的利用率, 并且可以实现多个存储分区同时读或 写, 则提高了读写性能, 提升了系统的息率。
结合图 1 , 图 3为本发明实施例提供的釆用异构存储资源整合的服务器 集群系统中创建用户卷的处理流程:
S301 : 服务器集群系统中的一个服务器节点上部署的 VBS接收到应用 端发送的创建用户卷的命令;
具体地, 运行在所述服务器集群上的应用端的某应用程序 (例如某台虚 拟机)发起创建用户卷的命令, 该命令被应用管理器转发到所述服务器集群 中的任意一个服务器节点上部署的 VBS (—种优选的方式是, 发起命令的虚 拟机的计算资源所在的服务器节点的 VBS接收到该创建用户卷的命令 ); 优 选地, 如果本发明实施例中的服务器集群系统进一步提供主备 VBS的功能, 则该接收到创建用户卷的命令后, 可以进一步判断自身是不是所述服务器集 群中的主 VBS,如果不是,将所述创建用户卷的命令转发给主 VBS。事实上, VBS 的部署比较灵活, 所述服务器集群系统每个服务器节点上安装的 VBS 可以不分主次, 此时每个 VBS的配置和功能等同, 也可以在服务器集群系统 中选择一个 VBS作为主 VBS, 其它 VBS作为备份 VBS, 主 VBS用于实现 用户卷 /数据块的分配以及存储分区的元数据管理, 备份 VBS用于向主 VBS 查询元数据以及根据主 VBS的命令执行操作。本发明实施例以服务器集群系 统实现主备 VBS为例。
S302: 所述主 VBS才艮据所述创建用户卷的命令所指示的卷的大小信息, 查询全局元数据, 确定共享存储资源池的剩余资源是否满足要求, 如果满足, 则创建所述用户卷, 即确定所述用户卷的卷标识 (ID ) , 并为所述用户卷分 配初始存储分区, 将所述用户卷的标识和分配的初始存储分区的信息记录在 所述初始存储分区的元数据中。
具体地, 如果所述创建用户卷的命令已经指定所述用户卷的 ID, 则所述 主 VBS直接使用所述创建用户卷的命令中的所述用户卷的 ID, 如果所述创 建用户卷的命令未指定所述用户卷的 ID, 则所述 VBS为所述用户卷分配用 户卷的 ID。
所述 VBS在创建所述用户卷的过程中,还可以进一步为所述用户卷分配 初始存储分区, 即从空闲的存储分区中挑选某些存储分区作为所述用户卷的 初始存储分区。 所述用户卷的初始存储资源的大小可以根据所述创建用户卷 的命令指定的用户卷的容量灵活分配, 可以将所述创建用户卷的命令指定的 用户卷的容量全部作为初始存储分区的容量, 例如所述创建用户卷的命令请 求创建一个 5GB的用户卷,所述 VBS可以将 5GB全部分配给所述用户卷作 为初始存储分区, 即将 5GB划分为 5120个 1 MB的数据块, 将这 5120个数 据块分布部署在 P1 -P30的存储分区中, 此时初始存储分区的大小为 5GB; 所述 VBS也可以使用瘦分配的方式,根据所述共享存储资源池中的实际情况 为所述用户卷分配一部分的存储资源,例如为所述用户卷分配 1 GB的初始存 储资源, 将 1 GB划分为 1024个 1 MB的数据块, 将这 1024个数据块分布部 署在 P1 -P30的存储分区中, 此时初始存储分区的大小为 1 GB。 所述 VBS将所述用户卷 ID和分配的初始存储分区的信息记录到全局元 数据中的每个初始存储分区的元数据信息中。
所述 VBS在为所述用户卷分配初始存储分区的时候,也为每个用户卷的 每个数据块分配对应的源物理地址。
S303: 所述主 VBS挂载所述用户卷, 并在挂载成功后, 生成一个虚拟 存储设备;
S305:所述主 VBS将所述全局元数据返回给服务器集群系统中的 MDC, 以供所述 MDC根据所述全局元数据更新全局分区信息表。
其中, 步骤 305为可选步骤, 其实施顺序也可灵活进行。
结合图 1 , 图 4为本发明实施例提供的釆用异构存储资源整合的服务器 集群系统中用户写数据的处理流程:
S401: 运行在所述服务器进群系统上的任意一个应用程序发起写数据操 作之后, 服务器集群系统中的 VBS接收到写数据操作请求。
所述写数据操作请求携带文件名和待写入数据本身。
S402: 所述 VBS根据所述写数据操作请求携带的文件名, 确定当前写 操作对应的用户卷 ID;
所述 VBS还可以根据所述待写入数据计算待写入数据的大小。
所述 VBS为所述待写入数据分配 LBA (本步骤分配 LBA为可选的, 所 述 VBS也可以在此步骤不为所述待写入数据分配 LBA ) 。
例如,所述 VBS确定所述当前写操作的 Volume ID 1 , Size= 1 GB, LBA: 001 x -221 x 。
S403: 所述 VBS将所述待写入数据分割为多个数据块, 并为所述每个 数据块分配 LBA。
其中,所述 VBS分割所述待写入数据可以按照一定单位大小进行均匀分 割, 例如按照 1 MB进行分割, 即按照每个存储分区的每次的使用单位进行分 割, 在本实施例中, 所述 VBS将所述 Size= 1 GB的待写入数据分割为 1024 个数据块, 每个数据块的大小为 1 MB, 若待写入数据的余数不足 1 MB, 最后 一个数据块的大小为余数的实际大小。所述 VBS还为每个数据块分配对应的 LBA。 例如:
Blockl LBA:0000-1024
Block2 LBA: 1025-2048 S404: 所述 VBS为所述每个数据块确定对应的存储分区;
具体地, 所述 VBS先确定每个待写入数据块的逻辑块地址(LBA ) , 再 将所述用户卷 ID和每个数据块的 LBA组合成每个数据块 key值, 根据分布 式存储的算法, 例如哈希算法, 为每个数据块确定对应的存储分区。 这里的 LBA可以是对原 LBA处理后的值,如 blockl对应的 LBA 0000-1024对应 1 , block2对应的 LBA1025-2048对应 2。
S405: 所述 VBS生成多个数据块写命令, 其中, 每个所述数据块对应 一个所述数据块写命令, 每个数据块写命令携带待写入数据块本身, 以及待 写入数据块 ID (例如 Block ID由用户卷 ID和待写入数据块 LBA组成) 。
本步骤也可以在后续步骤执行完成后再执行, 具体实施无时序限定。
S406: 所述 VBS根据所述每个数据块对应的存储分区, 确定每个数据 块所对应的读写控制模块。
具体地, 所述 VBS根据全局分区信息表, 确定每个数据块对应的读写控 制模块。
S407: 所述 VBS分别向每个数据块对应的读写控制模块发送所述每个 数据块写命令, 以使得每个数据块对应的读写控制模块将所述每个数据块写 入存储硬件资源上。
具体地, 如果是 OSD接收到数据块写命令, 所述 OSD根据所述待写入 数据块的 ID查询自身保存的数据块元数据确定是否自身是否对该数据块 ID 进行第一次操作, 如果是第一次操作, 则为所述待写入数据块分配实际的物 理地址, 将所述待写入数据块写入所述物理地址对应的磁盘中, 并更新自身 保存的数据块元数据, 记录所述待写入数据块的 ID与物理地址的对应关系; 如果非第一次操作, 所述 OSD根据所述待写入数据块的 ID查询自身保存的 数据块元数据, 确定所述待写入数据块对应的物理地址, 将所述待写入数据 块写入所述查询到的物理地址。
如果是 SSA接收到数据块写命令, 所述 SSA根据所述待写入数据块的 ID查询自身保存的数据块元数据确定是否自身是否对该数据块 ID进行第一 次操作, 如果是第一次操作, 则为所述待写入数据块分配实际的 SAN存储节 点的存储阵列上的逻辑地址, 即 LUN的地址, 将所述待写入数据块写入所述 LUN的地址对应的磁盘中, 并更新自身保存的数据块元数据, 记录所述待写 入数据块的 ID与 LUN的地址的对应关系; 如果非第一次操作,所述 OSD根 据所述待写入数据块的 ID查询自身保存的数据块元数据,确定所述待写入数 据块对应的 LUN的地址, 将所述待写入数据块写入所述查询到的 LUN的地 址。
其中, 所述 OSD或者 SSA在写操作的时候, 可以将所述数据块先写到 本地高速緩存层即返回响应消息, 提高存储效率。
结合图 1 , 图 5为本发明实施例提供的釆用异构存储资源整合的服务器 集群系统中用户读数据的处理流程:
S501: 运行在所述服务器进群系统上的任意一个应用程序发起读数据操 作之后, 服务器集群系统中的 VBS接收到读数据操作请求。
所述读数据操作请求携带文件名和待读取数据的偏移量信息。
S502: 所述 VBS根据所述读数据操作请求携带的文件名, 确定当前读 操作对应的用户卷 ID, 根据所述待读取数据的偏移量信息确定所述待读取数 据的 LBA。
S503: 所述 VBS根据所述用户卷的 ID和待读取数据的 LBA, 确定多个 待读取数据块。
具体地,所述每个待读取数据块的 ID由所述用户卷和每个数据块的 LBA 组成,每个数据块的 LBA可以根据待读取的数据量的大小和待读取的数据的 偏移量确定。
S504: 所述 VBS为所述每个待读取数据块确定对应的存储分区; 具体地, 所述 VBS先确定每个待读取数据块的逻辑块地址(LBA ) , 再 将所述用户卷 ID和每个数据块的 LBA组合成每个数据块 key值, 根据分布 式存储的算法, 例如哈希算法, 为每个数据块确定对应的存储分区。
S505: 所述 VBS生成多个数据块读命令, 其中, 每个所述数据块对应 一个所述数据块读命令, 每个数据块读命令携带待读取数据块的 ID (例如 Block ID由用户卷 ID和待读取数据块 LBA组成。 S506: 所述 VBS根据所述每个数据块对应的存储分区, 确定每个数据 块所对应的读写控制模块。
具体地, 所述 VBS根据全局分区信息表, 确定每个数据块对应的读写控 制模块。
S507: 所述 VBS分别向每个数据块对应的读写控制模块发送所述每个 数据块读命令, 以使得每个数据块对应的读写控制模块从存储硬件资源读取 每个待读取数据块。
具体地, 如果是 OSD接收到数据块读命令, 所述 OSD根据所述待读取 的物理地址, 从所述物理地址对应的磁盘读取所述待写入数据块中。
如果是 SSA接收到数据块写命令, 所述 SSA根据所述待写入数据块的 储节点的存储阵列上的逻辑地址, 即 LUN的地址, 从所述 LUN的地址对应 的磁盘中读取所述待读取数据块。
通过本发明实施例提供计算存储融合的集群系统, 在硬件上解决了现有 技术中因为使用专用 SAN而导致的操作复杂、成本较高的问题; 存储设备可 以有多个, 每个存储设备上都可以部署緩存, 在硬件上极大的提升了存储端 緩存的扩展能力; 存储资源不依赖于计算资源, 存储资源可以独立的增加和 减少, 增强了系统的可扩展性; 将系统中的持久化磁盘、 緩存资源虚拟化为 共享资源池并被所有计算共享, 数据读写时所有计算和存储都可以参与, 通 过并发性的提高而提升了系统的存储性能。 另外, 由于本发明实施例提供计 算存储融合的集群系统釆用高速数据交换网络进行通信, 进一步加快了数据 的交换速度。 如图 6所示, 为本发明实施例提供的服务器集群系统的又一组成图, 所 述服务器集群系统包括服务器节点 1和 2, 以及网络存储节点, 即 A厂家的 SAN设备, 所述服务器节点 1 包括硬盘 1和 2, 所述服务器节点 2包括硬盘 3, 所述网络存储节点包括存储阵列, 即 LUN1和 LUN2, 所述服务器节点上 运行分布式存储控制器, 所述分布式存储控制器包括:
元数据控制器, 本实施例中部署在两个服务器节点上, 其中, 部署在服 务器节点 1的为主 MDC, 部署在服务器节点 2的为备 MDC, 所述元数据控 制器用于将所述硬盘和所述存储阵列的存储资源划分为多个存储分区, 所述 多个存储分区组成共享存储资源池,为所述每个存储分区分配读写控制模块, 生成全局分区信息, 所述全局分区信息记录了所述共享存储资源池中的每个 存储分区与读写控制模块的对应关系, 以及向虚拟块服务模块下发所述全局 分区信息; 所述虚拟块服务模块, 本实施例中每个服务器节点都部署了 VBS, 用于 面向业务层,接收存储请求消息, 确定所述存储请求消息对应的存储分区, 根据所述全局分区信息, 确定所述存储请求消息对应的存储分区对应的读 写控制模块, 以及向确定的所述读写控制模块发送所述存储请求消息; 所述读写控制模块, 用于面向所述硬盘或所述网络存储节点, 执行所 述存储请求消息所请求的操作。 本实施例中包括部署在服务器节点 1 的 OSD1和 OSD2, 以及部署在服务器节点 2的 OSD3、 SSA1和 SSA2,其中, 所述 OSD1用于对硬盘 1进行读写控制,所述 OSD2用于对硬盘 2进行读写 控制, 所述 OSD3用于对硬盘 3进行读写控制, 所述 SSA1用于对 LUN1进 行读写控制, 所述 SSA2用于对 LUN2进行读写控制。 所述元数据控制器, 还用于分别确定所述对象存储代理和所述网络存储 代理在所述服务器节点上的部署情况, 并根据确定的部署情况生成读写控制 模块的视图信息, 所述读写控制模块的视图信息用于指示每个读写控制模块 部署的服务器节点的信息, 以及向所述虚拟块服务模块下发所述读写控制模 块的视图信息; 进一步, 所述元数据控制器具体用于确定将所述对象存储代理部署在所 述服务器集群系统中具备硬盘资源的服务器节点上, 以及确定将所述网络存 储代理部署在所述服务器集群系统中的负载小的服务器节点上。 如本实施例中, 所述元数据控制器将 SSA1和 SSA2部署在服务器节点
2上。 所述虚拟块服务模块, 具体用于根据所述读写控制模块的视图信息确定 读写控制模块的路由信息, 向确定的所述读写控制模块发送所述存储请求消 息。 图 6所示的本实施例, 还可以执行如图 3-5任意之一所说的方法, 本发 明实施例在此不再赘述。 如图 7, 为本发明实施例的计算机的结构组成示意图。 本发明实施例的 计算机可包括: 处理器 701、 存储器 702、 系统总线 704和通信接口 705。 CPU701、 存 储器 702和通信接口 705之间通过系统总线 704连接并完成相互间的通信。 处理器 701可能为单核或多核中央处理单元, 或者为特定集合成电路, 或者为被配置成实施本发明实施例的一个或多个集合成电路。 存储器 702 可以为高速 RAM 存储器, 也可以为非易失性存储器
( non-volatile memory ) , 例^口至少一个磁盘存 4诸器。 存储器 702用于计算机执行指令 703。 具体的, 计算机执行指令 703中 可以包括程序代码。 当计算机运行时, 处理器 701运行计算机执行指令 703, 可以执行本发 明实施例任意一个实施例所提供的方法。 更具体地说, 本发明实施例中所述 的分布式存储控制器如果通过计算机代码实现, 则所述计算机执行本发明实 施例的分布式存储控制器的功能。 应理解, 在本发明实施例中, "与 Α相应的 Β" 表示 B与 A相关联, 根 据 A可以确定 B。但还应理解,根据 A确定 B并不意味着仅仅根据 A确定 B, 还可以根据 A和 /或其它信息确定 B。 本领域普通技术人员可以意识到, 结合本文中所公开的实施例描述的各 示例的单元及算法步骤, 能够以电子硬件、 计算机软件或者二者的结合来实 现, 为了清楚地说明硬件和软件的可互换性, 在上述说明中已经按照功能一 般性地描述了各示例的组成及步骤。 这些功能究竟以硬件还是软件方式来执 行, 取决于技术方案的特定应用和设计约束条件。 专业技术人员可以对每个 特定的应用来使用不同方法来实现所描述的功能, 但是这种实现不应认为超 出本发明的范围。
在本申请所提供的实施例中, 应该理解到, 所揭露的系统, 可以通过其 它的方式实现。 例如, 以上所描述的系统实施例仅仅是示意性的, 例如, 所 述单元的划分, 仅仅为一种逻辑功能划分, 实际实现时可以有另外的划分方 式, 例如多个单元或组件可以结合或者可以集成到另一个系统, 或一些特征 可以忽略, 或不执行。 另外, 所显示或讨论的相互之间的耦合或直接耦合或 通信连接可以是通过一些接口、 装置或单元的间接耦合或通信连接, 也可以 是电的, 机械的或其它的形式连接。 为单元显示的部;可以是或者也可 不是 理单
可以根据实际的,、 即可以位于一个地方, 或者也可以分布到多个网络单元上。 需要选择其中的部分或 者全部单元来实现本发明实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中, 也可以是各个单元单独物理存在, 也可以是两个或两个以上单元集成在一个 单元中。 上述集成的单元既可以釆用硬件的形式实现, 也可以釆用软件功能 单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售 或使用时, 可以存储在一个计算机可读取存储介质中。 基于这样的理解, 本 发明的技术方案本质上或者说对现有技术做出贡献的部分, 或者该技术方案 的全部或部分可以以软件产品的形式体现出来, 该计算机软件产品存储在一 个存储介质中, 包括若干指令用以使得一台计算机设备 (可以是个人计算机, 服务器, 或者网络设备等)执行本发明各个实施例所述方法的全部或部分步 骤。而前述的存储介质包括: U盘、移动硬盘、只读存储器( ROM, Read-Only Memory ) 、 随机存取存储器 ( RAM, Random Access Memory ) 、 磁碟或 者光盘等各种可以存储程序代码的介质。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限 于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易 想到各种等效的修改或替换, 这些修改或替换都应涵盖在本发明的保护范围 之内。 因此, 本发明的保护范围应以权利要求的保护范围为准。

Claims

权利 要求 书
1、 一种共享存储资源的方法, 其特征在于, 应用于服务器集群系统, 所 述服务器集群系统包括服务器节点和网络存储节点, 所述服务器节点包括硬 盘, 所述网络存储节点包括存储阵列, 所述方法包括: 将所述硬盘和所述存储阵列的存储资源划分为多个存储分区, 所述多个 存储分区组成共享存储资源池; 为所述每个存储分区分配读写控制模块; 生成全局分区信息, 所述全局分区信息记录了所述共享存储资源池中的 每个存储分区与读写控制模块的对应关系; 接收存储请求消息, 确定所述存储请求消息对应的存储分区; 根据所述全局分区信息, 确定所述存储请求消息对应的存储分区对应 的读写控制模块;
向确定的所述读写控制模块发送所述存储请求消息, 以使所述读写控 制模块执行所述存储请求消息所请求的操作。
2、 如权利要求 1 所述的方法, 其特征在于, 所述确定所述存储请求消 息对应的存储分区包括: 确定所述存储请求消息待操作的数据所在的用户卷 ID 和所述待操作 的数据的至少一个数据块的逻辑块地址 LBA; 根据所述用户卷 ID和所述至少一个数据块的 LBA, 确定所述至少一 个数据块对应的存储分区。
3、 如权利要求 2所述的方法, 其特征在于, 所述方法还包括: 建立所述共享存储资源池的每个存储分区的元数据, 所述每个存储分区 的元数据记录本存储分区 ID与被分配到本存储分区的数据块 ID的对应关系; 则, 所述根据所述用户卷 ID和所述至少一个数据块的 LBA, 确定所 述至少一个数据块对应的存储分区包括: 根据所述用户卷 ID和所述至少一个数据块的 LBA确定所述至少一个 数据块的 ID , 查询所述每个存储分区的元数据, 确定所述至少一个数据块 对应的存储分区的 ID。
4、 如权利要求 2所述的方法, 其特征在于, 所述根据所述用户卷 ID和 所述至少一个数据块的 LBA,确定所述至少一个数据块对应的存储分区包 括:
将所述用户卷 ID和每个数据块的 LBA组成所述每个数据块的 key值, 计算所述每个数据块的 key值对应的 value值, 根据所述 value值确定所 述每个数据块对应的存储分区。
5、 如权利要求 2或 3或 4所述的方法, 其特征在于, 所述接收存储请 求消息包括: 接收创建用户卷的命令, 所述创建用户卷的命令指示所述用户 卷的大小; 则所述确定所述存储请求消息待操作的数据所在的用户卷 ID 和所述 待操作的数据的至少一个数据块的逻辑块地址 LBA, 根据所述用户卷 ID 和所述至少一个数据块的 LBA, 确定所述至少一个数据块对应的存储分 区, 包括:
为所述用户卷分配所述用户卷 ID; 根据所述所述用户卷的大小, 确定分配给所述用户卷的初始存储资源的 大小, 根据所述初始存储资源的大小确定至少一个数据块的 LBA;
根据所述用户卷 ID和所述至少一个数据块的 LBA, 确定所述至少一 个数据块对应的存储分区。
6、 如权利要求 5所述的方法, 其特征在于, 所述确定分配给所述用户卷 的初始存储资源的大小包括:
所述初始存储资源的大小小于或等于所述创建用户卷的命令指示的所述 用户卷的大小。
7、 如权利要求 2或 3或 4所述的方法, 其特征在于, 所述接收存储请 求消息包括: 接收写数据操作请求; 则所述确定所述存储请求消息待操作的数据所在的用户卷 ID 和所述 待操作的数据的至少一个数据块的逻辑块地址 LBA, 根据所述用户卷 ID 和所述至少一个数据块的 LBA, 确定所述至少一个数据块对应的存储分 区, 包括: 根据所述写数据操作请求携带的文件名, 确定当前写操作对应的用户卷
ID; 将待写入的数据划分为多个待写入数据块, 并为每个待写入数据块分配 LBA; 根据所述当前写操作对应的用户卷 ID 和所述每个待写入数据块的 LBA, 确定所述每个待写入数据块对应的存储分区。
8、 如权利要求 7所述的方法, 其特征在于, 所述将待写入的数据划分为 大小平均的所述多个待写入数据块。
9、 如权利要求 8所述的方法, 其特征在于, 所述根据所述全局分区信 息, 确定所述存储请求消息对应的读写控制模块; 向确定的所述读写控制 模块发送所述存储请求消息, 以使所述读写控制模块执行所述存储请求消 息所请求的操作, 包括: 根据所述全局分区信息, 确定所述每个待写入数据块对应的存储分区所 对应的读写控制模块; 生成多个数据块写命令, 其中, 每个数据块写命令对应一个所述待写入 数据块, 所述每个数据块写命令携带待写入数据块以及待写入数据的 ID; 分别向所述每个待写入数据块对应的读写控制模块发送所述每个数据块 写命令, 以使得所述每个待写入数据块对应的读写控制模块将所述每个待写 入数据块写入存储硬件资源。
10、 如权利要求 2或 3或 4所述的方法, 其特征在于, 所述接收存储请 求消息包括: 接收读数据操作请求, 所述读数据操作请求携带文件名和待读 取数据的偏移量; 则所述确定所述存储请求消息待操作的数据所在的用户卷 ID 和所述 待操作的数据的至少一个数据块的逻辑块地址 LBA, 根据所述用户卷 ID 和所述至少一个数据块的 LBA , 确定所述至少一个数据块对应的存储分 区, 包括: 根据所述读数据操作请求携带的文件名, 确定当前读操作对应的用户卷 ID; 根据所述待读取数据的偏移量信息, 确定多个待读取数据块的 LBA; 根据所述当前读操作对应的用户卷 ID和每个待读取数据块的 LBA, 确 定所述每个待读取数据块对应的存储分区。
1 1、 如权利要求 10 所述的方法, 其特征在于, 根据所述待读取数据的 偏移量信息, 确定多个大小平均的所述待读取数据块。
12、 如权利要求 10所述的方法, 其特征在于, 所述根据所述全局分区 信息, 确定所述存储请求消息对应的读写控制模块; 向确定的所述读写控 制模块发送所述存储请求消息, 以使所述读写控制模块执行所述存储请求 消息所请求的操作, 包括: 根据所述全局分区信息, 确定所述多个待读取数据块对应的存储分区所 对应的读写控制模块; 生成多个数据块读命令, 其中, 每个数据块读命令对应一个所述待读取 数据块, 所述每个数据块读命令携带待读取数据块以及待读取数据块的 ID; 分别向所述每个待读取数据块对应的读写控制模块发送所述每个数据块 读命令, 以使得所述每个待读取数据块对应的读写控制模块读取所述每个待 读取数据块。
13、 一种服务器集群系统, 其特征在于, 所述服务器集群系统包括服务 器节点和网络存储节点, 所述服务器节点包括硬盘, 所述网络存储节点包括 存储阵列, 所述服务器节点上运行分布式存储控制器, 所述分布式存储控制 器包括: 元数据控制器, 用于将所述硬盘和所述存储阵列的存储资源划分为多个 存储分区, 所述多个存储分区组成共享存储资源池, 为所述每个存储分区分 配读写控制模块, 生成全局分区信息, 所述全局分区信息记录了所述共享存 储资源池中的每个存储分区与读写控制模块的对应关系, 以及向虚拟块服务 模块下发所述全局分区信息;
所述虚拟块服务模块, 用于面向业务层, 接收存储请求消息, 确定所述 存储请求消息对应的存储分区, 根据所述全局分区信息, 确定所述存储请 求消息对应的存储分区对应的读写控制模块, 以及向确定的所述读写控制 模块发送所述存储请求消息;
所述读写控制模块, 用于面向所述硬盘或所述网络存储节点, 执行所 述存储请求消息所请求的操作。
14、 如权利要求 13所述的系统, 其特征在于, 所述读写控制模块包括 对象存储代理和网络存储代理; 所述元数据控制器具体用于为所述本地硬盘组成的存储分区分配所 述对象存储代理作为读写控制模块, 为所述存储阵列组成的存储分区分配 所述网络存储代理作为读写控制模块; 所述对象存储代理, 用于接收存储请求消息, 确定所述存储请求消息 对应的物理地址, 根据所述物理地址在所述硬盘上执行所述存储请求消息 所请求的操作; 所述网络存储代理, 用于接收存储请求消息, 确定所述存储请求消息 对应的网络存储节点的逻辑地址, 根据所述逻辑地址在所述存储阵列上执 行所述存储请求消息所请求的操作。
15、 如权利要求 13 所述的系统, 其特征在于, 所述虚拟块服务模块, 具体用于确定所述存储请求消息待操作的数据所在的用户卷 ID 和所述待 操作的数据的至少一个数据块的逻辑块地址 LBA, 根据所述用户卷 ID和 所述至少一个数据块的 LBA, 确定所述至少一个数据块对应的存储分区。
16、 如权利要求 15 所述的系统, 其特征在于, 所述虚拟块服务模块, 具体用于建立所述共享存储资源池的每个存储分区的元数据, 所述每个存储 分区的元数据记录本存储分区 ID与被分配到本存储分区的数据块 ID的对应 关系, 根据所述用户卷 ID和所述至少一个数据块的 LBA确定所述至少一 个数据块的 ID , 查询所述每个存储分区的元数据, 确定所述至少一个数据 块对应的存储分区的 ID。
17、 如权利要求 15 所述的系统, 其特征在于, 所述虚拟块服务模块, 具体用于将所述用户卷 ID 和每个数据块的 LBA 组成所述每个数据块的 key值, 计算所述每个数据块的 key值对应的 value值, 根据所述 value 值确定所述每个数据块对应的存储分区。
18、 如权利要求 13-17任一项所述的系统, 其特征在于, 所述虚拟块服 务模块, 具体用于接收创建用户卷的命令, 所述创建用户卷的命令指示所述 用户卷的大小, 为所述用户卷分配所述用户卷 ID , 根据所述所述用户卷的 大小, 确定分配给所述用户卷的初始存储资源的大小, 根据所述初始存储资 源的大小确定至少一个数据块的 LBA, 以及 居所述用户卷 ID和所述至少 一个数据块的 LBA, 确定所述至少一个数据块对应的存储分区。
19、 如权利要求 13-17任一项所述的系统, 其特征在于, 所述虚拟块服 务模块, 具体用于接收写数据操作请求, 根据所述写数据操作请求携带的文 件名, 确定当前写操作对应的用户卷 ID, 将待写入的数据划分为多个待写入 数据块, 并为每个待写入数据块分配 LBA, 根据所述当前写操作对应的用户 卷 ID和所述每个待写入数据块的 LBA,确定所述每个待写入数据块对应的 存储分区, 根据所述全局分区信息, 确定所述每个待写入数据块对应的存储 分区所对应的读写控制模块, 生成多个数据块写命令, 其中, 每个数据块写 命令对应一个所述待写入数据块, 所述每个数据块写命令携带待写入数据块 以及待写入数据的 ID, 以及分别向所述每个待写入数据块对应的读写控制模 块发送所述每个数据块写命令。
20、 如权利要求 13-17任一项所述的系统, 其特征在于, 所述虚拟块服 务模块, 具体用于接收读数据操作请求, 所述读数据操作请求携带文件名和 待读取数据的偏移量, 根据所述读数据操作请求携带的文件名, 确定当前读 操作对应的用户卷 ID, 根据所述待读取数据的偏移量信息, 确定多个待读取 数据块的 LBA, 以及根据所述当前读操作对应的用户卷 ID和每个待读取数 据块的 LBA, 确定所述每个待读取数据块对应的存储分区, 根据所述全局 分区信息, 确定所述多个待读取数据块对应的存储分区所对应的读写控制模 块, 生成多个数据块读命令, 其中, 每个数据块读命令对应一个所述待读取 数据块, 所述每个数据块读命令携带待读取数据块以及待读取数据块的 ID, 以及分别向所述每个待读取数据块对应的读写控制模块发送所述每个数据块 读命令。
21、 如权利要求 13-17任一项所述的系统, 其特征在于, 所述元数据控 制器, 还用于分别确定所述对象存储代理和所述网络存储代理在所述服务 器节点上的部署情况, 并根据确定的部署情况生成读写控制模块的视图信 息, 所述读写控制模块的视图信息用于指示每个读写控制模块部署的服务 器节点的信息, 以及向所述虚拟块服务模块下发所述读写控制模块的视图 信息; 所述虚拟块服务模块, 具体用于根据所述读写控制模块的视图信息确 定读写控制模块的路由信息, 向确定的所述读写控制模块发送所述存储请 求消息。
22、 如权利要求 21 所述的系统, 其特征在于, 所述元数据控制器具体 用于确定将所述对象存储代理部署在所述服务器集群系统中具备硬盘资源 的服务器节点上, 以及确定将所述网络存储代理部署在所述服务器集群系 统中的负载小的服务器节点上。
23、 如权利要求 21 所述的系统, 其特征在于, 所述元数据控制器, 还 用于搜集所述服务器节点的硬盘的可用存储资源和所述网络存储节点的存储 阵列的可用存储资源, 将所述硬盘和所述存储阵列的可用存储资源划分为多 个存储分区。
24、 一种计算机, 其特征在于, 包括: 处理器、 存储器、 总线和通信接 口;
所述存储器用于存储计算机执行指令,所述处理器与所述存储器通过所 述总线连接, 当所述计算机运行时, 所述处理器执行所述存储器存储的所述 计算机执行指令, 以使所述计算机执行如权利要求 1 -12任一项所述的创建 虚拟机的方法。
25、 一种计算机可读介质, 其特征在于, 包括计算机执行指令, 当计算 机的处理器执行所述计算机执行指令时,所述计算机执行如权利要求 1 -12任 一项所述的创建虚拟机的方法。
PCT/CN2013/091253 2012-12-31 2013-12-31 一种共享存储资源的方法和系统 WO2014101896A1 (zh)

Priority Applications (7)

Application Number Priority Date Filing Date Title
JP2015549981A JP6019513B2 (ja) 2012-12-31 2013-12-31 記憶リソースを共有する方法およびシステム
EP13869766.9A EP2930910B1 (en) 2012-12-31 2013-12-31 Method and system for sharing storage resources
EP16194969.8A EP3188449B1 (en) 2012-12-31 2013-12-31 Method and system for sharing storage resource
ES13869766.9T ES2624412T3 (es) 2012-12-31 2013-12-31 Procedimiento y sistema para compartir recursos de almacenamiento
CN201380002608.1A CN103797770B (zh) 2012-12-31 2013-12-31 一种共享存储资源的方法和系统
US14/754,378 US9733848B2 (en) 2012-12-31 2015-06-29 Method and system for pooling, partitioning, and sharing network storage resources
US15/675,226 US10082972B2 (en) 2012-12-31 2017-08-11 Method and system for pooling, partitioning, and sharing network storage resources

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
PCT/CN2012/088109 WO2014101218A1 (zh) 2012-12-31 2012-12-31 一种计算存储融合的集群系统
CNPCT/CN2012/088109 2012-12-31

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/754,378 Continuation US9733848B2 (en) 2012-12-31 2015-06-29 Method and system for pooling, partitioning, and sharing network storage resources

Publications (1)

Publication Number Publication Date
WO2014101896A1 true WO2014101896A1 (zh) 2014-07-03

Family

ID=49866757

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/CN2012/088109 WO2014101218A1 (zh) 2012-12-31 2012-12-31 一种计算存储融合的集群系统
PCT/CN2013/091253 WO2014101896A1 (zh) 2012-12-31 2013-12-31 一种共享存储资源的方法和系统

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/088109 WO2014101218A1 (zh) 2012-12-31 2012-12-31 一种计算存储融合的集群系统

Country Status (6)

Country Link
US (4) US10481804B2 (zh)
EP (2) EP3188449B1 (zh)
JP (1) JP6019513B2 (zh)
CN (1) CN103503414B (zh)
ES (1) ES2624412T3 (zh)
WO (2) WO2014101218A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209253A (zh) * 2019-12-30 2020-05-29 河南创新科信息技术有限公司 分布式存储设备性能提升方法、装置及分布式存储设备
CN111459679A (zh) * 2020-04-03 2020-07-28 宁波艾欧迪互联科技有限公司 一种用于5g通信测试仪表测试数据的并行处理方法
CN111786930A (zh) * 2019-04-03 2020-10-16 上海宽带技术及应用工程研究中心 虚拟现实的数据共享系统、方法、装置、终端、及介质

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190028542A1 (en) * 2016-02-03 2019-01-24 Surcloud Corp. Method and device for transmitting data
CN105657066B (zh) * 2016-03-23 2019-06-14 天津书生云科技有限公司 用于存储系统的负载再均衡方法及装置
CN105872031B (zh) * 2016-03-26 2019-06-14 天津书生云科技有限公司 存储系统
WO2014101218A1 (zh) 2012-12-31 2014-07-03 华为技术有限公司 一种计算存储融合的集群系统
US9882984B2 (en) 2013-08-02 2018-01-30 International Business Machines Corporation Cache migration management in a virtualized distributed computing system
WO2015176262A1 (zh) * 2014-05-22 2015-11-26 华为技术有限公司 一种节点互连装置、资源控制节点和服务器系统
CN104135514B (zh) * 2014-07-25 2017-10-17 英业达科技有限公司 融合式虚拟化存储系统
CN109918021B (zh) * 2014-11-05 2022-01-07 超聚变数字技术有限公司 数据处理方法和装置
CN104486444A (zh) * 2014-12-30 2015-04-01 北京天云融创软件技术有限公司 云管理平台的异构api转化系统
US10425352B2 (en) * 2015-03-09 2019-09-24 International Business Machines Corporation Policy driven storage hardware allocation
JP6448779B2 (ja) * 2015-05-14 2019-01-09 株式会社日立製作所 サーバストレージシステムを含んだ計算機システム
US10346237B1 (en) * 2015-08-28 2019-07-09 EMC IP Holding Company LLC System and method to predict reliability of backup software
CN107211003B (zh) * 2015-12-31 2020-07-14 华为技术有限公司 分布式存储系统及管理元数据的方法
CN107851062A (zh) * 2015-12-31 2018-03-27 华为技术有限公司 一种主机集群中缓存管理方法及主机
US10073725B2 (en) * 2016-02-11 2018-09-11 Micron Technology, Inc. Distributed input/output virtualization
CN106657356A (zh) * 2016-12-29 2017-05-10 郑州云海信息技术有限公司 一种云存储系统的数据写入方法、装置及云存储系统
US10768986B2 (en) 2017-01-06 2020-09-08 International Business Machines Corporation Management and utilization of storage capacities in a converged system
US10824355B2 (en) 2017-01-10 2020-11-03 International Business Machines Corporation Hierarchical management of storage capacity and data volumes in a converged system
US10938901B2 (en) 2017-01-11 2021-03-02 International Business Machines Corporation Management and utilization of data volumes in a converged system
US10394454B2 (en) * 2017-01-13 2019-08-27 Arm Limited Partitioning of memory system resources or performance monitoring
CN106844052A (zh) * 2017-01-22 2017-06-13 郑州云海信息技术有限公司 一种基于Windows Server构建融合集群的方法及装置
CN106919456A (zh) * 2017-03-01 2017-07-04 郑州云海信息技术有限公司 一种实现服务器串联的模块
US10454844B2 (en) * 2017-03-08 2019-10-22 A10 Networks, Inc. Dynamic capacity planning for application delivery platform across multiple cloud deployment
WO2019071595A1 (zh) * 2017-10-13 2019-04-18 华为技术有限公司 分布式块存储系统中数据存储方法、装置及计算机可读存储介质
CN107807794B (zh) * 2017-10-31 2021-02-26 新华三技术有限公司 一种数据存储方法和装置
CN107729536B (zh) * 2017-10-31 2020-09-08 新华三技术有限公司 一种数据存储方法和装置
CN108235751B (zh) 2017-12-18 2020-04-14 华为技术有限公司 识别对象存储设备亚健康的方法、装置和数据存储系统
US11194746B2 (en) * 2017-12-22 2021-12-07 Seagate Technology Llc Exchanging drive information
CN109039743B (zh) * 2018-08-03 2022-05-10 陕西中光电信高科技有限公司 分布式存储ceph群集网络的集中管理方法
CN109120556B (zh) * 2018-08-21 2019-07-09 广州市品高软件股份有限公司 一种云主机访问对象存储服务器的方法及系统
US11106378B2 (en) 2018-11-21 2021-08-31 At&T Intellectual Property I, L.P. Record information management based on self describing attributes
US11042411B2 (en) * 2019-03-15 2021-06-22 Toshiba Memory Corporation Data storage resource management
CN112099728B (zh) * 2019-06-18 2022-09-16 华为技术有限公司 一种执行写操作、读操作的方法及装置
CN111158595B (zh) * 2019-12-27 2023-05-23 中国建设银行股份有限公司 企业级异构存储资源调度方法及系统
CN111625401B (zh) * 2020-05-29 2023-03-21 浪潮电子信息产业股份有限公司 基于集群文件系统的数据备份方法、装置及可读存储介质
CN113946276A (zh) * 2020-07-16 2022-01-18 北京达佳互联信息技术有限公司 集群中的磁盘管理方法、装置及服务器
CN111949217A (zh) * 2020-08-21 2020-11-17 广东韶钢松山股份有限公司 超融合一体机及其软件定义存储sds处理方法和系统
CN112948300B (zh) * 2021-01-19 2023-02-10 浙江大华技术股份有限公司 服务器、存算一体设备以及服务器系统
US11016688B1 (en) * 2021-01-06 2021-05-25 Open Drives LLC Real-time localized data access in a distributed data storage system
CN113031858B (zh) * 2021-02-10 2022-09-20 山东英信计算机技术有限公司 一种基于多双活存储的数据处理方法、系统及介质
CN113342509B (zh) * 2021-08-03 2021-12-07 北京达佳互联信息技术有限公司 数据处理方法、装置、电子设备及存储介质
CN113821165B (zh) * 2021-08-20 2023-12-22 济南浪潮数据技术有限公司 一种分布式集群融合存储方法、系统及设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1652090A (zh) * 2005-02-23 2005-08-10 北京邦诺存储科技有限公司 网络存储系统中的数据管理方法及其构建的网络存储系统
CN102223409A (zh) * 2011-06-13 2011-10-19 浪潮(北京)电子信息产业有限公司 一种网络存储资源应用系统及方法
CN102521063A (zh) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 一种适用于虚拟机迁移和容错的共享存储方法

Family Cites Families (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6389503B1 (en) 1997-08-04 2002-05-14 Exabyte Corporation Tape drive emulation by removable disk drive and media formatted therefor
US6829610B1 (en) * 1999-03-11 2004-12-07 Microsoft Corporation Scalable storage system supporting multi-level query resolution
US6732166B1 (en) 1999-05-28 2004-05-04 Intel Corporation Method of distributed resource management of I/O devices in a network cluster
JP2001337850A (ja) 2000-05-25 2001-12-07 Hitachi Ltd 記憶装置および記憶装置の制御方法
WO2003007154A2 (en) * 2001-07-09 2003-01-23 Cable & Wireless Internet Services, Inc. Methods and systems for shared storage virtualization
CN1602480A (zh) 2001-12-10 2005-03-30 单球体有限公司 管理附装在数据网络上的存储器资源
US7379990B2 (en) 2002-08-12 2008-05-27 Tsao Sheng Ted Tai Distributed virtual SAN
JP2005539309A (ja) * 2002-09-16 2005-12-22 ティギ・コーポレイション 記憶システムアーキテクチャおよび多重キャッシュ装置
US7624170B2 (en) * 2002-09-26 2009-11-24 International Business Machines Corporation Integrated storage appliance
US7565566B2 (en) 2003-04-23 2009-07-21 Dot Hill Systems Corporation Network storage appliance with an integrated switch
US7320083B2 (en) 2003-04-23 2008-01-15 Dot Hill Systems Corporation Apparatus and method for storage controller to deterministically kill one of redundant servers integrated within the storage controller chassis
US7380039B2 (en) * 2003-12-30 2008-05-27 3Tera, Inc. Apparatus, method and system for aggregrating computing resources
JP4718285B2 (ja) 2005-09-22 2011-07-06 株式会社日立製作所 ファイル管理機能を備えたコンピュータシステム、ストレージ装置およびファイル管理方法
CN101169725A (zh) * 2006-10-23 2008-04-30 国际商业机器公司 随需个人计算机供应系统和方法
US8091087B2 (en) * 2007-04-20 2012-01-03 Microsoft Corporation Scheduling of new job within a start time range based on calculated current load and predicted load value of the new job on media resources
US8706914B2 (en) * 2007-04-23 2014-04-22 David D. Duchesneau Computing infrastructure
US8396937B1 (en) * 2007-04-30 2013-03-12 Oracle America, Inc. Efficient hardware scheme to support cross-cluster transactional memory
US9824006B2 (en) * 2007-08-13 2017-11-21 Digital Kiva, Inc. Apparatus and system for object-based storage solid-state device
US20090049236A1 (en) * 2007-08-15 2009-02-19 Hitachi, Ltd. System and method for data protection management for network storage
CN101374192A (zh) 2007-09-26 2009-02-25 北京数字太和科技有限责任公司 一种利用数字电视网络下载并存储多媒体数据的方法
JP2009223442A (ja) * 2008-03-13 2009-10-01 Hitachi Ltd ストレージシステム
RU2507703C2 (ru) * 2008-05-21 2014-02-20 Телефонактиеболагет Л М Эрикссон (Пабл) Объединение ресурсов в сервере центра коммутации с кластером с электронными платами
CN100555206C (zh) * 2008-05-27 2009-10-28 中国科学院计算技术研究所 一种绑定计算资源和存储资源的装置
CN101730313A (zh) 2008-10-10 2010-06-09 中国移动通信集团公司 多载波移动通信系统中的通信方法、基站以及通信系统
US8525925B2 (en) * 2008-12-29 2013-09-03 Red.Com, Inc. Modular digital camera
JP5286192B2 (ja) * 2009-08-12 2013-09-11 株式会社日立製作所 ストレージシステムの容量を管理する管理計算機及びストレージシステムの容量管理方法
EP2476055B1 (en) 2009-09-08 2020-01-22 SanDisk Technologies LLC Apparatus, system, and method for caching data on a solid-state storage device
US20110087833A1 (en) 2009-10-08 2011-04-14 Advanced Micro Devices, Inc. Local nonvolatile write-through cache for a data server having network-based data storage, and related operating methods
US20110153570A1 (en) * 2009-12-18 2011-06-23 Electronics And Telecommunications Research Institute Data replication and recovery method in asymmetric clustered distributed file system
US8290919B1 (en) * 2010-08-27 2012-10-16 Disney Enterprises, Inc. System and method for distributing and accessing files in a distributed storage system
CN102480791B (zh) 2010-11-30 2014-05-21 普天信息技术研究院有限公司 一种协作多点传输的调度方法
CN102076096B (zh) 2011-01-12 2013-08-28 上海华为技术有限公司 一种 CoMP的实现方法、装置及基站
CN102164177A (zh) 2011-03-11 2011-08-24 浪潮(北京)电子信息产业有限公司 一种集群共享存储池的方法、装置及系统
CN102520883B (zh) 2011-12-12 2015-05-20 杭州华三通信技术有限公司 一种数据存取方法及其装置
US20150019792A1 (en) * 2012-01-23 2015-01-15 The Regents Of The University Of California System and method for implementing transactions using storage device support for atomic updates and flexible interface for managing data logging
CN102664923A (zh) 2012-03-30 2012-09-12 浪潮电子信息产业股份有限公司 一种利用Linux全局文件系统实现共享存储池的方法
CN102739771A (zh) * 2012-04-18 2012-10-17 上海和辰信息技术有限公司 一种支持服务融合的云应用集成管理平台和方法
WO2014000271A1 (zh) 2012-06-29 2014-01-03 华为技术有限公司 一种pcie交换系统、装置及交换方法
WO2014101218A1 (zh) 2012-12-31 2014-07-03 华为技术有限公司 一种计算存储融合的集群系统
US10313251B2 (en) * 2016-02-01 2019-06-04 Netapp, Inc. Methods and systems for managing quality of service in a networked storage environment
US10048896B2 (en) * 2016-03-16 2018-08-14 Netapp, Inc. Methods and systems for determining performance capacity of a resource of a networked storage environment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1652090A (zh) * 2005-02-23 2005-08-10 北京邦诺存储科技有限公司 网络存储系统中的数据管理方法及其构建的网络存储系统
CN102223409A (zh) * 2011-06-13 2011-10-19 浪潮(北京)电子信息产业有限公司 一种网络存储资源应用系统及方法
CN102521063A (zh) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 一种适用于虚拟机迁移和容错的共享存储方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2930910A4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111786930A (zh) * 2019-04-03 2020-10-16 上海宽带技术及应用工程研究中心 虚拟现实的数据共享系统、方法、装置、终端、及介质
CN111209253A (zh) * 2019-12-30 2020-05-29 河南创新科信息技术有限公司 分布式存储设备性能提升方法、装置及分布式存储设备
CN111209253B (zh) * 2019-12-30 2023-10-24 河南创新科信息技术有限公司 分布式存储设备性能提升方法、装置及分布式存储设备
CN111459679A (zh) * 2020-04-03 2020-07-28 宁波艾欧迪互联科技有限公司 一种用于5g通信测试仪表测试数据的并行处理方法
CN111459679B (zh) * 2020-04-03 2023-10-27 宁波大学 一种用于5g通信测试仪表测试数据的并行处理方法

Also Published As

Publication number Publication date
US11042311B2 (en) 2021-06-22
EP3188449B1 (en) 2018-09-19
ES2624412T3 (es) 2017-07-14
EP3188449A1 (en) 2017-07-05
US20140189128A1 (en) 2014-07-03
US10082972B2 (en) 2018-09-25
JP2016507814A (ja) 2016-03-10
US20200065010A1 (en) 2020-02-27
WO2014101218A1 (zh) 2014-07-03
CN103503414B (zh) 2016-03-09
US10481804B2 (en) 2019-11-19
EP2930910A4 (en) 2015-11-25
US20150301759A1 (en) 2015-10-22
EP2930910A1 (en) 2015-10-14
US20170336998A1 (en) 2017-11-23
US9733848B2 (en) 2017-08-15
CN103503414A (zh) 2014-01-08
EP2930910B1 (en) 2017-02-22
JP6019513B2 (ja) 2016-11-02

Similar Documents

Publication Publication Date Title
US10082972B2 (en) Method and system for pooling, partitioning, and sharing network storage resources
US10708356B2 (en) Cloud computing system and method for managing storage resources therein
CN103797770B (zh) 一种共享存储资源的方法和系统
KR102457611B1 (ko) 터넌트-어웨어 스토리지 쉐어링 플랫폼을 위한 방법 및 장치
US9483187B2 (en) Quality of service implementation in a networked storage system with hierarchical schedulers
KR102044023B1 (ko) 키 값 기반 데이터 스토리지 시스템 및 이의 운용 방법
KR20200017363A (ko) 호스트 스토리지 서비스들을 제공하기 위한 NVMe 프로토콜에 근거하는 하나 이상의 호스트들과 솔리드 스테이트 드라이브(SSD)들 간의 관리되는 스위칭
KR20140111589A (ko) 가상 기계들을 지원하는 플래시―기반 캐싱 해결책에서의 동적인 캐시 공유를 위한 시스템, 방법 및 컴퓨터―판독가능한 매체
US11262916B2 (en) Distributed storage system, data processing method, and storage node
US20150269187A1 (en) Apparatus and method for providing virtual machine image file
WO2013004136A1 (zh) 分布式存储方法、装置和系统
TW201220060A (en) Latency reduction associated with a response to a request in a storage system
JP7467593B2 (ja) リソース割振り方法、記憶デバイス、および記憶システム
US11409454B1 (en) Container ownership protocol for independent node flushing
US11079961B1 (en) Storage system with write-via-hash functionality for synchronous replication of logical storage volumes
US9755986B1 (en) Techniques for tightly-integrating an enterprise storage array into a distributed virtualized computing environment
CN109814805B (zh) 存储系统中分条重组的方法及分条服务器
US11327895B1 (en) Protocol for processing requests that assigns each request received by a node a sequence identifier, stores data written by the request in a cache page block, stores a descriptor for the request in a cache page descriptor, and returns a completion acknowledgement of the request
KR20120063946A (ko) 대용량 통합 메모리를 위한 메모리 장치 및 이의 메타데이터 관리 방법
JP7107981B2 (ja) 計算機システム
CN112714910B (zh) 分布式存储系统及计算机程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13869766

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2015549981

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2013869766

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013869766

Country of ref document: EP