CN105657066B - Load for storage system equalization methods and device again - Google Patents

Load for storage system equalization methods and device again Download PDF

Info

Publication number
CN105657066B
CN105657066B CN201610173784.7A CN201610173784A CN105657066B CN 105657066 B CN105657066 B CN 105657066B CN 201610173784 A CN201610173784 A CN 201610173784A CN 105657066 B CN105657066 B CN 105657066B
Authority
CN
China
Prior art keywords
storage
node
memory
memory node
storage region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610173784.7A
Other languages
Chinese (zh)
Other versions
CN105657066A (en
Inventor
王东临
金友兵
莫仲华
齐宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shusheng Information Technology Co ltd
Original Assignee
TIANJIN SURSEN CLOUD TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TIANJIN SURSEN CLOUD TECHNOLOGY Co Ltd filed Critical TIANJIN SURSEN CLOUD TECHNOLOGY Co Ltd
Priority to CN201610173784.7A priority Critical patent/CN105657066B/en
Publication of CN105657066A publication Critical patent/CN105657066A/en
Priority to PCT/CN2017/077758 priority patent/WO2017162179A1/en
Priority to US16/139,712 priority patent/US10782898B2/en
Priority to US16/378,076 priority patent/US20190235777A1/en
Application granted granted Critical
Publication of CN105657066B publication Critical patent/CN105657066B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to the load for storage system equalization methods and devices again.This method comprises: the load condition between at least two memory nodes of monitoring;And when the load for monitoring a memory node exceeds predetermined threshold, the storage region managed the associated storage node at least two memory nodes is adjusted.Embodiment according to the present invention avoids the true migration process of data when can be loaded balanced again between storage region, to will not impact to regular traffic data.

Description

Load for storage system equalization methods and device again
Technical field
The present invention relates to the technical fields of data-storage system, equal again more particularly, to the load for storage system Weigh method and device.
Background technique
As computer application scale is increasing, the demand to memory space is also growing day by day.It is corresponding, plural number is set Standby storage resource (such as storage medium) integration be integrated as a storage pool provide storage service become it is present Mainstream.In traditional storage system, which is usually to be connected to the network multiple distributed storage node groups by TCP/IP At.Fig. 1 shows the configuration diagram of the storage system of the prior art.As shown in Figure 1, respectively being deposited in traditional storage system Storage node S is connected to TCP/IP network (realizing by core switch) by accessing network switch.Each memory node is An independent physical server, every server have several storage mediums of oneself.Each memory node pass through as IP network this The storage network connection of sample is got up, and a storage pool is constituted.
In the other side of core switch, each calculate node C is connected to TCP/IP network also by access network switch (being realized by core network switches), to access entire storage pool by TCP/IP network.
However, when it come to arriving dynamic equilibrium, being needed to physics number on memory node in traditional storage system According to being migrated, to reach balance purpose.
Further, in traditional storage system, usually when data are written in user, these data may be put down It is assigned on memory node, memory node load at this time and data occupancy are all that comparison is balanced.But in following situations, meeting There is the unbalanced of data:
(1) due to data Allocation Algorithms and user data itself the characteristics of, causes data to fail to be evenly distributed to difference and deposit Node is stored up, the memory node load shown as is high, and some memory node loads are low;
(2) dilatation operates: it is usually to realize dilatation by increasing new node, the memory node load of this stylish addition It is 0.The Data Physical of existing memory node must be migrated to a part to dilatation node, realize the load between memory node It is balanced again.
Fig. 2 shows realizing in traditional storage system based on TCP/IP network, the load between memory node is equal again The schematic diagram of Data Migration during weighing apparatus.In this example, the part number stored in higher memory node S1 will be loaded It is migrated according into the lower memory node S2 of load, and in particular to the data between the storage medium of two memory nodes Migration, as shown in dotted arrow 201.As it can be seen that loading process balanced again between the memory node for realizing TCP/IP network In, a large amount of disk read-write performance and network bandwidth can be occupied, the readwrite performance of regular traffic data is influenced.
Summary of the invention
In view of this, the first purpose of embodiment of the present invention is to provide a kind of high-efficient carrier for storage system again Equalization scheme.
Embodiment according to the present invention, the storage system may include storage network, at least two memory nodes with And at least one storage equipment, at least two memory node and at least one described storage equipment are respectively connected to described deposit Network is stored up, each storage equipment at least one described storage equipment includes at least one storage medium, wherein depositing described All storage mediums included by storage system constitute a storage pool, and the storage network is configured such that each storage section Point can access each storage medium without other memory nodes, and be single with storage medium by the storage pool Position is divided at least two storage regions, and each memory node is responsible for management zero to multiple storage regions.
According to an aspect of the present invention, a kind of load for aforementioned storage system equalization methods again are provided.The side Method includes: the load condition between monitoring at least two memory node;And in the load for monitoring a memory node When beyond predetermined threshold, the storage region managed the associated storage node at least two memory node is adjusted It is whole.
According to another aspect of the present invention, a kind of load for aforementioned storage system balancer again is provided.It is described Device includes: monitoring modular, for monitoring the load condition between at least two memory node;And adjustment module, it uses In in the case where monitoring the imbalance of load beyond predetermined threshold, to the correlation at least two memory node The storage region that memory node is managed is adjusted.
Further, monitoring the load condition between at least two memory node may include monitoring described at least two It is one or more in the following performance parameter of a memory node: the IOPS number of request of memory node;Memory node is handled up Amount;The CPU usage of memory node;The memory usage of memory node;And the storage of the storage medium of memory node management Space utilization rate.
Further, predetermined threshold can pass through the one or more of the respective specified threshold of the performance parameter Combination is to indicate.
Further, the respective specified threshold of performance parameter may include: that the parameter value of each performance parameter is highest Deviation between the parameter value of the minimum memory node of memory node and this performance parameter parameter value;Or each performance Between the average value of this parameter of this parameter value and each memory node of the highest memory node of the parameter value of parameter Deviation;Or the designated value for each performance parameter.
In one embodiment, predetermined threshold can be set to one or more of the following items: IOPS number is maximum The IOPS number of request of memory node and the IOPS number of request of the smallest memory node of IOPS number between deviation between deviation It is the 30% of the IOPS number of request of the smallest memory node of IOPS number;The IOPS number of request of the maximum memory node of IOPS number with The deviation between deviation between the average value of the IOPS number of request of each memory node is the 20% of the average value;Any storage The memory space utilization rate of medium is 0%;The memory space utilization rate of any storage medium is 90%;Or any memory node Storage between the highest storage medium of memory space utilization rate managed and the minimum storage medium of memory space utilization rate The difference of space utilization rate is greater than 20%.
Embodiment according to the present invention, each storage region at least two storage region are deposited by least one Block composition is stored up, a memory block is a complete storage medium or a memory block is a part of a storage medium.
In one embodiment, the adjustment carried out to storage region may include: to be managed associated storage node The allocation list of storage region be adjusted, at least two memory node determines what it was managed according to the allocation list Storage region.
In one embodiment, each storage region at least two storage region is by least one memory block group At a memory block is a complete storage medium, and wherein may include: to the adjustment of storage region progress will be described A storage medium in the first storage region at least two storage regions and a storage in the first storage region are situated between Matter is exchanged;Or a storage medium is deleted from first storage region, and the storage medium of the deletion is added Into second storage region;Or the new storage medium of access storage network or new storage region are fifty-fifty added Into at least two storage region;Or the partial memory area domain at least two storage region is merged.
In one embodiment, the storage region associated storage node at least two memory node managed Be adjusted includes: artificially to determine the storage region that is managed of associated storage node by the administrative staff of the storage system Adjustment mode;Or the adjustment mode for the storage region that associated storage node is managed is determined using configuration file mode;Or Person determines the adjustment mode for the storage region that associated storage node is managed according to the loading condition of memory node.Adjustment mode It may include the part for the storage region to be migrated and the target storage node to be moved to.
Further, storage network may include at least one storage switching equipment, all at least two memory nodes and At least one described storage medium all passes through memory channel and connect with storage switching equipment.Memory channel can be the channel SAS or The channel PCI/e, storage switching equipment can be SAS switch or PCI/e interchanger.
Further, storage equipment can be JBOD;And/or storage medium can be hard disk, flash memory, SRAM or DRAM.
Further, the interface of storage medium can be SAS interface, SATA interface, PCI/e interface, DIMM interface, NVMe Interface, scsi interface, ahci interface.
Embodiment according to the present invention, each memory node can correspond to one or more calculate nodes, and each The corresponding calculate node of memory node is all located at same server.
Embodiment according to the present invention, memory node can be a virtual machine of the server, a container or Run directly in a module on the physical operating system of the server;And/or calculate node can be the server A virtual machine, a container or run directly in a module on the physical operating system of the server.
Embodiment according to the present invention, the management for the storage region that memory node manages it may include: each Memory node can only read and write self-administered storage region;Or each memory node can only write self-administered storage region, but It can read the storage region of self-administered storage region and other memory node management.
According to a further aspect of the invention, provide it is a kind of in computer readable storage medium, it is described computer-readable to deposit Storage media, which has, is stored in computer readable program code part therein, and the computer readable program code part is processed Device executes any one of aforementioned method when running.For example, the computer readable program code part includes: the first executable portion Point, for monitoring the load condition between at least two memory node;And the second executable part, for monitoring When the load of one memory node exceeds predetermined threshold, the associated storage node at least two memory node is managed Storage region be adjusted.
Embodiment according to the present invention, the memory node load for providing a kind of migration for supporting storage region are balanced again Scheme directly realizes the load of memory node again by redistributing the control of storage region between each memory node Equilibrium avoids the influence in transition process to regular traffic data.
From the detailed description made below in conjunction with attached drawing, these and other advantages and features of the invention will become bright It is aobvious, wherein similar element will be with similar number in entire several attached drawings described below.
Detailed description of the invention
Fig. 1 shows the configuration diagram of the storage system of the prior art;
Fig. 2 shows realizing in the storage system of the prior art to load again balanced principle signal between memory node Figure;
Fig. 3 A shows the framework signal of a constructed according to embodiment of the present invention specific storage system Figure;
The framework that Fig. 3 B shows a constructed according to another implementation of the invention specific storage system shows It is intended to;
Fig. 4 shows the process for loading again equalization methods for storage system according to embodiment of the present invention Figure;
Fig. 5 shows middle realization according to an embodiment of the present invention and loads schematic illustration balanced again;
The middle realization that Fig. 6 shows another embodiment according to the present invention loads schematic illustration balanced again;And
Fig. 7 shows the block diagram for loading again balancer for storage system according to embodiment of the present invention.
Specific embodiment
Present disclosure is described in more detail below hereinafter with reference to attached drawing, wherein showing the reality of present disclosure in the accompanying drawings Apply mode.But these embodiments can be realized with many different forms and be should not be construed as being limited to described herein Embodiment.On the contrary, provide these examples so that present disclosure will be thorough and complete, and will comprehensively to Those skilled in the art expression scope of the present disclosure.
Various embodiments of detailed description of the present invention in an illustrative manner with reference to the accompanying drawing.
Fig. 3 A shows the configuration diagram of the storage system of embodiment according to the present invention.The storage system includes storage Network;Memory node is connected to the storage network;And storage equipment, it is similarly connected to the storage network.Each storage Equipment includes at least one storage medium.For example, inventor, which commonly stores equipment, can place 45 pieces of storage mediums.Wherein, The storage network is configured such that each memory node can access all deposit without other memory nodes Storage media.In Fig. 3 A will storage network be illustrated as SAS switch, but it is to be understood that storage network can also be SAS set, Or the other forms that will be discussed below.Fig. 3 A schematically shows three memory nodes, i.e. memory node S1, storage Node S2 and memory node S3, difference are directly connected with SAS switch.Storage system shown in Fig. 3 A includes physical server 31,32 and 33, these physical servers are connect with storage equipment by storing network respectively.Physical server 31 includes being co-located in Its calculate node C11, C12 and memory node S1, physical server 32 include being co-located in its calculate node C21, C22 and depositing Node S2 is stored up, physical server 33 includes calculate node C31, the C32 and memory node S3 for being co-located in it.It is deposited shown in Fig. 3 A Storage system includes storage equipment 34,35 and 36, and storage equipment 34 includes the storage medium 1 for being co-located in it, storage medium 2 and deposits Storage media 3, storage equipment 35 include the storage medium 1, storage medium 2 and storage medium 3 for being co-located in it, and storage equipment 36 is wrapped Include the storage medium 1, storage medium 2 and storage medium 3 for being co-located in it.
Using storage system provided in an embodiment of the present invention, each memory node can be saved without other storages Point and access all storage mediums so that all storage medium of the present invention is all actually total to by all memory nodes It enjoys, and then realizes the effect of pool of global storage.That is, storage network is configured such that each memory node can It is enough to access all storage mediums without other memory nodes.Further, storage network is configured such that each deposit It stores up node and is only responsible for the fixed storage medium of management simultaneously, and guarantee that a storage medium will not be by multiple memory nodes simultaneously Be written, lead to corrupted data, so as to realize each memory node can without other memory nodes and The storage medium that access is managed by it, and can guarantee the integrality of the data stored in storage system.Furthermore, it is possible to by institute The storage pool of building is divided at least two storage regions, and each memory node is responsible for management zero to multiple storage regions.With reference to Fig. 3 A, using different background pattern, the situation for the storage region for diagrammatically illustrating memory node management, wherein to identical The storage medium and the responsible memory node for managing it that storage region includes are indicated with identical background patterns.Specifically For, memory node S1 is responsible for managing the first storage region comprising stores in storing the storage medium 1 of equipment 34, being in The storage medium 1 of equipment 35 and in storage equipment 36 storage medium 1;Memory node S2 is responsible for managing the second memory block Domain comprising in storage equipment 34 storage medium 2, in storage equipment 35 storage medium 2 and in storage equipment 36 storage medium 2;Memory node S3 is responsible for managing third storage region comprising the storage medium in storage equipment 34 3, the storage medium 3 in storage equipment 35 and the storage medium 3 in storage equipment 36.
Meanwhile it can be seen from the above description that compared with the prior art (wherein memory node is located at storage medium side, Or strictly speaking, storage medium is the built-in disk of physical machine where memory node), in the embodiment of the present invention, memory node institute Physical machine independently of storage equipment, storage equipment more as connection storage medium with storage network a channel.
Such mode so that when needing to carry out dynamic equilibrium, without by physical data in different storage mediums It is migrated, it is only necessary to by configuring the storage region (or storage medium) for balancing different memory nodes and being managed.
In an alternative embodiment of the invention, storage-node side further comprises calculate node, and calculate node and storage Node is arranged in a physical server, which connect with storage equipment by storing network.Utilize the present invention Calculate node and memory node are located to the gathering storage system of same physical machine, from overall structure constructed by embodiment For, it is possible to reduce the quantity of required physical equipment, to reduce cost.Meanwhile calculate node can also be arrived in local IP access Its storage resource desired access to.In addition, since calculate node and memory node to be aggregated on same physical server, two Data exchange can be as simple as only shared drive between person, and performance is especially excellent.
In storage system provided in an embodiment of the present invention, calculate node to the I/O data path length between storage medium It include: (1) storage medium to memory node;And (2) memory node is to the calculate node for being aggregated in same physical server (cpu bus access).And in contrast, the storage system of the prior art shown in Fig. 1, calculate node is between storage medium I/O data path length includes: (1) storage medium to memory node;(2) memory node to storage network insertion network switch; (3) network insertion network switch is stored to core network switches;(4) core network switches to calculate network insertion network switch; And (5) calculate network insertion network switch to calculate node.Obviously, the total data road of the storage system of embodiment of the present invention Diameter is only close to (1) item of heritage storage system.That is, storage system provided in an embodiment of the present invention, by I/O data road The ultimate attainment compression of electrical path length can greatly improve the I/O channel performance of storage system, and practical operational effect is very close In the channel I/O of read-write local hard drive.
In an embodiment of the present invention, memory node can be a virtual machine of physical server, a container or straight A module on the physical operating system for operating in server is connect, calculate node is also possible to the same physical machine server One virtual machine, a container run directly in a module on the physical operating system of the server.In a reality It applies in example, each memory node can correspond to one or more calculate nodes.
Specifically, a physical server can be divided into multiple virtual machines, wherein a virtual machine does memory node With other virtual machines do calculate node use;It can also be and do memory node use using a module on physics OS, to realize more Good performance.
In an embodiment of the present invention, formed virtual machine virtualization technology can be KVM or Zen or VMware or Hyper-V virtualization technology, formed the container container technique can be Docker or Rockett or Odin or Chef or LXC or Vagrant or Ansible or Zone or Jail or Hyper-V container technique.
In an embodiment of the present invention, each memory node is only responsible for the fixed storage medium of management, and one simultaneously Storage medium will not be written by multiple memory nodes simultaneously, to avoid data collision, so as to realize each storage Node can access the storage medium managed by it without other memory nodes, and can guarantee in storage system The integrality of the data of storage.
In an embodiment of the present invention, storage medium all in system can be divided according to storage logic, is had For body, the storage pool of whole system can be divided into storage region, storage group, logical storage layers grade frame as memory block Structure, wherein memory block is minimum memory unit.In an embodiment of the present invention, storage pool at least two can be divided into deposit Storage area domain.
In an embodiment of the present invention, each storage region can be divided at least one storage group.Preferably at one In embodiment, each storage region is at least divided into two storage groups.
In some embodiments, storage region and storage group can merge, so as in the storage tier framework One level of middle omission.
In an embodiment of the present invention, each storage region (or storage group) can be made of at least one memory block, Wherein a part that memory block can be a complete storage medium, be also possible to a storage medium.In order in memory block Domain internal build redundant storage, each storage region (or storage group) can be made of at least two memory blocks, when wherein appointing It, can be from complete stored data be calculated in the group in remaining memory block when what memory block breaks down.Redundancy is deposited Storage mode can be more copy modes, raid-array (RAID) mode, correcting and eleting codes (erase code) mode.At this It invents in an embodiment, redundant storage mode can be established by ZFS file system.In an embodiment of the present invention, in order to right Anti- storage equipment/storage medium hardware fault, multiple memory blocks that each storage region (or storage group) is included will not It is not located in the same storage equipment in the same storage medium, or even also.In an embodiment of the present invention, each storage Any two memory block that region (or storage group) is included will not all be located in the same storage medium/storage equipment.? In another embodiment of the present invention, the storage of same storage medium/storage equipment is located in same storage region (or storage group) Number of blocks is preferably less than or equal to the redundancy of redundant storage.For example, when storing 5 mode of RAID that redundancy is taken, The redundancy of redundant storage is 1, then the storage number of blocks for being located at the same storage group of same storage equipment is up to 1;It is right RAID6, the redundancy of redundant storage are 2, then the storage number of blocks for being located at the same storage group of same storage equipment is most It is 2.
In an embodiment of the present invention, self-administered storage region can only be read and be write to each memory node.Due to more A memory node can't conflict mutually to the read operation of the same memory block, and multiple memory nodes write a memory block simultaneously It is easy to happen conflict, therefore, in another embodiment, can be each memory node can only write self-administered memory block Domain, but the storage region of self-administered storage region and other memory node management can be read, i.e. write operation is local Property, but read operation can be it is of overall importance.
In one embodiment, storage system can also include storage control node, be connected to storage network, be used for Determine the storage region of each memory node management.In another embodiment, each memory node may include storage point With module, the storage region managed for determining the memory node, this can be by each included by each memory node Communication and Coordination Treatment algorithm between distribution module are stored to realize.
In one embodiment, it when monitoring that a memory node breaks down, can be deposited to other parts or all Storage node is configured, so that by the memory block of the memory node management broken down before these memory nodes adapter tube Domain.For example, the storage region for the memory node management broken down can be taken over by one of memory node, alternatively, can be with It is taken over by other at least two memory nodes, wherein the portion for the memory node management that each memory node adapter tube breaks down The storage region divided, such as other at least two memory nodes take over the different storage groups in the storage region respectively.
In one embodiment, storage medium can include but is not limited to hard disk, flash memory, SRAM, DRAM, NVME or its Its form, the access interface of storage medium can include but is not limited to SAS interface, SATA interface, PCI/e interface, DIMM and connect Mouth, NVMe interface, scsi interface, ahci interface.
In an embodiment of the present invention, storage network may include at least one storage switching equipment, by including Storage switching equipment between data exchange realize access of the memory node to storage medium.Specifically, memory node Pass through memory channel respectively with storage medium to connect with storage switching equipment.
In an embodiment of the present invention, storage switching equipment can be SAS switch or PCI/e interchanger is accordingly deposited Storage channel can be (Serial Attached SCSI (SAS)) channel SAS or the channel PCI/e.
By taking the channel SAS as an example, possessed compared to traditional storage scheme based on IP agreement based on the scheme of SAS exchange Performance is high, with the advantages that roomy, single device number of disks is more.With on host adapter (HBA) or server master board After SAS interface is used in combination, storage provided by SAS system can easily connected multiple servers access simultaneously.
Specifically, SAS switch is connected between storage equipment by a SAS line, equipment and storage medium are stored Between be also to be connected by SAS interface, for example, the channel SAS is connected to each storage medium (can be set in storage inside storage equipment One SAS exchange chip of standby internal setting).It is gigabit Ethernet since the bandwidth of SAS network can achieve 24Gb or 48Gb Tens times, and several times of ten thousand expensive mbit ethernets;There is mentioning for about an order of magnitude than IP network in link layer SAS simultaneously It rises, in transport layer, is closed due to Transmission Control Protocol three-way handshake four times, expense is very high and delayed acknowledgement mechanism of TCP and slow turn-on have When will lead to the delays of 100 Milliseconds, the delay of SAS protocol only has 1/the tens of TCP, and performance has bigger promotion.Always It, SAS network has huge advantage than the TCP/IP based on Ethernet in terms of bandwidth, time delay.Those skilled in the art can To understand, the performance in the channel PCI/e is also adapted to the demand of system.
In an embodiment of the present invention, storage network may include at least two storage switching equipment, each storage Node can be connected to any one storage equipment by any one storage switching equipment, and then be connected to storage medium. When any one storage switching equipment or when being connected to the memory channel failure of a storage switching equipment, memory node is logical Cross the data in other storage switching equipment read-write storage equipment.
With reference to Fig. 3 B, it illustrates a constructed according to one embodiment of the present invention specific storage systems 30. Storage equipment in storage system 30 is built into more JBOD 307-310, is connected to two SAS by SAS data line respectively Interchanger 305 and 306, the two SAS switches constitute the exchcange core that network is stored included by storage system.Front end is At least two servers 301 and 302, every server are connected to this by SAS interface on HBA equipment (not shown) or mainboard Two SAS switches 305 and 306.It is used to monitor and communicate there are basic network connection between server.In every server There is a memory node, using the information obtained from SAS link, some or all of manages in all JBOD disks disk. JBOD disk is divided into specifically, can use present specification storage region described above, storage group, memory block Different storage groups.Each memory node manages one or more groups of such storage groups.Using superfluous inside each storage group When the mode of balance storage, the metadata of redundant storage can be present on disk, redundant storage is deposited by other Node is stored up directly to identify from disk.
In the exemplary memory system 30 shown in, memory node can install monitoring and management module, be responsible for monitoring originally The state of ground storage and other servers.When some disk exception on the whole abnormal or JBOD of certain JBOD, data can Ensured by property by redundant storage.Memory node when certain server failure, on another pre-set server In management module locally identifying according to the data on disk and taking over the memory node institute by failed server originally The disk of management.The storage service that the memory node of failed server externally provides originally, also by depositing on new server Storage node is continued.So far, a kind of pool of global storage structure of completely new High Availabitity is realized.
As it can be seen that constructed exemplary memory system 30 provides, a kind of multiple spot is controllable, storage pool of global access.Firmly Service externally is provided using multiple servers in terms of part, stores disk using JBOD.More JBOD are respectively connected two SAS switch, two interchangers are separately connected the HBA card of server again, so that it is guaranteed that all disks on JBOD, can be owned Server access.SAS redundant link also ensures the high availability of chain road.
In every server local, using redundant memory technology, disk is chosen from every JBOD and forms redundant storage, is kept away The loss for exempting from separate unit JBOD causes data unavailable.When a server failure, the module being monitored to integrality will Another server is dispatched, the disk managed by the memory node of SAS channel access failed server, rapid pipe connecting other side These responsible disks realize the global storage of High Availabitity.
Although be illustrated so that JBOD stores disk as an example in figure 3b, but it is to be understood that as shown in Figure 3B Embodiments of the present invention also support the storage equipment other than JBOD.In addition, being with one piece of storage medium (entire) work above For a memory block, it is applied equally to using a part of a storage medium as the situation of a memory block.
Fig. 4 shows the process of the access control method 40 for exemplary memory system of embodiment according to the present invention Figure.
In step S401, the load condition between at least two memory nodes included by storage system is monitored.
In step S402, when the load for monitoring a memory node exceeds predetermined threshold, at least two storages are saved The storage region that associated storage node in point is managed is adjusted.Associated storage node can be the unevenness for causing the load The memory node of weighing apparatus state possibly relies on the adjustable strategies of storage region and determines.To the adjustment of storage region can be by The memory block being related to is redistributed between memory node, or increase, merging or deletion for can be storage region etc.. The allocation list for the storage region that associated storage node is managed can be adjusted, at least two memory node is according to institute Allocation list is stated to determine its storage region managed.Can include by storage system above-mentioned to the adjustment of aforementioned arrangements table Storage control node or memory node include storage distribution module carry out.
In one embodiment, following property can be directed to the monitoring of the load condition between at least two memory nodes Can one or more progress in parameter: read-write operation number (IOPS) number of request per second of memory node, memory node gulp down The amount of spitting, the CPU usage of memory node, the memory usage of memory node and the storage medium of memory node management account for With rate.
In one embodiment, the performance parameter of each node regular monitoring oneself can be made, while periodically inquiring it Then the data of his node generate a global unification again by equalization scheme again predetermined or by algorithm dynamic Equalization scheme, last each node execute the program.It include independently of memory node in another embodiment, storage system The monitoring node of S1, memory node S2 and memory node S3 or storage control node above-mentioned or storage distribution module, To monitor the performance parameter of each memory node.
In one embodiment, for unbalanced judgement can by threshold values predetermined (configurable) Lai Shixian, Such as when the deviation of the IOPS number between each node is more than that a certain range then triggers equilibrating mechanism again.For example, for IOPS, Can comparing with the IOPS number of the smallest memory node of IOPS number by the maximum memory node of IOPS number, determine both Between deviation be greater than the latter 30% when, triggering storage region is adjusted.For example, the maximum storage of IOPS number is saved The storage medium that a managed storage medium of point is managed with the smallest memory node of IOPS number is exchanged, for example is selected The highest storage medium of occupancy that the maximum memory node of IOPS number is managed is selected to be managed with the smallest memory node of IOPS number The highest storage medium of the occupancy of reason.
It is alternatively possible to being averaged the IOPS number of the maximum memory node of IOPS number and the IOPS number of each memory node Value compares, and when determination deviation between the two is greater than the 20% of the latter, triggering is adjusted storage region, so that adjustment Storage region allocation plan afterwards will not trigger again balanced immediately.
It should be appreciated that being previously described for indicating that the predetermined threshold 20%, 30% of the imbalance of load is only exemplary , it can also the other threshold value of different definition according to application and for demand.Similarly, other performances are joined Number, such as handling capacity, the CPU utilization rate of memory node, the memory usage of memory node and the storage section of memory node The occupancy of the storage medium of point management, also definition, which is pre-defined, loads threshold value balanced again between memory node for triggering.
It is also understood that although discussed above can join the predetermined threshold of unbalanced judgement by multinomial performance One specified threshold of the respective specified threshold in number, such as IOPS number indicate, but it is envisioned that arrive the predetermined threshold Being worth it can also be indicated by the combination of the multinomial specified threshold of the respective specified threshold in multinomial performance parameter.For example, When the handling capacity that the IOPS number of memory node reaches its specified threshold and memory node reaches its specified threshold, just triggering is deposited The load for storing up node is balanced again.
In an embodiment, adjustment (balanced again) for storage region can will load high memory node and be managed The storage medium of reason, which is assigned to, to be loaded in the storage region that low memory node is managed, such as may include the friendship of storage medium It changes or from the deletion loaded in the storage region that high memory node is managed and is loading what low memory node was managed Increase in storage region or will access storage network new storage medium or new storage region be fifty-fifty added to In few two storage regions (for example, storage system dilatation) or by the partial memory area domain at least two storage regions into Row merges (for example, a memory node failure).In an embodiment, adjustment (balanced again) for storage region can be with Development behavior algorithm, for example, the various load datas of each storage medium and each memory node are weighted to obtain one Then single loading index calculates an equalization scheme again, by the minimal number of disk group of movement, keep system no longer super Reservation threshold out.
In one embodiment, the performance for the storage medium that each memory node regular monitoring oneself can be made to be managed Parameter, while the performance parameter for the storage medium that other nodes are managed periodically is inquired, it is fixed for the performance parameter of storage medium The threshold value of imbalance of the justice for indicating load, for example, the threshold value can use for the memory space of any storage medium Rate be 0% (thering is new disk to be added), any storage medium memory space utilization rate be 90% (having disk space that will expire) or The difference of the highest storage medium of memory space utilization rate and the minimum storage medium of memory space utilization rate is big in person's storage system In the 20% of the latter.It should be appreciated that being previously described for indicating the predetermined threshold 0%, 90%, 30% of the imbalance of load It is merely exemplary.
Fig. 5 show according to an embodiment of the present invention, in the storage system shown in Fig. 3 A realize that load is balanced again Schematic illustration.Assuming that at a time, the load of the memory node S1 in the storage system is very high, and what is managed deposits Storage media includes positioned at the storage medium 1 stored at equipment 34, the storage medium 1 at storage equipment 35 and being located at storage Storage medium 1 (as shown in Figure 3A) at equipment 36, and its total memory space will be used up quickly, while memory node 3 load is very low, and the memory space in storage medium managed is big.
In traditional storage network, each memory node can only access the storage region for being directly connected to itself.Therefore During rebalancing, need to copy to the data on heavy duty memory node on light load node, in the process, meeting There is mass data duplication operation, additional load is caused to storage region and network, influences the I O access of regular traffic data. For example, it is desired to which the one or more storage mediums managed from memory node 1 read data, then the data of reading are written to The one or more that memory node 3 manages finally discharges the disk that the data are stored in the storage medium that memory node 1 manages Load balancing is realized in space.
However, embodiment according to the present invention, as included by storage system in each memory node S1, S2 and S3 All storage regions can be accessed by storing network, it therefore, can be by way of shifting storage medium access right come real Existing migration of the storage region between each memory node, it can the storage region managed to associated storage node is again Grouping.During rebalancing, the data in each storage region no longer need to do duplication operation.For example, as shown in Figure 5, At storage equipment 34, the original storage medium 2 for having memory node 3 to manage is allocated to memory node 1 to manage, simultaneously will Storage medium 1 at storage equipment 34, originally having memory node 1 to manage is allocated to memory node 3 and manages, and is realized with this The load balancing of residual memory space between memory node 1 and memory node 3.In the process, it is only necessary to memory node 1 Configuration with memory node 3 is modified, and can be completed within a very short time, will not the business datum readwrite performance to user make At influence.
Fig. 6 show another embodiment according to the present invention, in the storage system shown in Fig. 3 A realize that load is equal again The schematic illustration of weighing apparatus.It is different from Fig. 5, in Fig. 6, in the load for monitoring the load of memory node S1 and memory node S2 When lower, it can will be located at storage medium 2 at storage equipment 35, originally thering is memory node 2 to manage and be allocated to memory node 1 Management, while at storage equipment 34, the original storage medium 1 for having memory node 1 to manage is allocated to memory node 2 and is managed Reason, the load balancing of the residual memory space between memory node 1 and memory node 2 is realized with this.
In monitoring the another embodiment for being storage medium dilatation, for example, can be by the flat of newly-increased storage medium It is assigned on each memory node and is managed by it, for example according to the sequence of addition, remain negative between memory node with this It carries balanced.
Although it should be appreciated that above-mentioned two embodiment with storage medium is scheduled between different memory nodes with Realize that load is balanced again, but it can be applicable to dispatch storage region between memory node to realize that load is balanced again, For example, in the case of storage medium dilatation, it, can depositing addition when monitor to be added is the situation of a storage region Storage area domain is assigned to each memory node by addition sequence.
Additionally, as shown in Figure 5 and Figure 6, very high in the load for monitoring memory node S1, storage can also be modified The configuration between calculate node and memory node in system, so that originally passing through at least one of memory node S1 storing data One or more calculate nodes, such as C12 in calculate node, can by other memory nodes, such as memory node S2, Carry out storing data.At this point, calculate node can need to access the memory node in place of the physical server locating for it to store Data, then can not physically mobile computing node, but accessed by remote access protocol, such as iSCSI protocol remote Storage region (as shown in Figure 5) on journey memory node;Alternatively, can the storage region that associated storage node is managed into While capable adjustment, calculate node is migrated (as shown in Fig. 6), may need first to close during this to be moved Calculate node.
It should be appreciated that memory node included by the storage system of above-mentioned reference Fig. 3-Fig. 6 discussion, storage equipment, storage The number of medium and storage region is only illustrative, and the storage system of embodiment may include at least two according to the present invention Memory node, storage network and at least one the storage equipment being connect by storing network at least two memory nodes, institute The each storage equipment stated at least one storage equipment may include at least one storage medium, and storage network can be configured To make each memory node that can access all storage mediums without other memory nodes.
Embodiment according to the present invention, each storage region are managed by a memory node in multiple memory nodes Reason, after memory node starting, memory node connects the storage region managed by it automatically, is then imported, after completion Storage service can be provided to upper layer calculate node.
When occurring load imbalance state between monitoring memory node, it is thus necessary to determine that saved for loading higher storage The part for the storage region that point, needs migrate, and the memory node for needing to move to the storage region.
Determination for the part for the storage region for needing to migrate, can be there are many embodiment.In an embodiment In, can need which storage region migrated by administrative staff's artificial judgment.It in one embodiment, can be using configuration text Part mode is pre-configured with migration priority for each storage region, the memory node is selected to work as when needing to migrate One or more memory block, storage group or the storage medium of highest priority in the storage region of preceding management is moved It moves.It in one embodiment, can be according to the load feelings of memory block, storage group or storage medium included by storage region Condition is migrated;For example, each memory node can monitor the included memory block of storage region at one's disposal, storage group Or the loading condition of storage medium, for example the information such as IOPS, handling capacity, IO delay are collected, all these information are added Power synthesis, to select the storage region part for needing to migrate.
Determination for the memory node that needs move to the storage region, can be there are many embodiment.At one In embodiment, the memory node that can be moved to by administrative staff's artificial judgment.In one embodiment, it can use and match File mode is set, i.e., is pre-configured with migration object listing, such as the storage section according to priority arrangement for each storage region Point list successively selects move target according to object listing after determining that the storage region (or part) needs are migrated Ground.It should be noted that using such mode, it shall be guaranteed that not will cause target storage node load too high after migration.In a reality It applies in mode, the memory node to be moved to can be selected according to the loading condition of memory node, each storage section can be monitored Point loading condition, such as collect CPU usage, memory usage, the information such as network bandwidth utilization rate, by all these information into Row weighted comprehensive, to select the memory node for needing to move to storage region.For example, each memory node can periodically or Person aperiodically reports the loading condition of itself to other memory nodes, when needing to migrate, needs depositing for migrating data Other memory nodes that storage node preferentially selects load minimum are migrated as target storage node.
The storage region (or its part) that needs migrate and the target storage node that its administrative power moves to is being determined Afterwards, specific transition process can be confirmed and started by the administrative staff of storage system, or the migration can also be opened by program Process.It should be noted that transition process needs to reduce the influence to upper layer calculate node to the greatest extent, such as can choose in application load It is migrated when minimum, for example carries out (assuming that period load is minimum) at midnight;It needs to close in transition process determining In the case where calculate node, it should be carried out in the case where the low utilization rate of the calculate node as far as possible;Migration can be pre-configured with Strategy is determining the case where needing the multiple portions to multiple storage regions or a storage region to migrate to handle Under migration sequence and concurrent quantity control;It, can be to associated storage section when starting to migrate storage region Point to associated memory region write or read operation carries out necessary configuration, to guarantee the integrality of data, such as will own Data cached write-in disk;After storage region moves to target storage node, memory node needs to carry out the memory node Necessary initial work, then the storage region can just be accessed by upper layer calculate node;It should be again after the completion of transition process Whether secondary monitoring loading condition, confirmation load balance.
As previously mentioned, storage system may include storage control node, it is connected to the storage network, for determining State the storage region of each memory node management at least two memory nodes;Alternatively, the memory node can also include Distribution module is stored, the storage region managed for determining the memory node, number can be shared between distribution module by storing According to.
In one embodiment, control node or storage distribution module are stored, each memory node is had recorded and is responsible for Storage region list.Self-administered storage is inquired to storage control node or storage distribution module after memory node starting Then these storage regions are scanned in region, complete initial work.When determining that needing to occur storage region migrates, storage control The storage region list of node processed or storage distribution module modification associated storage node, then notifies memory node as requested Complete actual switch operating.
For example, it is assumed that need storage region 1 moving to storage section from memory node A in SAS storage system 30 Point B, then transition process may include steps of:
1) storage region 1 is deleted from the list of managing storage area of memory node A;
2) all data cached pressures are brushed into storage region 1 on memory node A;
3) all in (or resetting) memory node A and storage region 1 deposit is closed by SAS instruction on memory node A SAS link between storage media;
4) storage region 1 is added in the list of managing storage area on memory node B;
5) it is deposited on memory node B by all in SAS instruction unpack (or resetting) memory node B and storage region 1 SAS link between storage media;
6) all storage mediums in memory node B-scan storage region 1 complete initial work;And
7) application program passes through the data in the access storage areas memory node B domain 1.
It should be noted that method of the present invention is illustrated and described as while for purposes of simplicity of explanation a succession of dynamic Make, it should be understood that with recognizing that claimed subject content will not be limited by the execution sequence that these are acted, because one A little movements can concurrently occur according to order in a different order appearance shown and described herein or with other movements, together When some movements be also possible that several sub-steps, and the possibility for intersecting execution in timing is likely to occur between these sub-steps. Additionally, it is possible to which and the movement of not all diagram is necessary to implementing the method according to the appended claims.Furthermore it is preceding It can also include the additional step that may obtain additional effect that the description for stating step, which is not excluded for this method,.It is also understood that difference Embodiment or process described in method and step can be combined with each other or replace.
Fig. 7 shows the frame for loading again balancer 70 for storage system according to embodiment of the present invention Figure.Balancer 70 may include: monitoring modular 701 again for load, negative between at least two memory node for monitoring Load state;And adjustment module 702, in the case where monitoring the imbalance of load beyond predetermined threshold, to institute The storage region that the associated storage node at least two memory nodes is managed is stated to be adjusted.
It should be appreciated that each module recorded in device 70 is opposite with each step in the method 40 with reference to Fig. 4 description It answers.The operation above with respect to Fig. 4 description and feature are equally applicable to device 70 and module wherein included as a result, duplicate interior Details are not described herein for appearance.
Embodiment according to the present invention, device 70 can be implemented at each memory node, can also be implemented in In the dispatching device of multiple memory nodes.
The teachings of the present invention is also implemented as a kind of computer program product of computer readable storage medium, including meter Calculation machine program code is enabled a processor to when computer program code is executed by processor according to embodiment party of the present invention The method of formula realizes the load for storage system as the embodiment described herein equalization scheme again.Computer storage medium It can be any tangible media, such as floppy disk, CD-ROM, DVD, hard disk drive, even network medium etc..
Embodiment according to the present invention provides a kind of storage section of migration for supporting storage medium or storage region Point loads equalization scheme, the directly control by redistributing storage medium or storage region between each memory node again Power is balanced to realize again, avoids the influence in transition process to regular traffic data, improves memory node load significantly Balanced efficiency again.
Although being produced it should be appreciated that can be computer program the foregoing describe a kind of way of realization of embodiment of the present invention Product, but the method or apparatus of embodiments of the present invention can be come in fact according to the combination of software, hardware or software and hardware It is existing.Hardware components can use special logic to realize;Software section can store in memory, by instruction execution appropriate System, such as microprocessor or special designs hardware execute.It will be understood by those skilled in the art that above-mentioned side Method and equipment can be used computer executable instructions and/or is included in the processor control code to realize, such as such as Disk, the mounting medium of CD or DVD-ROM, the programmable memory of such as read-only memory (firmware) or such as optics or Such code is provided in the data medium of electrical signal carrier.Methods and apparatus of the present invention can be by such as ultra-large The semiconductor or such as field programmable gate array of integrated circuit or gate array, logic chip, transistor etc. can be compiled The hardware circuit realization of the programmable hardware device of journey logical device etc., can also be soft with being executed by various types of processors Part is realized, can also be realized by the combination such as firmware of above-mentioned hardware circuit and software.
It will be appreciated that though it is referred to several modules or submodule of device in the detailed description above, but it is this Division is only exemplary rather than enforceable.In fact, according to an illustrative embodiment of the invention, above-described two Or more the feature and function of module can be realized in a module.Conversely, the feature and function of an above-described module It can be able to be to be realized by multiple modules with further division.
It is also understood that in order not to obscure embodiments of the present invention, specification only to it is some it is crucial, may not necessary skill Art and feature are described, and the feature that may do not can be realized to some those skilled in the art is explained.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, made any modification, equivalent replacement etc. be should all be included in the protection scope of the present invention.

Claims (17)

1. equalization methods, the storage system include storage network, at least two storages again for a kind of load for storage system Node and at least one storage equipment, at least two memory node and at least one described storage equipment are respectively connected to The storage network, at least one described each storage equipment stored in equipment includes at least one storage medium, wherein will All storage mediums included by the storage system constitute a storage pool, and the storage network is configured such that each Memory node can access all storage mediums without other memory nodes, and by the storage pool be divided into Few two storage regions, each memory node is responsible for management, and zero to multiple storage regions and each memory node can only be written from own pipe The storage region of reason;
The storage system further include:
Control node is stored, the storage network is connected to, for determining each storage at least two memory node The storage region of node administration;Or
The memory node further include:
Store distribution module, the storage region managed for determining the memory node;
The described method includes:
Monitor the load condition between at least two memory node;And
When the load for monitoring a memory node exceeds predetermined threshold, the correlation at least two memory node is deposited The storage region that storage node is managed is adjusted.
2. according to the method described in claim 1, wherein, the storage control node or the storage distribution module have recorded The storage region list of the storage region of each memory node management at least two memory node, and it is described to institute It states the storage region that the associated storage node at least two memory nodes is managed and is adjusted and include:
Modify the storage region list of associated storage node.
3. the load condition according to the method described in claim 1, wherein, between monitoring at least two memory node It is one or more in following performance parameter including monitoring at least two memory node:
The IOPS number of request of memory node;
The handling capacity of memory node;
The CPU usage of memory node;
The memory usage of memory node;And
The memory space utilization rate of the storage medium of memory node management.
4. according to the method described in claim 3, wherein, the respective specified threshold that the predetermined threshold passes through the performance parameter The combination of the one or more of value indicates.
5. according to the method described in claim 4, wherein, the respective specified threshold of the performance parameter includes:
The memory node minimum with the parameter value of this performance parameter of the highest memory node of the parameter value of each performance parameter Parameter value between deviation;Or
This parameter of this parameter value and each memory node of the highest memory node of the parameter value of each performance parameter Deviation between average value;Or
For the designated value of each performance parameter.
6. according to the method described in claim 1, wherein, each storage region at least two storage region is by least One memory block composition, a memory block is a complete storage medium or a memory block is a storage medium A part.
7. according to the method described in claim 6, wherein, the adjustment carried out to storage region includes: to associated storage section The allocation list of the managed storage region of point is adjusted, and at least two memory node determines it according to the allocation list The storage region managed.
8. according to the method described in claim 1, wherein, each storage region at least two storage region is by least One memory block composition, a memory block is a complete storage medium, and the tune wherein carried out to storage region It is whole to include:
It will be in the storage medium and the second storage region in the first storage region at least two storage region One storage medium is exchanged;Or
A storage medium is deleted from first storage region, and the storage medium of the deletion is added to described second In storage region;Or
The new storage medium of access storage network or new storage region are fifty-fifty added at least two memory block In domain;Or
Partial memory area domain at least two storage region is merged.
9. method according to claim 1 to 8, wherein the phase at least two memory node Closing the storage region that memory node is managed and being adjusted includes: artificially to determine correlation by the administrative staff of the storage system The adjustment mode for the storage region that memory node is managed;Or
The adjustment mode for the storage region that associated storage node is managed is determined using configuration file mode;Or
The adjustment mode for the storage region that associated storage node is managed is determined according to the loading condition of memory node,
Wherein, the adjustment mode includes the part for the storage region to be migrated and the target storage node to be moved to.
10. method according to claim 1 to 8, wherein the storage network includes that at least one storage is handed over Exchange device, all at least two memory nodes and at least one described storage medium all pass through memory channel and storage switching equipment Connection.
11. according to the method described in claim 10, wherein, the memory channel is the channel SAS or the channel PCI/e, described to deposit Storing up switching equipment is SAS switch or PCI/e interchanger.
12. method according to claim 1 to 8, wherein the storage equipment is JBOD;And/or
The storage medium is hard disk, flash memory, SRAM or DRAM;And/or the interface of the storage medium is SAS interface, SATA Interface, PCI/e interface, DIMM interface, NVMe interface, scsi interface, ahci interface.
13. method according to claim 1 to 8, wherein the corresponding one or more calculating of each memory node Node, and the corresponding calculate node of each memory node is all located at same server.
14. according to the method for claim 13, wherein the memory node is virtual machine, one of the server A container runs directly in a module on the physical operating system of the server;And/or
The calculate node is a virtual machine, a container or the physics for running directly in the server for the server A module in operating system.
15. method according to claim 1 to 8, wherein the storage region that memory node manages it Management includes:
Each memory node can only read and write self-administered storage region;Or
Each memory node can only write self-administered storage region, but can read self-administered storage region and other deposit Store up the storage region of node administration.
16. balancer, the storage system include storage network, at least two storages again for a kind of load for storage system Node and at least one storage equipment, at least two memory node and at least one described storage equipment are respectively connected to The storage network,
Each storage equipment at least one described storage equipment includes at least one storage medium, wherein the storage system Included all storage mediums constitute a storage pool, and the storage network is configured such that each memory node can It is enough to access all storage mediums without other memory nodes, and the storage pool is divided at least two memory blocks Domain, each memory node are responsible for management zero and arrive multiple storage regions,
Balancer includes: again for the load
Monitoring modular, for monitoring the load condition between at least two memory node;And
Module is adjusted, for being stored to described at least two when the load for monitoring a memory node exceeds predetermined threshold The storage region that associated storage node in node is managed is adjusted.
17. a kind of in computer readable storage medium, the computer readable storage medium, which has, is stored in computer therein Readable program code part, perform claim requires to appoint in 1-15 when the computer readable program code part is run by processor Method described in one.
CN201610173784.7A 2011-10-11 2016-03-23 Load for storage system equalization methods and device again Active CN105657066B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201610173784.7A CN105657066B (en) 2016-03-23 2016-03-23 Load for storage system equalization methods and device again
PCT/CN2017/077758 WO2017162179A1 (en) 2016-03-23 2017-03-22 Load rebalancing method and apparatus for use in storage system
US16/139,712 US10782898B2 (en) 2016-02-03 2018-09-24 Data storage system, load rebalancing method thereof and access control method thereof
US16/378,076 US20190235777A1 (en) 2011-10-11 2019-04-08 Redundant storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610173784.7A CN105657066B (en) 2016-03-23 2016-03-23 Load for storage system equalization methods and device again

Publications (2)

Publication Number Publication Date
CN105657066A CN105657066A (en) 2016-06-08
CN105657066B true CN105657066B (en) 2019-06-14

Family

ID=56495388

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610173784.7A Active CN105657066B (en) 2011-10-11 2016-03-23 Load for storage system equalization methods and device again

Country Status (2)

Country Link
CN (1) CN105657066B (en)
WO (1) WO2017162179A1 (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105657066B (en) * 2016-03-23 2019-06-14 天津书生云科技有限公司 Load for storage system equalization methods and device again
CN107423301B (en) * 2016-05-24 2021-02-23 华为技术有限公司 Data processing method, related equipment and storage system
CN106375427A (en) * 2016-08-31 2017-02-01 浪潮(北京)电子信息产业有限公司 Link redundancy optimization method for distributed SAN (Storage Area Network) storage system
CN108111566B (en) * 2016-11-25 2020-11-06 杭州海康威视数字技术股份有限公司 Cloud storage system capacity expansion method and device and cloud storage system
CN106990919A (en) * 2017-03-04 2017-07-28 郑州云海信息技术有限公司 The memory management method and device of automatic separating fault disk
CN107193502B (en) * 2017-05-27 2021-04-06 郑州云海信息技术有限公司 Storage service quality guarantee method and device
CN109788006B (en) * 2017-11-10 2021-08-24 阿里巴巴集团控股有限公司 Data equalization method and device and computer equipment
US11432194B2 (en) * 2018-10-22 2022-08-30 Commscope Technologies Llc Load measurement and load balancing for packet processing in a long term evolution evolved node B
CN111290699B (en) * 2018-12-07 2023-03-14 杭州海康威视系统技术有限公司 Data migration method, device and system
CN111381766B (en) * 2018-12-28 2022-08-02 杭州海康威视系统技术有限公司 Method for dynamically loading disk and cloud storage system
CN111078153B (en) * 2019-12-20 2023-08-01 同方知网数字出版技术股份有限公司 Distributed storage method based on file
CN113190167A (en) * 2020-01-14 2021-07-30 伊姆西Ip控股有限责任公司 Method for managing computing device, electronic device and computer storage medium
US11061571B1 (en) * 2020-03-19 2021-07-13 Nvidia Corporation Techniques for efficiently organizing and accessing compressible data
CN111464602B (en) * 2020-03-24 2023-04-18 平安银行股份有限公司 Flow processing method and device, computer equipment and storage medium
CN111552441B (en) * 2020-04-29 2023-02-28 重庆紫光华山智安科技有限公司 Data storage method and device, main node and distributed system
CN112747688A (en) * 2020-12-24 2021-05-04 山东大学 Discrete manufacturing external quality information collection device based on ultrasonic detection positioning and application thereof
CN113986522A (en) * 2021-08-29 2022-01-28 中盾创新数字科技(北京)有限公司 Load balancing-based distributed storage server capacity expansion system
CN113741819A (en) * 2021-09-15 2021-12-03 第四范式(北京)技术有限公司 Method and device for hierarchical storage of data
CN117041256B (en) * 2023-10-08 2024-02-02 深圳市连用科技有限公司 Network data transmission and storage method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582013A (en) * 2009-06-10 2009-11-18 成都市华为赛门铁克科技有限公司 Method, device and system for processing storage hotspots in distributed storage
CN104657316A (en) * 2015-03-06 2015-05-27 北京百度网讯科技有限公司 Server
CN104850634A (en) * 2015-05-22 2015-08-19 中国联合网络通信集团有限公司 Data storage node adjustment method and system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4681374B2 (en) * 2005-07-07 2011-05-11 株式会社日立製作所 Storage management system
CN101827120A (en) * 2010-02-25 2010-09-08 浪潮(北京)电子信息产业有限公司 Cluster storage method and system
CN105657066B (en) * 2016-03-23 2019-06-14 天津书生云科技有限公司 Load for storage system equalization methods and device again
US9374314B2 (en) * 2012-02-26 2016-06-21 Palo Alto Research Center Incorporated QoS aware balancing in data centers
CN103503414B (en) * 2012-12-31 2016-03-09 华为技术有限公司 A kind of group system calculating storage and merge
CN104238955B (en) * 2013-06-20 2018-12-25 杭州迪普科技股份有限公司 A kind of device and method of storage resource virtualization distribution according to need

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582013A (en) * 2009-06-10 2009-11-18 成都市华为赛门铁克科技有限公司 Method, device and system for processing storage hotspots in distributed storage
CN104657316A (en) * 2015-03-06 2015-05-27 北京百度网讯科技有限公司 Server
CN104850634A (en) * 2015-05-22 2015-08-19 中国联合网络通信集团有限公司 Data storage node adjustment method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
云存储系统中动态负载均衡算法研究;田浪军等;《计算机工程》;20131031;第39卷(第10期);第19-23页

Also Published As

Publication number Publication date
WO2017162179A1 (en) 2017-09-28
CN105657066A (en) 2016-06-08

Similar Documents

Publication Publication Date Title
CN105657066B (en) Load for storage system equalization methods and device again
US11789831B2 (en) Directing operations to synchronously replicated storage systems
US11803492B2 (en) System resource management using time-independent scheduling
US10534677B2 (en) Providing high availability for applications executing on a storage system
US11652884B2 (en) Customized hash algorithms
US10853139B2 (en) Dynamic workload management based on predictive modeling and recommendation engine for storage systems
US10855791B2 (en) Clustered storage system path quiescence analysis
CN105472047B (en) Storage system
US10454810B1 (en) Managing host definitions across a plurality of storage systems
US11481261B1 (en) Preventing extended latency in a storage system
US20150200833A1 (en) Adaptive Data Migration Using Available System Bandwidth
CN105843557A (en) Redundant storage system, redundant storage method and redundant storage device
CN103946846A (en) Use of virtual drive as hot spare for RAID group
US20220261286A1 (en) Scheduling Input/Output Operations For A Storage System
US20150347047A1 (en) Multilayered data storage methods and apparatus
CN103763383A (en) Integrated cloud storage system and storage method thereof
EP3195103A1 (en) Online data movement without compromising data integrity
US11579790B1 (en) Servicing input/output (‘I/O’) operations during data migration
CN105872031B (en) Storage system
US20080270727A1 (en) Data transfer in cluster storage systems
US20140075111A1 (en) Block Level Management with Service Level Agreement
EP4139782A1 (en) Providing data management as-a-service
US10782898B2 (en) Data storage system, load rebalancing method thereof and access control method thereof
CN105867842A (en) Access control method and apparatus for storage system
US11531577B1 (en) Temporarily limiting access to a storage device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PD01 Discharge of preservation of patent
PD01 Discharge of preservation of patent

Date of cancellation: 20210523

Granted publication date: 20190614

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: Room 645dd18, aviation industry support center No.1, Baohang Road, Tianjin Binhai New Area Airport Economic Zone, 300308

Patentee after: Tianjin Zhongcheng Star Technology Co.,Ltd.

Address before: Room 645dd18, aviation industry support center No.1, Baohang Road, Tianjin Binhai New Area Airport Economic Zone, 300308

Patentee before: TIANJIN SURDOC Corp.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210714

Address after: 100089 No. 4060, podium, 4th floor, 69 Zizhuyuan Road, Haidian District, Beijing

Patentee after: Beijing Shusheng cloud Technology Co.,Ltd.

Address before: Room 645dd18, aviation industry support center No.1, Baohang Road, Tianjin Binhai New Area Airport Economic Zone, 300308

Patentee before: Tianjin Zhongcheng Star Technology Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220425

Address after: 1101-13, 11th floor, building 1, courtyard 1, Shangdi 10th Street, Haidian District, Beijing 100085

Patentee after: Beijing Shusheng Information Technology Co.,Ltd.

Address before: 100089 No. 4060, podium, 4th floor, 69 Zizhuyuan Road, Haidian District, Beijing

Patentee before: Beijing Shusheng cloud Technology Co.,Ltd.

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Load rebalancing method and device for storage systems

Effective date of registration: 20230317

Granted publication date: 20190614

Pledgee: Bank of Hangzhou Limited by Share Ltd. Beijing branch

Pledgor: Beijing Shusheng Information Technology Co.,Ltd.

Registration number: Y2023110000102

PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20230425

Granted publication date: 20190614

Pledgee: Bank of Hangzhou Limited by Share Ltd. Beijing branch

Pledgor: Beijing Shusheng Information Technology Co.,Ltd.

Registration number: Y2023110000102

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Load rebalancing method and device for storage systems

Effective date of registration: 20230428

Granted publication date: 20190614

Pledgee: Bank of Hangzhou Co.,Ltd. Beijing Chaoyang Wenchuang Sub branch

Pledgor: Beijing Shusheng Information Technology Co.,Ltd.

Registration number: Y2023110000179