CN110633053B - Storage capacity balancing method, object storage method and device - Google Patents

Storage capacity balancing method, object storage method and device Download PDF

Info

Publication number
CN110633053B
CN110633053B CN201910867983.1A CN201910867983A CN110633053B CN 110633053 B CN110633053 B CN 110633053B CN 201910867983 A CN201910867983 A CN 201910867983A CN 110633053 B CN110633053 B CN 110633053B
Authority
CN
China
Prior art keywords
storage
nodes
node
hash
capacity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910867983.1A
Other languages
Chinese (zh)
Other versions
CN110633053A (en
Inventor
刘萌
陈志德
黎莉
刘廷永
谢文辉
杨程
朱志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mach Valley Technology Co ltd
Original Assignee
Beijing Mach Valley Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mach Valley Technology Co ltd filed Critical Beijing Mach Valley Technology Co ltd
Priority to CN201910867983.1A priority Critical patent/CN110633053B/en
Publication of CN110633053A publication Critical patent/CN110633053A/en
Application granted granted Critical
Publication of CN110633053B publication Critical patent/CN110633053B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0613Improving I/O performance in relation to throughput
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a storage capacity balancing method, an object storage method and an object storage device. The storage capacity balancing method comprises the following steps: forming a hash ring, wherein the hash ring is composed of N hash values; dividing the Hash ring into M equal parts, wherein each equal part Hash ring forms a virtual node; allocating X virtual nodes for the storage nodes to be divided according to the real-time capacity information of the storage nodes; hashing X virtual nodes onto a hash ring; forming a Hash index record, wherein the Hash index record comprises a VNode _ ID of a virtual node, a corresponding Hash value and an OSD _ ID of a storage node to be divided; and repeatedly distributing virtual nodes and hash virtual nodes to all storage nodes in the storage system and forming hash index records. When the method allocates the virtual nodes to the storage nodes, the real-time capacity information of the storage nodes and all the storage nodes in the storage system is referred, so that the writing amount of each storage node is automatically balanced, and the capacity balance of each storage node is ensured to the maximum extent.

Description

Storage capacity balancing method, object storage method and device
Technical Field
The invention relates to the technical field of data storage, in particular to a storage capacity balancing method, an object storage method and an object storage device.
Background
In order to meet the requirement of mass Object Storage, a distributed Storage system is generally composed of a plurality of Storage nodes (OSD) to provide high-performance Storage writing and query capabilities to the outside.
When the storage requirement of a large amount of small files is met, for example, pictures collected by a video collecting device are stored, the industry generally adopts a hash writing mode, that is, hash values are calculated according to object IDs, and different hash values correspond to different storage nodes, so that objects can be uniformly written into the storage nodes in quantity.
However, since the number of objects is large and varies from several tens of KB (210 bytes) to several MB (220 bytes), the storage capacity occupied by the same number of objects on each storage node is greatly different, and the capacity imbalance of the storage nodes is easily caused, so that some storage nodes are out of service due to capacity fullness, and the throughput of the entire system is reduced.
Disclosure of Invention
The invention provides a storage capacity balancing method, an object storage method and a device, which can prevent the capacity of each storage node in a storage system from being unbalanced.
The invention provides a storage capacity balancing method, which comprises the following steps: (1) forming a hash ring, wherein the hash ring is composed of N hash values; (2) dividing the hash ring into M equal parts, wherein each equal part of the hash ring forms a virtual node; (3) dynamically allocating X virtual nodes for the storage nodes to be distributed in real time according to the current storage capacity and the residual storage capacity of the storage nodes to be distributed and the current total storage capacity and the total residual storage capacity of all the storage nodes in a storage system where the storage nodes to be distributed are located; (4) hashing the X virtual nodes onto the hash ring; (5) forming a hash index record, wherein the hash index record comprises a virtual node number VNode _ ID of the virtual node, the corresponding hash value and a storage node number OSD _ ID of the storage node to be divided; (6) repeating the steps (3) to (5) by taking the storage node which is not allocated with the virtual node in the storage system as the storage node to be divided;
when the CluserMap is refreshed, repeating the steps (3) to (6), wherein the CluserMap is used for storing the storage node number OSD _ ID of each storage node in the storage system, the corresponding current storage capacity and the corresponding residual storage capacity;
the step (3) comprises the following steps: distributing X virtual nodes to the storage nodes to be distributed according to a formula; the formula is: the storage node distribution method comprises the steps that X is M (K) OSDCapacity/TotalCapacicity + (1-K) OSDFreeCapacicity/TotalFreCapacicity), K is a balance factor, K is more than 0 and less than 1, OSDCapacity is the storage capacity of the storage nodes to be distributed, OSDFreeCapacicity is the residual storage capacity of the storage nodes to be distributed, TotalCapacicity is the total storage capacity of all the storage nodes in the storage system where the storage nodes to be distributed are located, and TotalFreCapacicity is the total residual storage capacity of all the storage nodes in the storage system.
Optionally, the CluserMap refresh includes: each storage node in the storage system sends the storage node number OSD _ ID of the storage node, the corresponding current storage capacity and the corresponding residual storage capacity to a storage gateway at regular time; and the storage gateway refreshes the storage node number OSD _ ID of each storage node in the CluserMap, the corresponding current storage capacity and the corresponding residual storage capacity.
Further, the step (4) includes: calculating and generating a first hash value according to a virtual node number VNode _ ID of the virtual node and a storage node number OSD _ ID of the storage node to be divided; and taking the first hash value as the mapping starting position of the virtual node.
Optionally, a ratio of M to the number of storage nodes in the storage system is greater than or equal to 100.
The invention provides an object storage method, which comprises the following steps: receiving a storage object, wherein the storage object comprises time information, space information and content information; calculating and generating a second hash value according to the time information and the space information, wherein the second hash value is equal to one hash value in the hash ring of the storage capacity balancing method; searching a storage node number OSD _ ID of the storage node to be divided corresponding to the second hash value in the hash index record of the storage capacity balancing method; and storing the content information into a target storage node pointed by the storage node number OSD _ ID of the storage nodes to be divided.
Further, the object storage method further comprises: and coding the storage position of the content information in the target storage node and the storage node number OSD _ ID of the target storage node to form the object ID of the storage object.
The invention provides an object storage device, comprising: a receiving unit for receiving a storage object including time information, spatial information, and content information; a calculating unit, configured to calculate and generate a second hash value according to the time information and the space information, where the second hash value is equal to one hash value in the hash ring of the storage capacity balancing method; the record query unit is used for searching the storage node number OSD _ ID of the storage node to be divided corresponding to the second hash value in the hash index record of the storage capacity balancing method; and the storage unit is used for storing the content information into a target storage node pointed by the storage node number OSD _ ID of the storage nodes to be divided.
Further, the object storage apparatus further includes: and the coding unit is used for coding the storage position of the content information in the target storage node and the storage node number OSD _ ID of the target storage node to form the object ID of the storage object.
The invention provides a storage capacity balancing method, firstly forming a hash ring composed of N hash values, equally dividing the hash ring into M parts, namely M virtual nodes, then distributing the virtual nodes to each storage node in a storage system by a certain method, after obtaining the number of the virtual nodes distributed to one storage node according to the storage capacity and the residual capacity of one storage node and the total storage capacity and the total residual capacity of all the storage nodes in the whole storage system, hashing the virtual nodes to the hash ring, namely dispersing the virtual nodes on the hash ring, then saving the virtual node number VNode _ ID of the virtual node, the corresponding hash value and the storage node number OSD _ ID of the virtual node as a hash index record, so as to inquire the corresponding storage node according to the hash value, finally, according to the method, hash index records of all hash values are obtained. When the number of the virtual nodes distributed to one storage node is calculated, the real-time capacity information of the storage node and all the storage nodes in the storage system is referred, so that the writing amount of each storage node is automatically balanced, and the capacity balance of each storage node is ensured to the maximum extent.
In addition, the object storage method and the object storage device calculate the time information and the space information carried by the received storage object to generate a hash value, and then use the hash value to search the hash index record formed by the storage capacity balancing method, thereby obtaining the storage node number OSD _ ID of the corresponding storage node, and further storing the content information of the storage object in the designated position. The capacity balancing method can automatically balance the write-in amount of each storage node according to the real-time capacity of each storage node, so that the storage object can be ensured to be written into a proper position, and the capacity of each storage node can be ensured to be balanced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a method for balancing storage capacity according to an embodiment of the present invention;
FIG. 2 is a detailed flow chart of hashing a virtual node onto a hash ring in accordance with the present invention;
fig. 3 is a flowchart of an object storage method according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating an object storage method shown in FIG. 3;
fig. 5 is a block diagram of an object storage apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the technical solution of the present invention clearer, embodiments of the present invention are described in detail below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a storage capacity balancing method according to an embodiment of the present invention, and as shown in fig. 1, the storage capacity balancing method includes the following steps:
step 101, forming a hash ring, wherein the hash ring is composed of N hash values.
Specifically, the execution subject of the method may be a storage gateway in the storage system, which is responsible for storage control. The value of N may be selected according to the total storage capacity of the storage system, and in an actual application process, the hash value obtained by each storage object through the specified hash calculation may find a corresponding value in the hash ring, that is, each storage object may be mapped to one hash value of the hash ring.
Step 102, dividing the hash ring into M equal parts, wherein each equal part of the hash ring forms a virtual node.
Specifically, the virtual node VNode includes a plurality of hash values (N divided by M), and the size of the virtual node VNode, i.e., how many hash values are included, can be implemented by setting the size of M.
And 103, distributing X virtual nodes to the storage nodes to be distributed according to the storage capacity and the residual storage capacity of the storage nodes to be distributed, and the total storage capacity and the total residual storage capacity of all the storage nodes in the storage system where the storage nodes to be distributed are located.
Specifically, a to-be-divided storage node, that is, a storage node in the storage system waiting for virtual node allocation, may allocate X virtual nodes from among the M virtual nodes to the to-be-divided storage node, that is, storage objects that can be mapped to hash values included in the X virtual nodes are all finally stored in the to-be-divided storage node.
For convenience of description, the storage capacity and the remaining storage capacity of the storage node to be divided, and the total storage capacity and the total remaining storage capacity of all the storage nodes in the storage system in which the storage node to be divided is located are respectively represented by character strings. The OSDCapacity is the storage capacity of the storage nodes to be divided, the OSDFreeCapacity is the residual storage capacity of the storage nodes to be divided, the TotalCapacity is the total storage capacity of all the storage nodes in the storage system where the storage nodes to be divided are located, and the TotalFreeCapacity is the total residual storage capacity of all the storage nodes in the storage system.
Thus, the rule of allocation depends on four parameters, OSDCapacity, osdfreecacity, TotalCapacity, and totalfreeecacity. Herein, the TotalCapacity is obtained by summing OSDCapacity of all storage nodes in the storage system, and similarly, the totalfreecapcity is obtained by summing osdfreecapcity of all storage nodes in the storage system.
In the operation process of the storage system, the storage capacity and the residual capacity of each storage node are changed, the current storage capacity and the residual capacity of the storage nodes are considered in the process of distributing the virtual nodes, and the storage capacity of the storage nodes can be used more reasonably by setting a certain rule, so that too many virtual nodes are prevented from being distributed to the storage nodes which are about to be fully written.
And 104, hashing the X virtual nodes to a hash ring.
Specifically, in order to make the X virtual nodes as scattered as possible, they need to be hashed onto the hash ring.
And 105, forming a hash index record, wherein the hash index record comprises the VNode _ ID of the virtual node, the corresponding hash value and the OSD _ ID of the storage node to be divided.
Specifically, the purpose of forming the hash index record is to enable the storage object to find the corresponding VNode _ ID (virtual node number) and OSD _ ID (storage node number) according to one hash value when the storage object is mapped to the hash value of the hash ring, and therefore, each hash value corresponds to one hash index record.
And step 106, judging whether a storage node without a virtual node is available in the storage system, if so, executing step 107, and if not, ending.
And step 107, taking the storage node without the virtual node as a storage node to be distributed, and executing step 103.
Specifically, one storage system includes a plurality of storage nodes, the distributed number of virtual nodes needs to be calculated for each storage node, the distributed virtual nodes need to be hashed on a hash ring, and then the corresponding relationship between the hash value and the VNode _ ID and OSD _ ID is recorded until all the storage nodes complete the calculation, hashing, and recording processes.
In the storage capacity balancing method provided in the embodiment of the present invention, a hash ring formed by N hash values is first formed, and the hash ring is equally divided into M parts, that is, M virtual nodes, and then the virtual nodes are allocated to each storage node in the storage system by using a certain method, the allocation method obtains the number of virtual nodes allocated to a storage node according to the storage capacity and the remaining capacity of a storage node, and the total storage capacity and the total remaining capacity of all storage nodes in the entire storage system, and then hashes the virtual nodes onto the hash ring, that is, the virtual nodes are scattered on the hash ring, and then stores the VNode _ ID of the virtual node, the corresponding hash value, and the OSD _ ID of the virtual node as a hash index record, so that the corresponding storage node can be queried according to the hash value, finally, according to the method, hash index records of all hash values are obtained. When the number of the virtual nodes distributed to one storage node is calculated, the real-time capacity information of the storage node and all the storage nodes in the storage system is referred, so that the writing amount of each storage node is automatically balanced, and the capacity balance of each storage node is ensured to the maximum extent.
In addition, the capacity imbalance of each storage node easily causes some storage nodes to be full, the capacity migration of the storage nodes to be full is needed to be performed through manual intervention, a large amount of system resources are occupied in the migration process, the system service capacity is easily reduced obviously, and the method realizes the capacity equalization of each storage node and can save unnecessary cost of manual maintenance.
In the method for balancing storage capacity provided in the foregoing embodiment, step 103 may specifically include: distributing X virtual nodes for the storage nodes to be distributed according to a formula, wherein the formula is as follows:
X=M*(K*OSDCapacity/TotalCapacity+(1-K)*OSDFreeCapacity/TotalFreeCapacity)
k is a balance factor, K is more than 0 and less than or equal to 1, OSDCapacity is the storage capacity of the storage nodes to be divided, OSDFreeCapacity is the residual storage capacity of the storage nodes to be divided, TotalCapacity is the total storage capacity of all the storage nodes in the storage system where the storage nodes to be divided are located, and TotalFreeCapacity is the total residual storage capacity of all the storage nodes in the storage system.
In the operation process of the storage system, the ratio of OSDCapacity to TotalCapacity represents the occupation condition of the capacity of the storage nodes to be distributed in the total capacity of all the storage nodes, the ratio of osdfreecapcity to totalceacity represents the occupation condition of the residual capacity of the storage nodes to be distributed in the total residual capacity of all the storage nodes, K is a balance factor and is a constant and usually takes the value of 1/2, when the calculation of X needs to give priority to the residual capacity, the value of K is smaller than 1/2, the smaller the value is, the more virtual nodes are allocated to the storage nodes to be distributed with the larger residual capacity, and when the calculation of X needs to give priority to the total capacity, the value of K is larger than 1/2, the larger the value is, and the larger the capacity is, the more virtual nodes are allocated to the storage nodes.
Of course, the formula for calculating the X value is not limited thereto, and any formula known to those skilled in the art that can calculate the X value according to OSDCapacity, OSDFreeCapacicity, TotalCapacicity, TotalFreCapacicity can be used in the present invention.
The method for balancing storage capacity provided in the foregoing embodiment may further include: when the clustermap is refreshed, repeating the steps 103 to 107, wherein the clustermap is used for saving the OSD _ ID, the corresponding storage capacity and the remaining storage capacity of each storage node in the storage system.
Specifically, as the operation of storing and deleting objects is continuously performed, the remaining capacity of each storage node in the storage system is dynamically changed, in addition, a new storage node or a next storage node is continuously added to the storage system, the information is recorded in the table of the CluserMap, the CluserMap is periodically refreshed to store the latest storage capacity and remaining storage capacity corresponding to the OSD _ ID (storage node number), and the refreshing time of the storage node can be set according to actual needs.
When the CluserMap is refreshed, the steps 103 to 107 are repeated, that is, the virtual nodes, the hashed virtual nodes, and the hash index records are newly allocated to the respective storage nodes according to the latest storage capacity and the latest remaining capacity, so that the allocation of the virtual nodes can be changed along with the change of the storage capacity of the storage nodes.
In the above method, the step of refreshing the CluserMap may specifically include: each storage node in the storage system sends the OSD _ ID, the corresponding storage capacity and the residual storage capacity of the storage node to the storage gateway at regular time; and the storage gateway refreshes the OSD _ ID, the corresponding storage capacity and the residual storage capacity of each storage node in the CluserMap.
Specifically, the storage gateway is a device in charge of storage management in the storage system, the clustermap is refreshed by the storage gateway, the OSD _ ID (storage node number), the storage capacity, and the remaining capacity of each storage node are sent to the storage gateway at regular time, and the storage gateway replaces the corresponding information in the clustermap with the received latest information.
In the storage capacity balancing method provided in the foregoing embodiment, in step 104, X virtual nodes are hashed to a hash ring, fig. 2 is a specific flowchart of hashing a virtual node to a hash ring in the implementation of the present invention, and as shown in fig. 2, the method may specifically include the following steps:
step 1041, calculating and generating a first hash value according to the VNode _ ID of the virtual node and the OSD _ ID of the to-be-divided storage node.
Specifically, assuming that VNode _ IDs (virtual node numbers) of X virtual nodes are calculated to be 1 to X and OSD _ IDs (storage node numbers) of storage nodes to be distributed are 1, by setting an appropriate hash algorithm, substituting the VNode _ IDs and the OSD _ IDs into the calculation, a first hash value of virtual node number 1, a first hash value of virtual node number 2, and … … are calculated, respectively. The first hash value should be one on the hash ring.
It should be noted that: the "first" here and the "second" described in the following embodiments are merely used to make the two "hash values" different from each other in name.
Step 1042, using the first hash value as the mapping start position of the virtual node.
As described above, each virtual node includes a plurality of hash values, and the starting one of the plurality of hash values serves as a mapping start position. For example, a virtual node includes 5 hash values, and if the calculated first hash value is 1, the mapping start position of the virtual node is 1, including 1, 2, 3, 4, and 5, where the 5 hash values.
Of course, the hashing method of the virtual node is not limited thereto, and may be other hashing methods known to those skilled in the art.
In the above embodiment, the selection of N and M generally follows the principle that the value of N is much larger than that of M, so as to make the hashing of massive storage objects more uniform, and effectively prevent the formation of local hot spots of storage nodes during concurrent reading.
In addition, in the above formula, the ratio of M to the number of storage nodes in the storage system may be greater than or equal to 100. The larger the value of M, the more sensitive the equalization algorithm used in the formula is, and the equalization algorithm is referred to as "X ═ M ═ (K × (OSDCapacity)/total capacity + (1-K) × (osdfreecaavailability)/total freecaavailability)". If M is too small, the change of (K.OSDCapacity/TotalCapacity + (1-K). OSDFreeCapacicity/TotalFreeCapacicity) is not easily reflected on X. That is, the larger M, the smaller the granularity the equalization algorithm can adjust, making it easier to hash the storage objects more evenly.
Fig. 3 is a flowchart of an object storage method according to an embodiment of the present invention; fig. 4 is a flowchart illustrating the object storage method shown in fig. 3, and as shown in fig. 3 and fig. 4, the method includes the following steps:
step 301, receiving a storage object, wherein the storage object comprises time information, space information and content information.
Specifically, the storage object usually has a certain service attribute, for example, a picture acquired by the image acquisition device includes attributes such as acquisition time, acquisition location, acquisition content, and the like, the pictures are stored in the storage node and are used for querying when needed, and the query requirement usually includes the service attribute, so that the service attribute needs to be retained when the storage object is stored. The present invention provides that the step of receiving the storage object also receives time information, space information and content information of the storage object. The step execution subject may be a device in the storage system responsible for storage management, such as a storage gateway.
Step 302, calculating and generating a second hash value according to the time information and the space information, wherein the second hash value is equal to one hash value in the hash ring of the storage capacity balancing method.
Specifically, by setting a hash (hash) algorithm so that the second hash value calculated and generated from the time information and the space information is equal to the hash value in the hash ring, the storage object generates a series of second hash values corresponding to the hash value in the hash ring by hash transform.
Step 303, searching the OSD _ ID of the to-be-divided storage node corresponding to the second hash value in the hash index record of the storage capacity balancing method.
As described in the embodiment of the storage capacity balancing method, each to-be-sorted storage node is allocated with X virtual nodes, and after the virtual nodes are hashed to the hash ring, a hash index record of the corresponding relationship between the hash value and the OSD _ ID of the to-be-sorted storage node is generated, and the OSD _ ID (storage node number) of the to-be-sorted storage node corresponding to the second hash value can be obtained by querying the record.
And step 304, storing the content information into a target storage node pointed by the OSD _ ID of the storage node to be divided.
Specifically, the storage node pointed by the inquired storage node number OSD _ ID is taken as a target storage node, and the received content information of the storage object is stored in the target storage node. And the whole process of storing the storage object into the storage node is realized.
In the object storage method provided by the embodiment of the invention, the received time information and the space information carried by the storage object are calculated to generate a hash value, and then the hash value is used for searching a hash index record formed by the storage capacity balancing method, so that the OSD _ ID of the corresponding storage node is obtained, and further the content information of the storage object is stored in the designated position. The capacity balancing method can automatically balance the write-in amount of each storage node according to the real-time capacity of each storage node, so that the storage object can be ensured to be written into a proper position, and the capacity of each storage node can be ensured to be balanced.
In addition, since the hash value calculated from the time information and the space information is equal to one hash value on the hash ring, that is, each hash value on the hash ring corresponds to the time information and the space information of the storage object, the storage object can be hashed according to the time information and the space information, which are two attributes related to the service. When the storage objects are queried in batches according to the time and space attributes, the storage objects of the storage nodes are balanced in time and space dimensions, so that the storage nodes can be read in a concurrent manner to the maximum extent, and the query speed is increased.
The object storage method described in the above embodiment may further include: and coding the storage position of the content information in the target storage node and the OSD _ ID of the target storage node to form an object ID of a storage object.
In particular, the storage object is typically stored in the storage node in the form of a file, the storage location being in particular an offset of the first byte of the storage object with respect to the first byte of the file header. When the storage object is stored, the device in charge of managing storage in the storage system, such as a storage gateway, may obtain the offset. In order to facilitate a query operation on a storage object having an existing storage node, the acquired offset and the OSD _ ID (storage node number) of the target storage node may be encoded as an object ID of the storage object. Therefore, when the storage object is inquired, the second hash value can be obtained through time information and space information calculation, then the hash index record is searched to obtain the OSD _ ID, finally the storage position of the storage object is obtained according to the OSD _ ID and the object ID, and the storage object is further obtained from the storage position.
Fig. 5 is a block diagram of an object storage apparatus according to an embodiment of the present invention, and as shown in fig. 5, the apparatus includes: a receiving unit 41 for receiving a storage object including time information, spatial information, and content information; a calculation unit 42 configured to calculate and generate a second hash value, which is equal to one hash value in the hash ring of the storage capacity equalization method, based on the time information and the space information; a record query unit 43, configured to search, in the hash index record of the storage capacity balancing method, for the OSD _ ID of the to-be-divided storage node corresponding to the second hash value; and the storage unit 44 is used for storing the content information into a target storage node pointed by the OSD _ ID of the storage node to be divided.
The method executed by each unit in the device has been described in detail in the above object storage method, and is not described herein again.
In the object storage apparatus provided in this embodiment, the calculating unit calculates the time information and the space information carried by the storage object received by the receiving unit to generate a hash value, and the record querying unit searches the hash index record formed by the storage capacity balancing method by using the hash value, so as to obtain the OSD _ ID of the corresponding storage node, and then the storage unit stores the content information of the storage object in the designated location. The capacity balancing method can automatically balance the write-in amount of each storage node according to the real-time capacity of each storage node, so that the storage object can be ensured to be written into a proper position, and the capacity of each storage node can be ensured to be balanced.
In addition, since the calculation unit calculates a hash value equal to one hash value on the hash ring based on the time information and the space information, that is, each hash value on the hash ring corresponds to the time information and the space information of the storage object, the storage object can perform hash storage based on the time information and the space information, which are two attributes related to the service. When the storage objects are queried in batches according to the time and space attributes, the storage objects of the storage nodes are balanced in time and space dimensions, so that the storage nodes can be read in a concurrent manner to the maximum extent, and the query speed is increased.
The object storage apparatus provided in the foregoing embodiment may further include: and the coding unit is used for coding the storage position of the content information in the target storage node and the OSD _ ID of the target storage node to form the object ID of the storage object.
The method executed by the encoding unit has been described in detail in the above embodiment of the object storage method, and is not described herein again, and the object ID formed by the encoding unit is convenient for querying the storage object.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A method for balancing storage capacity, comprising:
(1) forming a hash ring, wherein the hash ring is composed of N hash values;
(2) dividing the hash ring into M equal parts, wherein each equal part of the hash ring forms a virtual node;
(3) dynamically allocating X virtual nodes for the storage nodes to be distributed in real time according to the current storage capacity and the residual storage capacity of the storage nodes to be distributed and the current total storage capacity and the total residual storage capacity of all the storage nodes in a storage system where the storage nodes to be distributed are located;
(4) hashing the X virtual nodes onto the hash ring;
(5) forming a hash index record, wherein the hash index record comprises a virtual node number VNode _ ID of the virtual node, the corresponding hash value and a storage node number OSD _ ID of the storage node to be divided;
(6) repeating the steps (3) to (5) by taking the storage node which is not allocated with the virtual node in the storage system as the storage node to be divided;
when the CluserMap is refreshed, repeating the steps (3) to (6), wherein the CluserMap is used for storing the storage node number OSD _ ID of each storage node in the storage system, the corresponding current storage capacity and the corresponding residual storage capacity;
the step (3) comprises the following steps: dynamically allocating X virtual nodes to the storage nodes to be distributed in real time according to a formula; the formula is:
X=M*(K*OSDCapacity/TotalCapacity
(1-K) × OSDFreeCapacity/totalfreeecacity), K is a balance factor, and K is more than 0 and less than 1, OSDCapacity is the storage capacity of the storage nodes to be divided, OSDFreeCapacity is the remaining storage capacity of the storage nodes to be divided, TotalCapacity is the total storage capacity of all the storage nodes in the storage system where the storage nodes to be divided are located, and totalfreeecacity is the total remaining storage capacity of all the storage nodes in the storage system.
2. The method of claim 1, wherein K is greater than 0 and less than or equal to 1/2.
3. The method of claim 1, wherein the CluserMap refresh comprises: each storage node in the storage system sends the storage node number OSD _ ID of the storage node, the corresponding current storage capacity and the corresponding residual storage capacity to a storage gateway at regular time; and the storage gateway refreshes the storage node number OSD _ ID of each storage node in the CluserMap, the corresponding current storage capacity and the corresponding residual storage capacity.
4. The method according to any one of claims 1 to 3, wherein the step (4) comprises: calculating and generating a first hash value according to a virtual node number VNode _ ID of the virtual node and a storage node number OSD _ ID of the storage node to be divided; and taking the first hash value as the mapping starting position of the virtual node.
5. The method of any of claims 1-3, wherein the ratio of M to the number of storage nodes in the storage system is greater than or equal to 100.
6. An object storage method, comprising:
receiving a storage object, wherein the storage object comprises time information, space information and content information;
calculating and generating a second hash value according to the time information and the space information, wherein the second hash value is equal to one hash value in the hash ring of any one of claims 1-5;
searching the hash index record of any one of claims 1 to 5 for a storage node number OSD _ ID of the to-be-divided storage node corresponding to the second hash value;
and storing the content information into a target storage node pointed by the storage node number OSD _ ID of the storage nodes to be divided.
7. The method of claim 6, further comprising: and coding the storage position of the content information in the target storage node and the storage node number OSD _ ID of the target storage node to form the object ID of the storage object.
8. An object storage device, comprising:
a receiving unit for receiving a storage object including time information, spatial information, and content information;
a calculation unit, configured to calculate and generate a second hash value according to the time information and the spatial information, where the second hash value is equal to one hash value in the hash ring according to any one of claims 1 to 5;
a record query unit, configured to search the hash index record of any one of claims 1 to 5 for a storage node number OSD _ ID of the to-be-divided storage node corresponding to the second hash value;
and the storage unit is used for storing the content information into a target storage node pointed by the storage node number OSD _ ID of the storage nodes to be divided.
9. The apparatus of claim 8, further comprising: and the coding unit is used for coding the storage position of the content information in the target storage node and the storage node number OSD _ ID of the target storage node to form the object ID of the storage object.
CN201910867983.1A 2019-09-16 2019-09-16 Storage capacity balancing method, object storage method and device Active CN110633053B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910867983.1A CN110633053B (en) 2019-09-16 2019-09-16 Storage capacity balancing method, object storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910867983.1A CN110633053B (en) 2019-09-16 2019-09-16 Storage capacity balancing method, object storage method and device

Publications (2)

Publication Number Publication Date
CN110633053A CN110633053A (en) 2019-12-31
CN110633053B true CN110633053B (en) 2020-07-10

Family

ID=68971206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910867983.1A Active CN110633053B (en) 2019-09-16 2019-09-16 Storage capacity balancing method, object storage method and device

Country Status (1)

Country Link
CN (1) CN110633053B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111475105B (en) * 2020-03-11 2024-05-03 平安科技(深圳)有限公司 Monitoring data storage method, monitoring data storage device, monitoring data server and storage medium
CN117370275A (en) * 2022-07-01 2024-01-09 中兴通讯股份有限公司 File method, server, storage node, file storage system and client

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843403A (en) * 2011-06-23 2012-12-26 盛大计算机(上海)有限公司 File processing method based on distributed file system, system, and client
CN105721532A (en) * 2014-12-26 2016-06-29 乐视网信息技术(北京)股份有限公司 Node management method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8504766B2 (en) * 2010-04-15 2013-08-06 Netapp, Inc. Methods and apparatus for cut-through cache management for a mirrored virtual volume of a virtualized storage system
CN107562531B (en) * 2016-06-30 2020-10-09 华为技术有限公司 Data equalization method and device
CN107018197A (en) * 2017-04-13 2017-08-04 南京大学 A kind of holding load dynamic retractility mobile awareness Complex event processing method in a balanced way
CN107197035A (en) * 2017-06-21 2017-09-22 中国民航大学 A kind of compatibility dynamic load balancing method based on uniformity hash algorithm
CN110049091A (en) * 2019-01-10 2019-07-23 阿里巴巴集团控股有限公司 Date storage method and device, electronic equipment, storage medium
CN110096227B (en) * 2019-03-28 2023-04-18 北京奇艺世纪科技有限公司 Data storage method, data processing device, electronic equipment and computer readable medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843403A (en) * 2011-06-23 2012-12-26 盛大计算机(上海)有限公司 File processing method based on distributed file system, system, and client
CN105721532A (en) * 2014-12-26 2016-06-29 乐视网信息技术(北京)股份有限公司 Node management method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
改进的云存储系统数据分布策略;周敬利,周正达;《计算机应用》;20120427;第309-312页 *

Also Published As

Publication number Publication date
CN110633053A (en) 2019-12-31

Similar Documents

Publication Publication Date Title
US20200333981A1 (en) SCALABLE In-MEMORY OBJECT STORAGE SYSTEM USING HYBRID MEMORY DEVICES
US9774564B2 (en) File processing method, system and server-clustered system for cloud storage
US8510528B2 (en) Differential data storage based on predicted access frequency
WO2017016423A1 (en) Real-time new data update method and device
CN106294352B (en) A kind of document handling method, device and file system
CN109117275B (en) Account checking method and device based on data slicing, computer equipment and storage medium
JP6492123B2 (en) Distributed caching and cache analysis
US9355121B1 (en) Segregating data and metadata in a file system
CN110633053B (en) Storage capacity balancing method, object storage method and device
KR20170123336A (en) File manipulation method and apparatus
CN112100293A (en) Data processing method, data access method, data processing device, data access device and computer equipment
CN106547481B (en) Data pre-distribution method and equipment
CN110222209B (en) Picture storage method, query method, device and access system
CN113535330A (en) Super-fusion system data localization storage method based on node evaluation function
CN106569892A (en) Resource scheduling method and device
CN111124309B (en) Method, device and equipment for determining fragmentation mapping relation and storage medium
CN109788013B (en) Method, device and equipment for distributing operation resources in distributed system
CN101783814A (en) Metadata storing method for mass storage system
US11442632B2 (en) Rebalancing of user accounts among partitions of a storage service
CN112380004A (en) Memory management method and device, computer readable storage medium and electronic equipment
WO2019179252A1 (en) Sample playback data access method and device
CN113905252B (en) Data storage method and device for live broadcasting room, electronic equipment and storage medium
CN111475535B (en) Data storage and access method and device
CN110874268B (en) Data processing method, device and equipment
CN109787899B (en) Data partition routing method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant