CN107085501B - Data storage method, data migration method, data storage device and data migration device - Google Patents

Data storage method, data migration method, data storage device and data migration device Download PDF

Info

Publication number
CN107085501B
CN107085501B CN201610087865.5A CN201610087865A CN107085501B CN 107085501 B CN107085501 B CN 107085501B CN 201610087865 A CN201610087865 A CN 201610087865A CN 107085501 B CN107085501 B CN 107085501B
Authority
CN
China
Prior art keywords
node
resource
storage
resource node
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610087865.5A
Other languages
Chinese (zh)
Other versions
CN107085501A (en
Inventor
刘志辉
管国辰
林武康
徐鑫
张仪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201610087865.5A priority Critical patent/CN107085501B/en
Publication of CN107085501A publication Critical patent/CN107085501A/en
Application granted granted Critical
Publication of CN107085501B publication Critical patent/CN107085501B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms

Abstract

The embodiment of the application discloses a data storage method, a data migration method and a data storage device, and relates to the technical field of computers, wherein the method comprises the following steps: acquiring an identifier of a data block to be stored; determining a first mapping point of a data block to be stored on a preset closed ring according to the identifier of the data block to be stored, wherein the preset closed ring is a closed ring formed by a preset number of points, and one point on the preset closed ring corresponds to the identifier of one data block or the identifier of one resource node; according to the storage range of each resource node, obtaining a first resource node corresponding to the first mapping point, wherein the storage range of each resource node is as follows: the range of mapping points of the data blocks which can be stored by the storage nodes corresponding to the resource nodes on the preset closed circular ring is within; and storing the data block to be stored to the storage node corresponding to the first resource node. By applying the scheme provided by the embodiment of the application to data storage, the limitation of other resources except the storage nodes to data storage capacity is reduced.

Description

Data storage method, data migration method, data storage device and data migration device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for data storage and data migration.
Background
With the rapid development of computer technology, more and more data need to be stored in practical application, and a file system based on a single storage device is difficult to meet user requirements, so that a distributed file system is widely applied. In the distributed file system, a plurality of storage nodes exist, the storage nodes can be located on the same physical machine or different physical machines, and the distributed file system can meet the user requirements to a greater extent compared with a file system based on a single storage device due to the abundant storage resources.
Taking HDFS (Hadoop Distributed File System ) as an example, when a File is stored in the Distributed File System, metadata information of data stored in each storage node is stored in a memory of a specific node of the System, and since a memory resource of the specific node is limited, metadata information that can be stored in the specific node is limited, and further, data that can be stored in each storage resource of the Distributed File System is also limited, that is, when the specific node has no remaining resource for storing metadata information, even if the storage node still has a remaining space, the data cannot be stored continuously. It can be seen that, in the prior art, when data is stored, the data is limited not only by the storage resources in the storage nodes, but also by other resources besides the storage nodes.
Disclosure of Invention
The embodiment of the application discloses a data storage method and a data migration method and device, which aim to reduce the limitation of other resources except storage nodes on data storage capacity.
In order to achieve the above object, an embodiment of the present application discloses a data storage method, where the method includes:
acquiring an identifier of a data block to be stored;
determining a first mapping point of the data block to be stored on a preset closed ring according to the identifier of the data block to be stored, wherein the preset closed ring is a closed ring formed by a preset number of points, one point on the preset closed ring corresponds to the identifier of one data block or the identifier of one resource node, and the resource node is a sub-node obtained by dividing according to a preset node division rule aiming at the storage resource state of the storage node;
obtaining a first resource node corresponding to the first mapping point according to the storage range of each resource node, wherein the storage range of each resource node is as follows: the range of mapping points of the data blocks which can be stored by the storage nodes corresponding to the resource nodes on the preset closed circular ring is within the preset closed circular ring;
and storing the data block to be stored to a storage node corresponding to the first resource node.
In a specific implementation manner of the present application, the storage range of each resource node is determined by the following method:
determining mapping points of the resource nodes on the preset closed circular ring according to the identifiers of the resource nodes;
dividing the preset closed ring into a plurality of point segments according to the determined mapping point position of each resource node and a preset segmentation rule to obtain the point segment corresponding to each resource node, wherein the number of the divided point segments is equal to that of the resource nodes, and the point segments are not overlapped;
and determining the range of the midpoint of the point segment corresponding to each resource node as the storage range of each resource node.
In a specific implementation manner of the present application, the determining, according to the identifier of the data block to be stored, a first mapping point of the data block to be stored on a preset closed circular ring includes:
calculating the hash value of the identifier of the data block to be stored;
and determining a first mapping point of the data block to be stored on a preset closed circular ring according to the hash value obtained by the calculation.
In a specific implementation manner of the present application, the preset node partition rule for the storage resource status of the storage node includes:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
In a specific implementation manner of the present application, the data storage method further includes:
receiving a resource node increase request, wherein the resource node increase request comprises: identification of resource nodes to be added;
determining a second mapping point of the resource node to be added on the preset closed circular ring according to the identifier of the resource node to be added;
obtaining two adjacent resource nodes of which mapping points are respectively positioned at two sides of the second mapping point and the mapping points are closest to the second mapping point;
calculating the length of the storage range of the resource nodes to be increased according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, wherein the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased;
according to the storage range and the length of the adjacent resource node, obtaining the storage range of the resource node to be increased according to a preset multiple linear regression algorithm, and adjusting the storage range of the adjacent resource node, wherein the sum of the storage range of the resource node to be increased and the storage range of the adjacent resource node after adjustment is equal to the sum of the storage ranges of the adjacent resource nodes before adjustment;
and migrating the data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
In a specific implementation manner of the present application, the data storage method further includes:
receiving a duplicate resource node setup request for a second resource node;
and selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the second resource node, wherein the target storage node is a storage node different from the storage node corresponding to the second resource node.
In a specific implementation manner of the present application, the target storage node is: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the second resource node is located.
In a specific implementation manner of the present application, after obtaining the first resource node corresponding to the first mapping point according to the storage range of each resource node, the method further includes:
judging whether a copy resource node exists in the first resource node;
and if so, storing the data block to be stored to a storage node corresponding to the copy resource node of the first resource node.
In a specific implementation manner of the present application, the data storage method further includes:
receiving a resource node deletion request aiming at a resource node to be deleted;
judging whether a duplicate resource node of the resource node to be deleted exists or not;
and if the mapping point exists in the storage range of the resource node to be deleted, directly deleting the resource node to be deleted, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a third mapping point corresponding to the resource node to be deleted and the mapping point is closest to the third mapping point, according to the storage range of the resource node to be deleted.
In order to achieve the above object, an embodiment of the present application discloses a data migration method, where the method includes:
receiving a resource node increase request, wherein the resource node increase request comprises: identification of resource nodes to be added;
determining a fourth mapping point of the resource node to be added on a preset closed ring according to the identifier of the resource node to be added, wherein the preset closed ring is a closed ring formed by a preset number of points, one point on the preset closed ring corresponds to the identifier of one data block or the identifier of one resource node, and the resource node is a sub-node obtained by dividing according to a preset node division rule aiming at the storage resource state of a storage node;
obtaining two adjacent resource nodes of which mapping points are respectively positioned at two sides of the fourth mapping point and the mapping points are closest to the fourth mapping point;
calculating the length of the storage range of the resource nodes to be increased according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, wherein the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased;
according to the storage range and the length of the adjacent resource node, obtaining the storage range of the resource node to be increased according to a preset multiple linear regression algorithm, and adjusting the storage range of the adjacent resource node, wherein the sum of the storage range of the resource node to be increased and the storage range of the adjacent resource node after adjustment is equal to the sum of the storage ranges of the adjacent resource nodes before adjustment;
and migrating the data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
In a specific implementation manner of the present application, the determining, according to the identifier of the resource node to be added, a fourth mapping point of the resource node to be added on a preset closed circular ring includes:
calculating the hash value of the identifier of the resource node to be added;
and determining a fourth mapping point of the resource node to be added on a preset closed circular ring according to the hash value obtained by calculation.
In a specific implementation manner of the present application, the preset node partition rule for the storage resource status of the storage node includes:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
In a specific implementation manner of the present application, the data migration method further includes:
receiving a duplicate resource node setup request for a third resource node;
and selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the third resource node, wherein the target storage node is a storage node different from the storage node corresponding to the third resource node.
In a specific implementation manner of the present application, the target storage node is: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the third resource node is located.
In a specific implementation manner of the present application, the data migration method further includes:
receiving a resource node deletion request aiming at a resource node to be deleted;
judging whether a duplicate resource node of the resource node to be deleted exists or not;
and if the mapping point exists in the storage range of the resource node to be deleted, directly deleting the resource node to be deleted, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a fourth mapping point corresponding to the resource node to be deleted and the mapping point is closest to the fourth mapping point, according to the storage range of the resource node to be deleted.
In order to achieve the above object, an embodiment of the present application discloses a data storage device, including:
the identification obtaining module is used for obtaining the identification of the data block to be stored;
the first mapping point determining module is configured to determine, according to the identifier of the data block to be stored, a first mapping point of the data block to be stored on a preset closed ring, where the preset closed ring is a closed ring formed by a preset number of points, one point on the preset closed ring corresponds to the identifier of one data block or the identifier of one resource node, and the resource node is a child node obtained by dividing according to a preset node division rule for a storage resource state of a storage node;
a resource node obtaining module, configured to obtain, according to a storage range of each resource node, a first resource node corresponding to the first mapping point, where the storage range of each resource node is: the range of mapping points of the data blocks which can be stored by the storage nodes corresponding to the resource nodes on the preset closed circular ring is within the preset closed circular ring;
and the first data storage module is used for storing the data block to be stored to the storage node corresponding to the first resource node.
In a specific implementation manner of the present application, the data storage device further includes:
the storage range determining module is used for determining the storage range of each resource node;
wherein the storage range determining module comprises:
the mapping point determining submodule is used for determining the mapping point of each resource node on the preset closed circular ring according to the identifier of each resource node;
the point segment obtaining submodule is used for dividing the preset closed ring into a plurality of point segments according to the determined mapping point position of each resource node and a preset segmentation rule to obtain the point segment corresponding to each resource node, wherein the number of the divided point segments is equal to that of the resource nodes, and each point segment is not overlapped;
and the storage range determining submodule is used for determining the range of the midpoint of the point segment corresponding to each resource node as the storage range of each resource node.
In a specific implementation manner of the present application, the first mapping point determining module includes:
the first hash value operator module is used for calculating the hash value of the identifier of the data block to be stored;
and the first mapping point determining submodule is used for determining a first mapping point of the data block to be stored on a preset closed ring according to the hash value obtained by calculation.
In a specific implementation manner of the present application, the preset node partition rule for the storage resource status of the storage node includes:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
In a specific implementation manner of the present application, the data storage device further includes:
a first request receiving module, configured to receive a resource node increase request, where the resource node increase request includes: identification of resource nodes to be added;
the second mapping point determining module is used for determining a second mapping point of the resource node to be added on the preset closed circular ring according to the identifier of the resource node to be added;
the first adjacent resource node determining module is used for obtaining two adjacent resource nodes, wherein the mapping points are respectively positioned at two sides of the second mapping point, and the mapping points are closest to the second mapping point;
the first length calculation module is used for calculating the length of the storage range of the resource nodes to be increased according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, wherein the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased;
a first storage range adjusting module, configured to obtain a storage range of the resource node to be added according to the storage range and the length of the adjacent resource node and according to a preset multiple linear regression algorithm, and adjust the storage range of the adjacent resource node, where a sum of the storage range of the resource node to be added and the storage range of the adjacent resource node after adjustment is equal to a sum of the storage ranges of the adjacent resource nodes before adjustment;
and the first data migration module is used for migrating data blocks from the storage nodes corresponding to the adjacent resource nodes to the storage nodes corresponding to the resource nodes to be added according to the storage range of the resource nodes to be added and the adjusted storage range of the adjacent resource nodes.
In a specific implementation manner of the present application, the data storage device further includes:
a second request receiving module, configured to receive a duplicate resource node setting request for a second resource node;
and the first resource node selection module is used for selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the second resource node, wherein the target storage node is a storage node different from the storage node corresponding to the second resource node.
In a specific implementation manner of the present application, the target storage node is: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the second resource node is located.
In a specific implementation manner of the present application, the data storage device further includes:
a first resource node determining module, configured to determine whether a duplicate resource node exists in the first resource node after the resource node obtaining module obtains the first resource node;
and the second data storage module is used for storing the data block to be stored to the storage node corresponding to the duplicate resource node of the first resource node under the condition that the judgment result of the first resource node judgment module is yes.
In a specific implementation manner of the present application, the data storage device further includes:
a third request receiving module, configured to receive a resource node deletion request for a resource node to be deleted;
the second resource node judgment module is used for judging whether a copy resource node of the resource node to be deleted exists or not;
and the second storage range adjusting module is used for directly deleting the resource node to be deleted under the condition that the judgment result of the second resource node judging module is yes, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a third mapping point corresponding to the resource node to be deleted, and the mapping points are closest to the third mapping point, according to the storage range of the resource node to be deleted.
In order to achieve the above object, an embodiment of the present application discloses a data migration apparatus, including:
a fourth request receiving module, configured to receive a resource node increase request, where the resource node increase request includes: identification of resource nodes to be added;
a fourth mapping point determining module, configured to determine, according to the identifier of the resource node to be added, a fourth mapping point of the resource node to be added on a preset closed circular ring, where the preset closed circular ring is a closed circular ring formed by a preset number of points, one point on the preset closed circular ring corresponds to an identifier of one data block or an identifier of one resource node, and the resource node is a child node obtained by partitioning according to a preset node partitioning rule for a storage resource state of a storage node;
a second adjacent resource node determining module, configured to obtain two adjacent resource nodes where mapping points are located on two sides of the fourth mapping point respectively and a mapping point is closest to the fourth mapping point;
the second length calculation module is used for calculating the length of the storage range of the resource nodes to be increased according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, wherein the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased;
a third storage range adjusting module, configured to obtain a storage range of the resource node to be added according to the storage range and the length of the adjacent resource node and according to a preset multiple linear regression algorithm, and adjust the storage range of the adjacent resource node, where a sum of the storage range of the resource node to be added and the storage range of the adjacent resource node after adjustment is equal to a sum of the storage ranges of the adjacent resource nodes before adjustment;
and the second data migration module is used for migrating the data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
In a specific implementation manner of the present application, the fourth mapping point determining module includes:
the second hash value operator module is used for calculating the hash value of the identifier of the resource node to be added;
and the fourth mapping point determining submodule is used for determining a fourth mapping point of the resource node to be added on a preset closed ring according to the hash value obtained by calculation.
In a specific implementation manner of the present application, the preset node partition rule for the storage resource status of the storage node includes:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
In a specific implementation manner of the present application, the data migration apparatus further includes:
a fifth request receiving module, configured to receive a duplicate resource node setting request for the third resource node;
and the second resource selection module is used for selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the third resource node, wherein the target storage node is a storage node different from the storage node corresponding to the third resource node.
In a specific implementation manner of the present application, the target storage node is: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the third resource node is located.
In a specific implementation manner of the present application, the data migration apparatus further includes:
a sixth request receiving module, configured to receive a resource node deletion request for a resource node to be deleted;
the third resource node judgment module is used for judging whether a copy resource node of the resource node to be deleted exists or not;
and a fourth storage range adjusting module, configured to, if the determination result of the third resource node determining module is yes, directly delete the resource node to be deleted, and adjust, according to the storage range of the resource node to be deleted, storage ranges of two resource nodes, where mapping points are located on two sides of a fourth mapping point corresponding to the resource node to be deleted and a distance between the mapping point and the fourth mapping point is closest to the fourth mapping point.
As can be seen from the above, in the scheme provided in the embodiment of the present application, when data is stored, a closed circular ring formed by a preset number of points is preset, then after an identifier of a data block to be stored is obtained, a mapping point of the data block to be stored on the preset closed circular ring is determined according to the identifier, a resource node corresponding to the mapping point of the data block to be stored is obtained according to a storage range of each resource node and a position of the mapping point of the data block to be stored, and finally the data block to be stored is stored in a storage node corresponding to the obtained resource node. Because the resource node corresponding to the data belongs to a part of the data metadata, when the scheme provided by the embodiment of the application is applied to data storage, the resource node corresponding to the data can be obtained through identification calculation of the data, so that the limitation of other resources except the storage node on the data storage amount can be reduced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a first data storage method according to an embodiment of the present application;
FIG. 2 is a schematic view of a first closed ring according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of a method for determining a storage range of a resource node according to an embodiment of the present application;
FIG. 4 is a schematic view of a second closed ring provided in an embodiment of the present application;
fig. 5 is a schematic flowchart of a second data storage method according to an embodiment of the present application;
FIG. 6 is a schematic view of a third closed ring provided in an embodiment of the present application;
FIG. 7 is a flowchart illustrating a third data storage method according to an embodiment of the present application;
fig. 8 is a schematic flowchart of a data migration method according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a first data storage device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an apparatus for determining a storage range of a resource node according to an embodiment of the present application;
FIG. 11 is a schematic structural diagram of a second data storage device according to an embodiment of the present application;
FIG. 12 is a schematic structural diagram of a third data storage device according to an embodiment of the present application;
fig. 13 is a schematic structural diagram of a data migration apparatus according to an embodiment of the present application.
Detailed Description
Because data storage is limited by other resources except for storage nodes in the prior art, the application provides a data storage method and device to reduce the limitation of other resources except for storage nodes on data storage.
The data storage method provided by the embodiments of the present application will be generally described below.
The data storage method comprises the following steps:
acquiring an identifier of a data block to be stored;
determining a first mapping point of the data block to be stored on a preset closed circular ring according to the identifier of the data block to be stored;
acquiring a first resource node corresponding to the first mapping point according to the storage range of each resource node;
and storing the data block to be stored to a storage node corresponding to the first resource node.
The preset closed ring is a closed ring formed by a preset number of points, one point on the preset closed ring corresponds to an identifier of one data block or an identifier of one resource node, the resource node is a sub-node obtained by dividing according to a preset node division rule aiming at a storage resource state of the storage node, and the storage range of each resource node is as follows: and the range of the mapping points of the data blocks which can be stored by the storage nodes corresponding to the resource nodes on the preset closed circular ring.
In view of that one physical machine may correspond to one storage node or a plurality of storage nodes, the data storage method provided in the embodiment of the present application may be applied to a distributed file system or a non-distributed file system, which is not limited in the present application.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a schematic flowchart of a first data storage method provided in an embodiment of the present application, where the method includes:
s101: and obtaining the identification of the data block to be stored.
The data block referred to in the embodiments of the present application may be understood as a piece of data, a file, a part of a file, or the like, and may also be understood as data corresponding to the smallest data storage unit when data is stored.
S102: and determining a first mapping point of the data block to be stored on a preset closed circular ring according to the identifier of the data block to be stored.
In the solution provided in the embodiment of the present application, a closed ring is preset, and the preset closed ring is a closed ring formed by a preset number of points, and it can be understood that, in order to ensure that the closed ring has better sealing performance, the preset number may be considered to be a larger number, for example, 232And so on. In addition, a point on the preset closed circle corresponds to an identifier of a data block or an identifier of a resource node. Specifically, referring to fig. 2, fig. 2 is a schematic view of a first closed circular ring provided in an embodiment of the present application.
The resource node is a child node obtained by dividing according to a preset node division rule for a storage resource state of a storage node, and it should be noted that the storage node may be one or multiple storage nodes, and in addition, one storage node may correspond to one physical machine, or multiple storage nodes correspond to one physical machine, which is not limited in this application.
In particular, from different partitioning perspectives, the states of the storage resources may correspond to different types, for example, the storage resource states may be an idle state, a stored data state, an available state, an unavailable state, and the like.
As will be appreciated by those skilled in the art, the hashing algorithm maps a binary value of arbitrary length to a smaller binary value of fixed length, referred to as a hash value. Hash values are a unique and extremely compact representation of a piece of data as a value. Theoretically, hash values have a unique correspondence, i.e., for a piece of plaintext, even if only one letter of the paragraph is changed, the hash value generated for the plaintext will be different, i.e., it is computationally essentially impossible to find two pieces of plaintext corresponding to the same hash value.
In view of the above characteristics of the hash value, and the requirement of the embodiment of the present application that one point on the preset closed circular ring corresponds to an identifier of one data block or an identifier of one resource node, in a preferred implementation manner of the present application, when determining a first mapping point of a data block to be stored on the preset closed circular ring according to the identifier of the data block to be stored, a hash value of the identifier of the data block to be stored may be calculated first, and then the first mapping point of the data block to be stored on the preset closed circular ring is determined according to the hash value obtained by the calculation, so as to ensure that the point on the preset closed circular ring corresponds to the data block to be stored uniquely.
In an optional implementation manner of the present application, the preset node partition rule for the storage resource state of the storage node may be that the storage node is partitioned by:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
For example, there are four storage nodes currently, and the total amount of the preset resource nodes is: 8, the free storage resource capacity of the four storage nodes and the division result are shown in the following table.
Storage node 1 Storage node 2 Storage node 3 Storage node 4
Capacity of 1G 2G 4G 1G
Number of resource nodes 1 2 4 1
Of course, the present application is described only by way of example, and the manner of dividing the storage nodes in practical application is not limited to this.
S103: and obtaining the first resource node corresponding to the determined first mapping point according to the storage range of each resource node.
Wherein, the storage range of each resource node is as follows: and the range of the mapping points of the data blocks which can be stored by the storage nodes corresponding to the resource nodes on the preset closed circular ring.
It should be noted that, since a storage node may be divided into one or more resource nodes, and each resource node has its corresponding storage range, for a storage node, it may correspond to multiple storage ranges.
In an optional implementation manner of the present application, a storage range of each resource node may be determined with reference to a manner shown in fig. 2, and fig. 3 is a schematic flow diagram of a method for determining a storage range of a resource node according to an embodiment of the present application, where the method includes:
s301: and determining the mapping point of each resource node on the preset closed ring according to the identifier of each resource node.
When determining the mapping point of the resource node on the preset closed circular ring, the mapping point may also be determined according to the hash value of the identifier of the resource node, and the determination method is similar to the method mentioned in S102, and is not described here again.
S302: and dividing the preset closed circular ring into a plurality of point segments according to the determined positions of the mapping points of the resource nodes and a preset segmentation rule to obtain the point segment corresponding to each resource node.
When the preset closed circular ring is divided, the number of the point segments obtained by dividing is equal to the number of the resource nodes, and the point segments are not overlapped.
Preferably, the preset segmentation rule may select a segmentation rule that minimizes the storage capacity fluctuation of each resource node, so that the storage capacities of the resource nodes are relatively balanced. For example, segmentation may be performed according to a multiple linear regression algorithm, and so on.
The storage amount of the resource nodes can be understood as follows: and in the data stored in the storage node corresponding to the resource node, the data quantity of the data of which the mapping point on the preset closed ring is positioned in the range of the resource node.
S303: and determining the range of the midpoint of the point segment corresponding to each resource node as the storage range of each resource node.
Specifically, referring to fig. 4, fig. 4 is a schematic diagram of a second closed circular ring provided in this embodiment, where N1.1, N2.1, N2.2, and so on are mapping points of each resource node on the closed circular ring, NP20000 represents a value of the mapping point N2.1, and NP15000-NP25000 are storage ranges of resource nodes corresponding to the mapping point N2.1.
S104: and storing the data block to be stored to the storage node corresponding to the first resource node.
Since the resource node is a child node obtained by dividing according to a preset node division rule for the storage resource state of the storage node, the corresponding relationship between the resource node and the storage node can be obtained according to the division result. And further, under the condition that the resource node is determined, the storage node corresponding to the resource node can be found according to the corresponding relation.
As can be seen from the above, in the scheme provided in this embodiment, when data storage is performed, a closed circular ring formed by a preset number of points is preset, then after an identifier of a data block to be stored is obtained, a mapping point of the data block to be stored on the preset closed circular ring is determined according to the identifier, a resource node corresponding to the mapping point of the data block to be stored is obtained according to a storage range of each resource node and a position of the mapping point of the data block to be stored, and finally the data block to be stored is stored in a storage node corresponding to the obtained resource node. Because the resource node corresponding to the data belongs to a part of the data metadata, when the scheme provided by the embodiment of the application is applied to data storage, the resource node corresponding to the data can be obtained through identification calculation of the data, so that the limitation of other resources except the storage node on the data storage amount can be reduced.
It can be understood by those skilled in the art that due to the variation of factors such as user requirements, the current storage node cannot meet the user requirements, and therefore, a storage node and a resource node need to be added.
In view of this, in a specific implementation manner of the present application, referring to fig. 5, a flowchart of a second data storage method is provided, and in this embodiment, compared with the foregoing embodiment, the data storage method further includes:
s105: a resource node increase request is received.
Wherein, the resource node increase request at least comprises: the identifier of the resource node to be added may also include other information, which is not limited in this application.
It should be noted that the resource node to be added may be a resource node that is a duplicate resource node of an existing resource node, or may also be a resource node that is not a duplicate resource node of an existing resource node, and the application is not limited thereto.
S106: and determining a second mapping point of the resource node to be added on the preset closed circular ring according to the identifier of the resource node to be added.
The hash value may also be considered when determining the second mapping point of the resource node to be added on the preset closed circle, and the determination method is similar to the method mentioned in S102, and is not described herein again.
S107: and obtaining two adjacent resource nodes of which the mapping points are respectively positioned at two sides of the second mapping point and the mapping point is closest to the second mapping point.
When determining that the mapping point is located in the adjacent resource nodes on both sides of the resource node to be added, the mapping point may be determined in a clockwise manner or a counterclockwise manner, which is not limited in the present application.
On the predetermined closed circle, a plurality of mapping points located on both sides of the mapping point of the resource node to be added may be found, but from the viewpoint of minimizing the amount of migration data, it is preferable to select only one mapping point on both sides of the mapping point of the resource node to be added, which is respectively closest to the mapping point.
S108: and calculating the length of the storage range of the resource nodes to be increased according to the total number of the points on the closed circular ring and the total number of the resource nodes.
And the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased.
Assuming that the total number of points on the closed circular ring is 10000, the number of current resource nodes is 199, and the number of resource nodes to be added is 1, the length of the storage range of the resource nodes to be added is 50 points.
S109: and according to the storage range and the length of the adjacent resource nodes, obtaining the storage range of the resource nodes to be added according to a preset multiple linear regression algorithm, and adjusting the storage range of the adjacent resource nodes.
And the sum of the storage range of the resource node to be increased and the storage range of the adjacent resource node after adjustment is equal to the sum of the storage ranges of the adjacent resource nodes before adjustment.
As can be seen from the above, only the storage range of the adjacent resource node changes after adjustment, and the storage ranges of other resource nodes do not change.
It is assumed that the storage ranges of two neighboring resource nodes are:
neighboring resource node a: [100, 200], neighboring resource node B: (200, 350],
the length of the storage range of the resource node to be added is 50 points, and the storage range of the resource node to be added after calculation may be: [190, 240], the storage ranges of two neighboring resource nodes are:
neighboring resource node a: [100, 190), neighboring resource node B: (240, 350].
S110: and migrating the data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
Continuing with the foregoing example, when data migration is performed, data within [190, 200] of the neighboring resource node a and data within (200, 240] of the neighboring resource node B need to be migrated to the resource node to be added.
As can be seen from the above, in the scheme provided in this embodiment, when a resource node is added, the storage range of the resource node to be added is determined only according to the adjacent resource nodes whose mapping points are located on both sides of the mapping point of the resource node to be added, and the storage ranges of the adjacent resource nodes are adjusted, it is seen that there are few resource nodes involved, and thus there are few storage nodes involved, and compared with the prior art, the data migration amount is greatly reduced.
In practical application, in order to prevent data loss caused by computer failure and other reasons, duplicate resource nodes can be set for existing resource nodes.
In a specific implementation manner of the present application, the data storage method further includes:
receiving a duplicate resource node setup request for a second resource node;
and selecting the resource node from the resource nodes corresponding to the target storage node as a copy resource node of the second resource node.
And the target storage node is a storage node different from the storage node corresponding to the second resource node.
Specifically, the data part corresponding to the replica resource node in the data stored in the storage node corresponding to the replica resource node is the same as the data part corresponding to the target resource node in the data stored in the storage node corresponding to the target resource node.
In order to improve data security, the target storage node may further include: and the storage nodes are positioned in other physical machines except the target physical machine, wherein the target physical machine is as follows: and the physical machine where the storage node corresponding to the second resource node is located.
Specifically, referring to fig. 6, fig. 6 is a schematic diagram of a third closed circular ring provided in the embodiment of the present application, where the schematic diagram includes mapping points of resource nodes that are copies of multiple resource nodes, for example, N2.1 copies, N3.1 copies, and the like.
After the duplicate resource node is set for the second resource node, when the storage node needs to be deleted, data stored in the storage node to be deleted is stored in other storage nodes, so that data migration is not needed. Based on this, in a specific implementation manner of the present application, the data storage method may further include: receiving a resource node deletion request aiming at a resource node to be deleted, judging whether a duplicate resource node of the resource node to be deleted exists, if so, directly deleting the resource node to be deleted, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a third mapping point corresponding to the resource node to be deleted and the distance between the mapping points and the third mapping point is the closest, according to the storage range of the resource node to be deleted.
Comprehensively, the setting of duplicate resource nodes for the resource nodes not only improves the security of data, but also reduces the data migration amount when the storage nodes are deleted, which is beneficial to the balanced distribution of data.
In another specific implementation manner of the present application, referring to fig. 7, a flowchart of a third data storage method is provided, and compared with the foregoing embodiment, in this embodiment, after obtaining a first resource node corresponding to a first mapping point according to a storage range of each resource node (S103), the method further includes:
s111: and judging whether the first resource node has a copy resource node, and if so, executing S112.
S112: and storing the data block to be stored to a storage node corresponding to the duplicate resource node of the first resource node.
It should be noted that the first resource node obtained in S103 may also be referred to as a duplicate resource node of the duplicate resource node, that is, when data storage is performed, if data to be stored needs to be stored in the duplicate resource node after being confirmed in S101-S103, the data block to be stored needs to be stored in the storage node corresponding to the duplicate resource node, and at the same time, the data block to be stored needs to be stored in the storage node corresponding to the obtained resource node.
As can be seen from the above, in the scheme provided in this embodiment, the data block to be stored is not only stored in the storage node corresponding to the obtained resource node, but also stored in the storage node corresponding to the duplicate resource node of the obtained resource node, because the data security is improved.
Corresponding to the data storage method, the embodiment of the application also provides a data migration method.
Fig. 8 is a schematic flowchart of a data migration method according to an embodiment of the present application, where the method includes:
s801: a resource node increase request is received.
Wherein, the resource node increase request at least comprises: and the identifier of the resource node to be added.
S802: and determining a fourth mapping point of the resource node to be added on the preset closed circular ring according to the identifier of the resource node to be added.
The preset closed ring is a closed ring formed by a preset number of points, one point on the preset closed ring corresponds to an identifier of one data block or an identifier of one resource node, and the resource node is a sub-node obtained by dividing according to a preset node division rule aiming at a storage resource state of a storage node.
In a specific implementation manner of the present application, the preset node partition rule for the storage resource state of the storage node may be that the storage node is partitioned by:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
In a preferred implementation manner of the present application, when determining the fourth mapping point of the resource node to be added on the preset closed circular ring according to the identifier of the resource node to be added, the hash value of the identifier of the resource node to be added may be calculated first, and then the fourth mapping point of the resource node to be added on the preset closed circular ring may be determined according to the hash value obtained by the calculation.
S803: and obtaining two adjacent resource nodes of which the mapping points are respectively positioned at two sides of the fourth mapping point and the mapping point is closest to the fourth mapping point.
S804: and calculating the length of the storage range of the resource nodes to be increased according to the total number of the points on the closed circular ring and the total number of the resource nodes.
And the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased.
S805: and according to the storage range and the length of the adjacent resource nodes, obtaining the storage range of the resource nodes to be added according to a preset multiple linear regression algorithm, and adjusting the storage range of the adjacent resource nodes.
And the sum of the storage range of the resource node to be increased and the storage range of the adjacent resource node after adjustment is equal to the sum of the storage ranges of the adjacent resource nodes before adjustment.
S806: and migrating the data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
In practical application, in order to prevent data loss caused by computer failure and other reasons, duplicate resource nodes can be set for existing resource nodes. In view of this, in a specific implementation manner of the present application, the data migration method may further include:
receiving a duplicate resource node setup request for a third resource node;
and selecting the resource node from the resource nodes corresponding to the target storage node as a copy resource node of the third resource node.
And the target storage node is a storage node different from the storage node corresponding to the third resource node.
In order to improve data security, the target storage node may further include: and the storage nodes are positioned in other physical machines except the target physical machine, wherein the target physical machine is as follows: and the physical machine where the storage node corresponding to the third resource node is located.
After the duplicate resource node is set for the third resource node, when the storage node needs to be deleted, data stored in the storage node to be deleted is stored in other storage nodes, so that data migration is not needed. Based on this, in a specific implementation manner of the present application, the data migration method may further include: receiving a resource node deletion request aiming at a resource node to be deleted, judging whether a duplicate resource node of the resource node to be deleted exists, if so, directly deleting the resource node to be deleted, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a fourth mapping point corresponding to the resource node to be deleted and the mapping points are closest to the fourth mapping point, according to the storage range of the resource node to be deleted.
It should be noted that, the steps and terms used in the present embodiment have the same or similar meanings as the corresponding steps and terms in the embodiments shown in fig. 1 to 7, and are not described herein.
As can be seen from the above, in the scheme provided in this embodiment, when a resource node is added, the storage range of the resource node to be added is determined only according to the adjacent resource nodes whose mapping points are located on both sides of the mapping point of the resource node to be added, and the storage ranges of the adjacent resource nodes are adjusted, it is seen that there are few resource nodes involved, and thus there are few storage nodes involved, and compared with the prior art, the data migration amount is greatly reduced.
Corresponding to the data storage method, the embodiment of the application provides a data storage device.
Fig. 9 is a schematic structural diagram of a first data storage device according to an embodiment of the present application, where the first data storage device includes:
an identifier obtaining module 901, configured to obtain an identifier of a data block to be stored;
a first mapping point determining module 902, configured to determine, according to an identifier of the data block to be stored, a first mapping point of the data block to be stored on a preset closed circular ring, where the preset closed circular ring is a closed circular ring formed by a preset number of points, one point on the preset closed circular ring corresponds to the identifier of one data block or the identifier of one resource node, and the resource node is a child node obtained by partitioning according to a preset node partitioning rule for a storage resource state of a storage node;
a resource node obtaining module 903, configured to obtain, according to a storage range of each resource node, a first resource node corresponding to the first mapping point, where the storage range of each resource node is: the range of mapping points of the data blocks which can be stored by the storage nodes corresponding to the resource nodes on the preset closed circular ring is within the preset closed circular ring;
a first data storage module 904, configured to store the data block to be stored to a storage node corresponding to the first resource node.
Specifically, the data storage device may further include:
and the storage range determining module is used for determining the storage range of each resource node.
In an alternative implementation manner of the present application, referring to fig. 10, a schematic structural diagram of an apparatus for determining a storage range of a resource node is provided.
In this implementation, the storage range determining module may include:
a mapping point determining submodule 1001 configured to determine, according to an identifier of each resource node, a mapping point of each resource node on the preset closed circular ring;
a point segment obtaining submodule 1002, configured to divide the preset closed circular ring into multiple point segments according to a preset segmentation rule and according to the determined position of the mapping point of each resource node, so as to obtain a point segment corresponding to each resource node, where the number of the divided point segments is equal to the number of the resource nodes, and each point segment is not overlapped;
the storage range determining submodule 1003 is configured to determine a range of a midpoint of a point segment corresponding to each resource node as a storage range of each resource node.
Specifically, the first mapping point determining module 902 may include:
the first hash value operator module is used for calculating the hash value of the identifier of the data block to be stored;
and the first mapping point determining submodule is used for determining a first mapping point of the data block to be stored on a preset closed ring according to the hash value obtained by calculation.
Specifically, the preset node partition rule for the storage resource state of the storage node may include:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
As can be seen from the above, in the scheme provided in this embodiment, when data storage is performed, a closed circular ring formed by a preset number of points is preset, then after an identifier of a data block to be stored is obtained, a mapping point of the data block to be stored on the preset closed circular ring is determined according to the identifier, a resource node corresponding to the mapping point of the data block to be stored is obtained according to a storage range of each resource node and a position of the mapping point of the data block to be stored, and finally the data block to be stored is stored in a storage node corresponding to the obtained resource node. Because the resource node corresponding to the data belongs to a part of the data metadata, when the scheme provided by the embodiment of the application is applied to data storage, the resource node corresponding to the data can be obtained through identification calculation of the data, so that the limitation of other resources except the storage node on the data storage amount can be reduced.
In a specific implementation manner of the present application, referring to fig. 11, a schematic structural diagram of a second data storage device is provided, and in this embodiment, compared with the foregoing embodiment, the data storage device further includes:
a first request receiving module 905, configured to receive a resource node increase request, where the resource node increase request includes: identification of resource nodes to be added;
a second mapping point determining module 906, configured to determine, according to the identifier of the resource node to be added, a second mapping point of the resource node to be added on the preset closed circular ring;
a first adjacent resource node determining module 907, configured to obtain two adjacent resource nodes where mapping points are located on two sides of the second mapping point respectively and a distance between the mapping point and the second mapping point is closest;
a first length calculating module 908, configured to calculate a length of the storage range of the resource node to be added according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, where the total number of the resource nodes is equal to a sum of the current number of the resource nodes and the number of the resource nodes to be added;
a first storage range adjusting module 909, configured to obtain a storage range of the resource node to be added according to the storage range and the length of the adjacent resource node and according to a preset multiple linear regression algorithm, and adjust the storage range of the adjacent resource node, where a sum of the storage range of the resource node to be added and the storage range of the adjacent resource node after adjustment is equal to a sum of the storage ranges of the adjacent resource nodes before adjustment;
a first data migration module 910, configured to migrate a data block from a storage node corresponding to the adjacent resource node to a storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
As can be seen from the above, in the scheme provided in this embodiment, when a resource node is added, the storage range of the resource node to be added is determined only according to the adjacent resource nodes whose mapping points are located on both sides of the mapping point of the resource node to be added, and the storage ranges of the adjacent resource nodes are adjusted, it is seen that there are few resource nodes involved, and thus there are few storage nodes involved, and compared with the prior art, the data migration amount is greatly reduced.
In an optional implementation manner of the present application, the data storage device may further include:
a second request receiving module, configured to receive a duplicate resource node setting request for a second resource node;
and the first resource node selection module is used for selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the second resource node, wherein the target storage node is a storage node different from the storage node corresponding to the second resource node.
Specifically, the target storage node may further be: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the second resource node is located.
Based on the above situation, in another specific implementation manner of the present application, referring to fig. 12, a schematic structural diagram of a third data storage device is provided, and compared with the foregoing embodiment, in this embodiment, the data storage device further includes:
a first resource node determining module 911, configured to determine whether a duplicate resource node exists in the first resource node after the resource node obtaining module obtains the first resource node;
a second data storage module 912, configured to store the data block to be stored to a storage node corresponding to a duplicate resource node of the first resource node if the determination result of the first resource node determining module is yes.
In an optional implementation manner of the present application, the data storage device may further include:
a third request receiving module, configured to receive a resource node deletion request for a resource node to be deleted;
the second resource node judgment module is used for judging whether a copy resource node of the resource node to be deleted exists or not;
and the second storage range adjusting module is used for directly deleting the resource node to be deleted under the condition that the judgment result of the second resource node judging module is yes, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a third mapping point corresponding to the resource node to be deleted, and the mapping points are closest to the third mapping point, according to the storage range of the resource node to be deleted.
As can be seen from the above, in the scheme provided in this embodiment, the data block to be stored is not only stored in the storage node corresponding to the obtained resource node, but also stored in the storage node corresponding to the duplicate resource node of the obtained resource node, because the data security is improved.
Corresponding to the data migration method, the embodiment of the application also provides a data migration device.
Fig. 13 is a schematic structural diagram of a data migration apparatus according to an embodiment of the present application, where the apparatus includes:
a fourth request receiving module 1301, configured to receive a resource node addition request, where the resource node addition request includes: identification of resource nodes to be added;
a fourth mapping point determining module 1302, configured to determine, according to the identifier of the resource node to be added, a fourth mapping point of the resource node to be added on a preset closed circular ring, where the preset closed circular ring is a closed circular ring formed by a preset number of points, one point on the preset closed circular ring corresponds to an identifier of one data block or an identifier of one resource node, and the resource node is a child node obtained by dividing according to a preset node division rule for a storage resource state of a storage node;
a second adjacent resource node determining module 1303, configured to obtain two adjacent resource nodes whose mapping points are located on two sides of the fourth mapping point respectively and whose mapping points are closest to the fourth mapping point;
a second length calculating module 1304, configured to calculate a length of the storage range of the resource node to be added according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, where the total number of the resource nodes is equal to a sum of the current number of the resource nodes and the number of the resource nodes to be added;
a third storage range adjusting module 1305, configured to obtain, according to the storage range and the length of the adjacent resource node, a storage range of the resource node to be added according to a preset multiple linear regression algorithm, and adjust the storage range of the adjacent resource node, where a sum of the storage range of the resource node to be added and the storage range of the adjacent resource node after adjustment is equal to a sum of the storage ranges of the adjacent resource nodes before adjustment;
a second data migration module 1306, configured to migrate, according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node, a data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added.
Specifically, the fourth mapping point determining module 1302 may include:
the second hash value operator module is used for calculating the hash value of the identifier of the resource node to be added;
and the fourth mapping point determining submodule is used for determining a fourth mapping point of the resource node to be added on a preset closed ring according to the hash value obtained by calculation.
Specifically, the preset node partition rule for the storage resource state of the storage node may include:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
Specifically, the data migration apparatus may further include:
a fifth request receiving module, configured to receive a duplicate resource node setting request for the third resource node;
and the second resource selection module is used for selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the third resource node, wherein the target storage node is a storage node different from the storage node corresponding to the third resource node.
Specifically, the target storage node may be: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the third resource node is located.
Specifically, the data migration apparatus may further include:
a sixth request receiving module, configured to receive a resource node deletion request for a resource node to be deleted;
the third resource node judgment module is used for judging whether a copy resource node of the resource node to be deleted exists or not;
and a fourth storage range adjusting module, configured to, if the determination result of the third resource node determining module is yes, directly delete the resource node to be deleted, and adjust, according to the storage range of the resource node to be deleted, storage ranges of two resource nodes, where mapping points are located on two sides of a fourth mapping point corresponding to the resource node to be deleted and a distance between the mapping point and the fourth mapping point is closest to the fourth mapping point.
As can be seen from the above, in the scheme provided in this embodiment, when a resource node is added, the storage range of the resource node to be added is determined only according to the adjacent resource nodes whose mapping points are located on both sides of the mapping point of the resource node to be added, and the storage ranges of the adjacent resource nodes are adjusted, it is seen that there are few resource nodes involved, and thus there are few storage nodes involved, and compared with the prior art, the data migration amount is greatly reduced.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Those skilled in the art will appreciate that all or part of the steps in the above method embodiments may be implemented by a program to instruct relevant hardware to perform the steps, and the program may be stored in a computer-readable storage medium, which is referred to herein as a storage medium, such as: ROM/RAM, magnetic disk, optical disk, etc.
The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims (28)

1. A method of data storage, the method comprising:
acquiring an identifier of a data block to be stored;
determining a first mapping point of the data block to be stored on a preset closed ring according to the identifier of the data block to be stored, wherein the preset closed ring is a closed ring formed by a preset number of points, one point on the preset closed ring corresponds to the identifier of one data block or the identifier of one resource node, and the resource node is a sub-node obtained by dividing according to a preset node division rule aiming at the storage resource state of the storage node; the determining a first mapping point of the data block to be stored on a preset closed circular ring according to the identifier of the data block to be stored includes: calculating the hash value of the identifier of the data block to be stored; determining a first mapping point of the data block to be stored on a preset closed circular ring according to the hash value obtained by calculation;
obtaining a first resource node corresponding to the first mapping point according to the storage range of each resource node, wherein the storage range of each resource node is as follows: the range of mapping points of the data blocks which can be stored by the storage nodes corresponding to the resource nodes on the preset closed circular ring is within the preset closed circular ring;
and storing the data block to be stored to a storage node corresponding to the first resource node.
2. The method of claim 1, wherein the storage scope of each resource node is determined by:
determining mapping points of the resource nodes on the preset closed circular ring according to the identifiers of the resource nodes;
dividing the preset closed ring into a plurality of point segments according to the determined mapping point position of each resource node and a preset segmentation rule to obtain the point segment corresponding to each resource node, wherein the number of the divided point segments is equal to that of the resource nodes, and the point segments are not overlapped;
and determining the range of the midpoint of the point segment corresponding to each resource node as the storage range of each resource node.
3. The method of claim 1, wherein the preset node partitioning rule for the storage resource status of the storage node comprises:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to each storage node.
4. The method according to any one of claims 1-3, further comprising:
receiving a resource node increase request, wherein the resource node increase request comprises: identification of resource nodes to be added;
determining a second mapping point of the resource node to be added on the preset closed circular ring according to the identifier of the resource node to be added;
obtaining two adjacent resource nodes of which mapping points are respectively positioned at two sides of the second mapping point and the mapping points are closest to the second mapping point;
calculating the length of the storage range of the resource nodes to be increased according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, wherein the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased;
according to the storage range and the length of the adjacent resource node, obtaining the storage range of the resource node to be increased according to a preset multiple linear regression algorithm, and adjusting the storage range of the adjacent resource node, wherein the sum of the storage range of the resource node to be increased and the storage range of the adjacent resource node after adjustment is equal to the sum of the storage ranges of the adjacent resource nodes before adjustment;
and migrating the data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
5. The method of claim 1, further comprising:
receiving a duplicate resource node setup request for a second resource node;
and selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the second resource node, wherein the target storage node is a storage node different from the storage node corresponding to the second resource node.
6. The method of claim 5,
the target storage node is: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the second resource node is located.
7. The method according to claim 5 or 6, wherein after obtaining the first resource node corresponding to the first mapping point according to the storage range of each resource node, the method further comprises:
judging whether a copy resource node exists in the first resource node;
and if so, storing the data block to be stored to a storage node corresponding to the copy resource node of the first resource node.
8. The method of claim 5 or 6, further comprising:
receiving a resource node deletion request aiming at a resource node to be deleted;
judging whether a duplicate resource node of the resource node to be deleted exists or not;
and if the mapping point exists in the storage range of the resource node to be deleted, directly deleting the resource node to be deleted, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a third mapping point corresponding to the resource node to be deleted and the mapping point is closest to the third mapping point, according to the storage range of the resource node to be deleted.
9. A method of data migration, the method comprising:
receiving a resource node increase request, wherein the resource node increase request comprises: identification of resource nodes to be added;
determining a fourth mapping point of the resource node to be added on a preset closed ring according to the identifier of the resource node to be added, wherein the preset closed ring is a closed ring formed by a preset number of points, one point on the preset closed ring corresponds to the identifier of one data block or the identifier of one resource node, and the resource node is a sub-node obtained by dividing according to a preset node division rule aiming at the storage resource state of a storage node;
obtaining two adjacent resource nodes of which mapping points are respectively positioned at two sides of the fourth mapping point and the mapping points are closest to the fourth mapping point;
calculating the length of the storage range of the resource nodes to be increased according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, wherein the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased;
according to the storage range and the length of the adjacent resource node, obtaining the storage range of the resource node to be increased according to a preset multiple linear regression algorithm, and adjusting the storage range of the adjacent resource node, wherein the sum of the storage range of the resource node to be increased and the storage range of the adjacent resource node after adjustment is equal to the sum of the storage ranges of the adjacent resource nodes before adjustment;
and migrating the data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
10. The method according to claim 9, wherein the determining a fourth mapping point of the resource node to be added on a preset closed circular ring according to the identifier of the resource node to be added comprises:
calculating the hash value of the identifier of the resource node to be added;
and determining a fourth mapping point of the resource node to be added on a preset closed circular ring according to the hash value obtained by calculation.
11. The method of claim 9, wherein the preset node partitioning rule for the storage resource status of the storage node comprises:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
12. The method according to any one of claims 9-11, further comprising:
receiving a duplicate resource node setup request for a third resource node;
and selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the third resource node, wherein the target storage node is a storage node different from the storage node corresponding to the third resource node.
13. The method of claim 12,
the target storage node is: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the third resource node is located.
14. The method according to any one of claims 9-11, further comprising:
receiving a resource node deletion request aiming at a resource node to be deleted;
judging whether a duplicate resource node of the resource node to be deleted exists or not;
and if the mapping point exists in the storage range of the resource node to be deleted, directly deleting the resource node to be deleted, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a fourth mapping point corresponding to the resource node to be deleted and the mapping point is closest to the fourth mapping point, according to the storage range of the resource node to be deleted.
15. A data storage device, characterized in that the device comprises:
the identification obtaining module is used for obtaining the identification of the data block to be stored;
the first mapping point determining module is configured to determine, according to the identifier of the data block to be stored, a first mapping point of the data block to be stored on a preset closed ring, where the preset closed ring is a closed ring formed by a preset number of points, one point on the preset closed ring corresponds to the identifier of one data block or the identifier of one resource node, and the resource node is a child node obtained by dividing according to a preset node division rule for a storage resource state of a storage node; the first mapping point determining module includes: the first hash value operator module is used for calculating the hash value of the identifier of the data block to be stored; the first mapping point determining submodule is used for determining a first mapping point of the data block to be stored on a preset closed circular ring according to the hash value obtained by calculation;
a resource node obtaining module, configured to obtain, according to a storage range of each resource node, a first resource node corresponding to the first mapping point, where the storage range of each resource node is: the range of mapping points of the data blocks which can be stored by the storage nodes corresponding to the resource nodes on the preset closed circular ring is within the preset closed circular ring;
and the first data storage module is used for storing the data block to be stored to the storage node corresponding to the first resource node.
16. The apparatus of claim 15, further comprising:
the storage range determining module is used for determining the storage range of each resource node;
wherein the storage range determining module comprises:
the mapping point determining submodule is used for determining the mapping point of each resource node on the preset closed circular ring according to the identifier of each resource node;
the point segment obtaining submodule is used for dividing the preset closed ring into a plurality of point segments according to the determined mapping point position of each resource node and a preset segmentation rule to obtain the point segment corresponding to each resource node, wherein the number of the divided point segments is equal to that of the resource nodes, and each point segment is not overlapped;
and the storage range determining submodule is used for determining the range of the midpoint of the point segment corresponding to each resource node as the storage range of each resource node.
17. The apparatus of claim 15, wherein the preset node partition rule for the storage resource status of the storage node comprises:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to each storage node.
18. The apparatus according to any one of claims 15-17, further comprising:
a first request receiving module, configured to receive a resource node increase request, where the resource node increase request includes: identification of resource nodes to be added;
the second mapping point determining module is used for determining a second mapping point of the resource node to be added on the preset closed circular ring according to the identifier of the resource node to be added;
the first adjacent resource node determining module is used for obtaining two adjacent resource nodes, wherein the mapping points are respectively positioned at two sides of the second mapping point, and the mapping points are closest to the second mapping point;
the first length calculation module is used for calculating the length of the storage range of the resource nodes to be increased according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, wherein the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased;
a first storage range adjusting module, configured to obtain a storage range of the resource node to be added according to the storage range and the length of the adjacent resource node and according to a preset multiple linear regression algorithm, and adjust the storage range of the adjacent resource node, where a sum of the storage range of the resource node to be added and the storage range of the adjacent resource node after adjustment is equal to a sum of the storage ranges of the adjacent resource nodes before adjustment;
and the first data migration module is used for migrating data blocks from the storage nodes corresponding to the adjacent resource nodes to the storage nodes corresponding to the resource nodes to be added according to the storage range of the resource nodes to be added and the adjusted storage range of the adjacent resource nodes.
19. The apparatus of claim 15, further comprising:
a second request receiving module, configured to receive a duplicate resource node setting request for a second resource node;
and the first resource node selection module is used for selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the second resource node, wherein the target storage node is a storage node different from the storage node corresponding to the second resource node.
20. The apparatus of claim 19,
the target storage node is: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the second resource node is located.
21. The apparatus of claim 19 or 20, further comprising:
a first resource node determining module, configured to determine whether a duplicate resource node exists in the first resource node after the resource node obtaining module obtains the first resource node;
and the second data storage module is used for storing the data block to be stored to the storage node corresponding to the duplicate resource node of the first resource node under the condition that the judgment result of the first resource node judgment module is yes.
22. The apparatus of claim 19 or 20, further comprising:
a third request receiving module, configured to receive a resource node deletion request for a resource node to be deleted;
the second resource node judgment module is used for judging whether a copy resource node of the resource node to be deleted exists or not;
and the second storage range adjusting module is used for directly deleting the resource node to be deleted under the condition that the judgment result of the second resource node judging module is yes, and adjusting the storage ranges of two resource nodes, wherein the mapping points are positioned at two sides of a third mapping point corresponding to the resource node to be deleted, and the mapping points are closest to the third mapping point, according to the storage range of the resource node to be deleted.
23. An apparatus for data migration, the apparatus comprising:
a fourth request receiving module, configured to receive a resource node increase request, where the resource node increase request includes: identification of resource nodes to be added;
a fourth mapping point determining module, configured to determine, according to the identifier of the resource node to be added, a fourth mapping point of the resource node to be added on a preset closed circular ring, where the preset closed circular ring is a closed circular ring formed by a preset number of points, one point on the preset closed circular ring corresponds to an identifier of one data block or an identifier of one resource node, and the resource node is a child node obtained by partitioning according to a preset node partitioning rule for a storage resource state of a storage node;
a second adjacent resource node determining module, configured to obtain two adjacent resource nodes where mapping points are located on two sides of the fourth mapping point respectively and a mapping point is closest to the fourth mapping point;
the second length calculation module is used for calculating the length of the storage range of the resource nodes to be increased according to the total number of the resource nodes and the total number of the resource nodes on the closed circular ring, wherein the total number of the resource nodes is equal to the sum of the number of the current resource nodes and the number of the resource nodes to be increased;
a third storage range adjusting module, configured to obtain a storage range of the resource node to be added according to the storage range and the length of the adjacent resource node and according to a preset multiple linear regression algorithm, and adjust the storage range of the adjacent resource node, where a sum of the storage range of the resource node to be added and the storage range of the adjacent resource node after adjustment is equal to a sum of the storage ranges of the adjacent resource nodes before adjustment;
and the second data migration module is used for migrating the data block from the storage node corresponding to the adjacent resource node to the storage node corresponding to the resource node to be added according to the storage range of the resource node to be added and the adjusted storage range of the adjacent resource node.
24. The apparatus of claim 23, wherein the fourth mapping point determining module comprises:
the second hash value operator module is used for calculating the hash value of the identifier of the resource node to be added;
and the fourth mapping point determining submodule is used for determining a fourth mapping point of the resource node to be added on a preset closed ring according to the hash value obtained by calculation.
25. The apparatus of claim 23, wherein the preset node partition rule for the storage resource status of the storage node comprises:
acquiring the capacity of available storage resources in each storage node;
and dividing each storage node according to the total amount of the preset resource nodes and the obtained capacity to obtain the resource nodes corresponding to the resource nodes.
26. The apparatus according to any one of claims 23-25, further comprising:
a fifth request receiving module, configured to receive a duplicate resource node setting request for the third resource node;
and the second resource selection module is used for selecting a resource node from the resource nodes corresponding to the target storage node as a copy resource node of the third resource node, wherein the target storage node is a storage node different from the storage node corresponding to the third resource node.
27. The apparatus of claim 26,
the target storage node is: a storage node located in a physical machine other than a target physical machine, wherein the target physical machine is: and the physical machine where the storage node corresponding to the third resource node is located.
28. The apparatus of any one of claims 23-25, further comprising:
a sixth request receiving module, configured to receive a resource node deletion request for a resource node to be deleted;
the third resource node judgment module is used for judging whether a copy resource node of the resource node to be deleted exists or not;
and a fourth storage range adjusting module, configured to, if the determination result of the third resource node determining module is yes, directly delete the resource node to be deleted, and adjust, according to the storage range of the resource node to be deleted, storage ranges of two resource nodes, where mapping points are located on two sides of a fourth mapping point corresponding to the resource node to be deleted and a distance between the mapping point and the fourth mapping point is closest to the fourth mapping point.
CN201610087865.5A 2016-02-16 2016-02-16 Data storage method, data migration method, data storage device and data migration device Active CN107085501B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610087865.5A CN107085501B (en) 2016-02-16 2016-02-16 Data storage method, data migration method, data storage device and data migration device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610087865.5A CN107085501B (en) 2016-02-16 2016-02-16 Data storage method, data migration method, data storage device and data migration device

Publications (2)

Publication Number Publication Date
CN107085501A CN107085501A (en) 2017-08-22
CN107085501B true CN107085501B (en) 2020-04-03

Family

ID=59615235

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610087865.5A Active CN107085501B (en) 2016-02-16 2016-02-16 Data storage method, data migration method, data storage device and data migration device

Country Status (1)

Country Link
CN (1) CN107085501B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063121B (en) * 2018-08-01 2024-04-05 平安科技(深圳)有限公司 Data storage method, device, computer equipment and computer storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5721899A (en) * 1994-11-16 1998-02-24 Fujitsu Limited Retrieval apparatus using compressed trie node and retrieval method thereof
CN1783082A (en) * 2004-11-30 2006-06-07 微软公司 Method and system for maintaining namespace consistency with a file system
CN104050270A (en) * 2014-06-23 2014-09-17 成都康赛信息技术有限公司 Distributed storage method based on consistent Hash algorithm
CN104283966A (en) * 2014-10-22 2015-01-14 浪潮(北京)电子信息产业有限公司 Data distribution algorithm and device of cloud storage system
CN104298541A (en) * 2014-10-22 2015-01-21 浪潮(北京)电子信息产业有限公司 Data distribution algorithm and data distribution device for cloud storage system
CN104391863A (en) * 2014-10-23 2015-03-04 中国建设银行股份有限公司 Data storage method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5721899A (en) * 1994-11-16 1998-02-24 Fujitsu Limited Retrieval apparatus using compressed trie node and retrieval method thereof
CN1783082A (en) * 2004-11-30 2006-06-07 微软公司 Method and system for maintaining namespace consistency with a file system
CN104050270A (en) * 2014-06-23 2014-09-17 成都康赛信息技术有限公司 Distributed storage method based on consistent Hash algorithm
CN104283966A (en) * 2014-10-22 2015-01-14 浪潮(北京)电子信息产业有限公司 Data distribution algorithm and device of cloud storage system
CN104298541A (en) * 2014-10-22 2015-01-21 浪潮(北京)电子信息产业有限公司 Data distribution algorithm and data distribution device for cloud storage system
CN104391863A (en) * 2014-10-23 2015-03-04 中国建设银行股份有限公司 Data storage method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"改进的云存储系统数据分布策略";周敬利;《计算机应用》;20120201;第32卷(第2期);第310页第一段至最后一段以及第311页第1至2段 *
周敬利."改进的云存储系统数据分布策略".《计算机应用》.2012,第32卷(第2期), *

Also Published As

Publication number Publication date
CN107085501A (en) 2017-08-22

Similar Documents

Publication Publication Date Title
KR102103130B1 (en) Method and device for writing service data to blockchain, and method for determining service subset
CN110287197B (en) Data storage method, migration method and device
CN107622091B (en) Database query method and device
CN110489059B (en) Data cluster storage method and device and computer equipment
CN110347651B (en) Cloud storage-based data synchronization method, device, equipment and storage medium
US9733835B2 (en) Data storage method and storage server
TWI694700B (en) Data processing method and device, user terminal
CN110222030B (en) Dynamic database capacity expansion method and storage medium
CN106909556B (en) Memory cluster storage balancing method and device
CN108563698B (en) Region merging method and device for HBase table
CN103440345A (en) Distributed database extension method and distributed database extension system based on relational database
KR20160124885A (en) Data processing method and apparatus
EP4339798A2 (en) Redistributing table data in database cluster
CN111522811B (en) Database processing method and device, storage medium and terminal
US9898518B2 (en) Computer system, data allocation management method, and program
US9805109B2 (en) Computer, control device for computer system, and recording medium
CN107085501B (en) Data storage method, data migration method, data storage device and data migration device
CN107656980B (en) Method applied to distributed database system and distributed database system
CN111459913B (en) Capacity expansion method and device of distributed database and electronic equipment
CN111046004B (en) Data file storage method, device, equipment and storage medium
US10614055B2 (en) Method and system for tree management of trees under multi-version concurrency control
Huang et al. Optimizing data partition for scaling out NoSQL cluster
CN113641686B (en) Data processing method, data processing apparatus, electronic device, storage medium, and program product
CN107092604B (en) File processing method and device
CN109787899B (en) Data partition routing method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant