CN111756828A - Data storage method, device and equipment - Google Patents

Data storage method, device and equipment Download PDF

Info

Publication number
CN111756828A
CN111756828A CN202010567658.6A CN202010567658A CN111756828A CN 111756828 A CN111756828 A CN 111756828A CN 202010567658 A CN202010567658 A CN 202010567658A CN 111756828 A CN111756828 A CN 111756828A
Authority
CN
China
Prior art keywords
disk
data object
node
resource domain
disk resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010567658.6A
Other languages
Chinese (zh)
Other versions
CN111756828B (en
Inventor
樊云龙
颜秉珩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Inspur Smart Computing Technology Co Ltd
Original Assignee
Guangdong Inspur Big Data Research Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Inspur Big Data Research Co Ltd filed Critical Guangdong Inspur Big Data Research Co Ltd
Priority to CN202010567658.6A priority Critical patent/CN111756828B/en
Publication of CN111756828A publication Critical patent/CN111756828A/en
Priority to PCT/CN2021/076920 priority patent/WO2021253853A1/en
Application granted granted Critical
Publication of CN111756828B publication Critical patent/CN111756828B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data storage method, which divides disk resources of a node into more than two disk resource domains in a disk resource domain strategy and sets a corresponding relation between an object and the disk resource domains. Therefore, when the disk mapping is performed, the target disk resource domain corresponding to the object is determined, and then the hash algorithm is used to determine which disk in the target disk resource domain the object is specifically mapped to, so as to finally obtain the mapping relationship between the object and the disk. Therefore, the method avoids the problem that the object is randomly mapped to any disk in the node by setting the disk resource domain strategy, and realizes the purpose mapping of the object, so that the object can only be mapped to the corresponding disk resource domain. The flexibility of resource allocation is improved, and the storage performance of the distributed storage system is fully exerted. In addition, the application also provides a data storage device, equipment and a readable storage medium, and the technical effect of the data storage device corresponds to that of the method.

Description

Data storage method, device and equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data storage method, an apparatus, a device, and a readable storage medium.
Background
The sheetlog is an emerging distributed storage system of an open source community, adopts a completely symmetrical structure, does not have a central node similar to metadata service, and provides storage service to the outside as a whole by means of interconnection of a large number of common PC servers through a network.
Unlike other distributed storage designs, the sheetlog has no metadata information, that is, node position information stored by the object is not recorded, and the sheetlog calculates the mapping relationship from the object to the storage position through a hash algorithm in the data storage process.
When calculating the mapping relationship between the object and the disk, the disadvantage of calculating the storage location of the object by the hash algorithm is that: according to the hash algorithm, the objects are randomly distributed on any disk in the node, so that the objects cannot be organized according to a certain rule and purposefully mapped. For example, assuming that each node has 4 disks, the objects are randomly distributed on any one of the 4 disks according to a hash algorithm, and the storage range of the object cannot be limited to only the disk 1 and the disk 2 of the node.
It can be seen that, in the current distributed storage system, the mapping relationship of the object to the disk is determined through a hash algorithm, the object is randomly distributed on any disk of the node, and the resource allocation mode is too rigid, which affects the storage performance of the distributed storage system.
Disclosure of Invention
The application aims to provide a data storage method, a data storage device, data storage equipment and a readable storage medium, which are used for solving the problems that the storage performance of a distributed storage system is influenced by the fact that the current distributed storage system determines the mapping relation of an object to a disk through a Hash algorithm and the resource allocation mode is too rigid. The specific scheme is as follows:
in a first aspect, the present application provides a data storage method, including:
determining a data object to be stored;
determining a target node to which the data object is mapped, and acquiring a disk resource domain strategy of the target node, wherein the disk resource of the target node is divided into more than two disk resource domains, and the disk resource domain strategy comprises a corresponding relation between the data object and the disk resource domains and also comprises a corresponding relation between the disks and the disk resource domains;
determining the mapping relation between the data object and the disk by utilizing a consistent hash algorithm on a target disk resource domain corresponding to the data object;
and storing the data object according to the mapping relation between the data object and the disk.
Preferably, the determining, by using a consistent hash algorithm, a mapping relationship between the data object and the disk in the target disk resource domain corresponding to the data object includes:
constructing a hash ring according to the target disk resource domain corresponding to the data object;
calculating a hash value of the name of the data object by using a consistent hash algorithm;
determining the position of the data object on the hash ring according to the size of the hash value;
and determining the mapping relation between the data object and the disk according to the position of the data object in the hash ring.
Preferably, before the obtaining the disk resource domain policy of the target node, the method further includes:
and setting a disk resource domain strategy of the target node, and dividing the high-performance disk and the low-performance disk into different disk resource domains.
Preferably, the storing the data object according to the mapping relationship between the data object and the disk includes:
and determining the storage position information of the data object according to the mapping relation between the data object and the disk, and storing the data object according to the storage position information, wherein the storage position information comprises a disk resource domain number, a disk number and a virtual node number.
Preferably, the determining the target node to which the data object is mapped includes:
acquiring a node resource domain strategy of a current cluster, wherein the node resource of the current cluster is divided into more than two node resource domains, and the node resource domain strategy comprises a corresponding relation between a data object and a node resource domain and also comprises a corresponding relation between a node and the node resource domain;
and determining the mapping relation between the data object and the nodes by utilizing a consistent hash algorithm on the target node resource domain corresponding to the data object to obtain the target node mapped by the data object.
Preferably, before the obtaining the node resource domain policy of the current cluster, the method further includes:
and setting a node resource domain strategy of the current cluster, and dividing nodes in different fault domains into the same node resource domain.
In a second aspect, the present application provides a data storage device comprising:
an object determination module: for determining a data object to be stored;
a policy acquisition module: the system comprises a target node, a disk resource domain policy and a data object mapping module, wherein the target node is used for determining a target node to which the data object is mapped and acquiring the disk resource domain policy of the target node, the disk resource of the target node is divided into more than two disk resource domains, and the disk resource domain policy comprises a corresponding relation between the data object and the disk resource domains and also comprises a corresponding relation between the disks and the disk resource domains;
a mapping relation determination module: the mapping relation between the data object and the disk is determined by utilizing a consistent hash algorithm on a target disk resource domain corresponding to the data object;
a storage module: and the data object is stored according to the mapping relation between the data object and the disk.
In a third aspect, the present application provides a data storage device, comprising:
a memory: for storing a computer program;
a processor: for executing said computer program for implementing the steps of the data storage method as described above.
In a fourth aspect, the present application provides a readable storage medium having stored thereon a computer program for implementing the steps of the data storage method as described above when executed by a processor.
The data storage method provided by the application comprises the following steps: determining a data object to be stored; determining a target node mapped by a data object, and acquiring a disk resource domain strategy of the target node, wherein the disk resource of the target node is divided into more than two disk resource domains, and the disk resource domain strategy comprises a corresponding relation between the data object and the disk resource domains and also comprises a corresponding relation between a disk and the disk resource domains; determining the mapping relation between the data object and the disk by utilizing a consistent hash algorithm on a target disk resource domain corresponding to the data object; and storing the data object according to the mapping relation between the data object and the disk.
Therefore, in the method, the disk resources of the node are divided into more than two disk resource domains in the disk resource domain strategy, and the corresponding relation between the data object and the disk resource domains is set in the disk resource domain strategy. Therefore, when mapping between the data object and the disk is performed, the target disk resource domain corresponding to the data object is determined, and then the hash algorithm is used to determine which disk in the target disk resource domain the data object is specifically mapped to, so as to finally obtain the mapping relationship between the object and the disk. Therefore, the method avoids the problem that the data object is randomly mapped to any disk in the node by setting the disk resource domain strategy, and realizes the purpose mapping of the data object, so that the data object can only be mapped to the disk in the corresponding disk resource domain. The flexibility of resource allocation is improved, and the storage performance of the distributed storage system is fully exerted.
In addition, the present application also provides a data storage device, a device and a readable storage medium, the technical effects of which correspond to the technical effects of the above method, and are not described herein again.
Drawings
For a clearer explanation of the embodiments or technical solutions of the prior art of the present application, the drawings needed for the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic diagram illustrating an object distribution of a conventional hash ring based on a consistent hash algorithm provided in the present application;
FIG. 2 is a flowchart illustrating an implementation of an embodiment of a data storage method provided in the present application;
FIG. 3 is a schematic diagram illustrating a partitioning of disk resources according to the present application;
fig. 4 is a schematic diagram of a hash ring applying a disk resource domain policy provided in the present application;
fig. 5 is a schematic diagram illustrating an object distribution of a hash ring applying a disk resource domain policy according to the present application;
fig. 6 is a flowchart of a refinement of S103 in a first embodiment of a data storage method provided in the present application;
FIG. 7 is a flowchart of a second implementation of a data storage method according to the present application;
FIG. 8 is a functional block diagram of an embodiment of a data storage device provided herein.
Detailed Description
In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As previously described, unlike other distributed storage designs, the sheetlog has no metadata information, i.e., does not record the location information that the object stores. The sheetlog calculates the mapping relationship between the object and the storage location through a consistent hashing algorithm, and the mapping process can be defined as follows: the object storage location is hash (object name). Then, the unique hash value calculated according to the object name is searched for its position on the hash ring, thereby determining the storage location of the object, as shown in fig. 1.
Fig. 1 is a schematic diagram of a hash ring, in which triangles represent objects and circles represent virtual nodes. In the consistent hash algorithm, regardless of whether the virtual node corresponding to the physical node or the virtual disk corresponding to the physical disk is referred to as a virtual node in the hash ring, a disk hash process is taken as an example to be described below.
As can be seen from fig. 1, the range of [0,2n) forms a hash ring, assuming that there are 3 physical disks on a certain physical node, and according to the consistent hash rule, each physical disk corresponds to 4 virtual nodes, where the names of the virtual nodes are defined by "physical disk number + virtual node number", that is, each virtual node corresponding to the physical disk 1 is respectively denoted as vnode1.1, vnode 1.2, vnode 1.3, vnode 1.4, and each virtual node is randomly and uniformly distributed at different positions of the hash ring. Assuming that there are 8 objects to be stored, which are respectively denoted as object 1, object 2, …, and object 8, a hash value is calculated according to the object name, and then the position of the object on the hash ring is determined according to the size of the hash value. According to the consistent hash algorithm, the objects are randomly distributed on different physical disks. The final allocation result as in fig. 1 is: the object 1 is allocated on the physical disk 1, the object 5 is allocated on the physical disk 3, and other object allocation situations are not described one by one.
Because a plurality of physical nodes exist in the sheetlog, and a plurality of physical disks exist on each physical node, in practical application, two layers of hash are needed to calculate the position information of an object: the first layer of hash is a hash ring consisting of all physical nodes in the cluster, and the physical node of the cluster on which the object is distributed is calculated through the layer of hash; after determining the physical node to which the object is mapped, all the physical disks of the physical node are formed into a hash ring, and the physical disk on which the object is allocated is calculated according to the hash value of the object. That is, the first layer hash calculates the node position information of the object, and the second layer hash calculates the disk position information of the object; through two-layer hash calculation, the position information of an object is determined.
However, the two-layer hash mapping described above has the disadvantages that: object mappings cannot be organized according to certain rules. For example: how can an object 1 be distributed only on physical node1 and physical node2, and an object 2 distributed only on physical node2 and physical node3, and further how can an object 1 be distributed only on physical disk 1 of physical node1, physical disk 2 of physical node1, physical disk 3 of physical node2, and physical disk 4 of physical node 2? The two-layer hash mapping of conventional distributed storage systems does not solve this problem.
In view of the above problems, the present application provides a data storage method, apparatus, device and readable storage medium, which avoid the problem that a data object is randomly mapped to any disk inside a node by setting a disk resource domain policy, and implement the purpose mapping of the data object, so that the data object can only be mapped to a disk in a corresponding disk resource domain. The flexibility of resource allocation is improved, and the storage performance of the distributed storage system is fully exerted.
Referring to fig. 2, a first embodiment of a data storage method provided in the present application is described below, where the first embodiment includes:
s201, determining a data object to be stored;
s202, determining a target node to which the data object is mapped, and acquiring a disk resource domain strategy of the target node, wherein the disk resource of the target node is divided into more than two disk resource domains, and the disk resource domain strategy comprises a corresponding relation between the data object and the disk resource domains and also comprises a corresponding relation between the disks and the disk resource domains;
s203, determining the mapping relation between the data object and the disk by utilizing a consistent hash algorithm on a target disk resource domain corresponding to the data object;
and S204, storing the data object according to the mapping relation between the data object and the disk.
In this embodiment, a concept of a resource domain (domain) is defined on the basis of not changing hash mapping, and taking a disk resource domain as an example, a resource set of a disk is defined in the disk resource domain. In this embodiment, a disk inside a node is divided into more than two disk resource domains, and a disk resource domain policy is used to describe a specific disk resource division condition, that is, to describe which disk resource domain each disk is specifically divided into, that is, a corresponding relationship between the disk and the disk resource domain; in addition, the disk resource domain policy is also used to describe a custom mapping rule, that is, a directional mapping policy between the object and the disk resource domain is defined, that is, the corresponding relationship between the data object and the disk resource domain.
In summary, a disk resource domain is a collection of disk resources, and a disk resource domain policy describes a partition condition of the disk resource domain and an object-to-disk resource domain mapping policy. In order to better explain the concept of the disk resource domain and the disk resource domain policy, the following description takes specific applications as an example:
the disk resource shown in fig. 1 is assumed to be divided into two disk resource domains, which are respectively denoted as domain-1 and domain-2. Assuming that the division result is as shown in fig. 3, the virtual nodes corresponding to the white circles in fig. 3, i.e., vnode1.1, vnode2.2, vnode3.2, and vnode3.3, are all divided into domain-1, and the virtual nodes corresponding to the black circles in fig. 3 are divided into domain-2. Assume that the naming convention for the virtual node is set to: the physical disk number + the number of the disk resource domain + the number of the virtual node in the disk resource domain, the naming result is as shown in fig. 3, for example, vnode3.3 in fig. 1 is named vnode 3.1.2 in fig. 3.
The hash rings are respectively constructed according to the two disk resource domains, and actually, the hash ring in fig. 1 is split into two hash rings according to the partition result of the disk resource domains, as shown in fig. 4, that is, the virtual node of domain-1 forms the hash ring 1, and the virtual node of domain-2 forms the hash ring 2.
The corresponding relation between the object and the disk resource domain is set in the disk resource domain policy, and for 8 objects shown in fig. 1, as shown in fig. 3, the object 1, the object 3, the object 4, and the object 5 all correspond to domain-1, and the other objects correspond to domain-2. Then, according to the consistent hash algorithm, the respective objects are distributed on the corresponding hash ring according to the hash value size of the object name, as shown in fig. 5.
By comparing fig. 1 and fig. 5, it can be seen that the mapping rule of the object and the location of the mapped virtual node are not changed, and only the name of the virtual node is changed. Therefore, only the disk resource domain needs to be defined, the hash ring shown in fig. 1 can be split into more than two, and the consistent hash distribution policy of the object on the two hash rings is not changed.
Therefore, the disk resource domain can divide the disk resources, and the virtual nodes of each disk resource domain obtained by dividing form a complete hash ring. From this perspective, the disk resource domain is a collection of node resources, and different combinations of nodes can be realized by defining different disk resource domain policies.
Specifically, in the above S103, that is, in the process of determining the mapping relationship between the data object and the disk by using the consistent hash algorithm on the target disk resource domain corresponding to the data object, the method specifically includes the following steps, as shown in fig. 6:
s601, constructing a hash ring according to the target disk resource domain corresponding to the data object;
s602, calculating a hash value of the name of the data object by using a consistent hash algorithm;
s603, determining the position of the data object on the hash ring according to the size of the hash value;
s604, determining the mapping relation between the data object and the disk according to the position of the data object on the hash ring.
In the data storage method provided by this embodiment, a disk resource of a node is divided into more than two disk resource domains in a disk resource domain policy, and a corresponding relationship between a data object and a disk resource domain is set in the disk resource domain policy. Therefore, when mapping between the data object and the disk is performed, the target disk resource domain corresponding to the data object is determined, and then the hash algorithm is used to determine which disk in the target disk resource domain the data object is specifically mapped to, so as to finally obtain the mapping relationship between the object and the disk. Therefore, the method avoids the problem that the data object is randomly mapped to any disk in the node by setting the disk resource domain strategy, and realizes the purpose mapping of the data object, so that the data object can only be mapped to a specific disk, namely the disk in the disk resource domain corresponding to the data object. The flexibility of resource allocation is improved, and the storage performance of the distributed storage system is fully exerted.
The second embodiment of the data storage method provided by the present application is described in detail below, and is implemented based on the first embodiment, and is expanded to a certain extent on the basis of the first embodiment.
Specifically, the first embodiment only describes that a disk resource domain policy is adopted in the disk mapping process, and on the basis of this embodiment, a node resource domain policy is also adopted in the node mapping process. Referring to fig. 7, the second embodiment specifically includes:
s701, determining a data object to be stored;
s702, obtaining a node resource domain strategy of a current cluster, wherein the node resource of the current cluster is divided into more than two node resource domains, and the node resource domain strategy comprises a corresponding relation between a data object and the node resource domains and also comprises a corresponding relation between nodes and the node resource domains;
s703, determining the mapping relation between the data object and the node by using a consistent hash algorithm on a target node resource domain corresponding to the data object, and obtaining a target node mapped by the data object;
s704, obtaining a disk resource domain strategy of the target node, wherein the disk resource of the target node is divided into more than two disk resource domains, and the disk resource domain strategy comprises a corresponding relation between a data object and the disk resource domains and also comprises a corresponding relation between a disk and the disk resource domains;
s705, determining the mapping relation between the data object and the disk by utilizing a consistent hash algorithm on a target disk resource domain corresponding to the data object;
s706, storing the data object according to the mapping relation between the data object and the disk.
In this embodiment, a node resource domain policy and a disk resource domain policy are respectively defined in two layers of hash mapping of a distributed storage system, where the node resource domain policy includes partition information of a node resource, and the disk resource domain policy includes partition information of a disk resource. And determining the node position information of the object in the node resource domain through the first-layer Hash mapping, and determining the disk position information of the object in the disk resource domain of the node through the second-layer Hash. Therefore, a directed mapping policy from node to disk can be implemented by defining a node resource domain policy and a disk resource domain policy in a configuration file, similar to the rule policy in ceph.
As a preferred embodiment, different disk resource domains may be divided according to the characteristics of the disk. Specifically, before the obtaining the disk resource domain policy of the target node, the method further includes: and setting a disk resource domain strategy of the target node, dividing the high-performance disk and the low-performance disk into different disk resource domains, and storing the disk resource domain strategy in a configuration file. For example, the high-performance storage medium is divided into a disk resource domain, and the low-performance storage medium is divided into a disk resource domain, so that the function of hierarchical storage can be realized by means of the strategy.
As a preferred embodiment, before the obtaining the node resource domain policy of the current cluster, the method further includes: setting a node resource domain strategy of the current cluster, dividing nodes in different fault domains into the same node resource domain, and storing the node resource domain strategy in a configuration file.
For example, assuming that node1, node2 and node3 are in the same rack, in order to solve the problem that the data copies stored on the nodes are simultaneously failed due to the simultaneous downtime of the rack for node1, node2 and node3, node1, node2 and node3 are usually grouped in the same fault domain. In this embodiment, the node1, the node2, and the node3 are defined in different node resource domains by a custom rule, or the nodes located in different fault domains are divided into the same node resource domain, and each node resource domain shares its own hash ring, so that the case that the object copy exists on the 3 nodes at the same time does not occur, and the definition of the node resource domain policy can implement the function of the fault domain.
It can be seen that, in the data storage method provided by this embodiment, on the basis of not changing the two-layer hash mapping, the nodes and the disks on the nodes are subjected to resource partitioning and integration according to a certain rule, so as to form a node resource domain and a disk resource domain; the method allows the user to define node resource domain strategies and disk resource domain strategies, and realizes the mapping of objects to specific nodes and characteristic disks according to the user-defined strategies, so that the physical resources in the distributed storage system can be used more flexibly, and the functions of fault domain, layered storage and the like can be realized.
In the following, a data storage device provided by an embodiment of the present application is described, and a data storage device described below and a data storage method described above are referred to correspondingly.
As shown in fig. 8, the data storage device of the present embodiment includes:
object determination module 801: for determining a data object to be stored;
the policy acquisition module 802: the system comprises a target node, a disk resource domain policy and a data object mapping module, wherein the target node is used for determining a target node to which the data object is mapped and acquiring the disk resource domain policy of the target node, the disk resource of the target node is divided into more than two disk resource domains, and the disk resource domain policy comprises a corresponding relation between the data object and the disk resource domains and also comprises a corresponding relation between the disks and the disk resource domains;
the mapping relation determination module 803: the mapping relation between the data object and the disk is determined by utilizing a consistent hash algorithm on a target disk resource domain corresponding to the data object;
the storage module 804: and the data object is stored according to the mapping relation between the data object and the disk.
The data storage device of this embodiment is used to implement the foregoing data storage method, and therefore the specific implementation of this device can be seen in the foregoing embodiment portions of the data storage method, for example, the object determining module 801, the policy obtaining module 802, the mapping relationship determining module 803, and the storage module 804 are respectively used to implement steps S201, S202, S203, and S204 in the foregoing data storage method. Therefore, specific embodiments thereof may be referred to in the description of the corresponding respective partial embodiments, and will not be described herein.
In addition, since the data storage device of this embodiment is used to implement the foregoing data storage method, the role thereof corresponds to that of the foregoing method, and details thereof are not repeated here.
In addition, the present application also provides a data storage device, including:
a memory: for storing a computer program;
a processor: for executing said computer program for implementing the steps of the data storage method as described above.
Finally, the present application provides a readable storage medium having stored thereon a computer program for implementing the steps of the data storage method as described above when executed by a processor.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above detailed descriptions of the solutions provided in the present application, and the specific examples applied herein are set forth to explain the principles and implementations of the present application, and the above descriptions of the examples are only used to help understand the method and its core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (9)

1. A method of storing data, comprising:
determining a data object to be stored;
determining a target node to which the data object is mapped, and acquiring a disk resource domain strategy of the target node, wherein the disk resource of the target node is divided into more than two disk resource domains, and the disk resource domain strategy comprises a corresponding relation between the data object and the disk resource domains and also comprises a corresponding relation between the disks and the disk resource domains;
determining the mapping relation between the data object and the disk by utilizing a consistent hash algorithm on a target disk resource domain corresponding to the data object;
and storing the data object according to the mapping relation between the data object and the disk.
2. The method of claim 1, wherein determining the mapping relationship between the data object and the disk by using a consistent hashing algorithm on the target disk resource domain corresponding to the data object comprises:
constructing a hash ring according to the target disk resource domain corresponding to the data object;
calculating a hash value of the name of the data object by using a consistent hash algorithm;
determining the position of the data object on the hash ring according to the size of the hash value;
and determining the mapping relation between the data object and the disk according to the position of the data object in the hash ring.
3. The method of claim 2, wherein prior to said obtaining the disk resource domain policy of the target node, further comprising:
and setting a disk resource domain strategy of the target node, and dividing the high-performance disk and the low-performance disk into different disk resource domains.
4. The method of claim 3, wherein storing the data object according to the mapping relationship between the data object and the disk comprises:
and determining the storage position information of the data object according to the mapping relation between the data object and the disk, and storing the data object according to the storage position information, wherein the storage position information comprises a disk resource domain number, a disk number and a virtual node number.
5. The method of any of claims 1-4, wherein the determining the target node to which the data object maps comprises:
acquiring a node resource domain strategy of a current cluster, wherein the node resource of the current cluster is divided into more than two node resource domains, and the node resource domain strategy comprises a corresponding relation between a data object and a node resource domain and also comprises a corresponding relation between a node and the node resource domain;
and determining the mapping relation between the data object and the nodes by utilizing a consistent hash algorithm on the target node resource domain corresponding to the data object to obtain the target node mapped by the data object.
6. The method of claim 5, wherein prior to said obtaining the node resource domain policy for the current cluster, further comprising:
and setting a node resource domain strategy of the current cluster, and dividing nodes in different fault domains into the same node resource domain.
7. A data storage device, comprising:
an object determination module: for determining a data object to be stored;
a policy acquisition module: the system comprises a target node, a disk resource domain policy and a data object mapping module, wherein the target node is used for determining a target node to which the data object is mapped and acquiring the disk resource domain policy of the target node, the disk resource of the target node is divided into more than two disk resource domains, and the disk resource domain policy comprises a corresponding relation between the data object and the disk resource domains and also comprises a corresponding relation between the disks and the disk resource domains;
a mapping relation determination module: the mapping relation between the data object and the disk is determined by utilizing a consistent hash algorithm on a target disk resource domain corresponding to the data object;
a storage module: and the data object is stored according to the mapping relation between the data object and the disk.
8. A data storage device, comprising:
a memory: for storing a computer program;
a processor: for executing said computer program for implementing the steps of the data storage method according to any one of claims 1 to 6.
9. A readable storage medium, having stored thereon a computer program for implementing the steps of the data storage method according to any one of claims 1-6 when executed by a processor.
CN202010567658.6A 2020-06-19 2020-06-19 Data storage method, device and equipment Active CN111756828B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010567658.6A CN111756828B (en) 2020-06-19 2020-06-19 Data storage method, device and equipment
PCT/CN2021/076920 WO2021253853A1 (en) 2020-06-19 2021-02-19 Data storage method, device and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010567658.6A CN111756828B (en) 2020-06-19 2020-06-19 Data storage method, device and equipment

Publications (2)

Publication Number Publication Date
CN111756828A true CN111756828A (en) 2020-10-09
CN111756828B CN111756828B (en) 2023-07-14

Family

ID=72675828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010567658.6A Active CN111756828B (en) 2020-06-19 2020-06-19 Data storage method, device and equipment

Country Status (2)

Country Link
CN (1) CN111756828B (en)
WO (1) WO2021253853A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199176A (en) * 2020-10-16 2021-01-08 济南浪潮数据技术有限公司 Service processing method, device and related equipment
CN112230861A (en) * 2020-10-26 2021-01-15 金钱猫科技股份有限公司 Data storage method and terminal based on consistent hash algorithm
WO2021253853A1 (en) * 2020-06-19 2021-12-23 广东浪潮智慧计算技术有限公司 Data storage method, device and apparatus
CN116204137A (en) * 2023-05-04 2023-06-02 苏州浪潮智能科技有限公司 Distributed storage system, control method, device and equipment based on DPU

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8046561B1 (en) * 2006-12-22 2011-10-25 Emc Corporation Methods and apparatus for selecting a storage zone for a content unit
CN102880428A (en) * 2012-08-20 2013-01-16 华为技术有限公司 Distributed RAID (redundant array of independent disks) establishing method and device
CN103136114A (en) * 2011-11-30 2013-06-05 华为技术有限公司 Storage method and storage device
CN103645859A (en) * 2013-11-19 2014-03-19 华中科技大学 Disk array caching method for virtual SSD and SSD isomerous mirror image
CN103905540A (en) * 2014-03-25 2014-07-02 浪潮电子信息产业股份有限公司 Object storage data distribution mechanism based on two-sage Hash
CN103929500A (en) * 2014-05-06 2014-07-16 刘跃 Method for data fragmentation of distributed storage system
CN104102709A (en) * 2014-07-14 2014-10-15 浪潮(北京)电子信息产业有限公司 Disk management method and database management system
CN106201355A (en) * 2016-07-12 2016-12-07 腾讯科技(深圳)有限公司 Data processing method and device and storage system
US20160371020A1 (en) * 2015-06-16 2016-12-22 Vmware, Inc. Virtual machine data placement in a virtualized computing environment
WO2016206198A1 (en) * 2015-06-26 2016-12-29 北京百度网讯科技有限公司 Storage system
CN107832017A (en) * 2017-11-14 2018-03-23 中国石油集团川庆钻探工程有限公司地球物理勘探公司 A kind of method and device for improving geological data storage IO performances
US20180248758A1 (en) * 2017-02-27 2018-08-30 Dell Products L.P. Storage isolation domains for converged infrastructure information handling systems
CN110058822A (en) * 2019-04-26 2019-07-26 北京计算机技术及应用研究所 A kind of disk array transverse direction expanding method
CN110083312A (en) * 2019-04-28 2019-08-02 联想(北京)有限公司 Disk expansion method, device and computer equipment
CN110096227A (en) * 2019-03-28 2019-08-06 北京奇艺世纪科技有限公司 Date storage method, data processing method, device, electronic equipment and computer-readable medium
CN110347675A (en) * 2019-06-05 2019-10-18 阿里巴巴集团控股有限公司 A kind of date storage method and device
CN110489059A (en) * 2019-07-11 2019-11-22 平安科技(深圳)有限公司 The method, apparatus and computer equipment of data cluster storage
WO2020083106A1 (en) * 2018-10-25 2020-04-30 华为技术有限公司 Node expansion method in storage system and storage system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10114716B2 (en) * 2015-11-20 2018-10-30 International Business Machines Corporation Virtual failure domains for storage systems
CN106055706B (en) * 2016-06-23 2019-08-06 杭州迪普科技股份有限公司 A kind of cache resources storage method and device
CN112352216B (en) * 2018-06-30 2022-06-14 华为技术有限公司 Data storage method and data storage device
CN111756828B (en) * 2020-06-19 2023-07-14 广东浪潮大数据研究有限公司 Data storage method, device and equipment

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8046561B1 (en) * 2006-12-22 2011-10-25 Emc Corporation Methods and apparatus for selecting a storage zone for a content unit
CN103136114A (en) * 2011-11-30 2013-06-05 华为技术有限公司 Storage method and storage device
CN102880428A (en) * 2012-08-20 2013-01-16 华为技术有限公司 Distributed RAID (redundant array of independent disks) establishing method and device
CN103645859A (en) * 2013-11-19 2014-03-19 华中科技大学 Disk array caching method for virtual SSD and SSD isomerous mirror image
CN103905540A (en) * 2014-03-25 2014-07-02 浪潮电子信息产业股份有限公司 Object storage data distribution mechanism based on two-sage Hash
CN103929500A (en) * 2014-05-06 2014-07-16 刘跃 Method for data fragmentation of distributed storage system
CN104102709A (en) * 2014-07-14 2014-10-15 浪潮(北京)电子信息产业有限公司 Disk management method and database management system
US20160371020A1 (en) * 2015-06-16 2016-12-22 Vmware, Inc. Virtual machine data placement in a virtualized computing environment
WO2016206198A1 (en) * 2015-06-26 2016-12-29 北京百度网讯科技有限公司 Storage system
CN106201355A (en) * 2016-07-12 2016-12-07 腾讯科技(深圳)有限公司 Data processing method and device and storage system
US20180248758A1 (en) * 2017-02-27 2018-08-30 Dell Products L.P. Storage isolation domains for converged infrastructure information handling systems
CN107832017A (en) * 2017-11-14 2018-03-23 中国石油集团川庆钻探工程有限公司地球物理勘探公司 A kind of method and device for improving geological data storage IO performances
WO2020083106A1 (en) * 2018-10-25 2020-04-30 华为技术有限公司 Node expansion method in storage system and storage system
CN110096227A (en) * 2019-03-28 2019-08-06 北京奇艺世纪科技有限公司 Date storage method, data processing method, device, electronic equipment and computer-readable medium
CN110058822A (en) * 2019-04-26 2019-07-26 北京计算机技术及应用研究所 A kind of disk array transverse direction expanding method
CN110083312A (en) * 2019-04-28 2019-08-02 联想(北京)有限公司 Disk expansion method, device and computer equipment
CN110347675A (en) * 2019-06-05 2019-10-18 阿里巴巴集团控股有限公司 A kind of date storage method and device
CN110489059A (en) * 2019-07-11 2019-11-22 平安科技(深圳)有限公司 The method, apparatus and computer equipment of data cluster storage

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
FENG-BIN SUN等: "A new reliability growth model with dual-time domain — A hard disk drive perspective", 2015 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM (RAMS) *
曾碧卿等: "分布式计算中可扩展的并行I/O数据分配策略研究", 《小型微型计算机系统》 *
曾碧卿等: "分布式计算中可扩展的并行I/O数据分配策略研究", 《小型微型计算机系统》, no. 10, 21 October 2005 (2005-10-21) *
章宏灿;薛巍;: "一种双均衡的集群存储资源映射方法", 清华大学学报(自然科学版)网络.预览, no. 10 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021253853A1 (en) * 2020-06-19 2021-12-23 广东浪潮智慧计算技术有限公司 Data storage method, device and apparatus
CN112199176A (en) * 2020-10-16 2021-01-08 济南浪潮数据技术有限公司 Service processing method, device and related equipment
CN112199176B (en) * 2020-10-16 2023-01-17 济南浪潮数据技术有限公司 Service processing method, device and related equipment
CN112230861A (en) * 2020-10-26 2021-01-15 金钱猫科技股份有限公司 Data storage method and terminal based on consistent hash algorithm
CN116204137A (en) * 2023-05-04 2023-06-02 苏州浪潮智能科技有限公司 Distributed storage system, control method, device and equipment based on DPU
CN116204137B (en) * 2023-05-04 2023-08-04 苏州浪潮智能科技有限公司 Distributed storage system, control method, device and equipment based on DPU

Also Published As

Publication number Publication date
CN111756828B (en) 2023-07-14
WO2021253853A1 (en) 2021-12-23

Similar Documents

Publication Publication Date Title
CN111756828B (en) Data storage method, device and equipment
Schmidt et al. Flexible information discovery in decentralized distributed systems
US6430618B1 (en) Method and apparatus for distributing requests among a plurality of resources
JP4681615B2 (en) Node workload split
US7127513B2 (en) Method and apparatus for distributing requests among a plurality of resources
JP5551270B2 (en) Method and apparatus for decomposing a peer-to-peer network and using the decomposed peer-to-peer network
US20130263151A1 (en) Consistent Hashing Table for Workload Distribution
JP2018520402A (en) Object-based storage cluster with multiple selectable data processing policies
KR20120120702A (en) Method and apparatus for selecting a node to place a replica in cloud storage system
CN110222013B (en) Method, system, equipment and storage medium for determining cluster storage capacity
EP2875653A1 (en) Method for generating a dataset structure for location-based services and method and system for providing location-based services to a mobile device
CN111399761B (en) Storage resource allocation method, device and equipment, and storage medium
CN114244805B (en) Domain name configuration method and device
CN113934377A (en) Metadata cluster deployment method, device, equipment and readable storage medium
Tsatsanifos et al. Index-based query processing on distributed multidimensional data
March et al. Multi-attribute range queries on read-only DHT
CN114879907A (en) Data distribution determination method, device, equipment and storage medium
Confais et al. A tree-based approach to locate object replicas in a fog storage infrastructure
CN114428681A (en) Method and device for expanding computing capacity of database system
CN109462642B (en) Data processing method and device
CN111125011A (en) File processing method, system and related equipment
Furuya et al. Load balancing method for data management using high availability distributed clusters
Knoll et al. A P2P-Framework for Context-based Information
Ratti et al. NL-DHT: A non-uniform locality sensitive DHT architecture for massively multi-user virtual environment applications
Gattermayer et al. Using bootstraping principles of contemporary P2P file-sharing protocols in large-scale grid computing systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant